idnits 2.17.1 

draft-ietf-avt-avpf-ccm-09.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 20.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 2959.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2970.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2977.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2983.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The document has examples using IPv4 documentation addresses according
     to RFC6890, but does not use any IPv6 documentation addresses.  Maybe
     there should be IPv6 examples, too?


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  == Line 758 has weird spacing: '...sg type    mul...'

  == Line 1142 has weird spacing: '...     ab  c   s...'

  == Line 1144 has weird spacing: '...     ba   s...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (August 1, 2007) is 6106 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RFCxxxx' is mentioned on line 2811, but not defined

  ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866)

  ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226)

  ** Obsolete normative reference: RFC 4234 (Obsoleted by RFC 5234)

  -- Obsolete informational reference (is this intentional?): RFC 2032
     (Obsoleted by RFC 4587)

  == Outdated reference: A later version (-12) exists of
     draft-ietf-avt-profile-savpf-10

  -- Obsolete informational reference (is this intentional?): RFC 3525
     (Obsoleted by RFC 5125)

  -- Obsolete informational reference (is this intentional?): RFC 3448
     (Obsoleted by RFC 5348)

  == Outdated reference: A later version (-07) exists of
     draft-ietf-avt-topologies-06


     Summary: 4 errors (**), 0 flaws (~~), 7 warnings (==), 11 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                   Stephan Wenger
3	INTERNET-DRAFT                                           Umesh Chandra
4	Expires: February 2008                                           Nokia
5	Intended Status: Proposed Standard                   Magnus Westerlund
6	                                                             Bo Burman
7	                                                              Ericsson
8	                                                        August 1, 2007

10	                       Codec Control Messages in the
11	               RTP Audio-Visual Profile with Feedback (AVPF)

13	                      <draft-ietf-avt-avpf-ccm-09.txt>

15	Status of this Memo

17	   By submitting this Internet-Draft, each author represents that any
18	   applicable patent or other IPR claims of which he or she is aware
19	   have been or will be disclosed, and any of which he or she becomes
20	   aware will be disclosed, in accordance with Section 6 of BCP 79.

22	   Internet-Drafts are working documents of the Internet Engineering
23	   Task Force (IETF), its areas, and its working groups.  Note that
24	   other groups may also distribute working documents as Internet-
25	   Drafts.

27	   Internet-Drafts are draft documents valid for a maximum of six
28	   months and may be updated, replaced, or obsoleted by other documents
29	   at any time.  It is inappropriate to use Internet-Drafts as
30	   reference material or to cite them other than as "work in progress."

32	   The list of current Internet-Drafts can be accessed at
33	   http://www.ietf.org/ietf/1id-abstracts.txt.

35	   The list of Internet-Draft Shadow Directories can be accessed at
36	   http://www.ietf.org/shadow.html.

38	Copyright Notice

40	   Copyright (C) The IETF Trust (2007).

42	Abstract

44	   This document specifies a few extensions to the messages defined in
45	   the Audio-Visual Profile with Feedback (AVPF).  They are helpful
46	   primarily in conversational multimedia scenarios where centralized
47	   multipoint functionalities are in use.  However, some are also
48	   usable in smaller multicast environments and point-to-point calls.

50	   The extensions discussed are messages related to the ITU-T H.271
51	   Video Back Channel, Full Intra Request, Temporary Maximum Media
52	   Stream Bit Rate and Temporal Spatial Trade-off.

54	TABLE OF CONTENTS

56	1.   Introduction..................................................5
57	2.   Definitions...................................................6
58	   2.1. Glossary...................................................6
59	   2.2. Terminology................................................6
60	   2.3. Topologies.................................................9
61	3.   Motivation...................................................10
62	   3.1. Use Cases.................................................10
63	   3.2. Using the Media Path......................................12
64	   3.3. Using AVPF................................................13
65	      3.3.1. Reliability..........................................13
66	   3.4. Multicast.................................................13
67	   3.5. Feedback Messages.........................................13
68	      3.5.1. Full Intra Request Command...........................13
69	         3.5.1.1. Reliability.....................................14
70	      3.5.2. Temporal Spatial Trade-off Request and Notification..15
71	         3.5.2.1. Point-to-Point..................................16
72	         3.5.2.2. Point-to-Multipoint Using Multicast or
73	                  Translators.....................................16
74	         3.5.2.3. Point-to-Multipoint Using RTP Mixer.............17
75	         3.5.2.4. Reliability.....................................17
76	      3.5.3. H.271 Video Back Channel Message.....................18
77	         3.5.3.1. Reliability.....................................20
78	      3.5.4. Temporary Maximum Media Stream Bit Rate Request and
79	             Notification.........................................20
80	         3.5.4.1. Behavior for media receivers using TMMBR........23
81	         3.5.4.2. Algorithm for establishing current limitations..24
82	         3.5.4.3. Use of TMMBR in a Mixer Based Multipoint
83	                  Operation.......................................31
84	         3.5.4.4. Use of TMMBR in Point-to-Multipoint Using
85	                  Multicast or Translators........................32
86	         3.5.4.5. Use of TMMBR in Point-to-point operation........32
87	         3.5.4.6. Reliability.....................................32
88	4.   RTCP Receiver Report Extensions..............................34
89	   4.1. Design Principles of the Extension Mechanism..............34
90	   4.2. Transport Layer Feedback Messages.........................35
91	      4.2.1. Temporary Maximum Media Stream Bit Rate Request
92	             (TMMBR)..............................................36
93	         4.2.1.1. Message Format..................................36
94	         4.2.1.2. Semantics.......................................37
95	         4.2.1.3. Timing Rules....................................41
96	         4.2.1.4. Handling in Translator and Mixers...............41
97	      4.2.2. Temporary Maximum Media Stream Bit Rate Notification
98	             (TMMBN)..............................................41
99	         4.2.2.1. Message Format..................................41
100	         4.2.2.2. Semantics.......................................42
101	         4.2.2.3. Timing Rules....................................43
102	         4.2.2.4. Handling by Translators and Mixers..............43
103	   4.3. Payload Specific Feedback Messages........................43
104	      4.3.1. Full Intra Request (FIR).............................44
105	         4.3.1.1. Message Format..................................44
106	         4.3.1.2. Semantics.......................................45
107	         4.3.1.3. Timing Rules....................................46
108	         4.3.1.4. Handling of FIR Message in Mixer and
109	                  Translators.....................................46
110	         4.3.1.5. Remarks.........................................46
111	      4.3.2. Temporal-Spatial Trade-off Request (TSTR)............48
112	         4.3.2.1. Message Format..................................48
113	         4.3.2.2. Semantics.......................................49
114	         4.3.2.3. Timing Rules....................................49
115	         4.3.2.4. Handling of message in Mixers and Translators...49
116	         4.3.2.5. Remarks.........................................50
117	      4.3.3. Temporal-Spatial Trade-off Notification (TSTN).......50
118	         4.3.3.1. Message Format..................................50
119	         4.3.3.2. Semantics.......................................51
120	         4.3.3.3. Timing Rules....................................52
121	         4.3.3.4. Handling of TSTN in Mixer and Translators.......52
122	         4.3.3.5. Remarks.........................................52
123	      4.3.4. H.271 Video Back Channel Message (VBCM)..............52
124	         4.3.4.1. Message Format..................................52
125	         4.3.4.2. Semantics.......................................53
126	         4.3.4.3. Timing Rules....................................54
127	         4.3.4.4. Handling of message in Mixer or Translator......55
128	         4.3.4.5. Remarks.........................................55
129	5.   Congestion Control...........................................55
130	6.   Security Considerations......................................56
131	7.   SDP Definitions..............................................57
132	   7.1. Extension of the rtcp-fb Attribute........................57
133	   7.2. Offer-Answer..............................................58
134	   7.3. Examples..................................................59
135	8.   IANA Considerations..........................................62
136	9.   Contributors.................................................63
137	10.  Acknowledgements.............................................63
138	11.  References...................................................64
139	   11.1. Normative references.....................................64
140	   11.2. Informative references...................................64
141	12.  Authors' Addresses...........................................66
142	1. Introduction

144	   When the Audio-Visual Profile with Feedback (AVPF) [RFC4585] was
145	   developed, the main emphasis lay in the efficient support of point-
146	   to-point and small multipoint scenarios without centralized
147	   multipoint control.  However, in practice, many small multipoint
148	   conferences operate utilizing devices known as Multipoint Control
149	   Units (MCUs).  Long-standing experience of the conversational video
150	   conferencing industry suggests that there is a need for a few
151	   additional feedback messages, to support centralized multipoint
152	   conferencing efficiently.  Some of the messages have applications
153	   beyond centralized multipoint, and this is indicated in the
154	   description of the message.  This is especially true for the message
155	   intended to carry ITU-T Rec. H.271 [H.271] bit strings for Video
156	   Back Channel messages.

158	   In Real-time Transport Protocol (RTP) [RFC3550] terminology, MCUs
159	   comprise mixers and translators.  Most MCUs also include signaling
160	   support.  During the development of this memo, it was noticed that
161	   there is considerable confusion in the community related to the use
162	   of terms such as mixer, translator, and MCU.  In response to these
163	   concerns, a number of topologies have been identified that are of
164	   practical relevance to the industry, but are not documented in
165	   sufficient detail in [RFC3550].  These topologies are documented in
166	   [Topologies], and understanding this memo requires previous or
167	   parallel study of [Topologies].

169	   Some of the messages defined here are forward only, in that they do
170	   not require an explicit notification to the message emitter that
171	   they have been received and/or indicating the message receiver's
172	   actions.  Other messages require a response, leading to a two way
173	   communication model that one could view as useful for control
174	   purposes.  However, it is not the intention of this memo to open up
175	   RTP Control Protocol (RTCP) to a generalized control protocol.  All
176	   mentioned messages have relatively strict real-time constraints, in
177	   the sense that their value diminishes with increased delay.  This
178	   makes the use of more traditional control protocol means, such as
179	   Session Initiation Protocol (SIP) re-INVITEs [RFC3261], undesirable
180	   when used for the same purpose.  Furthermore, all messages are of a
181	   very simple format that can be easily processed by an RTP/RTCP
182	   sender/receiver.  Finally, and most importantly, all messages relate
183	   only to the RTP stream with which they are associated, and not to
184	   any other property of a communication system.  In particular, none
185	   of them relate to the properties of the access links traversed by
186	   the session.

188	2. Definitions

190	2.1. Glossary

192	   AIMD   - Additive Increase Multiplicative Decrease
193	   AVPF   - The extended RTP profile for RTCP-based feedback
194	   FEC    - Forward Error Correction
195	   FCI    - Feedback Control Information [RFC4585]
196	   FIR    - Full Intra Request
197	   MCU    - Multipoint Control Unit
198	   MPEG   - Moving Picture Experts Group
199	   TMMBN  - Temporary Maximum Media Stream Bit Rate Notification
200	   TMMBR  - Temporary Maximum Media Stream Bit Rate Request
201	   PLI    - Picture Loss Indication
202	   PR     - Packet rate
203	   QP     - Quantizer Parameter
204	   RTT    - Round trip time
205	   SSRC   - Synchronization Source
206	   TSTN   - Temporal Spatial Trade-off Notification
207	   TSTR   - Temporal Spatial Trade-off Request
208	   VBCM   - Video Back Channel Message indication.

210	2.2. Terminology

212	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
213	   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in
214	   this document are to be interpreted as described in RFC 2119
215	   [RFC2119].

217	      Message:
218	          An RTCP feedback message [RFC4585] defined by this
219	          specification, of one of the following types:

221	          Request:
222	              Message that requires acknowledgement

224	          Command:
225	              Message that forces the receiver to an action

227	          Indication:
228	              Message that reports a situation

230	          Notification:
231	             Message that provides a notification that an event has
232	              occurred. Notifications are commonly generated in
233	              response to a Request.

235	          Note that, with the exception of "Notification", this
236	          terminology is in alignment with ITU-T Rec. H.245 [H245].

238	     Decoder Refresh Point:
239	          A bit string, packetized in one or more RTP packets, which
240	          completely resets the decoder to a known state.

242	          Examples for "hard" decoder refresh points are Intra pictures
243	          in H.261, H.263, MPEG-1, MPEG-2, and MPEG-4 part 2, and
244	          Instantaneous Decoder Refresh (IDR) pictures in H.264.
245	          "Gradual" decoder refresh points may also be used; see for
246	          example [AVC].  While both "hard" and "gradual" decoder
247	          refresh points are acceptable in the scope of this
248	          specification, in most cases the user experience will benefit
249	          from using a "hard" decoder refresh point.

251	          A decoder refresh point also contains all header information
252	          above the picture layer (or equivalent, depending on the
253	          video compression standard) that is conveyed in-band.  In
254	          H.264, for example, a decoder refresh point contains
255	          parameter set Network Adaptation Layer (NAL) units that
256	          generate parameter sets necessary for the decoding of the
257	          following slice/data partition NAL units (and that are not
258	          conveyed out of band).

260	   Decoding:
261	          The operation of reconstructing the media stream.

263	   Rendering:
264	          The operation of presenting (parts of) the reconstructed
265	          media stream to the user.

267	   Stream thinning:
268	          The operation of removing some of the packets from a media
269	          stream.  Stream thinning, preferably, is media-aware,
270	          implying that media packets are removed in the order of
271	          increasing relevance to the reproductive quality.  However,
272	          even when employing media-aware stream thinning, most media
273	          streams quickly lose quality when subjected to increasing
274	          levels of thinning.  Media-unaware stream thinning leads to
275	          even worse quality degradation.  In contrast to transcoding,
276	          stream thinning is typically seen as a computationally
277	          lightweight operation.

279	   Media:
280	          Often used (sometimes in conjunction with terms like bit
281	          rate, stream, sender ...) to identify the content of the
282	          forward RTP packet stream (carrying the codec data), to which
283	          the codec control message applies.

285	   Media Stream:
286	          The stream of RTP packets labeled with a single
287	          Synchronization Source (SSRC) carrying the media (and also in
288	          some cases repair information such as retransmission or
289	          Forward Error Correction (FEC) information).

291	   Total media bit rate:
292	          The total bits per second transferred in a media stream,
293	          measured at an observer-selected protocol layer and averaged
294	          over a reasonable timescale, the length of which depends on
295	          the application.  In general, a media sender and a media
296	          receiver will observe different total media bit rates for the
297	          same stream, first because they may have selected different
298	          reference protocol layers, and second, because of changes in
299	          per-packet overhead along the transmission path.  The goal
300	          with bit rate averaging is to be able to ignore any
301	          burstiness on very short timescales, below for example 100
302	          ms, introduced by scheduling or link layer packetization
303	          effects.

305	   Maximum total media bit rate:
306	          The upper limit on total media bit rate for a given media
307	          stream at a particular receiver and for its selected protocol
308	          layer. Note that this value cannot be measured on the
309	          received media stream, instead it needs to be calculated or
310	          determined through other means, such as QoS negotiations or
311	          local resource limitations. Also note that this value is an
312	          average (on a timescale that is reasonable for the
313	          application) and that it may be different from the
314	          instantaneous bit-rate seen by packets in the media stream.

316	   Overhead:
317	          All protocol header information required to convey a packet
318	          with media data from sender to receiver, from the application
319	          layer down to a pre-defined protocol level (for example down
320	          to, and including, the IP header).  Overhead may include, for
321	          example, IP, UDP, and RTP headers, any layer 2 headers, any
322	          Contributing Sources (CSRCs), RTP-Padding, and RTP header
323	          extensions.  Overhead excludes any RTP payload headers and
324	          the payload itself.

326	   Net media bit rate:
327	          The bit rate carried by a media stream, net of overhead.
328	          That is, the bits per second accounted for by encoded media,
329	          any applicable payload headers, and any directly associated
330	          meta payload information placed in the RTP packet.  A typical
331	          example of the latter is redundancy data provided by the use
332	          of RFC 2198 [RFC2198].  Note that, unlike the total media bit
333	          rate, the net media bit rate will have the same value at the
334	          media sender and at the media receiver unless any mixing or
335	          translating of the media has occurred.

337	          For a given observer, the total media bit rate for a media
338	          stream is equal to the sum of the net media bit rate and the
339	          per-packet overhead as defined above multiplied by the packet
340	          rate.

342	   Feasible region:
343	          The set of all combinations of packet rate and net media bit
344	          rate that do not exceed the restrictions in maximum media bit
345	          rate placed on a given media sender by the Temporary Maximum
346	          Media Stream Bit-rate Request (TMMBR)  messages it has
347	          received.  The feasible region will change as new TMMBR
348	          messages are received.

350	   Bounding set:
351	          The set of TMMBR tuples, selected from all those received at
352	          a given media sender, that define the feasible region for
353	          that media sender.  The media sender uses an algorithm such
354	          as that in section 3.5.4.2 to determine or iteratively
355	          approximate the current bounding set, and reports that set
356	          back to the media receivers in a Temporary Maximum Media
357	          Stream Bit-rate Notification (TMMBN) message.

359	2.3. Topologies

361	   Please refer to [Topologies] for an in depth discussion.  The
362	   topologies referred to throughout this memo are labeled
363	   (consistently with [Topologies]) as follows:

365	   Topo-Point-to-Point . . . . . Point-to-point communication
366	   Topo-Multicast  . . . . . . . Multicast communication
367	   Topo-Translator . . . . . . . Translator based
368	   Topo-Mixer  . . . . . . . . . Mixer based
369	   Topo-RTP-switch-MCU . . . .   RTP stream switching MCU,
370	   Topo-RTCP-terminating-MCU . . Mixer but terminating RTCP

372	3. Motivation

374	   This section discusses the motivation and usage of the different
375	   video and media control messages.  The video control messages have
376	   been under discussion for a long time, and a requirement draft was
377	   drawn up [Basso].  This draft has expired; however we quote relevant
378	   sections of it to provide motivation and requirements.

380	3.1. Use Cases

382	   There are a number of possible usages for the proposed feedback
383	   messages.  Let us begin by looking through the use cases Basso et
384	   al. [Basso] proposed.  Some of the use cases have been reformulated
385	   and comments have been added.

387	   1. An RTP video mixer composes multiple encoded video sources into a
388	      single encoded video stream.  Each time a video source is added,
389	      the RTP mixer needs to request a decoder refresh point from the
390	      video source, so as to start an uncorrupted prediction chain on
391	      the spatial area of the mixed picture occupied by the data from
392	      the new video source.

394	   2. An RTP video mixer receives multiple encoded RTP video streams
395	      from conference participants, and dynamically selects one of the
396	      streams to be included in its output RTP stream.  At the time of
397	      a bit stream change (determined through means such as voice
398	      activation or the user interface), the mixer requests a decoder
399	      refresh point from the remote source, in order to avoid using
400	      unrelated content as reference data for inter picture prediction.
401	      After requesting the decoder refresh point, the video mixer stops
402	      the delivery of the current RTP stream and monitors the RTP
403	      stream from the new source until it detects data belonging to the
404	      decoder refresh point.  At that time, the RTP mixer starts
405	      forwarding the newly selected stream to the receiver(s).

407	   3. An application needs to signal to the remote encoder that the
408	      desired trade-off between temporal and spatial resolution has
409	      changed.  For example, one user may prefer a higher frame rate
410	      and a lower spatial quality, and another user may prefer the
411	      opposite.  This choice is also highly content dependent.  Many
412	      current video conferencing systems offer in the user interface a
413	      mechanism to make this selection, usually in the form of a
414	      slider.  The mechanism is helpful in point-to-point, centralized
415	      multipoint and non-centralized multipoint uses.

417	   4. Use case 4 of the Basso draft applies only to Picture Loss
418	      Indication (PLI) as defined in AVPF [RFC4585] and is not
419	      reproduced here.

421	   5. Use case 5 of the Basso draft relates to a mechanism known as
422	      "freeze picture request".  Sending freeze picture requests
423	      over a non-reliable forward RTCP channel has been identified as
424	      problematic.  Therefore, no freeze picture request has been
425	      included in this memo, and the use case discussion is not
426	      reproduced here.

428	   6. A video mixer dynamically selects one of the received video
429	      streams to be sent out to participants and tries to provide the
430	      highest bit rate possible to all participants, while minimizing
431	      stream trans-rating.  One way of achieving this is to set up
432	      sessions with endpoints using the maximum bit rate accepted by
433	      each endpoint, and accepted by the call admission method used by
434	      the mixer.  By means of commands that reduce the maximum media
435	      stream bit rate below what has been negotiated during session set
436	      up, the mixer can reduce the maximum bit rate sent by endpoints
437	      to the lowest of all the accepted bit rates.  As the lowest
438	      accepted bit rate changes due to endpoints joining and leaving or
439	      due to network congestion, the mixer can adjust the limits at
440	      which endpoints can send their streams to match the new value.
441	      The mixer then requests a new maximum bit rate, which is equal to
442	      or less than the maximum bit rate negotiated at session setup for
443	      a specific media stream, and the remote endpoint can respond with
444	      the actual bit rate that it can support.

446	   The picture Basso, et al draws up covers most applications we
447	   foresee.  However, we would like to extend the list with two
448	   additional use cases:

450	   7. Currently deployed congestion control algorithms (AIMD and TFRC
451	      [RFC3448]) probe for additional available capacity as long as
452	      there is something to send.  With congestion control algorithms
453	      using packet loss as the indication for congestion, this probing
454	      generally results in reduced media quality (often to a point
455	      where the distortion is large enough to make the media unusable),
456	      due to packet loss and increased delay.

458	      In a number of deployment scenarios, especially cellular ones,
459	      the bottleneck link is often the last hop link.  That cellular
460	      link also commonly has some type of QoS negotiation enabling the
461	      cellular device to learn the maximal bit rate available over this
462	      last hop.  A media receiver behind this link can, in most (if not
463	      all) cases, calculate at least an upper bound for the bit rate
464	      available for each media stream it presently receives.  How this
465	      is done is an implementation detail and not discussed herein.
466	      Indicating the maximum available bit rate to the transmitting
467	      party for the various media streams can be beneficial to prevent
468	      that party from probing for bandwidth for this stream in excess
469	      of a known hard limit.  For cellular or other mobile devices, the
470	      known available bit rate for each stream (deduced from the link
471	      bit rate) can change quickly, due to handover to another
472	      transmission technology, QoS renegotiation due to congestion,
473	      etc.  To enable minimal disruption of service, quick convergence
474	      is necessary, and therefore media path signaling is desirable.

476	    8. The use of reference picture selection (RPS) as an error
477	       resilience tool has been introduced in 1997 as NEWPRED [NEWPRED],
478	       and is now widely deployed.  When RPS is in use, simplistically
479	       put, the receiver can send a feedback message to the sender,
480	       indicating a reference picture that should be used for future
481	       prediction.  ([NEWPRED] mentions other forms of feedback as
482	       well.)  AVPF contains a mechanism for conveying such a message,
483	       but did not specify for which codec and according to which syntax
484	       the message should conform.  Recently, the ITU-T finalized Rec.
485	       H.271 which (among other message types) also includes a feedback
486	       message.  It is expected that this feedback message will fairly
487	       quickly enjoy wide support.  Therefore, a mechanism to convey
488	       feedback messages according to H.271 appears to be desirable.

490	3.2. Using the Media Path

492	   There are multiple reasons why we use the media path for the codec
493	   control messages.

495	   First, systems employing MCUs often separate the control and media
496	   processing parts.  As these messages are intended for or generated
497	   by the media part rather than the signaling part of the MCU, having
498	   them on the media path avoids transmission across interfaces and
499	   unnecessary control traffic between signaling and processing.  If
500	   the MCU is physically decomposed, the use of the media path avoids
501	   the need for media control protocol extensions (e.g. in MEGACO
502	   [RFC3525]).

504	   Secondly, the signaling path quite commonly contains several
505	   signaling entities, e.g. SIP proxies and application servers.
506	   Avoiding going through signaling entities avoids delay for several
507	   reasons.  Proxies have less stringent delay requirements than media
508	   processing and due to their complex and more generic nature may
509	   result in significant processing delay.  The topological locations
510	   of the signaling entities are also commonly not optimized for
511	   minimal delay, but rather towards other architectural goals.  Thus,
512	   the signaling path can be significantly longer in both geographical
513	   and delay sense.

515	3.3. Using AVPF

517	   The AVPF feedback message framework [RFC4585] provides the
518	   appropriate framework to implement the new messages.  AVPF
519	   implements rules controlling the timing of feedback messages to
520	   avoid congestion through network flooding by RTCP traffic.  We re-
521	   use these rules by referencing AVPF.

523	   The signaling setup for AVPF allows each individual type of function
524	   to be configured or negotiated on an RTP session basis.

526	3.3.1. Reliability

528	   The use of RTCP messages implies that each message transfer is
529	   unreliable, unless the lower layer transport provides reliability.
530	   The different messages proposed in this specification have different
531	   requirements in terms of reliability.  However, in all cases, the
532	   reaction to an (occasional) loss of a feedback message is specified.

534	3.4. Multicast

536	   The codec control messages might be used with multicast.  The RTCP
537	   timing rules specified in [RFC3550] and [RFC4585] ensure that the
538	   messages do not cause overload of the RTCP connection.  The use of
539	   multicast may result in the reception of messages with inconsistent
540	   semantics.   The reaction to inconsistencies depends on the message
541	   type, and is discussed for each message type separately.

543	3.5. Feedback Messages

545	   This section describes the semantics of the different feedback
546	   messages and how they apply to the different use cases.

548	3.5.1. Full Intra Request Command

550	   A Full Intra Request (FIR) Command, when received by the designated
551	   media sender, requires that the media sender sends a Decoder Refresh
552	   Point (see 2.2) at the earliest opportunity.  The evaluation of such
553	   opportunity includes the current encoder coding strategy and the
554	   current available network resources.

556	   FIR is also known as an "instantaneous decoder refresh request",
557	   "fast video update request" or "video fast update request".

559	   Using a decoder refresh point implies refraining from using any
560	   picture sent prior to that point as a reference for the encoding
561	   process of any subsequent picture sent in the stream.  For
562	   predictive media types that are not video, the analogue applies.
563	   For example, if in MPEG-4 systems scene updates are used, the
564	   decoder refresh point consists of the full representation of the
565	   scene and is not delta-coded relative to previous updates.

567	   Decoder refresh points, especially Intra or IDR pictures, are in
568	   general several times larger in size than predicted pictures.  Thus,
569	   in scenarios in which the available bit rate is small, the use of a
570	   decoder refresh point implies a delay that is significantly longer
571	   than the typical picture duration.

573	   Usage in multicast is possible; however aggregation of the commands
574	   is recommended.  A receiver that receives a request closely after
575	   sending a decoder refresh point -- within 2 times the longest Round
576	   Trip Time (RTT) known, plus and AVPF-induced RTCP packet sending
577	   delays -- should await a second request message to ensure that the
578	   media receiver has not been served by the previously delivered
579	   decoder refresh point.  The reason for the specified delay is to
580	   avoid sending unnecessary decoder refresh points.  A session
581	   participant may have sent its own request while another
582	   participant's request was in-flight to them.  Suppressing those
583	   requests that may have been sent without knowledge about the other
584	   request avoids this issue.

586	   Using the FIR command to recover from errors is explicitly
587	   disallowed, and instead the PLI message defined in AVPF [RFC4585]
588	   should be used.  The PLI message reports lost pictures and has been
589	   included in AVPF for precisely that purpose.

591	   Full Intra Request is applicable in use-cases 1 and 2.

593	3.5.1.1. Reliability

595	   The FIR message results in the delivery of a decoder refresh point,
596	   unless the message is lost.  Decoder refresh points are easily
597	   identifiable from the bit stream.  Therefore, there is no need for
598	   protocol-level notification, and a simple command repetition
599	   mechanism is sufficient for ensuring the level of reliability
600	   required.  However, the potential use of repetition does require a
601	   mechanism to prevent the recipient from responding to messages
602	   already received and responded to.

604	   To ensure the best possible reliability, a sender of FIR may repeat
605	   the FIR request until the desired content has been received.  The
606	   repetition interval is determined by the RTCP timing rules
607	   applicable to the session.  Upon reception of a complete decoder
608	   refresh point or the detection of an attempt to send a decoder
609	   refresh point (which got damaged due to a packet loss), the
610	   repetition of the FIR must stop.  If another FIR is necessary, the
611	   request sequence number must be increased.  A FIR sender shall not
612	   have more than one FIR request (different request sequence number)
613	   outstanding at any time per media sender in the session.

615	   The receiver of FIR (i.e. the media sender) behaves in complementary
616	   fashion to ensure delivery of a decoder refresh point.  If it
617	   receives repetitions of the FIR more than 2*RTT after it has sent a
618	   decoder refresh point, it shall send a new decoder refresh point.
619	   Two round trip times allow time for the decoder refresh point to
620	   arrive back to the requestor and for the end of repetitions of FIR
621	   to reach and be detected by the media sender.

623	   An RTP mixer or RTP switching MCU that receive a FIR from a media
624	   receiver is responsible to ensure that a decoder refresh point is
625	   delivered to the requesting receiver.  It may be necessary for the
626	   mixer/MCU to generate FIR commands.  From a reliability perspective,
627	   the two legs (FIR-requesting endpoint to mixer/MCU, and mixer/MCU to
628	   decoder refresh point generating endpoint) are handled independently
629	   from each other.

631	3.5.2. Temporal Spatial Trade-off Request and Notification

633	   The Temporal Spatial Trade-off Request (TSTR) instructs the video
634	   encoder to change its trade-off between temporal and spatial
635	   resolution.  Index values from 0 to 31 indicate monotonically a
636	   desire for higher frame rate.  That is, a requester asking for an
637	   index of 0 prefers a high quality and is willing to accept a low
638	   frame rate, whereas a requester asking for 31 wishes a high frame
639	   rate, potentially at the cost of low spatial quality.

641	   In general the encoder reaction time may be significantly longer
642	   than the typical picture duration.  See use case 3 for an example.
643	   The encoder decides whether and to what extent the request results
644	   in a change of the trade-off.  It returns a Temporal Spatial Trade-
645	   Off Notification (TSTN) message to indicate the trade-off that it
646	   will use henceforth.

648	   TSTR and TSTN have been introduced primarily because it is believed
649	   that control protocol mechanisms, e.g. a SIP re-invite, are too
650	   heavyweight and too slow to allow for a reasonable user experience.
651	   Consider, for example, a user interface where the remote user
652	   selects the temporal/spatial trade-off with a slider.  An immediate
653	   feedback to any slider movement is required for a reasonable user
654	   experience.  A SIP re-INVITE [RFC3261] would require at least two
655	   round-trips more (compared to the TSTR/TSTN mechanism) and may
656	   involve proxies and other complex mechanisms.  Even in a well-
657	   designed system, it could take a second or so until the new trade-
658	   off is finally selected.  Furthermore the use of RTCP solves the
659	   multicast use case very efficiently.

661	   The use of TSTR and TSTN in multipoint scenarios is a non-trivial
662	   subject, and can be achieved in many implementation-specific ways.
663	   Problems stem from the fact that TSTRs will typically arrive
664	   unsynchronized, and may request different trade-off values for the
665	   same stream and/or endpoint encoder.  This memo does not specify a
666	   translator's, mixer's or endpoint's reaction to the reception of a
667	   suggested trade-off as conveyed in the TSTR.  We only require the
668	   receiver of a TSTR message to reply to it by sending a TSTN,
669	   carrying the new trade-off chosen by its own criteria (which may or
670	   may not be based on the trade-off conveyed by the TSTR).  In other
671	   words, the trade-off sent in TSTR is a non-binding recommendation,
672	   nothing more.

674	   Three TSTR/TSTN scenarios need to be distinguished, based on the
675	   topologies described in [Topologies].  The scenarios are described
676	   in the following sub-clauses.

678	3.5.2.1. Point-to-Point

680	   In this most trivial case (Topo-Point-to-Point), the media sender
681	   typically adjusts its temporal/spatial trade-off based on the
682	   requested value in TSTR, subject to its own capabilities.  The TSTN
683	   message conveys back the new trade-off value (which may be identical
684	   to the old one if, for example, the sender is not capable of
685	   adjusting its trade-off).

687	3.5.2.2. Point-to-Multipoint Using Multicast or Translators

689	   RTCP Multicast is used either with media multicast according to
690	   Topo-Multicast, or following RFC 3550's translator model according
691	   to Topo-Translator.  In these cases, unsynchronized TSTR messages
692	   from different receivers may be received, possibly with different
693	   requested trade-offs (because of different user preferences).  This
694	   memo does not specify how the media sender tunes its trade-off.
695	   Possible strategies include selecting the mean or median of all
696	   trade-off requests received, giving priority to certain
697	   participants, or continuing to use the previously selected trade-off
698	   (e.g. when the sender is not capable of adjusting it).  Again, all
699	   TSTR messages need to be acknowledged by TSTN, and the value
700	   conveyed back has to reflect the decision made.

702	3.5.2.3. Point-to-Multipoint Using RTP Mixer

704	   In this scenario (Topo-Mixer) the RTP mixer receives all TSTR
705	   messages, and has the opportunity to act on them based on its own
706	   criteria.  In most cases, the mixer should form a "consensus" of
707	   potentially conflicting TSTR messages arriving from different
708	   participants, and initiate its own TSTR message(s) to the media
709	   sender(s).  As in the previous scenario, the strategy for forming
710	   this "consensus" is up to the implementation, and can, for example,
711	   encompass averaging the participants' request values, giving
712	   priority to certain participants, or using session default values.

714	   Even if a mixer or translator performs transcoding, it is very
715	   difficult to deliver media with the requested trade-off, unless the
716	   content the mixer or translator receives is already close to that
717	   trade-off.  Thus, if the mixer changes its trade-off, it needs to
718	   request the media sender(s) to use the new value, by creating a TSTR
719	   of its own.  Upon reaching a decision on the used trade-off it
720	   includes that value in the acknowledgement to the downstream
721	   requestors.  Only in cases where the original source has
722	   substantially higher quality (and bit rate) is it likely that
723	   transcoding alone can result in the requested trade-off.

725	3.5.2.4. Reliability

727	   A request and reception acknowledgement mechanism is specified.  The
728	   Temporal Spatial Trade-off Notification (TSTN) message informs the
729	   requester that its request has been received, and what trade-off is
730	   used henceforth.  This acknowledgment mechanism is desirable for at
731	   least the following reasons:

733	   o A change in the trade-off cannot be directly identified from the
734	     media bit stream.
735	   o User feedback cannot be implemented without knowing the chosen
736	     trade-off value, according to the media sender's constraints.
737	   o Repetitive sending of messages requesting an unimplementable
738	     trade-off can be avoided.

740	3.5.3. H.271 Video Back Channel Message

742	   ITU-T Rec. H.271 defines syntax, semantics, and suggested encoder
743	   reaction to a video back channel message.  The structure defined in
744	   this memo is used to transparently convey such a message from media
745	   receiver to media sender.  In this memo, we refrain from an in-depth
746	   discussion of the available code points within H.271 and refer to
747	   the specification text [H.271] instead.

749	   However, we note that some H.271 messages bear similarities with
750	   native messages of AVPF and this memo.  Furthermore, we note that
751	   some H.271 message are known to require caution in multicast
752	   environments -- or are plainly not usable in multicast or multipoint
753	   scenarios.  Table 1 provides a brief, oversimplifed overview of the
754	   messages currently defined in H.271, their roughly corresponding
755	   AVPF or CCM messages (the latter as specified in this memo), and an
756	   indication of our current knowledge of their multicast safety.

758	   H.271 msg type      AVPF/CCM msg type    multicast-safe
759	   --------------------------------------------------------------------
760	   0 (when used for
761	     reference picture
762	      selection)        AVPF RPSI       No (positive ACK of pictures)
763	   1 picture loss       AVPF PLI        Yes
764	   2 partial loss       AVPF SLI        Yes
765	   3 one parameter CRC  N/A             Yes (no required sender action)
766	   4 all parameter CRC  N/A             Yes (no required sender action)
767	   5 refresh point      CCM FIR         Yes

769	   Table 1: H.271 messages and their AVPF/CCM equivalents

771	          Note: H.271 message type 0 is not a strict equivalent to
772	          AVPF's Reference Picture Selection Indication (RPSI); it is
773	          an indication of known-as-correct reference picture(s) at the
774	          decoder.  It does not command an encoder to use a defined
775	          reference picture (the form of control information envisioned
776	          to be carried in RPSI).  However, it is believed and intended
777	          that H.271 message type 0 will be used for the same purpose
778	          as AVPF's RPSI -- although other use forms are also possible.

780	   In response to the opaqueness of the H.271 messages, especially with
781	   respect to the multicast safety, the following guidelines MUST be
782	   followed when an implementation wishes to employ the H.271 video
783	   back channel message:

785	   1. Implementations utilizing the H.271 feedback message MUST stay in
786	      compliance with congestion control principles, as outlined in
787	      section 5.

789	   2. An implementation SHOULD utilize the IETF-native messages as
790	      defined in [RFC4585] and in this memo instead of similar messages
791	      defined in [H.271].  Our current understanding of similar
792	      messages is documented in Table 1 above.  One good reason to
793	      divert from the SHOULD statement above would be if it is clearly
794	      understood that, for a given application and video compression
795	      standard, the aforementioned "similarity" is not given, in
796	      contrast to what the table indicates.

798	   3. It has been observed that some of the H.271 code points currently
799	      in existence are not multicast-safe.  Therefore, the sensible
800	      thing to do is not to use the H.271 feedback message type in
801	      multicast environments.  It MAY be used only when all the issues
802	      mentioned later are fully understood by the implementer, and
803	      properly taken into account by all endpoints.  In all other
804	      cases, the H.271 message type MUST NOT be used in conjunction
805	      with multicast.

807	   4. It has been observed that even in centralized multipoint
808	      environments, where the mixer should theoretically be able to
809	      resolve issues as documented below, the implementation of such a
810	      mixer and cooperative endpoints is a very difficult and tedious
811	      task.  Therefore, H.271 messages MUST NOT be used in centralized
812	      multipoint scenarios, unless all the issues mentioned below are
813	      fully understood by the implementer, and properly taken into
814	      account by both mixer and endpoints.

816	   Issues to be taken into account when considering the use of H.271 in
817	   multipoint environments:

819	   1. Different state on different receivers.  In many environments it
820	      cannot be guaranteed that the decoder state of all media
821	      receivers is identical at any given point in time.  The most
822	      obvious reason for such a possible misalignment of state is a
823	      loss that occurs on the path to only one of many media receivers.
824	      However, there are other not so obvious reasons, such as recent
825	      joins to the multipoint conference (be it by joining the
826	      multicast group or through additional mixer output).  Different
827	      states can lead the media receivers to issue potentially
828	      contradicting H.271 messages (or one media receiver issuing an
829	      H.271 message that, when observed by the media sender, is not
830	      helpful for the other media receivers).  A naive reaction of the
831	      media sender to these contradicting messages can lead to
832	      unpredictable and annoying results.

834	   2. Combining messages from different media receivers in a media
835	      sender is a non-trivial task.  As reasons, we note that these
836	      messages may be contradicting each other, and that their
837	      transport is unreliable (there may well be other reasons).  In
838	      case of many H.271 messages (i.e. types 0, 2, 3, and 4), the
839	      algorithm for combining must be aware both of the
840	      network/protocol environment (i.e. with respect to congestion)
841	      and of the media codec employed, as H.271 messages of a given
842	      type can have different semantics for different media codecs.

844	   3. The suppression of requests may need to go beyond the basic
845	      mechanisms described in AVPF (which are driven exclusively by
846	      timing and transport considerations on the protocol level).  For
847	      example, a receiver is often required to refrain from (or delay)
848	      generating requests, based on information it receives from the
849	      media stream.  For instance, it makes no sense for a receiver to
850	      issue a FIR when a transmission of an Intra/IDR picture is
851	      ongoing.

853	   4. When using the non-multicast-safe messages (e.g. H.271 type 0
854	      positive ACK of received pictures/slices) in larger multicast
855	      groups, the media receiver will likely be forced to delay or even
856	      omit sending these messages.  For the media sender this looks
857	      like data has not been properly received (although it was
858	      received properly), and a naively implemented media sender reacts
859	      to these perceived problems where it should not.

861	3.5.3.1. Reliability

863	   H.271 Video Back Channel messages do not require reliable
864	   transmission, and confirmation of the reception of a message can be
865	   derived from the forward video bit stream.  Therefore, no specific
866	   reception acknowledgement is specified.

868	   With respect to re-sending rules, clause 3.5.1.1 applies.

870	3.5.4. Temporary Maximum Media Stream Bit Rate Request and Notification

872	   A receiver, translator or mixer uses the Temporary Maximum Media
873	   Stream Bit Rate Request (TMMBR, "timber") to request a sender to
874	   limit the maximum bit rate for a media stream (see 2
875	.2) to, or
876	   below, the provided value.  The Temporary Maximum Media Stream Bit
877	   Rate Notification (TMMBN) contains the media sender's current view
878	   of the most limiting subset of the TMMBR-defined limits it has
879	   received, to help the participants to suppress TMMBR requests that
880	   would not further restrict the media sender.  The primary usage for
881	   the TMMBR/TMMBN messages is in a scenario with an MCU or mixer (use
882	   case 6), corresponding to Topo-Translator or Topo-Mixer, but also to
883	   Topo-Point-to-Point.

885	   Each temporary limitation on the media stream is expressed as a
886	   tuple.  The first component of the tuple is the maximum total media
887	   bit rate (as defined in section 2.2) that the media receiver is
888	   currently prepared to accept for this media stream.  The second
889	   component is the per-packet overhead that the media receiver has
890	   observed for this media stream at its chosen reference protocol
891	   layer.

893	   As indicated in section 2.2, the overhead as observed by the sender
894	   of the TMMBR (i.e. the media receiver) may differ from the overhead
895	   observed at the receiver of the TMMBR (i.e. the media sender) due to
896	   use of a different reference protocol layer at the other end or due
897	   to the intervention of translators or mixers that affect the amount
898	   of per packet overhead.  For example, a gateway in between the two
899	   that converts between IPv4 and IPv6 affects the per-packet overhead
900	   by 20 bytes.  Other mechanisms that change the overhead include
901	   tunnels.  The problem with varying overhead is also discussed in
902	   [RFC3890].  As will be seen in the description of the algorithm for
903	   use of TMMBR, the difference in perceived overhead between the
904	   sending and receiving ends presents no difficulty because
905	   calculations are carried out in terms of variables that have the
906	   same value at the sender as at the receiver -- for example, packet
907	   rate and net media rate.

909	   Reporting both maximum total media bit rate and per-packet overhead
910	   allows different receivers to provide bit rate and overhead values
911	   for different protocol layers, for example at the IP level, at the
912	   outer part of a tunnel protocol, or at the link layer.  The protocol
913	   level a peer reports on depends on the level of integration the peer
914	   has, as it needs to be able to extract the information from that
915	   protocol level.  For example, an application with no knowledge of
916	   the IP version it is running over can not meaningfully determine the
917	   overhead of the IP header, and hence will not want to include IP
918	   overhead in the overhead or maximum total media bit rate
919	   calculation.

921	   It is expected that most peers will be able to report values at
922	   least for the IP layer.  In certain implementations it may be
923	   advantageous to also include information pertaining to the link
924	   layer, which in turn allows for a more precise overhead calculation
925	   and a better optimization of connectivity resources.

927	   The Temporary Maximum Media Stream Bit Rate messages are generic
928	   messages that can be applied to any RTP packet stream.  This
929	   separates them from the other codec control messages defined in this
930	   specification, which apply only to specific media types or payload
931	   formats.  The TMMBR functionality applies to the transport, and the
932	   requirements the transport places on the media encoding.

934	   The reasoning below assumes that the participants have negotiated a
935	   session maximum bit rate, using a signaling protocol.  This value
936	   can be global, for example in case of point-to-point, multicast, or
937	   translators.  It may also be local between the participant and the
938	   peer or mixer.  In either case, the bit rate negotiated in signaling
939	   is the one that the participant guarantees to be able to handle
940	   (depacketize and decode).  In practice, the connectivity of the
941	   participant also influences the negotiated value -- it does not make
942	   much sense to negotiate a total media bit rate that one's network
943	   interface does not support.

945	   It is also beneficial to have negotiated a maximum packet rate for
946	   the session or sender.  RFC 3890 provides an SDP [RFC4566] attribute
947	   that can be used for this purpose; however, that attribute is not
948	   usable in RTP sessions established using offer/answer [RFC3264].
949	   Therefore an optional maximum packet rate signaling parameter is
950	   specified in this memo.

952	   An already established maximum total media bit rate may be changed
953	   at any time, subject to the timing rules governing the sending of
954	   feedback messages. The limit may change to any value between zero
955	   and the session maximum, as negotiated during session establishment
956	   signaling.  However, even if a sender has received a TMMBR message
957	   allowing an increase in the bit rate, all increases must be governed
958	   by a congestion control mechanism.  TMMBR indicates known
959	   limitations only, usually in the local environment, and does not
960	   provide any guarantees about the full path.  Furthermore, any
961	   increases in TMMBR-established bit rate limits are to be executed
962	   only after a certain delay from the sending of the TMMBN message
963	   that notifies the world about the increase in limit.  The delay is
964	   specified as at least twice the longest RTT as known by the media
965	   sender, plus the media sender's calculation of the required wait
966	   time for the sending of another TMMBR message for this session based
967	   on AVPF timing rules.  This delay is introduced to allow other
968	   session participants to make known their bit rate limit
969	   requirements, which may be lower.

971	   If it is likely that the new value indicated by TMMBR will be valid
972	   for the remainder of the session, the TMMBR sender is expected to
973	   perform a renegotiation of the session upper limit using the session
974	   signaling protocol.

976	3.5.4.1. Behavior for media receivers using TMMBR

978	   This section is an informal description of behaviour described more
979	   precisely in section 4.2.

981	   A media sender begins the session limited by the maximum media bit
982	   rate and maximum packet rate negotiated in session signaling, if
983	   any. Note that this value may be negotiated for another protocol
984	   layer than the one the participant uses in its TMMBR messages.  Each
985	   media receiver selects a reference protocol layer, forms an estimate
986	   of the overhead it is observing (or estimating it if no packets has
987	   been seen yet) at that reference level, and determines the maximum
988	   total media bit rate it can accept, taking into account its own
989	   limitations and any transport path limitations of which it may be
990	   aware.  In case the current limitations are more restricting then
991	   what was agreed on in the session signaling, the media receiver
992	   reports its initial estimate of these two quantities to the media
993	   sender using a TMMBR message.  Overall message traffic is reduced by
994	   the possibility of including tuples for multiple media senders in
995	   the same TMMBR message.

997	   The media sender applies an algorithm such as that specified in
998	   section 3.5.4.2 to select which of the tuples it has received are
999	   most limiting (i.e. the bounding set as defined in section 2.2).  It
1000	   modifies its operation to stay within the feasible region (as
1001	   defined in section 2.2), and also sends out a TMMBN notification to
1002	   the media receivers indicating the selected bounding set.

1004	   If a media receiver does not own one of the tuples in the bounding
1005	   set reported by the TMMBN, it applies the same algorithm as the
1006	   media sender to determine if its current estimated (maximum total
1007	   media bit rate, overhead) tuple would enter the bounding set if
1008	   known to the media sender.  If so, it issues a TMMBR request
1009	   reporting the tuple value to the sender.  Otherwise it takes no
1010	   action for the moment.  Periodically, its estimated tuple values may
1011	   change or it may receive a new TMMBN.  If so, it reapplies the
1012	   algorithm to decide whether it needs to issue a TMMBR request.

1014	   If, alternatively, a media receiver owns one of the tuples in the
1015	   reported bounding set, it takes no action until such time as its
1016	   estimate of its own tuple values changes.  At that time it sends a
1017	   TMMBR request to the media sender to report the changed values.

1019	   A media receiver may change status between owner and non-owner of a
1020	   bounding tuple between one TMMBN message and the next.  Thus, it
1021	   must check the contents of each TMMBN to determine its subsequent
1022	   actions.

1024	   Implementations may use other algorithms of their choosing, as long
1025	   as the bit rate limitations resulting from the exchange of TMMBR and
1026	   TMMBN messages are at least as strict (at least as low, in the bit
1027	   rate dimension) as the ones resulting from the use of the
1028	   aforementioned algorithm.

1030	   Obviously, in point-to-point cases, when there is only one media
1031	   receiver, this receiver becomes "owner" once it receives the first
1032	   TMMBN in response to its own TMMBR, and stays "owner" for the rest
1033	   of the session.  Therefore, when it is known that there will always
1034	   be only a single media receiver, the above algorithm is not
1035	   required.  Media receivers that are aware they are the only ones in
1036	   a session can send TMMBR messages with bit rate limits both higher
1037	   and lower than the previously notified limit, at any time (subject
1038	   to the AVPF [RFC4585] RTCP RR send timing rules).  However, it may
1039	   be difficult for a session participant to determine if it is the
1040	   only receiver in the session.  Because of this any implementation of
1041	   TMMBR is required to include the algorithm described in the next
1042	   section or a stricter equivalent.

1044	3.5.4.2. Algorithm for establishing current limitations

1046	   This section introduces an example algorithm for the calculation of
1047	   a session limit.  Other algorithms can be employed, as long as the
1048	   result of the calculation is at least as restrictive as the result
1049	   that is obtained by this algorithm.

1051	   First, it is important to consider the implications of using a tuple
1052	   for limiting the media sender's behavior.  The bit rate and the
1053	   overhead value result in a two-dimensional solution space for the
1054	   calculation of the bit rate of media streams.  Fortunately, the two
1055	   variables are linked. Specifically, the bit rate available for RTP
1056	   payloads is equal to the TMMBR reported bit rate minus the packet
1057	   rate used, multiplied by the TMMBR reported overhead converted to
1058	   bits.  As a result, when different bit rate/overhead combinations
1059	   need to be considered, the packet rate determines the correct
1060	   limitation.  This is perhaps best explained by an example:

1062	   Example:

1064	   Receiver A: TMMBR_max total BR = 35 kbps, TMMBR_OH = 40 bytes
1065	   Receiver B: TMMBR_max total BR = 40 kbps, TMMBR_OH = 60 bytes

1067	   For a given packet rate (PR) the bit rate available for media
1068	   payloads in RTP will be:

1070	   Max_net media_BR_A = TMMBR_max total BR_A - PR * TMMBR_OH_A * 8 ...
1071	   (1)
1072	   Max_net media_BR_B = TMMBR_max total BR_B - PR * TMMBR_OH_B * 8 ...
1073	   (2)

1075	   For a PR = 20 these calculations will yield a Max_net media_BR_A =
1076	   28600 bps and Max_net media_BR_B = 30400 bps, which suggests that
1077	   receiver A is the limiting one for this packet rate.  However, at a
1078	   certain PR there is a switchover point at which receiver B becomes
1079	   the limiting one.  The switchover point can be identified by setting
1080	   Max_media_BR_A equal to Max_media_BR_B and breaking out PR:

1082	         TMMBR_max total BR_A - TMMBR_max total BR_B
1083	   PR =  ------------------------------------------- ... (3)
1084	                8*(TMMBR_OH_A - TMMBR_OH_B)

1086	   which, for the numbers above yields 31.25 as the switchover point
1087	   between the two limits.  That is, for packet rates below 31.25 per
1088	   second, receiver A is the limiting receiver, and for higher packet
1089	   rates, receiver B is more limiting.  The implications of this
1090	   behavior have to be considered by implementations that are going to
1091	   control media encoding and its packetization.  As exemplified above,
1092	   multiple TMMBR limits may apply to the trade-off between net media
1093	   bit rate and packet rate.  Which limitation applies depends on the
1094	   packet rate being considered.

1096	   This also has implications for how the TMMBR mechanism needs to
1097	   work.  First, there is the possibility that multiple TMMBR tuples
1098	   are providing limitations on the media sender.  Secondly there is a
1099	   need for any session participant (media sender and receivers) to be
1100	   able to determine if a given tuple will become a limitation upon the
1101	   media sender, or if the set of already given limitations is stricter
1102	   than the given values.  In the absence of the ability to make this
1103	   determination the suppression of TMMBR requests would not work.

1105	   The basic idea of the algorithm is as follows.  Each TMMBR tuple can
1106	   be viewed as the equation of a straight line (cf. equations (1) and
1107	   (2)) in a space where packet rate lies along the X-axis and maximum
1108	   bit rate lies along the Y-axis. The lower envelope of the set of
1109	   lines corresponding to the complete set of TMMR tuples, together
1110	   with the X and Y axes, defines a polygon. Points lying within this
1111	   polygon are combinations of packet rate and bit rate that meet all
1112	   of the TMMBR constraints. The highest feasible packet rate within
1113	   this region is the minimum of the rate at which the bounding polygon
1114	   meets the X-axis or the session maximum packet rate (SMAXPR,
1115	   measured in packets per second) provided by signaling, if any.
1116	   Typically a media sender will prefer to operate at a lower rate than
1117	   this theoretical maximum, so as to increase the rate at which actual
1118	   media content reaches the receivers.  The purpose of the algorithm
1119	   is to distinguish the TMMBR tuples constituting the bounding set and
1120	   thus delineate the feasible region, so that the media sender can
1121	   select its preferred operating point within that region

1123	   Figure 1 below shows a bounding polygon formed by TMMBR tuples A and
1124	   B. A third tuple C lies outside the bounding polygon and is
1125	   therefore irrelevant in determining feasible tradeoffs between media
1126	   rate and packet rate.  The line labeled ss..s represents the limit
1127	   on packet rate imposed by the session maximum packet rate (SMAXPR)
1128	   obtained by signaling during session setup.  In Figure 1 the limit
1129	   determined by tuple B happens to be more restrictive than SMAXPR.
1130	   The situation could easily be the reverse, meaning that the bounding
1131	   polygon is terminated on the right by the vertical line representing
1132	   the SMAXPR constraint.

1134	   Net  ^
1135	   Media|a   c   b             s
1136	   Bit  |  a   c  b            s
1137	   Rate |    a   c b           s
1138	        |      a   cb          s
1139	        |        a   c         s
1140	        |          a  bc       s
1141	        |            a b c     s
1142	        |              ab  c   s
1143	        |  Feasible      b   c s
1144	        |   region        ba   s
1145	        |                  b a s c
1146	        |                   b  s   c
1147	        |                    b s a
1148	        |_____________________bs________
1149	        +------------------------------>____________

1151	              Packet rate

1153	    Figure 1 - Geometric Interpretation of TMMBR Tuples

1155	   Note that the slopes of the lines making up the bounding polygon are
1156	   increasingly negative as one moves in the direction of increasing
1157	   packet rate.  Note also that with slight rearrangement, equations
1158	   (1) and (2) have the canonical form:

1160	          y = mx + b

1162	   where
1163	     m is the slope and has value equal to the negative of the tuple
1164	     overhead (in bits),
1165	   and
1166	     b is the y-intercept and has value equal to the tuple maximum
1167	     total media bit rate.

1169	   These observations lead to the conclusion that when processing the
1170	   TMMBR tuples to select the initial bounding set, one should sort and
1171	   process the tuples by order of increasing overhead. Once a
1172	   particular tuple has been added to the bounding set, all tuples not
1173	   already selected and having lower overhead can be eliminated,
1174	   because the next side of the bounding polygon has to be steeper
1175	   (i.e. the corresponding TMMBR must have higher overhead) than the
1176	   latest added tuple.

1178	   Line cc..c in Figure 1 illustrates another principle. This line is
1179	   parallel to line aa..a, but has a higher Y-intercept.  That is, the
1180	   corresponding TMMBR tuple contains a higher maximum total media bit
1181	   rate value.  Since line cc..c is outside the bounding polygon, it
1182	   illustrates the conclusion that if two TMMBR tuples have the same
1183	   overhead value, the one with higher maximum total media bit rate
1184	   value cannot be part of the bounding set and can be set aside.

1186	   Two further observations complete the algorithm.  Obviously, moving
1187	   from the left, the successive corners of the bounding polygon (i.e.
1188	   the intersection points between successive pairs of sides) lie at
1189	   successively higher packet rates.  On the other hand, again moving
1190	   from the left, each successive line making up the bounding set
1191	   crosses the X-axis at a lower packet rate.

1193	   The complete algorithm can now be specified.  The algorithm works
1194	   with two lists of TMMBR tuples, the candidate list X and the
1195	   selected list Y, both ordered by increasing overhead value.  The
1196	   algorithm terminates when all members of X have been discarded or
1197	   removed for processing.  Membership of the selected list Y is
1198	   probationary until the algorithm is complete.  Each member of the
1199	   selected list is associated with an intersection value, which is the
1200	   packet rate at which the line corresponding to that TMMBR tuple
1201	   intersects with the line corresponding to the previous TMMBR tuple
1202	   in the selected list.  Each member of the selected list is also
1203	   associated with a maximum packet rate value, which is the lesser of
1204	   the session maximum packet rate SMAXPR (if any) and the packet rate
1205	   at which the line corresponding to that tuple crosses the X-axis.

1207	   When the algorithm terminates, the selected list is equal to the
1208	   bounding set as defined in section 2.2.

1210	Initial Algorithm

1212	   This algorithm is used by the media sender when it has received one
1213	   or more TMMBR requests and before it has determined a bounding set
1214	   for the first time.

1216	   1. Sort the TMMBR tuples by order of increasing overhead.  This is
1217	      the initial candidate list X.

1219	   2. When multiple tuples in the candidate list have the same overhead
1220	      value, discard all but the one with the lowest maximum total media
1221	      bit rate value.

1223	   3. Select and remove from the candidate list the TMMBR tuple with the
1224	      lowest maximum total media bit rate value.  If there is more than
1225	      one tuple with that value, choose the one with the highest
1226	      overhead value.  This is the first member of the selected list Y.
1227	      Set its intersection value equal to zero.  Calculate its maximum
1228	      packet rate as the minimum of SMAXPR (if available) and the value
1229	      obtained from the following formula, which is the packet rate at
1230	      which the corresponding line crosses the X-axis.

1232	          Max PR = TMMBR max total BR / (8 * TMMBR OH) ... (4)

1234	   4. Discard from the candidate list all tuples with a lower overhead
1235	      value than the selected tuple.

1237	   5. Remove the first remaining tuple from the candidate list for
1238	      processing.  Call this the current candidate.

1240	   6. Calculate the packet rate PR at the intersection of the line
1241	      generated by the current candidate with the line generated by the
1242	      last tuple in the selected list Y, using equation (3).

1244	   7. If the calculated value PR is equal to or lower than the
1245	      intersection value stored for the last tuple of the selected list,
1246	      discard the last tuple of the selected list and go back to step 6
1247	      (retaining the same current candidate).

1249	      Note that the choice of the initial member of the selected list Y
1250	      in step 3 guarantees that the selected list will never be emptied
1251	      by this process, meaning that the algorithm must eventually (if
1252	      not immediately) fall through to the step 8.

1254	   8. (This step is reached when the calculated PR value of the current
1255	      candidate is greater than the intersection value of the current
1256	      last member of the selected list Y.)  If the calculated value PR
1257	      of the current candidate is lower than the maximum packet rate
1258	      associated with the last tuple in the selected list, add the
1259	      current candidate tuple to the end of the selected list.  Store PR
1260	      as its intersection value.  Calculate its maximum packet rate as
1261	      the lesser of SMAXPR (if available) and the maximum packet rate
1262	      calculated using equation (4).

1264	   9. If any tuples remain in the candidate list, go back to step 5.

1266	Incremental Algorithm

1268	   The previous algorithm covered the initial case, where no selected
1269	   list had previously been created.  It also applied only to the media
1270	   sender.  When a previously-created selected list is available at
1271	   either the media sender or media receiver, two other cases can be
1272	   considered:

1274	        o when a TMMBR tuple not currently in the selected list is a
1275	          candidate for addition;

1277	        o when the values change in a TMMBR tuple currently in the
1278	          selected list.

1280	   At the media receiver these cases correspond respectively to those
1281	   of the non-owner and owner of a tuple in the TMMBN-reported bounding
1282	   set.

1284	   In either case, the process of updating the selected list to take
1285	   account of the new/changed tuple can use the basic algorithm
1286	   described above, with the modification that the initial candidate
1287	   set consists only of the existing selected list and the new or
1288	   changed tuple.  Some further optimization is possible (beyond
1289	   starting with a reduced candidate set) by taking advantage of the
1290	   following observations.

1292	   The first observation is that if the new/changed candidate becomes
1293	   part of the new selected list, the result may be to cause zero or
1294	   more other tuples to be dropped from the list.  However, if more
1295	   than one other tuple is dropped, the dropped tuples will be
1296	   consecutive.  This can be confirmed geometrically by visualizing a
1297	   new line that cuts off a series of segments from the previously-
1298	   existing bounding polygon.  The cut-off segments are connected one
1299	   to the next, the geometric equivalent of consecutive tuples in a
1300	   list ordered by overhead value.  Beyond the dropped set in either
1301	   direction all of the tuples that were in the earlier selected list
1302	   will be in the updated one.  The second observation is that, leaving
1303	   aside the new candidate, the order of tuples remaining in the
1304	   updated selected list is unchanged because their overhead values
1305	   have not changed.

1307	   The consequence of these two observations is that, once the
1308	   placement of the new candidate and the extent of the dropped set of
1309	   tuples (if any) has been determined, the remaining tuples can be
1310	   copied directly from the candidate list into the selected list,
1311	   preserving their order.  This conclusion suggests the following
1312	   modified algorithm:

1314	       o Run steps 1-4 of the basic algorithm.

1316	       o If the new candidate has survived steps 2 and 4 and has become
1317	          the new first member of the selected list, run steps 5-9 on
1318	          subsequent candidates until another candidate is added to the
1319	          selected list.  Then move all remaining candidates to the
1320	          selected list, preserving their order.

1322	       o If the new candidate has survived steps 2 and 4 and has not
1323	          become the new first member of the selected list, start by
1324	          moving all tuples in the candidate list with lower overhead
1325	          values than that of the new candidate to the selected list,
1326	          preserving their order.  Run steps 5 through 9 for the new
1327	          candidate, with the modification that the intersection values
1328	          and maximum packet rates for the tuples on the selected list
1329	          have to be calculated on the fly because they were not
1330	          previously stored.  Continue processing only until a
1331	          subsequent tuple has been added to the selected list, then
1332	          move all remaining candidates to the selected list, preserving
1333	          their order.

1335	          Note that the new candidate could be added to the selected
1336	          list only to be dropped again when the next tuple is
1337	          processed.  It can easily be seen that in this case the new
1338	          candidate does not displace any of the earlier tuples in the
1339	          selected list.  The limitations of ASCII art make this
1340	          difficult to show in a figure.  Line cc..c in Figure 1 would
1341	          be an example if it had a steeper slope (tuple C had a higher
1342	          overhead value), but still intersected line aa..a beyond where
1343	          line aa..a intersects line bb..b.

1345	   The algorithm just described is approximate, because it does not
1346	   take account of tuples outside the selected list.  To see how such
1347	   tuples can become relevant, consider Figure 1 and suppose that the
1348	   maximum total media bit rate in tuple A increases to the point that
1349	   line aa..a moves outside line cc..c.  Tuple A will remain in the
1350	   bounding set calculated by the media sender.  However, once it
1351	   issues a new TMMBN, media receiver C will apply the algorithm and
1352	   discover that its tuple C should now enter the bounding set.  It
1353	   will issue a TMMBR request to the media sender, which will repeat
1354	   its calculation and come to the appropriate conclusion.

1356	   The rules of section 4.2 require that the media sender refrain from
1357	   raising its sending rate until media receivers have had a chance to
1358	   respond to the TMMBN.  In the example just given, this delay ensures
1359	   that the relaxation of tuple A does not actually result in an
1360	   attempt to send media at a rate exceeding the capacity at C.

1362	3.5.4.3. Use of TMMBR in a Mixer Based Multipoint Operation

1364	   Assume a small mixer-based multiparty conference is ongoing, as
1365	   depicted in Topo-Mixer of [Topologies].  All participants have
1366	   negotiated a common maximum bit rate that this session can use.  The
1367	   conference operates over a number of unicast paths between the
1368	   participants and the mixer.  The congestion situation on each of
1369	   these paths can be monitored by the participant in question and by
1370	   the mixer, utilizing, for example, RTCP receiver reports (RR) or the
1371	   transport protocol, e.g. DCCP [RFC4340].  However, any given
1372	   participant has no knowledge of the congestion situation of the
1373	   connections to the other participants.  Worse, without mechanisms
1374	   similar to the ones discussed in this draft, the mixer (which is
1375	   aware of the congestion situation on all connections it manages) has
1376	   no standardized means to inform media senders to slow down, short of
1377	   forging its own receiver reports (which is undesirable).  In
1378	   principle, a mixer confronted with such a situation is obliged to
1379	   thin or transcode streams intended for connections that detected
1380	   congestion.

1382	   In practice, unfortunately, media-aware streaming thinning is a very
1383	   difficult and cumbersome operation and adds undesirable delay.  If
1384	   media-unaware, it leads very quickly to unacceptable reproduced
1385	   media quality.  Hence, a means to slow down senders even in the
1386	   absence of congestion on their connections to the mixer is
1387	   desirable.

1389	   To allow the mixer to throttle traffic on the individual links,
1390	   without performing transcoding, there is a need for a mechanism that
1391	   enables the mixer to ask a participant's media encoders to limit the
1392	   media stream bit rate they are currently generating.  TMMBR provides
1393	   the required mechanism.  When the mixer detects congestion between
1394	   itself and a given participant, it executes the following procedure:

1396	   1. It starts thinning the media traffic to the congested participant
1397	      to the supported bit rate.

1399	   2. It uses TMMBR to request the media sender(s) to reduce the total
1400	      media bit rate sent by them to the mixer, to a value that is in
1401	      compliance with congestion control principles for the slowest
1402	      link.  Slow refers here to the available bandwidth / bit rate /
1403	      capacity and packet rate after congestion control.

1405	   3. As soon as the bit rate has been reduced by the sending part, the
1406	      mixer stops stream thinning implicitly, because there is no need
1407	      for it once the stream is in compliance with congestion control.

1409	   This use of stream thinning as an immediate reaction tool followed
1410	   up by a quick control mechanism appears to be a reasonable
1411	   compromise between media quality and the need to combat congestion.

1413	3.5.4.4. Use of TMMBR in Point-to-Multipoint Using Multicast or
1414	         Translators

1416	   In these topologies, corresponding to Topo-Multicast or Topo-
1417	   Translator, RTCP RRs are transmitted globally.  This allows all
1418	   participants to detect transmission problems such as congestion, on
1419	   a medium timescale.  As all media senders are aware of the
1420	   congestion situation of all media receivers, the rationale for the
1421	   use of TMMBR in the previous section does not apply.  However, even
1422	   in this case the congestion control response can be improved when
1423	   the unicast links are using congestion controlled transport
1424	   protocols (such as TCP or DCCP).  A peer may also report local
1425	   limitations to the media sender.

1427	3.5.4.5. Use of TMMBR in Point-to-point operation

1429	   In use case 7 it is possible to use TMMBR to improve the performance
1430	   when the known upper limit of the bit rate changes.  In this use
1431	   case the signaling protocol has established an upper limit for the
1432	   session and total media bit rates.  However, at the time of
1433	   transport link bit rate reduction, a receiver can avoid serious
1434	   congestion by sending a TMMBR to the sending side.  Thus, TMMBR is
1435	   useful for putting restrictions on the application and thus placing
1436	   the congestion control mechanism in the right ballpark.  However,
1437	   TMMBR is usually unable to provide the continuously quick feedback
1438	   loop required for real congestion control.  Nor do its semantics
1439	   match those of congestion control given its different purpose.  For
1440	   these reasons TMMBR SHALL NOT be used as a substitute for congestion
1441	   control.

1443	3.5.4.6. Reliability

1445	   The reaction of a media sender to the reception of a TMMBR message
1446	   is not immediately identifiable through inspection of the media
1447	   stream.  Therefore, a more explicit mechanism is needed to avoid
1448	   unnecessary re-sending of TMMBR messages.  Using a statistically
1449	   based retransmission scheme would only provide statistical
1450	   guarantees of the request being received.  It would also not avoid
1451	   the retransmission of already received messages.  In addition, it
1452	   would not allow for easy suppression of other participants'
1453	   requests.  For these reasons, a mechanism based on explicit
1454	   notification is used.

1456	   Upon the reception of a request a media sender sends a TMMBN
1457	   notification containing the current bounding set, and indicating
1458	   which session participants own that limit.  In multicast scenarios,
1459	   that allows all other participants to suppress any request they may
1460	   have, if their limitations are less strict than the current ones
1461	   (i.e. define lines lying outside the feasible region as defined in
1462	   section 2.2).  Keeping and notifying only the bounding set of tuples
1463	   allows for small message sizes and media sender states.  A media
1464	   sender only keeps state for the SSRCs of the current owners of the
1465	   bounding set of tuples; all other requests and their sources are not
1466	   saved.  Once the bounding set has been established, new TMMBR
1467	   messages should be generated only by owners of the bounding tuples
1468	   and by other entities that determine (by applying the algorithm of
1469	   section 3.5.4.2 or its equivalent) that their limitations should now
1470	   be part of the bounding set.

1472	4. RTCP Receiver Report Extensions

1474	   This memo specifies six new feedback messages.  The Full Intra
1475	   Request (FIR), Temporal-Spatial Trade-off Request (TSTR), Temporal-
1476	   Spatial Trade-off Notification (TSTN), and Video Back Channel
1477	   Message (VBCM) are "Payload Specific Feedback Messages" as defined
1478	   in Section 6.3 of AVPF [RFC4585].  The Temporary Maximum Media
1479	   Stream Bit Rate Request (TMMBR) and Temporary Maximum Media Stream
1480	   Bit Rate Notification (TMMBN) are "Transport Layer Feedback
1481	   Messages" as defined in Section 6.2 of AVPF.

1483	   The new feedback messages are defined in the following subsections,
1484	   following a similar structure to that in sections 6.2 and 6.3 of the
1485	   AVPF specification [RFC4585].

1487	4.1. Design Principles of the Extension Mechanism

1489	   RTCP was originally introduced as a channel to convey presence,
1490	   reception quality statistics and hints on the desired media coding.
1491	   A limited set of media control mechanisms were introduced in early
1492	   RTP payload formats for video formats, for example in RFC 2032
1493	   [RFC2032].  However, this specification, for the first time,
1494	   suggests a two-way handshake for some of its messages.  There is
1495	   danger that this introduction could be misunderstood as a precedent
1496	   for the use of RTCP as an RTP session control protocol.  To prevent
1497	   such a misunderstanding, this subsection attempts to clarify the
1498	   scope of the extensions specified in this memo, and strongly
1499	   suggests that future extensions follow the rationale spelled out
1500	   here, or compellingly explain why they divert from the rationale.

1502	   In this memo, and in AVPF [RFC4585], only such messages have been
1503	   included as:

1505	   a) have comparatively strict real-time constraints, which prevent
1506	      the use of mechanisms such as a SIP re-invite in most application
1507	      scenarios.  The real-time constraints are explained separately
1508	      for each message where necessary.

1510	   b) are multicast-safe in that the reaction to potentially
1511	      contradicting feedback messages is specified, as necessary for
1512	      each message; and

1514	   c) are directly related to activities of a certain media codec,
1515	      class of media codecs (e.g. video codecs), or a given RTP packet
1516	      stream.

1518	   In this memo, a two-way handshake is introduced only for messages
1519	   for which:

1521	   a) a notification or acknowledgement is required due to their
1522	      nature. An analysis to determine whether this requirement exists
1523	      has been performed separately for each message.

1525	   b) the notification or acknowledgement cannot be easily derived from
1526	      the media bit stream.

1528	   All messages in AVPF [RFC4585] and in this memo present their
1529	   contents in a simple, fixed binary format.  This accommodates media
1530	   receivers which have not implemented higher control protocol
1531	   functionalities (SDP, XML parsers and such) in their media path.

1533	   Messages that do not conform to the design principles just described
1534	   are not an appropriate use of RTCP or of the Codec Control Framework
1535	   defined in this document.

1537	4.2. Transport Layer Feedback Messages

1539	   As specified in section 6.1 of RFC 4585 [RFC4585], Transport Layer
1540	   Feedback messages are identified by the RTCP packet type value RTPFB
1541	   (205).

1543	   In AVPF, one message of this category had been defined.  This memo
1544	   specifies two more such messages.  They are identified by means of
1545	   the FMT parameter as follows:

1547	   Assigned in AVPF [RFC4585]:

1549	      1:    Generic NACK
1550	      31:   reserved for future expansion of the identifier number
1551	   space

1553	   Assigned in this memo:

1555	      2:    reserved (see note below)
1556	      3:    Temporary Maximum Media Stream Bit Rate Request (TMMBR)
1557	      4:    Temporary Maximum Media Stream Bit Rate Notification
1558	   (TMMBN)

1560	          Note: early drafts of AVPF [RFC4585] reserved FMT=2 for a
1561	          code point that has later been removed.  It has been pointed
1562	          out that there may be implementations in the field using this
1563	          value in accordance with the expired draft.  As there is
1564	          sufficient numbering space available, we mark FMT=2 as
1565	          reserved so to avoid possible interoperability problems with
1566	          any such early implementations.

1568	   Available for assignment:

1570	      0:    unassigned
1571	      5-30: unassigned

1573	   The following subsection defines the formats of the FCI entries for
1574	   the TMMBR and TMMBN messages respectively and specify the associated
1575	   behaviour at the media sender and receiver.

1577	4.2.1. Temporary Maximum Media Stream Bit Rate Request (TMMBR)

1579	   The Temporary Maximum Media Stream Bit Rate Request is identified by
1580	   RTCP packet type value PT=RTPFB and FMT=3.

1582	   The FCI field of a Temporary Maximum Media Stream Bit-Rate Request
1583	   (TMMBR) message SHALL contain one or more FCI entries.

1585	4.2.1.1. Message Format

1587	   The Feedback Control Information (FCI) consists of one or more TMMBR
1588	   FCI entries with the following syntax:

1590	    0                   1                   2                   3
1591	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1592	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1593	   |                              SSRC                             |
1594	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1595	   | MxTBR Exp |  MxTBR Mantissa                 |Measured Overhead|
1596	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1598	    Figure 2 - Syntax of an FCI entry in the TMMBR message

1600	     SSRC (32 bits): The SSRC value of the media sender that is
1601	              requested to obey the new maximum bit rate.

1603	     MxTBR Exp (6 bits): The exponential scaling of the mantissa for
1604	              the maximum total media bit rate value.  The value is an
1605	              unsigned integer [0..63].

1607	     MxTBR Mantissa (17 bits): The mantissa of the maximum total media
1608	              bit rate value as an unsigned integer.

1610	     Measured Overhead (9 bits): The measured average packet overhead
1611	              value in bytes.  The measurement SHALL be done according
1612	              to the description in section 4.2.1.2. The value is an
1613	              unsigned integer [0..512].

1615	   The maximum total media bit rate (MxTBR) value in bits per second is
1616	   calculated from the MxTBR exponent (exp) and mantissa in the
1617	   following way:

1619	      MxTBR = mantissa * 2^exp

1621	   This allows for 17 bits of resolution in the range 0 to 131072*2^63
1622	   (approximately 1.2*10^24).

1624	   The length of the TMMBR feedback message SHALL be set to 2+2*N where
1625	   N is the number of TMMBR FCI entries.

1627	4.2.1.2. Semantics

1629	Behaviour at the Media Receiver (Sender of the TMMBR)

1631	   TMMBR is used to indicate a transport related limitation at the
1632	   reporting entity acting as a media receiver.  TMMBR has the form of
1633	   a tuple containing two components.  The first value is the highest
1634	   bit rate per sender of a media stream, available at a receiver-
1635	   chosen protocol layer, which the receiver currently supports in this
1636	   RTP session.  The second value is the measured header overhead in
1637	   bytes as defined in section 2.2 and measured at the chosen protocol
1638	   layer in the packets received for the stream.  The measurement of
1639	   the overhead is a running average that is updated for each packet
1640	   received for this particular media source (SSRC), using the
1641	   following formula:

1643	       avg_OH (new) = 15/16*avg_OH (old) + 1/16*pckt_OH,

1645	   where avg_OH is the running (exponentially smoothed) average and
1646	   pckt_OH is the overhead observed in the latest packet.

1648	   If a maximum bit rate has been negotiated through signaling, the
1649	   maximum total media bit rate that the receiver reports in a TMMBR
1650	   message MUST NOT exceed the negotiated value converted to a common
1651	   basis (i.e. with overheads adjusted to bring it to the same
1652	   reference protocol layer).

1654	   Within the common packet header for feedback messages (as defined in
1655	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
1656	   indicates the source of the request, and the "SSRC of media source"
1657	   is not used and SHALL be set to 0.  Within a particular TMMBR FCI
1658	   entry, the "SSRC of media sender" in the FCI field denotes the media
1659	   sender the tuple applies to.  This is useful in the multicast or
1660	   translator topologies where the reporting entity may address all of
1661	   the media senders in a single TMMBR message using multiple FCI
1662	   entries.

1664	   The media receiver SHALL save the contents of the latest TMMBN
1665	   message received from each media sender.

1667	   The media receiver MAY send a TMMBR FCI entry to a particular media
1668	   sender under the following circumstances:

1670	     o   before any TMMBN message has been received from that media
1671	          sender;

1673	     o   when the media receiver has been identified as the source of a
1674	          bounding tuple within the latest TMMBN message received from
1675	          that media sender, and the value of the maximum total media
1676	          bit rate or the overhead relating to that media sender has
1677	          changed;

1679	     o   when the media receiver has not been identified as the source
1680	          of a bounding tuple within the latest TMMBN message received
1681	          from that media sender, and, after the media receiver applies
1682	          the incremental algorithm from section 3.5.4.2 or a stricter
1683	          equivalent, the media receiver's tuple relating to that media
1684	          sender is determined to belong to the bounding set.

1686	   A TMMBR FCI entry MAY be repeated in subsequent TMMBR messages if no
1687	   Temporary Maximum Media Stream Bit-Rate Notification (TMMBN) FCI has
1688	   been received from the media sender at the time of transmission of
1689	   the next RTCP packet.  The bit rate value of a TMMBR FCI entry MAY
1690	   be changed from one TMMBR message to the next.  The overhead
1691	   measurement SHALL be updated to the current value of avg_OH each
1692	   time the entry is sent.

1694	   If the value set by a TMMBR message is expected to be permanent, the
1695	   TMMBR setting party SHOULD renegotiate the session parameters to
1696	   reflect that using session setup signaling, e.g. a SIP re-invite.

1698	Behaviour at the Media Sender (Receiver of the TMMBR)

1700	   When it receives a TMMBR message containing an FCI entry relating to
1701	   it, the media sender SHALL use an initial or incremental algorithm
1702	   as applicable to determine the bounding set of tuples based on the
1703	   new information.  The algorithm used SHALL be at least as strict as
1704	   the corresponding algorithm defined in section 3
1705	.5.4.2.  The media
1706	   sender MAY accumulate TMMBR requests over a small interval (relative
1707	   to the RTCP sending interval) before making this calculation.

1709	   Once it has determined the bounding set of tuples, the media sender
1710	   MAY use any combination of packet rate and net media bit rate within
1711	   the feasible region that these tuples describe to produce a lower
1712	   total media stream bit rate, as it may need to address a congestion
1713	   situation or other limiting factors.  See section 5
1714	 (congestion
1715	   control) for more discussion.

1717	   If the media sender concludes that it can increase the maximum total
1718	   media bit rate value, it SHALL wait before actually doing so, for a
1719	   period long enough to allow a media receiver to respond to the TMMBN
1720	   if it determines that its tuple belongs in the bounding set.  This
1721	   delay period is estimated by the formula:

1723	      2 * RTT + T_Dither_Max,

1725	   where RTT is the longest round trip time known to the media sender
1726	   and T_Dither_Max is defined in section 3.4 of [RFC4585].  Even in
1727	   point-to-point sessions a media sender MUST obey to the
1728	   aforementioned rule, as it not guaranteed that a participant is able
1729	   to determine correctly whether all the sources are co-located in a
1730	   single node, and are coordinated.

1732	   A TMMBN message SHALL be sent by the media sender at the earliest
1733	   possible point in time, in response to any TMMBR messages received
1734	   since the last sending of TMMBN.  The TMMBN message indicates the
1735	   calculated set of bounding tuples and the owners of those tuples at
1736	   the time of the transmission of the message.

1738	   An SSRC may time out according to the default rules for RTP session
1739	   participants, i.e. the media sender has not received any RTP or RTCP
1740	   packets from the owner for the last five regular reporting
1741	   intervals.  An SSRC may also explicitly leave the session, with the
1742	   participant indicating this through the transmission of an RTCP BYE
1743	   packet or using an external signaling channel.  If the media sender
1744	   determines that the owner of a tuple in the bounding set has left
1745	   the session, the media sender shall transmit a new TMMBN containing
1746	   the previously-determined set of bounding tuples but with the tuple
1747	   belonging to the departed owner removed.

1749	   A media sender MAY proactively initiate the equivalent to a TMMBR
1750	   message to itself, when it is aware that its transmission path is
1751	   more restrictive than the current limitations.  As a result, a TMMBN
1752	   indicating the media source itself as the owner of a tuple is being
1753	   sent, thereby avoiding unnecessary TMMBR messages from other
1754	   participants. However, like any other participant, when the media
1755	   sender becomes aware of changed limitations, it is required to
1756	   change the tuple, and to send a corresponding TMMBN.

1758	Discussion

1760	   Due to the unreliable nature of transport of TMMBR and TMMBN, the
1761	   above rules may lead to the sending of TMMBR messages which appear
1762	   to disobey those rules.  Furthermore, in multicast scenarios it can
1763	   happen that more than one "non-owning" session participant may
1764	   determine, rightly or wrongly, that its tuple belongs in the
1765	   bounding set.  This is not critical for a number of reasons:

1767	   a) If a TMMBR message is lost in transmission, either the media
1768	      sender sends a new TMMBN message in response to some other media
1769	      receiver or it does not send a new TMMBN message at all.  In the
1770	      first case, the media receiver applies the incremental algorithm
1771	      and, if it determines that its tuple should be part of the
1772	      bounding set, sends out another TMMBR.  In the second case, it
1773	      repeats the sending of a TMMBR unconditionally.  Either way, the
1774	      media sender eventually gets the information it needs.

1776	   b) Similarly, if a TMMBN message gets lost, the media receiver that
1777	      has sent the corresponding TMMBR request does not receive the
1778	      notification and is expected to re-send the request and trigger
1779	      the transmission of another TMMBN.

1781	   c) If multiple competing TMMBR messages are sent by different
1782	      session participants, then the algorithm can be applied taking
1783	      all of these messages into account, and the resulting TMMBN
1784	      provides the participants with an updated view of how their
1785	      tuples compare with the bounded set.

1787	   d) If more than one session participant happens to send TMMBR
1788	      messages at the same time and with the same tuple component
1789	      values, it does not matter which of those tuples is taken into
1790	      the bounding set.  The losing session participant will determine,
1791	      after applying the algorithm, that its tuple does not enter the
1792	      bounding set, and will therefore stop sending its TMMBR request.

1794	   It is important to consider the security risks involved with faked
1795	   TMMBRs.  See the security considerations in Section 6
1796	.

1798	   As indicated already, the feedback messages may be used in both
1799	   multicast and unicast sessions in any of the specified topologies.
1800	   However, for sessions with a large number of participants, using the
1801	   lowest common denominator, as required by this mechanism, may not be
1802	   the most suitable course of action.  Large sessions may need to
1803	   consider other ways to adapt the bit rate to participants'
1804	   capabilities, such as partitioning the session into different
1805	   quality tiers, or using some other method of achieving bit rate
1806	   scalability.

1808	4.2.1.3. Timing Rules

1810	   The first transmission of the TMMBR request message MAY use early or
1811	   immediate feedback in cases when timeliness is desirable.  Any
1812	   repetition of a request message SHOULD use regular RTCP mode for its
1813	   transmission timing.

1815	4.2.1.4. Handling in Translator and Mixers

1817	   Media translators and mixers will need to receive and respond to
1818	   TMMBR messages as they are part of the chain that provides a certain
1819	   media stream to the receiver.  The mixer or translator may act
1820	   locally on the TMMBR request and thus generate a TMMBN to indicate
1821	   that it has done so.  Alternatively, in the case of a media
1822	   translator it can forward the request, or in the case of a mixer
1823	   generate one of its own and pass it forward.  In the latter case,
1824	   the mixer will need to send a TMMBN back to the original requestor
1825	   to indicate that it is handling the request.

1827	4.2.2. Temporary Maximum Media Stream Bit Rate Notification (TMMBN)

1829	   The Temporary Maximum Media Stream Bit Rate Notification is
1830	   identified by RTCP packet type value PT=RTPFB and FMT=4.

1832	   The FCI field of the TMMBN Feedback message may contain zero, one or
1833	   more TMMBN FCI entries.

1835	4.2.2.1. Message Format

1837	   The Feedback Control Information (FCI) consists of zero, one or more
1838	   TMMBN FCI entries with the following syntax:

1840	    0                   1                   2                   3
1841	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1842	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1843	   |                              SSRC                             |
1844	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1845	   | MxTBR Exp |  MxTBR Mantissa                 |Measured Overhead|
1846	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1848	    Figure 3 - Syntax of an FCI entry in the TMMBN message
1849	     SSRC (32 bits): The SSRC value of the "owner" of this tuple.

1851	     MxTBR Exp (6 bits): The exponential scaling of the mantissa for
1852	              the maximum total media bit rate value.  The value is an
1853	              unsigned integer [0..63].

1855	     MxTBR Mantissa (17 bits): The mantissa of the maximum total media
1856	              bit rate value as an unsigned integer.

1858	     Measured Overhead (9 bits): The measured average packet overhead
1859	              value in bytes represented as an unsigned integer.

1861	   Thus, the FCI within the TMMBN message contains entries indicating
1862	   the bounding tuples.  For each tuple, the entry gives the owner by
1863	   the SSRC, followed by the applicable maximum total media bit rate
1864	   and overhead value.

1866	   The length of the TMMBN message SHALL be set to 2+2*N where N is the
1867	   number of TMMBN FCI entries.

1869	4.2.2.2. Semantics

1871	   This feedback message is used to notify the senders of any TMMBR
1872	   message that one or more TMMBR messages have been received or that
1873	   an owner has left the session.  It indicates to all participants the
1874	   current set of bounding tuples and the "owners" of those tuples.

1876	   Within the common packet header for feedback messages (as defined in
1877	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
1878	   indicates the source of the notification.  The "SSRC of media
1879	   source" is not used and SHALL be set to 0.

1881	   A TMMBN message SHALL be scheduled for transmission after the
1882	   reception of a TMMBR message with an FCI entry identifying this
1883	   media sender.  Only a single TMMBN SHALL be sent, even if more than
1884	   one TMMBR message is received between the scheduling of the
1885	   transmission and the actual transmission of the TMMBN message.  The
1886	   TMMBN message indicates the bounding tuples and their owners at the
1887	   time of transmitting the message.  The bounding tuples included
1888	   SHALL be the set arrived at through application of the applicable
1889	   algorithm of section 3.5.4.2 or an equivalent, applied to the
1890	   previous bounding set if any and tuples received in TMMBR messages
1891	   since the last TMMBN was transmitted.

1893	   The reception of a TMMBR message SHALL still result in the
1894	   transmission of a TMMBN message even if, after application of the
1895	   algorithm, the newly reported TMMBR tuple is not accepted into the
1896	   bounding set.  In such a case the bounding tuples and their owners
1897	   are not changed, unless the TMMBR was from an owner of a tuple
1898	   within the previously calculated bounding set.  This procedure
1899	   allows session participants that did not see the last TMMBN message
1900	   to get a correct view of this media sender's state.

1902	   As indicated in section 4.2.1.2, when a media sender determines that
1903	   an "owner" of a bounding tuple has left the session, then that tuple
1904	   is removed from the bounding set, and the media sender SHALL send a
1905	   TMMBN message indicating the remaining bounding tuples.  If there
1906	   are no remaining bounding tuples a TMMBN without any FCI SHALL be
1907	   sent to indicate this.  Without a remaining bounding tuple, the
1908	   maximum media bit rate and maximum packet rate negotiated in session
1909	   signaling, if any, apply.

1911	     .Note: if any media receivers remain in the session, this last
1912	     will be a temporary situation.  The empty TMMBN will cause every
1913	     remaining media receiver to determine that its limitation belongs
1914	     in the bounding set and send a TMMBR in consequence.

1916	   In unicast scenarios (i.e. where a single sender talks to a single
1917	   receiver), the aforementioned algorithm to determine ownership
1918	   degenerates to the media receiver becoming the "owner" of the one
1919	   bounding tuple as soon as the media receiver has issued the first
1920	   TMMBR message.

1922	4.2.2.3. Timing Rules

1924	   The TMMBN acknowledgement SHOULD be sent as soon as allowed by the
1925	   applied timing rules for the session.  Immediate or early feedback
1926	   mode SHOULD be used for these messages.

1928	4.2.2.4. Handling by Translators and Mixers

1930	   As discussed in Section 4.2.1.4 mixers or translators may need to
1931	   issue TMMBN messages as responses to TMMBR messages for SSRC's
1932	   handled by them.

1934	4.3. Payload Specific Feedback Messages

1936	   As specified by section 6.1 of RFC 4585 [RFC4585], Payload-Specific
1937	   FB messages are identified by the RTCP packet type value PSFB (206).

1939	   AVPF [RFC4585] defines three payload-specific feedback messages and
1940	   one application layer feedback message.  This memo specifies four
1941	   additional payload-specific feedback messages.  All are identified
1942	   by means of the FMT parameter as follows:

1944	   Assigned in [RFC4585]:

1946	     1:     Picture Loss Indication (PLI)
1947	     2:     Slice Lost Indication (SLI)
1948	     3:     Reference Picture Selection Indication (RPSI)
1949	     15:    Application layer FB message
1950	     31:    reserved for future expansion of the number space

1952	   Assigned in this memo:

1954	     4:     Full Intra Request Command (FIR)
1955	     5:     Temporal-Spatial Trade-off Request (TSTR)
1956	     6:     Temporal-Spatial Trade-off Notification (TSTN)
1957	     7:     Video Back Channel Message (VBCM)

1959	   Unassigned:

1961	     0:     unassigned
1962	     8-14:  unassigned
1963	     16-30: unassigned

1965	   The following subsections define the new FCI formats for the
1966	   payload-specific feedback messages.

1968	4.3.1. Full Intra Request (FIR)

1970	   The FIR message is identified by RTCP packet type value PT=PSFB and
1971	   FMT=4.

1973	   The FCI field MUST contain one or more FIR entries.  Each entry
1974	   applies to a different media sender, identified by its SSRC.

1976	4.3.1.1. Message Format

1978	   The Feedback Control Information (FCI) for the Full Intra Request
1979	   consists of one or more FCI entries, the content of which is
1980	   depicted in Figure 4.  The length of the FIR feedback message MUST
1981	   be set to 2+2*N, where N is the number of FCI entries.

1983	    0                   1                   2                   3
1984	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1985	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1986	   |                              SSRC                             |
1987	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1988	   | Seq. nr       |    Reserved                                   |
1989	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1991	    Figure 4 - Syntax of an FCI entry in the FIR message

1993	     SSRC (32 bits): The SSRC value of the media sender which is
1994	              requested to send a decoder refresh point.

1996	     Seq. nr (8 bits): Command sequence number.  The sequence number
1997	              space is unique for each pairing of the SSRC of command
1998	              source and the SSRC of the command target.  The sequence
1999	              number SHALL be increased by 1 modulo 256 for each new
2000	              command.  A repetition SHALL NOT increase the sequence
2001	              number.  The initial value is arbitrary.

2003	     Reserved (24 bits): All bits SHALL be set to 0 by the sender and
2004	              SHALL be ignored on reception.

2006	   The semantics of this feedback message is independent of the RTP
2007	   payload type.

2009	4.3.1.2. Semantics

2011	   Within the common packet header for feedback messages (as defined in
2012	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
2013	   indicates the source of the request, and the "SSRC of media source"
2014	   is not used and SHALL be set to 0.  The SSRCs of the media senders
2015	   to which the FIR command applies are in the corresponding FCI
2016	   entries.  A FIR message MAY contain requests to multiple media
2017	   senders, using one FCI entry per target media sender.

2019	   Upon reception of FIR, the encoder MUST send a decoder refresh point
2020	   (see section 2.2) as soon as possible.

2022	   The sender MUST consider congestion control as outlined in section
2023	   5, which MAY restrict its ability to send a decoder refresh point
2024	   quickly.

2026	   FIR SHALL NOT be sent as a reaction to picture losses -- it is
2027	   RECOMMENDED to use PLI [RFC4585] instead.  FIR SHOULD be used only
2028	   in situations where not sending a decoder refresh point would render
2029	   the video unusable for the users.

2031	   A typical example where sending FIR is appropriate is when, in a
2032	   multipoint conference, a new user joins the session and no regular
2033	   decoder refresh point interval is established.  Another example
2034	   would be a video switching MCU that changes streams.  Here,
2035	   normally, the MCU issues a FIR to the new sender so to force it to
2036	   emit a decoder refresh point.  The decoder refresh point normally
2037	   includes a Freeze Picture Release (defined outside this
2038	   specification), which re-starts the rendering process of the
2039	   receivers.  Both techniques mentioned are commonly used in MCU-based
2040	   multipoint conferences.

2042	   Other RTP payload specifications such as RFC 2032 [RFC2032] already
2043	   define a feedback mechanism for certain codecs.  An application
2044	   supporting both schemes MUST use the feedback mechanism defined in
2045	   this specification when sending feedback.  For backward
2046	   compatibility reasons such an application SHOULD also be capable of
2047	   receiving and reacting to the feedback scheme defined in the
2048	   respective RTP payload format, if this is required by that payload
2049	   format.

2051	4.3.1.3. Timing Rules

2053	   The timing follows the rules outlined in section 3 of [RFC4585].
2054	   FIR commands MAY be used with early or immediate feedback.  The FIR
2055	   feedback message MAY be repeated.  If using immediate feedback mode
2056	   the repetition SHOULD wait at least one RTT before being sent.  In
2057	   early or regular RTCP mode the repetition is sent in the next
2058	   regular RTCP packet.

2060	4.3.1.4. Handling of FIR Message in Mixer and Translators

2062	   A media translator or a mixer performing media encoding of the
2063	   content for which the session participant has issued a FIR is
2064	   responsible for acting upon it.  A mixer acting upon a FIR SHOULD
2065	   NOT forward the message unaltered; instead it SHOULD issue a FIR
2066	   itself.

2068	4.3.1.5. Remarks

2070	   Currently, video appears to be the only useful application for FIR,
2071	   as it appears to be the only RTP payload widely deployed that relies
2072	   heavily on media prediction across RTP packet boundaries.  However,
2073	   use of FIR could also reasonably be envisioned for other media types
2074	   that share essential properties with compressed video, namely cross-
2075	   frame prediction (whatever a frame may be for that media type).  One
2076	   possible example may be the dynamic updates of MPEG-4 scene
2077	   descriptions.  It is suggested that payload formats for such media
2078	   types refer to FIR and other message types defined in this
2079	   specification and in AVPF [RFC4585], instead of creating similar
2080	   mechanisms in the payload specifications.  The payload
2081	   specifications may have to explain how the payload-specific
2082	   terminologies map to the video-centric terminology used herein.

2084	   In conjunction with video codecs, FIR messages typically trigger the
2085	   sending of full intra or IDR pictures.  Both are several times
2086	   larger then predicted (inter) pictures.  Their size is independent
2087	   of the time they are generated.  In most environments, especially
2088	   when employing bandwidth-limited links, the use of an intra picture
2089	   implies an allowed delay that is a significant multiple of the
2090	   typical frame duration.  An example: if the sending frame rate is 10
2091	   fps, and an intra picture is assumed to be 10 times as big as an
2092	   inter picture, then a full second of latency has to be accepted.  In
2093	   such an environment there is no need for a particularly short delay
2094	   in sending the FIR message.  Hence, waiting for the next possible
2095	   time slot allowed by RTCP timing rules as per [RFC4585] should not
2096	   have an overly negative impact on the system performance.

2098	   Mandating a maximum delay for completing the sending of a decoder
2099	   refresh point would be desirable from an application viewpoint, but
2100	   is problematic from a congestion control point of view.  "As soon as
2101	   possible" as mentioned above appears to be a reasonable compromise.

2103	   In environments where the sender has no control over the codec (e.g.
2104	   when streaming pre-recorded and pre-coded content), the reaction to
2105	   this command cannot be specified.  One suitable reaction of a sender
2106	   would be to skip forward in the video bit stream to the next decoder
2107	   refresh point.  In other scenarios, it may be preferable not to
2108	   react to the command at all, e.g. when streaming to a large
2109	   multicast group.  Other reactions may also be possible.  When
2110	   deciding on a strategy, a sender could take into account factors
2111	   such as the size of the receiving group, the "importance" of the
2112	   sender of the FIR message (however "importance" may be defined in
2113	   this specific application), the frequency of decoder refresh points
2114	   in the content, and so on.  However, a session which predominately
2115	   handles pre-coded content is not expected to use FIR at all.

2117	   The relationship between the Picture Loss Indication and FIR is as
2118	   follows.  As discussed in section 6.3.1 of AVPF [RFC4585], a Picture
2119	   Loss Indication informs the decoder about the loss of a picture and
2120	   hence the likelihood of misalignment of the reference pictures
2121	   between the encoder and decoder.  Such a scenario is normally
2122	   related to losses in an ongoing connection.  In point-to-point
2123	   scenarios, and without the presence of advanced error resilience
2124	   tools, one possible option for an encoder consists in sending a
2125	   decoder refresh point.  However, there are other options.  One
2126	   example is that the media sender ignores the PLI, because the
2127	   embedded stream redundancy is likely to clean up the reproduced
2128	   picture within a reasonable amount of time.  The FIR, in contrast,
2129	   leaves a (real-time) encoder no choice but to send a decoder refresh
2130	   point.  It does not allow the encoder to take into account any
2131	   considerations such as the ones mentioned above.

2133	4.3.2. Temporal-Spatial Trade-off Request (TSTR)

2135	   The TSTR feedback message is identified by RTCP packet type value
2136	   PT=PSFB and FMT=5.

2138	   The FCI field MUST contain one or more TSTR FCI entries.

2140	4.3.2.1. Message Format

2142	   The content of the FCI entry for the Temporal-Spatial Trade-off
2143	   Request is depicted in Figure 5.  The length of the feedback message
2144	   MUST be set to 2+2*N, where N is the number of FCI entries included.

2146	    0                   1                   2                   3
2147	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2148	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2149	   |                              SSRC                             |
2150	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2151	   |  Seq nr.      |  Reserved                           | Index   |
2152	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2154	    Figure 5 - Syntax of an FCI Entry in the TSTR Message

2156	     SSRC (32 bits): The SSRC of the media sender which is requested to
2157	              apply the tradeoff value given in Index.

2159	     Seq. nr (8 bits): Request sequence number.  The sequence number
2160	              space is unique for pairing of the SSRC of request source
2161	              and the SSRC of the request target.  The sequence number
2162	              SHALL be increased by 1 modulo 256 for each new command.
2163	              A repetition SHALL NOT increase the sequence number.  The
2164	              initial value is arbitrary.

2166	     Reserved (19 bits): All bits SHALL be set to 0 by the sender and
2167	              SHALL be ignored on reception.

2169	     Index (5 bits): An integer value between 0 and 31 that indicates
2170	              the relative trade-off that is requested.  An index value
2171	              of 0 indicates highest possible spatial quality, while 31
2172	              indicates highest possible temporal resolution.

2174	4.3.2.2. Semantics

2176	   A decoder can suggest a temporal-spatial trade-off level by sending
2177	   a TSTR message to an encoder.  If the encoder is capable of
2178	   adjusting its temporal-spatial trade-off, it SHOULD take into
2179	   account the received TSTR message for future coding of pictures.  A
2180	   value of 0 suggests a high spatial quality and a value of 31
2181	   suggests a high frame rate.  The progression of values from 0 to 31
2182	   indicate monotonically a desire for higher frame rate.  The index
2183	   values do not correspond to precise values of spatial quality or
2184	   frame rate.

2186	   The reaction to the reception of more than one TSTR message by a
2187	   media sender from different media receivers is left open to the
2188	   implementation.  The selected trade-off SHALL be communicated to the
2189	   media receivers by the means of the TSTN message.

2191	   Within the common packet header for feedback messages (as defined in
2192	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
2193	   indicates the source of the request, and the "SSRC of media source"
2194	   is not used and SHALL be set to 0.  The SSRCs of the media senders
2195	   to which the TSTR applies are in the corresponding FCI entries.

2197	   A TSTR message MAY contain requests to multiple media senders, using
2198	   one FCI entry per target media sender.

2200	4.3.2.3. Timing Rules

2202	   The timing follows the rules outlined in section 3 of [RFC4585].
2203	   This request message is not time critical and SHOULD be sent using
2204	   regular RTCP timing.  Only if it is known that the user interface
2205	   requires quick feedback, the message MAY be sent with early or
2206	   immediate feedback timing.

2208	4.3.2.4. Handling of message in Mixers and Translators
2209	   A mixer or media translator that encodes content sent to the session
2210	   participant issuing the TSTR SHALL consider the request to determine
2211	   if it can fulfill it by changing its own encoding parameters.  A
2212	   media translator unable to fulfill the request MAY forward the
2213	   request unaltered towards the media sender.  A mixer encoding for
2214	   multiple session participants will need to consider the joint needs
2215	   of these participants before generating a TSTR on its own behalf
2216	   towards the media sender.  See also the discussion in Section 3.5.2.

2218	4.3.2.5. Remarks

2220	   The term "spatial quality" does not necessarily refer to the
2221	   resolution as measured by the number of pixels the reconstructed
2222	   video is using.  In fact, in most scenarios the video resolution
2223	   stays constant during the lifetime of a session.  However, all video
2224	   compression standards have means to adjust the spatial quality at a
2225	   given resolution, often influenced by the Quantizer Parameter or QP.
2226	   A numerically low QP results in a good reconstructed picture
2227	   quality, whereas a numerically high QP yields a coarse picture.  The
2228	   typical reaction of an encoder to this request is to change its rate
2229	   control parameters to use a lower frame rate and a numerically lower
2230	   (on average) QP, or vice versa.  The precise mapping of Index value
2231	   to frame rate and QP is intentionally left open here, as it depends
2232	   on factors such as the compression standard employed, spatial
2233	   resolution, content, bit rate, and so on.

2235	4.3.3. Temporal-Spatial Trade-off Notification (TSTN)

2237	   The TSTN message is identified by RTCP packet type value PT=PSFB and
2238	   FMT=6.

2240	   The FCI field SHALL contain one or more TSTN FCI entries.

2242	4.3.3.1. Message Format

2244	   The content of an FCI entry for the Temporal-Spatial Trade-off
2245	   Notification is depicted in Figure 6.  The length of the TSTN
2246	   message MUST be set to 2+2*N, where N is the number of FCI entries.

2248	    0                   1                   2                   3
2249	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2250	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2251	   |                              SSRC                             |
2252	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2253	   |  Seq nr.      |  Reserved                           | Index   |
2254	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2255	   Figure 6 - Syntax of the TSTN

2257	     SSRC (32 bits): The SSRC of the source of the TSTR request which
2258	              resulted in this Notification.

2260	     Seq. nr (8 bits): The sequence number value from the TSTR request
2261	              that is being acknowledged.

2263	     Reserved (19 bits): All bits SHALL be set to 0 by the sender and
2264	              SHALL be ignored on reception.

2266	     Index (5 bits): The trade-off value the media sender is using
2267	              henceforth.

2269	      Informative note: The returned trade-off value (Index) may differ
2270	      from the requested one, for example in cases where a media encoder
2271	      cannot tune its trade-off, or when pre-recorded content is used.

2273	4.3.3.2. Semantics

2275	   This feedback message is used to acknowledge the reception of a
2276	   TSTR.  For each TSTR received targeted at the session participant, a
2277	   TSTN entry SHALL be sent included in a TSTN feedback message.  A
2278	   single TSTN message MAY acknowledge multiple requests using multiple
2279	   FCI entries.  The index value included SHALL be the same in all FCI
2280	   entries of the TSTN message.  Including a FCI for each requestor
2281	   allows each requesting entity to determine that the media sender
2282	   received the request.  The Notification SHALL also be sent in
2283	   response to TSTR repetitions received.  If the request receiver has
2284	   received TSTR with several different sequence numbers from a single
2285	   requestor it SHALL only respond to the request with the highest
2286	   (modulo 256) sequence number.  Note that the highest sequence number
2287	   may be a smaller integer value due to the wrapping of the field.
2288	   Section A.1 of [RFC3550] has an algorithm for keeping track of the
2289	   highest received sequence number for RTP packets, this could be
2290	   adapted for this usage.

2292	   The TSTN SHALL include the Temporal-Spatial Trade-off index that
2293	   will be used as a result of the request.  This is not necessarily
2294	   the same index as requested, as the media sender may need to
2295	   aggregate requests from several requesting session participants.  It
2296	   may also have some other policies or rules that limit the selection.

2298	   Within the common packet header for feedback messages (as defined in
2299	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
2300	   indicates the source of the Notification, and the "SSRC of media
2301	   source" is not used and SHALL be set to 0.  The SSRCs of the
2302	   requesting entities to which the Notification applies are in the
2303	   corresponding FCI entries.

2305	4.3.3.3. Timing Rules

2307	   The timing follows the rules outlined in section 3 of [RFC4585].
2308	   This acknowledgement message is not extremely time critical and
2309	   SHOULD be sent using regular RTCP timing.

2311	4.3.3.4. Handling of TSTN in Mixer and Translators

2313	   A mixer or translator that acts upon a TSTR SHALL also send the
2314	   corresponding TSTN.  In cases where it needs to forward a TSTR
2315	   itself the notification message MAY need to be delayed until the
2316	   TSTR has been responded to.

2318	4.3.3.5. Remarks

2320	   None

2322	4.3.4. H.271 Video Back Channel Message (VBCM)

2324	   The VBCM is identified by RTCP packet type value PT=PSFB and FMT=7.

2326	   The FCI field MUST contain one or more VBCM FCI entries.

2328	4.3.4.1. Message Format

2330	   The syntax of an FCI entry within the VBCM indication is depicted in
2331	   Figure 7.

2333	   0                   1                   2                   3
2334	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2335	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2336	   |                              SSRC                             |
2337	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2338	   | Seq. nr       |0| Payload Type| Length                        |
2339	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2340	   |                    VBCM Octet String....      |    Padding    |
2341	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2343	   Figure 7 - Syntax of an FCI Entry in the VBCM Message
2344	   SSRC (32 bits): The SSRC value of the media sender that is requested
2345	          to instruct its encoder to react to the VBCM message

2347	   Seq. nr (8 bits): Command sequence number.  The sequence number
2348	          space is unique for pairing of the SSRC of command source and
2349	          the SSRC of the command target.  The sequence number SHALL be
2350	          increased by 1 modulo 256 for each new command.  A repetition
2351	          SHALL NOT increase the sequence number.  The initial value is
2352	          arbitrary.

2354	   0: Must be set to 0 by the sender and should not be acted upon by
2355	          the message receiver.

2357	   Payload Type (7 bits): The RTP payload type for which the VBCM bit
2358	          stream must be interpreted.

2360	   Length (16 bits): The length of the VBCM octet string in octets
2361	          exclusive of any padding octets

2363	   VBCM Octet String (Variable length): This is the octet string
2364	          generated by the decoder carrying a specific feedback sub-
2365	          message.

2367	   Padding (Variable length): Bits set to 0 to make up a 32 bit
2368	          boundary.

2370	4.3.4.2. Semantics

2372	   The "payload" of the VBCM indication carries different types of
2373	   codec-specific, feedback information.  The type of feedback
2374	   information can be classified as a 'status report' (such as an
2375	   indication that a bit stream was received without errors, or that a
2376	   partial or complete picture or block was lost) or 'update requests'
2377	   (such as complete refresh of the bit stream).

2379	          Note: There are possible overlaps between the VBCM sub-
2380	          messages and CCM/AVPF feedback messages, such as FIR.  Please
2381	          see section 3.5.3 for further discussion.

2383	   The different types of feedback sub-messages carried in the VBCM are
2384	   indicated by the "payloadType" as defined in [VBCM].  These sub-
2385	   message types are reproduced below for convenience.  "payloadType",
2386	   in ITU-T Rec. H.271 terminology, refers to the sub-type of the H.271
2387	   message and should not be confused with an RTP payload type.

2389	   Payload          Message Content
2390	   Type
2391	   --------------------------------------------------------------------

2393	   0      One or more pictures without detected bit stream error
2394	          mismatch
2395	   1      One or more pictures that are entirely or partially lost
2396	   2      A set of blocks of one picture that is entirely or partially
2397	          lost
2398	   3      CRC for one parameter set
2399	   4      CRC for all parameter sets of a certain type
2400	   5      A "reset" request indicating that the sender should completely
2401	          refresh the video bit stream as if no prior bit stream data
2402	          had been received
2403	   > 5    Reserved for future use by ITU-T

2405	   Table 2: H.271 message types ("payloadTypes")

2407	   The bit string or the "payload" of a VBCM message is of variable
2408	   length and is self-contained and coded in a variable length, binary
2409	   format.  The media sender necessarily has to be able to parse this
2410	   optimized binary format to make use of VBCM messages.

2412	   Each of the different types of sub-messages (indicated by
2413	   payloadType) may have different semantics depending on the codec
2414	   used.

2416	   Within the common packet header for feedback messages (as defined in
2417	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
2418	   indicates the source of the request, and the "SSRC of media source"
2419	   is not used and SHALL be set to 0.  The SSRCs of the media senders
2420	   to which the VBCM message applies to are in the corresponding FCI
2421	   entries.  The sender of the VBCM message MAY send H.271 messages to
2422	   multiple media senders and MAY send more than one H.271 message to
2423	   the same media sender within the same VBCM message.

2425	4.3.4.3. Timing Rules

2427	   The timing follows the rules outlined in section 3 of [RFC4585].
2428	   The different sub-message types may have different properties in
2429	   regards to the timing of messages that should be used.  If several
2430	   different types are included in the same feedback packet then the
2431	   requirements for the sub-message type with the most stringent
2432	   requirements should be followed.

2434	4.3.4.4. Handling of message in Mixer or Translator

2436	   The handling of VBCM in a mixer or translator is sub-message type
2437	   dependent.

2439	4.3.4.5. Remarks

2441	   Please see section 3.5.3 for a discussion of the usage of H.271
2442	   messages and messages defined in AVPF [RFC4585] and this memo with
2443	   similar functionality.

2445	     Note: There has been some discussion whether the RTP payload type
2446	     field in this message is needed.  It will be needed if there is
2447	     potentially more than one VBCM-capable RTP payload type in the
2448	     same session, and the semantics of a given VBCM message changes
2449	     between payload types.  For example, the picture identification
2450	     mechanism in messages of H.271 type 0 is fundamentally different
2451	     between H.263 and H.264 (although both use the same syntax).
2452	     Therefore, the payload field is justified here.  There was a
2453	     further comment that for TSTR and FIR such a need does not exist,
2454	     because the semantics of TSTR and FIR are either loosely enough
2455	     defined, or generic enough, to apply to all video payloads
2456	     currently in existence/envisioned.

2458	5. Congestion Control

2460	   The correct application of the AVPF [RFC4585] timing rules prevents
2461	   the network from being flooded by feedback messages.  Hence,
2462	   assuming a correct implementation and configuration, the RTCP
2463	   channel cannot break its bit rate commitment and introduce
2464	   congestion.

2466	   The reception of some of the feedback messages modifies the
2467	   behaviour of the media senders or, more specifically, the media
2468	   encoders.  Thus, modified behaviour MUST respect the bandwidth
2469	   limits that the application of congestion control provides.  For
2470	   example, when a media sender is reacting to a FIR, the unusually
2471	   high number of packets that form the decoder refresh point have to
2472	   be paced in compliance with the congestion control algorithm, even
2473	   if the user experience suffers from a slowly transmitted decoder
2474	   refresh point.

2476	   A change of the Temporary Maximum Media Stream Bit Rate value can
2477	   only mitigate congestion, but not cause congestion as long as
2478	   congestion control is also employed.  An increase of the value by a
2479	   request REQUIRES the media sender to use congestion control when
2480	   increasing its transmission rate to that value.  A reduction of the
2481	   value results in a reduced transmission bit rate, thus reducing the
2482	   risk for congestion.

2484	6. Security Considerations

2486	   The defined messages have certain properties that have security
2487	   implications.  These must be addressed and taken into account by
2488	   users of this protocol.

2490	   The defined setup signaling mechanism is sensitive to modification
2491	   attacks that can result in session creation with sub-optimal
2492	   configuration, and, in the worst case, session rejection.  To
2493	   prevent this type of attack, authentication and integrity protection
2494	   of the setup signaling is required.

2496	   Spoofed or maliciously created feedback messages of the type defined
2497	   in this specification can have the following implications:

2499	        a. severely reduced media bit rate due to false TMMBR messages
2500	           that sets the maximum to a very low value;

2502	        b. assignment of the ownership of a bounding tuple to the wrong
2503	           participant within a TMMBN message, potentially causing
2504	           unnecessary oscillation in the bounding set as the mistakenly
2505	           identified owner reports a change in its tuple and the true
2506	           owner possibly holds back on changes until a correct TMMBN
2507	           message reaches the participants;

2509	        c. sending TSTR requests that result in a video quality
2510	           different from the user's desire, rendering the session less
2511	           useful;

2513	        d. sending multiple FIR commands to reduce the frame-rate, and
2514	           make the video jerky, due to the frequent usage of decoder
2515	           refresh points.

2517	   To prevent these attacks there is a need to apply authentication and
2518	   integrity protection of the feedback messages.  This can be
2519	   accomplished against threats external to the current RTP session
2520	   using the RTP profile that combines SRTP [SRTP] and AVPF into SAVPF
2521	   [SAVPF].  In the mixer cases, separate security contexts and
2522	   filtering can be applied between the mixer and the participants,
2523	   thus protecting other users on the mixer from a misbehaving
2524	   participant.

2526	7. SDP Definitions

2528	   Section 4 of [RFC4585] defines a new SDP [RFC4566] attribute, rtcp-
2529	   fb, that may be used to negotiate the capability to handle specific
2530	   AVPF commands and indications, such as Reference Picture Selection,
2531	   Picture Loss Indication etc.  The ABNF for rtcp-fb is described in
2532	   section 4.2 of [RFC4585].  In this section we extend the rtcp-fb
2533	   attribute to include the commands and indications that are described
2534	   for codec control in the present document.  We also discuss the
2535	   Offer/Answer implications for the codec control commands and
2536	   indications.

2538	7.1. Extension of the rtcp-fb Attribute

2540	   As described in AVPF [RFC4585], the rtcp-fb attribute indicates the
2541	   capability of using RTCP feedback.  AVPF specifies that the rtcp-fb
2542	   attribute must only be used as a media level attribute and must not
2543	   be provided at session level.  All the rules described in [RFC4585]
2544	   for rtcp-fb attribute relating to payload type and to multiple rtcp-
2545	   fb attributes in a session description also apply to the new
2546	   feedback messages defined in this memo.

2548	   The ABNF [RFC4234] for rtcp-fb as defined in [RFC4585] is

2550	     "a=rtcp-fb: " rtcp-fb-pt SP rtcp-fb-val CRLF

2552	   where rtcp-fb-pt is the payload type and rtcp-fb-val defines the
2553	   type of the feedback message such as ack, nack, trr-int and rtcp-fb-
2554	   id.  For example, to indicate the support of feedback of picture
2555	   loss indication, the sender declares the following in SDP

2557	         v=0
2558	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2559	         s=Media with feedback
2560	         t=0 0
2561	         c=IN IP4 host.example.com
2562	         m=audio 49170 RTP/AVPF 98
2563	         a=rtpmap:98 H263-1998/90000
2564	         a=rtcp-fb:98 nack pli

2566	   In this document we define a new feedback value "ccm" which
2567	   indicates the support of codec control using RTCP feedback messages.
2568	   The "ccm" feedback value SHOULD be used with parameters that
2569	   indicate the specific codec control commands supported. In this
2570	   draft we define four such parameters, namely:

2572	      o  "fir" indicates support of the Full Intra Request (FIR).
2573	      o  "tmmbr" indicates support of the Temporary Maximum Media Stream
2574	         Bit Rate Request/Notification (TMMBR/TMMBN).  It has an
2575	         optional sub parameter to indicate the session maximum packet
2576	         rate (measured in packets per second) to be used.  If not
2577	         included this defaults to infinity.
2578	      o  "tstr" indicates support of the Temporal-Spatial Trade-off
2579	         Request/Notification (TSTR/TSTN).
2580	      O  "vbcm" indicates support of H.271 video back channel messages
2581	         (VBCM).  It has zero or more subparameters identifying the
2582	         supported H.271 "payloadType" values.

2584	   In the ABNF for rtcp-fb-val defined in [RFC4585], there is a
2585	   placeholder called rtcp-fb-id to define new feedback types.  "ccm"
2586	   is defined as a new feedback type in this document and the ABNF for
2587	   the parameters for ccm are defined here (please refer to section 4.2
2588	   of [RFC4585] for complete ABNF syntax).

2590	   rtcp-fb-param = SP "app" [SP byte-string]
2591	                 / SP rtcp-fb-ccm-param
2592	                 /     ; empty

2594	   rtcp-fb-ccm-param = "ccm" SP ccm-param

2596	   ccm-param  = "fir"   ; Full Intra Request
2597	              / "tmmbr" [SP "smaxpr=" MaxPacketRateValue]
2598	                        ; Temporary max media bit rate
2599	              / "tstr"  ; Temporal Spatial Trade Off
2600	              / "vbcm" *(SP subMessageType) ; H.271 VBCM messages
2601	              / token [SP byte-string]
2602	                         ; for future commands/indications
2603	   subMessageType = 1*8DIGIT
2604	   byte-string = <as defined in section 4.2 of [RFC4585] >
2605	   MaxPacketRateValue = 1*15DIGIT

2607	7.2. Offer-Answer

2609	   The Offer/Answer [RFC3264] implications for codec control protocol
2610	   feedback messages are similar to those described in [RFC4585].  The
2611	   offerer MAY indicate the capability to support selected codec
2612	   commands and indications.  The answerer MUST remove all ccm
2613	   parameters corresponding to the CCM messages that it does not wish
2614	   to support in this particular media session (for example because it
2615	   does not implement the message in question, or because its
2616	   application logic suggests the support of the message adds no
2617	   value).  The answerer MUST NOT add new ccm parameters in addition to
2618	   what has been offered.  The answer is binding for the media session
2619	   and both offerer and answerer MUST NOT use any feedback messages
2620	   other than what both sides have explicitly indicated as being
2621	   supported.  In others words only the joint subset of CCM parameters
2622	   from the offer and answer may be used.

2624	   Note, that including a CCM parameter in an offer or answer indicates
2625	   that the party (offerer or answerer) is at least capable of
2626	   receiving the corresponding CCM message(s) and act upon them. In
2627	   cases when the reception of a negotiated CCM messages mandates the
2628	   party to respond with another CCM message, it must also have that
2629	   capability. Although it is not mandated to initiate CCM messages of
2630	   any negotiated type, it is generally expected that an party will
2631	   initiate CCM messages when appropriate.

2633	   The session maximum packet rate parameter part of the TMMBR
2634	   indication is declarative and everyone SHALL use the highest value
2635	   indicated in a response.  If the session maximum packet rate
2636	   parameter is not present in an offer it SHALL NOT be included by the
2637	   answerer.

2639	7.3. Examples

2641	   Example 1: The following SDP describes a point-to-point video call
2642	   with H.263, with the originator of the call declaring its capability
2643	   to support the FIR and TSTR/TSTN codec control messages.  The SDP is
2644	   carried in a high level signaling protocol like SIP.

2646	         v=0
2647	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2648	         s=Point-to-Point call
2649	         c=IN IP4 192.0.2.124
2650	         m=audio 49170 RTP/AVP 0
2651	         a=rtpmap:0 PCMU/8000
2652	         m=video 51372 RTP/AVPF 98
2653	         a=rtpmap:98 H263-1998/90000
2654	         a=rtcp-fb:98 ccm tstr
2655	         a=rtcp-fb:98 ccm fir

2657	   In the above example, when the sender receives a TSTR message from
2658	   the remote party it is capable of adjusting the trade off as
2659	   indicated in the RTCP TSTN feedback message.

2661	   Example 2: The following SDP describes a SIP end point joining a
2662	   video mixer that is hosting a multiparty video conferencing session.

2664	   The participant supports only the FIR (Full Intra Request) codec
2665	   control command and it declares it in its session description.

2667	         v=0
2668	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2669	         s=Multiparty Video Call
2670	         c=IN IP4 192.0.2.124
2671	         m=audio 49170 RTP/AVP 0
2672	         a=rtpmap:0 PCMU/8000
2673	         m=video 51372 RTP/AVPF 98
2674	         a=rtpmap:98 H263-1998/90000
2675	         a=rtcp-fb:98 ccm fir

2677	   When the video MCU decides to route the video of this participant it
2678	   sends an RTCP FIR feedback message.  Upon receiving this feedback
2679	   message the end point is required to generate a full intra request.

2681	   Example 3: The following example describes the Offer/Answer
2682	   implications for the codec control messages.  The Offerer wishes to
2683	   support "tstr", "fir" and "tmmbr".  The offered SDP is

2685	   -------------> Offer
2686	         v=0
2687	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2688	         s=Offer/Answer
2689	         c=IN IP4 192.0.2.124
2690	         m=audio 49170 RTP/AVP 0
2691	         a=rtpmap:0 PCMU/8000
2692	         m=video 51372 RTP/AVPF 98
2693	         a=rtpmap:98 H263-1998/90000
2694	         a=rtcp-fb:98 ccm tstr
2695	         a=rtcp-fb:98 ccm fir
2696	         a=rtcp-fb:* ccm tmmbr smaxpr=120

2698	   The answerer wishes to support only the FIR and TSTR/TSTN messages
2699	   and the answerer SDP is

2701	   <---------------- Answer

2703	         v=0
2704	         o=alice 3203093520 3203093524 IN IP4 otherhost.example.com
2705	         s=Offer/Answer
2706	         c=IN IP4 192.0.2.37
2707	         m=audio 47190 RTP/AVP 0
2708	         a=rtpmap:0 PCMU/8000
2709	         m=video 53273 RTP/AVPF 98
2710	         a=rtpmap:98 H263-1998/90000
2711	         a=rtcp-fb:98 ccm tstr
2712	         a=rtcp-fb:98 ccm fir

2714	   Example 4: The following example describes the Offer/Answer
2715	   implications for H.271 Video back channel messages (VBCM).  The
2716	   Offerer wishes to support VBCM and the sub-messages of payloadType 1
2717	   (one or more pictures that are entirely or partially lost) and 2 (a
2718	   set of blocks of one picture that are entirely or partially lost).

2720	   -------------> Offer
2721	         v=0
2722	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2723	         s=Offer/Answer
2724	         c=IN IP4 192.0.2.124
2725	         m=audio 49170 RTP/AVP 0
2726	         a=rtpmap:0 PCMU/8000
2727	         m=video 51372 RTP/AVPF 98
2728	         a=rtpmap:98 H263-1998/90000
2729	         a=rtcp-fb:98 ccm vbcm 1 2

2731	   The answerer only wishes to support sub-messages of type 1 only

2733	   <---------------- Answer

2735	         v=0
2736	         o=alice 3203093520 3203093524 IN IP4 otherhost.example.com
2737	         s=Offer/Answer
2738	         c=IN IP4 192.0.2.37
2739	         m=audio 47190 RTP/AVP 0
2740	         a=rtpmap:0 PCMU/8000
2741	         m=video 53273 RTP/AVPF 98
2742	         a=rtpmap:98 H263-1998/90000
2743	         a=rtcp-fb:98 ccm vbcm 1

2745	   So, in the above example, only VBCM indications comprised of
2746	   "payloadType" 1 will be supported.

2748	8. IANA Considerations

2750	   The new value "ccm" needs to be registered with IANA in the "rtcp-
2751	   fb" Attribute Values registry located at the time of publication at:
2752	   http://www.iana.org/assignments/sdp-parameters

2754	   Value name:       ccm
2755	   Long Name:        Codec Control Commands and Indications
2756	   Reference:        RFC XXXX

2758	   A new registry "Codec Control Messages" needs to be created to hold
2759	   "ccm" parameters located at time of publication at:
2760	   http://www.iana.org/assignments/sdp-parameters

2762	   New registration in this registry follows the "Specification
2763	   required" policy as defined by [RFC2434]. In addition they are
2764	   required to indicate which, if any additional RTCP feedback types,
2765	   such as "nack", "ack".

2767	   The initial content of the registry is the following values:

2769	   Value name:       fir
2770	   Long name:        Full Intra Request Command
2771	   Usable with:      ccm
2772	   Reference:        RFC XXXX

2774	   Value name:       tmmbr
2775	   Long name:        Temporary Maximum Media Stream Bit Rate
2776	   Usable with:      ccm
2777	   Reference:        RFC XXXX

2779	   Value name:       tstr
2780	   Long name:        temporal Spatial Trade Off
2781	   Usable with:      ccm
2782	   Reference:        RFC XXXX

2784	   Value name:       vbcm
2785	   Long name:        H.271 video back channel messages
2786	   Usable with:      ccm
2787	   Reference:        RFC XXXX

2789	   The following values need to be registered as FMT values in the "FMT
2790	   Values for RTPFB Payload Types" registry located at the time of
2791	   publication at: http://www.iana.org/assignments/rtp-parameters
2792	   RTPFB range
2793	   Name           Long Name                         Value  Reference
2794	   -------------- --------------------------------- -----  ---------
2795	                  Reserved                             2   [RFCxxxx]
2796	   TMMBR          Temporary Maximum Media Stream Bit   3   [RFCxxxx]
2797	                  Rate Request
2798	   TMMBN          Temporary Maximum Media Stream Bit   4   [RFCxxxx]
2799	                  Rate Notification

2801	   The following values need to be registered as FMT values in the "FMT
2802	   Values for PSFB Payload Types" registry located at the time of
2803	   publication at: http://www.iana.org/assignments/rtp-parameters

2805	   PSFB range
2806	   Name           Long Name                             Value Reference
2807	   -------------- ---------------------------------     ----- -------
2808	   FIR            Full Intra Request Command              4   [RFCxxxx]
2809	   TSTR           Temporal-Spatial Trade-off Request      5   [RFCxxxx]
2810	   TSTN           Temporal-Spatial Trade-off Notification 6   [RFCxxxx]
2811	   VBCM           Video Back Channel Message              7   [RFCxxxx]

2813	9. Contributors

2815	   Tom Taylor has made a very significant contribution, for which the
2816	   authors are very grateful, to this specification by helping rewrite
2817	   the specification. Especially the parts regarding the algorithm for
2818	   determining bounding sets for TMMBR have benefited.

2820	10.  Acknowledgements

2822	   The authors would like to thank Andrea Basso, Orit Levin, Nermeen
2823	   Ismail for their work on the requirement and discussion draft
2824	   [Basso].

2826	   Drafts of this memo were reviewed and extensively commented by Roni
2827	   Even, Colin Perkins, Randell Jesup, Keith Lantz, Harikishan
2828	   Desineni, Guido Franceschini and others.  The authors appreciate
2829	   these reviews.

2831	   Funding for the RFC Editor function is currently provided by the
2832	   Internet Society.

2834	11.  References

2836	11.1. Normative references

2838	   [RFC4585]   Ott, J., Wenger, S., Sato, N., Burmeister, C., Rey, J.,
2839	                "Extended RTP Profile for Real-Time Transport Control
2840	                Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585,
2841	                July 2006
2842	   [RFC2119]   Bradner, S., "Key words for use in RFCs to Indicate
2843	                Requirement Levels", BCP 14, RFC 2119, March 1997.
2844	   [RFC3550]   Schulzrinne, H.,  Casner, S., Frederick, R., and V.
2845	                Jacobson, "RTP: A Transport Protocol for Real-Time
2846	                Applications", STD 64, RFC 3550, July 2003.
2847	   [RFC4566]   Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
2848	                Description Protocol", RFC 4566, July 2006.
2849	   [RFC3264]   Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
2850	                with Session Description Protocol (SDP)", RFC 3264, June
2851	                2002.
2852	   [RFC2434]   Narten, T. and H. Alvestrand, "Guidelines for Writing an
2853	                IANA Considerations Section in RFCs", BCP 26, RFC 2434,
2854	                October 1998.
2855	   [RFC4234]   Crocker, D. and P. Overell, "Augmented BNF for Syntax
2856	                Specifications: ABNF", RFC 4234, October 2005.

2858	11.2. Informative references

2860	   [Basso]     A. Basso, et. al., "Requirements for transport of video
2861	                control commands", draft-basso-avt-videoconreq-02.txt,
2862	                expired Internet Draft, October 2004.
2863	   [AVC]       Joint Video Team of ITU-T and ISO/IEC JTC 1, Draft ITU-T
2864	                Recommendation and Final Draft International Standard of
2865	                Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC
2866	                14496-10 AVC), Joint Video Team (JVT) of ISO/IEC MPEG
2867	                and ITU-T VCEG, JVT-G050, March 2003.
2868	   [H245]      ITU-T Rec. HG.245, "Control protocol for multimedia
2869	                communication", MAY 2006
2870	   [NEWPRED]   S. Fukunaga, T. Nakai, and H. Inoue, "Error Resilient
2871	                Video Coding by Dynamic Replacing of Reference
2872	                Pictures," in Proc. Globcom'96, vol. 3, pp. 1503 - 1508,
2873	                1996.
2874	   [SRTP]      Baugher, M., McGrew, D., Naslund, M., Carrara, E., and
2875	                K. Norrman, "The Secure Real-time Transport Protocol
2876	                (SRTP)", RFC 3711, March 2004.
2877	   [RFC2032]   Turletti, T. and C. Huitema, "RTP Payload Format for
2878	                H.261 Video Streams", RFC 2032, October 1996.

2880	   [SAVPF]     J. Ott, E. Carrara, "Extended Secure RTP Profile for
2881	                RTCP-based Feedback (RTP/SAVPF)," draft-ietf-avt-
2882	                profile-savpf-10.txt, February, 2007.
2883	   [RFC3525]   Groves, C., Pantaleo, M., Anderson, T., and T. Taylor,
2884	                "Gateway Control Protocol Version 1", RFC 3525, June
2885	                2003.
2886	   [RFC3448]   M. Handley, S. Floyd, J. Padhye, J. Widmer, "TCP
2887	                Friendly Rate Control (TFRC): Protocol Specification",
2888	                RFC 3448, Jan 2003
2889	   [VBCM]      ITU-T Rec. H.271, "Video Back Channel Messages", June
2890	                2006
2891	   [RFC3890]   Westerlund, M., "A Transport Independent Bandwidth
2892	                Modifier for the Session Description Protocol (SDP)",
2893	                RFC 3890, September 2004.
2894	   [RFC4340]   Kohler, E., Handley, M., and S. Floyd, "Datagram
2895	                Congestion Control Protocol (DCCP)", RFC 4340, March
2896	                2006.
2897	   [RFC3261]   Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
2898	                A., Peterson, J., Sparks, R., Handley, M., and E.
2899	                Schooler, "SIP: Session Initiation Protocol", RFC 3261,
2900	                June 2002.
2901	   [RFC2198]   Perkins, C., Kouvelas, I., Hodson, O., Hardman, V.,
2902	                Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse-
2903	                Parisis, "RTP Payload for Redundant Audio Data", RFC
2904	                2198, September 1997.
2905	   [Topologies] M. Westerlund, and S. Wenger, "RTP Topologies", draft-
2906	                ietf-avt-topologies-06, work in progress, Aug 2007.

2908	12.  Authors' Addresses

2910	   Stephan Wenger
2911	   Nokia Corporation
2912	   975, Page Mill Road,
2913	   Palo Alto,CA 94304
2914	   USA

2916	   Phone: +1-650-862-7368
2917	   EMail: stewe@stewe.org

2919	   Umesh Chandra
2920	   Nokia Research Center
2921	   975, Page Mill Road,
2922	   Palo Alto,CA 94304
2923	   USA

2925	   Phone: +1-650-796-7502
2926	   Email: Umesh.1.Chandra@nokia.com

2928	   Magnus Westerlund
2929	   Ericsson Research
2930	   Ericsson AB
2931	   SE-164 80 Stockholm, SWEDEN

2933	   Phone: +46 8 7190000
2934	   EMail: magnus.westerlund@ericsson.com

2936	   Bo Burman
2937	   Ericsson Research
2938	   Ericsson AB
2939	   SE-164 80 Stockholm, SWEDEN

2941	   Phone: +46 8 7190000
2942	   EMail: bo.burman@ericsson.com

2944	Full Copyright Statement

2946	   Copyright (C) The IETF Trust (2007).

2948	   This document is subject to the rights, licenses and restrictions
2949	   contained in BCP 78, and except as set forth therein, the authors
2950	   retain all their rights.

2952	   This document and the information contained herein are provided on an
2953	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
2954	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST
2955	   AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES,
2956	   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT
2957	   THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY
2958	   IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR
2959	   PURPOSE.

2961	Intellectual Property

2963	   The IETF takes no position regarding the validity or scope of any
2964	   Intellectual Property Rights or other rights that might be claimed to
2965	   pertain to the implementation or use of the technology described in
2966	   this document or the extent to which any license under such rights
2967	   might or might not be available; nor does it represent that it has
2968	   made any independent effort to identify any such rights.  Information
2969	   on the procedures with respect to rights in RFC documents can be
2970	   found in BCP 78 and BCP 79.

2972	   Copies of IPR disclosures made to the IETF Secretariat and any
2973	   assurances of licenses to be made available, or the result of an
2974	   attempt made to obtain a general license or permission for the use of
2975	   such proprietary rights by implementers or users of this
2976	   specification can be obtained from the IETF on-line IPR repository at
2977	   http://www.ietf.org/ipr.

2979	   The IETF invites any interested party to bring to its attention any
2980	   copyrights, patents or patent applications, or other proprietary
2981	   rights that may cover technology that may be required to implement
2982	   this standard.  Please address the information to the IETF at
2983	   ietf-ipr@ietf.org.

2985	Acknowledgement

2987	   Funding for the RFC Editor function is provided by the IETF
2988	   Administrative Support Activity (IASA).

2990	RFC Editor Considerations

2992	   The RFC editor is requested to replace all occurrences of XXXX with
2993	   the RFC number this document receives.