idnits 2.17.1 

draft-ietf-avt-avpf-ccm-10.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 19.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 2962.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2973.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2980.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2986.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The document has examples using IPv4 documentation addresses according
     to RFC6890, but does not use any IPv6 documentation addresses.  Maybe
     there should be IPv6 examples, too?


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  == Line 756 has weird spacing: '...sg type    mul...'

  == Line 1143 has weird spacing: '...     ab  c   s...'

  == Line 1145 has weird spacing: '...     ba   s...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (October 26, 2007) is 6020 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RFCxxxx' is mentioned on line 2811, but not defined

  ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866)

  ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226)

  ** Obsolete normative reference: RFC 4234 (Obsoleted by RFC 5234)

  -- Obsolete informational reference (is this intentional?): RFC 2032
     (Obsoleted by RFC 4587)

  == Outdated reference: A later version (-12) exists of
     draft-ietf-avt-profile-savpf-11

  -- Obsolete informational reference (is this intentional?): RFC 3525
     (Obsoleted by RFC 5125)

  -- Obsolete informational reference (is this intentional?): RFC 3448
     (Obsoleted by RFC 5348)

  == Outdated reference: A later version (-07) exists of
     draft-ietf-avt-topologies-06

  == Outdated reference: A later version (-13) exists of
     draft-levin-mmusic-xml-media-control-11


     Summary: 4 errors (**), 0 flaws (~~), 8 warnings (==), 11 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                   Stephan Wenger
3	INTERNET-DRAFT                                           Umesh Chandra
4	Expires: April 2008                                              Nokia
5	Intended Status: Proposed Standard                   Magnus Westerlund
6	                                                             Bo Burman
7	                                                              Ericsson
8	                                                      October 26, 2007

10	                       Codec Control Messages in the
11	               RTP Audio-Visual Profile with Feedback (AVPF)
12	                      <draft-ietf-avt-avpf-ccm-10.txt>

14	Status of this Memo

16	   By submitting this Internet-Draft, each author represents that any
17	   applicable patent or other IPR claims of which he or she is aware
18	   have been or will be disclosed, and any of which he or she becomes
19	   aware will be disclosed, in accordance with Section 6 of BCP 79.

21	   Internet-Drafts are working documents of the Internet Engineering
22	   Task Force (IETF), its areas, and its working groups.  Note that
23	   other groups may also distribute working documents as Internet-
24	   Drafts.

26	   Internet-Drafts are draft documents valid for a maximum of six
27	   months and may be updated, replaced, or obsoleted by other documents
28	   at any time.  It is inappropriate to use Internet-Drafts as
29	   reference material or to cite them other than as "work in progress."

31	   The list of current Internet-Drafts can be accessed at
32	   http://www.ietf.org/ietf/1id-abstracts.txt.

34	   The list of Internet-Draft Shadow Directories can be accessed at
35	   http://www.ietf.org/shadow.html.

37	Copyright Notice

39	   Copyright (C) The IETF Trust (2007).

41	Abstract

43	   This document specifies a few extensions to the messages defined in
44	   the Audio-Visual Profile with Feedback (AVPF).  They are helpful
45	   primarily in conversational multimedia scenarios where centralized
46	   multipoint functionalities are in use.  However, some are also
47	   usable in smaller multicast environments and point-to-point calls.

49	   The extensions discussed are messages related to the ITU-T H.271
50	   Video Back Channel, Full Intra Request, Temporary Maximum Media
51	   Stream Bit Rate and Temporal Spatial Trade-off.

53	TABLE OF CONTENTS

55	1.   Introduction..................................................5
56	2.   Definitions...................................................6
57	   2.1. Glossary...................................................6
58	   2.2. Terminology................................................6
59	   2.3. Topologies.................................................9
60	3.   Motivation...................................................10
61	   3.1. Use Cases.................................................10
62	   3.2. Using the Media Path......................................12
63	   3.3. Using AVPF................................................13
64	      3.3.1. Reliability..........................................13
65	   3.4. Multicast.................................................13
66	   3.5. Feedback Messages.........................................13
67	      3.5.1. Full Intra Request Command...........................13
68	         3.5.1.1. Reliability.....................................14
69	      3.5.2. Temporal Spatial Trade-off Request and Notification..15
70	         3.5.2.1. Point-to-Point..................................16
71	         3.5.2.2. Point-to-Multipoint Using Multicast or Translators16
72	         3.5.2.3. Point-to-Multipoint Using RTP Mixer.............17
73	         3.5.2.4. Reliability.....................................17
74	      3.5.3. H.271 Video Back Channel Message.....................18
75	         3.5.3.1. Reliability.....................................20
76	      3.5.4. Temporary Maximum Media Stream Bit Rate Request and
77	      Notification................................................20
78	         3.5.4.1. Behavior for media receivers using TMMBR........23
79	         3.5.4.2. Algorithm for establishing current limitations..24
80	         3.5.4.3. Use of TMMBR in a Mixer Based Multipoint Operation31
81	         3.5.4.4. Use of TMMBR in Point-to-Multipoint Using
82	                  Multicast or Translators........................32
83	         3.5.4.5. Use of TMMBR in Point-to-point operation........32
84	         3.5.4.6. Reliability.....................................33
85	4.   RTCP Receiver Report Extensions..............................34
86	   4.1. Design Principles of the Extension Mechanism..............34
87	   4.2. Transport Layer Feedback Messages.........................35
88	      4.2.1. Temporary Maximum Media Stream Bit Rate Request (TMMBR)36
89	         4.2.1.1. Message Format..................................36
90	         4.2.1.2. Semantics.......................................37
91	         4.2.1.3. Timing Rules....................................41
92	         4.2.1.4. Handling in Translator and Mixers...............41
93	      4.2.2. Temporary Maximum Media Stream Bit Rate Notification
94	             (TMMBN)..............................................41
95	         4.2.2.1. Message Format..................................41
96	         4.2.2.2. Semantics.......................................42
97	         4.2.2.3. Timing Rules....................................43
98	         4.2.2.4. Handling by Translators and Mixers..............43
99	   4.3. Payload Specific Feedback Messages........................43
100	      4.3.1. Full Intra Request (FIR).............................44
101	         4.3.1.1. Message Format..................................44
102	         4.3.1.2. Semantics.......................................45
103	         4.3.1.3. Timing Rules....................................46
104	         4.3.1.4. Handling of FIR Message in Mixer and Translators 46
105	         4.3.1.5. Remarks.........................................46
106	      4.3.2. Temporal-Spatial Trade-off Request (TSTR)............48
107	         4.3.2.1. Message Format..................................48
108	         4.3.2.2. Semantics.......................................49
109	         4.3.2.3. Timing Rules....................................49
110	         4.3.2.4. Handling of message in Mixers and Translators...50
111	         4.3.2.5. Remarks.........................................50
112	      4.3.3. Temporal-Spatial Trade-off Notification (TSTN).......50
113	         4.3.3.1. Message Format..................................50
114	         4.3.3.2. Semantics.......................................51
115	         4.3.3.3. Timing Rules....................................52
116	         4.3.3.4. Handling of TSTN in Mixer and Translators.......52
117	         4.3.3.5. Remarks.........................................52
118	      4.3.4. H.271 Video Back Channel Message (VBCM)..............52
119	         4.3.4.1. Message Format..................................52
120	         4.3.4.2. Semantics.......................................53
121	         4.3.4.3. Timing Rules....................................55
122	         4.3.4.4. Handling of message in Mixer or Translator......55
123	         4.3.4.5. Remarks.........................................55
124	5.   Congestion Control...........................................55
125	6.   Security Considerations......................................56
126	7.   SDP Definitions..............................................57
127	   7.1. Extension of the rtcp-fb Attribute........................57
128	   7.2. Offer-Answer..............................................59
129	   7.3. Examples..................................................59
130	8.   IANA Considerations..........................................63
131	9.   Contributors.................................................64
132	10.  Acknowledgements.............................................64
133	11.  References...................................................65
134	   11.1. Normative references.....................................65
135	   11.2. Informative references...................................65
136	12.  Authors' Addresses...........................................67
137	1. Introduction

139	   When the Audio-Visual Profile with Feedback (AVPF) [RFC4585] was
140	   developed, the main emphasis lay in the efficient support of point-
141	   to-point and small multipoint scenarios without centralized
142	   multipoint control.  However, in practice, many small multipoint
143	   conferences operate utilizing devices known as Multipoint Control
144	   Units (MCUs).  Long-standing experience of the conversational video
145	   conferencing industry suggests that there is a need for a few
146	   additional feedback messages, to support centralized multipoint
147	   conferencing efficiently.  Some of the messages have applications
148	   beyond centralized multipoint, and this is indicated in the
149	   description of the message.  This is especially true for the message
150	   intended to carry ITU-T Rec. H.271 [H.271] bit strings for Video
151	   Back Channel messages.

153	   In Real-time Transport Protocol (RTP) [RFC3550] terminology, MCUs
154	   comprise mixers and translators.  Most MCUs also include signaling
155	   support.  During the development of this memo, it was noticed that
156	   there is considerable confusion in the community related to the use
157	   of terms such as mixer, translator, and MCU.  In response to these
158	   concerns, a number of topologies have been identified that are of
159	   practical relevance to the industry, but are not documented in
160	   sufficient detail in [RFC3550].  These topologies are documented in
161	   [Topologies], and understanding this memo requires previous or
162	   parallel study of [Topologies].

164	   Some of the messages defined here are forward only, in that they do
165	   not require an explicit notification to the message emitter that
166	   they have been received and/or indicating the message receiver's
167	   actions.  Other messages require a response, leading to a two way
168	   communication model that one could view as useful for control
169	   purposes.  However, it is not the intention of this memo to open up
170	   RTP Control Protocol (RTCP) to a generalized control protocol.  All
171	   mentioned messages have relatively strict real-time constraints, in
172	   the sense that their value diminishes with increased delay.  This
173	   makes the use of more traditional control protocol means, such as
174	   Session Initiation Protocol (SIP) [RFC3261], undesirable when used
175	   for the same purpose.  That is why this solution is recommended
176	   instead of "XML Schema for Media Control" [XML-MC], which uses SIP
177	   Info to transfer XML messages with similar semantics to what are
178	   defined in this memo.  Furthermore, all messages are of a very
179	   simple format that can be easily processed by an RTP/RTCP
180	   sender/receiver.  Finally, and most importantly, all messages relate
181	   only to the RTP stream with which they are associated, and not to
182	   any other property of a communication system.  In particular, none
183	   of them relate to the properties of the access links traversed by
184	   the session.

186	2. Definitions

188	2.1. Glossary

190	   AIMD   - Additive Increase Multiplicative Decrease
191	   AVPF   - The extended RTP profile for RTCP-based feedback
192	   FEC    - Forward Error Correction
193	   FCI    - Feedback Control Information [RFC4585]
194	   FIR    - Full Intra Request
195	   MCU    - Multipoint Control Unit
196	   MPEG   - Moving Picture Experts Group
197	   TMMBN  - Temporary Maximum Media Stream Bit Rate Notification
198	   TMMBR  - Temporary Maximum Media Stream Bit Rate Request
199	   PLI    - Picture Loss Indication
200	   PR     - Packet rate
201	   QP     - Quantizer Parameter
202	   RTT    - Round trip time
203	   SSRC   - Synchronization Source
204	   TSTN   - Temporal Spatial Trade-off Notification
205	   TSTR   - Temporal Spatial Trade-off Request
206	   VBCM   - Video Back Channel Message indication.

208	2.2. Terminology

210	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
211	   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in
212	   this document are to be interpreted as described in RFC 2119
213	   [RFC2119].

215	      Message:
216	          An RTCP feedback message [RFC4585] defined by this
217	          specification, of one of the following types:

219	          Request:
220	              Message that requires acknowledgement

222	          Command:
223	              Message that forces the receiver to an action

225	          Indication:
226	              Message that reports a situation

228	          Notification:
229	             Message that provides a notification that an event has
230	              occurred. Notifications are commonly generated in
231	              response to a Request.

233	          Note that, with the exception of "Notification", this
234	          terminology is in alignment with ITU-T Rec. H.245 [H245].

236	     Decoder Refresh Point:
237	          A bit string, packetized in one or more RTP packets, which
238	          completely resets the decoder to a known state.

240	          Examples for "hard" decoder refresh points are Intra pictures
241	          in H.261, H.263, MPEG-1, MPEG-2, and MPEG-4 part 2, and
242	          Instantaneous Decoder Refresh (IDR) pictures in H.264.
243	          "Gradual" decoder refresh points may also be used; see for
244	          example [AVC].  While both "hard" and "gradual" decoder
245	          refresh points are acceptable in the scope of this
246	          specification, in most cases the user experience will benefit
247	          from using a "hard" decoder refresh point.

249	          A decoder refresh point also contains all header information
250	          above the picture layer (or equivalent, depending on the
251	          video compression standard) that is conveyed in-band.  In
252	          H.264, for example, a decoder refresh point contains
253	          parameter set Network Adaptation Layer (NAL) units that
254	          generate parameter sets necessary for the decoding of the
255	          following slice/data partition NAL units (and that are not
256	          conveyed out of band).

258	   Decoding:
259	          The operation of reconstructing the media stream.

261	   Rendering:
262	          The operation of presenting (parts of) the reconstructed
263	          media stream to the user.

265	   Stream thinning:
266	          The operation of removing some of the packets from a media
267	          stream.  Stream thinning, preferably, is media-aware,
268	          implying that media packets are removed in the order of
269	          increasing relevance to the reproductive quality.  However,
270	          even when employing media-aware stream thinning, most media
271	          streams quickly lose quality when subjected to increasing
272	          levels of thinning.  Media-unaware stream thinning leads to
273	          even worse quality degradation.  In contrast to transcoding,
274	          stream thinning is typically seen as a computationally
275	          lightweight operation.

277	   Media:
278	          Often used (sometimes in conjunction with terms like bit
279	          rate, stream, sender ...) to identify the content of the
280	          forward RTP packet stream (carrying the codec data), to which
281	          the codec control message applies.

283	   Media Stream:
284	          The stream of RTP packets labeled with a single
285	          Synchronization Source (SSRC) carrying the media (and also in
286	          some cases repair information such as retransmission or
287	          Forward Error Correction (FEC) information).

289	   Total media bit rate:
290	          The total bits per second transferred in a media stream,
291	          measured at an observer-selected protocol layer and averaged
292	          over a reasonable timescale, the length of which depends on
293	          the application.  In general, a media sender and a media
294	          receiver will observe different total media bit rates for the
295	          same stream, first because they may have selected different
296	          reference protocol layers, and second, because of changes in
297	          per-packet overhead along the transmission path.  The goal
298	          with bit rate averaging is to be able to ignore any
299	          burstiness on very short timescales, below for example 100
300	          ms, introduced by scheduling or link layer packetization
301	          effects.

303	   Maximum total media bit rate:
304	          The upper limit on total media bit rate for a given media
305	          stream at a particular receiver and for its selected protocol
306	          layer. Note that this value cannot be measured on the
307	          received media stream, instead it needs to be calculated or
308	          determined through other means, such as QoS negotiations or
309	          local resource limitations. Also note that this value is an
310	          average (on a timescale that is reasonable for the
311	          application) and that it may be different from the
312	          instantaneous bit-rate seen by packets in the media stream.

314	   Overhead:
315	          All protocol header information required to convey a packet
316	          with media data from sender to receiver, from the application
317	          layer down to a pre-defined protocol level (for example down
318	          to, and including, the IP header).  Overhead may include, for
319	          example, IP, UDP, and RTP headers, any layer 2 headers, any
320	          Contributing Sources (CSRCs), RTP-Padding, and RTP header
321	          extensions.  Overhead excludes any RTP payload headers and
322	          the payload itself.

324	   Net media bit rate:
325	          The bit rate carried by a media stream, net of overhead.
326	          That is, the bits per second accounted for by encoded media,
327	          any applicable payload headers, and any directly associated
328	          meta payload information placed in the RTP packet.  A typical
329	          example of the latter is redundancy data provided by the use
330	          of RFC 2198 [RFC2198].  Note that, unlike the total media bit
331	          rate, the net media bit rate will have the same value at the
332	          media sender and at the media receiver unless any mixing or
333	          translating of the media has occurred.

335	          For a given observer, the total media bit rate for a media
336	          stream is equal to the sum of the net media bit rate and the
337	          per-packet overhead as defined above multiplied by the packet
338	          rate.

340	   Feasible region:
341	          The set of all combinations of packet rate and net media bit
342	          rate that do not exceed the restrictions in maximum media bit
343	          rate placed on a given media sender by the Temporary Maximum
344	          Media Stream Bit-rate Request (TMMBR)  messages it has
345	          received.  The feasible region will change as new TMMBR
346	          messages are received.

348	   Bounding set:
349	          The set of TMMBR tuples, selected from all those received at
350	          a given media sender, that define the feasible region for
351	          that media sender.  The media sender uses an algorithm such
352	          as that in section 3.5.4.2 to determine or iteratively
353	          approximate the current bounding set, and reports that set
354	          back to the media receivers in a Temporary Maximum Media
355	          Stream Bit-rate Notification (TMMBN) message.

357	2.3. Topologies

359	   Please refer to [Topologies] for an in depth discussion.  The
360	   topologies referred to throughout this memo are labeled
361	   (consistently with [Topologies]) as follows:

363	   Topo-Point-to-Point . . . . . Point-to-point communication
364	   Topo-Multicast  . . . . . . . Multicast communication
365	   Topo-Translator . . . . . . . Translator based
366	   Topo-Mixer  . . . . . . . . . Mixer based
367	   Topo-RTP-switch-MCU . . . .   RTP stream switching MCU,
368	   Topo-RTCP-terminating-MCU . . Mixer but terminating RTCP

370	3. Motivation

372	   This section discusses the motivation and usage of the different
373	   video and media control messages.  The video control messages have
374	   been under discussion for a long time, and a requirement draft was
375	   drawn up [Basso].  This draft has expired; however we quote relevant
376	   sections of it to provide motivation and requirements.

378	3.1. Use Cases

380	   There are a number of possible usages for the proposed feedback
381	   messages.  Let us begin by looking through the use cases Basso et
382	   al. [Basso] proposed.  Some of the use cases have been reformulated
383	   and comments have been added.

385	   1. An RTP video mixer composes multiple encoded video sources into a
386	      single encoded video stream.  Each time a video source is added,
387	      the RTP mixer needs to request a decoder refresh point from the
388	      video source, so as to start an uncorrupted prediction chain on
389	      the spatial area of the mixed picture occupied by the data from
390	      the new video source.

392	   2. An RTP video mixer receives multiple encoded RTP video streams
393	      from conference participants, and dynamically selects one of the
394	      streams to be included in its output RTP stream.  At the time of
395	      a bit stream change (determined through means such as voice
396	      activation or the user interface), the mixer requests a decoder
397	      refresh point from the remote source, in order to avoid using
398	      unrelated content as reference data for inter picture prediction.
399	      After requesting the decoder refresh point, the video mixer stops
400	      the delivery of the current RTP stream and monitors the RTP
401	      stream from the new source until it detects data belonging to the
402	      decoder refresh point.  At that time, the RTP mixer starts
403	      forwarding the newly selected stream to the receiver(s).

405	   3. An application needs to signal to the remote encoder that the
406	      desired trade-off between temporal and spatial resolution has
407	      changed.  For example, one user may prefer a higher frame rate
408	      and a lower spatial quality, and another user may prefer the
409	      opposite.  This choice is also highly content dependent.  Many
410	      current video conferencing systems offer in the user interface a
411	      mechanism to make this selection, usually in the form of a
412	      slider.  The mechanism is helpful in point-to-point, centralized
413	      multipoint and non-centralized multipoint uses.

415	   4. Use case 4 of the Basso draft applies only to Picture Loss
416	      Indication (PLI) as defined in AVPF [RFC4585] and is not
417	      reproduced here.

419	   5. Use case 5 of the Basso draft relates to a mechanism known as
420	      "freeze picture request".  Sending freeze picture requests
421	      over a non-reliable forward RTCP channel has been identified as
422	      problematic.  Therefore, no freeze picture request has been
423	      included in this memo, and the use case discussion is not
424	      reproduced here.

426	   6. A video mixer dynamically selects one of the received video
427	      streams to be sent out to participants and tries to provide the
428	      highest bit rate possible to all participants, while minimizing
429	      stream trans-rating.  One way of achieving this is to set up
430	      sessions with endpoints using the maximum bit rate accepted by
431	      each endpoint, and accepted by the call admission method used by
432	      the mixer.  By means of commands that reduce the maximum media
433	      stream bit rate below what has been negotiated during session set
434	      up, the mixer can reduce the maximum bit rate sent by endpoints
435	      to the lowest of all the accepted bit rates.  As the lowest
436	      accepted bit rate changes due to endpoints joining and leaving or
437	      due to network congestion, the mixer can adjust the limits at
438	      which endpoints can send their streams to match the new value.
439	      The mixer then requests a new maximum bit rate, which is equal to
440	      or less than the maximum bit rate negotiated at session setup for
441	      a specific media stream, and the remote endpoint can respond with
442	      the actual bit rate that it can support.

444	   The picture Basso et al draws up covers most applications we
445	   foresee.  However, we would like to extend the list with two
446	   additional use cases:

448	   7. Currently deployed congestion control algorithms (AIMD and TFRC
449	      [RFC3448]) probe for additional available capacity as long as
450	      there is something to send.  With congestion control algorithms
451	      using packet loss as the indication for congestion, this probing
452	      generally results in reduced media quality (often to a point
453	      where the distortion is large enough to make the media unusable),
454	      due to packet loss and increased delay.

456	      In a number of deployment scenarios, especially cellular ones,
457	      the bottleneck link is often the last hop link.  That cellular
458	      link also commonly has some type of QoS negotiation enabling the
459	      cellular device to learn the maximal bit rate available over this
460	      last hop.  A media receiver behind this link can, in most (if not
461	      all) cases, calculate at least an upper bound for the bit rate
462	      available for each media stream it presently receives.  How this
463	      is done is an implementation detail and not discussed herein.
464	      Indicating the maximum available bit rate to the transmitting
465	      party for the various media streams can be beneficial to prevent
466	      that party from probing for bandwidth for this stream in excess
467	      of a known hard limit.  For cellular or other mobile devices, the
468	      known available bit rate for each stream (deduced from the link
469	      bit rate) can change quickly, due to handover to another
470	      transmission technology, QoS renegotiation due to congestion,
471	      etc.  To enable minimal disruption of service, quick convergence
472	      is necessary, and therefore media path signaling is desirable.

474	    8. The use of reference picture selection (RPS) as an error
475	       resilience tool has been introduced in 1997 as NEWPRED [NEWPRED],
476	       and is now widely deployed.  When RPS is in use, simplistically
477	       put, the receiver can send a feedback message to the sender,
478	       indicating a reference picture that should be used for future
479	       prediction.  ([NEWPRED] mentions other forms of feedback as
480	       well.)  AVPF contains a mechanism for conveying such a message,
481	       but did not specify for which codec and according to which syntax
482	       the message should conform.  Recently, the ITU-T finalized Rec.
483	       H.271 which (among other message types) also includes a feedback
484	       message.  It is expected that this feedback message will fairly
485	       quickly enjoy wide support.  Therefore, a mechanism to convey
486	       feedback messages according to H.271 appears to be desirable.

488	3.2. Using the Media Path

490	   There are two reasons why we use the media path for the codec
491	   control messages.

493	   First, systems employing MCUs often separate the control and media
494	   processing parts.  As these messages are intended for or generated
495	   by the media part rather than the signaling part of the MCU, having
496	   them on the media path avoids transmission across interfaces and
497	   unnecessary control traffic between signaling and processing.  If
498	   the MCU is physically decomposed, the use of the media path avoids
499	   the need for media control protocol extensions (e.g. in MEGACO
500	   [RFC3525]).

502	   Secondly, the signaling path quite commonly contains several
503	   signaling entities, e.g. SIP proxies and application servers.
504	   Avoiding going through signaling entities avoids delay for several
505	   reasons.  Proxies have less stringent delay requirements than media
506	   processing and due to their complex and more generic nature may
507	   result in significant processing delay.  The topological locations
508	   of the signaling entities are also commonly not optimized for
509	   minimal delay, but rather towards other architectural goals.  Thus,
510	   the signaling path can be significantly longer in both geographical
511	   and delay sense.

513	3.3. Using AVPF

515	   The AVPF feedback message framework [RFC4585] provides the
516	   appropriate framework to implement the new messages.  AVPF
517	   implements rules controlling the timing of feedback messages to
518	   avoid congestion through network flooding by RTCP traffic.  We re-
519	   use these rules by referencing AVPF.

521	   The signaling setup for AVPF allows each individual type of function
522	   to be configured or negotiated on an RTP session basis.

524	3.3.1. Reliability

526	   The use of RTCP messages implies that each message transfer is
527	   unreliable, unless the lower layer transport provides reliability.
528	   The different messages proposed in this specification have different
529	   requirements in terms of reliability.  However, in all cases, the
530	   reaction to an (occasional) loss of a feedback message is specified.

532	3.4. Multicast

534	   The codec control messages might be used with multicast.  The RTCP
535	   timing rules specified in [RFC3550] and [RFC4585] ensure that the
536	   messages do not cause overload of the RTCP connection.  The use of
537	   multicast may result in the reception of messages with inconsistent
538	   semantics.   The reaction to inconsistencies depends on the message
539	   type, and is discussed for each message type separately.

541	3.5. Feedback Messages

543	   This section describes the semantics of the different feedback
544	   messages and how they apply to the different use cases.

546	3.5.1. Full Intra Request Command

548	   A Full Intra Request (FIR) Command, when received by the designated
549	   media sender, requires that the media sender sends a Decoder Refresh
550	   Point (see 2.2) at the earliest opportunity.  The evaluation of such
551	   opportunity includes the current encoder coding strategy and the
552	   current available network resources.

554	   FIR is also known as an "instantaneous decoder refresh request",
555	   "fast video update request" or "video fast update request".

557	   Using a decoder refresh point implies refraining from using any
558	   picture sent prior to that point as a reference for the encoding
559	   process of any subsequent picture sent in the stream.  For
560	   predictive media types that are not video, the analogue applies.
561	   For example, if in MPEG-4 systems scene updates are used, the
562	   decoder refresh point consists of the full representation of the
563	   scene and is not delta-coded relative to previous updates.

565	   Decoder refresh points, especially Intra or IDR pictures, are in
566	   general several times larger in size than predicted pictures.  Thus,
567	   in scenarios in which the available bit rate is small, the use of a
568	   decoder refresh point implies a delay that is significantly longer
569	   than the typical picture duration.

571	   Usage in multicast is possible; however aggregation of the commands
572	   is recommended.  A receiver that receives a request closely after
573	   sending a decoder refresh point -- within 2 times the longest Round
574	   Trip Time (RTT) known, plus and AVPF-induced RTCP packet sending
575	   delays -- should await a second request message to ensure that the
576	   media receiver has not been served by the previously delivered
577	   decoder refresh point.  The reason for the specified delay is to
578	   avoid sending unnecessary decoder refresh points.  A session
579	   participant may have sent its own request while another
580	   participant's request was in-flight to them.  Suppressing those
581	   requests that may have been sent without knowledge about the other
582	   request avoids this issue.

584	   Using the FIR command to recover from errors is explicitly
585	   disallowed, and instead the PLI message defined in AVPF [RFC4585]
586	   should be used.  The PLI message reports lost pictures and has been
587	   included in AVPF for precisely that purpose.

589	   Full Intra Request is applicable in use-cases 1 and 2.

591	3.5.1.1. Reliability

593	   The FIR message results in the delivery of a decoder refresh point,
594	   unless the message is lost.  Decoder refresh points are easily
595	   identifiable from the bit stream.  Therefore, there is no need for
596	   protocol-level notification, and a simple command repetition
597	   mechanism is sufficient for ensuring the level of reliability
598	   required.  However, the potential use of repetition does require a
599	   mechanism to prevent the recipient from responding to messages
600	   already received and responded to.

602	   To ensure the best possible reliability, a sender of FIR may repeat
603	   the FIR request until the desired content has been received.  The
604	   repetition interval is determined by the RTCP timing rules
605	   applicable to the session.  Upon reception of a complete decoder
606	   refresh point or the detection of an attempt to send a decoder
607	   refresh point (which got damaged due to a packet loss), the
608	   repetition of the FIR must stop.  If another FIR is necessary, the
609	   request sequence number must be increased.  A FIR sender shall not
610	   have more than one FIR request (different request sequence number)
611	   outstanding at any time per media sender in the session.

613	   The receiver of FIR (i.e. the media sender) behaves in complementary
614	   fashion to ensure delivery of a decoder refresh point.  If it
615	   receives repetitions of the FIR more than 2*RTT after it has sent a
616	   decoder refresh point, it shall send a new decoder refresh point.
617	   Two round trip times allow time for the decoder refresh point to
618	   arrive back to the requestor and for the end of repetitions of FIR
619	   to reach and be detected by the media sender.

621	   An RTP mixer or RTP switching MCU that receive a FIR from a media
622	   receiver is responsible to ensure that a decoder refresh point is
623	   delivered to the requesting receiver.  It may be necessary for the
624	   mixer/MCU to generate FIR commands.  From a reliability perspective,
625	   the two legs (FIR-requesting endpoint to mixer/MCU, and mixer/MCU to
626	   decoder refresh point generating endpoint) are handled independently
627	   from each other.

629	3.5.2. Temporal Spatial Trade-off Request and Notification

631	   The Temporal Spatial Trade-off Request (TSTR) instructs the video
632	   encoder to change its trade-off between temporal and spatial
633	   resolution.  Index values from 0 to 31 indicate monotonically a
634	   desire for higher frame rate.  That is, a requester asking for an
635	   index of 0 prefers a high quality and is willing to accept a low
636	   frame rate, whereas a requester asking for 31 wishes a high frame
637	   rate, potentially at the cost of low spatial quality.

639	   In general the encoder reaction time may be significantly longer
640	   than the typical picture duration.  See use case 3 for an example.
641	   The encoder decides whether and to what extent the request results
642	   in a change of the trade-off.  It returns a Temporal Spatial Trade-
643	   Off Notification (TSTN) message to indicate the trade-off that it
644	   will use henceforth.

646	   TSTR and TSTN have been introduced primarily because it is believed
647	   that control protocol mechanisms, e.g. a SIP re-invite, are too
648	   heavyweight and too slow to allow for a reasonable user experience.
649	   Consider, for example, a user interface where the remote user
650	   selects the temporal/spatial trade-off with a slider.  An immediate
651	   feedback to any slider movement is required for a reasonable user
652	   experience.  A SIP re-INVITE [RFC3261] would require at least two
653	   round-trips more (compared to the TSTR/TSTN mechanism) and may
654	   involve proxies and other complex mechanisms.  Even in a well-
655	   designed system, it could take a second or so until the new trade-
656	   off is finally selected.  Furthermore the use of RTCP solves the
657	   multicast use case very efficiently.

659	   The use of TSTR and TSTN in multipoint scenarios is a non-trivial
660	   subject, and can be achieved in many implementation-specific ways.
661	   Problems stem from the fact that TSTRs will typically arrive
662	   unsynchronized, and may request different trade-off values for the
663	   same stream and/or endpoint encoder.  This memo does not specify a
664	   translator's, mixer's or endpoint's reaction to the reception of a
665	   suggested trade-off as conveyed in the TSTR.  We only require the
666	   receiver of a TSTR message to reply to it by sending a TSTN,
667	   carrying the new trade-off chosen by its own criteria (which may or
668	   may not be based on the trade-off conveyed by the TSTR).  In other
669	   words, the trade-off sent in TSTR is a non-binding recommendation,
670	   nothing more.

672	   Three TSTR/TSTN scenarios need to be distinguished, based on the
673	   topologies described in [Topologies].  The scenarios are described
674	   in the following sub-clauses.

676	3.5.2.1. Point-to-Point

678	   In this most trivial case (Topo-Point-to-Point), the media sender
679	   typically adjusts its temporal/spatial trade-off based on the
680	   requested value in TSTR, subject to its own capabilities.  The TSTN
681	   message conveys back the new trade-off value (which may be identical
682	   to the old one if, for example, the sender is not capable of
683	   adjusting its trade-off).

685	3.5.2.2. Point-to-Multipoint Using Multicast or Translators

687	   RTCP Multicast is used either with media multicast according to
688	   Topo-Multicast, or following RFC 3550's translator model according
689	   to Topo-Translator.  In these cases, unsynchronized TSTR messages
690	   from different receivers may be received, possibly with different
691	   requested trade-offs (because of different user preferences).  This
692	   memo does not specify how the media sender tunes its trade-off.
693	   Possible strategies include selecting the mean or median of all
694	   trade-off requests received, giving priority to certain
695	   participants, or continuing to use the previously selected trade-off
696	   (e.g. when the sender is not capable of adjusting it).  Again, all
697	   TSTR messages need to be acknowledged by TSTN, and the value
698	   conveyed back has to reflect the decision made.

700	3.5.2.3. Point-to-Multipoint Using RTP Mixer

702	   In this scenario (Topo-Mixer) the RTP mixer receives all TSTR
703	   messages, and has the opportunity to act on them based on its own
704	   criteria.  In most cases, the mixer should form a "consensus" of
705	   potentially conflicting TSTR messages arriving from different
706	   participants, and initiate its own TSTR message(s) to the media
707	   sender(s).  As in the previous scenario, the strategy for forming
708	   this "consensus" is up to the implementation, and can, for example,
709	   encompass averaging the participants' request values, giving
710	   priority to certain participants, or using session default values.

712	   Even if a mixer or translator performs transcoding, it is very
713	   difficult to deliver media with the requested trade-off, unless the
714	   content the mixer or translator receives is already close to that
715	   trade-off.  Thus, if the mixer changes its trade-off, it needs to
716	   request the media sender(s) to use the new value, by creating a TSTR
717	   of its own.  Upon reaching a decision on the used trade-off it
718	   includes that value in the acknowledgement to the downstream
719	   requestors.  Only in cases where the original source has
720	   substantially higher quality (and bit rate) is it likely that
721	   transcoding alone can result in the requested trade-off.

723	3.5.2.4. Reliability

725	   A request and reception acknowledgement mechanism is specified.  The
726	   Temporal Spatial Trade-off Notification (TSTN) message informs the
727	   requester that its request has been received, and what trade-off is
728	   used henceforth.  This acknowledgment mechanism is desirable for at
729	   least the following reasons:

731	   o A change in the trade-off cannot be directly identified from the
732	     media bit stream.
733	   o User feedback cannot be implemented without knowing the chosen
734	     trade-off value, according to the media sender's constraints.
735	   o Repetitive sending of messages requesting an unimplementable
736	     trade-off can be avoided.

738	3.5.3. H.271 Video Back Channel Message

740	   ITU-T Rec. H.271 defines syntax, semantics, and suggested encoder
741	   reaction to a video back channel message.  The structure defined in
742	   this memo is used to transparently convey such a message from media
743	   receiver to media sender.  In this memo, we refrain from an in-depth
744	   discussion of the available code points within H.271 and refer to
745	   the specification text [H.271] instead.

747	   However, we note that some H.271 messages bear similarities with
748	   native messages of AVPF and this memo.  Furthermore, we note that
749	   some H.271 message are known to require caution in multicast
750	   environments -- or are plainly not usable in multicast or multipoint
751	   scenarios.  Table 1 provides a brief, oversimplifed overview of the
752	   messages currently defined in H.271, their roughly corresponding
753	   AVPF or CCM messages (the latter as specified in this memo), and an
754	   indication of our current knowledge of their multicast safety.

756	   H.271 msg type      AVPF/CCM msg type    multicast-safe
757	   --------------------------------------------------------------------
758	   0 (when used for
759	     reference picture
760	      selection)        AVPF RPSI       No (positive ACK of pictures)
761	   1 picture loss       AVPF PLI        Yes
762	   2 partial loss       AVPF SLI        Yes
763	   3 one parameter CRC  N/A             Yes (no required sender action)
764	   4 all parameter CRC  N/A             Yes (no required sender action)
765	   5 refresh point      CCM FIR         Yes

767	   Table 1: H.271 messages and their AVPF/CCM equivalents

769	          Note: H.271 message type 0 is not a strict equivalent to
770	          AVPF's Reference Picture Selection Indication (RPSI); it is
771	          an indication of known-as-correct reference picture(s) at the
772	          decoder.  It does not command an encoder to use a defined
773	          reference picture (the form of control information envisioned
774	          to be carried in RPSI).  However, it is believed and intended
775	          that H.271 message type 0 will be used for the same purpose
776	          as AVPF's RPSI -- although other use forms are also possible.

778	   In response to the opaqueness of the H.271 messages, especially with
779	   respect to the multicast safety, the following guidelines MUST be
780	   followed when an implementation wishes to employ the H.271 video
781	   back channel message:

783	   1. Implementations utilizing the H.271 feedback message MUST stay in
784	      compliance with congestion control principles, as outlined in
785	      section 5.

787	   2. An implementation SHOULD utilize the IETF-native messages as
788	      defined in [RFC4585] and in this memo instead of similar messages
789	      defined in [H.271].  Our current understanding of similar
790	      messages is documented in Table 1 above.  One good reason to
791	      divert from the SHOULD statement above would be if it is clearly
792	      understood that, for a given application and video compression
793	      standard, the aforementioned "similarity" is not given, in
794	      contrast to what the table indicates.

796	   3. It has been observed that some of the H.271 code points currently
797	      in existence are not multicast-safe.  Therefore, the sensible
798	      thing to do is not to use the H.271 feedback message type in
799	      multicast environments.  It MAY be used only when all the issues
800	      mentioned later are fully understood by the implementer, and
801	      properly taken into account by all endpoints.  In all other
802	      cases, the H.271 message type MUST NOT be used in conjunction
803	      with multicast.

805	   4. It has been observed that even in centralized multipoint
806	      environments, where the mixer should theoretically be able to
807	      resolve issues as documented below, the implementation of such a
808	      mixer and cooperative endpoints is a very difficult and tedious
809	      task.  Therefore, H.271 messages MUST NOT be used in centralized
810	      multipoint scenarios, unless all the issues mentioned below are
811	      fully understood by the implementer, and properly taken into
812	      account by both mixer and endpoints.

814	   Issues to be taken into account when considering the use of H.271 in
815	   multipoint environments:

817	   1. Different state on different receivers.  In many environments it
818	      cannot be guaranteed that the decoder state of all media
819	      receivers is identical at any given point in time.  The most
820	      obvious reason for such a possible misalignment of state is a
821	      loss that occurs on the path to only one of many media receivers.
822	      However, there are other not so obvious reasons, such as recent
823	      joins to the multipoint conference (be it by joining the
824	      multicast group or through additional mixer output).  Different
825	      states can lead the media receivers to issue potentially
826	      contradicting H.271 messages (or one media receiver issuing an
827	      H.271 message that, when observed by the media sender, is not
828	      helpful for the other media receivers).  A naive reaction of the
829	      media sender to these contradicting messages can lead to
830	      unpredictable and annoying results.

832	   2. Combining messages from different media receivers in a media
833	      sender is a non-trivial task.  As reasons, we note that these
834	      messages may be contradicting each other, and that their
835	      transport is unreliable (there may well be other reasons).  In
836	      case of many H.271 messages (i.e. types 0, 2, 3, and 4), the
837	      algorithm for combining must be aware both of the
838	      network/protocol environment (i.e. with respect to congestion)
839	      and of the media codec employed, as H.271 messages of a given
840	      type can have different semantics for different media codecs.

842	   3. The suppression of requests may need to go beyond the basic
843	      mechanisms described in AVPF (which are driven exclusively by
844	      timing and transport considerations on the protocol level).  For
845	      example, a receiver is often required to refrain from (or delay)
846	      generating requests, based on information it receives from the
847	      media stream.  For instance, it makes no sense for a receiver to
848	      issue a FIR when a transmission of an Intra/IDR picture is
849	      ongoing.

851	   4. When using the non-multicast-safe messages (e.g. H.271 type 0
852	      positive ACK of received pictures/slices) in larger multicast
853	      groups, the media receiver will likely be forced to delay or even
854	      omit sending these messages.  For the media sender this looks
855	      like data has not been properly received (although it was
856	      received properly), and a naively implemented media sender reacts
857	      to these perceived problems where it should not.

859	3.5.3.1. Reliability

861	   H.271 Video Back Channel messages do not require reliable
862	   transmission, and confirmation of the reception of a message can be
863	   derived from the forward video bit stream.  Therefore, no specific
864	   reception acknowledgement is specified.

866	   With respect to re-sending rules, clause 3.5.1.1 applies.

868	3.5.4. Temporary Maximum Media Stream Bit Rate Request and Notification

870	   A receiver, translator or mixer uses the Temporary Maximum Media
871	   Stream Bit Rate Request (TMMBR, "timber") to request a sender to
872	   limit the maximum bit rate for a media stream (see 2.2) to, or
873	   below, the provided value.  The Temporary Maximum Media Stream Bit
874	   Rate Notification (TMMBN) contains the media sender's current view
875	   of the most limiting subset of the TMMBR-defined limits it has
876	   received, to help the participants to suppress TMMBR requests that
877	   would not further restrict the media sender.  The primary usage for
878	   the TMMBR/TMMBN messages is in a scenario with an MCU or mixer (use
879	   case 6), corresponding to Topo-Translator or Topo-Mixer, but also to
880	   Topo-Point-to-Point.

882	   Each temporary limitation on the media stream is expressed as a
883	   tuple.  The first component of the tuple is the maximum total media
884	   bit rate (as defined in section 2.2) that the media receiver is
885	   currently prepared to accept for this media stream.  The second
886	   component is the per-packet overhead that the media receiver has
887	   observed for this media stream at its chosen reference protocol
888	   layer.

890	   As indicated in section 2.2, the overhead as observed by the sender
891	   of the TMMBR (i.e. the media receiver) may differ from the overhead
892	   observed at the receiver of the TMMBR (i.e. the media sender) due to
893	   use of a different reference protocol layer at the other end or due
894	   to the intervention of translators or mixers that affect the amount
895	   of per packet overhead.  For example, a gateway in between the two
896	   that converts between IPv4 and IPv6 affects the per-packet overhead
897	   by 20 bytes.  Other mechanisms that change the overhead include
898	   tunnels.  The problem with varying overhead is also discussed in
899	   [RFC3890].  As will be seen in the description of the algorithm for
900	   use of TMMBR, the difference in perceived overhead between the
901	   sending and receiving ends presents no difficulty because
902	   calculations are carried out in terms of variables that have the
903	   same value at the sender as at the receiver -- for example, packet
904	   rate and net media rate.

906	   Reporting both maximum total media bit rate and per-packet overhead
907	   allows different receivers to provide bit rate and overhead values
908	   for different protocol layers, for example at the IP level, at the
909	   outer part of a tunnel protocol, or at the link layer.  The protocol
910	   level a peer reports on depends on the level of integration the peer
911	   has, as it needs to be able to extract the information from that
912	   protocol level.  For example, an application with no knowledge of
913	   the IP version it is running over can not meaningfully determine the
914	   overhead of the IP header, and hence will not want to include IP
915	   overhead in the overhead or maximum total media bit rate
916	   calculation.

918	   It is expected that most peers will be able to report values at
919	   least for the IP layer.  In certain implementations it may be
920	   advantageous to also include information pertaining to the link
921	   layer, which in turn allows for a more precise overhead calculation
922	   and a better optimization of connectivity resources.

924	   The Temporary Maximum Media Stream Bit Rate messages are generic
925	   messages that can be applied to any RTP packet stream.  This
926	   separates them from the other codec control messages defined in this
927	   specification, which apply only to specific media types or payload
928	   formats.  The TMMBR functionality applies to the transport, and the
929	   requirements the transport places on the media encoding.

931	   The reasoning below assumes that the participants have negotiated a
932	   session maximum bit rate, using a signaling protocol.  This value
933	   can be global, for example in case of point-to-point, multicast, or
934	   translators.  It may also be local between the participant and the
935	   peer or mixer.  In either case, the bit rate negotiated in signaling
936	   is the one that the participant guarantees to be able to handle
937	   (depacketize and decode).  In practice, the connectivity of the
938	   participant also influences the negotiated value -- it does not make
939	   much sense to negotiate a total media bit rate that one's network
940	   interface does not support.

942	   It is also beneficial to have negotiated a maximum packet rate for
943	   the session or sender.  RFC 3890 provides an SDP [RFC4566] attribute
944	   that can be used for this purpose; however, that attribute is not
945	   usable in RTP sessions established using offer/answer [RFC3264].
946	   Therefore an optional maximum packet rate signaling parameter is
947	   specified in this memo.

949	   An already established maximum total media bit rate may be changed
950	   at any time, subject to the timing rules governing the sending of
951	   feedback messages. The limit may change to any value between zero
952	   and the session maximum, as negotiated during session establishment
953	   signaling.  However, even if a sender has received a TMMBR message
954	   allowing an increase in the bit rate, all increases must be governed
955	   by a congestion control mechanism.  TMMBR indicates known
956	   limitations only, usually in the local environment, and does not
957	   provide any guarantees about the full path.  Furthermore, any
958	   increases in TMMBR-established bit rate limits are to be executed
959	   only after a certain delay from the sending of the TMMBN message
960	   that notifies the world about the increase in limit.  The delay is
961	   specified as at least twice the longest RTT as known by the media
962	   sender, plus the media sender's calculation of the required wait
963	   time for the sending of another TMMBR message for this session based
964	   on AVPF timing rules.  This delay is introduced to allow other
965	   session participants to make known their bit rate limit
966	   requirements, which may be lower.

968	   If it is likely that the new value indicated by TMMBR will be valid
969	   for the remainder of the session, the TMMBR sender is expected to
970	   perform a renegotiation of the session upper limit using the session
971	   signaling protocol.

973	3.5.4.1. Behavior for media receivers using TMMBR

975	   This section is an informal description of behaviour described more
976	   precisely in section 4.2.

978	   A media sender begins the session limited by the maximum media bit
979	   rate and maximum packet rate negotiated in session signaling, if
980	   any. Note that this value may be negotiated for another protocol
981	   layer than the one the participant uses in its TMMBR messages.  Each
982	   media receiver selects a reference protocol layer, forms an estimate
983	   of the overhead it is observing (or estimating it if no packets has
984	   been seen yet) at that reference level, and determines the maximum
985	   total media bit rate it can accept, taking into account its own
986	   limitations and any transport path limitations of which it may be
987	   aware.  In case the current limitations are more restricting than
988	   what was agreed on in the session signaling, the media receiver
989	   reports its initial estimate of these two quantities to the media
990	   sender using a TMMBR message.  Overall message traffic is reduced by
991	   the possibility of including tuples for multiple media senders in
992	   the same TMMBR message.

994	   The media sender applies an algorithm such as that specified in
995	   section 3.5.4.2 to select which of the tuples it has received are
996	   most limiting (i.e. the bounding set as defined in section 2.2).  It
997	   modifies its operation to stay within the feasible region (as
998	   defined in section 2.2), and also sends out a TMMBN notification to
999	   the media receivers indicating the selected bounding set. That
1000	   notification also indicates who was responsible for the tuples in
1001	   the bounding set, i.e. the "owner"(s) of the limitation. A session
1002	   participant that owns no tuple in the bounding set is called a "non-
1003	   owner".

1005	   If a media receiver does not own one of the tuples in the bounding
1006	   set reported by the TMMBN, it applies the same algorithm as the
1007	   media sender to determine if its current estimated (maximum total
1008	   media bit rate, overhead) tuple would enter the bounding set if
1009	   known to the media sender.  If so, it issues a TMMBR request
1010	   reporting the tuple value to the sender.  Otherwise it takes no
1011	   action for the moment.  Periodically, its estimated tuple values may
1012	   change or it may receive a new TMMBN.  If so, it reapplies the
1013	   algorithm to decide whether it needs to issue a TMMBR request.

1015	   If, alternatively, a media receiver owns one of the tuples in the
1016	   reported bounding set, it takes no action until such time as its
1017	   estimate of its own tuple values changes.  At that time it sends a
1018	   TMMBR request to the media sender to report the changed values.

1020	   A media receiver may change status between owner and non-owner of a
1021	   bounding tuple between one TMMBN message and the next.  Thus, it
1022	   must check the contents of each TMMBN to determine its subsequent
1023	   actions.

1025	   Implementations may use other algorithms of their choosing, as long
1026	   as the bit rate limitations resulting from the exchange of TMMBR and
1027	   TMMBN messages are at least as strict (at least as low, in the bit
1028	   rate dimension) as the ones resulting from the use of the
1029	   aforementioned algorithm.

1031	   Obviously, in point-to-point cases, when there is only one media
1032	   receiver, this receiver becomes "owner" once it receives the first
1033	   TMMBN in response to its own TMMBR, and stays "owner" for the rest
1034	   of the session.  Therefore, when it is known that there will always
1035	   be only a single media receiver, the above algorithm is not
1036	   required.  Media receivers that are aware they are the only ones in
1037	   a session can send TMMBR messages with bit rate limits both higher
1038	   and lower than the previously notified limit, at any time (subject
1039	   to the AVPF [RFC4585] RTCP RR send timing rules).  However, it may
1040	   be difficult for a session participant to determine if it is the
1041	   only receiver in the session.  Because of this any implementation of
1042	   TMMBR is required to include the algorithm described in the next
1043	   section or a stricter equivalent.

1045	3.5.4.2. Algorithm for establishing current limitations

1047	   This section introduces an example algorithm for the calculation of
1048	   a session limit.  Other algorithms can be employed, as long as the
1049	   result of the calculation is at least as restrictive as the result
1050	   that is obtained by this algorithm.

1052	   First, it is important to consider the implications of using a tuple
1053	   for limiting the media sender's behavior.  The bit rate and the
1054	   overhead value result in a two-dimensional solution space for the
1055	   calculation of the bit rate of media streams.  Fortunately, the two
1056	   variables are linked. Specifically, the bit rate available for RTP
1057	   payloads is equal to the TMMBR reported bit rate minus the packet
1058	   rate used, multiplied by the TMMBR reported overhead converted to
1059	   bits.  As a result, when different bit rate/overhead combinations
1060	   need to be considered, the packet rate determines the correct
1061	   limitation.  This is perhaps best explained by an example:

1063	   Example:

1065	   Receiver A: TMMBR_max total BR = 35 kbps, TMMBR_OH = 40 bytes
1066	   Receiver B: TMMBR_max total BR = 40 kbps, TMMBR_OH = 60 bytes
1067	   For a given packet rate (PR) the bit rate available for media
1068	   payloads in RTP will be:

1070	   Max_net media_BR_A =
1071	       TMMBR_max total BR_A - PR * TMMBR_OH_A * 8 ... (1)

1073	   Max_net media_BR_B =
1074	       TMMBR_max total BR_B - PR * TMMBR_OH_B * 8 ... (2)

1076	   For a PR = 20 these calculations will yield a Max_net media_BR_A =
1077	   28600 bps and Max_net media_BR_B = 30400 bps, which suggests that
1078	   receiver A is the limiting one for this packet rate.  However, at a
1079	   certain PR there is a switchover point at which receiver B becomes
1080	   the limiting one.  The switchover point can be identified by setting
1081	   Max_media_BR_A equal to Max_media_BR_B and breaking out PR:

1083	         TMMBR_max total BR_A - TMMBR_max total BR_B
1084	   PR =  ------------------------------------------- ... (3)
1085	                8*(TMMBR_OH_A - TMMBR_OH_B)

1087	   which, for the numbers above yields 31.25 as the switchover point
1088	   between the two limits.  That is, for packet rates below 31.25 per
1089	   second, receiver A is the limiting receiver, and for higher packet
1090	   rates, receiver B is more limiting.  The implications of this
1091	   behavior have to be considered by implementations that are going to
1092	   control media encoding and its packetization.  As exemplified above,
1093	   multiple TMMBR limits may apply to the trade-off between net media
1094	   bit rate and packet rate.  Which limitation applies depends on the
1095	   packet rate being considered.

1097	   This also has implications for how the TMMBR mechanism needs to
1098	   work.  First, there is the possibility that multiple TMMBR tuples
1099	   are providing limitations on the media sender.  Secondly there is a
1100	   need for any session participant (media sender and receivers) to be
1101	   able to determine if a given tuple will become a limitation upon the
1102	   media sender, or if the set of already given limitations is stricter
1103	   than the given values.  In the absence of the ability to make this
1104	   determination the suppression of TMMBR requests would not work.

1106	   The basic idea of the algorithm is as follows.  Each TMMBR tuple can
1107	   be viewed as the equation of a straight line (cf. equations (1) and
1108	   (2)) in a space where packet rate lies along the X-axis and maximum
1109	   bit rate lies along the Y-axis. The lower envelope of the set of
1110	   lines corresponding to the complete set of TMMR tuples, together
1111	   with the X and Y axes, defines a polygon. Points lying within this
1112	   polygon are combinations of packet rate and bit rate that meet all
1113	   of the TMMBR constraints. The highest feasible packet rate within
1114	   this region is the minimum of the rate at which the bounding polygon
1115	   meets the X-axis or the session maximum packet rate (SMAXPR,
1116	   measured in packets per second) provided by signaling, if any.
1117	   Typically a media sender will prefer to operate at a lower rate than
1118	   this theoretical maximum, so as to increase the rate at which actual
1119	   media content reaches the receivers.  The purpose of the algorithm
1120	   is to distinguish the TMMBR tuples constituting the bounding set and
1121	   thus delineate the feasible region, so that the media sender can
1122	   select its preferred operating point within that region

1124	   Figure 1 below shows a bounding polygon formed by TMMBR tuples A and
1125	   B. A third tuple C lies outside the bounding polygon and is
1126	   therefore irrelevant in determining feasible tradeoffs between media
1127	   rate and packet rate.  The line labeled ss..s represents the limit
1128	   on packet rate imposed by the session maximum packet rate (SMAXPR)
1129	   obtained by signaling during session setup.  In Figure 1 the limit
1130	   determined by tuple B happens to be more restrictive than SMAXPR.
1131	   The situation could easily be the reverse, meaning that the bounding
1132	   polygon is terminated on the right by the vertical line representing
1133	   the SMAXPR constraint.

1135	   Net  ^
1136	   Media|a   c   b             s
1137	   Bit  |  a   c  b            s
1138	   Rate |    a   c b           s
1139	        |      a   cb          s
1140	        |        a   c         s
1141	        |          a  bc       s
1142	        |            a b c     s
1143	        |              ab  c   s
1144	        |  Feasible      b   c s
1145	        |   region        ba   s
1146	        |                  b a s c
1147	        |                   b  s   c
1148	        |                    b s a
1149	        |_____________________bs________
1150	        +------------------------------>____________

1152	              Packet rate

1154	    Figure 1 - Geometric Interpretation of TMMBR Tuples

1156	   Note that the slopes of the lines making up the bounding polygon are
1157	   increasingly negative as one moves in the direction of increasing
1158	   packet rate.  Note also that with slight rearrangement, equations
1159	   (1) and (2) have the canonical form:

1161	          y = mx + b

1163	   where
1164	     m is the slope and has value equal to the negative of the tuple
1165	     overhead (in bits),
1166	   and
1167	     b is the y-intercept and has value equal to the tuple maximum
1168	     total media bit rate.

1170	   These observations lead to the conclusion that when processing the
1171	   TMMBR tuples to select the initial bounding set, one should sort and
1172	   process the tuples by order of increasing overhead. Once a
1173	   particular tuple has been added to the bounding set, all tuples not
1174	   already selected and having lower overhead can be eliminated,
1175	   because the next side of the bounding polygon has to be steeper
1176	   (i.e. the corresponding TMMBR must have higher overhead) than the
1177	   latest added tuple.

1179	   Line cc..c in Figure 1 illustrates another principle. This line is
1180	   parallel to line aa..a, but has a higher Y-intercept.  That is, the
1181	   corresponding TMMBR tuple contains a higher maximum total media bit
1182	   rate value.  Since line cc..c is outside the bounding polygon, it
1183	   illustrates the conclusion that if two TMMBR tuples have the same
1184	   overhead value, the one with higher maximum total media bit rate
1185	   value cannot be part of the bounding set and can be set aside.

1187	   Two further observations complete the algorithm.  Obviously, moving
1188	   from the left, the successive corners of the bounding polygon (i.e.
1189	   the intersection points between successive pairs of sides) lie at
1190	   successively higher packet rates.  On the other hand, again moving
1191	   from the left, each successive line making up the bounding set
1192	   crosses the X-axis at a lower packet rate.

1194	   The complete algorithm can now be specified.  The algorithm works
1195	   with two lists of TMMBR tuples, the candidate list X and the
1196	   selected list Y, both ordered by increasing overhead value.  The
1197	   algorithm terminates when all members of X have been discarded or
1198	   removed for processing.  Membership of the selected list Y is
1199	   probationary until the algorithm is complete.  Each member of the
1200	   selected list is associated with an intersection value, which is the
1201	   packet rate at which the line corresponding to that TMMBR tuple
1202	   intersects with the line corresponding to the previous TMMBR tuple
1203	   in the selected list.  Each member of the selected list is also
1204	   associated with a maximum packet rate value, which is the lesser of
1205	   the session maximum packet rate SMAXPR (if any) and the packet rate
1206	   at which the line corresponding to that tuple crosses the X-axis.

1208	   When the algorithm terminates, the selected list is equal to the
1209	   bounding set as defined in section 2.2.

1211	Initial Algorithm

1213	   This algorithm is used by the media sender when it has received one
1214	   or more TMMBR requests and before it has determined a bounding set
1215	   for the first time.

1217	   1. Sort the TMMBR tuples by order of increasing overhead.  This is
1218	      the initial candidate list X.

1220	   2. When multiple tuples in the candidate list have the same overhead
1221	      value, discard all but the one with the lowest maximum total media
1222	      bit rate value.

1224	   3. Select and remove from the candidate list the TMMBR tuple with the
1225	      lowest maximum total media bit rate value.  If there is more than
1226	      one tuple with that value, choose the one with the highest
1227	      overhead value.  This is the first member of the selected list Y.
1228	      Set its intersection value equal to zero.  Calculate its maximum
1229	      packet rate as the minimum of SMAXPR (if available) and the value
1230	      obtained from the following formula, which is the packet rate at
1231	      which the corresponding line crosses the X-axis.

1233	          Max PR = TMMBR max total BR / (8 * TMMBR OH) ... (4)

1235	   4. Discard from the candidate list all tuples with a lower overhead
1236	      value than the selected tuple.

1238	   5. Remove the first remaining tuple from the candidate list for
1239	      processing.  Call this the current candidate.

1241	   6. Calculate the packet rate PR at the intersection of the line
1242	      generated by the current candidate with the line generated by the
1243	      last tuple in the selected list Y, using equation (3).

1245	   7. If the calculated value PR is equal to or lower than the
1246	      intersection value stored for the last tuple of the selected list,
1247	      discard the last tuple of the selected list and go back to step 6
1248	      (retaining the same current candidate).

1250	      Note that the choice of the initial member of the selected list Y
1251	      in step 3 guarantees that the selected list will never be emptied
1252	      by this process, meaning that the algorithm must eventually (if
1253	      not immediately) fall through to the step 8.

1255	   8. (This step is reached when the calculated PR value of the current
1256	      candidate is greater than the intersection value of the current
1257	      last member of the selected list Y.)  If the calculated value PR
1258	      of the current candidate is lower than the maximum packet rate
1259	      associated with the last tuple in the selected list, add the
1260	      current candidate tuple to the end of the selected list.  Store PR
1261	      as its intersection value.  Calculate its maximum packet rate as
1262	      the lesser of SMAXPR (if available) and the maximum packet rate
1263	      calculated using equation (4).

1265	   9. If any tuples remain in the candidate list, go back to step 5.

1267	Incremental Algorithm

1269	   The previous algorithm covered the initial case, where no selected
1270	   list had previously been created.  It also applied only to the media
1271	   sender.  When a previously-created selected list is available at
1272	   either the media sender or media receiver, two other cases can be
1273	   considered:

1275	        o when a TMMBR tuple not currently in the selected list is a
1276	          candidate for addition;

1278	        o when the values change in a TMMBR tuple currently in the
1279	          selected list.

1281	   At the media receiver these cases correspond respectively to those
1282	   of the non-owner and owner of a tuple in the TMMBN-reported bounding
1283	   set.

1285	   In either case, the process of updating the selected list to take
1286	   account of the new/changed tuple can use the basic algorithm
1287	   described above, with the modification that the initial candidate
1288	   set consists only of the existing selected list and the new or
1289	   changed tuple.  Some further optimization is possible (beyond
1290	   starting with a reduced candidate set) by taking advantage of the
1291	   following observations.

1293	   The first observation is that if the new/changed candidate becomes
1294	   part of the new selected list, the result may be to cause zero or
1295	   more other tuples to be dropped from the list.  However, if more
1296	   than one other tuple is dropped, the dropped tuples will be
1297	   consecutive.  This can be confirmed geometrically by visualizing a
1298	   new line that cuts off a series of segments from the previously-
1299	   existing bounding polygon.  The cut-off segments are connected one
1300	   to the next, the geometric equivalent of consecutive tuples in a
1301	   list ordered by overhead value.  Beyond the dropped set in either
1302	   direction all of the tuples that were in the earlier selected list
1303	   will be in the updated one.  The second observation is that, leaving
1304	   aside the new candidate, the order of tuples remaining in the
1305	   updated selected list is unchanged because their overhead values
1306	   have not changed.

1308	   The consequence of these two observations is that, once the
1309	   placement of the new candidate and the extent of the dropped set of
1310	   tuples (if any) has been determined, the remaining tuples can be
1311	   copied directly from the candidate list into the selected list,
1312	   preserving their order.  This conclusion suggests the following
1313	   modified algorithm:

1315	       o Run steps 1-4 of the basic algorithm.

1317	       o If the new candidate has survived steps 2 and 4 and has become
1318	          the new first member of the selected list, run steps 5-9 on
1319	          subsequent candidates until another candidate is added to the
1320	          selected list.  Then move all remaining candidates to the
1321	          selected list, preserving their order.

1323	       o If the new candidate has survived steps 2 and 4 and has not
1324	          become the new first member of the selected list, start by
1325	          moving all tuples in the candidate list with lower overhead
1326	          values than that of the new candidate to the selected list,
1327	          preserving their order.  Run steps 5 through 9 for the new
1328	          candidate, with the modification that the intersection values
1329	          and maximum packet rates for the tuples on the selected list
1330	          have to be calculated on the fly because they were not
1331	          previously stored.  Continue processing only until a
1332	          subsequent tuple has been added to the selected list, then
1333	          move all remaining candidates to the selected list, preserving
1334	          their order.

1336	          Note that the new candidate could be added to the selected
1337	          list only to be dropped again when the next tuple is
1338	          processed.  It can easily be seen that in this case the new
1339	          candidate does not displace any of the earlier tuples in the
1340	          selected list.  The limitations of ASCII art make this
1341	          difficult to show in a figure.  Line cc..c in Figure 1 would
1342	          be an example if it had a steeper slope (tuple C had a higher
1343	          overhead value), but still intersected line aa..a beyond where
1344	          line aa..a intersects line bb..b.

1346	   The algorithm just described is approximate, because it does not
1347	   take account of tuples outside the selected list.  To see how such
1348	   tuples can become relevant, consider Figure 1 and suppose that the
1349	   maximum total media bit rate in tuple A increases to the point that
1350	   line aa..a moves outside line cc..c.  Tuple A will remain in the
1351	   bounding set calculated by the media sender.  However, once it
1352	   issues a new TMMBN, media receiver C will apply the algorithm and
1353	   discover that its tuple C should now enter the bounding set.  It
1354	   will issue a TMMBR request to the media sender, which will repeat
1355	   its calculation and come to the appropriate conclusion.

1357	   The rules of section 4.2 require that the media sender refrain from
1358	   raising its sending rate until media receivers have had a chance to
1359	   respond to the TMMBN.  In the example just given, this delay ensures
1360	   that the relaxation of tuple A does not actually result in an
1361	   attempt to send media at a rate exceeding the capacity at C.

1363	3.5.4.3. Use of TMMBR in a Mixer Based Multipoint Operation

1365	   Assume a small mixer-based multiparty conference is ongoing, as
1366	   depicted in Topo-Mixer of [Topologies].  All participants have
1367	   negotiated a common maximum bit rate that this session can use.  The
1368	   conference operates over a number of unicast paths between the
1369	   participants and the mixer.  The congestion situation on each of
1370	   these paths can be monitored by the participant in question and by
1371	   the mixer, utilizing, for example, RTCP receiver reports (RR) or the
1372	   transport protocol, e.g. DCCP [RFC4340].  However, any given
1373	   participant has no knowledge of the congestion situation of the
1374	   connections to the other participants.  Worse, without mechanisms
1375	   similar to the ones discussed in this draft, the mixer (which is
1376	   aware of the congestion situation on all connections it manages) has
1377	   no standardized means to inform media senders to slow down, short of
1378	   forging its own receiver reports (which is undesirable).  In
1379	   principle, a mixer confronted with such a situation is obliged to
1380	   thin or transcode streams intended for connections that detected
1381	   congestion.

1383	   In practice, unfortunately, media-aware streaming thinning is a very
1384	   difficult and cumbersome operation and adds undesirable delay.  If
1385	   media-unaware, it leads very quickly to unacceptable reproduced
1386	   media quality.  Hence, a means to slow down senders even in the
1387	   absence of congestion on their connections to the mixer is
1388	   desirable.

1390	   To allow the mixer to throttle traffic on the individual links,
1391	   without performing transcoding, there is a need for a mechanism that
1392	   enables the mixer to ask a participant's media encoders to limit the
1393	   media stream bit rate they are currently generating.  TMMBR provides
1394	   the required mechanism.  When the mixer detects congestion between
1395	   itself and a given participant, it executes the following procedure:

1397	   1. It starts thinning the media traffic to the congested participant
1398	      to the supported bit rate.

1400	   2. It uses TMMBR to request the media sender(s) to reduce the total
1401	      media bit rate sent by them to the mixer, to a value that is in
1402	      compliance with congestion control principles for the slowest
1403	      link.  Slow refers here to the available bandwidth / bit rate /
1404	      capacity and packet rate after congestion control.

1406	   3. As soon as the bit rate has been reduced by the sending part, the
1407	      mixer stops stream thinning implicitly, because there is no need
1408	      for it once the stream is in compliance with congestion control.

1410	   This use of stream thinning as an immediate reaction tool followed
1411	   up by a quick control mechanism appears to be a reasonable
1412	   compromise between media quality and the need to combat congestion.

1414	3.5.4.4. Use of TMMBR in Point-to-Multipoint Using Multicast or
1415	   Translators

1417	   In these topologies, corresponding to Topo-Multicast or Topo-
1418	   Translator, RTCP RRs are transmitted globally.  This allows all
1419	   participants to detect transmission problems such as congestion, on
1420	   a medium timescale.  As all media senders are aware of the
1421	   congestion situation of all media receivers, the rationale for the
1422	   use of TMMBR in the previous section does not apply.  However, even
1423	   in this case the congestion control response can be improved when
1424	   the unicast links are using congestion controlled transport
1425	   protocols (such as TCP or DCCP).  A peer may also report local
1426	   limitations to the media sender.

1428	3.5.4.5. Use of TMMBR in Point-to-point operation

1430	   In use case 7 it is possible to use TMMBR to improve the performance
1431	   when the known upper limit of the bit rate changes.  In this use
1432	   case the signaling protocol has established an upper limit for the
1433	   session and total media bit rates.  However, at the time of
1434	   transport link bit rate reduction, a receiver can avoid serious
1435	   congestion by sending a TMMBR to the sending side.  Thus, TMMBR is
1436	   useful for putting restrictions on the application and thus placing
1437	   the congestion control mechanism in the right ballpark.  However,
1438	   TMMBR is usually unable to provide the continuously quick feedback
1439	   loop required for real congestion control.  Nor do its semantics
1440	   match those of congestion control given its different purpose.  For
1441	   these reasons TMMBR SHALL NOT be used as a substitute for congestion
1442	   control.

1444	3.5.4.6. Reliability

1446	   The reaction of a media sender to the reception of a TMMBR message
1447	   is not immediately identifiable through inspection of the media
1448	   stream.  Therefore, a more explicit mechanism is needed to avoid
1449	   unnecessary re-sending of TMMBR messages.  Using a statistically
1450	   based retransmission scheme would only provide statistical
1451	   guarantees of the request being received.  It would also not avoid
1452	   the retransmission of already received messages.  In addition, it
1453	   would not allow for easy suppression of other participants'
1454	   requests.  For these reasons, a mechanism based on explicit
1455	   notification is used.

1457	   Upon the reception of a request a media sender sends a TMMBN
1458	   notification containing the current bounding set, and indicating
1459	   which session participants own that limit.  In multicast scenarios,
1460	   that allows all other participants to suppress any request they may
1461	   have, if their limitations are less strict than the current ones
1462	   (i.e. define lines lying outside the feasible region as defined in
1463	   section 2.2).  Keeping and notifying only the bounding set of tuples
1464	   allows for small message sizes and media sender states.  A media
1465	   sender only keeps state for the SSRCs of the current owners of the
1466	   bounding set of tuples; all other requests and their sources are not
1467	   saved.  Once the bounding set has been established, new TMMBR
1468	   messages should be generated only by owners of the bounding tuples
1469	   and by other entities that determine (by applying the algorithm of
1470	   section 3.5.4.2 or its equivalent) that their limitations should now
1471	   be part of the bounding set.

1473	4. RTCP Receiver Report Extensions

1475	   This memo specifies six new feedback messages.  The Full Intra
1476	   Request (FIR), Temporal-Spatial Trade-off Request (TSTR), Temporal-
1477	   Spatial Trade-off Notification (TSTN), and Video Back Channel
1478	   Message (VBCM) are "Payload Specific Feedback Messages" as defined
1479	   in Section 6.3 of AVPF [RFC4585].  The Temporary Maximum Media
1480	   Stream Bit Rate Request (TMMBR) and Temporary Maximum Media Stream
1481	   Bit Rate Notification (TMMBN) are "Transport Layer Feedback
1482	   Messages" as defined in Section 6.2 of AVPF.

1484	   The new feedback messages are defined in the following subsections,
1485	   following a similar structure to that in sections 6.2 and 6.3 of the
1486	   AVPF specification [RFC4585].

1488	4.1. Design Principles of the Extension Mechanism

1490	   RTCP was originally introduced as a channel to convey presence,
1491	   reception quality statistics and hints on the desired media coding.
1492	   A limited set of media control mechanisms were introduced in early
1493	   RTP payload formats for video formats, for example in RFC 2032
1494	   [RFC2032].  However, this specification, for the first time,
1495	   suggests a two-way handshake for some of its messages.  There is
1496	   danger that this introduction could be misunderstood as a precedent
1497	   for the use of RTCP as an RTP session control protocol.  To prevent
1498	   such a misunderstanding, this subsection attempts to clarify the
1499	   scope of the extensions specified in this memo, and strongly
1500	   suggests that future extensions follow the rationale spelled out
1501	   here, or compellingly explain why they divert from the rationale.

1503	   In this memo, and in AVPF [RFC4585], only such messages have been
1504	   included as:

1506	   a) have comparatively strict real-time constraints, which prevent
1507	      the use of mechanisms such as a SIP re-invite in most application
1508	      scenarios.  The real-time constraints are explained separately
1509	      for each message where necessary.

1511	   b) are multicast-safe in that the reaction to potentially
1512	      contradicting feedback messages is specified, as necessary for
1513	      each message; and

1515	   c) are directly related to activities of a certain media codec,
1516	      class of media codecs (e.g. video codecs), or a given RTP packet
1517	      stream.

1519	   In this memo, a two-way handshake is introduced only for messages
1520	   for which:

1522	   a) a notification or acknowledgement is required due to their
1523	      nature. An analysis to determine whether this requirement exists
1524	      has been performed separately for each message.

1526	   b) the notification or acknowledgement cannot be easily derived from
1527	      the media bit stream.

1529	   All messages in AVPF [RFC4585] and in this memo present their
1530	   contents in a simple, fixed binary format.  This accommodates media
1531	   receivers which have not implemented higher control protocol
1532	   functionalities (SDP, XML parsers and such) in their media path.

1534	   Messages that do not conform to the design principles just described
1535	   are not an appropriate use of RTCP or of the Codec Control Framework
1536	   defined in this document.

1538	4.2. Transport Layer Feedback Messages

1540	   As specified in section 6.1 of RFC 4585 [RFC4585], Transport Layer
1541	   Feedback messages are identified by the RTCP packet type value RTPFB
1542	   (205).

1544	   In AVPF, one message of this category had been defined.  This memo
1545	   specifies two more such messages.  They are identified by means of
1546	   the FMT parameter as follows:

1548	   Assigned in AVPF [RFC4585]:

1550	      1:    Generic NACK
1551	      31:   reserved for future expansion of the identifier number
1552	   space

1554	   Assigned in this memo:

1556	      2:    reserved (see note below)
1557	      3:    Temporary Maximum Media Stream Bit Rate Request (TMMBR)
1558	      4:    Temporary Maximum Media Stream Bit Rate Notification
1559	   (TMMBN)

1561	          Note: early drafts of AVPF [RFC4585] reserved FMT=2 for a
1562	          code point that has later been removed.  It has been pointed
1563	          out that there may be implementations in the field using this
1564	          value in accordance with the expired draft.  As there is
1565	          sufficient numbering space available, we mark FMT=2 as
1566	          reserved so to avoid possible interoperability problems with
1567	          any such early implementations.

1569	   Available for assignment:

1571	      0:    unassigned
1572	      5-30: unassigned

1574	   The following subsection defines the formats of the Feedback Control
1575	   Information (FCI) entries for the TMMBR and TMMBN messages
1576	   respectively and specify the associated behaviour at the media
1577	   sender and receiver.

1579	4.2.1. Temporary Maximum Media Stream Bit Rate Request (TMMBR)

1581	   The Temporary Maximum Media Stream Bit Rate Request is identified by
1582	   RTCP packet type value PT=RTPFB and FMT=3.

1584	   The FCI field of a Temporary Maximum Media Stream Bit-Rate Request
1585	   (TMMBR) message SHALL contain one or more FCI entries.

1587	4.2.1.1. Message Format

1589	   The Feedback Control Information (FCI) consists of one or more TMMBR
1590	   FCI entries with the following syntax:

1592	    0                   1                   2                   3
1593	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1594	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1595	   |                              SSRC                             |
1596	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1597	   | MxTBR Exp |  MxTBR Mantissa                 |Measured Overhead|
1598	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1600	    Figure 2 - Syntax of an FCI entry in the TMMBR message

1602	     SSRC (32 bits): The SSRC value of the media sender that is
1603	              requested to obey the new maximum bit rate.

1605	     MxTBR Exp (6 bits): The exponential scaling of the mantissa for
1606	              the maximum total media bit rate value.  The value is an
1607	              unsigned integer [0..63].

1609	     MxTBR Mantissa (17 bits): The mantissa of the maximum total media
1610	              bit rate value as an unsigned integer.

1612	     Measured Overhead (9 bits): The measured average packet overhead
1613	              value in bytes.  The measurement SHALL be done according
1614	              to the description in section 4.2.1.2. The value is an
1615	              unsigned integer [0..511].

1617	   The maximum total media bit rate (MxTBR) value in bits per second is
1618	   calculated from the MxTBR exponent (exp) and mantissa in the
1619	   following way:

1621	      MxTBR = mantissa * 2^exp

1623	   This allows for 17 bits of resolution in the range 0 to 131072*2^63
1624	   (approximately 1.2*10^24).

1626	   The length of the TMMBR feedback message SHALL be set to 2+2*N where
1627	   N is the number of TMMBR FCI entries.

1629	4.2.1.2. Semantics

1631	Behaviour at the Media Receiver (Sender of the TMMBR)

1633	   TMMBR is used to indicate a transport related limitation at the
1634	   reporting entity acting as a media receiver.  TMMBR has the form of
1635	   a tuple containing two components.  The first value is the highest
1636	   bit rate per sender of a media stream, available at a receiver-
1637	   chosen protocol layer, which the receiver currently supports in this
1638	   RTP session.  The second value is the measured header overhead in
1639	   bytes as defined in section 2.2 and measured at the chosen protocol
1640	   layer in the packets received for the stream.  The measurement of
1641	   the overhead is a running average that is updated for each packet
1642	   received for this particular media source (SSRC), using the
1643	   following formula:

1645	       avg_OH (new) = 15/16*avg_OH (old) + 1/16*pckt_OH,

1647	   where avg_OH is the running (exponentially smoothed) average and
1648	   pckt_OH is the overhead observed in the latest packet.

1650	   If a maximum bit rate has been negotiated through signaling, the
1651	   maximum total media bit rate that the receiver reports in a TMMBR
1652	   message MUST NOT exceed the negotiated value converted to a common
1653	   basis (i.e. with overheads adjusted to bring it to the same
1654	   reference protocol layer).

1656	   Within the common packet header for feedback messages (as defined in
1657	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
1658	   indicates the source of the request, and the "SSRC of media source"
1659	   is not used and SHALL be set to 0.  Within a particular TMMBR FCI
1660	   entry, the "SSRC of media sender" in the FCI field denotes the media
1661	   sender the tuple applies to.  This is useful in the multicast or
1662	   translator topologies where the reporting entity may address all of
1663	   the media senders in a single TMMBR message using multiple FCI
1664	   entries.

1666	   The media receiver SHALL save the contents of the latest TMMBN
1667	   message received from each media sender.

1669	   The media receiver MAY send a TMMBR FCI entry to a particular media
1670	   sender under the following circumstances:

1672	     o   before any TMMBN message has been received from that media
1673	          sender;

1675	     o   when the media receiver has been identified as the source of a
1676	          bounding tuple within the latest TMMBN message received from
1677	          that media sender, and the value of the maximum total media
1678	          bit rate or the overhead relating to that media sender has
1679	          changed;

1681	     o   when the media receiver has not been identified as the source
1682	          of a bounding tuple within the latest TMMBN message received
1683	          from that media sender, and, after the media receiver applies
1684	          the incremental algorithm from section 3.5.4.2 or a stricter
1685	          equivalent, the media receiver's tuple relating to that media
1686	          sender is determined to belong to the bounding set.

1688	   A TMMBR FCI entry MAY be repeated in subsequent TMMBR messages if no
1689	   Temporary Maximum Media Stream Bit-Rate Notification (TMMBN) FCI has
1690	   been received from the media sender at the time of transmission of
1691	   the next RTCP packet.  The bit rate value of a TMMBR FCI entry MAY
1692	   be changed from one TMMBR message to the next.  The overhead
1693	   measurement SHALL be updated to the current value of avg_OH each
1694	   time the entry is sent.

1696	   If the value set by a TMMBR message is expected to be permanent, the
1697	   TMMBR setting party SHOULD renegotiate the session parameters to
1698	   reflect that using session setup signaling, e.g. a SIP re-invite.

1700	Behaviour at the Media Sender (Receiver of the TMMBR)

1702	   When it receives a TMMBR message containing an FCI entry relating to
1703	   it, the media sender SHALL use an initial or incremental algorithm
1704	   as applicable to determine the bounding set of tuples based on the
1705	   new information.  The algorithm used SHALL be at least as strict as
1706	   the corresponding algorithm defined in section 3.5.4.2.  The media
1707	   sender MAY accumulate TMMBR requests over a small interval (relative
1708	   to the RTCP sending interval) before making this calculation.

1710	   Once it has determined the bounding set of tuples, the media sender
1711	   MAY use any combination of packet rate and net media bit rate within
1712	   the feasible region that these tuples describe to produce a lower
1713	   total media stream bit rate, as it may need to address a congestion
1714	   situation or other limiting factors.  See section 5 (congestion
1715	   control) for more discussion.

1717	   If the media sender concludes that it can increase the maximum total
1718	   media bit rate value, it SHALL wait before actually doing so, for a
1719	   period long enough to allow a media receiver to respond to the TMMBN
1720	   if it determines that its tuple belongs in the bounding set.  This
1721	   delay period is estimated by the formula:

1723	      2 * RTT + T_Dither_Max,

1725	   where RTT is the longest round trip time known to the media sender
1726	   and T_Dither_Max is defined in section 3.4 of [RFC4585].  Even in
1727	   point-to-point sessions a media sender MUST obey to the
1728	   aforementioned rule, as it is not guaranteed that a participant is
1729	   able to determine correctly whether all the sources are co-located
1730	   in a single node, and are coordinated.

1732	   A TMMBN message SHALL be sent by the media sender at the earliest
1733	   possible point in time, in response to any TMMBR messages received
1734	   since the last sending of TMMBN.  The TMMBN message indicates the
1735	   calculated set of bounding tuples and the owners of those tuples at
1736	   the time of the transmission of the message.

1738	   An SSRC may time out according to the default rules for RTP session
1739	   participants, i.e. the media sender has not received any RTP or RTCP
1740	   packets from the owner for the last five regular reporting
1741	   intervals.  An SSRC may also explicitly leave the session, with the
1742	   participant indicating this through the transmission of an RTCP BYE
1743	   packet or using an external signaling channel.  If the media sender
1744	   determines that the owner of a tuple in the bounding set has left
1745	   the session, the media sender SHALL transmit a new TMMBN containing
1746	   the previously-determined set of bounding tuples but with the tuple
1747	   belonging to the departed owner removed.

1749	   A media sender MAY proactively initiate the equivalent to a TMMBR
1750	   message to itself, when it is aware that its transmission path is
1751	   more restrictive than the current limitations.  As a result, a TMMBN
1752	   indicating the media source itself as the owner of a tuple is being
1753	   sent, thereby avoiding unnecessary TMMBR messages from other
1754	   participants. However, like any other participant, when the media
1755	   sender becomes aware of changed limitations, it is required to
1756	   change the tuple, and to send a corresponding TMMBN.

1758	Discussion

1760	   Due to the unreliable nature of transport of TMMBR and TMMBN, the
1761	   above rules may lead to the sending of TMMBR messages which appear
1762	   to disobey those rules.  Furthermore, in multicast scenarios it can
1763	   happen that more than one "non-owning" session participant may
1764	   determine, rightly or wrongly, that its tuple belongs in the
1765	   bounding set.  This is not critical for a number of reasons:

1767	   a) If a TMMBR message is lost in transmission, either the media
1768	      sender sends a new TMMBN message in response to some other media
1769	      receiver or it does not send a new TMMBN message at all.  In the
1770	      first case, the media receiver applies the incremental algorithm
1771	      and, if it determines that its tuple should be part of the
1772	      bounding set, sends out another TMMBR.  In the second case, it
1773	      repeats the sending of a TMMBR unconditionally.  Either way, the
1774	      media sender eventually gets the information it needs.

1776	   b) Similarly, if a TMMBN message gets lost, the media receiver that
1777	      has sent the corresponding TMMBR request does not receive the
1778	      notification and is expected to re-send the request and trigger
1779	      the transmission of another TMMBN.

1781	   c) If multiple competing TMMBR messages are sent by different
1782	      session participants, then the algorithm can be applied taking
1783	      all of these messages into account, and the resulting TMMBN
1784	      provides the participants with an updated view of how their
1785	      tuples compare with the bounded set.

1787	   d) If more than one session participant happens to send TMMBR
1788	      messages at the same time and with the same tuple component
1789	      values, it does not matter which of those tuples is taken into
1790	      the bounding set.  The losing session participant will determine,
1791	      after applying the algorithm, that its tuple does not enter the
1792	      bounding set, and will therefore stop sending its TMMBR request.

1794	   It is important to consider the security risks involved with faked
1795	   TMMBRs.  See the security considerations in Section 6.

1797	   As indicated already, the feedback messages may be used in both
1798	   multicast and unicast sessions in any of the specified topologies.
1799	   However, for sessions with a large number of participants, using the
1800	   lowest common denominator, as required by this mechanism, may not be
1801	   the most suitable course of action.  Large sessions may need to
1802	   consider other ways to adapt the bit rate to participants'
1803	   capabilities, such as partitioning the session into different
1804	   quality tiers, or using some other method of achieving bit rate
1805	   scalability.

1807	4.2.1.3. Timing Rules

1809	   The first transmission of the TMMBR request message MAY use early or
1810	   immediate feedback in cases when timeliness is desirable.  Any
1811	   repetition of a request message SHOULD use regular RTCP mode for its
1812	   transmission timing.

1814	4.2.1.4. Handling in Translator and Mixers

1816	   Media translators and mixers will need to receive and respond to
1817	   TMMBR messages as they are part of the chain that provides a certain
1818	   media stream to the receiver.  The mixer or translator may act
1819	   locally on the TMMBR request and thus generate a TMMBN to indicate
1820	   that it has done so.  Alternatively, in the case of a media
1821	   translator it can forward the request, or in the case of a mixer
1822	   generate one of its own and pass it forward.  In the latter case,
1823	   the mixer will need to send a TMMBN back to the original requestor
1824	   to indicate that it is handling the request.

1826	4.2.2. Temporary Maximum Media Stream Bit Rate Notification (TMMBN)

1828	   The Temporary Maximum Media Stream Bit Rate Notification is
1829	   identified by RTCP packet type value PT=RTPFB and FMT=4.

1831	   The FCI field of the TMMBN Feedback message may contain zero, one or
1832	   more TMMBN FCI entries.

1834	4.2.2.1. Message Format

1836	   The Feedback Control Information (FCI) consists of zero, one or more
1837	   TMMBN FCI entries with the following syntax:

1839	    0                   1                   2                   3
1840	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1841	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1842	   |                              SSRC                             |
1843	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1844	   | MxTBR Exp |  MxTBR Mantissa                 |Measured Overhead|
1845	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1846	    Figure 3 - Syntax of an FCI entry in the TMMBN message

1848	     SSRC (32 bits): The SSRC value of the "owner" of this tuple.

1850	     MxTBR Exp (6 bits): The exponential scaling of the mantissa for
1851	              the maximum total media bit rate value.  The value is an
1852	              unsigned integer [0..63].

1854	     MxTBR Mantissa (17 bits): The mantissa of the maximum total media
1855	              bit rate value as an unsigned integer.

1857	     Measured Overhead (9 bits): The measured average packet overhead
1858	              value in bytes represented as an unsigned integer
1859	              [0..511].

1861	   Thus, the FCI within the TMMBN message contains entries indicating
1862	   the bounding tuples.  For each tuple, the entry gives the owner by
1863	   the SSRC, followed by the applicable maximum total media bit rate
1864	   and overhead value.

1866	   The length of the TMMBN message SHALL be set to 2+2*N where N is the
1867	   number of TMMBN FCI entries.

1869	4.2.2.2. Semantics

1871	   This feedback message is used to notify the senders of any TMMBR
1872	   message that one or more TMMBR messages have been received or that
1873	   an owner has left the session.  It indicates to all participants the
1874	   current set of bounding tuples and the "owners" of those tuples.

1876	   Within the common packet header for feedback messages (as defined in
1877	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
1878	   indicates the source of the notification.  The "SSRC of media
1879	   source" is not used and SHALL be set to 0.

1881	   A TMMBN message SHALL be scheduled for transmission after the
1882	   reception of a TMMBR message with an FCI entry identifying this
1883	   media sender.  Only a single TMMBN SHALL be sent, even if more than
1884	   one TMMBR message is received between the scheduling of the
1885	   transmission and the actual transmission of the TMMBN message.  The
1886	   TMMBN message indicates the bounding tuples and their owners at the
1887	   time of transmitting the message.  The bounding tuples included
1888	   SHALL be the set arrived at through application of the applicable
1889	   algorithm of section 3.5.4.2 or an equivalent, applied to the
1890	   previous bounding set, if any, and tuples received in TMMBR messages
1891	   since the last TMMBN was transmitted.

1893	   The reception of a TMMBR message SHALL still result in the
1894	   transmission of a TMMBN message even if, after application of the
1895	   algorithm, the newly reported TMMBR tuple is not accepted into the
1896	   bounding set.  In such a case the bounding tuples and their owners
1897	   are not changed, unless the TMMBR was from an owner of a tuple
1898	   within the previously calculated bounding set.  This procedure
1899	   allows session participants that did not see the last TMMBN message
1900	   to get a correct view of this media sender's state.

1902	   As indicated in section 4.2.1.2, when a media sender determines that
1903	   an "owner" of a bounding tuple has left the session, then that tuple
1904	   is removed from the bounding set, and the media sender SHALL send a
1905	   TMMBN message indicating the remaining bounding tuples.  If there
1906	   are no remaining bounding tuples a TMMBN without any FCI SHALL be
1907	   sent to indicate this.  Without a remaining bounding tuple, the
1908	   maximum media bit rate and maximum packet rate negotiated in session
1909	   signaling, if any, apply.

1911	     Note: if any media receivers remain in the session, this last will
1912	     be a temporary situation.  The empty TMMBN will cause every
1913	     remaining media receiver to determine that its limitation belongs
1914	     in the bounding set and send a TMMBR in consequence.

1916	   In unicast scenarios (i.e. where a single sender talks to a single
1917	   receiver), the aforementioned algorithm to determine ownership
1918	   degenerates to the media receiver becoming the "owner" of the one
1919	   bounding tuple as soon as the media receiver has issued the first
1920	   TMMBR message.

1922	4.2.2.3. Timing Rules

1924	   The TMMBN acknowledgement SHOULD be sent as soon as allowed by the
1925	   applied timing rules for the session.  Immediate or early feedback
1926	   mode SHOULD be used for these messages.

1928	4.2.2.4. Handling by Translators and Mixers

1930	   As discussed in Section 4.2.1.4 mixers or translators may need to
1931	   issue TMMBN messages as responses to TMMBR messages for SSRC's
1932	   handled by them.

1934	4.3. Payload Specific Feedback Messages
1935	   As specified by section 6.1 of RFC 4585 [RFC4585], Payload-Specific
1936	   FB messages are identified by the RTCP packet type value PSFB (206).

1938	   AVPF [RFC4585] defines three payload-specific feedback messages and
1939	   one application layer feedback message.  This memo specifies four
1940	   additional payload-specific feedback messages.  All are identified
1941	   by means of the FMT parameter as follows:

1943	   Assigned in [RFC4585]:

1945	     1:     Picture Loss Indication (PLI)
1946	     2:     Slice Lost Indication (SLI)
1947	     3:     Reference Picture Selection Indication (RPSI)
1948	     15:    Application layer FB message
1949	     31:    reserved for future expansion of the number space

1951	   Assigned in this memo:

1953	     4:     Full Intra Request Command (FIR)
1954	     5:     Temporal-Spatial Trade-off Request (TSTR)
1955	     6:     Temporal-Spatial Trade-off Notification (TSTN)
1956	      7:     Video Back Channel Message (VBCM)

1958	   Unassigned:

1960	     0:     unassigned
1961	      8-14:  unassigned
1962	     16-30: unassigned

1964	   The following subsections define the new FCI formats for the
1965	   payload-specific feedback messages.

1967	4.3.1. Full Intra Request (FIR)

1969	   The FIR message is identified by RTCP packet type value PT=PSFB and
1970	   FMT=4.

1972	   The FCI field MUST contain one or more FIR entries.  Each entry
1973	   applies to a different media sender, identified by its SSRC.

1975	4.3.1.1. Message Format

1977	   The Feedback Control Information (FCI) for the Full Intra Request
1978	   consists of one or more FCI entries, the content of which is
1979	   depicted in Figure 4.  The length of the FIR feedback message MUST
1980	   be set to 2+2*N, where N is the number of FCI entries.

1982	    0                   1                   2                   3
1983	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1984	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1985	   |                              SSRC                             |
1986	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1987	   | Seq. nr       |    Reserved                                   |
1988	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1990	    Figure 4 - Syntax of an FCI entry in the FIR message

1992	     SSRC (32 bits): The SSRC value of the media sender which is
1993	              requested to send a decoder refresh point.

1995	     Seq. nr (8 bits): Command sequence number.  The sequence number
1996	              space is unique for each pairing of the SSRC of command
1997	              source and the SSRC of the command target.  The sequence
1998	              number SHALL be increased by 1 modulo 256 for each new
1999	              command.  A repetition SHALL NOT increase the sequence
2000	              number.  The initial value is arbitrary.

2002	     Reserved (24 bits): All bits SHALL be set to 0 by the sender and
2003	              SHALL be ignored on reception.

2005	   The semantics of this feedback message is independent of the RTP
2006	   payload type.

2008	4.3.1.2. Semantics

2010	   Within the common packet header for feedback messages (as defined in
2011	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
2012	   indicates the source of the request, and the "SSRC of media source"
2013	   is not used and SHALL be set to 0.  The SSRCs of the media senders
2014	   to which the FIR command applies are in the corresponding FCI
2015	   entries.  A FIR message MAY contain requests to multiple media
2016	   senders, using one FCI entry per target media sender.

2018	   Upon reception of FIR, the encoder MUST send a decoder refresh point
2019	   (see section 2.2) as soon as possible.

2021	   The sender MUST consider congestion control as outlined in section
2022	   5, which MAY restrict its ability to send a decoder refresh point
2023	   quickly.

2025	   FIR SHALL NOT be sent as a reaction to picture losses -- it is
2026	   RECOMMENDED to use PLI [RFC4585] instead.  FIR SHOULD be used only
2027	   in situations where not sending a decoder refresh point would render
2028	   the video unusable for the users.

2030	   A typical example where sending FIR is appropriate is when, in a
2031	   multipoint conference, a new user joins the session and no regular
2032	   decoder refresh point interval is established.  Another example
2033	   would be a video switching MCU that changes streams.  Here,
2034	   normally, the MCU issues a FIR to the new sender so to force it to
2035	   emit a decoder refresh point.  The decoder refresh point normally
2036	   includes a Freeze Picture Release (defined outside this
2037	   specification), which re-starts the rendering process of the
2038	   receivers.  Both techniques mentioned are commonly used in MCU-based
2039	   multipoint conferences.

2041	   Other RTP payload specifications such as RFC 2032 [RFC2032] already
2042	   define a feedback mechanism for certain codecs.  An application
2043	   supporting both schemes MUST use the feedback mechanism defined in
2044	   this specification when sending feedback.  For backward
2045	   compatibility reasons such an application SHOULD also be capable of
2046	   receiving and reacting to the feedback scheme defined in the
2047	   respective RTP payload format, if this is required by that payload
2048	   format.

2050	4.3.1.3. Timing Rules

2052	   The timing follows the rules outlined in section 3 of [RFC4585].
2053	   FIR commands MAY be used with early or immediate feedback.  The FIR
2054	   feedback message MAY be repeated.  If using immediate feedback mode
2055	   the repetition SHOULD wait at least one RTT before being sent.  In
2056	   early or regular RTCP mode the repetition is sent in the next
2057	   regular RTCP packet.

2059	4.3.1.4. Handling of FIR Message in Mixer and Translators

2061	   A media translator or a mixer performing media encoding of the
2062	   content for which the session participant has issued a FIR is
2063	   responsible for acting upon it.  A mixer acting upon a FIR SHOULD
2064	   NOT forward the message unaltered; instead it SHOULD issue a FIR
2065	   itself.

2067	4.3.1.5. Remarks
2068	   Currently, video appears to be the only useful application for FIR,
2069	   as it appears to be the only RTP payload widely deployed that relies
2070	   heavily on media prediction across RTP packet boundaries.  However,
2071	   use of FIR could also reasonably be envisioned for other media types
2072	   that share essential properties with compressed video, namely cross-
2073	   frame prediction (whatever a frame may be for that media type).  One
2074	   possible example may be the dynamic updates of MPEG-4 scene
2075	   descriptions.  It is suggested that payload formats for such media
2076	   types refer to FIR and other message types defined in this
2077	   specification and in AVPF [RFC4585], instead of creating similar
2078	   mechanisms in the payload specifications.  The payload
2079	   specifications may have to explain how the payload-specific
2080	   terminologies map to the video-centric terminology used herein.

2082	   In conjunction with video codecs, FIR messages typically trigger the
2083	   sending of full intra or IDR pictures.  Both are several times
2084	   larger then predicted (inter) pictures.  Their size is independent
2085	   of the time they are generated.  In most environments, especially
2086	   when employing bandwidth-limited links, the use of an intra picture
2087	   implies an allowed delay that is a significant multiple of the
2088	   typical frame duration.  An example: if the sending frame rate is 10
2089	   fps, and an intra picture is assumed to be 10 times as big as an
2090	   inter picture, then a full second of latency has to be accepted.  In
2091	   such an environment there is no need for a particularly short delay
2092	   in sending the FIR message.  Hence, waiting for the next possible
2093	   time slot allowed by RTCP timing rules as per [RFC4585] should not
2094	   have an overly negative impact on the system performance.

2096	   Mandating a maximum delay for completing the sending of a decoder
2097	   refresh point would be desirable from an application viewpoint, but
2098	   is problematic from a congestion control point of view.  "As soon as
2099	   possible" as mentioned above appears to be a reasonable compromise.

2101	   In environments where the sender has no control over the codec (e.g.
2102	   when streaming pre-recorded and pre-coded content), the reaction to
2103	   this command cannot be specified.  One suitable reaction of a sender
2104	   would be to skip forward in the video bit stream to the next decoder
2105	   refresh point.  In other scenarios, it may be preferable not to
2106	   react to the command at all, e.g. when streaming to a large
2107	   multicast group.  Other reactions may also be possible.  When
2108	   deciding on a strategy, a sender could take into account factors
2109	   such as the size of the receiving group, the "importance" of the
2110	   sender of the FIR message (however "importance" may be defined in
2111	   this specific application), the frequency of decoder refresh points
2112	   in the content, and so on.  However, a session which predominately
2113	   handles pre-coded content is not expected to use FIR at all.

2115	   The relationship between the Picture Loss Indication and FIR is as
2116	   follows.  As discussed in section 6.3.1 of AVPF [RFC4585], a Picture
2117	   Loss Indication informs the decoder about the loss of a picture and
2118	   hence the likelihood of misalignment of the reference pictures
2119	   between the encoder and decoder.  Such a scenario is normally
2120	   related to losses in an ongoing connection.  In point-to-point
2121	   scenarios, and without the presence of advanced error resilience
2122	   tools, one possible option for an encoder consists in sending a
2123	   decoder refresh point.  However, there are other options.  One
2124	   example is that the media sender ignores the PLI, because the
2125	   embedded stream redundancy is likely to clean up the reproduced
2126	   picture within a reasonable amount of time.  The FIR, in contrast,
2127	   leaves a (real-time) encoder no choice but to send a decoder refresh
2128	   point.  It does not allow the encoder to take into account any
2129	   considerations such as the ones mentioned above.

2131	4.3.2. Temporal-Spatial Trade-off Request (TSTR)

2133	   The TSTR feedback message is identified by RTCP packet type value
2134	   PT=PSFB and FMT=5.

2136	   The FCI field MUST contain one or more TSTR FCI entries.

2138	4.3.2.1. Message Format

2140	   The content of the FCI entry for the Temporal-Spatial Trade-off
2141	   Request is depicted in Figure 5.  The length of the feedback message
2142	   MUST be set to 2+2*N, where N is the number of FCI entries included.

2144	    0                   1                   2                   3
2145	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2146	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2147	   |                              SSRC                             |
2148	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2149	   |  Seq nr.      |  Reserved                           | Index   |
2150	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2152	    Figure 5 - Syntax of an FCI Entry in the TSTR Message

2154	     SSRC (32 bits): The SSRC of the media sender which is requested to
2155	              apply the tradeoff value given in Index.

2157	     Seq. nr (8 bits): Request sequence number.  The sequence number
2158	              space is unique for pairing of the SSRC of request source
2159	              and the SSRC of the request target.  The sequence number
2160	              SHALL be increased by 1 modulo 256 for each new command.
2161	              A repetition SHALL NOT increase the sequence number.  The
2162	              initial value is arbitrary.

2164	     Reserved (19 bits): All bits SHALL be set to 0 by the sender and
2165	              SHALL be ignored on reception.

2167	     Index (5 bits): An integer value between 0 and 31 that indicates
2168	              the relative trade-off that is requested.  An index value
2169	              of 0 indicates highest possible spatial quality, while 31
2170	              indicates highest possible temporal resolution.

2172	4.3.2.2. Semantics

2174	   A decoder can suggest a temporal-spatial trade-off level by sending
2175	   a TSTR message to an encoder.  If the encoder is capable of
2176	   adjusting its temporal-spatial trade-off, it SHOULD take into
2177	   account the received TSTR message for future coding of pictures.  A
2178	   value of 0 suggests a high spatial quality and a value of 31
2179	   suggests a high frame rate.  The progression of values from 0 to 31
2180	   indicate monotonically a desire for higher frame rate.  The index
2181	   values do not correspond to precise values of spatial quality or
2182	   frame rate.

2184	   The reaction to the reception of more than one TSTR message by a
2185	   media sender from different media receivers is left open to the
2186	   implementation.  The selected trade-off SHALL be communicated to the
2187	   media receivers by the means of the TSTN message.

2189	   Within the common packet header for feedback messages (as defined in
2190	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
2191	   indicates the source of the request, and the "SSRC of media source"
2192	   is not used and SHALL be set to 0.  The SSRCs of the media senders
2193	   to which the TSTR applies are in the corresponding FCI entries.

2195	   A TSTR message MAY contain requests to multiple media senders, using
2196	   one FCI entry per target media sender.

2198	4.3.2.3. Timing Rules

2200	   The timing follows the rules outlined in section 3 of [RFC4585].
2201	   This request message is not time critical and SHOULD be sent using
2202	   regular RTCP timing.  Only if it is known that the user interface
2203	   requires quick feedback, the message MAY be sent with early or
2204	   immediate feedback timing.

2206	4.3.2.4. Handling of message in Mixers and Translators

2208	   A mixer or media translator that encodes content sent to the session
2209	   participant issuing the TSTR SHALL consider the request to determine
2210	   if it can fulfill it by changing its own encoding parameters.  A
2211	   media translator unable to fulfill the request MAY forward the
2212	   request unaltered towards the media sender.  A mixer encoding for
2213	   multiple session participants will need to consider the joint needs
2214	   of these participants before generating a TSTR on its own behalf
2215	   towards the media sender.  See also the discussion in Section 3.5.2.

2217	4.3.2.5. Remarks

2219	   The term "spatial quality" does not necessarily refer to the
2220	   resolution as measured by the number of pixels the reconstructed
2221	   video is using.  In fact, in most scenarios the video resolution
2222	   stays constant during the lifetime of a session.  However, all video
2223	   compression standards have means to adjust the spatial quality at a
2224	   given resolution, often influenced by the Quantizer Parameter or QP.
2225	   A numerically low QP results in a good reconstructed picture
2226	   quality, whereas a numerically high QP yields a coarse picture.  The
2227	   typical reaction of an encoder to this request is to change its rate
2228	   control parameters to use a lower frame rate and a numerically lower
2229	   (on average) QP, or vice versa.  The precise mapping of Index value
2230	   to frame rate and QP is intentionally left open here, as it depends
2231	   on factors such as the compression standard employed, spatial
2232	   resolution, content, bit rate, and so on.

2234	4.3.3. Temporal-Spatial Trade-off Notification (TSTN)

2236	   The TSTN message is identified by RTCP packet type value PT=PSFB and
2237	   FMT=6.

2239	   The FCI field SHALL contain one or more TSTN FCI entries.

2241	4.3.3.1. Message Format

2243	   The content of an FCI entry for the Temporal-Spatial Trade-off
2244	   Notification is depicted in Figure 6.  The length of the TSTN
2245	   message MUST be set to 2+2*N, where N is the number of FCI entries.

2247	    0                   1                   2                   3
2248	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2249	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2250	   |                              SSRC                             |
2251	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2252	   |  Seq nr.      |  Reserved                           | Index   |
2253	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2255	    Figure 6 - Syntax of the TSTN

2257	     SSRC (32 bits): The SSRC of the source of the TSTR request which
2258	              resulted in this Notification.

2260	     Seq. nr (8 bits): The sequence number value from the TSTR request
2261	              that is being acknowledged.

2263	     Reserved (19 bits): All bits SHALL be set to 0 by the sender and
2264	              SHALL be ignored on reception.

2266	     Index (5 bits): The trade-off value the media sender is using
2267	              henceforth.

2269	      Informative note: The returned trade-off value (Index) may differ
2270	      from the requested one, for example in cases where a media encoder
2271	      cannot tune its trade-off, or when pre-recorded content is used.

2273	4.3.3.2. Semantics

2275	   This feedback message is used to acknowledge the reception of a
2276	   TSTR.  For each TSTR received targeted at the session participant, a
2277	   TSTN entry SHALL be sent included in a TSTN feedback message.  A
2278	   single TSTN message MAY acknowledge multiple requests using multiple
2279	   FCI entries.  The index value included SHALL be the same in all FCI
2280	   entries of the TSTN message.  Including a FCI for each requestor
2281	   allows each requesting entity to determine that the media sender
2282	   received the request.  The Notification SHALL also be sent in
2283	   response to TSTR repetitions received.  If the request receiver has
2284	   received TSTR with several different sequence numbers from a single
2285	   requestor it SHALL only respond to the request with the highest
2286	   (modulo 256) sequence number.  Note that the highest sequence number
2287	   may be a smaller integer value due to the wrapping of the field.
2288	   Section A.1 of [RFC3550] has an algorithm for keeping track of the
2289	   highest received sequence number for RTP packets, this could be
2290	   adapted for this usage.

2292	   The TSTN SHALL include the Temporal-Spatial Trade-off index that
2293	   will be used as a result of the request.  This is not necessarily
2294	   the same index as requested, as the media sender may need to
2295	   aggregate requests from several requesting session participants.  It
2296	   may also have some other policies or rules that limit the selection.

2298	   Within the common packet header for feedback messages (as defined in
2299	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
2300	   indicates the source of the Notification, and the "SSRC of media
2301	   source" is not used and SHALL be set to 0.  The SSRCs of the
2302	   requesting entities to which the Notification applies are in the
2303	   corresponding FCI entries.

2305	4.3.3.3. Timing Rules

2307	   The timing follows the rules outlined in section 3 of [RFC4585].
2308	   This acknowledgement message is not extremely time critical and
2309	   SHOULD be sent using regular RTCP timing.

2311	4.3.3.4. Handling of TSTN in Mixer and Translators

2313	   A mixer or translator that acts upon a TSTR SHALL also send the
2314	   corresponding TSTN.  In cases where it needs to forward a TSTR
2315	   itself the notification message MAY need to be delayed until the
2316	   TSTR has been responded to.

2318	4.3.3.5. Remarks

2320	   None

2322	4.3.4. H.271 Video Back Channel Message (VBCM)

2324	   The VBCM is identified by RTCP packet type value PT=PSFB and FMT=7.

2326	   The FCI field MUST contain one or more VBCM FCI entries.

2328	4.3.4.1. Message Format

2330	   The syntax of an FCI entry within the VBCM indication is depicted in
2331	   Figure 7.

2333	   0                   1                   2                   3
2334	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2335	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2336	   |                              SSRC                             |
2337	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2338	   | Seq. nr       |0| Payload Type| Length                        |
2339	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2340	   |                    VBCM Octet String....      |    Padding    |
2341	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2343	   Figure 7 - Syntax of an FCI Entry in the VBCM Message

2345	   SSRC (32 bits): The SSRC value of the media sender that is requested
2346	          to instruct its encoder to react to the VBCM message

2348	   Seq. nr (8 bits): Command sequence number.  The sequence number
2349	          space is unique for pairing of the SSRC of command source and
2350	          the SSRC of the command target.  The sequence number SHALL be
2351	          increased by 1 modulo 256 for each new command.  A repetition
2352	          SHALL NOT increase the sequence number.  The initial value is
2353	          arbitrary.

2355	   0: Must be set to 0 by the sender and should not be acted upon by
2356	          the message receiver.

2358	   Payload Type (7 bits): The RTP payload type for which the VBCM bit
2359	          stream must be interpreted.

2361	   Length (16 bits): The length of the VBCM octet string in octets
2362	          exclusive of any padding octets

2364	   VBCM Octet String (Variable length): This is the octet string
2365	          generated by the decoder carrying a specific feedback sub-
2366	          message.

2368	   Padding (Variable length): Bits set to 0 to make up a 32 bit
2369	          boundary.

2371	4.3.4.2. Semantics

2373	   The "payload" of the VBCM indication carries different types of
2374	   codec-specific, feedback information.  The type of feedback
2375	   information can be classified as a 'status report' (such as an
2376	   indication that a bit stream was received without errors, or that a
2377	   partial or complete picture or block was lost) or 'update requests'
2378	   (such as complete refresh of the bit stream).

2380	          Note: There are possible overlaps between the VBCM sub-
2381	          messages and CCM/AVPF feedback messages, such as FIR.  Please
2382	          see section 3.5.3 for further discussion.

2384	   The different types of feedback sub-messages carried in the VBCM are
2385	   indicated by the "payloadType" as defined in [VBCM].  These sub-
2386	   message types are reproduced below for convenience.  "payloadType",
2387	   in ITU-T Rec. H.271 terminology, refers to the sub-type of the H.271
2388	   message and should not be confused with an RTP payload type.

2390	   Payload          Message Content
2391	   Type
2392	   --------------------------------------------------------------------
2393	   0      One or more pictures without detected bit stream error
2394	          mismatch
2395	   1      One or more pictures that are entirely or partially lost
2396	   2      A set of blocks of one picture that is entirely or partially
2397	          lost
2398	   3      CRC for one parameter set
2399	   4      CRC for all parameter sets of a certain type
2400	   5      A "reset" request indicating that the sender should
2401	   completely
2402	          refresh the video bit stream as if no prior bit stream data
2403	          had been received
2404	   > 5    Reserved for future use by ITU-T

2406	   Table 2: H.271 message types ("payloadTypes")

2408	   The bit string or the "payload" of a VBCM message is of variable
2409	   length and is self-contained and coded in a variable length, binary
2410	   format.  The media sender necessarily has to be able to parse this
2411	   optimized binary format to make use of VBCM messages.

2413	   Each of the different types of sub-messages (indicated by
2414	   payloadType) may have different semantics depending on the codec
2415	   used.

2417	   Within the common packet header for feedback messages (as defined in
2418	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
2419	   indicates the source of the request, and the "SSRC of media source"
2420	   is not used and SHALL be set to 0.  The SSRCs of the media senders
2421	   to which the VBCM message applies to are in the corresponding FCI
2422	   entries.  The sender of the VBCM message MAY send H.271 messages to
2423	   multiple media senders and MAY send more than one H.271 message to
2424	   the same media sender within the same VBCM message.

2426	4.3.4.3. Timing Rules

2428	   The timing follows the rules outlined in section 3 of [RFC4585].
2429	   The different sub-message types may have different properties in
2430	   regards to the timing of messages that should be used.  If several
2431	   different types are included in the same feedback packet then the
2432	   requirements for the sub-message type with the most stringent
2433	   requirements should be followed.

2435	4.3.4.4. Handling of message in Mixer or Translator

2437	   The handling of VBCM in a mixer or translator is sub-message type
2438	   dependent.

2440	4.3.4.5. Remarks

2442	   Please see section 3.5.3 for a discussion of the usage of H.271
2443	   messages and messages defined in AVPF [RFC4585] and this memo with
2444	   similar functionality.

2446	     Note: There has been some discussion whether the RTP payload type
2447	     field in this message is needed.  It will be needed if there is
2448	     potentially more than one VBCM-capable RTP payload type in the
2449	     same session, and the semantics of a given VBCM message changes
2450	     between payload types.  For example, the picture identification
2451	     mechanism in messages of H.271 type 0 is fundamentally different
2452	     between H.263 and H.264 (although both use the same syntax).
2453	     Therefore, the payload field is justified here.  There was a
2454	     further comment that for TSTR and FIR such a need does not exist,
2455	     because the semantics of TSTR and FIR are either loosely enough
2456	     defined, or generic enough, to apply to all video payloads
2457	     currently in existence/envisioned.

2459	5. Congestion Control

2461	   The correct application of the AVPF [RFC4585] timing rules prevents
2462	   the network from being flooded by feedback messages.  Hence,
2463	   assuming a correct implementation and configuration, the RTCP
2464	   channel cannot break its bit rate commitment and introduce
2465	   congestion.

2467	   The reception of some of the feedback messages modifies the
2468	   behaviour of the media senders or, more specifically, the media
2469	   encoders.  Thus, modified behaviour MUST respect the bandwidth
2470	   limits that the application of congestion control provides.  For
2471	   example, when a media sender is reacting to a FIR, the unusually
2472	   high number of packets that form the decoder refresh point have to
2473	   be paced in compliance with the congestion control algorithm, even
2474	   if the user experience suffers from a slowly transmitted decoder
2475	   refresh point.

2477	   A change of the Temporary Maximum Media Stream Bit Rate value can
2478	   only mitigate congestion, but not cause congestion as long as
2479	   congestion control is also employed.  An increase of the value by a
2480	   request REQUIRES the media sender to use congestion control when
2481	   increasing its transmission rate to that value.  A reduction of the
2482	   value results in a reduced transmission bit rate, thus reducing the
2483	   risk for congestion.

2485	6. Security Considerations

2487	   The defined messages have certain properties that have security
2488	   implications.  These must be addressed and taken into account by
2489	   users of this protocol.

2491	   The defined setup signaling mechanism is sensitive to modification
2492	   attacks that can result in session creation with sub-optimal
2493	   configuration, and, in the worst case, session rejection.  To
2494	   prevent this type of attack, authentication and integrity protection
2495	   of the setup signaling is required.

2497	   Spoofed or maliciously created feedback messages of the type defined
2498	   in this specification can have the following implications:

2500	        a. severely reduced media bit rate due to false TMMBR messages
2501	           that sets the maximum to a very low value;

2503	        b. assignment of the ownership of a bounding tuple to the wrong
2504	           participant within a TMMBN message, potentially causing
2505	           unnecessary oscillation in the bounding set as the mistakenly
2506	           identified owner reports a change in its tuple and the true
2507	           owner possibly holds back on changes until a correct TMMBN
2508	           message reaches the participants;

2510	        c. sending TSTR requests that result in a video quality
2511	           different from the user's desire, rendering the session less
2512	           useful;

2514	        d. sending multiple FIR commands to reduce the frame-rate, and
2515	           make the video jerky, due to the frequent usage of decoder
2516	           refresh points.

2518	   To prevent these attacks there is a need to apply authentication and
2519	   integrity protection of the feedback messages.  This can be
2520	   accomplished against threats external to the current RTP session
2521	   using the RTP profile that combines SRTP [SRTP] and AVPF into SAVPF
2522	   [SAVPF].  In the mixer cases, separate security contexts and
2523	   filtering can be applied between the mixer and the participants,
2524	   thus protecting other users on the mixer from a misbehaving
2525	   participant.

2527	7. SDP Definitions

2529	   Section 4 of [RFC4585] defines a new SDP [RFC4566] attribute, rtcp-
2530	   fb, that may be used to negotiate the capability to handle specific
2531	   AVPF commands and indications, such as Reference Picture Selection,
2532	   Picture Loss Indication etc.  The ABNF for rtcp-fb is described in
2533	   section 4.2 of [RFC4585].  In this section we extend the rtcp-fb
2534	   attribute to include the commands and indications that are described
2535	   for codec control in the present document.  We also discuss the
2536	   Offer/Answer implications for the codec control commands and
2537	   indications.

2539	7.1. Extension of the rtcp-fb Attribute

2541	   As described in AVPF [RFC4585], the rtcp-fb attribute indicates the
2542	   capability of using RTCP feedback.  AVPF specifies that the rtcp-fb
2543	   attribute must only be used as a media level attribute and must not
2544	   be provided at session level.  All the rules described in [RFC4585]
2545	   for rtcp-fb attribute relating to payload type and to multiple rtcp-
2546	   fb attributes in a session description also apply to the new
2547	   feedback messages defined in this memo.

2549	   The ABNF [RFC4234] for rtcp-fb as defined in [RFC4585] is

2551	     "a=rtcp-fb: " rtcp-fb-pt SP rtcp-fb-val CRLF

2553	   where rtcp-fb-pt is the payload type and rtcp-fb-val defines the
2554	   type of the feedback message such as ack, nack, trr-int and rtcp-fb-
2555	   id.  For example, to indicate the support of feedback of picture
2556	   loss indication, the sender declares the following in SDP

2558	         v=0
2559	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2560	         s=Media with feedback
2561	         t=0 0
2562	         c=IN IP4 host.example.com
2563	         m=audio 49170 RTP/AVPF 98
2564	         a=rtpmap:98 H263-1998/90000
2565	         a=rtcp-fb:98 nack pli

2567	   In this document we define a new feedback value "ccm" which
2568	   indicates the support of codec control using RTCP feedback messages.
2569	   The "ccm" feedback value SHOULD be used with parameters that
2570	   indicate the specific codec control commands supported. In this
2571	   draft we define four such parameters, namely:

2573	      o  "fir" indicates support of the Full Intra Request (FIR).
2574	      o  "tmmbr" indicates support of the Temporary Maximum Media Stream
2575	         Bit Rate Request/Notification (TMMBR/TMMBN).  It has an
2576	         optional sub parameter to indicate the session maximum packet
2577	         rate (measured in packets per second) to be used.  If not
2578	         included this defaults to infinity.
2579	      o  "tstr" indicates support of the Temporal-Spatial Trade-off
2580	         Request/Notification (TSTR/TSTN).
2581	      O  "vbcm" indicates support of H.271 video back channel messages
2582	         (VBCM).  It has zero or more subparameters identifying the
2583	         supported H.271 "payloadType" values.

2585	   In the ABNF for rtcp-fb-val defined in [RFC4585], there is a
2586	   placeholder called rtcp-fb-id to define new feedback types.  "ccm"
2587	   is defined as a new feedback type in this document and the ABNF for
2588	   the parameters for ccm are defined here (please refer to section 4.2
2589	   of [RFC4585] for complete ABNF syntax).

2591	   rtcp-fb-param = SP "app" [SP byte-string]
2592	                 / SP rtcp-fb-ccm-param
2593	                 /     ; empty

2595	   rtcp-fb-ccm-param = "ccm" SP ccm-param

2597	   ccm-param  = "fir"   ; Full Intra Request
2598	              / "tmmbr" [SP "smaxpr=" MaxPacketRateValue]
2599	                        ; Temporary max media bit rate
2600	              / "tstr"  ; Temporal Spatial Trade Off
2601	              / "vbcm" *(SP subMessageType) ; H.271 VBCM messages
2602	              / token [SP byte-string]
2603	                         ; for future commands/indications
2604	   subMessageType = 1*8DIGIT
2605	   byte-string = <as defined in section 4.2 of [RFC4585] >
2606	   MaxPacketRateValue = 1*15DIGIT

2608	7.2. Offer-Answer

2610	   The Offer/Answer [RFC3264] implications for codec control protocol
2611	   feedback messages are similar to those described in [RFC4585].  The
2612	   offerer MAY indicate the capability to support selected codec
2613	   commands and indications.  The answerer MUST remove all ccm
2614	   parameters corresponding to the CCM messages that it does not wish
2615	   to support in this particular media session (for example because it
2616	   does not implement the message in question, or because its
2617	   application logic suggests the support of the message adds no
2618	   value).  The answerer MUST NOT add new ccm parameters in addition to
2619	   what has been offered.  The answer is binding for the media session
2620	   and both offerer and answerer MUST NOT use any feedback messages
2621	   other than what both sides have explicitly indicated as being
2622	   supported.  In others words only the joint subset of CCM parameters
2623	   from the offer and answer may be used.

2625	   Note, that including a CCM parameter in an offer or answer indicates
2626	   that the party (offerer or answerer) is at least capable of
2627	   receiving the corresponding CCM message(s) and act upon them. In
2628	   cases when the reception of a negotiated CCM messages mandates the
2629	   party to respond with another CCM message, it must also have that
2630	   capability. Although it is not mandated to initiate CCM messages of
2631	   any negotiated type, it is generally expected that an party will
2632	   initiate CCM messages when appropriate.

2634	   The session maximum packet rate parameter part of the TMMBR
2635	   indication is declarative and everyone SHALL use the highest value
2636	   indicated in a response.  If the session maximum packet rate
2637	   parameter is not present in an offer it SHALL NOT be included by the
2638	   answerer.

2640	7.3. Examples

2642	   Example 1: The following SDP describes a point-to-point video call
2643	   with H.263, with the originator of the call declaring its capability
2644	   to support the FIR and TSTR/TSTN codec control messages.  The SDP is
2645	   carried in a high level signaling protocol like SIP.

2647	         v=0
2648	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2649	         s=Point-to-Point call
2650	         c=IN IP4 192.0.2.124
2651	         m=audio 49170 RTP/AVP 0
2652	         a=rtpmap:0 PCMU/8000
2653	         m=video 51372 RTP/AVPF 98
2654	         a=rtpmap:98 H263-1998/90000
2655	         a=rtcp-fb:98 ccm tstr
2656	         a=rtcp-fb:98 ccm fir

2658	   In the above example, when the sender receives a TSTR message from
2659	   the remote party it is capable of adjusting the trade off as
2660	   indicated in the RTCP TSTN feedback message.

2662	   Example 2: The following SDP describes a SIP end point joining a
2663	   video mixer that is hosting a multiparty video conferencing session.
2664	   The participant supports only the FIR (Full Intra Request) codec
2665	   control command and it declares it in its session description.

2667	         v=0
2668	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2669	         s=Multiparty Video Call
2670	         c=IN IP4 192.0.2.124
2671	         m=audio 49170 RTP/AVP 0
2672	         a=rtpmap:0 PCMU/8000
2673	         m=video 51372 RTP/AVPF 98
2674	         a=rtpmap:98 H263-1998/90000
2675	         a=rtcp-fb:98 ccm fir

2677	   When the video MCU decides to route the video of this participant it
2678	   sends an RTCP FIR feedback message.  Upon receiving this feedback
2679	   message the end point is required to generate a full intra request.

2681	   Example 3: The following example describes the Offer/Answer
2682	   implications for the codec control messages.  The Offerer wishes to
2683	   support "tstr", "fir" and "tmmbr".  The offered SDP is

2685	   -------------> Offer
2686	         v=0
2687	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2688	         s=Offer/Answer
2689	         c=IN IP4 192.0.2.124
2690	         m=audio 49170 RTP/AVP 0
2691	         a=rtpmap:0 PCMU/8000
2692	         m=video 51372 RTP/AVPF 98
2693	         a=rtpmap:98 H263-1998/90000
2694	         a=rtcp-fb:98 ccm tstr
2695	         a=rtcp-fb:98 ccm fir
2696	         a=rtcp-fb:* ccm tmmbr smaxpr=120

2698	   The answerer wishes to support only the FIR and TSTR/TSTN messages
2699	   and the answerer SDP is

2701	   <---------------- Answer

2703	         v=0
2704	         o=alice 3203093520 3203093524 IN IP4 otherhost.example.com
2705	         s=Offer/Answer
2706	         c=IN IP4 192.0.2.37
2707	         m=audio 47190 RTP/AVP 0
2708	         a=rtpmap:0 PCMU/8000
2709	         m=video 53273 RTP/AVPF 98
2710	         a=rtpmap:98 H263-1998/90000
2711	         a=rtcp-fb:98 ccm tstr
2712	         a=rtcp-fb:98 ccm fir

2714	   Example 4: The following example describes the Offer/Answer
2715	   implications for H.271 Video back channel messages (VBCM).  The
2716	   Offerer wishes to support VBCM and the sub-messages of payloadType 1
2717	   (one or more pictures that are entirely or partially lost) and 2 (a
2718	   set of blocks of one picture that are entirely or partially lost).

2720	   -------------> Offer
2721	         v=0
2722	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2723	         s=Offer/Answer
2724	         c=IN IP4 192.0.2.124
2725	         m=audio 49170 RTP/AVP 0
2726	         a=rtpmap:0 PCMU/8000
2727	         m=video 51372 RTP/AVPF 98
2728	         a=rtpmap:98 H263-1998/90000
2729	         a=rtcp-fb:98 ccm vbcm 1 2

2731	   The answerer only wishes to support sub-messages of type 1 only

2733	   <---------------- Answer

2735	         v=0
2736	         o=alice 3203093520 3203093524 IN IP4 otherhost.example.com
2737	         s=Offer/Answer
2738	         c=IN IP4 192.0.2.37
2739	         m=audio 47190 RTP/AVP 0
2740	         a=rtpmap:0 PCMU/8000
2741	         m=video 53273 RTP/AVPF 98
2742	         a=rtpmap:98 H263-1998/90000
2743	         a=rtcp-fb:98 ccm vbcm 1

2745	   So, in the above example, only VBCM indications comprised of
2746	   "payloadType" 1 will be supported.

2748	8. IANA Considerations

2750	   The new value "ccm" needs to be registered with IANA in the "rtcp-
2751	   fb" Attribute Values registry located at the time of publication at:
2752	   http://www.iana.org/assignments/sdp-parameters

2754	   Value name:       ccm
2755	   Long Name:        Codec Control Commands and Indications
2756	   Reference:        RFC XXXX

2758	   A new registry "Codec Control Messages" needs to be created to hold
2759	   "ccm" parameters located at time of publication at:
2760	   http://www.iana.org/assignments/sdp-parameters

2762	   New registration in this registry follows the "Specification
2763	   required" policy as defined by [RFC2434]. In addition they are
2764	   required to indicate which, if any additional RTCP feedback types,
2765	   such as "nack", "ack".

2767	   The initial content of the registry is the following values:

2769	   Value name:       fir
2770	   Long name:        Full Intra Request Command
2771	   Usable with:      ccm
2772	   Reference:        RFC XXXX

2774	   Value name:       tmmbr
2775	   Long name:        Temporary Maximum Media Stream Bit Rate
2776	   Usable with:      ccm
2777	   Reference:        RFC XXXX

2779	   Value name:       tstr
2780	   Long name:        temporal Spatial Trade Off
2781	   Usable with:      ccm
2782	   Reference:        RFC XXXX

2784	   Value name:       vbcm
2785	   Long name:        H.271 video back channel messages
2786	   Usable with:      ccm
2787	   Reference:        RFC XXXX

2789	   The following values need to be registered as FMT values in the "FMT
2790	   Values for RTPFB Payload Types" registry located at the time of
2791	   publication at: http://www.iana.org/assignments/rtp-parameters
2792	   RTPFB range
2793	   Name           Long Name                         Value  Reference
2794	   -------------- --------------------------------- -----  ---------
2795	                  Reserved                             2   [RFCxxxx]
2796	   TMMBR          Temporary Maximum Media Stream Bit   3   [RFCxxxx]
2797	                  Rate Request
2798	   TMMBN          Temporary Maximum Media Stream Bit   4   [RFCxxxx]
2799	                  Rate Notification

2801	   The following values need to be registered as FMT values in the "FMT
2802	   Values for PSFB Payload Types" registry located at the time of
2803	   publication at: http://www.iana.org/assignments/rtp-parameters

2805	   PSFB range
2806	   Name           Long Name                             Value Reference
2807	   -------------- ---------------------------------     ----- -------
2808	   FIR            Full Intra Request Command              4   [RFCxxxx]
2809	   TSTR           Temporal-Spatial Trade-off Request      5   [RFCxxxx]
2810	   TSTN           Temporal-Spatial Trade-off Notification 6   [RFCxxxx]
2811	   VBCM           Video Back Channel Message              7   [RFCxxxx]

2813	9. Contributors

2815	   Tom Taylor has made a very significant contribution, for which the
2816	   authors are very grateful, to this specification by helping rewrite
2817	   the specification. Especially the parts regarding the algorithm for
2818	   determining bounding sets for TMMBR have benefited.

2820	10.  Acknowledgements

2822	   The authors would like to thank Andrea Basso, Orit Levin, Nermeen
2823	   Ismail for their work on the requirement and discussion draft
2824	   [Basso].

2826	   Drafts of this memo were reviewed and extensively commented by Roni
2827	   Even, Colin Perkins, Randell Jesup, Keith Lantz, Harikishan
2828	   Desineni, Guido Franceschini and others.  The authors appreciate
2829	   these reviews.

2831	   Funding for the RFC Editor function is currently provided by the
2832	   Internet Society.

2834	11.  References

2836	11.1. Normative references

2838	   [RFC4585]   Ott, J., Wenger, S., Sato, N., Burmeister, C., Rey, J.,
2839	                "Extended RTP Profile for Real-Time Transport Control
2840	                Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585,
2841	                July 2006
2842	   [RFC2119]   Bradner, S., "Key words for use in RFCs to Indicate
2843	                Requirement Levels", BCP 14, RFC 2119, March 1997.
2844	   [RFC3550]   Schulzrinne, H.,  Casner, S., Frederick, R., and V.
2845	                Jacobson, "RTP: A Transport Protocol for Real-Time
2846	                Applications", STD 64, RFC 3550, July 2003.
2847	   [RFC4566]   Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
2848	                Description Protocol", RFC 4566, July 2006.
2849	   [RFC3264]   Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
2850	                with Session Description Protocol (SDP)", RFC 3264, June
2851	                2002.
2852	   [RFC2434]   Narten, T. and H. Alvestrand, "Guidelines for Writing an
2853	                IANA Considerations Section in RFCs", BCP 26, RFC 2434,
2854	                October 1998.
2855	   [RFC4234]   Crocker, D. and P. Overell, "Augmented BNF for Syntax
2856	                Specifications: ABNF", RFC 4234, October 2005.

2858	11.2. Informative references

2860	   [Basso]     A. Basso, et. al., "Requirements for transport of video
2861	                control commands", draft-basso-avt-videoconreq-02.txt,
2862	                expired Internet Draft, October 2004.
2863	   [AVC]       Joint Video Team of ITU-T and ISO/IEC JTC 1, Draft ITU-T
2864	                Recommendation and Final Draft International Standard of
2865	                Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC
2866	                14496-10 AVC), Joint Video Team (JVT) of ISO/IEC MPEG
2867	                and ITU-T VCEG, JVT-G050, March 2003.
2868	   [H245]      ITU-T Rec. HG.245, "Control protocol for multimedia
2869	                communication", MAY 2006
2870	   [NEWPRED]   S. Fukunaga, T. Nakai, and H. Inoue, "Error Resilient
2871	                Video Coding by Dynamic Replacing of Reference
2872	                Pictures," in Proc. Globcom'96, vol. 3, pp. 1503 - 1508,
2873	                1996.
2874	   [SRTP]      Baugher, M., McGrew, D., Naslund, M., Carrara, E., and
2875	                K. Norrman, "The Secure Real-time Transport Protocol
2876	                (SRTP)", RFC 3711, March 2004.
2877	   [RFC2032]   Turletti, T. and C. Huitema, "RTP Payload Format for
2878	                H.261 Video Streams", RFC 2032, October 1996.

2880	   [SAVPF]     J. Ott, E. Carrara, "Extended Secure RTP Profile for
2881	                RTCP-based Feedback (RTP/SAVPF)," draft-ietf-avt-
2882	                profile-savpf-11.txt, February, 2007.
2883	   [RFC3525]   Groves, C., Pantaleo, M., Anderson, T., and T. Taylor,
2884	                "Gateway Control Protocol Version 1", RFC 3525, June
2885	                2003.
2886	   [RFC3448]   M. Handley, S. Floyd, J. Padhye, J. Widmer, "TCP
2887	                Friendly Rate Control (TFRC): Protocol Specification",
2888	                RFC 3448, Jan 2003
2889	   [VBCM]      ITU-T Rec. H.271, "Video Back Channel Messages", June
2890	                2006
2891	   [RFC3890]   Westerlund, M., "A Transport Independent Bandwidth
2892	                Modifier for the Session Description Protocol (SDP)",
2893	                RFC 3890, September 2004.
2894	   [RFC4340]   Kohler, E., Handley, M., and S. Floyd, "Datagram
2895	                Congestion Control Protocol (DCCP)", RFC 4340, March
2896	                2006.
2897	   [RFC3261]   Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
2898	                A., Peterson, J., Sparks, R., Handley, M., and E.
2899	                Schooler, "SIP: Session Initiation Protocol", RFC 3261,
2900	                June 2002.
2901	   [RFC2198]   Perkins, C., Kouvelas, I., Hodson, O., Hardman, V.,
2902	                Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse-
2903	                Parisis, "RTP Payload for Redundant Audio Data", RFC
2904	                2198, September 1997.
2905	   [Topologies] M. Westerlund, and S. Wenger, "RTP Topologies", draft-
2906	                ietf-avt-topologies-06, work in progress, Aug 2007.
2907	   [XML-MC]    O. Levin, R. Even, P. Hagendorf, "XML Schema for Media
2908	                Control," draft-levin-mmusic-xml-media-control-11, work
2909	                in progress, July 2007.

2911	12.  Authors' Addresses

2913	   Stephan Wenger
2914	   Nokia Corporation
2915	   975, Page Mill Road,
2916	   Palo Alto,CA 94304
2917	   USA

2919	   Phone: +1-650-862-7368
2920	   EMail: stewe@stewe.org

2922	   Umesh Chandra
2923	   Nokia Research Center
2924	   975, Page Mill Road,
2925	   Palo Alto,CA 94304
2926	   USA

2928	   Phone: +1-650-796-7502
2929	   Email: Umesh.1.Chandra@nokia.com

2931	   Magnus Westerlund
2932	   Ericsson Research
2933	   Ericsson AB
2934	   SE-164 80 Stockholm, SWEDEN

2936	   Phone: +46 8 7190000
2937	   EMail: magnus.westerlund@ericsson.com

2939	   Bo Burman
2940	   Ericsson Research
2941	   Ericsson AB
2942	   SE-164 80 Stockholm, SWEDEN

2944	   Phone: +46 8 7190000
2945	   EMail: bo.burman@ericsson.com

2947	Full Copyright Statement

2949	   Copyright (C) The IETF Trust (2007).

2951	   This document is subject to the rights, licenses and restrictions
2952	   contained in BCP 78, and except as set forth therein, the authors
2953	   retain all their rights.

2955	   This document and the information contained herein are provided on an
2956	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
2957	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST
2958	   AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES,
2959	   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT
2960	   THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY
2961	   IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR
2962	   PURPOSE.

2964	Intellectual Property

2966	   The IETF takes no position regarding the validity or scope of any
2967	   Intellectual Property Rights or other rights that might be claimed to
2968	   pertain to the implementation or use of the technology described in
2969	   this document or the extent to which any license under such rights
2970	   might or might not be available; nor does it represent that it has
2971	   made any independent effort to identify any such rights.  Information
2972	   on the procedures with respect to rights in RFC documents can be
2973	   found in BCP 78 and BCP 79.

2975	   Copies of IPR disclosures made to the IETF Secretariat and any
2976	   assurances of licenses to be made available, or the result of an
2977	   attempt made to obtain a general license or permission for the use of
2978	   such proprietary rights by implementers or users of this
2979	   specification can be obtained from the IETF on-line IPR repository at
2980	   http://www.ietf.org/ipr.

2982	   The IETF invites any interested party to bring to its attention any
2983	   copyrights, patents or patent applications, or other proprietary
2984	   rights that may cover technology that may be required to implement
2985	   this standard.  Please address the information to the IETF at
2986	   ietf-ipr@ietf.org.

2988	Acknowledgement

2990	   Funding for the RFC Editor function is provided by the IETF
2991	   Administrative Support Activity (IASA).

2993	RFC Editor Considerations

2995	   The RFC editor is requested to replace all occurrences of XXXX with
2996	   the RFC number this document receives.