idnits 2.17.1 

draft-ietf-avt-avpf-ccm-07.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 19.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 2987.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 3000.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 3009.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 3015.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses
     in the document.  If these are example addresses, they should be changed.

  -- The document has examples using IPv4 documentation addresses according
     to RFC6890, but does not use any IPv6 documentation addresses.  Maybe
     there should be IPv6 examples, too?


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  == Line 769 has weird spacing: '...sg type    mul...'

  == Line 1159 has weird spacing: '...     ab  c   s...'

  == Line 1161 has weird spacing: '...     ba   s...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (May 30, 2007) is 6174 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RFCxxxx' is mentioned on line 2835, but not defined

  ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866)

  ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226)

  ** Obsolete normative reference: RFC 4234 (Obsoleted by RFC 5234)

  -- Obsolete informational reference (is this intentional?): RFC 2032
     (Obsoleted by RFC 4587)

  == Outdated reference: A later version (-12) exists of
     draft-ietf-avt-profile-savpf-10

  -- Obsolete informational reference (is this intentional?): RFC 3525
     (Obsoleted by RFC 5125)

  -- Obsolete informational reference (is this intentional?): RFC 3448
     (Obsoleted by RFC 5348)

  == Outdated reference: A later version (-07) exists of
     draft-ietf-avt-topologies-04


     Summary: 4 errors (**), 0 flaws (~~), 9 warnings (==), 11 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                   Stephan Wenger
3	INTERNET-DRAFT                                           Umesh Chandra
4	Expires: October 2007                                            Nokia
5	                                                     Magnus Westerlund
6	                                                             Bo Burman
7	                                                              Ericsson
8	                                                          May 30, 2007

10	                      Codec Control Messages in the
11	              RTP Audio-Visual Profile with Feedback (AVPF)
12	                     draft-ietf-avt-avpf-ccm-07.txt>

14	Status of this Memo

16	   By submitting this Internet-Draft, each author represents that any
17	   applicable patent or other IPR claims of which he or she is aware
18	   have been or will be disclosed, and any of which he or she becomes
19	   aware will be disclosed, in accordance with Section 6 of BCP 79.

21	   Internet-Drafts are working documents of the Internet Engineering
22	   Task Force (IETF), its areas, and its working groups.  Note that
23	   other groups may also distribute working documents as Internet-
24	   Drafts.

26	   Internet-Drafts are draft documents valid for a maximum of six
27	   months and may be updated, replaced, or obsoleted by other
28	   documents at any time.  It is inappropriate to use Internet-Drafts
29	   as reference material or to cite them other than as "work in
30	   progress."

32	   The list of current Internet-Drafts can be accessed at
33	   http://www.ietf.org/ietf/1id-abstracts.txt.

35	   The list of Internet-Draft Shadow Directories can be accessed at
36	   http://www.ietf.org/shadow.html.

38	Copyright Notice

40	   Copyright (C) The IETF Trust (2007).

42	Abstract

44	   This document specifies a few extensions to the messages defined
45	   in the Audio-Visual Profile with Feedback (AVPF).  They are
46	   helpful primarily in conversational multimedia scenarios where
47	   centralized multipoint functionalities are in use.  However some
48	   are also usable in smaller multicast environments and point-to-
49	   point calls.  The extensions discussed are messages related to the
50	   ITU-T H.271 Video Back Channel, Full Intra Request, Temporary
51	   Maximum Media Stream Bit Rate and Temporal Spatial Trade-off.

53	TABLE OF CONTENTS

55	1. Introduction....................................................5
56	2. Definitions.....................................................6
57	   2.1. Glossary...................................................6
58	   2.2. Terminology................................................6
59	   2.3. Topologies.................................................9
60	3. Motivation (Informative).......................................10
61	   3.1. Use Cases.................................................10
62	   3.2. Using the Media Path......................................12
63	   3.3. Using AVPF................................................13
64	      3.3.1. Reliability..........................................13
65	   3.4. Multicast.................................................13
66	   3.5. Feedback Messages.........................................13
67	      3.5.1. Full Intra Request Command...........................14
68	         3.5.1.1. Reliability.....................................14
69	      3.5.2. Temporal Spatial Trade-off Request and Notification..15
70	         3.5.2.1. Point-to-Point..................................16
71	         3.5.2.2. Point-to-Multipoint Using Multicast or
72	                  Translators.....................................17
73	         3.5.2.3. Point-to-Multipoint Using RTP Mixer.............17
74	         3.5.2.4. Reliability.....................................17
75	      3.5.3. H.271 Video Back Channel Message.....................18
76	         3.5.3.1. Reliability.....................................21
77	      3.5.4. Temporary Maximum Media Stream Bit Rate Request and
78	             Notification.........................................21
79	         3.5.4.1. Behavior for media receivers using TMMBR........23
80	         3.5.4.2. Algorithm for establishing current limitations..25
81	         3.5.4.3. Use of TMMBR in a Mixer Based Multipoint
82	                  Operation.......................................32
83	         3.5.4.4. Use of TMMBR in Point-to-Multipoint Using
84	                  Multicast or Translators........................33
85	         3.5.4.5. Use of TMMBR in Point-to-point operation........33
86	         3.5.4.6. Reliability.....................................33
87	4. RTCP Receiver Report Extensions................................35
88	   4.1. Design Principles of the Extension Mechanism..............35
89	   4.2. Transport Layer Feedback Messages.........................36
90	      4.2.1. Temporary Maximum Media Stream Bit Rate Request
91	             (TMMBR)..............................................37
92	         4.2.1.1. Message Format..................................37
93	         4.2.1.2. Semantics.......................................38
94	         4.2.1.3. Timing Rules....................................42
95	         4.2.1.4. Handling in Translator and Mixers...............42
96	      4.2.2. Temporary Maximum Media Stream Bit Rate Notification
97	             (TMMBN)..............................................42
98	         4.2.2.1. Message Format..................................42
99	         4.2.2.2. Semantics.......................................43
100	         4.2.2.3. Timing Rules....................................44
101	         4.2.2.4. Handling by Translators and Mixers..............44
102	   4.3. Payload Specific Feedback Messages........................44
103	      4.3.1. Full Intra Request (FIR).............................45
104	         4.3.1.1. Message Format..................................45
105	         4.3.1.2. Semantics.......................................46
106	         4.3.1.3. Timing Rules....................................48
107	         4.3.1.4. Handling of FIR Message in Mixer and Translators48
108	         4.3.1.5. Remarks.........................................49
109	      4.3.2. Temporal-Spatial Trade-off Request (TSTR)............49
110	         4.3.2.1. Message Format..................................49
111	         4.3.2.2. Semantics.......................................50
112	         4.3.2.3. Timing Rules....................................51
113	         4.3.2.4. Handling of message in Mixers and Translators...51
114	         4.3.2.5. Remarks.........................................51
115	      4.3.3. Temporal-Spatial Trade-off Notification (TSTN).......51
116	         4.3.3.1. Message Format..................................52
117	         4.3.3.2. Semantics.......................................52
118	         4.3.3.3. Timing Rules....................................53
119	         4.3.3.4. Handling of TSTN in Mixer and Translators.......53
120	         4.3.3.5. Remarks.........................................53
121	      4.3.4. H.271 Video Back Channel Message (VBCM)..............53
122	         4.3.4.1. Message Format..................................54
123	         4.3.4.2. Semantics.......................................55
124	         4.3.4.3. Timing Rules....................................56
125	         4.3.4.4. Handling of message in Mixer or Translator......56
126	         4.3.4.5. Remarks.........................................56
127	5. Congestion Control.............................................57
128	6. Security Considerations........................................57
129	7. SDP Definitions................................................58
130	   7.1. Extension of the rtcp-fb Attribute........................58
131	   7.2. Offer-Answer..............................................60
132	   7.3. Examples..................................................60
133	8. IANA Considerations............................................64
134	9. Acknowledgements...............................................65
135	10. References....................................................67
136	   10.1. Normative references.....................................67
137	   10.2. Informative references...................................67
138	11. Authors' Addresses............................................69
139	1.1. Introduction

141	   When the Audio-Visual Profile with Feedback (AVPF) [RFC4585] was
142	   developed, the main emphasis lay in the efficient support of
143	   point-to-point and small multipoint scenarios without centralized
144	   multipoint control.  However, in practice, many small multipoint
145	   conferences operate utilizing devices known as Multipoint Control
146	   Units (MCUs).  Long-standing experience of the conversational
147	   video conferencing industry suggests that there is a need for a
148	   few additional feedback messages, to support centralized
149	   multipoint conferencing efficiently.  Some of the messages have
150	   applications beyond centralized multipoint, and this is indicated
151	   in the description of the message.  This is especially true for
152	   the message intended to carry ITU-T Rec. H.271 [H.271] bit strings
153	   for Video Back Channel messages.

155	   In Real-time Transport Protocol (RTP) [RFC3550] terminology, MCUs
156	   comprise mixers and translators.  Most MCUs also include signaling
157	   support.  During the development of this memo, it was noticed that
158	   there is considerable confusion in the community related to the
159	   use of terms such as mixer, translator, and MCU.  In response to
160	   these concerns, a number of topologies have been identified that
161	   are of practical relevance to the industry, but are not documented
162	   in sufficient detail in [RFC3550].  These topologies are
163	   documented in [Topologies], and understanding this memo requires
164	   previous or parallel study of [Topologies].

166	   Some of the messages defined here are forward only, in that they
167	   do not require an explicit notification to the message emitter
168	   that they have been received and/or indicating the message
169	   receiver's actions.  Other messages require a response, leading to
170	   a two way communication model that one could view as useful for
171	   control purposes.  However, it is not the intention of this memo
172	   to open up RTP Control Protocol (RTCP) to a generalized control
173	   protocol.  All mentioned messages have relatively strict real-time
174	   constraints, in the sense that their value diminishes with
175	   increased delay.  This makes the use of more traditional control
176	   protocol means, such as Session Initiation Protocol (SIP) re-
177	   INVITEs [RFC3261], undesirable when used for the same purpose.
178	   Furthermore, all messages are of a very simple format that can be
179	   easily processed by an RTP/RTCP sender/receiver.  Finally, and
180	   most importantly, all messages relate only to the RTP stream with
181	   which they are associated, and not to any other property of a
182	   communication system.  In particular, none of them relate to the
183	   properties of the access links traversed by the session.

185	2. Definitions

187	2.1. Glossary

189	   AMID   - Additive Increase Multiplicative Decrease
190	   AVPF   - The extended RTP profile for RTCP-based feedback
191	   FEC    - Forward Error Correction
192	   FCI    - Feedback Control Information [RFC4585]
193	   FIR    - Full Intra Request
194	   MCU    - Multipoint Control Unit
195	   MPEG   - Moving Picture Experts Group
196	   TMMBN  - Temporary Maximum Media Stream Bit Rate Notification
197	   TMMBR  - Temporary Maximum Media Stream Bit Rate Request
198	   PLI    - Picture Loss Indication
199	   PR     - Packet rate
200	   QP     - Quantizer Parameter
201	   RTT    - Round trip time
202	   SSRC   - Synchronization Source
203	   TSTN   - Temporal Spatial Trade-off Notification
204	   TSTR   - Temporal Spatial Trade-off Request
205	   VBCM   - Video Back Channel Message indication.

207	2.2. Terminology

209	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
210	   NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and
211	   "OPTIONAL" in this document are to be interpreted as described in
212	   RFC 2119 [RFC2119].

214	      Message:
215	          An RTCP feedback message [RFC4585] defined by this
216	          specification, of one of the following types:

218	          Request:
219	              Message that requires acknowledgement

221	          Command:
222	              Message that forces the receiver to an action

224	          Indication:
225	              Message that reports a situation

227	          Notification:
228	             Message that provides a notification that an event has
229	              occurred. Notifications are commonly generated in
230	              response to a Request.

232	          Note that, with the exception of "Notification", this
233	          terminology is in alignment with ITU-T Rec. H.245 [H245].

235	     Decoder Refresh Point:
236	          A bit string, packetized in one or more RTP packets, which
237	          completely resets the decoder to a known state.

239	          Examples for "hard" decoder refresh points are Intra
240	          pictures in H.261, H.263, MPEG-1, MPEG-2, and MPEG-4 part
241	          2, and Instantaneous Decoder Refresh (IDR) pictures in
242	          H.264.  "Gradual" decoder refresh points may also be used;
243	          see for example [AVC].  While both "hard" and "gradual"
244	          decoder refresh points are acceptable in the scope of this
245	          specification, in most cases the user experience will
246	          benefit from using a "hard" decoder refresh point.

248	          A decoder refresh point also contains all header
249	          information above the picture layer (or equivalent,
250	          depending on the video compression standard) that is
251	          conveyed in-band.  In H.264, for example, a decoder refresh
252	          point contains parameter set Network Adaptation Layer (NAL)
253	          units that generate parameter sets necessary for the
254	          decoding of the following slice/data partition NAL units
255	          (and that are not conveyed out of band).

257	   Decoding:
258	          The operation of reconstructing the media stream.

260	   Rendering:
261	          The operation of presenting (parts of) the reconstructed
262	          media stream to the user.

264	   Stream thinning:
265	          The operation of removing some of the packets from a media
266	          stream.  Stream thinning, preferably, is media-aware,
267	          implying that media packets are removed in the order of
268	          increasing relevance to the reproductive quality.  However
269	          even when employing media-aware stream thinning, most media
270	          streams quickly lose quality when subject to increasing
271	          levels of thinning.  Media-unaware stream thinning leads to
272	          even worse quality degradation.  In contrast to
273	          transcoding, stream thinning is typically seen as a
274	          computationally lightweight operation.

276	   Media:
277	          Often used (sometimes in conjunction with terms like bit
278	          rate, stream, sender ...) to identify the content of the
279	          forward RTP packet stream (carrying the codec data), to
280	          which the codec control message applies.

282	   Media Stream:
283	          The stream of RTP packets labeled with a single
284	          Synchronization Source (SSRC) carrying the media (and also
285	          in some cases repair information such as retransmission or
286	          Forward Error Correction (FEC) information).

288	   Total media bit rate:
289	          The total bits per second transferred in a media stream,
290	          measured at an observer-selected protocol layer and
291	          averaged over a reasonable timescale, the length of which
292	          depends on the application.  In general, a media sender and
293	          a media receiver will observe different total media bit
294	          rates for the same stream, first because they may have
295	          selected different reference protocol layers, and second,
296	          because of changes in per-packet overhead along the
297	          transmission path.  The goal with bit rate averaging is to
298	          be able to ignore any burstiness on very short timescales,
299	          below for example 100 ms, introduced by scheduling or link
300	          layer packetization effects.

302	   Maximum total media bit rate:
303	          The upper limit on total media bit rate for a given media
304	          stream at a particular receiver and for its selected
305	          protocol layer. Note that this value cannot be measured on
306	          the received media stream, instead it needs to be
307	          calculated or determined through other means, such as QoS
308	          negotiations or local resource limitations. Also note that
309	          this value is an average (on a timescale that is reasonable
310	          for the application) and that it may be different from the
311	          instantaneous bit-rate seen by packets in the media stream.

313	   Overhead:
314	          All protocol header information required to convey a packet
315	          with media data from sender to receiver, from the
316	          application layer down to a pre-defined protocol level (for
317	          example down to, and including, the IP header).  Overhead
318	          may include, for example, IP, UDP, and RTP headers, any
319	          layer 2 headers, any Contributing Sources (CSRCs), RTP-
320	          Padding, and RTP header extensions.  Overhead excludes any
321	          RTP payload headers and the payload itself.

323	   Net media bit rate:
324	          The bit rate carried by a media stream, net of overhead.
325	          That is, the bits per second accounted for by encoded
326	          media, any applicable payload headers, and any directly
327	          associated meta payload information placed in the RTP
328	          packet.  A typical example of the latter is redundancy data
329	          provided by the use of RFC 2198 [RFC2198].  Note that,
330	          unlike the total media bit rate, the net media bit rate
331	          will have the same value at the media sender and at the
332	          media receiver unless any mixing or translating of the
333	          media has occurred.

335	          For a given observer, the total media bit rate for a media
336	          stream is equal to the sum of the net media bit rate and
337	          the per-packet overhead as defined above multiplied by the
338	          packet rate.

340	   Feasible region:
341	          The set of all combinations of packet rate and net media
342	          bit rate that do not exceed the restrictions in maximum
343	          media bit rate placed on a given media sender by the
344	          Temporary Maximum Media Stream Bit-rate Request (TMMBR)
345	          messages it has received.  The feasible region will change
346	          as new TMMBR messages are received.

348	   Bounding set:
349	          The set of TMMBR tuples, selected from all those received
350	          at a given media sender, that define the feasible region
351	          for that media sender.  The media sender uses an algorithm
352	          such as that in section 3
353	.5.4.2 to determine or iteratively
354	          approximate the current bounding set, and reports that set
355	          back to the media receivers in a Temporary Maximum Media
356	          Stream Bit-rate Notification (TMMBN) message.

358	2.3. Topologies

360	   Please refer to [Topologies] for an in depth discussion.  The
361	   topologies referred to throughout this memo are labeled
362	   (consistently with [Topologies]) as follows:

364	   Topo-Point-to-Point . . . . point-to-point communication
365	   Topo-Multicast  . . . . . . multicast communication as in RFC 3550
366	   Topo-Translator . . . . . . translator based as in RFC 3550
367	   Topo-Mixer  . . . . . . . . mixer based as in RFC 3550
368	   Topo-Video-switch-MCU . . . video switching MCU,
369	   Topo-RTCP-terminating-MCU . mixer but terminating RTCP

371	3. Motivation

373	   This section discusses the motivation and usage of the different
374	   video and media control messages.  The video control messages have
375	   been under discussion for a long time, and a requirement draft was
376	   drawn up [Basso].  This draft has expired; however we quote
377	   relevant sections of it to provide motivation and requirements.

379	3.1.
380	     Use Cases

382	   There are a number of possible usages for the proposed feedback
383	   messages.  Let us begin by looking through the use cases Basso et
384	   al. [Basso] proposed.  Some of the use cases have been
385	   reformulated and comments have been added.

387	   1. An RTP video mixer composes multiple encoded video sources into
388	      a single encoded video stream.  Each time a video source is
389	      added, the RTP mixer needs to request a decoder refresh point
390	      from the video source, so as to start an uncorrupted prediction
391	      chain on the spatial area of the mixed picture occupied by the
392	      data from the new video source.

394	   2. An RTP video mixer receives multiple encoded RTP video streams
395	      from conference participants, and dynamically selects one of
396	      the streams to be included in its output RTP stream.  At the
397	      time of a bit stream change (determined through means such as
398	      voice activation or the user interface), the mixer requests a
399	      decoder refresh point from the remote source, in order to avoid
400	      using unrelated content as reference data for inter picture
401	      prediction.  After requesting the decoder refresh point, the
402	      video mixer stops the delivery of the current RTP stream and
403	      monitors the RTP stream from the new source until it detects
404	      data belonging to the decoder refresh point.  At that time, the
405	      RTP mixer starts forwarding the newly selected stream to the
406	      receiver(s).

408	   3. An application needs to signal to the remote encoder that the
409	      desired trade-off between temporal and spatial resolution has
410	      changed.  For example, one user may prefer a higher frame rate
411	      and a lower spatial quality, and another user may prefer the
412	      opposite.  This choice is also highly content dependent.  Many
413	      current video conferencing systems offer in the user interface
414	      a mechanism to make this selection, usually in the form of a
415	      slider.  The mechanism is helpful in point-to-point,
416	      centralized multipoint and non-centralized multipoint uses.

418	   4. Use case 4 of the Basso draft applies only to Picture Loss
419	      Indication (PLI) as defined in AVPF [RFC4585] and is not
420	      reproduced here.

422	   5. Use case 5 of the Basso draft relates to a mechanism known as
423	      "freeze picture request".  Sending freeze picture requests
424	      over a non-reliable forward RTCP channel has been identified as
425	      problematic.  Therefore, no freeze picture request has been
426	      included in this memo, and the use case discussion is not
427	      reproduced here.

429	   6. A video mixer dynamically selects one of the received video
430	      streams to be sent out to participants and tries to provide the
431	      highest bit rate possible to all participants, while minimizing
432	      stream trans-rating.  One way of achieving this is to set up
433	      sessions with endpoints using the maximum bit rate accepted by
434	      each endpoint, and accepted by the call admission method used
435	      by the mixer.  By means of commands that reduce the maximum
436	      media stream bit rate below what has been negotiated during
437	      session set up, the mixer can reduce the maximum bit rate sent
438	      by endpoints to the lowest of all the accepted bit rates.  As
439	      the lowest accepted bit rate changes due to endpoints joining
440	      and leaving or due to network congestion, the mixer can adjust
441	      the limits at which endpoints can send their streams to match
442	      the new value.  The mixer then requests a new maximum bit rate,
443	      which is equal to or less than the maximum bit rate negotiated
444	      at session setup for a specific media stream, and the remote
445	      endpoint can respond with the actual bit rate that it can
446	      support.

448	   The picture Basso, et al draws up covers most applications we
449	   foresee.  However we would like to extend the list with two
450	   additional use cases:

452	   7. Currently deployed congestion control algorithms (AMID and TFRC
453	      [RFC3448]) probe for additional available capacity as long as
454	      there is something to send.  With congestion control algorithms
455	      using packet loss as the indication for congestion, this
456	      probing does generally result in reduced media quality (often
457	      to a point where the distortion is large enough to make the
458	      media unusable), due to packet loss and increased delay.

460	      In a number of deployment scenarios, especially cellular ones,
461	      the bottleneck link is often the last hop link.  That cellular
462	      link also commonly has some type of QoS negotiation enabling
463	      the cellular device to learn the maximal bit rate available
464	      over this last hop.  A media receiver behind this link can, in
465	      most (if not all) cases, calculate at least an upper bound for
466	      the bit rate available for each media stream it presently
467	      receives.  How this is done is an implementation detail and not
468	      discussed herein.  Indicating the maximum available bit rate to
469	      the transmitting party for the various media streams can be
470	      beneficial to prevent that party from probing for bandwidth for
471	      this stream in excess of a known hard limit.  For cellular or
472	      other mobile devices, the known available bit rate for each
473	      stream (deduced from the link bit rate) can change quickly, due
474	      to handover to another transmission technology, QoS
475	      renegotiation due to congestion, etc.  To enable minimal
476	      disruption of service, quick convergence is necessary, and
477	      therefore media path signaling is desirable.

479	    8. The use of reference picture selection (RPS) as an error
480	       resilience tool has been introduced in 1997 as NEWPRED
481	       [NEWPRED], and is now widely deployed.  When RPS is in use,
482	       simplistically put, the receiver can send a feedback message to
483	       the sender, indicating a reference picture that should be used
484	       for future prediction.  ([NEWPRED] mentions other forms of
485	       feedback as well.)  AVPF contains a mechanism for conveying
486	       such a message, but did not specify for which codec and
487	       according to which syntax the message should conform.
488	       Recently, the ITU-T finalized Rec. H.271 which (among other
489	       message types) also includes a feedback message.  It is
490	       expected that this feedback message will fairly quickly enjoy
491	       wide support.  Therefore, a mechanism to convey feedback
492	       messages according to H.271 appears to be desirable.

494	3.2. Using the Media Path

496	   There are multiple reasons why we use the media path for the codec
497	   control messages.

499	   First, systems employing MCUs often separate the control and media
500	   processing parts.  As these messages are intended for or generated
501	   by the media part rather than the signaling part of the MCU,
502	   having them on the media path avoids transmission across
503	   interfaces and unnecessary control traffic between signaling and
504	   processing.  If the MCU is physically decomposed, the use of the
505	   media path avoids the need for media control protocol extensions
506	   (e.g. in MEGACO [RFC3525]).

508	   Secondly, the signaling path quite commonly contains several
509	   signaling entities, e.g. SIP proxies and application servers.

511	   Avoiding going through signaling entities avoids delay for several
512	   reasons.  Proxies have less stringent delay requirements than
513	   media processing and due to their complex and more generic nature
514	   may result in significant processing delay.  The topological
515	   locations of the signaling entities are also commonly not
516	   optimized for minimal delay, but rather towards other
517	   architectural goals.  Thus the signaling path can be significantly
518	   longer in both geographical and delay sense.

520	3.3. Using AVPF

522	   The AVPF feedback message framework [RFC4585] provides the
523	   appropriate framework to implement the new messages.  AVPF
524	   implements rules controlling the timing of feedback messages to
525	   avoid congestion through network flooding by RTCP traffic.  We re-
526	   use these rules by referencing AVPF.

528	   The signaling setup for AVPF allows each individual type of
529	   function to be configured or negotiated on an RTP session basis.

531	3.3.1. Reliability

533	   The use of RTCP messages implies that each message transfer is
534	   unreliable, unless the lower layer transport provides reliability.
535	   The different messages proposed in this specification have
536	   different requirements in terms of reliability.  However, in all
537	   cases, the reaction to an (occasional) loss of a feedback message
538	   is specified.

540	3.4. Multicast

542	   The codec control messages might be used with multicast.  The RTCP
543	   timing rules specified in [RFC3550] and [RFC4585] ensure that the
544	   messages do not cause overload of the RTCP connection.  The use of
545	   multicast may result in the reception of messages with
546	   inconsistent semantics.   The reaction to inconsistencies depends
547	   on the message type, and is discussed for each message type
548	   separately.

550	3.5. Feedback Messages

552	   This section describes the semantics of the different feedback
553	   messages and how they apply to the different use cases.

555	3.5.1. Full Intra Request Command

557	   A Full Intra Request (FIR) Command, when received by the
558	   designated media sender, requires that the media sender sends a
559	   Decoder Refresh Point (see 2.2) at the earliest opportunity.  The
560	   evaluation of such opportunity includes the current encoder coding
561	   strategy and the current available network resources.

563	   FIR is also known as an "instantaneous decoder refresh request" or
564	   "video fast update request".

566	   Using a decoder refresh point implies refraining from using any
567	   picture sent prior to that point as a reference for the encoding
568	   process of any subsequent picture sent in the stream.  For
569	   predictive media types that are not video, the analogue applies.
570	   For example, if in MPEG-4 systems scene updates are used, the
571	   decoder refresh point consists of the full representation of the
572	   scene and is not delta-coded relative to previous updates.

574	   Decoder refresh points, especially Intra or IDR pictures, are in
575	   general several times larger in size than predicted pictures.
576	   Thus, in scenarios in which the available bit rate is small, the
577	   use of a decoder refresh point implies a delay that is
578	   significantly longer than the typical picture duration.

580	   Usage in multicast is possible; however aggregation of the
581	   commands is recommended.  A receiver that receives a request
582	   closely (within 2 times the longest Round Trip Time (RTT) known,
583	   plus any AVPF-induced RTCP packet sending delays, if those are
584	   known) after sending a decoder refresh point, should await a
585	   second request message to ensure that the media receiver has not
586	   been served by the previously delivered decoder refresh point.
587	   The reason for the specified delay is to avoid sending unnecessary
588	   decoder refresh points.  A session participant may have sent its
589	   own request while another participant's request was in-flight to
590	   them.  Suppressing those requests that may have been sent without
591	   knowledge about the other request avoids this issue.

593	   Using the FIR command to recover from errors is explicitly
594	   disallowed, and instead the PLI message defined in AVPF [RFC4585]
595	   should be used.  The PLI message reports lost pictures and has
596	   been included in AVPF for precisely that purpose.

598	   Full Intra Request is applicable in use-cases 1 and 2.

600	3.5.1.1. Reliability
601	   The FIR message results in the delivery of a decoder refresh
602	   point, unless the message is lost.  Decoder refresh points are
603	   easily identifiable from the bit stream.  Therefore, there is no
604	   need for protocol-level notification, and a simple command
605	   repetition mechanism is sufficient for ensuring the level of
606	   reliability required.  However, the potential use of repetition
607	   does require a mechanism to prevent the recipient from responding
608	   to messages already received and responded to.

610	   To ensure the best possible reliability, a sender of FIR may
611	   repeat the FIR request until the desired content has been
612	   received.  The repetition interval is determined by the RTCP
613	   timing rules applicable to the session.  Upon reception of a
614	   complete decoder refresh point or the detection of an attempt to
615	   send a decoder refresh point (which got damaged due to a packet
616	   loss), the repetition of the FIR must stop.  If another FIR is
617	   necessary, the request sequence number must be increased.  A FIR
618	   sender shall not have more than one FIR request (different request
619	   sequence number) outstanding at any time per media sender in the
620	   session.

622	   The receiver of FIR (i.e. the media sender) behaves in
623	   complementary fashion to ensure delivery of a decoder refresh
624	   point.  If it receives repetitions of the FIR more than 2*RTT
625	   after it has sent a decoder refresh point, it shall send a new
626	   decoder refresh point.  Two round trip times allow time for the
627	   decoder refresh point to arrive back to the requestor and for the
628	   end of repetitions of FIR to reach and be detected by the media
629	   sender.

631	   An RTP mixer that receives an FIR from a media receiver is
632	   responsible to ensure that a decoder refresh point is delivered to
633	   the requesting receiver.  It may be necessary for the mixer to
634	   generate FIR commands.  From a reliability perspective, the two
635	   legs (FIR-requesting endpoint to mixer, and mixer to decoder
636	   refresh point generating endpoint) are handled independently from
637	   each other.

639	3.5.2. Temporal Spatial Trade-off Request and Notification

641	   The Temporal Spatial Trade-off Request (TSTR) instructs the video
642	   encoder to change its trade-off between temporal and spatial
643	   resolution.  Index values from 0 to 31 indicate monotonically a
644	   desire for higher frame rate.  That is, a requester asking for an
645	   index of 0 prefers a high quality and is willing to accept a low
646	   frame rate, whereas a requester asking for 31 wishes a high frame
647	   rate, potentially at the cost of low spatial quality.

649	   In general the encoder reaction time may be significantly longer
650	   than the typical picture duration.  See use case 3 for an example.
651	   The encoder decides whether and to what extent the request results
652	   in a change of the trade-off.  It returns a Temporal Spatial
653	   Trade-Off Notification (TSTN) message to indicate the trade-off
654	   that it will use henceforth.

656	   TSTR and TSTN have been introduced primarily because it is
657	   believed that control protocol mechanisms, e.g. a SIP re-invite,
658	   are too heavyweight and too slow to allow for a reasonable user
659	   experience.  Consider, for example, a user interface where the
660	   remote user selects the temporal/spatial trade-off with a slider
661	   (as it is common in state-of-the-art video conferencing systems).
662	   An immediate feedback to any slider movement is required for a
663	   reasonable user experience.  A SIP re-INVITE [RFC3261] would
664	   require at least two round-trips more (compared to the TSTR/TSTN
665	   mechanism) and may involve proxies and other complex mechanisms.
666	   Even in a well-designed system, it could take a second or so until
667	   finally the new trade-off is selected.
668	   Furthermore the use of RTCP solves the multicast use case very
669	   efficiently.

671	   The use of TSTR and TSTN in multipoint scenarios is a non-trivial
672	   subject, and can be achieved in many implementation-specific ways.
673	   Problems stem from the fact that TSTRs will typically arrive
674	   unsynchronized, and may request different trade-off values for the
675	   same stream and/or endpoint encoder.  This memo does not specify a
676	   translator, mixer or endpoint's reaction to the reception of a
677	   suggested trade-off as conveyed in the TSTR.  We only require the
678	   receiver of a TSTR message to reply to it by sending a TSTN,
679	   carrying the new trade-off chosen by its own criteria (which may
680	   or may not be based on the trade-off conveyed by the TSTR).  In
681	   other words, the trade-off sent in TSTR is a non-binding
682	   recommendation, nothing more.

684	   Four TSTR/TSTN scenarios need to be distinguished, based on the
685	   topologies described in [Topologies].  The scenarios are described
686	   in the following sub-clauses.

688	3.5.2.1. Point-to-Point

690	   In this most trivial case (Topo-Point-to-Point), the media sender
691	   typically adjusts its temporal/spatial trade-off based on the
692	   requested value in TSTR, subject to its own capabilities.  The
693	   TSTN message conveys back the new trade-off value (which may be
694	   identical to the old one if, for example, the sender is not
695	   capable of adjusting its trade-off).

697	3.5.2.2. Point-to-Multipoint Using Multicast or Translators

699	   RTCP Multicast is used either with media multicast according to
700	   Topo-Multicast, or following RFC 3550's translator model according
701	   to Topo-Translator.  In these cases, unsynchronized TSTR messages
702	   from different receivers may be received, possibly with different
703	   requested trade-offs (because of different user preferences).
704	   This memo does not specify how the media sender tunes its trade-
705	   off.  Possible strategies include selecting the mean or median of
706	   all trade-off requests received, giving priority to certain
707	   participants, or continuing to use the previously selected trade-
708	   off (e.g. when the sender is not capable of adjusting it).  Again,
709	   all TSTR messages need to be acknowledged by TSTN, and the value
710	   conveyed back has to reflect the decision made.

712	3.5.2.3. Point-to-Multipoint Using RTP Mixer

714	   In this scenario (Topo-Mixer) the RTP mixer receives all TSTR
715	   messages, and has the opportunity to act on them based on its own
716	   criteria.  In most cases, the mixer should form a "consensus" of
717	   potentially conflicting TSTR messages arriving from different
718	   participants, and initiate its own TSTR message(s) to the media
719	   sender(s).  As in the previous scenario, the strategy for forming
720	   this "consensus" is up to the implementation, and can, for
721	   example, encompass averaging the participants' request values,
722	   giving priority to certain participants, or using session default
723	   values.

725	   Even if a mixer or translator performs transcoding, it is very
726	   difficult to deliver media with the requested trade-off, unless
727	   the content the mixer or translator receives is already close to
728	   that trade-off.  Thus if the mixer changes its trade-off, it needs
729	   to request the media sender(s) to use the new value, by creating a
730	   TSTR of its own.  Upon reaching a decision on the used trade-off
731	   it includes that value in the acknowledgement to the downstream
732	   requestors.  Only in cases where the original source has
733	   substantially higher quality (and bit rate), is it likely that
734	   transcoding alone can result in the requested trade-off.

736	3.5.2.4. Reliability
737	   A request and reception acknowledgement mechanism is specified.
738	   The Temporal Spatial Trade-off Notification (TSTN) message informs
739	   the request-sender that its request has been received, and what
740	   trade-off is used henceforth.  This acknowledgment mechanism is
741	   desirable for at least the following reasons:

743	   o A change in the trade-off cannot be directly identified from the
744	     media bit stream.
745	   o User feedback cannot be implemented without knowing the chosen
746	     trade-off value, according to the media sender's constraints.
747	   o Repetitive sending of messages requesting an unimplementable
748	     trade-off can be avoided.

750	3.5.3. H.271 Video Back Channel Message

752	   ITU-T Rec. H.271 defines syntax, semantics, and suggested encoder
753	   reaction to a video back channel message.  The structure defined
754	   in this memo is used to transparently convey such a message from
755	   media receiver to media sender.  In this memo, we refrain from an
756	   in-depth discussion of the available code points within H.271 and
757	   refer to the specification text [H.271] instead.

759	   However, we note that some H.271 messages bear similarities with
760	   native messages of AVPF and this memo.  Furthermore, we note that
761	   some H.271 message are known to require caution in multicast
762	   environments -- or are plainly not usable in multicast or
763	   multipoint scenarios.  Table 1 provides a brief, oversimplifying
764	   overview of the messages currently defined in H.271, their roughly
765	   corresponding AVPF or CCM messages (the latter as specified in
766	   this memo), and an indication of our current knowledge of their
767	   multicast safety.

769	   H.271 msg type       AVPF/CCM msg type    multicast-safe
770	   ---------------------------------------------------------------------
771	   0 (when used for
772	     reference picture
773	      selection)        AVPF RPSI        No (positive ACK of pictures)
774	   1 picture loss       AVPF PLI         Yes
775	   2 partial loss       AVPF SLI         Yes
776	   3 one parameter CRC  N/A              Yes (no required sender action)
777	   4 all parameter CRC  N/A              Yes (no required sender action)
778	   5 refresh point      CCM FIR          Yes

780	   Table 1: H.271 messages and their AVPF/CCM equivalents

782	          Note: H.271 message type 0 is not a strict equivalent to
783	          AVPF's Reference Picture Selection Indication (RPSI); it is
784	          an indication of known-as-correct reference picture(s) at
785	          the decoder.  It does not command an encoder to use a
786	          defined reference picture (the form of control information
787	          envisioned to be carried in RPSI).  However, it is believed
788	          and intended that H.271 message type 0 will be used for the
789	          same purpose as AVPF's RPSI -- although other use forms are
790	          also possible.

792	   In response to the opaqueness of the H.271 messages especially
793	   with respect to the multicast safety, the following guidelines
794	   MUST be followed when an implementation wishes to employ the H.271
795	   video back channel message:

797	   1. Implementations utilizing the H.271 feedback message MUST stay
798	      in compliance with congestion control principles, as outlined
799	      in section 5
800	.

802	   2. An implementation SHOULD utilize the IETF-native messages as
803	      defined in [RFC4585] and in this memo instead of similar
804	      messages defined in [H.271].  Our current understanding of
805	      similar messages is documented in Table 1 above.  One good
806	      reason to divert from the SHOULD statement above would be if it
807	      is clearly understood that, for a given application and video
808	      compression standard, the aforementioned "similarity" is not
809	      given, in contrast to what
810	      the table indicates.

812	   3. It has been observed that some of the H.271 code points
813	      currently in existence are not multicast-safe.  Therefore, the
814	      sensible thing to do is not to use the H.271 feedback message
815	      type in multicast environments.  It MAY be used only when all
816	      the issues mentioned later are fully understood by the
817	      implementer, and properly taken into account by all endpoints.
818	      In all other cases, the H.271 message type MUST NOT be used in
819	      conjunction with multicast.

821	   4. It has been observed that even in centralized multipoint
822	      environments, where the mixer should theoretically be able to
823	      resolve issues as documented below, the implementation of such
824	      a mixer and cooperative endpoints is a very difficult and
825	      tedious task.  Therefore, H.271 messages MUST NOT be used in
826	      centralized multipoint scenarios, unless all the issues
827	      mentioned below are fully understood by the implementer, and
828	      properly taken into account by both mixer and endpoints.

830	   Issues to be taken into account when considering the use of H.271
831	   in multipoint environments:

833	   1. Different state on different receivers.  In many environments
834	      it cannot be guaranteed that the decoder state of all media
835	      receivers is identical at any given point in time.  The most
836	      obvious reason for such a possible misalignment of state is a
837	      loss that occurs on the path to only one of many media
838	      receivers.  However, there are other not so obvious reasons,
839	      such as recent joins to the multipoint conference (be it by
840	      joining the multicast group or through additional mixer
841	      output).  Different states can lead the media receivers to
842	      issue potentially contradicting H.271 messages (or one media
843	      receiver issuing an H.271 message that, when observed by the
844	      media sender, is not helpful for the other media receivers).  A
845	      naive reaction of the media sender to these contradicting
846	      messages can lead to unpredictable and annoying results.

848	   2. Combining messages from different media receivers in a media
849	      sender is a non-trivial task.  As reasons, we note that these
850	      messages may be contradicting each other, and that their
851	      transport is unreliable (there may well be other reasons).  In
852	      case of many H.271 messages (i.e. types 0, 2, 3, and 4), the
853	      algorithm for combining must be aware both of the
854	      network/protocol environment (i.e. with respect to congestion)
855	      and of the media codec employed, as H.271 messages of a given
856	      type can have different semantics for different media codecs.

858	   3. The suppression of requests may need to go beyond the basic
859	      mechanisms described in AVPF (which are driven exclusively by
860	      timing and transport considerations on the protocol level).
861	      For example, a receiver is often required to refrain from (or
862	      delay) generating requests, based on information it receives
863	      from the media stream.  For instance, it makes no sense for a
864	      receiver to issue a FIR when a transmission of an Intra/IDR
865	      picture is ongoing.

867	   4. When using the non-multicast-safe messages (e.g. H.271 type 0
868	      positive ACK of received pictures/slices) in larger multicast
869	      groups, the media receiver will likely be forced to delay or
870	      even omit sending these messages.  For the media sender this
871	      looks like data has not been properly received (although it was
872	      received properly), and a naively implemented media sender
873	      reacts to these perceived problems where it should not.

875	3.5.3.1. Reliability

877	   H.271 Video Back Channel messages do not require reliable
878	   transmission, and confirmation of the reception of a message can
879	   be derived from the forward video bit stream.  Therefore, no
880	   specific reception acknowledgement is specified.

882	   With respect to re-sending rules, clause 3.5.1.1. applies.

884	3.5.4. Temporary Maximum Media Stream Bit Rate Request and
885	   Notification

887	   A receiver, translator or mixer uses the Temporary Maximum Media
888	   Stream Bit Rate Request (TMMBR, "timber") to request a sender to
889	   limit the maximum bit rate for a media stream (see 2.2) to, or
890	   below, the provided value.  The Temporary Maximum Media Stream Bit
891	   Rate Notification (TMMBN) contains the media sender's current view
892	   of the most limiting subset of the TMMBR-defined limits it has
893	   received, to help the participants to suppress TMMBR requests that
894	   would not further restrict the media sender.  The primary usage
895	   for the TMMBR/TMMBN messages is in a scenario with an MCU or mixer
896	   (use case 6), corresponding to Topo-Translator or Topo-Mixer, but
897	   also to Topo-Point-to-Point.

899	   Each temporary limitation on the media stream is expressed as a
900	   tuple.  The first component of the tuple is the maximum total
901	   media bit rate (as defined in section 2.2) that the media receiver
902	   is currently prepared to accept for this media stream.  The second
903	   component is the per-packet overhead that the media receiver has
904	   observed for this media stream at its chosen reference protocol
905	   layer.

907	   As indicated in section 2.2, the overhead as observed by the
908	   sender of the TMMBR (i.e. the media receiver) may differ from the
909	   overhead observed at the receiver of the TMMBR (i.e. the media
910	   sender) due to use of a different reference protocol layer at the
911	   other end or due to the intervention of translators or mixers that
912	   affect the amount of per packet overhead.  For example, a gateway
913	   in between the two that converts between IPv4 and IPv6 affects the
914	   per-packet overhead by 20 bytes.  Other mechanisms that change the
915	   overhead include tunnels.  The problem with varying overhead is
916	   also discussed in [RFC3890].  As will be seen in the description
917	   of the algorithm for use of TMMBR, the difference in perceived
918	   overhead between the sending and receiving ends presents no
919	   difficulty because calculations are carried out in terms of
920	   variables (packet rate, net media bit rate) that have the same
921	   value at the sender as at the receiver.

923	   Reporting both maximum total media bit rate and per-packet
924	   overhead allows different receivers to provide bit rate and
925	   overhead values for different protocol layers, for example at the
926	   IP level, at the outer part of a tunnel protocol, or at the link
927	   layer.  The protocol level a peer reports on depends on the level
928	   of integration the peer has, as it needs to be able to extract the
929	   information from that protocol level.  For example, an application
930	   with no knowledge of the IP version it is running over can not
931	   meaningfully determine the overhead of the IP header, and hence
932	   will not want to include IP overhead in the overhead or maximum
933	   total media bit rate calculation.

935	   It is expected that most peers will be able to report values at
936	   least for the IP layer.  In certain implementations it may be
937	   advantageous to also include information pertaining to the link
938	   layer, which in turn allows for a more precise overhead
939	   calculation and a better optimization of connectivity resources.

941	   The Temporary Maximum Media Stream Bit Rate messages are generic
942	   messages that can be applied to any RTP packet stream.  This
943	   separates them from the other codec control messages defined in
944	   this specification, which apply only to specific media types or
945	   payload formats.  The TMMBR functionality applies to the
946	   transport, and the requirements the transport places on the media
947	   encoding.

949	   The reasoning below assumes that the participants have negotiated
950	   a session maximum bit rate, using a signaling protocol.  This
951	   value can be global, for example in case of point-to-point,
952	   multicast, or translators.  It may also be local between the
953	   participant and the peer or mixer.  In either case, the bit rate
954	   negotiated in signaling is the one that the participant guarantees
955	   to be able to handle (depacketize and decode).  In practice, the
956	   connectivity of the participant also influences the negotiated
957	   value -- it does not make much sense to negotiate a total media
958	   bit rate that one's network interface does not support.

960	   It is also beneficial to have negotiated a maximum packet rate for
961	   the session or sender.  RFC 3890 provides an SDP [RFC4566]
962	   attribute that can be used for this purpose; however, that
963	   attribute is not usable in RTP sessions established using
964	   offer/answer [RFC3264].  Therefore an optional maximum packet rate
965	   signaling parameter is specified in this memo.

967	   An already established maximum total media bit rate may be changed
968	   at any time, subject to the timing rules governing the sending of
969	   feedback messages. The limit may change to any value between zero
970	   and the session maximum, as negotiated during session
971	   establishment signaling.  However, even if a sender has received a
972	   TMMBR message allowing an increase in the bit rate, all increases
973	   must be governed by a congestion control mechanism.  TMMBR
974	   indicates known limitations only, usually in the local
975	   environment, and does not provide any guarantees about the full
976	   path.  Furthermore, any increases in TMMBR-established bit rate
977	   limits are to be executed only after a certain delay from the
978	   sending of the TMMBN message that notifies the world about the
979	   increase in limit.  The delay is specified as at least twice the
980	   longest RTT as known by the media sender, plus the media sender's
981	   calculation of the required wait time for the sending of another
982	   TMMBR message for this session based on AVPF timing rules.  This
983	   delay is introduced to allow other session participants to make
984	   known their bit rate limit requirements, which may be lower.

986	   If it is likely that the new value indicated by TMMBR will be
987	   valid for the remainder of the session, the TMMBR sender is
988	   expected to perform a renegotiation of the session upper limit
989	   using the session signaling protocol.

991	3.5.4.1. Behavior for media receivers using TMMBR

993	   This section is an informal description of behaviour described
994	   more precisely in section 4.2.

996	   A media sender begins the session limited by the maximum media bit
997	   rate and maximum packet rate negotiated in session signaling, if
998	   any. Note that this value may be negotiated for another protocol
999	   layer than the one the participant uses in its TMMBR messages.
1000	   Each media receiver selects a reference protocol layer, forms an
1001	   estimate of the overhead it is observing (or estimating it if no
1002	   packets has been seen yet) at that reference level, and determines
1003	   the maximum total media bit rate it can accept, taking into
1004	   account its own limitations and any transport path limitations of
1005	   which it may be aware.  In case the current limitations are more
1006	   restricting then what was agreed on in the session signaling, the
1007	   media receiver reports its initial estimate of these two
1008	   quantities to the media sender using a TMMBR message.  Overall
1009	   message traffic is reduced by the possibility of including tuples
1010	   for multiple media senders in the same TMMBR message.

1012	   The media sender applies an algorithm such as that specified in
1013	   section 3.5.4.2 to select which of the tuples it has received are
1014	   most limiting (i.e. the bounding set as defined in section 2.2).
1015	   It modifies its operation to stay within the feasible region (as
1016	   defined in section 2.2), and also sends out a TMMBN notification
1017	   to the media receivers indicating the selected bounding set.

1019	   If a media receiver does not own one of the tuples in the bounding
1020	   set reported by the TMMBN, it applies the same algorithm as the
1021	   media sender to determine if its current estimated (maximum total
1022	   media bit rate, overhead) tuple would enter the bounding set if
1023	   known to the media sender.  If so, it issues a TMMBR request
1024	   reporting the tuple value to the sender.  Otherwise it takes no
1025	   action for the moment.  Periodically, its estimated tuple values
1026	   may change or it may receive a new TMMBN.  If so, it reapplies the
1027	   algorithm to decide whether it needs to issue a TMMBR request.

1029	   If, alternatively, a media receiver owns one of the tuples in the
1030	   reported bounding set, it takes no action until such time as its
1031	   estimate of its own tuple values changes.  At that time it sends a
1032	   TMMBR request to the media sender to report the changed values.

1034	   A media receiver may change status between owner and non-owner of
1035	   a bounding tuple between one TMMBN message and the next.  Thus it
1036	   must check the contents of each TMMBN to determine its subsequent
1037	   actions.

1039	   Implementations may use other algorithms of their choosing, as
1040	   long as the bit rate limitations resulting from the exchange of
1041	   TMMBR and TMMBN messages are at least as strict (at least as low,
1042	   in the bit rate dimension) as the ones resulting from the use of
1043	   the aforementioned algorithm.

1045	   Obviously, in point-to-point cases, when there is only one media
1046	   receiver, this receiver becomes "owner" once it receives the first
1047	   TMMBN in response to its own TMMBR, and stays "owner" for the rest
1048	   of the session.  Therefore, when it is known that there will
1049	   always be only a single media receiver, the above algorithm is not
1050	   required.  Media receivers that are aware they are the only ones
1051	   in a session can send TMMBR messages with bit rate limits both
1052	   higher and lower than the previously notified limit, at any time
1053	   (subject to the AVPF [RFC4585] RTCP RR send timing rules).
1054	   However, it may be difficult for a session participant to
1055	   determine if it is the only receiver in the session.  Because of
1056	   this any implementation of TMMBR is required to include the
1057	   algorithm described in the next section or a stricter equivalent.

1059	3.5.4.2. Algorithm for establishing current limitations

1061	   This section introduces an example algorithm for the calculation
1062	   of a session limit.  Other algorithms can be employed, as long as
1063	   the result of the calculation is at least as restrictive as the
1064	   result that is obtained by this algorithm.

1066	   First it is important to consider the implications of using a
1067	   tuple for limiting the media sender's behavior.  The bit rate and
1068	   the overhead value result in a two-dimensional solution space for
1069	   the calculation of the bit rate of media streams.  Fortunately the
1070	   two variables are linked. Specifically, the bit rate available for
1071	   RTP payloads is equal to the TMMBR reported bit rate minus the
1072	   packet rate used, multiplied by the TMMBR reported overhead
1073	   converted to bits.  As a result, when different bit rate/overhead
1074	   combinations need to be considered, the packet rate determines the
1075	   correct limitation.  This is perhaps best explained by an example:

1077	   Example:

1079	   Receiver A: TMMBR_max total BR = 35 kbps, TMMBR_OH = 40 bytes
1080	   Receiver B: TMMBR_max total BR = 40 kbps, TMMBR_OH = 60 bytes

1082	   For a given packet rate (PR) the bit rate available for media
1083	   payloads in RTP will be:

1085	   Max_net media_BR_A = TMMBR_max total BR_A - PR * TMMBR_OH_A * 8
1086	   ... (1)
1087	   Max_net media_BR_B = TMMBR_max total BR_B - PR * TMMBR_OH_B * 8
1088	   ... (2)

1090	   For a PR = 20 these calculations will yield a Max_net media_BR_A =
1091	   28600 bps and Max_net media_BR_B = 30400 bps, which suggests that
1092	   receiver A is the limiting one for this packet rate.  However at a
1093	   certain PR there is a switchover point at which receiver B becomes
1094	   the limiting one.  The switchover point can be identified by
1095	   setting Max_media_BR_A equal to Max_media_BR_B and breaking out
1096	   PR:

1098	         TMMBR_max total BR_A - TMMBR_max total BR_B
1099	   PR =  ------------------------------------------- ... (3)
1100	                8*(TMMBR_OH_A - TMMBR_OH_B)

1102	   which, for the numbers above yields 31.25 as the switchover point
1103	   between the two limits.  That is, for packet rates below 31.25 per
1104	   second, receiver A is the limiting receiver, and for higher packet
1105	   rates, receiver B is more limiting.  The implications of this
1106	   behavior have to be considered by implementations that are going
1107	   to control media encoding and its packetization.  As exemplified
1108	   above, multiple TMMBR limits may apply to the trade-off between
1109	   net media bit rate and packet rate.  Which limitation applies
1110	   depends on the packet rate being considered.

1112	   This also has implications for how the TMMBR mechanism needs to
1113	   work.  First, there is the possibility that multiple TMMBR tuples
1114	   are providing limitations on the media sender.  Secondly there is
1115	   a need for any session participant (media sender and receivers) to
1116	   be able to determine if a given tuple will become a limitation
1117	   upon the media sender, or if the set of already given limitations
1118	   is stricter than the given values.  In the absence of the ability
1119	   to make this determination the suppression of TMMBR requests would
1120	   not work.

1122	   The basic idea of the algorithm is as follows.  Each TMMBR tuple
1123	   can be viewed as the equation of a straight line (cf. equations
1124	   (1) and (2)) in a space where packet rate lies along the X-axis
1125	   and maximum bit rate lies along the Y-axis. The lower envelope of
1126	   the set of lines corresponding to the complete set of TMMBR tuples
1127	   defines a polygon. Points lying along or below this polygon are
1128	   combinations of packet rate and bit rate that meet all of the
1129	   TMMBR constraints. The highest feasible packet rate within this
1130	   region is the minimum of the rate at which the bounding polygon
1131	   meets the X-axis or the session maximum packet rate (SMAXPR)
1132	   provided by signaling, if any. Typically a media sender will
1133	   prefer to operate at a lower rate than this theoretical maximum,
1134	   so as to increase the rate at which actual media content reaches
1135	   the receivers.  The purpose of the algorithm is to distinguish the
1136	   TMMBR tuples constituting the bounding set and thus delineate the
1137	   feasible region, so that the media sender can select its preferred
1138	   operating point within that region

1140	   Figure 1 below shows a bounding polygon formed by TMMBR tuples A
1141	   and B. A third tuple C lies outside the bounding polygon and is
1142	   therefore irrelevant in determining feasible tradeoffs between
1143	   media rate and packet rate.  The line labeled ss..s represents the
1144	   limit on packet rate imposed by the session maximum packet rate
1145	   (SMAXPR) obtained by signaling during session setup.  In Figure 1
1146	   the limit determined by tuple B happens to be more restrictive
1147	   than SMAXPR.  The situation could easily be the reverse, meaning
1148	   that the bounding polygon is terminated on the right by the
1149	   vertical line representing the SMAXPR constraint.

1151	   Net  ^
1152	   Media|a   c   b             s
1153	   Bit  |  a   c  b            s
1154	   Rate |    a   c b           s
1155	        |      a   cb          s
1156	        |        a   c         s
1157	        |          a  bc       s
1158	        |            a b c     s
1159	        |              ab  c   s
1160	        |  Feasible      b   c s
1161	        |   region        ba   s
1162	        |                  b a s c
1163	        |                   b  s   c
1164	        |                    b s a
1165	        |                     bs
1166	        +------------------------------>
1167	              Packet rate

1169	    Figure 1 - Geometric Interpretation of TMMBR Tuples

1171	   Note that the slopes of the lines making up the bounding polygon
1172	   are increasingly negative as one moves in the direction of
1173	   increasing packet rate.  Note also that with slight rearrangement,
1174	   equations (1) and (2) have the canonical form:

1176	          y = mx + b

1178	   where
1179	     m is the slope and has value equal to the negative of the tuple
1180	     overhead (in bits),
1181	   and
1182	     b is the y-intercept and has value equal to the tuple maximum
1183	     total media bit rate.

1185	   These observations lead to the conclusion that when processing the
1186	   TMMBR tuples to select the initial bounding set, one should sort
1187	   and process the tuples by order of increasing overhead. Once a
1188	   particular tuple has been added to the bounding set, all tuples
1189	   not already selected and having lower overhead can be eliminated,
1190	   because the next side of the bounding polygon has to be steeper
1191	   (i.e. the corresponding TMMBR must have higher overhead) than the
1192	   latest added tuple.

1194	   Line cc..c in Figure 1 illustrates another principle. This line is
1195	   parallel to line aa..a, but has a higher Y-intercept.  That is,
1196	   the corresponding TMMBR tuple contains a higher maximum total
1197	   media bit rate value.  Since line cc..c is outside the bounding
1198	   polygon, it illustrates the conclusion that if two TMMBR tuples
1199	   have the same overhead value, the one with higher maximum total
1200	   media bit rate value cannot be part of the bounding set and can be
1201	   set aside.

1203	   Two further observations complete the algorithm.  Obviously,
1204	   moving from the left, the successive corners of the bounding
1205	   polygon (i.e. the intersection points between successive pairs of
1206	   sides) lie at successively higher packet rates.  On the other
1207	   hand, again moving from the left, each successive line making up
1208	   the bounding set crosses the X-axis at a lower packet rate.

1210	   The complete algorithm can now be specified.  The algorithm works
1211	   with two lists of TMMBR tuples, the candidate list X and the
1212	   selected list Y, both ordered by increasing overhead value.  The
1213	   algorithm terminates when all members of X have been discarded or
1214	   removed for processing.  Membership of the selected list Y is
1215	   probationary until the algorithm is complete.  Each member of the
1216	   selected list is associated with an intersection value, which is
1217	   the packet rate at which the line corresponding to that TMMBR
1218	   tuple intersects with the line corresponding to the previous TMMBR
1219	   tuple in the selected list.  Each member of the selected list is
1220	   also associated with a maximum packet rate value, which is the
1221	   lesser of the session maximum packet rate SMAXPR (if any) and the
1222	   packet rate at which the line corresponding to that tuple crosses
1223	   the X-axis.

1225	   When the algorithm terminates, the selected list is equal to the
1226	   bounding set as defined in section 2.2.

1228	Initial Algorithm

1230	   This algorithm is used by the media sender when it has received
1231	   one or more TMMBR requests and before it has determined a bounding
1232	   set for the first time.

1234	   1. Sort the TMMBR tuples by order of increasing overhead.  This is
1235	      the initial candidate list X.

1237	   2. When multiple tuples in the candidate list have the same
1238	      overhead value, discard all but the one with the lowest maximum
1239	      total media bit rate value.

1241	   3. Select and remove from the candidate list the TMMBR tuple with
1242	      the lowest maximum total media bit rate value.  If there is more
1243	      than one tuple with that value, choose the one with the highest
1244	      overhead value.  This is the first member of the selected list
1245	      Y.  Set its intersection value equal to zero.  Calculate its
1246	      maximum packet rate as the minimum of SMAXPR (if available) and
1247	      the value obtained from the following formula, which is the
1248	      packet rate at which the corresponding line crosses the X-axis.

1250	          Max PR = TMMBR max total BR / (8 * TMMBR OH) ... (4)

1252	   4. Discard from the candidate list all tuples with a lower overhead
1253	      value than the selected tuple.

1255	   5. Remove the first remaining tuple from the candidate list for
1256	      processing.  Call this the current candidate.

1258	   6. Calculate the packet rate PR at the intersection of the line
1259	      generated by the current candidate with the line generated by
1260	      the last tuple in the selected list Y, using equation (3).

1262	   7. If the calculated value PR is equal to or lower than the
1263	      intersection value stored for the last tuple of the selected
1264	      list, discard the last tuple of the selected list and go back to
1265	      step 6 (retaining the same current candidate).

1267	      Note that the choice of the initial member of the selected list
1268	      Y in step 3 guarantees that the selected list will never be
1269	      emptied by this process, meaning that the algorithm must
1270	      eventually (if not immediately) fall through to the step 8.

1272	   8. (This step is reached when the calculated PR value of the
1273	      current candidate is greater than the intersection value of the
1274	      current last member of the selected list Y.)  If the calculated
1275	      value PR of the current candidate is lower than the maximum
1276	      packet rate associated with the last tuple in the selected list,
1277	      add the current candidate tuple to the end of the selected list.
1278	      Store PR as its intersection value.  Calculate its maximum
1279	      packet rate as the lesser of SMAXPR (if available) and the
1280	      maximum packet rate calculated using equation (4).

1282	   9. If any tuples remain in the candidate list, go back to step 5.

1284	Incremental Algorithm
1285	   The previous algorithm covered the initial case, where no selected
1286	   list had previously been created.  It also applied only to the
1287	   media sender.  When a previously-created selected list is
1288	   available at either the media sender or media receiver, two other
1289	   cases can be considered:

1291	        o when a TMMBR tuple not currently in the selected list is a
1292	          candidate for addition;

1294	        o when the values change in a TMMBR tuple currently in the
1295	          selected list.

1297	   At the media receiver these cases correspond respectively to those
1298	   of the non-owner and owner of a tuple in the TMMBN-reported
1299	   bounding set.

1301	   In either case, the process of updating the selected list to take
1302	   account of the new/changed tuple can use the basic algorithm
1303	   described above, with the modification that the initial candidate
1304	   set consists only of the existing selected list and the new or
1305	   changed tuple.  Some further optimization is possible (beyond
1306	   starting with a reduced candidate set) by taking advantage of the
1307	   following observations.

1309	   The first observation is that if the new/changed candidate becomes
1310	   part of the new selected list, the result may be to cause zero or
1311	   more other tuples to be dropped from the list.  However, if more
1312	   than one other tuple is dropped, the dropped tuples will be
1313	   consecutive.  This can be confirmed geometrically by visualizing a
1314	   new line that cuts off a series of segments from the previously-
1315	   existing bounding polygon.  The cut-off segments are connected one
1316	   to the next, the geometric equivalent of consecutive tuples in a
1317	   list ordered by overhead value.  Beyond the dropped set in either
1318	   direction all of the tuples that were in the earlier selected list
1319	   will be in the updated one.  The second observation is that,
1320	   leaving aside the new candidate, the order of tuples remaining in
1321	   the updated selected list is unchanged because their overhead
1322	   values have not changed.

1324	   The consequence of these two observations is that, once the
1325	   placement of the new candidate and the extent of the dropped set
1326	   of tuples (if any) has been determined, the remaining tuples can
1327	   be copied directly from the candidate list into the selected list,
1328	   preserving their order.  This conclusion suggests the following
1329	   modified algorithm:

1331	       o Run steps 1-4 of the basic algorithm.

1333	       o If the new candidate has survived steps 2 and 4 and has
1334	          become the new first member of the selected list, run steps
1335	          5-9 on subsequent candidates until another candidate is
1336	          added to the selected list.  Then move all remaining
1337	          candidates to the selected list, preserving their order.

1339	       o If the new candidate has survived steps 2 and 4 and has not
1340	          become the new first member of the selected list, start by
1341	          moving all tuples in the candidate list with lower overhead
1342	          values than that of the new candidate to the selected list,
1343	          preserving their order.  Run steps 5 through 9 for the new
1344	          candidate, with the modification that the intersection
1345	          values and maximum packet rates for the tuples on the
1346	          selected list have to be calculated on the fly because they
1347	          were not previously stored.  Continue processing only until
1348	          a subsequent tuple has been added to the selected list, then
1349	          move all remaining candidates to the selected list,
1350	          preserving their order.

1352	          Note that the new candidate could be added to the selected
1353	          list only to be dropped again when the next tuple is
1354	          processed.  It can easily be seen that in this case the new
1355	          candidate does not displace any of the earlier tuples in the
1356	          selected list.  The limitations of ASCII art make this
1357	          difficult to show in a figure.  Line cc..c in Figure 1 would
1358	          be an example if it had a steeper slope (tuple C had a
1359	          higher overhead value), but still intersected line aa..a
1360	          beyond where line aa..a intersects line bb..b.

1362	   The algorithm just described is approximate, because it does not
1363	   take account of tuples outside the selected list.  To see how such
1364	   tuples can become relevant, consider Figure 1 and suppose that the
1365	   maximum total media bit rate in tuple A increases to the point
1366	   that line aa..a moves outside line cc..c.  Tuple A will remain in
1367	   the bounding set calculated by the media sender.  However, once it
1368	   issues a new TMMBN, media receiver C will apply the algorithm and
1369	   discover that its tuple C should now enter the bounding set.  It
1370	   will issue a TMMBR request to the media sender, which will repeat
1371	   its calculation and come to the appropriate conclusion.

1373	   The rules of section 4.2 require that the media sender refrain
1374	   from raising its sending rate until media receivers have had a
1375	   chance to respond to the TMMBN.  In the example just given, this
1376	   delay ensures that the relaxation of tuple A does not actually
1377	   result in an attempt to send media at a rate exceeding the
1378	   capacity at C.

1380	3.5.4.3. Use of TMMBR in a Mixer Based Multipoint Operation

1382	   Assume a small mixer-based multiparty conference is ongoing, as
1383	   depicted in Topo-Mixer of [Topologies].  All participants have
1384	   negotiated a common maximum bit rate that this session can use.
1385	   The conference operates over a number of unicast paths between the
1386	   participants and the mixer.  The congestion situation on each of
1387	   these paths can be monitored by the participant in question and by
1388	   the mixer, utilizing, for example, RTCP receiver reports (RR) or
1389	   the transport protocol, e.g. DCCP [RFC4340].  However, any given
1390	   participant has no knowledge of the congestion situation of the
1391	   connections to the other participants.  Worse, without mechanisms
1392	   similar to the ones discussed in this draft, the mixer (which is
1393	   aware of the congestion situation on all connections it manages)
1394	   has no standardized means to inform media senders to slow down,
1395	   short of forging its own receiver reports (which is undesirable).
1396	   In principle, a mixer confronted with such a situation is obliged
1397	   to thin or transcode streams intended for connections that
1398	   detected congestion.

1400	   In practice, media-aware stream thinning is unfortunately a very
1401	   difficult and cumbersome operation and adds undesirable delay.  If
1402	   media-unaware, it leads very quickly to unacceptable reproduced
1403	   media quality.  Hence, a means to slow down senders even in the
1404	   absence of congestion on their connections to the mixer is
1405	   desirable.

1407	   To allow the mixer to throttle traffic on the individual links,
1408	   without performing transcoding, there is a need for a mechanism
1409	   that enables the mixer to ask a participant's media encoders to
1410	   limit the media stream bit rate they are currently generating.
1411	   TMMBR provides the required mechanism.  When the mixer detects
1412	   congestion between itself and a given participant, it executes the
1413	   following procedure:

1415	   1. It starts thinning the media traffic to the congested
1416	      participant to the supported bit rate.

1418	   2. It uses TMMBR to request the media sender(s) to reduce the
1419	      total media bit rate sent by them to the mixer, to a value that
1420	      is in compliance with congestion control principles for the
1421	      slowest link.  Slow refers here to the available bandwidth /
1422	      bit rate / capacity and packet rate after congestion control.

1424	   3. As soon as the bit rate has been reduced by the sending part,
1425	      the mixer stops stream thinning implicitly, because there is no
1426	      need for it once the stream is in compliance with congestion
1427	      control.

1429	   This use of stream thinning as an immediate reaction tool followed
1430	   up by a quick control mechanism appears to be a reasonable
1431	   compromise between media quality and the need to combat
1432	   congestion.

1434	3.5.4.4. Use of TMMBR in Point-to-Multipoint Using Multicast or
1435	   Translators

1437	   In these topologies, corresponding to Topo-Multicast or Topo-
1438	   Translator, RTCP RRs are transmitted globally.  This allows all
1439	   participants to detect transmission problems such as congestion,
1440	   on a medium timescale.  As all media senders are aware of the
1441	   congestion situation of all media receivers, the rationale for the
1442	   use of TMMBR in the previous section does not apply.  However,
1443	   even in this case the congestion control response can be improved
1444	   when the unicast links are using congestion controlled transport
1445	   protocols (such as TCP or DCCP).  A peer may also report local
1446	   limitations to the media sender.

1448	3.5.4.5. Use of TMMBR in Point-to-point operation

1450	   In use case 7 it is possible to use TMMBR to improve the
1451	   performance when the known upper limit of the bit rate changes.
1452	   In this use case the signaling protocol has established an upper
1453	   limit for the session and total media bit rates.  However, at the
1454	   time of transport link bit rate reduction, a receiver can avoid
1455	   serious congestion by sending a TMMBR to the sending side.  Thus
1456	   TMMBR is useful for putting restrictions on the application and
1457	   thus placing the congestion control mechanism in the right
1458	   ballpark.  However TMMBR is usually unable to provide the
1459	   continuously quick feedback loop required for real congestion
1460	   control.  Nor do its semantics match those of congestion control
1461	   given its different purpose.  For these reasons TMMBR SHALL NOT be
1462	   used as a substitute for congestion control.

1464	3.5.4.6. Reliability

1466	   The reaction of a media sender to the reception of a TMMBR message
1467	   is not immediately identifiable through inspection of the media
1468	   stream.  Therefore, a more explicit mechanism is needed to avoid
1469	   unnecessary re-sending of TMMBR messages.  Using a statistically
1470	   based retransmission scheme would only provide statistical
1471	   guarantees of the request being received.  It would also not avoid
1472	   the retransmission of already received messages.  In addition, it
1473	   would not allow for easy suppression of other participants'
1474	   requests.  For these reasons, a mechanism based on explicit
1475	   notification is used.

1477	   Upon the reception of a request a media sender sends a TMMBN
1478	   notification containing the current bounding set, and indicating
1479	   which session participants own that limit.  In multicast
1480	   scenarios, that allows all other participants to suppress any
1481	   request they may have, if their limitations are less strict than
1482	   the current ones (i.e. define lines lying outside the feasible
1483	   region as defined in section 2.2).  Keeping and notifying only the
1484	   bounding set of tuples allows for small message sizes and media
1485	   sender states.  A media sender only keeps state for the SSRCs of
1486	   the current owners of the bounding set of tuples; all other
1487	   requests and their sources are not saved.  Once the bounding set
1488	   has been established, new TMMBR messages should be generated only
1489	   by owners of the bounding tuples and by other entities that
1490	   determine (by applying the algorithm of section 3.5.4.2 or its
1491	   equivalent) that their limitations should now be part of the
1492	   bounding set.

1494	4. RTCP Receiver Report Extensions

1496	   This memo specifies six new feedback messages.  The Full Intra
1497	   Request (FIR), Temporal-Spatial Trade-off Request (TSTR),
1498	   Temporal-Spatial Trade-off Notification (TSTN), and Video Back
1499	   Channel Message (VBCM) are "Payload Specific Feedback Messages" as
1500	   defined in Section 6.3 of AVPF [RFC4585].  The Temporary Maximum
1501	   Media Stream Bit Rate Request (TMMBR) and Temporary Maximum Media
1502	   Stream Bit Rate Notification (TMMBN) are "Transport Layer Feedback
1503	   Messages" as defined in Section 6.2 of AVPF.

1505	   The new feedback messages are defined in the following
1506	   subsections, following a similar structure to that in sections 6.2
1507	   and 6.3 of the AVPF specification [RFC4585].

1509	4.1. Design Principles of the Extension Mechanism

1511	   RTCP was originally introduced as a channel to convey presence,
1512	   reception quality statistics and hints on the desired media
1513	   coding.  A limited set of media control mechanisms were introduced
1514	   in early RTP payload formats for video formats, for example in RFC
1515	   2032 [RFC2032].  However, this specification, for the first time,
1516	   suggests a two-way handshake for some of its messages.  There is
1517	   danger that this introduction could be misunderstood as a
1518	   precedent for the use of RTCP as an RTP session control protocol.
1519	   To prevent such a misunderstanding, this subsection attempts to
1520	   clarify the scope of the extensions specified in this memo, and
1521	   strongly suggests that future extensions follow the rationale
1522	   spelled out here, or compellingly explain why they divert from the
1523	   rationale.

1525	   In this memo, and in AVPF [RFC4585], only such messages have been
1526	   included as:

1528	   a) have comparatively strict real-time constraints, which prevent
1529	      the use of mechanisms such as a SIP re-invite in most
1530	      application scenarios.  The real-time constraints are explained
1531	      separately for each message where necessary.

1533	   b) are multicast-safe in that the reaction to potentially
1534	      contradicting feedback messages is specified, as necessary for
1535	      each message; and

1537	   c) are directly related to activities of a certain media codec,
1538	      class of media codecs (e.g. video codecs), or a given RTP
1539	      packet stream.

1541	   In this memo, a two-way handshake is introduced only for messages
1542	   for which:

1544	   a) a notification or acknowledgement is required due to their
1545	      nature. An analysis to determine whether this requirement
1546	      exists has been performed separately for each message.

1548	   b) the notification or acknowledgement cannot be easily derived
1549	      from the media bit stream.

1551	   All messages in AVPF [RFC4585] and in this memo present their
1552	   contents in a simple, fixed binary format.  This accommodates
1553	   media receivers which have not implemented higher control protocol
1554	   functionalities (SDP, XML parsers and such) in their media path.

1556	   Messages that do not conform to the design principles just
1557	   described are not an appropriate use of RTCP or of the Codec
1558	   Control Framework defined in this document.

1560	4.2. Transport Layer Feedback Messages

1562	   As specified in section 6.1 of RFC 4585 [RFC4585], Transport Layer
1563	   Feedback messages are identified by the RTCP packet type value
1564	   RTPFB (205).

1566	   In AVPF, one message of this category had been defined.  This memo
1567	   specifies two more such messages.  They are identified by means of
1568	   the FMT parameter as follows:

1570	   Assigned in AVPF [RFC4585]:

1572	      1:    Generic NACK
1573	      31:   reserved for future expansion of the identifier number
1574	   space

1576	   Assigned in this memo:

1578	      2:    reserved (see note below)
1579	      3:    Temporary Maximum Media Stream Bit Rate Request (TMMBR)
1580	      4:    Temporary Maximum Media Stream Bit Rate Notification (TMMBN)

1582	          Note: early drafts of AVPF [RFC4585] reserved FMT=2 for a
1583	          code point that has later been removed.  It has been
1584	          pointed out that there may be implementations in the field
1585	          using this value in accordance with the expired draft.  As
1586	          there is sufficient numbering space available, we mark
1587	          FMT=2 as reserved so to avoid possible interoperability
1588	          problems with any such early implementations.

1590	   Available for assignment:

1592	      0:    unassigned
1593	      5-30: unassigned

1595	   The following subsection defines the formats of the FCI entries
1596	   for the TMMBR and TMMBN messages respectively and specify the
1597	   associated behaviour at the media sender and receiver.

1599	4.2.1. Temporary Maximum Media Stream Bit Rate Request (TMMBR)

1601	   The FCI field of a Temporary Maximum Media Stream Bit-Rate Request
1602	   (TMMBR) message SHALL contain one or more FCI entries.

1604	4.2.1.1. Message Format

1606	   The Feedback Control Information (FCI) consists of one or more
1607	   TMMBR FCI entries with the following syntax:

1609	    0                   1                   2                   3
1610	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1611	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1612	   |                              SSRC                             |
1613	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1614	   | MxTBR Exp |  MxTBR Mantissa                 |Measured Overhead|
1615	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1617	    Figure 2 - Syntax of an FCI entry in the TMMBR message

1619	     SSRC (32 bits): The SSRC value of the media sender that is
1620	              requested to obey the new maximum bit rate.

1622	     MxTBR Exp (6 bits): The exponential scaling of the mantissa for
1623	              the maximum total media bit rate value.  The value is an
1624	              unsigned integer [0..63].

1626	     MxTBR Mantissa (17 bits): The mantissa of the maximum total
1627	              media bit rate value as an unsigned integer.

1629	     Measured Overhead (9 bits): The measured average packet overhead
1630	              value in bytes.  The measurement SHALL be done according
1631	              to description in section 4.2.1.2. The value is an
1632	              unsigned integer [0..512].

1634	   The maximum total media bit rate (MxTBR) value in bits per second
1635	   is calculated from the MxTBR exponent (exp) and mantissa in the
1636	   following way:

1638	      MxTBR = mantissa * 2^exp

1640	   This allows for 17 bits of resolution in the range 0 to
1641	   131072*2^63 (approximately 1.2*10^24).

1643	   The length of the TMMBR feedback message SHALL be set to 2+2*N
1644	   where N is the number of TMMBR FCI entries.

1646	4.2.1.2. Semantics

1648	Behaviour at the Media Receiver (Sender of the TMMBR)

1650	   TMMBR is used to indicate a transport related limitation at the
1651	   reporting entity acting as a media receiver.  TMMBR has the form
1652	   of a tuple containing two components.  The first value is the
1653	   highest bit rate per sender of a media stream, available at a
1654	   receiver-chosen protocol layer, which the receiver currently
1655	   supports in this RTP session.  The second value is the measured
1656	   header overhead in bytes as defined in section 2.2 and measured at
1657	   the chosen protocol layer in the packets received for the stream.
1658	   The measurement of the overhead is a running average that is
1659	   updated for each packet received for this particular media source
1660	   (SSRC), using the following formula:

1662	       avg_OH (new) = 15/16*avg_OH (old) + 1/16*pckt_OH,

1664	   where avg_OH is the running (exponentially smoothed) average and
1665	   pckt_OH is the overhead observed in the latest packet.

1667	   If a maximum bit rate has been negotiated through signaling, the
1668	   maximum total media bit rate that the receiver reports in a TMMBR
1669	   message MUST NOT exceed the negotiated value converted to a common
1670	   basis (i.e. with overheads adjusted to bring it to the same
1671	   reference protocol layer).

1673	   Within the common packet header for feedback messages (as defined
1674	   in section 6.1 of [RFC4585]), the "SSRC of the packet sender"
1675	   field indicates the source of the request, and the "SSRC of media
1676	   source" is not used and SHALL be set to 0.  Within a particular
1677	   TMMBR FCI entry, the "SSRC of media sender" in the FCI field
1678	   denotes the media sender the tuple applies to.  This is useful in
1679	   the multicast or translator topologies where the reporting entity
1680	   may address all of the media senders in a single TMMBR message
1681	   using multiple FCI entries.

1683	   The media receiver SHALL save the contents of the latest TMMBN
1684	   message received from each media sender.

1686	   The media receiver MAY send a TMMBR FCI entry to a particular
1687	   media sender under the following circumstances:

1689	     o   before any TMMBN message has been received from that media
1690	          sender;

1692	     o   when the media receiver has been identified as the source of
1693	          a bounding tuple within the latest TMMBN message received
1694	          from that media sender, and the value of the maximum total
1695	          media bit rate or the overhead relating to that media sender
1696	          has changed;

1698	     o   when the media receiver has not been identified as the
1699	          source of a bounding tuple within the latest TMMBN message
1700	          received from that media sender, and, after the media
1701	          receiver applies the incremental algorithm from section
1702	          3.5.4.2 or a stricter equivalent, the media receiver's tuple
1703	          relating to that media sender is determined to belong to the
1704	          bounding set.

1706	   A TMMBR FCI entry MAY be repeated in subsequent TMMBR messages if
1707	   no Temporary Maximum Media Stream Bit-Rate Notification (TMMBN)
1708	   FCI has been received from the media sender at the time of
1709	   transmission of the next RTCP packet.  The bit rate value of a
1710	   TMMBR FCI entry MAY be changed from one TMMBR message to the next.
1711	   The overhead measurement SHALL be updated to the current value of
1712	   avg_OH each time the entry is sent.

1714	   If the value set by a TMMBR message is expected to be permanent,
1715	   the TMMBR setting party SHOULD renegotiate the session parameters
1716	   to reflect that using session setup signaling, e.g. a SIP re-
1717	   invite.

1719	Behaviour at the Media Sender (Receiver of the TMMBR)

1721	   When it receives a TMMBR message containing an FCI entry relating
1722	   to it, the media sender SHALL use an initial or incremental
1723	   algorithm as applicable to determine the bounding set of tuples
1724	   based on the new information.  The algorithm used SHALL be at
1725	   least as strict as the corresponding algorithm defined in section
1726	   3
1727	.5.4.2.  The media sender MAY accumulate TMMBR requests over a
1728	   small interval (relative to the RTCP sending interval) before
1729	   making this calculation.

1731	   Once it has determined the bounding set of tuples, the media
1732	   sender MAY use any combination of packet rate and net media bit
1733	   rate within the feasible region that these tuples describe to
1734	   produce a lower total media stream bit rate, as it may need to
1735	   address a congestion situation or other limiting factors.  See
1736	   section 5. (congestion control) for more discussion.

1738	   If the media sender concludes that it can increase the maximum
1739	   total media bit rate value, it SHALL wait before actually doing
1740	   so, for a period long enough to allow a media receiver to respond
1741	   to the TMMBN if it determines that its tuple belongs in the
1742	   bounding set.  This delay period is estimated by the formula:

1744	      2 * RTT + T_Dither_Max,

1746	   where RTT is the longest round trip time known to the media sender
1747	   and T_Dither_Max is defined in section 3.4 of [RFC4585].

1749	   A TMMBN message SHALL be sent by the media sender at the earliest
1750	   possible point in time, in response to any TMMBR messages received
1751	   since the last sending of TMMBN.  The TMMBN message indicates the
1752	   calculated set of bounding tuples and the owners of those tuples
1753	   at the time of the transmission of the message.

1755	   An SSRC may time out according to the default rules for RTP
1756	   session participants, i.e. the media sender has not received any
1757	   RTP or RTCP packets from the owner for the last five regular
1758	   reporting intervals.  An SSRC may also explicitly leave the
1759	   session, with the participant indicating this through the
1760	   transmission of an RTCP BYE packet or using an external signaling
1761	   channel.  If the media sender determines that the owner of a tuple
1762	   in the bounding set has left the session, the media sender shall
1763	   transmit a new TMMBN containing the previously-determined set of
1764	   bounding tuples but with the tuple belonging to the departed owner
1765	   removed.

1767	   A media sender MAY proactively initiate the equivalent to a TMMBR
1768	   message to itself, when it is aware that its transmission path is
1769	   more restrictive than the current limitations.  As a result, a
1770	   TMMBN indicating the media source itself as the owner of a tuple
1771	   is being sent, thereby avoiding unnecessary TMMBR messages from
1772	   other participants. However, like any other participant, when the
1773	   media sender becomes aware of changed limitations, it is required
1774	   to change the tuple, and to send a corresponding TMMBN.

1776	Discussion

1778	   Due to the unreliable nature of transport of TMMBR and TMMBN, the
1779	   above rules may lead to the sending of TMMBR messages which appear
1780	   to disobey those rules.  Furthermore, in multicast scenarios it
1781	   can happen that more than one "non-owning" session participant may
1782	   determine, rightly or wrongly, that its tuple belongs in the
1783	   bounding set.  This is not critical for a number of reasons:

1785	   a) If a TMMBR message is lost in transmission, either the media
1786	      sender sends a new TMMBN message in response to some other
1787	      media receiver or it does not send a new TMMBN message at all.
1788	      In the first case, the media receiver applies the incremental
1789	      algorithm and, if it determines that its tuple should be part
1790	      of the bounding set, sends out another TMMBR.  In the second
1791	      case, it repeats the sending of a TMMBR unconditionally.
1792	      Either way, the media sender eventually gets the information it
1793	      needs.

1795	   b) Similarly, if a TMMBN message gets lost, the media receiver
1796	      that has sent the corresponding TMMBR request does not receive
1797	      the notification and is expected to re-send the request and
1798	      trigger the transmission of another TMMBN.

1800	   c) If multiple competing TMMBR messages are sent by different
1801	      session participants, then the algorithm can be applied taking
1802	      all of these messages into account, and the resulting TMMBN
1803	      provides the participants with an updated view of how their
1804	      tuples compare with the bounded set.

1806	   d) If more than one session participant happens to send TMMBR
1807	      messages at the same time and with the same tuple component
1808	      values, it does not matter which if either tuple is taken into
1809	      the bounding set.  The losing session participant will
1810	      determine after applying the algorithm that its tuple does not
1811	      enter the bounding set, and will therefore stop sending its
1812	      TMMBR request.

1814	   It is important to consider the security risks involved with faked
1815	   TMMBRs.  See the security considerations in Section 6.

1817	   As indicated already, the feedback messages may be used in both
1818	   multicast and unicast sessions in any of the specified topologies.
1819	   However, for sessions with a large number of participants, using
1820	   the lowest common denominator, as required by this mechanism, may
1821	   not be the most suitable course of action.  Large sessions may
1822	   need to consider other ways to adapt the bit rate to participants'
1823	   capabilities, such as partitioning the session into different
1824	   quality tiers, or using some other method of achieving bit rate
1825	   scalability.

1827	4.2.1.3. Timing Rules

1829	   The first transmission of the TMMBR request message MAY use early
1830	   or immediate feedback in cases when timeliness is desirable.  Any
1831	   repetition of a request message SHOULD use regular RTCP mode for
1832	   its transmission timing.

1834	4.2.1.4. Handling in Translator and Mixers

1836	   Media translators and mixers will need to receive and respond to
1837	   TMMBR messages as they are part of the chain that provides a
1838	   certain media stream to the receiver.  The mixer or translator may
1839	   act locally on the TMMBR request and thus generate a TMMBN to
1840	   indicate that it has done so.  Alternatively, in the case of a
1841	   media translator it can forward the request, or in the case of a
1842	   mixer generate one of its own and pass it forward.  In the latter
1843	   case, the mixer will need to send a TMMBN back to the original
1844	   requestor to indicate that it is handling the request.

1846	4.2.2. Temporary Maximum Media Stream Bit Rate Notification (TMMBN)

1848	   The FCI field of the TMMBN Feedback message may contain zero, one
1849	   or more TMMBN FCI entries.

1851	4.2.2.1. Message Format

1853	   The Feedback Control Information (FCI) consists of zero, one or
1854	   more TMMBN FCI entries with the following syntax:

1856	    0                   1                   2                   3
1857	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1858	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1859	   |                              SSRC                             |
1860	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1861	   | MxTBR Exp |  MxTBR Mantissa                 |Measured Overhead|
1862	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1864	    Figure 3 - Syntax of an FCI entry in the TMMBN message
1865	     SSRC (32 bits): The SSRC value of the "owner" of this tuple.

1867	     MxTBR Exp (6 bits): The exponential scaling of the mantissa for
1868	              the maximum total media bit rate value.  The value is an
1869	              unsigned integer [0..63].

1871	     MxTBR Mantissa (17 bits): The mantissa of the maximum total
1872	              media bit rate value as an unsigned integer.

1874	     Measured Overhead (9 bits): The measured average packet overhead
1875	              value in bytes represented as an unsigned integer.

1877	   Thus the FCI within the TMMBN message contains entries indicating
1878	   the bounding tuples.  For each tuple, the entry gives the owner by
1879	   the SSRC, followed by the applicable maximum total media bit rate
1880	   and overhead value.

1882	   The length of the TMMBN message SHALL be set to 2+2*N where N is
1883	   the number of TMMBN FCI entries.

1885	4.2.2.2. Semantics

1887	   This feedback message is used to notify the senders of any TMMBR
1888	   message that one or more TMMBR messages have been received or that
1889	   an owner has left the session.  It indicates to all participants
1890	   the current set of bounding tuples and the "owners" of those
1891	   tuples.

1893	   Within the common packet header for feedback messages (as defined
1894	   in section 6.1 of [RFC4585]), the "SSRC of the packet sender"
1895	   field indicates the source of the notification.  The "SSRC of
1896	   media source" is not used and SHALL be set to 0.

1898	   A TMMBN message SHALL be scheduled for transmission after the
1899	   reception of a TMMBR message with an FCI entry identifying this
1900	   media sender.  Only a single TMMBN SHALL be sent, even if more
1901	   than one TMMBR message is received between the scheduling of the
1902	   transmission and the actual transmission of the TMMBN message.
1903	   The TMMBN message indicates the bounding tuples and their owners
1904	   at the time of transmitting the message.  The bounding tuples
1905	   included SHALL be the set arrived at through application of the
1906	   applicable algorithm of section 3.5.4.2 or an equivalent, applied
1907	   to the previous bounding set if any and tuples received in TMMBR
1908	   messages since the last TMMBN was transmitted.

1910	   The reception of a TMMBR message SHALL still result in the
1911	   transmission of a TMMBN message even if, after application of the
1912	   algorithm, the newly reported TMMBR tuple is not accepted into the
1913	   bounding set.  In such a case the bounding tuples and their owners
1914	   are not changed, unless the TMMBR was from an owner of a tuple
1915	   within the previously calculated bounding set.  This procedure
1916	   allows session participants that did not see the last TMMBN
1917	   message to get a correct view of this media sender's state.

1919	   As indicated in section 4.2.1.2, when a media sender determines
1920	   that an "owner" of a bounding tuple has left the session, then
1921	   that tuple is removed from the bounding set, and the media sender
1922	   SHALL send a TMMBN message indicating the remaining bounding
1923	   tuples.  If there are no remaining bounding tuples a TMMBN without
1924	   any FCI SHALL be sent to indicate this.

1926	     Note: if any media receivers remain in the session, this last
1927	     will be a temporary situation.  The empty TMMBN will cause every
1928	     remaining media receiver to determine that its limitation
1929	     belongs in the bounding set and send a TMMBR in consequence.

1931	   In unicast scenarios (i.e. where a single sender talks to a single
1932	   receiver), the aforementioned algorithm to determine ownership
1933	   degenerates to the media receiver becoming the "owner" of the one
1934	   bounding tuple as soon as the media receiver has issued the first
1935	   TMMBR message.

1937	4.2.2.3. Timing Rules

1939	   The TMMBN acknowledgement SHOULD be sent as soon as allowed by the
1940	   applied timing rules for the session.  Immediate or early feedback
1941	   mode SHOULD be used for these messages.

1943	4.2.2.4. Handling by Translators and Mixers

1945	   As discussed in Section 4.2.1.4 mixers or translators may need to
1946	   issue TMMBN messages as responses to TMMBR messages for SSRC's
1947	   handled by them.

1949	4.3. Payload Specific Feedback Messages

1951	   As specified by section 6.1 of RFC 4585 [RFC4585], Payload-
1952	   Specific FB messages are identified by the RTCP packet type value
1953	   PT=PSFB (206).

1955	   AVPF [RFC4585] defines three payload-specific feedback messages
1956	   and one application layer feedback message.  This memo specifies
1957	   four additional payload-specific feedback messages.  All are
1958	   identified by means of the FMT parameter as follows:

1960	   Assigned in [RFC4585]:

1962	     1:     Picture Loss Indication (PLI)
1963	     2:     Slice Lost Indication (SLI)
1964	     3:     Reference Picture Selection Indication (RPSI)
1965	     15:    Application layer FB message
1966	     31:    reserved for future expansion of the number space

1968	   Assigned in this memo:

1970	     4:     Full Intra Request Command (FIR)
1971	     5:     Temporal-Spatial Trade-off Request (TSTR)
1972	     6:     Temporal-Spatial Trade-off Notification (TSTN)
1973	     7:     Video Back Channel Message (VBCM)

1975	   Unassigned:

1977	     0:     unassigned
1978	     8-14:  unassigned
1979	     16-30: unassigned

1981	   The following subsections define the new FCI formats for the
1982	   payload-specific feedback messages.

1984	4.3.1. Full Intra Request (FIR)

1986	   The FIR message is identified by RTCP packet type value PT=PSFB
1987	   and FMT=4.

1989	   The FCI field MUST contain one or more FIR entries.  Each entry
1990	   applies to a different media sender, identified by its SSRC.

1992	4.3.1.1. Message Format

1994	   The Feedback Control Information (FCI) for the Full Intra Request
1995	   consists of one or more FCI entries, the content of which is
1996	   depicted in Figure 4.  The length of the FIR feedback message MUST
1997	   be set to 2+2*N, where N is the number of FCI entries.

1999	    0                   1                   2                   3
2000	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2001	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2002	   |                              SSRC                             |
2003	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2004	   | Seq. nr       |    Reserved                                   |
2005	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2007	    Figure 4 - Syntax of an FCI entry in the FIR message

2009	     SSRC (32 bits): The SSRC value of the media sender which is
2010	              requested to send a decoder refresh point.

2012	     Seq. nr (8 bits): Command sequence number.  The sequence number
2013	              space is unique for each pairing of the SSRC of command
2014	              source and the SSRC of the command target.  The sequence
2015	              number SHALL be increased by 1 modulo 256 for each new
2016	              command.  A repetition SHALL NOT increase the sequence
2017	              number.  The initial value is arbitrary.

2019	     Reserved (24 bits): All bits SHALL be set to 0 by the sender and
2020	              SHALL be ignored on reception.

2022	   The semantics of this feedback message is independent of the RTP
2023	   payload type.

2025	4.3.1.2. Semantics

2027	   Upon reception of FIR, the encoder MUST send a decoder refresh
2028	   point (see section 2.2) as soon as possible.

2030	     Note: Currently, video appears to be the only useful application
2031	     for FIR, as it appears to be the only RTP payload widely
2032	     deployed that relies heavily on media prediction across RTP
2033	     packet boundaries.  However, use of FIR could also reasonably be
2034	     envisioned for other media types that share essential properties
2035	     with compressed video, namely cross-frame prediction (whatever a
2036	     frame may be for that media type).  One possible example may be
2037	     the dynamic updates of MPEG-4 scene descriptions.  It is
2038	     suggested that payload formats for such media types refer to FIR
2039	     and other message types defined in this specification and in
2040	     AVPF [RFC4585], instead of creating similar mechanisms in the
2041	     payload specifications.  The payload specifications may have to
2042	     explain how the payload-specific terminologies map to the video-
2043	     centric terminology used herein.

2045	     Note: In environments where the sender has no control over the
2046	     codec (e.g. when streaming pre-recorded and pre-coded content),
2047	     the reaction to this command cannot be specified.  One suitable
2048	     reaction of a sender would be to skip forward in the video bit
2049	     stream to the next decoder refresh point.  In other scenarios,
2050	     it may be preferable not to react to the command at all, e.g.
2051	     when streaming to a large multicast group.  Other reactions may
2052	     also be possible.  When deciding on a strategy, a sender could
2053	     take into account factors such as the size of the receiving
2054	     group, the "importance" of the sender of the FIR message
2055	     (however "importance" may be defined in this specific
2056	     application), the frequency of decoder refresh points in the
2057	     content, and so on.  However a session which predominately
2058	     handles pre-coded content is not expected to use FIR at all.

2060	   The sender MUST consider congestion control as outlined in section
2061	   5, which MAY restrict its ability to send a decoder refresh point
2062	   quickly.

2064	     Note: The relationship between the Picture Loss Indication and
2065	     FIR is as follows.  As discussed in section 6.3.1 of AVPF
2066	     [RFC4585], a Picture Loss Indication informs the decoder about
2067	     the loss of a picture and hence the likelihood of misalignment
2068	     of the reference pictures between the encoder and decoder.  Such
2069	     a scenario is normally related to losses in an ongoing
2070	     connection.  In point-to-point scenarios, and without the
2071	     presence of advanced error resilience tools, one possible option
2072	     for an encoder consists in sending a decoder refresh point.
2073	     However, there are other options.  One example is that the media
2074	     sender ignores the PLI, because the embedded stream redundancy
2075	     is likely to clean up the reproduced picture within a reasonable
2076	     amount of time.  The FIR, in contrast, leaves a (real-time)
2077	     encoder no choice but to send a decoder refresh point.  It does
2078	     not allow the encoder to take into account any considerations
2079	     such as the ones mentioned above.

2081	     Note: Mandating a maximum delay for completing the sending of a
2082	     decoder refresh point would be desirable from an application
2083	     viewpoint, but is problematic from a congestion control point of
2084	     view.  "As soon as possible" as mentioned above appears to be a
2085	     reasonable compromise.

2087	   FIR SHALL NOT be sent as a reaction to picture losses -- it is
2088	   RECOMMENDED to use PLI instead.  FIR SHOULD be used only in
2089	   situations where not sending a decoder refresh point would render
2090	   the video unusable for the users.

2092	     Note: A typical example where sending FIR is appropriate is
2093	     when, in a multipoint conference, a new user joins the session
2094	     and no regular decoder refresh point interval is established.
2095	     Another example would be a video switching MCU that changes
2096	     streams.  Here, normally, the MCU issues a FIR to the new sender
2097	     so to force it to emit a decoder refresh point.  The decoder
2098	     refresh point normally includes a Freeze Picture Release
2099	     (defined outside this specification), which re-starts the
2100	     rendering process of the receivers.  Both techniques mentioned
2101	     are commonly used in MCU-based multipoint conferences.

2103	   Other RTP payload specifications such as RFC 2032 [RFC2032]
2104	   already define a feedback mechanism for certain codecs.  An
2105	   application supporting both schemes MUST use the feedback
2106	   mechanism defined in this specification when sending feedback.
2107	   For backward compatibility reasons, such an application SHOULD
2108	   also be capable to receive and react to the feedback scheme
2109	   defined in the respective RTP payload format, if this is required
2110	   by that payload format.

2112	   Within the common packet header for feedback messages (as defined
2113	   in section 6.1 of [RFC4585]), the "SSRC of the packet sender"
2114	   field indicates the source of the request, and the "SSRC of media
2115	   source" is not used and SHALL be set to 0.  The SSRCs of the media
2116	   senders to which the FIR command applies are in the corresponding
2117	   FCI entries.  A TSTR message MAY contain requests to multiple
2118	   media senders, using one FCI entry per target media sender.

2120	4.3.1.3. Timing Rules

2122	   The timing follows the rules outlined in section 3 of [RFC4585].
2123	   FIR commands MAY be used with early or immediate feedback.  The
2124	   FIR feedback message MAY be repeated.  If using immediate feedback
2125	   mode the repetition SHOULD wait at least one RTT before being
2126	   sent.  In early or regular RTCP mode the repetition is sent in the
2127	   next regular RTCP packet.

2129	4.3.1.4. Handling of FIR Message in Mixer and Translators

2131	   A media translator or a mixer performing media encoding of the
2132	   content for which the session participant has issued a FIR is
2133	   responsible for acting upon it.  A mixer acting upon a FIR SHOULD
2134	   NOT forward the message unaltered; instead it SHOULD issue a FIR
2135	   itself.

2137	4.3.1.5. Remarks

2139	   In conjunction with video codecs, FIR messages typically trigger
2140	   the sending of full intra or IDR pictures.  Both are several times
2141	   larger then predicted (inter) pictures.  Their size is independent
2142	   of the time they are generated.  In most environments, especially
2143	   when employing bandwidth-limited links, the use of an intra
2144	   picture implies an allowed delay that is a significant multiple of
2145	   the typical frame duration.  An example: if the sending frame rate
2146	   is 10 fps, and an intra picture is assumed to be 10 times as big
2147	   as an inter picture, then a full second of latency has to be
2148	   accepted.  In such an environment there is no need for a
2149	   particularly short delay in sending the FIR message.  Hence
2150	   waiting for the next possible time slot allowed by RTCP timing
2151	   rules as per [RFC4585] should not have an overly negative impact
2152	   on the system performance.

2154	4.3.2. Temporal-Spatial Trade-off Request (TSTR)

2156	   The TSTR feedback message is identified by RTCP packet type value
2157	   PT=PSFB and FMT=5.

2159	   The FCI field MUST contain one or more TSTR FCI entries.

2161	4.3.2.1. Message Format

2163	   The content of the FCI entry for the Temporal-Spatial Trade-off
2164	   Request is depicted in Figure 5.  The length of the feedback
2165	   message MUST be set to 2+2*N, where N is the number of FCI entries
2166	   included.

2168	    0                   1                   2                   3
2169	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2170	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2171	   |                              SSRC                             |
2172	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2173	   |  Seq nr.      |  Reserved                           | Index   |
2174	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2176	    Figure 5 - Syntax of an FCI Entry in the TSTR Message
2177	     SSRC (32 bits): The SSRC of the media sender which is requested
2178	              to apply the tradeoff value given in Index.

2180	     Seq. nr (8 bits): Request sequence number.  The sequence number
2181	              space is unique for pairing of the SSRC of request
2182	              source and the SSRC of the request target.  The sequence
2183	              number SHALL be increased by 1 modulo 256 for each new
2184	              command.  A repetition SHALL NOT increase the sequence
2185	              number.  The initial value is arbitrary.

2187	     Reserved (19 bits): All bits SHALL be set to 0 by the sender and
2188	              SHALL be ignored on reception.

2190	     Index (5 bits): An integer value between 0 and 31 that indicates
2191	              the relative trade off that is requested.  An index
2192	              value of 0 index highest possible spatial quality, while
2193	              31 indicates highest possible temporal resolution.

2195	4.3.2.2. Semantics

2197	   A decoder can suggest a temporal-spatial trade-off level by
2198	   sending a TSTR message to an encoder.  If the encoder is capable
2199	   of adjusting its temporal-spatial trade-off, it SHOULD take into
2200	   account the received TSTR message for future coding of pictures.
2201	   A value of 0 suggests a high spatial quality and a value of 31
2202	   suggests a high frame rate.  The progression of values from 0 to
2203	   31 indicate monotonically a desire for higher frame rate.  The
2204	   index values do not correspond to precise values of spatial
2205	   quality or frame rate.

2207	   The reaction to the reception of more than one TSTR message by a
2208	   media sender from different media receivers is left open to the
2209	   implementation.  The selected trade-off SHALL be communicated to
2210	   the media receivers by the means of the TSTN message.

2212	   Within the common packet header for feedback messages (as defined
2213	   in section 6.1 of [RFC4585]), the "SSRC of the packet sender"
2214	   field indicates the source of the request, and the "SSRC of media
2215	   source" is not used and SHALL be set to 0.  The SSRCs of the media
2216	   senders to which the TSTR applies to are in the corresponding FCI
2217	   entries.

2219	   A TSTR message MAY contain requests to multiple media senders,
2220	   using one FCI entry per target media sender.

2222	4.3.2.3. Timing Rules

2224	   The timing follows the rules outlined in section 3 of [RFC4585].
2225	   This request message is not time critical and SHOULD be sent using
2226	   regular RTCP timing.  Only if it is known that the user interface
2227	   requires a quick feedback, the message MAY be sent with early or
2228	   immediate feedback timing.

2230	4.3.2.4. Handling of message in Mixers and Translators

2232	   A mixer or media translator that encodes content sent to the
2233	   session participant issuing the TSTR SHALL consider the request to
2234	   determine if it can fulfill it by changing its own encoding
2235	   parameters.  A media translator unable to fulfill the request MAY
2236	   forward the request unaltered towards the media sender.  A mixer
2237	   encoding for multiple session participants will need to consider
2238	   the joint needs of these participants before generating a TSTR on
2239	   its own behalf towards the media sender.  See also the discussion
2240	   in Section 3
2241	              ..5.2.

2243	4.3.2.5. Remarks

2245	   The term "spatial quality" does not necessarily refer to the
2246	   resolution, measured by the number of pixels the reconstructed
2247	   video is using.  In fact, in most scenarios the video resolution
2248	   stays constant during the lifetime of a session.  However, all
2249	   video compression standards have means to adjust the spatial
2250	   quality at a given resolution, often influenced by the Quantizer
2251	   Parameter or QP.  A numerically low QP results in a good
2252	   reconstructed picture quality, whereas a numerically high QP
2253	   yields a coarse picture.  The typical reaction of an encoder to
2254	   this request is to change its rate control parameters to use a
2255	   lower frame rate and a numerically lower (on average) QP, or vice
2256	   versa.  The precise mapping of Index value to frame rate and QP is
2257	   intentionally left open here, as it depends on factors such as the
2258	   compression standard employed, spatial resolution, content, bit
2259	   rate, and so on.

2261	4.3.3. Temporal-Spatial Trade-off Notification (TSTN)

2263	   The TSTN message is identified by RTCP packet type value PT=PSFB
2264	   and FMT=6.

2266	   The FCI field SHALL contain one or more TSTN FCI entries.

2268	4.3.3.1. Message Format

2270	   The content of an FCI entry for the Temporal-Spatial Trade-off
2271	   Notification is depicted in Figure 6.  The length of the TSTN
2272	   message MUST be set to 2+2*N, where N is the number of FCI
2273	   entries.

2275	    0                   1                   2                   3
2276	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2277	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2278	   |                              SSRC                             |
2279	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2280	   |  Seq nr.      |  Reserved                           | Index   |
2281	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2283	    Figure 6 - Syntax of the TSTN

2285	     SSRC (32 bits): The SSRC of the source of the TSTR request which
2286	              resulted in this Notification.

2288	     Seq. nr (8 bits): The sequence number value from the TSTN
2289	              request that is being acknowledged.

2291	     Reserved (19 bits): All bits SHALL be set to 0 by the sender and
2292	              SHALL be ignored on reception.

2294	     Index (5 bits): The trade-off value the media sender is using
2295	              henceforth.

2297	      Informative note: The returned trade-off value (Index) may
2298	      differ from the requested one, for example in cases where a
2299	      media encoder cannot tune its trade-off, or when pre-recorded
2300	      content is used.

2302	4.3.3.2. Semantics

2304	   This feedback message is used to acknowledge the reception of a
2305	   TSTR.  One TSTN entry in a TSTN feedback message SHALL be sent for
2306	   each TSTR entry targeted to this session participant, i.e. each
2307	   TSTR received that in the SSRC field in the entry has the
2308	   receiving entities SSRC.  A single TSTN message MAY acknowledge
2309	   multiple requests using multiple FCI entries.  The index value
2310	   included SHALL be the same in all FCI entries of the TSTN message.
2311	   Including a FCI for each requestor allows each requesting entity
2312	   to determine that the media sender received the request.  The
2313	   Notification SHALL also be sent in response to TSTR repetitions
2314	   received.  If the request receiver has received TSTR with several
2315	   different sequence numbers from a single requestor it SHALL only
2316	   respond to the request with the highest (modulo 256) sequence
2317	   number.

2319	   The TSTN SHALL include the Temporal-Spatial Trade-off index that
2320	   will be used as a result of the request.  This is not necessarily
2321	   the same index as requested, as the media sender may need to
2322	   aggregate requests from several requesting session participants.
2323	   It may also have some other policies or rules that limit the
2324	   selection.

2326	   Within the common packet header for feedback messages (as defined
2327	   in section 6.1 of [RFC4585]), the "SSRC of the packet sender"
2328	   field indicates the source of the Notification, and the "SSRC of
2329	   media source" is not used and SHALL be set to 0.  The SSRCs of the
2330	   requesting entities to which the Notification applies are in the
2331	   corresponding FCI entries.

2333	4.3.3.3. Timing Rules

2335	   The timing follows the rules outlined in section 3 of [RFC4585].
2336	   This acknowledgement message is not extremely time critical and
2337	   SHOULD be sent using regular RTCP timing.

2339	4.3.3.4. Handling of TSTN in Mixer and Translators

2341	   A mixer or translator that acts upon a TSTR SHALL also send the
2342	   corresponding TSTN.  In cases where it needs to forward a TSTR
2343	   itself the notification message MAY need to be delayed until the
2344	   TSTR has been responded to.

2346	4.3.3.5. Remarks

2348	   None

2350	4.3.4. H.271 Video Back Channel Message (VBCM)

2352	   The VBCM is identified by RTCP packet type value PT=PSFB and
2353	   FMT=7.

2355	   The FCI field MUST contain one or more VBCM FCI entries.

2357	4.3.4.1. Message Format

2359	   The syntax of an FCI entry within the VBCM indication is depicted
2360	   in Figure 7.

2362	   0                   1                   2                   3
2363	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2364	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2365	   |                              SSRC                             |
2366	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2367	   | Seq. nr       |0| Payload Type| Length                        |
2368	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2369	   |                    VBCM Octet String....      |    Padding    |
2370	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2372	   Figure 7 - Syntax of an FCI Entry in the VBCM Message

2374	   SSRC (32 bits): The SSRC value of the media sender that is
2375	          requested to instruct its encoder to react to the VBCM
2376	          message

2378	   Seq. nr (8 bits): Command sequence number.  The sequence number
2379	          space is unique for pairing of the SSRC of command source
2380	          and the SSRC of the command target.  The sequence number
2381	          SHALL be increased by 1 modulo 256 for each new command.  A
2382	          repetition SHALL NOT increase the sequence number.  The
2383	          initial value is arbitrary.

2385	   0: Must be set to 0 by the sender and should not be acted upon by
2386	          the message receiver.

2388	   Payload Type (7 bits): The RTP payload type for which the VBCM bit
2389	          stream must be interpreted.

2391	   Length (16 bits): The length of the VBCM octet string in octets
2392	          exclusive of any padding octets

2394	   VBCM Octet String (Variable length): This is the octet string
2395	          generated by the decoder carrying a specific feedback sub-
2396	          message.

2398	   Padding (Variable length): Bits set to 0 to make up a 32 bit
2399	          boundary.

2401	4.3.4.2. Semantics

2403	   The "payload" of the VBCM indication carries different types of
2404	   codec-specific, feedback information.  The type of feedback
2405	   information can be classified as a 'status report' (such as an
2406	   indication that a bit stream was received without errors, or that
2407	   a partial or complete picture or block was lost) or 'update
2408	   requests' (such as complete refresh of the bit stream).

2410	          Note: There are possible overlaps between the VBCM sub-
2411	          messages and CCM/AVPF feedback messages, such FIR.  Please
2412	          see section 3.5.3 for further discussion.

2414	   The different types of feedback sub-messages carried in the VBCM
2415	   are indicated by the "payloadType" as defined in [VBCM].  These
2416	   sub-message types are reproduced below for convenience.
2417	   "payloadType", in ITU-T Rec. H.271 terminology, refers to the sub-
2418	   type of the H.271 message and should not be confused with an RTP
2419	   payload type.

2421	   Payload          Message Content
2422	   Type
2423	   ---------------------------------------------------------------------
2424	   0      One or more pictures without detected bit stream error
2425	          mismatch
2426	   1      One or more pictures that are entirely or partially lost
2427	   2      A set of blocks of one picture that is entirely or partially
2428	          lost
2429	   3      CRC for one parameter set
2430	   4      CRC for all parameter sets of a certain type
2431	   5      A "reset" request indicating that the sender should completely
2432	          refresh the video bit stream as if no prior bit stream data
2433	          had been received
2434	   > 5    Reserved for future use by ITU-T

2436	   Table 2: H.271 message types ("payloadTypes")

2438	   The bit string or the "payload" of a VBCM message is of variable
2439	   length and is self-contained and coded in a variable length,
2440	   binary format.  The media sender necessarily has to be able to
2441	   parse this optimized binary format to make use of VBCM messages.

2443	   Each of the different types of sub-messages (indicated by
2444	   payloadType) may have different semantics depending on the codec
2445	   used.

2447	   Within the common packet header for feedback messages (as defined
2448	   in section 6.1 of [RFC4585]), the "SSRC of the packet sender"
2449	   field indicates the source of the request, and the "SSRC of media
2450	   source" is not used and SHALL be set to 0.  The SSRCs of the media
2451	   senders to which the VBCM message applies to are in the
2452	   corresponding FCI entries.  The sender of the VBCM message MAY
2453	   send H.271 messages to multiple media senders and MAY send more
2454	   than one H.271 message to the same media sender within the same
2455	   VBCM message.

2457	4.3.4.3. Timing Rules

2459	   The timing follows the rules outlined in section 3 of [RFC4585].
2460	   The different sub-message types may have different properties in
2461	   regards to the timing of messages that should be used.  If several
2462	   different types are included in the same feedback packet then the
2463	   requirements for the sub-message type with the most stringent
2464	   requirements should be followed.

2466	4.3.4.4. Handling of message in Mixer or Translator

2468	   The handling of VBCM in a mixer or translator is sub-message type
2469	   dependent.

2471	4.3.4.5. Remarks

2473	   Please see section 3
2474	.5.3 for a discussion of the usage of H.271
2475	   messages and messages defined in AVPF [RFC4585] and this memo with
2476	   similar functionality.

2478	     Note: There has been some discussion whether the payload type
2479	     field in this message is needed.  It will be needed if there is
2480	     potentially more than one VBCM-capable RTP payload type in the
2481	     same session, and the semantics of a given VBCM message changes
2482	     between payload types.  For example, the picture identification
2483	     mechanism in messages of H.271 type 0 is fundamentally different
2484	     between H.263 and H.264 (although both use the same syntax).
2485	     Therefore, the payload field is justified here.  There was a
2486	     further comment that for TSTS and FIR such a need does not
2487	     exist, because the semantics of TSTS and FIR are either loosely
2488	     enough defined, or generic enough, to apply to all video
2489	     payloads currently in existence/envisioned.

2491	5. Congestion Control

2493	   The correct application of the AVPF [RFC4585] timing rules
2494	   prevents the network from being flooded by feedback messages.
2495	   Hence, assuming a correct implementation and configuration, the
2496	   RTCP channel cannot break its bit rate commitment and introduce
2497	   congestion.

2499	   The reception of some of the feedback messages modifies the
2500	   behaviour of the media senders or, more specifically, the media
2501	   encoders.  Thus modified behaviour MUST respect the bandwidth
2502	   limits that the application of congestion control provides.  For
2503	   example, when a media sender is reacting to a FIR, the unusually
2504	   high number of packets that form the decoder refresh point have to
2505	   be paced in compliance with the congestion control algorithm, even
2506	   if the user experience suffers from a slowly transmitted decoder
2507	   refresh point.

2509	   A change of the Temporary Maximum Media Stream Bit Rate value can
2510	   only mitigate congestion, but not cause congestion as long as
2511	   congestion control is also employed.  An increase of the value by
2512	   a request REQUIRES the media sender to use congestion control when
2513	   increasing its transmission rate to that value.  A reduction of
2514	   the value results in a reduced transmission bit rate thus reducing
2515	   the risk for congestion.

2517	6. Security Considerations

2519	   The defined messages have certain properties that have security
2520	   implications.  These must be addressed and taken into account by
2521	   users of this protocol.

2523	   The defined setup signaling mechanism is sensitive to modification
2524	   attacks that can result in session creation with sub-optimal
2525	   configuration, and, in the worst case, session rejection.  To
2526	   prevent this type of attack, authentication and integrity
2527	   protection of the setup signaling is required.

2529	   Spoofed or maliciously created feedback messages of the type
2530	   defined in this specification can have the following implications:

2532	        a. severely reduced media bit rate due to false TMMBR messages
2533	           that sets the maximum to a very low value;

2535	        b. assignment of the ownership of a bounding tuple to the
2536	           wrong participant within a TMMBN message, potentially
2537	           causing unnecessary oscillation in the bounding set as the
2538	           mistakenly identified owner reports a change in its tuple
2539	           and the true owner possibly holds back on changes until a
2540	           correct TMMBN message reaches the participants;

2542	        c. sending TSTR requests that result in a video quality
2543	           different from the user's desire, rendering the session
2544	           less useful.

2546	        d. Frequent FIR commands will potentially reduce the frame-
2547	           rate, making the video jerky, due to the frequent usage of
2548	           decoder refresh points.

2550	   To prevent these attacks there is a need to apply authentication
2551	   and integrity protection of the feedback messages.  This can be
2552	   accomplished against threats external to the current RTP session
2553	   using the RTP profile that combines SRTP [SRTP] and AVPF into
2554	   SAVPF [SAVPF].  In the mixer cases, separate security contexts and
2555	   filtering can be applied between the mixer and the participants
2556	   thus protecting other users on the mixer from a misbehaving
2557	   participant.

2559	7. SDP Definitions

2561	   Section 4 of [RFC4585] defines a new SDP [RFC4566] attribute,
2562	   rtcp-fb, that may be used to negotiate the capability to handle
2563	   specific AVPF commands and indications, such as Reference Picture
2564	   Selection, Picture Loss Indication etc.  The ABNF for rtcp-fb is
2565	   described in section 4.2 of [RFC4585].  In this section we extend
2566	   the rtcp-fb attribute to include the commands and indications that
2567	   are described for codec control protocol in the present document.
2568	   We also discuss the Offer/Answer implications for the codec
2569	   control commands and indications.

2571	7.1. Extension of the rtcp-fb Attribute

2573	   As described in AVPF [RFC4585], the rtcp-fb attribute indicates
2574	   the capability of using RTCP feedback.  AVPF specifies that the
2575	   rtcp-fb attribute must only be used as a media level attribute and
2576	   must not be provided at session level.  All the rules described in
2577	   [RFC4585] for rtcp-fb attribute relating to payload type and to
2578	   multiple rtcp-fb attributes in a session description also apply to
2579	   the new feedback messages defined in this memo.

2581	   The ABNF [RFC4234] for rtcp-fb as defined in [RFC4585] is

2583	     "a=rtcp-fb: " rtcp-fb-pt SP rtcp-fb-val CRLF

2585	   where rtcp-fb-pt is the payload type and rtcp-fb-val defines the
2586	   type of the feedback message such as ack, nack, trr-int and rtcp-
2587	   fb-id.  For example to indicate the support of feedback of picture
2588	   loss indication, the sender declares the following in SDP

2590	         v=0
2591	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2592	         s=Media with feedback
2593	         t=0 0
2594	         c=IN IP4 host.example.com
2595	         m=audio 49170 RTP/AVPF 98
2596	         a=rtpmap:98 H263-1998/90000
2597	         a=rtcp-fb:98 nack pli

2599	   In this document we define a new feedback value "ccm" which
2600	   indicates the support of codec control using RTCP feedback
2601	   messages.  The "ccm" feedback value SHOULD be used with
2602	   parameters, which indicate the specific codec control commands
2603	   supported.  In this draft we define four parameters, which can be
2604	   used with the ccm feedback value type.

2606	      o  "fir" indicates the support of the Full Intra Request (FIR).
2607	      o  "tmmbr" indicates the support of the Temporary Maximum Media
2608	         Stream Bit Rate Request/Notification (TMMBR/TMMBN).  It has
2609	         an optional sub parameter to indicate the session maximum
2610	         packet rate to be used.  If not included this defaults to
2611	         infinity.
2612	      o  "tstr" indicates the support of the Temporal-Spatial Trade-
2613	         off Request/Notification (TSTR/TSTN).
2614	      O  "vbcm" indicates the support of H.271 video back channel
2615	         messages (VBCM).  It has zero or more subparameters
2616	         identifying the supported H.271 "payloadType" values.

2618	   In the ABNF for rtcp-fb-val defined in [RFC4585], there is a
2619	   placeholder called rtcp-fb-id to define new feedback types.  "ccm"
2620	   is defined as a new feedback type in this document and the ABNF
2621	   for the parameters for ccm are defined here (please refer to
2622	   section 4.2 of [RFC4585] for complete ABNF syntax).

2624	   rtcp-fb-param = SP "app" [SP byte-string]
2625	                 / SP rtcp-fb-ccm-param
2626	                 /     ; empty

2628	   rtcp-fb-ccm-param = "ccm" SP ccm-param

2630	   ccm-param  = "fir"   ; Full Intra Request
2631	              / "tmmbr" [SP "smaxpr=" MaxPacketRateValue]
2632	                        ; Temporary max media bit rate
2633	              / "tstr"  ; Temporal Spatial Trade Off
2634	              / "vbcm" *(SP subMessageType) ; H.271 VBCM messages
2635	              / token [SP byte-string]
2636	                         ; for future commands/indications
2637	   subMessageType = 1*8DIGIT
2638	   byte-string = <as defined in section 4.2 of [RFC4585] >
2639	   MaxPacketRateValue = 1*15DIGIT

2641	7.2. Offer-Answer

2643	   The Offer/Answer [RFC3264] implications for codec control protocol
2644	   feedback messages are similar those described in [RFC4585].  The
2645	   offerer MAY indicate the capability to support selected codec
2646	   commands and indications.  The answerer MUST remove all ccm
2647	   parameters which it does not understand or does not wish to use in
2648	   this particular media session.  The answerer MUST NOT add new ccm
2649	   parameters in addition to what has been offered.  The answer is
2650	   binding for the media session and both offerer and answerer MUST
2651	   only use feedback messages negotiated in this way.

2653	   The session maximum packet rate parameter part of the TMMBR
2654	   indication is declarative and everyone shall use the highest value
2655	   indicated in a response.  If the session maximum packet rate
2656	   parameter is not present in an offer it SHALL NOT be included by
2657	   the answerer.

2659	7.3. Examples

2661	   Example 1: The following SDP describes a point-to-point video call
2662	   with H.263, with the originator of the call declaring its
2663	   capability to support the FIR and TSTR/TSTN codec control
2664	   messages.  The SDP is carried in a high level signaling protocol
2665	   like SIP.

2667	         v=0
2668	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2669	         s=Point-to-Point call
2670	         c=IN IP4 192.0.2.124
2671	         m=audio 49170 RTP/AVP 0
2672	         a=rtpmap:0 PCMU/8000
2673	         m=video 51372 RTP/AVPF 98
2674	         a=rtpmap:98 H263-1998/90000
2675	         a=rtcp-fb:98 ccm tstr
2676	         a=rtcp-fb:98 ccm fir

2678	   In the above example, when the sender receives a TSTR message from
2679	   the remote party it is capable of adjusting the trade off as
2680	   indicated in the RTCP TSTN feedback message.

2682	   Example 2: The following SDP describes a SIP end point joining a
2683	   video mixer that is hosting a multiparty video conferencing
2684	   session.  The participant supports only the FIR (Full Intra
2685	   Request) codec control command and it declares it in its session
2686	   description.

2688	         v=0
2689	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2690	         s=Multiparty Video Call
2691	         c=IN IP4 192.0.2.124
2692	         m=audio 49170 RTP/AVP 0
2693	         a=rtpmap:0 PCMU/8000
2694	         m=video 51372 RTP/AVPF 98
2695	         a=rtpmap:98 H263-1998/90000
2696	         a=rtcp-fb:98 ccm fir

2698	   When the video MCU decides to route the video of this participant
2699	   it sends an RTCP FIR feedback message.  Upon receiving this
2700	   feedback message the end point is required to generate a full
2701	   intra request.

2703	   Example 3: The following example describes the Offer/Answer
2704	   implications for the codec control messages.  The Offerer wishes
2705	   to support "tstr", "fir" and "tmmbr".  The offered SDP is

2707	   -------------> Offer
2708	         v=0
2709	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2710	         s=Offer/Answer
2711	         c=IN IP4 192.0.2.124
2712	         m=audio 49170 RTP/AVP 0
2713	         a=rtpmap:0 PCMU/8000
2714	         m=video 51372 RTP/AVPF 98
2715	         a=rtpmap:98 H263-1998/90000
2716	         a=rtcp-fb:98 ccm tstr
2717	         a=rtcp-fb:98 ccm fir
2718	         a=rtcp-fb:* ccm tmmbr smaxpr=120

2720	   The answerer wishes to support only the FIR and TSTR/TSTN messages
2721	   and the answerer SDP is

2723	   <---------------- Answer

2725	         v=0
2726	         o=alice 3203093520 3203093524 IN IP4 otherhost.example.com
2727	         s=Offer/Answer
2728	         c=IN IP4 192.0.2.37
2729	         m=audio 47190 RTP/AVP 0
2730	         a=rtpmap:0 PCMU/8000
2731	         m=video 53273 RTP/AVPF 98
2732	         a=rtpmap:98 H263-1998/90000
2733	         a=rtcp-fb:98 ccm tstr
2734	         a=rtcp-fb:98 ccm fir

2736	   Example 4: The following example describes the Offer/Answer
2737	   implications for H.271 Video back channel messages (VBCM).  The
2738	   Offerer wishes to support VBCM and the sub-messages of payloadType
2739	   1 (one or more pictures that are entirely or partially lost) and 2
2740	   (a set of blocks of one picture that are entirely or partially
2741	   lost).

2743	   -------------> Offer
2744	         v=0
2745	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2746	         s=Offer/Answer
2747	         c=IN IP4 192.0.2.124
2748	         m=audio 49170 RTP/AVP 0
2749	         a=rtpmap:0 PCMU/8000
2750	         m=video 51372 RTP/AVPF 98
2751	         a=rtpmap:98 H263-1998/90000
2752	         a=rtcp-fb:98 ccm vbcm 1 2

2754	   The answerer only wishes to support sub-messages of type 1 only

2756	   <---------------- Answer

2758	         v=0
2759	         o=alice 3203093520 3203093524 IN IP4 otherhost.example.com
2760	         s=Offer/Answer
2761	         c=IN IP4 192.0.2.37
2762	         m=audio 47190 RTP/AVP 0
2763	         a=rtpmap:0 PCMU/8000
2764	         m=video 53273 RTP/AVPF 98
2765	         a=rtpmap:98 H263-1998/90000
2766	         a=rtcp-fb:98 ccm vbcm 1

2768	   So in the above example only VBCM indications comprised of
2769	   "payloadType" 1 will be supported.

2771	8. IANA Considerations

2773	   The new value "ccm" needs to be registered with IANA in the "rtcp-
2774	   fb" Attribute Values registry located at the time of publication
2775	   at:
2776	   http://www.iana.org/assignments/sdp-parameters

2778	   Value name:       ccm
2779	   Long Name:        Codec Control Commands and Indications
2780	   Reference:        RFC XXXX

2782	   A new registry "Codec Control Messages" needs to be created to
2783	   hold "ccm" parameters located at time of publication at:
2784	   http://www.iana.org/assignments/sdp-parameters

2786	   New registration in this registry follows the "Specification
2787	   required" policy as defined by [RFC2434]. In addition they are
2788	   required to indicate which, if any additional RTCP feedback types,
2789	   such as "nack", "ack".

2791	   The initial content of the registry is the following values:

2793	   Value name:       fir
2794	   Long name:        Full Intra Request Command
2795	   Usable with:      ccm
2796	   Reference:        RFC XXXX

2798	   Value name:       tmmbr
2799	   Long name:        Temporary Maximum Media Stream Bit Rate
2800	   Usable with:      ccm
2801	   Reference:        RFC XXXX

2803	   Value name:       tstr
2804	   Long name:        temporal Spatial Trade Off
2805	   Usable with:      ccm
2806	   Reference:        RFC XXXX

2808	   Value name:       vbcm
2809	   Long name:        H.271 video back channel messages
2810	   Usable with:      ccm
2811	   Reference:        RFC XXXX

2813	   The following values need to be registered as FMT values in the
2814	   "FMT Values for RTPFB Payload Types" registry located at the time
2815	   of publication at: http://www.iana.org/assignments/rtp-parameters
2816	   RTPFB range
2817	   Name           Long Name                         Value  Reference
2818	   -------------- --------------------------------- -----  ---------
2819	                  Reserved                             2   [RFCxxxx]
2820	   TMMBR          Temporary Maximum Media Stream Bit   3   [RFCxxxx]
2821	                  Rate Request
2822	   TMMBN          Temporary Maximum Media Stream Bit   4   [RFCxxxx]
2823	                  Rate Notification

2825	   The following values need to be registered as FMT values in the
2826	   "FMT Values for PSFB Payload Types" registry located at the time
2827	   of publication at: http://www.iana.org/assignments/rtp-parameters

2829	   PSFB range
2830	   Name           Long Name                             Value Reference
2831	   -------------- ---------------------------------     ----- ---------
2832	   FIR            Full Intra Request Command              4   [RFCxxxx]
2833	   TSTR           Temporal-Spatial Trade-off Request      5   [RFCxxxx]
2834	   TSTN           Temporal-Spatial Trade-off Notification 6   [RFCxxxx]
2835	   VBCM           Video Back Channel Message              7   [RFCxxxx]

2837	9. Contributors

2839	   Tom Taylor has made a very significant contribution, for which the
2840	   authors are very grateful, to this specification by helping
2841	   rewrite the specification. Especially the parts regarding the
2842	   algorithm for determining bounding sets for TMMBR have benefited.

2844	10.  Acknowledgements

2846	   The authors would like to thank Andrea Basso, Orit Levin, Nermeen
2847	   Ismail for their work on the requirement and discussion draft
2848	   [Basso].

2850	   Drafts of this memo were reviewed and extensively commented by
2851	   Roni Even, Colin Perkins, Randell Jesup, Keith Lantz, Harikishan
2852	   Desineni, Guido Franceschini and others.  The authors appreciate
2853	   these reviews.

2855	   Funding for the RFC Editor function is currently provided by the
2856	   Internet Society.

2858	11.  References

2860	11.1. Normative references

2862	   [RFC4585]   Ott, J., Wenger, S., Sato, N., Burmeister, C., Rey,
2863	                J., "Extended RTP Profile for Real-Time Transport
2864	                Control Protocol (RTCP)-Based Feedback (RTP/AVPF)",
2865	                RFC 4585, July 2006
2866	   [RFC2119]   Bradner, S., "Key words for use in RFCs to Indicate
2867	                Requirement Levels", BCP 14, RFC 2119, March 1997.
2868	   [RFC3550]   Schulzrinne, H.,  Casner, S., Frederick, R., and V.
2869	                Jacobson, "RTP: A Transport Protocol for Real-Time
2870	                Applications", STD 64, RFC 3550, July 2003.
2871	   [RFC4566]   Handley, M., Jacobson, V., and C. Perkins, "SDP:
2872	                Session Description Protocol", RFC 4566, July 2006.
2873	   [RFC3264]   Rosenberg, J. and H. Schulzrinne, "An Offer/Answer
2874	                Model with Session Description Protocol (SDP)", RFC
2875	                3264, June 2002.
2876	   [RFC2434]   Narten, T. and H. Alvestrand, "Guidelines for Writing
2877	                an IANA Considerations Section in RFCs", BCP 26, RFC
2878	                2434, October 1998.
2879	   [RFC4234]   Crocker, D. and P. Overell, "Augmented BNF for Syntax
2880	                Specifications: ABNF", RFC 4234, October 2005.

2882	11.2. Informative references

2884	   [Basso]     A. Basso, et. al., "Requirements for transport of
2885	                video control commands", draft-basso-avt-videoconreq-
2886	                02.txt, expired Internet Draft, October 2004.
2887	   [AVC]       Joint Video Team of ITU-T and ISO/IEC JTC 1, Draft
2888	                ITU-T Recommendation and Final Draft International
2889	                Standard of Joint Video Specification (ITU-T Rec.
2890	                H.264 | ISO/IEC 14496-10 AVC), Joint Video Team (JVT)
2891	                of ISO/IEC MPEG and ITU-T VCEG, JVT-G050, March 2003.
2892	   [H245]      ITU-T Rec. HG.245, "Control protocol for multimedia
2893	                communication", MAY 2006
2894	   [NEWPRED]   S. Fukunaga, T. Nakai, and H. Inoue, "Error Resilient
2895	                Video Coding by Dynamic Replacing of Reference
2896	                Pictures," in Proc. Globcom'96, vol. 3, pp. 1503 -
2897	                1508, 1996.
2898	   [SRTP]      Baugher, M., McGrew, D., Naslund, M., Carrara, E., and
2899	                K. Norrman, "The Secure Real-time Transport Protocol
2900	                (SRTP)", RFC 3711, March 2004.
2901	   [RFC2032]   Turletti, T. and C. Huitema, "RTP Payload Format for
2902	                H.261 Video Streams", RFC 2032, October 1996.

2904	   [SAVPF]     J. Ott, E. Carrara, "Extended Secure RTP Profile for
2905	                RTCP-based Feedback (RTP/SAVPF)," draft-ietf-avt-
2906	                profile-savpf-10.txt, February, 2007.
2907	   [RFC3525]   Groves, C., Pantaleo, M., Anderson, T., and T. Taylor,
2908	                "Gateway Control Protocol Version 1", RFC 3525, June
2909	                2003.
2910	   [RFC3448]   M. Handley, S. Floyd, J. Padhye, J. Widmer, "TCP
2911	                Friendly Rate Control (TFRC): Protocol Specification",
2912	                RFC 3448, Jan 2003
2913	   [VBCM]      ITU-T Rec. H.271, "Video Back Channel Messages", June
2914	                2006
2915	   [RFC3890]   Westerlund, M., "A Transport Independent Bandwidth
2916	                Modifier for the Session Description Protocol (SDP)",
2917	                RFC 3890, September 2004.
2918	   [RFC4340]   Kohler, E., Handley, M., and S. Floyd, "Datagram
2919	                Congestion Control Protocol (DCCP)", RFC 4340, March
2920	                2006.
2921	   [RFC3261]   Rosenberg, J., Schulzrinne, H., Camarillo, G.,
2922	                Johnston, A., Peterson, J., Sparks, R., Handley, M.,
2923	                and E. Schooler, "SIP: Session Initiation Protocol",
2924	                RFC 3261, June 2002.
2925	   [RFC2198]   Perkins, C., Kouvelas, I., Hodson, O., Hardman, V.,
2926	                Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse-
2927	                Parisis, "RTP Payload for Redundant Audio Data", RFC
2928	                2198, September 1997.
2929	   [Topologies] M. Westerlund, and S. Wenger, "RTP Topologies",
2930	                draft-ietf-avt-topologies-04, work in progress, Feb
2931	                2007.

2933	12.  Authors' Addresses

2935	   Stephan Wenger
2936	   Nokia Corporation
2937	   975, Page Mill Road,
2938	   Palo Alto,CA 94304
2939	   USA

2941	   Phone: +1-650-862-7368
2942	   EMail: stewe@stewe.org

2944	   Umesh Chandra
2945	   Nokia Research Center
2946	   975, Page Mill Road,
2947	   Palo Alto,CA 94304
2948	   USA

2950	   Phone: +1-650-796-7502
2951	   Email: Umesh.Chandra@nokia.com

2953	   Magnus Westerlund
2954	   Ericsson Research
2955	   Ericsson AB
2956	   SE-164 80 Stockholm, SWEDEN

2958	   Phone: +46 8 7190000
2959	   EMail: magnus.westerlund@ericsson.com

2961	   Bo Burman
2962	   Ericsson Research
2963	   Ericsson AB
2964	   SE-164 80 Stockholm, SWEDEN

2966	   Phone: +46 8 7190000
2967	   EMail: bo.burman@ericsson.com

2969	Full Copyright Statement

2971	   Copyright (C) The IETF Trust (2007).

2973	   This document is subject to the rights, licenses and restrictions
2974	   contained in BCP 78, and except as set forth therein, the authors
2975	   retain all their rights.

2977	   This document and the information contained herein are provided on
2978	   an
2979	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
2980	   REPRESENTS
2981	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST
2982	   AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES,
2983	   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT
2984	   THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR
2985	   ANY
2986	   IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR
2987	   PURPOSE.

2989	Intellectual Property

2991	   The IETF takes no position regarding the validity or scope of any
2992	   Intellectual Property Rights or other rights that might be claimed
2993	   to
2994	   pertain to the implementation or use of the technology described in
2995	   this document or the extent to which any license under such rights
2996	   might or might not be available; nor does it represent that it has
2997	   made any independent effort to identify any such rights.
2998	   Information
2999	   on the procedures with respect to rights in RFC documents can be
3000	   found in BCP 78 and BCP 79.

3002	   Copies of IPR disclosures made to the IETF Secretariat and any
3003	   assurances of licenses to be made available, or the result of an
3004	   attempt made to obtain a general license or permission for the use
3005	   of
3006	   such proprietary rights by implementers or users of this
3007	   specification can be obtained from the IETF on-line IPR repository
3008	   at
3009	   http://www.ietf.org/ipr.

3011	   The IETF invites any interested party to bring to its attention any
3012	   copyrights, patents or patent applications, or other proprietary
3013	   rights that may cover technology that may be required to implement
3014	   this standard.  Please address the information to the IETF at
3015	   ietf-ipr@ietf.org.

3017	Acknowledgement

3019	   Funding for the RFC Editor function is provided by the IETF
3020	   Administrative Support Activity (IASA).

3022	RFC Editor Considerations

3024	   The RFC editor is requested to replace all occurrences of XXXX
3025	   with the RFC number this document receives.