idnits 2.17.1 

draft-ietf-avt-avpf-ccm-05.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 19.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 2897.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2908.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2915.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2921.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The document has examples using IPv4 documentation addresses according
     to RFC6890, but does not use any IPv6 documentation addresses.  Maybe
     there should be IPv6 examples, too?


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  == Line 752 has weird spacing: '...sg type    mul...'

  == Line 1132 has weird spacing: '...     ab  c   s...'

  == Line 1134 has weird spacing: '...     ba   s...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (May 14, 2007) is 6185 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RFCxxxx' is mentioned on line 2751, but not defined

  ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866)

  == Outdated reference: A later version (-07) exists of
     draft-ietf-avt-topologies-04

  ** Downref: Normative reference to an Informational draft:
     draft-ietf-avt-topologies (ref. 'Topologies')

  ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226)

  ** Obsolete normative reference: RFC 4234 (Obsoleted by RFC 5234)

  == Outdated reference: A later version (-12) exists of
     draft-ietf-avt-profile-savpf-10

  -- Obsolete informational reference (is this intentional?): RFC 3525
     (Obsoleted by RFC 5125)

  -- Obsolete informational reference (is this intentional?): RFC 3448
     (Obsoleted by RFC 5348)


     Summary: 5 errors (**), 0 flaws (~~), 8 warnings (==), 10 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                   Stephan Wenger
3	INTERNET-DRAFT                                           Umesh Chandra
4	Expires: October 2007                                            Nokia
5	                                                     Magnus Westerlund
6	                                                             Bo Burman
7	                                                              Ericsson
8	                                                          May 14, 2007

10	                        Codec Control Messages in the
11	                RTP Audio-Visual Profile with Feedback (AVPF)
12	                       draft-ietf-avt-avpf-ccm-05.txt>

14	Status of this Memo

16	   By submitting this Internet-Draft, each author represents that any
17	   applicable patent or other IPR claims of which he or she is aware
18	   have been or will be disclosed, and any of which he or she becomes
19	   aware will be disclosed, in accordance with Section 6 of BCP 79.

21	   Internet-Drafts are working documents of the Internet Engineering
22	   Task Force (IETF), its areas, and its working groups.  Note that
23	   other groups may also distribute working documents as Internet-
24	   Drafts.

26	   Internet-Drafts are draft documents valid for a maximum of six months
27	   and may be updated, replaced, or obsoleted by other documents at any
28	   time.  It is inappropriate to use Internet-Drafts as reference
29	   material or to cite them other than as "work in progress."

31	   The list of current Internet-Drafts can be accessed at
32	   http://www.ietf.org/ietf/1id-abstracts.txt.

34	   The list of Internet-Draft Shadow Directories can be accessed at
35	   http://www.ietf.org/shadow.html.

37	Copyright Notice

39	   Copyright (C) The IETF Trust (2007).

41	Abstract

43	   This document specifies a few extensions to the messages defined in
44	   the Audio-Visual Profile with Feedback (AVPF).  They are helpful
45	   primarily in conversational multimedia scenarios where centralized
46	   multipoint functionalities are in use.  However some are also usable
47	   in smaller multicast environments and point-to-point calls.  The
48	   extensions discussed are messages related to the ITU-T H.271 Video
49	   Back Channel, Full Intra Request, Temporary Maximum Media Stream Bit
50	   Rate and Temporal Spatial Trade-off.

52	TABLE OF CONTENTS

54	1. Introduction....................................................5
55	2. Definitions.....................................................6
56	   2.1. Glossary...................................................6
57	   2.2. Terminology................................................6
58	   2.3. Topologies.................................................9
59	3. Motivation (Informative).......................................10
60	   3.1. Use Cases.................................................10
61	   3.2. Using the Media Path......................................12
62	   3.3. Using AVPF................................................13
63	      3.3.1. Reliability..........................................13
64	   3.4. Multicast.................................................13
65	   3.5. Feedback Messages.........................................13
66	      3.5.1. Full Intra Request Command...........................13
67	         3.5.1.1. Reliability.....................................14
68	      3.5.2. Temporal Spatial Trade-off Request and Notification..15
69	         3.5.2.1. Point-to-Point..................................16
70	         3.5.2.2. Point-to-Multipoint Using Multicast or
71	                  Translators.....................................16
72	         3.5.2.3. Point-to-Multipoint Using RTP Mixer.............17
73	         3.5.2.4. Reliability.....................................17
74	      3.5.3. H.271 Video Back Channel Message.....................17
75	         3.5.3.1. Reliability.....................................20
76	      3.5.4. Temporary Maximum Media Stream Bit Rate Request and
77	      Notification................................................20
78	         3.5.4.1. Behavior for media receivers using TMMBR........22
79	         3.5.4.2. Algorithm for establishing current limitations..24
80	         3.5.4.3. Use of TMMBR in a Mixer Based Multipoint
81	                  Operation.......................................30
82	         3.5.4.4. Use of TMMBR in Point-to-Multipoint Using
83	                  Multicast or Translators........................32
84	         3.5.4.5. Use of TMMBR in Point-to-point operation........32
85	         3.5.4.6. Reliability.....................................32
86	4. RTCP Receiver Report Extensions................................34
87	   4.1. Design Principles of the Extension Mechanism..............34
88	   4.2. Transport Layer Feedback Messages.........................35
89	      4.2.1. Temporary Maximum Media Stream Bit Rate Request
90	             (TMMBR)..............................................36
91	         4.2.1.1. Message Format..................................36
92	         4.2.1.2. Semantics.......................................37
93	         4.2.1.3. Timing Rules....................................40
94	         4.2.1.4. Handling in Translator and Mixers...............40
95	      4.2.2. Temporary Maximum Media Stream Bit Rate Notification
96	             (TMMBN)..............................................41
97	         4.2.2.1. Message Format..................................41
98	         4.2.2.2. Semantics.......................................41
99	         4.2.2.3. Timing Rules....................................43
100	         4.2.2.4. Handling by Translators and Mixers..............43
101	   4.3. Payload Specific Feedback Messages........................43
102	      4.3.1. Full Intra Request (FIR).............................44
103	         4.3.1.1. Message Format..................................44
104	         4.3.1.2. Semantics.......................................45
105	         4.3.1.3. Timing Rules....................................47
106	         4.3.1.4. Handling of FIR Message in Mixer and
107	                  Translators.................................... 47
108	         4.3.1.5. Remarks.........................................47
109	      4.3.2. Temporal-Spatial Trade-off Request (TSTR)............47
110	         4.3.2.1. Message Format..................................47
111	         4.3.2.2. Semantics.......................................48
112	         4.3.2.3. Timing Rules....................................49
113	         4.3.2.4. Handling of message in Mixers and Translators...49
114	         4.3.2.5. Remarks.........................................49
115	      4.3.3. Temporal-Spatial Trade-off Notification (TSTN).......50
116	         4.3.3.1. Message Format..................................50
117	         4.3.3.2. Semantics.......................................50
118	         4.3.3.3. Timing Rules....................................51
119	         4.3.3.4. Handling of TSTN in Mixer and Translators.......51
120	         4.3.3.5. Remarks.........................................51
121	      4.3.4. H.271 Video Back Channel Message (VBCM)..............51
122	         4.3.4.1. Message Format..................................52
123	         4.3.4.2. Semantics.......................................52
124	         4.3.4.3. Timing Rules....................................54
125	         4.3.4.4. Handling of message in Mixer or Translator......54
126	         4.3.4.5. Remarks.........................................54
127	5. Congestion Control.............................................54
128	6. Security Considerations........................................55
129	7. SDP Definitions................................................56
130	   7.1. Extension of the rtcp-fb Attribute........................56
131	   7.2. Offer-Answer..............................................58
132	   7.3. Examples..................................................58
133	8. IANA Considerations............................................61
134	9. Acknowledgements...............................................62
135	10. References....................................................63
136	   10.1. Normative references.....................................63
137	   10.2. Informative references...................................63
138	11. Authors' Addresses............................................64
139	1.1. Introduction

141	   When the Audio-Visual Profile with Feedback (AVPF) [RFC4585] was
142	   developed, the main emphasis lay in the efficient support of point-
143	   to-point and small multipoint scenarios without centralized
144	   multipoint control.  However, in practice, many small multipoint
145	   conferences operate utilizing devices known as Multipoint Control
146	   Units (MCUs).  Long-standing experience of the conversational video
147	   conferencing industry suggests that there is a need for a few
148	   additional feedback messages, to support centralized multipoint
149	   conferencing efficiently.  Some of the messages have applications
150	   beyond centralized multipoint, and this is indicated in the
151	   description of the message.  This is especially true for the message
152	   intended to carry ITU-T Rec. H.271 [H.271] bit strings for Video Back
153	   Channel messages.

155	   In Real-time Transport Protocol (RTP) [RFC3550] terminology, MCUs
156	   comprise mixers and translators.  Most MCUs also include signaling
157	   support.  During the development of this memo, it was noticed that
158	   there is considerable confusion in the community related to the use
159	   of terms such as mixer, translator, and MCU.  In response to these
160	   concerns, a number of topologies have been identified that are of
161	   practical relevance to the industry, but are not documented in
162	   sufficient detail in [RFC3550].  These topologies are documented in
163	   [Topologies], and understanding this memo requires previous or
164	   parallel study of [Topologies].

166	   Some of the messages defined here are forward only, in that they do
167	   not require an explicit notification to the message emitter that they
168	   have been received and/or indicating the message receiver's actions.
169	   Other messages require a response, leading to a two way communication
170	   model that one could view as useful for control purposes.  However,
171	   it is not the intention of this memo to open up RTP Control Protocol
172	   (RTCP) to a generalized control protocol.  All mentioned messages
173	   have relatively strict real-time constraints, in the sense that their
174	   value diminishes with increased delay.  This makes the use of more
175	   traditional control protocol means, such as Session Initiation
176	   Protocol (SIP) re-INVITEs [RFC3261], undesirable when used for the
177	   same purpose.  Furthermore, all messages are of a very simple format
178	   that can be easily processed by an RTP/RTCP sender/receiver.
179	   Finally, all messages relate only to the RTP stream with which they
180	   are associated, and not to any other property of a communication
181	   system.  In particular, none of them relate to the properties of the
182	   access links traversed by the session.

184	2. Definitions

186	2.1. Glossary

188	   AMID   - Additive Increase Multiplicative Decrease
189	   AVPF   - The extended RTP profile for RTCP-based feedback
190	   FEC    - Forward Error Correction
191	   FCI    - Feedback Control Information [RFC4585]
192	   FIR    - Full Intra Request
193	   MCU    - Multipoint Control Unit
194	   MPEG   - Moving Picture Experts Group
195	   TMMBN  - Temporary Maximum Media Stream Bit Rate Notification
196	   TMMBR  - Temporary Maximum Media Stream Bit Rate Request
197	   PLI    - Picture Loss Indication
198	   PR     - Packet rate
199	   QP     - Quantizer Parameter
200	   RTT    - Round trip time
201	   SSRC   - Synchronization Source
202	   TSTN   - Temporal Spatial Trade-off Notification
203	   TSTR   - Temporal Spatial Trade-off Request
204	   VBCM   - Video Back Channel Message indication.

206	2.2. Terminology

208	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
209	   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
210	   document are to be interpreted as described in RFC 2119 [RFC2119].

212	      Message:
213	          An RTCP feedback message [RFC4585] defined by this
214	          specification, of one of the following types:

216	          Request:
217	              Message that requires acknowledgement

219	          Command:
220	              Message that forces the receiver to an action

222	          Indication:
223	              Message that reports a situation

225	          Notification:

227	             Message that provides a notification that an event has
228	              occurred. Notifications are commonly generated in response
229	              to a Request.

231	          Note that, with the exception of "Notification", this
232	          terminology is in alignment with ITU-T Rec. H.245 [H245].

234	     Decoder Refresh Point:
235	          A bit string, packetized in one or more RTP packets, which
236	          completely resets the decoder to a known state.

238	          Examples for "hard" decoder refresh points are Intra pictures
239	          in H.261, H.263, MPEG-1, MPEG-2, and MPEG-4 part 2, and
240	          Instantaneous Decoder Refresh (IDR) pictures in H.264.
241	          "Gradual" decoder refresh points may also be used; see for
242	          example [AVC].  While both "hard" and "gradual" decoder
243	          refresh points are acceptable in the scope of this
244	          specification, in most cases the user experience will benefit
245	          from using a "hard" decoder refresh point.

247	          A decoder refresh point also contains all header information
248	          above the picture layer (or equivalent, depending on the video
249	          compression standard) that is conveyed in-band.  In H.264, for
250	          example, a decoder refresh point contains parameter set
251	          Network Adaptation Layer (NAL) units that generate parameter
252	          sets necessary for the decoding of the following slice/data
253	          partition NAL units (and that are not conveyed out of band).

255	   Decoding:
256	          The operation of reconstructing the media stream.

258	   Rendering:
259	          The operation of presenting (parts of) the reconstructed media
260	          stream to the user.

262	   Stream thinning:
263	          The operation of removing some of the packets from a media
264	          stream.  Stream thinning, preferably, is media-aware, implying
265	          that media packets are removed in the order of increasing
266	          relevance to the reproductive quality.  However even when
267	          employing media-aware stream thinning, most media streams
268	          quickly lose quality when subject to increasing levels of
269	          thinning.  Media-unaware stream thinning leads to even worse
270	          quality degradation.  In contrast to transcoding, stream
271	          thinning is typically seen as a computationally lightweight
272	          operation.

274	   Media:

276	          Often used (sometimes in conjunction with terms like bit rate,
277	          stream, sender ...) to identify the content of the forward RTP
278	          packet stream (carrying the codec data), to which the codec
279	          control message applies.

281	   Media Stream:
282	          The stream of RTP packets labeled with a single
283	          Synchronization Source (SSRC) carrying the media (and also in
284	          some cases repair information such as retransmission or
285	          Forward Error Correction (FEC) information).

287	   Total media bit rate:
288	          The total bits per second transferred in a media stream,
289	          measured at an observer-selected protocol layer and averaged
290	          over a reasonable timescale, the length of which depends on
291	          the application.  In general, a media sender and a media
292	          receiver will observe different total media bit rates for the
293	          same stream, first because they may have selected different
294	          reference protocol layers, and second, because of changes in
295	          per-packet overhead along the transmission path.  The goal
296	          with bit rate averaging is to be able to ignore any burstiness
297	          on very short timescales, below for example 100 ms, introduced
298	          by scheduling or link layer packetization effects.

300	   Maximum total media bit rate:
301	          The upper limit on total media bit rate for a given media
302	          stream at a particular receiver and for its selected protocol
303	          layer. Note that this value cannot be measured on the received
304	          media stream, instead it needs to be calculated or determined
305	          through other means, such as QoS negotiations or local
306	          resource limitations. Also note that this value is an average
307	          (on a timescale that is reasonable for the application) and
308	          that it may be different from the instantaneous bit-rate seen
309	          by packets in the media stream.

311	   Overhead:
312	          All protocol header information required to convey a packet
313	          with media data from sender to receiver, from the application
314	          layer down to a pre-defined protocol level (for example down
315	          to, and including, the IP header).  Overhead may include, for
316	          example, IP, UDP, and RTP headers, any layer 2 headers, any
317	          Contributing Sources (CSRCs), RTP-Padding, and RTP header
318	          extensions.  Overhead excludes any RTP payload headers and the
319	          payload itself.

321	   Net media bit rate:
322	          The bit rate carried by a media stream, net of overhead.  That
323	          is, the bits per second accounted for by encoded media, any
324	          applicable payload headers, and any directly associated meta
325	          payload information placed in the RTP packet.  A typical
326	          example of the latter is redundancy data provided by the use
327	          of RFC 2198 [RFC2198].  Note that, unlike the total media bit
328	          rate, the net media bit rate will have the same value at the
329	          media sender and at the media receiver unless any mixing or
330	          translating of the media has occurred.

332	          For a given observer, the total media bit rate for a media
333	          stream is equal to the sum of the net media bit rate and the
334	          per-packet overhead as defined above multiplied by the packet
335	          rate.

337	   Feasible region:
338	          The set of all combinations of packet rate and net media bit
339	          rate that do not exceed the restrictions in maximum media bit
340	          rate placed on a given media sender by the Temporary Maximum
341	          Media Stream Bit-rate Request (TMMBR)  messages it has
342	          received.  The feasible region will change as new TMMBR
343	          messages are received.

345	   Bounding set:
346	          The set of TMMBR tuples, selected from all those received at a
347	          given media sender, that define the feasible region for that
348	          media sender.  The media sender uses an algorithm such as that
349	          in section 3.5.4.2 to determine or iteratively approximate the
350	          current bounding set, and reports that set back to the media
351	          receivers in a Temporary Maximum Media Stream Bit-rate
352	          Notification (TMMBN) message.

354	2.3. Topologies

356	   Please refer to [Topologies] for an in depth discussion.  The
357	   topologies referred to throughout this memo are labeled (consistently
358	   with [Topologies]) as follows:

360	   Topo-Point-to-Point . . . . . point-to-point communication
361	   Topo-Multicast  . . . . . . . multicast communication as in RFC 3550
362	   Topo-Translator . . . . . . . translator based as in RFC 3550
363	   Topo-Mixer  . . . . . . . . . mixer based as in RFC 3550
364	   Topo-Video-switch-MCU . . . . video switching MCU,
365	   Topo-RTCP-terminating-MCU . . mixer but terminating RTCP

367	3. Motivation (Informative)

369	   This section discusses the motivation and usage of the different
370	   video and media control messages.  The video control messages have
371	   been under discussion for a long time, and a requirement draft was
372	   drawn up [Basso].  This draft has expired; however we quote relevant
373	   sections of it to provide motivation and requirements.

375	3.1. Use Cases

377	   There are a number of possible usages for the proposed feedback
378	   messages.  Let us begin by looking through the use cases Basso et al.
379	   [Basso] proposed.  Some of the use cases have been reformulated and
380	   comments have been added.

382	   1. An RTP video mixer composes multiple encoded video sources into a
383	      single encoded video stream.  Each time a video source is added,
384	      the RTP mixer needs to request a decoder refresh point from the
385	      video source, so as to start an uncorrupted prediction chain on
386	      the spatial area of the mixed picture occupied by the data from
387	      the new video source.

389	   2. An RTP video mixer receives multiple encoded RTP video streams
390	      from conference participants, and dynamically selects one of the
391	      streams to be included in its output RTP stream.  At the time of a
392	      bit stream change (determined through means such as voice
393	      activation or the user interface), the mixer requests a decoder
394	      refresh point from the remote source, in order to avoid using
395	      unrelated content as reference data for inter picture prediction.
396	      After requesting the decoder refresh point, the video mixer stops
397	      the delivery of the current RTP stream and monitors the RTP stream
398	      from the new source until it detects data belonging to the decoder
399	      refresh point.  At that time, the RTP mixer starts forwarding the
400	      newly selected stream to the receiver(s).

402	   3. An application needs to signal to the remote encoder that the
403	      desired trade-off between temporal and spatial resolution has
404	      changed.  For example, one user may prefer a higher frame rate and
405	      a lower spatial quality, and another user may prefer the opposite.
406	      This choice is also highly content dependent.  Many current video
407	      conferencing systems offer in the user interface a mechanism to
408	      make this selection, usually in the form of a slider.  The
409	      mechanism is helpful in point-to-point, centralized multipoint and
410	      non-centralized multipoint uses.

412	   4. Use case 4 of the Basso draft applies only to Picture Loss
413	      Indication (PLI) as defined in AVPF [RFC4585] and is not
414	      reproduced here.

416	   5. Use case 5 of the Basso draft relates to a mechanism known as
417	      "freeze picture request".  Sending freeze picture requests
418	      over a non-reliable forward RTCP channel has been identified as
419	      problematic.  Therefore, no freeze picture request has been
420	      included in this memo, and the use case discussion is not
421	      reproduced here.

423	   6. A video mixer dynamically selects one of the received video
424	      streams to be sent out to participants and tries to provide the
425	      highest bit rate possible to all participants, while minimizing
426	      stream trans-rating.  One way of achieving this is to set up
427	      sessions with endpoints using the maximum bit rate accepted by
428	      each endpoint, and accepted by the call admission method used by
429	      the mixer.  By means of commands that reduce the maximum media
430	      stream bit rate below what has been negotiated during session set
431	      up, the mixer can reduce the maximum bit rate sent by endpoints to
432	      the lowest of all the accepted bit rates.  As the lowest accepted
433	      bit rate changes due to endpoints joining and leaving or due to
434	      network congestion, the mixer can adjust the limits at which
435	      endpoints can send their streams to match the new value.  The
436	      mixer then requests a new maximum bit rate, which is equal to or
437	      less than the maximum bit rate negotiated at session setup for a
438	      specific media stream, and the remote endpoint can respond with
439	      the actual bit rate that it can support.

441	   The picture Basso, et al draws up covers most applications we
442	   foresee.  However we would like to extend the list with two
443	   additional use cases:

445	   7. Currently deployed congestion control algorithms (AMID and TFRC
446	      [RFC3448]) probe for additional available capacity as long as
447	      there is something to send.  With congestion control algorithms
448	      using packet loss as the indication for congestion, this probing
449	      does generally result in reduced media quality (often to a point
450	      where the distortion is large enough to make the media unusable),
451	      due to packet loss and increased delay.

453	      In a number of deployment scenarios, especially cellular ones, the
454	      bottleneck link is often the last hop link.  That cellular link
455	      also commonly has some type of QoS negotiation enabling the
456	      cellular device to learn the maximal bit rate available over this
457	      last hop.  A media receiver behind this link can, in most (if not
458	      all) cases, calculate at least an upper bound for the bit rate
459	      available for each media stream it presently receives.  How this
460	      is done is an implementation detail and not discussed herein.
461	      Indicating the maximum available bit rate to the transmitting
462	      party for the various media streams can be beneficial to prevent
463	      that party from probing for bandwidth for this stream in excess of
464	      a known hard limit.  For cellular or other mobile devices, the
465	      known available bit rate for each stream (deduced from the link
466	      bit rate) can change quickly, due to handover to another
467	      transmission technology, QoS renegotiation due to congestion, etc.
468	      To enable minimal disruption of service, quick convergence is
469	      necessary, and therefore media path signaling is desirable.

471	    8. The use of reference picture selection (RPS) as an error
472	       resilience tool has been introduced in 1997 as NEWPRED [NEWPRED],
473	       and is now widely deployed.  When RPS is in use, simplistically
474	       put, the receiver can send a feedback message to the sender,
475	       indicating a reference picture that should be used for future
476	       prediction. ([NEWPRED] mentions other forms of feedback as well.)
477	       AVPF contains a mechanism for conveying such a message, but did
478	       not specify for which codec and according to which syntax the
479	       message should conform.  Recently, the ITU-T finalized Rec. H.271
480	       which (among other message types) also includes a feedback
481	       message.  It is expected that this feedback message will fairly
482	       quickly enjoy wide support.  Therefore, a mechanism to convey
483	       feedback messages according to H.271 appears to be desirable.

485	3.2. Using the Media Path

487	   There are multiple reasons why we use the media path for the codec
488	   control messages.

490	   First, systems employing MCUs often separate the control and media
491	   processing parts.  As these messages are intended for or generated by
492	   the media part rather than the signaling part of the MCU, having them
493	   on the media path avoids transmission across interfaces and
494	   unnecessary control traffic between signaling and processing.  If the
495	   MCU is physically decomposed, the use of the media path avoids the
496	   need for media control protocol extensions (e.g. in MEGACO
497	   [RFC3525]).

499	   Secondly, the signaling path quite commonly contains several
500	   signaling entities, e.g. SIP proxies and application servers.
501	   Avoiding going through signaling entities avoids delay for several
502	   reasons.  Proxies have less stringent delay requirements than media
503	   processing and due to their complex and more generic nature may
504	   result in significant processing delay.  The topological locations of
505	   the signaling entities are also commonly not optimized for minimal
506	   delay, but rather towards other architectural goals.  Thus the
507	   signaling path can be significantly longer in both geographical and
508	   delay sense.

510	3.3. Using AVPF

512	   The AVPF feedback message framework [RFC4585] provides the
513	   appropriate framework to implement the new messages.  AVPF implements
514	   rules controlling the timing of feedback messages to avoid congestion
515	   through network flooding by RTCP traffic.  We re-use these rules by
516	   referencing AVPF.

518	   The signaling setup for AVPF allows each individual type of function
519	   to be configured or negotiated on an RTP session basis.

521	3.3.1. Reliability

523	   The use of RTCP messages implies that each message transfer is
524	   unreliable, unless the lower layer transport provides reliability.
525	   The different messages proposed in this specification have different
526	   requirements in terms of reliability.  However, in all cases, the
527	   reaction to an (occasional) loss of a feedback message is specified.

529	3.4. Multicast

531	   The codec control messages might be used with multicast.  The RTCP
532	   timing rules specified in [RFC3550] and [RFC4585] ensure that the
533	   messages do not cause overload of the RTCP connection.  The use of
534	   multicast may result in the reception of messages with inconsistent
535	   semantics.   The reaction to inconsistencies depends on the message
536	   type, and is discussed for each message type separately.

538	3.5. Feedback Messages

540	   This section describes the semantics of the different feedback
541	   messages and how they apply to the different use cases.

543	3.5.1. Full Intra Request Command

545	   A Full Intra Request (FIR) Command, when received by the designated
546	   media sender, requires that the media sender sends a Decoder Refresh
547	   Point (see 2.2) at the earliest opportunity.  The evaluation of such
548	   opportunity includes the current encoder coding strategy and the
549	   current available network resources.

551	   FIR is also known as an "instantaneous decoder refresh request" or
552	   "video fast update request".

554	   Using a decoder refresh point implies refraining from using any
555	   picture sent prior to that point as a reference for the encoding
556	   process of any subsequent picture sent in the stream.  For predictive
557	   media types that are not video, the analogue applies.  For example,
558	   if in MPEG-4 systems scene updates are used, the decoder refresh
559	   point consists of the full representation of the scene and is not
560	   delta-coded relative to previous updates.

562	   Decoder refresh points, especially Intra or IDR pictures, are in
563	   general several times larger in size than predicted pictures.  Thus,
564	   in scenarios in which the available bit rate is small, the use of a
565	   decoder refresh point implies a delay that is significantly longer
566	   than the typical picture duration.

568	   Usage in multicast is possible; however aggregation of the commands
569	   is recommended.  A receiver that receives a request closely (within 2
570	   times the longest Round Trip Time (RTT) known, plus any AVPF-induced
571	   RTCP packet sending delays, if those are known) after sending a
572	   decoder refresh point, should await a second request message to
573	   ensure that the media receiver has not been served by the previously
574	   delivered decoder refresh point.  The reason for the specified delay
575	   is to avoid sending unnecessary decoder refresh points.  A session
576	   participant may have sent its own request while another participant's
577	   request was in-flight to them.  Suppressing those requests that may
578	   have been sent without knowledge about the other request avoids this
579	   issue.

581	   Using the FIR command to recover from errors is explicitly
582	   disallowed, and instead the PLI message defined in AVPF [RFC4585]
583	   should be used.  The PLI message reports lost pictures and has been
584	   included in AVPF for precisely that purpose.

586	   Full Intra Request is applicable in use-cases 1 and 2.

588	3.5.1.1. Reliability

590	   The FIR message results in the delivery of a decoder refresh point,
591	   unless the message is lost.  Decoder refresh points are easily
592	   identifiable from the bit stream.  Therefore, there is no need for
593	   protocol-level notification, and a simple command repetition
594	   mechanism is sufficient for ensuring the level of reliability
595	   required.  However, the potential use of repetition does require a
596	   mechanism to prevent the recipient from responding to messages
597	   already received and responded to.

599	   To ensure the best possible reliability, a sender of FIR may repeat
600	   the FIR request until the desired content has been received.  The
601	   repetition interval is determined by the RTCP timing rules applicable
602	   to the session.  Upon reception of a complete decoder refresh point
603	   or the detection of an attempt to send a decoder refresh point (which
604	   got damaged due to a packet loss), the repetition of the FIR must
605	   stop.  If another FIR is necessary, the request sequence number must
606	   be increased.  A FIR sender shall not have more than one FIR request
607	   (different request sequence number) outstanding at any time per media
608	   sender in the session.

610	   The receiver of FIR (i.e. the media sender) behaves in complementary
611	   fashion to ensure delivery of a decoder refresh point.  If it
612	   receives repetitions of the FIR more than 2*RTT after it has sent a
613	   decoder refresh point, it shall send a new decoder refresh point.
614	   Two round trip times allow time for the decoder refresh point to
615	   arrive back to the requestor and for the end of repetitions of FIR to
616	   reach and be detected by the media sender.

618	   An RTP mixer that receives an FIR from a media receiver is
619	   responsible to ensure that a decoder refresh point is delivered to
620	   the requesting receiver.  It may be necessary for the mixer to
621	   generate FIR commands.  From a reliability perspective, the two legs
622	   (FIR-requesting endpoint to mixer, and mixer to decoder refresh point
623	   generating endpoint) are handled independently from each other.

625	3.5.2. Temporal Spatial Trade-off Request and Notification

627	   The Temporal Spatial Trade-off Request (TSTR) instructs the video
628	   encoder to change its trade-off between temporal and spatial
629	   resolution.  Index values from 0 to 31 indicate monotonically a
630	   desire for higher frame rate.  That is, a requester asking for an
631	   index of 0 prefers a high quality and is willing to accept a low
632	   frame rate, whereas a requester asking for 31 wishes a high frame
633	   rate, potentially at the cost of low spatial quality.

635	   In general the encoder reaction time may be significantly longer than
636	   the typical picture duration.  See use case 3 for an example.  The
637	   encoder decides whether and to what extent the request results in a
638	   change of the trade-off.  It returns a Temporal Spatial Trade-Off
639	   Notification (TSTN) message to indicate the trade-off that it will
640	   use henceforth.

642	   TSTR and TSTN have been introduced primarily because it is believed
643	   that control protocol mechanisms, e.g. a SIP re-invite, are too
644	   heavyweight and too slow to allow for a reasonable user experience.

646	   Consider, for example, a user interface where the remote user selects
647	   the temporal/spatial trade-off with a slider (as it is common in
648	   state-of-the-art video conferencing systems).  An immediate feedback
649	   to any slider movement is required for a reasonable user experience.
650	   A SIP re-INVITE [RFC3261] would require at least two round-trips more
651	   (compared to the TSTR/TSTN mechanism) and may involve proxies and
652	   other complex mechanisms.  Even in a well-designed system, it could
653	   take a second or so until finally the new trade-off is selected.
654	   Furthermore the use of RTCP solves the multicast use case very
655	   efficiently.

657	   The use of TSTR and TSTN in multipoint scenarios is a non-trivial
658	   subject, and can be achieved in many implementation-specific ways.
659	   Problems stem from the fact that TSTRs will typically arrive
660	   unsynchronized, and may request different trade-off values for the
661	   same stream and/or endpoint encoder.  This memo does not specify a
662	   translator, mixer or endpoint's reaction to the reception of a
663	   suggested trade-off as conveyed in the TSTR.  We only require the
664	   receiver of a TSTR message to reply to it by sending a TSTN, carrying
665	   the new trade-off chosen by its own criteria (which may or may not be
666	   based on the trade-off conveyed by the TSTR).  In other words, the
667	   trade-off sent in TSTR is a non-binding recommendation, nothing more.

669	   Four TSTR/TSTN scenarios need to be distinguished, based on the
670	   topologies described in [Topologies].  The scenarios are described in
671	   the following sub-clauses.

673	3.5.2.1. Point-to-Point

675	   In this most trivial case (Topo-Point-to-Point), the media sender
676	   typically adjusts its temporal/spatial trade-off based on the
677	   requested value in TSTR, subject to its own capabilities.  The TSTN
678	   message conveys back the new trade-off value (which may be identical
679	   to the old one if, for example, the sender is not capable of
680	   adjusting its trade-off).

682	3.5.2.2. Point-to-Multipoint Using Multicast or Translators

684	   RTCP Multicast is used either with media multicast according to Topo-
685	   Multicast, or following RFC 3550's translator model according to
686	   Topo-Translator.  In these cases, unsynchronized TSTR messages from
687	   different receivers may be received, possibly with different
688	   requested trade-offs (because of different user preferences).  This
689	   memo does not specify how the media sender tunes its trade-off.
690	   Possible strategies include selecting the mean or median of all
691	   trade-off requests received, giving priority to certain participants,
692	   or continuing to use the previously selected trade-off (e.g. when the
693	   sender is not capable of adjusting it).  Again, all TSTR messages
694	   need to be acknowledged by TSTN, and the value conveyed back has to
695	   reflect the decision made.

697	3.5.2.3. Point-to-Multipoint Using RTP Mixer

699	   In this scenario (Topo-Mixer) the RTP mixer receives all TSTR
700	   messages, and has the opportunity to act on them based on its own
701	   criteria.  In most cases, the mixer should form a "consensus" of
702	   potentially conflicting TSTR messages arriving from different
703	   participants, and initiate its own TSTR message(s) to the media
704	   sender(s).  As in the previous scenario, the strategy for forming
705	   this "consensus" is up to the implementation, and can, for example,
706	   encompass averaging the participants' request values, giving priority
707	   to certain participants, or using session default values.

709	   Even if a mixer or translator performs transcoding, it is very
710	   difficult to deliver media with the requested trade-off, unless the
711	   content the mixer or translator receives is already close to that
712	   trade-off.  Thus if the mixer changes its trade-off, it needs to
713	   request the media sender(s) to use the new value, by creating a TSTR
714	   of its own.  Upon reaching a decision on the used trade-off it
715	   includes that value in the acknowledgement to the downstream
716	   requestors.  Only in cases where the original source has
717	   substantially higher quality (and bit rate), is it likely that
718	   transcoding alone can result in the requested trade-off.

720	3.5.2.4. Reliability

722	   A request and reception acknowledgement mechanism is specified.  The
723	   Temporal Spatial Trade-off Notification (TSTN) message informs the
724	   request-sender that its request has been received, and what trade-off
725	   is used henceforth.  This acknowledgment mechanism is desirable for
726	   at least the following reasons:

728	   o A change in the trade-off cannot be directly identified from the
729	     media bit stream.
730	   o User feedback cannot be implemented without knowing the chosen
731	     trade-off value, according to the media sender's constraints.
732	   o Repetitive sending of messages requesting an unimplementable trade-
733	     off can be avoided.

735	3.5.3. H.271 Video Back Channel Message
736	   ITU-T Rec. H.271 defines syntax, semantics, and suggested encoder
737	   reaction to a video back channel message.  The structure defined in
738	   this memo is used to transparently convey such a message from media
739	   receiver to media sender.  In this memo, we refrain from an in-depth
740	   discussion of the available code points within H.271 and refer to the
741	   specification text [H.271] instead.

743	   However, we note that some H.271 messages bear similarities with
744	   native messages of AVPF and this memo.  Furthermore, we note that
745	   some H.271 message are known to require caution in multicast
746	   environments -- or are plainly not usable in multicast or multipoint
747	   scenarios.  Table 1 provides a brief, oversimplifying overview of the
748	   messages currently defined in H.271, their roughly corresponding AVPF
749	   or CCM messages (the latter as specified in this memo), and an
750	   indication of our current knowledge of their multicast safety.

752	   H.271 msg type       AVPF/CCM msg type    multicast-safe
753	   ---------------------------------------------------------------------
754	   0 (when used for
755	     reference picture
756	      selection)        AVPF RPSI        No (positive ACK of pictures)
757	   1 picture loss       AVPF PLI         Yes
758	   2 partial loss       AVPF SLI         Yes
759	   3 one parameter CRC  N/A              Yes (no required sender action)
760	   4 all parameter CRC  N/A              Yes (no required sender action)
761	   5 refresh point      CCM FIR          Yes

763	   Table 1: H.271 messages and their AVPF/CCM equivalents

765	          Note: H.271 message type 0 is not a strict equivalent to
766	          AVPF's Reference Picture Selection Indication (RPSI); it is an
767	          indication of known-as-correct reference picture(s) at the
768	          decoder.  It does not command an encoder to use a defined
769	          reference picture (the form of control information envisioned
770	          to be carried in RPSI).  However, it is believed and intended
771	          that H.271 message type 0 will be used for the same purpose as
772	          AVPF's RPSI -- although other use forms are also possible.

774	   In response to the opaqueness of the H.271 messages especially with
775	   respect to the multicast safety, the following guidelines MUST be
776	   followed when an implementation wishes to employ the H.271 video back
777	   channel message:

779	   1. Implementations utilizing the H.271 feedback message MUST stay in
780	      compliance with congestion control principles, as outlined in
781	      section 5.

783	   2. An implementation SHOULD utilize the IETF-native messages as
784	      defined in [RFC4585] and in this memo instead of similar messages
785	      defined in [H.271].  Our current understanding of similar messages
786	      is documented in Table 1 above.  One good reason to divert from
787	      the SHOULD statement above would be if it is clearly understood
788	      that, for a given application and video compression standard, the
789	      aforementioned "similarity" is not given, in contrast to what
790	      the table indicates.

792	   3. It has been observed that some of the H.271 code points currently
793	      in existence are not multicast-safe.  Therefore, the sensible
794	      thing to do is not to use the H.271 feedback message type in
795	      multicast environments.  It MAY be used only when all the issues
796	      mentioned later are fully understood by the implementer, and
797	      properly taken into account by all endpoints.  In all other cases,
798	      the H.271 message type MUST NOT be used in conjunction with
799	      multicast.

801	   4. It has been observed that even in centralized multipoint
802	      environments, where the mixer should theoretically be able to
803	      resolve issues as documented below, the implementation of such a
804	      mixer and cooperative endpoints is a very difficult and tedious
805	      task.  Therefore, H.271 messages MUST NOT be used in centralized
806	      multipoint scenarios, unless all the issues mentioned below are
807	      fully understood by the implementer, and properly taken into
808	      account by both mixer and endpoints.

810	   Issues to be taken into account when considering the use of H.271 in
811	   multipoint environments:

813	   1. Different state on different receivers.  In many environments it
814	      cannot be guaranteed that the decoder state of all media receivers
815	      is identical at any given point in time.  The most obvious reason
816	      for such a possible misalignment of state is a loss that occurs on
817	      the path to only one of many media receivers.  However, there are
818	      other not so obvious reasons, such as recent joins to the
819	      multipoint conference (be it by joining the multicast group or
820	      through additional mixer output).  Different states can lead the
821	      media receivers to issue potentially contradicting H.271 messages
822	      (or one media receiver issuing an H.271 message that, when
823	      observed by the media sender, is not helpful for the other media
824	      receivers).  A naive reaction of the media sender to these
825	      contradicting messages can lead to unpredictable and annoying
826	      results.

828	   2. Combining messages from different media receivers in a media
829	      sender is a non-trivial task.  As reasons, we note that these
830	      messages may be contradicting each other, and that their transport
831	      is unreliable (there may well be other reasons).  In case of many
832	      H.271 messages (i.e. types 0, 2, 3, and 4), the algorithm for
833	      combining must be aware both of the network/protocol environment
834	      (i.e. with respect to congestion) and of the media codec employed,
835	      as H.271 messages of a given type can have different semantics for
836	      different media codecs.

838	   3. The suppression of requests may need to go beyond the basic
839	      mechanisms described in AVPF (which are driven exclusively by
840	      timing and transport considerations on the protocol level).  For
841	      example, a receiver is often required to refrain from (or delay)
842	      generating requests, based on information it receives from the
843	      media stream.  For instance, it makes no sense for a receiver to
844	      issue a FIR when a transmission of an Intra/IDR picture is
845	      ongoing.

847	   4. When using the non-multicast-safe messages (e.g. H.271 type 0
848	      positive ACK of received pictures/slices) in larger multicast
849	      groups, the media receiver will likely be forced to delay or even
850	      omit sending these messages.  For the media sender this looks like
851	      data has not been properly received (although it was received
852	      properly), and a naively implemented media sender reacts to these
853	      perceived problems where it should not.

855	3.5.3.1. Reliability

857	   H.271 Video Back Channel messages do not require reliable
858	   transmission, and confirmation of the reception of a message can be
859	   derived from the forward video bit stream.  Therefore, no specific
860	   reception acknowledgement is specified.

862	   With respect to re-sending rules, clause 3.5.1.1. applies.

864	3.5.4. Temporary Maximum Media Stream Bit Rate Request and Notification

866	   A receiver, translator or mixer uses the Temporary Maximum Media
867	   Stream Bit Rate Request (TMMBR, "timber") to request a sender to
868	   limit the maximum bit rate for a media stream (see 2.2) to, or below,
869	   the provided value.  The Temporary Maximum Media Stream Bit Rate
870	   Notification (TMMBN) contains the media sender's current view of the
871	   most limiting subset of the TMMBR-defined limits it has received, to
872	   help the participants to suppress TMMBR requests that would not
873	   further restrict the media sender.  The primary usage for the
874	   TMMBR/TMMBN messages is in a scenario with an MCU or mixer (use case
875	   6), corresponding to Topo-Translator or Topo-Mixer, but also to Topo-
876	   Point-to-Point.

878	   Each temporary limitation on the media stream is expressed as a
879	   tuple.  The first component of the tuple is the maximum total media
880	   bit rate (as defined in section 2.2) that the media receiver is
881	   currently prepared to accept for this media stream.  The second
882	   component is the per-packet overhead that the media receiver has
883	   observed for this media stream at its chosen reference protocol
884	   layer.

886	   As indicated in section 2.2, the overhead as observed by the sender
887	   of the TMMBR (i.e. the media receiver) may differ from the overhead
888	   observed at the receiver of the TMMBR (i.e. the media sender) due to
889	   use of a different reference protocol layer at the other end or due
890	   to the intervention of translators or mixers that affect the amount
891	   of per packet overhead.  For example, a gateway in between the two
892	   that converts between IPv4 and IPv6 affects the per-packet overhead
893	   by 20 bytes.  Other mechanisms that change the overhead include
894	   tunnels.  The problem with varying overhead is also discussed in
895	   [RFC3890].  As will be seen in the description of the algorithm for
896	   use of TMMBR, the difference in perceived overhead between the
897	   sending and receiving ends presents no difficulty because
898	   calculations are carried out in terms of variables (packet rate, net
899	   media bit rate) that have the same value at the sender as at the
900	   receiver.

902	   Reporting both maximum total media bit rate and per-packet overhead
903	   allows different receivers to provide bit rate and overhead values
904	   for different protocol layers, for example at the IP level, at the
905	   outer part of a tunnel protocol, or at the link layer.  The protocol
906	   level a peer reports on depends on the level of integration the peer
907	   has, as it needs to be able to extract the information from that
908	   protocol level.  For example, an application with no knowledge of the
909	   IP version it is running over can not meaningfully determine the
910	   overhead of the IP header, and hence will not want to include IP
911	   overhead in the overhead or maximum total media bit rate calculation.

913	   It is expected that most peers will be able to report values at least
914	   for the IP layer.  In certain implementations it may be advantageous
915	   to also include information pertaining to the link layer, which in
916	   turn allows for a more precise overhead calculation and a better
917	   optimization of connectivity resources.

919	   The Temporary Maximum Media Stream Bit Rate messages are generic
920	   messages that can be applied to any RTP packet stream.  This
921	   separates them from the other codec control messages defined in this
922	   specification, which apply only to specific media types or payload
923	   formats.  The TMMBR functionality applies to the transport, and the
924	   requirements the transport places on the media encoding.

926	   The reasoning below assumes that the participants have negotiated a
927	   session maximum bit rate, using a signaling protocol.  This value can
928	   be global, for example in case of point-to-point, multicast, or
929	   translators.  It may also be local between the participant and the
930	   peer or mixer.  In either case, the bit rate negotiated in signaling
931	   is the one that the participant guarantees to be able to handle
932	   (depacketize and decode).  In practice, the connectivity of the
933	   participant also influences the negotiated value -- it does not make
934	   much sense to negotiate a total media bit rate that one's network
935	   interface does not support.

937	   It is also beneficial to have negotiated a maximum packet rate for
938	   the session or sender.  RFC 3890 provides an SDP [RFC4566] attribute
939	   that can be used for this purpose; however, that attribute is not
940	   usable in RTP sessions established using offer/answer [RFC3264].
941	   Therefore an optional maximum packet rate signaling parameter is
942	   specified in this memo.

944	   An already established maximum total media bit rate may be changed at
945	   any time, subject to the timing rules governing the sending of
946	   feedback messages. The limit may change to any value between zero and
947	   the session maximum, as negotiated during session establishment
948	   signaling.  However, even if a sender has received a TMMBR message
949	   allowing an increase in the bit rate, all increases must be governed
950	   by a congestion control mechanism.  TMMBR indicates known limitations
951	   only, usually in the local environment, and does not provide any
952	   guarantees about the full path.  Furthermore, any increases in TMMBR-
953	   established bit rate limits are to be executed only after a certain
954	   delay from the sending of the TMMBN message that notifies the world
955	   about the increase in limit.  The delay is specified as at least
956	   twice the longest RTT as known by the media sender, plus the media
957	   sender's calculation of the required wait time for the sending of
958	   another TMMBR message for this session based on AVPF timing rules.
959	   This delay is introduced to allow other session participants to make
960	   known their bit rate limit requirements, which may be lower.

962	   If it is likely that the new value indicated by TMMBR will be valid
963	   for the remainder of the session, the TMMBR sender is expected to
964	   perform a renegotiation of the session upper limit using the session
965	   signaling protocol.

967	3.5.4.1. Behavior for media receivers using TMMBR

969	   This section is an informal description of behaviour described more
970	   precisely in section 4.2.

972	   A media sender begins the session limited by the maximum media bit
973	   rate and maximum packet rate negotiated in session signaling, if any.

975	   Note that this value may be negotiated for another protocol layer
976	   than the one the participant uses in its TMMBR messages.  Each media
977	   receiver selects a reference protocol layer, forms an estimate of the
978	   overhead it is observing (or estimating it if no packets has been
979	   seen yet) at that reference level, and determines the maximum total
980	   media bit rate it can accept, taking into account its own limitations
981	   and any transport path limitations of which it may be aware.  In case
982	   the current limitations are more restricting then what was agreed on
983	   in the session signaling, the media receiver reports its initial
984	   estimate of these two quantities to the media sender using a TMMBR
985	   message.  Overall message traffic is reduced by the possibility of
986	   including tuples for multiple media senders in the same TMMBR
987	   message.

989	   The media sender applies an algorithm such as that specified in
990	   section 3.5.4.2 to select which of the tuples it has received are
991	   most limiting (i.e. the bounding set as defined in section 2.2).  It
992	   modifies its operation to stay within the feasible region (as defined
993	   in section 2.2), and also sends out a TMMBN notification to the media
994	   receivers indicating the selected bounding set.

996	   If a media receiver does not own one of the tuples in the bounding
997	   set reported by the TMMBN, it applies the same algorithm as the media
998	   sender to determine if its current estimated (maximum total media bit
999	   rate, overhead) tuple would enter the bounding set if known to the
1000	   media sender.  If so, it issues a TMMBR request reporting the tuple
1001	   value to the sender.  Otherwise it takes no action for the moment.
1002	   Periodically, its estimated tuple values may change or it may receive
1003	   a new TMMBN.  If so, it reapplies the algorithm to decide whether it
1004	   needs to issue a TMMBR request.

1006	   If, alternatively, a media receiver owns one of the tuples in the
1007	   reported bounding set, it takes no action until such time as its
1008	   estimate of its own tuple values changes.  At that time it sends a
1009	   TMMBR request to the media sender to report the changed values.

1011	   A media receiver may change status between owner and non-owner of a
1012	   bounding tuple between one TMMBN message and the next.  Thus it must
1013	   check the contents of each TMMBN to determine its subsequent actions.

1015	   Implementations may use other algorithms of their choosing, as long
1016	   as the bit rate limitations resulting from the exchange of TMMBR and
1017	   TMMBN messages are at least as strict (at least as low, in the bit
1018	   rate dimension) as the ones resulting from the use of the
1019	   aforementioned algorithm.

1021	   Obviously, in point-to-point cases, when there is only one media
1022	   receiver, this receiver becomes "owner" once it receives the first
1023	   TMMBN in response to its own TMMBR, and stays "owner" for the rest of
1024	   the session.  Therefore, when it is known that there will always be
1025	   only a single media receiver, the above algorithm is not required.
1026	   Media receivers that are aware they are the only ones in a session
1027	   can send TMMBR messages with bit rate limits both higher and lower
1028	   than the previously notified limit, at any time (subject to the AVPF
1029	   [RFC4585] RTCP RR send timing rules).  However, it may be difficult
1030	   for a session participant to determine if it is the only receiver in
1031	   the session.  Because of this any implementation of TMMBR is required
1032	   to include the algorithm described in the next section or a stricter
1033	   equivalent.

1035	3.5.4.2. Algorithm for establishing current limitations

1037	   This section introduces an example algorithm for the calculation of a
1038	   session limit.  Other algorithms can be employed, as long as the
1039	   result of the calculation is at least as restrictive as the result
1040	   that is obtained by this algorithm.

1042	   First it is important to consider the implications of using a tuple
1043	   for limiting the media sender's behavior.  The bit rate and the
1044	   overhead value result in a two-dimensional solution space for the
1045	   calculation of the bit rate of media streams.  Fortunately the two
1046	   variables are linked. Specifically, the bit rate available for RTP
1047	   payloads is equal to the TMMBR reported bit rate minus the packet
1048	   rate used, multiplied by the TMMBR reported overhead converted to
1049	   bits.  As a result, when different bit rate/overhead combinations
1050	   need to be considered, the packet rate determines the correct
1051	   limitation.  This is perhaps best explained by an example:

1053	   Example:

1055	   Receiver A: TMMBR_max total BR = 35 kbps, TMMBR_OH = 40 bytes
1056	   Receiver B: TMMBR_max total BR = 40 kbps, TMMBR_OH = 60 bytes

1058	   For a given packet rate (PR) the bit rate available for media
1059	   payloads in RTP will be:

1061	   Max_net media_BR_A = TMMBR_max total BR_A - PR * TMMBR_OH_A * 8 ...
1062	   (1)
1063	   Max_net media_BR_B = TMMBR_max total BR_B - PR * TMMBR_OH_B * 8 ...
1064	   (2)

1066	   For a PR = 20 these calculations will yield a Max_net media_BR_A =
1067	   28600 bps and Max_net media_BR_B = 30400 bps, which suggests that
1068	   receiver A is the limiting one for this packet rate.  However at a
1069	   certain PR there is a switchover point at which receiver B becomes
1070	   the limiting one.  The switchover point can be identified by setting
1071	   Max_media_BR_A equal to Max_media_BR_B and breaking out PR:

1073	         TMMBR_max total BR_A - TMMBR_max total BR_B
1074	   PR =  ------------------------------------------- ... (3)
1075	                8*(TMMBR_OH_A - TMMBR_OH_B)

1077	   which, for the numbers above yields 31.25 as the switchover point
1078	   between the two limits.  That is, for packet rates below 31.25 per
1079	   second, receiver A is the limiting receiver, and for higher packet
1080	   rates, receiver B is more limiting.  The implications of this
1081	   behavior have to be considered by implementations that are going to
1082	   control media encoding and its packetization.  As exemplified above,
1083	   multiple TMMBR limits may apply to the trade-off between net media
1084	   bit rate and packet rate.  Which limitation applies depends on the
1085	   packet rate being considered.

1087	   This also has implications for how the TMMBR mechanism needs to work.
1088	   First, there is the possibility that multiple TMMBR tuples are
1089	   providing limitations on the media sender.  Secondly there is a need
1090	   for any session participant (media sender and receivers) to be able
1091	   to determine if a given tuple will become a limitation upon the media
1092	   sender, or if the set of already given limitations is stricter than
1093	   the given values.  In the absence of the ability to make this
1094	   determination the suppression of TMMBR requests would not work.

1096	   The basic idea of the algorithm is as follows.  Each TMMBR tuple can
1097	   be viewed as the equation of a straight line (cf. equations (1) and
1098	   (2)) in a space where packet rate lies along the X-axis and maximum
1099	   bit rate lies along the Y-axis. The lower envelope of the set of
1100	   lines corresponding to the complete set of TMMBR tuples defines a
1101	   polygon. Points lying along or below this polygon are combinations of
1102	   packet rate and bit rate that meet all of the TMMBR constraints. The
1103	   highest feasible packet rate within this region is the minimum of the
1104	   rate at which the bounding polygon meets the X-axis or the session
1105	   maximum packet rate (SMAXPR) provided by signaling, if any. Typically
1106	   a media sender will prefer to operate at a lower rate than this
1107	   theoretical maximum, so as to increase the rate at which actual media
1108	   content reaches the receivers.  The purpose of the algorithm is to
1109	   distinguish the TMMBR tuples constituting the bounding set and thus
1110	   delineate the feasible region, so that the media sender can select
1111	   its preferred operating point within that region

1113	   Figure 1 below shows a bounding polygon formed by TMMBR tuples A and
1114	   B. A third tuple C lies outside the bounding polygon and is therefore
1115	   irrelevant in determining feasible tradeoffs between media rate and
1116	   packet rate.  The line labeled ss..s represents the limit on packet
1117	   rate imposed by the session maximum packet rate (SMAXPR) obtained by
1118	   signaling during session setup.  In Figure 1 the limit determined by
1119	   tuple B happens to be more restrictive than SMAXPR.  The situation
1120	   could easily be the reverse, meaning that the bounding polygon is
1121	   terminated on the right by the vertical line representing the SMAXPR
1122	   constraint.

1124	        ^
1125	        |a   c   b             s
1126	   Bit  |  a   c  b            s
1127	   Rate |    a   c b           s
1128	        |      a   cb          s
1129	        |        a   c         s
1130	        |          a  bc       s
1131	        |            a b c     s
1132	        |              ab  c   s
1133	        |  Feasible      b   c s
1134	        |   region        ba   s
1135	        |                  b a s c
1136	        |                   b  s   c
1137	        |                    b s a
1138	        |_____________________bs________
1139	        +------------------------------>____________

1141	              Packet rate

1143	    Figure 1 - Geometric Interpretation of TMMBR Tuples

1145	   Note that the slopes of the lines making up the bounding polygon are
1146	   increasingly negative as one moves in the direction of increasing
1147	   packet rate.  Note also that with slight rearrangement, equations (1)
1148	   and (2) have the canonical form:

1150	          y = mx + b

1152	   where
1153	     m is the slope and has value equal to the negative of the tuple
1154	     overhead (in bits),
1155	   and
1156	     b is the y-intercept and has value equal to the tuple maximum total
1157	     media bit rate.

1159	   These observations lead to the conclusion that when processing the
1160	   TMMBR tuples to select the initial bounding set, one should sort and
1161	   process the tuples by order of increasing overhead. Once a particular
1162	   tuple has been added to the bounding set, all tuples not already
1163	   selected and having lower overhead can be eliminated, because the
1164	   next side of the bounding polygon has to be steeper (i.e. the
1165	   corresponding TMMBR must have higher overhead) than the latest added
1166	   tuple.

1168	   Line cc..c in Figure 1 illustrates another principle. This line is
1169	   parallel to line aa..a, but has a higher Y-intercept.  That is, the
1170	   corresponding TMMBR tuple contains a higher maximum total media bit
1171	   rate value.  Since line cc..c is outside the bounding polygon, it
1172	   illustrates the conclusion that if two TMMBR tuples have the same
1173	   overhead value, the one with higher maximum total media bit rate
1174	   value cannot be part of the bounding set and can be set aside.

1176	   Two further observations complete the algorithm.  Obviously, moving
1177	   from the left, the successive corners of the bounding polygon (i.e.
1178	   the intersection points between successive pairs of sides) lie at
1179	   successively higher packet rates.  On the other hand, again moving
1180	   from the left, each successive line making up the bounding set
1181	   crosses the X-axis at a lower packet rate.

1183	   The complete algorithm can now be specified.  The algorithm works
1184	   with two lists of TMMBR tuples, the candidate list X and the selected
1185	   list Y, both ordered by increasing overhead value.  The algorithm
1186	   terminates when all members of X have been discarded or removed for
1187	   processing.  Membership of the selected list Y is probationary until
1188	   the algorithm is complete.  Each member of the selected list is
1189	   associated with an intersection value, which is the packet rate at
1190	   which the line corresponding to that TMMBR tuple intersects with the
1191	   line corresponding to the previous TMMBR tuple in the selected list.
1192	   Each member of the selected list is also associated with a maximum
1193	   packet rate value, which is the lesser of the session maximum packet
1194	   rate SMAXPR (if any) and the packet rate at which the line
1195	   corresponding to that tuple crosses the X-axis.

1197	   When the algorithm terminates, the selected list is equal to the
1198	   bounding set as defined in section 2.2.

1200	Initial Algorithm

1202	   This algorithm is used by the media sender when it has received one
1203	   or more TMMBR requests and before it has determined a bounding set
1204	   for the first time.

1206	   1. Sort the TMMBR tuples by order of increasing overhead.  This is
1207	      the initial candidate list X.

1209	   2. When multiple tuples in the candidate list have the same
1210	      overhead value, discard all but the one with the lowest maximum
1211	      total media bit rate value.

1213	   3. Select and remove from the candidate list the TMMBR tuple with the
1214	      lowest maximum total media bit rate value.  If there is more than
1215	      one tuple with that value, choose the one with the highest
1216	      overhead value.  This is the first member of the selected list Y.
1217	      Set its intersection value equal to zero.  Calculate its maximum
1218	      packet rate as the minimum of SMAXPR (if available) and the value
1219	      obtained from the following formula, which is the packet rate at
1220	      which the corresponding line crosses the X-axis.

1222	          Max PR = TMMBR max total BR / (8 * TMMBR OH) ... (4)

1224	   4. Discard from the candidate list all tuples with a lower overhead
1225	      value than the selected tuple.

1227	   5. Remove the first remaining tuple from the candidate list for
1228	      processing.  Call this the current candidate.

1230	   6. Calculate the packet rate PR at the intersection of the line
1231	      generated by the current candidate with the line generated by the
1232	      last tuple in the selected list Y, using equation (3).

1234	   7. If the calculated value PR is equal to or lower than the
1235	      intersection value stored for the last tuple of the selected list,
1236	      discard the last tuple of the selected list and go back to step 6
1237	      (retaining the same current candidate).

1239	      Note that the choice of the initial member of the selected list Y
1240	      in step 3 guarantees that the selected list will never be emptied
1241	      by this process, meaning that the algorithm must eventually (if
1242	      not immediately) fall through to the step 8.

1244	   8. (This step is reached when the calculated PR value of the current
1245	      candidate is greater than the intersection value of the current
1246	      last member of the selected list Y.)  If the calculated value PR
1247	      of the current candidate is lower than the maximum packet rate
1248	      associated with the last tuple in the selected list, add the
1249	      current candidate tuple to the end of the selected list.  Store
1250	      PR as its intersection value.  Calculate its maximum packet rate
1251	      as the lesser of SMAXPR (if available) and the maximum packet
1252	      rate calculated using equation (4).

1254	   9. If any tuples remain in the candidate list, go back to step 5.

1256	Incremental Algorithm
1257	   The previous algorithm covered the initial case, where no selected
1258	   list had previously been created.  It also applied only to the media
1259	   sender.  When a previously-created selected list is available at
1260	   either the media sender or media receiver, two other cases can be
1261	   considered:

1263	        o when a TMMBR tuple not currently in the selected list is a
1264	          candidate for addition;

1266	        o when the values change in a TMMBR tuple currently in the
1267	          selected list.

1269	   At the media receiver these cases correspond respectively to those
1270	   of the non-owner and owner of a tuple in the TMMBN-reported bounding
1271	   set.

1273	   In either case, the process of updating the selected list to take
1274	   account of the new/changed tuple can use the basic algorithm
1275	   described above, with the modification that the initial candidate
1276	   set consists only of the existing selected list and the new or
1277	   changed tuple.  Some further optimization is possible (beyond
1278	   starting with a reduced candidate set) by taking advantage of the
1279	   following observations.

1281	   The first observation is that if the new/changed candidate becomes
1282	   part of the new selected list, the result may be to cause zero or
1283	   more other tuples to be dropped from the list.  However, if more than
1284	   one other tuple is dropped, the dropped tuples will be consecutive.
1285	   This can be confirmed geometrically by visualizing a new line that
1286	   cuts off a series of segments from the previously-existing bounding
1287	   polygon.  The cut-off segments are connected one to the next, the
1288	   geometric equivalent of consecutive tuples in a list ordered by
1289	   overhead value.  Beyond the dropped set in either direction all of
1290	   the tuples that were in the earlier selected list will be in the
1291	   updated one.  The second observation is that, leaving aside the new
1292	   candidate, the order of tuples remaining in the updated selected list
1293	   is unchanged because their overhead values have not changed.

1295	   The consequence of these two observations is that, once the placement
1296	   of the new candidate and the extent of the dropped set of tuples (if
1297	   any) has been determined, the remaining tuples can be copied directly
1298	   from the candidate list into the selected list, preserving their
1299	   order.  This conclusion suggests the following modified algorithm:

1301	       o Run steps 1-4 of the basic algorithm.

1303	       o If the new candidate has survived steps 2 and 4 and has become
1304	          the new first member of the selected list, run steps 5-9 on
1305	          subsequent candidates until another candidate is added to the
1306	          selected list.  Then move all remaining candidates to the
1307	          selected list, preserving their order.

1309	       o If the new candidate has survived steps 2 and 4 and has not
1310	          become the new first member of the selected list, start by
1311	          moving all tuples in the candidate list with lower overhead
1312	          values than that of the new candidate to the selected list,
1313	          preserving their order.  Run steps 5 through 9 for the new
1314	          candidate, with the modification that the intersection values
1315	          and maximum packet rates for the tuples on the selected list
1316	          have to be calculated on the fly because they were not
1317	          previously stored.  Continue processing only until a
1318	          subsequent tuple has been added to the selected list, then
1319	          move all remaining candidates to the selected list, preserving
1320	          their order.

1322	          Note that the new candidate could be added to the selected
1323	          list only to be dropped again when the next tuple is
1324	          processed.  It can easily be seen that in this case the new
1325	          candidate does not displace any of the earlier tuples in the
1326	          selected list.  The limitations of ASCII art make this
1327	          difficult to show in a figure.  Line cc..c in Figure 1 would
1328	          be an example if it had a steeper slope (tuple C had a higher
1329	          overhead value), but still intersected line aa..a beyond where
1330	          line aa..a intersects line bb..b.

1332	   The algorithm just described is approximate, because it does not take
1333	   account of tuples outside the selected list.  To see how such tuples
1334	   can become relevant, consider Figure 1 and suppose that the maximum
1335	   total media bit rate in tuple A increases to the point that line
1336	   aa..a moves outside line cc..c.  Tuple A will remain in the bounding
1337	   set calculated by the media sender.  However, once it issues a new
1338	   TMMBN, media receiver C will apply the algorithm and discover that
1339	   its tuple C should now enter the bounding set.  It will issue a TMMBR
1340	   request to the media sender, which will repeat its calculation and
1341	   come to the appropriate conclusion.

1343	   The rules of section 4.2 require that the media sender refrain from
1344	   raising its sending rate until media receivers have had a chance to
1345	   respond to the TMMBN.  In the example just given, this delay ensures
1346	   that the relaxation of tuple A does not actually result in an attempt
1347	   to send media at a rate exceeding the capacity at C.

1349	3.5.4.3. Use of TMMBR in a Mixer Based Multipoint Operation

1351	   Assume a small mixer-based multiparty conference is ongoing, as
1352	   depicted in Topo-Mixer of [Topologies].  All participants have
1353	   negotiated a common maximum bit rate that this session can use.  The
1354	   conference operates over a number of unicast paths between the
1355	   participants and the mixer.  The congestion situation on each of
1356	   these paths can be monitored by the participant in question and by
1357	   the mixer, utilizing, for example, RTCP receiver reports (RR) or the
1358	   transport protocol, e.g. DCCP [RFC4340].  However, any given
1359	   participant has no knowledge of the congestion situation of the
1360	   connections to the other participants.  Worse, without mechanisms
1361	   similar to the ones discussed in this draft, the mixer (which is
1362	   aware of the congestion situation on all connections it manages) has
1363	   no standardized means to inform media senders to slow down, short of
1364	   forging its own receiver reports (which is undesirable).  In
1365	   principle, a mixer confronted with such a situation is obliged to
1366	   thin or transcode streams intended for connections that detected
1367	   congestion.

1369	   In practice, media-aware stream thinning is unfortunately a very
1370	   difficult and cumbersome operation and adds undesirable delay.  If
1371	   media-unaware, it leads very quickly to unacceptable reproduced media
1372	   quality.  Hence, a means to slow down senders even in the absence of
1373	   congestion on their connections to the mixer is desirable.

1375	   To allow the mixer to throttle traffic on the individual links,
1376	   without performing transcoding, there is a need for a mechanism that
1377	   enables the mixer to ask a participant's media encoders to limit the
1378	   media stream bit rate they are currently generating.  TMMBR provides
1379	   the required mechanism.  When the mixer detects congestion between
1380	   itself and a given participant, it executes the following procedure:

1382	   1. It starts thinning the media traffic to the congested participant
1383	      to the supported bit rate.

1385	   2. It uses TMMBR to request the media sender(s) to reduce the total
1386	      media bit rate sent by them to the mixer, to a value that is in
1387	      compliance with congestion control principles for the slowest
1388	      link.  Slow refers here to the available bandwidth / bit rate /
1389	      capacity and packet rate after congestion control.

1391	   3. As soon as the bit rate has been reduced by the sending part, the
1392	      mixer stops stream thinning implicitly, because there is no need
1393	      for it once the stream is in compliance with congestion control.

1395	   This use of stream thinning as an immediate reaction tool followed up
1396	   by a quick control mechanism appears to be a reasonable compromise
1397	   between media quality and the need to combat congestion.

1399	3.5.4.4. Use of TMMBR in Point-to-Multipoint Using Multicast or
1400	   Translators

1402	   In these topologies, corresponding to Topo-Multicast or Topo-
1403	   Translator, RTCP RRs are transmitted globally.  This allows all
1404	   participants to detect transmission problems such as congestion, on a
1405	   medium timescale.  As all media senders are aware of the congestion
1406	   situation of all media receivers, the rationale for the use of TMMBR
1407	   in the previous section does not apply.  However, even in this case
1408	   the congestion control response can be improved when the unicast
1409	   links are using congestion controlled transport protocols (such as
1410	   TCP or DCCP).  A peer may also report local limitations to the media
1411	   sender.

1413	3.5.4.5. Use of TMMBR in Point-to-point operation

1415	   In use case 7 it is possible to use TMMBR to improve the performance
1416	   when the known upper limit of the bit rate changes.  In this use case
1417	   the signaling protocol has established an upper limit for the session
1418	   and total media bit rates.  However, at the time of transport link
1419	   bit rate reduction, a receiver can avoid serious congestion by
1420	   sending a TMMBR to the sending side.  Thus TMMBR is useful for
1421	   putting restrictions on the application and thus placing the
1422	   congestion control mechanism in the right ballpark.  However TMMBR is
1423	   usually unable to provide the continuously quick feedback loop
1424	   required for real congestion control.  Nor do its semantics match
1425	   those of congestion control given its different purpose.  For these
1426	   reasons TMMBR SHALL NOT be used as a substitute for congestion
1427	   control.

1429	3.5.4.6. Reliability

1431	   The reaction of a media sender to the reception of a TMMBR message is
1432	   not immediately identifiable through inspection of the media stream.
1433	   Therefore, a more explicit mechanism is needed to avoid unnecessary
1434	   re-sending of TMMBR messages.  Using a statistically based
1435	   retransmission scheme would only provide statistical guarantees of
1436	   the request being received.  It would also not avoid the
1437	   retransmission of already received messages.  In addition, it would
1438	   not allow for easy suppression of other participants' requests.  For
1439	   these reasons, a mechanism based on explicit notification is used.

1441	   Upon the reception of a request a media sender sends a TMMBN
1442	   notification containing the current bounding set, and indicating
1443	   which session participants own that limit.  In multicast scenarios,
1444	   that allows all other participants to suppress any request they may
1445	   have, if their limitations are less strict than the current ones
1446	   (i.e. define lines lying outside the feasible region as defined in
1447	   section 2.2).  Keeping and notifying only the bounding set of tuples
1448	   allows for small message sizes and media sender states.  A media
1449	   sender only keeps state for the SSRCs of the current owners of the
1450	   bounding set of tuples; all other requests and their sources are not
1451	   saved.  Once the bounding set has been established, new TMMBR
1452	   messages should be generated only by owners of the bounding tuples
1453	   and by other entities that determine (by applying the algorithm of
1454	   section 3.5.4.2 or its equivalent) that their limitations should now
1455	   be part of the bounding set.

1457	4. RTCP Receiver Report Extensions

1459	   This memo specifies six new feedback messages.  The Full Intra
1460	   Request (FIR), Temporal-Spatial Trade-off Request (TSTR), Temporal-
1461	   Spatial Trade-off Notification (TSTN), and Video Back Channel Message
1462	   (VBCM) are "Payload Specific Feedback Messages" as defined in Section
1463	   6.3 of AVPF [RFC4585].  The Temporary Maximum Media Stream Bit Rate
1464	   Request (TMMBR) and Temporary Maximum Media Stream Bit Rate
1465	   Notification (TMMBN) are "Transport Layer Feedback Messages" as
1466	   defined in Section 6.2 of AVPF.

1468	   The new feedback messages are defined in the following subsections,
1469	   following a similar structure to that in sections 6.2 and 6.3 of the
1470	   AVPF specification [RFC4585].

1472	4.1. Design Principles of the Extension Mechanism

1474	   RTCP was originally introduced as a channel to convey presence,
1475	   reception quality statistics and hints on the desired media coding.
1476	   A limited set of media control mechanisms were introduced in early
1477	   RTP payload formats for video formats, for example in RFC 4587
1478	   [RFC4587].  However, this specification, for the first time, suggests
1479	   a two-way handshake for some of its messages.  There is danger that
1480	   this introduction could be misunderstood as a precedent for the use
1481	   of RTCP as an RTP session control protocol.  To prevent such a
1482	   misunderstanding, this subsection attempts to clarify the scope of
1483	   the extensions specified in this memo, and strongly suggests that
1484	   future extensions follow the rationale spelled out here, or
1485	   compellingly explain why they divert from the rationale.

1487	   In this memo, and in AVPF [RFC4585], only such messages have been
1488	   included as:

1490	   a) have comparatively strict real-time constraints, which prevent the
1491	      use of mechanisms such as a SIP re-invite in most application
1492	      scenarios.  The real-time constraints are explained separately for
1493	      each message where necessary.

1495	   b) are multicast-safe in that the reaction to potentially
1496	      contradicting feedback messages is specified, as necessary for
1497	      each message; and

1499	   c) are directly related to activities of a certain media codec, class
1500	      of media codecs (e.g. video codecs), or a given RTP packet stream.

1502	   In this memo, a two-way handshake is introduced only for messages for
1503	   which:

1505	   a) a notification or acknowledgement is required due to their nature.
1506	      An analysis to determine whether this requirement exists has been
1507	      performed separately for each message.

1509	   b) the notification or acknowledgement cannot be easily derived from
1510	      the media bit stream.

1512	   All messages in AVPF [RFC4585] and in this memo present their
1513	   contents in a simple, fixed binary format.  This accommodates media
1514	   receivers which have not implemented higher control protocol
1515	   functionalities (SDP, XML parsers and such) in their media path.

1517	4.2. Transport Layer Feedback Messages

1519	   As specified in section 6.1 of RFC 4585 [RFC4585], Transport Layer
1520	   Feedback messages are identified by the RTCP packet type value RTPFB
1521	   (205).

1523	   In AVPF, one message of this category had been defined.  This memo
1524	   specifies two more such messages.  They are identified by means of
1525	   the FMT parameter as follows:

1527	   Assigned in AVPF [RFC4585]:

1529	      1:    Generic NACK
1530	      31:   reserved for future expansion of the identifier number space

1532	   Assigned in this memo:

1534	      2:    reserved (see note below)
1535	      3:    Temporary Maximum Media Stream Bit Rate Request (TMMBR)
1536	      4:    Temporary Maximum Media Stream Bit Rate Notification (TMMBN)

1538	          Note: early drafts of AVPF [RFC4585] reserved FMT=2 for a code
1539	          point that has later been removed.  It has been pointed out
1540	          that there may be implementations in the field using this
1541	          value in accordance with the expired draft.  As there is
1542	          sufficient numbering space available, we mark FMT=2 as
1543	          reserved so to avoid possible interoperability problems with
1544	          any such early implementations.

1546	   Available for assignment:

1548	      0:    unassigned
1549	      5-30: unassigned

1551	   The following subsection defines the formats of the FCI entries for
1552	   the TMMBR and TMMBN messages respectively and specify the associated
1553	   behaviour at the media sender and receiver.

1555	4.2.1. Temporary Maximum Media Stream Bit Rate Request (TMMBR)

1557	   The FCI field of a Temporary Maximum Media Stream Bit-Rate Request
1558	   (TMMBR) message SHALL contain one or more FCI entries.

1560	4.2.1.1. Message Format

1562	   The Feedback Control Information (FCI) consists of one or more TMMBR
1563	   FCI entries with the following syntax:

1565	    0                   1                   2                   3
1566	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1567	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1568	   |                              SSRC                             |
1569	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1570	   | MxTBR Exp |  MxTBR Mantissa                 |Measured Overhead|
1571	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1573	    Figure 2 - Syntax of an FCI entry in the TMMBR message

1575	     SSRC (32 bits): The SSRC value of the media sender that is
1576	              requested to obey the new maximum bit rate.

1578	     MxTBR Exp (6 bits): The exponential scaling of the mantissa for the
1579	              maximum total media bit rate value.  The value is an
1580	              unsigned integer [0..63].

1582	     MxTBR Mantissa (17 bits): The mantissa of the maximum total media
1583	              bit rate value as an unsigned integer.

1585	     Measured Overhead (9 bits): The measured average packet overhead
1586	              value in bytes.  The measurement SHALL be done according
1587	              to description in section 4.2.1.2. The value is an
1588	              unsigned integer [0..512].

1590	   The maximum total media bit rate (MxTBR) value in bits per second is
1591	   calculated from the MxTBR exponent (exp) and mantissa in the
1592	   following way:

1594	      MxTBR = mantissa * 2^exp

1596	   This allows for 17 bits of resolution in the range 0 to 131072*2^63
1597	   (approximately 1.2*10^24).

1599	   The length of the TMMBR feedback message SHALL be set to 2+2*N where
1600	   N is the number of TMMBR FCI entries.

1602	4.2.1.2. Semantics

1604	Behaviour at the Media Receiver (Sender of the TMMBR)

1606	   TMMBR is used to indicate a transport related limitation at the
1607	   reporting entity acting as a media receiver.  TMMBR has the form of a
1608	   tuple containing two components.  The first value is the highest bit
1609	   rate per sender of a media stream, observed at a receiver-chosen
1610	   protocol layer, which the receiver currently supports in this RTP
1611	   session.  The second value is the measured header overhead in bytes
1612	   as defined in section 2.2 and measured at the chosen protocol layer
1613	   in the packets received for the stream.  The measurement of the
1614	   overhead is a running average that is updated for each packet
1615	   received for this particular media source (SSRC), using the following
1616	   formula:

1618	       avg_OH (new) = 15/16*avg_OH (old) + 1/16*pckt_OH,

1620	   where avg_OH is the running (exponentially smoothed) average and
1621	   pckt_OH is the overhead observed in the latest packet.

1623	   If a maximum bit rate has been negotiated through signaling, the
1624	   maximum total media bit rate that the receiver reports in a TMMBR
1625	   message MUST NOT exceed the negotiated value converted to a common
1626	   basis (i.e. with overheads adjusted to bring it to the same reference
1627	   protocol layer).

1629	   Within the common packet header for feedback messages (as defined in
1630	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
1631	   indicates the source of the request, and the "SSRC of media source"
1632	   is not used and SHALL be set to 0.  Within a particular TMMBR FCI
1633	   entry, the "SSRC of media sender" in the FCI field denotes the media
1634	   sender the tuple applies to.  This is useful in the multicast or
1635	   translator topologies where the reporting entity may address all of
1636	   the media senders in a single TMMBR message using multiple FCI
1637	   entries.

1639	   The media receiver SHALL save the contents of the latest TMMBN
1640	   message received from each media sender.

1642	   The media receiver MAY send a TMMBR FCI entry to a particular media
1643	   sender under the following circumstances:

1645	     o   before any TMMBN message has been received from that media
1646	          sender;

1648	     o   when the media receiver has been identified as the source of a
1649	          bounding tuple within the latest TMMBN message received from
1650	          that media sender, and the value of the maximum total media
1651	          bit rate or the overhead relating to that media sender has
1652	          changed;

1654	     o   when the media receiver has not been identified as the source
1655	          of a bounding tuple within the latest TMMBN message received
1656	          from that media sender, and, after the media receiver applies
1657	          the incremental algorithm from section 3.5.4.2 or a stricter
1658	          equivalent, the media receiver's tuple relating to that media
1659	          sender is determined to belong to the bounding set.

1661	   A TMMBR FCI entry MAY be repeated in subsequent TMMBR messages if no
1662	   Temporary Maximum Media Stream Bit-Rate Notification (TMMBN) FCI has
1663	   been received from the media sender at the time of transmission of
1664	   the next RTCP packet.  The bit rate value of a TMMBR FCI entry MAY be
1665	   changed from one TMMBR message to the next.  The overhead measurement
1666	   SHALL be updated to the current value of avg_OH each time the entry
1667	   is sent.

1669	   If the value set by a TMMBR message is expected to be permanent, the
1670	   TMMBR setting party SHOULD renegotiate the session parameters to
1671	   reflect that using session setup signaling, e.g. a SIP re-invite.

1673	Behaviour at the Media Sender (Receiver of the TMMBR)

1675	   When it receives a TMMBR message containing an FCI entry relating to
1676	   it, the media sender SHALL use an initial or incremental algorithm as
1677	   applicable to determine the bounding set of tuples based on the new
1678	   information.  The algorithm used SHALL be at least as strict as the
1679	   corresponding algorithm defined in section 3.5.4.2.  The media sender
1680	   MAY accumulate TMMBR requests over a small interval (relative to the
1681	   RTCP sending interval) before making this calculation.

1683	   Once it has determined the bounding set of tuples, the media sender
1684	   MAY use any combination of packet rate and net media bit rate within
1685	   the feasible region that these tuples describe to produce a lower
1686	   total media stream bit rate, as it may need to address a congestion
1687	   situation or other limiting factors.  See section 5
1688	 (congestion
1689	   control) for more discussion.

1691	   If the media sender concludes that it can increase the maximum total
1692	   media bit rate value, it SHALL wait before actually doing so, for a
1693	   period long enough to allow a media receiver to respond to the TMMBN
1694	   if it determines that its tuple belongs in the bounding set.  This
1695	   delay period is estimated by the formula:

1697	      2 * RTT + T_Dither_Max,

1699	   where RTT is the longest round trip time known to the media sender
1700	   and T_Dither_Max is defined in section 3.4 of [RFC4585].

1702	   A TMMBN message SHALL be sent by the media sender at the earliest
1703	   possible point in time, in response to any TMMBR messages received
1704	   since the last sending of TMMBN.  The TMMBN message indicates the
1705	   calculated set of bounding tuples and the owners of those tuples at
1706	   the time of the transmission of the message.

1708	   An SSRC may time out according to the default rules for RTP session
1709	   participants, i.e. the media sender has not received any RTP or RTCP
1710	   packets from the owner for the last five regular reporting intervals.
1711	   An SSRC may also explicitly leave the session, with the participant
1712	   indicating this through the transmission of an RTCP BYE packet or
1713	   using an external signaling channel.  If the media sender determines
1714	   that the owner of a tuple in the bounding set has left the session,
1715	   the media sender shall transmit a new TMMBN containing the
1716	   previously-determined set of bounding tuples but with the tuple
1717	   belonging to the departed owner removed.

1719	Discussion

1721	   Due to the unreliable nature of transport of TMMBR and TMMBN, the
1722	   above rules may lead to the sending of TMMBR messages which appear to
1723	   disobey those rules.  Furthermore, in multicast scenarios it can
1724	   happen that more than one "non-owning" session participant may
1725	   determine, rightly or wrongly, that its tuple belongs in the bounding
1726	   set.  This is not critical for a number of reasons:

1728	   a) If a TMMBR message is lost in transmission, either the media
1729	      sender sends a new TMMBN message in response to some other media
1730	      receiver or it does not send a new TMMBN message at all.  In the
1731	      first case, the media receiver applies the incremental algorithm
1732	      and, if it determines that its tuple should be part of the
1733	      bounding set, sends out another TMMBR.  In the second case, it
1734	      repeats the sending of a TMMBR unconditionally.  Either way, the
1735	      media sender eventually gets the information it needs.

1737	   b) Similarly, if a TMMBN message gets lost, the media receiver that
1738	      has sent the corresponding TMMBR request does not receive the
1739	      notification and is expected to re-send the request and trigger
1740	      the transmission of another TMMBN.

1742	   c) If multiple competing TMMBR messages are sent by different session
1743	      participants, then the algorithm can be applied taking all of
1744	      these messages into account, and the resulting TMMBN provides the
1745	      participants with an updated view of how their tuples compare with
1746	      the bounded set.

1748	   d) If more than one session participant happens to send TMMBR
1749	      messages at the same time and with the same tuple component
1750	      values, it does not matter which if either tuple is taken into the
1751	      bounding set.  The losing session participant will determine after
1752	      applying the algorithm that its tuple does not enter the bounding
1753	      set, and will therefore stop sending its TMMBR request.

1755	   It is important to consider the security risks involved with faked
1756	   TMMBRs.  See the security considerations in Section 6
1757	.

1759	   As indicated already, the feedback messages may be used in both
1760	   multicast and unicast sessions in any of the specified topologies.
1761	   However, for sessions with a large number of participants, using the
1762	   lowest common denominator, as required by this mechanism, may not be
1763	   the most suitable course of action.  Large sessions may need to
1764	   consider other ways to adapt the bit rate to participants'
1765	   capabilities, such as partitioning the session into different quality
1766	   tiers, or using some other method of achieving bit rate scalability.

1768	4.2.1.3. Timing Rules

1770	   The first transmission of the TMMBR request message MAY use early or
1771	   immediate feedback in cases when timeliness is desirable.  Any
1772	   repetition of a request message SHOULD use regular RTCP mode for its
1773	   transmission timing.

1775	4.2.1.4. Handling in Translator and Mixers

1777	   Media translators and mixers will need to receive and respond to
1778	   TMMBR messages as they are part of the chain that provides a certain
1779	   media stream to the receiver.  The mixer or translator may act
1780	   locally on the TMMBR request and thus generate a TMMBN to indicate
1781	   that it has done so.  Alternatively, in the case of a media
1782	   translator it can forward the request, or in the case of a mixer
1783	   generate one of its own and pass it forward.  In the latter case, the
1784	   mixer will need to send a TMMBN back to the original requestor to
1785	   indicate that it is handling the request.

1787	4.2.2. Temporary Maximum Media Stream Bit Rate Notification (TMMBN)

1789	   The FCI field of the TMMBN Feedback message may contain zero, one or
1790	   more TMMBN FCI entries.

1792	4.2.2.1. Message Format

1794	   The Feedback Control Information (FCI) consists of zero, one or more
1795	   TMMBN FCI entries with the following syntax:

1797	    0                   1                   2                   3
1798	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1799	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1800	   |                              SSRC                             |
1801	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1802	   | MxTBR Exp |  MxTBR Mantissa                 |Measured Overhead|
1803	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1805	    Figure 3 - Syntax of an FCI entry in the TMMBN message

1807	     SSRC (32 bits): The SSRC value of the "owner" of this tuple.

1809	     MxTBR Exp (6 bits): The exponential scaling of the mantissa for the
1810	              maximum total media bit rate value.  The value is an
1811	              unsigned integer [0..63].

1813	     MxTBR Mantissa (17 bits): The mantissa of the maximum total media
1814	              bit rate value as an unsigned integer.

1816	     Measured Overhead (9 bits): The measured average packet overhead
1817	              value in bytes represented as an unsigned integer.

1819	   Thus the FCI within the TMMBN message contains entries indicating the
1820	   bounding tuples.  For each tuple, the entry gives the owner by the
1821	   SSRC, followed by the applicable maximum total media bit rate and
1822	   overhead value.

1824	   The length of the TMMBN message SHALL be set to 2+2*N where N is the
1825	   number of TMMBN FCI entries.

1827	4.2.2.2. Semantics

1829	   This feedback message is used to notify the senders of any TMMBR
1830	   message that one or more TMMBR messages have been received or that an
1831	   owner has left the session.  It indicates to all participants the
1832	   current set of bounding tuples and the "owners" of those tuples.

1834	   Within the common packet header for feedback messages (as defined in
1835	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
1836	   indicates the source of the notification.  The "SSRC of media source"
1837	   is not used and SHALL be set to 0.

1839	   A TMMBN message SHALL be scheduled for transmission after the
1840	   reception of a TMMBR message with an FCI entry identifying this media
1841	   sender.  Only a single TMMBN SHALL be sent, even if more than one
1842	   TMMBR message is received between the scheduling of the transmission
1843	   and the actual transmission of the TMMBN message.  The TMMBN message
1844	   indicates the bounding tuples and their owners at the time of
1845	   transmitting the message.  The bounding tuples included SHALL be the
1846	   set arrived at through application of the applicable algorithm of
1847	   section 3.5.4.2 or an equivalent, applied to the previous bounding
1848	   set if any and tuples received in TMMBR messages since the last TMMBN
1849	   was transmitted.

1851	   The reception of a TMMBR message SHALL still result in the
1852	   transmission of a TMMBN message even if, after application of the
1853	   algorithm, the newly reported TMMBR tuple is not accepted into the
1854	   bounding set.  In such a case the bounding tuples and their owners
1855	   are not changed, unless the TMMBR was from an owner of a tuple within
1856	   the previously calculated bounding set.  This procedure allows
1857	   session participants that did not see the last TMMBN message to get a
1858	   correct view of this media sender's state.

1860	   As indicated in section Error! Reference source not found., when a
1861	   media sender determines that an "owner" of a bounding tuple has left
1862	   the session, then that tuple is removed from the bounding set, and
1863	   the media sender SHALL send a TMMBN message indicating the remaining
1864	   bounding tuples.  If there are no remaining bounding tuples a TMMBN
1865	   without any FCI SHALL be sent to indicate this.

1867	     Note: if any media receivers remain in the session, this last will
1868	     be a temporary situation.  The empty TMMBN will cause every
1869	     remaining media receiver to determine that its limitation belongs
1870	     in the bounding set and send a TMMBR in consequence.

1872	   In unicast scenarios (i.e. where a single sender talks to a single
1873	   receiver), the aforementioned algorithm to determine ownership
1874	   degenerates to the media receiver becoming the "owner" of the one
1875	   bounding tuple as soon as the media receiver has issued the first
1876	   TMMBR message.

1878	4.2.2.3.
1879	         Timing Rules

1881	   The TMMBN acknowledgement SHOULD be sent as soon as allowed by the
1882	   applied timing rules for the session.  Immediate or early feedback
1883	   mode SHOULD be used for these messages.

1885	4.2.2.4. Handling by Translators and Mixers

1887	   As discussed in Section 4.2.1.4 mixers or translators may need to
1888	   issue TMMBN messages as responses to TMMBR messages for SSRC's
1889	   handled by them.

1891	4.3. Payload Specific Feedback Messages

1893	   As specified by section 6.1 of RFC 4585 [RFC4585], Payload-Specific
1894	   FB messages are identified by the RTCP packet type value PT=PSFB
1895	   (206).

1897	   AVPF [RFC4585] defines three payload-specific feedback messages and
1898	   one application layer feedback message.  This memo specifies four
1899	   additional payload-specific feedback messages.  All are identified by
1900	   means of the FMT parameter as follows:

1902	   Assigned in [RFC4585]:

1904	     1:     Picture Loss Indication (PLI)
1905	     2:     Slice Lost Indication (SLI)
1906	     3:     Reference Picture Selection Indication (RPSI)
1907	     15:    Application layer FB message
1908	     31:    reserved for future expansion of the number space

1910	   Assigned in this memo:

1912	     4:     Full Intra Request Command (FIR)
1913	     5:     Temporal-Spatial Trade-off Request (TSTR)
1914	     6:     Temporal-Spatial Trade-off Notification (TSTN)
1915	      7:     Video Back Channel Message (VBCM)

1917	   Unassigned:

1919	     0:     unassigned
1920	      8-14:  unassigned
1921	     16-30: unassigned

1923	   The following subsections define the new FCI formats for the payload-
1924	   specific feedback messages.

1926	4.3.1. Full Intra Request (FIR)

1928	   The FIR message is identified by RTCP packet type value PT=PSFB and
1929	   FMT=4.

1931	   The FCI field MUST contain one or more FIR entries.  Each entry
1932	   applies to a different media sender, identified by its SSRC.

1934	4.3.1.1. Message Format

1936	   The Feedback Control Information (FCI) for the Full Intra Request
1937	   consists of one or more FCI entries, the content of which is depicted
1938	   in Figure 4.  The length of the FIR feedback message MUST be set to
1939	   2+2*N, where N is the number of FCI entries.

1941	    0                   1                   2                   3
1942	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1943	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1944	   |                              SSRC                             |
1945	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1946	   | Seq. nr       |    Reserved                                   |
1947	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1949	    Figure 4 - Syntax of an FCI entry in the FIR message

1951	     SSRC (32 bits): The SSRC value of the media sender which is
1952	              requested to send a decoder refresh point.

1954	     Seq. nr (8 bits): Command sequence number.  The sequence number
1955	              space is unique for each pairing of the SSRC of command
1956	              source and the SSRC of the command target.  The sequence
1957	              number SHALL be increased by 1 modulo 256 for each new
1958	              command.  A repetition SHALL NOT increase the sequence
1959	              number.  The initial value is arbitrary.

1961	     Reserved (24 bits): All bits SHALL be set to 0 by the sender and
1962	              SHALL be ignored on reception.

1964	   The semantics of this feedback message is independent of the RTP
1965	   payload type.

1967	4.3.1.2. Semantics

1969	   Upon reception of FIR, the encoder MUST send a decoder refresh point
1970	   (see section 2.2) as soon as possible.

1972	     Note: Currently, video appears to be the only useful application
1973	     for FIR, as it appears to be the only RTP payload widely deployed
1974	     that relies heavily on media prediction across RTP packet
1975	     boundaries.  However, use of FIR could also reasonably be
1976	     envisioned for other media types that share essential properties
1977	     with compressed video, namely cross-frame prediction (whatever a
1978	     frame may be for that media type).  One possible example may be the
1979	     dynamic updates of MPEG-4 scene descriptions.  It is suggested that
1980	     payload formats for such media types refer to FIR and other message
1981	     types defined in this specification and in AVPF [RFC4585], instead
1982	     of creating similar mechanisms in the payload specifications.  The
1983	     payload specifications may have to explain how the payload-specific
1984	     terminologies map to the video-centric terminology used herein.

1986	     Note: In environments where the sender has no control over the
1987	     codec (e.g. when streaming pre-recorded and pre-coded content), the
1988	     reaction to this command cannot be specified.  One suitable
1989	     reaction of a sender would be to skip forward in the video bit
1990	     stream to the next decoder refresh point.  In other scenarios, it
1991	     may be preferable not to react to the command at all, e.g. when
1992	     streaming to a large multicast group.  Other reactions may also be
1993	     possible.  When deciding on a strategy, a sender could take into
1994	     account factors such as the size of the receiving group, the
1995	     "importance" of the sender of the FIR message (however "importance"
1996	     may be defined in this specific application), the frequency of
1997	     decoder refresh points in the content, and so on.  However a
1998	     session which predominately handles pre-coded content is not
1999	     expected to use FIR at all.

2001	   The sender MUST consider congestion control as outlined in
2002	   section 5
2003	., which MAY restrict its ability to send a decoder refresh
2004	   point quickly.

2006	     Note: The relationship between the Picture Loss Indication and FIR
2007	     is as follows.  As discussed in section 6.3.1 of AVPF [RFC4585], a
2008	     Picture Loss Indication informs the decoder about the loss of a
2009	     picture and hence the likelihood of misalignment of the reference
2010	     pictures between the encoder and decoder.  Such a scenario is
2011	     normally related to losses in an ongoing connection.  In point-to-
2012	     point scenarios, and without the presence of advanced error
2013	     resilience tools, one possible option for an encoder consists in
2014	     sending a decoder refresh point.  However, there are other options.
2015	     One example is that the media sender ignores the PLI, because the
2016	     embedded stream redundancy is likely to clean up the reproduced
2017	     picture within a reasonable amount of time.  The FIR, in contrast,
2018	     leaves a (real-time) encoder no choice but to send a decoder
2019	     refresh point.  It does not allow the encoder to take into account
2020	     any considerations such as the ones mentioned above.

2022	     Note: Mandating a maximum delay for completing the sending of a
2023	     decoder refresh point would be desirable from an application
2024	     viewpoint, but is problematic from a congestion control point of
2025	     view.  "As soon as possible" as mentioned above appears to be a
2026	     reasonable compromise.

2028	   FIR SHALL NOT be sent as a reaction to picture losses -- it is
2029	   RECOMMENDED to use PLI instead.  FIR SHOULD be used only in
2030	   situations where not sending a decoder refresh point would render the
2031	   video unusable for the users.

2033	     Note: A typical example where sending FIR is appropriate is when,
2034	     in a multipoint conference, a new user joins the session and no
2035	     regular decoder refresh point interval is established.  Another
2036	     example would be a video switching MCU that changes streams.  Here,
2037	     normally, the MCU issues a FIR to the new sender so to force it to
2038	     emit a decoder refresh point.  The decoder refresh point normally
2039	     includes a Freeze Picture Release (defined outside this
2040	     specification), which re-starts the rendering process of the
2041	     receivers.  Both techniques mentioned are commonly used in MCU-
2042	     based multipoint conferences.

2044	   Other RTP payload specifications such as RFC 4587 [RFC4587] already
2045	   define a feedback mechanism for certain codecs.  An application
2046	   supporting both schemes MUST use the feedback mechanism defined in
2047	   this specification when sending feedback.  For backward compatibility
2048	   reasons, such an application SHOULD also be capable to receive and
2049	   react to the feedback scheme defined in the respective RTP payload
2050	   format, if this is required by that payload format.

2052	   Within the common packet header for feedback messages (as defined in
2053	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
2054	   indicates the source of the request, and the "SSRC of media source"
2055	   is not used and SHALL be set to 0.  The SSRCs of the media senders to
2056	   which the FIR command applies are in the corresponding FCI entries.
2057	   A TSTR message MAY contain requests to multiple media senders, using
2058	   one FCI entry per target media sender.

2060	4.3.1.3. Timing Rules

2062	   The timing follows the rules outlined in section 3 of [RFC4585].  FIR
2063	   commands MAY be used with early or immediate feedback.  The FIR
2064	   feedback message MAY be repeated.  If using immediate feedback mode
2065	   the repetition SHOULD wait at least one RTT before being sent.  In
2066	   early or regular RTCP mode the repetition is sent in the next regular
2067	   RTCP packet.

2069	4.3.1.4. Handling of FIR Message in Mixer and Translators

2071	   A media translator or a mixer performing media encoding of the
2072	   content for which the session participant has issued a FIR is
2073	   responsible for acting upon it.  A mixer acting upon a FIR SHOULD NOT
2074	   forward the message unaltered; instead it SHOULD issue a FIR itself.

2076	4.3.1.5. Remarks

2078	   In conjunction with video codecs, FIR messages typically trigger the
2079	   sending of full intra or IDR pictures.  Both are several times larger
2080	   then predicted (inter) pictures.  Their size is independent of the
2081	   time they are generated.  In most environments, especially when
2082	   employing bandwidth-limited links, the use of an intra picture
2083	   implies an allowed delay that is a significant multiple of the
2084	   typical frame duration.  An example: if the sending frame rate is 10
2085	   fps, and an intra picture is assumed to be 10 times as big as an
2086	   inter picture, then a full second of latency has to be accepted.  In
2087	   such an environment there is no need for a particularly short delay
2088	   in sending the FIR message.  Hence waiting for the next possible time
2089	   slot allowed by RTCP timing rules as per [RFC4585] should not have an
2090	   overly negative impact on the system performance.

2092	4.3.2. Temporal-Spatial Trade-off Request (TSTR)

2094	   The TSTR feedback message is identified by RTCP packet type value
2095	   PT=PSFB and FMT=5.

2097	   The FCI field MUST contain one or more TSTR FCI entries.

2099	4.3.2.1. Message Format

2101	   The content of the FCI entry for the Temporal-Spatial Trade-off
2102	   Request is depicted in Figure 5.  The length of the feedback message
2103	   MUST be set to 2+2*N, where N is the number of FCI entries included.

2105	    0                   1                   2                   3
2106	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2107	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2108	   |                              SSRC                             |
2109	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2110	   |  Seq nr.      |  Reserved                           | Index   |
2111	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2113	    Figure 5 - Syntax of an FCI Entry in the TSTR Message

2115	     SSRC (32 bits): The SSRC of the media sender which is requested to
2116	              apply the tradeoff value given in Index.

2118	     Seq. nr (8 bits): Request sequence number.  The sequence number
2119	              space is unique for pairing of the SSRC of request source
2120	              and the SSRC of the request target.  The sequence number
2121	              SHALL be increased by 1 modulo 256 for each new command.
2122	              A repetition SHALL NOT increase the sequence number.  The
2123	              initial value is arbitrary.

2125	     Reserved (19 bits): All bits SHALL be set to 0 by the sender and
2126	              SHALL be ignored on reception.

2128	     Index (5 bits): An integer value between 0 and 31 that indicates
2129	              the relative trade off that is requested.  An index value
2130	              of 0 index highest possible spatial quality, while 31
2131	              indicates highest possible temporal resolution.

2133	4.3.2.2. Semantics

2135	   A decoder can suggest a temporal-spatial trade-off level by sending a
2136	   TSTR message to an encoder.  If the encoder is capable of adjusting
2137	   its temporal-spatial trade-off, it SHOULD take into account the
2138	   received TSTR message for future coding of pictures.  A value of 0
2139	   suggests a high spatial quality and a value of 31 suggests a high
2140	   frame rate.  The progression of values from 0 to 31 indicate
2141	   monotonically a desire for higher frame rate.  The index values do
2142	   not correspond to precise values of spatial quality or frame rate.

2144	   The reaction to the reception of more than one TSTR message by a
2145	   media sender from different media receivers is left open to the
2146	   implementation.  The selected trade-off SHALL be communicated to the
2147	   media receivers by the means of the TSTN message.

2149	   Within the common packet header for feedback messages (as defined in
2150	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
2151	   indicates the source of the request, and the "SSRC of media source"
2152	   is not used and SHALL be set to 0.  The SSRCs of the media senders to
2153	   which the TSTR applies to are in the corresponding FCI entries.

2155	   A TSTR message MAY contain requests to multiple media senders, using
2156	   one FCI entry per target media sender.

2158	4.3.2.3. Timing Rules

2160	   The timing follows the rules outlined in section 3 of [RFC4585].
2161	   This request message is not time critical and SHOULD be sent using
2162	   regular RTCP timing.  Only if it is known that the user interface
2163	   requires a quick feedback, the message MAY be sent with early or
2164	   immediate feedback timing.

2166	4.3.2.4. Handling of message in Mixers and Translators

2168	   A mixer or media translator that encodes content sent to the session
2169	   participant issuing the TSTR SHALL consider the request to determine
2170	   if it can fulfill it by changing its own encoding parameters.  A
2171	   media translator unable to fulfill the request MAY forward the
2172	   request unaltered towards the media sender.  A mixer encoding for
2173	   multiple session participants will need to consider the joint needs
2174	   of these participants before generating a TSTR on its own behalf
2175	   towards the media sender.  See also the discussion in Section 3.5.2.

2177	4.3.2.5. Remarks

2179	   The term "spatial quality" does not necessarily refer to the
2180	   resolution, measured by the number of pixels the reconstructed video
2181	   is using.  In fact, in most scenarios the video resolution stays
2182	   constant during the lifetime of a session.  However, all video
2183	   compression standards have means to adjust the spatial quality at a
2184	   given resolution, often influenced by the Quantizer Parameter or QP.
2185	   A numerically low QP results in a good reconstructed picture quality,
2186	   whereas a numerically high QP yields a coarse picture.  The typical
2187	   reaction of an encoder to this request is to change its rate control
2188	   parameters to use a lower frame rate and a numerically lower (on
2189	   average) QP, or vice versa.  The precise mapping of Index value to
2190	   frame rate and QP is intentionally left open here, as it depends on
2191	   factors such as the compression standard employed, spatial
2192	   resolution, content, bit rate, and so on.

2194	4.3.3. Temporal-Spatial Trade-off Notification (TSTN)

2196	   The TSTN message is identified by RTCP packet type value PT=PSFB and
2197	   FMT=6.

2199	   The FCI field SHALL contain one or more TSTN FCI entries.

2201	4.3.3.1. Message Format

2203	   The content of an FCI entry for the Temporal-Spatial Trade-off
2204	   Notification is depicted in Figure 6.  The length of the TSTN message
2205	   MUST be set to 2+2*N, where N is the number of FCI entries.

2207	    0                   1                   2                   3
2208	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2209	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2210	   |                              SSRC                             |
2211	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2212	   |  Seq nr.      |  Reserved                           | Index   |
2213	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2215	    Figure 6 - Syntax of the TSTN

2217	     SSRC (32 bits): The SSRC of the source of the TSTR request which
2218	              resulted in this Notification.

2220	     Seq. nr (8 bits): The sequence number value from the TSTN request
2221	              that is being acknowledged.

2223	     Reserved (19 bits): All bits SHALL be set to 0 by the sender and
2224	              SHALL be ignored on reception.

2226	     Index (5 bits): The trade-off value the media sender is using
2227	              henceforth.

2229	      Informative note: The returned trade-off value (Index) may differ
2230	      from the requested one, for example in cases where a media encoder
2231	      cannot tune its trade-off, or when pre-recorded content is used.

2233	4.3.3.2. Semantics

2235	   This feedback message is used to acknowledge the reception of a TSTR.
2236	   One TSTN entry in a TSTN feedback message SHALL be sent for each TSTR
2237	   entry targeted to this session participant, i.e. each TSTR received
2238	   that in the SSRC field in the entry has the receiving entities SSRC.

2240	   A single TSTN message MAY acknowledge multiple requests using
2241	   multiple FCI entries.  The index value included SHALL be the same in
2242	   all FCI entries of the TSTN message.  Including a FCI for each
2243	   requestor allows each requesting entity to determine that the media
2244	   sender received the request.  The Notification SHALL also be sent in
2245	   response to TSTR repetitions received.  If the request receiver has
2246	   received TSTR with several different sequence numbers from a single
2247	   requestor it SHALL only respond to the request with the highest
2248	   (modulo 256) sequence number.

2250	   The TSTN SHALL include the Temporal-Spatial Trade-off index that will
2251	   be used as a result of the request.  This is not necessarily the same
2252	   index as requested, as the media sender may need to aggregate
2253	   requests from several requesting session participants.  It may also
2254	   have some other policies or rules that limit the selection.

2256	   Within the common packet header for feedback messages (as defined in
2257	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
2258	   indicates the source of the Notification, and the "SSRC of media
2259	   source" is not used and SHALL be set to 0.  The SSRCs of the
2260	   requesting entities to which the Notification applies are in the
2261	   corresponding FCI entries.

2263	4.3.3.3. Timing Rules

2265	   The timing follows the rules outlined in section 3 of [RFC4585].
2266	   This acknowledgement message is not extremely time critical and
2267	   SHOULD be sent using regular RTCP timing.

2269	4.3.3.4. Handling of TSTN in Mixer and Translators

2271	   A mixer or translator that acts upon a TSTR SHALL also send the
2272	   corresponding TSTN.  In cases where it needs to forward a TSTR itself
2273	   the notification message MAY need to be delayed until the TSTR has
2274	   been responded to.

2276	4.3.3.5. Remarks

2278	   None

2280	4.3.4. H.271 Video Back Channel Message (VBCM)

2282	   The VBCM is identified by RTCP packet type value PT=PSFB and FMT=7.

2284	   The FCI field MUST contain one or more VBCM FCI entries.

2286	4.3.4.1.
2287	         Message Format

2289	   The syntax of an FCI entry within the VBCM indication is depicted in
2290	   Figure 7.

2292	   0                   1                   2                   3
2293	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2294	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2295	   |                              SSRC                             |
2296	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2297	   | Seq. nr       |0| Payload Type| Length                        |
2298	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2299	   |                    VBCM Octet String....      |    Padding    |
2300	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2302	   Figure 7 - Syntax of an FCI Entry in the VBCM Message

2304	   SSRC (32 bits): The SSRC value of the media sender that is requested
2305	          to instruct its encoder to react to the VBCM message

2307	   Seq. nr (8 bits): Command sequence number.  The sequence number space
2308	          is unique for pairing of the SSRC of command source and the
2309	          SSRC of the command target.  The sequence number SHALL be
2310	          increased by 1 modulo 256 for each new command.  A repetition
2311	          SHALL NOT increase the sequence number.  The initial value is
2312	          arbitrary.

2314	   0: Must be set to 0 by the sender and should not be acted upon by the
2315	          message receiver.

2317	   Payload Type (7 bits): The RTP payload type for which the VBCM bit
2318	          stream must be interpreted.

2320	   Length (16 bits): The length of the VBCM octet string in octets
2321	          exclusive of any padding octets

2323	   VBCM Octet String (Variable length): This is the octet string
2324	          generated by the decoder carrying a specific feedback sub-
2325	          message.

2327	   Padding (Variable length): Bits set to 0 to make up a 32 bit
2328	          boundary.

2330	4.3.4.2. Semantics
2331	   The "payload" of the VBCM indication carries different types of
2332	   codec-specific, feedback information.  The type of feedback
2333	   information can be classified as a 'status report' (such as an
2334	   indication that a bit stream was received without errors, or that a
2335	   partial or complete picture or block was lost) or 'update requests'
2336	   (such as complete refresh of the bit stream).

2338	          Note: There are possible overlaps between the VBCM sub-
2339	          messages and CCM/AVPF feedback messages, such FIR.  Please see
2340	          section 3.5.3 for further discussion.

2342	   The different types of feedback sub-messages carried in the VBCM are
2343	   indicated by the "payloadType" as defined in [VBCM].  These sub-
2344	   message types are reproduced below for convenience.  "payloadType",
2345	   in ITU-T Rec. H.271 terminology, refers to the sub-type of the H.271
2346	   message and should not be confused with an RTP payload type.

2348	   Payload          Message Content
2349	   Type
2350	   ---------------------------------------------------------------------
2351	   0      One or more pictures without detected bit stream error
2352	          mismatch
2353	   1      One or more pictures that are entirely or partially lost
2354	   2      A set of blocks of one picture that is entirely or partially
2355	          lost
2356	   3      CRC for one parameter set
2357	   4      CRC for all parameter sets of a certain type
2358	   5      A "reset" request indicating that the sender should completely
2359	          refresh the video bit stream as if no prior bit stream data
2360	          had been received
2361	   > 5    Reserved for future use by ITU-T

2363	   Table 2: H.271 message types ("payloadTypes")

2365	   The bit string or the "payload" of a VBCM message is of variable
2366	   length and is self-contained and coded in a variable length, binary
2367	   format.  The media sender necessarily has to be able to parse this
2368	   optimized binary format to make use of VBCM messages.

2370	   Each of the different types of sub-messages (indicated by
2371	   payloadType) may have different semantics depending on the codec
2372	   used.

2374	   Within the common packet header for feedback messages (as defined in
2375	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
2376	   indicates the source of the request, and the "SSRC of media source"
2377	   is not used and SHALL be set to 0.  The SSRCs of the media senders to
2378	   which the VBCM message applies to are in the corresponding FCI
2379	   entries.  The sender of the VBCM message MAY send H.271 messages to
2380	   multiple media senders and MAY send more than one H.271 message to
2381	   the same media sender within the same VBCM message.

2383	4.3.4.3. Timing Rules

2385	   The timing follows the rules outlined in section 3 of [RFC4585].  The
2386	   different sub-message types may have different properties in regards
2387	   to the timing of messages that should be used.  If several different
2388	   types are included in the same feedback packet then the requirements
2389	   for the sub-message type with the most stringent requirements should
2390	   be followed.

2392	4.3.4.4. Handling of message in Mixer or Translator

2394	   The handling of VBCM in a mixer or translator is sub-message type
2395	   dependent.

2397	4.3.4.5. Remarks

2399	   Please see section 3.5.3 for a discussion of the usage of H.271
2400	   messages and messages defined in AVPF [RFC4585] and this memo with
2401	   similar functionality.

2403	     Note: There has been some discussion whether the payload type field
2404	     in this message is needed.  It will be needed if there is
2405	     potentially more than one VBCM-capable RTP payload type in the same
2406	     session, and the semantics of a given VBCM message changes between
2407	     payload types.  For example, the picture identification mechanism
2408	     in messages of H.271 type 0 is fundamentally different between
2409	     H.263 and H.264 (although both use the same syntax).  Therefore,
2410	     the payload field is justified here.  There was a further comment
2411	     that for TSTS and FIR such a need does not exist, because the
2412	     semantics of TSTS and FIR are either loosely enough defined, or
2413	     generic enough, to apply to all video payloads currently in
2414	     existence/envisioned.

2416	5. Congestion Control

2418	   The correct application of the AVPF [RFC4585] timing rules prevents
2419	   the network from being flooded by feedback messages.  Hence, assuming
2420	   a correct implementation and configuration, the RTCP channel cannot
2421	   break its bit rate commitment and introduce congestion.

2423	   The reception of some of the feedback messages modifies the behaviour
2424	   of the media senders or, more specifically, the media encoders.  Thus
2425	   modified behaviour MUST respect the bandwidth limits that the
2426	   application of congestion control provides.  For example, when a
2427	   media sender is reacting to a FIR, the unusually high number of
2428	   packets that form the decoder refresh point have to be paced in
2429	   compliance with the congestion control algorithm, even if the user
2430	   experience suffers from a slowly transmitted decoder refresh point.

2432	   A change of the Temporary Maximum Media Stream Bit Rate value can
2433	   only mitigate congestion, but not cause congestion as long as
2434	   congestion control is also employed.  An increase of the value by a
2435	   request REQUIRES the media sender to use congestion control when
2436	   increasing its transmission rate to that value.  A reduction of the
2437	   value results in a reduced transmission bit rate thus reducing the
2438	   risk for congestion.

2440	6. Security Considerations

2442	   The defined messages have certain properties that have security
2443	   implications.  These must be addressed and taken into account by
2444	   users of this protocol.

2446	   The defined setup signaling mechanism is sensitive to modification
2447	   attacks that can result in session creation with sub-optimal
2448	   configuration, and, in the worst case, session rejection.  To prevent
2449	   this type of attack, authentication and integrity protection of the
2450	   setup signaling is required.

2452	   Spoofed or maliciously created feedback messages of the type defined
2453	   in this specification can have the following implications:

2455	        a. severely reduced media bit rate due to false TMMBR messages
2456	           that sets the maximum to a very low value;

2458	        b. assignment of the ownership of a bounding tuple to the wrong
2459	           participant within a TMMBN message, potentially causing
2460	           unnecessary oscillation in the bounding set as the mistakenly
2461	           identified owner reports a change in its tuple and the true
2462	           owner possibly holds back on changes until a correct TMMBN
2463	           message reaches the participants;

2465	        c. sending TSTR requests that result in a video quality
2466	           different from the user's desire, rendering the session less
2467	           useful.

2469	        d. Frequent FIR commands will potentially reduce the frame-rate,
2470	           making the video jerky, due to the frequent usage of decoder
2471	           refresh points.

2473	   To prevent these attacks there is a need to apply authentication and
2474	   integrity protection of the feedback messages.  This can be
2475	   accomplished against threats external to the current RTP session
2476	   using the RTP profile that combines SRTP [SRTP] and AVPF into SAVPF
2477	   [SAVPF].  In the mixer cases, separate security contexts and
2478	   filtering can be applied between the mixer and the participants thus
2479	   protecting other users on the mixer from a misbehaving participant.

2481	7. SDP Definitions

2483	   Section 4 of [RFC4585] defines a new SDP [RFC4566] attribute, rtcp-
2484	   fb, that may be used to negotiate the capability to handle specific
2485	   AVPF commands and indications, such as Reference Picture Selection,
2486	   Picture Loss Indication etc.  The ABNF for rtcp-fb is described in
2487	   section 4.2 of [RFC4585].  In this section we extend the rtcp-fb
2488	   attribute to include the commands and indications that are described
2489	   for codec control protocol in the present document.  We also discuss
2490	   the Offer/Answer implications for the codec control commands and
2491	   indications.

2493	7.1. Extension of the rtcp-fb Attribute

2495	   As described in AVPF [RFC4585], the rtcp-fb attribute indicates the
2496	   capability of using RTCP feedback.  AVPF specifies that the rtcp-fb
2497	   attribute must only be used as a media level attribute and must not
2498	   be provided at session level.  All the rules described in [RFC4585]
2499	   for rtcp-fb attribute relating to payload type and to multiple rtcp-
2500	   fb attributes in a session description also apply to the new feedback
2501	   messages defined in this memo.

2503	   The ABNF [RFC4234] for rtcp-fb as defined in [RFC4585] is

2505	     "a=rtcp-fb: " rtcp-fb-pt SP rtcp-fb-val CRLF

2507	   where rtcp-fb-pt is the payload type and rtcp-fb-val defines the type
2508	   of the feedback message such as ack, nack, trr-int and rtcp-fb-id.
2509	   For example to indicate the support of feedback of picture loss
2510	   indication, the sender declares the following in SDP

2512	         v=0
2513	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2514	         s=Media with feedback
2515	         t=0 0
2516	         c=IN IP4 host.example.com
2517	         m=audio 49170 RTP/AVPF 98
2518	         a=rtpmap:98 H263-1998/90000
2519	         a=rtcp-fb:98 nack pli

2521	   In this document we define a new feedback value "ccm" which indicates
2522	   the support of codec control using RTCP feedback messages.  The "ccm"
2523	   feedback value SHOULD be used with parameters, which indicate the
2524	   specific codec control commands supported.  In this draft we define
2525	   four parameters, which can be used with the ccm feedback value type.

2527	      o  "fir" indicates the support of the Full Intra Request (FIR).
2528	      o  "tmmbr" indicates the support of the Temporary Maximum Media
2529	         Stream Bit Rate Request/Notification (TMMBR/TMMBN).  It has an
2530	         optional sub parameter to indicate the session maximum packet
2531	         rate to be used.  If not included this defaults to infinity.
2532	      o  "tstr" indicates the support of the Temporal-Spatial Trade-off
2533	         Request/Notification (TSTR/TSTN).
2534	      O  "vbcm" indicates the support of H.271 video back channel
2535	         messages (VBCM).  It has zero or more subparameters identifying
2536	         the supported H.271 "payloadType" values.

2538	   In the ABNF for rtcp-fb-val defined in [RFC4585], there is a
2539	   placeholder called rtcp-fb-id to define new feedback types.  "ccm" is
2540	   defined as a new feedback type in this document and the ABNF for the
2541	   parameters for ccm are defined here (please refer to section 4.2 of
2542	   [RFC4585] for complete ABNF syntax).

2544	   rtcp-fb-param = SP "app" [SP byte-string]
2545	                 / SP rtcp-fb-ccm-param
2546	                 /     ; empty

2548	   rtcp-fb-ccm-param = "ccm" SP ccm-param

2550	   ccm-param  = "fir"   ; Full Intra Request
2551	              / "tmmbr" [SP "smaxpr=" MaxPacketRateValue]
2552	                        ; Temporary max media bit rate
2553	              / "tstr"  ; Temporal Spatial Trade Off
2554	              / "vbcm" *(SP subMessageType) ; H.271 VBCM messages
2555	              / token [SP byte-string]
2556	                         ; for future commands/indications
2557	   subMessageType = 1*8DIGIT
2558	   byte-string = <as defined in section 4.2 of [RFC4585] >
2559	   MaxPacketRateValue = 1*15DIGIT

2561	7.2. Offer-Answer

2563	   The Offer/Answer [RFC3264] implications for codec control protocol
2564	   feedback messages are similar those described in [RFC4585].  The
2565	   offerer MAY indicate the capability to support selected codec
2566	   commands and indications.  The answerer MUST remove all ccm
2567	   parameters which it does not understand or does not wish to use in
2568	   this particular media session.  The answerer MUST NOT add new ccm
2569	   parameters in addition to what has been offered.  The answer is
2570	   binding for the media session and both offerer and answerer MUST only
2571	   use feedback messages negotiated in this way.

2573	   The session maximum packet rate parameter part of the TMMBR
2574	   indication is declarative and everyone shall use the highest value
2575	   indicated in a response.  If the session maximum packet rate
2576	   parameter is not present in an offer it SHALL NOT be included by the
2577	   answerer.

2579	7.3. Examples

2581	   Example 1: The following SDP describes a point-to-point video call
2582	   with H.263, with the originator of the call declaring its capability
2583	   to support the FIR and TSTR/TSTN codec control messages.  The SDP is
2584	   carried in a high level signaling protocol like SIP.

2586	         v=0
2587	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2588	         s=Point-to-Point call
2589	         c=IN IP4 192.0.2.124
2590	         m=audio 49170 RTP/AVP 0
2591	         a=rtpmap:0 PCMU/8000
2592	         m=video 51372 RTP/AVPF 98
2593	         a=rtpmap:98 H263-1998/90000
2594	         a=rtcp-fb:98 ccm tstr
2595	         a=rtcp-fb:98 ccm fir

2597	   In the above example, when the sender receives a TSTR message from
2598	   the remote party it is capable of adjusting the trade off as
2599	   indicated in the RTCP TSTN feedback message.

2601	   Example 2: The following SDP describes a SIP end point joining a
2602	   video mixer that is hosting a multiparty video conferencing session.
2603	   The participant supports only the FIR (Full Intra Request) codec
2604	   control command and it declares it in its session description.

2606	         v=0
2607	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2608	         s=Multiparty Video Call
2609	         c=IN IP4 192.0.2.124
2610	         m=audio 49170 RTP/AVP 0
2611	         a=rtpmap:0 PCMU/8000
2612	         m=video 51372 RTP/AVPF 98
2613	         a=rtpmap:98 H263-1998/90000
2614	         a=rtcp-fb:98 ccm fir

2616	   When the video MCU decides to route the video of this participant it
2617	   sends an RTCP FIR feedback message.  Upon receiving this feedback
2618	   message the end point is required to generate a full intra request.

2620	   Example 3: The following example describes the Offer/Answer
2621	   implications for the codec control messages.  The Offerer wishes to
2622	   support "tstr", "fir" and "tmmbr".  The offered SDP is

2624	   -------------> Offer
2625	         v=0
2626	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2627	         s=Offer/Answer
2628	         c=IN IP4 192.0.2.124
2629	         m=audio 49170 RTP/AVP 0
2630	         a=rtpmap:0 PCMU/8000
2631	         m=video 51372 RTP/AVPF 98
2632	         a=rtpmap:98 H263-1998/90000
2633	         a=rtcp-fb:98 ccm tstr
2634	         a=rtcp-fb:98 ccm fir
2635	         a=rtcp-fb:* ccm tmmbr smaxpr=120

2637	   The answerer wishes to support only the FIR and TSTR/TSTN messages
2638	   and the answerer SDP is

2640	   <---------------- Answer

2642	         v=0
2643	         o=alice 3203093520 3203093524 IN IP4 otherhost.example.com
2644	         s=Offer/Answer
2645	         c=IN IP4 192.0.2.37
2646	         m=audio 47190 RTP/AVP 0
2647	         a=rtpmap:0 PCMU/8000
2648	         m=video 53273 RTP/AVPF 98
2649	         a=rtpmap:98 H263-1998/90000
2650	         a=rtcp-fb:98 ccm tstr
2651	         a=rtcp-fb:98 ccm fir

2653	   Example 4: The following example describes the Offer/Answer
2654	   implications for H.271 Video back channel messages (VBCM).  The
2655	   Offerer wishes to support VBCM and the sub-messages of payloadType 1
2656	   (one or more pictures that are entirely or partially lost) and 2 (a
2657	   set of blocks of one picture that are entirely or partially lost).

2659	   -------------> Offer
2660	         v=0
2661	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2662	         s=Offer/Answer
2663	         c=IN IP4 192.0.2.124
2664	         m=audio 49170 RTP/AVP 0
2665	         a=rtpmap:0 PCMU/8000
2666	         m=video 51372 RTP/AVPF 98
2667	         a=rtpmap:98 H263-1998/90000
2668	         a=rtcp-fb:98 ccm vbcm 1 2

2670	   The answerer only wishes to support sub-messages of type 1 only

2672	   <---------------- Answer

2674	         v=0
2675	         o=alice 3203093520 3203093524 IN IP4 otherhost.example.com
2676	         s=Offer/Answer
2677	         c=IN IP4 192.0.2.37
2678	         m=audio 47190 RTP/AVP 0
2679	         a=rtpmap:0 PCMU/8000
2680	         m=video 53273 RTP/AVPF 98
2681	         a=rtpmap:98 H263-1998/90000
2682	         a=rtcp-fb:98 ccm vbcm 1

2684	   So in the above example only VBCM indications comprised of
2685	   "payloadType" 1 will be supported.

2687	8. IANA Considerations

2689	   The new value "ccm" needs to be registered with IANA in the "rtcp-fb"
2690	   Attribute Values registry located at the time of publication at:
2691	   http://www.iana.org/assignments/sdp-parameters

2693	   Value name:       ccm
2694	   Long Name:        Codec Control Commands and Indications
2695	   Reference:        RFC XXXX

2697	   A new registry "Codec Control Messages" needs to be created to hold
2698	   "ccm" parameters located at time of publication at:
2699	   http://www.iana.org/assignments/sdp-parameters

2701	   New registration in this registry follows the "Specification
2702	   required" policy as defined by [RFC2434]. In addition they are
2703	   required to indicate which, if any additional RTCP feedback types,
2704	   such as "nack", "ack".

2706	   The initial content of the registry is the following values:

2708	   Value name:       fir
2709	   Long name:        Full Intra Request Command
2710	   Usable with:      ccm
2711	   Reference:        RFC XXXX

2713	   Value name:       tmmbr
2714	   Long name:        Temporary Maximum Media Stream Bit Rate
2715	   Usable with:      ccm
2716	   Reference:        RFC XXXX

2718	   Value name:       tstr
2719	   Long name:        temporal Spatial Trade Off
2720	   Usable with:      ccm
2721	   Reference:        RFC XXXX

2723	   Value name:       vbcm
2724	   Long name:        H.271 video back channel messages
2725	   Usable with:      ccm
2726	   Reference:        RFC XXXX

2728	   The following values need to be registered as FMT values in the "FMT
2729	   Values for RTPFB Payload Types" registry located at the time of
2730	   publication at: http://www.iana.org/assignments/rtp-parameters

2732	   RTPFB range
2733	   Name           Long Name                         Value  Reference
2734	   -------------- --------------------------------- -----  ---------
2735	                  Reserved                             2   [RFCxxxx]
2736	   TMMBR          Temporary Maximum Media Stream Bit   3   [RFCxxxx]
2737	                  Rate Request
2738	   TMMBN          Temporary Maximum Media Stream Bit   4   [RFCxxxx]
2739	                  Rate Notification

2741	   The following values need to be registered as FMT values in the "FMT
2742	   Values for PSFB Payload Types" registry located at the time of
2743	   publication at: http://www.iana.org/assignments/rtp-parameters

2745	   PSFB range
2746	   Name           Long Name                             Value  Reference
2747	   -------------- ---------------------------------     -----  ---------
2748	   FIR            Full Intra Request Command              4    [RFCxxxx]
2749	   TSTR           Temporal-Spatial Trade-off Request      5    [RFCxxxx]
2750	   TSTN           Temporal-Spatial Trade-off Notification 6    [RFCxxxx]
2751	   VBCM           Video Back Channel Message              7    [RFCxxxx]

2753	9. Contributors

2755	   Tom Taylor has made a very significant contribution, for which the
2756	   authors are very grateful, to this specification by helping rewrite
2757	   the specification. Especially the parts regarding the algorithm for
2758	   determining bounding sets for TMMBR have benefited.

2760	10.  Acknowledgements

2762	   The authors would like to thank Andrea Basso, Orit Levin, Nermeen
2763	   Ismail for their work on the requirement and discussion draft
2764	   [Basso].

2766	   Drafts of this memo were reviewed and extensively commented by Roni
2767	   Even, Colin Perkins, Randell Jesup, Keith Lantz, Harikishan Desineni,
2768	   Guido Franceschini and others.  The authors appreciate these reviews.

2770	   Funding for the RFC Editor function is currently provided by the
2771	   Internet Society.

2773	11.  References

2775	11.1. Normative references

2777	   [RFC4585]    Ott, J., Wenger, S., Sato, N., Burmeister, C., Rey, J.,
2778	                "Extended RTP Profile for Real-Time Transport Control
2779	                Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585,
2780	                July 2006
2781	   [RFC2119]    Bradner, S., "Key words for use in RFCs to Indicate
2782	                Requirement Levels", BCP 14, RFC 2119, March 1997.
2783	   [RFC3550]    Schulzrinne, H.,  Casner, S., Frederick, R., and V.
2784	                Jacobson, "RTP: A Transport Protocol for Real-Time
2785	                Applications", STD 64, RFC 3550, July 2003.
2786	   [RFC4566]    Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
2787	                Description Protocol", RFC 4566, July 2006.
2788	   [RFC3264]    Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
2789	                with Session Description Protocol (SDP)", RFC 3264, June
2790	                2002.
2791	   [Topologies] M. Westerlund, and S. Wenger, "RTP Topologies", draft-
2792	                ietf-avt-topologies-04, work in progress, Feb 2007.
2793	   [RFC2434]    Narten, T. and H. Alvestrand, "Guidelines for Writing an
2794	                IANA Considerations Section in RFCs", BCP 26, RFC 2434,
2795	                October 1998.
2796	   [RFC4234]    Crocker, D. and P. Overell, "Augmented BNF for Syntax
2797	                Specifications: ABNF", RFC 4234, October 2005.

2799	11.2. Informative references

2801	   [Basso]      A. Basso, et. al., "Requirements for transport of video
2802	                control commands", draft-basso-avt-videoconreq-02.txt,
2803	                expired Internet Draft, October 2004.
2804	   [AVC]        Joint Video Team of ITU-T and ISO/IEC JTC 1, Draft ITU-T
2805	                Recommendation and Final Draft International Standard of
2806	                Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC
2807	                14496-10 AVC), Joint Video Team (JVT) of ISO/IEC MPEG
2808	                and ITU-T VCEG, JVT-G050, March 2003.
2809	   [H245]       ITU-T Rec. HG.245, "Control protocol for multimedia
2810	                communication", MAY 2006
2811	   [NEWPRED]    S. Fukunaga, T. Nakai, and H. Inoue, "Error Resilient
2812	                Video Coding by Dynamic Replacing of Reference
2813	                Pictures," in Proc. Globcom'96, vol. 3, pp. 1503 - 1508,
2814	                1996.
2815	   [SRTP]       Baugher, M., McGrew, D., Naslund, M., Carrara, E., and
2816	                K. Norrman, "The Secure Real-time Transport Protocol
2817	                (SRTP)", RFC 3711, March 2004.
2818	   [RFC4587]    Even, R., "RTP Payload Format for H.261 Video Streams",
2819	                RFC 4587, August 2006.

2821	   [SAVPF]      J. Ott, E. Carrara, "Extended Secure RTP Profile for
2822	                RTCP-based Feedback (RTP/SAVPF),"
2823	                draft-ietf-avt-profile-savpf-10.txt, Feb, 2007.
2824	   [RFC3525]    Groves, C., Pantaleo, M., Anderson, T., and T. Taylor,
2825	                "Gateway Control Protocol Version 1", RFC 3525, June
2826	                2003.
2827	   [RFC3448]    M. Handley, S. Floyd, J. Padhye, J. Widmer, "TCP
2828	                Friendly Rate Control (TFRC): Protocol Specification",
2829	                RFC 3448, Jan 2003
2830	   [VBCM]       ITU-T Rec. H.271, "Video Back Channel Messages", June
2831	                2006
2832	   [RFC3890]    Westerlund, M., "A Transport Independent Bandwidth
2833	                Modifier for the Session Description Protocol (SDP)",
2834	                RFC 3890, September 2004.
2835	   [RFC4340]    Kohler, E., Handley, M., and S. Floyd, "Datagram
2836	                Congestion Control Protocol (DCCP)", RFC 4340, March
2837	                2006.
2838	   [RFC3261]    Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
2839	                A., Peterson, J., Sparks, R., Handley, M., and E.
2840	                Schooler, "SIP: Session Initiation Protocol", RFC 3261,
2841	                June 2002.
2842	   [RFC2198]    Perkins, C., Kouvelas, I., Hodson, O., Hardman, V.,
2843	                Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse-
2844	                Parisis, "RTP Payload for Redundant Audio Data", RFC
2845	                2198, September 1997.

2847	12.  Authors' Addresses

2849	   Stephan Wenger
2850	   Nokia Corporation
2851	   975, Page Mill Road,
2852	   Palo Alto,CA 94304
2853	   USA

2855	   Phone: +1-650-862-7368
2856	   EMail: stewe@stewe.org

2858	   Umesh Chandra
2859	   Nokia Research Center
2860	   975, Page Mill Road,
2861	   Palo Alto,CA 94304
2862	   USA

2864	   Phone: +1-650-796-7502
2865	   Email: Umesh.Chandra@nokia.com
2866	   Magnus Westerlund
2867	   Ericsson Research
2868	   Ericsson AB
2869	   SE-164 80 Stockholm, SWEDEN

2871	   Phone: +46 8 7190000
2872	   EMail: magnus.westerlund@ericsson.com

2874	   Bo Burman
2875	   Ericsson Research
2876	   Ericsson AB
2877	   SE-164 80 Stockholm, SWEDEN

2879	   Phone: +46 8 7190000
2880	   EMail: bo.burman@ericsson.com

2882	Full Copyright Statement

2884	   Copyright (C) The IETF Trust (2007).

2886	   This document is subject to the rights, licenses and restrictions
2887	   contained in BCP 78, and except as set forth therein, the authors
2888	   retain all their rights.

2890	   This document and the information contained herein are provided on an
2891	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
2892	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST
2893	   AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES,
2894	   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT
2895	   THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY
2896	   IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR
2897	   PURPOSE.

2899	Intellectual Property

2901	   The IETF takes no position regarding the validity or scope of any
2902	   Intellectual Property Rights or other rights that might be claimed to
2903	   pertain to the implementation or use of the technology described in
2904	   this document or the extent to which any license under such rights
2905	   might or might not be available; nor does it represent that it has
2906	   made any independent effort to identify any such rights.  Information
2907	   on the procedures with respect to rights in RFC documents can be
2908	   found in BCP 78 and BCP 79.

2910	   Copies of IPR disclosures made to the IETF Secretariat and any
2911	   assurances of licenses to be made available, or the result of an
2912	   attempt made to obtain a general license or permission for the use of
2913	   such proprietary rights by implementers or users of this
2914	   specification can be obtained from the IETF on-line IPR repository at
2915	   http://www.ietf.org/ipr.

2917	   The IETF invites any interested party to bring to its attention any
2918	   copyrights, patents or patent applications, or other proprietary
2919	   rights that may cover technology that may be required to implement
2920	   this standard.  Please address the information to the IETF at
2921	   ietf-ipr@ietf.org.

2923	Acknowledgement

2925	   Funding for the RFC Editor function is provided by the IETF
2926	   Administrative Support Activity (IASA).

2928	RFC Editor Considerations

2930	   The RFC editor is requested to replace all occurrences of XXXX with
2931	   the RFC number this document receives.