idnits 2.17.1 

draft-ietf-avt-avpf-ccm-06.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 19.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 2900.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2911.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2918.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2924.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The document has examples using IPv4 documentation addresses according
     to RFC6890, but does not use any IPv6 documentation addresses.  Maybe
     there should be IPv6 examples, too?


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  == Line 752 has weird spacing: '...sg type    mul...'

  == Line 1132 has weird spacing: '...     ab  c   s...'

  == Line 1134 has weird spacing: '...     ba   s...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (May 28, 2007) is 6171 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RFCxxxx' is mentioned on line 2752, but not defined

  ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866)

  == Outdated reference: A later version (-07) exists of
     draft-ietf-avt-topologies-04

  ** Downref: Normative reference to an Informational draft:
     draft-ietf-avt-topologies (ref. 'Topologies')

  ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226)

  ** Obsolete normative reference: RFC 4234 (Obsoleted by RFC 5234)

  == Outdated reference: A later version (-12) exists of
     draft-ietf-avt-profile-savpf-10

  -- Obsolete informational reference (is this intentional?): RFC 3525
     (Obsoleted by RFC 5125)

  -- Obsolete informational reference (is this intentional?): RFC 3448
     (Obsoleted by RFC 5348)


     Summary: 5 errors (**), 0 flaws (~~), 7 warnings (==), 10 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                   Stephan Wenger
3	INTERNET-DRAFT                                           Umesh Chandra
4	Expires: October 2007                                            Nokia
5	Intended Status: Proposed Standard                   Magnus Westerlund
6	                                                             Bo Burman
7	                                                              Ericsson
8	                                                          May 28, 2007

10	                        Codec Control Messages in the
11	                RTP Audio-Visual Profile with Feedback (AVPF)
12	                       draft-ietf-avt-avpf-ccm-06.txt>

14	Status of this Memo

16	   By submitting this Internet-Draft, each author represents that any
17	   applicable patent or other IPR claims of which he or she is aware
18	   have been or will be disclosed, and any of which he or she becomes
19	   aware will be disclosed, in accordance with Section 6 of BCP 79.

21	   Internet-Drafts are working documents of the Internet Engineering
22	   Task Force (IETF), its areas, and its working groups.  Note that
23	   other groups may also distribute working documents as Internet-
24	   Drafts.

26	   Internet-Drafts are draft documents valid for a maximum of six months
27	   and may be updated, replaced, or obsoleted by other documents at any
28	   time.  It is inappropriate to use Internet-Drafts as reference
29	   material or to cite them other than as "work in progress."

31	   The list of current Internet-Drafts can be accessed at
32	   http://www.ietf.org/ietf/1id-abstracts.txt.

34	   The list of Internet-Draft Shadow Directories can be accessed at
35	   http://www.ietf.org/shadow.html.

37	Copyright Notice

39	   Copyright (C) The IETF Trust (2007).

41	Abstract

43	   This document specifies a few extensions to the messages defined in
44	   the Audio-Visual Profile with Feedback (AVPF).  They are helpful
45	   primarily in conversational multimedia scenarios where centralized
46	   multipoint functionalities are in use.  However some are also usable
47	   in smaller multicast environments and point-to-point calls.  The
48	   extensions discussed are messages related to the ITU-T H.271 Video
49	   Back Channel, Full Intra Request, Temporary Maximum Media Stream Bit
50	   Rate and Temporal Spatial Trade-off.

52	TABLE OF CONTENTS

54	1. Introduction....................................................5
55	2. Definitions.....................................................6
56	   2.1. Glossary...................................................6
57	   2.2. Terminology................................................6
58	   2.3. Topologies.................................................9
59	3. Motivation (Informative).......................................10
60	   3.1. Use Cases.................................................10
61	   3.2. Using the Media Path......................................12
62	   3.3. Using AVPF................................................13
63	      3.3.1. Reliability..........................................13
64	   3.4. Multicast.................................................13
65	   3.5. Feedback Messages.........................................13
66	      3.5.1. Full Intra Request Command...........................13
67	         3.5.1.1. Reliability.....................................14
68	      3.5.2. Temporal Spatial Trade-off Request and Notification..15
69	         3.5.2.1. Point-to-Point..................................16
70	         3.5.2.2. Point-to-Multipoint Using Multicast or
71	                  Translators.....................................16
72	         3.5.2.3. Point-to-Multipoint Using RTP Mixer.............17
73	         3.5.2.4. Reliability.....................................17
74	      3.5.3. H.271 Video Back Channel Message.....................17
75	         3.5.3.1. Reliability.....................................20
76	      3.5.4. Temporary Maximum Media Stream Bit Rate Request and
77	      Notification................................................20
78	         3.5.4.1. Behavior for media receivers using TMMBR........22
79	         3.5.4.2. Algorithm for establishing current limitations..24
80	         3.5.4.3. Use of TMMBR in a Mixer Based Multipoint
81	                  Operation.......................................30
82	         3.5.4.4. Use of TMMBR in Point-to-Multipoint Using
83	                  Multicast or Translators........................32
84	         3.5.4.5. Use of TMMBR in Point-to-point operation........32
85	         3.5.4.6. Reliability.....................................32
86	4. RTCP Receiver Report Extensions................................34
87	   4.1. Design Principles of the Extension Mechanism..............34
88	   4.2. Transport Layer Feedback Messages.........................35
89	      4.2.1. Temporary Maximum Media Stream Bit Rate Request
90	             (TMMBR)..............................................36
91	         4.2.1.1. Message Format..................................36
92	         4.2.1.2. Semantics.......................................37
93	         4.2.1.3. Timing Rules....................................40
94	         4.2.1.4. Handling in Translator and Mixers...............40
95	      4.2.2. Temporary Maximum Media Stream Bit Rate Notification
96	             (TMMBN)..............................................41
97	         4.2.2.1. Message Format..................................41
98	         4.2.2.2. Semantics.......................................42
99	         4.2.2.3. Timing Rules....................................43
100	         4.2.2.4. Handling by Translators and Mixers..............43
101	   4.3. Payload Specific Feedback Messages........................43
102	      4.3.1. Full Intra Request (FIR).............................44
103	         4.3.1.1. Message Format..................................44
104	         4.3.1.2. Semantics.......................................45
105	         4.3.1.3. Timing Rules....................................47
106	         4.3.1.4. Handling of FIR Message in Mixer and
107	                  Translators.....................................47
108	         4.3.1.5. Remarks.........................................47
109	      4.3.2. Temporal-Spatial Trade-off Request (TSTR)............47
110	         4.3.2.1. Message Format..................................48
111	         4.3.2.2. Semantics.......................................48
112	         4.3.2.3. Timing Rules....................................49
113	         4.3.2.4. Handling of message in Mixers and Translators...49
114	         4.3.2.5. Remarks.........................................49
115	      4.3.3. Temporal-Spatial Trade-off Notification (TSTN).......50
116	         4.3.3.1. Message Format..................................50
117	         4.3.3.2. Semantics.......................................51
118	         4.3.3.3. Timing Rules....................................51
119	         4.3.3.4. Handling of TSTN in Mixer and Translators.......51
120	         4.3.3.5. Remarks.........................................51
121	      4.3.4. H.271 Video Back Channel Message (VBCM)..............52
122	         4.3.4.1. Message Format..................................52
123	         4.3.4.2. Semantics.......................................53
124	         4.3.4.3. Timing Rules....................................54
125	         4.3.4.4. Handling of message in Mixer or Translator......54
126	         4.3.4.5. Remarks.........................................54
127	5. Congestion Control.............................................55
128	6. Security Considerations........................................55
129	7. SDP Definitions................................................56
130	   7.1. Extension of the rtcp-fb Attribute........................56
131	   7.2. Offer-Answer..............................................58
132	   7.3. Examples..................................................58
133	8. IANA Considerations............................................62
134	9. Acknowledgements...............................................63
135	10. References....................................................64
136	   10.1. Normative references.....................................64
137	   10.2. Informative references...................................64
138	11. Authors' Addresses............................................66
139	1.1. Introduction

141	   When the Audio-Visual Profile with Feedback (AVPF) [RFC4585] was
142	   developed, the main emphasis lay in the efficient support of point-
143	   to-point and small multipoint scenarios without centralized
144	   multipoint control.  However, in practice, many small multipoint
145	   conferences operate utilizing devices known as Multipoint Control
146	   Units (MCUs).  Long-standing experience of the conversational video
147	   conferencing industry suggests that there is a need for a few
148	   additional feedback messages, to support centralized multipoint
149	   conferencing efficiently.  Some of the messages have applications
150	   beyond centralized multipoint, and this is indicated in the
151	   description of the message.  This is especially true for the message
152	   intended to carry ITU-T Rec. H.271 [H.271] bit strings for Video Back
153	   Channel messages.

155	   In Real-time Transport Protocol (RTP) [RFC3550] terminology, MCUs
156	   comprise mixers and translators.  Most MCUs also include signaling
157	   support.  During the development of this memo, it was noticed that
158	   there is considerable confusion in the community related to the use
159	   of terms such as mixer, translator, and MCU.  In response to these
160	   concerns, a number of topologies have been identified that are of
161	   practical relevance to the industry, but are not documented in
162	   sufficient detail in [RFC3550].  These topologies are documented in
163	   [Topologies], and understanding this memo requires previous or
164	   parallel study of [Topologies].

166	   Some of the messages defined here are forward only, in that they do
167	   not require an explicit notification to the message emitter that they
168	   have been received and/or indicating the message receiver's actions.
169	   Other messages require a response, leading to a two way communication
170	   model that one could view as useful for control purposes.  However,
171	   it is not the intention of this memo to open up RTP Control Protocol
172	   (RTCP) to a generalized control protocol.  All mentioned messages
173	   have relatively strict real-time constraints, in the sense that their
174	   value diminishes with increased delay.  This makes the use of more
175	   traditional control protocol means, such as Session Initiation
176	   Protocol (SIP) re-INVITEs [RFC3261], undesirable when used for the
177	   same purpose.  Furthermore, all messages are of a very simple format
178	   that can be easily processed by an RTP/RTCP sender/receiver.
179	   Finally, all messages relate only to the RTP stream with which they
180	   are associated, and not to any other property of a communication
181	   system.  In particular, none of them relate to the properties of the
182	   access links traversed by the session.

184	2. Definitions

186	2.1. Glossary

188	   AMID   - Additive Increase Multiplicative Decrease
189	   AVPF   - The extended RTP profile for RTCP-based feedback
190	   FEC    - Forward Error Correction
191	   FCI    - Feedback Control Information [RFC4585]
192	   FIR    - Full Intra Request
193	   MCU    - Multipoint Control Unit
194	   MPEG   - Moving Picture Experts Group
195	   TMMBN  - Temporary Maximum Media Stream Bit Rate Notification
196	   TMMBR  - Temporary Maximum Media Stream Bit Rate Request
197	   PLI    - Picture Loss Indication
198	   PR     - Packet rate
199	   QP     - Quantizer Parameter
200	   RTT    - Round trip time
201	   SSRC   - Synchronization Source
202	   TSTN   - Temporal Spatial Trade-off Notification
203	   TSTR   - Temporal Spatial Trade-off Request
204	   VBCM   - Video Back Channel Message indication.

206	2.2. Terminology

208	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
209	   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
210	   document are to be interpreted as described in RFC 2119 [RFC2119].

212	      Message:
213	          An RTCP feedback message [RFC4585] defined by this
214	          specification, of one of the following types:

216	          Request:
217	              Message that requires acknowledgement

219	          Command:
220	              Message that forces the receiver to an action

222	          Indication:
223	              Message that reports a situation

225	          Notification:

227	             Message that provides a notification that an event has
228	              occurred. Notifications are commonly generated in response
229	              to a Request.

231	          Note that, with the exception of "Notification", this
232	          terminology is in alignment with ITU-T Rec. H.245 [H245].

234	     Decoder Refresh Point:
235	          A bit string, packetized in one or more RTP packets, which
236	          completely resets the decoder to a known state.

238	          Examples for "hard" decoder refresh points are Intra pictures
239	          in H.261, H.263, MPEG-1, MPEG-2, and MPEG-4 part 2, and
240	          Instantaneous Decoder Refresh (IDR) pictures in H.264.
241	          "Gradual" decoder refresh points may also be used; see for
242	          example [AVC].  While both "hard" and "gradual" decoder
243	          refresh points are acceptable in the scope of this
244	          specification, in most cases the user experience will benefit
245	          from using a "hard" decoder refresh point.

247	          A decoder refresh point also contains all header information
248	          above the picture layer (or equivalent, depending on the video
249	          compression standard) that is conveyed in-band.  In H.264, for
250	          example, a decoder refresh point contains parameter set
251	          Network Adaptation Layer (NAL) units that generate parameter
252	          sets necessary for the decoding of the following slice/data
253	          partition NAL units (and that are not conveyed out of band).

255	   Decoding:
256	          The operation of reconstructing the media stream.

258	   Rendering:
259	          The operation of presenting (parts of) the reconstructed media
260	          stream to the user.

262	   Stream thinning:
263	          The operation of removing some of the packets from a media
264	          stream.  Stream thinning, preferably, is media-aware, implying
265	          that media packets are removed in the order of increasing
266	          relevance to the reproductive quality.  However even when
267	          employing media-aware stream thinning, most media streams
268	          quickly lose quality when subject to increasing levels of
269	          thinning.  Media-unaware stream thinning leads to even worse
270	          quality degradation.  In contrast to transcoding, stream
271	          thinning is typically seen as a computationally lightweight
272	          operation.

274	   Media:

276	          Often used (sometimes in conjunction with terms like bit rate,
277	          stream, sender ...) to identify the content of the forward RTP
278	          packet stream (carrying the codec data), to which the codec
279	          control message applies.

281	   Media Stream:
282	          The stream of RTP packets labeled with a single
283	          Synchronization Source (SSRC) carrying the media (and also in
284	          some cases repair information such as retransmission or
285	          Forward Error Correction (FEC) information).

287	   Total media bit rate:
288	          The total bits per second transferred in a media stream,
289	          measured at an observer-selected protocol layer and averaged
290	          over a reasonable timescale, the length of which depends on
291	          the application.  In general, a media sender and a media
292	          receiver will observe different total media bit rates for the
293	          same stream, first because they may have selected different
294	          reference protocol layers, and second, because of changes in
295	          per-packet overhead along the transmission path.  The goal
296	          with bit rate averaging is to be able to ignore any burstiness
297	          on very short timescales, below for example 100 ms, introduced
298	          by scheduling or link layer packetization effects.

300	   Maximum total media bit rate:
301	          The upper limit on total media bit rate for a given media
302	          stream at a particular receiver and for its selected protocol
303	          layer. Note that this value cannot be measured on the received
304	          media stream, instead it needs to be calculated or determined
305	          through other means, such as QoS negotiations or local
306	          resource limitations. Also note that this value is an average
307	          (on a timescale that is reasonable for the application) and
308	          that it may be different from the instantaneous bit-rate seen
309	          by packets in the media stream.

311	   Overhead:
312	          All protocol header information required to convey a packet
313	          with media data from sender to receiver, from the application
314	          layer down to a pre-defined protocol level (for example down
315	          to, and including, the IP header).  Overhead may include, for
316	          example, IP, UDP, and RTP headers, any layer 2 headers, any
317	          Contributing Sources (CSRCs), RTP-Padding, and RTP header
318	          extensions.  Overhead excludes any RTP payload headers and the
319	          payload itself.

321	   Net media bit rate:
322	          The bit rate carried by a media stream, net of overhead.  That
323	          is, the bits per second accounted for by encoded media, any
324	          applicable payload headers, and any directly associated meta
325	          payload information placed in the RTP packet.  A typical
326	          example of the latter is redundancy data provided by the use
327	          of RFC 2198 [RFC2198].  Note that, unlike the total media bit
328	          rate, the net media bit rate will have the same value at the
329	          media sender and at the media receiver unless any mixing or
330	          translating of the media has occurred.

332	          For a given observer, the total media bit rate for a media
333	          stream is equal to the sum of the net media bit rate and the
334	          per-packet overhead as defined above multiplied by the packet
335	          rate.

337	   Feasible region:
338	          The set of all combinations of packet rate and net media bit
339	          rate that do not exceed the restrictions in maximum media bit
340	          rate placed on a given media sender by the Temporary Maximum
341	          Media Stream Bit-rate Request (TMMBR)  messages it has
342	          received.  The feasible region will change as new TMMBR
343	          messages are received.

345	   Bounding set:
346	          The set of TMMBR tuples, selected from all those received at a
347	          given media sender, that define the feasible region for that
348	          media sender.  The media sender uses an algorithm such as that
349	          in section 3.5.4.2 to determine or iteratively approximate the
350	          current bounding set, and reports that set back to the media
351	          receivers in a Temporary Maximum Media Stream Bit-rate
352	          Notification (TMMBN) message.

354	2.3. Topologies

356	   Please refer to [Topologies] for an in depth discussion.  The
357	   topologies referred to throughout this memo are labeled (consistently
358	   with [Topologies]) as follows:

360	   Topo-Point-to-Point . . . . . point-to-point communication
361	   Topo-Multicast  . . . . . . . multicast communication as in RFC 3550
362	   Topo-Translator . . . . . . . translator based as in RFC 3550
363	   Topo-Mixer  . . . . . . . . . mixer based as in RFC 3550
364	   Topo-Video-switch-MCU . . . . video switching MCU,
365	   Topo-RTCP-terminating-MCU . . mixer but terminating RTCP

367	3. Motivation (Informative)

369	   This section discusses the motivation and usage of the different
370	   video and media control messages.  The video control messages have
371	   been under discussion for a long time, and a requirement draft was
372	   drawn up [Basso].  This draft has expired; however we quote relevant
373	   sections of it to provide motivation and requirements.

375	3.1. Use Cases

377	   There are a number of possible usages for the proposed feedback
378	   messages.  Let us begin by looking through the use cases Basso et al.
379	   [Basso] proposed.  Some of the use cases have been reformulated and
380	   comments have been added.

382	   1. An RTP video mixer composes multiple encoded video sources into a
383	      single encoded video stream.  Each time a video source is added,
384	      the RTP mixer needs to request a decoder refresh point from the
385	      video source, so as to start an uncorrupted prediction chain on
386	      the spatial area of the mixed picture occupied by the data from
387	      the new video source.

389	   2. An RTP video mixer receives multiple encoded RTP video streams
390	      from conference participants, and dynamically selects one of the
391	      streams to be included in its output RTP stream.  At the time of a
392	      bit stream change (determined through means such as voice
393	      activation or the user interface), the mixer requests a decoder
394	      refresh point from the remote source, in order to avoid using
395	      unrelated content as reference data for inter picture prediction.
396	      After requesting the decoder refresh point, the video mixer stops
397	      the delivery of the current RTP stream and monitors the RTP stream
398	      from the new source until it detects data belonging to the decoder
399	      refresh point.  At that time, the RTP mixer starts forwarding the
400	      newly selected stream to the receiver(s).

402	   3. An application needs to signal to the remote encoder that the
403	      desired trade-off between temporal and spatial resolution has
404	      changed.  For example, one user may prefer a higher frame rate and
405	      a lower spatial quality, and another user may prefer the opposite.
406	      This choice is also highly content dependent.  Many current video
407	      conferencing systems offer in the user interface a mechanism to
408	      make this selection, usually in the form of a slider.  The
409	      mechanism is helpful in point-to-point, centralized multipoint and
410	      non-centralized multipoint uses.

412	   4. Use case 4 of the Basso draft applies only to Picture Loss
413	      Indication (PLI) as defined in AVPF [RFC4585] and is not
414	      reproduced here.

416	   5. Use case 5 of the Basso draft relates to a mechanism known as
417	      "freeze picture request".  Sending freeze picture requests
418	      over a non-reliable forward RTCP channel has been identified as
419	      problematic.  Therefore, no freeze picture request has been
420	      included in this memo, and the use case discussion is not
421	      reproduced here.

423	   6. A video mixer dynamically selects one of the received video
424	      streams to be sent out to participants and tries to provide the
425	      highest bit rate possible to all participants, while minimizing
426	      stream trans-rating.  One way of achieving this is to set up
427	      sessions with endpoints using the maximum bit rate accepted by
428	      each endpoint, and accepted by the call admission method used by
429	      the mixer.  By means of commands that reduce the maximum media
430	      stream bit rate below what has been negotiated during session set
431	      up, the mixer can reduce the maximum bit rate sent by endpoints to
432	      the lowest of all the accepted bit rates.  As the lowest accepted
433	      bit rate changes due to endpoints joining and leaving or due to
434	      network congestion, the mixer can adjust the limits at which
435	      endpoints can send their streams to match the new value.  The
436	      mixer then requests a new maximum bit rate, which is equal to or
437	      less than the maximum bit rate negotiated at session setup for a
438	      specific media stream, and the remote endpoint can respond with
439	      the actual bit rate that it can support.

441	   The picture Basso, et al draws up covers most applications we
442	   foresee.  However we would like to extend the list with two
443	   additional use cases:

445	   7. Currently deployed congestion control algorithms (AMID and TFRC
446	      [RFC3448]) probe for additional available capacity as long as
447	      there is something to send.  With congestion control algorithms
448	      using packet loss as the indication for congestion, this probing
449	      does generally result in reduced media quality (often to a point
450	      where the distortion is large enough to make the media unusable),
451	      due to packet loss and increased delay.

453	      In a number of deployment scenarios, especially cellular ones, the
454	      bottleneck link is often the last hop link.  That cellular link
455	      also commonly has some type of QoS negotiation enabling the
456	      cellular device to learn the maximal bit rate available over this
457	      last hop.  A media receiver behind this link can, in most (if not
458	      all) cases, calculate at least an upper bound for the bit rate
459	      available for each media stream it presently receives.  How this
460	      is done is an implementation detail and not discussed herein.
461	      Indicating the maximum available bit rate to the transmitting
462	      party for the various media streams can be beneficial to prevent
463	      that party from probing for bandwidth for this stream in excess of
464	      a known hard limit.  For cellular or other mobile devices, the
465	      known available bit rate for each stream (deduced from the link
466	      bit rate) can change quickly, due to handover to another
467	      transmission technology, QoS renegotiation due to congestion, etc.
468	      To enable minimal disruption of service, quick convergence is
469	      necessary, and therefore media path signaling is desirable.

471	    8. The use of reference picture selection (RPS) as an error
472	       resilience tool has been introduced in 1997 as NEWPRED [NEWPRED],
473	       and is now widely deployed.  When RPS is in use, simplistically
474	       put, the receiver can send a feedback message to the sender,
475	       indicating a reference picture that should be used for future
476	       prediction. ([NEWPRED] mentions other forms of feedback as well.)
477	       AVPF contains a mechanism for conveying such a message, but did
478	       not specify for which codec and according to which syntax the
479	       message should conform.  Recently, the ITU-T finalized Rec. H.271
480	       which (among other message types) also includes a feedback
481	       message.  It is expected that this feedback message will fairly
482	       quickly enjoy wide support.  Therefore, a mechanism to convey
483	       feedback messages according to H.271 appears to be desirable.

485	3.2. Using the Media Path

487	   There are multiple reasons why we use the media path for the codec
488	   control messages.

490	   First, systems employing MCUs often separate the control and media
491	   processing parts.  As these messages are intended for or generated by
492	   the media part rather than the signaling part of the MCU, having them
493	   on the media path avoids transmission across interfaces and
494	   unnecessary control traffic between signaling and processing.  If the
495	   MCU is physically decomposed, the use of the media path avoids the
496	   need for media control protocol extensions (e.g. in MEGACO
497	   [RFC3525]).

499	   Secondly, the signaling path quite commonly contains several
500	   signaling entities, e.g. SIP proxies and application servers.
501	   Avoiding going through signaling entities avoids delay for several
502	   reasons.  Proxies have less stringent delay requirements than media
503	   processing and due to their complex and more generic nature may
504	   result in significant processing delay.  The topological locations of
505	   the signaling entities are also commonly not optimized for minimal
506	   delay, but rather towards other architectural goals.  Thus the
507	   signaling path can be significantly longer in both geographical and
508	   delay sense.

510	3.3. Using AVPF

512	   The AVPF feedback message framework [RFC4585] provides the
513	   appropriate framework to implement the new messages.  AVPF implements
514	   rules controlling the timing of feedback messages to avoid congestion
515	   through network flooding by RTCP traffic.  We re-use these rules by
516	   referencing AVPF.

518	   The signaling setup for AVPF allows each individual type of function
519	   to be configured or negotiated on an RTP session basis.

521	3.3.1. Reliability

523	   The use of RTCP messages implies that each message transfer is
524	   unreliable, unless the lower layer transport provides reliability.
525	   The different messages proposed in this specification have different
526	   requirements in terms of reliability.  However, in all cases, the
527	   reaction to an (occasional) loss of a feedback message is specified.

529	3.4. Multicast

531	   The codec control messages might be used with multicast.  The RTCP
532	   timing rules specified in [RFC3550] and [RFC4585] ensure that the
533	   messages do not cause overload of the RTCP connection.  The use of
534	   multicast may result in the reception of messages with inconsistent
535	   semantics.   The reaction to inconsistencies depends on the message
536	   type, and is discussed for each message type separately.

538	3.5. Feedback Messages

540	   This section describes the semantics of the different feedback
541	   messages and how they apply to the different use cases.

543	3.5.1. Full Intra Request Command

545	   A Full Intra Request (FIR) Command, when received by the designated
546	   media sender, requires that the media sender sends a Decoder Refresh
547	   Point (see 2.2) at the earliest opportunity.  The evaluation of such
548	   opportunity includes the current encoder coding strategy and the
549	   current available network resources.

551	   FIR is also known as an "instantaneous decoder refresh request" or
552	   "video fast update request".

554	   Using a decoder refresh point implies refraining from using any
555	   picture sent prior to that point as a reference for the encoding
556	   process of any subsequent picture sent in the stream.  For predictive
557	   media types that are not video, the analogue applies.  For example,
558	   if in MPEG-4 systems scene updates are used, the decoder refresh
559	   point consists of the full representation of the scene and is not
560	   delta-coded relative to previous updates.

562	   Decoder refresh points, especially Intra or IDR pictures, are in
563	   general several times larger in size than predicted pictures.  Thus,
564	   in scenarios in which the available bit rate is small, the use of a
565	   decoder refresh point implies a delay that is significantly longer
566	   than the typical picture duration.

568	   Usage in multicast is possible; however aggregation of the commands
569	   is recommended.  A receiver that receives a request closely (within 2
570	   times the longest Round Trip Time (RTT) known, plus any AVPF-induced
571	   RTCP packet sending delays, if those are known) after sending a
572	   decoder refresh point, should await a second request message to
573	   ensure that the media receiver has not been served by the previously
574	   delivered decoder refresh point.  The reason for the specified delay
575	   is to avoid sending unnecessary decoder refresh points.  A session
576	   participant may have sent its own request while another participant's
577	   request was in-flight to them.  Suppressing those requests that may
578	   have been sent without knowledge about the other request avoids this
579	   issue.

581	   Using the FIR command to recover from errors is explicitly
582	   disallowed, and instead the PLI message defined in AVPF [RFC4585]
583	   should be used.  The PLI message reports lost pictures and has been
584	   included in AVPF for precisely that purpose.

586	   Full Intra Request is applicable in use-cases 1 and 2.

588	3.5.1.1. Reliability

590	   The FIR message results in the delivery of a decoder refresh point,
591	   unless the message is lost.  Decoder refresh points are easily
592	   identifiable from the bit stream.  Therefore, there is no need for
593	   protocol-level notification, and a simple command repetition
594	   mechanism is sufficient for ensuring the level of reliability
595	   required.  However, the potential use of repetition does require a
596	   mechanism to prevent the recipient from responding to messages
597	   already received and responded to.

599	   To ensure the best possible reliability, a sender of FIR may repeat
600	   the FIR request until the desired content has been received.  The
601	   repetition interval is determined by the RTCP timing rules applicable
602	   to the session.  Upon reception of a complete decoder refresh point
603	   or the detection of an attempt to send a decoder refresh point (which
604	   got damaged due to a packet loss), the repetition of the FIR must
605	   stop.  If another FIR is necessary, the request sequence number must
606	   be increased.  A FIR sender shall not have more than one FIR request
607	   (different request sequence number) outstanding at any time per media
608	   sender in the session.

610	   The receiver of FIR (i.e. the media sender) behaves in complementary
611	   fashion to ensure delivery of a decoder refresh point.  If it
612	   receives repetitions of the FIR more than 2*RTT after it has sent a
613	   decoder refresh point, it shall send a new decoder refresh point.
614	   Two round trip times allow time for the decoder refresh point to
615	   arrive back to the requestor and for the end of repetitions of FIR to
616	   reach and be detected by the media sender.

618	   An RTP mixer that receives an FIR from a media receiver is
619	   responsible to ensure that a decoder refresh point is delivered to
620	   the requesting receiver.  It may be necessary for the mixer to
621	   generate FIR commands.  From a reliability perspective, the two legs
622	   (FIR-requesting endpoint to mixer, and mixer to decoder refresh point
623	   generating endpoint) are handled independently from each other.

625	3.5.2. Temporal Spatial Trade-off Request and Notification

627	   The Temporal Spatial Trade-off Request (TSTR) instructs the video
628	   encoder to change its trade-off between temporal and spatial
629	   resolution.  Index values from 0 to 31 indicate monotonically a
630	   desire for higher frame rate.  That is, a requester asking for an
631	   index of 0 prefers a high quality and is willing to accept a low
632	   frame rate, whereas a requester asking for 31 wishes a high frame
633	   rate, potentially at the cost of low spatial quality.

635	   In general the encoder reaction time may be significantly longer than
636	   the typical picture duration.  See use case 3 for an example.  The
637	   encoder decides whether and to what extent the request results in a
638	   change of the trade-off.  It returns a Temporal Spatial Trade-Off
639	   Notification (TSTN) message to indicate the trade-off that it will
640	   use henceforth.

642	   TSTR and TSTN have been introduced primarily because it is believed
643	   that control protocol mechanisms, e.g. a SIP re-invite, are too
644	   heavyweight and too slow to allow for a reasonable user experience.

646	   Consider, for example, a user interface where the remote user selects
647	   the temporal/spatial trade-off with a slider (as it is common in
648	   state-of-the-art video conferencing systems).  An immediate feedback
649	   to any slider movement is required for a reasonable user experience.
650	   A SIP re-INVITE [RFC3261] would require at least two round-trips more
651	   (compared to the TSTR/TSTN mechanism) and may involve proxies and
652	   other complex mechanisms.  Even in a well-designed system, it could
653	   take a second or so until finally the new trade-off is selected.
654	   Furthermore the use of RTCP solves the multicast use case very
655	   efficiently.

657	   The use of TSTR and TSTN in multipoint scenarios is a non-trivial
658	   subject, and can be achieved in many implementation-specific ways.
659	   Problems stem from the fact that TSTRs will typically arrive
660	   unsynchronized, and may request different trade-off values for the
661	   same stream and/or endpoint encoder.  This memo does not specify a
662	   translator, mixer or endpoint's reaction to the reception of a
663	   suggested trade-off as conveyed in the TSTR.  We only require the
664	   receiver of a TSTR message to reply to it by sending a TSTN, carrying
665	   the new trade-off chosen by its own criteria (which may or may not be
666	   based on the trade-off conveyed by the TSTR).  In other words, the
667	   trade-off sent in TSTR is a non-binding recommendation, nothing more.

669	   Four TSTR/TSTN scenarios need to be distinguished, based on the
670	   topologies described in [Topologies].  The scenarios are described in
671	   the following sub-clauses.

673	3.5.2.1. Point-to-Point

675	   In this most trivial case (Topo-Point-to-Point), the media sender
676	   typically adjusts its temporal/spatial trade-off based on the
677	   requested value in TSTR, subject to its own capabilities.  The TSTN
678	   message conveys back the new trade-off value (which may be identical
679	   to the old one if, for example, the sender is not capable of
680	   adjusting its trade-off).

682	3.5.2.2. Point-to-Multipoint Using Multicast or Translators

684	   RTCP Multicast is used either with media multicast according to Topo-
685	   Multicast, or following RFC 3550's translator model according to
686	   Topo-Translator.  In these cases, unsynchronized TSTR messages from
687	   different receivers may be received, possibly with different
688	   requested trade-offs (because of different user preferences).  This
689	   memo does not specify how the media sender tunes its trade-off.
690	   Possible strategies include selecting the mean or median of all
691	   trade-off requests received, giving priority to certain participants,
692	   or continuing to use the previously selected trade-off (e.g. when the
693	   sender is not capable of adjusting it).  Again, all TSTR messages
694	   need to be acknowledged by TSTN, and the value conveyed back has to
695	   reflect the decision made.

697	3.5.2.3. Point-to-Multipoint Using RTP Mixer

699	   In this scenario (Topo-Mixer) the RTP mixer receives all TSTR
700	   messages, and has the opportunity to act on them based on its own
701	   criteria.  In most cases, the mixer should form a "consensus" of
702	   potentially conflicting TSTR messages arriving from different
703	   participants, and initiate its own TSTR message(s) to the media
704	   sender(s).  As in the previous scenario, the strategy for forming
705	   this "consensus" is up to the implementation, and can, for example,
706	   encompass averaging the participants' request values, giving priority
707	   to certain participants, or using session default values.

709	   Even if a mixer or translator performs transcoding, it is very
710	   difficult to deliver media with the requested trade-off, unless the
711	   content the mixer or translator receives is already close to that
712	   trade-off.  Thus if the mixer changes its trade-off, it needs to
713	   request the media sender(s) to use the new value, by creating a TSTR
714	   of its own.  Upon reaching a decision on the used trade-off it
715	   includes that value in the acknowledgement to the downstream
716	   requestors.  Only in cases where the original source has
717	   substantially higher quality (and bit rate), is it likely that
718	   transcoding alone can result in the requested trade-off.

720	3.5.2.4. Reliability

722	   A request and reception acknowledgement mechanism is specified.  The
723	   Temporal Spatial Trade-off Notification (TSTN) message informs the
724	   request-sender that its request has been received, and what trade-off
725	   is used henceforth.  This acknowledgment mechanism is desirable for
726	   at least the following reasons:

728	   o A change in the trade-off cannot be directly identified from the
729	     media bit stream.
730	   o User feedback cannot be implemented without knowing the chosen
731	     trade-off value, according to the media sender's constraints.
732	   o Repetitive sending of messages requesting an unimplementable trade-
733	     off can be avoided.

735	3.5.3. H.271 Video Back Channel Message
736	   ITU-T Rec. H.271 defines syntax, semantics, and suggested encoder
737	   reaction to a video back channel message.  The structure defined in
738	   this memo is used to transparently convey such a message from media
739	   receiver to media sender.  In this memo, we refrain from an in-depth
740	   discussion of the available code points within H.271 and refer to the
741	   specification text [H.271] instead.

743	   However, we note that some H.271 messages bear similarities with
744	   native messages of AVPF and this memo.  Furthermore, we note that
745	   some H.271 message are known to require caution in multicast
746	   environments -- or are plainly not usable in multicast or multipoint
747	   scenarios.  Table 1 provides a brief, oversimplifying overview of the
748	   messages currently defined in H.271, their roughly corresponding AVPF
749	   or CCM messages (the latter as specified in this memo), and an
750	   indication of our current knowledge of their multicast safety.

752	   H.271 msg type       AVPF/CCM msg type    multicast-safe
753	   ---------------------------------------------------------------------
754	   0 (when used for
755	     reference picture
756	      selection)        AVPF RPSI        No (positive ACK of pictures)
757	   1 picture loss       AVPF PLI         Yes
758	   2 partial loss       AVPF SLI         Yes
759	   3 one parameter CRC  N/A              Yes (no required sender action)
760	   4 all parameter CRC  N/A              Yes (no required sender action)
761	   5 refresh point      CCM FIR          Yes

763	   Table 1: H.271 messages and their AVPF/CCM equivalents

765	          Note: H.271 message type 0 is not a strict equivalent to
766	          AVPF's Reference Picture Selection Indication (RPSI); it is an
767	          indication of known-as-correct reference picture(s) at the
768	          decoder.  It does not command an encoder to use a defined
769	          reference picture (the form of control information envisioned
770	          to be carried in RPSI).  However, it is believed and intended
771	          that H.271 message type 0 will be used for the same purpose as
772	          AVPF's RPSI -- although other use forms are also possible.

774	   In response to the opaqueness of the H.271 messages especially with
775	   respect to the multicast safety, the following guidelines MUST be
776	   followed when an implementation wishes to employ the H.271 video back
777	   channel message:

779	   1. Implementations utilizing the H.271 feedback message MUST stay in
780	      compliance with congestion control principles, as outlined in
781	      section 5.

783	   2. An implementation SHOULD utilize the IETF-native messages as
784	      defined in [RFC4585] and in this memo instead of similar messages
785	      defined in [H.271].  Our current understanding of similar messages
786	      is documented in Table 1 above.  One good reason to divert from
787	      the SHOULD statement above would be if it is clearly understood
788	      that, for a given application and video compression standard, the
789	      aforementioned "similarity" is not given, in contrast to what
790	      the table indicates.

792	   3. It has been observed that some of the H.271 code points currently
793	      in existence are not multicast-safe.  Therefore, the sensible
794	      thing to do is not to use the H.271 feedback message type in
795	      multicast environments.  It MAY be used only when all the issues
796	      mentioned later are fully understood by the implementer, and
797	      properly taken into account by all endpoints.  In all other cases,
798	      the H.271 message type MUST NOT be used in conjunction with
799	      multicast.

801	   4. It has been observed that even in centralized multipoint
802	      environments, where the mixer should theoretically be able to
803	      resolve issues as documented below, the implementation of such a
804	      mixer and cooperative endpoints is a very difficult and tedious
805	      task.  Therefore, H.271 messages MUST NOT be used in centralized
806	      multipoint scenarios, unless all the issues mentioned below are
807	      fully understood by the implementer, and properly taken into
808	      account by both mixer and endpoints.

810	   Issues to be taken into account when considering the use of H.271 in
811	   multipoint environments:

813	   1. Different state on different receivers.  In many environments it
814	      cannot be guaranteed that the decoder state of all media receivers
815	      is identical at any given point in time.  The most obvious reason
816	      for such a possible misalignment of state is a loss that occurs on
817	      the path to only one of many media receivers.  However, there are
818	      other not so obvious reasons, such as recent joins to the
819	      multipoint conference (be it by joining the multicast group or
820	      through additional mixer output).  Different states can lead the
821	      media receivers to issue potentially contradicting H.271 messages
822	      (or one media receiver issuing an H.271 message that, when
823	      observed by the media sender, is not helpful for the other media
824	      receivers).  A naive reaction of the media sender to these
825	      contradicting messages can lead to unpredictable and annoying
826	      results.

828	   2. Combining messages from different media receivers in a media
829	      sender is a non-trivial task.  As reasons, we note that these
830	      messages may be contradicting each other, and that their transport
831	      is unreliable (there may well be other reasons).  In case of many
832	      H.271 messages (i.e. types 0, 2, 3, and 4), the algorithm for
833	      combining must be aware both of the network/protocol environment
834	      (i.e. with respect to congestion) and of the media codec employed,
835	      as H.271 messages of a given type can have different semantics for
836	      different media codecs.

838	   3. The suppression of requests may need to go beyond the basic
839	      mechanisms described in AVPF (which are driven exclusively by
840	      timing and transport considerations on the protocol level).  For
841	      example, a receiver is often required to refrain from (or delay)
842	      generating requests, based on information it receives from the
843	      media stream.  For instance, it makes no sense for a receiver to
844	      issue a FIR when a transmission of an Intra/IDR picture is
845	      ongoing.

847	   4. When using the non-multicast-safe messages (e.g. H.271 type 0
848	      positive ACK of received pictures/slices) in larger multicast
849	      groups, the media receiver will likely be forced to delay or even
850	      omit sending these messages.  For the media sender this looks like
851	      data has not been properly received (although it was received
852	      properly), and a naively implemented media sender reacts to these
853	      perceived problems where it should not.

855	3.5.3.1. Reliability

857	   H.271 Video Back Channel messages do not require reliable
858	   transmission, and confirmation of the reception of a message can be
859	   derived from the forward video bit stream.  Therefore, no specific
860	   reception acknowledgement is specified.

862	   With respect to re-sending rules, clause 3.5.1.1. applies.

864	3.5.4. Temporary Maximum Media Stream Bit Rate Request and Notification

866	   A receiver, translator or mixer uses the Temporary Maximum Media
867	   Stream Bit Rate Request (TMMBR, "timber") to request a sender to
868	   limit the maximum bit rate for a media stream (see 2.2) to, or below,
869	   the provided value.  The Temporary Maximum Media Stream Bit Rate
870	   Notification (TMMBN) contains the media sender's current view of the
871	   most limiting subset of the TMMBR-defined limits it has received, to
872	   help the participants to suppress TMMBR requests that would not
873	   further restrict the media sender.  The primary usage for the
874	   TMMBR/TMMBN messages is in a scenario with an MCU or mixer (use case
875	   6), corresponding to Topo-Translator or Topo-Mixer, but also to
876	   Topo-Point-to-Point.

878	   Each temporary limitation on the media stream is expressed as a
879	   tuple.  The first component of the tuple is the maximum total media
880	   bit rate (as defined in section 2.2) that the media receiver is
881	   currently prepared to accept for this media stream.  The second
882	   component is the per-packet overhead that the media receiver has
883	   observed for this media stream at its chosen reference protocol
884	   layer.

886	   As indicated in section 2.2, the overhead as observed by the sender
887	   of the TMMBR (i.e. the media receiver) may differ from the overhead
888	   observed at the receiver of the TMMBR (i.e. the media sender) due to
889	   use of a different reference protocol layer at the other end or due
890	   to the intervention of translators or mixers that affect the amount
891	   of per packet overhead.  For example, a gateway in between the two
892	   that converts between IPv4 and IPv6 affects the per-packet overhead
893	   by 20 bytes.  Other mechanisms that change the overhead include
894	   tunnels.  The problem with varying overhead is also discussed in
895	   [RFC3890].  As will be seen in the description of the algorithm for
896	   use of TMMBR, the difference in perceived overhead between the
897	   sending and receiving ends presents no difficulty because
898	   calculations are carried out in terms of variables (packet rate, net
899	   media bit rate) that have the same value at the sender as at the
900	   receiver.

902	   Reporting both maximum total media bit rate and per-packet overhead
903	   allows different receivers to provide bit rate and overhead values
904	   for different protocol layers, for example at the IP level, at the
905	   outer part of a tunnel protocol, or at the link layer.  The protocol
906	   level a peer reports on depends on the level of integration the peer
907	   has, as it needs to be able to extract the information from that
908	   protocol level.  For example, an application with no knowledge of the
909	   IP version it is running over can not meaningfully determine the
910	   overhead of the IP header, and hence will not want to include IP
911	   overhead in the overhead or maximum total media bit rate calculation.

913	   It is expected that most peers will be able to report values at least
914	   for the IP layer.  In certain implementations it may be advantageous
915	   to also include information pertaining to the link layer, which in
916	   turn allows for a more precise overhead calculation and a better
917	   optimization of connectivity resources.

919	   The Temporary Maximum Media Stream Bit Rate messages are generic
920	   messages that can be applied to any RTP packet stream.  This
921	   separates them from the other codec control messages defined in this
922	   specification, which apply only to specific media types or payload
923	   formats.  The TMMBR functionality applies to the transport, and the
924	   requirements the transport places on the media encoding.

926	   The reasoning below assumes that the participants have negotiated a
927	   session maximum bit rate, using a signaling protocol.  This value can
928	   be global, for example in case of point-to-point, multicast, or
929	   translators.  It may also be local between the participant and the
930	   peer or mixer.  In either case, the bit rate negotiated in signaling
931	   is the one that the participant guarantees to be able to handle
932	   (depacketize and decode).  In practice, the connectivity of the
933	   participant also influences the negotiated value -- it does not make
934	   much sense to negotiate a total media bit rate that one's network
935	   interface does not support.

937	   It is also beneficial to have negotiated a maximum packet rate for
938	   the session or sender.  RFC 3890 provides an SDP [RFC4566] attribute
939	   that can be used for this purpose; however, that attribute is not
940	   usable in RTP sessions established using offer/answer [RFC3264].
941	   Therefore an optional maximum packet rate signaling parameter is
942	   specified in this memo.

944	   An already established maximum total media bit rate may be changed at
945	   any time, subject to the timing rules governing the sending of
946	   feedback messages. The limit may change to any value between zero and
947	   the session maximum, as negotiated during session establishment
948	   signaling.  However, even if a sender has received a TMMBR message
949	   allowing an increase in the bit rate, all increases must be governed
950	   by a congestion control mechanism.  TMMBR indicates known limitations
951	   only, usually in the local environment, and does not provide any
952	   guarantees about the full path.  Furthermore, any increases in TMMBR-
953	   established bit rate limits are to be executed only after a certain
954	   delay from the sending of the TMMBN message that notifies the world
955	   about the increase in limit.  The delay is specified as at least
956	   twice the longest RTT as known by the media sender, plus the media
957	   sender's calculation of the required wait time for the sending of
958	   another TMMBR message for this session based on AVPF timing rules.
959	   This delay is introduced to allow other session participants to make
960	   known their bit rate limit requirements, which may be lower.

962	   If it is likely that the new value indicated by TMMBR will be valid
963	   for the remainder of the session, the TMMBR sender is expected to
964	   perform a renegotiation of the session upper limit using the session
965	   signaling protocol.

967	3.5.4.1. Behavior for media receivers using TMMBR

969	   This section is an informal description of behaviour described more
970	   precisely in section 4.2.

972	   A media sender begins the session limited by the maximum media bit
973	   rate and maximum packet rate negotiated in session signaling, if any.

975	   Note that this value may be negotiated for another protocol layer
976	   than the one the participant uses in its TMMBR messages.  Each media
977	   receiver selects a reference protocol layer, forms an estimate of the
978	   overhead it is observing (or estimating it if no packets has been
979	   seen yet) at that reference level, and determines the maximum total
980	   media bit rate it can accept, taking into account its own limitations
981	   and any transport path limitations of which it may be aware.  In case
982	   the current limitations are more restricting then what was agreed on
983	   in the session signaling, the media receiver reports its initial
984	   estimate of these two quantities to the media sender using a TMMBR
985	   message.  Overall message traffic is reduced by the possibility of
986	   including tuples for multiple media senders in the same TMMBR
987	   message.

989	   The media sender applies an algorithm such as that specified in
990	   section 3.5.4.2 to select which of the tuples it has received are
991	   most limiting (i.e. the bounding set as defined in section 2.2).  It
992	   modifies its operation to stay within the feasible region (as defined
993	   in section 2.2), and also sends out a TMMBN notification to the media
994	   receivers indicating the selected bounding set.

996	   If a media receiver does not own one of the tuples in the bounding
997	   set reported by the TMMBN, it applies the same algorithm as the media
998	   sender to determine if its current estimated (maximum total media bit
999	   rate, overhead) tuple would enter the bounding set if known to the
1000	   media sender.  If so, it issues a TMMBR request reporting the tuple
1001	   value to the sender.  Otherwise it takes no action for the moment.
1002	   Periodically, its estimated tuple values may change or it may receive
1003	   a new TMMBN.  If so, it reapplies the algorithm to decide whether it
1004	   needs to issue a TMMBR request.

1006	   If, alternatively, a media receiver owns one of the tuples in the
1007	   reported bounding set, it takes no action until such time as its
1008	   estimate of its own tuple values changes.  At that time it sends a
1009	   TMMBR request to the media sender to report the changed values.

1011	   A media receiver may change status between owner and non-owner of a
1012	   bounding tuple between one TMMBN message and the next.  Thus it must
1013	   check the contents of each TMMBN to determine its subsequent actions.

1015	   Implementations may use other algorithms of their choosing, as long
1016	   as the bit rate limitations resulting from the exchange of TMMBR and
1017	   TMMBN messages are at least as strict (at least as low, in the bit
1018	   rate dimension) as the ones resulting from the use of the
1019	   aforementioned algorithm.

1021	   Obviously, in point-to-point cases, when there is only one media
1022	   receiver, this receiver becomes "owner" once it receives the first
1023	   TMMBN in response to its own TMMBR, and stays "owner" for the rest of
1024	   the session.  Therefore, when it is known that there will always be
1025	   only a single media receiver, the above algorithm is not required.
1026	   Media receivers that are aware they are the only ones in a session
1027	   can send TMMBR messages with bit rate limits both higher and lower
1028	   than the previously notified limit, at any time (subject to the AVPF
1029	   [RFC4585] RTCP RR send timing rules).  However, it may be difficult
1030	   for a session participant to determine if it is the only receiver in
1031	   the session.  Because of this any implementation of TMMBR is required
1032	   to include the algorithm described in the next section or a stricter
1033	   equivalent.

1035	3.5.4.2. Algorithm for establishing current limitations

1037	   This section introduces an example algorithm for the calculation of a
1038	   session limit.  Other algorithms can be employed, as long as the
1039	   result of the calculation is at least as restrictive as the result
1040	   that is obtained by this algorithm.

1042	   First it is important to consider the implications of using a tuple
1043	   for limiting the media sender's behavior.  The bit rate and the
1044	   overhead value result in a two-dimensional solution space for the
1045	   calculation of the bit rate of media streams.  Fortunately the two
1046	   variables are linked. Specifically, the bit rate available for RTP
1047	   payloads is equal to the TMMBR reported bit rate minus the packet
1048	   rate used, multiplied by the TMMBR reported overhead converted to
1049	   bits.  As a result, when different bit rate/overhead combinations
1050	   need to be considered, the packet rate determines the correct
1051	   limitation.  This is perhaps best explained by an example:

1053	   Example:

1055	   Receiver A: TMMBR_max total BR = 35 kbps, TMMBR_OH = 40 bytes
1056	   Receiver B: TMMBR_max total BR = 40 kbps, TMMBR_OH = 60 bytes

1058	   For a given packet rate (PR) the bit rate available for media
1059	   payloads in RTP will be:

1061	   Max_net media_BR_A = TMMBR_max total BR_A - PR * TMMBR_OH_A * 8 ...
1062	   (1)
1063	   Max_net media_BR_B = TMMBR_max total BR_B - PR * TMMBR_OH_B * 8 ...
1064	   (2)

1066	   For a PR = 20 these calculations will yield a Max_net media_BR_A =
1067	   28600 bps and Max_net media_BR_B = 30400 bps, which suggests that
1068	   receiver A is the limiting one for this packet rate.  However at a
1069	   certain PR there is a switchover point at which receiver B becomes
1070	   the limiting one.  The switchover point can be identified by setting
1071	   Max_media_BR_A equal to Max_media_BR_B and breaking out PR:

1073	         TMMBR_max total BR_A - TMMBR_max total BR_B
1074	   PR =  ------------------------------------------- ... (3)
1075	                8*(TMMBR_OH_A - TMMBR_OH_B)

1077	   which, for the numbers above yields 31.25 as the switchover point
1078	   between the two limits.  That is, for packet rates below 31.25 per
1079	   second, receiver A is the limiting receiver, and for higher packet
1080	   rates, receiver B is more limiting.  The implications of this
1081	   behavior have to be considered by implementations that are going to
1082	   control media encoding and its packetization.  As exemplified above,
1083	   multiple TMMBR limits may apply to the trade-off between net media
1084	   bit rate and packet rate.  Which limitation applies depends on the
1085	   packet rate being considered.

1087	   This also has implications for how the TMMBR mechanism needs to work.
1088	   First, there is the possibility that multiple TMMBR tuples are
1089	   providing limitations on the media sender.  Secondly there is a need
1090	   for any session participant (media sender and receivers) to be able
1091	   to determine if a given tuple will become a limitation upon the media
1092	   sender, or if the set of already given limitations is stricter than
1093	   the given values.  In the absence of the ability to make this
1094	   determination the suppression of TMMBR requests would not work.

1096	   The basic idea of the algorithm is as follows.  Each TMMBR tuple can
1097	   be viewed as the equation of a straight line (cf. equations (1) and
1098	   (2)) in a space where packet rate lies along the X-axis and maximum
1099	   bit rate lies along the Y-axis. The lower envelope of the set of
1100	   lines corresponding to the complete set of TMMBR tuples defines a
1101	   polygon. Points lying along or below this polygon are combinations of
1102	   packet rate and bit rate that meet all of the TMMBR constraints. The
1103	   highest feasible packet rate within this region is the minimum of the
1104	   rate at which the bounding polygon meets the X-axis or the session
1105	   maximum packet rate (SMAXPR) provided by signaling, if any. Typically
1106	   a media sender will prefer to operate at a lower rate than this
1107	   theoretical maximum, so as to increase the rate at which actual media
1108	   content reaches the receivers.  The purpose of the algorithm is to
1109	   distinguish the TMMBR tuples constituting the bounding set and thus
1110	   delineate the feasible region, so that the media sender can select
1111	   its preferred operating point within that region

1113	   Figure 1 below shows a bounding polygon formed by TMMBR tuples A and
1114	   B. A third tuple C lies outside the bounding polygon and is therefore
1115	   irrelevant in determining feasible tradeoffs between media rate and
1116	   packet rate.  The line labeled ss..s represents the limit on packet
1117	   rate imposed by the session maximum packet rate (SMAXPR) obtained by
1118	   signaling during session setup.  In Figure 1 the limit determined by
1119	   tuple B happens to be more restrictive than SMAXPR.  The situation
1120	   could easily be the reverse, meaning that the bounding polygon is
1121	   terminated on the right by the vertical line representing the SMAXPR
1122	   constraint.

1124	   Net  ^
1125	   Media|a   c   b             s
1126	   Bit  |  a   c  b            s
1127	   Rate |    a   c b           s
1128	        |      a   cb          s
1129	        |        a   c         s
1130	        |          a  bc       s
1131	        |            a b c     s
1132	        |              ab  c   s
1133	        |  Feasible      b   c s
1134	        |   region        ba   s
1135	        |                  b a s c
1136	        |                   b  s   c
1137	        |                    b s a
1138	        |                     bs
1139	        +------------------------------>

1141	              Packet rate

1143	    Figure 1 - Geometric Interpretation of TMMBR Tuples

1145	   Note that the slopes of the lines making up the bounding polygon are
1146	   increasingly negative as one moves in the direction of increasing
1147	   packet rate.  Note also that with slight rearrangement, equations (1)
1148	   and (2) have the canonical form:

1150	          y = mx + b

1152	   where
1153	     m is the slope and has value equal to the negative of the tuple
1154	     overhead (in bits),
1155	   and
1156	     b is the y-intercept and has value equal to the tuple maximum total
1157	     media bit rate.

1159	   These observations lead to the conclusion that when processing the
1160	   TMMBR tuples to select the initial bounding set, one should sort and
1161	   process the tuples by order of increasing overhead. Once a particular
1162	   tuple has been added to the bounding set, all tuples not already
1163	   selected and having lower overhead can be eliminated, because the
1164	   next side of the bounding polygon has to be steeper (i.e. the
1165	   corresponding TMMBR must have higher overhead) than the latest added
1166	   tuple.

1168	   Line cc..c in Figure 1 illustrates another principle. This line is
1169	   parallel to line aa..a, but has a higher Y-intercept.  That is, the
1170	   corresponding TMMBR tuple contains a higher maximum total media bit
1171	   rate value.  Since line cc..c is outside the bounding polygon, it
1172	   illustrates the conclusion that if two TMMBR tuples have the same
1173	   overhead value, the one with higher maximum total media bit rate
1174	   value cannot be part of the bounding set and can be set aside.

1176	   Two further observations complete the algorithm.  Obviously, moving
1177	   from the left, the successive corners of the bounding polygon (i.e.
1178	   the intersection points between successive pairs of sides) lie at
1179	   successively higher packet rates.  On the other hand, again moving
1180	   from the left, each successive line making up the bounding set
1181	   crosses the X-axis at a lower packet rate.

1183	   The complete algorithm can now be specified. The algorithm works with
1184	   two lists of TMMBR tuples, the candidate list X and the selected
1185	   list Y, both ordered by increasing overhead value.  The algorithm
1186	   terminates when all members of X have been discarded or removed for
1187	   processing.  Membership of the selected list Y is probationary until
1188	   the algorithm is complete.  Each member of the selected list is
1189	   associated with an intersection value, which is the packet rate at
1190	   which the line corresponding to that TMMBR tuple intersects with the
1191	   line corresponding to the previous TMMBR tuple in the selected list.
1192	   Each member of the selected list is also associated with a maximum
1193	   packet rate value, which is the lesser of the session maximum packet
1194	   rate SMAXPR (if any) and the packet rate at which the line
1195	   corresponding to that tuple crosses the X-axis.

1197	   When the algorithm terminates, the selected list is equal to the
1198	   bounding set as defined in section 2.2.

1200	Initial Algorithm

1202	   This algorithm is used by the media sender when it has received one
1203	   or more TMMBR requests and before it has determined a bounding set
1204	   for the first time.

1206	   1. Sort the TMMBR tuples by order of increasing overhead.  This is
1207	      the initial candidate list X.

1209	   2. When multiple tuples in the candidate list have the same overhead
1210	      value, discard all but the one with the lowest maximum total media
1211	      bit rate value.

1213	   3. Select and remove from the candidate list the TMMBR tuple with the
1214	      lowest maximum total media bit rate value.  If there is more than
1215	      one tuple with that value, choose the one with the highest
1216	      overhead value.  This is the first member of the selected list Y.
1217	      Set its intersection value equal to zero.  Calculate its maximum
1218	      packet rate as the minimum of SMAXPR (if available) and the value
1219	      obtained from the following formula, which is the packet rate at
1220	      which the corresponding line crosses the X-axis.

1222	          Max PR = TMMBR max total BR / (8 * TMMBR OH) ... (4)

1224	   4. Discard from the candidate list all tuples with a lower overhead
1225	      value than the selected tuple.

1227	   5. Remove the first remaining tuple from the candidate list for
1228	      processing.  Call this the current candidate.

1230	   6. Calculate the packet rate PR at the intersection of the line
1231	      generated by the current candidate with the line generated by the
1232	      last tuple in the selected list Y, using equation (3).

1234	   7. If the calculated value PR is equal to or lower than the
1235	      intersection value stored for the last tuple of the selected list,
1236	      discard the last tuple of the selected list and go back to step 6
1237	      (retaining the same current candidate).

1239	      Note that the choice of the initial member of the selected list Y
1240	      in step 3 guarantees that the selected list will never be emptied
1241	      by this process, meaning that the algorithm must eventually (if
1242	      not immediately) fall through to the step 8.

1244	   8. (This step is reached when the calculated PR value of the current
1245	      candidate is greater than the intersection value of the current
1246	      last member of the selected list Y.)  If the calculated value PR
1247	      of the current candidate is lower than the maximum packet rate
1248	      associated with the last tuple in the selected list, add the
1249	      current candidate tuple to the end of the selected list.  Store PR
1250	      as its intersection value.  Calculate its maximum packet rate as
1251	      the lesser of SMAXPR (if available) and the maximum packet rate
1252	      calculated using equation (4).

1254	   9. If any tuples remain in the candidate list, go back to step 5.

1256	Incremental Algorithm
1257	   The previous algorithm covered the initial case, where no selected
1258	   list had previously been created.  It also applied only to the media
1259	   sender.  When a previously-created selected list is available at
1260	   either the media sender or media receiver, two other cases can be
1261	   considered:

1263	        o when a TMMBR tuple not currently in the selected list is a
1264	          candidate for addition;

1266	        o when the values change in a TMMBR tuple currently in the
1267	          selected list.

1269	   At the media receiver these cases correspond respectively to those of
1270	   the non-owner and owner of a tuple in the TMMBN-reported bounding
1271	   set.

1273	   In either case, the process of updating the selected list to take
1274	   account of the new/changed tuple can use the basic algorithm
1275	   described above, with the modification that the initial candidate set
1276	   consists only of the existing selected list and the new or changed
1277	   tuple.  Some further optimization is possible (beyond starting with a
1278	   reduced candidate set) by taking advantage of the following
1279	   observations.

1281	   The first observation is that if the new/changed candidate becomes
1282	   part of the new selected list, the result may be to cause zero or
1283	   more other tuples to be dropped from the list.  However, if more than
1284	   one other tuple is dropped, the dropped tuples will be consecutive.
1285	   This can be confirmed geometrically by visualizing a new line that
1286	   cuts off a series of segments from the previously-existing bounding
1287	   polygon.  The cut-off segments are connected one to the next, the
1288	   geometric equivalent of consecutive tuples in a list ordered by
1289	   overhead value.  Beyond the dropped set in either direction all of
1290	   the tuples that were in the earlier selected list will be in the
1291	   updated one.  The second observation is that, leaving aside the new
1292	   candidate, the order of tuples remaining in the updated selected list
1293	   is unchanged because their overhead values have not changed.

1295	   The consequence of these two observations is that, once the placement
1296	   of the new candidate and the extent of the dropped set of tuples (if
1297	   any) has been determined, the remaining tuples can be copied directly
1298	   from the candidate list into the selected list, preserving their
1299	   order.  This conclusion suggests the following modified algorithm:

1301	       o Run steps 1-4 of the basic algorithm.

1303	       o If the new candidate has survived steps 2 and 4 and has become
1304	          the new first member of the selected list, run steps 5-9 on
1305	          subsequent candidates until another candidate is added to the
1306	          selected list.  Then move all remaining candidates to the
1307	          selected list, preserving their order.

1309	       o If the new candidate has survived steps 2 and 4 and has not
1310	          become the new first member of the selected list, start by
1311	          moving all tuples in the candidate list with lower overhead
1312	          values than that of the new candidate to the selected list,
1313	          preserving their order.  Run steps 5 through 9 for the new
1314	          candidate, with the modification that the intersection values
1315	          and maximum packet rates for the tuples on the selected list
1316	          have to be calculated on the fly because they were not
1317	          previously stored. Continue processing only until a subsequent
1318	          tuple has been added to the selected list, then move all
1319	          remaining candidates to the selected list, preserving their
1320	          order.

1322	          Note that the new candidate could be added to the selected
1323	          list only to be dropped again when the next tuple is
1324	          processed.  It can easily be seen that in this case the new
1325	          candidate does not displace any of the earlier tuples in the
1326	          selected list.  The limitations of ASCII art make this
1327	          difficult to show in a figure.  Line cc..c in Figure 1 would
1328	          be an example if it had a steeper slope (tuple C had a higher
1329	          overhead value), but still intersected line aa..a beyond
1330	          where line aa..a intersects line bb..b.

1332	   The algorithm just described is approximate, because it does not take
1333	   account of tuples outside the selected list.  To see how such tuples
1334	   can become relevant, consider Figure 1 and suppose that the maximum
1335	   total media bit rate in tuple A increases to the point that line
1336	   aa..a moves outside line cc..c.  Tuple A will remain in the bounding
1337	   set calculated by the media sender.  However, once it issues a new
1338	   TMMBN, media receiver C will apply the algorithm and discover that
1339	   its tuple C should now enter the bounding set.  It will issue a TMMBR
1340	   request to the media sender, which will repeat its calculation and
1341	   come to the appropriate conclusion.

1343	   The rules of section 4
1344	.2 require that the media sender refrain from
1345	   raising its sending rate until media receivers have had a chance to
1346	   respond to the TMMBN.  In the example just given, this delay ensures
1347	   that the relaxation of tuple A does not actually result in an attempt
1348	   to send media at a rate exceeding the capacity at C.

1350	3.5.4.3. Use of TMMBR in a Mixer Based Multipoint Operation

1352	   Assume a small mixer-based multiparty conference is ongoing, as
1353	   depicted in Topo-Mixer of [Topologies].  All participants have
1354	   negotiated a common maximum bit rate that this session can use.  The
1355	   conference operates over a number of unicast paths between the
1356	   participants and the mixer.  The congestion situation on each of
1357	   these paths can be monitored by the participant in question and by
1358	   the mixer, utilizing, for example, RTCP receiver reports (RR) or the
1359	   transport protocol, e.g. DCCP [RFC4340].  However, any given
1360	   participant has no knowledge of the congestion situation of the
1361	   connections to the other participants.  Worse, without mechanisms
1362	   similar to the ones discussed in this draft, the mixer (which is
1363	   aware of the congestion situation on all connections it manages) has
1364	   no standardized means to inform media senders to slow down, short of
1365	   forging its own receiver reports (which is undesirable).  In
1366	   principle, a mixer confronted with such a situation is obliged to
1367	   thin or transcode streams intended for connections that detected
1368	   congestion.

1370	   In practice, media-aware stream thinning is unfortunately a very
1371	   difficult and cumbersome operation and adds undesirable delay.  If
1372	   media-unaware, it leads very quickly to unacceptable reproduced media
1373	   quality.  Hence, a means to slow down senders even in the absence of
1374	   congestion on their connections to the mixer is desirable.

1376	   To allow the mixer to throttle traffic on the individual links,
1377	   without performing transcoding, there is a need for a mechanism that
1378	   enables the mixer to ask a participant's media encoders to limit the
1379	   media stream bit rate they are currently generating.  TMMBR provides
1380	   the required mechanism.  When the mixer detects congestion between
1381	   itself and a given participant, it executes the following procedure:

1383	   1. It starts thinning the media traffic to the congested participant
1384	      to the supported bit rate.

1386	   2. It uses TMMBR to request the media sender(s) to reduce the total
1387	      media bit rate sent by them to the mixer, to a value that is in
1388	      compliance with congestion control principles for the slowest
1389	      link.  Slow refers here to the available bandwidth / bit rate /
1390	      capacity and packet rate after congestion control.

1392	   3. As soon as the bit rate has been reduced by the sending part, the
1393	      mixer stops stream thinning implicitly, because there is no need
1394	      for it once the stream is in compliance with congestion control.

1396	   This use of stream thinning as an immediate reaction tool followed up
1397	   by a quick control mechanism appears to be a reasonable compromise
1398	   between media quality and the need to combat congestion.

1400	3.5.4.4. Use of TMMBR in Point-to-Multipoint Using Multicast or
1401	   Translators

1403	   In these topologies, corresponding to Topo-Multicast or Topo-
1404	   Translator, RTCP RRs are transmitted globally.  This allows all
1405	   participants to detect transmission problems such as congestion, on a
1406	   medium timescale.  As all media senders are aware of the congestion
1407	   situation of all media receivers, the rationale for the use of TMMBR
1408	   in the previous section does not apply.  However, even in this case
1409	   the congestion control response can be improved when the unicast
1410	   links are using congestion controlled transport protocols (such as
1411	   TCP or DCCP).  A peer may also report local limitations to the media
1412	   sender.

1414	3.5.4.5. Use of TMMBR in Point-to-point operation

1416	   In use case 7 it is possible to use TMMBR to improve the performance
1417	   when the known upper limit of the bit rate changes.  In this use case
1418	   the signaling protocol has established an upper limit for the session
1419	   and total media bit rates.  However, at the time of transport link
1420	   bit rate reduction, a receiver can avoid serious congestion by
1421	   sending a TMMBR to the sending side.  Thus TMMBR is useful for
1422	   putting restrictions on the application and thus placing the
1423	   congestion control mechanism in the right ballpark.  However TMMBR is
1424	   usually unable to provide the continuously quick feedback loop
1425	   required for real congestion control.  Nor do its semantics match
1426	   those of congestion control given its different purpose.  For these
1427	   reasons TMMBR SHALL NOT be used as a substitute for congestion
1428	   control.

1430	3.5.4.6. Reliability

1432	   The reaction of a media sender to the reception of a TMMBR message is
1433	   not immediately identifiable through inspection of the media stream.
1434	   Therefore, a more explicit mechanism is needed to avoid unnecessary
1435	   re-sending of TMMBR messages.  Using a statistically based
1436	   retransmission scheme would only provide statistical guarantees of
1437	   the request being received.  It would also not avoid the
1438	   retransmission of already received messages.  In addition, it would
1439	   not allow for easy suppression of other participants' requests.  For
1440	   these reasons, a mechanism based on explicit notification is used.

1442	   Upon the reception of a request a media sender sends a TMMBN
1443	   notification containing the current bounding set, and indicating
1444	   which session participants own that limit.  In multicast scenarios,
1445	   that allows all other participants to suppress any request they may
1446	   have, if their limitations are less strict than the current ones
1447	   (i.e. define lines lying outside the feasible region as defined in
1448	   section 2.2).  Keeping and notifying only the bounding set of tuples
1449	   allows for small message sizes and media sender states.  A media
1450	   sender only keeps state for the SSRCs of the current owners of the
1451	   bounding set of tuples; all other requests and their sources are not
1452	   saved.  Once the bounding set has been established, new TMMBR
1453	   messages should be generated only by owners of the bounding tuples
1454	   and by other entities that determine (by applying the algorithm of
1455	   section 3.5.4.2 or its equivalent) that their limitations should now
1456	   be part of the bounding set.

1458	4. RTCP Receiver Report Extensions

1460	   This memo specifies six new feedback messages.  The Full Intra
1461	   Request (FIR), Temporal-Spatial Trade-off Request (TSTR), Temporal-
1462	   Spatial Trade-off Notification (TSTN), and Video Back Channel Message
1463	   (VBCM) are "Payload Specific Feedback Messages" as defined in Section
1464	   6.3 of AVPF [RFC4585].  The Temporary Maximum Media Stream Bit Rate
1465	   Request (TMMBR) and Temporary Maximum Media Stream Bit Rate
1466	   Notification (TMMBN) are "Transport Layer Feedback Messages" as
1467	   defined in Section 6.2 of AVPF.

1469	   The new feedback messages are defined in the following subsections,
1470	   following a similar structure to that in sections 6.2 and 6.3 of the
1471	   AVPF specification [RFC4585].

1473	4.1. Design Principles of the Extension Mechanism

1475	   RTCP was originally introduced as a channel to convey presence,
1476	   reception quality statistics and hints on the desired media coding.
1477	   A limited set of media control mechanisms were introduced in early
1478	   RTP payload formats for video formats, for example in RFC 4587
1479	   [RFC4587].  However, this specification, for the first time, suggests
1480	   a two-way handshake for some of its messages.  There is danger that
1481	   this introduction could be misunderstood as a precedent for the use
1482	   of RTCP as an RTP session control protocol.  To prevent such a
1483	   misunderstanding, this subsection attempts to clarify the scope of
1484	   the extensions specified in this memo, and strongly suggests that
1485	   future extensions follow the rationale spelled out here, or
1486	   compellingly explain why they divert from the rationale.

1488	   In this memo, and in AVPF [RFC4585], only such messages have been
1489	   included as:

1491	   a) have comparatively strict real-time constraints, which prevent the
1492	      use of mechanisms such as a SIP re-invite in most application
1493	      scenarios.  The real-time constraints are explained separately for
1494	      each message where necessary.

1496	   b) are multicast-safe in that the reaction to potentially
1497	      contradicting feedback messages is specified, as necessary for
1498	      each message; and

1500	   c) are directly related to activities of a certain media codec, class
1501	      of media codecs (e.g. video codecs), or a given RTP packet stream.

1503	   In this memo, a two-way handshake is introduced only for messages for
1504	   which:

1506	   a) a notification or acknowledgement is required due to their nature.
1507	      An analysis to determine whether this requirement exists has been
1508	      performed separately for each message.

1510	   b) the notification or acknowledgement cannot be easily derived from
1511	      the media bit stream.

1513	   All messages in AVPF [RFC4585] and in this memo present their
1514	   contents in a simple, fixed binary format.  This accommodates media
1515	   receivers which have not implemented higher control protocol
1516	   functionalities (SDP, XML parsers and such) in their media path.

1518	4.2. Transport Layer Feedback Messages

1520	   As specified in section 6.1 of RFC 4585 [RFC4585], Transport Layer
1521	   Feedback messages are identified by the RTCP packet type value RTPFB
1522	   (205).

1524	   In AVPF, one message of this category had been defined.  This memo
1525	   specifies two more such messages.  They are identified by means of
1526	   the FMT parameter as follows:

1528	   Assigned in AVPF [RFC4585]:

1530	      1:    Generic NACK
1531	      31:   reserved for future expansion of the identifier number space

1533	   Assigned in this memo:

1535	      2:    reserved (see note below)
1536	      3:    Temporary Maximum Media Stream Bit Rate Request (TMMBR)
1537	      4:    Temporary Maximum Media Stream Bit Rate Notification (TMMBN)

1539	          Note: early drafts of AVPF [RFC4585] reserved FMT=2 for a code
1540	          point that has later been removed.  It has been pointed out
1541	          that there may be implementations in the field using this
1542	          value in accordance with the expired draft.  As there is
1543	          sufficient numbering space available, we mark FMT=2 as
1544	          reserved so to avoid possible interoperability problems with
1545	          any such early implementations.

1547	   Available for assignment:

1549	      0:    unassigned
1550	      5-30: unassigned

1552	   The following subsection defines the formats of the FCI entries for
1553	   the TMMBR and TMMBN messages respectively and specify the associated
1554	   behaviour at the media sender and receiver.

1556	4.2.1.   Temporary Maximum Media Stream Bit Rate Request (TMMBR)

1558	   The FCI field of a Temporary Maximum Media Stream Bit-Rate Request
1559	   (TMMBR) message SHALL contain one or more FCI entries.

1561	4.2.1.1. Message Format

1563	   The Feedback Control Information (FCI) consists of one or more TMMBR
1564	   FCI entries with the following syntax:

1566	    0                   1                   2                   3
1567	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1568	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1569	   |                              SSRC                             |
1570	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1571	   | MxTBR Exp |  MxTBR Mantissa                 |Measured Overhead|
1572	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1574	    Figure 2 - Syntax of an FCI entry in the TMMBR message

1576	     SSRC (32 bits): The SSRC value of the media sender that is
1577	              requested to obey the new maximum bit rate.

1579	     MxTBR Exp (6 bits): The exponential scaling of the mantissa for the
1580	              maximum total media bit rate value.  The value is an
1581	              unsigned integer [0..63].

1583	     MxTBR Mantissa (17 bits): The mantissa of the maximum total media
1584	              bit rate value as an unsigned integer.

1586	     Measured Overhead (9 bits): The measured average packet overhead
1587	              value in bytes.  The measurement SHALL be done according
1588	              to description in section 4
1589	.2.1.2. The value is an
1590	              unsigned integer [0..512].

1592	   The maximum total media bit rate (MxTBR) value in bits per second is
1593	   calculated from the MxTBR exponent (exp) and mantissa in the
1594	   following way:

1596	      MxTBR = mantissa * 2^exp

1598	   This allows for 17 bits of resolution in the range 0 to 131072*2^63
1599	   (approximately 1.2*10^24).

1601	   The length of the TMMBR feedback message SHALL be set to 2+2*N where
1602	   N is the number of TMMBR FCI entries.

1604	4.2.1.2. Semantics

1606	Behaviour at the Media Receiver (Sender of the TMMBR)

1608	   TMMBR is used to indicate a transport related limitation at the
1609	   reporting entity acting as a media receiver.  TMMBR has the form of a
1610	   tuple containing two components.  The first value is the highest bit
1611	   rate per sender of a media stream, available at a receiver-chosen
1612	   protocol layer, which the receiver currently supports in this RTP
1613	   session.  The second value is the measured header overhead in bytes
1614	   as defined in section 2.2 and measured at the chosen protocol layer
1615	   in the packets received for the stream.  The measurement of the
1616	   overhead is a running average that is updated for each packet
1617	   received for this particular media source (SSRC), using the following
1618	   formula:

1620	       avg_OH (new) = 15/16*avg_OH (old) + 1/16*pckt_OH,

1622	   where avg_OH is the running (exponentially smoothed) average and
1623	   pckt_OH is the overhead observed in the latest packet.

1625	   If a maximum bit rate has been negotiated through signaling, the
1626	   maximum total media bit rate that the receiver reports in a TMMBR
1627	   message MUST NOT exceed the negotiated value converted to a common
1628	   basis (i.e. with overheads adjusted to bring it to the same reference
1629	   protocol layer).

1631	   Within the common packet header for feedback messages (as defined in
1632	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
1633	   indicates the source of the request, and the "SSRC of media source"
1634	   is not used and SHALL be set to 0.  Within a particular TMMBR FCI
1635	   entry, the "SSRC of media sender" in the FCI field denotes the media
1636	   sender the tuple applies to.  This is useful in the multicast or
1637	   translator topologies where the reporting entity may address all of
1638	   the media senders in a single TMMBR message using multiple FCI
1639	   entries.

1641	   The media receiver SHALL save the contents of the latest TMMBN
1642	   message received from each media sender.

1644	   The media receiver MAY send a TMMBR FCI entry to a particular media
1645	   sender under the following circumstances:

1647	     o   before any TMMBN message has been received from that media
1648	          sender;

1650	     o   when the media receiver has been identified as the source of a
1651	          bounding tuple within the latest TMMBN message received from
1652	          that media sender, and the value of the maximum total media
1653	          bit rate or the overhead relating to that media sender has
1654	          changed;

1656	     o   when the media receiver has not been identified as the source
1657	          of a bounding tuple within the latest TMMBN message received
1658	          from that media sender, and, after the media receiver applies
1659	          the incremental algorithm from section 3.5.4.2 or a stricter
1660	          equivalent, the media receiver's tuple relating to that media
1661	          sender is determined to belong to the bounding set.

1663	   A TMMBR FCI entry MAY be repeated in subsequent TMMBR messages if no
1664	   Temporary Maximum Media Stream Bit-Rate Notification (TMMBN) FCI has
1665	   been received from the media sender at the time of transmission of
1666	   the next RTCP packet.  The bit rate value of a TMMBR FCI entry MAY be
1667	   changed from one TMMBR message to the next.  The overhead measurement
1668	   SHALL be updated to the current value of avg_OH each time the entry
1669	   is sent.

1671	   If the value set by a TMMBR message is expected to be permanent, the
1672	   TMMBR setting party SHOULD renegotiate the session parameters to
1673	   reflect that using session setup signaling, e.g. a SIP re-invite.

1675	Behaviour at the Media Sender (Receiver of the TMMBR)

1677	   When it receives a TMMBR message containing an FCI entry relating to
1678	   it, the media sender SHALL use an initial or incremental algorithm as
1679	   applicable to determine the bounding set of tuples based on the new
1680	   information.  The algorithm used SHALL be at least as strict as the
1681	   corresponding algorithm defined in section 3.5.4.2.  The media sender
1682	   MAY accumulate TMMBR requests over a small interval (relative to the
1683	   RTCP sending interval) before making this calculation.

1685	   Once it has determined the bounding set of tuples, the media sender
1686	   MAY use any combination of packet rate and net media bit rate within
1687	   the feasible region that these tuples describe to produce a lower
1688	   total media stream bit rate, as it may need to address a congestion
1689	   situation or other limiting factors.  See section 5 (congestion
1690	   control) for more discussion.

1692	   If the media sender concludes that it can increase the maximum total
1693	   media bit rate value, it SHALL wait before actually doing so, for a
1694	   period long enough to allow a media receiver to respond to the TMMBN
1695	   if it determines that its tuple belongs in the bounding set.  This
1696	   delay period is estimated by the formula:

1698	      2 * RTT + T_Dither_Max,

1700	   where RTT is the longest round trip time known to the media sender
1701	   and T_Dither_Max is defined in section 3.4 of [RFC4585].

1703	   A TMMBN message SHALL be sent by the media sender at the earliest
1704	   possible point in time, in response to any TMMBR messages received
1705	   since the last sending of TMMBN.  The TMMBN message indicates the
1706	   calculated set of bounding tuples and the owners of those tuples at
1707	   the time of the transmission of the message.

1709	   An SSRC may time out according to the default rules for RTP session
1710	   participants, i.e. the media sender has not received any RTP or RTCP
1711	   packets from the owner for the last five regular reporting intervals.
1712	   An SSRC may also explicitly leave the session, with the participant
1713	   indicating this through the transmission of an RTCP BYE packet or
1714	   using an external signaling channel.  If the media sender determines
1715	   that the owner of a tuple in the bounding set has left the session,
1716	   the media sender shall transmit a new TMMBN containing the
1717	   previously-determined set of bounding tuples but with the tuple
1718	   belonging to the departed owner removed.

1720	   A media sender MAY proactively initiate the equivalent to a TMMBR
1721	   message to itself, when it is aware that its transmission path is
1722	   more restrictive than the current limitations.  As a result, a TMMBN
1723	   indicating the media source itself as the owner of a tuple is being
1724	   sent, thereby avoiding unnecessary TMMBR messages from other
1725	   participants. However, like any other participant, when the media
1726	   sender becomes aware of changed limitations, it is required to change
1727	   the tuple, and to send a corresponding TMMBN.

1729	Discussion

1731	   Due to the unreliable nature of transport of TMMBR and TMMBN, the
1732	   above rules may lead to the sending of TMMBR messages which appear to
1733	   disobey those rules.  Furthermore, in multicast scenarios it can
1734	   happen that more than one "non-owning" session participant may
1735	   determine, rightly or wrongly, that its tuple belongs in the bounding
1736	   set.  This is not critical for a number of reasons:

1738	   a) If a TMMBR message is lost in transmission, either the media
1739	      sender sends a new TMMBN message in response to some other media
1740	      receiver or it does not send a new TMMBN message at all.  In the
1741	      first case, the media receiver applies the incremental algorithm
1742	      and, if it determines that its tuple should be part of the
1743	      bounding set, sends out another TMMBR.  In the second case, it
1744	      repeats the sending of a TMMBR unconditionally.  Either way, the
1745	      media sender eventually gets the information it needs.

1747	   b) Similarly, if a TMMBN message gets lost, the media receiver that
1748	      has sent the corresponding TMMBR request does not receive the
1749	      notification and is expected to re-send the request and trigger
1750	      the transmission of another TMMBN.

1752	   c) If multiple competing TMMBR messages are sent by different session
1753	      participants, then the algorithm can be applied taking all of
1754	      these messages into account, and the resulting TMMBN provides the
1755	      participants with an updated view of how their tuples compare with
1756	      the bounded set.

1758	   d) If more than one session participant happens to send TMMBR
1759	      messages at the same time and with the same tuple component
1760	      values, it does not matter which if either tuple is taken into the
1761	      bounding set.  The losing session participant will determine after
1762	      applying the algorithm that its tuple does not enter the bounding
1763	      set, and will therefore stop sending its TMMBR request.

1765	   It is important to consider the security risks involved with faked
1766	   TMMBRs.  See the security considerations in Section 6.

1768	   As indicated already, the feedback messages may be used in both
1769	   multicast and unicast sessions in any of the specified topologies.
1770	   However, for sessions with a large number of participants, using the
1771	   lowest common denominator, as required by this mechanism, may not be
1772	   the most suitable course of action.  Large sessions may need to
1773	   consider other ways to adapt the bit rate to participants'
1774	   capabilities, such as partitioning the session into different quality
1775	   tiers, or using some other method of achieving bit rate scalability.

1777	4.2.1.3. Timing Rules

1779	   The first transmission of the TMMBR request message MAY use early or
1780	   immediate feedback in cases when timeliness is desirable.  Any
1781	   repetition of a request message SHOULD use regular RTCP mode for its
1782	   transmission timing.

1784	4.2.1.4.  Handling in Translator and Mixers
1785	   Media translators and mixers will need to receive and respond to
1786	   TMMBR messages as they are part of the chain that provides a certain
1787	   media stream to the receiver.  The mixer or translator may act
1788	   locally on the TMMBR request and thus generate a TMMBN to indicate
1789	   that it has done so.  Alternatively, in the case of a media
1790	   translator it can forward the request, or in the case of a mixer
1791	   generate one of its own and pass it forward.  In the latter case, the
1792	   mixer will need to send a TMMBN back to the original requestor to
1793	   indicate that it is handling the request.

1795	4.2.2. Temporary Maximum Media Stream Bit Rate Notification (TMMBN)

1797	   The FCI field of the TMMBN Feedback message may contain zero, one or
1798	   more TMMBN FCI entries.

1800	4.2.2.1. Message Format

1802	   The Feedback Control Information (FCI) consists of zero, one or more
1803	   TMMBN FCI entries with the following syntax:

1805	    0                   1                   2                   3
1806	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1807	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1808	   |                              SSRC                             |
1809	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1810	   | MxTBR Exp |  MxTBR Mantissa                 |Measured Overhead|
1811	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1813	    Figure 3 - Syntax of an FCI entry in the TMMBN message

1815	     SSRC (32 bits): The SSRC value of the "owner" of this tuple.

1817	     MxTBR Exp (6 bits): The exponential scaling of the mantissa for the
1818	              maximum total media bit rate value.  The value is an
1819	              unsigned integer [0..63].

1821	     MxTBR Mantissa (17 bits): The mantissa of the maximum total media
1822	              bit rate value as an unsigned integer.

1824	     Measured Overhead (9 bits): The measured average packet overhead
1825	              value in bytes represented as an unsigned integer.

1827	   Thus the FCI within the TMMBN message contains entries indicating the
1828	   bounding tuples.  For each tuple, the entry gives the owner by the
1829	   SSRC, followed by the applicable maximum total media bit rate and
1830	   overhead value.

1832	   The length of the TMMBN message SHALL be set to 2+2*N where N is the
1833	   number of TMMBN FCI entries.

1835	4.2.2.2. Semantics

1837	   This feedback message is used to notify the senders of any TMMBR
1838	   message that one or more TMMBR messages have been received or that an
1839	   owner has left the session.  It indicates to all participants the
1840	   current set of bounding tuples and the "owners" of those tuples.

1842	   Within the common packet header for feedback messages (as defined in
1843	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
1844	   indicates the source of the notification.  The "SSRC of media source"
1845	   is not used and SHALL be set to 0.

1847	   A TMMBN message SHALL be scheduled for transmission after the
1848	   reception of a TMMBR message with an FCI entry identifying this media
1849	   sender.  Only a single TMMBN SHALL be sent, even if more than one
1850	   TMMBR message is received between the scheduling of the transmission
1851	   and the actual transmission of the TMMBN message.  The TMMBN message
1852	   indicates the bounding tuples and their owners at the time of
1853	   transmitting the message.  The bounding tuples included SHALL be the
1854	   set arrived at through application of the applicable algorithm of
1855	   section 3.5.4.2 or an equivalent, applied to the previous bounding
1856	   set if any and tuples received in TMMBR messages since the last TMMBN
1857	   was transmitted.

1859	   The reception of a TMMBR message SHALL still result in the
1860	   transmission of a TMMBN message even if, after application of the
1861	   algorithm, the newly reported TMMBR tuple is not accepted into the
1862	   bounding set.  In such a case the bounding tuples and their owners
1863	   are not changed, unless the TMMBR was from an owner of a tuple within
1864	   the previously calculated bounding set.  This procedure allows
1865	   session participants that did not see the last TMMBN message to get a
1866	   correct view of this media sender's state.

1868	   As indicated in section 4.2.1.2, when a media sender determines that
1869	   an "owner" of a bounding tuple has left the session, then that tuple
1870	   is removed from the bounding set, and the media sender SHALL send a
1871	   TMMBN message indicating the remaining bounding tuples.  If there are
1872	   no remaining bounding tuples a TMMBN without any FCI SHALL be sent to
1873	   indicate this.

1875	     Note: if any media receivers remain in the session, this last will
1876	     be a temporary situation.  The empty TMMBN will cause every
1877	     remaining media receiver to determine that its limitation belongs
1878	     in the bounding set and send a TMMBR in consequence.

1880	   In unicast scenarios (i.e. where a single sender talks to a single
1881	   receiver), the aforementioned algorithm to determine ownership
1882	   degenerates to the media receiver becoming the "owner" of the one
1883	   bounding tuple as soon as the media receiver has issued the first
1884	   TMMBR message.

1886	4.2.2.3. Timing Rules

1888	   The TMMBN acknowledgement SHOULD be sent as soon as allowed by the
1889	   applied timing rules for the session.  Immediate or early feedback
1890	   mode SHOULD be used for these messages.

1892	4.2.2.4. Handling by Translators and Mixers

1894	   As discussed in Section 4.2.1.4 mixers or translators may need to
1895	   issue TMMBN messages as responses to TMMBR messages for SSRC's
1896	   handled by them.

1898	4.3. Payload Specific Feedback Messages

1900	   As specified by section 6.1 of RFC 4585 [RFC4585], Payload-Specific
1901	   FB messages are identified by the RTCP packet type value PT=PSFB
1902	   (206).

1904	   AVPF [RFC4585] defines three payload-specific feedback messages and
1905	   one application layer feedback message.  This memo specifies four
1906	   additional payload-specific feedback messages.  All are identified by
1907	   means of the FMT parameter as follows:

1909	   Assigned in [RFC4585]:

1911	     1:     Picture Loss Indication (PLI)
1912	     2:     Slice Lost Indication (SLI)
1913	     3:     Reference Picture Selection Indication (RPSI)
1914	     15:    Application layer FB message
1915	     31:    reserved for future expansion of the number space

1917	   Assigned in this memo:

1919	     4:     Full Intra Request Command (FIR)
1920	     5:     Temporal-Spatial Trade-off Request (TSTR)
1921	     6:     Temporal-Spatial Trade-off Notification (TSTN)
1922	     7:     Video Back Channel Message (VBCM)

1924	   Unassigned:

1926	     0:     unassigned
1927	     8-14:  unassigned
1928	     16-30: unassigned

1930	   The following subsections define the new FCI formats for the payload-
1931	   specific feedback messages.

1933	4.3.1. Full Intra Request (FIR)

1935	   The FIR message is identified by RTCP packet type value PT=PSFB and
1936	   FMT=4.

1938	   The FCI field MUST contain one or more FIR entries.  Each entry
1939	   applies to a different media sender, identified by its SSRC.

1941	4.3.1.1. Message Format

1943	   The Feedback Control Information (FCI) for the Full Intra Request
1944	   consists of one or more FCI entries, the content of which is depicted
1945	   in Figure 4.  The length of the FIR feedback message MUST be set to
1946	   2+2*N, where N is the number of FCI entries.

1948	    0                   1                   2                   3
1949	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1950	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1951	   |                              SSRC                             |
1952	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1953	   | Seq. nr       |    Reserved                                   |
1954	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1956	    Figure 4 - Syntax of an FCI entry in the FIR message

1958	     SSRC (32 bits): The SSRC value of the media sender which is
1959	              requested to send a decoder refresh point.

1961	     Seq. nr (8 bits): Command sequence number.  The sequence number
1962	              space is unique for each pairing of the SSRC of command
1963	              source and the SSRC of the command target.  The sequence
1964	              number SHALL be increased by 1 modulo 256 for each new
1965	              command.  A repetition SHALL NOT increase the sequence
1966	              number.  The initial value is arbitrary.

1968	     Reserved (24 bits): All bits SHALL be set to 0 by the sender and
1969	              SHALL be ignored on reception.

1971	   The semantics of this feedback message is independent of the RTP
1972	   payload type.

1974	4.3.1.2. Semantics

1976	   Upon reception of FIR, the encoder MUST send a decoder refresh point
1977	   (see section 2.2) as soon as possible.

1979	     Note: Currently, video appears to be the only useful application
1980	     for FIR, as it appears to be the only RTP payload widely deployed
1981	     that relies heavily on media prediction across RTP packet
1982	     boundaries.  However, use of FIR could also reasonably be
1983	     envisioned for other media types that share essential properties
1984	     with compressed video, namely cross-frame prediction (whatever a
1985	     frame may be for that media type).  One possible example may be the
1986	     dynamic updates of MPEG-4 scene descriptions.  It is suggested that
1987	     payload formats for such media types refer to FIR and other message
1988	     types defined in this specification and in AVPF [RFC4585], instead
1989	     of creating similar mechanisms in the payload specifications.  The
1990	     payload specifications may have to explain how the payload-specific
1991	     terminologies map to the video-centric terminology used herein.

1993	     Note: In environments where the sender has no control over the
1994	     codec (e.g. when streaming pre-recorded and pre-coded content), the
1995	     reaction to this command cannot be specified.  One suitable
1996	     reaction of a sender would be to skip forward in the video bit
1997	     stream to the next decoder refresh point.  In other scenarios, it
1998	     may be preferable not to react to the command at all, e.g. when
1999	     streaming to a large multicast group.  Other reactions may also be
2000	     possible.  When deciding on a strategy, a sender could take into
2001	     account factors such as the size of the receiving group, the
2002	     "importance" of the sender of the FIR message (however "importance"
2003	     may be defined in this specific application), the frequency of
2004	     decoder refresh points in the content, and so on.  However a
2005	     session which predominately handles pre-coded content is not
2006	     expected to use FIR at all.

2008	   The sender MUST consider congestion control as outlined in section 5,
2009	   which MAY restrict its ability to send a decoder refresh point
2010	   quickly.

2012	     Note: The relationship between the Picture Loss Indication and FIR
2013	     is as follows.  As discussed in section 6.3.1 of AVPF [RFC4585], a
2014	     Picture Loss Indication informs the decoder about the loss of a
2015	     picture and hence the likelihood of misalignment of the reference
2016	     pictures between the encoder and decoder.  Such a scenario is
2017	     normally related to losses in an ongoing connection.  In point-to-
2018	     point scenarios, and without the presence of advanced error
2019	     resilience tools, one possible option for an encoder consists in
2020	     sending a decoder refresh point.  However, there are other options.
2021	     One example is that the media sender ignores the PLI, because the
2022	     embedded stream redundancy is likely to clean up the reproduced
2023	     picture within a reasonable amount of time.  The FIR, in contrast,
2024	     leaves a (real-time) encoder no choice but to send a decoder
2025	     refresh point.  It does not allow the encoder to take into account
2026	     any considerations such as the ones mentioned above.

2028	     Note: Mandating a maximum delay for completing the sending of a
2029	     decoder refresh point would be desirable from an application
2030	     viewpoint, but is problematic from a congestion control point of
2031	     view.  "As soon as possible" as mentioned above appears to be a
2032	     reasonable compromise.

2034	   FIR SHALL NOT be sent as a reaction to picture losses -- it is
2035	   RECOMMENDED to use PLI instead.  FIR SHOULD be used only in
2036	   situations where not sending a decoder refresh point would render the
2037	   video unusable for the users.

2039	     Note: A typical example where sending FIR is appropriate is when,
2040	     in a multipoint conference, a new user joins the session and no
2041	     regular decoder refresh point interval is established.  Another
2042	     example would be a video switching MCU that changes streams.  Here,
2043	     normally, the MCU issues a FIR to the new sender so to force it to
2044	     emit a decoder refresh point.  The decoder refresh point normally
2045	     includes a Freeze Picture Release (defined outside this
2046	     specification), which re-starts the rendering process of the
2047	     receivers.  Both techniques mentioned are commonly used in MCU-
2048	     based multipoint conferences.

2050	   Other RTP payload specifications such as RFC 4587 [RFC4587] already
2051	   define a feedback mechanism for certain codecs.  An application
2052	   supporting both schemes MUST use the feedback mechanism defined in
2053	   this specification when sending feedback.  For backward compatibility
2054	   reasons, such an application SHOULD also be capable to receive and
2055	   react to the feedback scheme defined in the respective RTP payload
2056	   format, if this is required by that payload format.

2058	   Within the common packet header for feedback messages (as defined in
2059	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
2060	   indicates the source of the request, and the "SSRC of media source"
2061	   is not used and SHALL be set to 0.  The SSRCs of the media senders to
2062	   which the FIR command applies are in the corresponding FCI entries.
2063	   A TSTR message MAY contain requests to multiple media senders, using
2064	   one FCI entry per target media sender.

2066	4.3.1.3. Timing Rules

2068	   The timing follows the rules outlined in section 3 of [RFC4585].  FIR
2069	   commands MAY be used with early or immediate feedback.  The FIR
2070	   feedback message MAY be repeated.  If using immediate feedback mode
2071	   the repetition SHOULD wait at least one RTT before being sent.  In
2072	   early or regular RTCP mode the repetition is sent in the next regular
2073	   RTCP packet.

2075	4.3.1.4. Handling of FIR Message in Mixer and Translators

2077	   A media translator or a mixer performing media encoding of the
2078	   content for which the session participant has issued a FIR is
2079	   responsible for acting upon it.  A mixer acting upon a FIR SHOULD NOT
2080	   forward the message unaltered; instead it SHOULD issue a FIR itself.

2082	4.3.1.5. Remarks

2084	   In conjunction with video codecs, FIR messages typically trigger the
2085	   sending of full intra or IDR pictures.  Both are several times larger
2086	   then predicted (inter) pictures.  Their size is independent of the
2087	   time they are generated.  In most environments, especially when
2088	   employing bandwidth-limited links, the use of an intra picture
2089	   implies an allowed delay that is a significant multiple of the
2090	   typical frame duration.  An example: if the sending frame rate is 10
2091	   fps, and an intra picture is assumed to be 10 times as big as an
2092	   inter picture, then a full second of latency has to be accepted.  In
2093	   such an environment there is no need for a particularly short delay
2094	   in sending the FIR message.  Hence waiting for the next possible time
2095	   slot allowed by RTCP timing rules as per [RFC4585] should not have an
2096	   overly negative impact on the system performance.

2098	4.3.2. Temporal-Spatial Trade-off Request (TSTR)

2100	   The TSTR feedback message is identified by RTCP packet type value
2101	   PT=PSFB and FMT=5.

2103	   The FCI field MUST contain one or more TSTR FCI entries.

2105	4.3.2.1. Message Format

2107	   The content of the FCI entry for the Temporal-Spatial Trade-off
2108	   Request is depicted in Figure 5.  The length of the feedback message
2109	   MUST be set to 2+2*N, where N is the number of FCI entries included.

2111	    0                   1                   2                   3
2112	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2113	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2114	   |                              SSRC                             |
2115	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2116	   |  Seq nr.  |  Reserved                           | Index   |
2117	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2119	    Figure 5 - Syntax of an FCI Entry in the TSTR Message

2121	     SSRC (32 bits): The SSRC of the media sender which is requested to
2122	              apply the tradeoff value given in Index.

2124	     Seq. nr (8 bits): Request sequence number.  The sequence number
2125	              space is unique for pairing of the SSRC of request source
2126	              and the SSRC of the request target.  The sequence number
2127	              SHALL be increased by 1 modulo 256 for each new command.
2128	              A repetition SHALL NOT increase the sequence number.  The
2129	              initial value is arbitrary.

2131	     Reserved (19 bits): All bits SHALL be set to 0 by the sender and
2132	              SHALL be ignored on reception.

2134	     Index (5 bits): An integer value between 0 and 31 that indicates
2135	              the relative trade off that is requested.  An index value
2136	              of 0 index highest possible spatial quality, while 31
2137	              indicates highest possible temporal resolution.

2139	4.3.2.2. Semantics

2141	   A decoder can suggest a temporal-spatial trade-off level by sending a
2142	   TSTR message to an encoder.  If the encoder is capable of adjusting
2143	   its temporal-spatial trade-off, it SHOULD take into account the
2144	   received TSTR message for future coding of pictures.  A value of 0
2145	   suggests a high spatial quality and a value of 31 suggests a high
2146	   frame rate.  The progression of values from 0 to 31 indicate
2147	   monotonically a desire for higher frame rate.  The index values do
2148	   not correspond to precise values of spatial quality or frame rate.

2150	   The reaction to the reception of more than one TSTR message by a
2151	   media sender from different media receivers is left open to the
2152	   implementation.  The selected trade-off SHALL be communicated to the
2153	   media receivers by the means of the TSTN message.

2155	   Within the common packet header for feedback messages (as defined in
2156	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
2157	   indicates the source of the request, and the "SSRC of media source"
2158	   is not used and SHALL be set to 0.  The SSRCs of the media senders to
2159	   which the TSTR applies to are in the corresponding FCI entries.

2161	   A TSTR message MAY contain requests to multiple media senders, using
2162	   one FCI entry per target media sender.

2164	4.3.2.3. Timing Rules

2166	   The timing follows the rules outlined in section 3 of [RFC4585].
2167	   This request message is not time critical and SHOULD be sent using
2168	   regular RTCP timing.  Only if it is known that the user interface
2169	   requires a quick feedback, the message MAY be sent with early or
2170	   immediate feedback timing.

2172	4.3.2.4. Handling of message in Mixers and Translators

2174	   A mixer or media translator that encodes content sent to the session
2175	   participant issuing the TSTR SHALL consider the request to determine
2176	   if it can fulfill it by changing its own encoding parameters.  A
2177	   media translator unable to fulfill the request MAY forward the
2178	   request unaltered towards the media sender.  A mixer encoding for
2179	   multiple session participants will need to consider the joint needs
2180	   of these participants before generating a TSTR on its own behalf
2181	   towards the media sender.  See also the discussion in Section 3.5.2.

2183	4.3.2.5. Remarks

2185	   The term "spatial quality" does not necessarily refer to the
2186	   resolution, measured by the number of pixels the reconstructed video
2187	   is using.  In fact, in most scenarios the video resolution stays
2188	   constant during the lifetime of a session.  However, all video
2189	   compression standards have means to adjust the spatial quality at a
2190	   given resolution, often influenced by the Quantizer Parameter or QP.
2191	   A numerically low QP results in a good reconstructed picture quality,
2192	   whereas a numerically high QP yields a coarse picture.  The typical
2193	   reaction of an encoder to this request is to change its rate control
2194	   parameters to use a lower frame rate and a numerically lower (on
2195	   average) QP, or vice versa.  The precise mapping of Index value to
2196	   frame rate and QP is intentionally left open here, as it depends on
2197	   factors such as the compression standard employed, spatial
2198	   resolution, content, bit rate, and so on.

2200	4.3.3. Temporal-Spatial Trade-off Notification (TSTN)

2202	   The TSTN message is identified by RTCP packet type value PT=PSFB and
2203	   FMT=6.

2205	   The FCI field SHALL contain one or more TSTN FCI entries.

2207	4.3.3.1. Message Format

2209	   The content of an FCI entry for the Temporal-Spatial Trade-off
2210	   Notification is depicted in Figure 6.  The length of the TSTN message
2211	   MUST be set to 2+2*N, where N is the number of FCI entries.

2213	    0                   1                   2                   3
2214	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2215	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2216	   |                              SSRC                             |
2217	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2218	   |  Seq nr.  |  Reserved                           | Index   |
2219	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2221	    Figure 6 - Syntax of the TSTN

2223	     SSRC (32 bits): The SSRC of the source of the TSTR request which
2224	              resulted in this Notification.

2226	     Seq. nr (8 bits): The sequence number value from the TSTN request
2227	              that is being acknowledged.

2229	     Reserved (19 bits): All bits SHALL be set to 0 by the sender and
2230	              SHALL be ignored on reception.

2232	     Index (5 bits): The trade-off value the media sender is using
2233	              henceforth.

2235	      Informative note: The returned trade-off value (Index) may differ
2236	      from the requested one, for example in cases where a media encoder
2237	      cannot tune its trade-off, or when pre-recorded content is used.

2239	4.3.3.2. Semantics

2241	   This feedback message is used to acknowledge the reception of a TSTR.
2242	   One TSTN entry in a TSTN feedback message SHALL be sent for each TSTR
2243	   entry targeted to this session participant, i.e. each TSTR received
2244	   that in the SSRC field in the entry has the receiving entities SSRC.
2245	   A single TSTN message MAY acknowledge multiple requests using
2246	   multiple FCI entries.  The index value included SHALL be the same in
2247	   all FCI entries of the TSTN message.  Including a FCI for each
2248	   requestor allows each requesting entity to determine that the media
2249	   sender received the request.  The Notification SHALL also be sent in
2250	   response to TSTR repetitions received.  If the request receiver has
2251	   received TSTR with several different sequence numbers from a single
2252	   requestor it SHALL only respond to the request with the highest
2253	   (modulo 256) sequence number.

2255	   The TSTN SHALL include the Temporal-Spatial Trade-off index that will
2256	   be used as a result of the request.  This is not necessarily the same
2257	   index as requested, as the media sender may need to aggregate
2258	   requests from several requesting session participants.  It may also
2259	   have some other policies or rules that limit the selection.

2261	   Within the common packet header for feedback messages (as defined in
2262	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
2263	   indicates the source of the Notification, and the "SSRC of media
2264	   source" is not used and SHALL be set to 0.  The SSRCs of the
2265	   requesting entities to which the Notification applies are in the
2266	   corresponding FCI entries.

2268	4.3.3.3. Timing Rules

2270	   The timing follows the rules outlined in section 3 of [RFC4585].
2271	   This acknowledgement message is not extremely time critical and
2272	   SHOULD be sent using regular RTCP timing.

2274	4.3.3.4. Handling of TSTN in Mixer and Translators

2276	   A mixer or translator that acts upon a TSTR SHALL also send the
2277	   corresponding TSTN.  In cases where it needs to forward a TSTR itself
2278	   the notification message MAY need to be delayed until the TSTR has
2279	   been responded to.

2281	4.3.3.5. Remarks
2282	   None

2284	4.3.4. H.271 Video Back Channel Message (VBCM)

2286	   The VBCM is identified by RTCP packet type value PT=PSFB and FMT=7.

2288	   The FCI field MUST contain one or more VBCM FCI entries.

2290	4.3.4.1. Message Format

2292	   The syntax of an FCI entry within the VBCM indication is depicted in
2293	   Figure 7.

2295	   0                   1                   2                   3
2296	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2297	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2298	   |                              SSRC                             |
2299	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2300	   | Seq. nr       |0| Payload Type| Length                        |
2301	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2302	   |                    VBCM Octet String....  |    Padding    |
2303	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2305	   Figure 7 - Syntax of an FCI Entry in the VBCM Message

2307	   SSRC (32 bits): The SSRC value of the media sender that is requested
2308	          to instruct its encoder to react to the VBCM message

2310	   Seq. nr (8 bits): Command sequence number.  The sequence number space
2311	          is unique for pairing of the SSRC of command source and the
2312	          SSRC of the command target.  The sequence number SHALL be
2313	          increased by 1 modulo 256 for each new command.  A repetition
2314	          SHALL NOT increase the sequence number.  The initial value is
2315	          arbitrary.

2317	   0: Must be set to 0 by the sender and should not be acted upon by the
2318	          message receiver.

2320	   Payload Type (7 bits): The RTP payload type for which the VBCM bit
2321	          stream must be interpreted.

2323	   Length (16 bits): The length of the VBCM octet string in octets
2324	          exclusive of any padding octets

2326	   VBCM Octet String (Variable length): This is the octet string
2327	          generated by the decoder carrying a specific feedback sub-
2328	          message.

2330	   Padding (Variable length): Bits set to 0 to make up a 32 bit
2331	          boundary.

2333	4.3.4.2. Semantics

2335	   The "payload" of the VBCM indication carries different types of
2336	   codec-specific, feedback information.  The type of feedback
2337	   information can be classified as a 'status report' (such as an
2338	   indication that a bit stream was received without errors, or that a
2339	   partial or complete picture or block was lost) or 'update requests'
2340	   (such as complete refresh of the bit stream).

2342	          Note: There are possible overlaps between the VBCM sub-
2343	          messages and CCM/AVPF feedback messages, such FIR.  Please see
2344	          section 3.5.3 for further discussion.

2346	   The different types of feedback sub-messages carried in the VBCM are
2347	   indicated by the "payloadType" as defined in [VBCM].  These sub-
2348	   message types are reproduced below for convenience.  "payloadType",
2349	   in ITU-T Rec. H.271 terminology, refers to the sub-type of the H.271
2350	   message and should not be confused with an RTP payload type.

2352	   Payload          Message Content
2353	   Type
2354	   ---------------------------------------------------------------------
2355	   0      One or more pictures without detected bit stream error
2356	          mismatch
2357	   1      One or more pictures that are entirely or partially lost
2358	   2      A set of blocks of one picture that is entirely or partially
2359	          lost
2360	   3      CRC for one parameter set
2361	   4      CRC for all parameter sets of a certain type
2362	   5      A "reset" request indicating that the sender should completely
2363	          refresh the video bit stream as if no prior bit stream data
2364	          had been received
2365	   > 5    Reserved for future use by ITU-T

2367	   Table 2: H.271 message types ("payloadTypes")

2369	   The bit string or the "payload" of a VBCM message is of variable
2370	   length and is self-contained and coded in a variable length, binary
2371	   format.  The media sender necessarily has to be able to parse this
2372	   optimized binary format to make use of VBCM messages.

2374	   Each of the different types of sub-messages (indicated by
2375	   payloadType) may have different semantics depending on the codec
2376	   used.

2378	   Within the common packet header for feedback messages (as defined in
2379	   section 6.1 of [RFC4585]), the "SSRC of the packet sender" field
2380	   indicates the source of the request, and the "SSRC of media source"
2381	   is not used and SHALL be set to 0.  The SSRCs of the media senders to
2382	   which the VBCM message applies to are in the corresponding FCI
2383	   entries.  The sender of the VBCM message MAY send H.271 messages to
2384	   multiple media senders and MAY send more than one H.271 message to
2385	   the same media sender within the same VBCM message.

2387	4.3.4.3. Timing Rules

2389	   The timing follows the rules outlined in section 3 of [RFC4585].  The
2390	   different sub-message types may have different properties in regards
2391	   to the timing of messages that should be used.  If several different
2392	   types are included in the same feedback packet then the requirements
2393	   for the sub-message type with the most stringent requirements should
2394	   be followed.

2396	4.3.4.4. Handling of message in Mixer or Translator

2398	   The handling of VBCM in a mixer or translator is sub-message type
2399	   dependent.

2401	4.3.4.5. Remarks

2403	   Please see section 3.5.3 for a discussion of the usage of H.271
2404	   messages and messages defined in AVPF [RFC4585] and this memo with
2405	   similar functionality.

2407	     Note: There has been some discussion whether the payload type field
2408	     in this message is needed.  It will be needed if there is
2409	     potentially more than one VBCM-capable RTP payload type in the same
2410	     session, and the semantics of a given VBCM message changes between
2411	     payload types.  For example, the picture identification mechanism
2412	     in messages of H.271 type 0 is fundamentally different between
2413	     H.263 and H.264 (although both use the same syntax).  Therefore,
2414	     the payload field is justified here.  There was a further comment
2415	     that for TSTS and FIR such a need does not exist, because the
2416	     semantics of TSTS and FIR are either loosely enough defined, or
2417	     generic enough, to apply to all video payloads currently in
2418	     existence/envisioned.

2420	5. Congestion Control

2422	   The correct application of the AVPF [RFC4585] timing rules prevents
2423	   the network from being flooded by feedback messages.  Hence, assuming
2424	   a correct implementation and configuration, the RTCP channel cannot
2425	   break its bit rate commitment and introduce congestion.

2427	   The reception of some of the feedback messages modifies the behaviour
2428	   of the media senders or, more specifically, the media encoders.  Thus
2429	   modified behaviour MUST respect the bandwidth limits that the
2430	   application of congestion control provides.  For example, when a
2431	   media sender is reacting to a FIR, the unusually high number of
2432	   packets that form the decoder refresh point have to be paced in
2433	   compliance with the congestion control algorithm, even if the user
2434	   experience suffers from a slowly transmitted decoder refresh point.

2436	   A change of the Temporary Maximum Media Stream Bit Rate value can
2437	   only mitigate congestion, but not cause congestion as long as
2438	   congestion control is also employed.  An increase of the value by a
2439	   request REQUIRES the media sender to use congestion control when
2440	   increasing its transmission rate to that value.  A reduction of the
2441	   value results in a reduced transmission bit rate thus reducing the
2442	   risk for congestion.

2444	6. Security Considerations

2446	   The defined messages have certain properties that have security
2447	   implications.  These must be addressed and taken into account by
2448	   users of this protocol.

2450	   The defined setup signaling mechanism is sensitive to modification
2451	   attacks that can result in session creation with sub-optimal
2452	   configuration, and, in the worst case, session rejection.  To prevent
2453	   this type of attack, authentication and integrity protection of the
2454	   setup signaling is required.

2456	   Spoofed or maliciously created feedback messages of the type defined
2457	   in this specification can have the following implications:

2459	        a. severely reduced media bit rate due to false TMMBR messages
2460	           that sets the maximum to a very low value;

2462	        b. assignment of the ownership of a bounding tuple to the wrong
2463	           participant within a TMMBN message, potentially causing
2464	           unnecessary oscillation in the bounding set as the mistakenly
2465	           identified owner reports a change in its tuple and the true
2466	           owner possibly holds back on changes until a correct TMMBN
2467	           message reaches the participants;

2469	        c. sending TSTR requests that result in a video quality
2470	           different from the user's desire, rendering the session less
2471	           useful.

2473	        d. Frequent FIR commands will potentially reduce the frame-rate,
2474	           making the video jerky, due to the frequent usage of decoder
2475	           refresh points.

2477	   To prevent these attacks there is a need to apply authentication and
2478	   integrity protection of the feedback messages.  This can be
2479	   accomplished against threats external to the current RTP session
2480	   using the RTP profile that combines SRTP [SRTP] and AVPF into SAVPF
2481	   [SAVPF].  In the mixer cases, separate security contexts and
2482	   filtering can be applied between the mixer and the participants thus
2483	   protecting other users on the mixer from a misbehaving participant.

2485	7. SDP Definitions

2487	   Section 4 of [RFC4585] defines a new SDP [RFC4566] attribute, rtcp-
2488	   fb, that may be used to negotiate the capability to handle specific
2489	   AVPF commands and indications, such as Reference Picture Selection,
2490	   Picture Loss Indication etc.  The ABNF for rtcp-fb is described in
2491	   section 4.2 of [RFC4585].  In this section we extend the rtcp-fb
2492	   attribute to include the commands and indications that are described
2493	   for codec control protocol in the present document.  We also discuss
2494	   the Offer/Answer implications for the codec control commands and
2495	   indications.

2497	7.1. Extension of the rtcp-fb Attribute

2499	   As described in AVPF [RFC4585], the rtcp-fb attribute indicates the
2500	   capability of using RTCP feedback.  AVPF specifies that the rtcp-fb
2501	   attribute must only be used as a media level attribute and must not
2502	   be provided at session level.  All the rules described in [RFC4585]
2503	   for rtcp-fb attribute relating to payload type and to multiple rtcp-
2504	   fb attributes in a session description also apply to the new feedback
2505	   messages defined in this memo.

2507	   The ABNF [RFC4234] for rtcp-fb as defined in [RFC4585] is
2508	     "a=rtcp-fb: " rtcp-fb-pt SP rtcp-fb-val CRLF

2510	   where rtcp-fb-pt is the payload type and rtcp-fb-val defines the type
2511	   of the feedback message such as ack, nack, trr-int and rtcp-fb-id.
2512	   For example to indicate the support of feedback of picture loss
2513	   indication, the sender declares the following in SDP

2515	         v=0
2516	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2517	         s=Media with feedback
2518	         t=0 0
2519	         c=IN IP4 host.example.com
2520	         m=audio 49170 RTP/AVPF 98
2521	         a=rtpmap:98 H263-1998/90000
2522	         a=rtcp-fb:98 nack pli

2524	   In this document we define a new feedback value "ccm" which indicates
2525	   the support of codec control using RTCP feedback messages.  The "ccm"
2526	   feedback value SHOULD be used with parameters, which indicate the
2527	   specific codec control commands supported.  In this draft we define
2528	   four parameters, which can be used with the ccm feedback value type.

2530	      o  "fir" indicates the support of the Full Intra Request (FIR).
2531	      o  "tmmbr" indicates the support of the Temporary Maximum Media
2532	         Stream Bit Rate Request/Notification (TMMBR/TMMBN).  It has an
2533	         optional sub parameter to indicate the session maximum packet
2534	         rate to be used.  If not included this defaults to infinity.
2535	      o  "tstr" indicates the support of the Temporal-Spatial Trade-off
2536	         Request/Notification (TSTR/TSTN).
2537	      O  "vbcm" indicates the support of H.271 video back channel
2538	         messages (VBCM).  It has zero or more subparameters identifying
2539	         the supported H.271 "payloadType" values.

2541	   In the ABNF for rtcp-fb-val defined in [RFC4585], there is a
2542	   placeholder called rtcp-fb-id to define new feedback types.  "ccm" is
2543	   defined as a new feedback type in this document and the ABNF for the
2544	   parameters for ccm are defined here (please refer to section 4.2 of
2545	   [RFC4585] for complete ABNF syntax).

2547	   rtcp-fb-param = SP "app" [SP byte-string]
2548	                 / SP rtcp-fb-ccm-param
2549	                 /     ; empty

2551	   rtcp-fb-ccm-param = "ccm" SP ccm-param

2553	   ccm-param  = "fir"   ; Full Intra Request
2554	              / "tmmbr" [SP "smaxpr=" MaxPacketRateValue]
2555	                        ; Temporary max media bit rate
2556	              / "tstr"  ; Temporal Spatial Trade Off
2557	              / "vbcm" *(SP subMessageType) ; H.271 VBCM messages
2558	              / token [SP byte-string]
2559	                         ; for future commands/indications
2560	   subMessageType = 1*8DIGIT
2561	   byte-string = <as defined in section 4.2 of [RFC4585] >
2562	   MaxPacketRateValue = 1*15DIGIT

2564	7.2. Offer-Answer

2566	   The Offer/Answer [RFC3264] implications for codec control protocol
2567	   feedback messages are similar those described in [RFC4585].  The
2568	   offerer MAY indicate the capability to support selected codec
2569	   commands and indications.  The answerer MUST remove all ccm
2570	   parameters which it does not understand or does not wish to use in
2571	   this particular media session.  The answerer MUST NOT add new ccm
2572	   parameters in addition to what has been offered.  The answer is
2573	   binding for the media session and both offerer and answerer MUST only
2574	   use feedback messages negotiated in this way.

2576	   The session maximum packet rate parameter part of the TMMBR
2577	   indication is declarative and everyone shall use the highest value
2578	   indicated in a response.  If the session maximum packet rate
2579	   parameter is not present in an offer it SHALL NOT be included by the
2580	   answerer.

2582	7.3. Examples

2584	   Example 1: The following SDP describes a point-to-point video call
2585	   with H.263, with the originator of the call declaring its capability
2586	   to support the FIR and TSTR/TSTN codec control messages.  The SDP is
2587	   carried in a high level signaling protocol like SIP.

2589	         v=0
2590	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2591	         s=Point-to-Point call
2592	         c=IN IP4 192.0.2.124
2593	         m=audio 49170 RTP/AVP 0
2594	         a=rtpmap:0 PCMU/8000
2595	         m=video 51372 RTP/AVPF 98
2596	         a=rtpmap:98 H263-1998/90000
2597	         a=rtcp-fb:98 ccm tstr
2598	         a=rtcp-fb:98 ccm fir

2600	   In the above example, when the sender receives a TSTR message from
2601	   the remote party it is capable of adjusting the trade off as
2602	   indicated in the RTCP TSTN feedback message.

2604	   Example 2: The following SDP describes a SIP end point joining a
2605	   video mixer that is hosting a multiparty video conferencing session.
2606	   The participant supports only the FIR (Full Intra Request) codec
2607	   control command and it declares it in its session description.

2609	         v=0
2610	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2611	         s=Multiparty Video Call
2612	         c=IN IP4 192.0.2.124
2613	         m=audio 49170 RTP/AVP 0
2614	         a=rtpmap:0 PCMU/8000
2615	         m=video 51372 RTP/AVPF 98
2616	         a=rtpmap:98 H263-1998/90000
2617	         a=rtcp-fb:98 ccm fir

2619	   When the video MCU decides to route the video of this participant it
2620	   sends an RTCP FIR feedback message.  Upon receiving this feedback
2621	   message the end point is required to generate a full intra request.

2623	   Example 3: The following example describes the Offer/Answer
2624	   implications for the codec control messages.  The Offerer wishes to
2625	   support "tstr", "fir" and "tmmbr".  The offered SDP is

2627	   -------------> Offer
2628	         v=0
2629	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2630	         s=Offer/Answer
2631	         c=IN IP4 192.0.2.124
2632	         m=audio 49170 RTP/AVP 0
2633	         a=rtpmap:0 PCMU/8000
2634	         m=video 51372 RTP/AVPF 98
2635	         a=rtpmap:98 H263-1998/90000
2636	         a=rtcp-fb:98 ccm tstr
2637	         a=rtcp-fb:98 ccm fir
2638	         a=rtcp-fb:* ccm tmmbr smaxpr=120

2640	   The answerer wishes to support only the FIR and TSTR/TSTN messages
2641	   and the answerer SDP is

2643	   <---------------- Answer
2644	         v=0
2645	         o=alice 3203093520 3203093524 IN IP4 otherhost.example.com
2646	         s=Offer/Answer
2647	         c=IN IP4 192.0.2.37
2648	         m=audio 47190 RTP/AVP 0
2649	         a=rtpmap:0 PCMU/8000
2650	         m=video 53273 RTP/AVPF 98
2651	         a=rtpmap:98 H263-1998/90000
2652	         a=rtcp-fb:98 ccm tstr
2653	         a=rtcp-fb:98 ccm fir

2655	   Example 4: The following example describes the Offer/Answer
2656	   implications for H.271 Video back channel messages (VBCM).  The
2657	   Offerer wishes to support VBCM and the sub-messages of payloadType 1
2658	   (one or more pictures that are entirely or partially lost) and 2 (a
2659	   set of blocks of one picture that are entirely or partially lost).

2661	   -------------> Offer
2662	         v=0
2663	         o=alice 3203093520 3203093520 IN IP4 host.example.com
2664	         s=Offer/Answer
2665	         c=IN IP4 192.0.2.124
2666	         m=audio 49170 RTP/AVP 0
2667	         a=rtpmap:0 PCMU/8000
2668	         m=video 51372 RTP/AVPF 98
2669	         a=rtpmap:98 H263-1998/90000
2670	         a=rtcp-fb:98 ccm vbcm 1 2

2672	   The answerer only wishes to support sub-messages of type 1 only

2674	   <---------------- Answer

2676	         v=0
2677	         o=alice 3203093520 3203093524 IN IP4 otherhost.example.com
2678	         s=Offer/Answer
2679	         c=IN IP4 192.0.2.37
2680	         m=audio 47190 RTP/AVP 0
2681	         a=rtpmap:0 PCMU/8000
2682	         m=video 53273 RTP/AVPF 98
2683	         a=rtpmap:98 H263-1998/90000
2684	         a=rtcp-fb:98 ccm vbcm 1

2686	   So in the above example only VBCM indications comprised of
2687	   "payloadType" 1 will be supported.

2689	8. IANA Considerations

2691	   The new value "ccm" needs to be registered with IANA in the "rtcp-fb"
2692	   Attribute Values registry located at the time of publication at:
2693	   http://www.iana.org/assignments/sdp-parameters

2695	   Value name:       ccm
2696	   Long Name:        Codec Control Commands and Indications
2697	   Reference:        RFC XXXX

2699	   A new registry "Codec Control Messages" needs to be created to hold
2700	   "ccm" parameters located at time of publication at:
2701	   http://www.iana.org/assignments/sdp-parameters

2703	   New registration in this registry follows the "Specification
2704	   required" policy as defined by [RFC2434]. In addition they are
2705	   required to indicate which, if any additional RTCP feedback types,
2706	   such as "nack", "ack".

2708	   The initial content of the registry is the following values:

2710	   Value name:       fir
2711	   Long name:        Full Intra Request Command
2712	   Usable with:      ccm
2713	   Reference:        RFC XXXX

2715	   Value name:       tmmbr
2716	   Long name:        Temporary Maximum Media Stream Bit Rate
2717	   Usable with:      ccm
2718	   Reference:        RFC XXXX

2720	   Value name:       tstr
2721	   Long name:        temporal Spatial Trade Off
2722	   Usable with:      ccm
2723	   Reference:        RFC XXXX

2725	   Value name:       vbcm
2726	   Long name:        H.271 video back channel messages
2727	   Usable with:      ccm
2728	   Reference:        RFC XXXX

2730	   The following values need to be registered as FMT values in the "FMT
2731	   Values for RTPFB Payload Types" registry located at the time of
2732	   publication at: http://www.iana.org/assignments/rtp-parameters
2733	   RTPFB range
2734	   Name           Long Name                         Value  Reference
2735	   -------------- --------------------------------- -----  ---------
2736	                  Reserved                             2   [RFCxxxx]
2737	   TMMBR          Temporary Maximum Media Stream Bit   3   [RFCxxxx]
2738	                  Rate Request
2739	   TMMBN          Temporary Maximum Media Stream Bit   4   [RFCxxxx]
2740	                  Rate Notification

2742	   The following values need to be registered as FMT values in the "FMT
2743	   Values for PSFB Payload Types" registry located at the time of
2744	   publication at: http://www.iana.org/assignments/rtp-parameters

2746	   PSFB range
2747	   Name           Long Name                             Value  Reference
2748	   -------------- ---------------------------------     -----  ---------
2749	   FIR            Full Intra Request Command              4    [RFCxxxx]
2750	   TSTR           Temporal-Spatial Trade-off Request      5    [RFCxxxx]
2751	   TSTN           Temporal-Spatial Trade-off Notification 6    [RFCxxxx]
2752	   VBCM           Video Back Channel Message              7    [RFCxxxx]

2754	9. Contributors

2756	   Tom Taylor has made a very significant contribution, for which the
2757	   authors are very grateful, to this specification by helping rewrite
2758	   the specification. Especially the parts regarding the algorithm for
2759	   determining bounding sets for TMMBR have benefited.

2761	10. Acknowledgements

2763	   The authors would like to thank Andrea Basso, Orit Levin, Nermeen
2764	   Ismail for their work on the requirement and discussion draft
2765	   [Basso].

2767	   Drafts of this memo were reviewed and extensively commented by Roni
2768	   Even, Colin Perkins, Randell Jesup, Keith Lantz, Harikishan Desineni,
2769	   Guido Franceschini and others.  The authors appreciate these reviews.

2771	   Funding for the RFC Editor function is currently provided by the
2772	   Internet Society.

2774	11. References

2776	11.1. Normative references

2778	   [RFC4585]   Ott, J., Wenger, S., Sato, N., Burmeister, C., Rey, J.,
2779	                "Extended RTP Profile for Real-Time Transport Control
2780	                Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585,
2781	                July 2006
2782	   [RFC2119]   Bradner, S., "Key words for use in RFCs to Indicate
2783	                Requirement Levels", BCP 14, RFC 2119, March 1997.
2784	   [RFC3550]   Schulzrinne, H.,  Casner, S., Frederick, R., and V.
2785	                Jacobson, "RTP: A Transport Protocol for Real-Time
2786	                Applications", STD 64, RFC 3550, July 2003.
2787	   [RFC4566]   Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
2788	                Description Protocol", RFC 4566, July 2006.
2789	   [RFC3264]   Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
2790	                with Session Description Protocol (SDP)", RFC 3264, June
2791	                2002.
2792	   [Topologies] M. Westerlund, and S. Wenger, "RTP Topologies", draft-
2793	                ietf-avt-topologies-04, work in progress, Feb 2007
2794	   [RFC2434]   Narten, T. and H. Alvestrand, "Guidelines for Writing an
2795	                IANA Considerations Section in RFCs", BCP 26, RFC 2434,
2796	                October 1998.
2797	   [RFC4234]   Crocker, D. and P. Overell, "Augmented BNF for Syntax
2798	                Specifications: ABNF", RFC 4234, October 2005.

2800	11.2. Informative references

2802	   [Basso]     A. Basso, et. al., "Requirements for transport of video
2803	                control commands", draft-basso-avt-videoconreq-02.txt,
2804	                expired Internet Draft, October 2004.
2805	   [AVC]       Joint Video Team of ITU-T and ISO/IEC JTC 1, Draft ITU-T
2806	                Recommendation and Final Draft International Standard of
2807	                Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC
2808	                14496-10 AVC), Joint Video Team (JVT) of ISO/IEC MPEG
2809	                and ITU-T VCEG, JVT-G050, March 2003.
2810	   [H245]      ITU-T Rec. HG.245, "Control protocol for multimedia
2811	                communication", MAY 2006
2812	   [NEWPRED]   S. Fukunaga, T. Nakai, and H. Inoue, "Error Resilient
2813	                Video Coding by Dynamic Replacing of Reference
2814	                Pictures," in Proc. Globcom'96, vol. 3, pp. 1503 - 1508,
2815	                1996.
2816	   [SRTP]      Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
2817	                Norrman, "The Secure Real-time Transport Protocol
2818	                (SRTP)", RFC 3711, March 2004.

2820	   [RFC4587]   Even, R., "RTP Payload Format for H.261 Video Streams",
2821	                RFC 4587, August 2006.

2823	   [SAVPF]     J. Ott, E. Carrara, "Extended Secure RTP Profile for
2824	                RTCP-based Feedback (RTP/SAVPF)," draft-ietf-avt-
2825	                profile-savpf-10.txt, February, 2007.
2826	   [RFC3525]   Groves, C., Pantaleo, M., Anderson, T., and T. Taylor,
2827	                "Gateway Control Protocol Version 1", RFC 3525, June
2828	                2003.
2829	   [RFC3448]   M. Handley, S. Floyd, J. Padhye, J. Widmer, "TCP Friendly
2830	                Rate Control (TFRC): Protocol Specification", RFC 3448,
2831	                Jan 2003
2832	   [VBCM]      ITU-T Rec. H.271, "Video Back Channel Messages", June
2833	                2006
2834	   [RFC3890]   Westerlund, M., "A Transport Independent Bandwidth
2835	                Modifier for the Session Description Protocol (SDP)",
2836	                RFC 3890, September 2004.
2837	   [RFC4340]   Kohler, E., Handley, M., and S. Floyd, "Datagram
2838	                Congestion Control Protocol (DCCP)", RFC 4340, March
2839	                2006.
2840	   [RFC3261]   Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
2841	                A., Peterson, J., Sparks, R., Handley, M., and E.
2842	                Schooler, "SIP: Session Initiation Protocol", RFC 3261,
2843	                June 2002.
2844	   [RFC2198]   Perkins, C., Kouvelas, I., Hodson, O., Hardman, V.,
2845	                Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse-
2846	                Parisis, "RTP Payload for Redundant Audio Data", RFC
2847	                2198, September 1997.

2849	12. Authors' Addresses

2851	   Stephan Wenger
2852	   Nokia Corporation
2853	   975, Page Mill Road,
2854	   Palo Alto,CA 94304
2855	   USA

2857	   Phone: +1-650-862-7368
2858	   EMail: stewe@stewe.org

2860	   Umesh Chandra
2861	   Nokia Research Center
2862	   975, Page Mill Road,
2863	   Palo Alto,CA 94304
2864	   USA

2866	   Phone: +1-650-796-7502
2867	   Email: Umesh.Chandra@nokia.com

2869	   Magnus Westerlund
2870	   Ericsson Research
2871	   Ericsson AB
2872	   SE-164 80 Stockholm, SWEDEN

2874	   Phone: +46 8 7190000
2875	   EMail: magnus.westerlund@ericsson.com

2877	   Bo Burman
2878	   Ericsson Research
2879	   Ericsson AB
2880	   SE-164 80 Stockholm, SWEDEN

2882	   Phone: +46 8 7190000
2883	   EMail: bo.burman@ericsson.com

2885	Full Copyright Statement

2887	   Copyright (C) The IETF Trust (2007).

2889	   This document is subject to the rights, licenses and restrictions
2890	   contained in BCP 78, and except as set forth therein, the authors
2891	   retain all their rights.

2893	   This document and the information contained herein are provided on an
2894	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
2895	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST
2896	   AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES,
2897	   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT
2898	   THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY
2899	   IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR
2900	   PURPOSE.

2902	Intellectual Property

2904	   The IETF takes no position regarding the validity or scope of any
2905	   Intellectual Property Rights or other rights that might be claimed to
2906	   pertain to the implementation or use of the technology described in
2907	   this document or the extent to which any license under such rights
2908	   might or might not be available; nor does it represent that it has
2909	   made any independent effort to identify any such rights.  Information
2910	   on the procedures with respect to rights in RFC documents can be
2911	   found in BCP 78 and BCP 79.

2913	   Copies of IPR disclosures made to the IETF Secretariat and any
2914	   assurances of licenses to be made available, or the result of an
2915	   attempt made to obtain a general license or permission for the use of
2916	   such proprietary rights by implementers or users of this
2917	   specification can be obtained from the IETF on-line IPR repository at
2918	   http://www.ietf.org/ipr.

2920	   The IETF invites any interested party to bring to its attention any
2921	   copyrights, patents or patent applications, or other proprietary
2922	   rights that may cover technology that may be required to implement
2923	   this standard.  Please address the information to the IETF at
2924	   ietf-ipr@ietf.org.

2926	Acknowledgement

2928	   Funding for the RFC Editor function is provided by the IETF
2929	   Administrative Support Activity (IASA).

2931	RFC Editor Considerations

2933	   The RFC editor is requested to replace all occurrences of XXXX with
2934	   the RFC number this document receives.