idnits 2.17.1 draft-ietf-avt-avpf-ccm-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 19. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 2987. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 3000. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 3009. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 3015. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == Line 769 has weird spacing: '...sg type mul...' == Line 1159 has weird spacing: '... ab c s...' == Line 1161 has weird spacing: '... ba s...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 30, 2007) is 6174 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFCxxxx' is mentioned on line 2835, but not defined ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Obsolete normative reference: RFC 4234 (Obsoleted by RFC 5234) -- Obsolete informational reference (is this intentional?): RFC 2032 (Obsoleted by RFC 4587) == Outdated reference: A later version (-12) exists of draft-ietf-avt-profile-savpf-10 -- Obsolete informational reference (is this intentional?): RFC 3525 (Obsoleted by RFC 5125) -- Obsolete informational reference (is this intentional?): RFC 3448 (Obsoleted by RFC 5348) == Outdated reference: A later version (-07) exists of draft-ietf-avt-topologies-04 Summary: 4 errors (**), 0 flaws (~~), 9 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Stephan Wenger 3 INTERNET-DRAFT Umesh Chandra 4 Expires: October 2007 Nokia 5 Magnus Westerlund 6 Bo Burman 7 Ericsson 8 May 30, 2007 10 Codec Control Messages in the 11 RTP Audio-Visual Profile with Feedback (AVPF) 12 draft-ietf-avt-avpf-ccm-07.txt> 14 Status of this Memo 16 By submitting this Internet-Draft, each author represents that any 17 applicable patent or other IPR claims of which he or she is aware 18 have been or will be disclosed, and any of which he or she becomes 19 aware will be disclosed, in accordance with Section 6 of BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six 27 months and may be updated, replaced, or obsoleted by other 28 documents at any time. It is inappropriate to use Internet-Drafts 29 as reference material or to cite them other than as "work in 30 progress." 32 The list of current Internet-Drafts can be accessed at 33 http://www.ietf.org/ietf/1id-abstracts.txt. 35 The list of Internet-Draft Shadow Directories can be accessed at 36 http://www.ietf.org/shadow.html. 38 Copyright Notice 40 Copyright (C) The IETF Trust (2007). 42 Abstract 44 This document specifies a few extensions to the messages defined 45 in the Audio-Visual Profile with Feedback (AVPF). They are 46 helpful primarily in conversational multimedia scenarios where 47 centralized multipoint functionalities are in use. However some 48 are also usable in smaller multicast environments and point-to- 49 point calls. The extensions discussed are messages related to the 50 ITU-T H.271 Video Back Channel, Full Intra Request, Temporary 51 Maximum Media Stream Bit Rate and Temporal Spatial Trade-off. 53 TABLE OF CONTENTS 55 1. Introduction....................................................5 56 2. Definitions.....................................................6 57 2.1. Glossary...................................................6 58 2.2. Terminology................................................6 59 2.3. Topologies.................................................9 60 3. Motivation (Informative).......................................10 61 3.1. Use Cases.................................................10 62 3.2. Using the Media Path......................................12 63 3.3. Using AVPF................................................13 64 3.3.1. Reliability..........................................13 65 3.4. Multicast.................................................13 66 3.5. Feedback Messages.........................................13 67 3.5.1. Full Intra Request Command...........................14 68 3.5.1.1. Reliability.....................................14 69 3.5.2. Temporal Spatial Trade-off Request and Notification..15 70 3.5.2.1. Point-to-Point..................................16 71 3.5.2.2. Point-to-Multipoint Using Multicast or 72 Translators.....................................17 73 3.5.2.3. Point-to-Multipoint Using RTP Mixer.............17 74 3.5.2.4. Reliability.....................................17 75 3.5.3. H.271 Video Back Channel Message.....................18 76 3.5.3.1. Reliability.....................................21 77 3.5.4. Temporary Maximum Media Stream Bit Rate Request and 78 Notification.........................................21 79 3.5.4.1. Behavior for media receivers using TMMBR........23 80 3.5.4.2. Algorithm for establishing current limitations..25 81 3.5.4.3. Use of TMMBR in a Mixer Based Multipoint 82 Operation.......................................32 83 3.5.4.4. Use of TMMBR in Point-to-Multipoint Using 84 Multicast or Translators........................33 85 3.5.4.5. Use of TMMBR in Point-to-point operation........33 86 3.5.4.6. Reliability.....................................33 87 4. RTCP Receiver Report Extensions................................35 88 4.1. Design Principles of the Extension Mechanism..............35 89 4.2. Transport Layer Feedback Messages.........................36 90 4.2.1. Temporary Maximum Media Stream Bit Rate Request 91 (TMMBR)..............................................37 92 4.2.1.1. Message Format..................................37 93 4.2.1.2. Semantics.......................................38 94 4.2.1.3. Timing Rules....................................42 95 4.2.1.4. Handling in Translator and Mixers...............42 96 4.2.2. Temporary Maximum Media Stream Bit Rate Notification 97 (TMMBN)..............................................42 98 4.2.2.1. Message Format..................................42 99 4.2.2.2. Semantics.......................................43 100 4.2.2.3. Timing Rules....................................44 101 4.2.2.4. Handling by Translators and Mixers..............44 102 4.3. Payload Specific Feedback Messages........................44 103 4.3.1. Full Intra Request (FIR).............................45 104 4.3.1.1. Message Format..................................45 105 4.3.1.2. Semantics.......................................46 106 4.3.1.3. Timing Rules....................................48 107 4.3.1.4. Handling of FIR Message in Mixer and Translators48 108 4.3.1.5. Remarks.........................................49 109 4.3.2. Temporal-Spatial Trade-off Request (TSTR)............49 110 4.3.2.1. Message Format..................................49 111 4.3.2.2. Semantics.......................................50 112 4.3.2.3. Timing Rules....................................51 113 4.3.2.4. Handling of message in Mixers and Translators...51 114 4.3.2.5. Remarks.........................................51 115 4.3.3. Temporal-Spatial Trade-off Notification (TSTN).......51 116 4.3.3.1. Message Format..................................52 117 4.3.3.2. Semantics.......................................52 118 4.3.3.3. Timing Rules....................................53 119 4.3.3.4. Handling of TSTN in Mixer and Translators.......53 120 4.3.3.5. Remarks.........................................53 121 4.3.4. H.271 Video Back Channel Message (VBCM)..............53 122 4.3.4.1. Message Format..................................54 123 4.3.4.2. Semantics.......................................55 124 4.3.4.3. Timing Rules....................................56 125 4.3.4.4. Handling of message in Mixer or Translator......56 126 4.3.4.5. Remarks.........................................56 127 5. Congestion Control.............................................57 128 6. Security Considerations........................................57 129 7. SDP Definitions................................................58 130 7.1. Extension of the rtcp-fb Attribute........................58 131 7.2. Offer-Answer..............................................60 132 7.3. Examples..................................................60 133 8. IANA Considerations............................................64 134 9. Acknowledgements...............................................65 135 10. References....................................................67 136 10.1. Normative references.....................................67 137 10.2. Informative references...................................67 138 11. Authors' Addresses............................................69 139 1.1. Introduction 141 When the Audio-Visual Profile with Feedback (AVPF) [RFC4585] was 142 developed, the main emphasis lay in the efficient support of 143 point-to-point and small multipoint scenarios without centralized 144 multipoint control. However, in practice, many small multipoint 145 conferences operate utilizing devices known as Multipoint Control 146 Units (MCUs). Long-standing experience of the conversational 147 video conferencing industry suggests that there is a need for a 148 few additional feedback messages, to support centralized 149 multipoint conferencing efficiently. Some of the messages have 150 applications beyond centralized multipoint, and this is indicated 151 in the description of the message. This is especially true for 152 the message intended to carry ITU-T Rec. H.271 [H.271] bit strings 153 for Video Back Channel messages. 155 In Real-time Transport Protocol (RTP) [RFC3550] terminology, MCUs 156 comprise mixers and translators. Most MCUs also include signaling 157 support. During the development of this memo, it was noticed that 158 there is considerable confusion in the community related to the 159 use of terms such as mixer, translator, and MCU. In response to 160 these concerns, a number of topologies have been identified that 161 are of practical relevance to the industry, but are not documented 162 in sufficient detail in [RFC3550]. These topologies are 163 documented in [Topologies], and understanding this memo requires 164 previous or parallel study of [Topologies]. 166 Some of the messages defined here are forward only, in that they 167 do not require an explicit notification to the message emitter 168 that they have been received and/or indicating the message 169 receiver's actions. Other messages require a response, leading to 170 a two way communication model that one could view as useful for 171 control purposes. However, it is not the intention of this memo 172 to open up RTP Control Protocol (RTCP) to a generalized control 173 protocol. All mentioned messages have relatively strict real-time 174 constraints, in the sense that their value diminishes with 175 increased delay. This makes the use of more traditional control 176 protocol means, such as Session Initiation Protocol (SIP) re- 177 INVITEs [RFC3261], undesirable when used for the same purpose. 178 Furthermore, all messages are of a very simple format that can be 179 easily processed by an RTP/RTCP sender/receiver. Finally, and 180 most importantly, all messages relate only to the RTP stream with 181 which they are associated, and not to any other property of a 182 communication system. In particular, none of them relate to the 183 properties of the access links traversed by the session. 185 2. Definitions 187 2.1. Glossary 189 AMID - Additive Increase Multiplicative Decrease 190 AVPF - The extended RTP profile for RTCP-based feedback 191 FEC - Forward Error Correction 192 FCI - Feedback Control Information [RFC4585] 193 FIR - Full Intra Request 194 MCU - Multipoint Control Unit 195 MPEG - Moving Picture Experts Group 196 TMMBN - Temporary Maximum Media Stream Bit Rate Notification 197 TMMBR - Temporary Maximum Media Stream Bit Rate Request 198 PLI - Picture Loss Indication 199 PR - Packet rate 200 QP - Quantizer Parameter 201 RTT - Round trip time 202 SSRC - Synchronization Source 203 TSTN - Temporal Spatial Trade-off Notification 204 TSTR - Temporal Spatial Trade-off Request 205 VBCM - Video Back Channel Message indication. 207 2.2. Terminology 209 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL 210 NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and 211 "OPTIONAL" in this document are to be interpreted as described in 212 RFC 2119 [RFC2119]. 214 Message: 215 An RTCP feedback message [RFC4585] defined by this 216 specification, of one of the following types: 218 Request: 219 Message that requires acknowledgement 221 Command: 222 Message that forces the receiver to an action 224 Indication: 225 Message that reports a situation 227 Notification: 228 Message that provides a notification that an event has 229 occurred. Notifications are commonly generated in 230 response to a Request. 232 Note that, with the exception of "Notification", this 233 terminology is in alignment with ITU-T Rec. H.245 [H245]. 235 Decoder Refresh Point: 236 A bit string, packetized in one or more RTP packets, which 237 completely resets the decoder to a known state. 239 Examples for "hard" decoder refresh points are Intra 240 pictures in H.261, H.263, MPEG-1, MPEG-2, and MPEG-4 part 241 2, and Instantaneous Decoder Refresh (IDR) pictures in 242 H.264. "Gradual" decoder refresh points may also be used; 243 see for example [AVC]. While both "hard" and "gradual" 244 decoder refresh points are acceptable in the scope of this 245 specification, in most cases the user experience will 246 benefit from using a "hard" decoder refresh point. 248 A decoder refresh point also contains all header 249 information above the picture layer (or equivalent, 250 depending on the video compression standard) that is 251 conveyed in-band. In H.264, for example, a decoder refresh 252 point contains parameter set Network Adaptation Layer (NAL) 253 units that generate parameter sets necessary for the 254 decoding of the following slice/data partition NAL units 255 (and that are not conveyed out of band). 257 Decoding: 258 The operation of reconstructing the media stream. 260 Rendering: 261 The operation of presenting (parts of) the reconstructed 262 media stream to the user. 264 Stream thinning: 265 The operation of removing some of the packets from a media 266 stream. Stream thinning, preferably, is media-aware, 267 implying that media packets are removed in the order of 268 increasing relevance to the reproductive quality. However 269 even when employing media-aware stream thinning, most media 270 streams quickly lose quality when subject to increasing 271 levels of thinning. Media-unaware stream thinning leads to 272 even worse quality degradation. In contrast to 273 transcoding, stream thinning is typically seen as a 274 computationally lightweight operation. 276 Media: 277 Often used (sometimes in conjunction with terms like bit 278 rate, stream, sender ...) to identify the content of the 279 forward RTP packet stream (carrying the codec data), to 280 which the codec control message applies. 282 Media Stream: 283 The stream of RTP packets labeled with a single 284 Synchronization Source (SSRC) carrying the media (and also 285 in some cases repair information such as retransmission or 286 Forward Error Correction (FEC) information). 288 Total media bit rate: 289 The total bits per second transferred in a media stream, 290 measured at an observer-selected protocol layer and 291 averaged over a reasonable timescale, the length of which 292 depends on the application. In general, a media sender and 293 a media receiver will observe different total media bit 294 rates for the same stream, first because they may have 295 selected different reference protocol layers, and second, 296 because of changes in per-packet overhead along the 297 transmission path. The goal with bit rate averaging is to 298 be able to ignore any burstiness on very short timescales, 299 below for example 100 ms, introduced by scheduling or link 300 layer packetization effects. 302 Maximum total media bit rate: 303 The upper limit on total media bit rate for a given media 304 stream at a particular receiver and for its selected 305 protocol layer. Note that this value cannot be measured on 306 the received media stream, instead it needs to be 307 calculated or determined through other means, such as QoS 308 negotiations or local resource limitations. Also note that 309 this value is an average (on a timescale that is reasonable 310 for the application) and that it may be different from the 311 instantaneous bit-rate seen by packets in the media stream. 313 Overhead: 314 All protocol header information required to convey a packet 315 with media data from sender to receiver, from the 316 application layer down to a pre-defined protocol level (for 317 example down to, and including, the IP header). Overhead 318 may include, for example, IP, UDP, and RTP headers, any 319 layer 2 headers, any Contributing Sources (CSRCs), RTP- 320 Padding, and RTP header extensions. Overhead excludes any 321 RTP payload headers and the payload itself. 323 Net media bit rate: 324 The bit rate carried by a media stream, net of overhead. 325 That is, the bits per second accounted for by encoded 326 media, any applicable payload headers, and any directly 327 associated meta payload information placed in the RTP 328 packet. A typical example of the latter is redundancy data 329 provided by the use of RFC 2198 [RFC2198]. Note that, 330 unlike the total media bit rate, the net media bit rate 331 will have the same value at the media sender and at the 332 media receiver unless any mixing or translating of the 333 media has occurred. 335 For a given observer, the total media bit rate for a media 336 stream is equal to the sum of the net media bit rate and 337 the per-packet overhead as defined above multiplied by the 338 packet rate. 340 Feasible region: 341 The set of all combinations of packet rate and net media 342 bit rate that do not exceed the restrictions in maximum 343 media bit rate placed on a given media sender by the 344 Temporary Maximum Media Stream Bit-rate Request (TMMBR) 345 messages it has received. The feasible region will change 346 as new TMMBR messages are received. 348 Bounding set: 349 The set of TMMBR tuples, selected from all those received 350 at a given media sender, that define the feasible region 351 for that media sender. The media sender uses an algorithm 352 such as that in section 3 353 .5.4.2 to determine or iteratively 354 approximate the current bounding set, and reports that set 355 back to the media receivers in a Temporary Maximum Media 356 Stream Bit-rate Notification (TMMBN) message. 358 2.3. Topologies 360 Please refer to [Topologies] for an in depth discussion. The 361 topologies referred to throughout this memo are labeled 362 (consistently with [Topologies]) as follows: 364 Topo-Point-to-Point . . . . point-to-point communication 365 Topo-Multicast . . . . . . multicast communication as in RFC 3550 366 Topo-Translator . . . . . . translator based as in RFC 3550 367 Topo-Mixer . . . . . . . . mixer based as in RFC 3550 368 Topo-Video-switch-MCU . . . video switching MCU, 369 Topo-RTCP-terminating-MCU . mixer but terminating RTCP 371 3. Motivation 373 This section discusses the motivation and usage of the different 374 video and media control messages. The video control messages have 375 been under discussion for a long time, and a requirement draft was 376 drawn up [Basso]. This draft has expired; however we quote 377 relevant sections of it to provide motivation and requirements. 379 3.1. 380 Use Cases 382 There are a number of possible usages for the proposed feedback 383 messages. Let us begin by looking through the use cases Basso et 384 al. [Basso] proposed. Some of the use cases have been 385 reformulated and comments have been added. 387 1. An RTP video mixer composes multiple encoded video sources into 388 a single encoded video stream. Each time a video source is 389 added, the RTP mixer needs to request a decoder refresh point 390 from the video source, so as to start an uncorrupted prediction 391 chain on the spatial area of the mixed picture occupied by the 392 data from the new video source. 394 2. An RTP video mixer receives multiple encoded RTP video streams 395 from conference participants, and dynamically selects one of 396 the streams to be included in its output RTP stream. At the 397 time of a bit stream change (determined through means such as 398 voice activation or the user interface), the mixer requests a 399 decoder refresh point from the remote source, in order to avoid 400 using unrelated content as reference data for inter picture 401 prediction. After requesting the decoder refresh point, the 402 video mixer stops the delivery of the current RTP stream and 403 monitors the RTP stream from the new source until it detects 404 data belonging to the decoder refresh point. At that time, the 405 RTP mixer starts forwarding the newly selected stream to the 406 receiver(s). 408 3. An application needs to signal to the remote encoder that the 409 desired trade-off between temporal and spatial resolution has 410 changed. For example, one user may prefer a higher frame rate 411 and a lower spatial quality, and another user may prefer the 412 opposite. This choice is also highly content dependent. Many 413 current video conferencing systems offer in the user interface 414 a mechanism to make this selection, usually in the form of a 415 slider. The mechanism is helpful in point-to-point, 416 centralized multipoint and non-centralized multipoint uses. 418 4. Use case 4 of the Basso draft applies only to Picture Loss 419 Indication (PLI) as defined in AVPF [RFC4585] and is not 420 reproduced here. 422 5. Use case 5 of the Basso draft relates to a mechanism known as 423 "freeze picture request". Sending freeze picture requests 424 over a non-reliable forward RTCP channel has been identified as 425 problematic. Therefore, no freeze picture request has been 426 included in this memo, and the use case discussion is not 427 reproduced here. 429 6. A video mixer dynamically selects one of the received video 430 streams to be sent out to participants and tries to provide the 431 highest bit rate possible to all participants, while minimizing 432 stream trans-rating. One way of achieving this is to set up 433 sessions with endpoints using the maximum bit rate accepted by 434 each endpoint, and accepted by the call admission method used 435 by the mixer. By means of commands that reduce the maximum 436 media stream bit rate below what has been negotiated during 437 session set up, the mixer can reduce the maximum bit rate sent 438 by endpoints to the lowest of all the accepted bit rates. As 439 the lowest accepted bit rate changes due to endpoints joining 440 and leaving or due to network congestion, the mixer can adjust 441 the limits at which endpoints can send their streams to match 442 the new value. The mixer then requests a new maximum bit rate, 443 which is equal to or less than the maximum bit rate negotiated 444 at session setup for a specific media stream, and the remote 445 endpoint can respond with the actual bit rate that it can 446 support. 448 The picture Basso, et al draws up covers most applications we 449 foresee. However we would like to extend the list with two 450 additional use cases: 452 7. Currently deployed congestion control algorithms (AMID and TFRC 453 [RFC3448]) probe for additional available capacity as long as 454 there is something to send. With congestion control algorithms 455 using packet loss as the indication for congestion, this 456 probing does generally result in reduced media quality (often 457 to a point where the distortion is large enough to make the 458 media unusable), due to packet loss and increased delay. 460 In a number of deployment scenarios, especially cellular ones, 461 the bottleneck link is often the last hop link. That cellular 462 link also commonly has some type of QoS negotiation enabling 463 the cellular device to learn the maximal bit rate available 464 over this last hop. A media receiver behind this link can, in 465 most (if not all) cases, calculate at least an upper bound for 466 the bit rate available for each media stream it presently 467 receives. How this is done is an implementation detail and not 468 discussed herein. Indicating the maximum available bit rate to 469 the transmitting party for the various media streams can be 470 beneficial to prevent that party from probing for bandwidth for 471 this stream in excess of a known hard limit. For cellular or 472 other mobile devices, the known available bit rate for each 473 stream (deduced from the link bit rate) can change quickly, due 474 to handover to another transmission technology, QoS 475 renegotiation due to congestion, etc. To enable minimal 476 disruption of service, quick convergence is necessary, and 477 therefore media path signaling is desirable. 479 8. The use of reference picture selection (RPS) as an error 480 resilience tool has been introduced in 1997 as NEWPRED 481 [NEWPRED], and is now widely deployed. When RPS is in use, 482 simplistically put, the receiver can send a feedback message to 483 the sender, indicating a reference picture that should be used 484 for future prediction. ([NEWPRED] mentions other forms of 485 feedback as well.) AVPF contains a mechanism for conveying 486 such a message, but did not specify for which codec and 487 according to which syntax the message should conform. 488 Recently, the ITU-T finalized Rec. H.271 which (among other 489 message types) also includes a feedback message. It is 490 expected that this feedback message will fairly quickly enjoy 491 wide support. Therefore, a mechanism to convey feedback 492 messages according to H.271 appears to be desirable. 494 3.2. Using the Media Path 496 There are multiple reasons why we use the media path for the codec 497 control messages. 499 First, systems employing MCUs often separate the control and media 500 processing parts. As these messages are intended for or generated 501 by the media part rather than the signaling part of the MCU, 502 having them on the media path avoids transmission across 503 interfaces and unnecessary control traffic between signaling and 504 processing. If the MCU is physically decomposed, the use of the 505 media path avoids the need for media control protocol extensions 506 (e.g. in MEGACO [RFC3525]). 508 Secondly, the signaling path quite commonly contains several 509 signaling entities, e.g. SIP proxies and application servers. 511 Avoiding going through signaling entities avoids delay for several 512 reasons. Proxies have less stringent delay requirements than 513 media processing and due to their complex and more generic nature 514 may result in significant processing delay. The topological 515 locations of the signaling entities are also commonly not 516 optimized for minimal delay, but rather towards other 517 architectural goals. Thus the signaling path can be significantly 518 longer in both geographical and delay sense. 520 3.3. Using AVPF 522 The AVPF feedback message framework [RFC4585] provides the 523 appropriate framework to implement the new messages. AVPF 524 implements rules controlling the timing of feedback messages to 525 avoid congestion through network flooding by RTCP traffic. We re- 526 use these rules by referencing AVPF. 528 The signaling setup for AVPF allows each individual type of 529 function to be configured or negotiated on an RTP session basis. 531 3.3.1. Reliability 533 The use of RTCP messages implies that each message transfer is 534 unreliable, unless the lower layer transport provides reliability. 535 The different messages proposed in this specification have 536 different requirements in terms of reliability. However, in all 537 cases, the reaction to an (occasional) loss of a feedback message 538 is specified. 540 3.4. Multicast 542 The codec control messages might be used with multicast. The RTCP 543 timing rules specified in [RFC3550] and [RFC4585] ensure that the 544 messages do not cause overload of the RTCP connection. The use of 545 multicast may result in the reception of messages with 546 inconsistent semantics. The reaction to inconsistencies depends 547 on the message type, and is discussed for each message type 548 separately. 550 3.5. Feedback Messages 552 This section describes the semantics of the different feedback 553 messages and how they apply to the different use cases. 555 3.5.1. Full Intra Request Command 557 A Full Intra Request (FIR) Command, when received by the 558 designated media sender, requires that the media sender sends a 559 Decoder Refresh Point (see 2.2) at the earliest opportunity. The 560 evaluation of such opportunity includes the current encoder coding 561 strategy and the current available network resources. 563 FIR is also known as an "instantaneous decoder refresh request" or 564 "video fast update request". 566 Using a decoder refresh point implies refraining from using any 567 picture sent prior to that point as a reference for the encoding 568 process of any subsequent picture sent in the stream. For 569 predictive media types that are not video, the analogue applies. 570 For example, if in MPEG-4 systems scene updates are used, the 571 decoder refresh point consists of the full representation of the 572 scene and is not delta-coded relative to previous updates. 574 Decoder refresh points, especially Intra or IDR pictures, are in 575 general several times larger in size than predicted pictures. 576 Thus, in scenarios in which the available bit rate is small, the 577 use of a decoder refresh point implies a delay that is 578 significantly longer than the typical picture duration. 580 Usage in multicast is possible; however aggregation of the 581 commands is recommended. A receiver that receives a request 582 closely (within 2 times the longest Round Trip Time (RTT) known, 583 plus any AVPF-induced RTCP packet sending delays, if those are 584 known) after sending a decoder refresh point, should await a 585 second request message to ensure that the media receiver has not 586 been served by the previously delivered decoder refresh point. 587 The reason for the specified delay is to avoid sending unnecessary 588 decoder refresh points. A session participant may have sent its 589 own request while another participant's request was in-flight to 590 them. Suppressing those requests that may have been sent without 591 knowledge about the other request avoids this issue. 593 Using the FIR command to recover from errors is explicitly 594 disallowed, and instead the PLI message defined in AVPF [RFC4585] 595 should be used. The PLI message reports lost pictures and has 596 been included in AVPF for precisely that purpose. 598 Full Intra Request is applicable in use-cases 1 and 2. 600 3.5.1.1. Reliability 601 The FIR message results in the delivery of a decoder refresh 602 point, unless the message is lost. Decoder refresh points are 603 easily identifiable from the bit stream. Therefore, there is no 604 need for protocol-level notification, and a simple command 605 repetition mechanism is sufficient for ensuring the level of 606 reliability required. However, the potential use of repetition 607 does require a mechanism to prevent the recipient from responding 608 to messages already received and responded to. 610 To ensure the best possible reliability, a sender of FIR may 611 repeat the FIR request until the desired content has been 612 received. The repetition interval is determined by the RTCP 613 timing rules applicable to the session. Upon reception of a 614 complete decoder refresh point or the detection of an attempt to 615 send a decoder refresh point (which got damaged due to a packet 616 loss), the repetition of the FIR must stop. If another FIR is 617 necessary, the request sequence number must be increased. A FIR 618 sender shall not have more than one FIR request (different request 619 sequence number) outstanding at any time per media sender in the 620 session. 622 The receiver of FIR (i.e. the media sender) behaves in 623 complementary fashion to ensure delivery of a decoder refresh 624 point. If it receives repetitions of the FIR more than 2*RTT 625 after it has sent a decoder refresh point, it shall send a new 626 decoder refresh point. Two round trip times allow time for the 627 decoder refresh point to arrive back to the requestor and for the 628 end of repetitions of FIR to reach and be detected by the media 629 sender. 631 An RTP mixer that receives an FIR from a media receiver is 632 responsible to ensure that a decoder refresh point is delivered to 633 the requesting receiver. It may be necessary for the mixer to 634 generate FIR commands. From a reliability perspective, the two 635 legs (FIR-requesting endpoint to mixer, and mixer to decoder 636 refresh point generating endpoint) are handled independently from 637 each other. 639 3.5.2. Temporal Spatial Trade-off Request and Notification 641 The Temporal Spatial Trade-off Request (TSTR) instructs the video 642 encoder to change its trade-off between temporal and spatial 643 resolution. Index values from 0 to 31 indicate monotonically a 644 desire for higher frame rate. That is, a requester asking for an 645 index of 0 prefers a high quality and is willing to accept a low 646 frame rate, whereas a requester asking for 31 wishes a high frame 647 rate, potentially at the cost of low spatial quality. 649 In general the encoder reaction time may be significantly longer 650 than the typical picture duration. See use case 3 for an example. 651 The encoder decides whether and to what extent the request results 652 in a change of the trade-off. It returns a Temporal Spatial 653 Trade-Off Notification (TSTN) message to indicate the trade-off 654 that it will use henceforth. 656 TSTR and TSTN have been introduced primarily because it is 657 believed that control protocol mechanisms, e.g. a SIP re-invite, 658 are too heavyweight and too slow to allow for a reasonable user 659 experience. Consider, for example, a user interface where the 660 remote user selects the temporal/spatial trade-off with a slider 661 (as it is common in state-of-the-art video conferencing systems). 662 An immediate feedback to any slider movement is required for a 663 reasonable user experience. A SIP re-INVITE [RFC3261] would 664 require at least two round-trips more (compared to the TSTR/TSTN 665 mechanism) and may involve proxies and other complex mechanisms. 666 Even in a well-designed system, it could take a second or so until 667 finally the new trade-off is selected. 668 Furthermore the use of RTCP solves the multicast use case very 669 efficiently. 671 The use of TSTR and TSTN in multipoint scenarios is a non-trivial 672 subject, and can be achieved in many implementation-specific ways. 673 Problems stem from the fact that TSTRs will typically arrive 674 unsynchronized, and may request different trade-off values for the 675 same stream and/or endpoint encoder. This memo does not specify a 676 translator, mixer or endpoint's reaction to the reception of a 677 suggested trade-off as conveyed in the TSTR. We only require the 678 receiver of a TSTR message to reply to it by sending a TSTN, 679 carrying the new trade-off chosen by its own criteria (which may 680 or may not be based on the trade-off conveyed by the TSTR). In 681 other words, the trade-off sent in TSTR is a non-binding 682 recommendation, nothing more. 684 Four TSTR/TSTN scenarios need to be distinguished, based on the 685 topologies described in [Topologies]. The scenarios are described 686 in the following sub-clauses. 688 3.5.2.1. Point-to-Point 690 In this most trivial case (Topo-Point-to-Point), the media sender 691 typically adjusts its temporal/spatial trade-off based on the 692 requested value in TSTR, subject to its own capabilities. The 693 TSTN message conveys back the new trade-off value (which may be 694 identical to the old one if, for example, the sender is not 695 capable of adjusting its trade-off). 697 3.5.2.2. Point-to-Multipoint Using Multicast or Translators 699 RTCP Multicast is used either with media multicast according to 700 Topo-Multicast, or following RFC 3550's translator model according 701 to Topo-Translator. In these cases, unsynchronized TSTR messages 702 from different receivers may be received, possibly with different 703 requested trade-offs (because of different user preferences). 704 This memo does not specify how the media sender tunes its trade- 705 off. Possible strategies include selecting the mean or median of 706 all trade-off requests received, giving priority to certain 707 participants, or continuing to use the previously selected trade- 708 off (e.g. when the sender is not capable of adjusting it). Again, 709 all TSTR messages need to be acknowledged by TSTN, and the value 710 conveyed back has to reflect the decision made. 712 3.5.2.3. Point-to-Multipoint Using RTP Mixer 714 In this scenario (Topo-Mixer) the RTP mixer receives all TSTR 715 messages, and has the opportunity to act on them based on its own 716 criteria. In most cases, the mixer should form a "consensus" of 717 potentially conflicting TSTR messages arriving from different 718 participants, and initiate its own TSTR message(s) to the media 719 sender(s). As in the previous scenario, the strategy for forming 720 this "consensus" is up to the implementation, and can, for 721 example, encompass averaging the participants' request values, 722 giving priority to certain participants, or using session default 723 values. 725 Even if a mixer or translator performs transcoding, it is very 726 difficult to deliver media with the requested trade-off, unless 727 the content the mixer or translator receives is already close to 728 that trade-off. Thus if the mixer changes its trade-off, it needs 729 to request the media sender(s) to use the new value, by creating a 730 TSTR of its own. Upon reaching a decision on the used trade-off 731 it includes that value in the acknowledgement to the downstream 732 requestors. Only in cases where the original source has 733 substantially higher quality (and bit rate), is it likely that 734 transcoding alone can result in the requested trade-off. 736 3.5.2.4. Reliability 737 A request and reception acknowledgement mechanism is specified. 738 The Temporal Spatial Trade-off Notification (TSTN) message informs 739 the request-sender that its request has been received, and what 740 trade-off is used henceforth. This acknowledgment mechanism is 741 desirable for at least the following reasons: 743 o A change in the trade-off cannot be directly identified from the 744 media bit stream. 745 o User feedback cannot be implemented without knowing the chosen 746 trade-off value, according to the media sender's constraints. 747 o Repetitive sending of messages requesting an unimplementable 748 trade-off can be avoided. 750 3.5.3. H.271 Video Back Channel Message 752 ITU-T Rec. H.271 defines syntax, semantics, and suggested encoder 753 reaction to a video back channel message. The structure defined 754 in this memo is used to transparently convey such a message from 755 media receiver to media sender. In this memo, we refrain from an 756 in-depth discussion of the available code points within H.271 and 757 refer to the specification text [H.271] instead. 759 However, we note that some H.271 messages bear similarities with 760 native messages of AVPF and this memo. Furthermore, we note that 761 some H.271 message are known to require caution in multicast 762 environments -- or are plainly not usable in multicast or 763 multipoint scenarios. Table 1 provides a brief, oversimplifying 764 overview of the messages currently defined in H.271, their roughly 765 corresponding AVPF or CCM messages (the latter as specified in 766 this memo), and an indication of our current knowledge of their 767 multicast safety. 769 H.271 msg type AVPF/CCM msg type multicast-safe 770 --------------------------------------------------------------------- 771 0 (when used for 772 reference picture 773 selection) AVPF RPSI No (positive ACK of pictures) 774 1 picture loss AVPF PLI Yes 775 2 partial loss AVPF SLI Yes 776 3 one parameter CRC N/A Yes (no required sender action) 777 4 all parameter CRC N/A Yes (no required sender action) 778 5 refresh point CCM FIR Yes 780 Table 1: H.271 messages and their AVPF/CCM equivalents 782 Note: H.271 message type 0 is not a strict equivalent to 783 AVPF's Reference Picture Selection Indication (RPSI); it is 784 an indication of known-as-correct reference picture(s) at 785 the decoder. It does not command an encoder to use a 786 defined reference picture (the form of control information 787 envisioned to be carried in RPSI). However, it is believed 788 and intended that H.271 message type 0 will be used for the 789 same purpose as AVPF's RPSI -- although other use forms are 790 also possible. 792 In response to the opaqueness of the H.271 messages especially 793 with respect to the multicast safety, the following guidelines 794 MUST be followed when an implementation wishes to employ the H.271 795 video back channel message: 797 1. Implementations utilizing the H.271 feedback message MUST stay 798 in compliance with congestion control principles, as outlined 799 in section 5 800 . 802 2. An implementation SHOULD utilize the IETF-native messages as 803 defined in [RFC4585] and in this memo instead of similar 804 messages defined in [H.271]. Our current understanding of 805 similar messages is documented in Table 1 above. One good 806 reason to divert from the SHOULD statement above would be if it 807 is clearly understood that, for a given application and video 808 compression standard, the aforementioned "similarity" is not 809 given, in contrast to what 810 the table indicates. 812 3. It has been observed that some of the H.271 code points 813 currently in existence are not multicast-safe. Therefore, the 814 sensible thing to do is not to use the H.271 feedback message 815 type in multicast environments. It MAY be used only when all 816 the issues mentioned later are fully understood by the 817 implementer, and properly taken into account by all endpoints. 818 In all other cases, the H.271 message type MUST NOT be used in 819 conjunction with multicast. 821 4. It has been observed that even in centralized multipoint 822 environments, where the mixer should theoretically be able to 823 resolve issues as documented below, the implementation of such 824 a mixer and cooperative endpoints is a very difficult and 825 tedious task. Therefore, H.271 messages MUST NOT be used in 826 centralized multipoint scenarios, unless all the issues 827 mentioned below are fully understood by the implementer, and 828 properly taken into account by both mixer and endpoints. 830 Issues to be taken into account when considering the use of H.271 831 in multipoint environments: 833 1. Different state on different receivers. In many environments 834 it cannot be guaranteed that the decoder state of all media 835 receivers is identical at any given point in time. The most 836 obvious reason for such a possible misalignment of state is a 837 loss that occurs on the path to only one of many media 838 receivers. However, there are other not so obvious reasons, 839 such as recent joins to the multipoint conference (be it by 840 joining the multicast group or through additional mixer 841 output). Different states can lead the media receivers to 842 issue potentially contradicting H.271 messages (or one media 843 receiver issuing an H.271 message that, when observed by the 844 media sender, is not helpful for the other media receivers). A 845 naive reaction of the media sender to these contradicting 846 messages can lead to unpredictable and annoying results. 848 2. Combining messages from different media receivers in a media 849 sender is a non-trivial task. As reasons, we note that these 850 messages may be contradicting each other, and that their 851 transport is unreliable (there may well be other reasons). In 852 case of many H.271 messages (i.e. types 0, 2, 3, and 4), the 853 algorithm for combining must be aware both of the 854 network/protocol environment (i.e. with respect to congestion) 855 and of the media codec employed, as H.271 messages of a given 856 type can have different semantics for different media codecs. 858 3. The suppression of requests may need to go beyond the basic 859 mechanisms described in AVPF (which are driven exclusively by 860 timing and transport considerations on the protocol level). 861 For example, a receiver is often required to refrain from (or 862 delay) generating requests, based on information it receives 863 from the media stream. For instance, it makes no sense for a 864 receiver to issue a FIR when a transmission of an Intra/IDR 865 picture is ongoing. 867 4. When using the non-multicast-safe messages (e.g. H.271 type 0 868 positive ACK of received pictures/slices) in larger multicast 869 groups, the media receiver will likely be forced to delay or 870 even omit sending these messages. For the media sender this 871 looks like data has not been properly received (although it was 872 received properly), and a naively implemented media sender 873 reacts to these perceived problems where it should not. 875 3.5.3.1. Reliability 877 H.271 Video Back Channel messages do not require reliable 878 transmission, and confirmation of the reception of a message can 879 be derived from the forward video bit stream. Therefore, no 880 specific reception acknowledgement is specified. 882 With respect to re-sending rules, clause 3.5.1.1. applies. 884 3.5.4. Temporary Maximum Media Stream Bit Rate Request and 885 Notification 887 A receiver, translator or mixer uses the Temporary Maximum Media 888 Stream Bit Rate Request (TMMBR, "timber") to request a sender to 889 limit the maximum bit rate for a media stream (see 2.2) to, or 890 below, the provided value. The Temporary Maximum Media Stream Bit 891 Rate Notification (TMMBN) contains the media sender's current view 892 of the most limiting subset of the TMMBR-defined limits it has 893 received, to help the participants to suppress TMMBR requests that 894 would not further restrict the media sender. The primary usage 895 for the TMMBR/TMMBN messages is in a scenario with an MCU or mixer 896 (use case 6), corresponding to Topo-Translator or Topo-Mixer, but 897 also to Topo-Point-to-Point. 899 Each temporary limitation on the media stream is expressed as a 900 tuple. The first component of the tuple is the maximum total 901 media bit rate (as defined in section 2.2) that the media receiver 902 is currently prepared to accept for this media stream. The second 903 component is the per-packet overhead that the media receiver has 904 observed for this media stream at its chosen reference protocol 905 layer. 907 As indicated in section 2.2, the overhead as observed by the 908 sender of the TMMBR (i.e. the media receiver) may differ from the 909 overhead observed at the receiver of the TMMBR (i.e. the media 910 sender) due to use of a different reference protocol layer at the 911 other end or due to the intervention of translators or mixers that 912 affect the amount of per packet overhead. For example, a gateway 913 in between the two that converts between IPv4 and IPv6 affects the 914 per-packet overhead by 20 bytes. Other mechanisms that change the 915 overhead include tunnels. The problem with varying overhead is 916 also discussed in [RFC3890]. As will be seen in the description 917 of the algorithm for use of TMMBR, the difference in perceived 918 overhead between the sending and receiving ends presents no 919 difficulty because calculations are carried out in terms of 920 variables (packet rate, net media bit rate) that have the same 921 value at the sender as at the receiver. 923 Reporting both maximum total media bit rate and per-packet 924 overhead allows different receivers to provide bit rate and 925 overhead values for different protocol layers, for example at the 926 IP level, at the outer part of a tunnel protocol, or at the link 927 layer. The protocol level a peer reports on depends on the level 928 of integration the peer has, as it needs to be able to extract the 929 information from that protocol level. For example, an application 930 with no knowledge of the IP version it is running over can not 931 meaningfully determine the overhead of the IP header, and hence 932 will not want to include IP overhead in the overhead or maximum 933 total media bit rate calculation. 935 It is expected that most peers will be able to report values at 936 least for the IP layer. In certain implementations it may be 937 advantageous to also include information pertaining to the link 938 layer, which in turn allows for a more precise overhead 939 calculation and a better optimization of connectivity resources. 941 The Temporary Maximum Media Stream Bit Rate messages are generic 942 messages that can be applied to any RTP packet stream. This 943 separates them from the other codec control messages defined in 944 this specification, which apply only to specific media types or 945 payload formats. The TMMBR functionality applies to the 946 transport, and the requirements the transport places on the media 947 encoding. 949 The reasoning below assumes that the participants have negotiated 950 a session maximum bit rate, using a signaling protocol. This 951 value can be global, for example in case of point-to-point, 952 multicast, or translators. It may also be local between the 953 participant and the peer or mixer. In either case, the bit rate 954 negotiated in signaling is the one that the participant guarantees 955 to be able to handle (depacketize and decode). In practice, the 956 connectivity of the participant also influences the negotiated 957 value -- it does not make much sense to negotiate a total media 958 bit rate that one's network interface does not support. 960 It is also beneficial to have negotiated a maximum packet rate for 961 the session or sender. RFC 3890 provides an SDP [RFC4566] 962 attribute that can be used for this purpose; however, that 963 attribute is not usable in RTP sessions established using 964 offer/answer [RFC3264]. Therefore an optional maximum packet rate 965 signaling parameter is specified in this memo. 967 An already established maximum total media bit rate may be changed 968 at any time, subject to the timing rules governing the sending of 969 feedback messages. The limit may change to any value between zero 970 and the session maximum, as negotiated during session 971 establishment signaling. However, even if a sender has received a 972 TMMBR message allowing an increase in the bit rate, all increases 973 must be governed by a congestion control mechanism. TMMBR 974 indicates known limitations only, usually in the local 975 environment, and does not provide any guarantees about the full 976 path. Furthermore, any increases in TMMBR-established bit rate 977 limits are to be executed only after a certain delay from the 978 sending of the TMMBN message that notifies the world about the 979 increase in limit. The delay is specified as at least twice the 980 longest RTT as known by the media sender, plus the media sender's 981 calculation of the required wait time for the sending of another 982 TMMBR message for this session based on AVPF timing rules. This 983 delay is introduced to allow other session participants to make 984 known their bit rate limit requirements, which may be lower. 986 If it is likely that the new value indicated by TMMBR will be 987 valid for the remainder of the session, the TMMBR sender is 988 expected to perform a renegotiation of the session upper limit 989 using the session signaling protocol. 991 3.5.4.1. Behavior for media receivers using TMMBR 993 This section is an informal description of behaviour described 994 more precisely in section 4.2. 996 A media sender begins the session limited by the maximum media bit 997 rate and maximum packet rate negotiated in session signaling, if 998 any. Note that this value may be negotiated for another protocol 999 layer than the one the participant uses in its TMMBR messages. 1000 Each media receiver selects a reference protocol layer, forms an 1001 estimate of the overhead it is observing (or estimating it if no 1002 packets has been seen yet) at that reference level, and determines 1003 the maximum total media bit rate it can accept, taking into 1004 account its own limitations and any transport path limitations of 1005 which it may be aware. In case the current limitations are more 1006 restricting then what was agreed on in the session signaling, the 1007 media receiver reports its initial estimate of these two 1008 quantities to the media sender using a TMMBR message. Overall 1009 message traffic is reduced by the possibility of including tuples 1010 for multiple media senders in the same TMMBR message. 1012 The media sender applies an algorithm such as that specified in 1013 section 3.5.4.2 to select which of the tuples it has received are 1014 most limiting (i.e. the bounding set as defined in section 2.2). 1015 It modifies its operation to stay within the feasible region (as 1016 defined in section 2.2), and also sends out a TMMBN notification 1017 to the media receivers indicating the selected bounding set. 1019 If a media receiver does not own one of the tuples in the bounding 1020 set reported by the TMMBN, it applies the same algorithm as the 1021 media sender to determine if its current estimated (maximum total 1022 media bit rate, overhead) tuple would enter the bounding set if 1023 known to the media sender. If so, it issues a TMMBR request 1024 reporting the tuple value to the sender. Otherwise it takes no 1025 action for the moment. Periodically, its estimated tuple values 1026 may change or it may receive a new TMMBN. If so, it reapplies the 1027 algorithm to decide whether it needs to issue a TMMBR request. 1029 If, alternatively, a media receiver owns one of the tuples in the 1030 reported bounding set, it takes no action until such time as its 1031 estimate of its own tuple values changes. At that time it sends a 1032 TMMBR request to the media sender to report the changed values. 1034 A media receiver may change status between owner and non-owner of 1035 a bounding tuple between one TMMBN message and the next. Thus it 1036 must check the contents of each TMMBN to determine its subsequent 1037 actions. 1039 Implementations may use other algorithms of their choosing, as 1040 long as the bit rate limitations resulting from the exchange of 1041 TMMBR and TMMBN messages are at least as strict (at least as low, 1042 in the bit rate dimension) as the ones resulting from the use of 1043 the aforementioned algorithm. 1045 Obviously, in point-to-point cases, when there is only one media 1046 receiver, this receiver becomes "owner" once it receives the first 1047 TMMBN in response to its own TMMBR, and stays "owner" for the rest 1048 of the session. Therefore, when it is known that there will 1049 always be only a single media receiver, the above algorithm is not 1050 required. Media receivers that are aware they are the only ones 1051 in a session can send TMMBR messages with bit rate limits both 1052 higher and lower than the previously notified limit, at any time 1053 (subject to the AVPF [RFC4585] RTCP RR send timing rules). 1054 However, it may be difficult for a session participant to 1055 determine if it is the only receiver in the session. Because of 1056 this any implementation of TMMBR is required to include the 1057 algorithm described in the next section or a stricter equivalent. 1059 3.5.4.2. Algorithm for establishing current limitations 1061 This section introduces an example algorithm for the calculation 1062 of a session limit. Other algorithms can be employed, as long as 1063 the result of the calculation is at least as restrictive as the 1064 result that is obtained by this algorithm. 1066 First it is important to consider the implications of using a 1067 tuple for limiting the media sender's behavior. The bit rate and 1068 the overhead value result in a two-dimensional solution space for 1069 the calculation of the bit rate of media streams. Fortunately the 1070 two variables are linked. Specifically, the bit rate available for 1071 RTP payloads is equal to the TMMBR reported bit rate minus the 1072 packet rate used, multiplied by the TMMBR reported overhead 1073 converted to bits. As a result, when different bit rate/overhead 1074 combinations need to be considered, the packet rate determines the 1075 correct limitation. This is perhaps best explained by an example: 1077 Example: 1079 Receiver A: TMMBR_max total BR = 35 kbps, TMMBR_OH = 40 bytes 1080 Receiver B: TMMBR_max total BR = 40 kbps, TMMBR_OH = 60 bytes 1082 For a given packet rate (PR) the bit rate available for media 1083 payloads in RTP will be: 1085 Max_net media_BR_A = TMMBR_max total BR_A - PR * TMMBR_OH_A * 8 1086 ... (1) 1087 Max_net media_BR_B = TMMBR_max total BR_B - PR * TMMBR_OH_B * 8 1088 ... (2) 1090 For a PR = 20 these calculations will yield a Max_net media_BR_A = 1091 28600 bps and Max_net media_BR_B = 30400 bps, which suggests that 1092 receiver A is the limiting one for this packet rate. However at a 1093 certain PR there is a switchover point at which receiver B becomes 1094 the limiting one. The switchover point can be identified by 1095 setting Max_media_BR_A equal to Max_media_BR_B and breaking out 1096 PR: 1098 TMMBR_max total BR_A - TMMBR_max total BR_B 1099 PR = ------------------------------------------- ... (3) 1100 8*(TMMBR_OH_A - TMMBR_OH_B) 1102 which, for the numbers above yields 31.25 as the switchover point 1103 between the two limits. That is, for packet rates below 31.25 per 1104 second, receiver A is the limiting receiver, and for higher packet 1105 rates, receiver B is more limiting. The implications of this 1106 behavior have to be considered by implementations that are going 1107 to control media encoding and its packetization. As exemplified 1108 above, multiple TMMBR limits may apply to the trade-off between 1109 net media bit rate and packet rate. Which limitation applies 1110 depends on the packet rate being considered. 1112 This also has implications for how the TMMBR mechanism needs to 1113 work. First, there is the possibility that multiple TMMBR tuples 1114 are providing limitations on the media sender. Secondly there is 1115 a need for any session participant (media sender and receivers) to 1116 be able to determine if a given tuple will become a limitation 1117 upon the media sender, or if the set of already given limitations 1118 is stricter than the given values. In the absence of the ability 1119 to make this determination the suppression of TMMBR requests would 1120 not work. 1122 The basic idea of the algorithm is as follows. Each TMMBR tuple 1123 can be viewed as the equation of a straight line (cf. equations 1124 (1) and (2)) in a space where packet rate lies along the X-axis 1125 and maximum bit rate lies along the Y-axis. The lower envelope of 1126 the set of lines corresponding to the complete set of TMMBR tuples 1127 defines a polygon. Points lying along or below this polygon are 1128 combinations of packet rate and bit rate that meet all of the 1129 TMMBR constraints. The highest feasible packet rate within this 1130 region is the minimum of the rate at which the bounding polygon 1131 meets the X-axis or the session maximum packet rate (SMAXPR) 1132 provided by signaling, if any. Typically a media sender will 1133 prefer to operate at a lower rate than this theoretical maximum, 1134 so as to increase the rate at which actual media content reaches 1135 the receivers. The purpose of the algorithm is to distinguish the 1136 TMMBR tuples constituting the bounding set and thus delineate the 1137 feasible region, so that the media sender can select its preferred 1138 operating point within that region 1140 Figure 1 below shows a bounding polygon formed by TMMBR tuples A 1141 and B. A third tuple C lies outside the bounding polygon and is 1142 therefore irrelevant in determining feasible tradeoffs between 1143 media rate and packet rate. The line labeled ss..s represents the 1144 limit on packet rate imposed by the session maximum packet rate 1145 (SMAXPR) obtained by signaling during session setup. In Figure 1 1146 the limit determined by tuple B happens to be more restrictive 1147 than SMAXPR. The situation could easily be the reverse, meaning 1148 that the bounding polygon is terminated on the right by the 1149 vertical line representing the SMAXPR constraint. 1151 Net ^ 1152 Media|a c b s 1153 Bit | a c b s 1154 Rate | a c b s 1155 | a cb s 1156 | a c s 1157 | a bc s 1158 | a b c s 1159 | ab c s 1160 | Feasible b c s 1161 | region ba s 1162 | b a s c 1163 | b s c 1164 | b s a 1165 | bs 1166 +------------------------------> 1167 Packet rate 1169 Figure 1 - Geometric Interpretation of TMMBR Tuples 1171 Note that the slopes of the lines making up the bounding polygon 1172 are increasingly negative as one moves in the direction of 1173 increasing packet rate. Note also that with slight rearrangement, 1174 equations (1) and (2) have the canonical form: 1176 y = mx + b 1178 where 1179 m is the slope and has value equal to the negative of the tuple 1180 overhead (in bits), 1181 and 1182 b is the y-intercept and has value equal to the tuple maximum 1183 total media bit rate. 1185 These observations lead to the conclusion that when processing the 1186 TMMBR tuples to select the initial bounding set, one should sort 1187 and process the tuples by order of increasing overhead. Once a 1188 particular tuple has been added to the bounding set, all tuples 1189 not already selected and having lower overhead can be eliminated, 1190 because the next side of the bounding polygon has to be steeper 1191 (i.e. the corresponding TMMBR must have higher overhead) than the 1192 latest added tuple. 1194 Line cc..c in Figure 1 illustrates another principle. This line is 1195 parallel to line aa..a, but has a higher Y-intercept. That is, 1196 the corresponding TMMBR tuple contains a higher maximum total 1197 media bit rate value. Since line cc..c is outside the bounding 1198 polygon, it illustrates the conclusion that if two TMMBR tuples 1199 have the same overhead value, the one with higher maximum total 1200 media bit rate value cannot be part of the bounding set and can be 1201 set aside. 1203 Two further observations complete the algorithm. Obviously, 1204 moving from the left, the successive corners of the bounding 1205 polygon (i.e. the intersection points between successive pairs of 1206 sides) lie at successively higher packet rates. On the other 1207 hand, again moving from the left, each successive line making up 1208 the bounding set crosses the X-axis at a lower packet rate. 1210 The complete algorithm can now be specified. The algorithm works 1211 with two lists of TMMBR tuples, the candidate list X and the 1212 selected list Y, both ordered by increasing overhead value. The 1213 algorithm terminates when all members of X have been discarded or 1214 removed for processing. Membership of the selected list Y is 1215 probationary until the algorithm is complete. Each member of the 1216 selected list is associated with an intersection value, which is 1217 the packet rate at which the line corresponding to that TMMBR 1218 tuple intersects with the line corresponding to the previous TMMBR 1219 tuple in the selected list. Each member of the selected list is 1220 also associated with a maximum packet rate value, which is the 1221 lesser of the session maximum packet rate SMAXPR (if any) and the 1222 packet rate at which the line corresponding to that tuple crosses 1223 the X-axis. 1225 When the algorithm terminates, the selected list is equal to the 1226 bounding set as defined in section 2.2. 1228 Initial Algorithm 1230 This algorithm is used by the media sender when it has received 1231 one or more TMMBR requests and before it has determined a bounding 1232 set for the first time. 1234 1. Sort the TMMBR tuples by order of increasing overhead. This is 1235 the initial candidate list X. 1237 2. When multiple tuples in the candidate list have the same 1238 overhead value, discard all but the one with the lowest maximum 1239 total media bit rate value. 1241 3. Select and remove from the candidate list the TMMBR tuple with 1242 the lowest maximum total media bit rate value. If there is more 1243 than one tuple with that value, choose the one with the highest 1244 overhead value. This is the first member of the selected list 1245 Y. Set its intersection value equal to zero. Calculate its 1246 maximum packet rate as the minimum of SMAXPR (if available) and 1247 the value obtained from the following formula, which is the 1248 packet rate at which the corresponding line crosses the X-axis. 1250 Max PR = TMMBR max total BR / (8 * TMMBR OH) ... (4) 1252 4. Discard from the candidate list all tuples with a lower overhead 1253 value than the selected tuple. 1255 5. Remove the first remaining tuple from the candidate list for 1256 processing. Call this the current candidate. 1258 6. Calculate the packet rate PR at the intersection of the line 1259 generated by the current candidate with the line generated by 1260 the last tuple in the selected list Y, using equation (3). 1262 7. If the calculated value PR is equal to or lower than the 1263 intersection value stored for the last tuple of the selected 1264 list, discard the last tuple of the selected list and go back to 1265 step 6 (retaining the same current candidate). 1267 Note that the choice of the initial member of the selected list 1268 Y in step 3 guarantees that the selected list will never be 1269 emptied by this process, meaning that the algorithm must 1270 eventually (if not immediately) fall through to the step 8. 1272 8. (This step is reached when the calculated PR value of the 1273 current candidate is greater than the intersection value of the 1274 current last member of the selected list Y.) If the calculated 1275 value PR of the current candidate is lower than the maximum 1276 packet rate associated with the last tuple in the selected list, 1277 add the current candidate tuple to the end of the selected list. 1278 Store PR as its intersection value. Calculate its maximum 1279 packet rate as the lesser of SMAXPR (if available) and the 1280 maximum packet rate calculated using equation (4). 1282 9. If any tuples remain in the candidate list, go back to step 5. 1284 Incremental Algorithm 1285 The previous algorithm covered the initial case, where no selected 1286 list had previously been created. It also applied only to the 1287 media sender. When a previously-created selected list is 1288 available at either the media sender or media receiver, two other 1289 cases can be considered: 1291 o when a TMMBR tuple not currently in the selected list is a 1292 candidate for addition; 1294 o when the values change in a TMMBR tuple currently in the 1295 selected list. 1297 At the media receiver these cases correspond respectively to those 1298 of the non-owner and owner of a tuple in the TMMBN-reported 1299 bounding set. 1301 In either case, the process of updating the selected list to take 1302 account of the new/changed tuple can use the basic algorithm 1303 described above, with the modification that the initial candidate 1304 set consists only of the existing selected list and the new or 1305 changed tuple. Some further optimization is possible (beyond 1306 starting with a reduced candidate set) by taking advantage of the 1307 following observations. 1309 The first observation is that if the new/changed candidate becomes 1310 part of the new selected list, the result may be to cause zero or 1311 more other tuples to be dropped from the list. However, if more 1312 than one other tuple is dropped, the dropped tuples will be 1313 consecutive. This can be confirmed geometrically by visualizing a 1314 new line that cuts off a series of segments from the previously- 1315 existing bounding polygon. The cut-off segments are connected one 1316 to the next, the geometric equivalent of consecutive tuples in a 1317 list ordered by overhead value. Beyond the dropped set in either 1318 direction all of the tuples that were in the earlier selected list 1319 will be in the updated one. The second observation is that, 1320 leaving aside the new candidate, the order of tuples remaining in 1321 the updated selected list is unchanged because their overhead 1322 values have not changed. 1324 The consequence of these two observations is that, once the 1325 placement of the new candidate and the extent of the dropped set 1326 of tuples (if any) has been determined, the remaining tuples can 1327 be copied directly from the candidate list into the selected list, 1328 preserving their order. This conclusion suggests the following 1329 modified algorithm: 1331 o Run steps 1-4 of the basic algorithm. 1333 o If the new candidate has survived steps 2 and 4 and has 1334 become the new first member of the selected list, run steps 1335 5-9 on subsequent candidates until another candidate is 1336 added to the selected list. Then move all remaining 1337 candidates to the selected list, preserving their order. 1339 o If the new candidate has survived steps 2 and 4 and has not 1340 become the new first member of the selected list, start by 1341 moving all tuples in the candidate list with lower overhead 1342 values than that of the new candidate to the selected list, 1343 preserving their order. Run steps 5 through 9 for the new 1344 candidate, with the modification that the intersection 1345 values and maximum packet rates for the tuples on the 1346 selected list have to be calculated on the fly because they 1347 were not previously stored. Continue processing only until 1348 a subsequent tuple has been added to the selected list, then 1349 move all remaining candidates to the selected list, 1350 preserving their order. 1352 Note that the new candidate could be added to the selected 1353 list only to be dropped again when the next tuple is 1354 processed. It can easily be seen that in this case the new 1355 candidate does not displace any of the earlier tuples in the 1356 selected list. The limitations of ASCII art make this 1357 difficult to show in a figure. Line cc..c in Figure 1 would 1358 be an example if it had a steeper slope (tuple C had a 1359 higher overhead value), but still intersected line aa..a 1360 beyond where line aa..a intersects line bb..b. 1362 The algorithm just described is approximate, because it does not 1363 take account of tuples outside the selected list. To see how such 1364 tuples can become relevant, consider Figure 1 and suppose that the 1365 maximum total media bit rate in tuple A increases to the point 1366 that line aa..a moves outside line cc..c. Tuple A will remain in 1367 the bounding set calculated by the media sender. However, once it 1368 issues a new TMMBN, media receiver C will apply the algorithm and 1369 discover that its tuple C should now enter the bounding set. It 1370 will issue a TMMBR request to the media sender, which will repeat 1371 its calculation and come to the appropriate conclusion. 1373 The rules of section 4.2 require that the media sender refrain 1374 from raising its sending rate until media receivers have had a 1375 chance to respond to the TMMBN. In the example just given, this 1376 delay ensures that the relaxation of tuple A does not actually 1377 result in an attempt to send media at a rate exceeding the 1378 capacity at C. 1380 3.5.4.3. Use of TMMBR in a Mixer Based Multipoint Operation 1382 Assume a small mixer-based multiparty conference is ongoing, as 1383 depicted in Topo-Mixer of [Topologies]. All participants have 1384 negotiated a common maximum bit rate that this session can use. 1385 The conference operates over a number of unicast paths between the 1386 participants and the mixer. The congestion situation on each of 1387 these paths can be monitored by the participant in question and by 1388 the mixer, utilizing, for example, RTCP receiver reports (RR) or 1389 the transport protocol, e.g. DCCP [RFC4340]. However, any given 1390 participant has no knowledge of the congestion situation of the 1391 connections to the other participants. Worse, without mechanisms 1392 similar to the ones discussed in this draft, the mixer (which is 1393 aware of the congestion situation on all connections it manages) 1394 has no standardized means to inform media senders to slow down, 1395 short of forging its own receiver reports (which is undesirable). 1396 In principle, a mixer confronted with such a situation is obliged 1397 to thin or transcode streams intended for connections that 1398 detected congestion. 1400 In practice, media-aware stream thinning is unfortunately a very 1401 difficult and cumbersome operation and adds undesirable delay. If 1402 media-unaware, it leads very quickly to unacceptable reproduced 1403 media quality. Hence, a means to slow down senders even in the 1404 absence of congestion on their connections to the mixer is 1405 desirable. 1407 To allow the mixer to throttle traffic on the individual links, 1408 without performing transcoding, there is a need for a mechanism 1409 that enables the mixer to ask a participant's media encoders to 1410 limit the media stream bit rate they are currently generating. 1411 TMMBR provides the required mechanism. When the mixer detects 1412 congestion between itself and a given participant, it executes the 1413 following procedure: 1415 1. It starts thinning the media traffic to the congested 1416 participant to the supported bit rate. 1418 2. It uses TMMBR to request the media sender(s) to reduce the 1419 total media bit rate sent by them to the mixer, to a value that 1420 is in compliance with congestion control principles for the 1421 slowest link. Slow refers here to the available bandwidth / 1422 bit rate / capacity and packet rate after congestion control. 1424 3. As soon as the bit rate has been reduced by the sending part, 1425 the mixer stops stream thinning implicitly, because there is no 1426 need for it once the stream is in compliance with congestion 1427 control. 1429 This use of stream thinning as an immediate reaction tool followed 1430 up by a quick control mechanism appears to be a reasonable 1431 compromise between media quality and the need to combat 1432 congestion. 1434 3.5.4.4. Use of TMMBR in Point-to-Multipoint Using Multicast or 1435 Translators 1437 In these topologies, corresponding to Topo-Multicast or Topo- 1438 Translator, RTCP RRs are transmitted globally. This allows all 1439 participants to detect transmission problems such as congestion, 1440 on a medium timescale. As all media senders are aware of the 1441 congestion situation of all media receivers, the rationale for the 1442 use of TMMBR in the previous section does not apply. However, 1443 even in this case the congestion control response can be improved 1444 when the unicast links are using congestion controlled transport 1445 protocols (such as TCP or DCCP). A peer may also report local 1446 limitations to the media sender. 1448 3.5.4.5. Use of TMMBR in Point-to-point operation 1450 In use case 7 it is possible to use TMMBR to improve the 1451 performance when the known upper limit of the bit rate changes. 1452 In this use case the signaling protocol has established an upper 1453 limit for the session and total media bit rates. However, at the 1454 time of transport link bit rate reduction, a receiver can avoid 1455 serious congestion by sending a TMMBR to the sending side. Thus 1456 TMMBR is useful for putting restrictions on the application and 1457 thus placing the congestion control mechanism in the right 1458 ballpark. However TMMBR is usually unable to provide the 1459 continuously quick feedback loop required for real congestion 1460 control. Nor do its semantics match those of congestion control 1461 given its different purpose. For these reasons TMMBR SHALL NOT be 1462 used as a substitute for congestion control. 1464 3.5.4.6. Reliability 1466 The reaction of a media sender to the reception of a TMMBR message 1467 is not immediately identifiable through inspection of the media 1468 stream. Therefore, a more explicit mechanism is needed to avoid 1469 unnecessary re-sending of TMMBR messages. Using a statistically 1470 based retransmission scheme would only provide statistical 1471 guarantees of the request being received. It would also not avoid 1472 the retransmission of already received messages. In addition, it 1473 would not allow for easy suppression of other participants' 1474 requests. For these reasons, a mechanism based on explicit 1475 notification is used. 1477 Upon the reception of a request a media sender sends a TMMBN 1478 notification containing the current bounding set, and indicating 1479 which session participants own that limit. In multicast 1480 scenarios, that allows all other participants to suppress any 1481 request they may have, if their limitations are less strict than 1482 the current ones (i.e. define lines lying outside the feasible 1483 region as defined in section 2.2). Keeping and notifying only the 1484 bounding set of tuples allows for small message sizes and media 1485 sender states. A media sender only keeps state for the SSRCs of 1486 the current owners of the bounding set of tuples; all other 1487 requests and their sources are not saved. Once the bounding set 1488 has been established, new TMMBR messages should be generated only 1489 by owners of the bounding tuples and by other entities that 1490 determine (by applying the algorithm of section 3.5.4.2 or its 1491 equivalent) that their limitations should now be part of the 1492 bounding set. 1494 4. RTCP Receiver Report Extensions 1496 This memo specifies six new feedback messages. The Full Intra 1497 Request (FIR), Temporal-Spatial Trade-off Request (TSTR), 1498 Temporal-Spatial Trade-off Notification (TSTN), and Video Back 1499 Channel Message (VBCM) are "Payload Specific Feedback Messages" as 1500 defined in Section 6.3 of AVPF [RFC4585]. The Temporary Maximum 1501 Media Stream Bit Rate Request (TMMBR) and Temporary Maximum Media 1502 Stream Bit Rate Notification (TMMBN) are "Transport Layer Feedback 1503 Messages" as defined in Section 6.2 of AVPF. 1505 The new feedback messages are defined in the following 1506 subsections, following a similar structure to that in sections 6.2 1507 and 6.3 of the AVPF specification [RFC4585]. 1509 4.1. Design Principles of the Extension Mechanism 1511 RTCP was originally introduced as a channel to convey presence, 1512 reception quality statistics and hints on the desired media 1513 coding. A limited set of media control mechanisms were introduced 1514 in early RTP payload formats for video formats, for example in RFC 1515 2032 [RFC2032]. However, this specification, for the first time, 1516 suggests a two-way handshake for some of its messages. There is 1517 danger that this introduction could be misunderstood as a 1518 precedent for the use of RTCP as an RTP session control protocol. 1519 To prevent such a misunderstanding, this subsection attempts to 1520 clarify the scope of the extensions specified in this memo, and 1521 strongly suggests that future extensions follow the rationale 1522 spelled out here, or compellingly explain why they divert from the 1523 rationale. 1525 In this memo, and in AVPF [RFC4585], only such messages have been 1526 included as: 1528 a) have comparatively strict real-time constraints, which prevent 1529 the use of mechanisms such as a SIP re-invite in most 1530 application scenarios. The real-time constraints are explained 1531 separately for each message where necessary. 1533 b) are multicast-safe in that the reaction to potentially 1534 contradicting feedback messages is specified, as necessary for 1535 each message; and 1537 c) are directly related to activities of a certain media codec, 1538 class of media codecs (e.g. video codecs), or a given RTP 1539 packet stream. 1541 In this memo, a two-way handshake is introduced only for messages 1542 for which: 1544 a) a notification or acknowledgement is required due to their 1545 nature. An analysis to determine whether this requirement 1546 exists has been performed separately for each message. 1548 b) the notification or acknowledgement cannot be easily derived 1549 from the media bit stream. 1551 All messages in AVPF [RFC4585] and in this memo present their 1552 contents in a simple, fixed binary format. This accommodates 1553 media receivers which have not implemented higher control protocol 1554 functionalities (SDP, XML parsers and such) in their media path. 1556 Messages that do not conform to the design principles just 1557 described are not an appropriate use of RTCP or of the Codec 1558 Control Framework defined in this document. 1560 4.2. Transport Layer Feedback Messages 1562 As specified in section 6.1 of RFC 4585 [RFC4585], Transport Layer 1563 Feedback messages are identified by the RTCP packet type value 1564 RTPFB (205). 1566 In AVPF, one message of this category had been defined. This memo 1567 specifies two more such messages. They are identified by means of 1568 the FMT parameter as follows: 1570 Assigned in AVPF [RFC4585]: 1572 1: Generic NACK 1573 31: reserved for future expansion of the identifier number 1574 space 1576 Assigned in this memo: 1578 2: reserved (see note below) 1579 3: Temporary Maximum Media Stream Bit Rate Request (TMMBR) 1580 4: Temporary Maximum Media Stream Bit Rate Notification (TMMBN) 1582 Note: early drafts of AVPF [RFC4585] reserved FMT=2 for a 1583 code point that has later been removed. It has been 1584 pointed out that there may be implementations in the field 1585 using this value in accordance with the expired draft. As 1586 there is sufficient numbering space available, we mark 1587 FMT=2 as reserved so to avoid possible interoperability 1588 problems with any such early implementations. 1590 Available for assignment: 1592 0: unassigned 1593 5-30: unassigned 1595 The following subsection defines the formats of the FCI entries 1596 for the TMMBR and TMMBN messages respectively and specify the 1597 associated behaviour at the media sender and receiver. 1599 4.2.1. Temporary Maximum Media Stream Bit Rate Request (TMMBR) 1601 The FCI field of a Temporary Maximum Media Stream Bit-Rate Request 1602 (TMMBR) message SHALL contain one or more FCI entries. 1604 4.2.1.1. Message Format 1606 The Feedback Control Information (FCI) consists of one or more 1607 TMMBR FCI entries with the following syntax: 1609 0 1 2 3 1610 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1611 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1612 | SSRC | 1613 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1614 | MxTBR Exp | MxTBR Mantissa |Measured Overhead| 1615 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1617 Figure 2 - Syntax of an FCI entry in the TMMBR message 1619 SSRC (32 bits): The SSRC value of the media sender that is 1620 requested to obey the new maximum bit rate. 1622 MxTBR Exp (6 bits): The exponential scaling of the mantissa for 1623 the maximum total media bit rate value. The value is an 1624 unsigned integer [0..63]. 1626 MxTBR Mantissa (17 bits): The mantissa of the maximum total 1627 media bit rate value as an unsigned integer. 1629 Measured Overhead (9 bits): The measured average packet overhead 1630 value in bytes. The measurement SHALL be done according 1631 to description in section 4.2.1.2. The value is an 1632 unsigned integer [0..512]. 1634 The maximum total media bit rate (MxTBR) value in bits per second 1635 is calculated from the MxTBR exponent (exp) and mantissa in the 1636 following way: 1638 MxTBR = mantissa * 2^exp 1640 This allows for 17 bits of resolution in the range 0 to 1641 131072*2^63 (approximately 1.2*10^24). 1643 The length of the TMMBR feedback message SHALL be set to 2+2*N 1644 where N is the number of TMMBR FCI entries. 1646 4.2.1.2. Semantics 1648 Behaviour at the Media Receiver (Sender of the TMMBR) 1650 TMMBR is used to indicate a transport related limitation at the 1651 reporting entity acting as a media receiver. TMMBR has the form 1652 of a tuple containing two components. The first value is the 1653 highest bit rate per sender of a media stream, available at a 1654 receiver-chosen protocol layer, which the receiver currently 1655 supports in this RTP session. The second value is the measured 1656 header overhead in bytes as defined in section 2.2 and measured at 1657 the chosen protocol layer in the packets received for the stream. 1658 The measurement of the overhead is a running average that is 1659 updated for each packet received for this particular media source 1660 (SSRC), using the following formula: 1662 avg_OH (new) = 15/16*avg_OH (old) + 1/16*pckt_OH, 1664 where avg_OH is the running (exponentially smoothed) average and 1665 pckt_OH is the overhead observed in the latest packet. 1667 If a maximum bit rate has been negotiated through signaling, the 1668 maximum total media bit rate that the receiver reports in a TMMBR 1669 message MUST NOT exceed the negotiated value converted to a common 1670 basis (i.e. with overheads adjusted to bring it to the same 1671 reference protocol layer). 1673 Within the common packet header for feedback messages (as defined 1674 in section 6.1 of [RFC4585]), the "SSRC of the packet sender" 1675 field indicates the source of the request, and the "SSRC of media 1676 source" is not used and SHALL be set to 0. Within a particular 1677 TMMBR FCI entry, the "SSRC of media sender" in the FCI field 1678 denotes the media sender the tuple applies to. This is useful in 1679 the multicast or translator topologies where the reporting entity 1680 may address all of the media senders in a single TMMBR message 1681 using multiple FCI entries. 1683 The media receiver SHALL save the contents of the latest TMMBN 1684 message received from each media sender. 1686 The media receiver MAY send a TMMBR FCI entry to a particular 1687 media sender under the following circumstances: 1689 o before any TMMBN message has been received from that media 1690 sender; 1692 o when the media receiver has been identified as the source of 1693 a bounding tuple within the latest TMMBN message received 1694 from that media sender, and the value of the maximum total 1695 media bit rate or the overhead relating to that media sender 1696 has changed; 1698 o when the media receiver has not been identified as the 1699 source of a bounding tuple within the latest TMMBN message 1700 received from that media sender, and, after the media 1701 receiver applies the incremental algorithm from section 1702 3.5.4.2 or a stricter equivalent, the media receiver's tuple 1703 relating to that media sender is determined to belong to the 1704 bounding set. 1706 A TMMBR FCI entry MAY be repeated in subsequent TMMBR messages if 1707 no Temporary Maximum Media Stream Bit-Rate Notification (TMMBN) 1708 FCI has been received from the media sender at the time of 1709 transmission of the next RTCP packet. The bit rate value of a 1710 TMMBR FCI entry MAY be changed from one TMMBR message to the next. 1711 The overhead measurement SHALL be updated to the current value of 1712 avg_OH each time the entry is sent. 1714 If the value set by a TMMBR message is expected to be permanent, 1715 the TMMBR setting party SHOULD renegotiate the session parameters 1716 to reflect that using session setup signaling, e.g. a SIP re- 1717 invite. 1719 Behaviour at the Media Sender (Receiver of the TMMBR) 1721 When it receives a TMMBR message containing an FCI entry relating 1722 to it, the media sender SHALL use an initial or incremental 1723 algorithm as applicable to determine the bounding set of tuples 1724 based on the new information. The algorithm used SHALL be at 1725 least as strict as the corresponding algorithm defined in section 1726 3 1727 .5.4.2. The media sender MAY accumulate TMMBR requests over a 1728 small interval (relative to the RTCP sending interval) before 1729 making this calculation. 1731 Once it has determined the bounding set of tuples, the media 1732 sender MAY use any combination of packet rate and net media bit 1733 rate within the feasible region that these tuples describe to 1734 produce a lower total media stream bit rate, as it may need to 1735 address a congestion situation or other limiting factors. See 1736 section 5. (congestion control) for more discussion. 1738 If the media sender concludes that it can increase the maximum 1739 total media bit rate value, it SHALL wait before actually doing 1740 so, for a period long enough to allow a media receiver to respond 1741 to the TMMBN if it determines that its tuple belongs in the 1742 bounding set. This delay period is estimated by the formula: 1744 2 * RTT + T_Dither_Max, 1746 where RTT is the longest round trip time known to the media sender 1747 and T_Dither_Max is defined in section 3.4 of [RFC4585]. 1749 A TMMBN message SHALL be sent by the media sender at the earliest 1750 possible point in time, in response to any TMMBR messages received 1751 since the last sending of TMMBN. The TMMBN message indicates the 1752 calculated set of bounding tuples and the owners of those tuples 1753 at the time of the transmission of the message. 1755 An SSRC may time out according to the default rules for RTP 1756 session participants, i.e. the media sender has not received any 1757 RTP or RTCP packets from the owner for the last five regular 1758 reporting intervals. An SSRC may also explicitly leave the 1759 session, with the participant indicating this through the 1760 transmission of an RTCP BYE packet or using an external signaling 1761 channel. If the media sender determines that the owner of a tuple 1762 in the bounding set has left the session, the media sender shall 1763 transmit a new TMMBN containing the previously-determined set of 1764 bounding tuples but with the tuple belonging to the departed owner 1765 removed. 1767 A media sender MAY proactively initiate the equivalent to a TMMBR 1768 message to itself, when it is aware that its transmission path is 1769 more restrictive than the current limitations. As a result, a 1770 TMMBN indicating the media source itself as the owner of a tuple 1771 is being sent, thereby avoiding unnecessary TMMBR messages from 1772 other participants. However, like any other participant, when the 1773 media sender becomes aware of changed limitations, it is required 1774 to change the tuple, and to send a corresponding TMMBN. 1776 Discussion 1778 Due to the unreliable nature of transport of TMMBR and TMMBN, the 1779 above rules may lead to the sending of TMMBR messages which appear 1780 to disobey those rules. Furthermore, in multicast scenarios it 1781 can happen that more than one "non-owning" session participant may 1782 determine, rightly or wrongly, that its tuple belongs in the 1783 bounding set. This is not critical for a number of reasons: 1785 a) If a TMMBR message is lost in transmission, either the media 1786 sender sends a new TMMBN message in response to some other 1787 media receiver or it does not send a new TMMBN message at all. 1788 In the first case, the media receiver applies the incremental 1789 algorithm and, if it determines that its tuple should be part 1790 of the bounding set, sends out another TMMBR. In the second 1791 case, it repeats the sending of a TMMBR unconditionally. 1792 Either way, the media sender eventually gets the information it 1793 needs. 1795 b) Similarly, if a TMMBN message gets lost, the media receiver 1796 that has sent the corresponding TMMBR request does not receive 1797 the notification and is expected to re-send the request and 1798 trigger the transmission of another TMMBN. 1800 c) If multiple competing TMMBR messages are sent by different 1801 session participants, then the algorithm can be applied taking 1802 all of these messages into account, and the resulting TMMBN 1803 provides the participants with an updated view of how their 1804 tuples compare with the bounded set. 1806 d) If more than one session participant happens to send TMMBR 1807 messages at the same time and with the same tuple component 1808 values, it does not matter which if either tuple is taken into 1809 the bounding set. The losing session participant will 1810 determine after applying the algorithm that its tuple does not 1811 enter the bounding set, and will therefore stop sending its 1812 TMMBR request. 1814 It is important to consider the security risks involved with faked 1815 TMMBRs. See the security considerations in Section 6. 1817 As indicated already, the feedback messages may be used in both 1818 multicast and unicast sessions in any of the specified topologies. 1819 However, for sessions with a large number of participants, using 1820 the lowest common denominator, as required by this mechanism, may 1821 not be the most suitable course of action. Large sessions may 1822 need to consider other ways to adapt the bit rate to participants' 1823 capabilities, such as partitioning the session into different 1824 quality tiers, or using some other method of achieving bit rate 1825 scalability. 1827 4.2.1.3. Timing Rules 1829 The first transmission of the TMMBR request message MAY use early 1830 or immediate feedback in cases when timeliness is desirable. Any 1831 repetition of a request message SHOULD use regular RTCP mode for 1832 its transmission timing. 1834 4.2.1.4. Handling in Translator and Mixers 1836 Media translators and mixers will need to receive and respond to 1837 TMMBR messages as they are part of the chain that provides a 1838 certain media stream to the receiver. The mixer or translator may 1839 act locally on the TMMBR request and thus generate a TMMBN to 1840 indicate that it has done so. Alternatively, in the case of a 1841 media translator it can forward the request, or in the case of a 1842 mixer generate one of its own and pass it forward. In the latter 1843 case, the mixer will need to send a TMMBN back to the original 1844 requestor to indicate that it is handling the request. 1846 4.2.2. Temporary Maximum Media Stream Bit Rate Notification (TMMBN) 1848 The FCI field of the TMMBN Feedback message may contain zero, one 1849 or more TMMBN FCI entries. 1851 4.2.2.1. Message Format 1853 The Feedback Control Information (FCI) consists of zero, one or 1854 more TMMBN FCI entries with the following syntax: 1856 0 1 2 3 1857 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1858 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1859 | SSRC | 1860 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1861 | MxTBR Exp | MxTBR Mantissa |Measured Overhead| 1862 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1864 Figure 3 - Syntax of an FCI entry in the TMMBN message 1865 SSRC (32 bits): The SSRC value of the "owner" of this tuple. 1867 MxTBR Exp (6 bits): The exponential scaling of the mantissa for 1868 the maximum total media bit rate value. The value is an 1869 unsigned integer [0..63]. 1871 MxTBR Mantissa (17 bits): The mantissa of the maximum total 1872 media bit rate value as an unsigned integer. 1874 Measured Overhead (9 bits): The measured average packet overhead 1875 value in bytes represented as an unsigned integer. 1877 Thus the FCI within the TMMBN message contains entries indicating 1878 the bounding tuples. For each tuple, the entry gives the owner by 1879 the SSRC, followed by the applicable maximum total media bit rate 1880 and overhead value. 1882 The length of the TMMBN message SHALL be set to 2+2*N where N is 1883 the number of TMMBN FCI entries. 1885 4.2.2.2. Semantics 1887 This feedback message is used to notify the senders of any TMMBR 1888 message that one or more TMMBR messages have been received or that 1889 an owner has left the session. It indicates to all participants 1890 the current set of bounding tuples and the "owners" of those 1891 tuples. 1893 Within the common packet header for feedback messages (as defined 1894 in section 6.1 of [RFC4585]), the "SSRC of the packet sender" 1895 field indicates the source of the notification. The "SSRC of 1896 media source" is not used and SHALL be set to 0. 1898 A TMMBN message SHALL be scheduled for transmission after the 1899 reception of a TMMBR message with an FCI entry identifying this 1900 media sender. Only a single TMMBN SHALL be sent, even if more 1901 than one TMMBR message is received between the scheduling of the 1902 transmission and the actual transmission of the TMMBN message. 1903 The TMMBN message indicates the bounding tuples and their owners 1904 at the time of transmitting the message. The bounding tuples 1905 included SHALL be the set arrived at through application of the 1906 applicable algorithm of section 3.5.4.2 or an equivalent, applied 1907 to the previous bounding set if any and tuples received in TMMBR 1908 messages since the last TMMBN was transmitted. 1910 The reception of a TMMBR message SHALL still result in the 1911 transmission of a TMMBN message even if, after application of the 1912 algorithm, the newly reported TMMBR tuple is not accepted into the 1913 bounding set. In such a case the bounding tuples and their owners 1914 are not changed, unless the TMMBR was from an owner of a tuple 1915 within the previously calculated bounding set. This procedure 1916 allows session participants that did not see the last TMMBN 1917 message to get a correct view of this media sender's state. 1919 As indicated in section 4.2.1.2, when a media sender determines 1920 that an "owner" of a bounding tuple has left the session, then 1921 that tuple is removed from the bounding set, and the media sender 1922 SHALL send a TMMBN message indicating the remaining bounding 1923 tuples. If there are no remaining bounding tuples a TMMBN without 1924 any FCI SHALL be sent to indicate this. 1926 Note: if any media receivers remain in the session, this last 1927 will be a temporary situation. The empty TMMBN will cause every 1928 remaining media receiver to determine that its limitation 1929 belongs in the bounding set and send a TMMBR in consequence. 1931 In unicast scenarios (i.e. where a single sender talks to a single 1932 receiver), the aforementioned algorithm to determine ownership 1933 degenerates to the media receiver becoming the "owner" of the one 1934 bounding tuple as soon as the media receiver has issued the first 1935 TMMBR message. 1937 4.2.2.3. Timing Rules 1939 The TMMBN acknowledgement SHOULD be sent as soon as allowed by the 1940 applied timing rules for the session. Immediate or early feedback 1941 mode SHOULD be used for these messages. 1943 4.2.2.4. Handling by Translators and Mixers 1945 As discussed in Section 4.2.1.4 mixers or translators may need to 1946 issue TMMBN messages as responses to TMMBR messages for SSRC's 1947 handled by them. 1949 4.3. Payload Specific Feedback Messages 1951 As specified by section 6.1 of RFC 4585 [RFC4585], Payload- 1952 Specific FB messages are identified by the RTCP packet type value 1953 PT=PSFB (206). 1955 AVPF [RFC4585] defines three payload-specific feedback messages 1956 and one application layer feedback message. This memo specifies 1957 four additional payload-specific feedback messages. All are 1958 identified by means of the FMT parameter as follows: 1960 Assigned in [RFC4585]: 1962 1: Picture Loss Indication (PLI) 1963 2: Slice Lost Indication (SLI) 1964 3: Reference Picture Selection Indication (RPSI) 1965 15: Application layer FB message 1966 31: reserved for future expansion of the number space 1968 Assigned in this memo: 1970 4: Full Intra Request Command (FIR) 1971 5: Temporal-Spatial Trade-off Request (TSTR) 1972 6: Temporal-Spatial Trade-off Notification (TSTN) 1973 7: Video Back Channel Message (VBCM) 1975 Unassigned: 1977 0: unassigned 1978 8-14: unassigned 1979 16-30: unassigned 1981 The following subsections define the new FCI formats for the 1982 payload-specific feedback messages. 1984 4.3.1. Full Intra Request (FIR) 1986 The FIR message is identified by RTCP packet type value PT=PSFB 1987 and FMT=4. 1989 The FCI field MUST contain one or more FIR entries. Each entry 1990 applies to a different media sender, identified by its SSRC. 1992 4.3.1.1. Message Format 1994 The Feedback Control Information (FCI) for the Full Intra Request 1995 consists of one or more FCI entries, the content of which is 1996 depicted in Figure 4. The length of the FIR feedback message MUST 1997 be set to 2+2*N, where N is the number of FCI entries. 1999 0 1 2 3 2000 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2001 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2002 | SSRC | 2003 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2004 | Seq. nr | Reserved | 2005 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2007 Figure 4 - Syntax of an FCI entry in the FIR message 2009 SSRC (32 bits): The SSRC value of the media sender which is 2010 requested to send a decoder refresh point. 2012 Seq. nr (8 bits): Command sequence number. The sequence number 2013 space is unique for each pairing of the SSRC of command 2014 source and the SSRC of the command target. The sequence 2015 number SHALL be increased by 1 modulo 256 for each new 2016 command. A repetition SHALL NOT increase the sequence 2017 number. The initial value is arbitrary. 2019 Reserved (24 bits): All bits SHALL be set to 0 by the sender and 2020 SHALL be ignored on reception. 2022 The semantics of this feedback message is independent of the RTP 2023 payload type. 2025 4.3.1.2. Semantics 2027 Upon reception of FIR, the encoder MUST send a decoder refresh 2028 point (see section 2.2) as soon as possible. 2030 Note: Currently, video appears to be the only useful application 2031 for FIR, as it appears to be the only RTP payload widely 2032 deployed that relies heavily on media prediction across RTP 2033 packet boundaries. However, use of FIR could also reasonably be 2034 envisioned for other media types that share essential properties 2035 with compressed video, namely cross-frame prediction (whatever a 2036 frame may be for that media type). One possible example may be 2037 the dynamic updates of MPEG-4 scene descriptions. It is 2038 suggested that payload formats for such media types refer to FIR 2039 and other message types defined in this specification and in 2040 AVPF [RFC4585], instead of creating similar mechanisms in the 2041 payload specifications. The payload specifications may have to 2042 explain how the payload-specific terminologies map to the video- 2043 centric terminology used herein. 2045 Note: In environments where the sender has no control over the 2046 codec (e.g. when streaming pre-recorded and pre-coded content), 2047 the reaction to this command cannot be specified. One suitable 2048 reaction of a sender would be to skip forward in the video bit 2049 stream to the next decoder refresh point. In other scenarios, 2050 it may be preferable not to react to the command at all, e.g. 2051 when streaming to a large multicast group. Other reactions may 2052 also be possible. When deciding on a strategy, a sender could 2053 take into account factors such as the size of the receiving 2054 group, the "importance" of the sender of the FIR message 2055 (however "importance" may be defined in this specific 2056 application), the frequency of decoder refresh points in the 2057 content, and so on. However a session which predominately 2058 handles pre-coded content is not expected to use FIR at all. 2060 The sender MUST consider congestion control as outlined in section 2061 5, which MAY restrict its ability to send a decoder refresh point 2062 quickly. 2064 Note: The relationship between the Picture Loss Indication and 2065 FIR is as follows. As discussed in section 6.3.1 of AVPF 2066 [RFC4585], a Picture Loss Indication informs the decoder about 2067 the loss of a picture and hence the likelihood of misalignment 2068 of the reference pictures between the encoder and decoder. Such 2069 a scenario is normally related to losses in an ongoing 2070 connection. In point-to-point scenarios, and without the 2071 presence of advanced error resilience tools, one possible option 2072 for an encoder consists in sending a decoder refresh point. 2073 However, there are other options. One example is that the media 2074 sender ignores the PLI, because the embedded stream redundancy 2075 is likely to clean up the reproduced picture within a reasonable 2076 amount of time. The FIR, in contrast, leaves a (real-time) 2077 encoder no choice but to send a decoder refresh point. It does 2078 not allow the encoder to take into account any considerations 2079 such as the ones mentioned above. 2081 Note: Mandating a maximum delay for completing the sending of a 2082 decoder refresh point would be desirable from an application 2083 viewpoint, but is problematic from a congestion control point of 2084 view. "As soon as possible" as mentioned above appears to be a 2085 reasonable compromise. 2087 FIR SHALL NOT be sent as a reaction to picture losses -- it is 2088 RECOMMENDED to use PLI instead. FIR SHOULD be used only in 2089 situations where not sending a decoder refresh point would render 2090 the video unusable for the users. 2092 Note: A typical example where sending FIR is appropriate is 2093 when, in a multipoint conference, a new user joins the session 2094 and no regular decoder refresh point interval is established. 2095 Another example would be a video switching MCU that changes 2096 streams. Here, normally, the MCU issues a FIR to the new sender 2097 so to force it to emit a decoder refresh point. The decoder 2098 refresh point normally includes a Freeze Picture Release 2099 (defined outside this specification), which re-starts the 2100 rendering process of the receivers. Both techniques mentioned 2101 are commonly used in MCU-based multipoint conferences. 2103 Other RTP payload specifications such as RFC 2032 [RFC2032] 2104 already define a feedback mechanism for certain codecs. An 2105 application supporting both schemes MUST use the feedback 2106 mechanism defined in this specification when sending feedback. 2107 For backward compatibility reasons, such an application SHOULD 2108 also be capable to receive and react to the feedback scheme 2109 defined in the respective RTP payload format, if this is required 2110 by that payload format. 2112 Within the common packet header for feedback messages (as defined 2113 in section 6.1 of [RFC4585]), the "SSRC of the packet sender" 2114 field indicates the source of the request, and the "SSRC of media 2115 source" is not used and SHALL be set to 0. The SSRCs of the media 2116 senders to which the FIR command applies are in the corresponding 2117 FCI entries. A TSTR message MAY contain requests to multiple 2118 media senders, using one FCI entry per target media sender. 2120 4.3.1.3. Timing Rules 2122 The timing follows the rules outlined in section 3 of [RFC4585]. 2123 FIR commands MAY be used with early or immediate feedback. The 2124 FIR feedback message MAY be repeated. If using immediate feedback 2125 mode the repetition SHOULD wait at least one RTT before being 2126 sent. In early or regular RTCP mode the repetition is sent in the 2127 next regular RTCP packet. 2129 4.3.1.4. Handling of FIR Message in Mixer and Translators 2131 A media translator or a mixer performing media encoding of the 2132 content for which the session participant has issued a FIR is 2133 responsible for acting upon it. A mixer acting upon a FIR SHOULD 2134 NOT forward the message unaltered; instead it SHOULD issue a FIR 2135 itself. 2137 4.3.1.5. Remarks 2139 In conjunction with video codecs, FIR messages typically trigger 2140 the sending of full intra or IDR pictures. Both are several times 2141 larger then predicted (inter) pictures. Their size is independent 2142 of the time they are generated. In most environments, especially 2143 when employing bandwidth-limited links, the use of an intra 2144 picture implies an allowed delay that is a significant multiple of 2145 the typical frame duration. An example: if the sending frame rate 2146 is 10 fps, and an intra picture is assumed to be 10 times as big 2147 as an inter picture, then a full second of latency has to be 2148 accepted. In such an environment there is no need for a 2149 particularly short delay in sending the FIR message. Hence 2150 waiting for the next possible time slot allowed by RTCP timing 2151 rules as per [RFC4585] should not have an overly negative impact 2152 on the system performance. 2154 4.3.2. Temporal-Spatial Trade-off Request (TSTR) 2156 The TSTR feedback message is identified by RTCP packet type value 2157 PT=PSFB and FMT=5. 2159 The FCI field MUST contain one or more TSTR FCI entries. 2161 4.3.2.1. Message Format 2163 The content of the FCI entry for the Temporal-Spatial Trade-off 2164 Request is depicted in Figure 5. The length of the feedback 2165 message MUST be set to 2+2*N, where N is the number of FCI entries 2166 included. 2168 0 1 2 3 2169 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2170 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2171 | SSRC | 2172 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2173 | Seq nr. | Reserved | Index | 2174 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2176 Figure 5 - Syntax of an FCI Entry in the TSTR Message 2177 SSRC (32 bits): The SSRC of the media sender which is requested 2178 to apply the tradeoff value given in Index. 2180 Seq. nr (8 bits): Request sequence number. The sequence number 2181 space is unique for pairing of the SSRC of request 2182 source and the SSRC of the request target. The sequence 2183 number SHALL be increased by 1 modulo 256 for each new 2184 command. A repetition SHALL NOT increase the sequence 2185 number. The initial value is arbitrary. 2187 Reserved (19 bits): All bits SHALL be set to 0 by the sender and 2188 SHALL be ignored on reception. 2190 Index (5 bits): An integer value between 0 and 31 that indicates 2191 the relative trade off that is requested. An index 2192 value of 0 index highest possible spatial quality, while 2193 31 indicates highest possible temporal resolution. 2195 4.3.2.2. Semantics 2197 A decoder can suggest a temporal-spatial trade-off level by 2198 sending a TSTR message to an encoder. If the encoder is capable 2199 of adjusting its temporal-spatial trade-off, it SHOULD take into 2200 account the received TSTR message for future coding of pictures. 2201 A value of 0 suggests a high spatial quality and a value of 31 2202 suggests a high frame rate. The progression of values from 0 to 2203 31 indicate monotonically a desire for higher frame rate. The 2204 index values do not correspond to precise values of spatial 2205 quality or frame rate. 2207 The reaction to the reception of more than one TSTR message by a 2208 media sender from different media receivers is left open to the 2209 implementation. The selected trade-off SHALL be communicated to 2210 the media receivers by the means of the TSTN message. 2212 Within the common packet header for feedback messages (as defined 2213 in section 6.1 of [RFC4585]), the "SSRC of the packet sender" 2214 field indicates the source of the request, and the "SSRC of media 2215 source" is not used and SHALL be set to 0. The SSRCs of the media 2216 senders to which the TSTR applies to are in the corresponding FCI 2217 entries. 2219 A TSTR message MAY contain requests to multiple media senders, 2220 using one FCI entry per target media sender. 2222 4.3.2.3. Timing Rules 2224 The timing follows the rules outlined in section 3 of [RFC4585]. 2225 This request message is not time critical and SHOULD be sent using 2226 regular RTCP timing. Only if it is known that the user interface 2227 requires a quick feedback, the message MAY be sent with early or 2228 immediate feedback timing. 2230 4.3.2.4. Handling of message in Mixers and Translators 2232 A mixer or media translator that encodes content sent to the 2233 session participant issuing the TSTR SHALL consider the request to 2234 determine if it can fulfill it by changing its own encoding 2235 parameters. A media translator unable to fulfill the request MAY 2236 forward the request unaltered towards the media sender. A mixer 2237 encoding for multiple session participants will need to consider 2238 the joint needs of these participants before generating a TSTR on 2239 its own behalf towards the media sender. See also the discussion 2240 in Section 3 2241 ..5.2. 2243 4.3.2.5. Remarks 2245 The term "spatial quality" does not necessarily refer to the 2246 resolution, measured by the number of pixels the reconstructed 2247 video is using. In fact, in most scenarios the video resolution 2248 stays constant during the lifetime of a session. However, all 2249 video compression standards have means to adjust the spatial 2250 quality at a given resolution, often influenced by the Quantizer 2251 Parameter or QP. A numerically low QP results in a good 2252 reconstructed picture quality, whereas a numerically high QP 2253 yields a coarse picture. The typical reaction of an encoder to 2254 this request is to change its rate control parameters to use a 2255 lower frame rate and a numerically lower (on average) QP, or vice 2256 versa. The precise mapping of Index value to frame rate and QP is 2257 intentionally left open here, as it depends on factors such as the 2258 compression standard employed, spatial resolution, content, bit 2259 rate, and so on. 2261 4.3.3. Temporal-Spatial Trade-off Notification (TSTN) 2263 The TSTN message is identified by RTCP packet type value PT=PSFB 2264 and FMT=6. 2266 The FCI field SHALL contain one or more TSTN FCI entries. 2268 4.3.3.1. Message Format 2270 The content of an FCI entry for the Temporal-Spatial Trade-off 2271 Notification is depicted in Figure 6. The length of the TSTN 2272 message MUST be set to 2+2*N, where N is the number of FCI 2273 entries. 2275 0 1 2 3 2276 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2277 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2278 | SSRC | 2279 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2280 | Seq nr. | Reserved | Index | 2281 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2283 Figure 6 - Syntax of the TSTN 2285 SSRC (32 bits): The SSRC of the source of the TSTR request which 2286 resulted in this Notification. 2288 Seq. nr (8 bits): The sequence number value from the TSTN 2289 request that is being acknowledged. 2291 Reserved (19 bits): All bits SHALL be set to 0 by the sender and 2292 SHALL be ignored on reception. 2294 Index (5 bits): The trade-off value the media sender is using 2295 henceforth. 2297 Informative note: The returned trade-off value (Index) may 2298 differ from the requested one, for example in cases where a 2299 media encoder cannot tune its trade-off, or when pre-recorded 2300 content is used. 2302 4.3.3.2. Semantics 2304 This feedback message is used to acknowledge the reception of a 2305 TSTR. One TSTN entry in a TSTN feedback message SHALL be sent for 2306 each TSTR entry targeted to this session participant, i.e. each 2307 TSTR received that in the SSRC field in the entry has the 2308 receiving entities SSRC. A single TSTN message MAY acknowledge 2309 multiple requests using multiple FCI entries. The index value 2310 included SHALL be the same in all FCI entries of the TSTN message. 2311 Including a FCI for each requestor allows each requesting entity 2312 to determine that the media sender received the request. The 2313 Notification SHALL also be sent in response to TSTR repetitions 2314 received. If the request receiver has received TSTR with several 2315 different sequence numbers from a single requestor it SHALL only 2316 respond to the request with the highest (modulo 256) sequence 2317 number. 2319 The TSTN SHALL include the Temporal-Spatial Trade-off index that 2320 will be used as a result of the request. This is not necessarily 2321 the same index as requested, as the media sender may need to 2322 aggregate requests from several requesting session participants. 2323 It may also have some other policies or rules that limit the 2324 selection. 2326 Within the common packet header for feedback messages (as defined 2327 in section 6.1 of [RFC4585]), the "SSRC of the packet sender" 2328 field indicates the source of the Notification, and the "SSRC of 2329 media source" is not used and SHALL be set to 0. The SSRCs of the 2330 requesting entities to which the Notification applies are in the 2331 corresponding FCI entries. 2333 4.3.3.3. Timing Rules 2335 The timing follows the rules outlined in section 3 of [RFC4585]. 2336 This acknowledgement message is not extremely time critical and 2337 SHOULD be sent using regular RTCP timing. 2339 4.3.3.4. Handling of TSTN in Mixer and Translators 2341 A mixer or translator that acts upon a TSTR SHALL also send the 2342 corresponding TSTN. In cases where it needs to forward a TSTR 2343 itself the notification message MAY need to be delayed until the 2344 TSTR has been responded to. 2346 4.3.3.5. Remarks 2348 None 2350 4.3.4. H.271 Video Back Channel Message (VBCM) 2352 The VBCM is identified by RTCP packet type value PT=PSFB and 2353 FMT=7. 2355 The FCI field MUST contain one or more VBCM FCI entries. 2357 4.3.4.1. Message Format 2359 The syntax of an FCI entry within the VBCM indication is depicted 2360 in Figure 7. 2362 0 1 2 3 2363 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2364 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2365 | SSRC | 2366 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2367 | Seq. nr |0| Payload Type| Length | 2368 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2369 | VBCM Octet String.... | Padding | 2370 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2372 Figure 7 - Syntax of an FCI Entry in the VBCM Message 2374 SSRC (32 bits): The SSRC value of the media sender that is 2375 requested to instruct its encoder to react to the VBCM 2376 message 2378 Seq. nr (8 bits): Command sequence number. The sequence number 2379 space is unique for pairing of the SSRC of command source 2380 and the SSRC of the command target. The sequence number 2381 SHALL be increased by 1 modulo 256 for each new command. A 2382 repetition SHALL NOT increase the sequence number. The 2383 initial value is arbitrary. 2385 0: Must be set to 0 by the sender and should not be acted upon by 2386 the message receiver. 2388 Payload Type (7 bits): The RTP payload type for which the VBCM bit 2389 stream must be interpreted. 2391 Length (16 bits): The length of the VBCM octet string in octets 2392 exclusive of any padding octets 2394 VBCM Octet String (Variable length): This is the octet string 2395 generated by the decoder carrying a specific feedback sub- 2396 message. 2398 Padding (Variable length): Bits set to 0 to make up a 32 bit 2399 boundary. 2401 4.3.4.2. Semantics 2403 The "payload" of the VBCM indication carries different types of 2404 codec-specific, feedback information. The type of feedback 2405 information can be classified as a 'status report' (such as an 2406 indication that a bit stream was received without errors, or that 2407 a partial or complete picture or block was lost) or 'update 2408 requests' (such as complete refresh of the bit stream). 2410 Note: There are possible overlaps between the VBCM sub- 2411 messages and CCM/AVPF feedback messages, such FIR. Please 2412 see section 3.5.3 for further discussion. 2414 The different types of feedback sub-messages carried in the VBCM 2415 are indicated by the "payloadType" as defined in [VBCM]. These 2416 sub-message types are reproduced below for convenience. 2417 "payloadType", in ITU-T Rec. H.271 terminology, refers to the sub- 2418 type of the H.271 message and should not be confused with an RTP 2419 payload type. 2421 Payload Message Content 2422 Type 2423 --------------------------------------------------------------------- 2424 0 One or more pictures without detected bit stream error 2425 mismatch 2426 1 One or more pictures that are entirely or partially lost 2427 2 A set of blocks of one picture that is entirely or partially 2428 lost 2429 3 CRC for one parameter set 2430 4 CRC for all parameter sets of a certain type 2431 5 A "reset" request indicating that the sender should completely 2432 refresh the video bit stream as if no prior bit stream data 2433 had been received 2434 > 5 Reserved for future use by ITU-T 2436 Table 2: H.271 message types ("payloadTypes") 2438 The bit string or the "payload" of a VBCM message is of variable 2439 length and is self-contained and coded in a variable length, 2440 binary format. The media sender necessarily has to be able to 2441 parse this optimized binary format to make use of VBCM messages. 2443 Each of the different types of sub-messages (indicated by 2444 payloadType) may have different semantics depending on the codec 2445 used. 2447 Within the common packet header for feedback messages (as defined 2448 in section 6.1 of [RFC4585]), the "SSRC of the packet sender" 2449 field indicates the source of the request, and the "SSRC of media 2450 source" is not used and SHALL be set to 0. The SSRCs of the media 2451 senders to which the VBCM message applies to are in the 2452 corresponding FCI entries. The sender of the VBCM message MAY 2453 send H.271 messages to multiple media senders and MAY send more 2454 than one H.271 message to the same media sender within the same 2455 VBCM message. 2457 4.3.4.3. Timing Rules 2459 The timing follows the rules outlined in section 3 of [RFC4585]. 2460 The different sub-message types may have different properties in 2461 regards to the timing of messages that should be used. If several 2462 different types are included in the same feedback packet then the 2463 requirements for the sub-message type with the most stringent 2464 requirements should be followed. 2466 4.3.4.4. Handling of message in Mixer or Translator 2468 The handling of VBCM in a mixer or translator is sub-message type 2469 dependent. 2471 4.3.4.5. Remarks 2473 Please see section 3 2474 .5.3 for a discussion of the usage of H.271 2475 messages and messages defined in AVPF [RFC4585] and this memo with 2476 similar functionality. 2478 Note: There has been some discussion whether the payload type 2479 field in this message is needed. It will be needed if there is 2480 potentially more than one VBCM-capable RTP payload type in the 2481 same session, and the semantics of a given VBCM message changes 2482 between payload types. For example, the picture identification 2483 mechanism in messages of H.271 type 0 is fundamentally different 2484 between H.263 and H.264 (although both use the same syntax). 2485 Therefore, the payload field is justified here. There was a 2486 further comment that for TSTS and FIR such a need does not 2487 exist, because the semantics of TSTS and FIR are either loosely 2488 enough defined, or generic enough, to apply to all video 2489 payloads currently in existence/envisioned. 2491 5. Congestion Control 2493 The correct application of the AVPF [RFC4585] timing rules 2494 prevents the network from being flooded by feedback messages. 2495 Hence, assuming a correct implementation and configuration, the 2496 RTCP channel cannot break its bit rate commitment and introduce 2497 congestion. 2499 The reception of some of the feedback messages modifies the 2500 behaviour of the media senders or, more specifically, the media 2501 encoders. Thus modified behaviour MUST respect the bandwidth 2502 limits that the application of congestion control provides. For 2503 example, when a media sender is reacting to a FIR, the unusually 2504 high number of packets that form the decoder refresh point have to 2505 be paced in compliance with the congestion control algorithm, even 2506 if the user experience suffers from a slowly transmitted decoder 2507 refresh point. 2509 A change of the Temporary Maximum Media Stream Bit Rate value can 2510 only mitigate congestion, but not cause congestion as long as 2511 congestion control is also employed. An increase of the value by 2512 a request REQUIRES the media sender to use congestion control when 2513 increasing its transmission rate to that value. A reduction of 2514 the value results in a reduced transmission bit rate thus reducing 2515 the risk for congestion. 2517 6. Security Considerations 2519 The defined messages have certain properties that have security 2520 implications. These must be addressed and taken into account by 2521 users of this protocol. 2523 The defined setup signaling mechanism is sensitive to modification 2524 attacks that can result in session creation with sub-optimal 2525 configuration, and, in the worst case, session rejection. To 2526 prevent this type of attack, authentication and integrity 2527 protection of the setup signaling is required. 2529 Spoofed or maliciously created feedback messages of the type 2530 defined in this specification can have the following implications: 2532 a. severely reduced media bit rate due to false TMMBR messages 2533 that sets the maximum to a very low value; 2535 b. assignment of the ownership of a bounding tuple to the 2536 wrong participant within a TMMBN message, potentially 2537 causing unnecessary oscillation in the bounding set as the 2538 mistakenly identified owner reports a change in its tuple 2539 and the true owner possibly holds back on changes until a 2540 correct TMMBN message reaches the participants; 2542 c. sending TSTR requests that result in a video quality 2543 different from the user's desire, rendering the session 2544 less useful. 2546 d. Frequent FIR commands will potentially reduce the frame- 2547 rate, making the video jerky, due to the frequent usage of 2548 decoder refresh points. 2550 To prevent these attacks there is a need to apply authentication 2551 and integrity protection of the feedback messages. This can be 2552 accomplished against threats external to the current RTP session 2553 using the RTP profile that combines SRTP [SRTP] and AVPF into 2554 SAVPF [SAVPF]. In the mixer cases, separate security contexts and 2555 filtering can be applied between the mixer and the participants 2556 thus protecting other users on the mixer from a misbehaving 2557 participant. 2559 7. SDP Definitions 2561 Section 4 of [RFC4585] defines a new SDP [RFC4566] attribute, 2562 rtcp-fb, that may be used to negotiate the capability to handle 2563 specific AVPF commands and indications, such as Reference Picture 2564 Selection, Picture Loss Indication etc. The ABNF for rtcp-fb is 2565 described in section 4.2 of [RFC4585]. In this section we extend 2566 the rtcp-fb attribute to include the commands and indications that 2567 are described for codec control protocol in the present document. 2568 We also discuss the Offer/Answer implications for the codec 2569 control commands and indications. 2571 7.1. Extension of the rtcp-fb Attribute 2573 As described in AVPF [RFC4585], the rtcp-fb attribute indicates 2574 the capability of using RTCP feedback. AVPF specifies that the 2575 rtcp-fb attribute must only be used as a media level attribute and 2576 must not be provided at session level. All the rules described in 2577 [RFC4585] for rtcp-fb attribute relating to payload type and to 2578 multiple rtcp-fb attributes in a session description also apply to 2579 the new feedback messages defined in this memo. 2581 The ABNF [RFC4234] for rtcp-fb as defined in [RFC4585] is 2583 "a=rtcp-fb: " rtcp-fb-pt SP rtcp-fb-val CRLF 2585 where rtcp-fb-pt is the payload type and rtcp-fb-val defines the 2586 type of the feedback message such as ack, nack, trr-int and rtcp- 2587 fb-id. For example to indicate the support of feedback of picture 2588 loss indication, the sender declares the following in SDP 2590 v=0 2591 o=alice 3203093520 3203093520 IN IP4 host.example.com 2592 s=Media with feedback 2593 t=0 0 2594 c=IN IP4 host.example.com 2595 m=audio 49170 RTP/AVPF 98 2596 a=rtpmap:98 H263-1998/90000 2597 a=rtcp-fb:98 nack pli 2599 In this document we define a new feedback value "ccm" which 2600 indicates the support of codec control using RTCP feedback 2601 messages. The "ccm" feedback value SHOULD be used with 2602 parameters, which indicate the specific codec control commands 2603 supported. In this draft we define four parameters, which can be 2604 used with the ccm feedback value type. 2606 o "fir" indicates the support of the Full Intra Request (FIR). 2607 o "tmmbr" indicates the support of the Temporary Maximum Media 2608 Stream Bit Rate Request/Notification (TMMBR/TMMBN). It has 2609 an optional sub parameter to indicate the session maximum 2610 packet rate to be used. If not included this defaults to 2611 infinity. 2612 o "tstr" indicates the support of the Temporal-Spatial Trade- 2613 off Request/Notification (TSTR/TSTN). 2614 O "vbcm" indicates the support of H.271 video back channel 2615 messages (VBCM). It has zero or more subparameters 2616 identifying the supported H.271 "payloadType" values. 2618 In the ABNF for rtcp-fb-val defined in [RFC4585], there is a 2619 placeholder called rtcp-fb-id to define new feedback types. "ccm" 2620 is defined as a new feedback type in this document and the ABNF 2621 for the parameters for ccm are defined here (please refer to 2622 section 4.2 of [RFC4585] for complete ABNF syntax). 2624 rtcp-fb-param = SP "app" [SP byte-string] 2625 / SP rtcp-fb-ccm-param 2626 / ; empty 2628 rtcp-fb-ccm-param = "ccm" SP ccm-param 2630 ccm-param = "fir" ; Full Intra Request 2631 / "tmmbr" [SP "smaxpr=" MaxPacketRateValue] 2632 ; Temporary max media bit rate 2633 / "tstr" ; Temporal Spatial Trade Off 2634 / "vbcm" *(SP subMessageType) ; H.271 VBCM messages 2635 / token [SP byte-string] 2636 ; for future commands/indications 2637 subMessageType = 1*8DIGIT 2638 byte-string = 2639 MaxPacketRateValue = 1*15DIGIT 2641 7.2. Offer-Answer 2643 The Offer/Answer [RFC3264] implications for codec control protocol 2644 feedback messages are similar those described in [RFC4585]. The 2645 offerer MAY indicate the capability to support selected codec 2646 commands and indications. The answerer MUST remove all ccm 2647 parameters which it does not understand or does not wish to use in 2648 this particular media session. The answerer MUST NOT add new ccm 2649 parameters in addition to what has been offered. The answer is 2650 binding for the media session and both offerer and answerer MUST 2651 only use feedback messages negotiated in this way. 2653 The session maximum packet rate parameter part of the TMMBR 2654 indication is declarative and everyone shall use the highest value 2655 indicated in a response. If the session maximum packet rate 2656 parameter is not present in an offer it SHALL NOT be included by 2657 the answerer. 2659 7.3. Examples 2661 Example 1: The following SDP describes a point-to-point video call 2662 with H.263, with the originator of the call declaring its 2663 capability to support the FIR and TSTR/TSTN codec control 2664 messages. The SDP is carried in a high level signaling protocol 2665 like SIP. 2667 v=0 2668 o=alice 3203093520 3203093520 IN IP4 host.example.com 2669 s=Point-to-Point call 2670 c=IN IP4 192.0.2.124 2671 m=audio 49170 RTP/AVP 0 2672 a=rtpmap:0 PCMU/8000 2673 m=video 51372 RTP/AVPF 98 2674 a=rtpmap:98 H263-1998/90000 2675 a=rtcp-fb:98 ccm tstr 2676 a=rtcp-fb:98 ccm fir 2678 In the above example, when the sender receives a TSTR message from 2679 the remote party it is capable of adjusting the trade off as 2680 indicated in the RTCP TSTN feedback message. 2682 Example 2: The following SDP describes a SIP end point joining a 2683 video mixer that is hosting a multiparty video conferencing 2684 session. The participant supports only the FIR (Full Intra 2685 Request) codec control command and it declares it in its session 2686 description. 2688 v=0 2689 o=alice 3203093520 3203093520 IN IP4 host.example.com 2690 s=Multiparty Video Call 2691 c=IN IP4 192.0.2.124 2692 m=audio 49170 RTP/AVP 0 2693 a=rtpmap:0 PCMU/8000 2694 m=video 51372 RTP/AVPF 98 2695 a=rtpmap:98 H263-1998/90000 2696 a=rtcp-fb:98 ccm fir 2698 When the video MCU decides to route the video of this participant 2699 it sends an RTCP FIR feedback message. Upon receiving this 2700 feedback message the end point is required to generate a full 2701 intra request. 2703 Example 3: The following example describes the Offer/Answer 2704 implications for the codec control messages. The Offerer wishes 2705 to support "tstr", "fir" and "tmmbr". The offered SDP is 2707 -------------> Offer 2708 v=0 2709 o=alice 3203093520 3203093520 IN IP4 host.example.com 2710 s=Offer/Answer 2711 c=IN IP4 192.0.2.124 2712 m=audio 49170 RTP/AVP 0 2713 a=rtpmap:0 PCMU/8000 2714 m=video 51372 RTP/AVPF 98 2715 a=rtpmap:98 H263-1998/90000 2716 a=rtcp-fb:98 ccm tstr 2717 a=rtcp-fb:98 ccm fir 2718 a=rtcp-fb:* ccm tmmbr smaxpr=120 2720 The answerer wishes to support only the FIR and TSTR/TSTN messages 2721 and the answerer SDP is 2723 <---------------- Answer 2725 v=0 2726 o=alice 3203093520 3203093524 IN IP4 otherhost.example.com 2727 s=Offer/Answer 2728 c=IN IP4 192.0.2.37 2729 m=audio 47190 RTP/AVP 0 2730 a=rtpmap:0 PCMU/8000 2731 m=video 53273 RTP/AVPF 98 2732 a=rtpmap:98 H263-1998/90000 2733 a=rtcp-fb:98 ccm tstr 2734 a=rtcp-fb:98 ccm fir 2736 Example 4: The following example describes the Offer/Answer 2737 implications for H.271 Video back channel messages (VBCM). The 2738 Offerer wishes to support VBCM and the sub-messages of payloadType 2739 1 (one or more pictures that are entirely or partially lost) and 2 2740 (a set of blocks of one picture that are entirely or partially 2741 lost). 2743 -------------> Offer 2744 v=0 2745 o=alice 3203093520 3203093520 IN IP4 host.example.com 2746 s=Offer/Answer 2747 c=IN IP4 192.0.2.124 2748 m=audio 49170 RTP/AVP 0 2749 a=rtpmap:0 PCMU/8000 2750 m=video 51372 RTP/AVPF 98 2751 a=rtpmap:98 H263-1998/90000 2752 a=rtcp-fb:98 ccm vbcm 1 2 2754 The answerer only wishes to support sub-messages of type 1 only 2756 <---------------- Answer 2758 v=0 2759 o=alice 3203093520 3203093524 IN IP4 otherhost.example.com 2760 s=Offer/Answer 2761 c=IN IP4 192.0.2.37 2762 m=audio 47190 RTP/AVP 0 2763 a=rtpmap:0 PCMU/8000 2764 m=video 53273 RTP/AVPF 98 2765 a=rtpmap:98 H263-1998/90000 2766 a=rtcp-fb:98 ccm vbcm 1 2768 So in the above example only VBCM indications comprised of 2769 "payloadType" 1 will be supported. 2771 8. IANA Considerations 2773 The new value "ccm" needs to be registered with IANA in the "rtcp- 2774 fb" Attribute Values registry located at the time of publication 2775 at: 2776 http://www.iana.org/assignments/sdp-parameters 2778 Value name: ccm 2779 Long Name: Codec Control Commands and Indications 2780 Reference: RFC XXXX 2782 A new registry "Codec Control Messages" needs to be created to 2783 hold "ccm" parameters located at time of publication at: 2784 http://www.iana.org/assignments/sdp-parameters 2786 New registration in this registry follows the "Specification 2787 required" policy as defined by [RFC2434]. In addition they are 2788 required to indicate which, if any additional RTCP feedback types, 2789 such as "nack", "ack". 2791 The initial content of the registry is the following values: 2793 Value name: fir 2794 Long name: Full Intra Request Command 2795 Usable with: ccm 2796 Reference: RFC XXXX 2798 Value name: tmmbr 2799 Long name: Temporary Maximum Media Stream Bit Rate 2800 Usable with: ccm 2801 Reference: RFC XXXX 2803 Value name: tstr 2804 Long name: temporal Spatial Trade Off 2805 Usable with: ccm 2806 Reference: RFC XXXX 2808 Value name: vbcm 2809 Long name: H.271 video back channel messages 2810 Usable with: ccm 2811 Reference: RFC XXXX 2813 The following values need to be registered as FMT values in the 2814 "FMT Values for RTPFB Payload Types" registry located at the time 2815 of publication at: http://www.iana.org/assignments/rtp-parameters 2816 RTPFB range 2817 Name Long Name Value Reference 2818 -------------- --------------------------------- ----- --------- 2819 Reserved 2 [RFCxxxx] 2820 TMMBR Temporary Maximum Media Stream Bit 3 [RFCxxxx] 2821 Rate Request 2822 TMMBN Temporary Maximum Media Stream Bit 4 [RFCxxxx] 2823 Rate Notification 2825 The following values need to be registered as FMT values in the 2826 "FMT Values for PSFB Payload Types" registry located at the time 2827 of publication at: http://www.iana.org/assignments/rtp-parameters 2829 PSFB range 2830 Name Long Name Value Reference 2831 -------------- --------------------------------- ----- --------- 2832 FIR Full Intra Request Command 4 [RFCxxxx] 2833 TSTR Temporal-Spatial Trade-off Request 5 [RFCxxxx] 2834 TSTN Temporal-Spatial Trade-off Notification 6 [RFCxxxx] 2835 VBCM Video Back Channel Message 7 [RFCxxxx] 2837 9. Contributors 2839 Tom Taylor has made a very significant contribution, for which the 2840 authors are very grateful, to this specification by helping 2841 rewrite the specification. Especially the parts regarding the 2842 algorithm for determining bounding sets for TMMBR have benefited. 2844 10. Acknowledgements 2846 The authors would like to thank Andrea Basso, Orit Levin, Nermeen 2847 Ismail for their work on the requirement and discussion draft 2848 [Basso]. 2850 Drafts of this memo were reviewed and extensively commented by 2851 Roni Even, Colin Perkins, Randell Jesup, Keith Lantz, Harikishan 2852 Desineni, Guido Franceschini and others. The authors appreciate 2853 these reviews. 2855 Funding for the RFC Editor function is currently provided by the 2856 Internet Society. 2858 11. References 2860 11.1. Normative references 2862 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., Rey, 2863 J., "Extended RTP Profile for Real-Time Transport 2864 Control Protocol (RTCP)-Based Feedback (RTP/AVPF)", 2865 RFC 4585, July 2006 2866 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2867 Requirement Levels", BCP 14, RFC 2119, March 1997. 2868 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 2869 Jacobson, "RTP: A Transport Protocol for Real-Time 2870 Applications", STD 64, RFC 3550, July 2003. 2871 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: 2872 Session Description Protocol", RFC 4566, July 2006. 2873 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer 2874 Model with Session Description Protocol (SDP)", RFC 2875 3264, June 2002. 2876 [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing 2877 an IANA Considerations Section in RFCs", BCP 26, RFC 2878 2434, October 1998. 2879 [RFC4234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 2880 Specifications: ABNF", RFC 4234, October 2005. 2882 11.2. Informative references 2884 [Basso] A. Basso, et. al., "Requirements for transport of 2885 video control commands", draft-basso-avt-videoconreq- 2886 02.txt, expired Internet Draft, October 2004. 2887 [AVC] Joint Video Team of ITU-T and ISO/IEC JTC 1, Draft 2888 ITU-T Recommendation and Final Draft International 2889 Standard of Joint Video Specification (ITU-T Rec. 2890 H.264 | ISO/IEC 14496-10 AVC), Joint Video Team (JVT) 2891 of ISO/IEC MPEG and ITU-T VCEG, JVT-G050, March 2003. 2892 [H245] ITU-T Rec. HG.245, "Control protocol for multimedia 2893 communication", MAY 2006 2894 [NEWPRED] S. Fukunaga, T. Nakai, and H. Inoue, "Error Resilient 2895 Video Coding by Dynamic Replacing of Reference 2896 Pictures," in Proc. Globcom'96, vol. 3, pp. 1503 - 2897 1508, 1996. 2898 [SRTP] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and 2899 K. Norrman, "The Secure Real-time Transport Protocol 2900 (SRTP)", RFC 3711, March 2004. 2901 [RFC2032] Turletti, T. and C. Huitema, "RTP Payload Format for 2902 H.261 Video Streams", RFC 2032, October 1996. 2904 [SAVPF] J. Ott, E. Carrara, "Extended Secure RTP Profile for 2905 RTCP-based Feedback (RTP/SAVPF)," draft-ietf-avt- 2906 profile-savpf-10.txt, February, 2007. 2907 [RFC3525] Groves, C., Pantaleo, M., Anderson, T., and T. Taylor, 2908 "Gateway Control Protocol Version 1", RFC 3525, June 2909 2003. 2910 [RFC3448] M. Handley, S. Floyd, J. Padhye, J. Widmer, "TCP 2911 Friendly Rate Control (TFRC): Protocol Specification", 2912 RFC 3448, Jan 2003 2913 [VBCM] ITU-T Rec. H.271, "Video Back Channel Messages", June 2914 2006 2915 [RFC3890] Westerlund, M., "A Transport Independent Bandwidth 2916 Modifier for the Session Description Protocol (SDP)", 2917 RFC 3890, September 2004. 2918 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 2919 Congestion Control Protocol (DCCP)", RFC 4340, March 2920 2006. 2921 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., 2922 Johnston, A., Peterson, J., Sparks, R., Handley, M., 2923 and E. Schooler, "SIP: Session Initiation Protocol", 2924 RFC 3261, June 2002. 2925 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 2926 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 2927 Parisis, "RTP Payload for Redundant Audio Data", RFC 2928 2198, September 1997. 2929 [Topologies] M. Westerlund, and S. Wenger, "RTP Topologies", 2930 draft-ietf-avt-topologies-04, work in progress, Feb 2931 2007. 2933 12. Authors' Addresses 2935 Stephan Wenger 2936 Nokia Corporation 2937 975, Page Mill Road, 2938 Palo Alto,CA 94304 2939 USA 2941 Phone: +1-650-862-7368 2942 EMail: stewe@stewe.org 2944 Umesh Chandra 2945 Nokia Research Center 2946 975, Page Mill Road, 2947 Palo Alto,CA 94304 2948 USA 2950 Phone: +1-650-796-7502 2951 Email: Umesh.Chandra@nokia.com 2953 Magnus Westerlund 2954 Ericsson Research 2955 Ericsson AB 2956 SE-164 80 Stockholm, SWEDEN 2958 Phone: +46 8 7190000 2959 EMail: magnus.westerlund@ericsson.com 2961 Bo Burman 2962 Ericsson Research 2963 Ericsson AB 2964 SE-164 80 Stockholm, SWEDEN 2966 Phone: +46 8 7190000 2967 EMail: bo.burman@ericsson.com 2969 Full Copyright Statement 2971 Copyright (C) The IETF Trust (2007). 2973 This document is subject to the rights, licenses and restrictions 2974 contained in BCP 78, and except as set forth therein, the authors 2975 retain all their rights. 2977 This document and the information contained herein are provided on 2978 an 2979 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 2980 REPRESENTS 2981 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST 2982 AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, 2983 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT 2984 THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR 2985 ANY 2986 IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR 2987 PURPOSE. 2989 Intellectual Property 2991 The IETF takes no position regarding the validity or scope of any 2992 Intellectual Property Rights or other rights that might be claimed 2993 to 2994 pertain to the implementation or use of the technology described in 2995 this document or the extent to which any license under such rights 2996 might or might not be available; nor does it represent that it has 2997 made any independent effort to identify any such rights. 2998 Information 2999 on the procedures with respect to rights in RFC documents can be 3000 found in BCP 78 and BCP 79. 3002 Copies of IPR disclosures made to the IETF Secretariat and any 3003 assurances of licenses to be made available, or the result of an 3004 attempt made to obtain a general license or permission for the use 3005 of 3006 such proprietary rights by implementers or users of this 3007 specification can be obtained from the IETF on-line IPR repository 3008 at 3009 http://www.ietf.org/ipr. 3011 The IETF invites any interested party to bring to its attention any 3012 copyrights, patents or patent applications, or other proprietary 3013 rights that may cover technology that may be required to implement 3014 this standard. Please address the information to the IETF at 3015 ietf-ipr@ietf.org. 3017 Acknowledgement 3019 Funding for the RFC Editor function is provided by the IETF 3020 Administrative Support Activity (IASA). 3022 RFC Editor Considerations 3024 The RFC editor is requested to replace all occurrences of XXXX 3025 with the RFC number this document receives.