idnits 2.17.1 draft-ietf-avt-avpf-ccm-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 20. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 2959. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2970. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2977. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2983. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == Line 758 has weird spacing: '...sg type mul...' == Line 1142 has weird spacing: '... ab c s...' == Line 1144 has weird spacing: '... ba s...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 1, 2007) is 6106 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFCxxxx' is mentioned on line 2811, but not defined ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Obsolete normative reference: RFC 4234 (Obsoleted by RFC 5234) -- Obsolete informational reference (is this intentional?): RFC 2032 (Obsoleted by RFC 4587) == Outdated reference: A later version (-12) exists of draft-ietf-avt-profile-savpf-10 -- Obsolete informational reference (is this intentional?): RFC 3525 (Obsoleted by RFC 5125) -- Obsolete informational reference (is this intentional?): RFC 3448 (Obsoleted by RFC 5348) == Outdated reference: A later version (-07) exists of draft-ietf-avt-topologies-06 Summary: 4 errors (**), 0 flaws (~~), 7 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Stephan Wenger 3 INTERNET-DRAFT Umesh Chandra 4 Expires: February 2008 Nokia 5 Intended Status: Proposed Standard Magnus Westerlund 6 Bo Burman 7 Ericsson 8 August 1, 2007 10 Codec Control Messages in the 11 RTP Audio-Visual Profile with Feedback (AVPF) 13 15 Status of this Memo 17 By submitting this Internet-Draft, each author represents that any 18 applicable patent or other IPR claims of which he or she is aware 19 have been or will be disclosed, and any of which he or she becomes 20 aware will be disclosed, in accordance with Section 6 of BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF), its areas, and its working groups. Note that 24 other groups may also distribute working documents as Internet- 25 Drafts. 27 Internet-Drafts are draft documents valid for a maximum of six 28 months and may be updated, replaced, or obsoleted by other documents 29 at any time. It is inappropriate to use Internet-Drafts as 30 reference material or to cite them other than as "work in progress." 32 The list of current Internet-Drafts can be accessed at 33 http://www.ietf.org/ietf/1id-abstracts.txt. 35 The list of Internet-Draft Shadow Directories can be accessed at 36 http://www.ietf.org/shadow.html. 38 Copyright Notice 40 Copyright (C) The IETF Trust (2007). 42 Abstract 44 This document specifies a few extensions to the messages defined in 45 the Audio-Visual Profile with Feedback (AVPF). They are helpful 46 primarily in conversational multimedia scenarios where centralized 47 multipoint functionalities are in use. However, some are also 48 usable in smaller multicast environments and point-to-point calls. 50 The extensions discussed are messages related to the ITU-T H.271 51 Video Back Channel, Full Intra Request, Temporary Maximum Media 52 Stream Bit Rate and Temporal Spatial Trade-off. 54 TABLE OF CONTENTS 56 1. Introduction..................................................5 57 2. Definitions...................................................6 58 2.1. Glossary...................................................6 59 2.2. Terminology................................................6 60 2.3. Topologies.................................................9 61 3. Motivation...................................................10 62 3.1. Use Cases.................................................10 63 3.2. Using the Media Path......................................12 64 3.3. Using AVPF................................................13 65 3.3.1. Reliability..........................................13 66 3.4. Multicast.................................................13 67 3.5. Feedback Messages.........................................13 68 3.5.1. Full Intra Request Command...........................13 69 3.5.1.1. Reliability.....................................14 70 3.5.2. Temporal Spatial Trade-off Request and Notification..15 71 3.5.2.1. Point-to-Point..................................16 72 3.5.2.2. Point-to-Multipoint Using Multicast or 73 Translators.....................................16 74 3.5.2.3. Point-to-Multipoint Using RTP Mixer.............17 75 3.5.2.4. Reliability.....................................17 76 3.5.3. H.271 Video Back Channel Message.....................18 77 3.5.3.1. Reliability.....................................20 78 3.5.4. Temporary Maximum Media Stream Bit Rate Request and 79 Notification.........................................20 80 3.5.4.1. Behavior for media receivers using TMMBR........23 81 3.5.4.2. Algorithm for establishing current limitations..24 82 3.5.4.3. Use of TMMBR in a Mixer Based Multipoint 83 Operation.......................................31 84 3.5.4.4. Use of TMMBR in Point-to-Multipoint Using 85 Multicast or Translators........................32 86 3.5.4.5. Use of TMMBR in Point-to-point operation........32 87 3.5.4.6. Reliability.....................................32 88 4. RTCP Receiver Report Extensions..............................34 89 4.1. Design Principles of the Extension Mechanism..............34 90 4.2. Transport Layer Feedback Messages.........................35 91 4.2.1. Temporary Maximum Media Stream Bit Rate Request 92 (TMMBR)..............................................36 93 4.2.1.1. Message Format..................................36 94 4.2.1.2. Semantics.......................................37 95 4.2.1.3. Timing Rules....................................41 96 4.2.1.4. Handling in Translator and Mixers...............41 97 4.2.2. Temporary Maximum Media Stream Bit Rate Notification 98 (TMMBN)..............................................41 99 4.2.2.1. Message Format..................................41 100 4.2.2.2. Semantics.......................................42 101 4.2.2.3. Timing Rules....................................43 102 4.2.2.4. Handling by Translators and Mixers..............43 103 4.3. Payload Specific Feedback Messages........................43 104 4.3.1. Full Intra Request (FIR).............................44 105 4.3.1.1. Message Format..................................44 106 4.3.1.2. Semantics.......................................45 107 4.3.1.3. Timing Rules....................................46 108 4.3.1.4. Handling of FIR Message in Mixer and 109 Translators.....................................46 110 4.3.1.5. Remarks.........................................46 111 4.3.2. Temporal-Spatial Trade-off Request (TSTR)............48 112 4.3.2.1. Message Format..................................48 113 4.3.2.2. Semantics.......................................49 114 4.3.2.3. Timing Rules....................................49 115 4.3.2.4. Handling of message in Mixers and Translators...49 116 4.3.2.5. Remarks.........................................50 117 4.3.3. Temporal-Spatial Trade-off Notification (TSTN).......50 118 4.3.3.1. Message Format..................................50 119 4.3.3.2. Semantics.......................................51 120 4.3.3.3. Timing Rules....................................52 121 4.3.3.4. Handling of TSTN in Mixer and Translators.......52 122 4.3.3.5. Remarks.........................................52 123 4.3.4. H.271 Video Back Channel Message (VBCM)..............52 124 4.3.4.1. Message Format..................................52 125 4.3.4.2. Semantics.......................................53 126 4.3.4.3. Timing Rules....................................54 127 4.3.4.4. Handling of message in Mixer or Translator......55 128 4.3.4.5. Remarks.........................................55 129 5. Congestion Control...........................................55 130 6. Security Considerations......................................56 131 7. SDP Definitions..............................................57 132 7.1. Extension of the rtcp-fb Attribute........................57 133 7.2. Offer-Answer..............................................58 134 7.3. Examples..................................................59 135 8. IANA Considerations..........................................62 136 9. Contributors.................................................63 137 10. Acknowledgements.............................................63 138 11. References...................................................64 139 11.1. Normative references.....................................64 140 11.2. Informative references...................................64 141 12. Authors' Addresses...........................................66 142 1. Introduction 144 When the Audio-Visual Profile with Feedback (AVPF) [RFC4585] was 145 developed, the main emphasis lay in the efficient support of point- 146 to-point and small multipoint scenarios without centralized 147 multipoint control. However, in practice, many small multipoint 148 conferences operate utilizing devices known as Multipoint Control 149 Units (MCUs). Long-standing experience of the conversational video 150 conferencing industry suggests that there is a need for a few 151 additional feedback messages, to support centralized multipoint 152 conferencing efficiently. Some of the messages have applications 153 beyond centralized multipoint, and this is indicated in the 154 description of the message. This is especially true for the message 155 intended to carry ITU-T Rec. H.271 [H.271] bit strings for Video 156 Back Channel messages. 158 In Real-time Transport Protocol (RTP) [RFC3550] terminology, MCUs 159 comprise mixers and translators. Most MCUs also include signaling 160 support. During the development of this memo, it was noticed that 161 there is considerable confusion in the community related to the use 162 of terms such as mixer, translator, and MCU. In response to these 163 concerns, a number of topologies have been identified that are of 164 practical relevance to the industry, but are not documented in 165 sufficient detail in [RFC3550]. These topologies are documented in 166 [Topologies], and understanding this memo requires previous or 167 parallel study of [Topologies]. 169 Some of the messages defined here are forward only, in that they do 170 not require an explicit notification to the message emitter that 171 they have been received and/or indicating the message receiver's 172 actions. Other messages require a response, leading to a two way 173 communication model that one could view as useful for control 174 purposes. However, it is not the intention of this memo to open up 175 RTP Control Protocol (RTCP) to a generalized control protocol. All 176 mentioned messages have relatively strict real-time constraints, in 177 the sense that their value diminishes with increased delay. This 178 makes the use of more traditional control protocol means, such as 179 Session Initiation Protocol (SIP) re-INVITEs [RFC3261], undesirable 180 when used for the same purpose. Furthermore, all messages are of a 181 very simple format that can be easily processed by an RTP/RTCP 182 sender/receiver. Finally, and most importantly, all messages relate 183 only to the RTP stream with which they are associated, and not to 184 any other property of a communication system. In particular, none 185 of them relate to the properties of the access links traversed by 186 the session. 188 2. Definitions 190 2.1. Glossary 192 AIMD - Additive Increase Multiplicative Decrease 193 AVPF - The extended RTP profile for RTCP-based feedback 194 FEC - Forward Error Correction 195 FCI - Feedback Control Information [RFC4585] 196 FIR - Full Intra Request 197 MCU - Multipoint Control Unit 198 MPEG - Moving Picture Experts Group 199 TMMBN - Temporary Maximum Media Stream Bit Rate Notification 200 TMMBR - Temporary Maximum Media Stream Bit Rate Request 201 PLI - Picture Loss Indication 202 PR - Packet rate 203 QP - Quantizer Parameter 204 RTT - Round trip time 205 SSRC - Synchronization Source 206 TSTN - Temporal Spatial Trade-off Notification 207 TSTR - Temporal Spatial Trade-off Request 208 VBCM - Video Back Channel Message indication. 210 2.2. Terminology 212 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 213 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 214 this document are to be interpreted as described in RFC 2119 215 [RFC2119]. 217 Message: 218 An RTCP feedback message [RFC4585] defined by this 219 specification, of one of the following types: 221 Request: 222 Message that requires acknowledgement 224 Command: 225 Message that forces the receiver to an action 227 Indication: 228 Message that reports a situation 230 Notification: 231 Message that provides a notification that an event has 232 occurred. Notifications are commonly generated in 233 response to a Request. 235 Note that, with the exception of "Notification", this 236 terminology is in alignment with ITU-T Rec. H.245 [H245]. 238 Decoder Refresh Point: 239 A bit string, packetized in one or more RTP packets, which 240 completely resets the decoder to a known state. 242 Examples for "hard" decoder refresh points are Intra pictures 243 in H.261, H.263, MPEG-1, MPEG-2, and MPEG-4 part 2, and 244 Instantaneous Decoder Refresh (IDR) pictures in H.264. 245 "Gradual" decoder refresh points may also be used; see for 246 example [AVC]. While both "hard" and "gradual" decoder 247 refresh points are acceptable in the scope of this 248 specification, in most cases the user experience will benefit 249 from using a "hard" decoder refresh point. 251 A decoder refresh point also contains all header information 252 above the picture layer (or equivalent, depending on the 253 video compression standard) that is conveyed in-band. In 254 H.264, for example, a decoder refresh point contains 255 parameter set Network Adaptation Layer (NAL) units that 256 generate parameter sets necessary for the decoding of the 257 following slice/data partition NAL units (and that are not 258 conveyed out of band). 260 Decoding: 261 The operation of reconstructing the media stream. 263 Rendering: 264 The operation of presenting (parts of) the reconstructed 265 media stream to the user. 267 Stream thinning: 268 The operation of removing some of the packets from a media 269 stream. Stream thinning, preferably, is media-aware, 270 implying that media packets are removed in the order of 271 increasing relevance to the reproductive quality. However, 272 even when employing media-aware stream thinning, most media 273 streams quickly lose quality when subjected to increasing 274 levels of thinning. Media-unaware stream thinning leads to 275 even worse quality degradation. In contrast to transcoding, 276 stream thinning is typically seen as a computationally 277 lightweight operation. 279 Media: 280 Often used (sometimes in conjunction with terms like bit 281 rate, stream, sender ...) to identify the content of the 282 forward RTP packet stream (carrying the codec data), to which 283 the codec control message applies. 285 Media Stream: 286 The stream of RTP packets labeled with a single 287 Synchronization Source (SSRC) carrying the media (and also in 288 some cases repair information such as retransmission or 289 Forward Error Correction (FEC) information). 291 Total media bit rate: 292 The total bits per second transferred in a media stream, 293 measured at an observer-selected protocol layer and averaged 294 over a reasonable timescale, the length of which depends on 295 the application. In general, a media sender and a media 296 receiver will observe different total media bit rates for the 297 same stream, first because they may have selected different 298 reference protocol layers, and second, because of changes in 299 per-packet overhead along the transmission path. The goal 300 with bit rate averaging is to be able to ignore any 301 burstiness on very short timescales, below for example 100 302 ms, introduced by scheduling or link layer packetization 303 effects. 305 Maximum total media bit rate: 306 The upper limit on total media bit rate for a given media 307 stream at a particular receiver and for its selected protocol 308 layer. Note that this value cannot be measured on the 309 received media stream, instead it needs to be calculated or 310 determined through other means, such as QoS negotiations or 311 local resource limitations. Also note that this value is an 312 average (on a timescale that is reasonable for the 313 application) and that it may be different from the 314 instantaneous bit-rate seen by packets in the media stream. 316 Overhead: 317 All protocol header information required to convey a packet 318 with media data from sender to receiver, from the application 319 layer down to a pre-defined protocol level (for example down 320 to, and including, the IP header). Overhead may include, for 321 example, IP, UDP, and RTP headers, any layer 2 headers, any 322 Contributing Sources (CSRCs), RTP-Padding, and RTP header 323 extensions. Overhead excludes any RTP payload headers and 324 the payload itself. 326 Net media bit rate: 327 The bit rate carried by a media stream, net of overhead. 328 That is, the bits per second accounted for by encoded media, 329 any applicable payload headers, and any directly associated 330 meta payload information placed in the RTP packet. A typical 331 example of the latter is redundancy data provided by the use 332 of RFC 2198 [RFC2198]. Note that, unlike the total media bit 333 rate, the net media bit rate will have the same value at the 334 media sender and at the media receiver unless any mixing or 335 translating of the media has occurred. 337 For a given observer, the total media bit rate for a media 338 stream is equal to the sum of the net media bit rate and the 339 per-packet overhead as defined above multiplied by the packet 340 rate. 342 Feasible region: 343 The set of all combinations of packet rate and net media bit 344 rate that do not exceed the restrictions in maximum media bit 345 rate placed on a given media sender by the Temporary Maximum 346 Media Stream Bit-rate Request (TMMBR) messages it has 347 received. The feasible region will change as new TMMBR 348 messages are received. 350 Bounding set: 351 The set of TMMBR tuples, selected from all those received at 352 a given media sender, that define the feasible region for 353 that media sender. The media sender uses an algorithm such 354 as that in section 3.5.4.2 to determine or iteratively 355 approximate the current bounding set, and reports that set 356 back to the media receivers in a Temporary Maximum Media 357 Stream Bit-rate Notification (TMMBN) message. 359 2.3. Topologies 361 Please refer to [Topologies] for an in depth discussion. The 362 topologies referred to throughout this memo are labeled 363 (consistently with [Topologies]) as follows: 365 Topo-Point-to-Point . . . . . Point-to-point communication 366 Topo-Multicast . . . . . . . Multicast communication 367 Topo-Translator . . . . . . . Translator based 368 Topo-Mixer . . . . . . . . . Mixer based 369 Topo-RTP-switch-MCU . . . . RTP stream switching MCU, 370 Topo-RTCP-terminating-MCU . . Mixer but terminating RTCP 372 3. Motivation 374 This section discusses the motivation and usage of the different 375 video and media control messages. The video control messages have 376 been under discussion for a long time, and a requirement draft was 377 drawn up [Basso]. This draft has expired; however we quote relevant 378 sections of it to provide motivation and requirements. 380 3.1. Use Cases 382 There are a number of possible usages for the proposed feedback 383 messages. Let us begin by looking through the use cases Basso et 384 al. [Basso] proposed. Some of the use cases have been reformulated 385 and comments have been added. 387 1. An RTP video mixer composes multiple encoded video sources into a 388 single encoded video stream. Each time a video source is added, 389 the RTP mixer needs to request a decoder refresh point from the 390 video source, so as to start an uncorrupted prediction chain on 391 the spatial area of the mixed picture occupied by the data from 392 the new video source. 394 2. An RTP video mixer receives multiple encoded RTP video streams 395 from conference participants, and dynamically selects one of the 396 streams to be included in its output RTP stream. At the time of 397 a bit stream change (determined through means such as voice 398 activation or the user interface), the mixer requests a decoder 399 refresh point from the remote source, in order to avoid using 400 unrelated content as reference data for inter picture prediction. 401 After requesting the decoder refresh point, the video mixer stops 402 the delivery of the current RTP stream and monitors the RTP 403 stream from the new source until it detects data belonging to the 404 decoder refresh point. At that time, the RTP mixer starts 405 forwarding the newly selected stream to the receiver(s). 407 3. An application needs to signal to the remote encoder that the 408 desired trade-off between temporal and spatial resolution has 409 changed. For example, one user may prefer a higher frame rate 410 and a lower spatial quality, and another user may prefer the 411 opposite. This choice is also highly content dependent. Many 412 current video conferencing systems offer in the user interface a 413 mechanism to make this selection, usually in the form of a 414 slider. The mechanism is helpful in point-to-point, centralized 415 multipoint and non-centralized multipoint uses. 417 4. Use case 4 of the Basso draft applies only to Picture Loss 418 Indication (PLI) as defined in AVPF [RFC4585] and is not 419 reproduced here. 421 5. Use case 5 of the Basso draft relates to a mechanism known as 422 "freeze picture request". Sending freeze picture requests 423 over a non-reliable forward RTCP channel has been identified as 424 problematic. Therefore, no freeze picture request has been 425 included in this memo, and the use case discussion is not 426 reproduced here. 428 6. A video mixer dynamically selects one of the received video 429 streams to be sent out to participants and tries to provide the 430 highest bit rate possible to all participants, while minimizing 431 stream trans-rating. One way of achieving this is to set up 432 sessions with endpoints using the maximum bit rate accepted by 433 each endpoint, and accepted by the call admission method used by 434 the mixer. By means of commands that reduce the maximum media 435 stream bit rate below what has been negotiated during session set 436 up, the mixer can reduce the maximum bit rate sent by endpoints 437 to the lowest of all the accepted bit rates. As the lowest 438 accepted bit rate changes due to endpoints joining and leaving or 439 due to network congestion, the mixer can adjust the limits at 440 which endpoints can send their streams to match the new value. 441 The mixer then requests a new maximum bit rate, which is equal to 442 or less than the maximum bit rate negotiated at session setup for 443 a specific media stream, and the remote endpoint can respond with 444 the actual bit rate that it can support. 446 The picture Basso, et al draws up covers most applications we 447 foresee. However, we would like to extend the list with two 448 additional use cases: 450 7. Currently deployed congestion control algorithms (AIMD and TFRC 451 [RFC3448]) probe for additional available capacity as long as 452 there is something to send. With congestion control algorithms 453 using packet loss as the indication for congestion, this probing 454 generally results in reduced media quality (often to a point 455 where the distortion is large enough to make the media unusable), 456 due to packet loss and increased delay. 458 In a number of deployment scenarios, especially cellular ones, 459 the bottleneck link is often the last hop link. That cellular 460 link also commonly has some type of QoS negotiation enabling the 461 cellular device to learn the maximal bit rate available over this 462 last hop. A media receiver behind this link can, in most (if not 463 all) cases, calculate at least an upper bound for the bit rate 464 available for each media stream it presently receives. How this 465 is done is an implementation detail and not discussed herein. 466 Indicating the maximum available bit rate to the transmitting 467 party for the various media streams can be beneficial to prevent 468 that party from probing for bandwidth for this stream in excess 469 of a known hard limit. For cellular or other mobile devices, the 470 known available bit rate for each stream (deduced from the link 471 bit rate) can change quickly, due to handover to another 472 transmission technology, QoS renegotiation due to congestion, 473 etc. To enable minimal disruption of service, quick convergence 474 is necessary, and therefore media path signaling is desirable. 476 8. The use of reference picture selection (RPS) as an error 477 resilience tool has been introduced in 1997 as NEWPRED [NEWPRED], 478 and is now widely deployed. When RPS is in use, simplistically 479 put, the receiver can send a feedback message to the sender, 480 indicating a reference picture that should be used for future 481 prediction. ([NEWPRED] mentions other forms of feedback as 482 well.) AVPF contains a mechanism for conveying such a message, 483 but did not specify for which codec and according to which syntax 484 the message should conform. Recently, the ITU-T finalized Rec. 485 H.271 which (among other message types) also includes a feedback 486 message. It is expected that this feedback message will fairly 487 quickly enjoy wide support. Therefore, a mechanism to convey 488 feedback messages according to H.271 appears to be desirable. 490 3.2. Using the Media Path 492 There are multiple reasons why we use the media path for the codec 493 control messages. 495 First, systems employing MCUs often separate the control and media 496 processing parts. As these messages are intended for or generated 497 by the media part rather than the signaling part of the MCU, having 498 them on the media path avoids transmission across interfaces and 499 unnecessary control traffic between signaling and processing. If 500 the MCU is physically decomposed, the use of the media path avoids 501 the need for media control protocol extensions (e.g. in MEGACO 502 [RFC3525]). 504 Secondly, the signaling path quite commonly contains several 505 signaling entities, e.g. SIP proxies and application servers. 506 Avoiding going through signaling entities avoids delay for several 507 reasons. Proxies have less stringent delay requirements than media 508 processing and due to their complex and more generic nature may 509 result in significant processing delay. The topological locations 510 of the signaling entities are also commonly not optimized for 511 minimal delay, but rather towards other architectural goals. Thus, 512 the signaling path can be significantly longer in both geographical 513 and delay sense. 515 3.3. Using AVPF 517 The AVPF feedback message framework [RFC4585] provides the 518 appropriate framework to implement the new messages. AVPF 519 implements rules controlling the timing of feedback messages to 520 avoid congestion through network flooding by RTCP traffic. We re- 521 use these rules by referencing AVPF. 523 The signaling setup for AVPF allows each individual type of function 524 to be configured or negotiated on an RTP session basis. 526 3.3.1. Reliability 528 The use of RTCP messages implies that each message transfer is 529 unreliable, unless the lower layer transport provides reliability. 530 The different messages proposed in this specification have different 531 requirements in terms of reliability. However, in all cases, the 532 reaction to an (occasional) loss of a feedback message is specified. 534 3.4. Multicast 536 The codec control messages might be used with multicast. The RTCP 537 timing rules specified in [RFC3550] and [RFC4585] ensure that the 538 messages do not cause overload of the RTCP connection. The use of 539 multicast may result in the reception of messages with inconsistent 540 semantics. The reaction to inconsistencies depends on the message 541 type, and is discussed for each message type separately. 543 3.5. Feedback Messages 545 This section describes the semantics of the different feedback 546 messages and how they apply to the different use cases. 548 3.5.1. Full Intra Request Command 550 A Full Intra Request (FIR) Command, when received by the designated 551 media sender, requires that the media sender sends a Decoder Refresh 552 Point (see 2.2) at the earliest opportunity. The evaluation of such 553 opportunity includes the current encoder coding strategy and the 554 current available network resources. 556 FIR is also known as an "instantaneous decoder refresh request", 557 "fast video update request" or "video fast update request". 559 Using a decoder refresh point implies refraining from using any 560 picture sent prior to that point as a reference for the encoding 561 process of any subsequent picture sent in the stream. For 562 predictive media types that are not video, the analogue applies. 563 For example, if in MPEG-4 systems scene updates are used, the 564 decoder refresh point consists of the full representation of the 565 scene and is not delta-coded relative to previous updates. 567 Decoder refresh points, especially Intra or IDR pictures, are in 568 general several times larger in size than predicted pictures. Thus, 569 in scenarios in which the available bit rate is small, the use of a 570 decoder refresh point implies a delay that is significantly longer 571 than the typical picture duration. 573 Usage in multicast is possible; however aggregation of the commands 574 is recommended. A receiver that receives a request closely after 575 sending a decoder refresh point -- within 2 times the longest Round 576 Trip Time (RTT) known, plus and AVPF-induced RTCP packet sending 577 delays -- should await a second request message to ensure that the 578 media receiver has not been served by the previously delivered 579 decoder refresh point. The reason for the specified delay is to 580 avoid sending unnecessary decoder refresh points. A session 581 participant may have sent its own request while another 582 participant's request was in-flight to them. Suppressing those 583 requests that may have been sent without knowledge about the other 584 request avoids this issue. 586 Using the FIR command to recover from errors is explicitly 587 disallowed, and instead the PLI message defined in AVPF [RFC4585] 588 should be used. The PLI message reports lost pictures and has been 589 included in AVPF for precisely that purpose. 591 Full Intra Request is applicable in use-cases 1 and 2. 593 3.5.1.1. Reliability 595 The FIR message results in the delivery of a decoder refresh point, 596 unless the message is lost. Decoder refresh points are easily 597 identifiable from the bit stream. Therefore, there is no need for 598 protocol-level notification, and a simple command repetition 599 mechanism is sufficient for ensuring the level of reliability 600 required. However, the potential use of repetition does require a 601 mechanism to prevent the recipient from responding to messages 602 already received and responded to. 604 To ensure the best possible reliability, a sender of FIR may repeat 605 the FIR request until the desired content has been received. The 606 repetition interval is determined by the RTCP timing rules 607 applicable to the session. Upon reception of a complete decoder 608 refresh point or the detection of an attempt to send a decoder 609 refresh point (which got damaged due to a packet loss), the 610 repetition of the FIR must stop. If another FIR is necessary, the 611 request sequence number must be increased. A FIR sender shall not 612 have more than one FIR request (different request sequence number) 613 outstanding at any time per media sender in the session. 615 The receiver of FIR (i.e. the media sender) behaves in complementary 616 fashion to ensure delivery of a decoder refresh point. If it 617 receives repetitions of the FIR more than 2*RTT after it has sent a 618 decoder refresh point, it shall send a new decoder refresh point. 619 Two round trip times allow time for the decoder refresh point to 620 arrive back to the requestor and for the end of repetitions of FIR 621 to reach and be detected by the media sender. 623 An RTP mixer or RTP switching MCU that receive a FIR from a media 624 receiver is responsible to ensure that a decoder refresh point is 625 delivered to the requesting receiver. It may be necessary for the 626 mixer/MCU to generate FIR commands. From a reliability perspective, 627 the two legs (FIR-requesting endpoint to mixer/MCU, and mixer/MCU to 628 decoder refresh point generating endpoint) are handled independently 629 from each other. 631 3.5.2. Temporal Spatial Trade-off Request and Notification 633 The Temporal Spatial Trade-off Request (TSTR) instructs the video 634 encoder to change its trade-off between temporal and spatial 635 resolution. Index values from 0 to 31 indicate monotonically a 636 desire for higher frame rate. That is, a requester asking for an 637 index of 0 prefers a high quality and is willing to accept a low 638 frame rate, whereas a requester asking for 31 wishes a high frame 639 rate, potentially at the cost of low spatial quality. 641 In general the encoder reaction time may be significantly longer 642 than the typical picture duration. See use case 3 for an example. 643 The encoder decides whether and to what extent the request results 644 in a change of the trade-off. It returns a Temporal Spatial Trade- 645 Off Notification (TSTN) message to indicate the trade-off that it 646 will use henceforth. 648 TSTR and TSTN have been introduced primarily because it is believed 649 that control protocol mechanisms, e.g. a SIP re-invite, are too 650 heavyweight and too slow to allow for a reasonable user experience. 651 Consider, for example, a user interface where the remote user 652 selects the temporal/spatial trade-off with a slider. An immediate 653 feedback to any slider movement is required for a reasonable user 654 experience. A SIP re-INVITE [RFC3261] would require at least two 655 round-trips more (compared to the TSTR/TSTN mechanism) and may 656 involve proxies and other complex mechanisms. Even in a well- 657 designed system, it could take a second or so until the new trade- 658 off is finally selected. Furthermore the use of RTCP solves the 659 multicast use case very efficiently. 661 The use of TSTR and TSTN in multipoint scenarios is a non-trivial 662 subject, and can be achieved in many implementation-specific ways. 663 Problems stem from the fact that TSTRs will typically arrive 664 unsynchronized, and may request different trade-off values for the 665 same stream and/or endpoint encoder. This memo does not specify a 666 translator's, mixer's or endpoint's reaction to the reception of a 667 suggested trade-off as conveyed in the TSTR. We only require the 668 receiver of a TSTR message to reply to it by sending a TSTN, 669 carrying the new trade-off chosen by its own criteria (which may or 670 may not be based on the trade-off conveyed by the TSTR). In other 671 words, the trade-off sent in TSTR is a non-binding recommendation, 672 nothing more. 674 Three TSTR/TSTN scenarios need to be distinguished, based on the 675 topologies described in [Topologies]. The scenarios are described 676 in the following sub-clauses. 678 3.5.2.1. Point-to-Point 680 In this most trivial case (Topo-Point-to-Point), the media sender 681 typically adjusts its temporal/spatial trade-off based on the 682 requested value in TSTR, subject to its own capabilities. The TSTN 683 message conveys back the new trade-off value (which may be identical 684 to the old one if, for example, the sender is not capable of 685 adjusting its trade-off). 687 3.5.2.2. Point-to-Multipoint Using Multicast or Translators 689 RTCP Multicast is used either with media multicast according to 690 Topo-Multicast, or following RFC 3550's translator model according 691 to Topo-Translator. In these cases, unsynchronized TSTR messages 692 from different receivers may be received, possibly with different 693 requested trade-offs (because of different user preferences). This 694 memo does not specify how the media sender tunes its trade-off. 695 Possible strategies include selecting the mean or median of all 696 trade-off requests received, giving priority to certain 697 participants, or continuing to use the previously selected trade-off 698 (e.g. when the sender is not capable of adjusting it). Again, all 699 TSTR messages need to be acknowledged by TSTN, and the value 700 conveyed back has to reflect the decision made. 702 3.5.2.3. Point-to-Multipoint Using RTP Mixer 704 In this scenario (Topo-Mixer) the RTP mixer receives all TSTR 705 messages, and has the opportunity to act on them based on its own 706 criteria. In most cases, the mixer should form a "consensus" of 707 potentially conflicting TSTR messages arriving from different 708 participants, and initiate its own TSTR message(s) to the media 709 sender(s). As in the previous scenario, the strategy for forming 710 this "consensus" is up to the implementation, and can, for example, 711 encompass averaging the participants' request values, giving 712 priority to certain participants, or using session default values. 714 Even if a mixer or translator performs transcoding, it is very 715 difficult to deliver media with the requested trade-off, unless the 716 content the mixer or translator receives is already close to that 717 trade-off. Thus, if the mixer changes its trade-off, it needs to 718 request the media sender(s) to use the new value, by creating a TSTR 719 of its own. Upon reaching a decision on the used trade-off it 720 includes that value in the acknowledgement to the downstream 721 requestors. Only in cases where the original source has 722 substantially higher quality (and bit rate) is it likely that 723 transcoding alone can result in the requested trade-off. 725 3.5.2.4. Reliability 727 A request and reception acknowledgement mechanism is specified. The 728 Temporal Spatial Trade-off Notification (TSTN) message informs the 729 requester that its request has been received, and what trade-off is 730 used henceforth. This acknowledgment mechanism is desirable for at 731 least the following reasons: 733 o A change in the trade-off cannot be directly identified from the 734 media bit stream. 735 o User feedback cannot be implemented without knowing the chosen 736 trade-off value, according to the media sender's constraints. 737 o Repetitive sending of messages requesting an unimplementable 738 trade-off can be avoided. 740 3.5.3. H.271 Video Back Channel Message 742 ITU-T Rec. H.271 defines syntax, semantics, and suggested encoder 743 reaction to a video back channel message. The structure defined in 744 this memo is used to transparently convey such a message from media 745 receiver to media sender. In this memo, we refrain from an in-depth 746 discussion of the available code points within H.271 and refer to 747 the specification text [H.271] instead. 749 However, we note that some H.271 messages bear similarities with 750 native messages of AVPF and this memo. Furthermore, we note that 751 some H.271 message are known to require caution in multicast 752 environments -- or are plainly not usable in multicast or multipoint 753 scenarios. Table 1 provides a brief, oversimplifed overview of the 754 messages currently defined in H.271, their roughly corresponding 755 AVPF or CCM messages (the latter as specified in this memo), and an 756 indication of our current knowledge of their multicast safety. 758 H.271 msg type AVPF/CCM msg type multicast-safe 759 -------------------------------------------------------------------- 760 0 (when used for 761 reference picture 762 selection) AVPF RPSI No (positive ACK of pictures) 763 1 picture loss AVPF PLI Yes 764 2 partial loss AVPF SLI Yes 765 3 one parameter CRC N/A Yes (no required sender action) 766 4 all parameter CRC N/A Yes (no required sender action) 767 5 refresh point CCM FIR Yes 769 Table 1: H.271 messages and their AVPF/CCM equivalents 771 Note: H.271 message type 0 is not a strict equivalent to 772 AVPF's Reference Picture Selection Indication (RPSI); it is 773 an indication of known-as-correct reference picture(s) at the 774 decoder. It does not command an encoder to use a defined 775 reference picture (the form of control information envisioned 776 to be carried in RPSI). However, it is believed and intended 777 that H.271 message type 0 will be used for the same purpose 778 as AVPF's RPSI -- although other use forms are also possible. 780 In response to the opaqueness of the H.271 messages, especially with 781 respect to the multicast safety, the following guidelines MUST be 782 followed when an implementation wishes to employ the H.271 video 783 back channel message: 785 1. Implementations utilizing the H.271 feedback message MUST stay in 786 compliance with congestion control principles, as outlined in 787 section 5. 789 2. An implementation SHOULD utilize the IETF-native messages as 790 defined in [RFC4585] and in this memo instead of similar messages 791 defined in [H.271]. Our current understanding of similar 792 messages is documented in Table 1 above. One good reason to 793 divert from the SHOULD statement above would be if it is clearly 794 understood that, for a given application and video compression 795 standard, the aforementioned "similarity" is not given, in 796 contrast to what the table indicates. 798 3. It has been observed that some of the H.271 code points currently 799 in existence are not multicast-safe. Therefore, the sensible 800 thing to do is not to use the H.271 feedback message type in 801 multicast environments. It MAY be used only when all the issues 802 mentioned later are fully understood by the implementer, and 803 properly taken into account by all endpoints. In all other 804 cases, the H.271 message type MUST NOT be used in conjunction 805 with multicast. 807 4. It has been observed that even in centralized multipoint 808 environments, where the mixer should theoretically be able to 809 resolve issues as documented below, the implementation of such a 810 mixer and cooperative endpoints is a very difficult and tedious 811 task. Therefore, H.271 messages MUST NOT be used in centralized 812 multipoint scenarios, unless all the issues mentioned below are 813 fully understood by the implementer, and properly taken into 814 account by both mixer and endpoints. 816 Issues to be taken into account when considering the use of H.271 in 817 multipoint environments: 819 1. Different state on different receivers. In many environments it 820 cannot be guaranteed that the decoder state of all media 821 receivers is identical at any given point in time. The most 822 obvious reason for such a possible misalignment of state is a 823 loss that occurs on the path to only one of many media receivers. 824 However, there are other not so obvious reasons, such as recent 825 joins to the multipoint conference (be it by joining the 826 multicast group or through additional mixer output). Different 827 states can lead the media receivers to issue potentially 828 contradicting H.271 messages (or one media receiver issuing an 829 H.271 message that, when observed by the media sender, is not 830 helpful for the other media receivers). A naive reaction of the 831 media sender to these contradicting messages can lead to 832 unpredictable and annoying results. 834 2. Combining messages from different media receivers in a media 835 sender is a non-trivial task. As reasons, we note that these 836 messages may be contradicting each other, and that their 837 transport is unreliable (there may well be other reasons). In 838 case of many H.271 messages (i.e. types 0, 2, 3, and 4), the 839 algorithm for combining must be aware both of the 840 network/protocol environment (i.e. with respect to congestion) 841 and of the media codec employed, as H.271 messages of a given 842 type can have different semantics for different media codecs. 844 3. The suppression of requests may need to go beyond the basic 845 mechanisms described in AVPF (which are driven exclusively by 846 timing and transport considerations on the protocol level). For 847 example, a receiver is often required to refrain from (or delay) 848 generating requests, based on information it receives from the 849 media stream. For instance, it makes no sense for a receiver to 850 issue a FIR when a transmission of an Intra/IDR picture is 851 ongoing. 853 4. When using the non-multicast-safe messages (e.g. H.271 type 0 854 positive ACK of received pictures/slices) in larger multicast 855 groups, the media receiver will likely be forced to delay or even 856 omit sending these messages. For the media sender this looks 857 like data has not been properly received (although it was 858 received properly), and a naively implemented media sender reacts 859 to these perceived problems where it should not. 861 3.5.3.1. Reliability 863 H.271 Video Back Channel messages do not require reliable 864 transmission, and confirmation of the reception of a message can be 865 derived from the forward video bit stream. Therefore, no specific 866 reception acknowledgement is specified. 868 With respect to re-sending rules, clause 3.5.1.1 applies. 870 3.5.4. Temporary Maximum Media Stream Bit Rate Request and Notification 872 A receiver, translator or mixer uses the Temporary Maximum Media 873 Stream Bit Rate Request (TMMBR, "timber") to request a sender to 874 limit the maximum bit rate for a media stream (see 2 875 .2) to, or 876 below, the provided value. The Temporary Maximum Media Stream Bit 877 Rate Notification (TMMBN) contains the media sender's current view 878 of the most limiting subset of the TMMBR-defined limits it has 879 received, to help the participants to suppress TMMBR requests that 880 would not further restrict the media sender. The primary usage for 881 the TMMBR/TMMBN messages is in a scenario with an MCU or mixer (use 882 case 6), corresponding to Topo-Translator or Topo-Mixer, but also to 883 Topo-Point-to-Point. 885 Each temporary limitation on the media stream is expressed as a 886 tuple. The first component of the tuple is the maximum total media 887 bit rate (as defined in section 2.2) that the media receiver is 888 currently prepared to accept for this media stream. The second 889 component is the per-packet overhead that the media receiver has 890 observed for this media stream at its chosen reference protocol 891 layer. 893 As indicated in section 2.2, the overhead as observed by the sender 894 of the TMMBR (i.e. the media receiver) may differ from the overhead 895 observed at the receiver of the TMMBR (i.e. the media sender) due to 896 use of a different reference protocol layer at the other end or due 897 to the intervention of translators or mixers that affect the amount 898 of per packet overhead. For example, a gateway in between the two 899 that converts between IPv4 and IPv6 affects the per-packet overhead 900 by 20 bytes. Other mechanisms that change the overhead include 901 tunnels. The problem with varying overhead is also discussed in 902 [RFC3890]. As will be seen in the description of the algorithm for 903 use of TMMBR, the difference in perceived overhead between the 904 sending and receiving ends presents no difficulty because 905 calculations are carried out in terms of variables that have the 906 same value at the sender as at the receiver -- for example, packet 907 rate and net media rate. 909 Reporting both maximum total media bit rate and per-packet overhead 910 allows different receivers to provide bit rate and overhead values 911 for different protocol layers, for example at the IP level, at the 912 outer part of a tunnel protocol, or at the link layer. The protocol 913 level a peer reports on depends on the level of integration the peer 914 has, as it needs to be able to extract the information from that 915 protocol level. For example, an application with no knowledge of 916 the IP version it is running over can not meaningfully determine the 917 overhead of the IP header, and hence will not want to include IP 918 overhead in the overhead or maximum total media bit rate 919 calculation. 921 It is expected that most peers will be able to report values at 922 least for the IP layer. In certain implementations it may be 923 advantageous to also include information pertaining to the link 924 layer, which in turn allows for a more precise overhead calculation 925 and a better optimization of connectivity resources. 927 The Temporary Maximum Media Stream Bit Rate messages are generic 928 messages that can be applied to any RTP packet stream. This 929 separates them from the other codec control messages defined in this 930 specification, which apply only to specific media types or payload 931 formats. The TMMBR functionality applies to the transport, and the 932 requirements the transport places on the media encoding. 934 The reasoning below assumes that the participants have negotiated a 935 session maximum bit rate, using a signaling protocol. This value 936 can be global, for example in case of point-to-point, multicast, or 937 translators. It may also be local between the participant and the 938 peer or mixer. In either case, the bit rate negotiated in signaling 939 is the one that the participant guarantees to be able to handle 940 (depacketize and decode). In practice, the connectivity of the 941 participant also influences the negotiated value -- it does not make 942 much sense to negotiate a total media bit rate that one's network 943 interface does not support. 945 It is also beneficial to have negotiated a maximum packet rate for 946 the session or sender. RFC 3890 provides an SDP [RFC4566] attribute 947 that can be used for this purpose; however, that attribute is not 948 usable in RTP sessions established using offer/answer [RFC3264]. 949 Therefore an optional maximum packet rate signaling parameter is 950 specified in this memo. 952 An already established maximum total media bit rate may be changed 953 at any time, subject to the timing rules governing the sending of 954 feedback messages. The limit may change to any value between zero 955 and the session maximum, as negotiated during session establishment 956 signaling. However, even if a sender has received a TMMBR message 957 allowing an increase in the bit rate, all increases must be governed 958 by a congestion control mechanism. TMMBR indicates known 959 limitations only, usually in the local environment, and does not 960 provide any guarantees about the full path. Furthermore, any 961 increases in TMMBR-established bit rate limits are to be executed 962 only after a certain delay from the sending of the TMMBN message 963 that notifies the world about the increase in limit. The delay is 964 specified as at least twice the longest RTT as known by the media 965 sender, plus the media sender's calculation of the required wait 966 time for the sending of another TMMBR message for this session based 967 on AVPF timing rules. This delay is introduced to allow other 968 session participants to make known their bit rate limit 969 requirements, which may be lower. 971 If it is likely that the new value indicated by TMMBR will be valid 972 for the remainder of the session, the TMMBR sender is expected to 973 perform a renegotiation of the session upper limit using the session 974 signaling protocol. 976 3.5.4.1. Behavior for media receivers using TMMBR 978 This section is an informal description of behaviour described more 979 precisely in section 4.2. 981 A media sender begins the session limited by the maximum media bit 982 rate and maximum packet rate negotiated in session signaling, if 983 any. Note that this value may be negotiated for another protocol 984 layer than the one the participant uses in its TMMBR messages. Each 985 media receiver selects a reference protocol layer, forms an estimate 986 of the overhead it is observing (or estimating it if no packets has 987 been seen yet) at that reference level, and determines the maximum 988 total media bit rate it can accept, taking into account its own 989 limitations and any transport path limitations of which it may be 990 aware. In case the current limitations are more restricting then 991 what was agreed on in the session signaling, the media receiver 992 reports its initial estimate of these two quantities to the media 993 sender using a TMMBR message. Overall message traffic is reduced by 994 the possibility of including tuples for multiple media senders in 995 the same TMMBR message. 997 The media sender applies an algorithm such as that specified in 998 section 3.5.4.2 to select which of the tuples it has received are 999 most limiting (i.e. the bounding set as defined in section 2.2). It 1000 modifies its operation to stay within the feasible region (as 1001 defined in section 2.2), and also sends out a TMMBN notification to 1002 the media receivers indicating the selected bounding set. 1004 If a media receiver does not own one of the tuples in the bounding 1005 set reported by the TMMBN, it applies the same algorithm as the 1006 media sender to determine if its current estimated (maximum total 1007 media bit rate, overhead) tuple would enter the bounding set if 1008 known to the media sender. If so, it issues a TMMBR request 1009 reporting the tuple value to the sender. Otherwise it takes no 1010 action for the moment. Periodically, its estimated tuple values may 1011 change or it may receive a new TMMBN. If so, it reapplies the 1012 algorithm to decide whether it needs to issue a TMMBR request. 1014 If, alternatively, a media receiver owns one of the tuples in the 1015 reported bounding set, it takes no action until such time as its 1016 estimate of its own tuple values changes. At that time it sends a 1017 TMMBR request to the media sender to report the changed values. 1019 A media receiver may change status between owner and non-owner of a 1020 bounding tuple between one TMMBN message and the next. Thus, it 1021 must check the contents of each TMMBN to determine its subsequent 1022 actions. 1024 Implementations may use other algorithms of their choosing, as long 1025 as the bit rate limitations resulting from the exchange of TMMBR and 1026 TMMBN messages are at least as strict (at least as low, in the bit 1027 rate dimension) as the ones resulting from the use of the 1028 aforementioned algorithm. 1030 Obviously, in point-to-point cases, when there is only one media 1031 receiver, this receiver becomes "owner" once it receives the first 1032 TMMBN in response to its own TMMBR, and stays "owner" for the rest 1033 of the session. Therefore, when it is known that there will always 1034 be only a single media receiver, the above algorithm is not 1035 required. Media receivers that are aware they are the only ones in 1036 a session can send TMMBR messages with bit rate limits both higher 1037 and lower than the previously notified limit, at any time (subject 1038 to the AVPF [RFC4585] RTCP RR send timing rules). However, it may 1039 be difficult for a session participant to determine if it is the 1040 only receiver in the session. Because of this any implementation of 1041 TMMBR is required to include the algorithm described in the next 1042 section or a stricter equivalent. 1044 3.5.4.2. Algorithm for establishing current limitations 1046 This section introduces an example algorithm for the calculation of 1047 a session limit. Other algorithms can be employed, as long as the 1048 result of the calculation is at least as restrictive as the result 1049 that is obtained by this algorithm. 1051 First, it is important to consider the implications of using a tuple 1052 for limiting the media sender's behavior. The bit rate and the 1053 overhead value result in a two-dimensional solution space for the 1054 calculation of the bit rate of media streams. Fortunately, the two 1055 variables are linked. Specifically, the bit rate available for RTP 1056 payloads is equal to the TMMBR reported bit rate minus the packet 1057 rate used, multiplied by the TMMBR reported overhead converted to 1058 bits. As a result, when different bit rate/overhead combinations 1059 need to be considered, the packet rate determines the correct 1060 limitation. This is perhaps best explained by an example: 1062 Example: 1064 Receiver A: TMMBR_max total BR = 35 kbps, TMMBR_OH = 40 bytes 1065 Receiver B: TMMBR_max total BR = 40 kbps, TMMBR_OH = 60 bytes 1067 For a given packet rate (PR) the bit rate available for media 1068 payloads in RTP will be: 1070 Max_net media_BR_A = TMMBR_max total BR_A - PR * TMMBR_OH_A * 8 ... 1071 (1) 1072 Max_net media_BR_B = TMMBR_max total BR_B - PR * TMMBR_OH_B * 8 ... 1073 (2) 1075 For a PR = 20 these calculations will yield a Max_net media_BR_A = 1076 28600 bps and Max_net media_BR_B = 30400 bps, which suggests that 1077 receiver A is the limiting one for this packet rate. However, at a 1078 certain PR there is a switchover point at which receiver B becomes 1079 the limiting one. The switchover point can be identified by setting 1080 Max_media_BR_A equal to Max_media_BR_B and breaking out PR: 1082 TMMBR_max total BR_A - TMMBR_max total BR_B 1083 PR = ------------------------------------------- ... (3) 1084 8*(TMMBR_OH_A - TMMBR_OH_B) 1086 which, for the numbers above yields 31.25 as the switchover point 1087 between the two limits. That is, for packet rates below 31.25 per 1088 second, receiver A is the limiting receiver, and for higher packet 1089 rates, receiver B is more limiting. The implications of this 1090 behavior have to be considered by implementations that are going to 1091 control media encoding and its packetization. As exemplified above, 1092 multiple TMMBR limits may apply to the trade-off between net media 1093 bit rate and packet rate. Which limitation applies depends on the 1094 packet rate being considered. 1096 This also has implications for how the TMMBR mechanism needs to 1097 work. First, there is the possibility that multiple TMMBR tuples 1098 are providing limitations on the media sender. Secondly there is a 1099 need for any session participant (media sender and receivers) to be 1100 able to determine if a given tuple will become a limitation upon the 1101 media sender, or if the set of already given limitations is stricter 1102 than the given values. In the absence of the ability to make this 1103 determination the suppression of TMMBR requests would not work. 1105 The basic idea of the algorithm is as follows. Each TMMBR tuple can 1106 be viewed as the equation of a straight line (cf. equations (1) and 1107 (2)) in a space where packet rate lies along the X-axis and maximum 1108 bit rate lies along the Y-axis. The lower envelope of the set of 1109 lines corresponding to the complete set of TMMR tuples, together 1110 with the X and Y axes, defines a polygon. Points lying within this 1111 polygon are combinations of packet rate and bit rate that meet all 1112 of the TMMBR constraints. The highest feasible packet rate within 1113 this region is the minimum of the rate at which the bounding polygon 1114 meets the X-axis or the session maximum packet rate (SMAXPR, 1115 measured in packets per second) provided by signaling, if any. 1116 Typically a media sender will prefer to operate at a lower rate than 1117 this theoretical maximum, so as to increase the rate at which actual 1118 media content reaches the receivers. The purpose of the algorithm 1119 is to distinguish the TMMBR tuples constituting the bounding set and 1120 thus delineate the feasible region, so that the media sender can 1121 select its preferred operating point within that region 1123 Figure 1 below shows a bounding polygon formed by TMMBR tuples A and 1124 B. A third tuple C lies outside the bounding polygon and is 1125 therefore irrelevant in determining feasible tradeoffs between media 1126 rate and packet rate. The line labeled ss..s represents the limit 1127 on packet rate imposed by the session maximum packet rate (SMAXPR) 1128 obtained by signaling during session setup. In Figure 1 the limit 1129 determined by tuple B happens to be more restrictive than SMAXPR. 1130 The situation could easily be the reverse, meaning that the bounding 1131 polygon is terminated on the right by the vertical line representing 1132 the SMAXPR constraint. 1134 Net ^ 1135 Media|a c b s 1136 Bit | a c b s 1137 Rate | a c b s 1138 | a cb s 1139 | a c s 1140 | a bc s 1141 | a b c s 1142 | ab c s 1143 | Feasible b c s 1144 | region ba s 1145 | b a s c 1146 | b s c 1147 | b s a 1148 |_____________________bs________ 1149 +------------------------------>____________ 1151 Packet rate 1153 Figure 1 - Geometric Interpretation of TMMBR Tuples 1155 Note that the slopes of the lines making up the bounding polygon are 1156 increasingly negative as one moves in the direction of increasing 1157 packet rate. Note also that with slight rearrangement, equations 1158 (1) and (2) have the canonical form: 1160 y = mx + b 1162 where 1163 m is the slope and has value equal to the negative of the tuple 1164 overhead (in bits), 1165 and 1166 b is the y-intercept and has value equal to the tuple maximum 1167 total media bit rate. 1169 These observations lead to the conclusion that when processing the 1170 TMMBR tuples to select the initial bounding set, one should sort and 1171 process the tuples by order of increasing overhead. Once a 1172 particular tuple has been added to the bounding set, all tuples not 1173 already selected and having lower overhead can be eliminated, 1174 because the next side of the bounding polygon has to be steeper 1175 (i.e. the corresponding TMMBR must have higher overhead) than the 1176 latest added tuple. 1178 Line cc..c in Figure 1 illustrates another principle. This line is 1179 parallel to line aa..a, but has a higher Y-intercept. That is, the 1180 corresponding TMMBR tuple contains a higher maximum total media bit 1181 rate value. Since line cc..c is outside the bounding polygon, it 1182 illustrates the conclusion that if two TMMBR tuples have the same 1183 overhead value, the one with higher maximum total media bit rate 1184 value cannot be part of the bounding set and can be set aside. 1186 Two further observations complete the algorithm. Obviously, moving 1187 from the left, the successive corners of the bounding polygon (i.e. 1188 the intersection points between successive pairs of sides) lie at 1189 successively higher packet rates. On the other hand, again moving 1190 from the left, each successive line making up the bounding set 1191 crosses the X-axis at a lower packet rate. 1193 The complete algorithm can now be specified. The algorithm works 1194 with two lists of TMMBR tuples, the candidate list X and the 1195 selected list Y, both ordered by increasing overhead value. The 1196 algorithm terminates when all members of X have been discarded or 1197 removed for processing. Membership of the selected list Y is 1198 probationary until the algorithm is complete. Each member of the 1199 selected list is associated with an intersection value, which is the 1200 packet rate at which the line corresponding to that TMMBR tuple 1201 intersects with the line corresponding to the previous TMMBR tuple 1202 in the selected list. Each member of the selected list is also 1203 associated with a maximum packet rate value, which is the lesser of 1204 the session maximum packet rate SMAXPR (if any) and the packet rate 1205 at which the line corresponding to that tuple crosses the X-axis. 1207 When the algorithm terminates, the selected list is equal to the 1208 bounding set as defined in section 2.2. 1210 Initial Algorithm 1212 This algorithm is used by the media sender when it has received one 1213 or more TMMBR requests and before it has determined a bounding set 1214 for the first time. 1216 1. Sort the TMMBR tuples by order of increasing overhead. This is 1217 the initial candidate list X. 1219 2. When multiple tuples in the candidate list have the same overhead 1220 value, discard all but the one with the lowest maximum total media 1221 bit rate value. 1223 3. Select and remove from the candidate list the TMMBR tuple with the 1224 lowest maximum total media bit rate value. If there is more than 1225 one tuple with that value, choose the one with the highest 1226 overhead value. This is the first member of the selected list Y. 1227 Set its intersection value equal to zero. Calculate its maximum 1228 packet rate as the minimum of SMAXPR (if available) and the value 1229 obtained from the following formula, which is the packet rate at 1230 which the corresponding line crosses the X-axis. 1232 Max PR = TMMBR max total BR / (8 * TMMBR OH) ... (4) 1234 4. Discard from the candidate list all tuples with a lower overhead 1235 value than the selected tuple. 1237 5. Remove the first remaining tuple from the candidate list for 1238 processing. Call this the current candidate. 1240 6. Calculate the packet rate PR at the intersection of the line 1241 generated by the current candidate with the line generated by the 1242 last tuple in the selected list Y, using equation (3). 1244 7. If the calculated value PR is equal to or lower than the 1245 intersection value stored for the last tuple of the selected list, 1246 discard the last tuple of the selected list and go back to step 6 1247 (retaining the same current candidate). 1249 Note that the choice of the initial member of the selected list Y 1250 in step 3 guarantees that the selected list will never be emptied 1251 by this process, meaning that the algorithm must eventually (if 1252 not immediately) fall through to the step 8. 1254 8. (This step is reached when the calculated PR value of the current 1255 candidate is greater than the intersection value of the current 1256 last member of the selected list Y.) If the calculated value PR 1257 of the current candidate is lower than the maximum packet rate 1258 associated with the last tuple in the selected list, add the 1259 current candidate tuple to the end of the selected list. Store PR 1260 as its intersection value. Calculate its maximum packet rate as 1261 the lesser of SMAXPR (if available) and the maximum packet rate 1262 calculated using equation (4). 1264 9. If any tuples remain in the candidate list, go back to step 5. 1266 Incremental Algorithm 1268 The previous algorithm covered the initial case, where no selected 1269 list had previously been created. It also applied only to the media 1270 sender. When a previously-created selected list is available at 1271 either the media sender or media receiver, two other cases can be 1272 considered: 1274 o when a TMMBR tuple not currently in the selected list is a 1275 candidate for addition; 1277 o when the values change in a TMMBR tuple currently in the 1278 selected list. 1280 At the media receiver these cases correspond respectively to those 1281 of the non-owner and owner of a tuple in the TMMBN-reported bounding 1282 set. 1284 In either case, the process of updating the selected list to take 1285 account of the new/changed tuple can use the basic algorithm 1286 described above, with the modification that the initial candidate 1287 set consists only of the existing selected list and the new or 1288 changed tuple. Some further optimization is possible (beyond 1289 starting with a reduced candidate set) by taking advantage of the 1290 following observations. 1292 The first observation is that if the new/changed candidate becomes 1293 part of the new selected list, the result may be to cause zero or 1294 more other tuples to be dropped from the list. However, if more 1295 than one other tuple is dropped, the dropped tuples will be 1296 consecutive. This can be confirmed geometrically by visualizing a 1297 new line that cuts off a series of segments from the previously- 1298 existing bounding polygon. The cut-off segments are connected one 1299 to the next, the geometric equivalent of consecutive tuples in a 1300 list ordered by overhead value. Beyond the dropped set in either 1301 direction all of the tuples that were in the earlier selected list 1302 will be in the updated one. The second observation is that, leaving 1303 aside the new candidate, the order of tuples remaining in the 1304 updated selected list is unchanged because their overhead values 1305 have not changed. 1307 The consequence of these two observations is that, once the 1308 placement of the new candidate and the extent of the dropped set of 1309 tuples (if any) has been determined, the remaining tuples can be 1310 copied directly from the candidate list into the selected list, 1311 preserving their order. This conclusion suggests the following 1312 modified algorithm: 1314 o Run steps 1-4 of the basic algorithm. 1316 o If the new candidate has survived steps 2 and 4 and has become 1317 the new first member of the selected list, run steps 5-9 on 1318 subsequent candidates until another candidate is added to the 1319 selected list. Then move all remaining candidates to the 1320 selected list, preserving their order. 1322 o If the new candidate has survived steps 2 and 4 and has not 1323 become the new first member of the selected list, start by 1324 moving all tuples in the candidate list with lower overhead 1325 values than that of the new candidate to the selected list, 1326 preserving their order. Run steps 5 through 9 for the new 1327 candidate, with the modification that the intersection values 1328 and maximum packet rates for the tuples on the selected list 1329 have to be calculated on the fly because they were not 1330 previously stored. Continue processing only until a 1331 subsequent tuple has been added to the selected list, then 1332 move all remaining candidates to the selected list, preserving 1333 their order. 1335 Note that the new candidate could be added to the selected 1336 list only to be dropped again when the next tuple is 1337 processed. It can easily be seen that in this case the new 1338 candidate does not displace any of the earlier tuples in the 1339 selected list. The limitations of ASCII art make this 1340 difficult to show in a figure. Line cc..c in Figure 1 would 1341 be an example if it had a steeper slope (tuple C had a higher 1342 overhead value), but still intersected line aa..a beyond where 1343 line aa..a intersects line bb..b. 1345 The algorithm just described is approximate, because it does not 1346 take account of tuples outside the selected list. To see how such 1347 tuples can become relevant, consider Figure 1 and suppose that the 1348 maximum total media bit rate in tuple A increases to the point that 1349 line aa..a moves outside line cc..c. Tuple A will remain in the 1350 bounding set calculated by the media sender. However, once it 1351 issues a new TMMBN, media receiver C will apply the algorithm and 1352 discover that its tuple C should now enter the bounding set. It 1353 will issue a TMMBR request to the media sender, which will repeat 1354 its calculation and come to the appropriate conclusion. 1356 The rules of section 4.2 require that the media sender refrain from 1357 raising its sending rate until media receivers have had a chance to 1358 respond to the TMMBN. In the example just given, this delay ensures 1359 that the relaxation of tuple A does not actually result in an 1360 attempt to send media at a rate exceeding the capacity at C. 1362 3.5.4.3. Use of TMMBR in a Mixer Based Multipoint Operation 1364 Assume a small mixer-based multiparty conference is ongoing, as 1365 depicted in Topo-Mixer of [Topologies]. All participants have 1366 negotiated a common maximum bit rate that this session can use. The 1367 conference operates over a number of unicast paths between the 1368 participants and the mixer. The congestion situation on each of 1369 these paths can be monitored by the participant in question and by 1370 the mixer, utilizing, for example, RTCP receiver reports (RR) or the 1371 transport protocol, e.g. DCCP [RFC4340]. However, any given 1372 participant has no knowledge of the congestion situation of the 1373 connections to the other participants. Worse, without mechanisms 1374 similar to the ones discussed in this draft, the mixer (which is 1375 aware of the congestion situation on all connections it manages) has 1376 no standardized means to inform media senders to slow down, short of 1377 forging its own receiver reports (which is undesirable). In 1378 principle, a mixer confronted with such a situation is obliged to 1379 thin or transcode streams intended for connections that detected 1380 congestion. 1382 In practice, unfortunately, media-aware streaming thinning is a very 1383 difficult and cumbersome operation and adds undesirable delay. If 1384 media-unaware, it leads very quickly to unacceptable reproduced 1385 media quality. Hence, a means to slow down senders even in the 1386 absence of congestion on their connections to the mixer is 1387 desirable. 1389 To allow the mixer to throttle traffic on the individual links, 1390 without performing transcoding, there is a need for a mechanism that 1391 enables the mixer to ask a participant's media encoders to limit the 1392 media stream bit rate they are currently generating. TMMBR provides 1393 the required mechanism. When the mixer detects congestion between 1394 itself and a given participant, it executes the following procedure: 1396 1. It starts thinning the media traffic to the congested participant 1397 to the supported bit rate. 1399 2. It uses TMMBR to request the media sender(s) to reduce the total 1400 media bit rate sent by them to the mixer, to a value that is in 1401 compliance with congestion control principles for the slowest 1402 link. Slow refers here to the available bandwidth / bit rate / 1403 capacity and packet rate after congestion control. 1405 3. As soon as the bit rate has been reduced by the sending part, the 1406 mixer stops stream thinning implicitly, because there is no need 1407 for it once the stream is in compliance with congestion control. 1409 This use of stream thinning as an immediate reaction tool followed 1410 up by a quick control mechanism appears to be a reasonable 1411 compromise between media quality and the need to combat congestion. 1413 3.5.4.4. Use of TMMBR in Point-to-Multipoint Using Multicast or 1414 Translators 1416 In these topologies, corresponding to Topo-Multicast or Topo- 1417 Translator, RTCP RRs are transmitted globally. This allows all 1418 participants to detect transmission problems such as congestion, on 1419 a medium timescale. As all media senders are aware of the 1420 congestion situation of all media receivers, the rationale for the 1421 use of TMMBR in the previous section does not apply. However, even 1422 in this case the congestion control response can be improved when 1423 the unicast links are using congestion controlled transport 1424 protocols (such as TCP or DCCP). A peer may also report local 1425 limitations to the media sender. 1427 3.5.4.5. Use of TMMBR in Point-to-point operation 1429 In use case 7 it is possible to use TMMBR to improve the performance 1430 when the known upper limit of the bit rate changes. In this use 1431 case the signaling protocol has established an upper limit for the 1432 session and total media bit rates. However, at the time of 1433 transport link bit rate reduction, a receiver can avoid serious 1434 congestion by sending a TMMBR to the sending side. Thus, TMMBR is 1435 useful for putting restrictions on the application and thus placing 1436 the congestion control mechanism in the right ballpark. However, 1437 TMMBR is usually unable to provide the continuously quick feedback 1438 loop required for real congestion control. Nor do its semantics 1439 match those of congestion control given its different purpose. For 1440 these reasons TMMBR SHALL NOT be used as a substitute for congestion 1441 control. 1443 3.5.4.6. Reliability 1445 The reaction of a media sender to the reception of a TMMBR message 1446 is not immediately identifiable through inspection of the media 1447 stream. Therefore, a more explicit mechanism is needed to avoid 1448 unnecessary re-sending of TMMBR messages. Using a statistically 1449 based retransmission scheme would only provide statistical 1450 guarantees of the request being received. It would also not avoid 1451 the retransmission of already received messages. In addition, it 1452 would not allow for easy suppression of other participants' 1453 requests. For these reasons, a mechanism based on explicit 1454 notification is used. 1456 Upon the reception of a request a media sender sends a TMMBN 1457 notification containing the current bounding set, and indicating 1458 which session participants own that limit. In multicast scenarios, 1459 that allows all other participants to suppress any request they may 1460 have, if their limitations are less strict than the current ones 1461 (i.e. define lines lying outside the feasible region as defined in 1462 section 2.2). Keeping and notifying only the bounding set of tuples 1463 allows for small message sizes and media sender states. A media 1464 sender only keeps state for the SSRCs of the current owners of the 1465 bounding set of tuples; all other requests and their sources are not 1466 saved. Once the bounding set has been established, new TMMBR 1467 messages should be generated only by owners of the bounding tuples 1468 and by other entities that determine (by applying the algorithm of 1469 section 3.5.4.2 or its equivalent) that their limitations should now 1470 be part of the bounding set. 1472 4. RTCP Receiver Report Extensions 1474 This memo specifies six new feedback messages. The Full Intra 1475 Request (FIR), Temporal-Spatial Trade-off Request (TSTR), Temporal- 1476 Spatial Trade-off Notification (TSTN), and Video Back Channel 1477 Message (VBCM) are "Payload Specific Feedback Messages" as defined 1478 in Section 6.3 of AVPF [RFC4585]. The Temporary Maximum Media 1479 Stream Bit Rate Request (TMMBR) and Temporary Maximum Media Stream 1480 Bit Rate Notification (TMMBN) are "Transport Layer Feedback 1481 Messages" as defined in Section 6.2 of AVPF. 1483 The new feedback messages are defined in the following subsections, 1484 following a similar structure to that in sections 6.2 and 6.3 of the 1485 AVPF specification [RFC4585]. 1487 4.1. Design Principles of the Extension Mechanism 1489 RTCP was originally introduced as a channel to convey presence, 1490 reception quality statistics and hints on the desired media coding. 1491 A limited set of media control mechanisms were introduced in early 1492 RTP payload formats for video formats, for example in RFC 2032 1493 [RFC2032]. However, this specification, for the first time, 1494 suggests a two-way handshake for some of its messages. There is 1495 danger that this introduction could be misunderstood as a precedent 1496 for the use of RTCP as an RTP session control protocol. To prevent 1497 such a misunderstanding, this subsection attempts to clarify the 1498 scope of the extensions specified in this memo, and strongly 1499 suggests that future extensions follow the rationale spelled out 1500 here, or compellingly explain why they divert from the rationale. 1502 In this memo, and in AVPF [RFC4585], only such messages have been 1503 included as: 1505 a) have comparatively strict real-time constraints, which prevent 1506 the use of mechanisms such as a SIP re-invite in most application 1507 scenarios. The real-time constraints are explained separately 1508 for each message where necessary. 1510 b) are multicast-safe in that the reaction to potentially 1511 contradicting feedback messages is specified, as necessary for 1512 each message; and 1514 c) are directly related to activities of a certain media codec, 1515 class of media codecs (e.g. video codecs), or a given RTP packet 1516 stream. 1518 In this memo, a two-way handshake is introduced only for messages 1519 for which: 1521 a) a notification or acknowledgement is required due to their 1522 nature. An analysis to determine whether this requirement exists 1523 has been performed separately for each message. 1525 b) the notification or acknowledgement cannot be easily derived from 1526 the media bit stream. 1528 All messages in AVPF [RFC4585] and in this memo present their 1529 contents in a simple, fixed binary format. This accommodates media 1530 receivers which have not implemented higher control protocol 1531 functionalities (SDP, XML parsers and such) in their media path. 1533 Messages that do not conform to the design principles just described 1534 are not an appropriate use of RTCP or of the Codec Control Framework 1535 defined in this document. 1537 4.2. Transport Layer Feedback Messages 1539 As specified in section 6.1 of RFC 4585 [RFC4585], Transport Layer 1540 Feedback messages are identified by the RTCP packet type value RTPFB 1541 (205). 1543 In AVPF, one message of this category had been defined. This memo 1544 specifies two more such messages. They are identified by means of 1545 the FMT parameter as follows: 1547 Assigned in AVPF [RFC4585]: 1549 1: Generic NACK 1550 31: reserved for future expansion of the identifier number 1551 space 1553 Assigned in this memo: 1555 2: reserved (see note below) 1556 3: Temporary Maximum Media Stream Bit Rate Request (TMMBR) 1557 4: Temporary Maximum Media Stream Bit Rate Notification 1558 (TMMBN) 1560 Note: early drafts of AVPF [RFC4585] reserved FMT=2 for a 1561 code point that has later been removed. It has been pointed 1562 out that there may be implementations in the field using this 1563 value in accordance with the expired draft. As there is 1564 sufficient numbering space available, we mark FMT=2 as 1565 reserved so to avoid possible interoperability problems with 1566 any such early implementations. 1568 Available for assignment: 1570 0: unassigned 1571 5-30: unassigned 1573 The following subsection defines the formats of the FCI entries for 1574 the TMMBR and TMMBN messages respectively and specify the associated 1575 behaviour at the media sender and receiver. 1577 4.2.1. Temporary Maximum Media Stream Bit Rate Request (TMMBR) 1579 The Temporary Maximum Media Stream Bit Rate Request is identified by 1580 RTCP packet type value PT=RTPFB and FMT=3. 1582 The FCI field of a Temporary Maximum Media Stream Bit-Rate Request 1583 (TMMBR) message SHALL contain one or more FCI entries. 1585 4.2.1.1. Message Format 1587 The Feedback Control Information (FCI) consists of one or more TMMBR 1588 FCI entries with the following syntax: 1590 0 1 2 3 1591 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1592 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1593 | SSRC | 1594 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1595 | MxTBR Exp | MxTBR Mantissa |Measured Overhead| 1596 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1598 Figure 2 - Syntax of an FCI entry in the TMMBR message 1600 SSRC (32 bits): The SSRC value of the media sender that is 1601 requested to obey the new maximum bit rate. 1603 MxTBR Exp (6 bits): The exponential scaling of the mantissa for 1604 the maximum total media bit rate value. The value is an 1605 unsigned integer [0..63]. 1607 MxTBR Mantissa (17 bits): The mantissa of the maximum total media 1608 bit rate value as an unsigned integer. 1610 Measured Overhead (9 bits): The measured average packet overhead 1611 value in bytes. The measurement SHALL be done according 1612 to the description in section 4.2.1.2. The value is an 1613 unsigned integer [0..512]. 1615 The maximum total media bit rate (MxTBR) value in bits per second is 1616 calculated from the MxTBR exponent (exp) and mantissa in the 1617 following way: 1619 MxTBR = mantissa * 2^exp 1621 This allows for 17 bits of resolution in the range 0 to 131072*2^63 1622 (approximately 1.2*10^24). 1624 The length of the TMMBR feedback message SHALL be set to 2+2*N where 1625 N is the number of TMMBR FCI entries. 1627 4.2.1.2. Semantics 1629 Behaviour at the Media Receiver (Sender of the TMMBR) 1631 TMMBR is used to indicate a transport related limitation at the 1632 reporting entity acting as a media receiver. TMMBR has the form of 1633 a tuple containing two components. The first value is the highest 1634 bit rate per sender of a media stream, available at a receiver- 1635 chosen protocol layer, which the receiver currently supports in this 1636 RTP session. The second value is the measured header overhead in 1637 bytes as defined in section 2.2 and measured at the chosen protocol 1638 layer in the packets received for the stream. The measurement of 1639 the overhead is a running average that is updated for each packet 1640 received for this particular media source (SSRC), using the 1641 following formula: 1643 avg_OH (new) = 15/16*avg_OH (old) + 1/16*pckt_OH, 1645 where avg_OH is the running (exponentially smoothed) average and 1646 pckt_OH is the overhead observed in the latest packet. 1648 If a maximum bit rate has been negotiated through signaling, the 1649 maximum total media bit rate that the receiver reports in a TMMBR 1650 message MUST NOT exceed the negotiated value converted to a common 1651 basis (i.e. with overheads adjusted to bring it to the same 1652 reference protocol layer). 1654 Within the common packet header for feedback messages (as defined in 1655 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 1656 indicates the source of the request, and the "SSRC of media source" 1657 is not used and SHALL be set to 0. Within a particular TMMBR FCI 1658 entry, the "SSRC of media sender" in the FCI field denotes the media 1659 sender the tuple applies to. This is useful in the multicast or 1660 translator topologies where the reporting entity may address all of 1661 the media senders in a single TMMBR message using multiple FCI 1662 entries. 1664 The media receiver SHALL save the contents of the latest TMMBN 1665 message received from each media sender. 1667 The media receiver MAY send a TMMBR FCI entry to a particular media 1668 sender under the following circumstances: 1670 o before any TMMBN message has been received from that media 1671 sender; 1673 o when the media receiver has been identified as the source of a 1674 bounding tuple within the latest TMMBN message received from 1675 that media sender, and the value of the maximum total media 1676 bit rate or the overhead relating to that media sender has 1677 changed; 1679 o when the media receiver has not been identified as the source 1680 of a bounding tuple within the latest TMMBN message received 1681 from that media sender, and, after the media receiver applies 1682 the incremental algorithm from section 3.5.4.2 or a stricter 1683 equivalent, the media receiver's tuple relating to that media 1684 sender is determined to belong to the bounding set. 1686 A TMMBR FCI entry MAY be repeated in subsequent TMMBR messages if no 1687 Temporary Maximum Media Stream Bit-Rate Notification (TMMBN) FCI has 1688 been received from the media sender at the time of transmission of 1689 the next RTCP packet. The bit rate value of a TMMBR FCI entry MAY 1690 be changed from one TMMBR message to the next. The overhead 1691 measurement SHALL be updated to the current value of avg_OH each 1692 time the entry is sent. 1694 If the value set by a TMMBR message is expected to be permanent, the 1695 TMMBR setting party SHOULD renegotiate the session parameters to 1696 reflect that using session setup signaling, e.g. a SIP re-invite. 1698 Behaviour at the Media Sender (Receiver of the TMMBR) 1700 When it receives a TMMBR message containing an FCI entry relating to 1701 it, the media sender SHALL use an initial or incremental algorithm 1702 as applicable to determine the bounding set of tuples based on the 1703 new information. The algorithm used SHALL be at least as strict as 1704 the corresponding algorithm defined in section 3 1705 .5.4.2. The media 1706 sender MAY accumulate TMMBR requests over a small interval (relative 1707 to the RTCP sending interval) before making this calculation. 1709 Once it has determined the bounding set of tuples, the media sender 1710 MAY use any combination of packet rate and net media bit rate within 1711 the feasible region that these tuples describe to produce a lower 1712 total media stream bit rate, as it may need to address a congestion 1713 situation or other limiting factors. See section 5 1714 (congestion 1715 control) for more discussion. 1717 If the media sender concludes that it can increase the maximum total 1718 media bit rate value, it SHALL wait before actually doing so, for a 1719 period long enough to allow a media receiver to respond to the TMMBN 1720 if it determines that its tuple belongs in the bounding set. This 1721 delay period is estimated by the formula: 1723 2 * RTT + T_Dither_Max, 1725 where RTT is the longest round trip time known to the media sender 1726 and T_Dither_Max is defined in section 3.4 of [RFC4585]. Even in 1727 point-to-point sessions a media sender MUST obey to the 1728 aforementioned rule, as it not guaranteed that a participant is able 1729 to determine correctly whether all the sources are co-located in a 1730 single node, and are coordinated. 1732 A TMMBN message SHALL be sent by the media sender at the earliest 1733 possible point in time, in response to any TMMBR messages received 1734 since the last sending of TMMBN. The TMMBN message indicates the 1735 calculated set of bounding tuples and the owners of those tuples at 1736 the time of the transmission of the message. 1738 An SSRC may time out according to the default rules for RTP session 1739 participants, i.e. the media sender has not received any RTP or RTCP 1740 packets from the owner for the last five regular reporting 1741 intervals. An SSRC may also explicitly leave the session, with the 1742 participant indicating this through the transmission of an RTCP BYE 1743 packet or using an external signaling channel. If the media sender 1744 determines that the owner of a tuple in the bounding set has left 1745 the session, the media sender shall transmit a new TMMBN containing 1746 the previously-determined set of bounding tuples but with the tuple 1747 belonging to the departed owner removed. 1749 A media sender MAY proactively initiate the equivalent to a TMMBR 1750 message to itself, when it is aware that its transmission path is 1751 more restrictive than the current limitations. As a result, a TMMBN 1752 indicating the media source itself as the owner of a tuple is being 1753 sent, thereby avoiding unnecessary TMMBR messages from other 1754 participants. However, like any other participant, when the media 1755 sender becomes aware of changed limitations, it is required to 1756 change the tuple, and to send a corresponding TMMBN. 1758 Discussion 1760 Due to the unreliable nature of transport of TMMBR and TMMBN, the 1761 above rules may lead to the sending of TMMBR messages which appear 1762 to disobey those rules. Furthermore, in multicast scenarios it can 1763 happen that more than one "non-owning" session participant may 1764 determine, rightly or wrongly, that its tuple belongs in the 1765 bounding set. This is not critical for a number of reasons: 1767 a) If a TMMBR message is lost in transmission, either the media 1768 sender sends a new TMMBN message in response to some other media 1769 receiver or it does not send a new TMMBN message at all. In the 1770 first case, the media receiver applies the incremental algorithm 1771 and, if it determines that its tuple should be part of the 1772 bounding set, sends out another TMMBR. In the second case, it 1773 repeats the sending of a TMMBR unconditionally. Either way, the 1774 media sender eventually gets the information it needs. 1776 b) Similarly, if a TMMBN message gets lost, the media receiver that 1777 has sent the corresponding TMMBR request does not receive the 1778 notification and is expected to re-send the request and trigger 1779 the transmission of another TMMBN. 1781 c) If multiple competing TMMBR messages are sent by different 1782 session participants, then the algorithm can be applied taking 1783 all of these messages into account, and the resulting TMMBN 1784 provides the participants with an updated view of how their 1785 tuples compare with the bounded set. 1787 d) If more than one session participant happens to send TMMBR 1788 messages at the same time and with the same tuple component 1789 values, it does not matter which of those tuples is taken into 1790 the bounding set. The losing session participant will determine, 1791 after applying the algorithm, that its tuple does not enter the 1792 bounding set, and will therefore stop sending its TMMBR request. 1794 It is important to consider the security risks involved with faked 1795 TMMBRs. See the security considerations in Section 6 1796 . 1798 As indicated already, the feedback messages may be used in both 1799 multicast and unicast sessions in any of the specified topologies. 1800 However, for sessions with a large number of participants, using the 1801 lowest common denominator, as required by this mechanism, may not be 1802 the most suitable course of action. Large sessions may need to 1803 consider other ways to adapt the bit rate to participants' 1804 capabilities, such as partitioning the session into different 1805 quality tiers, or using some other method of achieving bit rate 1806 scalability. 1808 4.2.1.3. Timing Rules 1810 The first transmission of the TMMBR request message MAY use early or 1811 immediate feedback in cases when timeliness is desirable. Any 1812 repetition of a request message SHOULD use regular RTCP mode for its 1813 transmission timing. 1815 4.2.1.4. Handling in Translator and Mixers 1817 Media translators and mixers will need to receive and respond to 1818 TMMBR messages as they are part of the chain that provides a certain 1819 media stream to the receiver. The mixer or translator may act 1820 locally on the TMMBR request and thus generate a TMMBN to indicate 1821 that it has done so. Alternatively, in the case of a media 1822 translator it can forward the request, or in the case of a mixer 1823 generate one of its own and pass it forward. In the latter case, 1824 the mixer will need to send a TMMBN back to the original requestor 1825 to indicate that it is handling the request. 1827 4.2.2. Temporary Maximum Media Stream Bit Rate Notification (TMMBN) 1829 The Temporary Maximum Media Stream Bit Rate Notification is 1830 identified by RTCP packet type value PT=RTPFB and FMT=4. 1832 The FCI field of the TMMBN Feedback message may contain zero, one or 1833 more TMMBN FCI entries. 1835 4.2.2.1. Message Format 1837 The Feedback Control Information (FCI) consists of zero, one or more 1838 TMMBN FCI entries with the following syntax: 1840 0 1 2 3 1841 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1842 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1843 | SSRC | 1844 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1845 | MxTBR Exp | MxTBR Mantissa |Measured Overhead| 1846 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1848 Figure 3 - Syntax of an FCI entry in the TMMBN message 1849 SSRC (32 bits): The SSRC value of the "owner" of this tuple. 1851 MxTBR Exp (6 bits): The exponential scaling of the mantissa for 1852 the maximum total media bit rate value. The value is an 1853 unsigned integer [0..63]. 1855 MxTBR Mantissa (17 bits): The mantissa of the maximum total media 1856 bit rate value as an unsigned integer. 1858 Measured Overhead (9 bits): The measured average packet overhead 1859 value in bytes represented as an unsigned integer. 1861 Thus, the FCI within the TMMBN message contains entries indicating 1862 the bounding tuples. For each tuple, the entry gives the owner by 1863 the SSRC, followed by the applicable maximum total media bit rate 1864 and overhead value. 1866 The length of the TMMBN message SHALL be set to 2+2*N where N is the 1867 number of TMMBN FCI entries. 1869 4.2.2.2. Semantics 1871 This feedback message is used to notify the senders of any TMMBR 1872 message that one or more TMMBR messages have been received or that 1873 an owner has left the session. It indicates to all participants the 1874 current set of bounding tuples and the "owners" of those tuples. 1876 Within the common packet header for feedback messages (as defined in 1877 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 1878 indicates the source of the notification. The "SSRC of media 1879 source" is not used and SHALL be set to 0. 1881 A TMMBN message SHALL be scheduled for transmission after the 1882 reception of a TMMBR message with an FCI entry identifying this 1883 media sender. Only a single TMMBN SHALL be sent, even if more than 1884 one TMMBR message is received between the scheduling of the 1885 transmission and the actual transmission of the TMMBN message. The 1886 TMMBN message indicates the bounding tuples and their owners at the 1887 time of transmitting the message. The bounding tuples included 1888 SHALL be the set arrived at through application of the applicable 1889 algorithm of section 3.5.4.2 or an equivalent, applied to the 1890 previous bounding set if any and tuples received in TMMBR messages 1891 since the last TMMBN was transmitted. 1893 The reception of a TMMBR message SHALL still result in the 1894 transmission of a TMMBN message even if, after application of the 1895 algorithm, the newly reported TMMBR tuple is not accepted into the 1896 bounding set. In such a case the bounding tuples and their owners 1897 are not changed, unless the TMMBR was from an owner of a tuple 1898 within the previously calculated bounding set. This procedure 1899 allows session participants that did not see the last TMMBN message 1900 to get a correct view of this media sender's state. 1902 As indicated in section 4.2.1.2, when a media sender determines that 1903 an "owner" of a bounding tuple has left the session, then that tuple 1904 is removed from the bounding set, and the media sender SHALL send a 1905 TMMBN message indicating the remaining bounding tuples. If there 1906 are no remaining bounding tuples a TMMBN without any FCI SHALL be 1907 sent to indicate this. Without a remaining bounding tuple, the 1908 maximum media bit rate and maximum packet rate negotiated in session 1909 signaling, if any, apply. 1911 .Note: if any media receivers remain in the session, this last 1912 will be a temporary situation. The empty TMMBN will cause every 1913 remaining media receiver to determine that its limitation belongs 1914 in the bounding set and send a TMMBR in consequence. 1916 In unicast scenarios (i.e. where a single sender talks to a single 1917 receiver), the aforementioned algorithm to determine ownership 1918 degenerates to the media receiver becoming the "owner" of the one 1919 bounding tuple as soon as the media receiver has issued the first 1920 TMMBR message. 1922 4.2.2.3. Timing Rules 1924 The TMMBN acknowledgement SHOULD be sent as soon as allowed by the 1925 applied timing rules for the session. Immediate or early feedback 1926 mode SHOULD be used for these messages. 1928 4.2.2.4. Handling by Translators and Mixers 1930 As discussed in Section 4.2.1.4 mixers or translators may need to 1931 issue TMMBN messages as responses to TMMBR messages for SSRC's 1932 handled by them. 1934 4.3. Payload Specific Feedback Messages 1936 As specified by section 6.1 of RFC 4585 [RFC4585], Payload-Specific 1937 FB messages are identified by the RTCP packet type value PSFB (206). 1939 AVPF [RFC4585] defines three payload-specific feedback messages and 1940 one application layer feedback message. This memo specifies four 1941 additional payload-specific feedback messages. All are identified 1942 by means of the FMT parameter as follows: 1944 Assigned in [RFC4585]: 1946 1: Picture Loss Indication (PLI) 1947 2: Slice Lost Indication (SLI) 1948 3: Reference Picture Selection Indication (RPSI) 1949 15: Application layer FB message 1950 31: reserved for future expansion of the number space 1952 Assigned in this memo: 1954 4: Full Intra Request Command (FIR) 1955 5: Temporal-Spatial Trade-off Request (TSTR) 1956 6: Temporal-Spatial Trade-off Notification (TSTN) 1957 7: Video Back Channel Message (VBCM) 1959 Unassigned: 1961 0: unassigned 1962 8-14: unassigned 1963 16-30: unassigned 1965 The following subsections define the new FCI formats for the 1966 payload-specific feedback messages. 1968 4.3.1. Full Intra Request (FIR) 1970 The FIR message is identified by RTCP packet type value PT=PSFB and 1971 FMT=4. 1973 The FCI field MUST contain one or more FIR entries. Each entry 1974 applies to a different media sender, identified by its SSRC. 1976 4.3.1.1. Message Format 1978 The Feedback Control Information (FCI) for the Full Intra Request 1979 consists of one or more FCI entries, the content of which is 1980 depicted in Figure 4. The length of the FIR feedback message MUST 1981 be set to 2+2*N, where N is the number of FCI entries. 1983 0 1 2 3 1984 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1985 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1986 | SSRC | 1987 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1988 | Seq. nr | Reserved | 1989 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1991 Figure 4 - Syntax of an FCI entry in the FIR message 1993 SSRC (32 bits): The SSRC value of the media sender which is 1994 requested to send a decoder refresh point. 1996 Seq. nr (8 bits): Command sequence number. The sequence number 1997 space is unique for each pairing of the SSRC of command 1998 source and the SSRC of the command target. The sequence 1999 number SHALL be increased by 1 modulo 256 for each new 2000 command. A repetition SHALL NOT increase the sequence 2001 number. The initial value is arbitrary. 2003 Reserved (24 bits): All bits SHALL be set to 0 by the sender and 2004 SHALL be ignored on reception. 2006 The semantics of this feedback message is independent of the RTP 2007 payload type. 2009 4.3.1.2. Semantics 2011 Within the common packet header for feedback messages (as defined in 2012 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 2013 indicates the source of the request, and the "SSRC of media source" 2014 is not used and SHALL be set to 0. The SSRCs of the media senders 2015 to which the FIR command applies are in the corresponding FCI 2016 entries. A FIR message MAY contain requests to multiple media 2017 senders, using one FCI entry per target media sender. 2019 Upon reception of FIR, the encoder MUST send a decoder refresh point 2020 (see section 2.2) as soon as possible. 2022 The sender MUST consider congestion control as outlined in section 2023 5, which MAY restrict its ability to send a decoder refresh point 2024 quickly. 2026 FIR SHALL NOT be sent as a reaction to picture losses -- it is 2027 RECOMMENDED to use PLI [RFC4585] instead. FIR SHOULD be used only 2028 in situations where not sending a decoder refresh point would render 2029 the video unusable for the users. 2031 A typical example where sending FIR is appropriate is when, in a 2032 multipoint conference, a new user joins the session and no regular 2033 decoder refresh point interval is established. Another example 2034 would be a video switching MCU that changes streams. Here, 2035 normally, the MCU issues a FIR to the new sender so to force it to 2036 emit a decoder refresh point. The decoder refresh point normally 2037 includes a Freeze Picture Release (defined outside this 2038 specification), which re-starts the rendering process of the 2039 receivers. Both techniques mentioned are commonly used in MCU-based 2040 multipoint conferences. 2042 Other RTP payload specifications such as RFC 2032 [RFC2032] already 2043 define a feedback mechanism for certain codecs. An application 2044 supporting both schemes MUST use the feedback mechanism defined in 2045 this specification when sending feedback. For backward 2046 compatibility reasons such an application SHOULD also be capable of 2047 receiving and reacting to the feedback scheme defined in the 2048 respective RTP payload format, if this is required by that payload 2049 format. 2051 4.3.1.3. Timing Rules 2053 The timing follows the rules outlined in section 3 of [RFC4585]. 2054 FIR commands MAY be used with early or immediate feedback. The FIR 2055 feedback message MAY be repeated. If using immediate feedback mode 2056 the repetition SHOULD wait at least one RTT before being sent. In 2057 early or regular RTCP mode the repetition is sent in the next 2058 regular RTCP packet. 2060 4.3.1.4. Handling of FIR Message in Mixer and Translators 2062 A media translator or a mixer performing media encoding of the 2063 content for which the session participant has issued a FIR is 2064 responsible for acting upon it. A mixer acting upon a FIR SHOULD 2065 NOT forward the message unaltered; instead it SHOULD issue a FIR 2066 itself. 2068 4.3.1.5. Remarks 2070 Currently, video appears to be the only useful application for FIR, 2071 as it appears to be the only RTP payload widely deployed that relies 2072 heavily on media prediction across RTP packet boundaries. However, 2073 use of FIR could also reasonably be envisioned for other media types 2074 that share essential properties with compressed video, namely cross- 2075 frame prediction (whatever a frame may be for that media type). One 2076 possible example may be the dynamic updates of MPEG-4 scene 2077 descriptions. It is suggested that payload formats for such media 2078 types refer to FIR and other message types defined in this 2079 specification and in AVPF [RFC4585], instead of creating similar 2080 mechanisms in the payload specifications. The payload 2081 specifications may have to explain how the payload-specific 2082 terminologies map to the video-centric terminology used herein. 2084 In conjunction with video codecs, FIR messages typically trigger the 2085 sending of full intra or IDR pictures. Both are several times 2086 larger then predicted (inter) pictures. Their size is independent 2087 of the time they are generated. In most environments, especially 2088 when employing bandwidth-limited links, the use of an intra picture 2089 implies an allowed delay that is a significant multiple of the 2090 typical frame duration. An example: if the sending frame rate is 10 2091 fps, and an intra picture is assumed to be 10 times as big as an 2092 inter picture, then a full second of latency has to be accepted. In 2093 such an environment there is no need for a particularly short delay 2094 in sending the FIR message. Hence, waiting for the next possible 2095 time slot allowed by RTCP timing rules as per [RFC4585] should not 2096 have an overly negative impact on the system performance. 2098 Mandating a maximum delay for completing the sending of a decoder 2099 refresh point would be desirable from an application viewpoint, but 2100 is problematic from a congestion control point of view. "As soon as 2101 possible" as mentioned above appears to be a reasonable compromise. 2103 In environments where the sender has no control over the codec (e.g. 2104 when streaming pre-recorded and pre-coded content), the reaction to 2105 this command cannot be specified. One suitable reaction of a sender 2106 would be to skip forward in the video bit stream to the next decoder 2107 refresh point. In other scenarios, it may be preferable not to 2108 react to the command at all, e.g. when streaming to a large 2109 multicast group. Other reactions may also be possible. When 2110 deciding on a strategy, a sender could take into account factors 2111 such as the size of the receiving group, the "importance" of the 2112 sender of the FIR message (however "importance" may be defined in 2113 this specific application), the frequency of decoder refresh points 2114 in the content, and so on. However, a session which predominately 2115 handles pre-coded content is not expected to use FIR at all. 2117 The relationship between the Picture Loss Indication and FIR is as 2118 follows. As discussed in section 6.3.1 of AVPF [RFC4585], a Picture 2119 Loss Indication informs the decoder about the loss of a picture and 2120 hence the likelihood of misalignment of the reference pictures 2121 between the encoder and decoder. Such a scenario is normally 2122 related to losses in an ongoing connection. In point-to-point 2123 scenarios, and without the presence of advanced error resilience 2124 tools, one possible option for an encoder consists in sending a 2125 decoder refresh point. However, there are other options. One 2126 example is that the media sender ignores the PLI, because the 2127 embedded stream redundancy is likely to clean up the reproduced 2128 picture within a reasonable amount of time. The FIR, in contrast, 2129 leaves a (real-time) encoder no choice but to send a decoder refresh 2130 point. It does not allow the encoder to take into account any 2131 considerations such as the ones mentioned above. 2133 4.3.2. Temporal-Spatial Trade-off Request (TSTR) 2135 The TSTR feedback message is identified by RTCP packet type value 2136 PT=PSFB and FMT=5. 2138 The FCI field MUST contain one or more TSTR FCI entries. 2140 4.3.2.1. Message Format 2142 The content of the FCI entry for the Temporal-Spatial Trade-off 2143 Request is depicted in Figure 5. The length of the feedback message 2144 MUST be set to 2+2*N, where N is the number of FCI entries included. 2146 0 1 2 3 2147 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2148 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2149 | SSRC | 2150 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2151 | Seq nr. | Reserved | Index | 2152 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2154 Figure 5 - Syntax of an FCI Entry in the TSTR Message 2156 SSRC (32 bits): The SSRC of the media sender which is requested to 2157 apply the tradeoff value given in Index. 2159 Seq. nr (8 bits): Request sequence number. The sequence number 2160 space is unique for pairing of the SSRC of request source 2161 and the SSRC of the request target. The sequence number 2162 SHALL be increased by 1 modulo 256 for each new command. 2163 A repetition SHALL NOT increase the sequence number. The 2164 initial value is arbitrary. 2166 Reserved (19 bits): All bits SHALL be set to 0 by the sender and 2167 SHALL be ignored on reception. 2169 Index (5 bits): An integer value between 0 and 31 that indicates 2170 the relative trade-off that is requested. An index value 2171 of 0 indicates highest possible spatial quality, while 31 2172 indicates highest possible temporal resolution. 2174 4.3.2.2. Semantics 2176 A decoder can suggest a temporal-spatial trade-off level by sending 2177 a TSTR message to an encoder. If the encoder is capable of 2178 adjusting its temporal-spatial trade-off, it SHOULD take into 2179 account the received TSTR message for future coding of pictures. A 2180 value of 0 suggests a high spatial quality and a value of 31 2181 suggests a high frame rate. The progression of values from 0 to 31 2182 indicate monotonically a desire for higher frame rate. The index 2183 values do not correspond to precise values of spatial quality or 2184 frame rate. 2186 The reaction to the reception of more than one TSTR message by a 2187 media sender from different media receivers is left open to the 2188 implementation. The selected trade-off SHALL be communicated to the 2189 media receivers by the means of the TSTN message. 2191 Within the common packet header for feedback messages (as defined in 2192 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 2193 indicates the source of the request, and the "SSRC of media source" 2194 is not used and SHALL be set to 0. The SSRCs of the media senders 2195 to which the TSTR applies are in the corresponding FCI entries. 2197 A TSTR message MAY contain requests to multiple media senders, using 2198 one FCI entry per target media sender. 2200 4.3.2.3. Timing Rules 2202 The timing follows the rules outlined in section 3 of [RFC4585]. 2203 This request message is not time critical and SHOULD be sent using 2204 regular RTCP timing. Only if it is known that the user interface 2205 requires quick feedback, the message MAY be sent with early or 2206 immediate feedback timing. 2208 4.3.2.4. Handling of message in Mixers and Translators 2209 A mixer or media translator that encodes content sent to the session 2210 participant issuing the TSTR SHALL consider the request to determine 2211 if it can fulfill it by changing its own encoding parameters. A 2212 media translator unable to fulfill the request MAY forward the 2213 request unaltered towards the media sender. A mixer encoding for 2214 multiple session participants will need to consider the joint needs 2215 of these participants before generating a TSTR on its own behalf 2216 towards the media sender. See also the discussion in Section 3.5.2. 2218 4.3.2.5. Remarks 2220 The term "spatial quality" does not necessarily refer to the 2221 resolution as measured by the number of pixels the reconstructed 2222 video is using. In fact, in most scenarios the video resolution 2223 stays constant during the lifetime of a session. However, all video 2224 compression standards have means to adjust the spatial quality at a 2225 given resolution, often influenced by the Quantizer Parameter or QP. 2226 A numerically low QP results in a good reconstructed picture 2227 quality, whereas a numerically high QP yields a coarse picture. The 2228 typical reaction of an encoder to this request is to change its rate 2229 control parameters to use a lower frame rate and a numerically lower 2230 (on average) QP, or vice versa. The precise mapping of Index value 2231 to frame rate and QP is intentionally left open here, as it depends 2232 on factors such as the compression standard employed, spatial 2233 resolution, content, bit rate, and so on. 2235 4.3.3. Temporal-Spatial Trade-off Notification (TSTN) 2237 The TSTN message is identified by RTCP packet type value PT=PSFB and 2238 FMT=6. 2240 The FCI field SHALL contain one or more TSTN FCI entries. 2242 4.3.3.1. Message Format 2244 The content of an FCI entry for the Temporal-Spatial Trade-off 2245 Notification is depicted in Figure 6. The length of the TSTN 2246 message MUST be set to 2+2*N, where N is the number of FCI entries. 2248 0 1 2 3 2249 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2250 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2251 | SSRC | 2252 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2253 | Seq nr. | Reserved | Index | 2254 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2255 Figure 6 - Syntax of the TSTN 2257 SSRC (32 bits): The SSRC of the source of the TSTR request which 2258 resulted in this Notification. 2260 Seq. nr (8 bits): The sequence number value from the TSTR request 2261 that is being acknowledged. 2263 Reserved (19 bits): All bits SHALL be set to 0 by the sender and 2264 SHALL be ignored on reception. 2266 Index (5 bits): The trade-off value the media sender is using 2267 henceforth. 2269 Informative note: The returned trade-off value (Index) may differ 2270 from the requested one, for example in cases where a media encoder 2271 cannot tune its trade-off, or when pre-recorded content is used. 2273 4.3.3.2. Semantics 2275 This feedback message is used to acknowledge the reception of a 2276 TSTR. For each TSTR received targeted at the session participant, a 2277 TSTN entry SHALL be sent included in a TSTN feedback message. A 2278 single TSTN message MAY acknowledge multiple requests using multiple 2279 FCI entries. The index value included SHALL be the same in all FCI 2280 entries of the TSTN message. Including a FCI for each requestor 2281 allows each requesting entity to determine that the media sender 2282 received the request. The Notification SHALL also be sent in 2283 response to TSTR repetitions received. If the request receiver has 2284 received TSTR with several different sequence numbers from a single 2285 requestor it SHALL only respond to the request with the highest 2286 (modulo 256) sequence number. Note that the highest sequence number 2287 may be a smaller integer value due to the wrapping of the field. 2288 Section A.1 of [RFC3550] has an algorithm for keeping track of the 2289 highest received sequence number for RTP packets, this could be 2290 adapted for this usage. 2292 The TSTN SHALL include the Temporal-Spatial Trade-off index that 2293 will be used as a result of the request. This is not necessarily 2294 the same index as requested, as the media sender may need to 2295 aggregate requests from several requesting session participants. It 2296 may also have some other policies or rules that limit the selection. 2298 Within the common packet header for feedback messages (as defined in 2299 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 2300 indicates the source of the Notification, and the "SSRC of media 2301 source" is not used and SHALL be set to 0. The SSRCs of the 2302 requesting entities to which the Notification applies are in the 2303 corresponding FCI entries. 2305 4.3.3.3. Timing Rules 2307 The timing follows the rules outlined in section 3 of [RFC4585]. 2308 This acknowledgement message is not extremely time critical and 2309 SHOULD be sent using regular RTCP timing. 2311 4.3.3.4. Handling of TSTN in Mixer and Translators 2313 A mixer or translator that acts upon a TSTR SHALL also send the 2314 corresponding TSTN. In cases where it needs to forward a TSTR 2315 itself the notification message MAY need to be delayed until the 2316 TSTR has been responded to. 2318 4.3.3.5. Remarks 2320 None 2322 4.3.4. H.271 Video Back Channel Message (VBCM) 2324 The VBCM is identified by RTCP packet type value PT=PSFB and FMT=7. 2326 The FCI field MUST contain one or more VBCM FCI entries. 2328 4.3.4.1. Message Format 2330 The syntax of an FCI entry within the VBCM indication is depicted in 2331 Figure 7. 2333 0 1 2 3 2334 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2335 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2336 | SSRC | 2337 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2338 | Seq. nr |0| Payload Type| Length | 2339 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2340 | VBCM Octet String.... | Padding | 2341 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2343 Figure 7 - Syntax of an FCI Entry in the VBCM Message 2344 SSRC (32 bits): The SSRC value of the media sender that is requested 2345 to instruct its encoder to react to the VBCM message 2347 Seq. nr (8 bits): Command sequence number. The sequence number 2348 space is unique for pairing of the SSRC of command source and 2349 the SSRC of the command target. The sequence number SHALL be 2350 increased by 1 modulo 256 for each new command. A repetition 2351 SHALL NOT increase the sequence number. The initial value is 2352 arbitrary. 2354 0: Must be set to 0 by the sender and should not be acted upon by 2355 the message receiver. 2357 Payload Type (7 bits): The RTP payload type for which the VBCM bit 2358 stream must be interpreted. 2360 Length (16 bits): The length of the VBCM octet string in octets 2361 exclusive of any padding octets 2363 VBCM Octet String (Variable length): This is the octet string 2364 generated by the decoder carrying a specific feedback sub- 2365 message. 2367 Padding (Variable length): Bits set to 0 to make up a 32 bit 2368 boundary. 2370 4.3.4.2. Semantics 2372 The "payload" of the VBCM indication carries different types of 2373 codec-specific, feedback information. The type of feedback 2374 information can be classified as a 'status report' (such as an 2375 indication that a bit stream was received without errors, or that a 2376 partial or complete picture or block was lost) or 'update requests' 2377 (such as complete refresh of the bit stream). 2379 Note: There are possible overlaps between the VBCM sub- 2380 messages and CCM/AVPF feedback messages, such as FIR. Please 2381 see section 3.5.3 for further discussion. 2383 The different types of feedback sub-messages carried in the VBCM are 2384 indicated by the "payloadType" as defined in [VBCM]. These sub- 2385 message types are reproduced below for convenience. "payloadType", 2386 in ITU-T Rec. H.271 terminology, refers to the sub-type of the H.271 2387 message and should not be confused with an RTP payload type. 2389 Payload Message Content 2390 Type 2391 -------------------------------------------------------------------- 2393 0 One or more pictures without detected bit stream error 2394 mismatch 2395 1 One or more pictures that are entirely or partially lost 2396 2 A set of blocks of one picture that is entirely or partially 2397 lost 2398 3 CRC for one parameter set 2399 4 CRC for all parameter sets of a certain type 2400 5 A "reset" request indicating that the sender should completely 2401 refresh the video bit stream as if no prior bit stream data 2402 had been received 2403 > 5 Reserved for future use by ITU-T 2405 Table 2: H.271 message types ("payloadTypes") 2407 The bit string or the "payload" of a VBCM message is of variable 2408 length and is self-contained and coded in a variable length, binary 2409 format. The media sender necessarily has to be able to parse this 2410 optimized binary format to make use of VBCM messages. 2412 Each of the different types of sub-messages (indicated by 2413 payloadType) may have different semantics depending on the codec 2414 used. 2416 Within the common packet header for feedback messages (as defined in 2417 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 2418 indicates the source of the request, and the "SSRC of media source" 2419 is not used and SHALL be set to 0. The SSRCs of the media senders 2420 to which the VBCM message applies to are in the corresponding FCI 2421 entries. The sender of the VBCM message MAY send H.271 messages to 2422 multiple media senders and MAY send more than one H.271 message to 2423 the same media sender within the same VBCM message. 2425 4.3.4.3. Timing Rules 2427 The timing follows the rules outlined in section 3 of [RFC4585]. 2428 The different sub-message types may have different properties in 2429 regards to the timing of messages that should be used. If several 2430 different types are included in the same feedback packet then the 2431 requirements for the sub-message type with the most stringent 2432 requirements should be followed. 2434 4.3.4.4. Handling of message in Mixer or Translator 2436 The handling of VBCM in a mixer or translator is sub-message type 2437 dependent. 2439 4.3.4.5. Remarks 2441 Please see section 3.5.3 for a discussion of the usage of H.271 2442 messages and messages defined in AVPF [RFC4585] and this memo with 2443 similar functionality. 2445 Note: There has been some discussion whether the RTP payload type 2446 field in this message is needed. It will be needed if there is 2447 potentially more than one VBCM-capable RTP payload type in the 2448 same session, and the semantics of a given VBCM message changes 2449 between payload types. For example, the picture identification 2450 mechanism in messages of H.271 type 0 is fundamentally different 2451 between H.263 and H.264 (although both use the same syntax). 2452 Therefore, the payload field is justified here. There was a 2453 further comment that for TSTR and FIR such a need does not exist, 2454 because the semantics of TSTR and FIR are either loosely enough 2455 defined, or generic enough, to apply to all video payloads 2456 currently in existence/envisioned. 2458 5. Congestion Control 2460 The correct application of the AVPF [RFC4585] timing rules prevents 2461 the network from being flooded by feedback messages. Hence, 2462 assuming a correct implementation and configuration, the RTCP 2463 channel cannot break its bit rate commitment and introduce 2464 congestion. 2466 The reception of some of the feedback messages modifies the 2467 behaviour of the media senders or, more specifically, the media 2468 encoders. Thus, modified behaviour MUST respect the bandwidth 2469 limits that the application of congestion control provides. For 2470 example, when a media sender is reacting to a FIR, the unusually 2471 high number of packets that form the decoder refresh point have to 2472 be paced in compliance with the congestion control algorithm, even 2473 if the user experience suffers from a slowly transmitted decoder 2474 refresh point. 2476 A change of the Temporary Maximum Media Stream Bit Rate value can 2477 only mitigate congestion, but not cause congestion as long as 2478 congestion control is also employed. An increase of the value by a 2479 request REQUIRES the media sender to use congestion control when 2480 increasing its transmission rate to that value. A reduction of the 2481 value results in a reduced transmission bit rate, thus reducing the 2482 risk for congestion. 2484 6. Security Considerations 2486 The defined messages have certain properties that have security 2487 implications. These must be addressed and taken into account by 2488 users of this protocol. 2490 The defined setup signaling mechanism is sensitive to modification 2491 attacks that can result in session creation with sub-optimal 2492 configuration, and, in the worst case, session rejection. To 2493 prevent this type of attack, authentication and integrity protection 2494 of the setup signaling is required. 2496 Spoofed or maliciously created feedback messages of the type defined 2497 in this specification can have the following implications: 2499 a. severely reduced media bit rate due to false TMMBR messages 2500 that sets the maximum to a very low value; 2502 b. assignment of the ownership of a bounding tuple to the wrong 2503 participant within a TMMBN message, potentially causing 2504 unnecessary oscillation in the bounding set as the mistakenly 2505 identified owner reports a change in its tuple and the true 2506 owner possibly holds back on changes until a correct TMMBN 2507 message reaches the participants; 2509 c. sending TSTR requests that result in a video quality 2510 different from the user's desire, rendering the session less 2511 useful; 2513 d. sending multiple FIR commands to reduce the frame-rate, and 2514 make the video jerky, due to the frequent usage of decoder 2515 refresh points. 2517 To prevent these attacks there is a need to apply authentication and 2518 integrity protection of the feedback messages. This can be 2519 accomplished against threats external to the current RTP session 2520 using the RTP profile that combines SRTP [SRTP] and AVPF into SAVPF 2521 [SAVPF]. In the mixer cases, separate security contexts and 2522 filtering can be applied between the mixer and the participants, 2523 thus protecting other users on the mixer from a misbehaving 2524 participant. 2526 7. SDP Definitions 2528 Section 4 of [RFC4585] defines a new SDP [RFC4566] attribute, rtcp- 2529 fb, that may be used to negotiate the capability to handle specific 2530 AVPF commands and indications, such as Reference Picture Selection, 2531 Picture Loss Indication etc. The ABNF for rtcp-fb is described in 2532 section 4.2 of [RFC4585]. In this section we extend the rtcp-fb 2533 attribute to include the commands and indications that are described 2534 for codec control in the present document. We also discuss the 2535 Offer/Answer implications for the codec control commands and 2536 indications. 2538 7.1. Extension of the rtcp-fb Attribute 2540 As described in AVPF [RFC4585], the rtcp-fb attribute indicates the 2541 capability of using RTCP feedback. AVPF specifies that the rtcp-fb 2542 attribute must only be used as a media level attribute and must not 2543 be provided at session level. All the rules described in [RFC4585] 2544 for rtcp-fb attribute relating to payload type and to multiple rtcp- 2545 fb attributes in a session description also apply to the new 2546 feedback messages defined in this memo. 2548 The ABNF [RFC4234] for rtcp-fb as defined in [RFC4585] is 2550 "a=rtcp-fb: " rtcp-fb-pt SP rtcp-fb-val CRLF 2552 where rtcp-fb-pt is the payload type and rtcp-fb-val defines the 2553 type of the feedback message such as ack, nack, trr-int and rtcp-fb- 2554 id. For example, to indicate the support of feedback of picture 2555 loss indication, the sender declares the following in SDP 2557 v=0 2558 o=alice 3203093520 3203093520 IN IP4 host.example.com 2559 s=Media with feedback 2560 t=0 0 2561 c=IN IP4 host.example.com 2562 m=audio 49170 RTP/AVPF 98 2563 a=rtpmap:98 H263-1998/90000 2564 a=rtcp-fb:98 nack pli 2566 In this document we define a new feedback value "ccm" which 2567 indicates the support of codec control using RTCP feedback messages. 2568 The "ccm" feedback value SHOULD be used with parameters that 2569 indicate the specific codec control commands supported. In this 2570 draft we define four such parameters, namely: 2572 o "fir" indicates support of the Full Intra Request (FIR). 2573 o "tmmbr" indicates support of the Temporary Maximum Media Stream 2574 Bit Rate Request/Notification (TMMBR/TMMBN). It has an 2575 optional sub parameter to indicate the session maximum packet 2576 rate (measured in packets per second) to be used. If not 2577 included this defaults to infinity. 2578 o "tstr" indicates support of the Temporal-Spatial Trade-off 2579 Request/Notification (TSTR/TSTN). 2580 O "vbcm" indicates support of H.271 video back channel messages 2581 (VBCM). It has zero or more subparameters identifying the 2582 supported H.271 "payloadType" values. 2584 In the ABNF for rtcp-fb-val defined in [RFC4585], there is a 2585 placeholder called rtcp-fb-id to define new feedback types. "ccm" 2586 is defined as a new feedback type in this document and the ABNF for 2587 the parameters for ccm are defined here (please refer to section 4.2 2588 of [RFC4585] for complete ABNF syntax). 2590 rtcp-fb-param = SP "app" [SP byte-string] 2591 / SP rtcp-fb-ccm-param 2592 / ; empty 2594 rtcp-fb-ccm-param = "ccm" SP ccm-param 2596 ccm-param = "fir" ; Full Intra Request 2597 / "tmmbr" [SP "smaxpr=" MaxPacketRateValue] 2598 ; Temporary max media bit rate 2599 / "tstr" ; Temporal Spatial Trade Off 2600 / "vbcm" *(SP subMessageType) ; H.271 VBCM messages 2601 / token [SP byte-string] 2602 ; for future commands/indications 2603 subMessageType = 1*8DIGIT 2604 byte-string = 2605 MaxPacketRateValue = 1*15DIGIT 2607 7.2. Offer-Answer 2609 The Offer/Answer [RFC3264] implications for codec control protocol 2610 feedback messages are similar to those described in [RFC4585]. The 2611 offerer MAY indicate the capability to support selected codec 2612 commands and indications. The answerer MUST remove all ccm 2613 parameters corresponding to the CCM messages that it does not wish 2614 to support in this particular media session (for example because it 2615 does not implement the message in question, or because its 2616 application logic suggests the support of the message adds no 2617 value). The answerer MUST NOT add new ccm parameters in addition to 2618 what has been offered. The answer is binding for the media session 2619 and both offerer and answerer MUST NOT use any feedback messages 2620 other than what both sides have explicitly indicated as being 2621 supported. In others words only the joint subset of CCM parameters 2622 from the offer and answer may be used. 2624 Note, that including a CCM parameter in an offer or answer indicates 2625 that the party (offerer or answerer) is at least capable of 2626 receiving the corresponding CCM message(s) and act upon them. In 2627 cases when the reception of a negotiated CCM messages mandates the 2628 party to respond with another CCM message, it must also have that 2629 capability. Although it is not mandated to initiate CCM messages of 2630 any negotiated type, it is generally expected that an party will 2631 initiate CCM messages when appropriate. 2633 The session maximum packet rate parameter part of the TMMBR 2634 indication is declarative and everyone SHALL use the highest value 2635 indicated in a response. If the session maximum packet rate 2636 parameter is not present in an offer it SHALL NOT be included by the 2637 answerer. 2639 7.3. Examples 2641 Example 1: The following SDP describes a point-to-point video call 2642 with H.263, with the originator of the call declaring its capability 2643 to support the FIR and TSTR/TSTN codec control messages. The SDP is 2644 carried in a high level signaling protocol like SIP. 2646 v=0 2647 o=alice 3203093520 3203093520 IN IP4 host.example.com 2648 s=Point-to-Point call 2649 c=IN IP4 192.0.2.124 2650 m=audio 49170 RTP/AVP 0 2651 a=rtpmap:0 PCMU/8000 2652 m=video 51372 RTP/AVPF 98 2653 a=rtpmap:98 H263-1998/90000 2654 a=rtcp-fb:98 ccm tstr 2655 a=rtcp-fb:98 ccm fir 2657 In the above example, when the sender receives a TSTR message from 2658 the remote party it is capable of adjusting the trade off as 2659 indicated in the RTCP TSTN feedback message. 2661 Example 2: The following SDP describes a SIP end point joining a 2662 video mixer that is hosting a multiparty video conferencing session. 2664 The participant supports only the FIR (Full Intra Request) codec 2665 control command and it declares it in its session description. 2667 v=0 2668 o=alice 3203093520 3203093520 IN IP4 host.example.com 2669 s=Multiparty Video Call 2670 c=IN IP4 192.0.2.124 2671 m=audio 49170 RTP/AVP 0 2672 a=rtpmap:0 PCMU/8000 2673 m=video 51372 RTP/AVPF 98 2674 a=rtpmap:98 H263-1998/90000 2675 a=rtcp-fb:98 ccm fir 2677 When the video MCU decides to route the video of this participant it 2678 sends an RTCP FIR feedback message. Upon receiving this feedback 2679 message the end point is required to generate a full intra request. 2681 Example 3: The following example describes the Offer/Answer 2682 implications for the codec control messages. The Offerer wishes to 2683 support "tstr", "fir" and "tmmbr". The offered SDP is 2685 -------------> Offer 2686 v=0 2687 o=alice 3203093520 3203093520 IN IP4 host.example.com 2688 s=Offer/Answer 2689 c=IN IP4 192.0.2.124 2690 m=audio 49170 RTP/AVP 0 2691 a=rtpmap:0 PCMU/8000 2692 m=video 51372 RTP/AVPF 98 2693 a=rtpmap:98 H263-1998/90000 2694 a=rtcp-fb:98 ccm tstr 2695 a=rtcp-fb:98 ccm fir 2696 a=rtcp-fb:* ccm tmmbr smaxpr=120 2698 The answerer wishes to support only the FIR and TSTR/TSTN messages 2699 and the answerer SDP is 2701 <---------------- Answer 2703 v=0 2704 o=alice 3203093520 3203093524 IN IP4 otherhost.example.com 2705 s=Offer/Answer 2706 c=IN IP4 192.0.2.37 2707 m=audio 47190 RTP/AVP 0 2708 a=rtpmap:0 PCMU/8000 2709 m=video 53273 RTP/AVPF 98 2710 a=rtpmap:98 H263-1998/90000 2711 a=rtcp-fb:98 ccm tstr 2712 a=rtcp-fb:98 ccm fir 2714 Example 4: The following example describes the Offer/Answer 2715 implications for H.271 Video back channel messages (VBCM). The 2716 Offerer wishes to support VBCM and the sub-messages of payloadType 1 2717 (one or more pictures that are entirely or partially lost) and 2 (a 2718 set of blocks of one picture that are entirely or partially lost). 2720 -------------> Offer 2721 v=0 2722 o=alice 3203093520 3203093520 IN IP4 host.example.com 2723 s=Offer/Answer 2724 c=IN IP4 192.0.2.124 2725 m=audio 49170 RTP/AVP 0 2726 a=rtpmap:0 PCMU/8000 2727 m=video 51372 RTP/AVPF 98 2728 a=rtpmap:98 H263-1998/90000 2729 a=rtcp-fb:98 ccm vbcm 1 2 2731 The answerer only wishes to support sub-messages of type 1 only 2733 <---------------- Answer 2735 v=0 2736 o=alice 3203093520 3203093524 IN IP4 otherhost.example.com 2737 s=Offer/Answer 2738 c=IN IP4 192.0.2.37 2739 m=audio 47190 RTP/AVP 0 2740 a=rtpmap:0 PCMU/8000 2741 m=video 53273 RTP/AVPF 98 2742 a=rtpmap:98 H263-1998/90000 2743 a=rtcp-fb:98 ccm vbcm 1 2745 So, in the above example, only VBCM indications comprised of 2746 "payloadType" 1 will be supported. 2748 8. IANA Considerations 2750 The new value "ccm" needs to be registered with IANA in the "rtcp- 2751 fb" Attribute Values registry located at the time of publication at: 2752 http://www.iana.org/assignments/sdp-parameters 2754 Value name: ccm 2755 Long Name: Codec Control Commands and Indications 2756 Reference: RFC XXXX 2758 A new registry "Codec Control Messages" needs to be created to hold 2759 "ccm" parameters located at time of publication at: 2760 http://www.iana.org/assignments/sdp-parameters 2762 New registration in this registry follows the "Specification 2763 required" policy as defined by [RFC2434]. In addition they are 2764 required to indicate which, if any additional RTCP feedback types, 2765 such as "nack", "ack". 2767 The initial content of the registry is the following values: 2769 Value name: fir 2770 Long name: Full Intra Request Command 2771 Usable with: ccm 2772 Reference: RFC XXXX 2774 Value name: tmmbr 2775 Long name: Temporary Maximum Media Stream Bit Rate 2776 Usable with: ccm 2777 Reference: RFC XXXX 2779 Value name: tstr 2780 Long name: temporal Spatial Trade Off 2781 Usable with: ccm 2782 Reference: RFC XXXX 2784 Value name: vbcm 2785 Long name: H.271 video back channel messages 2786 Usable with: ccm 2787 Reference: RFC XXXX 2789 The following values need to be registered as FMT values in the "FMT 2790 Values for RTPFB Payload Types" registry located at the time of 2791 publication at: http://www.iana.org/assignments/rtp-parameters 2792 RTPFB range 2793 Name Long Name Value Reference 2794 -------------- --------------------------------- ----- --------- 2795 Reserved 2 [RFCxxxx] 2796 TMMBR Temporary Maximum Media Stream Bit 3 [RFCxxxx] 2797 Rate Request 2798 TMMBN Temporary Maximum Media Stream Bit 4 [RFCxxxx] 2799 Rate Notification 2801 The following values need to be registered as FMT values in the "FMT 2802 Values for PSFB Payload Types" registry located at the time of 2803 publication at: http://www.iana.org/assignments/rtp-parameters 2805 PSFB range 2806 Name Long Name Value Reference 2807 -------------- --------------------------------- ----- ------- 2808 FIR Full Intra Request Command 4 [RFCxxxx] 2809 TSTR Temporal-Spatial Trade-off Request 5 [RFCxxxx] 2810 TSTN Temporal-Spatial Trade-off Notification 6 [RFCxxxx] 2811 VBCM Video Back Channel Message 7 [RFCxxxx] 2813 9. Contributors 2815 Tom Taylor has made a very significant contribution, for which the 2816 authors are very grateful, to this specification by helping rewrite 2817 the specification. Especially the parts regarding the algorithm for 2818 determining bounding sets for TMMBR have benefited. 2820 10. Acknowledgements 2822 The authors would like to thank Andrea Basso, Orit Levin, Nermeen 2823 Ismail for their work on the requirement and discussion draft 2824 [Basso]. 2826 Drafts of this memo were reviewed and extensively commented by Roni 2827 Even, Colin Perkins, Randell Jesup, Keith Lantz, Harikishan 2828 Desineni, Guido Franceschini and others. The authors appreciate 2829 these reviews. 2831 Funding for the RFC Editor function is currently provided by the 2832 Internet Society. 2834 11. References 2836 11.1. Normative references 2838 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., Rey, J., 2839 "Extended RTP Profile for Real-Time Transport Control 2840 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, 2841 July 2006 2842 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2843 Requirement Levels", BCP 14, RFC 2119, March 1997. 2844 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 2845 Jacobson, "RTP: A Transport Protocol for Real-Time 2846 Applications", STD 64, RFC 3550, July 2003. 2847 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 2848 Description Protocol", RFC 4566, July 2006. 2849 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 2850 with Session Description Protocol (SDP)", RFC 3264, June 2851 2002. 2852 [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an 2853 IANA Considerations Section in RFCs", BCP 26, RFC 2434, 2854 October 1998. 2855 [RFC4234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 2856 Specifications: ABNF", RFC 4234, October 2005. 2858 11.2. Informative references 2860 [Basso] A. Basso, et. al., "Requirements for transport of video 2861 control commands", draft-basso-avt-videoconreq-02.txt, 2862 expired Internet Draft, October 2004. 2863 [AVC] Joint Video Team of ITU-T and ISO/IEC JTC 1, Draft ITU-T 2864 Recommendation and Final Draft International Standard of 2865 Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC 2866 14496-10 AVC), Joint Video Team (JVT) of ISO/IEC MPEG 2867 and ITU-T VCEG, JVT-G050, March 2003. 2868 [H245] ITU-T Rec. HG.245, "Control protocol for multimedia 2869 communication", MAY 2006 2870 [NEWPRED] S. Fukunaga, T. Nakai, and H. Inoue, "Error Resilient 2871 Video Coding by Dynamic Replacing of Reference 2872 Pictures," in Proc. Globcom'96, vol. 3, pp. 1503 - 1508, 2873 1996. 2874 [SRTP] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and 2875 K. Norrman, "The Secure Real-time Transport Protocol 2876 (SRTP)", RFC 3711, March 2004. 2877 [RFC2032] Turletti, T. and C. Huitema, "RTP Payload Format for 2878 H.261 Video Streams", RFC 2032, October 1996. 2880 [SAVPF] J. Ott, E. Carrara, "Extended Secure RTP Profile for 2881 RTCP-based Feedback (RTP/SAVPF)," draft-ietf-avt- 2882 profile-savpf-10.txt, February, 2007. 2883 [RFC3525] Groves, C., Pantaleo, M., Anderson, T., and T. Taylor, 2884 "Gateway Control Protocol Version 1", RFC 3525, June 2885 2003. 2886 [RFC3448] M. Handley, S. Floyd, J. Padhye, J. Widmer, "TCP 2887 Friendly Rate Control (TFRC): Protocol Specification", 2888 RFC 3448, Jan 2003 2889 [VBCM] ITU-T Rec. H.271, "Video Back Channel Messages", June 2890 2006 2891 [RFC3890] Westerlund, M., "A Transport Independent Bandwidth 2892 Modifier for the Session Description Protocol (SDP)", 2893 RFC 3890, September 2004. 2894 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 2895 Congestion Control Protocol (DCCP)", RFC 4340, March 2896 2006. 2897 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 2898 A., Peterson, J., Sparks, R., Handley, M., and E. 2899 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 2900 June 2002. 2901 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 2902 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 2903 Parisis, "RTP Payload for Redundant Audio Data", RFC 2904 2198, September 1997. 2905 [Topologies] M. Westerlund, and S. Wenger, "RTP Topologies", draft- 2906 ietf-avt-topologies-06, work in progress, Aug 2007. 2908 12. Authors' Addresses 2910 Stephan Wenger 2911 Nokia Corporation 2912 975, Page Mill Road, 2913 Palo Alto,CA 94304 2914 USA 2916 Phone: +1-650-862-7368 2917 EMail: stewe@stewe.org 2919 Umesh Chandra 2920 Nokia Research Center 2921 975, Page Mill Road, 2922 Palo Alto,CA 94304 2923 USA 2925 Phone: +1-650-796-7502 2926 Email: Umesh.1.Chandra@nokia.com 2928 Magnus Westerlund 2929 Ericsson Research 2930 Ericsson AB 2931 SE-164 80 Stockholm, SWEDEN 2933 Phone: +46 8 7190000 2934 EMail: magnus.westerlund@ericsson.com 2936 Bo Burman 2937 Ericsson Research 2938 Ericsson AB 2939 SE-164 80 Stockholm, SWEDEN 2941 Phone: +46 8 7190000 2942 EMail: bo.burman@ericsson.com 2944 Full Copyright Statement 2946 Copyright (C) The IETF Trust (2007). 2948 This document is subject to the rights, licenses and restrictions 2949 contained in BCP 78, and except as set forth therein, the authors 2950 retain all their rights. 2952 This document and the information contained herein are provided on an 2953 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 2954 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST 2955 AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, 2956 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT 2957 THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY 2958 IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR 2959 PURPOSE. 2961 Intellectual Property 2963 The IETF takes no position regarding the validity or scope of any 2964 Intellectual Property Rights or other rights that might be claimed to 2965 pertain to the implementation or use of the technology described in 2966 this document or the extent to which any license under such rights 2967 might or might not be available; nor does it represent that it has 2968 made any independent effort to identify any such rights. Information 2969 on the procedures with respect to rights in RFC documents can be 2970 found in BCP 78 and BCP 79. 2972 Copies of IPR disclosures made to the IETF Secretariat and any 2973 assurances of licenses to be made available, or the result of an 2974 attempt made to obtain a general license or permission for the use of 2975 such proprietary rights by implementers or users of this 2976 specification can be obtained from the IETF on-line IPR repository at 2977 http://www.ietf.org/ipr. 2979 The IETF invites any interested party to bring to its attention any 2980 copyrights, patents or patent applications, or other proprietary 2981 rights that may cover technology that may be required to implement 2982 this standard. Please address the information to the IETF at 2983 ietf-ipr@ietf.org. 2985 Acknowledgement 2987 Funding for the RFC Editor function is provided by the IETF 2988 Administrative Support Activity (IASA). 2990 RFC Editor Considerations 2992 The RFC editor is requested to replace all occurrences of XXXX with 2993 the RFC number this document receives.