idnits 2.17.1 draft-ietf-avt-avpf-ccm-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 19. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 2897. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2908. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2915. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2921. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == Line 752 has weird spacing: '...sg type mul...' == Line 1132 has weird spacing: '... ab c s...' == Line 1134 has weird spacing: '... ba s...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 14, 2007) is 6185 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFCxxxx' is mentioned on line 2751, but not defined ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) == Outdated reference: A later version (-07) exists of draft-ietf-avt-topologies-04 ** Downref: Normative reference to an Informational draft: draft-ietf-avt-topologies (ref. 'Topologies') ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Obsolete normative reference: RFC 4234 (Obsoleted by RFC 5234) == Outdated reference: A later version (-12) exists of draft-ietf-avt-profile-savpf-10 -- Obsolete informational reference (is this intentional?): RFC 3525 (Obsoleted by RFC 5125) -- Obsolete informational reference (is this intentional?): RFC 3448 (Obsoleted by RFC 5348) Summary: 5 errors (**), 0 flaws (~~), 8 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Stephan Wenger 3 INTERNET-DRAFT Umesh Chandra 4 Expires: October 2007 Nokia 5 Magnus Westerlund 6 Bo Burman 7 Ericsson 8 May 14, 2007 10 Codec Control Messages in the 11 RTP Audio-Visual Profile with Feedback (AVPF) 12 draft-ietf-avt-avpf-ccm-05.txt> 14 Status of this Memo 16 By submitting this Internet-Draft, each author represents that any 17 applicable patent or other IPR claims of which he or she is aware 18 have been or will be disclosed, and any of which he or she becomes 19 aware will be disclosed, in accordance with Section 6 of BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt. 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 Copyright Notice 39 Copyright (C) The IETF Trust (2007). 41 Abstract 43 This document specifies a few extensions to the messages defined in 44 the Audio-Visual Profile with Feedback (AVPF). They are helpful 45 primarily in conversational multimedia scenarios where centralized 46 multipoint functionalities are in use. However some are also usable 47 in smaller multicast environments and point-to-point calls. The 48 extensions discussed are messages related to the ITU-T H.271 Video 49 Back Channel, Full Intra Request, Temporary Maximum Media Stream Bit 50 Rate and Temporal Spatial Trade-off. 52 TABLE OF CONTENTS 54 1. Introduction....................................................5 55 2. Definitions.....................................................6 56 2.1. Glossary...................................................6 57 2.2. Terminology................................................6 58 2.3. Topologies.................................................9 59 3. Motivation (Informative).......................................10 60 3.1. Use Cases.................................................10 61 3.2. Using the Media Path......................................12 62 3.3. Using AVPF................................................13 63 3.3.1. Reliability..........................................13 64 3.4. Multicast.................................................13 65 3.5. Feedback Messages.........................................13 66 3.5.1. Full Intra Request Command...........................13 67 3.5.1.1. Reliability.....................................14 68 3.5.2. Temporal Spatial Trade-off Request and Notification..15 69 3.5.2.1. Point-to-Point..................................16 70 3.5.2.2. Point-to-Multipoint Using Multicast or 71 Translators.....................................16 72 3.5.2.3. Point-to-Multipoint Using RTP Mixer.............17 73 3.5.2.4. Reliability.....................................17 74 3.5.3. H.271 Video Back Channel Message.....................17 75 3.5.3.1. Reliability.....................................20 76 3.5.4. Temporary Maximum Media Stream Bit Rate Request and 77 Notification................................................20 78 3.5.4.1. Behavior for media receivers using TMMBR........22 79 3.5.4.2. Algorithm for establishing current limitations..24 80 3.5.4.3. Use of TMMBR in a Mixer Based Multipoint 81 Operation.......................................30 82 3.5.4.4. Use of TMMBR in Point-to-Multipoint Using 83 Multicast or Translators........................32 84 3.5.4.5. Use of TMMBR in Point-to-point operation........32 85 3.5.4.6. Reliability.....................................32 86 4. RTCP Receiver Report Extensions................................34 87 4.1. Design Principles of the Extension Mechanism..............34 88 4.2. Transport Layer Feedback Messages.........................35 89 4.2.1. Temporary Maximum Media Stream Bit Rate Request 90 (TMMBR)..............................................36 91 4.2.1.1. Message Format..................................36 92 4.2.1.2. Semantics.......................................37 93 4.2.1.3. Timing Rules....................................40 94 4.2.1.4. Handling in Translator and Mixers...............40 95 4.2.2. Temporary Maximum Media Stream Bit Rate Notification 96 (TMMBN)..............................................41 97 4.2.2.1. Message Format..................................41 98 4.2.2.2. Semantics.......................................41 99 4.2.2.3. Timing Rules....................................43 100 4.2.2.4. Handling by Translators and Mixers..............43 101 4.3. Payload Specific Feedback Messages........................43 102 4.3.1. Full Intra Request (FIR).............................44 103 4.3.1.1. Message Format..................................44 104 4.3.1.2. Semantics.......................................45 105 4.3.1.3. Timing Rules....................................47 106 4.3.1.4. Handling of FIR Message in Mixer and 107 Translators.................................... 47 108 4.3.1.5. Remarks.........................................47 109 4.3.2. Temporal-Spatial Trade-off Request (TSTR)............47 110 4.3.2.1. Message Format..................................47 111 4.3.2.2. Semantics.......................................48 112 4.3.2.3. Timing Rules....................................49 113 4.3.2.4. Handling of message in Mixers and Translators...49 114 4.3.2.5. Remarks.........................................49 115 4.3.3. Temporal-Spatial Trade-off Notification (TSTN).......50 116 4.3.3.1. Message Format..................................50 117 4.3.3.2. Semantics.......................................50 118 4.3.3.3. Timing Rules....................................51 119 4.3.3.4. Handling of TSTN in Mixer and Translators.......51 120 4.3.3.5. Remarks.........................................51 121 4.3.4. H.271 Video Back Channel Message (VBCM)..............51 122 4.3.4.1. Message Format..................................52 123 4.3.4.2. Semantics.......................................52 124 4.3.4.3. Timing Rules....................................54 125 4.3.4.4. Handling of message in Mixer or Translator......54 126 4.3.4.5. Remarks.........................................54 127 5. Congestion Control.............................................54 128 6. Security Considerations........................................55 129 7. SDP Definitions................................................56 130 7.1. Extension of the rtcp-fb Attribute........................56 131 7.2. Offer-Answer..............................................58 132 7.3. Examples..................................................58 133 8. IANA Considerations............................................61 134 9. Acknowledgements...............................................62 135 10. References....................................................63 136 10.1. Normative references.....................................63 137 10.2. Informative references...................................63 138 11. Authors' Addresses............................................64 139 1.1. Introduction 141 When the Audio-Visual Profile with Feedback (AVPF) [RFC4585] was 142 developed, the main emphasis lay in the efficient support of point- 143 to-point and small multipoint scenarios without centralized 144 multipoint control. However, in practice, many small multipoint 145 conferences operate utilizing devices known as Multipoint Control 146 Units (MCUs). Long-standing experience of the conversational video 147 conferencing industry suggests that there is a need for a few 148 additional feedback messages, to support centralized multipoint 149 conferencing efficiently. Some of the messages have applications 150 beyond centralized multipoint, and this is indicated in the 151 description of the message. This is especially true for the message 152 intended to carry ITU-T Rec. H.271 [H.271] bit strings for Video Back 153 Channel messages. 155 In Real-time Transport Protocol (RTP) [RFC3550] terminology, MCUs 156 comprise mixers and translators. Most MCUs also include signaling 157 support. During the development of this memo, it was noticed that 158 there is considerable confusion in the community related to the use 159 of terms such as mixer, translator, and MCU. In response to these 160 concerns, a number of topologies have been identified that are of 161 practical relevance to the industry, but are not documented in 162 sufficient detail in [RFC3550]. These topologies are documented in 163 [Topologies], and understanding this memo requires previous or 164 parallel study of [Topologies]. 166 Some of the messages defined here are forward only, in that they do 167 not require an explicit notification to the message emitter that they 168 have been received and/or indicating the message receiver's actions. 169 Other messages require a response, leading to a two way communication 170 model that one could view as useful for control purposes. However, 171 it is not the intention of this memo to open up RTP Control Protocol 172 (RTCP) to a generalized control protocol. All mentioned messages 173 have relatively strict real-time constraints, in the sense that their 174 value diminishes with increased delay. This makes the use of more 175 traditional control protocol means, such as Session Initiation 176 Protocol (SIP) re-INVITEs [RFC3261], undesirable when used for the 177 same purpose. Furthermore, all messages are of a very simple format 178 that can be easily processed by an RTP/RTCP sender/receiver. 179 Finally, all messages relate only to the RTP stream with which they 180 are associated, and not to any other property of a communication 181 system. In particular, none of them relate to the properties of the 182 access links traversed by the session. 184 2. Definitions 186 2.1. Glossary 188 AMID - Additive Increase Multiplicative Decrease 189 AVPF - The extended RTP profile for RTCP-based feedback 190 FEC - Forward Error Correction 191 FCI - Feedback Control Information [RFC4585] 192 FIR - Full Intra Request 193 MCU - Multipoint Control Unit 194 MPEG - Moving Picture Experts Group 195 TMMBN - Temporary Maximum Media Stream Bit Rate Notification 196 TMMBR - Temporary Maximum Media Stream Bit Rate Request 197 PLI - Picture Loss Indication 198 PR - Packet rate 199 QP - Quantizer Parameter 200 RTT - Round trip time 201 SSRC - Synchronization Source 202 TSTN - Temporal Spatial Trade-off Notification 203 TSTR - Temporal Spatial Trade-off Request 204 VBCM - Video Back Channel Message indication. 206 2.2. Terminology 208 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 209 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 210 document are to be interpreted as described in RFC 2119 [RFC2119]. 212 Message: 213 An RTCP feedback message [RFC4585] defined by this 214 specification, of one of the following types: 216 Request: 217 Message that requires acknowledgement 219 Command: 220 Message that forces the receiver to an action 222 Indication: 223 Message that reports a situation 225 Notification: 227 Message that provides a notification that an event has 228 occurred. Notifications are commonly generated in response 229 to a Request. 231 Note that, with the exception of "Notification", this 232 terminology is in alignment with ITU-T Rec. H.245 [H245]. 234 Decoder Refresh Point: 235 A bit string, packetized in one or more RTP packets, which 236 completely resets the decoder to a known state. 238 Examples for "hard" decoder refresh points are Intra pictures 239 in H.261, H.263, MPEG-1, MPEG-2, and MPEG-4 part 2, and 240 Instantaneous Decoder Refresh (IDR) pictures in H.264. 241 "Gradual" decoder refresh points may also be used; see for 242 example [AVC]. While both "hard" and "gradual" decoder 243 refresh points are acceptable in the scope of this 244 specification, in most cases the user experience will benefit 245 from using a "hard" decoder refresh point. 247 A decoder refresh point also contains all header information 248 above the picture layer (or equivalent, depending on the video 249 compression standard) that is conveyed in-band. In H.264, for 250 example, a decoder refresh point contains parameter set 251 Network Adaptation Layer (NAL) units that generate parameter 252 sets necessary for the decoding of the following slice/data 253 partition NAL units (and that are not conveyed out of band). 255 Decoding: 256 The operation of reconstructing the media stream. 258 Rendering: 259 The operation of presenting (parts of) the reconstructed media 260 stream to the user. 262 Stream thinning: 263 The operation of removing some of the packets from a media 264 stream. Stream thinning, preferably, is media-aware, implying 265 that media packets are removed in the order of increasing 266 relevance to the reproductive quality. However even when 267 employing media-aware stream thinning, most media streams 268 quickly lose quality when subject to increasing levels of 269 thinning. Media-unaware stream thinning leads to even worse 270 quality degradation. In contrast to transcoding, stream 271 thinning is typically seen as a computationally lightweight 272 operation. 274 Media: 276 Often used (sometimes in conjunction with terms like bit rate, 277 stream, sender ...) to identify the content of the forward RTP 278 packet stream (carrying the codec data), to which the codec 279 control message applies. 281 Media Stream: 282 The stream of RTP packets labeled with a single 283 Synchronization Source (SSRC) carrying the media (and also in 284 some cases repair information such as retransmission or 285 Forward Error Correction (FEC) information). 287 Total media bit rate: 288 The total bits per second transferred in a media stream, 289 measured at an observer-selected protocol layer and averaged 290 over a reasonable timescale, the length of which depends on 291 the application. In general, a media sender and a media 292 receiver will observe different total media bit rates for the 293 same stream, first because they may have selected different 294 reference protocol layers, and second, because of changes in 295 per-packet overhead along the transmission path. The goal 296 with bit rate averaging is to be able to ignore any burstiness 297 on very short timescales, below for example 100 ms, introduced 298 by scheduling or link layer packetization effects. 300 Maximum total media bit rate: 301 The upper limit on total media bit rate for a given media 302 stream at a particular receiver and for its selected protocol 303 layer. Note that this value cannot be measured on the received 304 media stream, instead it needs to be calculated or determined 305 through other means, such as QoS negotiations or local 306 resource limitations. Also note that this value is an average 307 (on a timescale that is reasonable for the application) and 308 that it may be different from the instantaneous bit-rate seen 309 by packets in the media stream. 311 Overhead: 312 All protocol header information required to convey a packet 313 with media data from sender to receiver, from the application 314 layer down to a pre-defined protocol level (for example down 315 to, and including, the IP header). Overhead may include, for 316 example, IP, UDP, and RTP headers, any layer 2 headers, any 317 Contributing Sources (CSRCs), RTP-Padding, and RTP header 318 extensions. Overhead excludes any RTP payload headers and the 319 payload itself. 321 Net media bit rate: 322 The bit rate carried by a media stream, net of overhead. That 323 is, the bits per second accounted for by encoded media, any 324 applicable payload headers, and any directly associated meta 325 payload information placed in the RTP packet. A typical 326 example of the latter is redundancy data provided by the use 327 of RFC 2198 [RFC2198]. Note that, unlike the total media bit 328 rate, the net media bit rate will have the same value at the 329 media sender and at the media receiver unless any mixing or 330 translating of the media has occurred. 332 For a given observer, the total media bit rate for a media 333 stream is equal to the sum of the net media bit rate and the 334 per-packet overhead as defined above multiplied by the packet 335 rate. 337 Feasible region: 338 The set of all combinations of packet rate and net media bit 339 rate that do not exceed the restrictions in maximum media bit 340 rate placed on a given media sender by the Temporary Maximum 341 Media Stream Bit-rate Request (TMMBR) messages it has 342 received. The feasible region will change as new TMMBR 343 messages are received. 345 Bounding set: 346 The set of TMMBR tuples, selected from all those received at a 347 given media sender, that define the feasible region for that 348 media sender. The media sender uses an algorithm such as that 349 in section 3.5.4.2 to determine or iteratively approximate the 350 current bounding set, and reports that set back to the media 351 receivers in a Temporary Maximum Media Stream Bit-rate 352 Notification (TMMBN) message. 354 2.3. Topologies 356 Please refer to [Topologies] for an in depth discussion. The 357 topologies referred to throughout this memo are labeled (consistently 358 with [Topologies]) as follows: 360 Topo-Point-to-Point . . . . . point-to-point communication 361 Topo-Multicast . . . . . . . multicast communication as in RFC 3550 362 Topo-Translator . . . . . . . translator based as in RFC 3550 363 Topo-Mixer . . . . . . . . . mixer based as in RFC 3550 364 Topo-Video-switch-MCU . . . . video switching MCU, 365 Topo-RTCP-terminating-MCU . . mixer but terminating RTCP 367 3. Motivation (Informative) 369 This section discusses the motivation and usage of the different 370 video and media control messages. The video control messages have 371 been under discussion for a long time, and a requirement draft was 372 drawn up [Basso]. This draft has expired; however we quote relevant 373 sections of it to provide motivation and requirements. 375 3.1. Use Cases 377 There are a number of possible usages for the proposed feedback 378 messages. Let us begin by looking through the use cases Basso et al. 379 [Basso] proposed. Some of the use cases have been reformulated and 380 comments have been added. 382 1. An RTP video mixer composes multiple encoded video sources into a 383 single encoded video stream. Each time a video source is added, 384 the RTP mixer needs to request a decoder refresh point from the 385 video source, so as to start an uncorrupted prediction chain on 386 the spatial area of the mixed picture occupied by the data from 387 the new video source. 389 2. An RTP video mixer receives multiple encoded RTP video streams 390 from conference participants, and dynamically selects one of the 391 streams to be included in its output RTP stream. At the time of a 392 bit stream change (determined through means such as voice 393 activation or the user interface), the mixer requests a decoder 394 refresh point from the remote source, in order to avoid using 395 unrelated content as reference data for inter picture prediction. 396 After requesting the decoder refresh point, the video mixer stops 397 the delivery of the current RTP stream and monitors the RTP stream 398 from the new source until it detects data belonging to the decoder 399 refresh point. At that time, the RTP mixer starts forwarding the 400 newly selected stream to the receiver(s). 402 3. An application needs to signal to the remote encoder that the 403 desired trade-off between temporal and spatial resolution has 404 changed. For example, one user may prefer a higher frame rate and 405 a lower spatial quality, and another user may prefer the opposite. 406 This choice is also highly content dependent. Many current video 407 conferencing systems offer in the user interface a mechanism to 408 make this selection, usually in the form of a slider. The 409 mechanism is helpful in point-to-point, centralized multipoint and 410 non-centralized multipoint uses. 412 4. Use case 4 of the Basso draft applies only to Picture Loss 413 Indication (PLI) as defined in AVPF [RFC4585] and is not 414 reproduced here. 416 5. Use case 5 of the Basso draft relates to a mechanism known as 417 "freeze picture request". Sending freeze picture requests 418 over a non-reliable forward RTCP channel has been identified as 419 problematic. Therefore, no freeze picture request has been 420 included in this memo, and the use case discussion is not 421 reproduced here. 423 6. A video mixer dynamically selects one of the received video 424 streams to be sent out to participants and tries to provide the 425 highest bit rate possible to all participants, while minimizing 426 stream trans-rating. One way of achieving this is to set up 427 sessions with endpoints using the maximum bit rate accepted by 428 each endpoint, and accepted by the call admission method used by 429 the mixer. By means of commands that reduce the maximum media 430 stream bit rate below what has been negotiated during session set 431 up, the mixer can reduce the maximum bit rate sent by endpoints to 432 the lowest of all the accepted bit rates. As the lowest accepted 433 bit rate changes due to endpoints joining and leaving or due to 434 network congestion, the mixer can adjust the limits at which 435 endpoints can send their streams to match the new value. The 436 mixer then requests a new maximum bit rate, which is equal to or 437 less than the maximum bit rate negotiated at session setup for a 438 specific media stream, and the remote endpoint can respond with 439 the actual bit rate that it can support. 441 The picture Basso, et al draws up covers most applications we 442 foresee. However we would like to extend the list with two 443 additional use cases: 445 7. Currently deployed congestion control algorithms (AMID and TFRC 446 [RFC3448]) probe for additional available capacity as long as 447 there is something to send. With congestion control algorithms 448 using packet loss as the indication for congestion, this probing 449 does generally result in reduced media quality (often to a point 450 where the distortion is large enough to make the media unusable), 451 due to packet loss and increased delay. 453 In a number of deployment scenarios, especially cellular ones, the 454 bottleneck link is often the last hop link. That cellular link 455 also commonly has some type of QoS negotiation enabling the 456 cellular device to learn the maximal bit rate available over this 457 last hop. A media receiver behind this link can, in most (if not 458 all) cases, calculate at least an upper bound for the bit rate 459 available for each media stream it presently receives. How this 460 is done is an implementation detail and not discussed herein. 461 Indicating the maximum available bit rate to the transmitting 462 party for the various media streams can be beneficial to prevent 463 that party from probing for bandwidth for this stream in excess of 464 a known hard limit. For cellular or other mobile devices, the 465 known available bit rate for each stream (deduced from the link 466 bit rate) can change quickly, due to handover to another 467 transmission technology, QoS renegotiation due to congestion, etc. 468 To enable minimal disruption of service, quick convergence is 469 necessary, and therefore media path signaling is desirable. 471 8. The use of reference picture selection (RPS) as an error 472 resilience tool has been introduced in 1997 as NEWPRED [NEWPRED], 473 and is now widely deployed. When RPS is in use, simplistically 474 put, the receiver can send a feedback message to the sender, 475 indicating a reference picture that should be used for future 476 prediction. ([NEWPRED] mentions other forms of feedback as well.) 477 AVPF contains a mechanism for conveying such a message, but did 478 not specify for which codec and according to which syntax the 479 message should conform. Recently, the ITU-T finalized Rec. H.271 480 which (among other message types) also includes a feedback 481 message. It is expected that this feedback message will fairly 482 quickly enjoy wide support. Therefore, a mechanism to convey 483 feedback messages according to H.271 appears to be desirable. 485 3.2. Using the Media Path 487 There are multiple reasons why we use the media path for the codec 488 control messages. 490 First, systems employing MCUs often separate the control and media 491 processing parts. As these messages are intended for or generated by 492 the media part rather than the signaling part of the MCU, having them 493 on the media path avoids transmission across interfaces and 494 unnecessary control traffic between signaling and processing. If the 495 MCU is physically decomposed, the use of the media path avoids the 496 need for media control protocol extensions (e.g. in MEGACO 497 [RFC3525]). 499 Secondly, the signaling path quite commonly contains several 500 signaling entities, e.g. SIP proxies and application servers. 501 Avoiding going through signaling entities avoids delay for several 502 reasons. Proxies have less stringent delay requirements than media 503 processing and due to their complex and more generic nature may 504 result in significant processing delay. The topological locations of 505 the signaling entities are also commonly not optimized for minimal 506 delay, but rather towards other architectural goals. Thus the 507 signaling path can be significantly longer in both geographical and 508 delay sense. 510 3.3. Using AVPF 512 The AVPF feedback message framework [RFC4585] provides the 513 appropriate framework to implement the new messages. AVPF implements 514 rules controlling the timing of feedback messages to avoid congestion 515 through network flooding by RTCP traffic. We re-use these rules by 516 referencing AVPF. 518 The signaling setup for AVPF allows each individual type of function 519 to be configured or negotiated on an RTP session basis. 521 3.3.1. Reliability 523 The use of RTCP messages implies that each message transfer is 524 unreliable, unless the lower layer transport provides reliability. 525 The different messages proposed in this specification have different 526 requirements in terms of reliability. However, in all cases, the 527 reaction to an (occasional) loss of a feedback message is specified. 529 3.4. Multicast 531 The codec control messages might be used with multicast. The RTCP 532 timing rules specified in [RFC3550] and [RFC4585] ensure that the 533 messages do not cause overload of the RTCP connection. The use of 534 multicast may result in the reception of messages with inconsistent 535 semantics. The reaction to inconsistencies depends on the message 536 type, and is discussed for each message type separately. 538 3.5. Feedback Messages 540 This section describes the semantics of the different feedback 541 messages and how they apply to the different use cases. 543 3.5.1. Full Intra Request Command 545 A Full Intra Request (FIR) Command, when received by the designated 546 media sender, requires that the media sender sends a Decoder Refresh 547 Point (see 2.2) at the earliest opportunity. The evaluation of such 548 opportunity includes the current encoder coding strategy and the 549 current available network resources. 551 FIR is also known as an "instantaneous decoder refresh request" or 552 "video fast update request". 554 Using a decoder refresh point implies refraining from using any 555 picture sent prior to that point as a reference for the encoding 556 process of any subsequent picture sent in the stream. For predictive 557 media types that are not video, the analogue applies. For example, 558 if in MPEG-4 systems scene updates are used, the decoder refresh 559 point consists of the full representation of the scene and is not 560 delta-coded relative to previous updates. 562 Decoder refresh points, especially Intra or IDR pictures, are in 563 general several times larger in size than predicted pictures. Thus, 564 in scenarios in which the available bit rate is small, the use of a 565 decoder refresh point implies a delay that is significantly longer 566 than the typical picture duration. 568 Usage in multicast is possible; however aggregation of the commands 569 is recommended. A receiver that receives a request closely (within 2 570 times the longest Round Trip Time (RTT) known, plus any AVPF-induced 571 RTCP packet sending delays, if those are known) after sending a 572 decoder refresh point, should await a second request message to 573 ensure that the media receiver has not been served by the previously 574 delivered decoder refresh point. The reason for the specified delay 575 is to avoid sending unnecessary decoder refresh points. A session 576 participant may have sent its own request while another participant's 577 request was in-flight to them. Suppressing those requests that may 578 have been sent without knowledge about the other request avoids this 579 issue. 581 Using the FIR command to recover from errors is explicitly 582 disallowed, and instead the PLI message defined in AVPF [RFC4585] 583 should be used. The PLI message reports lost pictures and has been 584 included in AVPF for precisely that purpose. 586 Full Intra Request is applicable in use-cases 1 and 2. 588 3.5.1.1. Reliability 590 The FIR message results in the delivery of a decoder refresh point, 591 unless the message is lost. Decoder refresh points are easily 592 identifiable from the bit stream. Therefore, there is no need for 593 protocol-level notification, and a simple command repetition 594 mechanism is sufficient for ensuring the level of reliability 595 required. However, the potential use of repetition does require a 596 mechanism to prevent the recipient from responding to messages 597 already received and responded to. 599 To ensure the best possible reliability, a sender of FIR may repeat 600 the FIR request until the desired content has been received. The 601 repetition interval is determined by the RTCP timing rules applicable 602 to the session. Upon reception of a complete decoder refresh point 603 or the detection of an attempt to send a decoder refresh point (which 604 got damaged due to a packet loss), the repetition of the FIR must 605 stop. If another FIR is necessary, the request sequence number must 606 be increased. A FIR sender shall not have more than one FIR request 607 (different request sequence number) outstanding at any time per media 608 sender in the session. 610 The receiver of FIR (i.e. the media sender) behaves in complementary 611 fashion to ensure delivery of a decoder refresh point. If it 612 receives repetitions of the FIR more than 2*RTT after it has sent a 613 decoder refresh point, it shall send a new decoder refresh point. 614 Two round trip times allow time for the decoder refresh point to 615 arrive back to the requestor and for the end of repetitions of FIR to 616 reach and be detected by the media sender. 618 An RTP mixer that receives an FIR from a media receiver is 619 responsible to ensure that a decoder refresh point is delivered to 620 the requesting receiver. It may be necessary for the mixer to 621 generate FIR commands. From a reliability perspective, the two legs 622 (FIR-requesting endpoint to mixer, and mixer to decoder refresh point 623 generating endpoint) are handled independently from each other. 625 3.5.2. Temporal Spatial Trade-off Request and Notification 627 The Temporal Spatial Trade-off Request (TSTR) instructs the video 628 encoder to change its trade-off between temporal and spatial 629 resolution. Index values from 0 to 31 indicate monotonically a 630 desire for higher frame rate. That is, a requester asking for an 631 index of 0 prefers a high quality and is willing to accept a low 632 frame rate, whereas a requester asking for 31 wishes a high frame 633 rate, potentially at the cost of low spatial quality. 635 In general the encoder reaction time may be significantly longer than 636 the typical picture duration. See use case 3 for an example. The 637 encoder decides whether and to what extent the request results in a 638 change of the trade-off. It returns a Temporal Spatial Trade-Off 639 Notification (TSTN) message to indicate the trade-off that it will 640 use henceforth. 642 TSTR and TSTN have been introduced primarily because it is believed 643 that control protocol mechanisms, e.g. a SIP re-invite, are too 644 heavyweight and too slow to allow for a reasonable user experience. 646 Consider, for example, a user interface where the remote user selects 647 the temporal/spatial trade-off with a slider (as it is common in 648 state-of-the-art video conferencing systems). An immediate feedback 649 to any slider movement is required for a reasonable user experience. 650 A SIP re-INVITE [RFC3261] would require at least two round-trips more 651 (compared to the TSTR/TSTN mechanism) and may involve proxies and 652 other complex mechanisms. Even in a well-designed system, it could 653 take a second or so until finally the new trade-off is selected. 654 Furthermore the use of RTCP solves the multicast use case very 655 efficiently. 657 The use of TSTR and TSTN in multipoint scenarios is a non-trivial 658 subject, and can be achieved in many implementation-specific ways. 659 Problems stem from the fact that TSTRs will typically arrive 660 unsynchronized, and may request different trade-off values for the 661 same stream and/or endpoint encoder. This memo does not specify a 662 translator, mixer or endpoint's reaction to the reception of a 663 suggested trade-off as conveyed in the TSTR. We only require the 664 receiver of a TSTR message to reply to it by sending a TSTN, carrying 665 the new trade-off chosen by its own criteria (which may or may not be 666 based on the trade-off conveyed by the TSTR). In other words, the 667 trade-off sent in TSTR is a non-binding recommendation, nothing more. 669 Four TSTR/TSTN scenarios need to be distinguished, based on the 670 topologies described in [Topologies]. The scenarios are described in 671 the following sub-clauses. 673 3.5.2.1. Point-to-Point 675 In this most trivial case (Topo-Point-to-Point), the media sender 676 typically adjusts its temporal/spatial trade-off based on the 677 requested value in TSTR, subject to its own capabilities. The TSTN 678 message conveys back the new trade-off value (which may be identical 679 to the old one if, for example, the sender is not capable of 680 adjusting its trade-off). 682 3.5.2.2. Point-to-Multipoint Using Multicast or Translators 684 RTCP Multicast is used either with media multicast according to Topo- 685 Multicast, or following RFC 3550's translator model according to 686 Topo-Translator. In these cases, unsynchronized TSTR messages from 687 different receivers may be received, possibly with different 688 requested trade-offs (because of different user preferences). This 689 memo does not specify how the media sender tunes its trade-off. 690 Possible strategies include selecting the mean or median of all 691 trade-off requests received, giving priority to certain participants, 692 or continuing to use the previously selected trade-off (e.g. when the 693 sender is not capable of adjusting it). Again, all TSTR messages 694 need to be acknowledged by TSTN, and the value conveyed back has to 695 reflect the decision made. 697 3.5.2.3. Point-to-Multipoint Using RTP Mixer 699 In this scenario (Topo-Mixer) the RTP mixer receives all TSTR 700 messages, and has the opportunity to act on them based on its own 701 criteria. In most cases, the mixer should form a "consensus" of 702 potentially conflicting TSTR messages arriving from different 703 participants, and initiate its own TSTR message(s) to the media 704 sender(s). As in the previous scenario, the strategy for forming 705 this "consensus" is up to the implementation, and can, for example, 706 encompass averaging the participants' request values, giving priority 707 to certain participants, or using session default values. 709 Even if a mixer or translator performs transcoding, it is very 710 difficult to deliver media with the requested trade-off, unless the 711 content the mixer or translator receives is already close to that 712 trade-off. Thus if the mixer changes its trade-off, it needs to 713 request the media sender(s) to use the new value, by creating a TSTR 714 of its own. Upon reaching a decision on the used trade-off it 715 includes that value in the acknowledgement to the downstream 716 requestors. Only in cases where the original source has 717 substantially higher quality (and bit rate), is it likely that 718 transcoding alone can result in the requested trade-off. 720 3.5.2.4. Reliability 722 A request and reception acknowledgement mechanism is specified. The 723 Temporal Spatial Trade-off Notification (TSTN) message informs the 724 request-sender that its request has been received, and what trade-off 725 is used henceforth. This acknowledgment mechanism is desirable for 726 at least the following reasons: 728 o A change in the trade-off cannot be directly identified from the 729 media bit stream. 730 o User feedback cannot be implemented without knowing the chosen 731 trade-off value, according to the media sender's constraints. 732 o Repetitive sending of messages requesting an unimplementable trade- 733 off can be avoided. 735 3.5.3. H.271 Video Back Channel Message 736 ITU-T Rec. H.271 defines syntax, semantics, and suggested encoder 737 reaction to a video back channel message. The structure defined in 738 this memo is used to transparently convey such a message from media 739 receiver to media sender. In this memo, we refrain from an in-depth 740 discussion of the available code points within H.271 and refer to the 741 specification text [H.271] instead. 743 However, we note that some H.271 messages bear similarities with 744 native messages of AVPF and this memo. Furthermore, we note that 745 some H.271 message are known to require caution in multicast 746 environments -- or are plainly not usable in multicast or multipoint 747 scenarios. Table 1 provides a brief, oversimplifying overview of the 748 messages currently defined in H.271, their roughly corresponding AVPF 749 or CCM messages (the latter as specified in this memo), and an 750 indication of our current knowledge of their multicast safety. 752 H.271 msg type AVPF/CCM msg type multicast-safe 753 --------------------------------------------------------------------- 754 0 (when used for 755 reference picture 756 selection) AVPF RPSI No (positive ACK of pictures) 757 1 picture loss AVPF PLI Yes 758 2 partial loss AVPF SLI Yes 759 3 one parameter CRC N/A Yes (no required sender action) 760 4 all parameter CRC N/A Yes (no required sender action) 761 5 refresh point CCM FIR Yes 763 Table 1: H.271 messages and their AVPF/CCM equivalents 765 Note: H.271 message type 0 is not a strict equivalent to 766 AVPF's Reference Picture Selection Indication (RPSI); it is an 767 indication of known-as-correct reference picture(s) at the 768 decoder. It does not command an encoder to use a defined 769 reference picture (the form of control information envisioned 770 to be carried in RPSI). However, it is believed and intended 771 that H.271 message type 0 will be used for the same purpose as 772 AVPF's RPSI -- although other use forms are also possible. 774 In response to the opaqueness of the H.271 messages especially with 775 respect to the multicast safety, the following guidelines MUST be 776 followed when an implementation wishes to employ the H.271 video back 777 channel message: 779 1. Implementations utilizing the H.271 feedback message MUST stay in 780 compliance with congestion control principles, as outlined in 781 section 5. 783 2. An implementation SHOULD utilize the IETF-native messages as 784 defined in [RFC4585] and in this memo instead of similar messages 785 defined in [H.271]. Our current understanding of similar messages 786 is documented in Table 1 above. One good reason to divert from 787 the SHOULD statement above would be if it is clearly understood 788 that, for a given application and video compression standard, the 789 aforementioned "similarity" is not given, in contrast to what 790 the table indicates. 792 3. It has been observed that some of the H.271 code points currently 793 in existence are not multicast-safe. Therefore, the sensible 794 thing to do is not to use the H.271 feedback message type in 795 multicast environments. It MAY be used only when all the issues 796 mentioned later are fully understood by the implementer, and 797 properly taken into account by all endpoints. In all other cases, 798 the H.271 message type MUST NOT be used in conjunction with 799 multicast. 801 4. It has been observed that even in centralized multipoint 802 environments, where the mixer should theoretically be able to 803 resolve issues as documented below, the implementation of such a 804 mixer and cooperative endpoints is a very difficult and tedious 805 task. Therefore, H.271 messages MUST NOT be used in centralized 806 multipoint scenarios, unless all the issues mentioned below are 807 fully understood by the implementer, and properly taken into 808 account by both mixer and endpoints. 810 Issues to be taken into account when considering the use of H.271 in 811 multipoint environments: 813 1. Different state on different receivers. In many environments it 814 cannot be guaranteed that the decoder state of all media receivers 815 is identical at any given point in time. The most obvious reason 816 for such a possible misalignment of state is a loss that occurs on 817 the path to only one of many media receivers. However, there are 818 other not so obvious reasons, such as recent joins to the 819 multipoint conference (be it by joining the multicast group or 820 through additional mixer output). Different states can lead the 821 media receivers to issue potentially contradicting H.271 messages 822 (or one media receiver issuing an H.271 message that, when 823 observed by the media sender, is not helpful for the other media 824 receivers). A naive reaction of the media sender to these 825 contradicting messages can lead to unpredictable and annoying 826 results. 828 2. Combining messages from different media receivers in a media 829 sender is a non-trivial task. As reasons, we note that these 830 messages may be contradicting each other, and that their transport 831 is unreliable (there may well be other reasons). In case of many 832 H.271 messages (i.e. types 0, 2, 3, and 4), the algorithm for 833 combining must be aware both of the network/protocol environment 834 (i.e. with respect to congestion) and of the media codec employed, 835 as H.271 messages of a given type can have different semantics for 836 different media codecs. 838 3. The suppression of requests may need to go beyond the basic 839 mechanisms described in AVPF (which are driven exclusively by 840 timing and transport considerations on the protocol level). For 841 example, a receiver is often required to refrain from (or delay) 842 generating requests, based on information it receives from the 843 media stream. For instance, it makes no sense for a receiver to 844 issue a FIR when a transmission of an Intra/IDR picture is 845 ongoing. 847 4. When using the non-multicast-safe messages (e.g. H.271 type 0 848 positive ACK of received pictures/slices) in larger multicast 849 groups, the media receiver will likely be forced to delay or even 850 omit sending these messages. For the media sender this looks like 851 data has not been properly received (although it was received 852 properly), and a naively implemented media sender reacts to these 853 perceived problems where it should not. 855 3.5.3.1. Reliability 857 H.271 Video Back Channel messages do not require reliable 858 transmission, and confirmation of the reception of a message can be 859 derived from the forward video bit stream. Therefore, no specific 860 reception acknowledgement is specified. 862 With respect to re-sending rules, clause 3.5.1.1. applies. 864 3.5.4. Temporary Maximum Media Stream Bit Rate Request and Notification 866 A receiver, translator or mixer uses the Temporary Maximum Media 867 Stream Bit Rate Request (TMMBR, "timber") to request a sender to 868 limit the maximum bit rate for a media stream (see 2.2) to, or below, 869 the provided value. The Temporary Maximum Media Stream Bit Rate 870 Notification (TMMBN) contains the media sender's current view of the 871 most limiting subset of the TMMBR-defined limits it has received, to 872 help the participants to suppress TMMBR requests that would not 873 further restrict the media sender. The primary usage for the 874 TMMBR/TMMBN messages is in a scenario with an MCU or mixer (use case 875 6), corresponding to Topo-Translator or Topo-Mixer, but also to Topo- 876 Point-to-Point. 878 Each temporary limitation on the media stream is expressed as a 879 tuple. The first component of the tuple is the maximum total media 880 bit rate (as defined in section 2.2) that the media receiver is 881 currently prepared to accept for this media stream. The second 882 component is the per-packet overhead that the media receiver has 883 observed for this media stream at its chosen reference protocol 884 layer. 886 As indicated in section 2.2, the overhead as observed by the sender 887 of the TMMBR (i.e. the media receiver) may differ from the overhead 888 observed at the receiver of the TMMBR (i.e. the media sender) due to 889 use of a different reference protocol layer at the other end or due 890 to the intervention of translators or mixers that affect the amount 891 of per packet overhead. For example, a gateway in between the two 892 that converts between IPv4 and IPv6 affects the per-packet overhead 893 by 20 bytes. Other mechanisms that change the overhead include 894 tunnels. The problem with varying overhead is also discussed in 895 [RFC3890]. As will be seen in the description of the algorithm for 896 use of TMMBR, the difference in perceived overhead between the 897 sending and receiving ends presents no difficulty because 898 calculations are carried out in terms of variables (packet rate, net 899 media bit rate) that have the same value at the sender as at the 900 receiver. 902 Reporting both maximum total media bit rate and per-packet overhead 903 allows different receivers to provide bit rate and overhead values 904 for different protocol layers, for example at the IP level, at the 905 outer part of a tunnel protocol, or at the link layer. The protocol 906 level a peer reports on depends on the level of integration the peer 907 has, as it needs to be able to extract the information from that 908 protocol level. For example, an application with no knowledge of the 909 IP version it is running over can not meaningfully determine the 910 overhead of the IP header, and hence will not want to include IP 911 overhead in the overhead or maximum total media bit rate calculation. 913 It is expected that most peers will be able to report values at least 914 for the IP layer. In certain implementations it may be advantageous 915 to also include information pertaining to the link layer, which in 916 turn allows for a more precise overhead calculation and a better 917 optimization of connectivity resources. 919 The Temporary Maximum Media Stream Bit Rate messages are generic 920 messages that can be applied to any RTP packet stream. This 921 separates them from the other codec control messages defined in this 922 specification, which apply only to specific media types or payload 923 formats. The TMMBR functionality applies to the transport, and the 924 requirements the transport places on the media encoding. 926 The reasoning below assumes that the participants have negotiated a 927 session maximum bit rate, using a signaling protocol. This value can 928 be global, for example in case of point-to-point, multicast, or 929 translators. It may also be local between the participant and the 930 peer or mixer. In either case, the bit rate negotiated in signaling 931 is the one that the participant guarantees to be able to handle 932 (depacketize and decode). In practice, the connectivity of the 933 participant also influences the negotiated value -- it does not make 934 much sense to negotiate a total media bit rate that one's network 935 interface does not support. 937 It is also beneficial to have negotiated a maximum packet rate for 938 the session or sender. RFC 3890 provides an SDP [RFC4566] attribute 939 that can be used for this purpose; however, that attribute is not 940 usable in RTP sessions established using offer/answer [RFC3264]. 941 Therefore an optional maximum packet rate signaling parameter is 942 specified in this memo. 944 An already established maximum total media bit rate may be changed at 945 any time, subject to the timing rules governing the sending of 946 feedback messages. The limit may change to any value between zero and 947 the session maximum, as negotiated during session establishment 948 signaling. However, even if a sender has received a TMMBR message 949 allowing an increase in the bit rate, all increases must be governed 950 by a congestion control mechanism. TMMBR indicates known limitations 951 only, usually in the local environment, and does not provide any 952 guarantees about the full path. Furthermore, any increases in TMMBR- 953 established bit rate limits are to be executed only after a certain 954 delay from the sending of the TMMBN message that notifies the world 955 about the increase in limit. The delay is specified as at least 956 twice the longest RTT as known by the media sender, plus the media 957 sender's calculation of the required wait time for the sending of 958 another TMMBR message for this session based on AVPF timing rules. 959 This delay is introduced to allow other session participants to make 960 known their bit rate limit requirements, which may be lower. 962 If it is likely that the new value indicated by TMMBR will be valid 963 for the remainder of the session, the TMMBR sender is expected to 964 perform a renegotiation of the session upper limit using the session 965 signaling protocol. 967 3.5.4.1. Behavior for media receivers using TMMBR 969 This section is an informal description of behaviour described more 970 precisely in section 4.2. 972 A media sender begins the session limited by the maximum media bit 973 rate and maximum packet rate negotiated in session signaling, if any. 975 Note that this value may be negotiated for another protocol layer 976 than the one the participant uses in its TMMBR messages. Each media 977 receiver selects a reference protocol layer, forms an estimate of the 978 overhead it is observing (or estimating it if no packets has been 979 seen yet) at that reference level, and determines the maximum total 980 media bit rate it can accept, taking into account its own limitations 981 and any transport path limitations of which it may be aware. In case 982 the current limitations are more restricting then what was agreed on 983 in the session signaling, the media receiver reports its initial 984 estimate of these two quantities to the media sender using a TMMBR 985 message. Overall message traffic is reduced by the possibility of 986 including tuples for multiple media senders in the same TMMBR 987 message. 989 The media sender applies an algorithm such as that specified in 990 section 3.5.4.2 to select which of the tuples it has received are 991 most limiting (i.e. the bounding set as defined in section 2.2). It 992 modifies its operation to stay within the feasible region (as defined 993 in section 2.2), and also sends out a TMMBN notification to the media 994 receivers indicating the selected bounding set. 996 If a media receiver does not own one of the tuples in the bounding 997 set reported by the TMMBN, it applies the same algorithm as the media 998 sender to determine if its current estimated (maximum total media bit 999 rate, overhead) tuple would enter the bounding set if known to the 1000 media sender. If so, it issues a TMMBR request reporting the tuple 1001 value to the sender. Otherwise it takes no action for the moment. 1002 Periodically, its estimated tuple values may change or it may receive 1003 a new TMMBN. If so, it reapplies the algorithm to decide whether it 1004 needs to issue a TMMBR request. 1006 If, alternatively, a media receiver owns one of the tuples in the 1007 reported bounding set, it takes no action until such time as its 1008 estimate of its own tuple values changes. At that time it sends a 1009 TMMBR request to the media sender to report the changed values. 1011 A media receiver may change status between owner and non-owner of a 1012 bounding tuple between one TMMBN message and the next. Thus it must 1013 check the contents of each TMMBN to determine its subsequent actions. 1015 Implementations may use other algorithms of their choosing, as long 1016 as the bit rate limitations resulting from the exchange of TMMBR and 1017 TMMBN messages are at least as strict (at least as low, in the bit 1018 rate dimension) as the ones resulting from the use of the 1019 aforementioned algorithm. 1021 Obviously, in point-to-point cases, when there is only one media 1022 receiver, this receiver becomes "owner" once it receives the first 1023 TMMBN in response to its own TMMBR, and stays "owner" for the rest of 1024 the session. Therefore, when it is known that there will always be 1025 only a single media receiver, the above algorithm is not required. 1026 Media receivers that are aware they are the only ones in a session 1027 can send TMMBR messages with bit rate limits both higher and lower 1028 than the previously notified limit, at any time (subject to the AVPF 1029 [RFC4585] RTCP RR send timing rules). However, it may be difficult 1030 for a session participant to determine if it is the only receiver in 1031 the session. Because of this any implementation of TMMBR is required 1032 to include the algorithm described in the next section or a stricter 1033 equivalent. 1035 3.5.4.2. Algorithm for establishing current limitations 1037 This section introduces an example algorithm for the calculation of a 1038 session limit. Other algorithms can be employed, as long as the 1039 result of the calculation is at least as restrictive as the result 1040 that is obtained by this algorithm. 1042 First it is important to consider the implications of using a tuple 1043 for limiting the media sender's behavior. The bit rate and the 1044 overhead value result in a two-dimensional solution space for the 1045 calculation of the bit rate of media streams. Fortunately the two 1046 variables are linked. Specifically, the bit rate available for RTP 1047 payloads is equal to the TMMBR reported bit rate minus the packet 1048 rate used, multiplied by the TMMBR reported overhead converted to 1049 bits. As a result, when different bit rate/overhead combinations 1050 need to be considered, the packet rate determines the correct 1051 limitation. This is perhaps best explained by an example: 1053 Example: 1055 Receiver A: TMMBR_max total BR = 35 kbps, TMMBR_OH = 40 bytes 1056 Receiver B: TMMBR_max total BR = 40 kbps, TMMBR_OH = 60 bytes 1058 For a given packet rate (PR) the bit rate available for media 1059 payloads in RTP will be: 1061 Max_net media_BR_A = TMMBR_max total BR_A - PR * TMMBR_OH_A * 8 ... 1062 (1) 1063 Max_net media_BR_B = TMMBR_max total BR_B - PR * TMMBR_OH_B * 8 ... 1064 (2) 1066 For a PR = 20 these calculations will yield a Max_net media_BR_A = 1067 28600 bps and Max_net media_BR_B = 30400 bps, which suggests that 1068 receiver A is the limiting one for this packet rate. However at a 1069 certain PR there is a switchover point at which receiver B becomes 1070 the limiting one. The switchover point can be identified by setting 1071 Max_media_BR_A equal to Max_media_BR_B and breaking out PR: 1073 TMMBR_max total BR_A - TMMBR_max total BR_B 1074 PR = ------------------------------------------- ... (3) 1075 8*(TMMBR_OH_A - TMMBR_OH_B) 1077 which, for the numbers above yields 31.25 as the switchover point 1078 between the two limits. That is, for packet rates below 31.25 per 1079 second, receiver A is the limiting receiver, and for higher packet 1080 rates, receiver B is more limiting. The implications of this 1081 behavior have to be considered by implementations that are going to 1082 control media encoding and its packetization. As exemplified above, 1083 multiple TMMBR limits may apply to the trade-off between net media 1084 bit rate and packet rate. Which limitation applies depends on the 1085 packet rate being considered. 1087 This also has implications for how the TMMBR mechanism needs to work. 1088 First, there is the possibility that multiple TMMBR tuples are 1089 providing limitations on the media sender. Secondly there is a need 1090 for any session participant (media sender and receivers) to be able 1091 to determine if a given tuple will become a limitation upon the media 1092 sender, or if the set of already given limitations is stricter than 1093 the given values. In the absence of the ability to make this 1094 determination the suppression of TMMBR requests would not work. 1096 The basic idea of the algorithm is as follows. Each TMMBR tuple can 1097 be viewed as the equation of a straight line (cf. equations (1) and 1098 (2)) in a space where packet rate lies along the X-axis and maximum 1099 bit rate lies along the Y-axis. The lower envelope of the set of 1100 lines corresponding to the complete set of TMMBR tuples defines a 1101 polygon. Points lying along or below this polygon are combinations of 1102 packet rate and bit rate that meet all of the TMMBR constraints. The 1103 highest feasible packet rate within this region is the minimum of the 1104 rate at which the bounding polygon meets the X-axis or the session 1105 maximum packet rate (SMAXPR) provided by signaling, if any. Typically 1106 a media sender will prefer to operate at a lower rate than this 1107 theoretical maximum, so as to increase the rate at which actual media 1108 content reaches the receivers. The purpose of the algorithm is to 1109 distinguish the TMMBR tuples constituting the bounding set and thus 1110 delineate the feasible region, so that the media sender can select 1111 its preferred operating point within that region 1113 Figure 1 below shows a bounding polygon formed by TMMBR tuples A and 1114 B. A third tuple C lies outside the bounding polygon and is therefore 1115 irrelevant in determining feasible tradeoffs between media rate and 1116 packet rate. The line labeled ss..s represents the limit on packet 1117 rate imposed by the session maximum packet rate (SMAXPR) obtained by 1118 signaling during session setup. In Figure 1 the limit determined by 1119 tuple B happens to be more restrictive than SMAXPR. The situation 1120 could easily be the reverse, meaning that the bounding polygon is 1121 terminated on the right by the vertical line representing the SMAXPR 1122 constraint. 1124 ^ 1125 |a c b s 1126 Bit | a c b s 1127 Rate | a c b s 1128 | a cb s 1129 | a c s 1130 | a bc s 1131 | a b c s 1132 | ab c s 1133 | Feasible b c s 1134 | region ba s 1135 | b a s c 1136 | b s c 1137 | b s a 1138 |_____________________bs________ 1139 +------------------------------>____________ 1141 Packet rate 1143 Figure 1 - Geometric Interpretation of TMMBR Tuples 1145 Note that the slopes of the lines making up the bounding polygon are 1146 increasingly negative as one moves in the direction of increasing 1147 packet rate. Note also that with slight rearrangement, equations (1) 1148 and (2) have the canonical form: 1150 y = mx + b 1152 where 1153 m is the slope and has value equal to the negative of the tuple 1154 overhead (in bits), 1155 and 1156 b is the y-intercept and has value equal to the tuple maximum total 1157 media bit rate. 1159 These observations lead to the conclusion that when processing the 1160 TMMBR tuples to select the initial bounding set, one should sort and 1161 process the tuples by order of increasing overhead. Once a particular 1162 tuple has been added to the bounding set, all tuples not already 1163 selected and having lower overhead can be eliminated, because the 1164 next side of the bounding polygon has to be steeper (i.e. the 1165 corresponding TMMBR must have higher overhead) than the latest added 1166 tuple. 1168 Line cc..c in Figure 1 illustrates another principle. This line is 1169 parallel to line aa..a, but has a higher Y-intercept. That is, the 1170 corresponding TMMBR tuple contains a higher maximum total media bit 1171 rate value. Since line cc..c is outside the bounding polygon, it 1172 illustrates the conclusion that if two TMMBR tuples have the same 1173 overhead value, the one with higher maximum total media bit rate 1174 value cannot be part of the bounding set and can be set aside. 1176 Two further observations complete the algorithm. Obviously, moving 1177 from the left, the successive corners of the bounding polygon (i.e. 1178 the intersection points between successive pairs of sides) lie at 1179 successively higher packet rates. On the other hand, again moving 1180 from the left, each successive line making up the bounding set 1181 crosses the X-axis at a lower packet rate. 1183 The complete algorithm can now be specified. The algorithm works 1184 with two lists of TMMBR tuples, the candidate list X and the selected 1185 list Y, both ordered by increasing overhead value. The algorithm 1186 terminates when all members of X have been discarded or removed for 1187 processing. Membership of the selected list Y is probationary until 1188 the algorithm is complete. Each member of the selected list is 1189 associated with an intersection value, which is the packet rate at 1190 which the line corresponding to that TMMBR tuple intersects with the 1191 line corresponding to the previous TMMBR tuple in the selected list. 1192 Each member of the selected list is also associated with a maximum 1193 packet rate value, which is the lesser of the session maximum packet 1194 rate SMAXPR (if any) and the packet rate at which the line 1195 corresponding to that tuple crosses the X-axis. 1197 When the algorithm terminates, the selected list is equal to the 1198 bounding set as defined in section 2.2. 1200 Initial Algorithm 1202 This algorithm is used by the media sender when it has received one 1203 or more TMMBR requests and before it has determined a bounding set 1204 for the first time. 1206 1. Sort the TMMBR tuples by order of increasing overhead. This is 1207 the initial candidate list X. 1209 2. When multiple tuples in the candidate list have the same 1210 overhead value, discard all but the one with the lowest maximum 1211 total media bit rate value. 1213 3. Select and remove from the candidate list the TMMBR tuple with the 1214 lowest maximum total media bit rate value. If there is more than 1215 one tuple with that value, choose the one with the highest 1216 overhead value. This is the first member of the selected list Y. 1217 Set its intersection value equal to zero. Calculate its maximum 1218 packet rate as the minimum of SMAXPR (if available) and the value 1219 obtained from the following formula, which is the packet rate at 1220 which the corresponding line crosses the X-axis. 1222 Max PR = TMMBR max total BR / (8 * TMMBR OH) ... (4) 1224 4. Discard from the candidate list all tuples with a lower overhead 1225 value than the selected tuple. 1227 5. Remove the first remaining tuple from the candidate list for 1228 processing. Call this the current candidate. 1230 6. Calculate the packet rate PR at the intersection of the line 1231 generated by the current candidate with the line generated by the 1232 last tuple in the selected list Y, using equation (3). 1234 7. If the calculated value PR is equal to or lower than the 1235 intersection value stored for the last tuple of the selected list, 1236 discard the last tuple of the selected list and go back to step 6 1237 (retaining the same current candidate). 1239 Note that the choice of the initial member of the selected list Y 1240 in step 3 guarantees that the selected list will never be emptied 1241 by this process, meaning that the algorithm must eventually (if 1242 not immediately) fall through to the step 8. 1244 8. (This step is reached when the calculated PR value of the current 1245 candidate is greater than the intersection value of the current 1246 last member of the selected list Y.) If the calculated value PR 1247 of the current candidate is lower than the maximum packet rate 1248 associated with the last tuple in the selected list, add the 1249 current candidate tuple to the end of the selected list. Store 1250 PR as its intersection value. Calculate its maximum packet rate 1251 as the lesser of SMAXPR (if available) and the maximum packet 1252 rate calculated using equation (4). 1254 9. If any tuples remain in the candidate list, go back to step 5. 1256 Incremental Algorithm 1257 The previous algorithm covered the initial case, where no selected 1258 list had previously been created. It also applied only to the media 1259 sender. When a previously-created selected list is available at 1260 either the media sender or media receiver, two other cases can be 1261 considered: 1263 o when a TMMBR tuple not currently in the selected list is a 1264 candidate for addition; 1266 o when the values change in a TMMBR tuple currently in the 1267 selected list. 1269 At the media receiver these cases correspond respectively to those 1270 of the non-owner and owner of a tuple in the TMMBN-reported bounding 1271 set. 1273 In either case, the process of updating the selected list to take 1274 account of the new/changed tuple can use the basic algorithm 1275 described above, with the modification that the initial candidate 1276 set consists only of the existing selected list and the new or 1277 changed tuple. Some further optimization is possible (beyond 1278 starting with a reduced candidate set) by taking advantage of the 1279 following observations. 1281 The first observation is that if the new/changed candidate becomes 1282 part of the new selected list, the result may be to cause zero or 1283 more other tuples to be dropped from the list. However, if more than 1284 one other tuple is dropped, the dropped tuples will be consecutive. 1285 This can be confirmed geometrically by visualizing a new line that 1286 cuts off a series of segments from the previously-existing bounding 1287 polygon. The cut-off segments are connected one to the next, the 1288 geometric equivalent of consecutive tuples in a list ordered by 1289 overhead value. Beyond the dropped set in either direction all of 1290 the tuples that were in the earlier selected list will be in the 1291 updated one. The second observation is that, leaving aside the new 1292 candidate, the order of tuples remaining in the updated selected list 1293 is unchanged because their overhead values have not changed. 1295 The consequence of these two observations is that, once the placement 1296 of the new candidate and the extent of the dropped set of tuples (if 1297 any) has been determined, the remaining tuples can be copied directly 1298 from the candidate list into the selected list, preserving their 1299 order. This conclusion suggests the following modified algorithm: 1301 o Run steps 1-4 of the basic algorithm. 1303 o If the new candidate has survived steps 2 and 4 and has become 1304 the new first member of the selected list, run steps 5-9 on 1305 subsequent candidates until another candidate is added to the 1306 selected list. Then move all remaining candidates to the 1307 selected list, preserving their order. 1309 o If the new candidate has survived steps 2 and 4 and has not 1310 become the new first member of the selected list, start by 1311 moving all tuples in the candidate list with lower overhead 1312 values than that of the new candidate to the selected list, 1313 preserving their order. Run steps 5 through 9 for the new 1314 candidate, with the modification that the intersection values 1315 and maximum packet rates for the tuples on the selected list 1316 have to be calculated on the fly because they were not 1317 previously stored. Continue processing only until a 1318 subsequent tuple has been added to the selected list, then 1319 move all remaining candidates to the selected list, preserving 1320 their order. 1322 Note that the new candidate could be added to the selected 1323 list only to be dropped again when the next tuple is 1324 processed. It can easily be seen that in this case the new 1325 candidate does not displace any of the earlier tuples in the 1326 selected list. The limitations of ASCII art make this 1327 difficult to show in a figure. Line cc..c in Figure 1 would 1328 be an example if it had a steeper slope (tuple C had a higher 1329 overhead value), but still intersected line aa..a beyond where 1330 line aa..a intersects line bb..b. 1332 The algorithm just described is approximate, because it does not take 1333 account of tuples outside the selected list. To see how such tuples 1334 can become relevant, consider Figure 1 and suppose that the maximum 1335 total media bit rate in tuple A increases to the point that line 1336 aa..a moves outside line cc..c. Tuple A will remain in the bounding 1337 set calculated by the media sender. However, once it issues a new 1338 TMMBN, media receiver C will apply the algorithm and discover that 1339 its tuple C should now enter the bounding set. It will issue a TMMBR 1340 request to the media sender, which will repeat its calculation and 1341 come to the appropriate conclusion. 1343 The rules of section 4.2 require that the media sender refrain from 1344 raising its sending rate until media receivers have had a chance to 1345 respond to the TMMBN. In the example just given, this delay ensures 1346 that the relaxation of tuple A does not actually result in an attempt 1347 to send media at a rate exceeding the capacity at C. 1349 3.5.4.3. Use of TMMBR in a Mixer Based Multipoint Operation 1351 Assume a small mixer-based multiparty conference is ongoing, as 1352 depicted in Topo-Mixer of [Topologies]. All participants have 1353 negotiated a common maximum bit rate that this session can use. The 1354 conference operates over a number of unicast paths between the 1355 participants and the mixer. The congestion situation on each of 1356 these paths can be monitored by the participant in question and by 1357 the mixer, utilizing, for example, RTCP receiver reports (RR) or the 1358 transport protocol, e.g. DCCP [RFC4340]. However, any given 1359 participant has no knowledge of the congestion situation of the 1360 connections to the other participants. Worse, without mechanisms 1361 similar to the ones discussed in this draft, the mixer (which is 1362 aware of the congestion situation on all connections it manages) has 1363 no standardized means to inform media senders to slow down, short of 1364 forging its own receiver reports (which is undesirable). In 1365 principle, a mixer confronted with such a situation is obliged to 1366 thin or transcode streams intended for connections that detected 1367 congestion. 1369 In practice, media-aware stream thinning is unfortunately a very 1370 difficult and cumbersome operation and adds undesirable delay. If 1371 media-unaware, it leads very quickly to unacceptable reproduced media 1372 quality. Hence, a means to slow down senders even in the absence of 1373 congestion on their connections to the mixer is desirable. 1375 To allow the mixer to throttle traffic on the individual links, 1376 without performing transcoding, there is a need for a mechanism that 1377 enables the mixer to ask a participant's media encoders to limit the 1378 media stream bit rate they are currently generating. TMMBR provides 1379 the required mechanism. When the mixer detects congestion between 1380 itself and a given participant, it executes the following procedure: 1382 1. It starts thinning the media traffic to the congested participant 1383 to the supported bit rate. 1385 2. It uses TMMBR to request the media sender(s) to reduce the total 1386 media bit rate sent by them to the mixer, to a value that is in 1387 compliance with congestion control principles for the slowest 1388 link. Slow refers here to the available bandwidth / bit rate / 1389 capacity and packet rate after congestion control. 1391 3. As soon as the bit rate has been reduced by the sending part, the 1392 mixer stops stream thinning implicitly, because there is no need 1393 for it once the stream is in compliance with congestion control. 1395 This use of stream thinning as an immediate reaction tool followed up 1396 by a quick control mechanism appears to be a reasonable compromise 1397 between media quality and the need to combat congestion. 1399 3.5.4.4. Use of TMMBR in Point-to-Multipoint Using Multicast or 1400 Translators 1402 In these topologies, corresponding to Topo-Multicast or Topo- 1403 Translator, RTCP RRs are transmitted globally. This allows all 1404 participants to detect transmission problems such as congestion, on a 1405 medium timescale. As all media senders are aware of the congestion 1406 situation of all media receivers, the rationale for the use of TMMBR 1407 in the previous section does not apply. However, even in this case 1408 the congestion control response can be improved when the unicast 1409 links are using congestion controlled transport protocols (such as 1410 TCP or DCCP). A peer may also report local limitations to the media 1411 sender. 1413 3.5.4.5. Use of TMMBR in Point-to-point operation 1415 In use case 7 it is possible to use TMMBR to improve the performance 1416 when the known upper limit of the bit rate changes. In this use case 1417 the signaling protocol has established an upper limit for the session 1418 and total media bit rates. However, at the time of transport link 1419 bit rate reduction, a receiver can avoid serious congestion by 1420 sending a TMMBR to the sending side. Thus TMMBR is useful for 1421 putting restrictions on the application and thus placing the 1422 congestion control mechanism in the right ballpark. However TMMBR is 1423 usually unable to provide the continuously quick feedback loop 1424 required for real congestion control. Nor do its semantics match 1425 those of congestion control given its different purpose. For these 1426 reasons TMMBR SHALL NOT be used as a substitute for congestion 1427 control. 1429 3.5.4.6. Reliability 1431 The reaction of a media sender to the reception of a TMMBR message is 1432 not immediately identifiable through inspection of the media stream. 1433 Therefore, a more explicit mechanism is needed to avoid unnecessary 1434 re-sending of TMMBR messages. Using a statistically based 1435 retransmission scheme would only provide statistical guarantees of 1436 the request being received. It would also not avoid the 1437 retransmission of already received messages. In addition, it would 1438 not allow for easy suppression of other participants' requests. For 1439 these reasons, a mechanism based on explicit notification is used. 1441 Upon the reception of a request a media sender sends a TMMBN 1442 notification containing the current bounding set, and indicating 1443 which session participants own that limit. In multicast scenarios, 1444 that allows all other participants to suppress any request they may 1445 have, if their limitations are less strict than the current ones 1446 (i.e. define lines lying outside the feasible region as defined in 1447 section 2.2). Keeping and notifying only the bounding set of tuples 1448 allows for small message sizes and media sender states. A media 1449 sender only keeps state for the SSRCs of the current owners of the 1450 bounding set of tuples; all other requests and their sources are not 1451 saved. Once the bounding set has been established, new TMMBR 1452 messages should be generated only by owners of the bounding tuples 1453 and by other entities that determine (by applying the algorithm of 1454 section 3.5.4.2 or its equivalent) that their limitations should now 1455 be part of the bounding set. 1457 4. RTCP Receiver Report Extensions 1459 This memo specifies six new feedback messages. The Full Intra 1460 Request (FIR), Temporal-Spatial Trade-off Request (TSTR), Temporal- 1461 Spatial Trade-off Notification (TSTN), and Video Back Channel Message 1462 (VBCM) are "Payload Specific Feedback Messages" as defined in Section 1463 6.3 of AVPF [RFC4585]. The Temporary Maximum Media Stream Bit Rate 1464 Request (TMMBR) and Temporary Maximum Media Stream Bit Rate 1465 Notification (TMMBN) are "Transport Layer Feedback Messages" as 1466 defined in Section 6.2 of AVPF. 1468 The new feedback messages are defined in the following subsections, 1469 following a similar structure to that in sections 6.2 and 6.3 of the 1470 AVPF specification [RFC4585]. 1472 4.1. Design Principles of the Extension Mechanism 1474 RTCP was originally introduced as a channel to convey presence, 1475 reception quality statistics and hints on the desired media coding. 1476 A limited set of media control mechanisms were introduced in early 1477 RTP payload formats for video formats, for example in RFC 4587 1478 [RFC4587]. However, this specification, for the first time, suggests 1479 a two-way handshake for some of its messages. There is danger that 1480 this introduction could be misunderstood as a precedent for the use 1481 of RTCP as an RTP session control protocol. To prevent such a 1482 misunderstanding, this subsection attempts to clarify the scope of 1483 the extensions specified in this memo, and strongly suggests that 1484 future extensions follow the rationale spelled out here, or 1485 compellingly explain why they divert from the rationale. 1487 In this memo, and in AVPF [RFC4585], only such messages have been 1488 included as: 1490 a) have comparatively strict real-time constraints, which prevent the 1491 use of mechanisms such as a SIP re-invite in most application 1492 scenarios. The real-time constraints are explained separately for 1493 each message where necessary. 1495 b) are multicast-safe in that the reaction to potentially 1496 contradicting feedback messages is specified, as necessary for 1497 each message; and 1499 c) are directly related to activities of a certain media codec, class 1500 of media codecs (e.g. video codecs), or a given RTP packet stream. 1502 In this memo, a two-way handshake is introduced only for messages for 1503 which: 1505 a) a notification or acknowledgement is required due to their nature. 1506 An analysis to determine whether this requirement exists has been 1507 performed separately for each message. 1509 b) the notification or acknowledgement cannot be easily derived from 1510 the media bit stream. 1512 All messages in AVPF [RFC4585] and in this memo present their 1513 contents in a simple, fixed binary format. This accommodates media 1514 receivers which have not implemented higher control protocol 1515 functionalities (SDP, XML parsers and such) in their media path. 1517 4.2. Transport Layer Feedback Messages 1519 As specified in section 6.1 of RFC 4585 [RFC4585], Transport Layer 1520 Feedback messages are identified by the RTCP packet type value RTPFB 1521 (205). 1523 In AVPF, one message of this category had been defined. This memo 1524 specifies two more such messages. They are identified by means of 1525 the FMT parameter as follows: 1527 Assigned in AVPF [RFC4585]: 1529 1: Generic NACK 1530 31: reserved for future expansion of the identifier number space 1532 Assigned in this memo: 1534 2: reserved (see note below) 1535 3: Temporary Maximum Media Stream Bit Rate Request (TMMBR) 1536 4: Temporary Maximum Media Stream Bit Rate Notification (TMMBN) 1538 Note: early drafts of AVPF [RFC4585] reserved FMT=2 for a code 1539 point that has later been removed. It has been pointed out 1540 that there may be implementations in the field using this 1541 value in accordance with the expired draft. As there is 1542 sufficient numbering space available, we mark FMT=2 as 1543 reserved so to avoid possible interoperability problems with 1544 any such early implementations. 1546 Available for assignment: 1548 0: unassigned 1549 5-30: unassigned 1551 The following subsection defines the formats of the FCI entries for 1552 the TMMBR and TMMBN messages respectively and specify the associated 1553 behaviour at the media sender and receiver. 1555 4.2.1. Temporary Maximum Media Stream Bit Rate Request (TMMBR) 1557 The FCI field of a Temporary Maximum Media Stream Bit-Rate Request 1558 (TMMBR) message SHALL contain one or more FCI entries. 1560 4.2.1.1. Message Format 1562 The Feedback Control Information (FCI) consists of one or more TMMBR 1563 FCI entries with the following syntax: 1565 0 1 2 3 1566 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1567 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1568 | SSRC | 1569 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1570 | MxTBR Exp | MxTBR Mantissa |Measured Overhead| 1571 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1573 Figure 2 - Syntax of an FCI entry in the TMMBR message 1575 SSRC (32 bits): The SSRC value of the media sender that is 1576 requested to obey the new maximum bit rate. 1578 MxTBR Exp (6 bits): The exponential scaling of the mantissa for the 1579 maximum total media bit rate value. The value is an 1580 unsigned integer [0..63]. 1582 MxTBR Mantissa (17 bits): The mantissa of the maximum total media 1583 bit rate value as an unsigned integer. 1585 Measured Overhead (9 bits): The measured average packet overhead 1586 value in bytes. The measurement SHALL be done according 1587 to description in section 4.2.1.2. The value is an 1588 unsigned integer [0..512]. 1590 The maximum total media bit rate (MxTBR) value in bits per second is 1591 calculated from the MxTBR exponent (exp) and mantissa in the 1592 following way: 1594 MxTBR = mantissa * 2^exp 1596 This allows for 17 bits of resolution in the range 0 to 131072*2^63 1597 (approximately 1.2*10^24). 1599 The length of the TMMBR feedback message SHALL be set to 2+2*N where 1600 N is the number of TMMBR FCI entries. 1602 4.2.1.2. Semantics 1604 Behaviour at the Media Receiver (Sender of the TMMBR) 1606 TMMBR is used to indicate a transport related limitation at the 1607 reporting entity acting as a media receiver. TMMBR has the form of a 1608 tuple containing two components. The first value is the highest bit 1609 rate per sender of a media stream, observed at a receiver-chosen 1610 protocol layer, which the receiver currently supports in this RTP 1611 session. The second value is the measured header overhead in bytes 1612 as defined in section 2.2 and measured at the chosen protocol layer 1613 in the packets received for the stream. The measurement of the 1614 overhead is a running average that is updated for each packet 1615 received for this particular media source (SSRC), using the following 1616 formula: 1618 avg_OH (new) = 15/16*avg_OH (old) + 1/16*pckt_OH, 1620 where avg_OH is the running (exponentially smoothed) average and 1621 pckt_OH is the overhead observed in the latest packet. 1623 If a maximum bit rate has been negotiated through signaling, the 1624 maximum total media bit rate that the receiver reports in a TMMBR 1625 message MUST NOT exceed the negotiated value converted to a common 1626 basis (i.e. with overheads adjusted to bring it to the same reference 1627 protocol layer). 1629 Within the common packet header for feedback messages (as defined in 1630 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 1631 indicates the source of the request, and the "SSRC of media source" 1632 is not used and SHALL be set to 0. Within a particular TMMBR FCI 1633 entry, the "SSRC of media sender" in the FCI field denotes the media 1634 sender the tuple applies to. This is useful in the multicast or 1635 translator topologies where the reporting entity may address all of 1636 the media senders in a single TMMBR message using multiple FCI 1637 entries. 1639 The media receiver SHALL save the contents of the latest TMMBN 1640 message received from each media sender. 1642 The media receiver MAY send a TMMBR FCI entry to a particular media 1643 sender under the following circumstances: 1645 o before any TMMBN message has been received from that media 1646 sender; 1648 o when the media receiver has been identified as the source of a 1649 bounding tuple within the latest TMMBN message received from 1650 that media sender, and the value of the maximum total media 1651 bit rate or the overhead relating to that media sender has 1652 changed; 1654 o when the media receiver has not been identified as the source 1655 of a bounding tuple within the latest TMMBN message received 1656 from that media sender, and, after the media receiver applies 1657 the incremental algorithm from section 3.5.4.2 or a stricter 1658 equivalent, the media receiver's tuple relating to that media 1659 sender is determined to belong to the bounding set. 1661 A TMMBR FCI entry MAY be repeated in subsequent TMMBR messages if no 1662 Temporary Maximum Media Stream Bit-Rate Notification (TMMBN) FCI has 1663 been received from the media sender at the time of transmission of 1664 the next RTCP packet. The bit rate value of a TMMBR FCI entry MAY be 1665 changed from one TMMBR message to the next. The overhead measurement 1666 SHALL be updated to the current value of avg_OH each time the entry 1667 is sent. 1669 If the value set by a TMMBR message is expected to be permanent, the 1670 TMMBR setting party SHOULD renegotiate the session parameters to 1671 reflect that using session setup signaling, e.g. a SIP re-invite. 1673 Behaviour at the Media Sender (Receiver of the TMMBR) 1675 When it receives a TMMBR message containing an FCI entry relating to 1676 it, the media sender SHALL use an initial or incremental algorithm as 1677 applicable to determine the bounding set of tuples based on the new 1678 information. The algorithm used SHALL be at least as strict as the 1679 corresponding algorithm defined in section 3.5.4.2. The media sender 1680 MAY accumulate TMMBR requests over a small interval (relative to the 1681 RTCP sending interval) before making this calculation. 1683 Once it has determined the bounding set of tuples, the media sender 1684 MAY use any combination of packet rate and net media bit rate within 1685 the feasible region that these tuples describe to produce a lower 1686 total media stream bit rate, as it may need to address a congestion 1687 situation or other limiting factors. See section 5 1688 (congestion 1689 control) for more discussion. 1691 If the media sender concludes that it can increase the maximum total 1692 media bit rate value, it SHALL wait before actually doing so, for a 1693 period long enough to allow a media receiver to respond to the TMMBN 1694 if it determines that its tuple belongs in the bounding set. This 1695 delay period is estimated by the formula: 1697 2 * RTT + T_Dither_Max, 1699 where RTT is the longest round trip time known to the media sender 1700 and T_Dither_Max is defined in section 3.4 of [RFC4585]. 1702 A TMMBN message SHALL be sent by the media sender at the earliest 1703 possible point in time, in response to any TMMBR messages received 1704 since the last sending of TMMBN. The TMMBN message indicates the 1705 calculated set of bounding tuples and the owners of those tuples at 1706 the time of the transmission of the message. 1708 An SSRC may time out according to the default rules for RTP session 1709 participants, i.e. the media sender has not received any RTP or RTCP 1710 packets from the owner for the last five regular reporting intervals. 1711 An SSRC may also explicitly leave the session, with the participant 1712 indicating this through the transmission of an RTCP BYE packet or 1713 using an external signaling channel. If the media sender determines 1714 that the owner of a tuple in the bounding set has left the session, 1715 the media sender shall transmit a new TMMBN containing the 1716 previously-determined set of bounding tuples but with the tuple 1717 belonging to the departed owner removed. 1719 Discussion 1721 Due to the unreliable nature of transport of TMMBR and TMMBN, the 1722 above rules may lead to the sending of TMMBR messages which appear to 1723 disobey those rules. Furthermore, in multicast scenarios it can 1724 happen that more than one "non-owning" session participant may 1725 determine, rightly or wrongly, that its tuple belongs in the bounding 1726 set. This is not critical for a number of reasons: 1728 a) If a TMMBR message is lost in transmission, either the media 1729 sender sends a new TMMBN message in response to some other media 1730 receiver or it does not send a new TMMBN message at all. In the 1731 first case, the media receiver applies the incremental algorithm 1732 and, if it determines that its tuple should be part of the 1733 bounding set, sends out another TMMBR. In the second case, it 1734 repeats the sending of a TMMBR unconditionally. Either way, the 1735 media sender eventually gets the information it needs. 1737 b) Similarly, if a TMMBN message gets lost, the media receiver that 1738 has sent the corresponding TMMBR request does not receive the 1739 notification and is expected to re-send the request and trigger 1740 the transmission of another TMMBN. 1742 c) If multiple competing TMMBR messages are sent by different session 1743 participants, then the algorithm can be applied taking all of 1744 these messages into account, and the resulting TMMBN provides the 1745 participants with an updated view of how their tuples compare with 1746 the bounded set. 1748 d) If more than one session participant happens to send TMMBR 1749 messages at the same time and with the same tuple component 1750 values, it does not matter which if either tuple is taken into the 1751 bounding set. The losing session participant will determine after 1752 applying the algorithm that its tuple does not enter the bounding 1753 set, and will therefore stop sending its TMMBR request. 1755 It is important to consider the security risks involved with faked 1756 TMMBRs. See the security considerations in Section 6 1757 . 1759 As indicated already, the feedback messages may be used in both 1760 multicast and unicast sessions in any of the specified topologies. 1761 However, for sessions with a large number of participants, using the 1762 lowest common denominator, as required by this mechanism, may not be 1763 the most suitable course of action. Large sessions may need to 1764 consider other ways to adapt the bit rate to participants' 1765 capabilities, such as partitioning the session into different quality 1766 tiers, or using some other method of achieving bit rate scalability. 1768 4.2.1.3. Timing Rules 1770 The first transmission of the TMMBR request message MAY use early or 1771 immediate feedback in cases when timeliness is desirable. Any 1772 repetition of a request message SHOULD use regular RTCP mode for its 1773 transmission timing. 1775 4.2.1.4. Handling in Translator and Mixers 1777 Media translators and mixers will need to receive and respond to 1778 TMMBR messages as they are part of the chain that provides a certain 1779 media stream to the receiver. The mixer or translator may act 1780 locally on the TMMBR request and thus generate a TMMBN to indicate 1781 that it has done so. Alternatively, in the case of a media 1782 translator it can forward the request, or in the case of a mixer 1783 generate one of its own and pass it forward. In the latter case, the 1784 mixer will need to send a TMMBN back to the original requestor to 1785 indicate that it is handling the request. 1787 4.2.2. Temporary Maximum Media Stream Bit Rate Notification (TMMBN) 1789 The FCI field of the TMMBN Feedback message may contain zero, one or 1790 more TMMBN FCI entries. 1792 4.2.2.1. Message Format 1794 The Feedback Control Information (FCI) consists of zero, one or more 1795 TMMBN FCI entries with the following syntax: 1797 0 1 2 3 1798 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1799 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1800 | SSRC | 1801 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1802 | MxTBR Exp | MxTBR Mantissa |Measured Overhead| 1803 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1805 Figure 3 - Syntax of an FCI entry in the TMMBN message 1807 SSRC (32 bits): The SSRC value of the "owner" of this tuple. 1809 MxTBR Exp (6 bits): The exponential scaling of the mantissa for the 1810 maximum total media bit rate value. The value is an 1811 unsigned integer [0..63]. 1813 MxTBR Mantissa (17 bits): The mantissa of the maximum total media 1814 bit rate value as an unsigned integer. 1816 Measured Overhead (9 bits): The measured average packet overhead 1817 value in bytes represented as an unsigned integer. 1819 Thus the FCI within the TMMBN message contains entries indicating the 1820 bounding tuples. For each tuple, the entry gives the owner by the 1821 SSRC, followed by the applicable maximum total media bit rate and 1822 overhead value. 1824 The length of the TMMBN message SHALL be set to 2+2*N where N is the 1825 number of TMMBN FCI entries. 1827 4.2.2.2. Semantics 1829 This feedback message is used to notify the senders of any TMMBR 1830 message that one or more TMMBR messages have been received or that an 1831 owner has left the session. It indicates to all participants the 1832 current set of bounding tuples and the "owners" of those tuples. 1834 Within the common packet header for feedback messages (as defined in 1835 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 1836 indicates the source of the notification. The "SSRC of media source" 1837 is not used and SHALL be set to 0. 1839 A TMMBN message SHALL be scheduled for transmission after the 1840 reception of a TMMBR message with an FCI entry identifying this media 1841 sender. Only a single TMMBN SHALL be sent, even if more than one 1842 TMMBR message is received between the scheduling of the transmission 1843 and the actual transmission of the TMMBN message. The TMMBN message 1844 indicates the bounding tuples and their owners at the time of 1845 transmitting the message. The bounding tuples included SHALL be the 1846 set arrived at through application of the applicable algorithm of 1847 section 3.5.4.2 or an equivalent, applied to the previous bounding 1848 set if any and tuples received in TMMBR messages since the last TMMBN 1849 was transmitted. 1851 The reception of a TMMBR message SHALL still result in the 1852 transmission of a TMMBN message even if, after application of the 1853 algorithm, the newly reported TMMBR tuple is not accepted into the 1854 bounding set. In such a case the bounding tuples and their owners 1855 are not changed, unless the TMMBR was from an owner of a tuple within 1856 the previously calculated bounding set. This procedure allows 1857 session participants that did not see the last TMMBN message to get a 1858 correct view of this media sender's state. 1860 As indicated in section Error! Reference source not found., when a 1861 media sender determines that an "owner" of a bounding tuple has left 1862 the session, then that tuple is removed from the bounding set, and 1863 the media sender SHALL send a TMMBN message indicating the remaining 1864 bounding tuples. If there are no remaining bounding tuples a TMMBN 1865 without any FCI SHALL be sent to indicate this. 1867 Note: if any media receivers remain in the session, this last will 1868 be a temporary situation. The empty TMMBN will cause every 1869 remaining media receiver to determine that its limitation belongs 1870 in the bounding set and send a TMMBR in consequence. 1872 In unicast scenarios (i.e. where a single sender talks to a single 1873 receiver), the aforementioned algorithm to determine ownership 1874 degenerates to the media receiver becoming the "owner" of the one 1875 bounding tuple as soon as the media receiver has issued the first 1876 TMMBR message. 1878 4.2.2.3. 1879 Timing Rules 1881 The TMMBN acknowledgement SHOULD be sent as soon as allowed by the 1882 applied timing rules for the session. Immediate or early feedback 1883 mode SHOULD be used for these messages. 1885 4.2.2.4. Handling by Translators and Mixers 1887 As discussed in Section 4.2.1.4 mixers or translators may need to 1888 issue TMMBN messages as responses to TMMBR messages for SSRC's 1889 handled by them. 1891 4.3. Payload Specific Feedback Messages 1893 As specified by section 6.1 of RFC 4585 [RFC4585], Payload-Specific 1894 FB messages are identified by the RTCP packet type value PT=PSFB 1895 (206). 1897 AVPF [RFC4585] defines three payload-specific feedback messages and 1898 one application layer feedback message. This memo specifies four 1899 additional payload-specific feedback messages. All are identified by 1900 means of the FMT parameter as follows: 1902 Assigned in [RFC4585]: 1904 1: Picture Loss Indication (PLI) 1905 2: Slice Lost Indication (SLI) 1906 3: Reference Picture Selection Indication (RPSI) 1907 15: Application layer FB message 1908 31: reserved for future expansion of the number space 1910 Assigned in this memo: 1912 4: Full Intra Request Command (FIR) 1913 5: Temporal-Spatial Trade-off Request (TSTR) 1914 6: Temporal-Spatial Trade-off Notification (TSTN) 1915 7: Video Back Channel Message (VBCM) 1917 Unassigned: 1919 0: unassigned 1920 8-14: unassigned 1921 16-30: unassigned 1923 The following subsections define the new FCI formats for the payload- 1924 specific feedback messages. 1926 4.3.1. Full Intra Request (FIR) 1928 The FIR message is identified by RTCP packet type value PT=PSFB and 1929 FMT=4. 1931 The FCI field MUST contain one or more FIR entries. Each entry 1932 applies to a different media sender, identified by its SSRC. 1934 4.3.1.1. Message Format 1936 The Feedback Control Information (FCI) for the Full Intra Request 1937 consists of one or more FCI entries, the content of which is depicted 1938 in Figure 4. The length of the FIR feedback message MUST be set to 1939 2+2*N, where N is the number of FCI entries. 1941 0 1 2 3 1942 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1943 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1944 | SSRC | 1945 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1946 | Seq. nr | Reserved | 1947 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1949 Figure 4 - Syntax of an FCI entry in the FIR message 1951 SSRC (32 bits): The SSRC value of the media sender which is 1952 requested to send a decoder refresh point. 1954 Seq. nr (8 bits): Command sequence number. The sequence number 1955 space is unique for each pairing of the SSRC of command 1956 source and the SSRC of the command target. The sequence 1957 number SHALL be increased by 1 modulo 256 for each new 1958 command. A repetition SHALL NOT increase the sequence 1959 number. The initial value is arbitrary. 1961 Reserved (24 bits): All bits SHALL be set to 0 by the sender and 1962 SHALL be ignored on reception. 1964 The semantics of this feedback message is independent of the RTP 1965 payload type. 1967 4.3.1.2. Semantics 1969 Upon reception of FIR, the encoder MUST send a decoder refresh point 1970 (see section 2.2) as soon as possible. 1972 Note: Currently, video appears to be the only useful application 1973 for FIR, as it appears to be the only RTP payload widely deployed 1974 that relies heavily on media prediction across RTP packet 1975 boundaries. However, use of FIR could also reasonably be 1976 envisioned for other media types that share essential properties 1977 with compressed video, namely cross-frame prediction (whatever a 1978 frame may be for that media type). One possible example may be the 1979 dynamic updates of MPEG-4 scene descriptions. It is suggested that 1980 payload formats for such media types refer to FIR and other message 1981 types defined in this specification and in AVPF [RFC4585], instead 1982 of creating similar mechanisms in the payload specifications. The 1983 payload specifications may have to explain how the payload-specific 1984 terminologies map to the video-centric terminology used herein. 1986 Note: In environments where the sender has no control over the 1987 codec (e.g. when streaming pre-recorded and pre-coded content), the 1988 reaction to this command cannot be specified. One suitable 1989 reaction of a sender would be to skip forward in the video bit 1990 stream to the next decoder refresh point. In other scenarios, it 1991 may be preferable not to react to the command at all, e.g. when 1992 streaming to a large multicast group. Other reactions may also be 1993 possible. When deciding on a strategy, a sender could take into 1994 account factors such as the size of the receiving group, the 1995 "importance" of the sender of the FIR message (however "importance" 1996 may be defined in this specific application), the frequency of 1997 decoder refresh points in the content, and so on. However a 1998 session which predominately handles pre-coded content is not 1999 expected to use FIR at all. 2001 The sender MUST consider congestion control as outlined in 2002 section 5 2003 ., which MAY restrict its ability to send a decoder refresh 2004 point quickly. 2006 Note: The relationship between the Picture Loss Indication and FIR 2007 is as follows. As discussed in section 6.3.1 of AVPF [RFC4585], a 2008 Picture Loss Indication informs the decoder about the loss of a 2009 picture and hence the likelihood of misalignment of the reference 2010 pictures between the encoder and decoder. Such a scenario is 2011 normally related to losses in an ongoing connection. In point-to- 2012 point scenarios, and without the presence of advanced error 2013 resilience tools, one possible option for an encoder consists in 2014 sending a decoder refresh point. However, there are other options. 2015 One example is that the media sender ignores the PLI, because the 2016 embedded stream redundancy is likely to clean up the reproduced 2017 picture within a reasonable amount of time. The FIR, in contrast, 2018 leaves a (real-time) encoder no choice but to send a decoder 2019 refresh point. It does not allow the encoder to take into account 2020 any considerations such as the ones mentioned above. 2022 Note: Mandating a maximum delay for completing the sending of a 2023 decoder refresh point would be desirable from an application 2024 viewpoint, but is problematic from a congestion control point of 2025 view. "As soon as possible" as mentioned above appears to be a 2026 reasonable compromise. 2028 FIR SHALL NOT be sent as a reaction to picture losses -- it is 2029 RECOMMENDED to use PLI instead. FIR SHOULD be used only in 2030 situations where not sending a decoder refresh point would render the 2031 video unusable for the users. 2033 Note: A typical example where sending FIR is appropriate is when, 2034 in a multipoint conference, a new user joins the session and no 2035 regular decoder refresh point interval is established. Another 2036 example would be a video switching MCU that changes streams. Here, 2037 normally, the MCU issues a FIR to the new sender so to force it to 2038 emit a decoder refresh point. The decoder refresh point normally 2039 includes a Freeze Picture Release (defined outside this 2040 specification), which re-starts the rendering process of the 2041 receivers. Both techniques mentioned are commonly used in MCU- 2042 based multipoint conferences. 2044 Other RTP payload specifications such as RFC 4587 [RFC4587] already 2045 define a feedback mechanism for certain codecs. An application 2046 supporting both schemes MUST use the feedback mechanism defined in 2047 this specification when sending feedback. For backward compatibility 2048 reasons, such an application SHOULD also be capable to receive and 2049 react to the feedback scheme defined in the respective RTP payload 2050 format, if this is required by that payload format. 2052 Within the common packet header for feedback messages (as defined in 2053 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 2054 indicates the source of the request, and the "SSRC of media source" 2055 is not used and SHALL be set to 0. The SSRCs of the media senders to 2056 which the FIR command applies are in the corresponding FCI entries. 2057 A TSTR message MAY contain requests to multiple media senders, using 2058 one FCI entry per target media sender. 2060 4.3.1.3. Timing Rules 2062 The timing follows the rules outlined in section 3 of [RFC4585]. FIR 2063 commands MAY be used with early or immediate feedback. The FIR 2064 feedback message MAY be repeated. If using immediate feedback mode 2065 the repetition SHOULD wait at least one RTT before being sent. In 2066 early or regular RTCP mode the repetition is sent in the next regular 2067 RTCP packet. 2069 4.3.1.4. Handling of FIR Message in Mixer and Translators 2071 A media translator or a mixer performing media encoding of the 2072 content for which the session participant has issued a FIR is 2073 responsible for acting upon it. A mixer acting upon a FIR SHOULD NOT 2074 forward the message unaltered; instead it SHOULD issue a FIR itself. 2076 4.3.1.5. Remarks 2078 In conjunction with video codecs, FIR messages typically trigger the 2079 sending of full intra or IDR pictures. Both are several times larger 2080 then predicted (inter) pictures. Their size is independent of the 2081 time they are generated. In most environments, especially when 2082 employing bandwidth-limited links, the use of an intra picture 2083 implies an allowed delay that is a significant multiple of the 2084 typical frame duration. An example: if the sending frame rate is 10 2085 fps, and an intra picture is assumed to be 10 times as big as an 2086 inter picture, then a full second of latency has to be accepted. In 2087 such an environment there is no need for a particularly short delay 2088 in sending the FIR message. Hence waiting for the next possible time 2089 slot allowed by RTCP timing rules as per [RFC4585] should not have an 2090 overly negative impact on the system performance. 2092 4.3.2. Temporal-Spatial Trade-off Request (TSTR) 2094 The TSTR feedback message is identified by RTCP packet type value 2095 PT=PSFB and FMT=5. 2097 The FCI field MUST contain one or more TSTR FCI entries. 2099 4.3.2.1. Message Format 2101 The content of the FCI entry for the Temporal-Spatial Trade-off 2102 Request is depicted in Figure 5. The length of the feedback message 2103 MUST be set to 2+2*N, where N is the number of FCI entries included. 2105 0 1 2 3 2106 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2107 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2108 | SSRC | 2109 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2110 | Seq nr. | Reserved | Index | 2111 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2113 Figure 5 - Syntax of an FCI Entry in the TSTR Message 2115 SSRC (32 bits): The SSRC of the media sender which is requested to 2116 apply the tradeoff value given in Index. 2118 Seq. nr (8 bits): Request sequence number. The sequence number 2119 space is unique for pairing of the SSRC of request source 2120 and the SSRC of the request target. The sequence number 2121 SHALL be increased by 1 modulo 256 for each new command. 2122 A repetition SHALL NOT increase the sequence number. The 2123 initial value is arbitrary. 2125 Reserved (19 bits): All bits SHALL be set to 0 by the sender and 2126 SHALL be ignored on reception. 2128 Index (5 bits): An integer value between 0 and 31 that indicates 2129 the relative trade off that is requested. An index value 2130 of 0 index highest possible spatial quality, while 31 2131 indicates highest possible temporal resolution. 2133 4.3.2.2. Semantics 2135 A decoder can suggest a temporal-spatial trade-off level by sending a 2136 TSTR message to an encoder. If the encoder is capable of adjusting 2137 its temporal-spatial trade-off, it SHOULD take into account the 2138 received TSTR message for future coding of pictures. A value of 0 2139 suggests a high spatial quality and a value of 31 suggests a high 2140 frame rate. The progression of values from 0 to 31 indicate 2141 monotonically a desire for higher frame rate. The index values do 2142 not correspond to precise values of spatial quality or frame rate. 2144 The reaction to the reception of more than one TSTR message by a 2145 media sender from different media receivers is left open to the 2146 implementation. The selected trade-off SHALL be communicated to the 2147 media receivers by the means of the TSTN message. 2149 Within the common packet header for feedback messages (as defined in 2150 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 2151 indicates the source of the request, and the "SSRC of media source" 2152 is not used and SHALL be set to 0. The SSRCs of the media senders to 2153 which the TSTR applies to are in the corresponding FCI entries. 2155 A TSTR message MAY contain requests to multiple media senders, using 2156 one FCI entry per target media sender. 2158 4.3.2.3. Timing Rules 2160 The timing follows the rules outlined in section 3 of [RFC4585]. 2161 This request message is not time critical and SHOULD be sent using 2162 regular RTCP timing. Only if it is known that the user interface 2163 requires a quick feedback, the message MAY be sent with early or 2164 immediate feedback timing. 2166 4.3.2.4. Handling of message in Mixers and Translators 2168 A mixer or media translator that encodes content sent to the session 2169 participant issuing the TSTR SHALL consider the request to determine 2170 if it can fulfill it by changing its own encoding parameters. A 2171 media translator unable to fulfill the request MAY forward the 2172 request unaltered towards the media sender. A mixer encoding for 2173 multiple session participants will need to consider the joint needs 2174 of these participants before generating a TSTR on its own behalf 2175 towards the media sender. See also the discussion in Section 3.5.2. 2177 4.3.2.5. Remarks 2179 The term "spatial quality" does not necessarily refer to the 2180 resolution, measured by the number of pixels the reconstructed video 2181 is using. In fact, in most scenarios the video resolution stays 2182 constant during the lifetime of a session. However, all video 2183 compression standards have means to adjust the spatial quality at a 2184 given resolution, often influenced by the Quantizer Parameter or QP. 2185 A numerically low QP results in a good reconstructed picture quality, 2186 whereas a numerically high QP yields a coarse picture. The typical 2187 reaction of an encoder to this request is to change its rate control 2188 parameters to use a lower frame rate and a numerically lower (on 2189 average) QP, or vice versa. The precise mapping of Index value to 2190 frame rate and QP is intentionally left open here, as it depends on 2191 factors such as the compression standard employed, spatial 2192 resolution, content, bit rate, and so on. 2194 4.3.3. Temporal-Spatial Trade-off Notification (TSTN) 2196 The TSTN message is identified by RTCP packet type value PT=PSFB and 2197 FMT=6. 2199 The FCI field SHALL contain one or more TSTN FCI entries. 2201 4.3.3.1. Message Format 2203 The content of an FCI entry for the Temporal-Spatial Trade-off 2204 Notification is depicted in Figure 6. The length of the TSTN message 2205 MUST be set to 2+2*N, where N is the number of FCI entries. 2207 0 1 2 3 2208 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2209 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2210 | SSRC | 2211 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2212 | Seq nr. | Reserved | Index | 2213 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2215 Figure 6 - Syntax of the TSTN 2217 SSRC (32 bits): The SSRC of the source of the TSTR request which 2218 resulted in this Notification. 2220 Seq. nr (8 bits): The sequence number value from the TSTN request 2221 that is being acknowledged. 2223 Reserved (19 bits): All bits SHALL be set to 0 by the sender and 2224 SHALL be ignored on reception. 2226 Index (5 bits): The trade-off value the media sender is using 2227 henceforth. 2229 Informative note: The returned trade-off value (Index) may differ 2230 from the requested one, for example in cases where a media encoder 2231 cannot tune its trade-off, or when pre-recorded content is used. 2233 4.3.3.2. Semantics 2235 This feedback message is used to acknowledge the reception of a TSTR. 2236 One TSTN entry in a TSTN feedback message SHALL be sent for each TSTR 2237 entry targeted to this session participant, i.e. each TSTR received 2238 that in the SSRC field in the entry has the receiving entities SSRC. 2240 A single TSTN message MAY acknowledge multiple requests using 2241 multiple FCI entries. The index value included SHALL be the same in 2242 all FCI entries of the TSTN message. Including a FCI for each 2243 requestor allows each requesting entity to determine that the media 2244 sender received the request. The Notification SHALL also be sent in 2245 response to TSTR repetitions received. If the request receiver has 2246 received TSTR with several different sequence numbers from a single 2247 requestor it SHALL only respond to the request with the highest 2248 (modulo 256) sequence number. 2250 The TSTN SHALL include the Temporal-Spatial Trade-off index that will 2251 be used as a result of the request. This is not necessarily the same 2252 index as requested, as the media sender may need to aggregate 2253 requests from several requesting session participants. It may also 2254 have some other policies or rules that limit the selection. 2256 Within the common packet header for feedback messages (as defined in 2257 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 2258 indicates the source of the Notification, and the "SSRC of media 2259 source" is not used and SHALL be set to 0. The SSRCs of the 2260 requesting entities to which the Notification applies are in the 2261 corresponding FCI entries. 2263 4.3.3.3. Timing Rules 2265 The timing follows the rules outlined in section 3 of [RFC4585]. 2266 This acknowledgement message is not extremely time critical and 2267 SHOULD be sent using regular RTCP timing. 2269 4.3.3.4. Handling of TSTN in Mixer and Translators 2271 A mixer or translator that acts upon a TSTR SHALL also send the 2272 corresponding TSTN. In cases where it needs to forward a TSTR itself 2273 the notification message MAY need to be delayed until the TSTR has 2274 been responded to. 2276 4.3.3.5. Remarks 2278 None 2280 4.3.4. H.271 Video Back Channel Message (VBCM) 2282 The VBCM is identified by RTCP packet type value PT=PSFB and FMT=7. 2284 The FCI field MUST contain one or more VBCM FCI entries. 2286 4.3.4.1. 2287 Message Format 2289 The syntax of an FCI entry within the VBCM indication is depicted in 2290 Figure 7. 2292 0 1 2 3 2293 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2294 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2295 | SSRC | 2296 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2297 | Seq. nr |0| Payload Type| Length | 2298 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2299 | VBCM Octet String.... | Padding | 2300 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2302 Figure 7 - Syntax of an FCI Entry in the VBCM Message 2304 SSRC (32 bits): The SSRC value of the media sender that is requested 2305 to instruct its encoder to react to the VBCM message 2307 Seq. nr (8 bits): Command sequence number. The sequence number space 2308 is unique for pairing of the SSRC of command source and the 2309 SSRC of the command target. The sequence number SHALL be 2310 increased by 1 modulo 256 for each new command. A repetition 2311 SHALL NOT increase the sequence number. The initial value is 2312 arbitrary. 2314 0: Must be set to 0 by the sender and should not be acted upon by the 2315 message receiver. 2317 Payload Type (7 bits): The RTP payload type for which the VBCM bit 2318 stream must be interpreted. 2320 Length (16 bits): The length of the VBCM octet string in octets 2321 exclusive of any padding octets 2323 VBCM Octet String (Variable length): This is the octet string 2324 generated by the decoder carrying a specific feedback sub- 2325 message. 2327 Padding (Variable length): Bits set to 0 to make up a 32 bit 2328 boundary. 2330 4.3.4.2. Semantics 2331 The "payload" of the VBCM indication carries different types of 2332 codec-specific, feedback information. The type of feedback 2333 information can be classified as a 'status report' (such as an 2334 indication that a bit stream was received without errors, or that a 2335 partial or complete picture or block was lost) or 'update requests' 2336 (such as complete refresh of the bit stream). 2338 Note: There are possible overlaps between the VBCM sub- 2339 messages and CCM/AVPF feedback messages, such FIR. Please see 2340 section 3.5.3 for further discussion. 2342 The different types of feedback sub-messages carried in the VBCM are 2343 indicated by the "payloadType" as defined in [VBCM]. These sub- 2344 message types are reproduced below for convenience. "payloadType", 2345 in ITU-T Rec. H.271 terminology, refers to the sub-type of the H.271 2346 message and should not be confused with an RTP payload type. 2348 Payload Message Content 2349 Type 2350 --------------------------------------------------------------------- 2351 0 One or more pictures without detected bit stream error 2352 mismatch 2353 1 One or more pictures that are entirely or partially lost 2354 2 A set of blocks of one picture that is entirely or partially 2355 lost 2356 3 CRC for one parameter set 2357 4 CRC for all parameter sets of a certain type 2358 5 A "reset" request indicating that the sender should completely 2359 refresh the video bit stream as if no prior bit stream data 2360 had been received 2361 > 5 Reserved for future use by ITU-T 2363 Table 2: H.271 message types ("payloadTypes") 2365 The bit string or the "payload" of a VBCM message is of variable 2366 length and is self-contained and coded in a variable length, binary 2367 format. The media sender necessarily has to be able to parse this 2368 optimized binary format to make use of VBCM messages. 2370 Each of the different types of sub-messages (indicated by 2371 payloadType) may have different semantics depending on the codec 2372 used. 2374 Within the common packet header for feedback messages (as defined in 2375 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 2376 indicates the source of the request, and the "SSRC of media source" 2377 is not used and SHALL be set to 0. The SSRCs of the media senders to 2378 which the VBCM message applies to are in the corresponding FCI 2379 entries. The sender of the VBCM message MAY send H.271 messages to 2380 multiple media senders and MAY send more than one H.271 message to 2381 the same media sender within the same VBCM message. 2383 4.3.4.3. Timing Rules 2385 The timing follows the rules outlined in section 3 of [RFC4585]. The 2386 different sub-message types may have different properties in regards 2387 to the timing of messages that should be used. If several different 2388 types are included in the same feedback packet then the requirements 2389 for the sub-message type with the most stringent requirements should 2390 be followed. 2392 4.3.4.4. Handling of message in Mixer or Translator 2394 The handling of VBCM in a mixer or translator is sub-message type 2395 dependent. 2397 4.3.4.5. Remarks 2399 Please see section 3.5.3 for a discussion of the usage of H.271 2400 messages and messages defined in AVPF [RFC4585] and this memo with 2401 similar functionality. 2403 Note: There has been some discussion whether the payload type field 2404 in this message is needed. It will be needed if there is 2405 potentially more than one VBCM-capable RTP payload type in the same 2406 session, and the semantics of a given VBCM message changes between 2407 payload types. For example, the picture identification mechanism 2408 in messages of H.271 type 0 is fundamentally different between 2409 H.263 and H.264 (although both use the same syntax). Therefore, 2410 the payload field is justified here. There was a further comment 2411 that for TSTS and FIR such a need does not exist, because the 2412 semantics of TSTS and FIR are either loosely enough defined, or 2413 generic enough, to apply to all video payloads currently in 2414 existence/envisioned. 2416 5. Congestion Control 2418 The correct application of the AVPF [RFC4585] timing rules prevents 2419 the network from being flooded by feedback messages. Hence, assuming 2420 a correct implementation and configuration, the RTCP channel cannot 2421 break its bit rate commitment and introduce congestion. 2423 The reception of some of the feedback messages modifies the behaviour 2424 of the media senders or, more specifically, the media encoders. Thus 2425 modified behaviour MUST respect the bandwidth limits that the 2426 application of congestion control provides. For example, when a 2427 media sender is reacting to a FIR, the unusually high number of 2428 packets that form the decoder refresh point have to be paced in 2429 compliance with the congestion control algorithm, even if the user 2430 experience suffers from a slowly transmitted decoder refresh point. 2432 A change of the Temporary Maximum Media Stream Bit Rate value can 2433 only mitigate congestion, but not cause congestion as long as 2434 congestion control is also employed. An increase of the value by a 2435 request REQUIRES the media sender to use congestion control when 2436 increasing its transmission rate to that value. A reduction of the 2437 value results in a reduced transmission bit rate thus reducing the 2438 risk for congestion. 2440 6. Security Considerations 2442 The defined messages have certain properties that have security 2443 implications. These must be addressed and taken into account by 2444 users of this protocol. 2446 The defined setup signaling mechanism is sensitive to modification 2447 attacks that can result in session creation with sub-optimal 2448 configuration, and, in the worst case, session rejection. To prevent 2449 this type of attack, authentication and integrity protection of the 2450 setup signaling is required. 2452 Spoofed or maliciously created feedback messages of the type defined 2453 in this specification can have the following implications: 2455 a. severely reduced media bit rate due to false TMMBR messages 2456 that sets the maximum to a very low value; 2458 b. assignment of the ownership of a bounding tuple to the wrong 2459 participant within a TMMBN message, potentially causing 2460 unnecessary oscillation in the bounding set as the mistakenly 2461 identified owner reports a change in its tuple and the true 2462 owner possibly holds back on changes until a correct TMMBN 2463 message reaches the participants; 2465 c. sending TSTR requests that result in a video quality 2466 different from the user's desire, rendering the session less 2467 useful. 2469 d. Frequent FIR commands will potentially reduce the frame-rate, 2470 making the video jerky, due to the frequent usage of decoder 2471 refresh points. 2473 To prevent these attacks there is a need to apply authentication and 2474 integrity protection of the feedback messages. This can be 2475 accomplished against threats external to the current RTP session 2476 using the RTP profile that combines SRTP [SRTP] and AVPF into SAVPF 2477 [SAVPF]. In the mixer cases, separate security contexts and 2478 filtering can be applied between the mixer and the participants thus 2479 protecting other users on the mixer from a misbehaving participant. 2481 7. SDP Definitions 2483 Section 4 of [RFC4585] defines a new SDP [RFC4566] attribute, rtcp- 2484 fb, that may be used to negotiate the capability to handle specific 2485 AVPF commands and indications, such as Reference Picture Selection, 2486 Picture Loss Indication etc. The ABNF for rtcp-fb is described in 2487 section 4.2 of [RFC4585]. In this section we extend the rtcp-fb 2488 attribute to include the commands and indications that are described 2489 for codec control protocol in the present document. We also discuss 2490 the Offer/Answer implications for the codec control commands and 2491 indications. 2493 7.1. Extension of the rtcp-fb Attribute 2495 As described in AVPF [RFC4585], the rtcp-fb attribute indicates the 2496 capability of using RTCP feedback. AVPF specifies that the rtcp-fb 2497 attribute must only be used as a media level attribute and must not 2498 be provided at session level. All the rules described in [RFC4585] 2499 for rtcp-fb attribute relating to payload type and to multiple rtcp- 2500 fb attributes in a session description also apply to the new feedback 2501 messages defined in this memo. 2503 The ABNF [RFC4234] for rtcp-fb as defined in [RFC4585] is 2505 "a=rtcp-fb: " rtcp-fb-pt SP rtcp-fb-val CRLF 2507 where rtcp-fb-pt is the payload type and rtcp-fb-val defines the type 2508 of the feedback message such as ack, nack, trr-int and rtcp-fb-id. 2509 For example to indicate the support of feedback of picture loss 2510 indication, the sender declares the following in SDP 2512 v=0 2513 o=alice 3203093520 3203093520 IN IP4 host.example.com 2514 s=Media with feedback 2515 t=0 0 2516 c=IN IP4 host.example.com 2517 m=audio 49170 RTP/AVPF 98 2518 a=rtpmap:98 H263-1998/90000 2519 a=rtcp-fb:98 nack pli 2521 In this document we define a new feedback value "ccm" which indicates 2522 the support of codec control using RTCP feedback messages. The "ccm" 2523 feedback value SHOULD be used with parameters, which indicate the 2524 specific codec control commands supported. In this draft we define 2525 four parameters, which can be used with the ccm feedback value type. 2527 o "fir" indicates the support of the Full Intra Request (FIR). 2528 o "tmmbr" indicates the support of the Temporary Maximum Media 2529 Stream Bit Rate Request/Notification (TMMBR/TMMBN). It has an 2530 optional sub parameter to indicate the session maximum packet 2531 rate to be used. If not included this defaults to infinity. 2532 o "tstr" indicates the support of the Temporal-Spatial Trade-off 2533 Request/Notification (TSTR/TSTN). 2534 O "vbcm" indicates the support of H.271 video back channel 2535 messages (VBCM). It has zero or more subparameters identifying 2536 the supported H.271 "payloadType" values. 2538 In the ABNF for rtcp-fb-val defined in [RFC4585], there is a 2539 placeholder called rtcp-fb-id to define new feedback types. "ccm" is 2540 defined as a new feedback type in this document and the ABNF for the 2541 parameters for ccm are defined here (please refer to section 4.2 of 2542 [RFC4585] for complete ABNF syntax). 2544 rtcp-fb-param = SP "app" [SP byte-string] 2545 / SP rtcp-fb-ccm-param 2546 / ; empty 2548 rtcp-fb-ccm-param = "ccm" SP ccm-param 2550 ccm-param = "fir" ; Full Intra Request 2551 / "tmmbr" [SP "smaxpr=" MaxPacketRateValue] 2552 ; Temporary max media bit rate 2553 / "tstr" ; Temporal Spatial Trade Off 2554 / "vbcm" *(SP subMessageType) ; H.271 VBCM messages 2555 / token [SP byte-string] 2556 ; for future commands/indications 2557 subMessageType = 1*8DIGIT 2558 byte-string = 2559 MaxPacketRateValue = 1*15DIGIT 2561 7.2. Offer-Answer 2563 The Offer/Answer [RFC3264] implications for codec control protocol 2564 feedback messages are similar those described in [RFC4585]. The 2565 offerer MAY indicate the capability to support selected codec 2566 commands and indications. The answerer MUST remove all ccm 2567 parameters which it does not understand or does not wish to use in 2568 this particular media session. The answerer MUST NOT add new ccm 2569 parameters in addition to what has been offered. The answer is 2570 binding for the media session and both offerer and answerer MUST only 2571 use feedback messages negotiated in this way. 2573 The session maximum packet rate parameter part of the TMMBR 2574 indication is declarative and everyone shall use the highest value 2575 indicated in a response. If the session maximum packet rate 2576 parameter is not present in an offer it SHALL NOT be included by the 2577 answerer. 2579 7.3. Examples 2581 Example 1: The following SDP describes a point-to-point video call 2582 with H.263, with the originator of the call declaring its capability 2583 to support the FIR and TSTR/TSTN codec control messages. The SDP is 2584 carried in a high level signaling protocol like SIP. 2586 v=0 2587 o=alice 3203093520 3203093520 IN IP4 host.example.com 2588 s=Point-to-Point call 2589 c=IN IP4 192.0.2.124 2590 m=audio 49170 RTP/AVP 0 2591 a=rtpmap:0 PCMU/8000 2592 m=video 51372 RTP/AVPF 98 2593 a=rtpmap:98 H263-1998/90000 2594 a=rtcp-fb:98 ccm tstr 2595 a=rtcp-fb:98 ccm fir 2597 In the above example, when the sender receives a TSTR message from 2598 the remote party it is capable of adjusting the trade off as 2599 indicated in the RTCP TSTN feedback message. 2601 Example 2: The following SDP describes a SIP end point joining a 2602 video mixer that is hosting a multiparty video conferencing session. 2603 The participant supports only the FIR (Full Intra Request) codec 2604 control command and it declares it in its session description. 2606 v=0 2607 o=alice 3203093520 3203093520 IN IP4 host.example.com 2608 s=Multiparty Video Call 2609 c=IN IP4 192.0.2.124 2610 m=audio 49170 RTP/AVP 0 2611 a=rtpmap:0 PCMU/8000 2612 m=video 51372 RTP/AVPF 98 2613 a=rtpmap:98 H263-1998/90000 2614 a=rtcp-fb:98 ccm fir 2616 When the video MCU decides to route the video of this participant it 2617 sends an RTCP FIR feedback message. Upon receiving this feedback 2618 message the end point is required to generate a full intra request. 2620 Example 3: The following example describes the Offer/Answer 2621 implications for the codec control messages. The Offerer wishes to 2622 support "tstr", "fir" and "tmmbr". The offered SDP is 2624 -------------> Offer 2625 v=0 2626 o=alice 3203093520 3203093520 IN IP4 host.example.com 2627 s=Offer/Answer 2628 c=IN IP4 192.0.2.124 2629 m=audio 49170 RTP/AVP 0 2630 a=rtpmap:0 PCMU/8000 2631 m=video 51372 RTP/AVPF 98 2632 a=rtpmap:98 H263-1998/90000 2633 a=rtcp-fb:98 ccm tstr 2634 a=rtcp-fb:98 ccm fir 2635 a=rtcp-fb:* ccm tmmbr smaxpr=120 2637 The answerer wishes to support only the FIR and TSTR/TSTN messages 2638 and the answerer SDP is 2640 <---------------- Answer 2642 v=0 2643 o=alice 3203093520 3203093524 IN IP4 otherhost.example.com 2644 s=Offer/Answer 2645 c=IN IP4 192.0.2.37 2646 m=audio 47190 RTP/AVP 0 2647 a=rtpmap:0 PCMU/8000 2648 m=video 53273 RTP/AVPF 98 2649 a=rtpmap:98 H263-1998/90000 2650 a=rtcp-fb:98 ccm tstr 2651 a=rtcp-fb:98 ccm fir 2653 Example 4: The following example describes the Offer/Answer 2654 implications for H.271 Video back channel messages (VBCM). The 2655 Offerer wishes to support VBCM and the sub-messages of payloadType 1 2656 (one or more pictures that are entirely or partially lost) and 2 (a 2657 set of blocks of one picture that are entirely or partially lost). 2659 -------------> Offer 2660 v=0 2661 o=alice 3203093520 3203093520 IN IP4 host.example.com 2662 s=Offer/Answer 2663 c=IN IP4 192.0.2.124 2664 m=audio 49170 RTP/AVP 0 2665 a=rtpmap:0 PCMU/8000 2666 m=video 51372 RTP/AVPF 98 2667 a=rtpmap:98 H263-1998/90000 2668 a=rtcp-fb:98 ccm vbcm 1 2 2670 The answerer only wishes to support sub-messages of type 1 only 2672 <---------------- Answer 2674 v=0 2675 o=alice 3203093520 3203093524 IN IP4 otherhost.example.com 2676 s=Offer/Answer 2677 c=IN IP4 192.0.2.37 2678 m=audio 47190 RTP/AVP 0 2679 a=rtpmap:0 PCMU/8000 2680 m=video 53273 RTP/AVPF 98 2681 a=rtpmap:98 H263-1998/90000 2682 a=rtcp-fb:98 ccm vbcm 1 2684 So in the above example only VBCM indications comprised of 2685 "payloadType" 1 will be supported. 2687 8. IANA Considerations 2689 The new value "ccm" needs to be registered with IANA in the "rtcp-fb" 2690 Attribute Values registry located at the time of publication at: 2691 http://www.iana.org/assignments/sdp-parameters 2693 Value name: ccm 2694 Long Name: Codec Control Commands and Indications 2695 Reference: RFC XXXX 2697 A new registry "Codec Control Messages" needs to be created to hold 2698 "ccm" parameters located at time of publication at: 2699 http://www.iana.org/assignments/sdp-parameters 2701 New registration in this registry follows the "Specification 2702 required" policy as defined by [RFC2434]. In addition they are 2703 required to indicate which, if any additional RTCP feedback types, 2704 such as "nack", "ack". 2706 The initial content of the registry is the following values: 2708 Value name: fir 2709 Long name: Full Intra Request Command 2710 Usable with: ccm 2711 Reference: RFC XXXX 2713 Value name: tmmbr 2714 Long name: Temporary Maximum Media Stream Bit Rate 2715 Usable with: ccm 2716 Reference: RFC XXXX 2718 Value name: tstr 2719 Long name: temporal Spatial Trade Off 2720 Usable with: ccm 2721 Reference: RFC XXXX 2723 Value name: vbcm 2724 Long name: H.271 video back channel messages 2725 Usable with: ccm 2726 Reference: RFC XXXX 2728 The following values need to be registered as FMT values in the "FMT 2729 Values for RTPFB Payload Types" registry located at the time of 2730 publication at: http://www.iana.org/assignments/rtp-parameters 2732 RTPFB range 2733 Name Long Name Value Reference 2734 -------------- --------------------------------- ----- --------- 2735 Reserved 2 [RFCxxxx] 2736 TMMBR Temporary Maximum Media Stream Bit 3 [RFCxxxx] 2737 Rate Request 2738 TMMBN Temporary Maximum Media Stream Bit 4 [RFCxxxx] 2739 Rate Notification 2741 The following values need to be registered as FMT values in the "FMT 2742 Values for PSFB Payload Types" registry located at the time of 2743 publication at: http://www.iana.org/assignments/rtp-parameters 2745 PSFB range 2746 Name Long Name Value Reference 2747 -------------- --------------------------------- ----- --------- 2748 FIR Full Intra Request Command 4 [RFCxxxx] 2749 TSTR Temporal-Spatial Trade-off Request 5 [RFCxxxx] 2750 TSTN Temporal-Spatial Trade-off Notification 6 [RFCxxxx] 2751 VBCM Video Back Channel Message 7 [RFCxxxx] 2753 9. Contributors 2755 Tom Taylor has made a very significant contribution, for which the 2756 authors are very grateful, to this specification by helping rewrite 2757 the specification. Especially the parts regarding the algorithm for 2758 determining bounding sets for TMMBR have benefited. 2760 10. Acknowledgements 2762 The authors would like to thank Andrea Basso, Orit Levin, Nermeen 2763 Ismail for their work on the requirement and discussion draft 2764 [Basso]. 2766 Drafts of this memo were reviewed and extensively commented by Roni 2767 Even, Colin Perkins, Randell Jesup, Keith Lantz, Harikishan Desineni, 2768 Guido Franceschini and others. The authors appreciate these reviews. 2770 Funding for the RFC Editor function is currently provided by the 2771 Internet Society. 2773 11. References 2775 11.1. Normative references 2777 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., Rey, J., 2778 "Extended RTP Profile for Real-Time Transport Control 2779 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, 2780 July 2006 2781 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2782 Requirement Levels", BCP 14, RFC 2119, March 1997. 2783 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 2784 Jacobson, "RTP: A Transport Protocol for Real-Time 2785 Applications", STD 64, RFC 3550, July 2003. 2786 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 2787 Description Protocol", RFC 4566, July 2006. 2788 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 2789 with Session Description Protocol (SDP)", RFC 3264, June 2790 2002. 2791 [Topologies] M. Westerlund, and S. Wenger, "RTP Topologies", draft- 2792 ietf-avt-topologies-04, work in progress, Feb 2007. 2793 [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an 2794 IANA Considerations Section in RFCs", BCP 26, RFC 2434, 2795 October 1998. 2796 [RFC4234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 2797 Specifications: ABNF", RFC 4234, October 2005. 2799 11.2. Informative references 2801 [Basso] A. Basso, et. al., "Requirements for transport of video 2802 control commands", draft-basso-avt-videoconreq-02.txt, 2803 expired Internet Draft, October 2004. 2804 [AVC] Joint Video Team of ITU-T and ISO/IEC JTC 1, Draft ITU-T 2805 Recommendation and Final Draft International Standard of 2806 Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC 2807 14496-10 AVC), Joint Video Team (JVT) of ISO/IEC MPEG 2808 and ITU-T VCEG, JVT-G050, March 2003. 2809 [H245] ITU-T Rec. HG.245, "Control protocol for multimedia 2810 communication", MAY 2006 2811 [NEWPRED] S. Fukunaga, T. Nakai, and H. Inoue, "Error Resilient 2812 Video Coding by Dynamic Replacing of Reference 2813 Pictures," in Proc. Globcom'96, vol. 3, pp. 1503 - 1508, 2814 1996. 2815 [SRTP] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and 2816 K. Norrman, "The Secure Real-time Transport Protocol 2817 (SRTP)", RFC 3711, March 2004. 2818 [RFC4587] Even, R., "RTP Payload Format for H.261 Video Streams", 2819 RFC 4587, August 2006. 2821 [SAVPF] J. Ott, E. Carrara, "Extended Secure RTP Profile for 2822 RTCP-based Feedback (RTP/SAVPF)," 2823 draft-ietf-avt-profile-savpf-10.txt, Feb, 2007. 2824 [RFC3525] Groves, C., Pantaleo, M., Anderson, T., and T. Taylor, 2825 "Gateway Control Protocol Version 1", RFC 3525, June 2826 2003. 2827 [RFC3448] M. Handley, S. Floyd, J. Padhye, J. Widmer, "TCP 2828 Friendly Rate Control (TFRC): Protocol Specification", 2829 RFC 3448, Jan 2003 2830 [VBCM] ITU-T Rec. H.271, "Video Back Channel Messages", June 2831 2006 2832 [RFC3890] Westerlund, M., "A Transport Independent Bandwidth 2833 Modifier for the Session Description Protocol (SDP)", 2834 RFC 3890, September 2004. 2835 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 2836 Congestion Control Protocol (DCCP)", RFC 4340, March 2837 2006. 2838 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 2839 A., Peterson, J., Sparks, R., Handley, M., and E. 2840 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 2841 June 2002. 2842 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 2843 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 2844 Parisis, "RTP Payload for Redundant Audio Data", RFC 2845 2198, September 1997. 2847 12. Authors' Addresses 2849 Stephan Wenger 2850 Nokia Corporation 2851 975, Page Mill Road, 2852 Palo Alto,CA 94304 2853 USA 2855 Phone: +1-650-862-7368 2856 EMail: stewe@stewe.org 2858 Umesh Chandra 2859 Nokia Research Center 2860 975, Page Mill Road, 2861 Palo Alto,CA 94304 2862 USA 2864 Phone: +1-650-796-7502 2865 Email: Umesh.Chandra@nokia.com 2866 Magnus Westerlund 2867 Ericsson Research 2868 Ericsson AB 2869 SE-164 80 Stockholm, SWEDEN 2871 Phone: +46 8 7190000 2872 EMail: magnus.westerlund@ericsson.com 2874 Bo Burman 2875 Ericsson Research 2876 Ericsson AB 2877 SE-164 80 Stockholm, SWEDEN 2879 Phone: +46 8 7190000 2880 EMail: bo.burman@ericsson.com 2882 Full Copyright Statement 2884 Copyright (C) The IETF Trust (2007). 2886 This document is subject to the rights, licenses and restrictions 2887 contained in BCP 78, and except as set forth therein, the authors 2888 retain all their rights. 2890 This document and the information contained herein are provided on an 2891 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 2892 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST 2893 AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, 2894 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT 2895 THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY 2896 IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR 2897 PURPOSE. 2899 Intellectual Property 2901 The IETF takes no position regarding the validity or scope of any 2902 Intellectual Property Rights or other rights that might be claimed to 2903 pertain to the implementation or use of the technology described in 2904 this document or the extent to which any license under such rights 2905 might or might not be available; nor does it represent that it has 2906 made any independent effort to identify any such rights. Information 2907 on the procedures with respect to rights in RFC documents can be 2908 found in BCP 78 and BCP 79. 2910 Copies of IPR disclosures made to the IETF Secretariat and any 2911 assurances of licenses to be made available, or the result of an 2912 attempt made to obtain a general license or permission for the use of 2913 such proprietary rights by implementers or users of this 2914 specification can be obtained from the IETF on-line IPR repository at 2915 http://www.ietf.org/ipr. 2917 The IETF invites any interested party to bring to its attention any 2918 copyrights, patents or patent applications, or other proprietary 2919 rights that may cover technology that may be required to implement 2920 this standard. Please address the information to the IETF at 2921 ietf-ipr@ietf.org. 2923 Acknowledgement 2925 Funding for the RFC Editor function is provided by the IETF 2926 Administrative Support Activity (IASA). 2928 RFC Editor Considerations 2930 The RFC editor is requested to replace all occurrences of XXXX with 2931 the RFC number this document receives.