idnits 2.17.1 draft-ietf-avt-avpf-ccm-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 19. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 2900. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2911. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2918. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2924. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == Line 752 has weird spacing: '...sg type mul...' == Line 1132 has weird spacing: '... ab c s...' == Line 1134 has weird spacing: '... ba s...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 28, 2007) is 6171 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFCxxxx' is mentioned on line 2752, but not defined ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) == Outdated reference: A later version (-07) exists of draft-ietf-avt-topologies-04 ** Downref: Normative reference to an Informational draft: draft-ietf-avt-topologies (ref. 'Topologies') ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Obsolete normative reference: RFC 4234 (Obsoleted by RFC 5234) == Outdated reference: A later version (-12) exists of draft-ietf-avt-profile-savpf-10 -- Obsolete informational reference (is this intentional?): RFC 3525 (Obsoleted by RFC 5125) -- Obsolete informational reference (is this intentional?): RFC 3448 (Obsoleted by RFC 5348) Summary: 5 errors (**), 0 flaws (~~), 7 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Stephan Wenger 3 INTERNET-DRAFT Umesh Chandra 4 Expires: October 2007 Nokia 5 Intended Status: Proposed Standard Magnus Westerlund 6 Bo Burman 7 Ericsson 8 May 28, 2007 10 Codec Control Messages in the 11 RTP Audio-Visual Profile with Feedback (AVPF) 12 draft-ietf-avt-avpf-ccm-06.txt> 14 Status of this Memo 16 By submitting this Internet-Draft, each author represents that any 17 applicable patent or other IPR claims of which he or she is aware 18 have been or will be disclosed, and any of which he or she becomes 19 aware will be disclosed, in accordance with Section 6 of BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt. 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 Copyright Notice 39 Copyright (C) The IETF Trust (2007). 41 Abstract 43 This document specifies a few extensions to the messages defined in 44 the Audio-Visual Profile with Feedback (AVPF). They are helpful 45 primarily in conversational multimedia scenarios where centralized 46 multipoint functionalities are in use. However some are also usable 47 in smaller multicast environments and point-to-point calls. The 48 extensions discussed are messages related to the ITU-T H.271 Video 49 Back Channel, Full Intra Request, Temporary Maximum Media Stream Bit 50 Rate and Temporal Spatial Trade-off. 52 TABLE OF CONTENTS 54 1. Introduction....................................................5 55 2. Definitions.....................................................6 56 2.1. Glossary...................................................6 57 2.2. Terminology................................................6 58 2.3. Topologies.................................................9 59 3. Motivation (Informative).......................................10 60 3.1. Use Cases.................................................10 61 3.2. Using the Media Path......................................12 62 3.3. Using AVPF................................................13 63 3.3.1. Reliability..........................................13 64 3.4. Multicast.................................................13 65 3.5. Feedback Messages.........................................13 66 3.5.1. Full Intra Request Command...........................13 67 3.5.1.1. Reliability.....................................14 68 3.5.2. Temporal Spatial Trade-off Request and Notification..15 69 3.5.2.1. Point-to-Point..................................16 70 3.5.2.2. Point-to-Multipoint Using Multicast or 71 Translators.....................................16 72 3.5.2.3. Point-to-Multipoint Using RTP Mixer.............17 73 3.5.2.4. Reliability.....................................17 74 3.5.3. H.271 Video Back Channel Message.....................17 75 3.5.3.1. Reliability.....................................20 76 3.5.4. Temporary Maximum Media Stream Bit Rate Request and 77 Notification................................................20 78 3.5.4.1. Behavior for media receivers using TMMBR........22 79 3.5.4.2. Algorithm for establishing current limitations..24 80 3.5.4.3. Use of TMMBR in a Mixer Based Multipoint 81 Operation.......................................30 82 3.5.4.4. Use of TMMBR in Point-to-Multipoint Using 83 Multicast or Translators........................32 84 3.5.4.5. Use of TMMBR in Point-to-point operation........32 85 3.5.4.6. Reliability.....................................32 86 4. RTCP Receiver Report Extensions................................34 87 4.1. Design Principles of the Extension Mechanism..............34 88 4.2. Transport Layer Feedback Messages.........................35 89 4.2.1. Temporary Maximum Media Stream Bit Rate Request 90 (TMMBR)..............................................36 91 4.2.1.1. Message Format..................................36 92 4.2.1.2. Semantics.......................................37 93 4.2.1.3. Timing Rules....................................40 94 4.2.1.4. Handling in Translator and Mixers...............40 95 4.2.2. Temporary Maximum Media Stream Bit Rate Notification 96 (TMMBN)..............................................41 97 4.2.2.1. Message Format..................................41 98 4.2.2.2. Semantics.......................................42 99 4.2.2.3. Timing Rules....................................43 100 4.2.2.4. Handling by Translators and Mixers..............43 101 4.3. Payload Specific Feedback Messages........................43 102 4.3.1. Full Intra Request (FIR).............................44 103 4.3.1.1. Message Format..................................44 104 4.3.1.2. Semantics.......................................45 105 4.3.1.3. Timing Rules....................................47 106 4.3.1.4. Handling of FIR Message in Mixer and 107 Translators.....................................47 108 4.3.1.5. Remarks.........................................47 109 4.3.2. Temporal-Spatial Trade-off Request (TSTR)............47 110 4.3.2.1. Message Format..................................48 111 4.3.2.2. Semantics.......................................48 112 4.3.2.3. Timing Rules....................................49 113 4.3.2.4. Handling of message in Mixers and Translators...49 114 4.3.2.5. Remarks.........................................49 115 4.3.3. Temporal-Spatial Trade-off Notification (TSTN).......50 116 4.3.3.1. Message Format..................................50 117 4.3.3.2. Semantics.......................................51 118 4.3.3.3. Timing Rules....................................51 119 4.3.3.4. Handling of TSTN in Mixer and Translators.......51 120 4.3.3.5. Remarks.........................................51 121 4.3.4. H.271 Video Back Channel Message (VBCM)..............52 122 4.3.4.1. Message Format..................................52 123 4.3.4.2. Semantics.......................................53 124 4.3.4.3. Timing Rules....................................54 125 4.3.4.4. Handling of message in Mixer or Translator......54 126 4.3.4.5. Remarks.........................................54 127 5. Congestion Control.............................................55 128 6. Security Considerations........................................55 129 7. SDP Definitions................................................56 130 7.1. Extension of the rtcp-fb Attribute........................56 131 7.2. Offer-Answer..............................................58 132 7.3. Examples..................................................58 133 8. IANA Considerations............................................62 134 9. Acknowledgements...............................................63 135 10. References....................................................64 136 10.1. Normative references.....................................64 137 10.2. Informative references...................................64 138 11. Authors' Addresses............................................66 139 1.1. Introduction 141 When the Audio-Visual Profile with Feedback (AVPF) [RFC4585] was 142 developed, the main emphasis lay in the efficient support of point- 143 to-point and small multipoint scenarios without centralized 144 multipoint control. However, in practice, many small multipoint 145 conferences operate utilizing devices known as Multipoint Control 146 Units (MCUs). Long-standing experience of the conversational video 147 conferencing industry suggests that there is a need for a few 148 additional feedback messages, to support centralized multipoint 149 conferencing efficiently. Some of the messages have applications 150 beyond centralized multipoint, and this is indicated in the 151 description of the message. This is especially true for the message 152 intended to carry ITU-T Rec. H.271 [H.271] bit strings for Video Back 153 Channel messages. 155 In Real-time Transport Protocol (RTP) [RFC3550] terminology, MCUs 156 comprise mixers and translators. Most MCUs also include signaling 157 support. During the development of this memo, it was noticed that 158 there is considerable confusion in the community related to the use 159 of terms such as mixer, translator, and MCU. In response to these 160 concerns, a number of topologies have been identified that are of 161 practical relevance to the industry, but are not documented in 162 sufficient detail in [RFC3550]. These topologies are documented in 163 [Topologies], and understanding this memo requires previous or 164 parallel study of [Topologies]. 166 Some of the messages defined here are forward only, in that they do 167 not require an explicit notification to the message emitter that they 168 have been received and/or indicating the message receiver's actions. 169 Other messages require a response, leading to a two way communication 170 model that one could view as useful for control purposes. However, 171 it is not the intention of this memo to open up RTP Control Protocol 172 (RTCP) to a generalized control protocol. All mentioned messages 173 have relatively strict real-time constraints, in the sense that their 174 value diminishes with increased delay. This makes the use of more 175 traditional control protocol means, such as Session Initiation 176 Protocol (SIP) re-INVITEs [RFC3261], undesirable when used for the 177 same purpose. Furthermore, all messages are of a very simple format 178 that can be easily processed by an RTP/RTCP sender/receiver. 179 Finally, all messages relate only to the RTP stream with which they 180 are associated, and not to any other property of a communication 181 system. In particular, none of them relate to the properties of the 182 access links traversed by the session. 184 2. Definitions 186 2.1. Glossary 188 AMID - Additive Increase Multiplicative Decrease 189 AVPF - The extended RTP profile for RTCP-based feedback 190 FEC - Forward Error Correction 191 FCI - Feedback Control Information [RFC4585] 192 FIR - Full Intra Request 193 MCU - Multipoint Control Unit 194 MPEG - Moving Picture Experts Group 195 TMMBN - Temporary Maximum Media Stream Bit Rate Notification 196 TMMBR - Temporary Maximum Media Stream Bit Rate Request 197 PLI - Picture Loss Indication 198 PR - Packet rate 199 QP - Quantizer Parameter 200 RTT - Round trip time 201 SSRC - Synchronization Source 202 TSTN - Temporal Spatial Trade-off Notification 203 TSTR - Temporal Spatial Trade-off Request 204 VBCM - Video Back Channel Message indication. 206 2.2. Terminology 208 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 209 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 210 document are to be interpreted as described in RFC 2119 [RFC2119]. 212 Message: 213 An RTCP feedback message [RFC4585] defined by this 214 specification, of one of the following types: 216 Request: 217 Message that requires acknowledgement 219 Command: 220 Message that forces the receiver to an action 222 Indication: 223 Message that reports a situation 225 Notification: 227 Message that provides a notification that an event has 228 occurred. Notifications are commonly generated in response 229 to a Request. 231 Note that, with the exception of "Notification", this 232 terminology is in alignment with ITU-T Rec. H.245 [H245]. 234 Decoder Refresh Point: 235 A bit string, packetized in one or more RTP packets, which 236 completely resets the decoder to a known state. 238 Examples for "hard" decoder refresh points are Intra pictures 239 in H.261, H.263, MPEG-1, MPEG-2, and MPEG-4 part 2, and 240 Instantaneous Decoder Refresh (IDR) pictures in H.264. 241 "Gradual" decoder refresh points may also be used; see for 242 example [AVC]. While both "hard" and "gradual" decoder 243 refresh points are acceptable in the scope of this 244 specification, in most cases the user experience will benefit 245 from using a "hard" decoder refresh point. 247 A decoder refresh point also contains all header information 248 above the picture layer (or equivalent, depending on the video 249 compression standard) that is conveyed in-band. In H.264, for 250 example, a decoder refresh point contains parameter set 251 Network Adaptation Layer (NAL) units that generate parameter 252 sets necessary for the decoding of the following slice/data 253 partition NAL units (and that are not conveyed out of band). 255 Decoding: 256 The operation of reconstructing the media stream. 258 Rendering: 259 The operation of presenting (parts of) the reconstructed media 260 stream to the user. 262 Stream thinning: 263 The operation of removing some of the packets from a media 264 stream. Stream thinning, preferably, is media-aware, implying 265 that media packets are removed in the order of increasing 266 relevance to the reproductive quality. However even when 267 employing media-aware stream thinning, most media streams 268 quickly lose quality when subject to increasing levels of 269 thinning. Media-unaware stream thinning leads to even worse 270 quality degradation. In contrast to transcoding, stream 271 thinning is typically seen as a computationally lightweight 272 operation. 274 Media: 276 Often used (sometimes in conjunction with terms like bit rate, 277 stream, sender ...) to identify the content of the forward RTP 278 packet stream (carrying the codec data), to which the codec 279 control message applies. 281 Media Stream: 282 The stream of RTP packets labeled with a single 283 Synchronization Source (SSRC) carrying the media (and also in 284 some cases repair information such as retransmission or 285 Forward Error Correction (FEC) information). 287 Total media bit rate: 288 The total bits per second transferred in a media stream, 289 measured at an observer-selected protocol layer and averaged 290 over a reasonable timescale, the length of which depends on 291 the application. In general, a media sender and a media 292 receiver will observe different total media bit rates for the 293 same stream, first because they may have selected different 294 reference protocol layers, and second, because of changes in 295 per-packet overhead along the transmission path. The goal 296 with bit rate averaging is to be able to ignore any burstiness 297 on very short timescales, below for example 100 ms, introduced 298 by scheduling or link layer packetization effects. 300 Maximum total media bit rate: 301 The upper limit on total media bit rate for a given media 302 stream at a particular receiver and for its selected protocol 303 layer. Note that this value cannot be measured on the received 304 media stream, instead it needs to be calculated or determined 305 through other means, such as QoS negotiations or local 306 resource limitations. Also note that this value is an average 307 (on a timescale that is reasonable for the application) and 308 that it may be different from the instantaneous bit-rate seen 309 by packets in the media stream. 311 Overhead: 312 All protocol header information required to convey a packet 313 with media data from sender to receiver, from the application 314 layer down to a pre-defined protocol level (for example down 315 to, and including, the IP header). Overhead may include, for 316 example, IP, UDP, and RTP headers, any layer 2 headers, any 317 Contributing Sources (CSRCs), RTP-Padding, and RTP header 318 extensions. Overhead excludes any RTP payload headers and the 319 payload itself. 321 Net media bit rate: 322 The bit rate carried by a media stream, net of overhead. That 323 is, the bits per second accounted for by encoded media, any 324 applicable payload headers, and any directly associated meta 325 payload information placed in the RTP packet. A typical 326 example of the latter is redundancy data provided by the use 327 of RFC 2198 [RFC2198]. Note that, unlike the total media bit 328 rate, the net media bit rate will have the same value at the 329 media sender and at the media receiver unless any mixing or 330 translating of the media has occurred. 332 For a given observer, the total media bit rate for a media 333 stream is equal to the sum of the net media bit rate and the 334 per-packet overhead as defined above multiplied by the packet 335 rate. 337 Feasible region: 338 The set of all combinations of packet rate and net media bit 339 rate that do not exceed the restrictions in maximum media bit 340 rate placed on a given media sender by the Temporary Maximum 341 Media Stream Bit-rate Request (TMMBR) messages it has 342 received. The feasible region will change as new TMMBR 343 messages are received. 345 Bounding set: 346 The set of TMMBR tuples, selected from all those received at a 347 given media sender, that define the feasible region for that 348 media sender. The media sender uses an algorithm such as that 349 in section 3.5.4.2 to determine or iteratively approximate the 350 current bounding set, and reports that set back to the media 351 receivers in a Temporary Maximum Media Stream Bit-rate 352 Notification (TMMBN) message. 354 2.3. Topologies 356 Please refer to [Topologies] for an in depth discussion. The 357 topologies referred to throughout this memo are labeled (consistently 358 with [Topologies]) as follows: 360 Topo-Point-to-Point . . . . . point-to-point communication 361 Topo-Multicast . . . . . . . multicast communication as in RFC 3550 362 Topo-Translator . . . . . . . translator based as in RFC 3550 363 Topo-Mixer . . . . . . . . . mixer based as in RFC 3550 364 Topo-Video-switch-MCU . . . . video switching MCU, 365 Topo-RTCP-terminating-MCU . . mixer but terminating RTCP 367 3. Motivation (Informative) 369 This section discusses the motivation and usage of the different 370 video and media control messages. The video control messages have 371 been under discussion for a long time, and a requirement draft was 372 drawn up [Basso]. This draft has expired; however we quote relevant 373 sections of it to provide motivation and requirements. 375 3.1. Use Cases 377 There are a number of possible usages for the proposed feedback 378 messages. Let us begin by looking through the use cases Basso et al. 379 [Basso] proposed. Some of the use cases have been reformulated and 380 comments have been added. 382 1. An RTP video mixer composes multiple encoded video sources into a 383 single encoded video stream. Each time a video source is added, 384 the RTP mixer needs to request a decoder refresh point from the 385 video source, so as to start an uncorrupted prediction chain on 386 the spatial area of the mixed picture occupied by the data from 387 the new video source. 389 2. An RTP video mixer receives multiple encoded RTP video streams 390 from conference participants, and dynamically selects one of the 391 streams to be included in its output RTP stream. At the time of a 392 bit stream change (determined through means such as voice 393 activation or the user interface), the mixer requests a decoder 394 refresh point from the remote source, in order to avoid using 395 unrelated content as reference data for inter picture prediction. 396 After requesting the decoder refresh point, the video mixer stops 397 the delivery of the current RTP stream and monitors the RTP stream 398 from the new source until it detects data belonging to the decoder 399 refresh point. At that time, the RTP mixer starts forwarding the 400 newly selected stream to the receiver(s). 402 3. An application needs to signal to the remote encoder that the 403 desired trade-off between temporal and spatial resolution has 404 changed. For example, one user may prefer a higher frame rate and 405 a lower spatial quality, and another user may prefer the opposite. 406 This choice is also highly content dependent. Many current video 407 conferencing systems offer in the user interface a mechanism to 408 make this selection, usually in the form of a slider. The 409 mechanism is helpful in point-to-point, centralized multipoint and 410 non-centralized multipoint uses. 412 4. Use case 4 of the Basso draft applies only to Picture Loss 413 Indication (PLI) as defined in AVPF [RFC4585] and is not 414 reproduced here. 416 5. Use case 5 of the Basso draft relates to a mechanism known as 417 "freeze picture request". Sending freeze picture requests 418 over a non-reliable forward RTCP channel has been identified as 419 problematic. Therefore, no freeze picture request has been 420 included in this memo, and the use case discussion is not 421 reproduced here. 423 6. A video mixer dynamically selects one of the received video 424 streams to be sent out to participants and tries to provide the 425 highest bit rate possible to all participants, while minimizing 426 stream trans-rating. One way of achieving this is to set up 427 sessions with endpoints using the maximum bit rate accepted by 428 each endpoint, and accepted by the call admission method used by 429 the mixer. By means of commands that reduce the maximum media 430 stream bit rate below what has been negotiated during session set 431 up, the mixer can reduce the maximum bit rate sent by endpoints to 432 the lowest of all the accepted bit rates. As the lowest accepted 433 bit rate changes due to endpoints joining and leaving or due to 434 network congestion, the mixer can adjust the limits at which 435 endpoints can send their streams to match the new value. The 436 mixer then requests a new maximum bit rate, which is equal to or 437 less than the maximum bit rate negotiated at session setup for a 438 specific media stream, and the remote endpoint can respond with 439 the actual bit rate that it can support. 441 The picture Basso, et al draws up covers most applications we 442 foresee. However we would like to extend the list with two 443 additional use cases: 445 7. Currently deployed congestion control algorithms (AMID and TFRC 446 [RFC3448]) probe for additional available capacity as long as 447 there is something to send. With congestion control algorithms 448 using packet loss as the indication for congestion, this probing 449 does generally result in reduced media quality (often to a point 450 where the distortion is large enough to make the media unusable), 451 due to packet loss and increased delay. 453 In a number of deployment scenarios, especially cellular ones, the 454 bottleneck link is often the last hop link. That cellular link 455 also commonly has some type of QoS negotiation enabling the 456 cellular device to learn the maximal bit rate available over this 457 last hop. A media receiver behind this link can, in most (if not 458 all) cases, calculate at least an upper bound for the bit rate 459 available for each media stream it presently receives. How this 460 is done is an implementation detail and not discussed herein. 461 Indicating the maximum available bit rate to the transmitting 462 party for the various media streams can be beneficial to prevent 463 that party from probing for bandwidth for this stream in excess of 464 a known hard limit. For cellular or other mobile devices, the 465 known available bit rate for each stream (deduced from the link 466 bit rate) can change quickly, due to handover to another 467 transmission technology, QoS renegotiation due to congestion, etc. 468 To enable minimal disruption of service, quick convergence is 469 necessary, and therefore media path signaling is desirable. 471 8. The use of reference picture selection (RPS) as an error 472 resilience tool has been introduced in 1997 as NEWPRED [NEWPRED], 473 and is now widely deployed. When RPS is in use, simplistically 474 put, the receiver can send a feedback message to the sender, 475 indicating a reference picture that should be used for future 476 prediction. ([NEWPRED] mentions other forms of feedback as well.) 477 AVPF contains a mechanism for conveying such a message, but did 478 not specify for which codec and according to which syntax the 479 message should conform. Recently, the ITU-T finalized Rec. H.271 480 which (among other message types) also includes a feedback 481 message. It is expected that this feedback message will fairly 482 quickly enjoy wide support. Therefore, a mechanism to convey 483 feedback messages according to H.271 appears to be desirable. 485 3.2. Using the Media Path 487 There are multiple reasons why we use the media path for the codec 488 control messages. 490 First, systems employing MCUs often separate the control and media 491 processing parts. As these messages are intended for or generated by 492 the media part rather than the signaling part of the MCU, having them 493 on the media path avoids transmission across interfaces and 494 unnecessary control traffic between signaling and processing. If the 495 MCU is physically decomposed, the use of the media path avoids the 496 need for media control protocol extensions (e.g. in MEGACO 497 [RFC3525]). 499 Secondly, the signaling path quite commonly contains several 500 signaling entities, e.g. SIP proxies and application servers. 501 Avoiding going through signaling entities avoids delay for several 502 reasons. Proxies have less stringent delay requirements than media 503 processing and due to their complex and more generic nature may 504 result in significant processing delay. The topological locations of 505 the signaling entities are also commonly not optimized for minimal 506 delay, but rather towards other architectural goals. Thus the 507 signaling path can be significantly longer in both geographical and 508 delay sense. 510 3.3. Using AVPF 512 The AVPF feedback message framework [RFC4585] provides the 513 appropriate framework to implement the new messages. AVPF implements 514 rules controlling the timing of feedback messages to avoid congestion 515 through network flooding by RTCP traffic. We re-use these rules by 516 referencing AVPF. 518 The signaling setup for AVPF allows each individual type of function 519 to be configured or negotiated on an RTP session basis. 521 3.3.1. Reliability 523 The use of RTCP messages implies that each message transfer is 524 unreliable, unless the lower layer transport provides reliability. 525 The different messages proposed in this specification have different 526 requirements in terms of reliability. However, in all cases, the 527 reaction to an (occasional) loss of a feedback message is specified. 529 3.4. Multicast 531 The codec control messages might be used with multicast. The RTCP 532 timing rules specified in [RFC3550] and [RFC4585] ensure that the 533 messages do not cause overload of the RTCP connection. The use of 534 multicast may result in the reception of messages with inconsistent 535 semantics. The reaction to inconsistencies depends on the message 536 type, and is discussed for each message type separately. 538 3.5. Feedback Messages 540 This section describes the semantics of the different feedback 541 messages and how they apply to the different use cases. 543 3.5.1. Full Intra Request Command 545 A Full Intra Request (FIR) Command, when received by the designated 546 media sender, requires that the media sender sends a Decoder Refresh 547 Point (see 2.2) at the earliest opportunity. The evaluation of such 548 opportunity includes the current encoder coding strategy and the 549 current available network resources. 551 FIR is also known as an "instantaneous decoder refresh request" or 552 "video fast update request". 554 Using a decoder refresh point implies refraining from using any 555 picture sent prior to that point as a reference for the encoding 556 process of any subsequent picture sent in the stream. For predictive 557 media types that are not video, the analogue applies. For example, 558 if in MPEG-4 systems scene updates are used, the decoder refresh 559 point consists of the full representation of the scene and is not 560 delta-coded relative to previous updates. 562 Decoder refresh points, especially Intra or IDR pictures, are in 563 general several times larger in size than predicted pictures. Thus, 564 in scenarios in which the available bit rate is small, the use of a 565 decoder refresh point implies a delay that is significantly longer 566 than the typical picture duration. 568 Usage in multicast is possible; however aggregation of the commands 569 is recommended. A receiver that receives a request closely (within 2 570 times the longest Round Trip Time (RTT) known, plus any AVPF-induced 571 RTCP packet sending delays, if those are known) after sending a 572 decoder refresh point, should await a second request message to 573 ensure that the media receiver has not been served by the previously 574 delivered decoder refresh point. The reason for the specified delay 575 is to avoid sending unnecessary decoder refresh points. A session 576 participant may have sent its own request while another participant's 577 request was in-flight to them. Suppressing those requests that may 578 have been sent without knowledge about the other request avoids this 579 issue. 581 Using the FIR command to recover from errors is explicitly 582 disallowed, and instead the PLI message defined in AVPF [RFC4585] 583 should be used. The PLI message reports lost pictures and has been 584 included in AVPF for precisely that purpose. 586 Full Intra Request is applicable in use-cases 1 and 2. 588 3.5.1.1. Reliability 590 The FIR message results in the delivery of a decoder refresh point, 591 unless the message is lost. Decoder refresh points are easily 592 identifiable from the bit stream. Therefore, there is no need for 593 protocol-level notification, and a simple command repetition 594 mechanism is sufficient for ensuring the level of reliability 595 required. However, the potential use of repetition does require a 596 mechanism to prevent the recipient from responding to messages 597 already received and responded to. 599 To ensure the best possible reliability, a sender of FIR may repeat 600 the FIR request until the desired content has been received. The 601 repetition interval is determined by the RTCP timing rules applicable 602 to the session. Upon reception of a complete decoder refresh point 603 or the detection of an attempt to send a decoder refresh point (which 604 got damaged due to a packet loss), the repetition of the FIR must 605 stop. If another FIR is necessary, the request sequence number must 606 be increased. A FIR sender shall not have more than one FIR request 607 (different request sequence number) outstanding at any time per media 608 sender in the session. 610 The receiver of FIR (i.e. the media sender) behaves in complementary 611 fashion to ensure delivery of a decoder refresh point. If it 612 receives repetitions of the FIR more than 2*RTT after it has sent a 613 decoder refresh point, it shall send a new decoder refresh point. 614 Two round trip times allow time for the decoder refresh point to 615 arrive back to the requestor and for the end of repetitions of FIR to 616 reach and be detected by the media sender. 618 An RTP mixer that receives an FIR from a media receiver is 619 responsible to ensure that a decoder refresh point is delivered to 620 the requesting receiver. It may be necessary for the mixer to 621 generate FIR commands. From a reliability perspective, the two legs 622 (FIR-requesting endpoint to mixer, and mixer to decoder refresh point 623 generating endpoint) are handled independently from each other. 625 3.5.2. Temporal Spatial Trade-off Request and Notification 627 The Temporal Spatial Trade-off Request (TSTR) instructs the video 628 encoder to change its trade-off between temporal and spatial 629 resolution. Index values from 0 to 31 indicate monotonically a 630 desire for higher frame rate. That is, a requester asking for an 631 index of 0 prefers a high quality and is willing to accept a low 632 frame rate, whereas a requester asking for 31 wishes a high frame 633 rate, potentially at the cost of low spatial quality. 635 In general the encoder reaction time may be significantly longer than 636 the typical picture duration. See use case 3 for an example. The 637 encoder decides whether and to what extent the request results in a 638 change of the trade-off. It returns a Temporal Spatial Trade-Off 639 Notification (TSTN) message to indicate the trade-off that it will 640 use henceforth. 642 TSTR and TSTN have been introduced primarily because it is believed 643 that control protocol mechanisms, e.g. a SIP re-invite, are too 644 heavyweight and too slow to allow for a reasonable user experience. 646 Consider, for example, a user interface where the remote user selects 647 the temporal/spatial trade-off with a slider (as it is common in 648 state-of-the-art video conferencing systems). An immediate feedback 649 to any slider movement is required for a reasonable user experience. 650 A SIP re-INVITE [RFC3261] would require at least two round-trips more 651 (compared to the TSTR/TSTN mechanism) and may involve proxies and 652 other complex mechanisms. Even in a well-designed system, it could 653 take a second or so until finally the new trade-off is selected. 654 Furthermore the use of RTCP solves the multicast use case very 655 efficiently. 657 The use of TSTR and TSTN in multipoint scenarios is a non-trivial 658 subject, and can be achieved in many implementation-specific ways. 659 Problems stem from the fact that TSTRs will typically arrive 660 unsynchronized, and may request different trade-off values for the 661 same stream and/or endpoint encoder. This memo does not specify a 662 translator, mixer or endpoint's reaction to the reception of a 663 suggested trade-off as conveyed in the TSTR. We only require the 664 receiver of a TSTR message to reply to it by sending a TSTN, carrying 665 the new trade-off chosen by its own criteria (which may or may not be 666 based on the trade-off conveyed by the TSTR). In other words, the 667 trade-off sent in TSTR is a non-binding recommendation, nothing more. 669 Four TSTR/TSTN scenarios need to be distinguished, based on the 670 topologies described in [Topologies]. The scenarios are described in 671 the following sub-clauses. 673 3.5.2.1. Point-to-Point 675 In this most trivial case (Topo-Point-to-Point), the media sender 676 typically adjusts its temporal/spatial trade-off based on the 677 requested value in TSTR, subject to its own capabilities. The TSTN 678 message conveys back the new trade-off value (which may be identical 679 to the old one if, for example, the sender is not capable of 680 adjusting its trade-off). 682 3.5.2.2. Point-to-Multipoint Using Multicast or Translators 684 RTCP Multicast is used either with media multicast according to Topo- 685 Multicast, or following RFC 3550's translator model according to 686 Topo-Translator. In these cases, unsynchronized TSTR messages from 687 different receivers may be received, possibly with different 688 requested trade-offs (because of different user preferences). This 689 memo does not specify how the media sender tunes its trade-off. 690 Possible strategies include selecting the mean or median of all 691 trade-off requests received, giving priority to certain participants, 692 or continuing to use the previously selected trade-off (e.g. when the 693 sender is not capable of adjusting it). Again, all TSTR messages 694 need to be acknowledged by TSTN, and the value conveyed back has to 695 reflect the decision made. 697 3.5.2.3. Point-to-Multipoint Using RTP Mixer 699 In this scenario (Topo-Mixer) the RTP mixer receives all TSTR 700 messages, and has the opportunity to act on them based on its own 701 criteria. In most cases, the mixer should form a "consensus" of 702 potentially conflicting TSTR messages arriving from different 703 participants, and initiate its own TSTR message(s) to the media 704 sender(s). As in the previous scenario, the strategy for forming 705 this "consensus" is up to the implementation, and can, for example, 706 encompass averaging the participants' request values, giving priority 707 to certain participants, or using session default values. 709 Even if a mixer or translator performs transcoding, it is very 710 difficult to deliver media with the requested trade-off, unless the 711 content the mixer or translator receives is already close to that 712 trade-off. Thus if the mixer changes its trade-off, it needs to 713 request the media sender(s) to use the new value, by creating a TSTR 714 of its own. Upon reaching a decision on the used trade-off it 715 includes that value in the acknowledgement to the downstream 716 requestors. Only in cases where the original source has 717 substantially higher quality (and bit rate), is it likely that 718 transcoding alone can result in the requested trade-off. 720 3.5.2.4. Reliability 722 A request and reception acknowledgement mechanism is specified. The 723 Temporal Spatial Trade-off Notification (TSTN) message informs the 724 request-sender that its request has been received, and what trade-off 725 is used henceforth. This acknowledgment mechanism is desirable for 726 at least the following reasons: 728 o A change in the trade-off cannot be directly identified from the 729 media bit stream. 730 o User feedback cannot be implemented without knowing the chosen 731 trade-off value, according to the media sender's constraints. 732 o Repetitive sending of messages requesting an unimplementable trade- 733 off can be avoided. 735 3.5.3. H.271 Video Back Channel Message 736 ITU-T Rec. H.271 defines syntax, semantics, and suggested encoder 737 reaction to a video back channel message. The structure defined in 738 this memo is used to transparently convey such a message from media 739 receiver to media sender. In this memo, we refrain from an in-depth 740 discussion of the available code points within H.271 and refer to the 741 specification text [H.271] instead. 743 However, we note that some H.271 messages bear similarities with 744 native messages of AVPF and this memo. Furthermore, we note that 745 some H.271 message are known to require caution in multicast 746 environments -- or are plainly not usable in multicast or multipoint 747 scenarios. Table 1 provides a brief, oversimplifying overview of the 748 messages currently defined in H.271, their roughly corresponding AVPF 749 or CCM messages (the latter as specified in this memo), and an 750 indication of our current knowledge of their multicast safety. 752 H.271 msg type AVPF/CCM msg type multicast-safe 753 --------------------------------------------------------------------- 754 0 (when used for 755 reference picture 756 selection) AVPF RPSI No (positive ACK of pictures) 757 1 picture loss AVPF PLI Yes 758 2 partial loss AVPF SLI Yes 759 3 one parameter CRC N/A Yes (no required sender action) 760 4 all parameter CRC N/A Yes (no required sender action) 761 5 refresh point CCM FIR Yes 763 Table 1: H.271 messages and their AVPF/CCM equivalents 765 Note: H.271 message type 0 is not a strict equivalent to 766 AVPF's Reference Picture Selection Indication (RPSI); it is an 767 indication of known-as-correct reference picture(s) at the 768 decoder. It does not command an encoder to use a defined 769 reference picture (the form of control information envisioned 770 to be carried in RPSI). However, it is believed and intended 771 that H.271 message type 0 will be used for the same purpose as 772 AVPF's RPSI -- although other use forms are also possible. 774 In response to the opaqueness of the H.271 messages especially with 775 respect to the multicast safety, the following guidelines MUST be 776 followed when an implementation wishes to employ the H.271 video back 777 channel message: 779 1. Implementations utilizing the H.271 feedback message MUST stay in 780 compliance with congestion control principles, as outlined in 781 section 5. 783 2. An implementation SHOULD utilize the IETF-native messages as 784 defined in [RFC4585] and in this memo instead of similar messages 785 defined in [H.271]. Our current understanding of similar messages 786 is documented in Table 1 above. One good reason to divert from 787 the SHOULD statement above would be if it is clearly understood 788 that, for a given application and video compression standard, the 789 aforementioned "similarity" is not given, in contrast to what 790 the table indicates. 792 3. It has been observed that some of the H.271 code points currently 793 in existence are not multicast-safe. Therefore, the sensible 794 thing to do is not to use the H.271 feedback message type in 795 multicast environments. It MAY be used only when all the issues 796 mentioned later are fully understood by the implementer, and 797 properly taken into account by all endpoints. In all other cases, 798 the H.271 message type MUST NOT be used in conjunction with 799 multicast. 801 4. It has been observed that even in centralized multipoint 802 environments, where the mixer should theoretically be able to 803 resolve issues as documented below, the implementation of such a 804 mixer and cooperative endpoints is a very difficult and tedious 805 task. Therefore, H.271 messages MUST NOT be used in centralized 806 multipoint scenarios, unless all the issues mentioned below are 807 fully understood by the implementer, and properly taken into 808 account by both mixer and endpoints. 810 Issues to be taken into account when considering the use of H.271 in 811 multipoint environments: 813 1. Different state on different receivers. In many environments it 814 cannot be guaranteed that the decoder state of all media receivers 815 is identical at any given point in time. The most obvious reason 816 for such a possible misalignment of state is a loss that occurs on 817 the path to only one of many media receivers. However, there are 818 other not so obvious reasons, such as recent joins to the 819 multipoint conference (be it by joining the multicast group or 820 through additional mixer output). Different states can lead the 821 media receivers to issue potentially contradicting H.271 messages 822 (or one media receiver issuing an H.271 message that, when 823 observed by the media sender, is not helpful for the other media 824 receivers). A naive reaction of the media sender to these 825 contradicting messages can lead to unpredictable and annoying 826 results. 828 2. Combining messages from different media receivers in a media 829 sender is a non-trivial task. As reasons, we note that these 830 messages may be contradicting each other, and that their transport 831 is unreliable (there may well be other reasons). In case of many 832 H.271 messages (i.e. types 0, 2, 3, and 4), the algorithm for 833 combining must be aware both of the network/protocol environment 834 (i.e. with respect to congestion) and of the media codec employed, 835 as H.271 messages of a given type can have different semantics for 836 different media codecs. 838 3. The suppression of requests may need to go beyond the basic 839 mechanisms described in AVPF (which are driven exclusively by 840 timing and transport considerations on the protocol level). For 841 example, a receiver is often required to refrain from (or delay) 842 generating requests, based on information it receives from the 843 media stream. For instance, it makes no sense for a receiver to 844 issue a FIR when a transmission of an Intra/IDR picture is 845 ongoing. 847 4. When using the non-multicast-safe messages (e.g. H.271 type 0 848 positive ACK of received pictures/slices) in larger multicast 849 groups, the media receiver will likely be forced to delay or even 850 omit sending these messages. For the media sender this looks like 851 data has not been properly received (although it was received 852 properly), and a naively implemented media sender reacts to these 853 perceived problems where it should not. 855 3.5.3.1. Reliability 857 H.271 Video Back Channel messages do not require reliable 858 transmission, and confirmation of the reception of a message can be 859 derived from the forward video bit stream. Therefore, no specific 860 reception acknowledgement is specified. 862 With respect to re-sending rules, clause 3.5.1.1. applies. 864 3.5.4. Temporary Maximum Media Stream Bit Rate Request and Notification 866 A receiver, translator or mixer uses the Temporary Maximum Media 867 Stream Bit Rate Request (TMMBR, "timber") to request a sender to 868 limit the maximum bit rate for a media stream (see 2.2) to, or below, 869 the provided value. The Temporary Maximum Media Stream Bit Rate 870 Notification (TMMBN) contains the media sender's current view of the 871 most limiting subset of the TMMBR-defined limits it has received, to 872 help the participants to suppress TMMBR requests that would not 873 further restrict the media sender. The primary usage for the 874 TMMBR/TMMBN messages is in a scenario with an MCU or mixer (use case 875 6), corresponding to Topo-Translator or Topo-Mixer, but also to 876 Topo-Point-to-Point. 878 Each temporary limitation on the media stream is expressed as a 879 tuple. The first component of the tuple is the maximum total media 880 bit rate (as defined in section 2.2) that the media receiver is 881 currently prepared to accept for this media stream. The second 882 component is the per-packet overhead that the media receiver has 883 observed for this media stream at its chosen reference protocol 884 layer. 886 As indicated in section 2.2, the overhead as observed by the sender 887 of the TMMBR (i.e. the media receiver) may differ from the overhead 888 observed at the receiver of the TMMBR (i.e. the media sender) due to 889 use of a different reference protocol layer at the other end or due 890 to the intervention of translators or mixers that affect the amount 891 of per packet overhead. For example, a gateway in between the two 892 that converts between IPv4 and IPv6 affects the per-packet overhead 893 by 20 bytes. Other mechanisms that change the overhead include 894 tunnels. The problem with varying overhead is also discussed in 895 [RFC3890]. As will be seen in the description of the algorithm for 896 use of TMMBR, the difference in perceived overhead between the 897 sending and receiving ends presents no difficulty because 898 calculations are carried out in terms of variables (packet rate, net 899 media bit rate) that have the same value at the sender as at the 900 receiver. 902 Reporting both maximum total media bit rate and per-packet overhead 903 allows different receivers to provide bit rate and overhead values 904 for different protocol layers, for example at the IP level, at the 905 outer part of a tunnel protocol, or at the link layer. The protocol 906 level a peer reports on depends on the level of integration the peer 907 has, as it needs to be able to extract the information from that 908 protocol level. For example, an application with no knowledge of the 909 IP version it is running over can not meaningfully determine the 910 overhead of the IP header, and hence will not want to include IP 911 overhead in the overhead or maximum total media bit rate calculation. 913 It is expected that most peers will be able to report values at least 914 for the IP layer. In certain implementations it may be advantageous 915 to also include information pertaining to the link layer, which in 916 turn allows for a more precise overhead calculation and a better 917 optimization of connectivity resources. 919 The Temporary Maximum Media Stream Bit Rate messages are generic 920 messages that can be applied to any RTP packet stream. This 921 separates them from the other codec control messages defined in this 922 specification, which apply only to specific media types or payload 923 formats. The TMMBR functionality applies to the transport, and the 924 requirements the transport places on the media encoding. 926 The reasoning below assumes that the participants have negotiated a 927 session maximum bit rate, using a signaling protocol. This value can 928 be global, for example in case of point-to-point, multicast, or 929 translators. It may also be local between the participant and the 930 peer or mixer. In either case, the bit rate negotiated in signaling 931 is the one that the participant guarantees to be able to handle 932 (depacketize and decode). In practice, the connectivity of the 933 participant also influences the negotiated value -- it does not make 934 much sense to negotiate a total media bit rate that one's network 935 interface does not support. 937 It is also beneficial to have negotiated a maximum packet rate for 938 the session or sender. RFC 3890 provides an SDP [RFC4566] attribute 939 that can be used for this purpose; however, that attribute is not 940 usable in RTP sessions established using offer/answer [RFC3264]. 941 Therefore an optional maximum packet rate signaling parameter is 942 specified in this memo. 944 An already established maximum total media bit rate may be changed at 945 any time, subject to the timing rules governing the sending of 946 feedback messages. The limit may change to any value between zero and 947 the session maximum, as negotiated during session establishment 948 signaling. However, even if a sender has received a TMMBR message 949 allowing an increase in the bit rate, all increases must be governed 950 by a congestion control mechanism. TMMBR indicates known limitations 951 only, usually in the local environment, and does not provide any 952 guarantees about the full path. Furthermore, any increases in TMMBR- 953 established bit rate limits are to be executed only after a certain 954 delay from the sending of the TMMBN message that notifies the world 955 about the increase in limit. The delay is specified as at least 956 twice the longest RTT as known by the media sender, plus the media 957 sender's calculation of the required wait time for the sending of 958 another TMMBR message for this session based on AVPF timing rules. 959 This delay is introduced to allow other session participants to make 960 known their bit rate limit requirements, which may be lower. 962 If it is likely that the new value indicated by TMMBR will be valid 963 for the remainder of the session, the TMMBR sender is expected to 964 perform a renegotiation of the session upper limit using the session 965 signaling protocol. 967 3.5.4.1. Behavior for media receivers using TMMBR 969 This section is an informal description of behaviour described more 970 precisely in section 4.2. 972 A media sender begins the session limited by the maximum media bit 973 rate and maximum packet rate negotiated in session signaling, if any. 975 Note that this value may be negotiated for another protocol layer 976 than the one the participant uses in its TMMBR messages. Each media 977 receiver selects a reference protocol layer, forms an estimate of the 978 overhead it is observing (or estimating it if no packets has been 979 seen yet) at that reference level, and determines the maximum total 980 media bit rate it can accept, taking into account its own limitations 981 and any transport path limitations of which it may be aware. In case 982 the current limitations are more restricting then what was agreed on 983 in the session signaling, the media receiver reports its initial 984 estimate of these two quantities to the media sender using a TMMBR 985 message. Overall message traffic is reduced by the possibility of 986 including tuples for multiple media senders in the same TMMBR 987 message. 989 The media sender applies an algorithm such as that specified in 990 section 3.5.4.2 to select which of the tuples it has received are 991 most limiting (i.e. the bounding set as defined in section 2.2). It 992 modifies its operation to stay within the feasible region (as defined 993 in section 2.2), and also sends out a TMMBN notification to the media 994 receivers indicating the selected bounding set. 996 If a media receiver does not own one of the tuples in the bounding 997 set reported by the TMMBN, it applies the same algorithm as the media 998 sender to determine if its current estimated (maximum total media bit 999 rate, overhead) tuple would enter the bounding set if known to the 1000 media sender. If so, it issues a TMMBR request reporting the tuple 1001 value to the sender. Otherwise it takes no action for the moment. 1002 Periodically, its estimated tuple values may change or it may receive 1003 a new TMMBN. If so, it reapplies the algorithm to decide whether it 1004 needs to issue a TMMBR request. 1006 If, alternatively, a media receiver owns one of the tuples in the 1007 reported bounding set, it takes no action until such time as its 1008 estimate of its own tuple values changes. At that time it sends a 1009 TMMBR request to the media sender to report the changed values. 1011 A media receiver may change status between owner and non-owner of a 1012 bounding tuple between one TMMBN message and the next. Thus it must 1013 check the contents of each TMMBN to determine its subsequent actions. 1015 Implementations may use other algorithms of their choosing, as long 1016 as the bit rate limitations resulting from the exchange of TMMBR and 1017 TMMBN messages are at least as strict (at least as low, in the bit 1018 rate dimension) as the ones resulting from the use of the 1019 aforementioned algorithm. 1021 Obviously, in point-to-point cases, when there is only one media 1022 receiver, this receiver becomes "owner" once it receives the first 1023 TMMBN in response to its own TMMBR, and stays "owner" for the rest of 1024 the session. Therefore, when it is known that there will always be 1025 only a single media receiver, the above algorithm is not required. 1026 Media receivers that are aware they are the only ones in a session 1027 can send TMMBR messages with bit rate limits both higher and lower 1028 than the previously notified limit, at any time (subject to the AVPF 1029 [RFC4585] RTCP RR send timing rules). However, it may be difficult 1030 for a session participant to determine if it is the only receiver in 1031 the session. Because of this any implementation of TMMBR is required 1032 to include the algorithm described in the next section or a stricter 1033 equivalent. 1035 3.5.4.2. Algorithm for establishing current limitations 1037 This section introduces an example algorithm for the calculation of a 1038 session limit. Other algorithms can be employed, as long as the 1039 result of the calculation is at least as restrictive as the result 1040 that is obtained by this algorithm. 1042 First it is important to consider the implications of using a tuple 1043 for limiting the media sender's behavior. The bit rate and the 1044 overhead value result in a two-dimensional solution space for the 1045 calculation of the bit rate of media streams. Fortunately the two 1046 variables are linked. Specifically, the bit rate available for RTP 1047 payloads is equal to the TMMBR reported bit rate minus the packet 1048 rate used, multiplied by the TMMBR reported overhead converted to 1049 bits. As a result, when different bit rate/overhead combinations 1050 need to be considered, the packet rate determines the correct 1051 limitation. This is perhaps best explained by an example: 1053 Example: 1055 Receiver A: TMMBR_max total BR = 35 kbps, TMMBR_OH = 40 bytes 1056 Receiver B: TMMBR_max total BR = 40 kbps, TMMBR_OH = 60 bytes 1058 For a given packet rate (PR) the bit rate available for media 1059 payloads in RTP will be: 1061 Max_net media_BR_A = TMMBR_max total BR_A - PR * TMMBR_OH_A * 8 ... 1062 (1) 1063 Max_net media_BR_B = TMMBR_max total BR_B - PR * TMMBR_OH_B * 8 ... 1064 (2) 1066 For a PR = 20 these calculations will yield a Max_net media_BR_A = 1067 28600 bps and Max_net media_BR_B = 30400 bps, which suggests that 1068 receiver A is the limiting one for this packet rate. However at a 1069 certain PR there is a switchover point at which receiver B becomes 1070 the limiting one. The switchover point can be identified by setting 1071 Max_media_BR_A equal to Max_media_BR_B and breaking out PR: 1073 TMMBR_max total BR_A - TMMBR_max total BR_B 1074 PR = ------------------------------------------- ... (3) 1075 8*(TMMBR_OH_A - TMMBR_OH_B) 1077 which, for the numbers above yields 31.25 as the switchover point 1078 between the two limits. That is, for packet rates below 31.25 per 1079 second, receiver A is the limiting receiver, and for higher packet 1080 rates, receiver B is more limiting. The implications of this 1081 behavior have to be considered by implementations that are going to 1082 control media encoding and its packetization. As exemplified above, 1083 multiple TMMBR limits may apply to the trade-off between net media 1084 bit rate and packet rate. Which limitation applies depends on the 1085 packet rate being considered. 1087 This also has implications for how the TMMBR mechanism needs to work. 1088 First, there is the possibility that multiple TMMBR tuples are 1089 providing limitations on the media sender. Secondly there is a need 1090 for any session participant (media sender and receivers) to be able 1091 to determine if a given tuple will become a limitation upon the media 1092 sender, or if the set of already given limitations is stricter than 1093 the given values. In the absence of the ability to make this 1094 determination the suppression of TMMBR requests would not work. 1096 The basic idea of the algorithm is as follows. Each TMMBR tuple can 1097 be viewed as the equation of a straight line (cf. equations (1) and 1098 (2)) in a space where packet rate lies along the X-axis and maximum 1099 bit rate lies along the Y-axis. The lower envelope of the set of 1100 lines corresponding to the complete set of TMMBR tuples defines a 1101 polygon. Points lying along or below this polygon are combinations of 1102 packet rate and bit rate that meet all of the TMMBR constraints. The 1103 highest feasible packet rate within this region is the minimum of the 1104 rate at which the bounding polygon meets the X-axis or the session 1105 maximum packet rate (SMAXPR) provided by signaling, if any. Typically 1106 a media sender will prefer to operate at a lower rate than this 1107 theoretical maximum, so as to increase the rate at which actual media 1108 content reaches the receivers. The purpose of the algorithm is to 1109 distinguish the TMMBR tuples constituting the bounding set and thus 1110 delineate the feasible region, so that the media sender can select 1111 its preferred operating point within that region 1113 Figure 1 below shows a bounding polygon formed by TMMBR tuples A and 1114 B. A third tuple C lies outside the bounding polygon and is therefore 1115 irrelevant in determining feasible tradeoffs between media rate and 1116 packet rate. The line labeled ss..s represents the limit on packet 1117 rate imposed by the session maximum packet rate (SMAXPR) obtained by 1118 signaling during session setup. In Figure 1 the limit determined by 1119 tuple B happens to be more restrictive than SMAXPR. The situation 1120 could easily be the reverse, meaning that the bounding polygon is 1121 terminated on the right by the vertical line representing the SMAXPR 1122 constraint. 1124 Net ^ 1125 Media|a c b s 1126 Bit | a c b s 1127 Rate | a c b s 1128 | a cb s 1129 | a c s 1130 | a bc s 1131 | a b c s 1132 | ab c s 1133 | Feasible b c s 1134 | region ba s 1135 | b a s c 1136 | b s c 1137 | b s a 1138 | bs 1139 +------------------------------> 1141 Packet rate 1143 Figure 1 - Geometric Interpretation of TMMBR Tuples 1145 Note that the slopes of the lines making up the bounding polygon are 1146 increasingly negative as one moves in the direction of increasing 1147 packet rate. Note also that with slight rearrangement, equations (1) 1148 and (2) have the canonical form: 1150 y = mx + b 1152 where 1153 m is the slope and has value equal to the negative of the tuple 1154 overhead (in bits), 1155 and 1156 b is the y-intercept and has value equal to the tuple maximum total 1157 media bit rate. 1159 These observations lead to the conclusion that when processing the 1160 TMMBR tuples to select the initial bounding set, one should sort and 1161 process the tuples by order of increasing overhead. Once a particular 1162 tuple has been added to the bounding set, all tuples not already 1163 selected and having lower overhead can be eliminated, because the 1164 next side of the bounding polygon has to be steeper (i.e. the 1165 corresponding TMMBR must have higher overhead) than the latest added 1166 tuple. 1168 Line cc..c in Figure 1 illustrates another principle. This line is 1169 parallel to line aa..a, but has a higher Y-intercept. That is, the 1170 corresponding TMMBR tuple contains a higher maximum total media bit 1171 rate value. Since line cc..c is outside the bounding polygon, it 1172 illustrates the conclusion that if two TMMBR tuples have the same 1173 overhead value, the one with higher maximum total media bit rate 1174 value cannot be part of the bounding set and can be set aside. 1176 Two further observations complete the algorithm. Obviously, moving 1177 from the left, the successive corners of the bounding polygon (i.e. 1178 the intersection points between successive pairs of sides) lie at 1179 successively higher packet rates. On the other hand, again moving 1180 from the left, each successive line making up the bounding set 1181 crosses the X-axis at a lower packet rate. 1183 The complete algorithm can now be specified. The algorithm works with 1184 two lists of TMMBR tuples, the candidate list X and the selected 1185 list Y, both ordered by increasing overhead value. The algorithm 1186 terminates when all members of X have been discarded or removed for 1187 processing. Membership of the selected list Y is probationary until 1188 the algorithm is complete. Each member of the selected list is 1189 associated with an intersection value, which is the packet rate at 1190 which the line corresponding to that TMMBR tuple intersects with the 1191 line corresponding to the previous TMMBR tuple in the selected list. 1192 Each member of the selected list is also associated with a maximum 1193 packet rate value, which is the lesser of the session maximum packet 1194 rate SMAXPR (if any) and the packet rate at which the line 1195 corresponding to that tuple crosses the X-axis. 1197 When the algorithm terminates, the selected list is equal to the 1198 bounding set as defined in section 2.2. 1200 Initial Algorithm 1202 This algorithm is used by the media sender when it has received one 1203 or more TMMBR requests and before it has determined a bounding set 1204 for the first time. 1206 1. Sort the TMMBR tuples by order of increasing overhead. This is 1207 the initial candidate list X. 1209 2. When multiple tuples in the candidate list have the same overhead 1210 value, discard all but the one with the lowest maximum total media 1211 bit rate value. 1213 3. Select and remove from the candidate list the TMMBR tuple with the 1214 lowest maximum total media bit rate value. If there is more than 1215 one tuple with that value, choose the one with the highest 1216 overhead value. This is the first member of the selected list Y. 1217 Set its intersection value equal to zero. Calculate its maximum 1218 packet rate as the minimum of SMAXPR (if available) and the value 1219 obtained from the following formula, which is the packet rate at 1220 which the corresponding line crosses the X-axis. 1222 Max PR = TMMBR max total BR / (8 * TMMBR OH) ... (4) 1224 4. Discard from the candidate list all tuples with a lower overhead 1225 value than the selected tuple. 1227 5. Remove the first remaining tuple from the candidate list for 1228 processing. Call this the current candidate. 1230 6. Calculate the packet rate PR at the intersection of the line 1231 generated by the current candidate with the line generated by the 1232 last tuple in the selected list Y, using equation (3). 1234 7. If the calculated value PR is equal to or lower than the 1235 intersection value stored for the last tuple of the selected list, 1236 discard the last tuple of the selected list and go back to step 6 1237 (retaining the same current candidate). 1239 Note that the choice of the initial member of the selected list Y 1240 in step 3 guarantees that the selected list will never be emptied 1241 by this process, meaning that the algorithm must eventually (if 1242 not immediately) fall through to the step 8. 1244 8. (This step is reached when the calculated PR value of the current 1245 candidate is greater than the intersection value of the current 1246 last member of the selected list Y.) If the calculated value PR 1247 of the current candidate is lower than the maximum packet rate 1248 associated with the last tuple in the selected list, add the 1249 current candidate tuple to the end of the selected list. Store PR 1250 as its intersection value. Calculate its maximum packet rate as 1251 the lesser of SMAXPR (if available) and the maximum packet rate 1252 calculated using equation (4). 1254 9. If any tuples remain in the candidate list, go back to step 5. 1256 Incremental Algorithm 1257 The previous algorithm covered the initial case, where no selected 1258 list had previously been created. It also applied only to the media 1259 sender. When a previously-created selected list is available at 1260 either the media sender or media receiver, two other cases can be 1261 considered: 1263 o when a TMMBR tuple not currently in the selected list is a 1264 candidate for addition; 1266 o when the values change in a TMMBR tuple currently in the 1267 selected list. 1269 At the media receiver these cases correspond respectively to those of 1270 the non-owner and owner of a tuple in the TMMBN-reported bounding 1271 set. 1273 In either case, the process of updating the selected list to take 1274 account of the new/changed tuple can use the basic algorithm 1275 described above, with the modification that the initial candidate set 1276 consists only of the existing selected list and the new or changed 1277 tuple. Some further optimization is possible (beyond starting with a 1278 reduced candidate set) by taking advantage of the following 1279 observations. 1281 The first observation is that if the new/changed candidate becomes 1282 part of the new selected list, the result may be to cause zero or 1283 more other tuples to be dropped from the list. However, if more than 1284 one other tuple is dropped, the dropped tuples will be consecutive. 1285 This can be confirmed geometrically by visualizing a new line that 1286 cuts off a series of segments from the previously-existing bounding 1287 polygon. The cut-off segments are connected one to the next, the 1288 geometric equivalent of consecutive tuples in a list ordered by 1289 overhead value. Beyond the dropped set in either direction all of 1290 the tuples that were in the earlier selected list will be in the 1291 updated one. The second observation is that, leaving aside the new 1292 candidate, the order of tuples remaining in the updated selected list 1293 is unchanged because their overhead values have not changed. 1295 The consequence of these two observations is that, once the placement 1296 of the new candidate and the extent of the dropped set of tuples (if 1297 any) has been determined, the remaining tuples can be copied directly 1298 from the candidate list into the selected list, preserving their 1299 order. This conclusion suggests the following modified algorithm: 1301 o Run steps 1-4 of the basic algorithm. 1303 o If the new candidate has survived steps 2 and 4 and has become 1304 the new first member of the selected list, run steps 5-9 on 1305 subsequent candidates until another candidate is added to the 1306 selected list. Then move all remaining candidates to the 1307 selected list, preserving their order. 1309 o If the new candidate has survived steps 2 and 4 and has not 1310 become the new first member of the selected list, start by 1311 moving all tuples in the candidate list with lower overhead 1312 values than that of the new candidate to the selected list, 1313 preserving their order. Run steps 5 through 9 for the new 1314 candidate, with the modification that the intersection values 1315 and maximum packet rates for the tuples on the selected list 1316 have to be calculated on the fly because they were not 1317 previously stored. Continue processing only until a subsequent 1318 tuple has been added to the selected list, then move all 1319 remaining candidates to the selected list, preserving their 1320 order. 1322 Note that the new candidate could be added to the selected 1323 list only to be dropped again when the next tuple is 1324 processed. It can easily be seen that in this case the new 1325 candidate does not displace any of the earlier tuples in the 1326 selected list. The limitations of ASCII art make this 1327 difficult to show in a figure. Line cc..c in Figure 1 would 1328 be an example if it had a steeper slope (tuple C had a higher 1329 overhead value), but still intersected line aa..a beyond 1330 where line aa..a intersects line bb..b. 1332 The algorithm just described is approximate, because it does not take 1333 account of tuples outside the selected list. To see how such tuples 1334 can become relevant, consider Figure 1 and suppose that the maximum 1335 total media bit rate in tuple A increases to the point that line 1336 aa..a moves outside line cc..c. Tuple A will remain in the bounding 1337 set calculated by the media sender. However, once it issues a new 1338 TMMBN, media receiver C will apply the algorithm and discover that 1339 its tuple C should now enter the bounding set. It will issue a TMMBR 1340 request to the media sender, which will repeat its calculation and 1341 come to the appropriate conclusion. 1343 The rules of section 4 1344 .2 require that the media sender refrain from 1345 raising its sending rate until media receivers have had a chance to 1346 respond to the TMMBN. In the example just given, this delay ensures 1347 that the relaxation of tuple A does not actually result in an attempt 1348 to send media at a rate exceeding the capacity at C. 1350 3.5.4.3. Use of TMMBR in a Mixer Based Multipoint Operation 1352 Assume a small mixer-based multiparty conference is ongoing, as 1353 depicted in Topo-Mixer of [Topologies]. All participants have 1354 negotiated a common maximum bit rate that this session can use. The 1355 conference operates over a number of unicast paths between the 1356 participants and the mixer. The congestion situation on each of 1357 these paths can be monitored by the participant in question and by 1358 the mixer, utilizing, for example, RTCP receiver reports (RR) or the 1359 transport protocol, e.g. DCCP [RFC4340]. However, any given 1360 participant has no knowledge of the congestion situation of the 1361 connections to the other participants. Worse, without mechanisms 1362 similar to the ones discussed in this draft, the mixer (which is 1363 aware of the congestion situation on all connections it manages) has 1364 no standardized means to inform media senders to slow down, short of 1365 forging its own receiver reports (which is undesirable). In 1366 principle, a mixer confronted with such a situation is obliged to 1367 thin or transcode streams intended for connections that detected 1368 congestion. 1370 In practice, media-aware stream thinning is unfortunately a very 1371 difficult and cumbersome operation and adds undesirable delay. If 1372 media-unaware, it leads very quickly to unacceptable reproduced media 1373 quality. Hence, a means to slow down senders even in the absence of 1374 congestion on their connections to the mixer is desirable. 1376 To allow the mixer to throttle traffic on the individual links, 1377 without performing transcoding, there is a need for a mechanism that 1378 enables the mixer to ask a participant's media encoders to limit the 1379 media stream bit rate they are currently generating. TMMBR provides 1380 the required mechanism. When the mixer detects congestion between 1381 itself and a given participant, it executes the following procedure: 1383 1. It starts thinning the media traffic to the congested participant 1384 to the supported bit rate. 1386 2. It uses TMMBR to request the media sender(s) to reduce the total 1387 media bit rate sent by them to the mixer, to a value that is in 1388 compliance with congestion control principles for the slowest 1389 link. Slow refers here to the available bandwidth / bit rate / 1390 capacity and packet rate after congestion control. 1392 3. As soon as the bit rate has been reduced by the sending part, the 1393 mixer stops stream thinning implicitly, because there is no need 1394 for it once the stream is in compliance with congestion control. 1396 This use of stream thinning as an immediate reaction tool followed up 1397 by a quick control mechanism appears to be a reasonable compromise 1398 between media quality and the need to combat congestion. 1400 3.5.4.4. Use of TMMBR in Point-to-Multipoint Using Multicast or 1401 Translators 1403 In these topologies, corresponding to Topo-Multicast or Topo- 1404 Translator, RTCP RRs are transmitted globally. This allows all 1405 participants to detect transmission problems such as congestion, on a 1406 medium timescale. As all media senders are aware of the congestion 1407 situation of all media receivers, the rationale for the use of TMMBR 1408 in the previous section does not apply. However, even in this case 1409 the congestion control response can be improved when the unicast 1410 links are using congestion controlled transport protocols (such as 1411 TCP or DCCP). A peer may also report local limitations to the media 1412 sender. 1414 3.5.4.5. Use of TMMBR in Point-to-point operation 1416 In use case 7 it is possible to use TMMBR to improve the performance 1417 when the known upper limit of the bit rate changes. In this use case 1418 the signaling protocol has established an upper limit for the session 1419 and total media bit rates. However, at the time of transport link 1420 bit rate reduction, a receiver can avoid serious congestion by 1421 sending a TMMBR to the sending side. Thus TMMBR is useful for 1422 putting restrictions on the application and thus placing the 1423 congestion control mechanism in the right ballpark. However TMMBR is 1424 usually unable to provide the continuously quick feedback loop 1425 required for real congestion control. Nor do its semantics match 1426 those of congestion control given its different purpose. For these 1427 reasons TMMBR SHALL NOT be used as a substitute for congestion 1428 control. 1430 3.5.4.6. Reliability 1432 The reaction of a media sender to the reception of a TMMBR message is 1433 not immediately identifiable through inspection of the media stream. 1434 Therefore, a more explicit mechanism is needed to avoid unnecessary 1435 re-sending of TMMBR messages. Using a statistically based 1436 retransmission scheme would only provide statistical guarantees of 1437 the request being received. It would also not avoid the 1438 retransmission of already received messages. In addition, it would 1439 not allow for easy suppression of other participants' requests. For 1440 these reasons, a mechanism based on explicit notification is used. 1442 Upon the reception of a request a media sender sends a TMMBN 1443 notification containing the current bounding set, and indicating 1444 which session participants own that limit. In multicast scenarios, 1445 that allows all other participants to suppress any request they may 1446 have, if their limitations are less strict than the current ones 1447 (i.e. define lines lying outside the feasible region as defined in 1448 section 2.2). Keeping and notifying only the bounding set of tuples 1449 allows for small message sizes and media sender states. A media 1450 sender only keeps state for the SSRCs of the current owners of the 1451 bounding set of tuples; all other requests and their sources are not 1452 saved. Once the bounding set has been established, new TMMBR 1453 messages should be generated only by owners of the bounding tuples 1454 and by other entities that determine (by applying the algorithm of 1455 section 3.5.4.2 or its equivalent) that their limitations should now 1456 be part of the bounding set. 1458 4. RTCP Receiver Report Extensions 1460 This memo specifies six new feedback messages. The Full Intra 1461 Request (FIR), Temporal-Spatial Trade-off Request (TSTR), Temporal- 1462 Spatial Trade-off Notification (TSTN), and Video Back Channel Message 1463 (VBCM) are "Payload Specific Feedback Messages" as defined in Section 1464 6.3 of AVPF [RFC4585]. The Temporary Maximum Media Stream Bit Rate 1465 Request (TMMBR) and Temporary Maximum Media Stream Bit Rate 1466 Notification (TMMBN) are "Transport Layer Feedback Messages" as 1467 defined in Section 6.2 of AVPF. 1469 The new feedback messages are defined in the following subsections, 1470 following a similar structure to that in sections 6.2 and 6.3 of the 1471 AVPF specification [RFC4585]. 1473 4.1. Design Principles of the Extension Mechanism 1475 RTCP was originally introduced as a channel to convey presence, 1476 reception quality statistics and hints on the desired media coding. 1477 A limited set of media control mechanisms were introduced in early 1478 RTP payload formats for video formats, for example in RFC 4587 1479 [RFC4587]. However, this specification, for the first time, suggests 1480 a two-way handshake for some of its messages. There is danger that 1481 this introduction could be misunderstood as a precedent for the use 1482 of RTCP as an RTP session control protocol. To prevent such a 1483 misunderstanding, this subsection attempts to clarify the scope of 1484 the extensions specified in this memo, and strongly suggests that 1485 future extensions follow the rationale spelled out here, or 1486 compellingly explain why they divert from the rationale. 1488 In this memo, and in AVPF [RFC4585], only such messages have been 1489 included as: 1491 a) have comparatively strict real-time constraints, which prevent the 1492 use of mechanisms such as a SIP re-invite in most application 1493 scenarios. The real-time constraints are explained separately for 1494 each message where necessary. 1496 b) are multicast-safe in that the reaction to potentially 1497 contradicting feedback messages is specified, as necessary for 1498 each message; and 1500 c) are directly related to activities of a certain media codec, class 1501 of media codecs (e.g. video codecs), or a given RTP packet stream. 1503 In this memo, a two-way handshake is introduced only for messages for 1504 which: 1506 a) a notification or acknowledgement is required due to their nature. 1507 An analysis to determine whether this requirement exists has been 1508 performed separately for each message. 1510 b) the notification or acknowledgement cannot be easily derived from 1511 the media bit stream. 1513 All messages in AVPF [RFC4585] and in this memo present their 1514 contents in a simple, fixed binary format. This accommodates media 1515 receivers which have not implemented higher control protocol 1516 functionalities (SDP, XML parsers and such) in their media path. 1518 4.2. Transport Layer Feedback Messages 1520 As specified in section 6.1 of RFC 4585 [RFC4585], Transport Layer 1521 Feedback messages are identified by the RTCP packet type value RTPFB 1522 (205). 1524 In AVPF, one message of this category had been defined. This memo 1525 specifies two more such messages. They are identified by means of 1526 the FMT parameter as follows: 1528 Assigned in AVPF [RFC4585]: 1530 1: Generic NACK 1531 31: reserved for future expansion of the identifier number space 1533 Assigned in this memo: 1535 2: reserved (see note below) 1536 3: Temporary Maximum Media Stream Bit Rate Request (TMMBR) 1537 4: Temporary Maximum Media Stream Bit Rate Notification (TMMBN) 1539 Note: early drafts of AVPF [RFC4585] reserved FMT=2 for a code 1540 point that has later been removed. It has been pointed out 1541 that there may be implementations in the field using this 1542 value in accordance with the expired draft. As there is 1543 sufficient numbering space available, we mark FMT=2 as 1544 reserved so to avoid possible interoperability problems with 1545 any such early implementations. 1547 Available for assignment: 1549 0: unassigned 1550 5-30: unassigned 1552 The following subsection defines the formats of the FCI entries for 1553 the TMMBR and TMMBN messages respectively and specify the associated 1554 behaviour at the media sender and receiver. 1556 4.2.1. Temporary Maximum Media Stream Bit Rate Request (TMMBR) 1558 The FCI field of a Temporary Maximum Media Stream Bit-Rate Request 1559 (TMMBR) message SHALL contain one or more FCI entries. 1561 4.2.1.1. Message Format 1563 The Feedback Control Information (FCI) consists of one or more TMMBR 1564 FCI entries with the following syntax: 1566 0 1 2 3 1567 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1568 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1569 | SSRC | 1570 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1571 | MxTBR Exp | MxTBR Mantissa |Measured Overhead| 1572 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1574 Figure 2 - Syntax of an FCI entry in the TMMBR message 1576 SSRC (32 bits): The SSRC value of the media sender that is 1577 requested to obey the new maximum bit rate. 1579 MxTBR Exp (6 bits): The exponential scaling of the mantissa for the 1580 maximum total media bit rate value. The value is an 1581 unsigned integer [0..63]. 1583 MxTBR Mantissa (17 bits): The mantissa of the maximum total media 1584 bit rate value as an unsigned integer. 1586 Measured Overhead (9 bits): The measured average packet overhead 1587 value in bytes. The measurement SHALL be done according 1588 to description in section 4 1589 .2.1.2. The value is an 1590 unsigned integer [0..512]. 1592 The maximum total media bit rate (MxTBR) value in bits per second is 1593 calculated from the MxTBR exponent (exp) and mantissa in the 1594 following way: 1596 MxTBR = mantissa * 2^exp 1598 This allows for 17 bits of resolution in the range 0 to 131072*2^63 1599 (approximately 1.2*10^24). 1601 The length of the TMMBR feedback message SHALL be set to 2+2*N where 1602 N is the number of TMMBR FCI entries. 1604 4.2.1.2. Semantics 1606 Behaviour at the Media Receiver (Sender of the TMMBR) 1608 TMMBR is used to indicate a transport related limitation at the 1609 reporting entity acting as a media receiver. TMMBR has the form of a 1610 tuple containing two components. The first value is the highest bit 1611 rate per sender of a media stream, available at a receiver-chosen 1612 protocol layer, which the receiver currently supports in this RTP 1613 session. The second value is the measured header overhead in bytes 1614 as defined in section 2.2 and measured at the chosen protocol layer 1615 in the packets received for the stream. The measurement of the 1616 overhead is a running average that is updated for each packet 1617 received for this particular media source (SSRC), using the following 1618 formula: 1620 avg_OH (new) = 15/16*avg_OH (old) + 1/16*pckt_OH, 1622 where avg_OH is the running (exponentially smoothed) average and 1623 pckt_OH is the overhead observed in the latest packet. 1625 If a maximum bit rate has been negotiated through signaling, the 1626 maximum total media bit rate that the receiver reports in a TMMBR 1627 message MUST NOT exceed the negotiated value converted to a common 1628 basis (i.e. with overheads adjusted to bring it to the same reference 1629 protocol layer). 1631 Within the common packet header for feedback messages (as defined in 1632 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 1633 indicates the source of the request, and the "SSRC of media source" 1634 is not used and SHALL be set to 0. Within a particular TMMBR FCI 1635 entry, the "SSRC of media sender" in the FCI field denotes the media 1636 sender the tuple applies to. This is useful in the multicast or 1637 translator topologies where the reporting entity may address all of 1638 the media senders in a single TMMBR message using multiple FCI 1639 entries. 1641 The media receiver SHALL save the contents of the latest TMMBN 1642 message received from each media sender. 1644 The media receiver MAY send a TMMBR FCI entry to a particular media 1645 sender under the following circumstances: 1647 o before any TMMBN message has been received from that media 1648 sender; 1650 o when the media receiver has been identified as the source of a 1651 bounding tuple within the latest TMMBN message received from 1652 that media sender, and the value of the maximum total media 1653 bit rate or the overhead relating to that media sender has 1654 changed; 1656 o when the media receiver has not been identified as the source 1657 of a bounding tuple within the latest TMMBN message received 1658 from that media sender, and, after the media receiver applies 1659 the incremental algorithm from section 3.5.4.2 or a stricter 1660 equivalent, the media receiver's tuple relating to that media 1661 sender is determined to belong to the bounding set. 1663 A TMMBR FCI entry MAY be repeated in subsequent TMMBR messages if no 1664 Temporary Maximum Media Stream Bit-Rate Notification (TMMBN) FCI has 1665 been received from the media sender at the time of transmission of 1666 the next RTCP packet. The bit rate value of a TMMBR FCI entry MAY be 1667 changed from one TMMBR message to the next. The overhead measurement 1668 SHALL be updated to the current value of avg_OH each time the entry 1669 is sent. 1671 If the value set by a TMMBR message is expected to be permanent, the 1672 TMMBR setting party SHOULD renegotiate the session parameters to 1673 reflect that using session setup signaling, e.g. a SIP re-invite. 1675 Behaviour at the Media Sender (Receiver of the TMMBR) 1677 When it receives a TMMBR message containing an FCI entry relating to 1678 it, the media sender SHALL use an initial or incremental algorithm as 1679 applicable to determine the bounding set of tuples based on the new 1680 information. The algorithm used SHALL be at least as strict as the 1681 corresponding algorithm defined in section 3.5.4.2. The media sender 1682 MAY accumulate TMMBR requests over a small interval (relative to the 1683 RTCP sending interval) before making this calculation. 1685 Once it has determined the bounding set of tuples, the media sender 1686 MAY use any combination of packet rate and net media bit rate within 1687 the feasible region that these tuples describe to produce a lower 1688 total media stream bit rate, as it may need to address a congestion 1689 situation or other limiting factors. See section 5 (congestion 1690 control) for more discussion. 1692 If the media sender concludes that it can increase the maximum total 1693 media bit rate value, it SHALL wait before actually doing so, for a 1694 period long enough to allow a media receiver to respond to the TMMBN 1695 if it determines that its tuple belongs in the bounding set. This 1696 delay period is estimated by the formula: 1698 2 * RTT + T_Dither_Max, 1700 where RTT is the longest round trip time known to the media sender 1701 and T_Dither_Max is defined in section 3.4 of [RFC4585]. 1703 A TMMBN message SHALL be sent by the media sender at the earliest 1704 possible point in time, in response to any TMMBR messages received 1705 since the last sending of TMMBN. The TMMBN message indicates the 1706 calculated set of bounding tuples and the owners of those tuples at 1707 the time of the transmission of the message. 1709 An SSRC may time out according to the default rules for RTP session 1710 participants, i.e. the media sender has not received any RTP or RTCP 1711 packets from the owner for the last five regular reporting intervals. 1712 An SSRC may also explicitly leave the session, with the participant 1713 indicating this through the transmission of an RTCP BYE packet or 1714 using an external signaling channel. If the media sender determines 1715 that the owner of a tuple in the bounding set has left the session, 1716 the media sender shall transmit a new TMMBN containing the 1717 previously-determined set of bounding tuples but with the tuple 1718 belonging to the departed owner removed. 1720 A media sender MAY proactively initiate the equivalent to a TMMBR 1721 message to itself, when it is aware that its transmission path is 1722 more restrictive than the current limitations. As a result, a TMMBN 1723 indicating the media source itself as the owner of a tuple is being 1724 sent, thereby avoiding unnecessary TMMBR messages from other 1725 participants. However, like any other participant, when the media 1726 sender becomes aware of changed limitations, it is required to change 1727 the tuple, and to send a corresponding TMMBN. 1729 Discussion 1731 Due to the unreliable nature of transport of TMMBR and TMMBN, the 1732 above rules may lead to the sending of TMMBR messages which appear to 1733 disobey those rules. Furthermore, in multicast scenarios it can 1734 happen that more than one "non-owning" session participant may 1735 determine, rightly or wrongly, that its tuple belongs in the bounding 1736 set. This is not critical for a number of reasons: 1738 a) If a TMMBR message is lost in transmission, either the media 1739 sender sends a new TMMBN message in response to some other media 1740 receiver or it does not send a new TMMBN message at all. In the 1741 first case, the media receiver applies the incremental algorithm 1742 and, if it determines that its tuple should be part of the 1743 bounding set, sends out another TMMBR. In the second case, it 1744 repeats the sending of a TMMBR unconditionally. Either way, the 1745 media sender eventually gets the information it needs. 1747 b) Similarly, if a TMMBN message gets lost, the media receiver that 1748 has sent the corresponding TMMBR request does not receive the 1749 notification and is expected to re-send the request and trigger 1750 the transmission of another TMMBN. 1752 c) If multiple competing TMMBR messages are sent by different session 1753 participants, then the algorithm can be applied taking all of 1754 these messages into account, and the resulting TMMBN provides the 1755 participants with an updated view of how their tuples compare with 1756 the bounded set. 1758 d) If more than one session participant happens to send TMMBR 1759 messages at the same time and with the same tuple component 1760 values, it does not matter which if either tuple is taken into the 1761 bounding set. The losing session participant will determine after 1762 applying the algorithm that its tuple does not enter the bounding 1763 set, and will therefore stop sending its TMMBR request. 1765 It is important to consider the security risks involved with faked 1766 TMMBRs. See the security considerations in Section 6. 1768 As indicated already, the feedback messages may be used in both 1769 multicast and unicast sessions in any of the specified topologies. 1770 However, for sessions with a large number of participants, using the 1771 lowest common denominator, as required by this mechanism, may not be 1772 the most suitable course of action. Large sessions may need to 1773 consider other ways to adapt the bit rate to participants' 1774 capabilities, such as partitioning the session into different quality 1775 tiers, or using some other method of achieving bit rate scalability. 1777 4.2.1.3. Timing Rules 1779 The first transmission of the TMMBR request message MAY use early or 1780 immediate feedback in cases when timeliness is desirable. Any 1781 repetition of a request message SHOULD use regular RTCP mode for its 1782 transmission timing. 1784 4.2.1.4. Handling in Translator and Mixers 1785 Media translators and mixers will need to receive and respond to 1786 TMMBR messages as they are part of the chain that provides a certain 1787 media stream to the receiver. The mixer or translator may act 1788 locally on the TMMBR request and thus generate a TMMBN to indicate 1789 that it has done so. Alternatively, in the case of a media 1790 translator it can forward the request, or in the case of a mixer 1791 generate one of its own and pass it forward. In the latter case, the 1792 mixer will need to send a TMMBN back to the original requestor to 1793 indicate that it is handling the request. 1795 4.2.2. Temporary Maximum Media Stream Bit Rate Notification (TMMBN) 1797 The FCI field of the TMMBN Feedback message may contain zero, one or 1798 more TMMBN FCI entries. 1800 4.2.2.1. Message Format 1802 The Feedback Control Information (FCI) consists of zero, one or more 1803 TMMBN FCI entries with the following syntax: 1805 0 1 2 3 1806 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1807 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1808 | SSRC | 1809 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1810 | MxTBR Exp | MxTBR Mantissa |Measured Overhead| 1811 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1813 Figure 3 - Syntax of an FCI entry in the TMMBN message 1815 SSRC (32 bits): The SSRC value of the "owner" of this tuple. 1817 MxTBR Exp (6 bits): The exponential scaling of the mantissa for the 1818 maximum total media bit rate value. The value is an 1819 unsigned integer [0..63]. 1821 MxTBR Mantissa (17 bits): The mantissa of the maximum total media 1822 bit rate value as an unsigned integer. 1824 Measured Overhead (9 bits): The measured average packet overhead 1825 value in bytes represented as an unsigned integer. 1827 Thus the FCI within the TMMBN message contains entries indicating the 1828 bounding tuples. For each tuple, the entry gives the owner by the 1829 SSRC, followed by the applicable maximum total media bit rate and 1830 overhead value. 1832 The length of the TMMBN message SHALL be set to 2+2*N where N is the 1833 number of TMMBN FCI entries. 1835 4.2.2.2. Semantics 1837 This feedback message is used to notify the senders of any TMMBR 1838 message that one or more TMMBR messages have been received or that an 1839 owner has left the session. It indicates to all participants the 1840 current set of bounding tuples and the "owners" of those tuples. 1842 Within the common packet header for feedback messages (as defined in 1843 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 1844 indicates the source of the notification. The "SSRC of media source" 1845 is not used and SHALL be set to 0. 1847 A TMMBN message SHALL be scheduled for transmission after the 1848 reception of a TMMBR message with an FCI entry identifying this media 1849 sender. Only a single TMMBN SHALL be sent, even if more than one 1850 TMMBR message is received between the scheduling of the transmission 1851 and the actual transmission of the TMMBN message. The TMMBN message 1852 indicates the bounding tuples and their owners at the time of 1853 transmitting the message. The bounding tuples included SHALL be the 1854 set arrived at through application of the applicable algorithm of 1855 section 3.5.4.2 or an equivalent, applied to the previous bounding 1856 set if any and tuples received in TMMBR messages since the last TMMBN 1857 was transmitted. 1859 The reception of a TMMBR message SHALL still result in the 1860 transmission of a TMMBN message even if, after application of the 1861 algorithm, the newly reported TMMBR tuple is not accepted into the 1862 bounding set. In such a case the bounding tuples and their owners 1863 are not changed, unless the TMMBR was from an owner of a tuple within 1864 the previously calculated bounding set. This procedure allows 1865 session participants that did not see the last TMMBN message to get a 1866 correct view of this media sender's state. 1868 As indicated in section 4.2.1.2, when a media sender determines that 1869 an "owner" of a bounding tuple has left the session, then that tuple 1870 is removed from the bounding set, and the media sender SHALL send a 1871 TMMBN message indicating the remaining bounding tuples. If there are 1872 no remaining bounding tuples a TMMBN without any FCI SHALL be sent to 1873 indicate this. 1875 Note: if any media receivers remain in the session, this last will 1876 be a temporary situation. The empty TMMBN will cause every 1877 remaining media receiver to determine that its limitation belongs 1878 in the bounding set and send a TMMBR in consequence. 1880 In unicast scenarios (i.e. where a single sender talks to a single 1881 receiver), the aforementioned algorithm to determine ownership 1882 degenerates to the media receiver becoming the "owner" of the one 1883 bounding tuple as soon as the media receiver has issued the first 1884 TMMBR message. 1886 4.2.2.3. Timing Rules 1888 The TMMBN acknowledgement SHOULD be sent as soon as allowed by the 1889 applied timing rules for the session. Immediate or early feedback 1890 mode SHOULD be used for these messages. 1892 4.2.2.4. Handling by Translators and Mixers 1894 As discussed in Section 4.2.1.4 mixers or translators may need to 1895 issue TMMBN messages as responses to TMMBR messages for SSRC's 1896 handled by them. 1898 4.3. Payload Specific Feedback Messages 1900 As specified by section 6.1 of RFC 4585 [RFC4585], Payload-Specific 1901 FB messages are identified by the RTCP packet type value PT=PSFB 1902 (206). 1904 AVPF [RFC4585] defines three payload-specific feedback messages and 1905 one application layer feedback message. This memo specifies four 1906 additional payload-specific feedback messages. All are identified by 1907 means of the FMT parameter as follows: 1909 Assigned in [RFC4585]: 1911 1: Picture Loss Indication (PLI) 1912 2: Slice Lost Indication (SLI) 1913 3: Reference Picture Selection Indication (RPSI) 1914 15: Application layer FB message 1915 31: reserved for future expansion of the number space 1917 Assigned in this memo: 1919 4: Full Intra Request Command (FIR) 1920 5: Temporal-Spatial Trade-off Request (TSTR) 1921 6: Temporal-Spatial Trade-off Notification (TSTN) 1922 7: Video Back Channel Message (VBCM) 1924 Unassigned: 1926 0: unassigned 1927 8-14: unassigned 1928 16-30: unassigned 1930 The following subsections define the new FCI formats for the payload- 1931 specific feedback messages. 1933 4.3.1. Full Intra Request (FIR) 1935 The FIR message is identified by RTCP packet type value PT=PSFB and 1936 FMT=4. 1938 The FCI field MUST contain one or more FIR entries. Each entry 1939 applies to a different media sender, identified by its SSRC. 1941 4.3.1.1. Message Format 1943 The Feedback Control Information (FCI) for the Full Intra Request 1944 consists of one or more FCI entries, the content of which is depicted 1945 in Figure 4. The length of the FIR feedback message MUST be set to 1946 2+2*N, where N is the number of FCI entries. 1948 0 1 2 3 1949 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1950 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1951 | SSRC | 1952 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1953 | Seq. nr | Reserved | 1954 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1956 Figure 4 - Syntax of an FCI entry in the FIR message 1958 SSRC (32 bits): The SSRC value of the media sender which is 1959 requested to send a decoder refresh point. 1961 Seq. nr (8 bits): Command sequence number. The sequence number 1962 space is unique for each pairing of the SSRC of command 1963 source and the SSRC of the command target. The sequence 1964 number SHALL be increased by 1 modulo 256 for each new 1965 command. A repetition SHALL NOT increase the sequence 1966 number. The initial value is arbitrary. 1968 Reserved (24 bits): All bits SHALL be set to 0 by the sender and 1969 SHALL be ignored on reception. 1971 The semantics of this feedback message is independent of the RTP 1972 payload type. 1974 4.3.1.2. Semantics 1976 Upon reception of FIR, the encoder MUST send a decoder refresh point 1977 (see section 2.2) as soon as possible. 1979 Note: Currently, video appears to be the only useful application 1980 for FIR, as it appears to be the only RTP payload widely deployed 1981 that relies heavily on media prediction across RTP packet 1982 boundaries. However, use of FIR could also reasonably be 1983 envisioned for other media types that share essential properties 1984 with compressed video, namely cross-frame prediction (whatever a 1985 frame may be for that media type). One possible example may be the 1986 dynamic updates of MPEG-4 scene descriptions. It is suggested that 1987 payload formats for such media types refer to FIR and other message 1988 types defined in this specification and in AVPF [RFC4585], instead 1989 of creating similar mechanisms in the payload specifications. The 1990 payload specifications may have to explain how the payload-specific 1991 terminologies map to the video-centric terminology used herein. 1993 Note: In environments where the sender has no control over the 1994 codec (e.g. when streaming pre-recorded and pre-coded content), the 1995 reaction to this command cannot be specified. One suitable 1996 reaction of a sender would be to skip forward in the video bit 1997 stream to the next decoder refresh point. In other scenarios, it 1998 may be preferable not to react to the command at all, e.g. when 1999 streaming to a large multicast group. Other reactions may also be 2000 possible. When deciding on a strategy, a sender could take into 2001 account factors such as the size of the receiving group, the 2002 "importance" of the sender of the FIR message (however "importance" 2003 may be defined in this specific application), the frequency of 2004 decoder refresh points in the content, and so on. However a 2005 session which predominately handles pre-coded content is not 2006 expected to use FIR at all. 2008 The sender MUST consider congestion control as outlined in section 5, 2009 which MAY restrict its ability to send a decoder refresh point 2010 quickly. 2012 Note: The relationship between the Picture Loss Indication and FIR 2013 is as follows. As discussed in section 6.3.1 of AVPF [RFC4585], a 2014 Picture Loss Indication informs the decoder about the loss of a 2015 picture and hence the likelihood of misalignment of the reference 2016 pictures between the encoder and decoder. Such a scenario is 2017 normally related to losses in an ongoing connection. In point-to- 2018 point scenarios, and without the presence of advanced error 2019 resilience tools, one possible option for an encoder consists in 2020 sending a decoder refresh point. However, there are other options. 2021 One example is that the media sender ignores the PLI, because the 2022 embedded stream redundancy is likely to clean up the reproduced 2023 picture within a reasonable amount of time. The FIR, in contrast, 2024 leaves a (real-time) encoder no choice but to send a decoder 2025 refresh point. It does not allow the encoder to take into account 2026 any considerations such as the ones mentioned above. 2028 Note: Mandating a maximum delay for completing the sending of a 2029 decoder refresh point would be desirable from an application 2030 viewpoint, but is problematic from a congestion control point of 2031 view. "As soon as possible" as mentioned above appears to be a 2032 reasonable compromise. 2034 FIR SHALL NOT be sent as a reaction to picture losses -- it is 2035 RECOMMENDED to use PLI instead. FIR SHOULD be used only in 2036 situations where not sending a decoder refresh point would render the 2037 video unusable for the users. 2039 Note: A typical example where sending FIR is appropriate is when, 2040 in a multipoint conference, a new user joins the session and no 2041 regular decoder refresh point interval is established. Another 2042 example would be a video switching MCU that changes streams. Here, 2043 normally, the MCU issues a FIR to the new sender so to force it to 2044 emit a decoder refresh point. The decoder refresh point normally 2045 includes a Freeze Picture Release (defined outside this 2046 specification), which re-starts the rendering process of the 2047 receivers. Both techniques mentioned are commonly used in MCU- 2048 based multipoint conferences. 2050 Other RTP payload specifications such as RFC 4587 [RFC4587] already 2051 define a feedback mechanism for certain codecs. An application 2052 supporting both schemes MUST use the feedback mechanism defined in 2053 this specification when sending feedback. For backward compatibility 2054 reasons, such an application SHOULD also be capable to receive and 2055 react to the feedback scheme defined in the respective RTP payload 2056 format, if this is required by that payload format. 2058 Within the common packet header for feedback messages (as defined in 2059 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 2060 indicates the source of the request, and the "SSRC of media source" 2061 is not used and SHALL be set to 0. The SSRCs of the media senders to 2062 which the FIR command applies are in the corresponding FCI entries. 2063 A TSTR message MAY contain requests to multiple media senders, using 2064 one FCI entry per target media sender. 2066 4.3.1.3. Timing Rules 2068 The timing follows the rules outlined in section 3 of [RFC4585]. FIR 2069 commands MAY be used with early or immediate feedback. The FIR 2070 feedback message MAY be repeated. If using immediate feedback mode 2071 the repetition SHOULD wait at least one RTT before being sent. In 2072 early or regular RTCP mode the repetition is sent in the next regular 2073 RTCP packet. 2075 4.3.1.4. Handling of FIR Message in Mixer and Translators 2077 A media translator or a mixer performing media encoding of the 2078 content for which the session participant has issued a FIR is 2079 responsible for acting upon it. A mixer acting upon a FIR SHOULD NOT 2080 forward the message unaltered; instead it SHOULD issue a FIR itself. 2082 4.3.1.5. Remarks 2084 In conjunction with video codecs, FIR messages typically trigger the 2085 sending of full intra or IDR pictures. Both are several times larger 2086 then predicted (inter) pictures. Their size is independent of the 2087 time they are generated. In most environments, especially when 2088 employing bandwidth-limited links, the use of an intra picture 2089 implies an allowed delay that is a significant multiple of the 2090 typical frame duration. An example: if the sending frame rate is 10 2091 fps, and an intra picture is assumed to be 10 times as big as an 2092 inter picture, then a full second of latency has to be accepted. In 2093 such an environment there is no need for a particularly short delay 2094 in sending the FIR message. Hence waiting for the next possible time 2095 slot allowed by RTCP timing rules as per [RFC4585] should not have an 2096 overly negative impact on the system performance. 2098 4.3.2. Temporal-Spatial Trade-off Request (TSTR) 2100 The TSTR feedback message is identified by RTCP packet type value 2101 PT=PSFB and FMT=5. 2103 The FCI field MUST contain one or more TSTR FCI entries. 2105 4.3.2.1. Message Format 2107 The content of the FCI entry for the Temporal-Spatial Trade-off 2108 Request is depicted in Figure 5. The length of the feedback message 2109 MUST be set to 2+2*N, where N is the number of FCI entries included. 2111 0 1 2 3 2112 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2113 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2114 | SSRC | 2115 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2116 | Seq nr. | Reserved | Index | 2117 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2119 Figure 5 - Syntax of an FCI Entry in the TSTR Message 2121 SSRC (32 bits): The SSRC of the media sender which is requested to 2122 apply the tradeoff value given in Index. 2124 Seq. nr (8 bits): Request sequence number. The sequence number 2125 space is unique for pairing of the SSRC of request source 2126 and the SSRC of the request target. The sequence number 2127 SHALL be increased by 1 modulo 256 for each new command. 2128 A repetition SHALL NOT increase the sequence number. The 2129 initial value is arbitrary. 2131 Reserved (19 bits): All bits SHALL be set to 0 by the sender and 2132 SHALL be ignored on reception. 2134 Index (5 bits): An integer value between 0 and 31 that indicates 2135 the relative trade off that is requested. An index value 2136 of 0 index highest possible spatial quality, while 31 2137 indicates highest possible temporal resolution. 2139 4.3.2.2. Semantics 2141 A decoder can suggest a temporal-spatial trade-off level by sending a 2142 TSTR message to an encoder. If the encoder is capable of adjusting 2143 its temporal-spatial trade-off, it SHOULD take into account the 2144 received TSTR message for future coding of pictures. A value of 0 2145 suggests a high spatial quality and a value of 31 suggests a high 2146 frame rate. The progression of values from 0 to 31 indicate 2147 monotonically a desire for higher frame rate. The index values do 2148 not correspond to precise values of spatial quality or frame rate. 2150 The reaction to the reception of more than one TSTR message by a 2151 media sender from different media receivers is left open to the 2152 implementation. The selected trade-off SHALL be communicated to the 2153 media receivers by the means of the TSTN message. 2155 Within the common packet header for feedback messages (as defined in 2156 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 2157 indicates the source of the request, and the "SSRC of media source" 2158 is not used and SHALL be set to 0. The SSRCs of the media senders to 2159 which the TSTR applies to are in the corresponding FCI entries. 2161 A TSTR message MAY contain requests to multiple media senders, using 2162 one FCI entry per target media sender. 2164 4.3.2.3. Timing Rules 2166 The timing follows the rules outlined in section 3 of [RFC4585]. 2167 This request message is not time critical and SHOULD be sent using 2168 regular RTCP timing. Only if it is known that the user interface 2169 requires a quick feedback, the message MAY be sent with early or 2170 immediate feedback timing. 2172 4.3.2.4. Handling of message in Mixers and Translators 2174 A mixer or media translator that encodes content sent to the session 2175 participant issuing the TSTR SHALL consider the request to determine 2176 if it can fulfill it by changing its own encoding parameters. A 2177 media translator unable to fulfill the request MAY forward the 2178 request unaltered towards the media sender. A mixer encoding for 2179 multiple session participants will need to consider the joint needs 2180 of these participants before generating a TSTR on its own behalf 2181 towards the media sender. See also the discussion in Section 3.5.2. 2183 4.3.2.5. Remarks 2185 The term "spatial quality" does not necessarily refer to the 2186 resolution, measured by the number of pixels the reconstructed video 2187 is using. In fact, in most scenarios the video resolution stays 2188 constant during the lifetime of a session. However, all video 2189 compression standards have means to adjust the spatial quality at a 2190 given resolution, often influenced by the Quantizer Parameter or QP. 2191 A numerically low QP results in a good reconstructed picture quality, 2192 whereas a numerically high QP yields a coarse picture. The typical 2193 reaction of an encoder to this request is to change its rate control 2194 parameters to use a lower frame rate and a numerically lower (on 2195 average) QP, or vice versa. The precise mapping of Index value to 2196 frame rate and QP is intentionally left open here, as it depends on 2197 factors such as the compression standard employed, spatial 2198 resolution, content, bit rate, and so on. 2200 4.3.3. Temporal-Spatial Trade-off Notification (TSTN) 2202 The TSTN message is identified by RTCP packet type value PT=PSFB and 2203 FMT=6. 2205 The FCI field SHALL contain one or more TSTN FCI entries. 2207 4.3.3.1. Message Format 2209 The content of an FCI entry for the Temporal-Spatial Trade-off 2210 Notification is depicted in Figure 6. The length of the TSTN message 2211 MUST be set to 2+2*N, where N is the number of FCI entries. 2213 0 1 2 3 2214 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2215 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2216 | SSRC | 2217 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2218 | Seq nr. | Reserved | Index | 2219 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2221 Figure 6 - Syntax of the TSTN 2223 SSRC (32 bits): The SSRC of the source of the TSTR request which 2224 resulted in this Notification. 2226 Seq. nr (8 bits): The sequence number value from the TSTN request 2227 that is being acknowledged. 2229 Reserved (19 bits): All bits SHALL be set to 0 by the sender and 2230 SHALL be ignored on reception. 2232 Index (5 bits): The trade-off value the media sender is using 2233 henceforth. 2235 Informative note: The returned trade-off value (Index) may differ 2236 from the requested one, for example in cases where a media encoder 2237 cannot tune its trade-off, or when pre-recorded content is used. 2239 4.3.3.2. Semantics 2241 This feedback message is used to acknowledge the reception of a TSTR. 2242 One TSTN entry in a TSTN feedback message SHALL be sent for each TSTR 2243 entry targeted to this session participant, i.e. each TSTR received 2244 that in the SSRC field in the entry has the receiving entities SSRC. 2245 A single TSTN message MAY acknowledge multiple requests using 2246 multiple FCI entries. The index value included SHALL be the same in 2247 all FCI entries of the TSTN message. Including a FCI for each 2248 requestor allows each requesting entity to determine that the media 2249 sender received the request. The Notification SHALL also be sent in 2250 response to TSTR repetitions received. If the request receiver has 2251 received TSTR with several different sequence numbers from a single 2252 requestor it SHALL only respond to the request with the highest 2253 (modulo 256) sequence number. 2255 The TSTN SHALL include the Temporal-Spatial Trade-off index that will 2256 be used as a result of the request. This is not necessarily the same 2257 index as requested, as the media sender may need to aggregate 2258 requests from several requesting session participants. It may also 2259 have some other policies or rules that limit the selection. 2261 Within the common packet header for feedback messages (as defined in 2262 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 2263 indicates the source of the Notification, and the "SSRC of media 2264 source" is not used and SHALL be set to 0. The SSRCs of the 2265 requesting entities to which the Notification applies are in the 2266 corresponding FCI entries. 2268 4.3.3.3. Timing Rules 2270 The timing follows the rules outlined in section 3 of [RFC4585]. 2271 This acknowledgement message is not extremely time critical and 2272 SHOULD be sent using regular RTCP timing. 2274 4.3.3.4. Handling of TSTN in Mixer and Translators 2276 A mixer or translator that acts upon a TSTR SHALL also send the 2277 corresponding TSTN. In cases where it needs to forward a TSTR itself 2278 the notification message MAY need to be delayed until the TSTR has 2279 been responded to. 2281 4.3.3.5. Remarks 2282 None 2284 4.3.4. H.271 Video Back Channel Message (VBCM) 2286 The VBCM is identified by RTCP packet type value PT=PSFB and FMT=7. 2288 The FCI field MUST contain one or more VBCM FCI entries. 2290 4.3.4.1. Message Format 2292 The syntax of an FCI entry within the VBCM indication is depicted in 2293 Figure 7. 2295 0 1 2 3 2296 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2297 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2298 | SSRC | 2299 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2300 | Seq. nr |0| Payload Type| Length | 2301 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2302 | VBCM Octet String.... | Padding | 2303 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2305 Figure 7 - Syntax of an FCI Entry in the VBCM Message 2307 SSRC (32 bits): The SSRC value of the media sender that is requested 2308 to instruct its encoder to react to the VBCM message 2310 Seq. nr (8 bits): Command sequence number. The sequence number space 2311 is unique for pairing of the SSRC of command source and the 2312 SSRC of the command target. The sequence number SHALL be 2313 increased by 1 modulo 256 for each new command. A repetition 2314 SHALL NOT increase the sequence number. The initial value is 2315 arbitrary. 2317 0: Must be set to 0 by the sender and should not be acted upon by the 2318 message receiver. 2320 Payload Type (7 bits): The RTP payload type for which the VBCM bit 2321 stream must be interpreted. 2323 Length (16 bits): The length of the VBCM octet string in octets 2324 exclusive of any padding octets 2326 VBCM Octet String (Variable length): This is the octet string 2327 generated by the decoder carrying a specific feedback sub- 2328 message. 2330 Padding (Variable length): Bits set to 0 to make up a 32 bit 2331 boundary. 2333 4.3.4.2. Semantics 2335 The "payload" of the VBCM indication carries different types of 2336 codec-specific, feedback information. The type of feedback 2337 information can be classified as a 'status report' (such as an 2338 indication that a bit stream was received without errors, or that a 2339 partial or complete picture or block was lost) or 'update requests' 2340 (such as complete refresh of the bit stream). 2342 Note: There are possible overlaps between the VBCM sub- 2343 messages and CCM/AVPF feedback messages, such FIR. Please see 2344 section 3.5.3 for further discussion. 2346 The different types of feedback sub-messages carried in the VBCM are 2347 indicated by the "payloadType" as defined in [VBCM]. These sub- 2348 message types are reproduced below for convenience. "payloadType", 2349 in ITU-T Rec. H.271 terminology, refers to the sub-type of the H.271 2350 message and should not be confused with an RTP payload type. 2352 Payload Message Content 2353 Type 2354 --------------------------------------------------------------------- 2355 0 One or more pictures without detected bit stream error 2356 mismatch 2357 1 One or more pictures that are entirely or partially lost 2358 2 A set of blocks of one picture that is entirely or partially 2359 lost 2360 3 CRC for one parameter set 2361 4 CRC for all parameter sets of a certain type 2362 5 A "reset" request indicating that the sender should completely 2363 refresh the video bit stream as if no prior bit stream data 2364 had been received 2365 > 5 Reserved for future use by ITU-T 2367 Table 2: H.271 message types ("payloadTypes") 2369 The bit string or the "payload" of a VBCM message is of variable 2370 length and is self-contained and coded in a variable length, binary 2371 format. The media sender necessarily has to be able to parse this 2372 optimized binary format to make use of VBCM messages. 2374 Each of the different types of sub-messages (indicated by 2375 payloadType) may have different semantics depending on the codec 2376 used. 2378 Within the common packet header for feedback messages (as defined in 2379 section 6.1 of [RFC4585]), the "SSRC of the packet sender" field 2380 indicates the source of the request, and the "SSRC of media source" 2381 is not used and SHALL be set to 0. The SSRCs of the media senders to 2382 which the VBCM message applies to are in the corresponding FCI 2383 entries. The sender of the VBCM message MAY send H.271 messages to 2384 multiple media senders and MAY send more than one H.271 message to 2385 the same media sender within the same VBCM message. 2387 4.3.4.3. Timing Rules 2389 The timing follows the rules outlined in section 3 of [RFC4585]. The 2390 different sub-message types may have different properties in regards 2391 to the timing of messages that should be used. If several different 2392 types are included in the same feedback packet then the requirements 2393 for the sub-message type with the most stringent requirements should 2394 be followed. 2396 4.3.4.4. Handling of message in Mixer or Translator 2398 The handling of VBCM in a mixer or translator is sub-message type 2399 dependent. 2401 4.3.4.5. Remarks 2403 Please see section 3.5.3 for a discussion of the usage of H.271 2404 messages and messages defined in AVPF [RFC4585] and this memo with 2405 similar functionality. 2407 Note: There has been some discussion whether the payload type field 2408 in this message is needed. It will be needed if there is 2409 potentially more than one VBCM-capable RTP payload type in the same 2410 session, and the semantics of a given VBCM message changes between 2411 payload types. For example, the picture identification mechanism 2412 in messages of H.271 type 0 is fundamentally different between 2413 H.263 and H.264 (although both use the same syntax). Therefore, 2414 the payload field is justified here. There was a further comment 2415 that for TSTS and FIR such a need does not exist, because the 2416 semantics of TSTS and FIR are either loosely enough defined, or 2417 generic enough, to apply to all video payloads currently in 2418 existence/envisioned. 2420 5. Congestion Control 2422 The correct application of the AVPF [RFC4585] timing rules prevents 2423 the network from being flooded by feedback messages. Hence, assuming 2424 a correct implementation and configuration, the RTCP channel cannot 2425 break its bit rate commitment and introduce congestion. 2427 The reception of some of the feedback messages modifies the behaviour 2428 of the media senders or, more specifically, the media encoders. Thus 2429 modified behaviour MUST respect the bandwidth limits that the 2430 application of congestion control provides. For example, when a 2431 media sender is reacting to a FIR, the unusually high number of 2432 packets that form the decoder refresh point have to be paced in 2433 compliance with the congestion control algorithm, even if the user 2434 experience suffers from a slowly transmitted decoder refresh point. 2436 A change of the Temporary Maximum Media Stream Bit Rate value can 2437 only mitigate congestion, but not cause congestion as long as 2438 congestion control is also employed. An increase of the value by a 2439 request REQUIRES the media sender to use congestion control when 2440 increasing its transmission rate to that value. A reduction of the 2441 value results in a reduced transmission bit rate thus reducing the 2442 risk for congestion. 2444 6. Security Considerations 2446 The defined messages have certain properties that have security 2447 implications. These must be addressed and taken into account by 2448 users of this protocol. 2450 The defined setup signaling mechanism is sensitive to modification 2451 attacks that can result in session creation with sub-optimal 2452 configuration, and, in the worst case, session rejection. To prevent 2453 this type of attack, authentication and integrity protection of the 2454 setup signaling is required. 2456 Spoofed or maliciously created feedback messages of the type defined 2457 in this specification can have the following implications: 2459 a. severely reduced media bit rate due to false TMMBR messages 2460 that sets the maximum to a very low value; 2462 b. assignment of the ownership of a bounding tuple to the wrong 2463 participant within a TMMBN message, potentially causing 2464 unnecessary oscillation in the bounding set as the mistakenly 2465 identified owner reports a change in its tuple and the true 2466 owner possibly holds back on changes until a correct TMMBN 2467 message reaches the participants; 2469 c. sending TSTR requests that result in a video quality 2470 different from the user's desire, rendering the session less 2471 useful. 2473 d. Frequent FIR commands will potentially reduce the frame-rate, 2474 making the video jerky, due to the frequent usage of decoder 2475 refresh points. 2477 To prevent these attacks there is a need to apply authentication and 2478 integrity protection of the feedback messages. This can be 2479 accomplished against threats external to the current RTP session 2480 using the RTP profile that combines SRTP [SRTP] and AVPF into SAVPF 2481 [SAVPF]. In the mixer cases, separate security contexts and 2482 filtering can be applied between the mixer and the participants thus 2483 protecting other users on the mixer from a misbehaving participant. 2485 7. SDP Definitions 2487 Section 4 of [RFC4585] defines a new SDP [RFC4566] attribute, rtcp- 2488 fb, that may be used to negotiate the capability to handle specific 2489 AVPF commands and indications, such as Reference Picture Selection, 2490 Picture Loss Indication etc. The ABNF for rtcp-fb is described in 2491 section 4.2 of [RFC4585]. In this section we extend the rtcp-fb 2492 attribute to include the commands and indications that are described 2493 for codec control protocol in the present document. We also discuss 2494 the Offer/Answer implications for the codec control commands and 2495 indications. 2497 7.1. Extension of the rtcp-fb Attribute 2499 As described in AVPF [RFC4585], the rtcp-fb attribute indicates the 2500 capability of using RTCP feedback. AVPF specifies that the rtcp-fb 2501 attribute must only be used as a media level attribute and must not 2502 be provided at session level. All the rules described in [RFC4585] 2503 for rtcp-fb attribute relating to payload type and to multiple rtcp- 2504 fb attributes in a session description also apply to the new feedback 2505 messages defined in this memo. 2507 The ABNF [RFC4234] for rtcp-fb as defined in [RFC4585] is 2508 "a=rtcp-fb: " rtcp-fb-pt SP rtcp-fb-val CRLF 2510 where rtcp-fb-pt is the payload type and rtcp-fb-val defines the type 2511 of the feedback message such as ack, nack, trr-int and rtcp-fb-id. 2512 For example to indicate the support of feedback of picture loss 2513 indication, the sender declares the following in SDP 2515 v=0 2516 o=alice 3203093520 3203093520 IN IP4 host.example.com 2517 s=Media with feedback 2518 t=0 0 2519 c=IN IP4 host.example.com 2520 m=audio 49170 RTP/AVPF 98 2521 a=rtpmap:98 H263-1998/90000 2522 a=rtcp-fb:98 nack pli 2524 In this document we define a new feedback value "ccm" which indicates 2525 the support of codec control using RTCP feedback messages. The "ccm" 2526 feedback value SHOULD be used with parameters, which indicate the 2527 specific codec control commands supported. In this draft we define 2528 four parameters, which can be used with the ccm feedback value type. 2530 o "fir" indicates the support of the Full Intra Request (FIR). 2531 o "tmmbr" indicates the support of the Temporary Maximum Media 2532 Stream Bit Rate Request/Notification (TMMBR/TMMBN). It has an 2533 optional sub parameter to indicate the session maximum packet 2534 rate to be used. If not included this defaults to infinity. 2535 o "tstr" indicates the support of the Temporal-Spatial Trade-off 2536 Request/Notification (TSTR/TSTN). 2537 O "vbcm" indicates the support of H.271 video back channel 2538 messages (VBCM). It has zero or more subparameters identifying 2539 the supported H.271 "payloadType" values. 2541 In the ABNF for rtcp-fb-val defined in [RFC4585], there is a 2542 placeholder called rtcp-fb-id to define new feedback types. "ccm" is 2543 defined as a new feedback type in this document and the ABNF for the 2544 parameters for ccm are defined here (please refer to section 4.2 of 2545 [RFC4585] for complete ABNF syntax). 2547 rtcp-fb-param = SP "app" [SP byte-string] 2548 / SP rtcp-fb-ccm-param 2549 / ; empty 2551 rtcp-fb-ccm-param = "ccm" SP ccm-param 2553 ccm-param = "fir" ; Full Intra Request 2554 / "tmmbr" [SP "smaxpr=" MaxPacketRateValue] 2555 ; Temporary max media bit rate 2556 / "tstr" ; Temporal Spatial Trade Off 2557 / "vbcm" *(SP subMessageType) ; H.271 VBCM messages 2558 / token [SP byte-string] 2559 ; for future commands/indications 2560 subMessageType = 1*8DIGIT 2561 byte-string = 2562 MaxPacketRateValue = 1*15DIGIT 2564 7.2. Offer-Answer 2566 The Offer/Answer [RFC3264] implications for codec control protocol 2567 feedback messages are similar those described in [RFC4585]. The 2568 offerer MAY indicate the capability to support selected codec 2569 commands and indications. The answerer MUST remove all ccm 2570 parameters which it does not understand or does not wish to use in 2571 this particular media session. The answerer MUST NOT add new ccm 2572 parameters in addition to what has been offered. The answer is 2573 binding for the media session and both offerer and answerer MUST only 2574 use feedback messages negotiated in this way. 2576 The session maximum packet rate parameter part of the TMMBR 2577 indication is declarative and everyone shall use the highest value 2578 indicated in a response. If the session maximum packet rate 2579 parameter is not present in an offer it SHALL NOT be included by the 2580 answerer. 2582 7.3. Examples 2584 Example 1: The following SDP describes a point-to-point video call 2585 with H.263, with the originator of the call declaring its capability 2586 to support the FIR and TSTR/TSTN codec control messages. The SDP is 2587 carried in a high level signaling protocol like SIP. 2589 v=0 2590 o=alice 3203093520 3203093520 IN IP4 host.example.com 2591 s=Point-to-Point call 2592 c=IN IP4 192.0.2.124 2593 m=audio 49170 RTP/AVP 0 2594 a=rtpmap:0 PCMU/8000 2595 m=video 51372 RTP/AVPF 98 2596 a=rtpmap:98 H263-1998/90000 2597 a=rtcp-fb:98 ccm tstr 2598 a=rtcp-fb:98 ccm fir 2600 In the above example, when the sender receives a TSTR message from 2601 the remote party it is capable of adjusting the trade off as 2602 indicated in the RTCP TSTN feedback message. 2604 Example 2: The following SDP describes a SIP end point joining a 2605 video mixer that is hosting a multiparty video conferencing session. 2606 The participant supports only the FIR (Full Intra Request) codec 2607 control command and it declares it in its session description. 2609 v=0 2610 o=alice 3203093520 3203093520 IN IP4 host.example.com 2611 s=Multiparty Video Call 2612 c=IN IP4 192.0.2.124 2613 m=audio 49170 RTP/AVP 0 2614 a=rtpmap:0 PCMU/8000 2615 m=video 51372 RTP/AVPF 98 2616 a=rtpmap:98 H263-1998/90000 2617 a=rtcp-fb:98 ccm fir 2619 When the video MCU decides to route the video of this participant it 2620 sends an RTCP FIR feedback message. Upon receiving this feedback 2621 message the end point is required to generate a full intra request. 2623 Example 3: The following example describes the Offer/Answer 2624 implications for the codec control messages. The Offerer wishes to 2625 support "tstr", "fir" and "tmmbr". The offered SDP is 2627 -------------> Offer 2628 v=0 2629 o=alice 3203093520 3203093520 IN IP4 host.example.com 2630 s=Offer/Answer 2631 c=IN IP4 192.0.2.124 2632 m=audio 49170 RTP/AVP 0 2633 a=rtpmap:0 PCMU/8000 2634 m=video 51372 RTP/AVPF 98 2635 a=rtpmap:98 H263-1998/90000 2636 a=rtcp-fb:98 ccm tstr 2637 a=rtcp-fb:98 ccm fir 2638 a=rtcp-fb:* ccm tmmbr smaxpr=120 2640 The answerer wishes to support only the FIR and TSTR/TSTN messages 2641 and the answerer SDP is 2643 <---------------- Answer 2644 v=0 2645 o=alice 3203093520 3203093524 IN IP4 otherhost.example.com 2646 s=Offer/Answer 2647 c=IN IP4 192.0.2.37 2648 m=audio 47190 RTP/AVP 0 2649 a=rtpmap:0 PCMU/8000 2650 m=video 53273 RTP/AVPF 98 2651 a=rtpmap:98 H263-1998/90000 2652 a=rtcp-fb:98 ccm tstr 2653 a=rtcp-fb:98 ccm fir 2655 Example 4: The following example describes the Offer/Answer 2656 implications for H.271 Video back channel messages (VBCM). The 2657 Offerer wishes to support VBCM and the sub-messages of payloadType 1 2658 (one or more pictures that are entirely or partially lost) and 2 (a 2659 set of blocks of one picture that are entirely or partially lost). 2661 -------------> Offer 2662 v=0 2663 o=alice 3203093520 3203093520 IN IP4 host.example.com 2664 s=Offer/Answer 2665 c=IN IP4 192.0.2.124 2666 m=audio 49170 RTP/AVP 0 2667 a=rtpmap:0 PCMU/8000 2668 m=video 51372 RTP/AVPF 98 2669 a=rtpmap:98 H263-1998/90000 2670 a=rtcp-fb:98 ccm vbcm 1 2 2672 The answerer only wishes to support sub-messages of type 1 only 2674 <---------------- Answer 2676 v=0 2677 o=alice 3203093520 3203093524 IN IP4 otherhost.example.com 2678 s=Offer/Answer 2679 c=IN IP4 192.0.2.37 2680 m=audio 47190 RTP/AVP 0 2681 a=rtpmap:0 PCMU/8000 2682 m=video 53273 RTP/AVPF 98 2683 a=rtpmap:98 H263-1998/90000 2684 a=rtcp-fb:98 ccm vbcm 1 2686 So in the above example only VBCM indications comprised of 2687 "payloadType" 1 will be supported. 2689 8. IANA Considerations 2691 The new value "ccm" needs to be registered with IANA in the "rtcp-fb" 2692 Attribute Values registry located at the time of publication at: 2693 http://www.iana.org/assignments/sdp-parameters 2695 Value name: ccm 2696 Long Name: Codec Control Commands and Indications 2697 Reference: RFC XXXX 2699 A new registry "Codec Control Messages" needs to be created to hold 2700 "ccm" parameters located at time of publication at: 2701 http://www.iana.org/assignments/sdp-parameters 2703 New registration in this registry follows the "Specification 2704 required" policy as defined by [RFC2434]. In addition they are 2705 required to indicate which, if any additional RTCP feedback types, 2706 such as "nack", "ack". 2708 The initial content of the registry is the following values: 2710 Value name: fir 2711 Long name: Full Intra Request Command 2712 Usable with: ccm 2713 Reference: RFC XXXX 2715 Value name: tmmbr 2716 Long name: Temporary Maximum Media Stream Bit Rate 2717 Usable with: ccm 2718 Reference: RFC XXXX 2720 Value name: tstr 2721 Long name: temporal Spatial Trade Off 2722 Usable with: ccm 2723 Reference: RFC XXXX 2725 Value name: vbcm 2726 Long name: H.271 video back channel messages 2727 Usable with: ccm 2728 Reference: RFC XXXX 2730 The following values need to be registered as FMT values in the "FMT 2731 Values for RTPFB Payload Types" registry located at the time of 2732 publication at: http://www.iana.org/assignments/rtp-parameters 2733 RTPFB range 2734 Name Long Name Value Reference 2735 -------------- --------------------------------- ----- --------- 2736 Reserved 2 [RFCxxxx] 2737 TMMBR Temporary Maximum Media Stream Bit 3 [RFCxxxx] 2738 Rate Request 2739 TMMBN Temporary Maximum Media Stream Bit 4 [RFCxxxx] 2740 Rate Notification 2742 The following values need to be registered as FMT values in the "FMT 2743 Values for PSFB Payload Types" registry located at the time of 2744 publication at: http://www.iana.org/assignments/rtp-parameters 2746 PSFB range 2747 Name Long Name Value Reference 2748 -------------- --------------------------------- ----- --------- 2749 FIR Full Intra Request Command 4 [RFCxxxx] 2750 TSTR Temporal-Spatial Trade-off Request 5 [RFCxxxx] 2751 TSTN Temporal-Spatial Trade-off Notification 6 [RFCxxxx] 2752 VBCM Video Back Channel Message 7 [RFCxxxx] 2754 9. Contributors 2756 Tom Taylor has made a very significant contribution, for which the 2757 authors are very grateful, to this specification by helping rewrite 2758 the specification. Especially the parts regarding the algorithm for 2759 determining bounding sets for TMMBR have benefited. 2761 10. Acknowledgements 2763 The authors would like to thank Andrea Basso, Orit Levin, Nermeen 2764 Ismail for their work on the requirement and discussion draft 2765 [Basso]. 2767 Drafts of this memo were reviewed and extensively commented by Roni 2768 Even, Colin Perkins, Randell Jesup, Keith Lantz, Harikishan Desineni, 2769 Guido Franceschini and others. The authors appreciate these reviews. 2771 Funding for the RFC Editor function is currently provided by the 2772 Internet Society. 2774 11. References 2776 11.1. Normative references 2778 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., Rey, J., 2779 "Extended RTP Profile for Real-Time Transport Control 2780 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, 2781 July 2006 2782 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2783 Requirement Levels", BCP 14, RFC 2119, March 1997. 2784 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 2785 Jacobson, "RTP: A Transport Protocol for Real-Time 2786 Applications", STD 64, RFC 3550, July 2003. 2787 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 2788 Description Protocol", RFC 4566, July 2006. 2789 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 2790 with Session Description Protocol (SDP)", RFC 3264, June 2791 2002. 2792 [Topologies] M. Westerlund, and S. Wenger, "RTP Topologies", draft- 2793 ietf-avt-topologies-04, work in progress, Feb 2007 2794 [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an 2795 IANA Considerations Section in RFCs", BCP 26, RFC 2434, 2796 October 1998. 2797 [RFC4234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 2798 Specifications: ABNF", RFC 4234, October 2005. 2800 11.2. Informative references 2802 [Basso] A. Basso, et. al., "Requirements for transport of video 2803 control commands", draft-basso-avt-videoconreq-02.txt, 2804 expired Internet Draft, October 2004. 2805 [AVC] Joint Video Team of ITU-T and ISO/IEC JTC 1, Draft ITU-T 2806 Recommendation and Final Draft International Standard of 2807 Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC 2808 14496-10 AVC), Joint Video Team (JVT) of ISO/IEC MPEG 2809 and ITU-T VCEG, JVT-G050, March 2003. 2810 [H245] ITU-T Rec. HG.245, "Control protocol for multimedia 2811 communication", MAY 2006 2812 [NEWPRED] S. Fukunaga, T. Nakai, and H. Inoue, "Error Resilient 2813 Video Coding by Dynamic Replacing of Reference 2814 Pictures," in Proc. Globcom'96, vol. 3, pp. 1503 - 1508, 2815 1996. 2816 [SRTP] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 2817 Norrman, "The Secure Real-time Transport Protocol 2818 (SRTP)", RFC 3711, March 2004. 2820 [RFC4587] Even, R., "RTP Payload Format for H.261 Video Streams", 2821 RFC 4587, August 2006. 2823 [SAVPF] J. Ott, E. Carrara, "Extended Secure RTP Profile for 2824 RTCP-based Feedback (RTP/SAVPF)," draft-ietf-avt- 2825 profile-savpf-10.txt, February, 2007. 2826 [RFC3525] Groves, C., Pantaleo, M., Anderson, T., and T. Taylor, 2827 "Gateway Control Protocol Version 1", RFC 3525, June 2828 2003. 2829 [RFC3448] M. Handley, S. Floyd, J. Padhye, J. Widmer, "TCP Friendly 2830 Rate Control (TFRC): Protocol Specification", RFC 3448, 2831 Jan 2003 2832 [VBCM] ITU-T Rec. H.271, "Video Back Channel Messages", June 2833 2006 2834 [RFC3890] Westerlund, M., "A Transport Independent Bandwidth 2835 Modifier for the Session Description Protocol (SDP)", 2836 RFC 3890, September 2004. 2837 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 2838 Congestion Control Protocol (DCCP)", RFC 4340, March 2839 2006. 2840 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 2841 A., Peterson, J., Sparks, R., Handley, M., and E. 2842 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 2843 June 2002. 2844 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 2845 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 2846 Parisis, "RTP Payload for Redundant Audio Data", RFC 2847 2198, September 1997. 2849 12. Authors' Addresses 2851 Stephan Wenger 2852 Nokia Corporation 2853 975, Page Mill Road, 2854 Palo Alto,CA 94304 2855 USA 2857 Phone: +1-650-862-7368 2858 EMail: stewe@stewe.org 2860 Umesh Chandra 2861 Nokia Research Center 2862 975, Page Mill Road, 2863 Palo Alto,CA 94304 2864 USA 2866 Phone: +1-650-796-7502 2867 Email: Umesh.Chandra@nokia.com 2869 Magnus Westerlund 2870 Ericsson Research 2871 Ericsson AB 2872 SE-164 80 Stockholm, SWEDEN 2874 Phone: +46 8 7190000 2875 EMail: magnus.westerlund@ericsson.com 2877 Bo Burman 2878 Ericsson Research 2879 Ericsson AB 2880 SE-164 80 Stockholm, SWEDEN 2882 Phone: +46 8 7190000 2883 EMail: bo.burman@ericsson.com 2885 Full Copyright Statement 2887 Copyright (C) The IETF Trust (2007). 2889 This document is subject to the rights, licenses and restrictions 2890 contained in BCP 78, and except as set forth therein, the authors 2891 retain all their rights. 2893 This document and the information contained herein are provided on an 2894 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 2895 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST 2896 AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, 2897 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT 2898 THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY 2899 IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR 2900 PURPOSE. 2902 Intellectual Property 2904 The IETF takes no position regarding the validity or scope of any 2905 Intellectual Property Rights or other rights that might be claimed to 2906 pertain to the implementation or use of the technology described in 2907 this document or the extent to which any license under such rights 2908 might or might not be available; nor does it represent that it has 2909 made any independent effort to identify any such rights. Information 2910 on the procedures with respect to rights in RFC documents can be 2911 found in BCP 78 and BCP 79. 2913 Copies of IPR disclosures made to the IETF Secretariat and any 2914 assurances of licenses to be made available, or the result of an 2915 attempt made to obtain a general license or permission for the use of 2916 such proprietary rights by implementers or users of this 2917 specification can be obtained from the IETF on-line IPR repository at 2918 http://www.ietf.org/ipr. 2920 The IETF invites any interested party to bring to its attention any 2921 copyrights, patents or patent applications, or other proprietary 2922 rights that may cover technology that may be required to implement 2923 this standard. Please address the information to the IETF at 2924 ietf-ipr@ietf.org. 2926 Acknowledgement 2928 Funding for the RFC Editor function is provided by the IETF 2929 Administrative Support Activity (IASA). 2931 RFC Editor Considerations 2933 The RFC editor is requested to replace all occurrences of XXXX with 2934 the RFC number this document receives.