idnits 2.17.1 draft-ietf-avt-avpf-ccm-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 19. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1997. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2008. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2015. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2021. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 6 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 28, 2006) is 6450 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 2327 (Obsoleted by RFC 4566) -- Possible downref: Non-RFC (?) normative reference: ref. 'Topologies' -- Obsolete informational reference (is this intentional?): RFC 2032 (Obsoleted by RFC 4587) == Outdated reference: A later version (-12) exists of draft-ietf-avt-profile-savpf-02 -- Obsolete informational reference (is this intentional?): RFC 3525 (Obsoleted by RFC 5125) Summary: 4 errors (**), 0 flaws (~~), 4 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Stephan Wenger 3 INTERNET-DRAFT Umesh Chandra 4 Expires: February 2007 Nokia 5 Magnus Westerlund 6 Bo Burman 7 Ericsson 8 August 28, 2006 10 Codec Control Messages in the 11 Audio-Visual Profile with Feedback (AVPF) 12 draft-ietf-avt-avpf-ccm-00.txt> 14 Status of this Memo 16 By submitting this Internet-Draft, each author represents that any 17 applicable patent or other IPR claims of which he or she is aware 18 have been or will be disclosed, and any of which he or she becomes 19 aware will be disclosed, in accordance with Section 6 of BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt. 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 Copyright Notice 39 Copyright (C) The Internet Society (2006). 41 Abstract 43 This document specifies a few extensions to the messages defined in 44 the Audio-Visual Profile with Feedback (AVPF). They are helpful 45 primarily in conversational multimedia scenarios where centralized 46 multipoint functionalities are in use. However some are also usable 47 in smaller multicast environments and point-to-point calls. The 48 extensions discussed are Full Intra Request, Temporary Maximum Media 49 Bit-rate and Temporal Spatial Trade-off. 51 TABLE OF CONTENTS 53 1. Introduction....................................................5 54 2. Definitions.....................................................7 55 2.1. Glossary...................................................7 56 2.2. Terminology................................................7 57 2.3. Topologies.................................................9 58 3. Motivation (Informative)........................................9 59 3.1. Use Cases..................................................9 60 3.2. Using the Media Path......................................11 61 3.3. Using AVPF................................................12 62 3.3.1. Reliability..........................................12 63 3.4. Multicast.................................................12 64 3.5. Feedback Messages.........................................13 65 3.5.1. Full Intra Request Command...........................13 66 3.5.1.1. Reliability.....................................14 67 3.5.2. Temporal Spatial Trade-off Request and Announcement..14 68 3.5.2.1. Point-to-point..................................15 69 3.5.2.2. Point-to-Multipoint using Multicast or Translators15 70 3.5.2.3. Point-to-Multipoint using RTP Mixer.............16 71 3.5.2.4. Reliability.....................................16 72 3.5.3. H.271 Video Back Channel Message conforming to ITU-T Rec. 73 H.271.......................................................17 74 3.5.3.1. Reliability.....................................17 75 3.5.4. Temporary Maximum Media Bit-rate Request.............17 76 3.5.4.1. MCU based Multi-point operation.................18 77 3.5.4.2. Point-to-Multipoint using Multicast or Translators20 78 3.5.4.3. Point-to-point operation........................20 79 3.5.4.4. Reliability.....................................20 80 4. RTCP Receiver Report Extensions................................22 81 4.1. Design Principles of the Extension Mechanism..............22 82 4.2. Transport Layer Feedback Messages.........................23 83 4.2.1. Temporary Maximum Media Bit-rate Request (TMMBR).....23 84 4.2.1.1. Semantics.......................................23 85 4.2.1.2. Message Format..................................25 86 4.2.1.3. Timing Rules....................................26 87 4.2.2. Temporary Maximum Media Bit-rate Notification (TMMBN) 26 88 4.2.2.1. Semantics.......................................26 89 4.2.2.2. Message Format..................................27 90 4.2.2.3. Timing Rules....................................27 91 4.3. Payload Specific Feedback Messages........................27 92 4.3.1. Full Intra Request (FIR) command.....................28 93 4.3.1.1. Semantics.......................................28 94 4.3.1.2. Message Format..................................30 95 4.3.1.3. Timing Rules....................................31 96 4.3.1.4. Remarks.........................................31 97 4.3.2. Temporal-Spatial Trade-off Request (TSTR)............31 98 4.3.2.1. Semantics.......................................32 99 4.3.2.2. Message Format..................................32 100 4.3.2.3. Timing Rules....................................33 101 4.3.2.4. Remarks.........................................33 102 4.3.3. Temporal-Spatial Trade-off Announcement (TSTA).......33 103 4.3.3.1. Semantics.......................................34 104 4.3.3.2. Message Format..................................34 105 4.3.3.3. Timing Rules....................................35 106 4.3.3.4. Remarks.........................................35 107 4.3.4. H.271 VideoBackChannelMessage (VBCM).................35 108 5. Congestion Control.............................................37 109 6. Security Considerations........................................38 110 7. SDP Definitions................................................38 111 7.1. Extension of rtcp-fb attribute............................39 112 7.2. Offer-Answer..............................................40 113 7.3. Examples..................................................40 114 8. IANA Considerations............................................43 115 9. Acknowledgements...............................................44 116 10. References....................................................45 117 10.1. Normative references.....................................45 118 10.2. Informative references...................................45 119 11. Authors' Addresses............................................46 120 12. List of Changes relative to previous drafts...................46 121 1. Introduction 123 When the Audio-Visual Profile with Feedback (AVPF) [RFC4548] was 124 developed, the main emphasis lied in the efficient support of point- 125 to-point and small multipoint scenarios without centralized 126 multipoint control. However, in practice, many small multipoint 127 conferences operate utilizing devices known as Multipoint Control 128 Units (MCUs). Long standing experience of the conversational video 129 conferencing industry suggests that there is a need for a few 130 additional feedback messages, to efficiently support MCU-based 131 multipoint conferencing. Some of the messages have applications 132 beyond centralized multipoint, and this is indicated in the 133 description of the message. This is especially true for the message 134 intended to carry ITU-T Rec. H.271 [H.271] bitstrings for video back 135 channel messages. 137 In RTP [RFC3550] terminology, MCUs comprise mixers and translators. 138 Most MCUs also include signalling support. During the development of 139 this memo, it was noticed that there is considerable confusion in the 140 community related to the use of terms such as "mixer", "translator", 141 and "MCU". In response to these concerns, a number of topologies 142 have been identified that are of practical relevance to the industry, 143 but were not envisioned (or at least not documented in sufficient 144 detail) in RTP. These topologies are documented in [Topologies], and 145 this memo frequently refers to sections in that document. 147 Some of the messages defined here are forward only, in that they do 148 not require an explicit acknowledgement. Other messages require 149 acknowledgement, leading to a two way communication model that could 150 suggest to some to be useful for control purposes. It is not the 151 intention of this memo to open up RTCP to a generalized control 152 protocol. All mentioned messages have relatively strict real-time 153 constraints - in the sense that their value diminishes with increased 154 delay. This makes the use of more traditional control protocol 155 means, such as SIP re-invites, undesirable. Furthermore, all 156 messages are of a very simple format that can be easily processed by 157 an RTP/RTCP sender/receiver. Finally, all messages infer only to the 158 RTP stream they are related to, and not to any other property of a 159 communication system. 161 The Full Intra Request (FIR) Command requires the receiver of the 162 message (and sender of the stream) to immediately insert a decoder 163 refresh point. In video coding, one commonly used form of a decoder 164 refresh point is an IDR or Intra picture. Other codecs may have 165 other forms of decoder refresh points. In order to fulfil congestion 166 control constraints, sending a decoder refresh point may imply a 167 significant drop in frame rate, as they are commonly much larger than 168 regular predicted content. The use of this message is restricted to 169 cases where no other means of decoder refresh can be employed, e.g. 170 during the join-phase of a new participant in a multipoint 171 conference. It is explicitly disallowed to use the FIR command for 172 error resilience purposes, and instead it is referred to AVPF's PLI 173 message, which reports lost pictures and has been included in AVPF 174 for precisely that purpose. The message does not require an 175 acknowledgement, as the presence of a decoder refresh point can be 176 easily derived from the media bit stream. Today, the FIR message 177 appears to be useful primarily with video streams, but in the future 178 it may become helpful also in conjunction with other media codecs 179 that support prediction across RTP packets. 181 The Temporary Maximum Media Bandwidth Request (TMMBR) Message allows 182 to signal, from media receiver to media sender, the current maximum 183 supported media bit-rate for a given media stream. Once a bandwidth 184 limitation is established by the media sender, that sender notifies 185 the initiator of the request, and all other session participants, by 186 sending a TMMBN notification message. One usage scenarios can be 187 seen as limiting media senders in multiparty conferencing to the 188 slowest receiver's maximum media bandwidth reception/handling 189 capability. Such a use is helpful, for example, because the 190 receiver's situation may have changed due to computational load, or 191 because the receiver has just joined the conference and it is helpful 192 to inform media sender(s) about its constraints, without waiting for 193 congestion induced bandwidth reduction. Another application involves 194 graceful bandwidth adaptation in scenarios where the upper limit 195 connection bandwidth to a receiver changes, but is known in the 196 interval between these dynamic changes. The TMMBR message is useful 197 for all media types that are not inherently of constant bit rate. 199 The Video back channel message (VBCM) allows conveying bit streams 200 conforming to ITU-T Rec. H.271 [H.271], from a video receiver to 201 video sender. This ITU-T Recommendation defines codepoints for a 202 number of video-specific feedback messages. Examples include 203 messages to signal 204 - the corruption of reference pictures or parts thereof, 205 - the corruption of decoder state information, e.g. parameter sets, 206 - the suggestion of using a reference picture other than the one 207 typically used, e.g. to support the NEWPRED algorithm [NEWPRED]. 208 The ITU-T plans to add codepoints to H.271 every time a need arises, 209 e.g. with the introduction of new video codecs or new tools into 210 existing video codecs. 211 There exists some overlap between H.271 messages and "native" 212 messages specified in this memo and in AVPF. Examples include the 213 PLI message of [RFC4548] and the FIR message specified herein. As a 214 general rule, the "native" messages should be prefered over the 215 sending of VBCM messages when all senders and receivers implement 216 this memo. However, if gateways are in the picture, it may be more 217 advisable to utilize VBCM. Similarly, for feedback message types 218 that exist in H.271 but do not exist in this memo or AVPF, there is 219 no other choice but using VBCM. 220 Video feedback channel messages according to H.271 do not require 221 acknowledgements on a protocol level, because the appropriate 222 reaction of the video encoder and sender can be derived from the 223 forward video bit stream. 225 Finally, the Temporal-Spatial Trade-off Request (TSTR) Message 226 enables a video receiver to signal to the video sender its preference 227 for spatial quality or high temporal resolution (frame rate). The 228 receiver of the video stream generates this signal typically based on 229 input from its user interface, so to react to explicit requests of 230 the user. However, some implicit use forms are also known. For 231 example, the trade-offs commonly used for live video and document 232 camera content are different. Obviously, this indication is relevant 233 only with respect to video transmission. The message is acknowledged 234 by an announcement message indicating the newly chosen tradeoff, so 235 to allow immediate user feedback. 237 2. Definitions 239 2.1. Glossary 241 ASM - Asynchronous Multicast 242 AVPF - The Extended RTP Profile for RTCP-based Feedback 243 FEC - Forward Error Correction 244 FIR - Full Intra Request 245 MCU - Multipoint Control Unit 246 MPEG - Moving Picture Experts Group 247 PtM - Point to Multipoint 248 PtP - Point to Point 249 TMMBN - Temporary Maximum Media Bit-rate Notification 250 TMMBR - Temporary Maximum Media Bit-rate Request 251 PLI - Picture Loss Indication 252 TSTA - Temporal Spatial Trade-off Announcement 253 TSTR - Temporal Spatial Trade-off Request 254 VBCM - Video Back Channel Message indication. 256 2.2. Terminology 257 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 258 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 259 document are to be interpreted as described in RFC 2119 [RFC2119]. 261 Message: 262 Codepoint defined by this specification, of one of the 263 following types: 265 Request: 266 Message that requires Acknowledgement 268 Acknowledgment: 269 Message that answers a Request 271 Command: 272 Message that forces the receiver to an action 274 Indication: 275 Message that reports a situation 277 Notification: 278 See Indication. 280 Note that, with the exception of "Notification", this terminology 281 is in alignment with ITU-T Rec. H.245. 283 Decoder Refresh Point: 284 A bit string, packetised in one or more RTP packets, which 285 completely resets the decoder to a known state. Typical 286 examples of Decoder Refresh Points are H.261 Intra pictures 287 and H.264 IDR pictures. However, there are also much more 288 complex decoder refresh points. 290 Typical examples for "hard" decoder refresh points are Intra 291 pictures in H.261, H.263, MPEG 1, MPEG 2, and MPEG-4 part 2, 292 and IDR pictures in H.264. "Gradual" decoder refresh points 293 may also be used; see for example [AVC]. While both "hard" 294 and "gradual" decoder refresh points are acceptable in the 295 scope of this specification, in most cases the user 296 experience will benefit from using a "hard" decoder refresh 297 point. 299 A decoder refresh point also contains all header information 300 above the picture layer (or equivalent, depending on the 301 video compression standard) that is conveyed in-band. In 302 H.264, for example, a decoder refresh point contains 303 parameter set NAL units that generate parameter sets 304 necessary for the decoding of the following slice/data 305 partition NAL units (and that are not conveyed out of band). 306 To the best of the author's knowledge, the term "Decoder 307 Refresh Point" has been formally defined only in H.264; hence 308 we are referring here to this video compression standard. 310 Decoding: 311 The operation of reconstructing the media stream. 313 Rendering: 314 The operation of presenting (parts of) the reconstructed 315 media stream to the user. 317 Stream thinning: 318 The operation of removing some of the packets from a media 319 stream. Stream thinning, preferably, is performed media 320 aware, implying that media packets are removed in the order 321 of their relevance to the reproductive quality. However even 322 when employing media-aware stream thinning, most media 323 streams quickly lose quality when subject to increasing 324 levels of thinning. Media-unaware stream thinning leads to 325 even worse quality degradation. 327 2.3. Topologies 329 Please refer to [Topologies] for an in depth discussion. 331 3. Motivation (Informative) 333 This section discusses the motivation and usage of the different 334 video and media control messages. The video control messages have 335 been under discussion for a long time, and a requirement draft was 336 drawn up [Basso]. This draft has expired; however we do quote 337 relevant sections of it to provide motivation and requirements. 339 3.1. Use Cases 341 There are a number of possible usages for the proposed feedback 342 messages. Let's begin with looking through the use cases Basso et al. 343 [Basso] proposed. Some of the use cases have been reformulated and 344 commented: 346 1. An RTP video mixer composes multiple encoded video sources into a 347 single encoded video stream. Each time a video source is added, 348 the RTP mixer needs to request a decoder refresh point from the 349 video source, so as to start an uncorrupted prediction chain on 350 the spatial area of the mixed picture occupied by the data from 351 the new video source. 353 2. An RTP video mixer that receives multiple encoded RTP video 354 streams from conference participants, and dynamically selects one 355 of the streams to be included in its output RTP stream. At the 356 time of a bit stream change (determined through means such as 357 voice activation or the user interface), the mixer requests a 358 decoder refresh point from the remote source, in order to avoid 359 using unrelated content as reference data for inter picture 360 prediction. After requesting the decoder refresh point, the video 361 mixer stops the delivery of the current RTP stream and monitors 362 the RTP stream from the new source until it detects data belonging 363 to the decoder refresh point. At that time, the RTP mixer starts 364 forwarding the newly selected stream to the receiver(s). 366 3. An application needs to signal to the remote encoder a request of 367 change of the desired trade-off in temporal/spatial resolution. 368 For example, one user may prefer a higher frame rate and a lower 369 spatial quality, and another use may prefer the opposite. This 370 choice is also highly content dependent. Many current video 371 conferencing systems offer in the user interface a mechanism to 372 make this selection, usually in the form of a slider. The 373 mechanism is helpful in point-to-point, centralized multipoint and 374 non-centralized multipoint uses. 376 4. Use case 4 of the Basso draft applies only to AVPF's PLI and is 377 not reproduced here. 379 5. Use case 5 of the Basso draft relates to a mechanism known as 380 "freeze picture request". Sending freeze picture requests over a 381 non-reliable forward RTCP channel has been identified as 382 problematic. Therefore, no freeze picture request has been 383 included in this memo, and the use case discussion is not 384 reproduced here. 386 6. A video mixer dynamically selects one of the received video 387 streams to be sent out to participants and tries to provide the 388 highest bit rate possible to all participants, while minimizing 389 stream transrating. One way of achieving this is to setup sessions 390 with endpoints using the maximum bit rate accepted by that 391 endpoint, and by the call admission method used by the mixer. By 392 means of commands that allow reducing the maximum media bitrate 393 beyond what has been negotiated during session setup, the mixer 394 can then reduce the maximum bit rate sent by endpoints to the 395 lowest common denominator of all received streams. As the lowest 396 common denominator changes due to endpoints joining, leaving, or 397 network congestion, the mixer can adjust the limits to which 398 endpoints can send their streams to match the new limit. The mixer 399 then would request a new maximum bit rate, which is equal or less 400 than the maximum bit-rate negotiated at session setup, for a 401 specific media stream, and the remote endpoint can respond with 402 the actual bit-rate that it can support. 404 The picture Basso, et al draws up covers most applications we 405 foresee. However we would like to extend the list with two additional 406 use cases: 408 7. The used congestion control algorithms (AMID and TFRC) probe for 409 more bandwidth as long as there is something to send. With 410 congestion control using packet-loss as the indication for 411 congestion, this probing does generally result in reduced media 412 quality (often to a point where the distortion is large enough to 413 make the media unusable), due to packet loss and increased delay. 414 In a number of deployment scenarios, especially cellular ones, the 415 bottleneck link is often the last hop link. That cellular link 416 also commonly has some type of QoS negotiation enabling the 417 cellular device to learn the maximal bit-rate available over this 418 last hop. Thus indicating the maximum available bit-rate to the 419 transmitting part can be beneficial to prevent it from even trying 420 to exceed the known hard limit that exists. For cellular or other 421 mobile devices the available known bit-rate can also quickly 422 change due to handover to another transmission technology, QoS 423 renegotiation due to congestion, etc. To enable minimal disruption 424 of service a possibility for quick convergence, especially in 425 cases of reduced bandwidth, a media path signalling method is 426 desired. 428 8. The use of reference picture selection as an error resilience tool 429 has been introduced in 1997 as NEWPRED [NEWPRED], and is now 430 widely deployed. It operates the receiver sending a feedback 431 message to the sender, indicating a reference picture that should 432 be used for future prediction. AVPF contains a mechanism for 433 conveying such a message, but did not specify for which codec and 434 according to which syntax the message conforms to. Recently, the 435 ITU-T finalized Rec. H.271 which (among other message types) also 436 includes a feedback message. It is expected that this feedback 437 message will enjoy wide support and fairly quickly. Therefore, a 438 mechanisms to convey feedback messages according to H.271 appears 439 to be desirable. 441 3.2. Using the Media Path 442 There are multiple reasons why we propose to use the media path for 443 the codec control messages. First, systems employing MCUs are often 444 separating the control and media processing parts. As these messages 445 are intended or generated by the media part rather than the 446 signalling part of the MCU, having them on the media path avoids 447 interfaces and unnecessary control traffic between signalling and 448 processing. If the MCU is physically decomposite, the use of the 449 media path avoids the need for media control protocol extensions 450 (e.g. in MEGACO [RFC3525]). 452 Secondly, the signalling path quite commonly contains several 453 signalling entities, e.g. SIP-proxies and application servers. 454 Avoiding signalling entities avoids delay for several reasons. 455 Proxies have less stringent delay requirements than media processing 456 and due to their complex and more generic nature may result in 457 significant processing delay. The topological locations of the 458 signalling entities are also commonly not optimized for minimal 459 delay, rather other architectural goals. Thus the signalling path can 460 be significantly longer in both geographical and delay sense. 462 3.3. Using AVPF 464 The AVPF feedback message framework provides a simple way of 465 implementing the new messages. Furthermore, AVPF implements rules 466 controlling the timing of feedback messages so to avoid congestion 467 through network flooding. We re-use these rules by referencing to 468 AVPF. 470 The signalling setup for AVPF allows each individual type of function 471 to be configured or negotiated on a RTP session basis. 473 3.3.1. Reliability 475 The use of RTCP messages implies that each message transfer is 476 unreliable, unless the lower layer transport provides reliability. 477 The different messages proposed in this specification have different 478 requirements in terms of reliability. However, in all cases, the 479 reaction to an (occasional) loss of a feedback message is specified. 481 3.4. Multicast 483 The media related requests might be used with multicast. The RTCP 484 timing rules specified in [RFC3550] and [RFC4548] ensure that the 485 messages do not cause overload of the RTCP connection. The use of 486 multicast may result in the reception of messages with inconsistent 487 semantics. The reaction to inconsistencies depends on the message 488 type, and is discussed for each message type separately. 490 3.5. Feedback Messages 492 This section describes the semantics of the different feedback 493 messages and how they apply to the different use cases. 495 3.5.1. Full Intra Request Command 497 A Full Intra Request (FIR) command, when received by the designated 498 media sender, requires that the media sender sends a "decoder refresh 499 point" (see 2.2) at the earliest opportunity. The evaluation of such 500 opportunity includes the current encoder coding strategy and the 501 current available network resources. 503 FIR is also known as an "instantaneous decoder refresh request" or 504 "video fast update request". 506 Using a decoder refresh point implies refraining from using any 507 picture sent prior to that point as a reference for the encoding 508 process of any subsequent picture sent in the stream. For predictive 509 media types that are not video, the analogue applies. For example, 510 if in MPEG-4 systems scene updates are used, the decoder refresh 511 point consists of the full representation of the scene and is not 512 delta-coded relative to previous updates. 514 Decoder Refresh points, especially Intra or IDR pictures, are in 515 general several times larger in size than predicted pictures. Thus, 516 in scenarios in which the available bandwidth is small, the use of a 517 decoder refresh point implies a delay that is significantly longer 518 than the typical picture duration. 520 Usage in multicast is possible; however aggregation of the commands 521 is recommended. A receiver that receives a request closely (within 2 522 times the longest Round Trip Time (RTT) known) after sending a 523 decoder refresh point should await a second request message to ensure 524 that the media receiver has not been served by the previously 525 delivered decoder refresh point. The reason for delaying 2 times the 526 longest known RTT is to avoid sending unnecessary decoder refresh 527 points. A session participant may have sent its own request while 528 another participants request was in-flight to them. Thus suppressing 529 those requests that may have been sent without knowledge about the 530 other request avoids this issue. 532 Full Intra Request is applicable in use-case 1, 2, and 5. 534 3.5.1.1. Reliability 536 The FIR message results in the delivery of a decoder refresh point, 537 unless the message is lost. Decoder refresh points are easily 538 identifiable from the bit stream. Therefore, there is no need for 539 protocol-level acknowledgement, and a simple command repetition 540 mechanism is sufficient for ensuring the level of reliability 541 required. However, the potential use of repetition does require a 542 mechanism to prevent the recipient from responding to messages 543 already received and responded to. 545 To ensure the best possible reliability, a sender of FIR may repeat 546 the FIR request until a response has been received. The repetition 547 interval is determined by the RTCP timing rules the session operates 548 under. Upon reception of a complete decoder refresh point or the 549 detection of an attempt to send a decoder refresh point (which got 550 damaged due to a packet loss) the repetition of the FIR must stop. If 551 another FIR is necessary, the request sequence number must be 552 increased. To combat loss of the decoder refresh points sent, the 553 sender that receives repetitions of the FIR 2*RTT after the 554 transmission of the decoder refresh point shall send a new decoder 555 refresh point. Two round trip times allow time for the request to 556 arrive at the media sender and the decoder refresh point to arrive 557 back to the requestor. A FIR sender shall not have more than one FIR 558 request (different request sequence number) outstanding at any time 559 per media sender in the session. 561 An RTP Mixer that receives an FIR from a media receiver is 562 responsible to ensure that a decoder refresh point is delivered to 563 the requesting receiver. It may be necessary to generate FIR commands 564 by the MCU. The two legs (FIR-requesting endpoint to MCU, and MCU to 565 decoder refresh point generating MCU) are handled independently from 566 each other from a reliability perspective. 568 3.5.2. Temporal Spatial Trade-off Request and Announcement 570 The Temporal Spatial Trade-off Request (TSTR) instructs the video 571 encoder to change its trade-off between temporal and spatial 572 resolution. Index values from 0 to 31 indicate monotonically a 573 desire for higher frame rate. In general the encoder reaction time 574 may be significantly longer than the typical picture duration. See 575 use case 3 for an example. The encoder decides if the request 576 results in a change of the trade off. An acknowledgement process has 577 been defined to provide feedback of the trade-off that is used 578 henceforth. 580 Informative note: TSTR and TSTA have been introduced primarily 581 because it is believed that control protocol mechanisms, e.g. a SIP 582 re-invite, are too heavyweight, and too slow to allow for a 583 reasonable user experience. Consider, for example, a user 584 interface where the remote user selects the temporal/spatial trade- 585 off with a slider (as it is common in state-of-the-art video 586 conferencing systems). An immediate feedback to any slider 587 movement is required for a reasonable user experience. A SIP re- 588 invite would require at least 2 round-trips more (compared to the 589 TSTR/TSTA mechanism) and may involve proxies and other complex 590 mechanisms. Even in a well-designed system, it may take a second 591 or so until finally the new trade-off is selected. 592 Furthermore the use of RTCP solves very efficiently the multicast 593 use case. 595 The use of TSTR and TSTA in multipoint scenarios is a non-trivial 596 subject, and can be solved in many implementation specific ways. 597 Problems are stemming from the fact that TSTRs will typically arrive 598 unsynchronized, and may request different trade-off values for the 599 same stream and/or endpoint encoder. This memo does not specify a 600 MCU's or endpoint's reaction to the reception of a suggested trade- 601 off as conveyed in the TSTR -- we only require the receiver of a TSTR 602 message to reply to it by sending a TSTA, carrying the new trade-off 603 chosen by its own criteria (which may or may not be based on the 604 trade-off conveyed by TSTR). In other words, the trade-off sent in 605 TSTR is a non-binding recommendation; nothing more. 607 With respect to TSTR/TSTA, four scenarios based on the topologies 608 described in [Topologies] need to be distinguished. The scenarios are 609 described in the following sub-clauses. 611 3.5.2.1. Point-to-point 613 In this most trivial case, the media sender typically adjusts its 614 temporal/spatial trade-off based on the requested value in TSTR, and 615 within its capabilities. The TSTA message conveys back the new 616 trade-off value (which may be identical to the old one if, for 617 example, the sender is not capable to adjust its trade-off). 619 3.5.2.2. Point-to-Multipoint using Multicast or Translators 621 RTCP Multicast is used either with media multicast according to 622 Section 2.3.2 of [Topologies], or following RFC 3550's translator 623 model according to Section 2.3.3 of [Topologies]. In these cases, 624 TSTR messages from different receivers may be received 625 unsynchronized, and possibly with different requested trade-offs 626 (because of different user preferences). This memo does not specify 627 how the media sender tunes its trade-off. Possible strategies 628 include selecting the mean, or median, of all trade-off requests 629 received, prioritize certain participants, or continue using the 630 previously selected trade-off (e.g. when the sender is not capable of 631 adjusting it). Again, all TSTR messages need to be acknowledged by 632 TSTA, and the value conveyed back has to reflect the decision made. 634 3.5.2.3. Point-to-Multipoint using RTP Mixer 636 In this scenario the RTP Mixer receives all TSTR messages, and has 637 the opportunity to act on them based on its own criteria. In most 638 cases, the MCU should form a "consensus" of potentially conflicting 639 TSTR messages arriving from different participants, and initiate its 640 own TSTR message(s) to the media sender(s). The strategy of forming 641 this "consensus" is open for the implementation, and can, for 642 example, encompass averaging the participant's request values, 643 prioritizing certain participants, or use session default values. If 644 the Mixer changes its trade-off, it needs to request from the media 645 sender(s) the use of the new value, by creating a TSTR of its own. 646 Upon reaching a decision on the used trade-off it includes that value 647 in the acknowledgement. 649 Even if a Mixer or Translator performs transcoding, it is very 650 difficult to deliver media with the requested trade-off, unless the 651 content the MCU receives is already close to that trade-off. Only in 652 cases where the original source has substantially higher quality (and 653 bit-rate), it is likely that transcoding can result in the requested 654 trade-off. 656 3.5.2.4. Reliability 658 A request and reception acknowledgement mechanism is specified. The 659 Temporal Spatial Trade-off Announcement (TSTA) message informs the 660 request-sender that its request has been received, and what trade-off 661 is used henceforth. This acknowledgment mechanism is desirable for at 662 least the following reasons: 664 o A change in the trade-off cannot be directly identified from the 665 media bit stream, 666 o User feedback cannot be implemented without information of the 667 chosen trade-off value, according to the media sender's 668 constraints, 669 o Repetitive sending of messages requesting an unimplementable trade- 670 off can be avoided. 672 3.5.3. H.271 Video Back Channel Message 674 ITU-T Rec. H.271 defines syntax, semantics, and suggested encoder 675 reaction to a video back channel message. The codepoint defined in 676 this memo is used to convey such a message from media receiver to 677 media sender. 679 We refrain from an in-depth discussion of the available codepoints 680 within H.271 in this memo for a number of reason. The perhaps most 681 important reason is that we expect backward-compatible additions of 682 codepoints to H.271 outside the update/maturity cycle of this memo. 683 The situation is similar to RTP payload format specs - the data 684 carried within the spec is normally not described in any significant 685 detail. 687 However, we note that some H.271 messages bear similarities with 688 "native" messages of AVPF and this memo, which are known to require 689 caution in multicast environments. One example is the reference 690 picture feedback message, which appears to be critical to 691 contradicting information. While it would perhaps be possible to 692 specify an algorithm to resolve eventual contradictions, this would 693 require an amount of awareness to the details of H.271 and the video 694 codec employed which we would like to avoid in this memo. Therefore, 695 we err on the side of caution and discourage the use of VBCM in 696 topologies other than point-to-point (section 2.3.1 of [Topologies]) 697 and point-to-multipoint utilizing a mixer (section 2.3.4. of 698 [Topologies]). In the former case, obviously, no inconsistency 699 problem exists. In the latter case, it is the mixer's responsibility 700 to resolve the inconsistencies, and the mixer is media aware and can 701 do so. 703 3.5.3.1. Reliability 705 H.271 video back channel messages do not require reliable 706 transmission, and the reception of a message can be derived from the 707 forward video bit stream. Therefore, no specific reception 708 acknowledgement is specified. 710 With respect to re-sending rules, clause 3.5.1.1. applies. 712 3.5.4. Temporary Maximum Media Bit-rate Request 714 A receiver, translator or mixer uses the Temporary Maximum Media Bit- 715 rate Request (TMMBR, "timber") to request a sender to limit the 716 maximum bit-rate for a media stream to, or below, the provided value. 717 The primary usage for this is a scenario with MCU (use case 6), 718 corresponding to topologies in 2.3.3 of [Topologies] (translator) and 719 2.3.4 of [Topologies] (mixer), but also .2.3.1 of Topologies (point- 720 to-point). 722 The temporary maximum media bit-rate messages are generic messages 723 that can be applied to any media. 725 The reasoning below assumes that the participants have negotiated a 726 session maximum bit-rate, using the signalling protocol. This value 727 can be global, for example in case of point-to-point, multicast, or 728 translators. It may also be local between the participant and the 729 peer or mixer. In both cases, the bit-rate negotiated in signalling 730 is the one that the participant guarantees to be able to handle 731 (encode and decode). In practice, the connectivity of the 732 participant also bears an influence to the negotiated value -- it 733 does not necessarily make much sense to negotiate a media bit rate 734 that one's network interface does not support. 736 An already established temporary bit-rate value may be changed at any 737 time (subject to the timing rules of the feedback message sending), 738 and to any value between zero and the session maximum, as negotiated 739 during signalling. Even if a sender has received a TMMBR message 740 increasing the bit-rate, all increases must be governed by a 741 congestion control algorithm. TMMBR only indicates known limitations, 742 usually in the local environment, and does not provide any 743 guarantees. 745 If it is likely that the new bit-rate indicated by TMMBR will be 746 valid for the remainder of the session, the TMMBR sender can perform 747 a renegotiation of the session upper limit using the session 748 signalling protocol. 750 3.5.4.1. MCU based Multi-point operation 752 Assume a small multiparty conference is ongoing, as depicted in 753 Section 2.3.4 of [Topologies]. All participants (A-D) have negotiated 754 a common maximum bit-rate that this session can use. The conference 755 operates over a number of unicast links between the participants and 756 the MCU. The congestion situation on each of these links can easily 757 be monitored by the participant in question and by the MCU, 758 utilizing, for example, RTCP Receiver Reports. However, any given 759 participant has no knowledge of the congestion situation of the 760 connections to the other participants. Worse, without mechanisms 761 similar to the ones discussed in this draft, the MCU (who is aware of 762 the congestion situation on all connections it manages) has no 763 standardized means to inform participants to slow down, short of 764 forging its own receiver reports (which is undesirable). In 765 principle, an MCU confronted with such a situation is obliged to thin 766 or transcode streams intended for connections that detected 767 congestion. 769 In practice, stream thinning - if performed media aware - is 770 unfortunately a very difficult and cumbersome operation and adds 771 undesirable delay. If done media unaware, it leads very quickly to 772 unacceptable reproduced media quality. Hence, means to slow down 773 senders even in the absence of congestion on their connections to the 774 MCU are desirable. 776 To allow the MCU to perform congestion control on the individual 777 links, without performing transcoding, there is a need for a 778 mechanism that enables the MCU to request the participant's media 779 encoders to limit their maximum media bit-rate currently used. The 780 MCU handles the detection of a congestion state between itself and a 781 participant as follows: 782 1. Start thinning the media traffic to the supported bit-rate. 783 2. Use the TMMBR to request the media sender(s) to reduce the media 784 bit-rate sent by them to the MCU, to a value that is in compliance 785 with congestion control principles for the slowest link. Slow 786 refers here to the available bandwidth and packet rate after 787 congestion control. 788 3. As soon as the bit-rate has been reduced by the sending part, the 789 MCU stops stream thinning implicitly, because there is no need for 790 it any more as the stream is in compliance with congestion 791 control. 793 Above algorithms may suggest to some that there is no need for the 794 TMMBR - it should be sufficient to solely rely on stream thinning. 795 As much as this is desirable from a network protocol designer's 796 viewpoint, it has the disadvantage that it doesn't work very 797 well - the reproduced media quality quickly becomes unusable. 799 It appears to be a reasonable compromise to rely on stream thinning 800 as an immediate reaction tool to combat congestions, and have a quick 801 control mechanism that instructs the original sender to reduce its 802 bitrate. 804 Note also that the standard RTCP receiver report cannot serve for the 805 purpose mentioned. In an environment with RTP Mixers, the RTCP RR is 806 being sent between the RTP receiver in the endpoint and the RTP 807 sender in the Mixer only - as there is no multicast transmission. 808 The stream that needs to be bandwidth-reduced, however, is the one 809 between the original sending endpoint and the Mixer. This endpoint 810 doesn't see the aforementioned RTCP RRs, and hence needs explicitly 811 informed about desired bandwidth adjustments. 813 In this topology it is the Mixer's responsibility to collect, and 814 consider jointly, the different bit-rates which the different links 815 may support, into the bit rate requested. This aggregation may also 816 take into account that the Mixer may contain certain transcoding 817 capabilities (as discussed in section 2.3.4 of [Topologies]), which 818 can be employed for those few of the session participants that have 819 the lowest available bit-rates. 821 3.5.4.2. Point-to-Multipoint using Multicast or Translators 823 In this topology, RTCP RRs are transmitted globally which allows for 824 the detection of transmission problems such as congestion, on a 825 medium timescale. As all media senders are aware of the congestion 826 situation of all media receivers, the rationale of the use of TMMBR 827 of section 3.5.4.1 does not apply. However, even in this case the 828 congestion control response can be improved when the unicast links 829 are employing congestion controlled transport protocols (such as TCP 830 or DCCP). A peer may also report local limitation to the media 831 sender. 833 3.5.4.3. Point-to-point operation 835 In use case 7 it is possible to use TMMBR to improve the performance 836 at times of changes in the known upper limit of the bit-rate. In 837 this use case the signalling protocol has established an upper limit 838 for the session and media bit-rates. However at the time of 839 transport link bit-rate reduction, a receiver could avoid serious 840 congestion by sending a TMMBR to the sending side. 842 3.5.4.4. Reliability 844 The reaction of a media sender to the reception of a TMMBR message is 845 not immediately identifiable through inspection of the media stream. 846 Therefore a more explicit mechanism is needed to avoid unnecessary 847 re-sending of TMMBR messages. Using a statistically based 848 retransmission scheme would only provide statistical guarantees of 849 the request being received. It would also not avoid the 850 retransmission of already received messages. In addition it does not 851 allow for easy suppression of other participants requests. For the 852 reasons mentioned, a mechanism based on explicit notification is 853 used. 855 Upon the reception of a request a media sender sends a notification 856 containing the current applicable limitation of the bit-rate, and 857 which session participants that own that limit. That allows all other 858 participants to suppress any request they may have, with limitation 859 value equal or higher to the current one. The identity of the owner 860 allows for small message sizes and media sender states. A media 861 sender only keeps state for the SSRC of the current owner of the 862 limitation; all other requests and their sources are not saved. Only 863 the participant with the lowest value is allowed to remove or change 864 its limitation. Otherwise anyone that ever set a limitation would 865 need to remove it to allow the maximum bit-rate to be raised beyond 866 that value. 868 4. RTCP Receiver Report Extensions 870 This memo specifies six new feedback messages. The Full Intra Request 871 (FIR), Temporal-Spatial Trade-off Request (TSTR), Temporal-Spatial 872 Trade-off Announcement (TSTA), and Video Back Channel Message (VBCM) 873 are "Payload Specific Feedback Messages" in the sense of section 6.3 874 of AVPF [RFC4548]. The Temporary Maximum Media Bit-rate Request 875 (TMMBR) and Temporary Maximum Media Bit-rate Notification (TMMBN) are 876 "Transport Layer Feedback Messages" in the sense of section 6.2 of 877 AVPF. 879 In the following subsections, the new feedback messages are defined, 880 following a similar structure as in the AVPF specification's sections 881 6.2 and 6.3, respectively. 883 4.1. Design Principles of the Extension Mechanism 885 RTCP was originally introduced as a channel to convey presence, 886 reception quality statistics and hints on the desired media coding. 887 A limited set of media control mechanisms have been introduced in 888 early RTP payload formats for video formats, for example in RFC 2032 889 [RFC2032]. However, this specification, for the first time, suggests 890 a two-way handshake for one of its messages. There is danger that 891 this introduction could be misunderstood as the precedence for the 892 use of RTCP as an RTP session control protocol. In order to prevent 893 these misunderstandings, this subsection attempts to clarify the 894 scope of the extensions specified in this memo, and strongly suggests 895 that future extensions follow the rationale spelled out here, or 896 compellingly explain why they divert from the rationale. 898 In this memo, and in AVPF [RFC4548], only such messages have been 899 included which 901 a) have comparatively strict real-time constraints, which prevent the 902 use of mechanisms such as a SIP re-invite in most application 903 scenarios. The real-time constraints are explained separately for 904 each message where necessary 905 b) are multicast-safe in that the reaction to potentially 906 contradicting feedback messages is specified, as necessary for 907 each message 908 c) are directly related to activities of a certain media codec, class 909 of media codecs (e.g. video codecs), or the given media stream. 911 In this memo, a two-way handshake is only introduced for such 912 messages that 913 a) require a notification or acknowledgement due to their nature, 914 which is motivated separately for each message 915 b) the notification or acknowledgement cannot be easily derived from 916 the media bit stream. 918 All messages in AVPF [RFC4548] and in this memo follow a number of 919 common design principles. In particular: 921 a) Media receivers are not always implementing higher control 922 protocol functionalities (SDP, XML parsers and such) in their 923 media path. Therefore, simple binary representations are used in 924 the feedback messages and not an (otherwise desirable) flexible 925 format such as, for example, XML. 927 4.2. Transport Layer Feedback Messages 929 Transport Layer FB messages are identified by the value RTPFB (205) 930 as RTCP packet type. 932 In AVPF, one message of this category had been defined. This memo 933 specifies two more messages for a total of three messages of this 934 type. They are identified by means of the FMT parameter as follows: 936 0: unassigned 937 1: Generic NACK (as per AVPF) 938 2: Maximum Media Bit-rate Request 939 3: Maximum Media Bit-rate Notification 940 4-30: unassigned 941 31: reserved for future expansion of the identifier number space 943 The following subsection defines the formats of the FCI field for 944 this type of FB message. 946 4.2.1. Temporary Maximum Media Bit-rate Request (TMMBR) 948 The FCI field of a TMMBR Feedback message SHALL contain one or more 949 FCI entries. 951 4.2.1.1. Semantics 953 The TMMBR is used to indicate the highest bit-rate per sender of a 954 media, which the receiver currently supports in this RTP session. 955 The media sender MAY use any lower bit-rate, as it may need to 956 address a congestion situation or other limiting factors. See 957 section 5 (congestion control) for more discussion. 959 The "SSRC of the packet sender" field indicates the source of the 960 request, and the "SSRC of media source" is not used and SHALL be set 961 to 0. The SSRC of media sender in the FCI field denotes the media 962 sender the message applies to. This is useful in the multicast or 963 translator topologies where each media sender may be addressed in a 964 single TMMBR message using multiple FCIs. 966 A TMMBR FCI MAY be repeated in subsequent TMMBR messages if no 967 applicable TMMBN FCI has been received at the time of transmission of 968 the next RTCP packet. The bit-rate value of a TMMBR FCI MAY be 969 changed from a previous TMMBR message and the next, regardless of the 970 eventual reception of an applicable TMMBN FCI. 972 Please note that a TMMBN message is sent by the media sender at the 973 earliest possible point in time, as a result of any TMMBR messages 974 received since the last sending of TMMBN. The TMMBN message 975 indicates the limit and the owner of that limit at the time of the 976 transmission of the message. The limit is the lowest of all values 977 received since the last TMMBN was transmitted. 979 A media receiver who is not the owner of the bandwidth limit when 980 sending a TMMBR, MUST request a bandwidth lower than their knowledge 981 of currently established bandwidth limit for this media sender. 982 Therefore, all received requests for bandwidth limits greater or 983 equal to the one currently established are ignored. A media receiver 984 who is the owner of the current bandwidth limit, MAY lower the value 985 further, raise the value or remove the restriction completely by 986 setting the bandwidth limit equal to the session limit. 988 Once a session participant receives the TMMBN in response to its 989 TMMBR, with its own SSRC, it knows that it "owns" the bandwidth 990 limitation. Only the "owner" of a bandwidth limitation can raise it 991 or reset it to the session limit. 993 Note that, due to the unreliable nature of transport of TMMBR and 994 TMMBN, the above rules may lead to the sending of TMMBR messages 995 disobeying the rules above. Furthermore, in multicast scenarios it 996 can happen that more than one session participants believes it "owns" 997 the current bandwidth limitation. This is not critical for a number 998 of reasons: 999 a) If a TMMBR message is lost in transmission, the media sender does 1000 not learn about the restrictions imposed on it. However, it also 1001 does not send a TMMBN message notifying reception of a request it has 1002 never received. Therefore, no new limit is established, the media 1003 receiver sending the more restrictive TMMBR is not the owner. Since 1004 this media receiver has not seen a notification corresponding to its 1005 request, it is free to re-send it. 1007 b) Similarly, if a TMMBN message gets lost, the media receiver that 1008 has sent the corresponding TMMBR request does not receive 1009 acknowledgement. In that case, it is also not the "owner" of the 1010 restriction and is free to re-send the request. 1011 c) If multiple competing TMMBR messages are sent by different session 1012 participants, then the resulting TMMBN indicates the lowest bandwidth 1013 requested; the owner is set to the sender of the TMMBR with the 1014 lowest requested bandwidth value. 1016 TMMBR feedback SHOULD NOT be used if the underlying transport 1017 protocol is capable of providing similar feedback information from 1018 the receiver to the sender. 1020 It also important to consider the security risks involved with faked 1021 TMMBRs. See security considerations in Section 6. 1023 The feedback messages may be used in both multicast and unicast 1024 sessions of any of the specified topologies. 1026 For sessions with a larger number of participants using the lowest 1027 common denominator, as required by this mechanism, may not be the 1028 most suitable course of action. Larger session may need to consider 1029 other ways to support adapted bit-rate to participants, such as 1030 partitioning the session in different quality tiers, or use some 1031 other method of achieving bit-rate scalability. 1033 If the value set by a TMMBR message is expected to be permanent the 1034 TMMBR setting party is RECOMMENDED to renegotiate the session 1035 parameters to reflect that using the setup signalling. 1037 4.2.1.2. Message Format 1039 The Feedback control information (FCI) consists of one or more TMMBR 1040 FCI entries with the following syntax: 1042 0 1 2 3 1043 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1044 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1045 | SSRC | 1046 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1047 | Maximum bit-rate in units of 128 bits/s | 1048 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1050 Figure 1 - Syntax for the TMMBR message 1051 SSRC: The SSRC value of the target of this specific maximum bit- 1052 rate request. 1054 Maximum bit-rate: The temporary maximum media bit-rate value in 1055 units of 128 bit/s. This provides range from 0 to 1056 549755813888 bits/s (~550 Tbit/s) with a granularity of 128 1057 bits/s. 1059 The length of the FB message is be set to 2+2*N where N is the number 1060 of TMMBR FCI entries. 1062 4.2.1.3. Timing Rules 1064 The first transmission of the request message MAY use early or 1065 immediate feedback in cases when timeliness is desirable. Any 1066 repetition of a request message SHOULD use regular RTCP mode for its 1067 transmission timing. 1069 4.2.2. Temporary Maximum Media Bit-rate Notification (TMMBN) 1071 The FCI field of the TMMBN Feedback message SHALL contain one TMMBN 1072 FCI entry. 1074 4.2.2.1. Semantics 1076 This feedback message is used to notify the senders of any TMMBR 1077 message that one or more TMMBR messages have been received. It 1078 indicates to all participants the currently employed maximum bit-rate 1079 value and the "owner" of the current limitation. The "owner" of a 1080 limitation is the sender of the last (most restrictive) TMMBR message 1081 received by the media sender. 1083 The "SSRC of the packet sender" field indicates the source of the 1084 notification. The "SSRC of media source" SHALL be set to the SSRC of 1085 the media receiver that currently owns the bit-rate limitation. 1087 A TMMBN message SHALL be scheduled for transmission after the 1088 reception of a TMMBR message with a FCI including the session 1089 participant's SSRC. Only a single TMMBN SHALL be sent, even if more 1090 than one TMMBR messages are received between the scheduling of the 1091 transmission and the actual transmission of the TMMBN message. The 1092 TMMBN message indicates the limit and the owner of that limit at the 1093 time of transmitting the message. The limit SHALL be the lowest of 1094 all values received since the last TMMBN was transmitted. The one 1095 sending that request SHALL become the owner of the limit. 1097 The reception of a TMMBR message with a transmission limit greater or 1098 equal than the current limit SHALL still result in the transmission 1099 of a TMMBN message. However the limit and owner is not changed, 1100 unless it was from the owner, and the current limit and owner is 1101 indicated in the TMMBN message. This procedure allows session 1102 participants that haven't seen the last TMMBN message to get a 1103 correct view of this media sender's state. 1105 When a media sender determines an "owner" of a limitation has left 1106 the session, then the current limitation is removed, and the media 1107 sender SHALL send a TMMBN message indicating the maximum session 1108 bandwidth. 1110 4.2.2.2. Message Format 1112 The TMMBN Feedback control information (FCI) entry has the following 1113 syntax: 1115 0 1 2 3 1116 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1117 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1118 | Maximum bit-rate in units of 128 bits/s | 1119 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1121 Figure 2 - Syntax for the TMMBN message 1123 Maximum bit-rate: The current temporary maximum media bit-rate 1124 value in units of 128 bit/s. 1126 The length field value of the FB message SHALL be 3. 1128 4.2.2.3. Timing Rules 1130 The acknowledgement SHOULD be sent as soon as allowed by the applied 1131 timing rules for the session. Immediate or early feedback mode SHOULD 1132 be used for these messages. 1134 4.3. Payload Specific Feedback Messages 1136 Payload-Specific FB messages are identified by the value PT=PSFB 1137 (206) as RTCP packet type. 1139 AVPF defines three payload-specific FB messages and one application 1140 layer FB message. This memo specifies four additional payload 1141 specific feedback messages. All are identified by means of the FMT 1142 parameter as follows: 1144 0: unassigned 1145 1: Picture Loss Indication (PLI) 1146 2: Slice Lost Indication (SLI) 1147 3: Reference Picture Selection Indication (RPSI) 1148 4: Full Intra Request Command (FIR) 1149 5: Temporal-Spatial Trade-off Request (TSTR) 1150 6: Temporal-Spatial Trade-off Announcement (TSTA) 1151 7: Video Back Channel Message (VBCM) 1152 8-14: unassigned 1153 15: Application layer FB message 1154 16-30: unassigned 1155 31: reserved for future expansion of the sequence number space 1157 The following subsections define the new FCI formats for the payload- 1158 specific FB messages. 1160 4.3.1. Full Intra Request (FIR) command 1162 The FIR command FB message is identified by PT=PSFB and FMT=4. 1164 There MUST be one or more FIR entry contained in the FCI field. 1166 4.3.1.1. Semantics 1168 Upon reception of a FIR message, an encoder MUST send a decoder 1169 refresh point (see Section 2.2) as soon as possible. 1171 Note: Currently, video appears to be the only useful application 1172 for FIR, as it appears to be the only RTP payloads widely deployed 1173 that relies heavily on media prediction across RTP packet 1174 boundaries. However, use of FIR could also reasonably be 1175 envisioned for other media types that share essential properties 1176 with compressed video, namely cross-frame prediction (whatever a 1177 frame may be for that media type). One possible example may be the 1178 dynamic updates of MPEG-4 scene descriptions. It is suggested that 1179 payload formats for such media types refer to FIR and other message 1180 types defined in this specification and in AVPF, instead of 1181 creating similar mechanisms in the payload specifications. The 1182 payload specifications may have to explain how the payload specific 1183 terminologies map to the video-centric terminology used here. 1185 Note: In environments where the sender has no control over the 1186 codec (e.g. when streaming pre-recorded and pre-coded content), the 1187 reaction to this command cannot be specified. One suitable 1188 reaction of a sender would be to skip forward in the video bit 1189 stream to the next decoder refresh point. In other scenarios, it 1190 may be preferable not to react to the command at all, e.g. when 1191 streaming to a large multicast group. Other reactions may also be 1192 possible. When deciding on a strategy, a sender could take into 1193 account factors such as the size of the receiving multicast group, 1194 the "importance" of the sender of the FIR message (however 1195 "importance" may be defined in this specific application), the 1196 frequency of decoder refresh points in the content, and others. 1197 However the usage of FIR in a session which predominately handles 1198 pre-coded content shouldn't use the FIR at all. 1200 The sender MUST consider congestion control as outlined in section 5, 1201 which MAY restrict its ability to send a decoder refresh point 1202 quickly. 1204 Note: The relationship between the Picture Loss Indication and FIR 1205 is as follows. As discussed in section 6.3.1 of AVPF, a Picture 1206 Loss Indication informs the decoder about the loss of a picture and 1207 hence the likeliness of misalignment of the reference pictures in 1208 encoder and decoder. Such a scenario is normally related to losses 1209 in an ongoing connection. In point-to-point scenarios, and without 1210 the presence of advanced error resilience tools, one possible 1211 option an encoder has is to send a decoder refresh point. However, 1212 there are other options including ignoring the PLI, for example if 1213 only one receiver of many has sent a PLI or when the embedded 1214 stream redundancy is likely to clean up the reproduced picture 1215 within a reasonable amount of time. 1216 The FIR, in contrast, leaves a real-time encoder no choice but to 1217 send a decoder refresh point. It disallows the encoder to take 1218 into account any considerations such as the ones mentioned above. 1220 Note: Mandating a maximum delay for completing the sending of a 1221 decoder refresh point would be desirable from an application 1222 viewpoint, but may be problematic from a congestion control point 1223 of view. "As soon as possible" as mentioned above appears to be a 1224 reasonable compromise. 1226 FIR SHALL NOT be sent as a reaction to picture losses - it is 1227 RECOMMENDED to use PLI instead. FIR SHOULD be used only in such 1228 situations where not sending a decoder refresh point would render the 1229 video unusable for the users. 1231 Note: a typical example where sending FIR is adequate is when, in a 1232 multipoint conference, a new user joins the session and no regular 1233 decoder refresh point interval is established. Another example 1234 would be a video switching MCU that changes streams. Here, 1235 normally, the MCU issues a freeze picture request (through protocol 1236 means outside this specification) to the receiver(s), switches the 1237 streams, and issues a FIR to the new sender so to force it to emit 1238 a decoder refresh point. The decoder refresh point includes 1239 normally a Freeze Picture Release (defined outside this 1240 specification), which re-starts the rendering process of the 1241 receivers. Both techniques mentioned are commonly used in MCU- 1242 based multipoint conferences. 1244 Other RTP payload specifications such as RFC 2032 [RFC2032] already 1245 define a feedback mechanism for certain codecs. An application 1246 supporting both schemes MUST use the feedback mechanism defined in 1247 this specification when sending feedback. For backward compatibility 1248 reasons, such an application SHOULD also be capable to receive and 1249 react to the feedback scheme defined in the respective RTP payload 1250 format, if this is required by that payload format. 1252 The "SSRC of the packet sender" field indicates the source of the 1253 request, and the "SSRC of media source" is not used and SHALL be set 1254 to 0. The SSRC of media sender to which the FIR command applies to is 1255 in the FCI. 1257 4.3.1.2. Message Format 1259 Full Intra Request uses one additional FCI field, the content of 1260 which is depicted in Figure 3 The length of the FB message MUST be 1261 set to 2+2*N, where N is the number of FCI entries. 1263 0 1 2 3 1264 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1265 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1266 | SSRC | 1267 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1268 | Seq. nr | Reserved | 1269 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1271 Figure 3 - Syntax for the FIR message 1273 SSRC: The SSRC value of the media sender of this specific FIR 1274 command. 1276 Seq. nr: Command sequence number. The sequence number space is 1277 unique for each tuple consisting of the SSRC of command 1278 source and the SSRC of the command target. The sequence 1279 number SHALL be increased by 1 modulo 256 for each new 1280 command. A repetition SHALL NOT increase the sequence 1281 number. Initial value is arbitrary. 1283 Reserved: All bits SHALL be set to 0 and SHALL be ignored on 1284 reception. 1286 The semantics of this FB message is independent of the RTP payload 1287 type. 1289 4.3.1.3. Timing Rules 1291 The timing follows the rules outlined in section 3 of [RFC4548]. FIR 1292 commands MAY be used with early or immediate feedback. The FIR 1293 feedback message MAY be repeated. If using immediate feedback mode 1294 the repetition SHOULD wait at least on RTT before being sent. In 1295 early or regular RTCP mode the repetition is sent in the next regular 1296 RTCP packet. 1298 4.3.1.4. Remarks 1300 FIR messages typically trigger the sending of full intra or IDR 1301 pictures. Both are several times larger then predicted (inter) 1302 pictures. Their size is independent of the time they are generated. 1303 In most environments, especially when employing bandwidth-limited 1304 links, the use of an intra picture implies an allowed delay that is a 1305 significant multitude of the typical frame duration. An example: If 1306 the sending frame rate is 10 fps, and an intra picture is assumed to 1307 be 10 times as big as an inter picture, then a full second of latency 1308 has to be accepted. In such an environment there is no need for a 1309 particular short delay in sending the FIR message. Hence waiting for 1310 the next possible time slot allowed by RTCP timing rules as per 1311 [RFC4548] may not have an overly negative impact on the system 1312 performance. 1314 4.3.2. Temporal-Spatial Trade-off Request (TSTR) 1316 The TSTR FB message is identified by PT=PSFB and FMT=5. 1318 There MUST be one or more TSTR entry contained in the FCI field. 1320 4.3.2.1. Semantics 1322 A decoder can suggest the use of a temporal-spatial trade-off by 1323 sending a TSTR message to an encoder. If the encoder is capable of 1324 adjusting its temporal-spatial trade-off, it SHOULD take into account 1325 the received TSTR message for future coding of pictures. A value of 1326 0 suggests a high spatial quality and a value of 31 suggests a high 1327 frame rate. The values from 0 to 31 indicate monotonically a desire 1328 for higher frame rate. Actual values do not correspond to precise 1329 values of spatial quality or frame rate. 1331 The reaction to the reception of more than one TSTR message by a 1332 media sender from different media receivers is left open to the 1333 implementation. The selected trade-off SHALL be communicated to the 1334 media receivers by the means of the TSTA message. 1336 The "SSRC of the packet sender" field indicates the source of the 1337 request, and the "SSRC of media source" is not used and SHALL be set 1338 to 0. The SSRC of media sender to which the TSTR applies to is in the 1339 FCI entries. 1341 A TSTR message may contain multiple requests to different media 1342 senders, using multiple FCI entries. 1344 4.3.2.2. Message Format 1346 The Temporal-Spatial Trade-off Request uses one FCI field, the 1347 content of which is depicted in Figure 4. The length of the FB 1348 message MUST be set to 2+2*N, where N is the number of FCI entries 1349 included. 1351 0 1 2 3 1352 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1353 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1354 | SSRC | 1355 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1356 | Seq nr. | Reserved | Index | 1357 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1359 Figure 4 - Syntax of the TSTR 1361 SSRC: The SSRC value of the target (or the media sender) of this 1362 specific TSTR request. 1364 Seq. nr: Request sequence number. The sequence number space is 1365 unique for each tuple consisting of the SSRC of request 1366 source and the SSRC of the request target. The sequence 1367 number SHALL be increased by 1 modulo 256 for each new 1368 command. A repetition SHALL NOT increase the sequence 1369 number. Initial value is arbitrary. 1371 Index: An integer value between 0 and 31 that indicates the 1372 relative trade off that is requested. An index value of 0 1373 index highest possible spatial quality, while 31 indicates 1374 highest possible temporal resolution. 1376 Reserved: All bits SHALL be set to 0 and SHALL be ignored on 1377 reception. 1379 4.3.2.3. Timing Rules 1381 The timing follows the rules outlined in section 3 of [RFC4548]. 1382 This request message is not time critical and SHOULD be sent using 1383 regular RTCP timing. Only if it is known that the user interface 1384 requires a quick feedback, the message MAY be sent with early or 1385 immediate feedback timing. 1387 4.3.2.4. Remarks 1389 The term "spatial quality" does not necessarily refer to the 1390 resolution, measured by the number of pixels the reconstructed video 1391 is using. In fact, in most scenarios the video resolution stays 1392 constant during the lifetime of a session. However, all video 1393 compression standards have means to adjust the spatial quality at a 1394 given resolution, often influenced by the Quantizer Parameter or QP. 1395 A numerically low QP results in a good reconstructed picture quality, 1396 whereas a numerically high QP yields a coarse picture. The typical 1397 reaction of an encoder to this request is to change its rate control 1398 parameters to use a lower frame rate and a numerically lower (on 1399 average) QP, or vice versa. The precise mapping of Index, frame 1400 rate, and QP is intentionally left open here, as it depends on 1401 factors such as compression standard employed, spatial resolution, 1402 content, bit rate, and many more. 1404 4.3.3. Temporal-Spatial Trade-off Announcement (TSTA) 1406 The TSTA FB message is identified by PT=PSFB and FMT=6. 1408 There SHALL be one or more TSTA contained in the FCI field. 1410 4.3.3.1. Semantics 1412 This feedback message is used to acknowledge the reception of a TSTR. 1413 A TSTA entry in a TSTA feedback message SHALL be sent for each TSTR 1414 entry targeted to this session participant, i.e. each TSTR received 1415 that in the SSRC field in the entry has the receiving entities SSRC. 1416 The acknowledgement SHALL be sent also for repetitions received. If 1417 the request receiver has received TSTR with several different 1418 sequence numbers from a single requestor it SHALL only respond to the 1419 request with the highest (modulo 256) sequence number. 1421 The TSTA SHALL include the Temporal-Spatial Trade-off index that will 1422 be used as a result of the request. This is not necessarily the same 1423 index as requested, as media sender may need to aggregate requests 1424 from several requesting session participants. It may also have some 1425 other policies or rules that limit the selection. 1427 A single TSTA message MAY acknowledge multiple requests using 1428 multiple FCI entries. 1430 4.3.3.2. Message Format 1432 The Temporal-Spatial Trade-off Announcement uses one additional FCI 1433 field, the content of which is depicted in Figure 5. The length of 1434 the FB message MUST be set to 2+2*N, where N is the number of FCI 1435 entries. 1437 0 1 2 3 1438 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1439 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1440 | SSRC | 1441 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1442 | Seq nr. | Reserved | Index | 1443 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1445 Figure 5 - Syntax of the TSTA 1447 SSRC: The SSRC of the source of the TSTA request that is 1448 acknowledged. 1450 Seq. nr: The sequence number value from the TSTA request that is 1451 being acknowledged. 1453 Index: The trade-off value the media sender is using henceforth. 1455 Reserved: All bits SHALL be set to 0 and SHALL be ignored on 1456 reception. 1458 Informative note: The returned trade-off value (Index) may differ 1459 from the requested one, for example in cases where a media encoder 1460 cannot tune its trade-off, or when pre-recorded content is used. 1462 4.3.3.3. Timing Rules 1464 The timing follows the rules outlined in section 3 of [RFC4548]. 1465 This acknowledgement message is not extremely time critical and 1466 SHOULD be sent using regular RTCP timing. 1468 4.3.3.4. Remarks 1470 None 1472 4.3.4. H.271 VideoBackChannelMessage (VBCM) 1474 The VBCM FB message is identified by PT=PSFB and FMT=7. 1476 There MUST be one or more VBCM entry contained in the FCI field. 1478 Semantics 1480 The "payload" of VBCM indication carries codec specific, different 1481 types of feedback information. The type of feedback information can 1482 be classified as "status report" such as receiving bit stream without 1483 errors, loss of partial or complete picture or block or "update 1484 requests" such as complete refresh of the bit stream. 1486 Note: There are possible overlap between the VBCM sub-messages 1487 and CCM/AVPF feedback messages, such FIR. Please see section 1488 3.5.3 for further discussions. 1490 The different types of feedback sub-messages carried in the VBCM are 1491 indicated by the "payloadType" as defined in [VBCM]. The different 1492 sub-message types as defined in [VBCM] are re-produced below for 1493 convenience. "payloadType", in ITU-T Rec. H.271 terminology, refers 1494 to the sub-type of the H.271 message and should not be confused with 1495 an RTP payload type. 1497 Payload Type Message Content 1498 0 One or more pictures without detected bitstream error mismatch 1499 1 One or more pictures that are entirely or partially lost 1500 2 A set of blocks of one picture that is entirely or partially 1501 lost 1502 3 CRC for one parameter set 1503 4 CRC for all parameter sets of a certain type 1504 5 A "reset" request indicating that the sender should completely 1505 refresh the video bitstream as if no prior bitstream data had been 1506 received 1507 > 5 Reserved for future use by ITU-T 1509 The bit string or the "payload" of VBCM message is of variable length 1510 and is self-contained and coded in a variable length, binary format. 1511 The media sender necessarily has to be able to parse this optimized 1512 binary format to make use of VBCM messages 1514 Each of the different types of sub-messages (indicated by 1515 payloadType)e may have different semantic based on the codec used. 1517 Message Format 1519 The VBCM indication uses one FCI field and the syntax is depicted in 1520 Figure 6. 1522 0 1 2 3 1523 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1524 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1525 | SSRC | 1526 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1527 | Seq. nr |0| Payload Type| Length | 1528 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1529 | VBCM Bit String.... | Padding | 1530 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1532 Figure 6 - Syntax for VBCM Message 1533 SSRC: The SSRC value of the media sender of this specific VBCM 1534 indication message. 1536 Seq. nr : Command sequence number. The sequence number space 1537 is unique for each tuple consisting of the SSRC of command 1538 source and the SSRC of the command target. The sequence number 1539 SHALL be increased by 1 modulo 256 for each new command. A 1540 repetition SHALL NOT increase the sequence number. Initial 1541 value is arbitrary. 1543 0: Must be set to 0 and should not be acted upon receiving. 1545 Payload: The RTP payload type for which the VBCM bit stream 1546 must be interpreted. 1548 NOTE : Stephan I think this payload type is redundant, 1549 since during session set up phase you do lock down the payload 1550 type that you are going to use for this session. But I am 1551 keeping it since it is there in RPSI of AVPF. Is there any 1552 reason for this ? From an implementation point of view I can 1553 see this being helpful (and fast) since you don't have to go 1554 back to your session information. 1556 VBCM Bit String : This is the bit string generated by the 1557 decoder carrying a specific feedback sub-message. It is of 1558 variable length. 1560 Padding: Bits set to 0 to make up a 32 bit boundry 1562 Timing Rules 1564 The timing follows the rules outlined in section 3 of [RFC4548] 1566 Remarks 1567 Please see section 3.5.3 for the applicability of the VBCM message 1568 in relation to messages in both AVPF and this memo with similar 1569 functionality. 1571 5. Congestion Control 1573 The correct application of the AVPF timing rules prevents the network 1574 flooding by feedback messages. Hence, assuming a correct 1575 implementation, the RTCP channel cannot break its bit-rate commitment 1576 and introduce congestion. 1578 The reception of some of the feedback messages modifies the behaviour 1579 of the media senders or, more specifically, the media encoders. All 1580 of these modifications MUST only be performed within the bandwidth 1581 limits the applied congestion control provides. For example, when 1582 reacting to a FIR, the unusually high number of packets that form the 1583 decoder refresh point have to be paced in compliance with the 1584 congestion control algorithm, even if the user experience suffers 1585 from a slowly transmitted decoder refresh point. 1587 A change of the Temporary Maximum Media Bit-rate value can only 1588 mitigate congestion, but not cause congestion. An increase of the 1589 value by a request REQUIRES the media sender to use congestion 1590 control when increasing its transmission rate to that value. A 1591 reduction of the value results in a reduced transmission bit-rate 1592 thus reducing the risk for congestion. 1594 6. Security Considerations 1596 The defined messages have certain properties that have security 1597 implications. These must be addressed and taken into account by users 1598 of this protocol. 1600 The defined setup signalling mechanism is sensitive to modification 1601 attacks that can result in session creation with sub-optimal 1602 configuration, and, in the worst case, session rejection. To prevent 1603 this type of attack, authentication and integrity protection of the 1604 setup signalling is required. 1606 Spoofed or maliciously created feedback messages of the type defined 1607 in this specification can have the following implications: 1608 a. Severely reduced media bit-rate due to false TMMBR messages 1609 that sets the maximum to a very low value. 1610 b. The assignment of the ownership of a bit-rate limit with a 1611 TMMBN message to the wrong participant. Thus potentially 1612 freezing the mechanism until a correct TMMBN message reached 1613 the participants. 1614 c. Sending TSTR that result in a video quality different from 1615 the user's desire, rendering the session less useful. 1616 d. Frequent FIR commands will potentially reduce the frame-rate 1617 making the video jerky due to the frequent usage of decoder 1618 refresh points. 1620 To prevent these attacks there is need to apply authentication and 1621 integrity protection of the feedback messages. This can be 1622 accomplished against group external threats using the RTP profile 1623 that combines SRTP [SRTP] and AVPF into SAVPF [SAVPF]. In the MCU 1624 cases, separate security contexts and filtering can be applied 1625 between the MCU and the participants thus protecting other MCU users 1626 from a misbehaving participant. 1628 7. SDP Definitions 1630 Section 4 of [RFC4548] defines new SDP [RFC2327] attributes that are 1631 used for the capability exchange of the AVPF commands and 1632 indications, such as Reference Picture selection, Picture loss 1633 indication etc. The defined SDP attribute is known as rtcp-fb and its 1634 ABNF is described in section 4.2 of [RFC4548]. In this section we 1635 extend the rtcp-fb attribute to include the commands and indications 1636 that are described in this document for codec control protocol. We 1637 also discuss the Offer/Answer implications for the codec control 1638 commands and indications. 1640 7.1. Extension of rtcp-fb attribute 1642 As described in [RFC4548], the rtcp-fb attribute is defined to 1643 indicate the capability of using RTCP feedback. As defined in AVPF 1644 the rtcp-fb attribute must only be used as a media level attribute 1645 and must not be provided at session level. 1646 All the rules described in [RFC4548] for rtcp-fb attribute relating 1647 to payload type, multiple rtcp-fb attributes in a session description 1648 hold for the new feedback messages for codec control defined in this 1649 document. 1651 The ABNF for rtcp-fb attributed as defined in [RFC4548] is 1653 Rtcp-fb-syntax = "a=rtcp-fb: " rtcp-fb-pt SP rtcp-fb-val CRLF 1655 Where rtcp-fb-pt is the payload type and rtcp-fb-val defines the type 1656 of the feedback message such as ack, nack, trr-int and rtcp-fb-id. 1657 For example to indicate the support of feedback of picture loss 1658 indication, the sender declares the following in SDP 1660 v=0 1661 o=alice 3203093520 3203093520 IN IP4 host.example.com 1662 s=Media with feedback 1663 t=0 0 1664 c=IN IP4 host.example.com 1665 m=audio 49170 RTP/AVPF 98 1666 a=rtpmap:98 H263-1998/90000 1667 a=rtcp-fb:98 nack pli 1669 In this document we define a new feedback value type called "ccm" 1670 which indicates the support of codec control using RTCP feedback 1671 messages. The "ccm" feedback value should be used with parameters, 1672 which indicates the support of which codec commands the session may 1673 use. In this draft we define four parameters, which can be used with 1674 the ccm feedback value type. 1676 o "fir" indicates the support of Full Intra Request 1677 o "tmmbr" indicates the support of Temporal Maximum Media Bit-rate 1678 o "tstr" indicates the support of temporal spatial trade-off 1679 request. 1680 O "bbcm" indicates the support of H.271 video back channel 1681 messages. 1683 In ABNF for rtcp-fb-val defined in [RFC4548], there is a placeholder 1684 called rtcp-fb-id to define new feedback types. The ccm is defined as 1685 a new feedback type in this document and the ABNF for the parameters 1686 for ccm are defined here (please refer section 4.2 of [RFC4548] for 1687 complete ABNF syntax). 1689 Rtcp-fb-param = SP "app" [SP byte-string] 1690 / SP rtcp-fb-ccm-param 1691 / ; empty 1693 rtcp-fb-ccm-param = "ccm" SP ccm-param 1695 ccm-param = "fir" ; Full Intra Request 1696 / "tmmbr" ; Temporary max media bit rate 1697 / "tstr" ; Temporal Spatial Trade Off 1698 / "vbcm" 1*[SP subMessageType] ; H.271 VBCM messages 1699 / token [SP byte-string] 1700 ; for future commands/indications 1701 subMessageType = 1*[integer]; 1702 byte-string = 1704 7.2. Offer-Answer 1706 The Offer/Answer [RFC3264] implications to codec control protocol 1707 feedback messages are similar to as described in [RFC4548]. The 1708 offerer MAY indicate the capability to support selected codec 1709 commands and indications. The answerer MUST remove all ccm 1710 parameters, which it does not understand or does not wish to use in 1711 this particular media session. The answerer MUST NOT add new ccm 1712 parameters in addition to what has been offered. The answer is 1713 binding for the media session and both offerer and answerer MUST only 1714 use feedback messages negotiated in this way. 1716 7.3. Examples 1718 Example 1: The following SDP describes a point-to-point video call 1719 with H.263 with the originator of the call declaring its capability 1720 to support codec control messages - fir, tstr. The SDP is carried in 1721 a high level signalling protocol like SIP 1722 v=0 1723 o=alice 3203093520 3203093520 IN IP4 host.example.com 1724 s=Point-to-Point call 1725 c=IN IP4 172.11.1.124 1726 m=audio 49170 RTP/AVP 0 1727 a=rtpmap:0 PCMU/8000 1728 m=video 51372 RTP/AVPF 98 1729 a=rtpmap:98 H263-1998/90000 1730 a=rtcp-fb:98 ccm tstr 1731 a=rtcp-fb:98 ccm fir 1733 In the above example the sender when it receives a TSTR message from 1734 the remote party can adjust the trade off as indicated in the RTCP 1735 TSTA feedback message. 1737 Example 2: The following SDP describes a SIP end point joining a 1738 video MCU that is hosting a multiparty video conferencing session. 1739 The participant supports only the FIR (Full Intra Request) codec 1740 control command and it declares it in its session description. The 1741 video MCU can send an FIR RTCP feedback message to this end point 1742 when it needs to send this participants video to other participants 1743 of the conference. 1745 v=0 1746 o=alice 3203093520 3203093520 IN IP4 host.example.com 1747 s=Multiparty Video Call 1748 c=IN IP4 172.11.1.124 1749 m=audio 49170 RTP/AVP 0 1750 a=rtpmap:0 PCMU/8000 1751 m=video 51372 RTP/AVPF 98 1752 a=rtpmap:98 H263-1998/90000 1753 a=rtcp-fb:98 ccm fir 1755 When the video MCU decides to route the video of this participant it 1756 sends an RTCP FIR feedback message. Upon receiving this feedback 1757 message the end point is mandated to generate a full intra request. 1759 Example 3: The following example describes the Offer/Answer 1760 implications for the codec control messages. The Offerer wishes to 1761 support all the commands and indications of codec control messages. 1762 The offered SDP is 1764 -------------> Offer 1765 v=0 1766 o=alice 3203093520 3203093520 IN IP4 host.example.com 1767 s=Offer/Answer 1768 c=IN IP4 172.11.1.124 1769 m=audio 49170 RTP/AVP 0 1770 a=rtpmap:0 PCMU/8000 1771 m=video 51372 RTP/AVPF 98 1772 a=rtpmap:98 H263-1998/90000 1773 a=rtcp-fb:98 ccm tstr 1774 a=rtcp-fb:98 ccm fir 1775 a=rtcp-fb:98 ccm tmmbr 1777 The answerer only wishes to support FIR and TSTR message as the codec 1778 control messages and the answerer SDP is 1780 <---------------- Answer 1782 v=0 1783 o=alice 3203093520 3203093524 IN IP4 otherhost.example.com 1784 s=Offer/Answer 1785 c=IN IP4 189.13.1.37 1786 m=audio 47190 RTP/AVP 0 1787 a=rtpmap:0 PCMU/8000 1788 m=video 53273 RTP/AVPF 98 1789 a=rtpmap:98 H263-1998/90000 1790 a=rtcp-fb:98 ccm tstr 1791 a=rtcp-fb:98 ccm fir 1793 Example 4: The following example describes the Offer/Answer 1794 implications for H.271 Video back channel messages (VBCM). The 1795 Offerer wishes to support VBCM and the submessages of payloadType 2( 1796 A set of blocks of one picture that is entirely or partially lost, 3 1797 (CRC for one parameter set) and 4 (CRC for all parameter sets of a 1798 certain type). 1800 -------------> Offer 1801 v=0 1802 o=alice 3203093520 3203093520 IN IP4 host.example.com 1803 s=Offer/Answer 1804 c=IN IP4 172.11.1.124 1805 m=audio 49170 RTP/AVP 0 1806 a=rtpmap:0 PCMU/8000 1807 m=video 51372 RTP/AVPF 98 1808 a=rtpmap:98 H263-1998/90000 1809 a=rtcp-fb:98 ccm vbcm 2 3 4 1811 The answerer only wishes to support sub-messages 3 and 4 only 1813 <---------------- Answer 1815 v=0 1816 o=alice 3203093520 3203093524 IN IP4 otherhost.example.com 1817 s=Offer/Answer 1818 c=IN IP4 189.13.1.37 1819 m=audio 47190 RTP/AVP 0 1820 a=rtpmap:0 PCMU/8000 1821 m=video 53273 RTP/AVPF 98 1822 a=rtpmap:98 H263-1998/90000 1823 a=rtcp-fb:98 ccm vbcm 3 4 1825 So in the above example only VBCM indication comprising of only 1826 "payloadType" 3 and 4 will be supported. 1828 8. IANA Considerations 1830 The new value of ccm for the rtcp-fb attribute needs to be registered 1831 with IANA. 1833 Value name: ccm 1834 Long Name: Codec Control Commands and Indications 1835 Reference: RFC XXXX 1837 For use with "ccm" the following values also needs to be 1838 registered. 1840 Value name: fir 1841 Long name: Full Intra Request Command 1842 Usable with: ccm 1843 Reference: RFC XXXX 1845 Value name: tmmbr 1846 Long name: Temporary Maximum Media Bit-rate 1847 Usable with: ccm 1848 Reference: RFC XXXX 1850 Value name: tstr 1851 Long name: temporal Spatial Trade Off 1852 Usable with: ccm 1853 Reference: RFC XXXX 1855 Value name: vbcm 1856 Long name: H.271 video back channel messages 1857 Usable with: ccm 1858 Reference: RFC XXXX 1860 9. Acknowledgements 1862 The authors would like to thank Andrea Basso, Orit Levin, Nermeen 1863 Ismail for their work on the requirement and discussion draft 1864 [Basso]. 1866 10. References 1868 10.1. Normative references 1870 [RFC4548] Ott, J., Wenger, S., Sato, N., Burmeister, C., Rey, J., 1871 "Extended RTP Profile for Real-Time Transport Control 1872 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 1873 2006 1874 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1875 Requirement Levels", BCP 14, RFC 2119, March 1997. 1876 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 1877 Jacobson, "RTP: A Transport Protocol for Real-Time 1878 Applications", STD 64, RFC 3550, July 2003. 1879 [RFC2327] Handley, M. and V. Jacobson, "SDP: Session Description 1880 Protocol", RFC 2327, April 1998. 1881 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 1882 with Session Description Protocol (SDP)", RFC 3264, June 1883 2002. 1884 [Topologies] M. Westerlund, and S. Wenger, "Topologies", RFC xxxx, x 1886 10.2. Informative references 1888 [Basso] A. Basso, et. al., "Requirements for transport of video 1889 control commands", draft-basso-avt-videoconreq-02.txt, 1890 expired Internet Draft, October 2004. 1891 [AVC] Joint Video Team of ITU-T and ISO/IEC JTC 1, Draft ITU-T 1892 Recommendation and Final Draft International Standard of 1893 Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC 1894 14496-10 AVC), Joint Video Team (JVT) of ISO/IEC MPEG and 1895 ITU-T VCEG, JVT-G050, March 2003. 1896 [NEWPRED] S. Fukunaga, T. Nakai, and H. Inoue, "Error Resilient Video 1897 Coding by Dynamic Replacing of Reference Pictures," in 1898 Proc. Globcom'96, vol. 3, pp. 1503 - 1508, 1996. 1899 [SRTP] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 1900 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 1901 RFC 3711, March 2004. 1902 [RFC2032] Turletti, T. and C. Huitema, "RTP Payload Format for H.261 1903 Video Streams", RFC 2032, October 1996. 1904 [SAVPF] J. Ott, E. Carrara, "Extended Secure RTP Profile for RTCP- 1905 based Feedback (RTP/SAVPF)," draft-ietf-avt-profile-savpf- 1906 02.txt, July, 2005. 1907 [RFC3525] Groves, C., Pantaleo, M., Anderson, T., and T. Taylor, 1908 "Gateway Control Protocol Version 1", RFC 3525, June 2003. 1909 [VBCM] ITU-T Rec. H.271, "Video Bach Channel Messages", pre- 1910 published, June 2006 1912 Any 3GPP document can be downloaded from the 3GPP web server, 1913 "http://www.3gpp.org/", see specifications. 1915 11. Authors' Addresses 1917 Stephan Wenger 1918 Nokia Corporation 1919 P.O. Box 100 1920 FIN-33721 Tampere 1921 FINLAND 1923 Phone: +358-50-486-0637 1924 EMail: stewe@stewe.org 1926 Umesh Chandra 1927 Nokia Research Center 1928 6000 Connection Drive 1929 Irving, Texas 75063 1930 USA 1932 Phone: +1-972-894-6017 1933 Email: Umesh.Chandra@nokia.com 1935 Magnus Westerlund 1936 Ericsson Research 1937 Ericsson AB 1938 SE-164 80 Stockholm, SWEDEN 1940 Phone: +46 8 7190000 1941 EMail: magnus.westerlund@ericsson.com 1943 Bo Burman 1944 Ericsson Research 1945 Ericsson AB 1946 SE-164 80 Stockholm, SWEDEN 1948 Phone: +46 8 7190000 1949 EMail: bo.burman@ericsson.com 1951 12. List of Changes relative to previous drafts 1953 The following changes since draft-wenger-avt-avpf-ccm-01 have been 1954 made: 1956 - The topologies have been rewritten and clarified. 1958 - The TMMBR mechanism has been completely revised to use notification 1959 and suppress messages in deployments with large common SSRC spaces. 1961 The following changes since draft-wenger-avt-avpf-ccm-02 have been 1962 made: 1964 - Update of section 4.2.2.1 (TMMBN) as per discussions between 1965 Harikishan Desineni and Magnus Westerlund on the AVT list around 1966 Feb 21, 2006 1967 - Section 2.3.4 clarified as per email exchange between Colin Perkins 1968 and Magnus Westerlund around Feb 24 1969 - Section 3.5.2 and other occurrences throughout the draft, 1970 Temporal/Spatial Acknowledgement renamed to Temporal/Spatial 1971 Annoucement 1973 Changes relative to draft-wenger-avt-avpf-ccm-03 1975 - Moved "topologies" out to another draft 1976 - Editorial improvements 1977 - Added new code point VBCM for H.271 Video back channel messages. 1978 Several sections - 3,4 and 7 were modified for this new CCM 1979 message. 1980 - Removed Basso use case referring to forward Freeze command, added 1981 justification. 1983 Full Copyright Statement 1985 Copyright (C) The Internet Society (2006). 1987 This document is subject to the rights, licenses and restrictions 1988 contained in BCP 78, and except as set forth therein, the authors 1989 retain all their rights. 1991 This document and the information contained herein are provided on an 1992 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1993 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1994 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1995 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1996 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1997 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1999 Intellectual Property Statement 2001 The IETF takes no position regarding the validity or scope of any 2002 Intellectual Property Rights or other rights that might be claimed to 2003 pertain to the implementation or use of the technology described in 2004 this document or the extent to which any license under such rights 2005 might or might not be available; nor does it represent that it has 2006 made any independent effort to identify any such rights. Information 2007 on the procedures with respect to rights in RFC documents can be 2008 found in BCP 78 and BCP 79. 2010 Copies of IPR disclosures made to the IETF Secretariat and any 2011 assurances of licenses to be made available, or the result of an 2012 attempt made to obtain a general license or permission for the use of 2013 such proprietary rights by implementers or users of this 2014 specification can be obtained from the IETF on-line IPR repository at 2015 http://www.ietf.org/ipr. 2017 The IETF invites any interested party to bring to its attention any 2018 copyrights, patents or patent applications, or other proprietary 2019 rights that may cover technology that may be required to implement 2020 this standard. Please address the information to the IETF at 2021 ietf-ipr@ietf.org. 2023 Acknowledgment 2025 Funding for the RFC Editor function is currently provided by the 2026 Internet Society. 2028 RFC Editor Considerations 2030 The RFC editor is requested to replace all occurrences of XXXX with 2031 the RFC number this document receives.