idnits 2.17.1 draft-ietf-avt-rtcp-feedback-01.txt: -(2): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(136): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(576): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1198): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1383): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1386): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1860): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document is more than 15 pages and seems to lack a Table of Contents. == There are 10 instances of lines with non-ascii characters in the document. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 38 longer pages, the longest (page 1) being 60 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 520 instances of too long lines in the document, the longest one being 4 characters in excess of 72. ** The abstract seems to contain references ([2], [10], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == There are 12 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 2 instances of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? RFC 2119 keyword, line 222: '...tiple FB messages MAY be combined in a...' RFC 2119 keyword, line 223: '... packet and they MAY also be sent comb...' RFC 2119 keyword, line 227: '... MUST contain RTCP packets in the order as defined in [1]:...' RFC 2119 keyword, line 229: '... . OPTIONAL encryption prefix tha...' RFC 2119 keyword, line 232: '...ATORY SDES which MUST contain the CNAM...' (93 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 748 has weird spacing: '...r which the "...' == Line 1146 has weird spacing: '... sender knows...' == Line 1608 has weird spacing: '... Mail fukun...' == Line 1615 has weird spacing: '... Mail sato6...' == Line 1628 has weird spacing: '... Mail akihi...' == (3 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 2002) is 8017 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 1852 looks like a reference -- Missing reference section? '10' on line 1894 looks like a reference -- Missing reference section? '2' on line 1658 looks like a reference -- Missing reference section? '7' on line 1873 looks like a reference -- Missing reference section? '11' on line 1687 looks like a reference -- Missing reference section? '5' on line 1668 looks like a reference -- Missing reference section? '6' on line 1671 looks like a reference -- Missing reference section? '8' on line 1678 looks like a reference -- Missing reference section? '3' on line 1662 looks like a reference -- Missing reference section? '4' on line 1665 looks like a reference -- Missing reference section? '14' on line 1698 looks like a reference -- Missing reference section? '13' on line 1695 looks like a reference -- Missing reference section? '15' on line 1701 looks like a reference -- Missing reference section? '12' on line 1692 looks like a reference -- Missing reference section? '16' on line 1705 looks like a reference -- Missing reference section? '9' on line 1681 looks like a reference -- Missing reference section? '124' on line 1716 looks like a reference Summary: 7 errors (**), 0 flaws (~~), 12 warnings (==), 20 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT J�rg Ott/Universit�t Bremen TZI 3 draft-ietf-avt-rtcp-feedback-01.txt Stephan Wenger/TU Berlin 4 Shigeru Fukunaga/Oki 5 Noriyuki Sato/Oki 6 Koichi Yano/Fast Forward Networks 7 Akihiro Miyazaki/Matsushita 8 Koichi Hata/Matsushita 9 Rolf Hakenberg/Matsushita 10 Carsten Burmeister/Matsushita 12 21 November, 2001 13 Expires May 2002 15 Extended RTP Profile for RTCP-based Feedback (RTP/AVPF) 17 Status of this Memo 19 This document is an Internet-Draft and is in full conformance with all 20 provisions of Section 10 of RFC 2026. Internet-Drafts are working 21 documents of the Internet Engineering Task Force (IETF), its areas, and 22 its working groups. Note that other groups may also distribute working 23 documents as Internet-Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet- Drafts as reference material 28 or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 Abstract 38 Real-time media streams are not resilient against packet losses. RTP 39 [1] provides all the necessary mechanisms to restore ordering and 40 timing to properly reproduce a media stream at the recipient. RTP 41 also provides continuous feedback about the overall reception quality 42 from all receivers -- thereby allowing the sender(s) in the mid-term 43 (in the order of several seconds to minutes) to adapt their coding 44 scheme and transmission behavior to the observed network QoS. 45 However, except for a few payload specific mechanisms [10], RTP makes 46 no provision for timely feedback that would allow a sender to repair 47 the media stream immediately: through retransmissions, retro-active 48 FEC, or media-specific mechanisms such as reference picture 49 selection. 51 Generally, real-time transport of media streams across IP networks 52 follows RTP[1] in conjunction with the RTP Profile for Audio and 53 Video Conferences with Minimal Control [2]. This document modifies 54 the profile defined in [2] in two ways: 56 . by providing additional RTCP messages that enable a receiver to 57 convey more precise feedback to a sender and 59 . by adapting the timing algorithm for scheduling RTCP packets in 60 order to allow for occasional timely feedback about events 61 observed by a receiver (such as lost packets). 63 The result is an RTP Profile for Audio and Video Conferences with 64 Minimal Control that allows for more explicit and more immediate 65 receiver feedback but shares all other properties (including all 66 other message types and formats, all code points for codecs, payload 67 formats, scaling capabilities, etc. of [2]). Therefore, this 68 document only specifies the additions and modifications to [2] rather 69 than the repeating the entire specification. 71 1. Introduction 73 Real-time media streams are not resilient against packet losses. RTP 74 [1] provides all the necessary mechanisms to restore ordering and 75 timing present at the sender to properly reproduce a media stream at 76 a recipient. RTP also provides continuous feedback about the overall 77 reception quality from all receivers -- thereby allowing the 78 sender(s) in the mid-term (in the order of several seconds to 79 minutes) to adapt their coding scheme and transmission behavior to 80 the observed network QoS. However, except for a few payload specific 81 mechanisms [10], RTP makes no provision for timely feedback that 82 would allow a sender to repair the media stream immediately: through 83 retransmissions, retro-active FEC, or media-specific mechanisms such 84 as reference picture selection. 86 Current mechanisms available with RTP to improve error resilience 87 include audio redundancy coding [7], video redundancy coding [11], 88 RTP-level FEC [5], and general considerations on more robust media 89 streams transmission [6]. These mechanisms may be applied pro- 90 actively (thereby increasing the bandwidth of a given media stream). 91 Alternatively, in sufficiently small groups with short RTTs, the 92 senders may perform repair on-demand, using the above mechanisms 93 and/or media-encoding-specific approaches. Note that "small group" 94 and "sufficiently short RTT" are both highly application dependent. 96 This document specifies a modified RTP Profile for Audio and Video 97 conferences with minimal control based upon [1] and [2] by means of 98 two modifications/additions: To achieve timely feedback the concepts 99 of Immediate Feedback messages and Early RTCP messages as well as 100 algorithms allowing for low delay feedback in small multicast groups 101 (and preventing feedback implosion in large ones) are introduced. 102 Special consideration is given to point-to-point scenarios. And a 103 small number general-purpose feedback messages as well as a format 104 for codec and application-specific feedback information is defined as 105 specific RTCP payloads. 107 1.1 Definitions 109 The definitions from [1] and [2] apply. In addition, the following 110 definitions are used in this document: 112 Early RTCP mode: 113 The mode of operation in which a receiver of a media stream 114 is, statistically, often (but not always) capable of 115 reporting events of interest back to the sender close to 116 their occurrence. In Early RTCP mode, RTCP feedback messages 117 are transmitted according to the timing rules defined in this 118 document. 120 Early RTCP packet: 121 An Early RTCP packet is a packet which is transmitted earlier 122 than would be allowed following the scheduling algorithm of 123 [1], the reason being that an event observed by a receiver. 124 Early RTCP packets may be sent in Immediate feedback and in 125 Early RTCP mode. 127 Event: 128 An observation made by the receiver of a media stream that is 129 (potentially) of interest to the sender -- such as a packet 130 loss or packet reception, frame loss, etc. -- and thus to be 131 reported back to the sender by means of a Feedback message. 133 Feedback (FB) message: 134 An RTCP message as defined in this document used to convey 135 events observed at a receiver -- in addition to long term 136 receiver status information which is carried in RTCP RRs � 137 back to the sender of the media stream. 139 Feedback (FB) threshold: 140 The FB threshold indicates the "borderline" between Immediate 141 Feedback and Early RTCP mode. For a multicast scenario, the 142 FB threshold indicates the maximum group size at which, on 143 average, each receiver is able to report each event back to 144 the sender(s) immediately, i.e. without having to wait for 145 its regularly scheduled RTCP interval. This threshold is 146 highly dependent on network QoS (e.g. packet loss probability 147 and distribution), codec and packetization in use, and 148 application requirements. Hence, no formal definition is 149 presented in this document. 151 Immediate Feedback mode: 152 Mode of operation in which each receiver of a media is, 153 statistically, capable of reporting each event of interest 154 immediately back to the media stream sender. In Immediate 155 Feedback mode, RTCP feedback messages are transmitted 156 according to the timing rules defined in this document. 158 Regular RTCP mode: 159 Mode of operation in which no preferred transmission of 160 feedback messages is allowed. Instead, RTCP messages are 161 sent following the rules of [1] and may contain feedback 162 messages information as defined in this document. 164 Regularly Scheduled RTCP packet: 165 An RTCP packet that is not sent as an Early RTCP packet. 167 1.2 Terminology 169 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 170 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 171 document are to be interpreted as described in RFC 2119 [8] 173 2. RTP and RTCP Packet Formats and Protocol Behavior 175 The rules defined in [2] also apply to this profile except for those 176 rules mentioned in the following: 178 RTCP packet types: 179 Three additional RTCP packet types to convey feedback 180 information are defined in section 4. 182 RTCP report intervals: 183 This memo describes three modes of operation which influence 184 the RTCP report intervals (see section 3.2). In regular 185 RTCP mode, all rules from [1] apply. In both Immediate 186 Feedback and Early RTCP modes the minimal interval of 5 187 seconds between 2 RTCP reports is dropped and the rules 188 specified in section 3 apply if RTCP packets containing 189 feedback messages (defined in section 4) are to be 190 transmitted. 192 The rules set forth in [1] may be overridden by session 193 descriptions specifying different parameters (e.g. for the 194 bandwidth share assigned to RTCP for senders and receivers, 195 respectively. For sessions defined using the Session 196 Description Protocol (SDP) [3], the rules of [4] apply. 198 Congestion control: 199 The same basic rules as detailed in [2] apply. Beyond this, 200 in section 5, further consideration is given to the impact of 201 feedback and a sender's reaction to feedback messages. 203 3. Rules for RTCP Feedback 205 3.1 Compound RTCP Feedback Packets 206 Two components constitute RTCP-based feedback as described in this 207 memo: 209 . Status reports are contained in SR/RR messages and are transmitted 210 at regular intervals as part of compound RTCP packets (which also 211 include SDES and possibly other messages); these status reports 212 provide an overall indication for the recent reception quality of 213 a media stream. 215 . Feedback messages as defined in this document that indicate loss 216 or reception of particular pieces of a media stream (or provide 217 some other form of rather immediate feedback on the data 218 received). Rules for the transmission of feedback messages are 219 newly introduced in this memo. 221 RTCP Feedback (FB) messages are just another RTCP packet type (see 222 section 4). Therefore, multiple FB messages MAY be combined in a 223 single compound RTCP packet and they MAY also be sent combined with 224 other RTCP packets. 226 RTCP packets containing Feedback packets as defined in this document 227 MUST contain RTCP packets in the order as defined in [1]: 229 . OPTIONAL encryption prefix that MUST be present if the RTCP 230 message is to be encrypted. 231 . MANDATORY SR or RR. 232 . MANDATORY SDES which MUST contain the CNAME item; all other SDES 233 items are OPTIONAL. 234 . One or more FB messages. 236 The FB MUST be placed in the compound packet after RR and SDES RTCP 237 packets defined in [1]. The ordering with respect to other RTCP 238 extensions is not defined. 240 Two types of compound RTCP packets carrying feedback packets are used 241 in this document: 243 a) Minimal compound RTCP feedback packet 245 A minimal compound RTCP feedback packet MUST contain only the 246 mandatory information as listed above: encryption prefix if 247 necessary, exactly one RR or SR, exactly one SDES with only the 248 CNAME item present, and the feedback message(s). This is to 249 minimize the size of the RTCP packet transmitted to convey 250 feedback and thus to maximize the frequency at which feedback can 251 be provided while still adhering to the RTCP bandwidth 252 limitations. 254 This packet format SHOULD be used whenever an RTCP feedback 255 message is sent as part of an Early RTCP packet. 257 b) (Full) compound RTCP feedback packet 258 A (full) compound RTCP feedback packet MAY contain any additional 259 number of RTCP packets (additional RRs, further SDES items, 260 etc.). 262 This packet format MUST be used whenever an RTCP feedback message 263 is sent as part of a regularly scheduled RTCP packet or in 264 Regular RTCP mode. This packet format MAY also be used to send 265 RTCP feedback messages in Immediate Feedback or Early RTCP mode. 267 RTCP packets that do not contain FB messages are referred to as non- 268 FB RTCP packets. 270 3.2 Algorithm Outline 272 FB messages are part of the RTCP control streams and are thus subject 273 to the same bandwidth constraints as other RTCP traffic. This means 274 in particular that it may not be possible to report an event observed 275 at a receiver immediately back to the sender. However, the value of 276 feedback given to a sender typically decreases over time -- in terms 277 of the media quality as perceived by the user at the receiving end 278 and/or the cost required to achieve media stream repair. 280 RTP [1] and the commonly used RTP profile [2] specify rules when 281 compound RTCP packets should be sent. This document modifies those 282 rules in order to allow applications to timely report media loss or 283 reception events to accommodate algorithms that use FB messages and 284 are sensitive to the feedback timing. 286 The modified algorithm can be outlined as follows: Normally, when no 287 FB messages have to be conveyed, compound RTCP packets are sent 288 following the rules of RTP [1] -- except that the 5s minimum interval 289 between RTCP reports is not enforced. If a receiver detects the need 290 for an FB message, the receiver waits for a short, random dithering 291 interval (in case of multicast) and then checks whether it has 292 already seen a corresponding FB message from any other receiver 293 (which it can do with all FB messages that are transmitted via 294 multicast; for unicast sessions, there is no such delay). If this is 295 the case then the receiver refrains from sending the FB message and 296 continues to follow the regular RTCP sending schedule. If the 297 receiver has not yet seen a similar FB message from any other 298 receiver, it checks whether it has recently exceeded its RTCP bit 299 rate budget to transmit another FB message (without waiting for its 300 regularly scheduled RTCP transmission time). Only if this is not the 301 case, it sends the FB message as part of a (minimal) compound RTCP 302 packet. 304 FB messages may also be sent as part of full compound RTCP packets 305 which are interspersed as per [1] in regular intervals. 307 3.3 Modes of Operation 309 RTCP-based feedback may operate in one of three modes (figure 1): 311 a) Immediate feedback mode: the group size is below the FB threshold 312 which gives each receiving party sufficient bandwidth to transmit 313 the feedback traffic for the intended purpose. This means, for 314 each receiver there is enough bandwidth to report each event it is 315 supposed/expected to by means of a virtually "immediate" RTCP 316 feedback packet. 318 The group size threshold is a function of a number of parameters 319 including (but not necessarily limited to) the type of feedback 320 used (e.g. ACK vs. NACK), bandwidth, packet rate, packet loss 321 probability and distribution, media type, codec, and -- again 322 depending on the type of FB used -- the (worst case or observed) 323 frequency of events to report (e.g. frame received, packet lost). 325 A special case of this is the ACK mode (where positive 326 acknowledgements are used to confirm reception of data) which is 327 restricted to point-to-point communications. 329 b) Early RTCP mode: In this mode, the group size and other parameters 330 no longer allow each receiver to react to each event that would be 331 worth (or needed) to report. But feedback can still be given 332 sufficiently often so that it allows the sender to adapt the media 333 stream transmission accordingly and thereby increase the overall 334 reproduced media quality. 336 c) From some group size upwards, it is no longer useful to provide 337 feedback from individual receivers at all -- because of the time 338 scale in which the feedback could be provided and/or because in 339 large groups the sender(s) have no chance to react to individual 340 feedback anymore. 342 As the feedback algorithm described in this memo scales smoothly, 343 there is no need for an agreement among the participants on the 344 precise values of the respective "thresholds" within the group. 345 Hence the borders between all these modes are allowed to be fluent. 347 ACK 348 feedback 349 V 350 :<- - - - NACK feedback - - - ->// 351 : 352 : Immediate || 353 : Feedback mode ||Early RTCP mode Regular RTCP mode 354 :<=============>||<=============>//<=================> 355 : || 356 -+---------------||---------------//------------------> group size 357 2 || 358 Application-specific FB Threshold 359 = f(data rate, packet loss, codec, ...) 361 Figure 1: Modes of operation 362 The respective thresholds depend on a number of technical parameters 363 (of the codec, the transport, the feedback used, etc.) but also on 364 the respective application scenarios. Section 3.5 provides some 365 useful hints (but no complete precise calculations) on estimating 366 these thresholds. 368 3.4 Definitions 370 The following pieces of state information need to be maintained per 371 receiver (largely taken from [1]). Note that all variables (except 372 for h) are calculated independently at each receiver and so their 373 local values may differ at a given point in time. 375 a) Let senders be the number of active senders in the RTP session. 377 b) Let members be the current estimate of the number of receivers 378 in the RTP session. 380 c) Let T_rtt be the maximum round trip time as measured by RTCP 381 (if available to the receiver). Note that this may be asymmetric. 383 d) Let tn and tp be the time for the next (last) scheduled 384 RTCP RR transmission calculated prior to reconsideration. 386 e) Let T_rr be the interval after which, having just sent a regularly 387 scheduled RTCP packet, a receiver would schedule the transmission 388 of its next RTCP packet following the rules of [1]: T_rr = tn - 389 tp. Note that the 5s minimum interval between two report as 390 defined in [1] SHOULD NOT be enforced. 392 f) Let t0 be the time at which an event that is to be reported is 393 detected by a receiver. 395 g) Let T_dither_max be the maximum interval for which an RTCP 396 feedback packet may be additionally delayed (to prevent 397 implosions). 399 h) Let T_max_fb_delay be the upper bound within which feedback to 400 an event needs to be reported back to the sender to be useful at 401 all. Note that this value is application-specific. 403 i) Let te be the time for which a feedback packet is scheduled. 405 j) Let T_fd be the actual (randomized) delay for the transmission of 406 feedback message in response to an event that a certain packet P 407 caused. 409 k) Let allow_early be a Boolean variable that indicates whether the 410 receiver currently may transmit feedback messages prior to its 411 next regularly scheduled RTCP interval tn. This variable is used 412 to throttle the feedback sent by a single receiver. allow_early 413 is adjusted (set to FALSE) after early feedback transmission and 414 is reset to TRUE as soon as the next regular RTCP transmission is 415 scheduled. 417 l) Let avg_rtcp_size be the moving average on the RTCP packet size as 418 defined in [1]. 420 The feedback situation for an event to report at a receiver is 421 depicted in figure 2 below. At time t0, such an event (e.g. a packet 422 loss) is detected at the receiver. The receiver decides -- based 423 upon current T_rtt, group size, and other (application-specific) 424 parameters -- that a feedback message needs to be sent back to the 425 sender. 427 To avoid an implosion of immediate feedback packets, the receiver 428 MUST delay the transmission of the compound feedback packet by a 429 random amount T_fd (with the random number evenly distributed in the 430 interval [0, T_dither_max]. Transmission of the compound RTCP packet 431 is then scheduled for te = t0 + T_fd. 433 The T_dither_max parameter is chosen based upon the round-trip time 434 or, if the round-trip time is not available, based upon the group 435 size. 437 Based upon the parameters influencing T_dither_max and a number of 438 other parameters (such as the type of feedback to be provided) the 439 receiver may determine T_max_fb_delay (as static value or dynamically 440 adjusted) as the upper bound for the feedback information to be 441 useful when it reaches the sender. 443 If a compound RTCP feedback packet is scheduled, the time slot for 444 the next scheduled compound RTCP packet is updated accordingly to a 445 new tn. 447 event to 448 report 449 detected 450 | 451 | RTCP feedback range 452 | (T_max_fb_delay) 453 vXXXXXXXXXXXXXXXXXXXXXXXXXXX ) ) 454 |---+--------+-------------+-----+------------| |--------+---------> 455 | | | | ( ( | 456 | t0 te | 457 tp tn 458 \_______ ________/ 459 \/ 460 T_dither_max 462 Figure 2: Event report and parameters for Early RTCP scheduling 464 3.5 Early RTCP Algorithm 465 Assume an active sender S0 (out of S senders) and a number N of 466 receivers with R being one of these receivers. 468 Assume further that R has verified that using feedback mechanisms is 469 reasonable at the current constellation (which is highly application 470 specific and hence not specified in this memo). 472 Then, receiver R MUST use the following rules for transmitting one or 473 more Feedback messages as minimal or full compound RTCP packet: 475 Initially, R MUST set allow_early := TRUE. 477 R has transmitted the last RTCP RR packet at tp and has scheduled the 478 next transmission (prior to reconsideration) for tn. 480 At time t0, R detects the need to transmit one or more feedback 481 messages (e.g. because media "units" needs to be ACKed or NACKed) and 482 finds that sending the feedback information is useful for the sender. 484 R first checks whether there is still a compound RTCP feedback packet 485 waiting for transmission (scheduled as early or regular RTCP packet). 486 If so, the new feedback message MUST be appended to the packet; the 487 schedule for the waiting RTCP feedback packet MUST remain unchanged. 488 When appending, the feedback information of several RTCP feedback 489 packets SHOULD be merged as few packets as possible. 491 If no RTCP feedback message is already awaiting transmission, a new 492 (minimal) compound RTCP feedback packet MUST be created and the 493 minimal interval for T_dither_max MUST be chosen as follows: 495 i) If the session is a unicast session (group size = 2) then 496 T_dither_max := 0. 498 ii) If the receiver has an RTT estimate to the originator of the 499 media unit to provide feedback about, then 501 T_dither_max := k * T_rtt/2 * members 503 with k=1. 505 iii) If the receiver does not have an RTT estimate to the originator, 506 then 508 T_dither_max := l * T_rr 510 with l=0.5. 512 The values given above for T_dither_max are minimal values. 513 Application-specific feedback considerations may make it worthwhile 514 to increase T_dither_max beyond this value. This is up to the 515 discretion of the implementer. 517 Then, R MUST check whether its next regularly scheduled RTCP packet 518 is within the time bounds for the RTCP FB (t0 + T_dither_max > tn). 520 If so, an Early RTCP packet MUST NOT be scheduled; instead the FB 521 message(s) MUST be stored to be appended to the regular RTCP packet 522 scheduled for tn. 524 Otherwise, R MUST check whether it is allowed to transmit an Early 525 RTCP packet (allow_early == TRUE). 527 If so, R MUST schedule an Early RTCP packet for te := t0 + RND * 528 T_dither_max with the RND function evenly distributed between 0 529 and 1. 531 If, while waiting for te, R receives RTCP feedback packets 532 contained in one or more (minimal) compound RTCP packets, R MUST 533 act as follows for each of the RTCP feedback packets in the one or 534 more compound RTCP packets received: 536 1. If R understands the received feedback message's semantics and 537 the message contents is a superset of the feedback R wanted to 538 send then R MUST discard its own feedback message and MUST re- 539 schedule the next regular RTCP message transmission for tn (as 540 calculated before). 542 2. If R understands the received feedback message's semantics and 543 the message contents is not a superset of the feedback R 544 wanted to send then R SHOULD transmit its own feedback message 545 as scheduled. If there is an overlap between the feedback 546 information to send and the feedback information to receive, 547 the amount of feedback transmitted is up to R: R MAY send its 548 feedback information unchanged, R MAY as well eliminate any 549 redundancy between its own feedback and the feedback received 550 so far. 552 3. If R does not understand the received feedback message's 553 semantics, R checks whether the compound RTCP packet contains 554 a Generic INFO message. If a Generic INFO message is present 555 R performs the comparison based upon this information and 556 proceeds with alternative 1. or 2. above depending on the 557 outcome of the comparison. If no Generic INFO message is 558 present, then R MAY send its own feedback message as or Early 559 RTCP packet. Alternatively, R MAY re-schedule the next 560 regular RTCP message transmission for tn (as calculated 561 before) and MAY append the feedback message to the now 562 regularly scheduled RTCP message. 564 Refer to section 4 on the comparison of feedback messages and for 565 which feedback messages MUST be understood by a receiver. 567 Otherwise, when te is reached, R MUST transmit the RTCP packet 568 containing the FB message. R then MUST set allow_early := FALSE 569 and MUST recalculate tn := tp + 2*T_rr. As soon as R sends its 570 next regularly scheduled RTCP RR (at the new tn), it MUST set 571 allow_early := TRUE again. 573 If allow_early == FALSE then R MUST check the time for the next 574 scheduled RR: 576 1. If tn � t0 < T_max_fb_delay (i.e. if, despite late reception, the 577 feedback could still be useful for the sender) then R MAY create 578 an RTCP FB message for transmission along with the RTCP packet at 579 tn. 581 2. Otherwise, R MUST discard the RTCP feedback message. 583 In regular RTCP intervals as specified by [1] (except for the five 584 second minimum), a full compound RTCP packet is sent (which may also 585 contain a feedback message if one has been created according to the 586 above rules and scheduled for transmission along the full compound 587 RTCP message). 589 Whenever an RTCP packet is sent or received -- minimal or full 590 compound, early or regularly scheduled -- the avg_rtcp_size variable 591 is updated accordingly (see [1]) and the tn is calculated using the 592 new avg_rtcp_size. 594 3.6 Considerations on the Group Size 596 This section provides guidelines to the group sizes at which the 597 various feedback modes may be used. 599 3.6.1 ACK mode 601 The group size MUST be exactly two participants, i.e. point-to-point 602 communications. Unicast addresses SHOULD be used in the session 603 description. 605 For unidirectional as well as bi-directional communication between 606 two parties, 2.5% of the RTP session bandwidth are available for RTCP 607 traffic from the receivers including feedback. Assuming that out of 608 ten RTCP packets, nine are sent as minimal compound RTCP packets and 609 one as full compound RTCP packet, at 64kbit/s unidirectional 610 communication scenario, a receiver can report 1.5 events per second 611 back to the sender, at 256kbit/s 6 events and so forth. 613 From 1 Mbit/s upwards, a receiver would be able to acknowledge each 614 individual frame (not packet!) in a 25 fps video stream. 616 ACK strategies MUST be defined accordingly to work properly with 617 these bandwidth limitations. An indication whether or not ACKs are 618 allowed for a session and, if so, which ACK strategy should be used, 619 MAY be conveyed by out-of-band mechanisms, e.g. media-specific 620 attributes in a session description using SDP. 622 3.6.2 NACK mode 624 Negative acknowledgements (or similar types of feedback) MUST be 625 used for all groups larger than two. Of course, NACKs MAY be used 626 for point-to-point communications as well. 628 Whether or not the use of Immediate or Early RTCP packets should be 629 considered depends upon a number of parameters including session 630 bandwidth, codec, special type of feedback, number of senders and 631 receivers, among many others. 633 The crucial parameters -- to which virtually all of the above can be 634 reduced -- is the allowed minimal interval between two RTCP reports 635 and the (average) number of events that presumably need reporting per 636 time interval (plus their distribution over time, of course). The 637 minimum interval is derived from the available RTCP bandwidth and the 638 expected average size of an RTCP packet. The number of events to 639 report e.g. per second may be derived from the packet loss rate and 640 sender's rate of transmitting packets. From these two values, the 641 allowable group size for the Immediate feedback mode can be 642 calculated. 644 The upper bound for the Early RTCP mode then solely depends on the 645 acceptable quality degradation, i.e. how many events per time 646 interval may go unreported. 648 Example: If a 256kbit/s video with 30 fps is transmitted through a 649 network with an MTU size of some 1500 bytes, then, in most cases, 650 each frame would fit in its own packet leading to a packet rate of 30 651 packets per second. If 5% packet loss occurs in the network (equally 652 distributed, no inter-dependence between receivers), then each 653 receiver will have to report 3 packets lost each two seconds. 654 Assuming a single sender and more than three receivers, this yields 655 3.75% of the RTCP bandwidth allocated to the receivers and thus 656 9.6kbit/s. Assuming further a size of 120 bytes for the average 657 compound RTCP packet allows 10 RTCP packets to be sent per second or 658 20 in two seconds. If every receiver needs to report three packets, 659 this yields a maximum group size of 6-7 receivers if all loss events 660 shall be reported. The rules for transmission of immediate RTCP 661 packets should provide sufficient flexibility for most of this 662 reporting to occur in a timely fashion. 664 Extending this example to determine the upper bound for Early RTCP 665 mode leads to the following considerations: assume that the 666 underlying coding scheme and the application (as well as the tolerant 667 users) allow on the order of one loss without repair per two seconds. 668 Thus the number of packets to be reported by each receiver decreases 669 to two per two seconds second and increases the group size to 10. 670 Assuming further that some number of packet losses are correlated, 671 feedback traffic is further reduced and group sizes of some 12 to 16 672 (maybe even 20) can be reasonably well supported using Early RTCP 673 mode. 675 3.7 Summary of decision steps 677 3.7.1 General Hints 679 Before even considering whether or not to send RTCP feedback 680 information an application has to determine whether this mechanism is 681 applicable: 683 1) An application has to decide whether -- for the current ratio of 684 packet rate with the associated (application-specific) maximum 685 feedback delay and the currently observed round-trip time (if 686 available) -- feedback mechanisms can be applied at all. 688 This decision may obviously be based upon (and dynamically revised 689 following) regular RTCP reception statistics. 691 2) The application has to decide whether -- for a certain observed 692 error rate, assigned bandwidth, frame rate, and group size -- (and 693 which) feedback mechanisms can be applied. 695 Regular RTCP provides valuable input to this step, too. 697 3) If these tests pass, the application has to follow the rules for 698 transmitting Early RTCP packets or regularly scheduled RTCP 699 packets with piggybacked feedback. 701 3.7.2 Media Session Attributes 703 Media sessions are typically described using out-of-band mechanisms 704 to convey transport addresses, codec information, etc. between 705 sender(s) and receiver(s). Such a mechanisms is composed of a format 706 used to describe a media session and another mechanism for 707 transporting this description. 709 In the IETF, the Session Description Protocol (SDP) is currently used 710 to describe media sessions while protocols such as SIP, SAP, RTSP, 711 and HTTP are used to convey the description. 713 A present media session description format MAY include parameters to 714 indicate that RTCP feedback mechanisms are supported in this session 715 and which of the feedback mechanisms may be applied. 717 To do so, the profile "AVPF" MUST be indicated instead of "AVP". 718 Further attributes may be defined to show which type(s) of feedback 719 are supported. 721 Section 4 contains the syntax specification to support RTCP feedback 722 with SDP. Similar specifications for other media session description 723 formats are outside the scope of this specification. 725 4. SDP Definitions 726 This section defines a number of additional SDP parameters that are 727 used to describe a session. All of these are defined as media level 728 attributes. 730 4.1 Profile identification 732 The AV profile defined in [4] is referred to as "AVP" in the context 733 of e.g. the Session Description Protocol (SDP) [3]. The profile 734 specified in this document is referred to as "AVPF". 736 Feedback information following the modified timing rules as specified 737 in this document MUST NOT be sent for a particular media session 738 unless the profile for this session indicates the use of the "AVPF" 739 profile. 741 4.2 RTCP Feedback Capability Attribute 743 A new payload format-specific SDP attribute (for use with "a=fmtp:") 744 is defined to indicate the capability of using RTCP feedback as 745 specified in this document: "rtcp-fb". The "rtcp-fb" attribute MAY 746 only be used as an SDP media attribute and MUST NOT be provided at 747 the session level. The rtcp-fb attribute MUST only be used in media 748 sessions for which the "AVPF" is specified. 750 The rtcp-fb attribute is used to indicate which RTCP feedback 751 messages MAY be used in this media session for the indicated payload 752 type. If several types of feedback are supported, several a=rtcp-fb: 753 lines MUST be used. 755 If no rtcp-fb attribute is specified the RTP receivers SHOULD assume 756 that the RTP senders only support generic NACKs. In addition, the 757 RTP receivers MAY send feedback using other suitable RTCP feedback 758 packets as defined for the respective media type. The RTP receivers 759 MUST NOT rely on the RTP senders reacting to any of the feedback 760 messages. 762 If one or more rtcp-fb attributes are present in a media session 763 description, the RTP receivers for the media session(s) containing 764 the "rtcp-fb" 766 . MUST ignore all rtcp-fb attributes of which they do not fully 767 understand the semantics (i.e. understand the meaning of all 768 values in the a=fmtp:rtcp-fb line); 770 . SHOULD provide feedback information as specified in this document 771 using any of the RTCP feedback packets as specified in one of the 772 rtcp-fb attributes for this media session; and 774 . MUST NOT use other feedback messages than those listed in one of 775 the rtcp-fb attribute lines. 777 RTP senders MUST be prepared to receive any kind of RTCP feedback 778 messages and MUST silently discard all those RTCP feedback messages 779 that they do not understand. 781 The syntax of the rtcp-fb attribute is as follows (the feedback types 782 and optional parameters are all case sensitive): 784 rtcp-fb-syntax = "a=fmtp:" WS "rtcp-fb" WS rtcp-fb-value 786 rtcp-fb-value = "ack" rtcp-fb-param 787 | "nack" rtcp-fb-nack-param 788 | rtcp-fb-id rtcp-fb-param 790 rtcp-fb-id = 1*(alpha-numeric | "-" | "_") 792 rtcp-fb-param = "app" 793 | byte-string 794 | ; empty 796 rtcp-fb-nack-param = "pli" 797 | "sli" 798 | "rpsi" 799 | "app" 800 | byte-string 801 | ; empty 803 The literals of the above grammar have the following semantics: 805 Feedback type "ack": 807 This feedback type indicates that positive acknowledgements for 808 feedback are supported. 810 The feedback type "ack" MUST only be used if the media session 811 is allowed to operate in ACK mode as defined in 3.6.1.2. 813 Parameters may be provided to further distinguish different 814 types of positive acknowledgement feedback. If no parameters 815 are present, the Generic ACK as specified in section 4.1.2 is 816 implied. 818 If the parameter "app" is specified, this indicates the use of 819 application layer feedback. In this case, additional parameters 820 following "app" MAY be used to further differentiate various 821 types of application layer feedback. This document does not 822 define any parameters specific to "app". 824 Further parameters for "ack" MAY be defined in other documents. 826 Feedback type "nack": 828 This feedback type indicates that negative acknowledgements for 829 feedback are supported. 831 The feedback type "nack", without parameters, indicates use of 832 the General NACK feedback format as defined in section 4.2.1. 834 The following three parameters are defined in this document for 835 use with "nack" in conjunction with the media type "video": 837 . "pli" indicates the use of Picture Loss Indication feedback 838 as defined in section 4.3.1. 839 . "sli" indicates the use of Slice Loss Indication feedback as 840 defined in section 4.3.2. 841 . "rpsi" indicates the use of Reference Picture Selection 842 Indication feedback as defined in section 4.3.3. 843 . "app" indicates the use of application layer feedback. 844 Additional parameters after "app" MAY be provided to 845 differentiate different types of application layer feedback. 846 No parameters specific to "app" are defined in this document. 848 Further parameters for "nack" MAY be defined in other documents. 850 Other feedback types : 852 Other documents MAY define additional types of feedback; to keep 853 the grammar extensible for those cases, the rtcp-fb-id is 854 introduced as a placeholder. A new feedback scheme name needs 855 to be unique (and thus has to be registered with IANA). Along 856 with a new name, its semantics, packet formats (if necessary), 857 and rules for its operation need to be specified. 859 Note that it is assumed that more specific information about 860 application layer feedback (as defined in section 4.2.3) will be 861 conveyed as feedback types and parameters defined elsewhere. Hence, 862 no further provision for any types and parameters is made in this 863 document. 865 Further types of feedback as well as further parameters may be 866 defined in other documents. 868 It is up to the recipients whether or not they send feedback 869 information and up to the sender(s) to make use of feedback provided. 871 4.3 Unicasting 873 If an m= line in the SDP describing a session indicates unicast 874 addresses for a particular media type (and does not operate in multi- 875 unicast mode with all recipients listed explicitly but still 876 addressed via unicast), the RTCP feedback MAY operate in ACK feedback 877 mode. 879 4.4 RTCP Bandwidth Modifiers 881 The standard RTCP bandwidth assignments as defined in [1] and [2] may 882 be overridden by bandwidth modifiers as specified in [4]: b=RS: 883 and b=RR: MAY be used to assign a different bandwidth (measured 884 in bits per second) to RTP senders and receivers, respectively. The 885 precedence rules of [4] apply to determine the actual bandwidth to be 886 used by senders and receivers. 888 Applications operating knowingly over highly asymmetric links (such 889 as satellite links) SHOULD use this mechanism to reduce the feedback 890 rate for high bandwidth streams to prevent deterministic congestion 891 of the feedback path(s). 893 4.5 Examples 895 Example 1: The following session description indicates a session made 896 up from an audio and a DTMF for point-to-point communication in which 897 the DTMF stream uses Generic ACKs. This session description could be 898 contained in a SIP INVITE, 200 OK, or ACK message to indicate that 899 its sender is capable of and willing to receive feedback for the DTMF 900 stream it transmits. 902 v=0 903 o=alice 3203093520 3203093520 IN IP4 host.example.com 904 s=Media with feedback 905 t=0 0 906 c=IN IP4 host.example.com 907 m=audio 49170 RTP/AVPF 0 96 908 a=rtpmap:0 PCMU/8000 909 a=rtpmap:96 telephone-event/8000 910 a=fmtp:96 0-16 911 a=fmtp:96 rtcp-fb ack 913 Example 2: The following session description indicates a multicast 914 video-only session (using H.263+) with the video source accepting 915 Generic NACKs and Reference Picture Selection. Such a description 916 may have been conveyed using the Session Announcement Protocol (SAP). 918 v=0 919 o=alice 3203093520 3203093520 IN IP4 host.example.com 920 s=Multicast video with feedback 921 t=3203130148 3203137348 922 m=audio 49170 RTP/AVP 0 923 c=IN IP4 224.2.1.183 924 a=rtpmap:0 PCMU/8000 925 m=video 51372 RTP/AVPF 98 926 c=IN IP4 224.2.1.184 927 a=rtpmap:98 H263-1998/90000 928 a=fmtp:98 rtcp-fb nack 929 a=fmtp:98 rtcp-fb nack rpsi 931 5. Interworking and Co-Existence of AVP and AVPF Entities 933 The AVPF profile defined in this document is an extension of the AVP 934 profile as defined in [2]. Both profiles follow the same basic rules 935 (including the upper bandwidth limit for RTCP and the bandwidth 936 assignments to senders and receivers. Therefore, senders and 937 receivers of using either of the two profiles can be mixed in a 938 single session. 940 AVP and AVPF are defined in a way that, from a robustness point of 941 view,, the RTP entities do not need to be aware of entities of the 942 respective other profile: they will not disturb each other's 943 functioning. However, the quality of the media presented may suffer. 945 The following considerations apply to senders and receivers when used 946 in a combined session. 948 . AVP entities (senders and receivers) 950 AVP senders will receive RTCP feedback packets from AVPF receivers 951 and ignore these packets. They will see occasional closer spacing 952 of RTCP messages (e.g. violating the 5s rule) by AVPF entities. 953 As the overall bandwidth constraints are adhered to by both types 954 of entities, they will still get their share of the RTCP 955 bandwidth. However, while AVP entities are bound by the 5s rule, 956 depending on the group size and session bandwidth, AVPF entities 957 may provide more frequent RTCP reports than AVP ones will. Also, 958 the overall reporting may decrease slightly as AVPF entities are 959 may to send bigger RTCP packets (due to the extra fields). 961 . AVPF senders 963 AVPF senders will receive feedback information only from AVPF 964 receivers. If they rely on feedback to provide the target media 965 quality, the quality achieved for AVP receivers may be sub- 966 optimal. 968 . AVPF receivers 970 AVPF receivers SHOULD send immediate or early RTCP feedback 971 packets only if all (sending) entities in the media session 972 support AVPF. AVPF receivers MAY send feedback information as 973 part of regularly scheduled compound RTCP packets following the 974 timing rules of [1] and [2] also in media sessions operating in 975 mixed mode. In this case, however, the receiver providing 976 feedback MUST NOT rely on the sender reacting to the feedback at 977 all. 979 6. Format of RTCP Feedback Messages 981 This section defines the format of the low delay RTCP feedback 982 messages. These messages classified into three categories as 983 follows: 985 - Transport layer feedback messages 986 - Payload-specific feedback messages 987 - Application layer feedback messages 988 Transport layer feedback messages are intended to transmit general 989 purpose feedback information, i.e. information independent of the 990 particular codec or the application in use. The information is 991 expected to be generated and processed at the transport/RTP layer. 992 Currently, only a general positive acknowledgement (ACK) and negative 993 acknowledgement (NACK) message are defined. 995 Payload-specific feedback messages transport information that is 996 specific to a certain payload and will be generated and acted upon at 997 the codec "layer". This document defines a common header to be used 998 in conjunction with all payload-specific feedback messages. The 999 definition of specific messages is left to either RTP Payload Format 1000 specifications or to additional feedback format documents. 1002 Application layer feedback messages provide a means to transparently 1003 convey feedback from the receiver's to the sender's application. The 1004 information contained in such a message is not expected to be acted 1005 upon at the transport/RTP or the codec layer. The data to be 1006 exchanged between two application instances is usually defined in the 1007 application protocol's specification and thus can be identified by 1008 the application so that there is no need for additional external 1009 information. Hence, this document defines only a common header to be 1010 used along with all application layer feedback messages. From a 1011 protocol point of view, an application layer feedback message is 1012 treated as a special case of a payload-specific feedback message. 1014 This document defines two transport layer feedback and three (video) 1015 payload-specific feedback messages as well as a container for 1016 application layer feedback messages. Additional transport layer and 1017 payload specific feedback messages may be defined in other documents 1018 and are registered through IANA (see section IANA considerations). 1020 The general syntax and semantics for the above RTCP feedback message 1021 types is described in the following subsections. 1023 6.1 Common Packet Format for Feedback Message 1025 All feedback message share a common packet format that is depicted in 1026 figure 3: 1028 0 1 2 3 1029 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1030 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1031 |V=2|P|0| FMT | PT | length | 1032 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1033 | SSRC of packet sender | 1034 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1035 | SSRC of media source | 1036 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1037 : Feedback Control Information (FCI) : 1038 : : 1040 Figure 3: Common Packet Format for Feedback Messages 1041 The various fields V, P, SSRC and length are defined in the RTP 1042 specification [2], the respective meaning being summarized below: 1044 version (V): 2 bits 1045 This field identifies the RTP version. The current version is 2. 1047 padding (P): 1 bit 1048 If set, the padding bit indicates that the packet contains 1049 additional padding octets at the end which are not part of the 1050 control information but are included in the length field. 1052 Feedback message type (FMT): 4 bits 1053 This field identifies the type of the feedback message and is 1054 interpreted relative to the RTCP message type (transport, 1055 payload-specific, or application feedback). The values for each 1056 of the three feedback types are defined in the respective 1057 sections below. 1059 Payload type (PT): 8 bits 1060 This is the RTCP packet type which identifies the packet as being 1061 an RTCP Feedback Message. Two values are defined (TBA. By IANA): 1063 Name | Value | Brief Description 1064 ----------+-------+-------------------------------------- 1065 RTPFB | 2xx | Transport layer feedback message 1066 PSFB | 2xy | Payload-specific feedback message 1068 Length: 16 bits 1069 The length of this packet in 32-bit words minus one, including 1070 the header and any padding. This is in line with the definition 1071 of the length field used in RTCP sender and receiver reports [3]. 1073 SSRC of packet sender: 32 bits 1074 The synchronization source identifier for the originator of this 1075 packet. 1077 SSRC of media source: 32 bits 1078 The synchronization source identifier of the media source that 1079 this piece of feedback information is related to. 1081 Feedback Control Information (FCI): variable length 1082 The following three sections define which additional information 1083 is included in the feedback message for each type of feedback. 1084 Each RTCP feedback packet MUST contain exactly one FCI field of 1085 the types defined in sections 6.2 and 6.3. If multiple FCI 1086 fields (even of the same type) need to be conveyed, then several 1087 RTCP feedback packets MUST be generated and concatenated in the 1088 same compound RTCP packet. 1090 6.2 Transport Layer Feedback Messages 1092 Transport Layer Feedback messages are identified by the value RTPFB 1093 as RTCP message type. 1095 Two general purpose transport layer feedback messages are defined so 1096 far: General ACK and General NACK. They are identified by means of 1097 the FMT parameter as follows: 1099 0: forbidden 1100 1: Generic NACK 1101 2: Generic ACK 1102 3: Generic INFO 1103 4-15: reserved 1105 The following two subsections define the packet formats for these 1106 messages. 1108 6.2.1 Generic NACK 1110 The Generic NACK message is identified by PT=RTPFB and FMT=1. 1112 The Generic NACK packet is used to indicate the loss of one or more 1113 RTP packets. The lost packet(s) are identified by the means of a 1114 packet identifier and a bit mask. 1116 The Feedback control information (FCI) field has the following 1117 Syntax (figure 4): 1119 0 1 2 3 1120 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1121 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1122 | PID | BLP | 1123 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1125 Figure 4: Syntax for the Generic NACK message 1127 Packet ID (PID): 16 bits 1128 The PID field is used to specify a lost packet. Typically, the 1129 RTP sequence number is used for PID as the default format, but 1130 RTP Payload Formats may decide to identify a packet differently. 1132 bitmask of following lost packets (BLP): 16 bits 1133 The BLP allows for reporting losses of any of the 16 RTP packets 1134 immediately following the RTP packet indicated by the PID. The 1135 BLP's definition is identical to that given in [10]. Denoting 1136 the BLP's least significant bit as bit 1, and its most 1137 significant bit as bit 16, then bit i of the bit mask is set to 1 1138 if the sender has not received RTP packet number PID+i (modulo 1139 2^16) and the receiver decides this packet is lost; bit i is set 1140 to 0 otherwise. Note that the sender MUST NOT assume that a 1141 receiver has received a packet because its bit mask was set to 0. 1142 For example, the least significant bit of the BLP would be set to 1143 1 if the packet corresponding to the PID and the following packet 1144 have been lost. However, the sender cannot infer that packets 1145 PID+2 through PID+16 have been received simply because bits 2 1146 through 15 of the BLP are 0; all the sender knows is that the 1147 receiver has not reported them as lost at this time. 1149 6.2.2 Generic ACK 1151 The Generic ACK message is identified by PT=RTPFB and FMT=2. 1153 The Generic ACK packet is used to indicate that one or several RTP 1154 packets were received correctly. The received packet(s) are 1155 identified by the means of a packet identifier and a bit mask. 1156 ACKing of a range of consecutive packets is also possible. 1158 The Feedback control information (FCI) field has the following 1159 syntax: 1161 0 1 2 3 1162 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1163 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1164 | PID |R| BLP/#packets | 1165 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1167 Figure 5: Syntax for the Generic ACK message 1169 Packet ID (1st PID): 16 bits 1170 This PID field is used to specify a correctly received packet. 1171 Typically, the RTP sequence number is used for PID as the default 1172 format, but RTP Payload Formats may decide to identify a packet 1173 differently. 1175 Range of ACKs (R): 1 bit 1176 The R-bit indicates that a range of consecutive packets are 1177 received correctly. If R=1 then the PID field specifies the 1178 first packet of that range and the next field (BLP/#packets) will 1179 carry the number of packets being acknowledged. If R=0 then PID 1180 specifies the first packet to be acknowledged and BLP/#packets 1181 provides a bit mask to selectively indicate individual packets 1182 that are acknowledged. 1184 Bit mask of lost packets (BLP)/#packets (PID): 15 bits 1185 The semantics of this field depends on the value of the R-bit. 1187 If R=1, this field is used to identify the number of additional 1188 packets of to be acknowledged: 1190 #packets = - 1192 That is, #packets MUST indicate the number of packet to be ACKed 1193 minus one. In particular, if only a single packet is to be ACKed 1194 and R=1 then #packets MUST be set to 0x0000. 1196 Example: If all packets between and including PIDx=380 and PIDy = 1197 422 have been received, the Generic ACK would contain PID = PIDx 1198 = 380 and #packets = PIDy � PID = 42. In case the PID wraps 1199 around, modulo arithmetic is used to calculate the number of 1200 packets. 1202 If R=0, this field carries a bit mask. The BLP allows for 1203 reporting reception of any of the 15 RTP packets immediately 1204 following the RTP packet indicated by the PID. The BLP's 1205 definition is identical to that given in [10] except that, here, 1206 BLP is only 15 bits wide. Denoting the BLP's least significant 1207 bit as bit 1, and its most significant bit as bit 15, then bit i 1208 of the bitmask is set to 1 if the sender has received RTP packet 1209 number PID+i (modulo 2^16) and the receiver decides to ACK this 1210 packet; bit i is set to 0 otherwise. If only the packet 1211 indicated by PID is to be ACKed and R=0 then BLP MUST be set to 1212 0x0000. 1214 6.2.3 Generic INFO 1216 The Generic INFO message is identified by PT=RTPFB and FMT=3. 1218 The Generic INFO packet MUST only be used in conjunction with an 1219 application-specific feedback message. The Generic INFO message 1220 indicates which RTP packets the payload-specific message is about. 1221 The packet(s) in question are identified by the means of a packet 1222 identifier and a bit mask. 1224 The sole purpose of the Generic INFO packet is to avoid unnecessary 1225 feedback suppression when payload-specific feedback messages are 1226 mixed with generic ones. 1228 The packet format is the same as for the Generic NACK message defined 1229 in section 6.2.3. 1231 6.3 Payload Specific Feedback Messages 1233 Payload-Specific Feedback Messages are identified by the value PSFB 1234 as RTCP message type. 1236 Three payload-specific feedback messages are defined so far. They 1237 are identified by means of the FMT parameter as follows: 1239 0: forbidden 1240 1: Picture Loss Indication (PLI) 1241 2: Slice Lost Indication (SLI) 1242 3: Reference Picture Selection Indication (RPSI) 1243 4-14: reserved 1244 15: Application layer feedback message 1246 The following subsections define the packet formats for these 1247 messages. 1249 AVPF entities MUST include Generic INFO messages along with any 1250 payload-specific ones in compound RTCP packets (early as well as 1251 regularly scheduled ones). The INFO message(s) MUST cover all the 1252 RTP packets to which the payload-specific message(s) apply. This is 1253 to avoid that AVPF entities that do not understand the payload- 1254 specific messages unnecessarily suppress their feedback messages. 1256 6.3.1 Picture Loss Indication (PLI) 1258 The PLI feedback message is identified by PT=PSFB and FMT=1. 1260 6.3.1.1 Semantics 1262 With the Picture Loss Indication message a decoder informs the 1263 encoder about the loss of one or more full pictures. 1265 6.3.1.2 Message Format 1267 PLI does not require parameters. Therefore, the length field MUST be 1268 2, and there MUST NOT be any Feedback Control Information. 1270 6.3.1.3 Timing Rules 1272 The timing follows the rules outlined in section 3. In systems that 1273 employ both PLI and other types of feedback it may be advisable to 1274 follow the regular RTCP RR timing rules for PLI, since PLI is not as 1275 delay critical as other FB types. 1277 6.3.1.4 Remarks 1279 PLI messages typically trigger the sending of full Intra pictures. 1280 Intra Pictures are several times larger then predicted (Inter) 1281 pictures. Their size is independent of the time they are generated. 1282 In most environments, especially when employing bandwidth-limited 1283 links, the use of an Intra picture implies an allowed delay that is a 1284 significant multitude of the typical frame duration. An example: If 1285 the sending frame rate is 10 fps, and an Intra picture is assumed to 1286 be 10 times as big as an Inter picture (not an unrealistic 1287 assumption, see [14] for details), then a full second of latency has 1288 to be accepted. In such an environment there is no need for a 1289 particular short delay in sending the feedback message. Hence 1290 waiting for the next possible time slot allowed by RTCP timing rules 1291 as per [2] does not have a negative impact on the system performance. 1293 6.3.2 Slice Lost Indication (SLI) 1295 The SLI feedback message is identified by PT=PSFB and FMT=2. 1297 6.3.2.1 Semantics 1299 With the Slice Lost Indication a decoder can inform an encoder that 1300 it was unable to decode one, or several consecutive, macroblocks. 1301 The encoder can take appropriate action in order to re-synchronize 1302 encoder and decoder by means of its choice, typically by sending the 1303 lost macroblocks in Intra mode. This feedback message SHALL NOT be 1304 used for video codecs with non-uniform, dynamically changeable 1305 macroblock sizes such as H.263 with enabled Annex Q. In such a case, 1306 an encoder cannot always identify the corrupted spatial region. 1308 6.3.2.2 Format 1310 When FBT indicates a Slice Lost Indication, then there is one 1311 additional PCI field the content of which is depicted in figure 6. 1312 The length of the feedback message MUST be set to 3. 1314 0 1 2 3 1315 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1316 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1317 | First | Number | TR | 1318 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1320 Figure 6: Syntax of the Slice Lost Indication (SLI) 1322 First: 13 bits 1323 The macroblock (MB) address of the first lost macroblock. The MB 1324 numbering is done such that the macroblock in the upper left 1325 corner of the picture is considered macroblock number 1 and the 1326 number for each macroblock increases from left to right and then 1327 from top to bottom in raster-scan order (such that if there is a 1328 total of N macroblocks in a picture, the bottom right macroblock 1329 is considered macroblock number N). 1331 Number: 13 bits 1332 The number of lost macroblocks, in scan order as discussed above. 1334 TR: 6 bits 1335 The six least significant bits of the Temporal Reference of the 1336 picture. 1338 6.3.2.3 Timing Rules 1340 The efficiency of algorithms using the Slice Lost Indication is 1341 reduced greatly when the Indication is not transmitted in a timely 1342 fashion. Motion compensation propagates corrupted pixels that are 1343 not reported as being corrupted. Therefore, the use of the algorithm 1344 discussed in section 3 is highly recommended. 1346 6.3.2.4 Remarks 1348 The First field of the UCI defines the first macroblock of a picture 1349 as 1 and not, as one could suspect, as 0. This was done to align 1350 this specification with the comparable mechanism available in H.245. 1351 The maximum number of macroblocks in a picture (2**13 or 8192) 1352 corresponds to the maximum picture sizes of the ITU-T and ISO/IEC 1353 video codecs. If future video codecs offer larger picture sizes 1354 and/or smaller macroblock sizes, then an additional feedback message 1355 has to be defined. The six least significant bits of the Temporal 1356 Reference field are deemed to be sufficient to indicate the picture 1357 in which the loss occurred. 1359 Algorithms were reported that keep track of the regions effected by 1360 motion compensation, in order to allow for a transmission of Intra 1361 macroblocks to all those areas, regardless of the timing of the FB 1362 (see H.263 (2000) Appendix I [13]] and [15]. While, when those 1363 algorithms are used, the timing of the FB is less critical then 1364 without, it has to be observed that those algorithms correct large 1365 parts of the picture and, therefore, have to transmit many for bits 1366 in case of delayed FBs. 1368 6.3.3 Reference Picture Selection Indication (RPSI) 1370 The RPSI feedback message is identified by PT=PSFB and FMT=3. 1372 6.3.3.1 Semantics 1374 Modern video coding standards such as MPEG-4 visual version 2 [12] or 1375 H.263 version 2 [13] allow the use of older reference pictures then 1376 the most recent one. Typically, a first-in-first-out queue of 1377 reference pictures is maintained. If an encoder has learned about a 1378 loss of encoder-decoder synchronicity, a known-as-correct reference 1379 picture can be used. As this reference picture is temporally further 1380 away then usual, the resulting predictively coded picture will use 1381 more bits. 1383 Both MPEG-4 and H.263 define a binary format for the �payload� of an 1384 RPSI message that includes information such as the temporal ID of the 1385 damaged picture and the size of the damaged region. This bit string 1386 is typically small �- a couple of dozen bits -�, of variable length, 1387 and self-contained, i.e. contains all information that is necessary 1388 to perform reference picture selection. 1390 Note that both MPEG-4 and H.263 allow the use of RPSI with positive 1391 feedback information as well. That is, all corrected pictures are 1392 reported. Any form of positive feedback MUST NOT be used when in a 1393 multicast environment (reporting positive feedback about individual 1394 reference pictures at RTCP intervals is not expected to be of much 1395 use anyway). For point-to-point communication, positive feedback MAY 1396 be used but, again, the bit rate budget of RTCP feedback will prevent 1397 the use in most scenarios anyway. 1399 6.3.3.2 Format 1401 When FB indicates an RPSI, then the length field is set to the number 1402 of bits of the following bit string that contains the RPS 1403 information. This bit string follows byte aligned in the UCI field. 1404 Bit padding is used to achieve 32-bit word alignment of the UCI 1405 message (and the whole packet). 1407 6.3.3.3 Timing Rules 1409 RPS is even more critical to delay then algorithms using SLI. This 1410 is due to the fact that the older the RPS message is, the more bits 1411 the encoder has to spend to achieve encoder-decoder synchronicity. 1412 See [14] and [15] for some information about the overhead of RPS for 1413 certain bit rate/frame rate/loss rate scenarios. 1415 Therefore, RPS messages should typically be sent as soon as possible, 1416 employing the algorithm of section 3. 1418 6.4 Application Layer Feedback Messages 1420 Payload-Specific Feedback Messages are a special case of payload- 1421 specific messages and identified by PT=PSFB and FMT=15. 1423 These messages are used to transport application defined data 1424 directly from the receiver's to the sender's application. The data 1425 that is transported is not identified by the feedback message. 1426 Therefore the application must be able to identify the messages 1427 payload. 1429 Usually applications define their own set of messages, e.g. NEWPRED 1430 messages in MPEG-4 or feedback messages in H.263/Annex N,U. These 1431 messages do not need any additional information from the RTCP 1432 message. Thus the application message is simply placed into the FCI 1433 field as follows and the length field is set accordingly. 1435 Application Message (FCI): variable length 1436 This field contains the original application message that should 1437 be transported from the receiver to the source. The format is 1438 application dependent. The length of this field is variable. If 1439 the application data is not four-byte aligned, padding must be 1440 added. 1442 7. Early Feedback and Congestion Control 1444 In the previous sections, the feedback messages were defined as well 1445 as the timing rules according to which to send these messages. The 1446 way to react to the feedback received depends on the application 1447 using the feedback mechanisms and hence is beyond the scope of this 1448 document. 1450 However, across all applications, there is a common requirement for 1451 (TCP-friendly) congestion control on the media stream as defined in 1452 [1] and [2] when operating in a best-effort network environment. 1454 Low delay feedback supports the use of congestion control algorithms 1455 in two ways: 1457 . The potentially more frequent RTCP messages allow the sender to 1458 monitor the network state more closely than with regular RTCP 1459 and therefore enable reacting to upcoming congestion in a more 1460 timely fashion. 1462 . The feedback messages themselves may convey additional 1463 information as input to congestion control algorithms and thus 1464 improve reaction over conventional RTCP. (For example, ACK-based 1465 feedback may even allow to construct closed loop algorithms and 1466 NACK-based systems may provide further information on the packet 1467 loss distribution.) 1469 A congestion control algorithm that shares the available bandwidth 1470 fair with competing TCP connections, e.g. TFRC [16], SHOULD be used 1471 to determine the data rate for the media stream (if the low delay RTP 1472 session is transmitted in a best effort environment). 1474 RTCP feedback messages or RTCP SR/RR packets that indicate recent 1475 packet loss MUST NOT lead to a (mid-term) increase in the 1476 transmission data rate and SHOULD lead to a (short-term) decrease of 1477 the transmission data rate. Such messages SHOULD cause the sender to 1478 adjust the transmission data rate to the order of the throughput TCP 1479 would achieve under similar conditions (e.g. using TFRC). 1481 RTCP feedback messages or RTCP SR/RR packets that indicate no recent 1482 packet loss MAY cause the sender to increase the transmission data 1483 rate to roughly the throughput TCP would achieve under similar 1484 conditions (e.g. using TFRC). 1486 8. Security Considerations 1488 RTP packets transporting information with the proposed payload for 1489 mat are subject to the security considerations discussed in the RTP 1490 specification [1] and in the RTP/AVP profile specification [2]. 1491 This profile does not specify any different security services. 1493 This profile modifies the timing behavior of RTCP and eliminates the 1494 minimum RTCP interval of 5 seconds and allows for earlier feedback to 1495 be provided by receivers. This approach does not increase the 1496 potential for denial-of-service attacks beyond those discussed in [1] 1497 and [2]. 1499 Feedback information is suppressed if unknown RTCP feedback packets 1500 are received. This introduces the risk of a malicious group member 1501 eliminating all early feedback by simply transmitting payload- 1502 specific RTCP feedback packets with random contents that are neither 1503 recognized by any receiver (so they will suppress feedback) nor by 1504 the sender (so no repair actions will be taken). 1506 A malicious group member can also report arbitrary high loss rates in 1507 the feedback information to make the sender throttle the data 1508 transmission and increase the amount of redundancy information or 1509 take other action to deal with the pretended packet loss. This may 1510 result in a degradation of the quality of the reproduced media 1511 stream. 1512 Finally, a malicious group member can act as a large number of group 1513 members and thereby obtain an artificially large share of the early 1514 feedback bandwidth and reduce the reactivity of the other group 1515 members -- possibly even causing them to no longer operate in 1516 immediate or early feedback mode and thus undermining the whole 1517 purpose of this profile. 1519 9. IANA Considerations 1521 The feedback profile as an extension to the profile for audio-visual 1522 conferences with minimal control needs to be registered: "RTP/AVPF". 1524 For the Session Description Protocol, the following "fmtp:" attribute 1525 needs to be registered: "rtcp-fb". 1527 Along with "rtcp-fb", the feedback types "ack" and "nack" need to be 1528 registered. 1530 Along with "nack", the feedback type parameters "sli", "pli", and 1531 "rpsi" need to be registered. 1533 Two RTCP Control Packet Types: for the class of transport layer 1534 feedback messages ("RTPFB") and for the class of payload-specific 1535 feedback messages ("PSFB"). 1537 Within the RTPFB range, three format (FMT) values need to be 1538 registered: 1540 0: forbidden 1541 1: General NACK 1542 2: General ACK 1544 Within the PSFB range, five format (FMT) values need to be 1545 registered: 1547 0: forbidden 1548 1: Picture Loss Indication (PLI) 1549 2: Slice Loss Indication (SLI) 1550 3: Reference Picture Selection Indication (SLI) 1551 15: Application layer feedback (AFB) 1552 10. Acknowledgements 1554 This document is a product of the Audio-Visual Transport (AVT) 1555 Working Group of the IETF. The authors would like to thank Steve 1556 Casner and Colin Perkins for their comments and suggestions as well 1557 as for their responsiveness to numerous questions. 1559 11. Full Copyright Statement 1561 Copyright (C) The Internet Society (2001). All Rights Reserved. 1563 This document and translations of it may be copied and furnished to 1564 others, and derivative works that comment on or otherwise explain it 1565 or assist in its implementation may be prepared, copied, published 1566 and distributed, in whole or in part, without restriction of any 1567 kind, provided that the above copyright notice and this paragraph are 1568 included on all such copies and derivative works. 1570 However, this document itself may not be modified in any way, such as 1571 by removing the copyright notice or references to the Internet Soci- 1572 ety or other Internet organizations, except as needed for the purpose 1573 of developing Internet standards in which case the procedures for 1574 copyrights defined in the Internet Standards process must be fol- 1575 lowed, or as required to translate it into languages other than 1576 English. 1578 The limited permissions granted above are perpetual and will not be 1579 revoked by the Internet Society or its successors or assigns. 1581 This document and the information contained herein is provided on an 1582 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 1583 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 1584 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 1585 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MER- 1586 CHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." 1588 12. Authors' Addresses 1590 J�rg Ott {sip,mailto}:jo@tzi.org 1591 Universit�t Bremen TZI 1592 MZH 5180 1593 Bibliothekstr. 1 1594 D-28359 Bremen 1595 Germany 1597 Stephan Wenger stewe@cs.tu-berlin.de 1598 TU Berlin 1599 Sekr. FR 6-3 1600 Franklinstr. 28-29 1601 D-10587 Berlin 1602 Germany 1603 Shigeru Fukunaga 1604 Oki Electric Industry Co., Ltd. 1605 1-2-27 Shiromi, Chuo-ku, Osaka 540-6025 Japan 1606 Tel. +81 6 6949 5101 1607 Fax. +81 6 6949 5108 1608 Mail fukunaga444@oki.com 1610 Noriyuki Sato 1611 Oki Electric Industry Co., Ltd. 1612 1-2-27 Shiromi, Chuo-ku, Osaka 540-6025 Japan 1613 Tel. +81 6 6949 5101 1614 Fax. +81 6 6949 5108 1615 Mail sato652@oki.com 1617 Koichi Yano 1618 FastForward Networks, 1619 75 Hawthorne St. #601 1620 San Francisco, CA 94105 1621 Tel. +1.415.430.2500 1623 Akihiro Miyazaki 1624 Matsushita Electric Industrial Co., Ltd 1625 1006, Kadoma, Kadoma City, Osaka, Japan 1626 Tel. +81-6-6900-9192 1627 Fax. +81-6-6900-9193 1628 Mail akihiro@isl.mei.co.jp 1630 Koichi Hata 1631 Matsushita Electric Industrial Co., Ltd 1632 1006, Kadoma, Kadoma City, Osaka, Japan 1633 Tel. +81-6-6900-9192 1634 Fax. +81-6-6900-9193 1635 Mail hata@isl.mei.co.jp 1637 Rolf Hakenberg 1638 Panasonic European Laboratories GmbH 1639 Monzastr. 4c, 63225 Langen, Germany 1640 Tel. +49-(0)6103-766-162 1641 Fax. +49-(0)6103-766-166 1642 Mail hakenberg@panasonic.de 1644 Carsten Burmeister 1645 Panasonic European Laboratories GmbH 1646 Monzastr. 4c, 63225 Langen, Germany 1647 Tel. +49-(0)6103-766-263 1648 Fax. +49-(0)6103-766-166 1649 Mail burmeister@panasonic.de 1651 11. Bibliography 1653 [1] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP - 1654 A Transport Protocol for Real-time Applications," Internet 1655 Draft, draft-ietf-avt-rtp-new-10.txt, Work in Progress, July 1656 2001. 1658 [2] H. Schulzrinne and S. Casner, "RTP Profile for Audio and Video 1659 Conferences with Minimal Control," Internet Draft draft-ietf- 1660 avt-profile-new-11.txt, July 2001. 1662 [3] M. Handley and V. Jacobson, "SDP: Session Description Protocol", 1663 RFC 2327, April 1998. 1665 [4] S. Casner, "SDP Bandwidth Modifiers for RTCP Bandwidth", 1666 Internet Draft draft-ietf-avt-rtcp-bw-03.txt, July 2001. 1668 [5] C. Perkins and O. Hodson, "2354 Options for Repair of Streaming 1669 Media," RFC 2354, June 1998. 1671 [6] J. Rosenberg and H. Schulzrinne, "An RTP Payload Format for 1672 Generic Forward Error Correction,", RFC 2733, December 1999. 1674 [7] C. Perkins, I. Kouvelas, O. Hodson, V. Hardman, M. Handley, J.C. 1675 Bolot, A. Vega-Garcia, and S. Fosse-Parisis, "RTP Payload for 1676 Redundant Audio Data," RFC 2198, September 1997. 1678 [8] S. Bradner, "Key words for use in RFCs to Indicate Requirement 1679 Levels," RFC 2119, March 1997. 1681 [9] H. Schulzrinne and S. Petrack, "RTP Payload for DTMF Digits, 1682 Telephony Tones and Telephony Signals," RFC 2833, May 2000. 1684 [10] T. Turletti and C. Huitema, "RTP Payload Format for H.261 Video 1685 Streams, RFC 2032, October 1996. 1687 [11] C. Bormann, L. Cline, G. Deisher, T. Gardos, C. Maciocco, D. 1688 Newell, J. Ott, G. Sullivan, S. Wenger, and C. Zhu, "RTP Payload 1689 Format for the 1998 Version of ITU-T Rec. H.263 Video (H.263+)," 1690 RFC 2429, October 1998. 1692 [12] ISO/IEC 14496-2:1999/Amd.1:2000, "Information technology - 1693 Coding of audio-visual objects - Part2: Visual", July 2000. 1695 [13] ITU-T Recommendation H.263, "Video Coding for Low Bit Rate 1696 Communication," November 2000. 1698 [14] S. Wenger, "Media-aware Protocols -- transport aware Media 1699 Coding," Habilitation thesis, in preparation, 2001. 1701 [15] B. Girod, N. Faerber, "Feedback-based error control for mobile 1702 video transmission," Proceedings IEEE, Vol. 87, No. 10, pp. 1707 1703 � 1723, October, 1999. 1705 [16] M. Handley, J. Padhye, S. Floyd, J. Widmer, "TCP friendly Rate 1706 Control (TFRC): Protocol Specification," Internet Draft, draft- 1707 ietf-tsvwg-02.txt, Work in Progress, May 2001. 1709 Appendix A. Some Background and Motivation (Informative) 1711 A.1 Example: Predictive Video Coding 1713 A.1.1 Video Encoder-decoder synchronicity 1715 Most current video coding schemes for compressed video, such as the 1716 ITU-T H.261 and H.263 and ISO/IEC MPEG[124] employ a mechanism known 1717 as Inter Picture Prediction. Each picture is divided into 1718 macroblocks of uniform size. For each macroblock, one or more 1719 motion vectors may be identified and transmitted. The residual 1720 signal after motion compensation is DCT-transformed, quantized, 1721 entropy coded, and transmitted as well. The encoder reconstructs, 1722 based on this information, a so-called reference picture, which is 1723 used to perform the motion compensation and residual signal coding 1724 steps for the subsequent picture. Since the reference picture is 1725 generated using only such information that is also available at the 1726 decoder, the reference picture is identical to the reconstructed 1727 picture at the decoder. Having identical reference pictures at the 1728 encoder and decoder is referred to as encoder-decoder-synchronicity. 1730 Whenever data is damaged or lost on the way between the encoder and 1731 the decoder, the reconstructed picture at the decoder is no more 1732 identical with the encoder's reference picture -- the encoder-decoder 1733 synchronicity is lost. 1735 Any loss of the encoder-decoder synchronicity results in annoying 1736 artifacts at the decoder. Because the prediction of subsequent 1737 pictures in the decoder is based on a damaged reference picture, the 1738 annoying artifacts are present not only in the picture in which the 1739 loss occurred; they propagate to all subsequent pictures, until, 1740 through source coding based mechanisms, the encoder-decoder 1741 synchronicity is restored. Therefore, the goal of systems employing 1742 predictive video coding in a lossy environment must be to keep the 1743 encoder-decoder synchronicity, or, if this is not possible, to regain 1744 that synchronicity as quickly as possible. 1746 A.1.2. Non-feedback based mechanisms 1748 Avoiding the loss of the encoder-decoder synchronicity corresponds to 1749 avoiding the loss of coded picture data. Such a task can be 1750 performed on the transport layer. In RTP environments, the use of 1751 packet-based FEC is a good example for such a technique. (The use of 1752 TCP or reliable multicast as the transport for media streams would be 1753 an even better one but is inappropriate for low-delay (interactive) 1754 real-time systems.) FEC schemes, interleaving, and other means for 1755 repairing real-time media streams may also add additional delay and 1756 significant bit rate overhead without being able to guarantee 1757 compensation of virtually all packet losses. 1759 Once the encoder-decoder synchronicity is lost, only source coding 1760 oriented mechanisms can help to regain it. One common way is to send 1761 a non-predictively coded picture (known as Intra picture). Intra 1762 pictures have the disadvantage of being several times bigger than 1763 predictively coded pictures (Inter pictures). Therefore, sending 1764 Intra pictures has negative implications both on the bandwidth and 1765 (in bandwidth limited environments) delay. Another way is to use 1766 Intra macroblock refresh. Here, certain parts of the picture (those 1767 affected by a packet loss) are coded non-predictively in order to 1768 resynchronize the encoder and decoder over time. Intra macroblock 1769 refresh has better delay characteristics then full Intra pictures 1770 because the picture size can be kept constant, but is less efficient 1771 in terms of bit rate/distortion than full Intra pictures. More 1772 sophisticated means such as Reference Picture Selection (RPS) are 1773 also available in modern video coding standards. 1775 Systems not employing feedback channels may use any combination of 1776 the mechanisms described above to add error resilience -- at the cost 1777 of added bit rate and, sometimes, added delay. The number of 1778 additional bits spent for error resilience can be adapted using the 1779 long-term packet loss rate information in the RTCP receiver reports. 1780 But, even when using such adaptive means, it is still likely that 1781 systems spend many more bits then theoretically necessary to achieve 1782 error resilience in order to be on the safe side. Plus, as regular 1783 RTCP feedback is aimed at longer terms, reactivity to sudden losses 1784 is limited. In all practical applications today this means that 1785 fewer bits are available for non redundant picture data, and hence 1786 the overall picture quality suffers. 1788 A.1.3 Feedback based systems 1790 Feedback-based systems try to avoid spending too many bits for 1791 redundant information by informing the encoder about a loss situation 1792 at the decoder(s). The encoder can then react accordingly and spend 1793 redundant bits only when needed possibly only for the part of the 1794 picture that was effected by the loss -- thereby reducing the number 1795 of redundant bits and leaving more bits for useful information. As a 1796 result, a higher reproduced picture quality can generally be expected 1797 when feedback channels are available. 1799 Similar to the observations of section 2.1.2, transport and source 1800 coding based mechanisms can be distinguished that react on loss 1801 situations reported by feedback. 1803 Transport based systems employing feedback react media unaware, by 1804 re-transmitting lost packets. TCP is a good example for a protocol 1805 following such a scheme. Transport-based feedback in real-time 1806 and/or multicast environments is a complex matter and subject of a 1807 lot of engineering and research in and outside of the IETF. This 1808 specification is not concerned with pure transport-based feedback. 1810 Source coding based mechanisms may react upon the arrival of a 1811 feedback message indicating a loss situation by adding bits that 1812 restore, or at least make an effort to restore, the encoder-decoder 1813 synchronicity. This process has to be performed by a real-time 1814 encoder. However, schemes were reported, that allow the use of 1815 feedback also for non-real-time encoders by storing multiple 1816 representations of the same data (e.g. Inter and Intra coded), and 1817 dynamically switching between those representations. 1819 Several types of feedback messages, called Feedback Messages or FB 1820 messages, can be defined for such a case. An FB message can be as 1821 simple as a Boolean condition, indicating for example the loss of a 1822 full picture (and, therefore, the need of a full Intra picture 1823 transmission). Other feedback messages may contain more complex 1824 information such as information about the damage of a spatial region 1825 of the picture. A special form consists of a message the format and 1826 semantics of which are not known at the transport level, because they 1827 are defined in the video codec standards. 1829 A.2 Feedback Messages 1831 Most FB messages contain negative acknowledge information, indicating 1832 an erroneous situation at the decoder. In others, the nature of the 1833 acknowledge (positive, negative, or both) is part of the feedback 1834 message itself. When used in multicast environments, positive 1835 acknowledge must not be used. 1837 This document assumes that feedback messages are transmitted using 1838 RTCP packets. RTCP messages from the receivers to the sender cannot 1839 be sent at any possible time, in order to prevent traffic explosion 1840 in case of large multicast groups. Instead, the bit rate for all 1841 RTCP messages of all receivers together has to obey a maximum 1842 fraction of the total RTP session bit rate, yielding a very limited 1843 bit rate budget for a single receiver when having a large multicast 1844 group. This, in turn, leads to an increased average delay when the 1845 size of the receiving multicast group grows. (see section 6 of [1] 1846 for details) 1848 This specification defines an algorithm that adheres to the bit rate 1849 limitations for the feedback channel on the long term, but allows 1850 short-term overdrafting for any receiver (but not all of them 1851 simultaneously). Thus, the algorithm allows for better real-time 1852 performance then the one specified in [1]. Traffic explosion in such 1853 cases in which many receivers identify a picture damage 1854 simultaneously is prevented by dithering. 1856 As this specification assumes a sender that has full control over its 1857 transmission bit rate (e.g. a real-time encoder), there is no scaling 1858 problem on the forward channel. Any reaction to negative feedback 1859 generates additional bits, which have to be conveyed but this is 1860 taken from the sender�s total bit rate budget. The encoder can take 1861 this into account by, for example, changing the encoding mode, packet 1862 size, and so forth. The sender is also free to simply ignore 1863 feedback messages. Adjusting the tradeoff between the reproduced 1864 media quality of all receivers of a multicast group and the amount of 1865 additional repair traffic is a media-dependent, very complex task and 1866 is not covered in this specification. 1868 Finally, frequent RTCP-based feedback messages may provide additional 1869 input to the sender(s)'s congestion control algorithms and thus 1870 improve its reactivity towards network congestion. 1872 Feedback messages as well as sender and receiver behavior are to be 1873 specified in separate documents (such as [7]). Such specifications 1874 need to consider that, frequently, packet loss is an indication of 1875 network congestion and thus define mechanisms for media-specific 1876 congestion control in the presence of feedback as defined in this 1877 memo. 1879 A.3. Applications and Relationships to other Standards 1881 This specification is based on RTCP, which implies its use in an RTP 1882 environment. RTP itself is used in a variety of systems such as in 1883 SIP- or H.323-based multimedia conferencing/telephony, SAP-announced 1884 Mbone conferences, and RTSP-based media streaming. 1886 As for the video codecs, there is currently a small set of standards 1887 that are, for the purpose of this discussion, roughly comparable. 1888 Many mechanisms for regaining encoder-decoder synchronicity are 1889 applicable to all video codecs. Others require certain tools (such 1890 as Reference Picture Selection, aka NEWPRED) that are available only 1891 in certain versions of the standards, and/or optional tools whose use 1892 must be negotiated prior to being used. 1894 A few RTP payload specifications such as RFC 2032 [10] already define 1895 a feedback mechanism for some of the coding algorithms considered in 1896 this specification. An application capable of performing both 1897 schemes MUST use the feedback mechanism defined in this 1898 specification, although, for backward compatibility reasons, it MUST 1899 also be capable to conform to the feedback scheme defined in the 1900 respective RTP payload format, if this is required by that payload 1901 format. 1903 Also, audio, DTMF, and text streams could benefit from more immediate 1904 feedback even though the redundancy payload formats work well for 1905 these media. 1907 All kinds of non-interactive media streams (such as RTSP-controlled 1908 media streaming applications) could benefit significantly as without 1909 interactivity there is more time available for media repair. 1911 A.4 Remarks on the size of the multicast group 1913 This specification prevents traffic explosion on the feedback channel 1914 in a very similar way as RTP does, with the exception of allowing 1915 individual receivers to overdraft their bit rate budget from time to 1916 time. This is necessary in order to allow for low delay, which is 1917 needed by the algorithms reacting to Feedback messages. 1919 This scaling, however, limits the usefulness of this mechanism in 1920 multicast groups from a certain size upwards (where the size 1921 threshold depends on a number of parameters including loss rate, 1922 frame rate, number of packets per frame, and session bandwidth). The 1923 maximum size of the multicast group is soft and also depends on 1924 application requirements and is therefore not specified here. 1925 Considerations on the multicast group sizes are presented in section 1926 3.5.