idnits 2.17.1 draft-ietf-avt-byerecon-00.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 9 longer pages, the longest (page 2) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 10 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 2 instances of too long lines in the document, the longest one being 1 character in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 13, 1997) is 9661 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 425 looks like a reference -- Missing reference section? '2' on line 429 looks like a reference -- Missing reference section? '3' on line 432 looks like a reference -- Missing reference section? '4' on line 435 looks like a reference -- Missing reference section? '5' on line 439 looks like a reference -- Missing reference section? '6' on line 443 looks like a reference Summary: 10 errors (**), 0 flaws (~~), 3 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force Audio/Video Transport wg 3 Internet Draft J. Rosenberg, H. Schulzrinne 4 draft-ietf-avt-byerecon-00.txt Bell Laboratories/Columbia U. 5 November 13, 1997 6 Expires: May 1998 8 New Results in RTP Scalability 10 STATUS OF THIS MEMO 12 This document is an Internet-Draft. Internet-Drafts are working docu- 13 ments of the Internet Engineering Task Force (IETF), its areas, and 14 its working groups. Note that other groups may also distribute work- 15 ing documents as Internet-Drafts. 17 Internet-Drafts are draft documents valid for a maximum of six months 18 and may be updated, replaced, or obsoleted by other documents at any 19 time. It is inappropriate to use Internet-Drafts as reference mate- 20 rial or to cite them other than as ``work in progress''. 22 To learn the current status of any Internet-Draft, please check the 23 ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow 24 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 25 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 26 ftp.isi.edu (US West Coast). 28 Distribution of this document is unlimited. 30 1 Abstract 32 Recently, a number of problems related to RTP scalability to large 33 multicast groups have been identified. The main problem, congestion 34 of RTCP packets due to near-simultaneous joins, has been resolved in 35 previous work. In this document, we present some additional problems 36 and describe the solutions. In particular, we discuss the problem of 37 BYE floods and premature timeouts. To resolve the BYE flood problem, 38 we propose a BYE reconsideration mechanism. To help alleviate prema- 39 ture timeouts, we propose a reverse timer reconsideration algorithm. 40 Both algorithms are simple, and require minimal state and computa- 41 tion. 43 2 Introduction 45 Recently, a number of problems related to RTP [1] scalability to 46 large multicast groups have been identified [2]. The most serious of 47 these problems is RTCP congestion in simultaneous joins. In this sce- 48 nario, a large number of users join a session at nearly the same 49 time. This can happen if the join is an automatic response to an 50 announced session on the mbone [3], or if RTP is being used for 51 broadcast applications, and a popular show comes on. When this hap- 52 pens, each user joins the group believing that they are the only 53 user. They then send their initial RTCP packets within a short period 54 of time, causing a flood of RTCP reports. This can result in network 55 and/or access link congestion. For modem dial-in users, the slow 56 access links can cause congestion even with moderate simultaneous 57 joins. 59 To combat this problem, an algorithm called reconsideration has been 60 proposed [4] [5]. This algorithm causes backoff in the transmission 61 of RTCP packets as group sizes increase. 63 However, several new problems have recently been discovered relating 64 to RTCP scalability. These difficulties are similar to the step-join 65 congestion problem, but different in that they require additional 66 algorithms to resolve. This document describes the two new difficul- 67 ties which have been encountered - BYE floods and premature timeouts. 69 3 BYE Floods 71 The BYE flood problem is very similar to the simultaneous join prob- 72 lem. Instead of many users joining at the same time, many users leave 73 the group at the same time. Since an RTP client sends an RTCP BYE 74 packet when it leaves the group, this causes a flood of BYE packets, 75 which congests the network. Users can be expected to leave a group 76 simultaneously for much the same reasons they might join simultane- 77 ously - an automatic leave as a result of the end of the session (as 78 indicated in the SDP [6] announcement for the session), or because 79 the show is over, and the users manually exit their applications. 81 A number of aspects of the BYE flood problem make it different than 82 the simultaneous join problem. These must be taken into consideration 83 when designing an algorithm to reduce the flood. We therefore state 84 the goals of the BYE flood prevention algorithm as follows: 86 oUsers often terminate their applications just after leaving the 87 session. The algorithm must be aware of this possibility, and 88 define the appropriate behavior if an application decides to ter- 89 minate. 91 oThe algorithm should behave gracefully; when very few users are 92 leaving the group simultaneously, users should generally be 93 allowed to send their BYE packets right away. It is only in the 94 presence of a large number of BYE packets that the algorithm 95 should kick in, and force users to hold back on sending their BYE 96 packets. 98 oThe algorithm should be simple, requiring minimal computation and 99 storage. 101 We propose an algorithm called BYE reconsideration to accomplish these 102 goals. The algorithm operates much like standard 103 reconsideration. How- ever, instead of counting other users, and using 104 the resulting count as a multiplier for the packet transmission 105 interval, the client counts BYE packets, and uses the number of BYE 106 packets received thus far as the multiplier for the interval. The 107 operation of the algorithm is as fol- lows. 109 At some time tl, the user decides to leave the session. The application 110 first checks to see if it has ever sent an RTCP packet. If it has not, 111 the application must not send a BYE packet. Instead, it should leave the 112 session silently. Without having sent an RTCP packet, the BYE packet 113 provides no useful information. Next, the application checks to see if 114 the group size is less than some threshold, Bt (a value of 50 seems rea- 115 sonable). If it is, the application may send a BYE packet immediately, 116 and then leave the session. For small groups, BYE packets are of signif- 117 icant value (for loose session management), and sending them immediately 118 is quite important. Since the group contains only a small number of par- 119 ticipants, the flood of packets is limited. 121 It is possible that a user has a lowball group size estimate if they 122 leave a session quickly after joining it. If this session is large, and 123 there are many users coming and going fairly quickly (typical of a chan- 124 nel surfer), it might appear that this can cause a steady flow of BYE 125 packets. However, if these clients implement the forward reconsideration 126 algorithms, they generally will have never sent an RTCP packet. This is 127 because the new users' RTCP packet transmission will be constantly 128 reconsidered, as the new user will be receiving RTCP packets at a steady 129 rate from other users already in the group. This constant reception of 130 packets will cause the new user to see a steady growth in the group 131 size, causing its own RTCP packet transmission to be pushed into the 132 future. Since a user who never sends an RTCP packet cannot send a BYE 133 packet, this will generally cause these channel surfers to neither send 134 RTCP SDES or RR information, nor a BYE packet. 136 If the user has sent an RTCP packet previously, and the group size 137 exceeds Bt, the application computes a time interval T as: 139 T = R(1/2) max(T_min, n_l * C) 141 Where Tmin is 2.5 seconds, C=avgpktsz/(bw*.05), R(1/2) is a random vari- 142 able uniformly distributed between 1/2 and 3/2, avgpktsz is the average 143 size of all BYE packets received thus far, and bw is the session band- 144 width. The average packet size is computed using the same exponential 145 weighted average filter used to compute the average RTCP packet size in 146 the current specification. The value is updated through the filter every 147 time a BYE packet is received or transmitted (not when it is reconsid- 148 ered). 150 The user then schedules the BYE packet to be sent at time tl+T. Between 151 tl and this time, the user increments nl for each BYE packet that is 152 received. In this fashion, nl counts the number of BYE's from other 153 users since deciding to leave the session. 155 When this time arrives, the user recomputes T according to the previous 156 equation. If tl+T is less than the current time, the BYE packet may be 157 sent. If tl+T is more than the current time, the BYE packet transmission 158 is rescheduled for time tl+T. At that time, the computation and compari- 159 son are repeated. All along, nl is incremented for each BYE packet 160 received. 162 A BYE packet which is from an SSRC which already sent a BYE (a dupli- 163 cate) is ignored. Furthermore, the application should not increment nl 164 if it receives a BYE from a user which has never sent an RTCP packet. 165 Under normal situations, an application should never send a duplicate 166 BYE packet, or send a BYE if an RTCP packet was never sent. However, a 167 malicious user may send many BYE packets. If this check were not made, 168 these BYE's would cause the variable nl to increase, and effectively 169 prevent any other user from sending a BYE. 171 If an application wishes to terminate before it can send a BYE RTCP 172 packet according to these rules, it must not send a BYE packet. Instead, 173 it should terminate silently. The BYE reconsideration algorithm will 174 effectively re-allocate the bandwidth from users who leave without BYE's 175 to those who wait around to send a BYE. 177 The effect of this algorithm is to restrict the BYE packet transmission 178 rate to at most an additional 10% of the session bandwidth (assuming a 179 very large simultaneous leave). At the same time, if only a few users 180 are leaving the group (even for a large group), they will get to send 181 their BYE packets in a timely fashion. This meets the design objectives 182 described in the beginning of the section. 184 We ran numerous simulations to verify the performance of the algorithm. 185 Even with as many as 10,000 users simultaneously leaving the session, 186 the BYE reconsideration algorithm maintained the BYE transmission rate 187 at 10%. This is demonstrated in Figure 1, which depicts the cumulative 188 number of RTCP packets (BYE and others) send to the multicast group over 189 time. At time 10,000, almost all of the users leave the group. The top 190 line depicts the performance without BYE reconsideration, where some 191 10,000 BYE packets are sent all at once. The lower curve shows perfor- 192 mance for BYE reconsideration. Note how there is only a small increase 193 in packet transmission rates. 195 [Figure available in Postscript version only] 197 Figure 1: BYE Reconsideration Performance 198 4 Premature Timeouts 200 We have observed a secondary effect when many users simultaneously 201 leave a group. There are many applications where not all of the users 202 will leave - some will stick around. An example is a distance learn- 203 ing application. There are perhaps several hundred students in the 204 class. When the class ends, most of the students leave at about the 205 same time. However, some stay behind to talk with the professor. 207 We have observed that rapid leaves can cause the remaining users to 208 time each other out. It can take a significant amount of time for the 209 users to return. This implies that each user will not see the other 210 users, which is undesirable for the post-class discussion scenario 211 just mentioned. 213 4.1 Quantifying the Problem 215 The difficulty is related to the way timeouts are handled. In the 216 current specification [1], a user is timed out if they have not sent 217 an RTP or RTCP packet within the last five RTCP intervals. In dynamic 218 groups, the interval itself is dynamic. As many users leave a group, 219 their BYE packets cause the group size estimate to rapidly decrease. 220 This, in turn, decreases the timeout interval. If the number of users 221 who leave the group is sufficiently large, the timeout interval may 222 decrease so much that the remaining users will time out. 224 For example, consider a group of 505 users. If the total RTCP inter- 225 val is to be limited to 1 packet per second, each user sends RTCP 226 packets once every 505 seconds (on average). Assume user 1 last sent 227 an RTCP packet at time 0. The user schedules the next RTCP packet for 228 time 505. At time 490, 500 of the 505 members (not including user 1) 229 leave the group, and send BYE packets (assume for the moment that 230 there is no BYE flood prevention algorithm). Shortly thereafter (say 231 time 500) the BYE packets have been received, and the remaining 5 232 users perceive the group size to be 5. Based on this, the timeout 233 interval is 25 seconds. Any user who has not sent a packet since time 234 475 will therefore be timed out. User 1 last sent a packet at time 0, 235 so they are timed out. In fact, odds are good that most of the 236 remaining 5 users sent their last packet before time 475. Thus, every 237 user will time out all of the other users. Furthermore, it may take a 238 long time for those users to come back. Consider user 2, who did not 239 leave the group, and who was unfortunate enough to have last sent an 240 RTCP packet at time 450. They then scheduled their next RTCP packet 241 for time 955 (since there were still 505 users at the time). After 242 the exodus at time 500, user 2 will remain, but will not send the 243 next RTCP packet until time 955 (unless the user sends data, in which 244 case they will be known via the RTP packet). 246 The first question to ask is whether BYE flood prevention helps alle- 247 viate this problem. Since the algorithm is designed to reduce the 248 flood of BYE packets, the group size cannot drop so rapidly. This 249 does help, of course, but not completely. With network delays, there 250 still can be spikes in BYE packets. It does not require many BYE 251 packets for this phenomenon to surface; it only requires that the 252 ratio of users left after the leave to the number before the leave be 253 less than around 1/5. This can occur in both small and large groups 254 alike. 256 Even assuming the rate of BYE packets is restricted to some constant 257 factor, the efficacy of the timeout algorithm is reduced. To show 258 this, we computed the number of packets sent by a user, on average, 259 during the timeout interval, as the group size decreases linearly 260 from some initial value Ns to some final value Nf. We denote the 261 slope of the decrease by r users per second. We also define C as the 262 multiplier to convert from group size to interval (so that if there 263 are N group members, each member sends RTCP packets every CN seconds, 264 on average), and M as the factor of 5 timeout multiplier. If r times 265 C is much smaller than 1, and less than (1/CM)(Ns/Nf - 1)the number of 266 packets sent during the timeout window, is on average: 268 N_p = (-1)/(ln(1 - rC)) ln(1 + rCM) 270 and for rC larger than 1, but less than (1/cM)(Ns/Nf-1): 272 N_p <= (M)/(1 + rCM) 274 In the limit as r goes to zero (that is, nobody leaving the group), 275 the number of packets is M (according to the first equation), as 276 expected. However, this quantity decreases rapidly as the slope, r, 277 increases relative to 1/C. With the BYE prevention algorithm, the 278 flood of packets can be shown to be bounded at an average rate of 2/C 279 in the absence of network delays. With network delays, there can be a 280 spike of packets, but following this spike, the BYE packet rate will 281 also settle to 2/C initially, gradually decreaseing to 1/C. Plugging 282 in r=2/C into the second equation above, the number of packets sent 283 is: 285 N_p <= 5/11 287 This means that each sender will send only half a packet during the 288 timeout window, on average. In reality, this means that half of the 289 users remaining will send 1, and the other half will not. Therefore, 290 many users will still timeout. Those users which do manage to send a 291 packet may still timeout if the packet is lost. 293 Note that both equations above rely on many users leaving the group 294 at the same time (the constraint that r<(1/MC)(Nf/Ns-1)).If the num- 295 ber who leave is small relative to the slope of the leave (r>(1/MC) 296 (Nf/Ns-1)),the effects are less severe. 298 4.2 Reverse Reconsideration 300 One of the major factors contributing to the premature timeout effect 301 is the delay between when the group size decreases, and when users 302 begin to send packets using the resulting smaller interval. In the 303 example in the previous section, user 2 sent an RTCP packet at time 304 450, and scheduled the next for time 955. After the exodus at time 305 490, user 2 knows that the group size has dropped - but does nothing. 306 The user instead waits until time 955, sends the packet, and then 307 computes the next send time. Since the group size is now 5, user 2 308 will schedule the next packet 5 seconds later, on average. There is 309 thus a 500 second delay between the exodus and when user 2 gets 310 around to scheduling a packet using the new, smaller interval. 312 To resolve this problem, we propose an algorithm called reverse 313 reconsideration. The idea is simple. If the group membership 314 decreases, each user reschedules their next packet immediately. The 315 packet is rescheduled for a time earlier than previously. The amount 316 earlier is made to depend on how much the group size has decreased. 318 More specifically, assume that the last time a user sent an RTCP 319 report is tp. The next report is scheduled for time tn, and tc is the 320 current time. Before the arrival of a BYE packet at the current time 321 tc, there were np users. There are now nc users. Before the BYE, RTCP 322 packet transmissions should be uniformly scheduled over time. That 323 means that there should have been nc packet transmissions scheduled 324 between tc and tc+C*nc. Now, however, the group size has decreased to 325 np. This should cause there to be np packet transmissions scheduled 326 between tcandC*np. To accomplish this, every user should compress the 327 interval between the current time and their next packet transmission 328 by nc/np.This implies that the next packet transmission should be 329 rescheduled for time tn: 331 t_n = t_c + (n_c/n_p)(t_n - t_c) 333 This new value for tn has two key properties: 335 1. The new time is always earlier than the previous time. 337 2. The new time can never be before the current time. 339 The second property is key; it guarantees that there will not be a spike 340 of packets transmitted due to a sharp decrease in group size. 342 On the surface, it would seem that an alternate algorithm might be a 343 direct application of regular (or forward) reconsideration. Such an 344 implementation might work as follows. At tc, when the BYE arrives, the 345 user recomputes the transmission interval T, based on the new group size 346 nc. This interval is then added to the previous packet transmission time 347 tp. If the result is a time before the current time, the packet is sent, 348 else it is rescheduled for tp+T. This algorithm does not work. Even a 349 moderate decrease in the group size would cause many users to send their 350 RTCP packets immediately, causing an additional spike. This is because 351 this version of the algorithm does not maintain property 2 - the new 352 transmission time can be before the current time. 354 There is one additional aspect to the reverse reconsideration algorithm 355 that must be considered - how it interacts with forward reconsideration. 356 Consider the following example. There are 100 users in a group. The con- 357 stant C is equal to 1 packet per second. At time 0, user A sends an RTCP 358 packet, and schedules the next for time 100. At time 50, 50 users leave 359 the group. User A executes the reverse reconsideration algorithm, and 360 reschedules their packet for time 50 + (50/100)(100 - 50) = 75. At time 361 60, one more user joins the group. At time 75, user A executes the for- 362 ward reconsideration algorithm (we assume conditional reconsideration). 363 Since the group size has increased (51 vs. 50), user A recomputes the 364 interval - which is now 51 seconds on average. This is then added to the 365 previous transmission time, which is time 0. The result is t=51, signif- 366 icantly earlier than the current time t=75. This will cause the user 367 (and in fact, all other users) to send their packets immediately, even 368 if the group size further increases. The problem is that while we have 369 adjusted the next packet transmission time, tn, with reverse reconsider- 370 ation, we have not adjusted the value of the previous packet transmis- 371 sion time, tp. This quantity is used for forward reconsideration, and 372 must be adjusted as well in order to maintain consistency. 374 The adjustment algorithm is simple. The value for tp is updated when tn 375 is updated by reverse reconsideration. Its value is adjusted to: 377 t_p = t_c - (n_c/n_p)(t_c - t_p) 379 The nature of this adjustment is the same as for tn. In the previous 380 example, it would have caused tp to be adjusted from 0 to (50 - 381 (50/100)(50 - 0)) = 25. At time 75, when the user is performing the for- 382 ward reconsideration algorithm, the interval T=51 is added to tp=25, 383 yielding t=76, slightly ahead of the current time t=75, as expected 384 (since the group has only increased by one member). 386 4.3 Performance 388 What is the improvement due to reverse reconsideration? We have been 389 able to prove that the number of packets sent during the timeout 390 window has an achievable lower bound of: 392 N_p = (1 / rC) * ln(1 + rCM) 394 This time, independent of the relative value of rC. Again, this holds 395 only when the actual number of users who leave is a significant frac- 396 tion of the current group size (Nf/Ns<1/(1+MCr)).Assuming that BYE 397 reconsideration is being used, the BYE rate is limited to around 398 r=2/C when network delays are small. Plugging this in: 400 N_p = 1/2 ln 11 = 1.19 402 This means that a user will send 1.19 packets on average, during the 403 timeout window. This is to be compared to the situation before 404 reverse reconsideration, where Np=5/11. Therefore, reverse reconsid- 405 eration affords a factor of three improvement in performance. Now, no 406 users will timeout under normal circumstances (on average). However, 407 a single packet loss may cause a user to timeout prematurely. 409 5 Conclusion 411 We presented two additional problems with RTP scalability - BYE 412 floods and premature timeouts. BYE floods are caused when many users 413 simultaneously leave a group. To fix the problem, we presented an 414 algorithm called BYE reconsideration, which works much like forward 415 reconsideration, but for BYE packets. The second problem, premature 416 timeouts, is a secondary effect, but still problematic. To help 417 resolve it, we presented an algorithm called reverse reconsideration, 418 which shows a threefold factor of improvement. Both algorithms are 419 extremely simple, requiring no additional memory beyond forward 420 reconsideration, and requiring a single O(1) computation upon recep- 421 tion of a BYE packet. 423 6 Bibliography 425 [1] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, RTP: a 426 transport protocol for real-time applications, Tech. Rep. RFC 1889, 427 Internet Engineering Task Force, Jan. 1996. 429 [2] B. Aboba, Alternatives for enhancing RTCP scalability, Internet 430 Draft, Internet Engineering Task Force, Jan. 1997. Work in progress. 432 [3] M. Handley, SAP: Session announcement protocol, Internet Draft, 433 Internet Engineering Task Force, Nov. 1996. Work in progress. 435 [4] J. Rosenberg and H. Schulzrinne, Timer reconsideration for 436 enhanced RTP scalability, Internet Draft, Internet Engineering Task 437 Force, July 1997. Work in progress. 439 [5] J. Rosenberg and H. Schulzrinne, Timer reconsideration for 440 enhanced rtp scalability, in To appear in Proceedings of IEEE Info- 441 com '98 , 1998. 443 [6] M. Handley and V. Jacobson, SDP: Session description protocol, 444 Internet Draft, Internet Engineering Task Force, Mar. 1997. Work in 445 progress. 447 7 Authors' Addresses 449 Jonathan Rosenberg 450 Bell Laboratories, Lucent Technologies 451 101 Crawfords Corner Rd. 452 Holmdel, NJ 07733 453 Rm. 4C-526 454 email: jdrosen@bell-labs.com 456 Henning Schulzrinne 457 Columbia University 458 M/S 0401 459 1214 Amsterdam Ave. 460 New York, NY 10027-7003 461 email: schulzrinne@cs.columbia.edu