idnits 2.17.1 draft-ietf-avt-rtpsample-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 11 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 22, 1999) is 8921 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 446 looks like a reference -- Missing reference section? '2' on line 450 looks like a reference -- Missing reference section? '3' on line 454 looks like a reference -- Missing reference section? '4' on line 459 looks like a reference -- Missing reference section? '5' on line 463 looks like a reference Summary: 4 errors (**), 0 flaws (~~), 2 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Audio/Video Transport wg 2 Internet Draft J. Rosenberg, H. Schulzrinne 3 draft-ietf-avt-rtpsample-06.txt dynamicsoft/Columbia U. 4 November 22, 1999 5 Expires: May 22, 2000 7 Sampling of the Group Membership in RTP 9 STATUS OF THIS MEMO 11 This document is an Internet-Draft and is in full conformance with 12 all provisions of Section 10 of RFC2026. 14 Internet-Drafts are working documents of the Internet Engineering 15 Task Force (IETF), its areas, and its working groups. Note that 16 other groups may also distribute working documents as Internet- 17 Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six months 20 and may be updated, replaced, or obsoleted by other documents at any 21 time. It is inappropriate to use Internet- Drafts as reference 22 material or to cite them other than as work in progress. 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/ietf/1id-abstracts.txt 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html. 30 Abstract 32 In large multicast groups, the size of the group membership table 33 maintained by RTP (Real Time Transport Protocol) participants may 34 become unwieldy, particularly for embedded devices with limited 35 memory and processing power. This document discusses mechanisms for 36 sampling of this group membership table in order to reduce the memory 37 requirements. Several mechanisms are proposed, and the performance of 38 each is considered. 40 1 Introduction 42 RTP, the Real Time Transport protocol [1], mandates that RTCP packets 43 be transmitted from each participant with a period roughly 44 proportional to the group size. The group size is obtained by storing 45 a table, containing an entry for each unique SSRC seen in RTP and 46 RTCP packets. As members leave or time out, entries are deleted, and 47 as new members join, entries are added. The table is thus highly 48 dynamic. 50 For large multicast sessions, such as an mbone broadcast or IP based 51 TV distribution, group sizes can be extremely large, on the order of 52 hundreds of thousands to millions of participants. In these 53 environments, RTCP may not always be used, and thus the group 54 membership table isn't needed. However, it is highly desirable for 55 RTP to scale well for groups with one member to groups with one 56 million members, without human intervention to "turn off" RTCP when 57 it's no longer appropriate. This means that the same tools and 58 systems can be used for both small conferences and TV broadcasts in a 59 smooth, scalable fashion. 61 Previous work [2] has identified three major scalability problems 62 with RTP. These are: 64 1. Congestion due to floods of RTCP packets in highly dynamic 65 groups 67 2. Large delays between receipt of RTCP packets from a single 68 user 70 3. Large size of the group membership table 72 The reconsideration algorithm [2] helps to alleviate the first of 73 these. This document addresses the third, that of large group size 74 tables. 76 Storage of an SSRC table with one million members, for example, 77 requires at least four megabytes. As a result, embedded devices with 78 small memory footprints may have difficulty under these conditions. 79 To solve this problem, SSRC sampling has been proposed. SSRC sampling 80 uses statistical sampling to obtain a stochastic estimate of the 81 group membership. There are many issues that arise when this is done. 82 This document reviews these issues and discusses the mechanisms which 83 can be applied by implementors. In particular, it focuses on three 84 methods for adapting the sampling probability as the group membership 85 varies. It is important to note that the IETF has been notified of 86 intellectual property rights claimed in regard to some or all of the 87 specification contained in this document, and in particular to one of 88 the three mechanisms: the binning algorithm described below. For more 89 information consult the online list of claimed rights. The two other 90 approaches presented are inferior to the binning algorithm, but are 91 included as they are believed to be unencumbered by IPR. 93 2 Basic Operation 94 The basic idea behind SSRC sampling is simple. Each participant 95 maintains a key K of 32 bits, and a mask M of 32 bits. Assume that m 96 of the bits in the mask are 1, and the remainder are zero. When an 97 RTCP packet arrives with some SSRC S, rather than placing it in the 98 table, it is first sampled. The sampling is performed by ANDing the 99 key and the mask, and also ANDing the SSRC and the mask. The 100 resulting values are compared. If equal, the SSRC is stored in the 101 table. If not equal, the SSRC is rejected, and the packet is treated 102 as if it were never received. 104 The key can be anything, but is usually chosen to be the SSRC of the 105 user who is performing the sampling. 107 This sampling process can be described mathematically as: 109 D = (K*M == S*M) 111 Where the * operator denotes AND and the == operator denotes a test 112 for equality. D represents the sampling decision. 114 According to the RTP specification, the SSRC's used by session 115 participants are chosen randomly. If the distribution is also 116 uniform, it is easy to see that the above filtering will cause 1 out 117 of 2**m SSRC's to be placed in the table, where m is the number of 118 bits in the mask, M, which are one. Thus, the sampling probability p 119 is 2**-m. 121 Then, to obtain an actual group size estimate, L, the number of 122 entries in the table N is multiplied by 2**m: 124 L = N * 2**m 126 Care must be taken when choosing which bits to set to 1 in the mask. 127 Although the RTP specification mandates randomly chosen SSRC, there 128 are many known implementations which do not conform to this. In 129 particular, the ITU H.323 [3] series of recommendations allows the 130 central control element, the gatekeeper, to assign the least 131 significant 8 bits of the SSRC, while the most significant are 132 randomly chosen by RTP participants. 134 The safest way to handle this problem is to first hash the SSRC using 135 a cryptographically secure hash, such as MD5 [4], and then choose 32 136 of the bits in the result as the SSRC used in the above computation. 137 This provides much better randomness, and doesn't require detailed 138 knowledge about how various implementations actually set the SSRC. 140 2.1 Performance 142 The estimate is more accurate as the value of m decreases, less 143 accurate as it increases. This can be demonstrated analytically. If 144 the actual group size is G, the ratio of the standard deviation to 145 mean of the estimate L (coefficient of variation) is: 147 sqrt((2**m - 1)/G) 149 This equation can be used as a guide for selecting the thresholds for 150 when to change the sampling factor, as discussed below. For example, 151 if the target is a 1% standard deviation to mean, the sampling 152 probability p=2**-m should be no smaller than .5 when there are ten 153 thousand group members. More generally, to achieve a desired standard 154 deviation to mean ratio of T, the sampling probability should be no 155 less than: 157 p > 1 / (1 + G*(T**2)) 159 3 Increasing the Sampling Probability 161 The above simple sampling procedure would work fine if the group size 162 was static. However, it is not. A participant joining an RTP session 163 will initially see just one participant (themselves). As packets are 164 received, the group size as seen by that participant will increase. 165 To handle this, the sampling probability must be made dynamic, and 166 will need to increase and decrease as group sizes vary. 168 The procedure for increasing the sampling probability is easy. A 169 participant starts with a mask with m=0. Under these conditions, 170 every received SSRC will be stored in the table, so there is 171 effectively no sampling. At some point, the value of m is increased. 172 This implies that approximately half of the SSRC already in the table 173 will no longer match the key under the masking operation. In order to 174 maintain a correct estimate, these SSRC must be discarded from the 175 table. New SSRC are only added if they match the key under the new 176 mask. 178 The decision about when to increase the number of bits in the mask is 179 also simple. Let's say an RTP participant has a memory with enough 180 capacity to store C entries in the table. The best estimate of the 181 group is obtained by the largest sampling probability. This also 182 means that the best estimate is obtained the fuller the table is. A 183 reasonable approach is therefore to increase the number of bits in 184 the mask just as the table fills to C. This will generally cause its 185 contents to be reduced by half. Once the table fills again, the 186 number of bits in the mask is further increased. 188 4 Reducing the Sampling Probability 190 If the group size begins to decrease, it may be necessary to reduce 191 the number of bits in the mask. Not doing so will result in extremely 192 poor estimates of the group size. Unfortunately, reducing the number 193 of bits in the mask is more difficult than increasing them. 195 When the number of bits in the mask increases, the user compensates 196 by removing those SSRC which no longer match. When the number of bits 197 decreases, the user should theoretically add back those users whose 198 SSRC now match. However, these SSRC are not known, since the whole 199 point of sampling was to not have to remember them. Therefore, if the 200 number of bits in the mask is just reduced without any changes in the 201 membership table, the group estimate will instantly drop by exactly 202 half. 204 To compensate for this, some kind of algorithm is needed. Two 205 approaches are presented here: a corrective-factor solution, and a 206 binning solution. The binning solution is simpler to understand and 207 performs better. However, we include a discussion of the corrective- 208 factor solution for completeness and comparison, and also because it 209 is believed to be unencumbered by IPR. 211 4.1 Corrective Factors 213 The idea with the corrective factors is to take one of two 214 approaches. In the first, a corrective factor is added to the group 215 size estimate, and in the second, the group size estimate is 216 multiplied by a corrective factor. In both cases, the purpose is to 217 compensate for the change in sample mask. The corrective factors 218 should decay as the "fudged" members are eventually learned about and 219 actually placed in the membership list. 221 The additive factor starts at the difference between the group size 222 estimate before and after the number of bits in the mask is reduced, 223 and decays to 0 (this is not always half the group size estimate, as 224 the corrective factors can be compounded, see below). The 225 multiplicative corrective factor starts at 2, and gradually decays to 226 one. Both factors decay over a time of cL(ts-), where c is the 227 average RTCP packet size divided by the RTCP bandwidth for receivers, 228 and L(ts-) is the group size estimate just before the change in the 229 number of bits in the mask at time ts. The reason for this constant 230 is as follows. In the case where the actual group membership has not 231 changed, those members which were forgotten will still be sending 232 RTCP packets. The amount of time it will take to hear an RTCP packet 233 from each of them is the average RTCP interval, which is cL(ts-). 234 Therefore, by cL(ts-) seconds after the change in the mask, those 235 users who were fudged by the corrective factor should have sent a 236 packet and thus appear in the table. We chose to decay both functions 237 linearly. This is because the rate of arrival of RTCP packets is 238 linear. 240 What happens if the number of bits in the mask is reduced once again 241 before the previous corrective factor has expired? In that case, we 242 compound the factors by using yet another one. Let fi() represent the 243 ith additive correction function, and gi() the ith multiplicative 244 correction function. If ts is the time when the number of bits in the 245 mask is reduced, we can describe the additive correction factor as: 247 / 0 , t < ts 248 | ts + cL(ts-) - t 249 fi(t) = |( L(ts-) - L(ts+)) ---------------- , ts < t < ts+cL(ts-) 250 | cL(ts-) 251 | 0 , t > ts + cL(ts-) 253 and the multiplicative factor as: 255 / 1 , t < ts 256 | 257 | ts + 2cL(ts-) - t 258 gi(t) | ----------------- , ts < t < ts + cL(ts-) 259 | cL(ts-) 260 | 261 1 , t > ts + cL(ts-) 263 Note that in these equations, L(t) denotes the group size estimate 264 obtained including the corrective factors except for the new factor. 265 ts- is the time right before the reduction in the number of bits, and 266 ts+ the time after. As a result, L(ts-) represents the group size 267 estimate before the reduction, and L(ts+) the estimate right after, 268 but not including the new factor. 270 Finally, the actual group size estimate L(t) is given by: 272 ----- 274 L(t) = / fi(t) + N*(2**m) 275 ----- 276 i 278 for the additive factor, and: 280 ------ 281 | | 282 | | 283 L(t)= | | N*(2**m)*gi(t) 285 for the multiplicative factor. 287 Simulations showed that both algorithms performed equally well, but 288 both tended to seriously underestimate the group size when the group 289 membership was rapidly declining [5]. This is demonstrated in the 290 performance data below. 292 As an example, consider computation of the additive factor. The group 293 size is 1000, c is 1 second, and m is two. With a mask of this size, 294 a participant will, on average, observe 250 (N = 250) users. At t=0, 295 the user decides to reduce the number of bits in the mask to 1. As a 296 result, L(0-) is 1000, and L(0+) is 500. The additive factor 297 therefore starts at 500, and decays to zero at time ts + cL(ts-) = 298 1000. At time 500, lets assume N has increased to 375 (this will, on 299 average, be the case if the actual group size has not changed). At 300 time 500, the additive factor is 250. This is added to 2**m times N, 301 which is 750, resulting in a group size estimate of 1000. Now, the 302 user decides to reduce the number of bits in the mask again, so that 303 m=0. Another additive factor is computed. This factor starts at 304 L(ts-) (which is 1000), minus L(ts+). L(ts+) is computed without the 305 new factor; it is the first additive factor at this time (250) plus 306 2**m (1) times N (375). This is 625. As a result, the new additive 307 factor starts at 1000 - 625 (375), and decays to 0 in 1000 seconds. 309 4.2 Binning Algorithm 310 In order to more correctly estimate the group size even when it was 311 rapidly decreasing, a binning algorithm can be used. The algorithm 312 works as follows. There are 32 bins, same as the number of bits in 313 the sample mask. When an RTCP packet from a new user arrives whose 314 SSRC matches the key under the masking operation, it is placed in the 315 mth bin (where m is the number of ones in the mask) otherwise it is 316 discarded. 318 When the number of bits in the mask is to be increased, those members 319 in the bin who still match after the new mask are moved into the next 320 higher bin. Those who don't match are discarded. When the number of 321 bits in the mask is to be decreased, nothing is done. Users in the 322 various bins stay where they are. However, when an RTCP packet for a 323 user shows up, and the user is in a bin with a higher value than the 324 current number of bits in the mask, it is moved into the bin 325 corresponding to the current number of bits in the mask. Finally, the 326 group size estimate L(t) is obtained by: 328 31 329 ---- 331 L(t) = / B(i) * 2**i 332 ---- 333 i=0 335 Where B(i) are the number of users in the ith bin. 337 The algorithm works by basically keeping the old estimate when the 338 number of bits in the mask drops. As users arrive, they are gradually 339 moved into the lower bin, reducing the amount that the higher bin 340 contributes to the total estimate. However, the old estimate is still 341 updated in the sense that users which timeout are removed from the 342 higher bin, and users who send BYE packets are also removed from the 343 higher bin. This allows the older estimate to still adapt, while 344 gradually phasing it out. It is this adaptation which makes it 345 perform much better than the corrective algorithms. The algorithm is 346 also extremely simple. 348 4.3 Comparison 350 The algorithms are all compared via simulation in Table 1. In the 351 simulation, 10,001 users join a group at t=0. At t=10,000, 5000 of 352 them leave. At t=20,000, another 5000 leave. All implement an SSRC 353 sampling algorithm, unconditional forward reconsideration and BYE 354 reconsideration. The table depicts the group size estimate from time 355 20,000 to time 25,000 as seen by the single user present throughout 356 the entire session. In the simulation, a memory size of 1000 SSRC was 357 assumed. The performance without sampling, and with sampling with the 358 additive, multiplicative, and bin-based correction are depicted. 360 As the table shows, the bin based algorithm performs particularly 361 well at capturing the group size estimate towards the tail end of the 362 simulation. 364 Time No Sampling Binned Additive Multiplicative 365 ---- ----------- ------ -------- -------------- 366 20000 5001 5024 5024 5024 367 20250 4379 4352 4352 4352 368 20500 3881 3888 3900 3853 369 20750 3420 3456 3508 3272 370 21000 3018 2992 3100 2701 371 21250 2677 2592 2724 2225 372 21500 2322 2272 2389 1783 373 21750 2034 2096 2125 1414 374 22000 1756 1760 1795 1007 375 22250 1476 1472 1459 582 376 22500 1243 1232 1135 230 377 22750 1047 1040 807 80 378 23000 856 864 468 59 379 23250 683 704 106 44 380 23500 535 512 32 32 381 23750 401 369 24 24 382 24000 290 257 17 17 383 24250 198 177 13 13 384 24500 119 129 11 11 385 24750 59 65 8 8 386 25000 18 1 2 2 388 4.4 Sender Sampling 390 Care must be taken in handling senders when using SSRC sampling. 391 Since the number of senders is generally small, and they contribute 392 significantly to the computation of the RTCP interval, sampling 393 should not be applied to them. However, they must be kept in a 394 separate table, and not be "counted" as part of the general group 395 membership. If they are counted as part of the general group 396 membership, and are not sampled, the group size estimate will be 397 inflated to overemphasize the senders. 399 This is easily demonstrated analytically. Let Ns be the number of 400 senders, and Nr be the number of receivers. The membership table will 401 contain all Ns senders and (1/2)**m of the receivers. The total group 402 size estimate in the current draft is obtained by 2**m times the 403 number of entries in the table. Therefore, the group size estimate 404 becomes: 406 L(t) = (2**m) Ns + Nr 408 which exponentially weights the senders. 410 This is easily compensated for in the binning algorithm. A sender is 411 always placed in the 0th bin. When a sender becomes a receiver, it is 412 moved into the bin corresponding to the current value of m, if its 413 SSRC matches the key under the masked comparison operation. 415 5 Security Considerations 417 The use of SSRC sampling does not appear to introduce any additional 418 security considerations beyond those described in [1]. In fact, SSRC 419 sampling, as described above, can help somewhat in reducing the 420 effect of certain attacks. 422 RTP, when used without authentication of RTCP packets, is susceptible 423 to a spoofing attack. Attackers can inject many RTCP packets into the 424 group, each with a different SSRC. This will cause RTP participants 425 to believe the group membership is much higher than it actually is. 426 The result is that each participant will end up transmitting RTCP 427 packets very infrequently, if ever. When SSRC sampling is used, the 428 problem can be amplified if a participant is not applying a hash to 429 the SSRC before matching them against their key. This is because an 430 attacker can send many packets, each with different SSRC, that match 431 the key. This would cause the group size to inflate exponentially. 432 However, with a hash applied, an attacker cannot guess those SSRC 433 which will match against the key. In fact, an attacker will have to 434 send 2**m different SSRC before finding one that matches, on average. 435 Of course, the effect of a match causes an increase of group 436 membership by 2**m. But, the use of sampling means that an attacker 437 will have to send many packets before an effect can be observed. 439 6 Acknowledgements 441 The authors wish to thank Bill Fenner and Vern Paxson for their 442 comments. 444 7 Bibliography 446 [1] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: a 447 transport protocol for real-time applications," Request for Comments 448 (Proposed Standard) 1889, Internet Engineering Task Force, Jan. 1996. 450 [2] J. Rosenberg and H. Schulzrinne, "Timer reconsideration for 451 enhanced RTP scalability," in IEEE Infocom , (San Francisco, 452 California), March/April 1998. 454 [3] International Telecommunication Union, "Visual telephone systems 455 and equipment for local area networks which provide a non-guaranteed 456 quality of service," Recommendation H.323, Telecommunication 457 Standardization Sector of ITU, Geneva, Switzerland, May 1996. 459 [4] R. Rivest, "The MD5 message-digest algorithm," Request for 460 Comments (Informational) 1321, Internet Engineering Task Force, Apr. 461 1992. 463 [5] J. Rosenberg, Protocols and Algorithms for Supporting Distributed 464 Internet Telephony PhD thesis, Columbia University, Dec. 1999. Work 465 in Progress. 467 8 Authors' Addresses 469 Jonathan Rosenberg 470 dynamicsoft 471 200 Executive Drive 472 West Orange, NJ 07052 473 USA 474 email: jdrosen@dynamicsoft.com 476 Henning Schulzrinne 477 Columbia University 478 M/S 0401 479 1214 Amsterdam Ave. 480 New York, NY 10027-7003 481 USA 482 email: schulzrinne@cs.columbia.edu