idnits 2.17.1 draft-ietf-avt-rtpsample-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** Expected the document's filename to be given on the first page, but didn't find any == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Couldn't figure out when the document was first submitted -- there may comments or warnings related to the use of a disclaimer for pre-RFC5378 work that could not be issued because of this. Please check the Legal Provisions document at https://trustee.ietf.org/license-info to determine if you need the pre-RFC5378 disclaimer. -- The document date (November 10, 1998) is 9296 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 385 looks like a reference -- Missing reference section? '2' on line 389 looks like a reference -- Missing reference section? '3' on line 393 looks like a reference -- Missing reference section? '4' on line 398 looks like a reference Summary: 9 errors (**), 0 flaws (~~), 3 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Audio/Video Transport wg 2 Internet Draft J. Rosenberg, H. Schulzrinne 3 ietf-avt-rtpsample-01.txt Bell Laboratories/Columbia U. 4 November 10, 1998 5 Expires: May 10, 1999 7 Sampling of the Group Membership in RTP 9 STATUS OF THIS MEMO 11 This document is an Internet-Draft. Internet-Drafts are working 12 documents of the Internet Engineering Task Force (IETF), its areas, 13 and its working groups. Note that other groups may also distribute 14 working documents as Internet-Drafts. 16 Internet-Drafts are draft documents valid for a maximum of six months 17 and may be updated, replaced, or obsoleted by other documents at any 18 time. It is inappropriate to use Internet-Drafts as reference 19 material or to cite them other than as ``work in progress''. 21 To learn the current status of any Internet-Draft, please check the 22 ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow 23 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 24 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 25 ftp.isi.edu (US West Coast). 27 Distribution of this document is unlimited. 29 ABSTRACT 31 In large multicast groups, the size of the group 32 membership table maintained by RTP (Real Time Transport 33 Protocol) participants may become unwieldy, particularly 34 for embedded devices with limited memory and processing 35 power. This document discusses mechanisms for sampling of 36 this group membership table in order to reduce the memory 37 requirements. Several mechanisms are proposed, and the 38 performance of each is considered. 40 1 Introduction 42 RTP, the Real Time Transport protocol [1], mandates that RTCP packets 43 be transmitted from each participant with a period roughly 44 proportional to the group size. The group size is obtained by storing 45 a table, containing an entry for each unique SSRC seen in RTP and 46 RTCP packets. As members leave or time out, entries are deleted, and 47 as new members join, entries are added. The table is thus highly 48 dynamic. 50 For large multicast sessions, such as an mbone broadcast or IP based 51 TV distribution, group sizes can be extremely large, on the order of 52 hundreds of thousands to millions of participants. In these 53 environments, RTCP may not always be used, and thus the group 54 membership table isn't needed. However, it is highly desirable for 55 RTP to scale well for groups with one member to groups with one 56 million members, without human intervention to "turn off" RTCP when 57 its no longer appropriate. This means that the same tools and systems 58 can be used for both small conferences and TV broadcasts in a smooth, 59 scalable fashion. 61 Previous work [2] has identified three major scalability problems 62 with RTP. These are: 64 1. Congestion due to floods of RTCP packets in highly dynamic 65 groups 67 2. Large delays between receipt of RTCP packets from a single 68 user 70 3. Large size of the group membership table 72 The reconsideration algorithm helps to alleviate the first of these. 73 This document addresses the third, that of large group size tables. 75 Storage of an SSRC table with one million members, for example, 76 requires at least four megabytes. As a result, embedded devices with 77 small memory footprints may have difficulty under these conditions. 78 To solve this problem, SSRC sampling has been proposed. SSRC sampling 79 uses statistical sampling to obtain a stochastic estimate of the 80 group membership. There are many issues that arise when this is done. 81 This document reviews these issues and discusses the mechanisms which 82 can be applied by implementors. 84 2 Basic Operation 86 The basic idea behind SSRC sampling is simple. Each participant 87 maintains a key K of 32 bits, and a mask M of 32 bits. Assume that m 88 of the M bits in the mask are 1, and the remainder are zero. When an 89 RTCP packet arrives with some SSRC S, rather than placing it in the 90 table, it is first sampled. The sampling is performed by ANDing the 91 key and the mask, and also ANDing the SSRC and the mask. The 92 resulting values are compared. If equal, the SSRC is stored in the 93 table. If not equal, the SSRC is rejected, and the packet is treated 94 as if it were never received. 96 The key can be anything, but is usually chosen to be the SSRC of the 97 user who is performing the sampling. 99 This sampling process can be described mathematically as: 101 D = (K*M == S*M) 103 Where the * operator denotes AND and the = operator denotes a test 104 for equality. D represents the sampling decision. 106 According to the RTP specification, the SSRC's used by session 107 participants are chosen randomly. If the distribution is also 108 uniform, it is easy to see that the above filtering will cause 1 out 109 of 2**m SSRC's to be placed in the table, where m is the number of 110 bits in the mask, M, which are one. Thus, the sampling probability is 111 2**-m. 113 Then, to obtain an actual group size estimate, L, the number of 114 entries in the table N is multiplied by 2**m: 116 L = N * 2**m 118 Care must be taken when choosing which bits to set to 1 in the mask. 119 Although the RTP specification mandates randomly chosen SSRC, there 120 are many known implementations which do not conform to this. In 121 particular, the ITU H.323 [3] series of recommendations allows the 122 central control element, the gatekeeper, to assign the least 123 significant 8 bits of the SSRC, while the most significant are 124 randomly chosen by RTP participants. 126 The safest way to handle this problem is to first hash the SSRC using 127 a cryptographically secure hash, such as MD5 [4], and then choose 32 128 of the bits in the result as the SSRC used in the above computation. 129 This provides much better randomness, and doesn't require detailed 130 knowledge about how various implementations actually set the SSRC. 132 2.1 Performance 134 The estimate is more accurate as the value of m decreases, less 135 accurate as it increases. This can be demonstrated analytically. If 136 the actual group size is G, the standard deviation to mean of the 137 estimate L (coefficient of variation) is: 139 sqrt{(1 - p)/pL} 141 This equation can be used as a guide for selecting the thresholds for 142 when to change the sampling factor, as discussed below. For example, 143 if the target is a 1% standard deviation to mean, the sampling 144 probability should be no smaller than .5 when there are ten thousand 145 group members. More generally, to achieve a desired standard 146 deviation to mean ration of T, the sampling probability should be no 147 less than: 149 p > 1 / (1 + G*(T**2)) 151 3 Increasing the Sampling Probability 153 The above simple sampling procedure would work fine if the group size 154 was static. However, it is not. A participant joining an RTP session 155 will initially see just one participant (themselves). As packets are 156 received, the group size as seen by that participant will increase. 157 To handle this, the sampling probability must be made dynamic, and 158 will need to increase and decrease as group sizes vary. 160 The procedure for increasing the sampling probability is easy. A 161 participant starts with a mask with m=0. Under these conditions, 162 every received SSRC will be stored in the table, so there is 163 effectively no sampling. At some point, the value of m is increased. 164 This implies that approximately half of the SSRC already in the table 165 will no longer match the key under the masking operation. In order to 166 maintain a correct estimate, these SSRC must be discarded from the 167 table. New SSRC are only added if they match the key under the new 168 mask. 170 The decision about when to increase the number of bits in the mask is 171 also simple. Lets say an RTP participant has a memory with enough 172 capacity to store C entries in the table. The best estimate of the 173 group is obtained by the largest sampling probability. This also 174 means that the best estimate is obtained the fuller the table is. A 175 reasonable approach is therefore to increase the number of bits in 176 the mask just as the table fills to C. This will generally cause its 177 contents to be reduced by half. Once the table fills again, the 178 number of bits in the mask is further increased. 180 4 Reducing the Sampling Probability 182 If the group size begins to decrease, it may be necessary to reduce 183 the number of bits in the mask. Not doing so will result in extremely 184 poor estimates of the group size. Unfortunately, reducing the number 185 of bits in the mask is more difficult than increasing them. 187 When the number of bits in the mask increases, the user compensates 188 by removing those SSRC which no longer match. When the number of bits 189 decreases, the user should theoretically add back those users whose 190 SSRC now match. However, these SSRC are not known, since the whole 191 point of sampling was to not have to remember them. Therefore, if the 192 number of bits in the mask is just reduced without any changes in the 193 membership table, the group estimate will instantly drop by exactly 194 half. 196 To compensate for this, some kind of algorithm is needed. Two 197 approaches are presented here: a corrective-factor solution, and a 198 binning solution. 200 4.1 Corrective Factors 202 The idea with the corrective factors is to add in or multiply the 203 group size estimate by some corrective factor to compensate for the 204 change in sample mask. The corrective factors should decay as the 205 "fudged" members are eventually learned about and actually placed in 206 the membership list. 208 The multiplicative factor starts at 2, and gradually decays to one. 209 The additive factor starts at the difference between the group size 210 estimate before and after the number of bits in the mask is reduced, 211 and decays to 0 (this is not always half the group size estimate, as 212 the corrective factors can be compounded, see below). Both factors 213 decay over a time of c*L(ts), where c is the average RTCP packet size 214 divided by 5% of the session bandwidth, and L(ts) is the group size 215 estimate just before the change in the number of bits in the mask at 216 time ts. The reason for this constant is as follows. In the case 217 where the actual group membership has not changed, those members 218 which were forgotten will still be sending RTCP packets. The amount 219 of time it will take to hear an RTCP packet from each of them is the 220 average RTCP interval, which is cL(ts). Therefore, by cL(ts) seconds 221 after the change in the mask, those users who were fudged by the 222 corrective factor should have sent a packet and thus appear in the 223 table. We chose to decay both functions linearly. This is because the 224 rate of arrival of RTCP packets is linear. 226 What happens if the number of bits in the mask is reduced once again 227 before the previous corrective factor has expired? In that case, we 228 compound the factors by using yet another one. Let fi() represent the 229 ith correction function. If ts is the time when the number of bits in 230 the mask is reduced, we can describe the additive correction factor 231 as: 233 / 0 , t < ts 234 | ts + cL(ts-) - t 235 fi(t) = |( L(ts-) - L(ts+)) ---------------- ,ts ts + cL(ts-) 238 \ 240 and the multiplicative factor as: 242 / 1 , t < ts 243 | 244 | ts + 2cL(ts-) - t 245 | ----------------- , ts < t < ts + cL(ts-) 246 | cL(ts-) 247 | 248 \ 1 , t > ts + cL(ts-) 250 Note that in these equations, L(t) denotes the group size estimate 251 obtained including the corrective factors except for the new factor. 253 Finally, the actual group size estimate L(t) is given by: 255 ----- 256 \ 257 L(t) = / fi(t) + N*(2**m) 258 ----- 259 i 261 for the additive factor, and: 263 ------ 264 | | 265 | | 266 L(t)= | | N*(2**m)*fi(t) 268 for the multiplicative factor. 270 Simulations showed that both algorithms performed equally well, but 271 both tended to seriously underestimate the group size when the group 272 membership was rapidly declining. This is demonstrated in the 273 performance curves below. 275 4.2 Binning Algorithm 277 In order to more correctly estimate the group size even when it was 278 rapidly decreasing, a binning algorithm can be used. The algorithm 279 works as follows. There are 32 bins, same as the number of bits in 280 the sample mask. When an RTCP packet from a new user arrives whose 281 SSRC matches the key under the masking operation, it is placed in the 282 mth bin (where m is the number of ones in the mask) otherwise it is 283 discarded. 285 When the number of bits in the mask is to be increased, those members 286 in the bin who still match after the new mask are moved into the next 287 higher bin. Those who don't match are discarded. When the number of 288 bits in the mask is to be decreased, nothing is done. Users in the 289 various bins stay where they are. However, when an RTCP packet for a 290 user shows up, and the user is in a bin with a higher value than the 291 current number of bits in the mask, it is moved into the bin 292 corresponding to the current number of bits in the mask. Finally, the 293 group size estimate L(t) is obtained by: 295 31 296 ---- 297 \ 298 L(t) = / B(i) * 2**i 299 ---- 300 i=0 302 Where B(i) are the number of users in the ith bin. 304 The algorithm works by basically keeping the old estimate when the 305 number of bits in the mask drops. As users arrive, they are gradually 306 moved into the lower bin, reducing the amount that the higher bin 307 contributes to the total estimate. However, the old estimate is still 308 updated in the sense that users which timeout are removed from the 309 higher bin, and users who send BYE packets are also removed from the 310 higher bin. This allows the older estimate to still adapt, while 311 gradually phasing it out. It is this adaptation which makes it 312 perform much better than the corrective algorithms. The algorithm is 313 also extremely simple. 315 4.3 Comparison 317 The algorithms are all compared via simulation in Table 1. In the 318 simulation, 10,001 users join a group at t=0. At t=10,000, 5000 of 319 them leave. At t=20,000, another 5000 leave. All implement an SSRC 320 sampling algorithm, unconditional forward and BYE reconsideration. 321 The table depicts the group size estimate from time 20,000 to time 322 25,000 as seen by the single user present throughout the entire 323 session. In the simulation, a memory size of 1000 SSRC was assumed. 324 The performance without sampling, and with sampling with the 325 additive, multiplicative, and bin-based correction are depicted. 327 As the table shows, the bin based algorithm performs particularly 328 well at capturing the group size estimate towards the tail end of the 329 simulation. 331 Time No Sampling Binned Additive Multiplicative 332 ---- ----------- ------ -------- -------------- 333 20000 5001 5024 5024 5024 334 20250 4379 4352 4352 4352 335 20500 3881 3888 3900 3853 336 20750 3420 3456 3508 3272 337 21000 3018 2992 3100 2701 338 21250 2677 2592 2724 2225 339 21500 2322 2272 2389 1783 340 21750 2034 2096 2125 1414 341 22000 1756 1760 1795 1007 342 22250 1476 1472 1459 582 343 22500 1243 1232 1135 230 344 22750 1047 1040 807 80 345 23000 856 864 468 59 346 23250 683 704 106 44 347 23500 535 512 32 32 348 23750 401 369 24 24 349 24000 290 257 17 17 350 24250 198 177 13 13 351 24500 119 129 11 11 352 24750 59 65 8 8 353 25000 18 1 2 2 355 Table 1 357 4.4 Sender Sampling 359 Care must be taken in handling senders when using SSRC sampling. 360 Since the number of senders is generally small, and they contribute 361 significantly to the computation of the RTCP interval, sampling 362 should not be applied to them. However, they must be kept in a 363 separate table, and not be "counted" as part of the general group 364 membership. If this is done, the group size estimate will be inflated 365 to overemphasize the senders. 367 This is easily demonstrated analytically. Let Ns be the number of 368 senders, and Nr be the number of receivers. The membership table will 369 contain all Ns senders and (1/2)**m of the receivers. The total group 370 size estimate in the current draft is obtained by 2**m times the 371 number of entries in the table. Therefore, the group size estimate 372 becomes: 374 L(t) = (2**m) Ns + Nr 376 which exponentially weights the senders. 378 This is easily compensated for in the binning algorithm. A sender is 379 always placed in the 0th bin. When a sender becomes a receiver, it is 380 moved into the bin corresponding to the current value of m, if its 381 SSRC matches the key under the masked comparison operation. 383 5 Bibliography 385 [1] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: a 386 transport protocol for real-time applications," Request for Comments 387 (Proposed Standard) 1889, Internet Engineering Task Force, Jan. 1996. 389 [2] J. Rosenberg and H. Schulzrinne, "Timer reconsideration for 390 enhanced RTP scalability," (San Francisco, California), March/April 391 1998. 393 [3] International Telecommunication Union, "Visual telephone systems 394 and equipment for local area networks which provide a non-guaranteed 395 quality of service," Recommendation H.323, Telecommunication 396 Standardization Sector of ITU, Geneva, Switzerland, May 1996. 398 [4] R. Rivest, "The MD5 message-digest algorithm," Request for 399 Comments (Informational) 1321, Internet Engineering Task Force, Apr. 400 1992. 402 6 Full Copyright Statement 404 Copyright (C) The Internet Society (1998). All Rights Reserved. 406 This document and translations of it may be copied and furnished to 407 others, and derivative works that comment on or otherwise explain it 408 or assist in its implmentation may be prepared, copied, published and 409 distributed, in whole or in part, without restriction of any kind, 410 provided that the above copyright notice and this paragraph are 411 included on all such copies and derivative works. 413 However, this document itself may not be modified in any way, such as 414 by removing the copyright notice or references to the Internet Soci- 415 ety or other Internet organizations, except as needed for the purpose 416 of developing Internet standards in which case the procedures for 417 copyrights defined in the Internet Standards process must be fol- 418 lowed, or as required to translate it into languages other than 419 English. 421 The limited permissions granted above are perpetual and will not be 422 revoked by the Internet Society or its successors or assigns. 424 This document and the information contained herein is provided on an 425 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 426 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 427 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 428 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MER- 429 CHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." 431 7 Authors' Addresses 433 Jonathan Rosenberg 434 Bell Laboratories, Lucent Technologies 435 101 Crawfords Corner Rd. 436 Holmdel, NJ 07733 437 USA 438 Rm. 4C-526 439 email: jdrosen@bell-labs.com 441 Henning Schulzrinne 442 Columbia University 443 M/S 0401 444 1214 Amsterdam Ave. 445 New York, NY 10027-7003 446 USA 447 email: schulzrinne@cs.columbia.edu