idnits 2.17.1 

draft-ietf-avt-rtpsample-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  ** Expected the document's filename to be given on the first page, but
     didn't find any

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == Couldn't figure out when the document was first submitted -- there may
     comments or warnings related to the use of a disclaimer for pre-RFC5378
     work that could not be issued because of this.  Please check the Legal
     Provisions document at https://trustee.ietf.org/license-info to determine
     if you need the pre-RFC5378 disclaimer.

  -- The document date (November 10, 1998) is 9296 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Missing reference section? '1' on line 385 looks like a reference

  -- Missing reference section? '2' on line 389 looks like a reference

  -- Missing reference section? '3' on line 393 looks like a reference

  -- Missing reference section? '4' on line 398 looks like a reference


     Summary: 9 errors (**), 0 flaws (~~), 3 warnings (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Engineering Task Force                 Audio/Video Transport wg
2	Internet Draft                              J. Rosenberg, H. Schulzrinne
3	ietf-avt-rtpsample-01.txt                  Bell Laboratories/Columbia U.
4	November 10, 1998
5	Expires: May 10, 1999

7	                Sampling of the Group Membership in RTP

9	STATUS OF THIS MEMO

11	   This document is an Internet-Draft. Internet-Drafts are working
12	   documents of the Internet Engineering Task Force (IETF), its areas,
13	   and its working groups.  Note that other groups may also distribute
14	   working documents as Internet-Drafts.

16	   Internet-Drafts are draft documents valid for a maximum of six months
17	   and may be updated, replaced, or obsoleted by other documents at any
18	   time.  It is inappropriate to use Internet-Drafts as reference
19	   material or to cite them other than as ``work in progress''.

21	   To learn the current status of any Internet-Draft, please check the
22	   ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
23	   Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
24	   munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
25	   ftp.isi.edu (US West Coast).

27	   Distribution of this document is unlimited.

29	                                 ABSTRACT

31	         In large multicast groups, the size of the group
32	         membership table maintained by RTP (Real Time Transport
33	         Protocol) participants may become unwieldy, particularly
34	         for embedded devices with limited memory and processing
35	         power. This document discusses mechanisms for sampling of
36	         this group membership table in order to reduce the memory
37	         requirements. Several mechanisms are proposed, and the
38	         performance of each is considered.

40	1 Introduction

42	   RTP, the Real Time Transport protocol [1], mandates that RTCP packets
43	   be transmitted from each participant with a period roughly
44	   proportional to the group size. The group size is obtained by storing
45	   a table, containing an entry for each unique SSRC seen in RTP and
46	   RTCP packets. As members leave or time out, entries are deleted, and
47	   as new members join, entries are added. The table is thus highly
48	   dynamic.

50	   For large multicast sessions, such as an mbone broadcast or IP based
51	   TV distribution, group sizes can be extremely large, on the order of
52	   hundreds of thousands to millions of participants. In these
53	   environments, RTCP may not always be used, and thus the group
54	   membership table isn't needed. However, it is highly desirable for
55	   RTP to scale well for groups with one member to groups with one
56	   million members, without human intervention to "turn off" RTCP when
57	   its no longer appropriate. This means that the same tools and systems
58	   can be used for both small conferences and TV broadcasts in a smooth,
59	   scalable fashion.

61	   Previous work [2] has identified three major scalability problems
62	   with RTP. These are:

64	        1.   Congestion due to floods of RTCP packets in highly dynamic
65	             groups

67	        2.   Large delays between receipt of RTCP packets from a single
68	             user

70	        3.   Large size of the group membership table

72	   The reconsideration algorithm helps to alleviate the first of these.
73	   This document addresses the third, that of large group size tables.

75	   Storage of an SSRC table with one million members, for example,
76	   requires at least four megabytes. As a result, embedded devices with
77	   small memory footprints may have difficulty under these conditions.
78	   To solve this problem, SSRC sampling has been proposed. SSRC sampling
79	   uses statistical sampling to obtain a stochastic estimate of the
80	   group membership. There are many issues that arise when this is done.
81	   This document reviews these issues and discusses the mechanisms which
82	   can be applied by implementors.

84	2 Basic Operation

86	   The basic idea behind SSRC sampling is simple. Each participant
87	   maintains a key K of 32 bits, and a mask M of 32 bits. Assume that m
88	   of the M bits in the mask are 1, and the remainder are zero. When an
89	   RTCP packet arrives with some SSRC S, rather than placing it in the
90	   table, it is first sampled. The sampling is performed by ANDing the
91	   key and the mask, and also ANDing the SSRC and the mask. The
92	   resulting values are compared. If equal, the SSRC is stored in the
93	   table. If not equal, the SSRC is rejected, and the packet is treated
94	   as if it were never received.

96	   The key can be anything, but is usually chosen to be the SSRC of the
97	   user who is performing the sampling.

99	   This sampling process can be described mathematically as:

101	   D = (K*M == S*M)

103	   Where the * operator denotes AND and the = operator denotes a test
104	   for equality. D represents the sampling decision.

106	   According to the RTP specification, the SSRC's used by session
107	   participants are chosen randomly. If the distribution is also
108	   uniform, it is easy to see that the above filtering will cause 1 out
109	   of 2**m SSRC's to be placed in the table, where m is the number of
110	   bits in the mask, M, which are one. Thus, the sampling probability is
111	   2**-m.

113	   Then, to obtain an actual group size estimate, L, the number of
114	   entries in the table N is multiplied by 2**m:

116	   L = N * 2**m

118	   Care must be taken when choosing which bits to set to 1 in the mask.
119	   Although the RTP specification mandates randomly chosen SSRC, there
120	   are many known implementations which do not conform to this. In
121	   particular, the ITU H.323 [3] series of recommendations allows the
122	   central control element, the gatekeeper, to assign the least
123	   significant 8 bits of the SSRC, while the most significant are
124	   randomly chosen by RTP participants.

126	   The safest way to handle this problem is to first hash the SSRC using
127	   a cryptographically secure hash, such as MD5 [4], and then choose 32
128	   of the bits in the result as the SSRC used in the above computation.
129	   This provides much better randomness, and doesn't require detailed
130	   knowledge about how various implementations actually set the SSRC.

132	2.1 Performance

134	   The estimate is more accurate as the value of m decreases, less
135	   accurate as it increases. This can be demonstrated analytically. If
136	   the actual group size is G, the standard deviation to mean of the
137	   estimate L (coefficient of variation) is:

139	   sqrt{(1 - p)/pL}

141	   This equation can be used as a guide for selecting the thresholds for
142	   when to change the sampling factor, as discussed below. For example,
143	   if the target is a 1% standard deviation to mean, the sampling
144	   probability should be no smaller than .5 when there are ten thousand
145	   group members. More generally, to achieve a desired standard
146	   deviation to mean ration of T, the sampling probability should be no
147	   less than:

149	   p > 1 / (1 + G*(T**2))

151	3 Increasing the Sampling Probability

153	   The above simple sampling procedure would work fine if the group size
154	   was static. However, it is not. A participant joining an RTP session
155	   will initially see just one participant (themselves). As packets are
156	   received, the group size as seen by that participant will increase.
157	   To handle this, the sampling probability must be made dynamic, and
158	   will need to increase and decrease as group sizes vary.

160	   The procedure for increasing the sampling probability is easy. A
161	   participant starts with a mask with m=0. Under these conditions,
162	   every received SSRC will be stored in the table, so there is
163	   effectively no sampling. At some point, the value of m is increased.
164	   This implies that approximately half of the SSRC already in the table
165	   will no longer match the key under the masking operation. In order to
166	   maintain a correct estimate, these SSRC must be discarded from the
167	   table. New SSRC are only added if they match the key under the new
168	   mask.

170	   The decision about when to increase the number of bits in the mask is
171	   also simple. Lets say an RTP participant has a memory with enough
172	   capacity to store C entries in the table. The best estimate of the
173	   group is obtained by the largest sampling probability. This also
174	   means that the best estimate is obtained the fuller the table is. A
175	   reasonable approach is therefore to increase the number of bits in
176	   the mask just as the table fills to C. This will generally cause its
177	   contents to be reduced by half. Once the table fills again, the
178	   number of bits in the mask is further increased.

180	4 Reducing the Sampling Probability

182	   If the group size begins to decrease, it may be necessary to reduce
183	   the number of bits in the mask. Not doing so will result in extremely
184	   poor estimates of the group size. Unfortunately, reducing the number
185	   of bits in the mask is more difficult than increasing them.

187	   When the number of bits in the mask increases, the user compensates
188	   by removing those SSRC which no longer match. When the number of bits
189	   decreases, the user should theoretically add back those users whose
190	   SSRC now match. However, these SSRC are not known, since the whole
191	   point of sampling was to not have to remember them. Therefore, if the
192	   number of bits in the mask is just reduced without any changes in the
193	   membership table, the group estimate will instantly drop by exactly
194	   half.

196	   To compensate for this, some kind of algorithm is needed. Two
197	   approaches are presented here: a corrective-factor solution, and a
198	   binning solution.

200	4.1 Corrective Factors

202	   The idea with the corrective factors is to add in or multiply the
203	   group size estimate by some corrective factor to compensate for the
204	   change in sample mask. The corrective factors should decay as the
205	   "fudged" members are eventually learned about and actually placed in
206	   the membership list.

208	   The multiplicative factor starts at 2, and gradually decays to one.
209	   The additive factor starts at the difference between the group size
210	   estimate before and after the number of bits in the mask is reduced,
211	   and decays to 0 (this is not always half the group size estimate, as
212	   the corrective factors can be compounded, see below). Both factors
213	   decay over a time of c*L(ts), where c is the average RTCP packet size
214	   divided by 5% of the session bandwidth, and L(ts) is the group size
215	   estimate just before the change in the number of bits in the mask at
216	   time ts. The reason for this constant is as follows. In the case
217	   where the actual group membership has not changed, those members
218	   which were forgotten will still be sending RTCP packets. The amount
219	   of time it will take to hear an RTCP packet from each of them is the
220	   average RTCP interval, which is cL(ts). Therefore, by cL(ts) seconds
221	   after the change in the mask, those users who were fudged by the
222	   corrective factor should have sent a packet and thus appear in the
223	   table. We chose to decay both functions linearly. This is because the
224	   rate of arrival of RTCP packets is linear.

226	   What happens if the number of bits in the mask is reduced once again
227	   before the previous corrective factor has expired? In that case, we
228	   compound the factors by using yet another one. Let fi() represent the
229	   ith correction function. If ts is the time when the number of bits in
230	   the mask is reduced, we can describe the additive correction factor
231	   as:

233	             / 0                                  ,   t < ts
234	             |                   ts + cL(ts-) - t
235	   fi(t)  =  |( L(ts-) - L(ts+)) ---------------- ,ts<t<ts+cL(ts-)
236	             |                        cL(ts-)
237	             | 0                                  ,  t > ts + cL(ts-)
238	             \

240	   and the multiplicative factor as:

242	             /  1                      , t < ts
243	             |
244	             |  ts + 2cL(ts-) - t
245	             |  -----------------      , ts < t < ts + cL(ts-)
246	             |       cL(ts-)
247	             |
248	             \  1                      , t > ts + cL(ts-)

250	   Note that in these equations, L(t) denotes the group size estimate
251	   obtained including the corrective factors except for the new factor.

253	   Finally, the actual group size estimate L(t) is given by:

255	          -----
256	          \
257	   L(t) = /      fi(t) + N*(2**m)
258	          -----
259	            i

261	   for the additive factor, and:

263	          ------
264	           |  |
265	           |  |
266	   L(t)=   |  |  N*(2**m)*fi(t)

268	   for the multiplicative factor.

270	   Simulations showed that both algorithms performed equally well, but
271	   both tended to seriously underestimate the group size when the group
272	   membership was rapidly declining. This is demonstrated in the
273	   performance curves below.

275	4.2 Binning Algorithm

277	   In order to more correctly estimate the group size even when it was
278	   rapidly decreasing, a binning algorithm can be used. The algorithm
279	   works as follows. There are 32 bins, same as the number of bits in
280	   the sample mask. When an RTCP packet from a new user arrives whose
281	   SSRC matches the key under the masking operation, it is placed in the
282	   mth bin (where m is the number of ones in the mask) otherwise it is
283	   discarded.

285	   When the number of bits in the mask is to be increased, those members
286	   in the bin who still match after the new mask are moved into the next
287	   higher bin. Those who don't match are discarded. When the number of
288	   bits in the mask is to be decreased, nothing is done. Users in the
289	   various bins stay where they are. However, when an RTCP packet for a
290	   user shows up, and the user is in a bin with a higher value than the
291	   current number of bits in the mask, it is moved into the bin
292	   corresponding to the current number of bits in the mask. Finally, the
293	   group size estimate L(t) is obtained by:

295	           31
296	          ----
297	          \
298	   L(t) = /    B(i) * 2**i
299	          ----
300	           i=0

302	   Where B(i) are the number of users in the ith bin.

304	   The algorithm works by basically keeping the old estimate when the
305	   number of bits in the mask drops. As users arrive, they are gradually
306	   moved into the lower bin, reducing the amount that the higher bin
307	   contributes to the total estimate. However, the old estimate is still
308	   updated in the sense that users which timeout are removed from the
309	   higher bin, and users who send BYE packets are also removed from the
310	   higher bin. This allows the older estimate to still adapt, while
311	   gradually phasing it out. It is this adaptation which makes it
312	   perform much better than the corrective algorithms. The algorithm is
313	   also extremely simple.

315	4.3 Comparison

317	   The algorithms are all compared via simulation in Table 1. In the
318	   simulation, 10,001 users join a group at t=0. At t=10,000, 5000 of
319	   them leave. At t=20,000, another 5000 leave. All implement an SSRC
320	   sampling algorithm, unconditional forward and BYE reconsideration.
321	   The table depicts the group size estimate from time 20,000 to time
322	   25,000 as seen by the single user present throughout the entire
323	   session. In the simulation, a memory size of 1000 SSRC was assumed.
324	   The performance without sampling, and with sampling with the
325	   additive, multiplicative, and bin-based correction are depicted.

327	   As the table shows, the bin based algorithm performs particularly
328	   well at capturing the group size estimate towards the tail end of the
329	   simulation.

331	   Time    No Sampling     Binned  Additive  Multiplicative
332	   ----    -----------     ------  --------  --------------
333	   20000   5001            5024    5024      5024
334	   20250   4379            4352    4352      4352
335	   20500   3881            3888    3900      3853
336	   20750   3420            3456    3508      3272
337	   21000   3018            2992    3100      2701
338	   21250   2677            2592    2724      2225
339	   21500   2322            2272    2389      1783
340	   21750   2034            2096    2125      1414
341	   22000   1756            1760    1795      1007
342	   22250   1476            1472    1459      582
343	   22500   1243            1232    1135      230
344	   22750   1047            1040    807       80
345	   23000   856             864     468       59
346	   23250   683             704     106       44
347	   23500   535             512     32        32
348	   23750   401             369     24        24
349	   24000   290             257     17        17
350	   24250   198             177     13        13
351	   24500   119             129     11        11
352	   24750   59              65      8         8
353	   25000   18              1       2         2

355	                   Table 1

357	4.4 Sender Sampling

359	   Care must be taken in handling senders when using SSRC sampling.
360	   Since the number of senders is generally small, and they contribute
361	   significantly to the computation of the RTCP interval, sampling
362	   should not be applied to them. However, they must be kept in a
363	   separate table, and not be "counted" as part of the general group
364	   membership. If this is done, the group size estimate will be inflated
365	   to overemphasize the senders.

367	   This is easily demonstrated analytically. Let Ns be the number of
368	   senders, and Nr be the number of receivers. The membership table will
369	   contain all Ns senders and (1/2)**m of the receivers. The total group
370	   size estimate in the current draft is obtained by 2**m times the
371	   number of entries in the table. Therefore, the group size estimate
372	   becomes:

374	   L(t) = (2**m) Ns + Nr

376	   which exponentially weights the senders.

378	   This is easily compensated for in the binning algorithm. A sender is
379	   always placed in the 0th bin. When a sender becomes a receiver, it is
380	   moved into the bin corresponding to the current value of m, if its
381	   SSRC matches the key under the masked comparison operation.

383	5 Bibliography

385	   [1] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: a
386	   transport protocol for real-time applications," Request for Comments
387	   (Proposed Standard) 1889, Internet Engineering Task Force, Jan. 1996.

389	   [2] J. Rosenberg and H. Schulzrinne, "Timer reconsideration for
390	   enhanced RTP scalability," (San Francisco, California), March/April
391	   1998.

393	   [3] International Telecommunication Union, "Visual telephone systems
394	   and equipment for local area networks which provide a non-guaranteed
395	   quality of service," Recommendation H.323, Telecommunication
396	   Standardization Sector of ITU, Geneva, Switzerland, May 1996.

398	   [4] R. Rivest, "The MD5 message-digest algorithm," Request for
399	   Comments (Informational) 1321, Internet Engineering Task Force, Apr.
400	   1992.

402	6 Full Copyright Statement

404	   Copyright (C) The Internet Society (1998). All Rights Reserved.

406	   This document and translations of it may be copied and furnished to
407	   others, and derivative works that comment on or otherwise explain it
408	   or assist in its implmentation may be prepared, copied, published and
409	   distributed, in whole or in part, without restriction of any kind,
410	   provided that the above copyright notice and this paragraph are
411	   included on all such copies and derivative works.

413	   However, this document itself may not be modified in any way, such as
414	   by removing the copyright notice or references to the Internet Soci-
415	   ety or other Internet organizations, except as needed for the purpose
416	   of developing Internet standards in which case the procedures for
417	   copyrights defined in the Internet Standards process must be fol-
418	   lowed, or as required to translate it into languages other than
419	   English.

421	   The limited permissions granted above are perpetual and will not be
422	   revoked by the Internet Society or its successors or assigns.

424	   This document and the information contained herein is provided on an
425	   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
426	   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
427	   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
428	   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MER-
429	   CHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."

431	7 Authors' Addresses

433	   Jonathan Rosenberg
434	   Bell Laboratories, Lucent Technologies
435	   101 Crawfords Corner Rd.
436	   Holmdel, NJ 07733
437	   USA
438	   Rm. 4C-526
439	   email: jdrosen@bell-labs.com

441	   Henning Schulzrinne
442	   Columbia University
443	   M/S 0401
444	   1214 Amsterdam Ave.
445	   New York, NY 10027-7003
446	   USA
447	   email: schulzrinne@cs.columbia.edu