idnits 2.17.1 

draft-ietf-avt-rtpsample-06.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == It seems as if not all pages are separated by form feeds - found 0 form
     feeds but 11 pages


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** There is 1 instance of too long lines in the document, the longest one
     being 1 character in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (November 22, 1999) is 8921 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Missing reference section? '1' on line 446 looks like a reference

  -- Missing reference section? '2' on line 450 looks like a reference

  -- Missing reference section? '3' on line 454 looks like a reference

  -- Missing reference section? '4' on line 459 looks like a reference

  -- Missing reference section? '5' on line 463 looks like a reference


     Summary: 4 errors (**), 0 flaws (~~), 2 warnings (==), 7 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Engineering Task Force                 Audio/Video Transport wg
2	Internet Draft                              J. Rosenberg, H. Schulzrinne
3	draft-ietf-avt-rtpsample-06.txt                  dynamicsoft/Columbia U.
4	November 22, 1999
5	Expires: May 22, 2000

7	                Sampling of the Group Membership in RTP

9	STATUS OF THIS MEMO

11	   This document is an Internet-Draft and is in full conformance with
12	   all provisions of Section 10 of RFC2026.

14	   Internet-Drafts are working documents of the Internet Engineering
15	   Task Force (IETF), its areas, and its working groups.  Note that
16	   other groups may also distribute working documents as Internet-
17	   Drafts.

19	   Internet-Drafts are draft documents valid for a maximum of six months
20	   and may be updated, replaced, or obsoleted by other documents at any
21	   time.  It is inappropriate to use Internet- Drafts as reference
22	   material or to cite them other than as work in progress.

24	   The list of current Internet-Drafts can be accessed at
25	   http://www.ietf.org/ietf/1id-abstracts.txt

27	   The list of Internet-Draft Shadow Directories can be accessed at
28	   http://www.ietf.org/shadow.html.

30	Abstract

32	   In large multicast groups, the size of the group membership table
33	   maintained by RTP (Real Time Transport Protocol) participants may
34	   become unwieldy, particularly for embedded devices with limited
35	   memory and processing power. This document discusses mechanisms for
36	   sampling of this group membership table in order to reduce the memory
37	   requirements. Several mechanisms are proposed, and the performance of
38	   each is considered.

40	1 Introduction

42	   RTP, the Real Time Transport protocol [1], mandates that RTCP packets
43	   be transmitted from each participant with a period roughly
44	   proportional to the group size. The group size is obtained by storing
45	   a table, containing an entry for each unique SSRC seen in RTP and
46	   RTCP packets. As members leave or time out, entries are deleted, and
47	   as new members join, entries are added. The table is thus highly
48	   dynamic.

50	   For large multicast sessions, such as an mbone broadcast or IP based
51	   TV distribution, group sizes can be extremely large, on the order of
52	   hundreds of thousands to millions of participants. In these
53	   environments, RTCP may not always be used, and thus the group
54	   membership table isn't needed. However, it is highly desirable for
55	   RTP to scale well for groups with one member to groups with one
56	   million members, without human intervention to "turn off" RTCP when
57	   it's no longer appropriate. This means that the same tools and
58	   systems can be used for both small conferences and TV broadcasts in a
59	   smooth, scalable fashion.

61	   Previous work [2] has identified three major scalability problems
62	   with RTP. These are:

64	        1.   Congestion due to floods of RTCP packets in highly dynamic
65	             groups

67	        2.   Large delays between receipt of RTCP packets from a single
68	             user

70	        3.   Large size of the group membership table

72	   The reconsideration algorithm [2] helps to alleviate the first of
73	   these. This document addresses the third, that of large group size
74	   tables.

76	   Storage of an SSRC table with one million members, for example,
77	   requires at least four megabytes. As a result, embedded devices with
78	   small memory footprints may have difficulty under these conditions.
79	   To solve this problem, SSRC sampling has been proposed. SSRC sampling
80	   uses statistical sampling to obtain a stochastic estimate of the
81	   group membership. There are many issues that arise when this is done.
82	   This document reviews these issues and discusses the mechanisms which
83	   can be applied by implementors. In particular, it focuses on three
84	   methods for adapting the sampling probability as the group membership
85	   varies. It is important to note that the IETF has been notified of
86	   intellectual property rights claimed in regard to some or all of the
87	   specification contained in this document, and in particular to one of
88	   the three mechanisms: the binning algorithm described below. For more
89	   information consult the online list of claimed rights. The two other
90	   approaches presented are inferior to the binning algorithm, but are
91	   included as they are believed to be unencumbered by IPR.

93	2 Basic Operation
94	   The basic idea behind SSRC sampling is simple. Each participant
95	   maintains a key K of 32 bits, and a mask M of 32 bits. Assume that m
96	   of the bits in the mask are 1, and the remainder are zero. When an
97	   RTCP packet arrives with some SSRC S, rather than placing it in the
98	   table, it is first sampled. The sampling is performed by ANDing the
99	   key and the mask, and also ANDing the SSRC and the mask. The
100	   resulting values are compared. If equal, the SSRC is stored in the
101	   table. If not equal, the SSRC is rejected, and the packet is treated
102	   as if it were never received.

104	   The key can be anything, but is usually chosen to be the SSRC of the
105	   user who is performing the sampling.

107	   This sampling process can be described mathematically as:

109	   D = (K*M == S*M)

111	   Where the * operator denotes AND and the == operator denotes a test
112	   for equality. D represents the sampling decision.

114	   According to the RTP specification, the SSRC's used by session
115	   participants are chosen randomly. If the distribution is also
116	   uniform, it is easy to see that the above filtering will cause 1 out
117	   of 2**m SSRC's to be placed in the table, where m is the number of
118	   bits in the mask, M, which are one. Thus, the sampling probability p
119	   is 2**-m.

121	   Then, to obtain an actual group size estimate, L, the number of
122	   entries in the table N is multiplied by 2**m:

124	   L = N * 2**m

126	   Care must be taken when choosing which bits to set to 1 in the mask.
127	   Although the RTP specification mandates randomly chosen SSRC, there
128	   are many known implementations which do not conform to this. In
129	   particular, the ITU H.323 [3] series of recommendations allows the
130	   central control element, the gatekeeper, to assign the least
131	   significant 8 bits of the SSRC, while the most significant are
132	   randomly chosen by RTP participants.

134	   The safest way to handle this problem is to first hash the SSRC using
135	   a cryptographically secure hash, such as MD5 [4], and then choose 32
136	   of the bits in the result as the SSRC used in the above computation.
137	   This provides much better randomness, and doesn't require detailed
138	   knowledge about how various implementations actually set the SSRC.

140	2.1 Performance

142	   The estimate is more accurate as the value of m decreases, less
143	   accurate as it increases. This can be demonstrated analytically. If
144	   the actual group size is G, the ratio of the standard deviation to
145	   mean of the estimate L (coefficient of variation) is:

147	   sqrt((2**m - 1)/G)

149	   This equation can be used as a guide for selecting the thresholds for
150	   when to change the sampling factor, as discussed below. For example,
151	   if the target is a 1% standard deviation to mean, the sampling
152	   probability p=2**-m should be no smaller than .5 when there are ten
153	   thousand group members. More generally, to achieve a desired standard
154	   deviation to mean ratio of T, the sampling probability should be no
155	   less than:

157	   p > 1 / (1 + G*(T**2))

159	3 Increasing the Sampling Probability

161	   The above simple sampling procedure would work fine if the group size
162	   was static. However, it is not. A participant joining an RTP session
163	   will initially see just one participant (themselves). As packets are
164	   received, the group size as seen by that participant will increase.
165	   To handle this, the sampling probability must be made dynamic, and
166	   will need to increase and decrease as group sizes vary.

168	   The procedure for increasing the sampling probability is easy. A
169	   participant starts with a mask with m=0. Under these conditions,
170	   every received SSRC will be stored in the table, so there is
171	   effectively no sampling. At some point, the value of m is increased.
172	   This implies that approximately half of the SSRC already in the table
173	   will no longer match the key under the masking operation. In order to
174	   maintain a correct estimate, these SSRC must be discarded from the
175	   table. New SSRC are only added if they match the key under the new
176	   mask.

178	   The decision about when to increase the number of bits in the mask is
179	   also simple. Let's say an RTP participant has a memory with enough
180	   capacity to store C entries in the table. The best estimate of the
181	   group is obtained by the largest sampling probability. This also
182	   means that the best estimate is obtained the fuller the table is. A
183	   reasonable approach is therefore to increase the number of bits in
184	   the mask just as the table fills to C. This will generally cause its
185	   contents to be reduced by half. Once the table fills again, the
186	   number of bits in the mask is further increased.

188	4 Reducing the Sampling Probability

190	   If the group size begins to decrease, it may be necessary to reduce
191	   the number of bits in the mask. Not doing so will result in extremely
192	   poor estimates of the group size. Unfortunately, reducing the number
193	   of bits in the mask is more difficult than increasing them.

195	   When the number of bits in the mask increases, the user compensates
196	   by removing those SSRC which no longer match. When the number of bits
197	   decreases, the user should theoretically add back those users whose
198	   SSRC now match. However, these SSRC are not known, since the whole
199	   point of sampling was to not have to remember them. Therefore, if the
200	   number of bits in the mask is just reduced without any changes in the
201	   membership table, the group estimate will instantly drop by exactly
202	   half.

204	   To compensate for this, some kind of algorithm is needed. Two
205	   approaches are presented here: a corrective-factor solution, and a
206	   binning solution. The binning solution is simpler to understand and
207	   performs better. However, we include a discussion of the corrective-
208	   factor solution for completeness and comparison, and also because it
209	   is believed to be unencumbered by IPR.

211	4.1 Corrective Factors

213	   The idea with the corrective factors is to take one of two
214	   approaches. In the first, a corrective factor is added to the group
215	   size estimate, and in the second, the group size estimate is
216	   multiplied by a corrective factor. In both cases, the purpose is to
217	   compensate for the change in sample mask. The corrective factors
218	   should decay as the "fudged" members are eventually learned about and
219	   actually placed in the membership list.

221	   The additive factor starts at the difference between the group size
222	   estimate before and after the number of bits in the mask is reduced,
223	   and decays to 0 (this is not always half the group size estimate, as
224	   the corrective factors can be compounded, see below). The
225	   multiplicative corrective factor starts at 2, and gradually decays to
226	   one. Both factors decay over a time of cL(ts-), where c is the
227	   average RTCP packet size divided by the RTCP bandwidth for receivers,
228	   and L(ts-) is the group size estimate just before the change in the
229	   number of bits in the mask at time ts. The reason for this constant
230	   is as follows. In the case where the actual group membership has not
231	   changed, those members which were forgotten will still be sending
232	   RTCP packets. The amount of time it will take to hear an RTCP packet
233	   from each of them is the average RTCP interval, which is cL(ts-).
234	   Therefore, by cL(ts-) seconds after the change in the mask, those
235	   users who were fudged by the corrective factor should have sent a
236	   packet and thus appear in the table. We chose to decay both functions
237	   linearly. This is because the rate of arrival of RTCP packets is
238	   linear.

240	   What happens if the number of bits in the mask is reduced once again
241	   before the previous corrective factor has expired? In that case, we
242	   compound the factors by using yet another one. Let fi() represent the
243	   ith additive correction function, and gi() the ith multiplicative
244	   correction function. If ts is the time when the number of bits in the
245	   mask is reduced, we can describe the additive correction factor as:

247	             / 0                                  ,   t < ts
248	             |                   ts + cL(ts-) - t
249	   fi(t)  =  |( L(ts-) - L(ts+)) ---------------- ,   ts < t < ts+cL(ts-)
250	             |                        cL(ts-)
251	             | 0                                  ,   t > ts + cL(ts-)

253	   and the multiplicative factor as:

255	             /  1                      , t < ts
256	             |
257	             |  ts + 2cL(ts-) - t
258	   gi(t)     |  -----------------      , ts < t < ts + cL(ts-)
259	             |       cL(ts-)
260	             |
261	               1                      , t > ts + cL(ts-)

263	   Note that in these equations, L(t) denotes the group size estimate
264	   obtained including the corrective factors except for the new factor.
265	   ts- is the time right before the reduction in the number of bits, and
266	   ts+ the time after. As a result, L(ts-) represents the group size
267	   estimate before the reduction, and L(ts+) the estimate right after,
268	   but not including the new factor.

270	   Finally, the actual group size estimate L(t) is given by:

272	          -----

274	   L(t) = /      fi(t) + N*(2**m)
275	          -----
276	            i

278	   for the additive factor, and:

280	          ------
281	           |  |
282	           |  |
283	   L(t)=   |  |  N*(2**m)*gi(t)

285	   for the multiplicative factor.

287	   Simulations showed that both algorithms performed equally well, but
288	   both tended to seriously underestimate the group size when the group
289	   membership was rapidly declining [5]. This is demonstrated in the
290	   performance data below.

292	   As an example, consider computation of the additive factor. The group
293	   size is 1000, c is 1 second, and m is two. With a mask of this size,
294	   a participant will, on average, observe 250 (N = 250) users. At t=0,
295	   the user decides to reduce the number of bits in the mask to 1. As a
296	   result, L(0-) is 1000, and L(0+) is 500. The additive factor
297	   therefore starts at 500, and decays to zero at time ts + cL(ts-) =
298	   1000. At time 500, lets assume N has increased to 375 (this will, on
299	   average, be the case if the actual group size has not changed). At
300	   time 500, the additive factor is 250. This is added to 2**m times N,
301	   which is 750, resulting in a group size estimate of 1000. Now, the
302	   user decides to reduce the number of bits in the mask again, so that
303	   m=0. Another additive factor is computed. This factor starts at
304	   L(ts-) (which is 1000), minus L(ts+). L(ts+) is computed without the
305	   new factor; it is the first additive factor at this time (250) plus
306	   2**m (1) times N (375). This is 625. As a result, the new additive
307	   factor starts at 1000 - 625 (375), and decays to 0 in 1000 seconds.

309	4.2 Binning Algorithm
310	   In order to more correctly estimate the group size even when it was
311	   rapidly decreasing, a binning algorithm can be used. The algorithm
312	   works as follows. There are 32 bins, same as the number of bits in
313	   the sample mask. When an RTCP packet from a new user arrives whose
314	   SSRC matches the key under the masking operation, it is placed in the
315	   mth bin (where m is the number of ones in the mask) otherwise it is
316	   discarded.

318	   When the number of bits in the mask is to be increased, those members
319	   in the bin who still match after the new mask are moved into the next
320	   higher bin. Those who don't match are discarded. When the number of
321	   bits in the mask is to be decreased, nothing is done. Users in the
322	   various bins stay where they are. However, when an RTCP packet for a
323	   user shows up, and the user is in a bin with a higher value than the
324	   current number of bits in the mask, it is moved into the bin
325	   corresponding to the current number of bits in the mask. Finally, the
326	   group size estimate L(t) is obtained by:

328	           31
329	          ----

331	   L(t) = /    B(i) * 2**i
332	          ----
333	           i=0

335	   Where B(i) are the number of users in the ith bin.

337	   The algorithm works by basically keeping the old estimate when the
338	   number of bits in the mask drops. As users arrive, they are gradually
339	   moved into the lower bin, reducing the amount that the higher bin
340	   contributes to the total estimate. However, the old estimate is still
341	   updated in the sense that users which timeout are removed from the
342	   higher bin, and users who send BYE packets are also removed from the
343	   higher bin. This allows the older estimate to still adapt, while
344	   gradually phasing it out. It is this adaptation which makes it
345	   perform much better than the corrective algorithms. The algorithm is
346	   also extremely simple.

348	4.3 Comparison

350	   The algorithms are all compared via simulation in Table 1. In the
351	   simulation, 10,001 users join a group at t=0. At t=10,000, 5000 of
352	   them leave. At t=20,000, another 5000 leave. All implement an SSRC
353	   sampling algorithm, unconditional forward reconsideration and BYE
354	   reconsideration. The table depicts the group size estimate from time
355	   20,000 to time 25,000 as seen by the single user present throughout
356	   the entire session. In the simulation, a memory size of 1000 SSRC was
357	   assumed. The performance without sampling, and with sampling with the
358	   additive, multiplicative, and bin-based correction are depicted.

360	   As the table shows, the bin based algorithm performs particularly
361	   well at capturing the group size estimate towards the tail end of the
362	   simulation.

364	   Time    No Sampling     Binned  Additive  Multiplicative
365	   ----    -----------     ------  --------  --------------
366	   20000   5001            5024    5024      5024
367	   20250   4379            4352    4352      4352
368	   20500   3881            3888    3900      3853
369	   20750   3420            3456    3508      3272
370	   21000   3018            2992    3100      2701
371	   21250   2677            2592    2724      2225
372	   21500   2322            2272    2389      1783
373	   21750   2034            2096    2125      1414
374	   22000   1756            1760    1795      1007
375	   22250   1476            1472    1459      582
376	   22500   1243            1232    1135      230
377	   22750   1047            1040    807       80
378	   23000   856             864     468       59
379	   23250   683             704     106       44
380	   23500   535             512     32        32
381	   23750   401             369     24        24
382	   24000   290             257     17        17
383	   24250   198             177     13        13
384	   24500   119             129     11        11
385	   24750   59              65      8         8
386	   25000   18              1       2         2

388	4.4 Sender Sampling

390	   Care must be taken in handling senders when using SSRC sampling.
391	   Since the number of senders is generally small, and they contribute
392	   significantly to the computation of the RTCP interval, sampling
393	   should not be applied to them. However, they must be kept in a
394	   separate table, and not be "counted" as part of the general group
395	   membership. If they are counted as part of the general group
396	   membership, and are not sampled, the group size estimate will be
397	   inflated to overemphasize the senders.

399	   This is easily demonstrated analytically. Let Ns be the number of
400	   senders, and Nr be the number of receivers. The membership table will
401	   contain all Ns senders and (1/2)**m of the receivers. The total group
402	   size estimate in the current draft is obtained by 2**m times the
403	   number of entries in the table. Therefore, the group size estimate
404	   becomes:

406	   L(t) = (2**m) Ns + Nr

408	   which exponentially weights the senders.

410	   This is easily compensated for in the binning algorithm. A sender is
411	   always placed in the 0th bin. When a sender becomes a receiver, it is
412	   moved into the bin corresponding to the current value of m, if its
413	   SSRC matches the key under the masked comparison operation.

415	5 Security Considerations

417	   The use of SSRC sampling does not appear to introduce any additional
418	   security considerations beyond those described in [1]. In fact, SSRC
419	   sampling, as described above, can help somewhat in reducing the
420	   effect of certain attacks.

422	   RTP, when used without authentication of RTCP packets, is susceptible
423	   to a spoofing attack. Attackers can inject many RTCP packets into the
424	   group, each with a different SSRC. This will cause RTP participants
425	   to believe the group membership is much higher than it actually is.
426	   The result is that each participant will end up transmitting RTCP
427	   packets very infrequently, if ever. When SSRC sampling is used, the
428	   problem can be amplified if a participant is not applying a hash to
429	   the SSRC before matching them against their key. This is because an
430	   attacker can send many packets, each with different SSRC, that match
431	   the key. This would cause the group size to inflate exponentially.
432	   However, with a hash applied, an attacker cannot guess those SSRC
433	   which will match against the key. In fact, an attacker will have to
434	   send 2**m different SSRC before finding one that matches, on average.
435	   Of course, the effect of a match causes an increase of group
436	   membership by 2**m. But, the use of sampling means that an attacker
437	   will have to send many packets before an effect can be observed.

439	6 Acknowledgements

441	   The authors wish to thank Bill Fenner and Vern Paxson for their
442	   comments.

444	7 Bibliography

446	   [1] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: a
447	   transport protocol for real-time applications," Request for Comments
448	   (Proposed Standard) 1889, Internet Engineering Task Force, Jan. 1996.

450	   [2] J. Rosenberg and H. Schulzrinne, "Timer reconsideration for
451	   enhanced RTP scalability," in IEEE Infocom , (San Francisco,
452	   California), March/April 1998.

454	   [3] International Telecommunication Union, "Visual telephone systems
455	   and equipment for local area networks which provide a non-guaranteed
456	   quality of service," Recommendation H.323, Telecommunication
457	   Standardization Sector of ITU, Geneva, Switzerland, May 1996.

459	   [4] R. Rivest, "The MD5 message-digest algorithm," Request for
460	   Comments (Informational) 1321, Internet Engineering Task Force, Apr.
461	   1992.

463	   [5] J. Rosenberg, Protocols and Algorithms for Supporting Distributed
464	   Internet Telephony PhD thesis, Columbia University, Dec. 1999.  Work
465	   in Progress.

467	8 Authors' Addresses

469	   Jonathan Rosenberg
470	   dynamicsoft
471	   200 Executive Drive
472	   West Orange, NJ 07052
473	   USA
474	   email: jdrosen@dynamicsoft.com

476	   Henning Schulzrinne
477	   Columbia University
478	   M/S 0401
479	   1214 Amsterdam Ave.
480	   New York, NY 10027-7003
481	   USA
482	   email: schulzrinne@cs.columbia.edu