idnits 2.17.1 

draft-briscoe-ama-00.txt:
-(1016): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding
-(1017): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-24) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  ** Bad filename characters: the document name given in the document,
     'draft-briscoe-ama-00.ps,', contains other characters than digits,
     lowercase letters and dash.

  ** Missing revision: the document name given in the document,
     'draft-briscoe-ama-00.ps,', does not give the document revision number

  ~~ Missing draftname component: the document name given in the document,
     'draft-briscoe-ama-00.ps,', does not seem to contain all the document
     name components required ('draft' prefix, document source, document name,
     and revision) -- see https://www.ietf.org/id-info/guidelines#naming for
     more information.

  == Mismatching filename: the document gives the document name as
     'draft-briscoe-ama-00.ps,', but the file name used is
     'draft-briscoe-ama-00'

  == There are 10 instances of lines with non-ascii characters in the
     document.

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == The page length should not exceed 58 lines per page, but there was 1
     longer page, the longest (page 1) being 1452 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There are 2 instances of too long lines in the document, the longest one
     being 1 character in excess of 72.

  ** There are 3 instances of lines with control characters in the document.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 256 has weird spacing: '...   addr  offse...'

  == Line 477 has weird spacing: '...ount of  trees...'

  == Line 503 has weird spacing: '...   mask  mask ...'

  == Line 506 has weird spacing: '...     wm  wd   ...'

  == Line 535 has weird spacing: '...storage    pro...'

  == (2 more instances...)

  == Couldn't figure out when the document was first submitted -- there may
     comments or warnings related to the use of a disclaimer for pre-RFC5378
     work that could not be issued because of this.  Please check the Legal
     Provisions document at https://trustee.ietf.org/license-info to determine
     if you need the pre-RFC5378 disclaimer.

  -- The document date (21 Nov 1997) is 9651 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Possible downref: Non-RFC (?) normative reference: ref. '1'

  -- Possible downref: Non-RFC (?) normative reference: ref. '2'

  -- Possible downref: Non-RFC (?) normative reference: ref. '4'

  ** Downref: Normative reference to an Experimental RFC: RFC 1075 (ref. '5')

  -- Possible downref: Non-RFC (?) normative reference: ref. '6'

  -- Possible downref: Non-RFC (?) normative reference: ref. '7'

  ** Downref: Normative reference to an Historic RFC: RFC 2201 (ref. '8')

  ** Downref: Normative reference to an Historic RFC: RFC 2189 (ref. '9')

  -- Possible downref: Non-RFC (?) normative reference: ref. '10'

  -- Possible downref: Non-RFC (?) normative reference: ref. '11'

  -- Possible downref: Non-RFC (?) normative reference: ref. '12'

  -- Possible downref: Non-RFC (?) normative reference: ref. '13'

  -- Possible downref: Non-RFC (?) normative reference: ref. '14'

  -- Possible downref: Non-RFC (?) normative reference: ref. '15'

  -- Possible downref: Non-RFC (?) normative reference: ref. '16'

  -- Possible downref: Non-RFC (?) normative reference: ref. '17'

  ** Obsolete normative reference: RFC 1884 (ref. '18') (Obsoleted by RFC
     2373)

  -- Possible downref: Non-RFC (?) normative reference: ref. '19'

  -- Possible downref: Non-RFC (?) normative reference: ref. '20'

  -- Possible downref: Non-RFC (?) normative reference: ref. '21'

  -- Possible downref: Non-RFC (?) normative reference: ref. '22'

  -- Possible downref: Non-RFC (?) normative reference: ref. '23'

  -- Possible downref: Non-RFC (?) normative reference: ref. '24'

  -- Possible downref: Non-RFC (?) normative reference: ref. '25'

  -- Possible downref: Non-RFC (?) normative reference: ref. '26'

  -- Possible downref: Non-RFC (?) normative reference: ref. '27'

  -- Possible downref: Non-RFC (?) normative reference: ref. '28'


     Summary: 15 errors (**), 1 flaw (~~), 11 warnings (==), 24 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	INTERNET DRAFT		                                     R Briscoe
2	MALLOC(?) Working Group	                                      M Tatham
3	Expiration: 26 May 1998	                                            BT
4	                                                           21 Nov 1997

6	              End to End Aggregation of Multicast Addresses

8	                    draft-briscoe-ama-00.ps,.txt [28]
9	   Note: lack of figures and copious subscripts make the .txt version
10	                            difficult to read

12	Status of this Memo

14	This document is an Internet-Draft.  Internet-Drafts are working
15	documents of the Internet Engineering Task Force (IETF), its areas, and
16	its working groups.  Note that other groups may also distribute working
17	documents as Internet-Drafts.

19	Internet-Drafts are draft documents valid for a maximum of six months
20	and may be updated, replaced, or obsoleted by other documents at any
21	time.  It is inappropriate to use Internet-Drafts as reference material
22	or to cite them other than as "work in progress."

24	To view the entire list of current Internet-Drafts, please check the
25	"1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
26	Directories on ftp.is.co.za (Africa), ftp.nordu.net (Europe),
27	munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
28	ftp.isi.edu (US West Coast).

30	                              Abstract

32	This paper presents an approach for solving the inherent problem with
33	multicast routing scalability - by co-operation between end-systems and
34	the network. We introduce an extremely efficient, elegant way to name
35	arbitrary sized inter-meshed aggregations of multicast addresses. This
36	is done in such a way that it is easy to calculate how to change the
37	name to encompass many more related names. We describe how these
38	aggregate names could be used anywhere in place of the set of addresses
39	to which they refer, not by resolving them into multiple operations, but
40	by a single bulk action throughout the routing tree, and in session
41	descriptions potentially including those for reservations. Initial
42	aggregation in end-systems might only reduce the problem by an order of
43	magnitude, but it is believed that this will provide sufficient
44	structure for routers to be able to recognise further aggregation
45	potential.  To improve the chances of router aggregation, address set
46	allocation schemes must fulfil certain criteria that are laid down in
47	this paper.

49	Keywords

51	Multicast, address, aggregation, end to end, scalable, Internet.

53	Contents

55	1. Introduction & Requirements
56	2. Scheme Description
57	  2.1 Construction of an AMA
58	  2.2 Probability of address aggregation
59	  2.3 Further storage optimisation (for IPv4)
60	  2.4 Life-cycle of an Aggregate Multicast Address
61	       2.4.1 How an application would assign an AMA
62	       2.4.2 How a sending application would understand an AMA
63	       2.4.3 How a receiving host would join or leave an AMA
64	       2.4.4 How a receiving application would understand an AMA
65	       2.4.5 How a router would aggregate AMAs
66	       2.4.6 How an application would expand or contract an AMA if
67	       required
68	     2.5 Router Implementation
69	     2.6 Derivation of recommended values for constants
70	       2.6.1 Statistics
71	     2.7 Evolution
72	3. Related Work
73	4. Limitations and Further Work
74	5. Security Considerations
75	6. Conclusions
76	7. Acknowledgements
77	8. References
78	9. Authors' addresses

80	1. Introduction & Requirements

82	The addressing scheme used for Internet multicast has been recognised as
83	unscalable since its inception. Every multicast group has to have a
84	separate entry in the forwarding tables of every router on its path.
85	Because multicast groups, being logical entities, have no direct
86	relationship with physical topology, they cannot directly be matched to
87	the hierarchical design of the Internet. In this paper we present an
88	approach for solving this problem indirectly, in the proven tradition of
89	the Internet; end to end - by co-operation between end-systems and the
90	network.

92	The primary problem is that multicast addresses are fixed by session
93	initiators, but aggregation relies on a clustering pattern emerging from
94	the demography of the receivers. Further, the initiator usually doesn't
95	know who the receivers will be until after the addresses to be used have
96	been fixed. Therefore, any clustering beyond small-scale aggregation
97	within applications will have to be achieved on a longer time-scale by
98	session initiators predicting the likely demography of their session
99	(e.g. based on past experience of similar sessions). The job of these
100	predictive systems will be much simpler if they have some degree of pre-
101	existing aggregation to nucleate around, rather than having to
102	crystallise aggregation from complete chaos without any bootstrap
103	process.

105	A large class of applications utilises multiple multicast addresses
106	internally. Further, many such applications consist of varying sets of
107	multiple multicast groups that are all sub-sets of a bigger set (e.g.
108	news, stock-feeds, network games, virtual worlds, distributed
109	simulations, all applications using layered multicast [1]). Any solution
110	should ensure that such "nucleating applications" use aggregations of
111	addresses that can be identified as such.

113	Related work is described later. Most is in the area of straightforward
114	multicast address allocation, which is, perhaps surprisingly, far from
115	being sorted out. It is very difficult, indeed probably reckless, to
116	comment on proposals that have only been hinted at in the odd mailing
117	list posting, or conference presentation. However, it appears that most
118	schemes are assuming that aggregation will be possible if addresses are
119	allocated based on the topological position in the Internet of the root
120	of the multicast tree. Although this may well be the case, it is only
121	true if most multicast applications will tend to be used by groups of
122	users that happen to use the same ISP (Internet Service Provider), or
123	they use ISPs that have a common parent ISP. No evidence that this is
124	the case has been presented. In some cases the tree root even seems to
125	be (erroneously) assumed to be in the same place as the session
126	initiator (the party requesting the address). Certainly, nowhere does it
127	seem to be taken into account that the likely receiver topology is just
128	as important in determining whether multicast trees can be aggregated.
129	Some of the proposals that are available also seem to implicitly assume
130	that routing state will be aggregated in the same way unicast routing is
131	aggregated - by discarding the right-hand bits of addresses which have
132	common left-hand bit fields, despite there being no greater significance
133	to any bit in a multicast address. These criticisms may prove to be
134	unfounded when the work in question is published in detail.

136	The primary thesis of this paper is that it will never be possible to
137	aggregate multicast state in routers if there is no means to name
138	aggregations of multicast addresses. In unicast, an address prefix is an
139	aggregation name, but for multicast the prefix means nothing. A similar
140	concept is required, but with radically different properties and
141	requirements.

143	These names must take up significantly less space than would the list of
144	addresses themselves. Further, if only some sizes of aggregations can be
145	named, this will lead to wastage of addresses. Therefore, any solution
146	must offer a way to name arbitrary sized aggregations. Further, the
147	naming of aggregations must not make any one type of address more in
148	demand than others (e.g. addresses with trailing zeroes, or with
149	sequences of zeros) or encourage the hoarding of certain addresses. It
150	should also be possible to grow or shrink the size of the set identified
151	by the name, while avoiding clashes with addresses in use by other
152	sessions. An informal list of further requirements of address allocation
153	schemes, which are not directly relevant to the discussion here, is at
154	[26].

156	These aggregation names should themselves be open to aggregation,
157	implying the naming should be recursive. Otherwise the small-scale
158	clustering that the "nucleating applications" will be able to engender,
159	will simply result in these small clusters collecting in routers deeper
160	into the network. Where two names might be aggregated, it should be
161	possible to test this possibility, based on the structure of the names,
162	rather than by trial and error or exhaustive expansion to the lists of
163	addresses that the names resolve to.

165	To encourage their use,  these aggregation names should offer
166	convenience and efficiency for the programmer. Preferably they should
167	enable bulk joins to and leaves from sets of multicast addresses. The
168	"nucleating applications" would then naturally assist the network in
169	aggregating together multicast addresses, not out of any sense of
170	altruism, but because using an atomic name for an aggregation is
171	convenient for the programmer and efficient for the system it runs on.
172	This approach could be accused of moving away from random address
173	allocation and therefore possibly encouraging address hoarding.
174	Certainly, once a pattern emerges, the aim should be to bias address
175	allocation to re-inforce the pattern. However, this is still originally
176	based on random allocation, so shouldn't lead to hoarding as long as we
177	don't aim to re-inforce patterns to such an extent that they become
178	entrenched beyond their useful life.

180	2. Scheme Description

182	The proposed scheme for identifying aggregates of multicast addresses
183	builds directly on IP multicasting [2]. The scheme assigns a new meaning
184	to (a number that looks like) any existing multicast address, but only
185	when it is in in a context associated with (at least) two other
186	parameters that define aggregation size. Multiple sets of these
187	aggregation size parameters can be associated with the same single
188	multicast address to signify a list of aggregates of multicast addresses
189	affording further aggregation. These aggregated addresses would be
190	suitable replacements even for single discrete multicast addresses both
191	in hard multicast routing state on the router and soft state in messages
192	initiating and refreshing it, as well as in session descriptions. Thus,
193	in the context of sending data to or receiving data from a multicast
194	address, there is no change from the existing meaning for the address -
195	it doesn't represent an aggregate, just a single address. In the context
196	of joining or leaving a multicast group and in the context of describing
197	a session, the address means an aggregation (although possibly of size
198	1) if associated with size parameters. This switch of meaning dependent
199	on context is consistent with the fact that the aggregation parameters
200	aren't associated with sending and receiving, only with allocation,
201	joining, leaving and describing.

203	For conciseness, we shall refer to this tuple of multicast address and
204	aggregation parameters as an aggregated multicast address (AMA(i)).
205	Thus, it is intended that AMAs are used in future versions of Internet
206	Group Management Protocol (IGMP) [3, 4] and the various multicast
207	routing protocols (DVMRP [5, 6], CBT [7, 8, 9] & PIM [10, 11, 12]) to
208	replace multicast addresses wherever they are used. It would also be
209	highly advantageous for session directory, invitation and announcement
210	protocols (e.g. SDP [13], SAP [14] & SIP [15]) to evolve to use AMAs in
211	place of multicast addresses. Evolution from the current version of IGMP
212	and multicast routing protocols and from current session description
213	protocols is discussed under the "Evolution" section.

215	Further, it would be natural for evolving multicast address allocation
216	schemes ([20, 21]) to use AMAs to define allocations concisely. Just as
217	sets of related AMAs can be aggregated together into a single AMA, the
218	reverse is also possible. Thus parts of large allocations can be "sub-
219	allocated" to further parties if required. However, allocation of
220	multicast addresses (and hence AMAs) is not as straightforward as for
221	unicast addresses. Allocation issues are discussed under the section on
222	"Probability of address aggregation".

224	Firstly we shall define how an AMA is constructed and how its
225	construction affects its meaning. Then we shall follow through a typical
226	life-cycle of an AMA:
227	1.   how an application would assign an AMA
228	2.   how a sending application would understand an AMA
229	3.   how a receiving host would join or leave an AMA
230	4.   how a receiving application would understand an AMA
231	5.   how a router would aggregate AMAs
232	6.   how an application would expand or contract an AMA if required

234	2.1 Construction of an AMA

236	The construction of an AMA is based on two architectural principles:
237	  1. to identify a range of addresses by starting from a base, using an
238	     offset (d or v) to the minimum and a range value (n or s) to
239	     identify the maximum
240	  2. to use two levels of this structure - the first based on bit-field
241	     widths to home in on a rough area using very few bits, the second
242	     using actual values to achieve the arbitrary aggregation size
243	     requirement

245	  ----------2^d---------->--2^n------
246	                         |-v-->-s->

248	The motivation for having a maximum for the rougher range as well as the
249	fine-grained one is explained later.

251	The address-like component of an AMA is defined to be a concatenation of
252	bit-fields as shown (with recommended values of bit widths shown for
253	IPv4 - see discussion later):

255	  Description:       mcast mask        remainder
256	                     addr  offset
257	                     flags
258	  Value:                   d               r
259	  Width:               4   wd              wr
260	  Recommended width:       4               24
261	                      <--><-->.<-------.--------.------->
262	                      1110XXXX.XXXXXXXX.XXXXXXXX.XXXXXXXX

264	The three associated aggregation parameters are defined below. The
265	recommended values of bit widths are shown for IPv4, but later a scheme
266	for optimising these is proposed (see Further storage optimisation).
267	Introducing this now would just confuse the explanation:

269	  Description:       mask          aggr'n            aggr'n
270	                    width          size              offset
271	  Value:               m            s                  t
272	  Width:               wm           ws                 ws
273	  Recommended width:   4           16                 16
274	                      <-->  <-------.------->  <-------.------->
275	                      XXXX  XXXXXXXX.XXXXXXXX  XXXXXXXX.XXXXXXXX

277	The value in the "mask width" bit field,  m, (recall this is associated
278	with an address, not part of it) determines the width, n, of a bit-mask
279	overlaid on the remainder (as described next), where:

281	  n = m + m0      --------(1)

283	and where m0 is a constant integer (set for this scheme) that determines
284	the minimum size of an AMA (see later for discussion). It is recommended
285	that:

287	  m0 = 0

289	The value in the mask offset field, d, determines the offset in bits
290	from the right-hand end of the address to the right-hand end of the
291	mask, for example, if m0 = 0:

293	                         d              r                m    s
294	Example pseudo-dotted   13 .   221  .  147   .  93       3    5
295	 decimal value:   <--><-->.<-------.--------.------->  <--><-...
296	                  11101101.11011101.10010011.01011101  0011
297	Mask (XXX):       11101101.11011101.XXX10011.01011101
298	                                       <----d--------
299	                                    <-n

301	Note: in these diagrams, the label "d" is used to denote where the value
302	of d is stored and the length of the offset set by this value.

304	If n + d > wr , the bit mask simply wraps round into the right-hand end
305	of the remainder, r.

307	The value within the masked bits, vmin (100b in this example) determines
308	the base or minimum address of the AMA(ii).

310	The value of the aggregate size, s,  (recall this is associated with an
311	address, not part of it) determines the upper or maximum address of the
312	AMA by setting the maximum value in the masked bits, vmax, where:

314	  vmax = vmin + s - 1

316	The AMA is defined as the set of addresses that result when v is varied
317	between its minimum and maximum values. For example, the AMA defined by
318	our example address and example aggregation size parameters follows:

320	                    <v>               dotted decimal
321	  11101101.11011101.10010011.01011101  237.221.147.93  a0
322	  11101101.11011101.10110011.01011101  237.221.179.93  a1
323	  11101101.11011101.11010011.01011101  237.221.211.93  a2
324	  11101101.11011101.11110011.01011101  237.221.243.93  a3
325	  11101101.11011101.00010011.01011101  237.221. 19.93  a4

327	Note that if vmax > 2^n, v may wrap within the bit-field (without
328	carrying outside the bit-field).

330	Note that s=0 means no values of v at all (useful later).
331	Note that meanings of s > 2^n are reserved but undefined.     ------(2)

333	For convenience, we might denote an AMA as so far described by the
334	tuple:

336	  (a, m, s)

338	which for the example above would be:

340	  (237.221.147.93, 3, 5)

342	Where two or more AMAs are disjoint sets that share a common a (even if
343	the masked bits are different) and m, they may be represented by the
344	more general form that is the correct full definition of an AMA:

346	  (a, m, s0, [(t1, s1), ... (ti, si), ... ])

348	where ti is an offset from vmin, and si is the size of the set of
349	addresses based on vmini where:

351	  vmini = vmin + ti.

353	Thus, continuing the above example, all three of the following sets of
354	addresses or AMAs represent the same thing:

356	  a0, a1,      a3, a4
357	  (a0, m, 2), (a3, m, 2)
358	  (a0, m, 2, 3,2)

360	This allows re-use of one address by associating it with multiple
361	aggregate size pairs with minimal extra storage.
362	Further, as the sequence of pairs along the AMA is processed,

364	  if ti >= 2^n, m is incremented until ti < 2^n    --------(3)

366	m remains at its higher value for subsequent pairs, unless it is
367	incremented further later.

369	Again continuing the above example to illustrate this point, if we
370	denote a8 as the address related to a0 like so:

372	                  <-v->                dotted decimal
373	  11101101.11011101.10010011.01011101  237.221.147.93  a0
374	  11101101.11011100.10010011.01011101  237.220.147.93  a8

376	then these two sets represent the same thing:

378	  (a0, 3, 2), (a3, 3, 2), (a8, 4, 7)
379	  (a0, 3, 2, 3,2, 8,7)

381	The motivation for this particular design of scheme should start to
382	become apparent; AMAs can be fully manipulated without caring about
383	which multicast address was randomly chosen to start with as the base.
384	This ensures there is no more demand for any one multicast address than
385	another. vmin doesn't have to be all zeroes to ensure all 2^n
386	combinations of bits under the mask are used. This ensures that the task
387	of aggregating multicast addresses into AMAs and small AMAs into big
388	AMAs should be achievable with just bit-wise logic operations.

390	2.2 Probability of address aggregation

392	Fig 1. attempts to represent multicast trees of various topologies
393	across the Internet as various three-sided shapes. The Internet is
394	represented by the oval with end-systems around the perimeter. The root
395	of each tree is indicated by a dot. The receivers that have joined the
396	tree are spread across the side of the "triangle" opposite the tree
397	root. The shaded areas are an attempt to represent the aggregation
398	potential of a representative topology by representing an even density
399	of receivers around the edge and an even density of routers across the
400	oval, rather than being a representation of the physical density of
401	hosts and routers. Trees "a", "d" and "e" are core-based trees, with
402	tree roots within the network rather than on end-systems. Trees "b" and
403	"c" are source-based with roots on end-systems. It should be noted that
404	these trees are probably not typical, mainly because it is difficult in
405	such a diagram to draw trees that have receivers spread all over the
406	Internet. However the trees have been chosen to illustrate various
407	points.

409	                        {See .ps or [28] for fig}
410	      Fig 1 - Various multicast tree topologies across the Internet

412	Although trees "a" and "d" have roots that could well be in the same
413	domain (autonomous system), their receiver sets have caused them to grow
414	in completely opposite directions so no aggregation would be possible.
415	Tree "e" is completely contained within tree "d" although their roots
416	could well be in completely unrelated domains. Thus it should be
417	possible to aggregate the routing information they need. Tree "a" is
418	nearly contained within tree "b", so it would be desirable to be able to
419	aggregate their routing at least where they overlap. Trees "a" and "c"
420	have a similar relationship to that between "a" and "b". Trees "b" and
421	"c" cross eachother at the network core so some aggregation may be
422	possible where the directions of the trees coincide on certain links.
423	There could even be occaisions where the routing of "a", "b" and "c" can
424	all be aggregated together.

426	It should now be possible to see that, although it is more likely that
427	two trees with close roots will have routing information that is
428	condusive to aggregation, this is by no means a necessary or sufficient
429	condition. In fact it appears that receiver distribution is a better
430	indication of aggregation potential.

432	A multicast address needs to be allocated when a session is initiated,
433	at which time the likely receiver topology may be difficult to predict.
434	For certain types of tree (e.g. source-based) it is much easier to be
435	certain where the root should be at this early stage. For core-based
436	trees, the positioning of the core is ideally dependent on the
437	prediction of the receiver topology (which can be sketchy, as we have
438	already said).

440	Receiver topology is difficult but not impossible to predict. This
441	usually reduces to a marketing-type problem. What is clear is that
442	address allocation schemes that are based exclusively on the position of
443	the root of the multicast tree (or worse the position of the session
444	initiator, who may not even be taking part once the session is going)
445	will not realise anything close to the full aggregation potential of any
446	conceivable topologies of multicast trees that are likely on the
447	Internet. A good scheme must allow addresses to be allocated to sessions
448	based on a combination of root and receiver topology. It is outside the
449	scope of this paper to solve how this allocation would be done, but it
450	is clear that a scheme that disallows a future solution to this problem
451	is a bad scheme.

453	The scheme proposed for AMAs allows just such a combination of
454	allocation between root and receiver topology. Taking the example AMA
455	used already,

457	                                d              r
458	  Example pseudo-              13 .   221  .  147   .  93
459	   dotted decimal value:  <--><-->.<-------.--------.------->
460	                          11101101.11011101.10010011.01011101
461	                                              |<----d--------
462	                                              |
463	                                d             |    f
464	  root-based allocation:      <-->.           |<----.-->
465	  receiver-based allocation:       --------.-->         <----
466	                                   |_________g_______________|

468	The 16 bits, labelled g, to the left of the offset (to which d points)
469	are those that the largest possible mask could cover so they might
470	conceivably all be aggregated together. Therefore these bits should be
471	allocated purely on likely receiver positioning. In the example this
472	field wraps after 11 bits with the remaining 5 bits being at the far
473	right hand end of the address. On the other hand, it is proposed that d
474	and f would be allocated based on the position within the Internet of
475	the root of their multicast tree (possibly combined with the likely
476	direction from the root from which most receivers would join, to take
477	account of  trees like "d" and "e" above). f is defined as the 8 bits to
478	the right of the offset  (wrapping round the length of r if necessary).
479	The mask can never cover f which is why it is available for allocation
480	based on the root position.

482	A simple short term solution for an address allocation scheme might be
483	to generate g, d and f from an algorithm seeded by a unicast address
484	prefix (or a set of a few representative prefixes) that represents the
485	majority of likely receivers and the unicast address of the tree root.
486	The latter would be necessary if the tree were source-based, but it may
487	be possible for the allocation service (or some other service) to
488	suggest the best tree root position if using core-based multicast.

490	2.3 Further storage optimisation (for IPv4)

492	Optimisations considered but rejected are recorded elsewhere [25].

494	2.3.1 Mask width, m

496	In the contexts where AMAs are used, the first four bits of the address
497	are redundant, as a non-multicast address wouldn't make sense. m is the
498	natural candidate to store in place of this 1110 bit pattern which is
499	fixed for IPv4 (class D) addresses (but see below). When the AMA is read
500	from storage, m can be read into another variable and this bit pattern
501	can be re-instated.

503	  Description:       mask  mask        remainder
504	                     width offset
505	  Value:               m   d               r
506	  Width:               wm  wd              wr
507	  Recommended width:   4   4               24
508	                      <--><-->.<-------.--------.------->
509	                      XXXXXXXX.XXXXXXXX.XXXXXXXX.XXXXXXXX

511	However, there is no guarantee that the multicast address range will
512	always be confined to this fixed first four bits. For instance, class E
513	addresses (starting 1111) are currently reserved, and might conceivably
514	form an extension to the IPv4 multicast address space in the future.

516	Thus this optimisation is suggested but if it were adopted, the
517	implications for the future use of other ranges as an extension to the
518	multicast range would need serious consideration. However, as this is an
519	optimisation, implementations could be designed so that if other address
520	ranges ever became valid for multicast, the optimisation could be
521	removed.

523	2.3.2 Aggregation size, s

525	Many applications will have sessions with less than 16 multicast
526	addresses and most will have less than 256. Thus it seems wasteful to
527	use 16 bits for s (and t) when most of the time, most of their leading
528	bits will be zeroes. Also we must not forget that there will be many
529	sessions that only use one multicast address.

531	Therefore, it is recommended that ws be dependent on the value of m at
532	least in the environments indicated:

534	                                        Rule applies in:
535	                                  router storage    protocol fields
536	  If     m = 0   (0000b),  ws = 0        Y                N?
537	  If 0 < m <= 1  (0001b),  ws = 2        N??              N
538	  If 1 < m <= 3  (0011b),  ws = 4        N?               N
539	  If 3 < m <= 7  (0111b),  ws = 8        Y                N?
540	  If 7 < m <= 15 (1111b), ws = 16        Y                Y
541	                                  \--------(4)

543	The question marks indicate where the rule is open for discussion after
544	implementation experience.

546	It is recommended, at least for router storage, that m = 0 is used to
547	indicate an AMA that is actually just a single discrete multicast
548	address, where associating a size of 1 with it would be a waste of
549	space.

551	However, this appears to make the widest possible mask 15 bits, because
552	it precludes using m = 0 to mean n = 16. Although the need for a 16 bit
553	wide mask is likely to be rare, it is still possible to force the mask
554	to be this wide by exploiting the equivalence of the two AMAs in the
555	following example (due to the ability of t to increment m, given in
556	formula (3)):

558	  (a, 16, s) (impossible value of m)
559	  (a, 15, 1,  >2^15, s)

561	The two conditions that lead to sub-byte storage requirements are not
562	recommended as they are likely to be more trouble than they are worth,
563	even for router storage.

565	It has also been assumed that saving a byte or two is not worth it for
566	protocols (like IGMP) as opposed to router storage. The hassle of a
567	conditional width field is probably not worth the small reduction in
568	message size that would result.

570	Note that the width of t would always follow the same rule as that for
571	s. That is:

573	  wti = wsi

575	Note that in all cases (except m = 0) t can always be large enough,
576	within the available field-width, to be capable of incrementing m into
577	the next range, by formula (3). For example, these two AMAs are
578	equivalent :

580	  (a, 8,0)
581	  (a, m=7,s0=0,  t1=128,s1=128)

583	In the second AMA, the starting mask width is 7, so s0, t1 and s1 are 8
584	bits wide. But, by formula (3), because t1 >= 2^7  (has the first bit
585	set) m is incremented to 8.

587	Note that if t forces m to increment into a range that would alter the
588	width of both t and s (formula (4)), the increase in width of t doesn't
589	happen until the next pair of t and s, if at all, while the s in the
590	current pair should be widened immediately. The reasoning, is that the
591	width of t mustn't be affected by its own value.

593	2.4 Life-cycle of an Aggregate Multicast Address

595	2.4.1 How an application would assign an AMA

597	First, the application initiating (or expanding) the session must decide
598	how many multicast addresses it needs (bearing in mind the scheme allows
599	it to expand or shrink from this initial decision later). This we will
600	call smax. (As discussed above, even if smax is 1, it may still make
601	sense to use AMAs.)

603	We assume some smart address allocation scheme has been worked out that
604	meets the requirements laid down in the section on the Probability of
605	address aggregation and may indeed be like the simple one outlined in
606	that section. It should be noted that a sub-optimal, tactical address
607	allocation service wouldn't stop this scheme working, but as the art of
608	allocation improved, aggregation results would improve.

610	The application (or session initiator) now predicts the likely receiver
611	topology for the session it is initiating. From this, it decides the
612	best position for the root of the multicast tree(s) that make up the
613	session (or some other service does this at its request). The
614	application would then pass all these constraining parameters, including
615	the size of AMA it required to the address allocation service. The
616	address allocation service would return an AMA of the required size.
617	To arrive at an AMA, it would internally (probably) allocate g based on
618	the receiver topology and d and f based on the root and receiver
619	topology given to it by the application.

621	Also, internally, it would have to calculate n such that 2^(n-1) < smax,
622	but 2^n >= smax .

624	In other words, it would decide integer n where no more than 2^n
625	multicast addresses are needed in the session. m follows rather
626	straightforwardly from formula (1).

628	For example, if the session needed 6 multicast addresses and m0 = 0, it
629	would use m = n = 3.

631	We shall assume the address allocated is the example address used above.
632	It then joins 6 multicast groups simultaneously by sending one message
633	to join the AMA:

635	  (237.220.147.93, 3, 6)

637	If the allocation scheme is light-weight it may give no guarantees that
638	the any of the addresses haven't been simultaneously allocated to other
639	sessions. Therefore, the application may have to monitor whether any of
640	the multicasts are in use. If they are free, it holds them for itself
641	(as is done with discrete multicast addresses today), then goes ahead
642	and advertises the session (or invites users in) using this AMA.

644	In fact, if there is a finite possibility of simultaneous allocation,
645	the above is not necessarily the best opening strategy for grabbing a
646	set of free multicast addresses. Which strategy is best depends on the
647	size of the set required and the level of utilisation of the address
648	space (which will oscillate daily and increase into the future,
649	decreasing the probability of finding a free set in a single attempt).
650	This is a large enough problem space to become a topic for study in its
651	own right (see Further Work), so it is not gone into in any depth here,
652	save to make some broad generalisations.

654	If the probability of finding a free set of multicast addresses first
655	time is high, the obvious strategy would be to just try another address
656	set if part of the first was busy. If the allocation service couldn't
657	guarantee all its allocations were not in use, it would have to be
658	possible to receive a slightly different response to a repeated
659	identical request.

661	Otherwise it may be best to test a larger set than is required, then
662	drop the busy ones. AMAs make this easy, as it is often possible to
663	derive one smaller AMA from a larger one to deliberately avoid some of
664	the addresses.

666	Yet a third method would be to try a number of contiguous or overlapping
667	AMAs, then drop those that tested busy and amalgamate the remaining ones
668	into one AMA (again often but not always possible).

670	The comments in this section on allocation of AMAs for applications
671	apply equally well to allocation of AMAs to allocation sub-authorities
672	in an allocation hierarchy. However, it is made clear in the section on
673	Probability of aggregation that allocation hierarchies per se are not a
674	panacea where multicast is concerned.

676	2.4.2 How a sending application would understand an AMA

678	This section is included to remind the reader that sending to an AMA
679	probably isn't a valid activity. As explained earlier, the concept of an
680	AMA doesn't exist in the context of sending. If the sending part of an
681	application includes understanding of sessions it may well utilise AMAs
682	(this is also a sure sign it hasn't been written in a structured way!).
683	However, the raw sending aspects will be direct to discrete multicast
684	addresses, and shouldn't involve AMAs, unless they are modelled as an
685	array of multicast addresses.

687	It might be desirable to send the same information to multiple multicast
688	group addresses using only one socket and one flow of packets through
689	the network. If a clear need for this were identified, it would be
690	sensible to use the same addressing scheme as AMAs provide. However, no
691	clear need is immediately apparent to the author, so it would not be
692	sensible to update APIs for sending to multicast sockets and the router
693	code that disseminates multicast packets so that they act on this extra
694	parameter.

696	2.4.3 How a receiving host would join or leave an AMA

698	This is the other main use for AMAs besides in session initiation. An
699	application's request to join an AMA given in a session announcement or
700	invitation would be translated directly into a (future) IGMP call to
701	join that AMA as a single atomic action. If the AMA was a superset of
702	another AMA already having its soft-state joins regularly refreshed by
703	the stack, the stack would have to merge the two AMAs into one expanded
704	AMA (if it didn't the router would, before forwarding the joins).

706	Similarly, a call to leave an AMA would translate directly to a (future)
707	IGMP call to leave. If the call to leave an AMA resolved to a sub-set of
708	the multicast addresses previously joined, the router would be designed
709	to be able to handle contracting the size of the AMA it considered was
710	still of interest on that interface (or layer 2 address) down to the AMA
711	that mapped to the remaining addresses. The host stack would also have
712	to be able to contract its AMA for it to regularly refresh the soft-
713	state of the remaining joined addresses.

715	The stack would have to ensure the frequency of join refreshes remained
716	sufficient while it amalgamated or contracted down any out of phase AMA
717	refreshes.

719	AMAs can be used as a consistent computational type for addressing any
720	number of multicast addresses, whether the AMA resolves to many or just
721	a single multicast address. In fact, APIs (application programming
722	interfaces) could pre-empt the introduction of AMAs into the network, by
723	presenting AMAs to the programmer but having middleware or the stack
724	convert AMAs into sets of discrete multicast addresses until the network
725	is upgraded. However, it would be sensible for routers and protocols to
726	signify an AMA of size 1 by not storing or passing the aggregation size
727	parameters at all (see Further storage optimisation - this would help
728	backward compatibility too). Otherwise the efficiency benefits of the
729	scheme would be offset by the 25% increase in storage required for each
730	non-aggregated, isolated address. On the other hand, to hide AMAs from
731	those programmers not interested in sessions with multiple multicast
732	addresses, it may well be best to implement an API for AMAs by
733	overloading a similar one for discrete multicast addresses(iii).

735	2.4.4 How a receiving application would understand an AMA

737	All the comments above on the irrelevance (and possible relevance) of
738	AMAs to sending data apply equally to receiving.

740	2.4.5 How a router would aggregate AMAs

742	As a router received the equivalent of "join" and "leave" requests and
743	refreshed "joins" it would continually be looking for matches at two
744	levels:

746	1.   on the one hand, the router would continually attempt to merge
747	     incoming AMAs on a per interface basis to reduce the state being
748	     held in its tables.
749	2.   on the other, the router would attempt to aggregate the outgoing
750	     "join" refresh messages it forwarded upstream on a per-interface
751	     basis too.

753	AMAs can be aggregated at two levels (whether by a router, or by a
754	host).
755	1.   where two or more AMAs share a common a (even if the masked bits
756	     are different) and m, overlapping s ranges can be merged
757	2.   where a whole set of combinations under a mask is present (s = 0),
758	     m can be incremented and this can be represented as half a full
759	     range at the wider level, which is then subject to further
760	     aggregation using the first technique again

762	Where multicast trees cross eachother (e.g. "b" and "c" in Fig 1), it
763	would be necessary to "de-aggregate" routing at the point where the
764	routes to the two roots diverge. As stated before, because AMAs can be
765	split as easily as they can be amalgamated, this is not a problem.
766	The section on Router Implementation should be referred to for more
767	detail on these matters.

769	2.4.6 How an application would expand or contract an AMA if required

771	As long as the overall application session was utilising an AMA with s =
772	smax, any one receiver could vary vmin and s around to utilise any
773	subset AMA of the larger AMA set  (useful for many simulation
774	applications). Thus contraction and re-expansion of individual joins is
775	straightforward.

777	To expand a session beyond the original smax but still with a single AMA
778	sometimes requires time for preparation of the ground and a degree of
779	luck. In all cases this is because one is attempting to acquire use of
780	extra multicast addresses without altering the addresses already being
781	used. This limits the sets of interesting multicast addresses to AMAs
782	that are related to the one in use. This is a limitation of the AMA
783	scheme when compared to schemes based on arbitrary masks such as [19]
784	(which has different limitations discussed under Related work). However,
785	this was a concious compromise to avoid the larger storage needed for
786	arbitrary masks as opposed to contiguous ones.

788	The following operations would (probably) be enacted by the address
789	allocation service in response to a request to increase the size of an
790	existing allocation (the uncertainty is because the detailed design of
791	such a service is outside the scope of this paper).

793	Firstly, if not already there, smax should be increased to 2^n and the
794	new addresses tested for prior allocation.

796	If enough addresses still aren't free, n can be incremented without
797	harm, then s can be doubled for every increment and the new addresses
798	tested.

800	If this doesn't find enough free addresses, keeping n incremented , vmin
801	can be changed while s is increased to try addresses related to the
802	current range which will still give the network a chance to aggregate
803	addresses as long as other receivers are co-operating within the same
804	rules.

806	If this doesn't work, one can look for a close but disjoint AMA (or
807	AMAs) and watch for a gap between the current AMA and the new one until
808	it can be closed by a superset AMA later.

810	Beyond that, the only solution is to give up and use more than one AMA,
811	but this is unlikely to be necessary.

813	2.5 Router implementation

815	The discussion in this section shouldn't be taken to imply that this is
816	the best way to implement multicast routing. It is simply necessary to
817	establish at high level that it would theoretically be possible to
818	modify multicast routing implementations to take advantage of AMAs.

820	Currently, the most general form of multicast routing table is a
821	database with (a usually large number of) rows for each unique
822	address/source pair as follows:

824	  a, S, fI, [fOj, ... fOk,] misc

826	where the variables are defined as:

828	  a          : multicast address
829	  S          : source
830	  fI         : incoming interface
831	  fOj ... fOk : list of outgoing interfaces to which to duplicate and
832	  forward packets
833	  misc       : other miscellaneous information not relevant to this
834	  paper (e.g. MTUs, prune state)

836	To take advantage of the aggregation and deaggregation potential of
837	AMAs, the most general modification to this would be for each row to
838	contain the following:

840	  a, m, S, [(tm, sm, fIh,), ... (tn, sn, fIi,)], [(tp, sp, fOj,), ...
841	  (tq, sq, fOk,)], misc

843	Here one row represents the routing for all the related AMAs that are
844	being aggregated and de-aggregated at this router with respect to one
845	source. We define related AMAs as AMAs that can be represented with the
846	same base address and mask, only differing in their offset and size.

848	This is illustrated in Fig 2. It should be noted that the three trees,
849	A1, A2 & A3 represent related AMAs, not discrete multicast trees
850	(although they could be AMAs of size one), because, as noted before,
851	routing is one of the contexts where AMAs can completely replace
852	discrete multicast addresses. Thus, for these related AMAs, only one set
853	of routing information would be passed in or out of each interface (e.g.
854	because all three AMAs "join" f4, only one routing message would enter
855	this interface for them all). Another way of defining related AMAs is to
856	say that it is possible to represent their union as a single AMA (we
857	will call this the super-AMA). The only distinction between the three
858	trees illustrated is that they represent the different sub-sets (of the
859	super-AMA) that share identical routing at this router. Each row such as
860	that presented above represents the complete routing definition for each
861	super-AMA at this router.

863	                        {See .ps or [28] for fig}
864	        Fig 2 - Aggregation and de-aggregation of multicast trees

866	The routing row for this super-AMA is similar to the previous row
867	structure, except that:
868	  �  the multicast address field has subtly different semantics. It
869	     doesn't mean a single address, but it is taken to mean the base
870	     address with which all the offset/size pairs in the row are
871	     associated - the base address of the super-AMA.
872	  �  the mask is included, which for IPv4, could be overlaid over the
873	     front of the base address as described under Further storage
874	     optimisation
875	  �  each outgoing interface is associated with an offset/size pair(s)
876	     (t and s are as already defined under Construction of an AMA above)
877	     which defines the outgoing AMA on that interface when associated
878	     with the base address
879	  �  there may me more than one incoming interface associated with all
880	     the related AMAs being de-aggregated at this router, each of which
881	     is listed with an offset/size pair(s) as for the outgoing
882	     interfaces.

884	For example, if A1, A2 & A3 in Fig 2 were the three AMAs respectively:

886	  (a,m,17), (a,m,0, 31,42), (a,m,0, 20,37)

888	then the routing table entry for the super-AMA (a,m,42) with respect to
889	source S would be:

891	  a, m, S, [(0, 17, f1,), (20, 42, f3,)],
892	   [(0, 42, f4,), (0, 42, f5,), ((0, 17, 31, 42, f6,)], misc

894	When a packet arrives at the incoming interface, it's destination is
895	matched against the AMA associated with each outgoing interface. A match
896	causes it to be duplicated and forwarded out of that interface.

898	Obviously, the row structure above isn't the most efficient for routing
899	look-ups. It is more suited for routing updates. This suggests that a
900	read-optimised data-structure should be built from this write-optimised
901	structure and both held by the router, with the former being regularly
902	refreshed from the latter. This is similar to the way most unicast
903	routing is implemented. Whether there is one read-optimised table per
904	router or a number of sub-sets of the main table specific to each
905	interface is beyond the scope of this paper.

907	2.6 Derivation of recommended values for constants

909	The principles for choosing the constants for IPv4 have been laid out in
910	detail, so that they can be re-applied judiciously for IPv6 once (or
911	while) its multicast addressing scheme [18] is finalised.

913	wd = 4
914	It is an important design feature that the mask offset is contained
915	within the address, rather than added as another parameter outside the
916	address. This is because, as a packet arrives at the router, this offset
917	can be read to find the point at which any mask would be applied. This
918	should greatly speed the process of matching a packet against the AMAs
919	in the routing table. It also crams as much meaning into the address as
920	possible, reducing the extra state held in routers and passed in
921	protocols. The offset doesn't need to be varied as aggregation
922	progresses, so it makes sense for it to be a fixed value, and therefore
923	it might as well be part of the address.

925	For IPv4, a width of 4 for the offset was a compromise. Ideally the
926	width would have allowed any offset value across the remainder (28 -
927	wd). Note that, if the offset did allow the mask to start anywhere in
928	the remainder, this does not mean there would be no space in the
929	remainder that was guaranteed to always be outside the mask (e.g. for
930	address allocation related to the position of the tree root - see "How
931	an application would assign an AMA" above). This depends on the mask
932	width, not the offset size. Therefore, wd could have been set to 5
933	leaving the remainder 23 bits wide. However, this would only have used
934	half the value of the fifth bit, so 4 was chosen as a compromise between
935	availability of more descriptions for AMAs and utilisation of bits. It
936	should be noted that, because AMAs overlap in their use of addresses,
937	there are diminishing returns in having more ways to split the
938	description of sub-groups of the address space. Also a 5/23 split was
939	"binarily awkward" compared to a 4/24 split.

941	wm = 4
942	Some thought was given to limiting the mask to 3 bits wide. This would
943	have led to a maximum AMA size of 2^8, which would probably have been
944	adequate if just considering application aggregation. However, we hope
945	that higher level techniques will be found to enable considerable router
946	aggregation based on the foundations laid by AMAs. Limiting routers to
947	aggregation of just two orders of magnitude seemed short-sighted, when
948	using one more bit would raise the maximum aggregation limit to 2^16
949	(and align on a half-byte boundary). The importance of keeping the size
950	of m down, is that it is stored in the router for each AMA.
951	The bit mask is deliberately fixed on its right-hand end so that when m
952	is varied, the LSB of the mask stays in the same place otherwise large-
953	scale aggregation would be terribly hard.

955	ws = 0, 8 or 16
956	The discussion of this value's potential dependence on m is well
957	rehearsed under Further storage optimisation . The motivation for
958	keeping its size down is again router storage. It is expected that a
959	large number of values of s will be under 16 (at least in edge routers),
960	so it would be profligate to allocate a whole byte, let alone two just
961	because such bit-widths will be needed for higher level aggregation or
962	applications with more complex sessions.

964	m0 = 0
965	We could have made m0 = 1 but this would have led to an extra increment
966	operation every time the AMA was interpreted. Whether this is more
967	efficient than testing for m = 0 is debatable (machine instruction set
968	dependent). n = 0 could be made illegal rather than mapping it to 16,
969	which would only lose a few ridiculously large (?) aggregates.

971	Alternatively, we could even have made m0 = 2, on the grounds that AMAs
972	of size 2 are uninteresting, and AMAs of size 1 can be indicated by no
973	value of s. We decided against this on grounds of caution over lack of
974	elegance. If we put discontinuities in the maths, it makes it much more
975	difficult to build higher order mechanisms that are simple.

977	2.6.1 Statistics

979	Below are listed the number of distinct AMAs of a selection of sizes
980	(note they are not confined to powers of 2):
981	AMA size    distinct AMAs
982	{left as an exercise for the reader or until the author gets round to
983	working out the formula...}

985	2.7 Evolution

987	It is a simple matter to convert an AMA to the list of addresses it
988	refers to. It is also possible to convert lists of addresses into AMAs
989	or lists of AMAs.

991	AMA-enabled hosts and routers shouldn't send or forward joins or leaves
992	to routers that don't understand AMAs, they should convert the AMAs to
993	lists of discrete multicast addresses.

995	This assumes routers can determine which version of a multicast routing
996	protocol their upstream router for each interface is using. This should
997	be straightforward as version stamped routing will also be arriving at
998	the interface down that link (multi-host links will cause complications
999	for some routing protocols). This also assumes hosts can determine which
1000	version of IGMP is being used by their upstream router which should be
1001	possible, but is probably difficult.

1003	Session descriptions would either have to use both forms for an interim
1004	period, or parallel descriptions in the two versions would have to be
1005	transmitted on different channels.

1007	IGMP, multicast routing protocols and the session description protocols
1008	could evolve independently as long as applications and AMA-enabled
1009	routers had the capability installed to convert AMAs into lists of
1010	discrete multicast addresses when necessary.

1012	3. Related Work

1014	The initial schemes for allocation of multicast addresses involve three
1015	distinct classes of address:
1016	  �  addresses that are permanently assigned for specific purposes [23]
1017	  �  address ranges that are permanently assigned for certain uses [23],
1018	     from which single addresses are intended to be temporarily assigned
1019	  �  an address range reserved for assignment within an administrative
1020	     scope which may therefore safely have multiple assignments within
1021	     multiple administrative domains [17]

1023	For several years, multicast group addresses for public Mbone
1024	(experimental Internet multicast backbone overlay) sessions have been
1025	allocated using the sd (session directory) or sdr tools. In addition to
1026	advertising multicast sessions based on the SDP [13], sd and sdr
1027	encompass a proprietary mechanism for allocating multicast addresses to
1028	sessions from an IANA (Internet Assigned Numbers Authority) defined
1029	address range 224.2.128.0-224.2.255.255 [23]. The algorithm used has not
1030	been described publicly in detail, but is essentially random assignment
1031	from the available addresses to ensure very efficient use of the address
1032	space, but taking account of existing allocations to minimise the
1033	probability of collisions. There is also a collision detection algorithm
1034	included for the cases where an address is chosen simultaneously by
1035	multiple parties.

1037	Thus there is currently no standard mechanism for address allocation,
1038	but the author of the sd tools recently agreed to publish the address
1039	allocation mechanism from sdr as a separate protocol - Address
1040	Allocation Protocol (AAP) [20] - which could be incorporated into other
1041	address allocators.

1043	Microsoft created an alternative means for allocation of multicast
1044	addresses by extending their Dynamic Host Configuration Protocol,
1045	originally designed to allocate temporary unicast addresses. Multicast
1046	DHCP (M-DHCP) [22] will allocate multicast addresses, consistent with
1047	the requested Administrative Scope (Link Local, Organisation Local,
1048	Global, etc.) and with a `lease period', the effectiveness of which is
1049	unproven. This lease period can be renewed if desired. The M-DHCP
1050	authors recently removed all aspects of hierarchy from the M-
1051	DHCP protocol, to confine it to allocating multicast addresses to hosts
1052	initiating sessions, rather than also using it to allocate ranges of
1053	addresses down allocation hierarchies. This was in response to criticism
1054	over its static hierarchy and static allocation wasting the address
1055	space. All aspects of multicast address hierarchy, and allocation policy
1056	are expected to be handled by a separate protocol, such as AAP (see
1057	above).

1059	Where allocation of ranges of addresses to organisations is concerned
1060	there is concensus among the known proposals that these should not be
1061	static. Range allocations should be over one (longish) timescale with
1062	the allocation of addresses from within that range for the duration of
1063	individual sessions. The intention is to avoid long term "ownership" of
1064	addresses ranges by assuring organisations that they will be able to
1065	have addresses on demand as long as they co-operate with everyone else
1066	in returning unused address space. Thus the multicast address allocation
1067	proposals are distinct from unicast in the following aspects:
1068	  �  No fixed assignments of address ranges to IP domains - address
1069	     ranges are claimed as needed
1070	  �  Random allocation of addresses to sessions within a domain
1071	  �  Allocations of addresses have finite lifetimes

1073	The MASC (Multicast Address Set Claim) [21] is one such proposal. This
1074	is work in progress but the outline of the mechanism as described by the
1075	authors is as follows:
1076	  1. The ISP (Internet Service Provider) claims an address set (in
1077	     general, the ISP's next level provider will be in control of address
1078	     allocation to the ISP)
1079	  2. Address ranges will be allocated for a set lifetime and in such a
1080	     way that they may be aggregated. Allocation will take account of
1081	     administratively scoped multicast addresses.
1082	  3. The ISP then advertises this address range to other domains using
1083	     BGP4++ [21] (or similar mechanism). All border routers will thus
1084	     hear these announcements. It waits a certain time, however, (typ. 3
1085	     days) before using the address range, to detect collisions
1086	  4. Address allocation mechanisms (such as AAP or M-DHCP (see below))
1087	     will listen to announcements and allocate addresses to sessions
1088	     appropriately.

1090	A possibility being investigated for MASC address range allocation is
1091	"Kampai style" addressing [19] which uses a non-continuous mask to
1092	define the allocated range so is more flexible when range sizes need to
1093	be arbitrarily increased or decreased. However, Kampai-style addressing
1094	allocates ranges in sizes that are powers of two and hence could be very
1095	wasteful where large ranges are concerned. Although there is brief
1096	consideration of a way to remove this restriction, it is admitted it
1097	will be more complex and hasn't been thought through.

1099	The BGMP (Border Gateway Multicast Protocol) draft [24], proposes a new
1100	architecture for Inter-Domain IP multicast:
1101	  1. Address ranges are associated (at least temporarily) with domains,
1102	     allocated using the MASC mechanism described above.
1103	  2. These address ranges will be advertised globally. The proposal is
1104	     that BGP4 will be used for this, given the proposed Multiprotocol
1105	     Extensions [21]. Thus, multicast address ranges will be advertised
1106	     in much the same way as unicast IPv4 NLRIs are advertised in BGP4
1107	     now. They will be advertised at least a day in advance of use, to
1108	     enable any clashes of address allocation to be resolved.
1109	  3. A multicast session initiated within a domain will be allocated an
1110	     address from within the domain's multicast address assignment.
1111	  4. BGMP then assumes an inter-domain shared tree, for which the `root
1112	     domain' (the focal domain of the shared tree) is the domain owning
1113	     the address.

1115	Some alternative ideas generated while preparing this paper (but
1116	rejected in favour of the scheme presented) are recorded elsewhere [25]

1118	4. Limitations and Further Work

1120	The ability to do large-scale aggregation is based on hope that higher
1121	level organisation will be achieved. This scheme "feels" as if it is a
1122	good foundation for solutions to these problems, but the author can't
1123	give any guarantees without more work. This implies it will be necessary
1124	to simulate, build, test & refine.

1126	Aggregation of multicast addresses by this scheme will probably mean it
1127	is more efficient for routers to store multicast routing state keyed on
1128	interface than on multicast address. In other words, a table of
1129	aggregated multicast addresses would be held for each interface, rather
1130	than a table of interfaces for each address. The implications of this
1131	need investigation.

1133	The problem of how a router would efficiently look up each multicast
1134	packet in a table of AMAs has been deliberately left to one side.
1135	Because AMAs are logically similar to unicast address prefixes, similar
1136	techniques should be appropriate. This may not appear obvious, because
1137	an AMA is more obfuscated than a unicast routing prefix. Briefly, the
1138	first item to be extracted from an incoming packet would be d. This
1139	would point to where the mask started. The eight bits to the right of
1140	this mask would then be guaranteed to be unmasked (invariant), so d and
1141	this octet could be looked up in a Patricia trie or similar but more
1142	efficient data-structure [16]. The two potentially masked octets would
1143	then have to be built into another similar look-up table. Finding any
1144	one address in this structure would again be akin to the longest prefix
1145	(shortest mask) problem.

1147	It is possible that m is redundant, but it is believed that keeping it
1148	will make the aggregation maths a lot easier .

1150	Applicability of AMAs as a solution to reliable multicast
1151	clustering/layering needs assessment.

1153	Applicability of AMAs for aggregation in RSVP (reservation protocol)
1154	[27] when used for multicast needs assessment.

1156	Further work is needed on the potential for using AMAs to insulate
1157	upstream routers from high join/leave churn by introducing pessimistic
1158	inertia in the aggregation. The effect on leave latency (particularly
1159	where used for congestion control in layered multicast) would need
1160	careful study.

1162	The assertion that the weak capability for allocation growth (as
1163	compared to kampai-style addressing) is offset by more efficient storage
1164	needs more justification.

1166	It is believed that, due to topological realities, aggregation in the
1167	network will never approach the aggregation  potential of applications
1168	that use sessions with multiple multicast addresses. This would be a
1169	strong argument against aggregation schemes that are not end to end, but
1170	this needs proving.

1172	Evolution from current versions of protocols needs more careful
1173	analysis.

1175	The design of an AMA scheme for IPv6 [18] needs to be done.
1176	ATM (asynchronous transfer mode) multicast may benefit from a scheme
1177	based on similar thinking?

1179	5. Security Considerations

1181	Phasing of AMA aggregation in routers must be designed carefully so that
1182	it is not possible for one user to join to an AMA that overlaps a
1183	neighbouring join, then leave the AMA and cause the neighbours to lose
1184	their join before their soft-state refresh re-instates it.

1186	It is possible that the aggregation of multicast addresses into sets for
1187	use in the description of complex sessions will cause service providers
1188	to hoard multicast addresses more than when they are allocated singly.
1189	The scheme has been carefully designed to avoid such a tendency on
1190	technical grounds, but predicting how selfish people adapt based on
1191	their understanding of a mechanism moves us into the realms of
1192	psychology where anything goes. In other words, address hoarding
1193	shouldn't be any worse than the situation was without this proposal as
1194	long as people don't misunderstand.

1196	This proposal is considered independent of all aspects of the security
1197	of (encrypted, tamper-proofed, authenticated etc.) multicasts.

1199	6. Conclusions

1201	A strong case has been made that, for multicast addresses to be open to
1202	aggregation, there must be a standard way to name arbitrary size groups
1203	of addresses. An efficient, elegant technique for doing this has been
1204	presented which gives equal value to each point of the multicast address
1205	space, thus preserving the principle of randomness designed to prevent
1206	address hoarding. These names have been called aggregated multicast
1207	addresses (AMAs).

1209	For IPv4, AMAs look like a 32 bit IP address coupled with, typically, an
1210	extra octet (but more generally with two octets) representing the size
1211	of the aggregation of addresses. The list of addresses that form the
1212	aggregation are defined by identifying a string of bits (of defined
1213	width) within the address, which are incremented as if they were a
1214	binary number in their own right until the aggregation size is reached.
1215	All but the first four bits of the address-like component represents the
1216	base address of the aggregation from which the incrementation starts.
1217	Because multicast addresses always(iv) start with the same four bits in
1218	IPv4 (1110b), the first four bits of an AMA can be used to store another
1219	value in transit and storage, but replaced by 1110b to derive the base
1220	multicast address of the aggregation when required.  The value stored in
1221	the first four bits of an AMA represents the width of the field within
1222	the AMA which varies to define the list of addresses. Related but
1223	disjoint AMAs can also be represented efficiently by using the most
1224	general AMA form: an AMA followed by a sequence of alternating pairs of
1225	numbers which represent respectively a further jump from the based
1226	address then a further size of aggregation on top of this.

1228	In the scheme recommended for IPv4, potentially an arbitrary-sized
1229	aggregation of any size up to 2^16 - 1 multicast addresses could be
1230	represented by a field the size of an IPv4 address (4B) plus 2B (256kB
1231	reduced to 6B). However, it should be understood, that such ultra-large-
1232	scale aggregation has a low probability of happening without higher-
1233	level organisation of the address space by end-systems. First or second
1234	order aggregation will occur naturally under this scheme due to the
1235	large class of applications that build sessions from multiple multicast
1236	addresses. Medium-scale aggregation will be possible where routers can
1237	identify overlap or concatenation within multicast routing tables, which
1238	was not possible before without a way to describe aggregations of
1239	multicast addresses. This is because the AMA scheme is inherently
1240	recursive, so that it is possible to merge certain sets of AMAs into one
1241	AMA. To improve the chances of aggregation in multicast routing tables,
1242	address set allocation schemes must fulfil certain criteria that are
1243	laid down in this paper.

1245	The proposal is that AMAs will completely replace multicast addresses in
1246	contexts where aggregation makes sense, that is when describing,
1247	joining, leaving or updating the routing of multicast addresses. For
1248	sending data to and receiving data from multicast addresses,
1249	aggregation, and therefore AMAs, are not relevant. This implies
1250	protocols for describing multicast sessions (such as SDP, SAP, SIP, RSVP
1251	etc.) and protocols for updating the routing of multicast addresses
1252	(such as IGMP, DVMRP, CBT, PIM) will all need to be updated to handle
1253	AMAs in place of discrete multicast addresses. Independent evolution of
1254	all these protocols is considered to be reasonably straightforward.
1255	Although this represents a major round of protocol upgrades, all these
1256	protocols are experimental, and it is a commonly held view that
1257	multicast as it stands is not sufficiently scalable for wide-area
1258	deployment.

1260	It should be clarified that, although joining and leaving aggregates of
1261	multicast addresses can be achieved in single bulk operations, AMAs
1262	deliberately overlap in their use of individual addresses. Thus,
1263	allocation remains on a per address basis. In other words, when joining
1264	an AMA, it may be found that some of the addresses within it are in use
1265	if a strong address allocation scheme is not in use. An AMA allocation
1266	procedure is described in the text for mutating the AMA to cover an
1267	overlapping set of addresses to avoid those addresses in use while
1268	testing different addresses, until a full set of unused addresses is
1269	obtained without losing the addresses in the original AMA that were
1270	free.

1272	To summarise, we have presented a new paradigm(v) for multicasting, such
1273	that describing, joining, leaving and updating multicast routing can and
1274	should all be discussed in terms of aggregates of multicast addresses
1275	(even if they are of size unity) rather than discrete multicast
1276	addresses. We have introduced an efficient, elegant way to name such
1277	aggregates that preserves all the architectural principles on which
1278	multicast addressing is founded, and which will allow potentially large-
1279	scale aggregation of multicast addressing by both end-systems and
1280	routers in end to end co-operation.

1282	7. Acknowledgements

1284	Peter Bagnall, BT, for reviews and suggestions.

1286	8. References

1288	[1] S. McCanne, V. Jacobson, and M. Vetterli, (LBNL/UCB) "Receiver-
1289	driven Layered Multicast", In Proceedings of the ACM SIGCOMM'96,
1290	Stanford, California, pp.1-14. Aug 1996,
1291	<URL:ftp://ftp.ee.lbl.gov/papers/mccanne-sigcomm96.ps.gz>

1293	[2] S.E.Deering, "Multicast Routing in Internetworks and Extended LANs",
1294	In Proceedings of the ACM SIGCOMM'88, Stanford, California, Aug 1988

1296	[3] S.E.Deering (Stanford U.), "Host Extensions for IP Multicasting",
1297	IETF RFC, Aug 1989 <rfc1112.txt>

1299	[4] W. Fenner (Xerox PARC), "Internet Group Management Protocol, Version
1300	2", IETF Internet Draft, 01 Nov 1997, <draft-ietf-idmr-igmp-v2-08.txt>

1302	[5] D. Waitzman, C. Partridge, S. Deering, "Distance Vector Multicast
1303	Routing Protocol" v1,  IETF RFC, Nov 1988, <rfc1075.txt>

1305	[6] T. Pusateri (Juniper Networks), "Distance Vector Multicast Routing
1306	Protocol", IETF Internet Draft, 30 Oct 1997, <draft-ietf-idmr-dvmrp-v3-
1307	05.txt>

1309	[7] A.Ballardie, J.Crowcroft, "Core Based Trees: An Architecture for
1310	Scalable Inter-Domain Multicast Routing", In Proceedings of the ACM
1311	SIGCOMM'93, San Francisco, Sep 1993

1313	[8] A. Ballardie, "Core Based Trees (CBT) Multicast Routing
1314	Architecture", IETF RFC, Sep 1997, <rfc2201.txt>

1316	[9] A. Ballardie, "Core Based Trees (CBT version 2) Multicast Routing",
1317	IETF RFC, Sep 1997, <rfc2189.txt>

1319	[10] S.E.Deering, D.Estrin, D.Farinacci, V.Jacobsen, L.Ching-Gung and
1320	L.Wei, "An Architecture for Wide-Area Multicasting". In Proceedings of
1321	the ACM SIGCOMM'94, London, Sep 1994

1323	[11] Deering, S, et. al., "Protocol Independent Multicast Version 2,
1324	Dense Mode Specification", IETF Internet Draft, 28 May 1997 <draft-ietf-
1325	idmr-pim-dm-spec-05.txt>

1327	[12] Estrin, D, et. al., "Protocol Independent Multicast Sparse-Mode
1328	(PIM-SM): Protocol Specification", IETF Internet Draft, 9 Sep 1997,
1329	<draft-ietf-idmr-pim-sm-specv2-00.ps>

1331	[13] M. Handley (ISI) and V. Jacobson (LBNL), "SDP: Session Description
1332	Protocol", IETF Internet Draft, 02 Sep 1997, <draft-ietf-mmusic-sdp-
1333	04.ps>

1335	[14] M. Handley (ISI), "SAP - Session Announcement Protocol", IETF
1336	Internet Draft (currently expired, see Multiparty Multimedia Session
1337	Control IETF working group charter for status
1338	<http://www.cs.utk.edu/~moore/san-jose/wg/mmusic/charter.html>)

1340	[15] M. Handley (ISI), H. Schulzrinne (Columbia U.), E. Schooler
1341	(Caltech), "SIP: Session Initiation Protocol", IETF Internet Draft, 13
1342	Nov 1997, <draft-ietf-mmusic-sip-04.ps>

1344	[16] M.Degermark, A.Brodnik, S.Carlsson, S.Pink, "Small Forwarding
1345	Tables for Fast Routing Look-ups", In Proceedings of the ACM SIGCOMM'97,
1346	Cannes, France, Sep 1997
1347	<http://www.acm.org/sigcomm/sigcomm97/program.html#ab192>

1349	[17] D. Meyer, University of Oregon, "Administratively Scoped IP
1350	Multicast", IETF Internet Draft, Nov 1997, <draft-ietf-mboned-admin-ip-
1351	space-04.txt>

1353	[18] R. Hinden, Ipsilon Networks and S. Deering, Xerox PARC, "IP Version
1354	6 Addressing Architecture", IETF RFC <rfc1884.txt>

1356	[19] P. Tsuchiya, Bellcore, "Efficient and flexible Hierarchical Address
1357	Assignment", to be published - author's address:
1358	<tsuchiya@thumper.bellcore.com>

1360	[20] M. Handley (ISI), "AAP: Address Allocation Protocol"
1361	(presentation), In Proceedings of the 39th IETF (mmusic working group)
1362	<http://www.ietf.org/proceedings/97aug/toc-97aug.html>

1364	[21] D. Estrin (ISI), M. Handley (ISI) and D. Thaler (Merit), "MASC:
1365	Multicast Address Set Claim" (presentation), In Proceedings of the 39th
1366	IETF (idmr working group) <http://www.ietf.org/proceedings/97aug/toc-
1367	97aug.html>

1369	[21] Tony Bates et al., Cisco Systems, Juniper Networks, "Multiprotocol
1370	Extensions for BGP-4", 26 Sep 1997, <draft-ietf-idr-bgp4-multiprotocol-
1371	01.txt>

1373	[22] Baiju V. Patel, Intel Corp, Munil Shah, Microsoft Corp. "Multicast
1374	address allocation extensions to the Dynamic Host Configuration
1375	Protocol", IETF Internet Draft, 24 Nov 1997, <draft-ietf-dhc-mdhcp-
1376	03.txt>

1378	[23] Internet Assigned Numbers Authority, "Internet Multicast
1379	Addresses", <ftp://ftp.isi.edu/in-notes/iana/assignments/multicast-
1380	addresses>

1382	[24] D. Thaler (U. Michigan), D. Estrin (USC/ISI) and  D. Meyer (U.
1383	Oregon) , "Border Gateway Multicast Protocol (BGMP): Protocol
1384	Specification", IETF Internet Draft, 30 Oct 1997, <draft-ietf-idmr-gum-
1385	01.txt>

1387	[25] R. Briscoe, "Rejected ideas for End to End Aggregation of Multicast
1388	Addresses", 13 Nov 1997,
1389	<http://www.labs.bt.com/people/briscorj/projects/lsma/e2ama_rejects.html>

1391	[26] M. Handley, "Multicast address allocation", in archive of postings
1392	to ipmulticast mailing list, 19 Jun 1997,
1393	<http://www.ipmulticast.com/hypermail/ipmulticast/0144.html>

1395	[27] L. Zhang, S. Deering, D. Estrin, S. Shenker, and D. Zappala, "RSVP:
1396	A New Resource ReSerVation Protocol" In IEEE Network, Sep 1993,
1397	<http://www.isi.edu/div7/rsvp/pub.html>

1399	[28] This paper itself, in HTML format
1400	<http://www.labs.bt.com/people/briscorj/projects/lsma/e2ama.html>

1402	9. Authors' Addresses

1404	Bob Briscoe
1405	   B54/74 BT Labs
1406	   Martlesham Heath
1407	   Ipswich, IP5 3RE
1408	   England
1409	   Phone: +44 1473 645196
1410	   Fax:   +44 1473 640929
1411	   EMail: briscorj@boat.bt.com
1412	   Home page: http://www.labs.bt.com/people/briscorj/

1414	Dr Martin Tatham
1415	   B29/129 BT Labs
1416	   Martlesham Heath
1417	   Ipswich, IP5 3RE
1418	   England
1419	   Phone: +44 1473 642498
1420	   Fax:   +44 1473 640709
1421	   EMail: martin.tatham@bt-sys.bt.co.uk

1423	Notes

1425	i) Pronounced as in "If I 'ad an AMA" (American civil rights song in a
1426	cockney accent).

1428	ii) The multicast addresses that are within the bit mask defined by the
1429	width m, but not between vmin & vmax (defined by vmin and s) are not
1430	"wasted" by using an AMA. In the example used above, the addresses with
1431	v = 001, 010 & 011 are not reserved by the AMA. They can be used by a
1432	completely different application and different set of users. The bit
1433	mask is just a way of narrowing the context of the s parameter (to
1434	define when it wraps). It causes no direct effect on address space
1435	utilisation, but without it, it is believed aggregation would be a lot
1436	more difficult.

1438	iii) Application programs and even routers should never need to know the
1439	list of multicast addresses that an AMA resolves to (other than for
1440	evolution from today's protocols). They will not need to bit shift along
1441	multicast addresses to find the value of d, then bit shift d bits back
1442	from the end to find the mask etc. The AMA tuple can be used by
1443	applications in all cases in place of listing the set of addresses it
1444	resolves to {this statement needs proving, mind - I haven't worked
1445	through AMA aggregation, expansion and contraction maths yet, but we
1446	live in hope!}.

1448	iv) A discussion of the issues if this turns out to not always be the
1449	case is under the section on "Further storage optimisation".

1451	v) There are some places where the "p word" really is appropriate.