idnits 2.17.1 

draft-ietf-ion-marsmcs-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-27) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  ** The document is more than 15 pages and seems to lack a Table of Contents.

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There are 200 instances of too long lines in the document, the longest
     one being 4 characters in excess of 72.

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 167: '...An MCS MUST NOT share its ATM address ...'
     RFC 2119 keyword, line 177: '...ress of the MARS MUST be known to the ...'
     RFC 2119 keyword, line 180: '...startup, the MCS MUST open a point-to-...'
     RFC 2119 keyword, line 181: '...e MCS to the MARS MUST be carried over...'
     RFC 2119 keyword, line 182: '... MARSVC. The MCS MUST register with th...'
     (66 more instances...)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (November 18, 1996) is 10022 days in the past.  Is
     this intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'ML93' is mentioned on line 48, but not defined

  == Unused Reference: 'BK95' is defined on line 741, but no explicit
     reference was found in the text

  == Unused Reference: 'LM93' is defined on line 744, but no explicit
     reference was found in the text

  -- Possible downref: Normative reference to a draft: ref. 'BK95' 

  ** Obsolete normative reference: RFC 1577 (ref. 'LM93') (Obsoleted by RFC
     2225)

  -- No information found for draft-luciani-rolc-scsp - is the name correct?

  -- Possible downref: Normative reference to a draft: ref. 'LA96' 

  -- No information found for draft-talpade-ion-multmcs - is the name correct?

  -- Possible downref: Normative reference to a draft: ref. 'TA96' 


     Summary: 13 errors (**), 0 flaws (~~), 4 warnings (==), 7 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Engineering Task Force                            Talpade and Ammar
2	INTERNET-DRAFT                               Georgia Institute of Technology
3	                                                           November 18, 1996

5	                                                         Expires:  May, 1996

7	                      <draft-ietf-ion-marsmcs-01.txt>
8	      Multicast Server Architectures for MARS-based ATM multicasting.

10	Status of this Memo

12	This document is an Internet-Draft.  Internet-Drafts are working documents
13	of the Internet Engineering Task Force (IETF), its areas, and its working
14	groups.  Note that other groups may also distribute working documents as
15	Internet-Drafts.

17	Internet-Drafts are draft documents valid for a maximum of six months and
18	may be updated, replaced, or obsoleted by other documents at any time.  It
19	is inappropriate to use Internet- Drafts as reference material or to cite
20	them other than as ``work in progress.''

22	To learn the current status of any Internet-Draft, please check the
23	``1id-abstracts.txt'' listing contained in the Internet- Drafts Shadow
24	Directories on ds.internic.net (US East Coast), nic.nordu.net (Europe),
25	ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific Rim).

27	                                  Abstract

29	A mechanism to support the multicast needs of layer 3 protocols in general,
30	and IP in particular, over UNI 3.0/3.1 based ATM networks has been
31	described in RFC 2022.  Two basic approaches exist for the intra-subnet
32	(intra-cluster) multicasting of IP packets.  One makes use of a mesh of
33	point to multipoint VCs (the 'VC Mesh' approach), while the other uses a
34	shared point to multipoint tree rooted on a Multicast Server (MCS). This
35	memo provides details on the design and implementation of an MCS, building
36	on the core mechanisms defined in RFC 2022.  It also provides a mechanism
37	for using multiple MCSs per group for providing fault tolerance.  This
38	approach can be used with RFC 2022 based MARS server and clients, without
39	needing any change in their functionality.

41	1 Introduction

43	A solution to the problem of mapping layer 3 multicast service over the
44	connection-oriented ATM service provided by UNI 3.0/3.1, has been presented
45	in [GA96].  A Multicast Address Resolution Server (MARS) is used to
46	maintain a mapping of layer 3 group addresses to ATM addresses in that
47	architecture.  It can be considered to be an extended analog of the ATM ARP
48	Server introduced in RFC 1577 ([ML93]).  Hosts in the ATM network use the
49	MARS to resolve layer 3 multicast addresses into corresponding lists of ATM
50	addresses of group members.  Hosts keep the MARS informed when they need to
51	join or leave a particular layer 3 group.

53	The MARS manages a "cluster" of ATM-attached endpoints.  A "cluster" is
54	defined as

56	"The set of ATM interfaces choosing to participate in direct ATM
57	connections to achieve multicasting of AALSDUs between themselves."

59	In practice, a cluster is the set of endpoints that choose to use the same
60	MARS to register their memberships and receive their updates from.

62	A sender in the cluster has two options for multicasting data to the group
63	members.  It can either get the list of ATM addresses constituting the
64	group from the MARS, set up a point-to-multipoint virtual circuit (VC) with
65	the group members as leaves, and then proceed to send data out on it.
66	Alternatively, the source can make use of a proxy Multicast Server (MCS).
67	The source transmits data to such an MCS, which in turn uses a
68	point-to-multipoint VC to get the data to the group members.

70	The MCS approach has been briefly introduced in [GA96].  This memo presents
71	a detailed description of MCS architecture and proposes a simple mechanism
72	for supporting multiple MCSs for fault tolerance.  We assume an
73	understanding of the IP multicasting over UNI 3.0/3.1 ATM network concepts
74	described in [GA96], and access to it.  This document is organized as
75	follows.  Section 2 presents interactions with the local UNI 3.0/3.1
76	signaling entity that are used later in the document and have been
77	originally described in [GA96].  Section 3 presents an MCS architecture,
78	along with a description of its interactions with the MARS. Section 4
79	describes the working of an MCS. The possibility of using multiple MCSs for
80	the same layer 3 group, and the mechanism needed to support such usage, is
81	described in section 5.  A comparison of the VC Mesh approach and the MCS
82	approach is presented in Appendix A.

84	2 Interaction with the local UNI 3.0/3.1 signaling entity

86	The following generic signaling functions are presumed to be available to
87	local AAL Users:

89	LCALL-RQ - Establish a unicast VC to a specific endpoint.
90	LMULTI-RQ - Establish multicast VC to a specific endpoint.
91	LMULTI-ADD - Add new leaf node to previously established VC.
92	LMULTI-DROP - Remove specific leaf node from established VC.
93	LRELEASE - Release unicast VC, or all Leaves of a multicast VC.

95	The following indications are assumed to be available to AAL Users,
96	generated by by the local UNI 3.0/3.1 signaling entity:

98	LACK - Succesful completion of a local request.
99	LREMOTE-CALL - A new VC has been established to the AAL User.
100	ERRL-RQFAILED - A remote ATM endpoint rejected an LCALLRQ,
101	                      LMULTIRQ, or L-MULTIADD.
102	ERRL-DROP - A remote ATM endpoint dropped off an existing VC.
103	ERRL-RELEASE - An existing VC was terminated.

105	3 MCS Architecture

107	The MCS acts as a proxy server which multicasts data received from a source
108	to the group members in the cluster.  All multicast sources transmitting to
109	an MCS-based group send the data to the specified MCS. The MCS then
110	forwards the data over a point to multipoint VC that it maintains to group
111	members in the cluster.  Each multicast source thus maintains a single
112	point-to-multipoint VC to the designated MCS for the group.  The designated
113	MCS terminates one point-to-multipoint VC from each cluster member that is
114	multicasting to the layer 3 group.  Each group member is the leaf of the
115	point-to-multipoint VC originating from the MCS.

117	A brief introduction to possible MCS architectures has been presented in
118	[GA96].  The main contribution of that document concerning the MCS approach
119	is the specification of the MARS interaction with the MCS. The next section
120	lists control messages exchanged by the MARS and MCS.

122	3.1 Control Messages exchanged by the MCS and the MARS

124	The following control messages are exchanged by the MARS and the MCS.

126	operation code                Control Message

128	      1                       MARS_REQUEST
129	      2                       MARS_MULTI
130	      3                       MARS_MSERV
131	      6                       MARS_NAK
132	      7                       MARS_UNSERV
133	      8                       MARS_SJOIN
134	      9                       MARS_SLEAVE
135	     12                       MARS_REDIRECT_MAP

137	MARSMSERV and MARS-UNSERV are identical in format to the MARSJOIN message.
138	MARSSJOIN and MARS-SLEAVE are also identical in format to MARSJOIN. As
139	such, their formats and those of MARSREQUEST, MARS-MULTI, MARSNAK and
140	MARSREDIRECT-MAP are described in [GA96].  Their usage is described in
141	section 4.  All control messages are LLC/SNAP encapsulated as described in
142	section 4.2 of [GA96].  (The "mar$" notation used in this document is
143	borrowed from [GA96], and indicates a specific field in the control
144	message.)  Data messages are reflected without any modification by the MCS.

146	3.2 Association with a layer 3 group

148	The simplest MCS architecture involves taking incoming AALSDUs from the
149	multicast sources and sending them out over the point-to-multipoint VC to
150	the group members.  The MCS can service just one layer 3 group using this
151	design, as it has no way of distinguishing between traffic destined for
152	different groups.  So each layer 3 MCS-supported group will have its own
153	designated MCS.

155	However it is desirable in the interests of saving resources to utilize the
156	same MCS to support multiple groups.  This can be done by adding minimal
157	layer 3 specific processing into the MCS. The MCS can now look inside the
158	received AALSDUs and determine which layer 3 group they are destined for.
159	A single instance of such an MCS could register its ATM address with the
160	MARS for multiple layer 3 groups, and manage multiple point-to-multipoint
161	VCs, one for each group.  This capability is included in the MCS
162	architecture, as is the capability of having multiple MCSs per group
163	(section 5).

165	4 Working of MCS

167	An MCS MUST NOT share its ATM address with any other cluster member (MARS
168	or otherwise).  However, it may share the same physical ATM interface (even
169	with other MCSs or the MARS), provided that each logical entity has a
170	different ATM address.  This section describes the working of MCS and its
171	interactions with the MARS and other cluster members.

173	4.1 Usage of MARSMSERV and MARS-UNSERV

175	4.1.1 Registration (and deregistration) with the MARS

177	The ATM address of the MARS MUST be known to the MCS by out-of-band means
178	at startup.  One possible approach for doing this is for the network
179	administrator to specify the MARS address at command line while invoking
180	the MCS. On startup, the MCS MUST open a point-to-point control VC (MARSVC)
181	with the MARS. All traffic from the MCS to the MARS MUST be carried over
182	the MARSVC. The MCS MUST register with the MARS using the MARS-MSERV
183	message on startup.  To register, a MARSMSERV MUST be sent by the MCS to
184	the MARS over the MARSVC. On receiving this MARS-MSERV, the MARS adds the
185	MCS to the ServerControlVC. The ServerControlVC is maintained by the MARS
186	with all MCSs as leaves, and is used to disseminate general control
187	messages to all the MCSs.  The MCS MUST terminate this VC, and MUST expect
188	a copy of the MCS registration MARSMSERV on the MARS-VC from the MARS.

190	An MCS can deregister by sending a MARSUNSERV to the MARS. A copy of this
191	MARSUNSERV MUST be expected back from the MARS. The MCS will then be
192	dropped from the ServerControlVC.

194	No protocol specific group addresses are included in MCS registration
195	MARSMSERV and MARS-UNSERV. The mar$flags.register bit MUST be set, the
196	mar$cmi field MUST be set to zero, the mar$flags.sequence field MUST be set
197	to zero, the source ATM address MUST be included and a null source protocol
198	address MAY be specified in these MARSMSERV and MARS-UNSERV. All other
199	fields are set as described in section 5.2.1 of [GA96] (the MCS can be
200	considered to be a cluster member while reading that section).  It MUST
201	keep retransmitting (section 4.1.3) the MARSMSERV/MARS-UNSERV over the
202	MARSVC until it receives a copy back.

204	In case of failure to open the MARSVC, or error on it, the reconnection
205	procedure outlined in section 4.5.2 is to be followed.

207	4.1.2 Registration (and deregistration) of layer 3 groups

209	The MCS can register with the MARS to support particular group(s).  To
210	register a group X, a MARSMSERV with a <min, max> pair of <X, X> MUST be
211	sent to the MARS. The MCS MUST expect a copy of the MARSMSERV back from the
212	MARS. The retransmission strategy outlined in section 4.1.3 is to be
213	followed if no copy is received.  Multiple groups can be supported by
214	sending a separate MARSMSERV for each group.

216	The MCS MUST similarly use MARSUNSERV if it wants to withdraw support for a
217	specific layer 3 group.  A copy of the group MARSUNSERV MUST be received,
218	failing which the retransmission strategy in section 4.1.3 is to be
219	followed.

221	The mar$flags.register bit MUST be reset and the mar$flags.sequence field
222	MUST be set to zero in the group MARSMSERV and MARS-UNSERV. All other
223	fields are set as described in section 5.2.1 of [GA96] (the MCS can be
224	considered to be a cluster member when reading that section).

226	4.1.3 Retransmission of MARSMSERV and MARS-UNSERV

228	Transient problems may cause loss of control messages.  The MCS needs to
229	retransmit MARSMSERV/MARS-UNSERV at regular intervals when it does not
230	receive a copy back from the MARS. This interval should be no shorter than
231	5 seconds, and a default value of 10 seconds is recommended.  A maximum of
232	5 retransmissions are permitted before a failure is logged.  This MUST be
233	considered a MARS failure, which SHOULD result in the MARS reconnection
234	mechanism described in section 4.5.2.

236	A "copy" is defined as a received message with the following fields
237	matching the previously transmitted MARSMSERV/MARS-UNSERV:

239	   -  mar$op
240	   -  mar$flags.register
241	   -  mar$pnum
242	   -  Source ATM address
243	   -  first <min, max> pair

245	In addition, a valid copy MUST have the following field values:

247	   -  mar$flags.punched = 0
248	   -  mar$flags.copy = 1

250	If either of the above is not true, the message MUST be dropped without
251	resetting of the MARSMSERV/MARS-UNSERV timer.  There MUST be only one
252	MARSMSERV or MARS-UNSERV outstanding at a time.

254	4.1.4 Processing of MARSMSERV and MARS-UNSERV

256	The MARS transmits copies of group MARSMSERV and MARS-UNSERV on the
257	ServerControlVC. So they are also received by MCSs other than the
258	originating one.  This section discusses the processing of these messages
259	by the other MCSs.

261	If a MARSMSERV is seen that refers to a layer 3 group not supported by the
262	MCS, it MUST be used to track the Server Sequence Number (section 4.5.1)
263	and then silently dropped.

265	If a MARSMSERV is seen that refers to a layer 3 group supported by the MCS,
266	the MCS learns of the existence of another MCS supporting the same group.
267	This possibility is incorporated (of multiple MCSs per group) in this
268	version of the MCS approach and is discussed in section 5.

270	4.2 Usage of MARSREQUEST and MARS-MULTI

272	As described in section 5.1, the MCS learns at startup whether it is an
273	active or inactive MCS. After successful registration with the MARS, an MCS
274	which has been designated as inactive for a particular group MUST NOT
275	register to support that group with the MARS. It instead proceeds as in
276	section 5.4.  The active MCS for a group also has to do some special
277	processing, which we describe in that section.  The rest of section 4
278	describes the working of a single active MCS, with section 5 describing the
279	active MCSs actions for supporting multiple MCSs.

281	After the active MCS registers to support a layer 3 group, it uses
282	MARSREQUEST and MARS-MULTI to obtain information about group membership
283	from the MARS. These messages are also used during the revalidation phase
284	(section 4.5) and when no outgoing VC exists for a received layer 3 packet
285	(section 4.3).

287	On registering to support a particular layer 3 group, the MCS MUST send a
288	MARSREQUEST to the MARS. The mechanism to retrieve group membership and the
289	format of MARSREQUEST and MARS-MULTI is described in section 5.1.1 and
290	5.1.2 of [GA96] respectively.  The MCS MUST use this mechanism for sending
291	(and retransmitting) the MARSREQUEST and processing the returned
292	MARSMULTI(/s).  The MARS-MULTI MUST be received correctly, and the MCS MUST
293	use it to initialize its knowledge of group membership.

295	On successful reception of a MARSMULTI, the MCS MUST attempt to open the
296	outgoing point-to-multipoint VC using the mechanism described in section
297	5.1.3 of [GA96], if any group members exist.  The MCS however MUST start
298	transmitting data on this VC after it has opened it successfully with at
299	least one of the group members as a leaf, and after it has attempted to add
300	all the group members at least once.

302	4.3 Usage of outgoing point-to-multipoint VC

304	Cluster members which are sources for MCS-supported layer 3 groups send
305	(encapsulated) layer 3 packets to the designated MCSs.  An MCS, on
306	receiving them from cluster members, has to send them out over the specific
307	point-to-multipoint VC for that layer 3 group.  This VC is setup as
308	described in the previous section.  However, it is possible that no group
309	members currently exist, thus causing no VC to be setup.  So an MCS may
310	have no outgoing VC to forward received layer 3 packets on, in which case
311	it MUST initiate the MARSREQUEST and MARS-MULTI sequence described in the
312	previous section.  This new MARSMULTI could contain new members, whose
313	MARSSJOINs may have been not received by the MCS (and the loss not detected
314	due to absence of traffic on the ServerControlVC).

316	If an MCS learns that there are no group members (MARSNAK received from
317	MARS), it MUST delay sending out a new MARSREQUEST for that group for a
318	period no less than 5 seconds and no more than 10 seconds.

320	Layer 3 packets received from cluster members, while no outgoing
321	point-to-multipoint VC exists for that group, MUST be silently dropped
322	after following the guidelines in the previous paragraphs.  This might
323	result in some layer 3 packets being lost until the VC is setup.

325	Each outgoing point-to-multipoint VC has a revalidate flag associated with
326	it.  This flag MUST be checked whenever a layer 3 packet is sent out on
327	that VC. No action is taken if it is not set.  If it is set, the packet is
328	sent out, the revalidation procedure (section 4.5.3) MUST be initiated for
329	this group, and the flag MUST be reset.

331	In case of error on a point-to-multipoint VC, the MCS MUST initiate
332	revalidation procedures for that VC as described in section 4.5.3.

334	Once a point-to-multipoint VC has been setup for a particular layer 3
335	group, the MCS MUST hold the VC open and mark it as the outgoing path for
336	any subsequent layer 3 packets being sent for that group address.  A
337	point-to-multipoint VC MUST NOT have an activity timer associated with it.
338	It is to remain up at all times, unless the MCS explicitly stops supporting
339	that layer 3 group, or no more leaves exist on the VC which causes it to be
340	shut down.  The VC is kept up inspite of non-existent traffic to reduce the
341	delay suffered by MCS supported groups.  If the VC were to be shut down on
342	absence of traffic, the VC reestablishment procedure (needed when new
343	traffic for the layer 3 group appears) would further increase the initial
344	delay, which can be potentially higher than the VC mesh approach anyway as
345	two VCs need to be setup in the MCS case (one from source to MCS, second
346	from MCS to group) as opposed to only one (from source to group) in the VC
347	Mesh approach.  This approach of keeping the VC from the MCS open even in
348	the absense of traffic is experimental.  A decision either way can only be
349	made after gaining experience (either through implementation or simulation)
350	about the implications of keeping the VC open.

352	If the MCS supports multiple layer 3 groups, it MUST follow the procedure
353	outlined in the four previous subsections for each group that it is an
354	active MCS. Each incoming data AALSDU MUST be examined for determining its
355	recipient group, before being forwarded onto the appropriate outgoing
356	point-to-multipoint VC.

358	4.3.1 Group member dropping off a point-to-multipoint VC

360	AN ERRL-DROP may be received during the lifetime of a point-to-multipoint
361	VC indicating that a leaf node has terminated its participation at the ATM
362	level.  The ATM endpoint associated with the ERRL-DROP MUST be removed from
363	the locally held set associated with the VC. The revalidate flag on the VC
364	MUST be set after a random interval of 1 through 10 seconds.

366	If an ERRL-RELEASE is received for a VC, then the entire set is cleared and
367	the VC considered to be completely shutdown.  A new VC for this layer 3
368	group will be established only on reception of new traffic for the group
369	(as described in section 4.3).

371	4.4 Processing of MARSSJOIN and MARS-SLEAVE

373	The MARS transmits equivalent MARSSJOIN/MARS-SLEAVE on the ServerControlVC
374	when it receives MARSJOIN/MARS-LEAVE from cluster members.  The MCSs keep
375	track of group membership updates through these messages.  The format of
376	these messages are identical to MARSJOIN and MARS-LEAVE, which are
377	described in section 5.2.1 of [GA96].  It is sufficient to note here that
378	these messages carry the ATM address of the node joining/leaving the
379	group(/s), the group(/s) being joined or left, and a Server Sequence Number
380	from MARS.

382	When a MARSSJOIN is seen which refers to (or encompasses) a layer 3 group
383	(or groups) supported by the MCS, the following action MUST be taken.  The
384	new member's ATM address is extracted from the MARSSJOIN. An L-MULTIADD is
385	issued for the new member for each of those referred groups which have an
386	outgoing point-to-multipoint VC. An LMULTI-RQ is issued for the new member
387	for each of those refered groups which have no outgoing VCs.

389	When a MARSSLEAVE is seen that refers to (or encompasses) a layer 3 group
390	(or groups) supported by the MCS, the following action MUST be taken.  The
391	leaving member's ATM address is extracted.  An LMULTI-DROP is issued for
392	the member for each of the refered groups which have an outgoing
393	point-to-multipoint VC.

395	There is a possibility of the above requests (LMULTI-RQ or LMULTIADD or
396	LMULTI-DROP) failing.  The UNI 3.0/3.1 failure cause must be returned in
397	the ERRL-RQFAILED signal from the local signaling entity to the AAL User.
398	If the failure cause is not 49 (Quality of Service unavailable), 51 (user
399	cell rate not available - UNI 3.0), 37 (user cell rate not available - UNI
400	3.1), or 41 (Temporary failure), the endpoint's ATM address is dropped from
401	the locally held view of the group by the MCS. Otherwise, the request MUST
402	be re-attempted with increasing delay (initial value between 5 to 10
403	seconds, with delay value doubling after each attempt) until it either
404	succeeds or the multipoint VC is released or a MARSSLEAVE is received for
405	that group member.  If the VC is open, traffic on the VC MUST continue
406	during these attempts.

408	MARSSJOIN and MARS-SLEAVE are processed differently if multiple MCSs share
409	the members of the same layer 3 group (section 5.4).  MARSSJOIN and
410	MARSSLEAVE that do not refer to (or encompass) supported groups MUST be
411	used to track the Server Sequence Number (section 4.5.1), but are otherwise
412	ignored.

414	4.5 Revalidation Procedures

416	The MCS has to initiate revalidation procedures in case of certain failures
417	or errors.

419	4.5.1 Server Sequence Number

421	The MCS needs to track the Server Sequence Number (SSN) in the messages
422	received on the ServerControlVC from the MARS. It is carried in the mar$msn
423	of all messages (except MARSNAK) sent by the MARS to MCSs.  A jump in SSN
424	implies that the MCS missed the previous message(/s) sent by the MARS. The
425	MCS then sets the revalidate flag on all outgoing point-to-multipoint VCs
426	after a random delay of between 1 and 10 seconds, to avoid all MCSs
427	inundating the MARS simultaneously in case of a more general failure.

429	The only exception to the rule is if a sequence number is detected during
430	the establishment of a new group's VC (i.e.  a MARSMULTI was correctly
431	received, but its mar$msn indicated that some previous MARS traffic had
432	been missed on ClusterControlVC). In this case every open VC, EXCEPT the
433	one just being established, MUST have its revalidate flag set at some
434	random interval between 1 and 10 seconds from the time the jump in SSN was
435	detected.  (The VC being established is considered already validated in
436	this case).

438	Each MCS keeps its own 32 bit MCS Sequence Number (MSN) to track the SSN.
439	Whenever a message is received that carries a mar$msn field, the following
440	processing is performed:

442	        Seq.diff = mar$msn - MSN

444	        mar$msn -> MSN

446	        (.... process MARS message ....)

448	        if ((Seq.diff != 1) && (Seq.diff != 0))
449	              then (.... revalidate group membership information ....)

451	The mar$msn value in an individual MARSMULTI is not used to update the MSN
452	until all parts of the MARSMULTI (if > 1) have arrived.  (If the mar$msn
453	changes during reception of a MARSMULTI series, the MARS-MULTI is discarded
454	as described in section 5.1.1 of [GA96]).

456	The MCS sets its MSN to zero on startup.  It gets the current value of SSN
457	when it receives the copy of the registration MARSMSERV back from the MARS.

459	4.5.2 Reconnecting to the MARS

461	The MCSs are assumed to have been configured with the ATM address of at
462	least one MARS at startup.  MCSs MAY choose to maintain a table of ATM
463	addresses, each address representing alternative MARS which will be
464	contacted in case of failure of the previous one.  This table is assumed to
465	be ordered in descending order of preference.

467	An MCS will decide that it has problems communicating with a MARS if:

469	   * It fails to establish a point-to-point VC with the MARS.

471	   * MARSREQUEST generates no response (no MARSMULTI or MARS-NAK returned).

473	   * ServerControlVC fails.

475	   * MARSMSERV or MARSUNSERV do not result in their respective copies being
476	     received.

478	(reconnection as in section 5.4 in [GA96], with MCS-specific actions used
479	where needed).

481	4.5.3 Revalidating a point-to-multipoint VC

483	The revalidation flag associated with a point-to-multipoint VC is checked
484	when a layer 3 packet is to be sent out on the VC. Revalidation procedures
485	MUST be initiated for a point-to-multipoint VC that has its revalidate flag
486	set when a layer 3 packet is being sent out on it.  Thus more active groups
487	get revalidated faster than less active ones.  The revalidation process
488	MUST NOT result in disruption of normal traffic on the VC being
489	revalidated.

491	The revalidation procedure is as follows.  The MCS reissues a MARSREQUEST
492	for the VC being revalidated.  The returned set of members is compared with
493	the locally held set; LMULTI-ADDs MUST be issued for new members, and
494	LMULTI-DROPs MUST be issued for non-existent ones.  The revalidate flag
495	MUST be reset for the VC.

497	5 Multiple MCSs for a layer 3 group

499	Having a single MCS for a layer 3 group can cause it to become a single
500	point of failure and a bottleneck for groups with large numbers of active
501	senders.  It is thus desirable to introduce a level of fault tolerance by
502	having multiple MCS per group.  Support for load sharing is not introduced
503	in this version of the draft so as to reduce the complexity of the
504	protocol.

506	5.1 Outline

508	The protocol described in this draft offers fault tolerance by using
509	multiple MCSs for the same group.  This is achieved by having a standby MCS
510	take over from a failed MCS which had been supporting the group.  The MCS
511	currently supporting a group is refered to as the active MCS, while the one
512	or more standby MCSs are refered to as inactive MCSs.  There is only one
513	active MCS existing at any given instant for an MCS-supported group.  The
514	protocol makes use of the HELLO messages as described in [LA96].

516	To reduce the complexity of the protocol, the following operational
517	guidelines need to be followed.  These guidelines need to be enforced by
518	out-of-band means which are not specified in this document and can be
519	implementation dependent.

521	   * The set of (one or more) MCSs (``mcslist'') that support a particular
522	     IP Multicast group is predetermined and fixed.  This set MUST be known
523	     to each MCS in the set at startup, and the ordering of MCSs in the set
524	     is the same for all MCSs in the set.  An implementation of this would
525	     be to maintain the set of ATM addresses of the MCSs in a file, an
526	     identical copy of which is kept at each MCS in the set.

528	   * All MCSs in ``mcslist'' have to be started up together, with the first
529	     MCS in ``mcslist'' being the last to be started.

531	   * A failed MCS cannot be started up again.

533	5.2 Discussion of Multiple MCSs in operation

535	An MCS on startup determines its position in the ``mcslist''.  If the MCS
536	is not the first in ``mcslist'', it does not register for supporting the
537	group with the MARS. If the MCS is first in the set, it does register to
538	support the group.

540	The first MCS thus becomes the active MCS and supports the group as
541	described in section 4.  The active MCS also opens a point-to-multipoint VC
542	(HelloVC) to the remaining MCSs in the set (the inactive MCSs).  It starts
543	sending HELLO messages on this VC at a fixed interval (HelloInterval
544	seconds).  The inactive MCSs maintain a timer to keep track of the last
545	received HELLO message.  If an inactive MCS does not receive a message
546	within HelloInterval* DeadFactor seconds (values of HelloInterval and
547	DeadFactor are the same at all the MCSs), or if the HelloVC is closed, it
548	assumes failure of the active MCS and attempts to elect a new one.  The
549	election process is described in section 5.5.

551	If an MCS is elected as the new active one, it registers to support the
552	group with the MARS. It also initiates the transmission of HELLO messages
553	to the remaining inactive MCSs.

555	5.3 Inter-MCS control messages

557	The protocol uses HELLO messages in the heartbeat mechanism, and also
558	during the election process.  The format of the HELLO message is based on
559	that described in [LA96].  The Hello message type code is 5.

561	    0                   1                   2                   3
562	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
563	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
564	   | Sender Len    |    Recvr Len  | State | Type  |    unused     |
565	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
566	   |         HelloInterval         |          DeadFactor           |
567	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
568	   |                        IP Multicast address                   |
569	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
570	   |                    Sender ATM address (variable length)       |
571	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
572	   |                  Receiver ATM address (variable length)       |
573	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

575	   Sender Len
576	     This field holds the length in octets of the Sender ATM address.

578	   Recvr Len
579	     This field holds the length in octets of the Receiver ATM
580	     address.

582	   State
583	     Currently two states: No-Op (0x00) and Elected (0x01).
584	     It is used by a candidate MCS to indicate if it was successfully
585	     elected.

587	   Type
588	     This is the code for the message type.

590	   HelloInterval
591	     The hello interval advertises the time between sending of
592	     consecutive Hello Messages by an active MCS.  If the time between
593	     Hello messages exceeds the HelloInterval then the Hello is to be
594	     considered late by the inactive MCS.

596	   DeadFactor
597	     This is a multiplier to the HelloInterval. If an inactive MCS
598	     does not receive a Hello message within the interval
599	     HelloInterval*DeadFactor from an active MCS that advertised
600	     the HelloInterval then the inactive MCS MUST consider the active
601	     one to have failed.

603	   IP Multicast address
604	     This field is used to indicate the group to associate the HELLO
605	     message with. It is useful if MCSs can support more than one
606	     group.

608	   Sender ATM address
609	     This is the protocol address of the server which is sending the
610	     Hello.

612	   Receiver ATM address
613	     This is the protocol address of the server which is to Reply to
614	     the Hello.  If the sender does not know this address then the
615	     sender sets it to zero. (This happens in the HELLO messages sent
616	     from the active MCS to the inactive ones, as they are multicast
617	     and not sent to one specific receiver).

619	5.4 The Multiple MCS protocol

621	As is indicated in section 5.1, all the MCSs supporting the same IP
622	Multicast group MUST be started up together.  The set of MCSs (``mcslist'')
623	MUST be specified to each MCS in the set at startup.  After registering to
624	support the group with the MARS, the first MCS in the set MUST open a
625	point-to-multipoint VC (HelloVC) with the remaining MCSs in the ``mcslist''
626	as leaves, and thus assumes the role of active MCS. It MUST send HELLO
627	messages HelloInterval seconds apart on this VC. The Hello message sent by
628	the active MCS MUST have the Receiver Len set to zero, the State field set
629	to "Elected", with the other fields appropriately set.  The Receiver ATM
630	address field does not exist in this HELLO message.  The initial value of
631	HelloInterval and DeadFactor MUST be the same at all MCSs at startup.  The
632	active MCS can choose to change these values by introducing the new value
633	in the HELLO messages that are sent out.  The active MCS MUST support the
634	group as described in section 4.

636	The other MCSs in ``mcslist'' determine the identity of the first MCS from
637	the ``mcslist''.  They MUST NOT register to support the group with the
638	MARS, and become inactive MCSs.  On startup, an inactive MCS expects HELLO
639	messages from the active MCS. The inactive MCS MUST terminate the HelloVC.
640	A timer MUST be maintained, and if the inactive MCS does not receive HELLO
641	message from the active one within a period HelloInterval*DeadFactor
642	seconds, it assumes that the active MCS died, and initiates the election
643	process as described in section 5.5.  If a HELLO message is received within
644	this period, the inactive MCS does not initiate any further action, other
645	than restarting the timer.  The inactive MCSs MUST set their values of
646	HelloInterval and DeadFactor to those specified by the active MCS in the
647	HELLO messages.

649	On failure of the active MCS, a new MCS assumes its role as described in
650	section 5.5.  In this case, the remaining inactive MCSs will expect HELLO
651	messages from this new active MCS as described in the previous paragraph.

653	5.5 Failure handling

655	5.5.1 Failure of active MCS

657	The failure of the active MCS is detected by the inactive MCSs if no HELLO
658	message is received within an interval of HelloInterval*DeadFactor seconds,
659	or if the HelloVC is closed.  In this case the next MCS in ``mcslist''
660	becomes the candidate MCS. It MUST open a point-to-multipoint VC to the
661	remaining inactive MCSs (HelloVC) and send a HELLO message on it with the
662	State field set to No-Op.  The rest of the message is formatted as
663	described earlier.

665	On receiving a HELLO message from a candidate MCS, an inactive MCS MUST
666	open a point-to-point VC to that candidate.  It MUST send a HELLO message
667	back to it, with the Sender and Receiver fields appropriately set (not
668	zero), and the State field being No-Op.  If a HELLO message is received by
669	an inactive MCS from a non-candidate MCS, it is ignored.  If no HELLO
670	message is received from the candidate with the State field set to
671	"Elected" in HelloInterval seconds, the inactive MCS MUST retransmit the
672	HELLO. If no HELLO message with State field set to "Elected" is received by
673	the inactive MCSs within an interval of HelloInterval*DeadFactor seconds,
674	the next MCS in ``mcslist'' is considered as the candidate MCS. Note that
675	the values used for HelloInterval and DeadFactor in the election phase are
676	the default ones.

678	The candidate MCS MUST wait for a period of HelloInterval*DeadFactor
679	seconds for receiving HELLO messages from inactive MCSs.  It MUST transmit
680	HELLO messages with State field set to No-Op at HelloInterval seconds
681	interval during this period.  If it receives messages from atleast half of
682	the remaining inactive MCSs during this period, it considers itself elected
683	and assumes the active MCS role.  It then registers to support the group
684	with the MARS, and starts sending HELLO messages at HelloInterval second
685	intervals with State field set to "Elected" on the already existing
686	HelloVC. The active MCS can then alter the HelloInterval and DeadFactor
687	values if desired, and communicate the same to the inactive MCSs in the
688	HELLO message.

690	5.5.2 Failure of inactive MCS

692	If an inactive MCS drops off the HelloVC, the active MCS MUST attempt to
693	add that MCS back to the VC for three attempts, spaced
694	HelloInterval*DeadFactor seconds apart.  If even the third attempt fails,
695	the inactive MCS is considered dead.

697	An MCS, active or inactive, MUST NOT be started up once it has failed.
698	Failed MCSs can only be started up by manual intervention after shutting
699	down all the MCSs, and restarting them together.

701	5.6 Compatibility with future MARS and MCS versions

703	Future versions of MCSs can be expected to use an enhanced MARS for load
704	sharing and fault tolerance ([TA96]).  The MCS architecture described in
705	this document is compatible with the enhanced MARS and the future MCS
706	versions.  This is because the active MCS is the only one which
707	communicates with the MARS about the group.  Hence the active MCS will only
708	be informed by the enhanced MARS about the subset of the group that it is
709	to support.  Thus MCSs conforming to this document are compatible with
710	[GA96] based MARS, as well as enhanced MARS.

712	6 Summary

714	This draft describes the architecture of an MCS. It also provides a
715	mechanism for using multiple MCSs per group for providing fault tolerance.
716	This approach can be used with [GA96] based MARS server and clients,
717	without needing any change in their functionality.  It uses the HELLO
718	packet format as described in [LA96] for the heartbeat messages.

720	7 Acknowledgements

722	We would like to acknowledge Grenville Armitage (Bellcore) for reviewing
723	the draft and suggesting improvements towards simplifying the multiple MCS
724	functionalities.  Discussion with Joel Halpern (Newbridge) helped clarify
725	the multiple MCS problem.  Anthony Gallo (IBM RTP) pointed out security
726	issues that are not adequately addressed in the current draft.

728	8 Authors' Address

730	Rajesh Talpade - taddy@cc.gatech.edu - (404)-894-6737
731	Mostafa H. Ammar - ammar@cc.gatech.edu - (404)-894-3292

733	College of Computing
734	Georgia Institute of Technology
735	Atlanta, GA 30332-0280
736	References

738	[GA96]   Armitage, G.J., "Support for Multicast over UNI 3.0/3.1 based ATM
739	         networks", RFC 2022.

741	[BK95]   Birman, A., Kandlur, D., Rubas, J., "An extension to the MARS
742	         model", Internet Draft, draft-kandlur-ipatm-mars-directvc-00.txt,
743	         November 1995.
744	[LM93]   Laubach, M., "Classical IP and ARP over ATM", RFC1577,
745	         Hewlett-Packard Laboratories, December 1993.

747	[LA96]   Luciani, J., G. Armitage, and J. Halpern, "Server Cache
748	         Synchronization Protocol (SCSP) - NBMA", Internet Draft,
749	         draft-luciani-rolc-scsp-02.txt, April 1996.

751	[TA96]   Talpade, R., and Ammar, M.H., "Multiple MCS support using an
752	         enhanced version of the MARS server.", Internet Draft (work in
753	         progress), draft-talpade-ion-multmcs-00.txt, June 1996.