idnits 2.17.1 

draft-ietf-mpls-ldp-restart-applic-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == The page length should not exceed 58 lines per page, but there was 14
     longer pages, the longest (page 1) being 61 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an Introduction section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** There is 1 instance of too long lines in the document, the longest one
     being 1 character in excess of 72.

  ** The abstract seems to contain references ([LDP-FT], [LDP-RESTART]),
     which it shouldn't.  Please replace those with straight textual mentions
     of the documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (February 2003) is 7738 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'RFC 3036' is mentioned on line 350, but not defined

  ** Obsolete undefined reference: RFC 3036 (Obsoleted by RFC 5036)

  == Missing Reference: 'RFC 3212' is mentioned on line 611, but not defined

  == Missing Reference: 'LDP' is mentioned on line 628, but not defined

  == Unused Reference: 'RFC2119' is defined on line 675, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC3212' is defined on line 695, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 3036 (Obsoleted by RFC 5036)

  -- No information found for draft-ietf-ldp-restart - is the name correct?

  == Outdated reference: A later version (-08) exists of
     draft-ietf-mpls-recovery-frmwrk-07


     Summary: 7 errors (**), 0 flaws (~~), 8 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Network Working Group                                      Adrian Farrel
2	Internet Draft                                            Movaz Networks
3	Category: Informational
4	Expiration Date: August 2003                               February 2003

6	        Applicability Statement for Restart Mechanisms for the
7	                    Label Distribution Protocol

9	              draft-ietf-mpls-ldp-restart-applic-00.txt

11	Status of this Memo

13	   This document is an Internet-Draft and is in full conformance with
14	   all provisions of Section 10 of RFC2026 [RFC2026].

16	   Internet-Drafts are working documents of the Internet Engineering
17	   Task Force (IETF), its areas, and its working groups.  Note that
18	   other groups may also distribute working documents as Internet-
19	   Drafts.

21	   Internet-Drafts are draft documents valid for a maximum of six months
22	   and may be updated, replaced, or obsoleted by other documents at any
23	   time.  It is inappropriate to use Internet-Drafts as reference
24	   material or to cite them other than as "work in progress."

26	   The list of current Internet-Drafts can be accessed at
27	   http://www.ietf.org/ietf/1id-abstracts.txt

29	   The list of Internet-Draft Shadow Directories can be accessed at
30	   http://www.ietf.org/shadow.html.

32	Abstract

34	   Multiprotocol Label Switching (MPLS) systems will be used in core
35	   networks where system downtime must be kept to a minimum. Similarly,
36	   where MPLS is at the network edges (for example, in Provider Edge
37	   routers) system downtime must also be kept as small as possible.
38	   Many MPLS Label Switching Routers (LSRs) may, therefore, exploit
39	   Fault Tolerant (FT) hardware or software to provide high availability
40	   of the core networks.

42	   The details of how FT is achieved for the various components of an
43	   FT LSR, including the switching hardware and the TCP stack are
44	   implementation specific.  How the software module itself chooses to
45	   implement FT for the state created by the Label Distribution Protocol
46	   (LDP) is also implementation specific but there are several issues in
47	   the LDP specification in RFC 3036 "LDP Specification" that make it
48	   difficult to implement an FT LSR using the LDP protocols without some
49	   extensions to those protocols.

51	   Proposals have been made in "Fault Tolerance for the Label
52	   Distribution Protocol (LDP)" [LDP-FT] and "Graceful Restart Mechanism
53	   for LDP" [LDP-RESTART] to address these issues.

55	   This document gives guidance on when it is advisable to implement
56	   some form of LDP restart mechanism and which approach might be more
57	   suitable. The issues and extensions described here are equally
58	   applicable to RFC 3212, "Constraint-Based LSP Setup Using LDP"
59	   (CR-LDP).

61	1. Requirements of an LDP FT System

63	   MPLS is a technology that will be used in core networks where system
64	   downtime must be kept to an absolute minimum. Similarly, where MPLS
65	   is at the network edges (for example, in PE routers in RFC2547)
66	   system downtime must also be kept as small as possible.

68	   Many MPLS LSRs may, therefore, exploit FT hardware or software to
69	   provide high availability (HA) of core networks.

71	   In order to provide HA, an MPLS system needs to be able to survive a
72	   variety of faults with minimal disruption to the Data Plane,
73	   including the following fault types:

75	   -  failure/hot-swap of the switching fabric in an LSR

77	   -  failure/hot-swap of a physical connection between LSRs

79	   -  failure of the TCP or LDP stack in an LSR

81	   -  software upgrade to the TCP or LDP stacks in an LSR.

83	   The first two examples of faults listed above may be confined to the
84	   Data Plane in which case such faults can be handled by providing
85	   redundancy in the Data Plane which is transparent to LDP operating in
86	   the Control Plane.  However, the failure of the switching fabric or a
87	   physical link may have repercussions in the Control Plane since
88	   signaling may be disrupted.

90	   The third example may be caused by a variety of events including
91	   processor or other hardware failure, and software failure.

93	   Any of the last three examples may impact the Control Plane and will
94	   require action in the Control Plane to recover.  Such action should
95	   be designed to avoid disrupting traffic in the Data Plane.  This is
96	   possible because many recent router architectures separate the
97	   Control and Data Planes such that forwarding can continue unaffected
98	   by recovery action in the Control Plane.

100	   In other scenarios, the Data and Control Planes may be impacted by a
101	   fault but the needs of HA require the coordinated recovery of the
102	   Data and Control Planes to state that existed before the fault.

104	   The provision of protection paths for MPLS LSP and the protection of
105	   links, IP routes or tunnels through the use of protection LSPs is
106	   outside the scope of this document. See [MPLS-RECOV] for further
107	   information on this subject.

109	2. General Considerations

111	   In order that the Data and Control Plane states may be successfully
112	   recovered after a fault, procedures are required to ensure that the
113	   state held on a pair of LDP peers (at least one of which was affected
114	   directly by the fault) are synchronized. Such procedures must be
115	   implemented in the Control Plane software modules on the peers using
116	   Control Plane protocols.

118	   The required actions may be operate fully after the failure
119	   (reactive recovery) or may contain elements that operate before the
120	   fault in order to minimize the actions taken after the fault
121	   (proactive recovery). It is rarely feasible to implement actions that
122	   operate solely in advance of the failure and do not require any
123	   further processing after the failure (preventive recovery) - this is
124	   because of the dynamic nature of signaling protocols and the
125	   unpredictability of fault timing.

127	   Reactive recovery actions may include full re-signaling of state,
128	   re-synchronization of state between peers and synchronization based on
129	   checkpointing.

131	   Proactive recovery actions may include hand-shaking state transitions
132	   and checkpointing.

134	3. Specific Issues with the LDP Protocol

136	   LDP uses TCP to provide reliable connections between LSRs over which
137	   to exchange protocol messages to distribute labels and to set up
138	   LSPs. A pair of LSRs that have such a connection are referred to as
139	   LDP peers.

141	   TCP enables LDP to assume reliable transfer of protocol messages.
142	   This means that some of the messages do not need to be acknowledged
143	   (for example, Label Release).

145	   LDP is defined such that if the TCP connection fails, the LSR should
146	   immediately tear down the LSPs associated with the session between
147	   the LDP peers, and release any labels and resources assigned to those
148	   LSPs.

150	   It is notoriously hard to provide a Fault Tolerant implementation of
151	   TCP. To do so might involve making copies of all data sent and
152	   received. This is an issue familiar to implementers of other TCP
153	   applications such as BGP.

155	   During failover affecting the TCP or LDP stacks, therefore, the TCP
156	   connection may be lost.  Recovery from this position is made worse by
157	   the fact that LDP control messages may have been lost during the
158	   connection failure.  Since these messages are unconfirmed, it is
159	   possible that LSP or label state information will be lost.

161	   The solution to this problem must at the very least include a change
162	   to the basic requirements of LDP so that the failure of an LDP
163	   session does not require that associated LDP or forwarding state be
164	   torn down.

166	   Any changes made to LDP in support of recovery processing must meet
167	   the following requirements:

169	   - offer backward-compatibility with LSRs that do not implement the
170	     extensions to LDP

172	   - preserve existing protocol rules described in [RFC3036] for
173	     handling unexpected duplicate messages and for processing
174	     unexpected messages referring to unknown LSPs/labels.

176	   Ideally, any solution applicable to LDP should be equally applicable
177	   to CR-LDP.

179	4. Summary of the Features of LDP FT

181	   LDP Fault Tolerance extensions are described in [LDP-FT].  This
182	   approach involves:

184	   - negotiation between LDP peers of the intent to support extensions
185	     to LDP that facilitate recovery from failover without loss of LSPs

187	   - selection of FT survival on a per LSP/label basis or for all labels
188	     on a session

190	   - sequence numbering of LDP messages to facilitate acknowledgement
191	     and checkpointing

193	   - acknowledgement of LDP messages to ensure that a full handshake is
194	     performed on those messages either frequently (such as per message)
195	     or less frequently as in checkpointing

197	   - solicitation of up-to-date acknowledgement (checkpointing) of
198	     previous LDP messages to ensure the current state is secured, with
199	     an additional option that allows an LDP partner to request that
200	     state is flushed in both directions if graceful shutdown is
201	     required

203	   - a timer to control for how long LDP and forwarding state should
204	     be retained after LDP session failure before being discarded if
205	     LDP communications are not re-established

207	   - exchange of checkpointing information on LDP session recovery to
208	     establish what state has been retained by recovering LDP peers

210	   - re-issuing lost messages after failover to ensure that LSP/label
211	     state is correctly recovered after reconnection of the LDP session.

213	   The FT procedures in [LDP-FT] concentrate on the preservation of
214	   label state for labels exchanged between a pair of adjacent LSRs when
215	   the TCP connection between those LSRs is lost. There is no intention
216	   within these procedures to support end-to-end protection for LSPs.

218	5. Summary of the Features of LDP Graceful Restart

220	   LDP graceful restart extensions are defined in [LDP-RESTART]. This
221	   approach involves:

223	   - negotiation between LDP peers of the intent to support extensions
224	     to LDP that facilitate recovery from failover without loss of LSPs

226	   - a mechanism whereby an LSR that restarts can relearn LDP state
227	     by resynchronization with its peers

229	   - use of the same mechanism to allow LSRs recovering from an LDP
230	     session failure to resynchronize LDP state with their peers
231	     provided that at least one of the LSRs has retained state across
232	     the failure or has itself resynchronized state with its peers

234	   - a timer to control for how long LDP and forwarding state should
235	     be retained after LDP session failure before being discarded if
236	     LDP communications are not re-established

238	   - a timer to control the length of the period during which
239	     resynchronization of state between adjacent peers should be
240	     completed

242	   The procedures in [LDP-RESTART] are applicable to all LSRs, both
243	   those with the ability to preserve forwarding state during LDP
244	   restart and those without. An LSRs that can not preserve its MPLS
245	   forwarding state across the LDP restart would impact MPLS traffic
246	   during restart, but by implementing a subset of the mechanisms in
247	   [LDP-RESTART] it can minimize the impact if their neighbor(s) are
248	   capable of preserving their forwarding state across the restart of
249	   their LDP sessions or control planes by implementing the mechanism
250	   in [LDP-RESTART].

252	6. Applicability Considerations

254	   This section considers the applicability of fault tolerance schemes
255	   within LDP networks and considers issues that might lead to the
256	   choice of one method or another. Many of the points raised below
257	   should be viewed as implementation issues rather than specific
258	   drawbacks of either solution.

260	6.1 General Applicability

262	   The procedures described in [LDP-FT] and [LDP-RESTART] are intended
263	   to cover two distinct scenarios. In Session Failure the LDP peers at
264	   the ends of a session remain active, but the session fails and is
265	   restarted. Note that session failure does not imply failure of the
266	   data channel even when using an in-band control channel. In Node
267	   Failure the session fails because one of the peers has been restarted
268	   (or at least, the LDP component of the node has been restarted).
269	   These two scenarios have different implications for the ease of
270	   retention of LDP state within an individual LSR, and are described in
271	   sections below.

273	   These techniques are only applicable in LDP networks where at least
274	   one LSR has the capability to retain LDP signaling state and the
275	   associated forwarding state across LDP session failure and recovery.
276	   In [LDP-RESTART] the LSRs retaining state do not need to be adjacent
277	   to the failed LSR or session.

279	   If traffic is not to be impacted, both LSRs at the ends of an LDP
280	   session must at least preserve forwarding state. Preserving LDP state
281	   is not a requirement to preserve traffic.

283	   [LDP-FT] requires that the LSRs at both ends of the session implement
284	   the procedures that is describes. Thus, either traffic is preserved
285	   and recovery resynchronizes state, or no traffic is preserved and the
286	   LSP fails.

288	   Further, to use the procedures of [LDP-FT] to recover state on a
289	   session both LSRs must have a mechanism for maintaining some
290	   session state and a way of auditing the forwarding state and the
291	   resynhcronized control state.

293	   [LDP-RESTART] is scoped to support preservation of traffic if both
294	   LSRs implement the procedures that it describes. Additionally, it
295	   functions if only one LSR on the failed session supports retention of
296	   forwarding state, and implements the mechanisms in the document - in
297	   this case traffic will be impacted by the session failure, but the
298	   forwarding state will be recovered on session recovery. Further, in
299	   the event of simultaneous failures, [LDP-RESTART] is capable of
300	   relearning and redistributing state across multiple LSRs by combining
301	   its mechanisms with the usual LDP message exchanges of [RFC 3036].

303	6.2 Session Failure

305	   In Session Failure an LDP session between two peers fails and is
306	   restarted. There is no restart of the LSRs at either end of the
307	   session and LDP continues to function on those nodes.

309	   In these cases, it is simple for LDP implementations to retain LDP
310	   state associated with the failed session and to associate the state
311	   with the new session when it is established. Housekeeping may be
312	   applied to determine that the failed session is not returning and to
313	   release the old LDP state. Both [LDP-FT] and [LDP-RESTART] handle
314	   this case.

316	   Applicability of  [LDP-FT] and [LDP-RESTART] to the Session Failure
317	   scenario should be considered with respect to the availability of the
318	   data plane.

320	   In some cases the failure of the LDP session may be independent of
321	   any failure of the physical (or virtual) link(s) between adjacent
322	   peers; for example, it might represent a failure of the TCP/IP stack.
323	   In these cases the data plane is not impacted and both [LDP-FT] and
324	   [LDP-RESTART] are applicable to preserve or restore LDP state.

326	   LDP signaling may also operate out of band; that is, it may use
327	   different links from the data plane. In this case, a failure of the
328	   LDP session may be a result of a failure of the control channel, but
329	   there is no implied failure of the data plane. For this scenario
330	   [LDP-FT] and [LDP-RESTART] are both applicable to preserve or restore
331	   LDP state.

333	   In the case where the failure of the LDP session also implies the
334	   failure of the data plane it may be an implementation decision
335	   whether LDP peers retain forwarding state, and for how long. In such
336	   situations, if forwarding state is retained, and if the LDP session
337	   is re-established, both [LDP-FT] and [LDP-RESTART] are applicable to
338	   preserve or restore LDP state.

340	   When the data plane has been disrupted an objective of a recovery
341	   implementation might be to restore data traffic as quickly as
342	   possible.

344	6.3 Controlled Session Failure

346	   In some circumstances the LSRs may know in advance that an LDP
347	   session is going fail - perhaps a link is going to be taken out of
348	   service.

350	   [RFC 3036] includes provision for controlled shutdown of a session.
351	   [LDP-FT] and [LDP-RESTART] allow resynchronization of LDP state upon
352	   re-establishment of the session.

354	   [LDP-FT] offers the facility to both checkpoint all state before the
355	   shut-down, and to quiesce the session so that no new state changes
356	   are attempted between the checkpoint and the shut-down. This means
357	   that on recovery, resynchronization is simple and fast.

359	   [LDP-RESTART] resynchronizes all state on recovery regardless of the
360	   nature of the shut-down.

362	6.4 Node Failure

364	   Node Failure describes events where a whole node is restarted or
365	   where the component responsible for LDP signaling is restarted. Such
366	   an event will be perceived by the LSR's peers as session failure, but
367	   the restarting node sees the restart as full re-initialization.

369	   The basic requirement is that forwarding state is retained otherwise
370	   the data plane will necessarily be interrupted. If forwarding state
371	   is not retained, it may be relearned from saved control state in
372	   [LDP-FT]. [LDP-RESTART] does not utilize or expect saved control
373	   state, and if a node restarts without preserved forwarding state it
374	   informs its neighbors which immediately delete all label-FEC bindings
375	   previously received from the restarted node.

377	   The ways to retain forwarding and control state are numerous and
378	   implementation specific, and it is not the purpose of this document
379	   to espouse one mechanism or another nor even to suggest how this
380	   might be done. If state has been preserved across the restart,
381	   synchronization with peers can be carried out as though recovering
382	   from Session Failure as in the previous section. Both [LDP-FT] and
383	   [LDP-RESTART] support this case.

385	   How much control state is retained is largely an implementation
386	   choice, but [LDP-FT] requires that at least small amount of per-
387	   session control state be retained. [LDP-RESTART] does not require
388	   or expect control state to be retained.

390	   It is also possible that the restarting LSR has not preserved any
391	   state. In this case [LDP-FT] is of no help. [LDP-RESTART] however
392	   allows the restarting LSR to relearn state from each adjacent peer
393	   through the processes for resynchronizing after Session Failure.
394	   Further, in the event of simultaneous failure of multiple adjacent
395	   nodes, the nodes at the edge of the failure zone can recover state
396	   from their active neighbors and distribute it to the other recovering
397	   LSRs without any failed LSR having to have saved state.

399	6.5 Controlled Node Failure

401	   In some cases (hardware repair, software upgrade, etc.) node failure
402	   may be predictable. In these cases all sessions with peers may be
403	   shutdown and existing state retention may be enhanced by special
404	   actions.

406	   [LDP-FT] checkpointing and quiesce may be applied to all sessions
407	   so that state is up-to-date.

409	   As above, [LDP-RESTART] does not require that state is retained by
410	   the restarting node, but can utilize it if it is.

412	6.6 Speed of Recovery

414	   Speed of recovery is impacted by the amount of signaling required.

416	   If forwarding state is preserved on both LSRs on the failed session
417	   then the recovery time is constrained by the time to resynchronize
418	   the state between the two LSRs.

420	   [LDP-FT] may resynchronize very quickly. In a stable network this
421	   resolves to a handshake of a checkpoint. At the most,
422	   resynchronization involves this handshake plus an exchange of
423	   messages to handle state changes since the checkpoint was taken.
424	   Implementations that support only the periodic checkpointing subset
425	   of [LDP-FT] are more likely to have additional state to
426	   resynchronize.

428	   [LDP-RESTART] must resynchronize state for all label mappings that
429	   have been retained. At the same time, resources that have be retained
430	   by a restarting upstream LSR but are not actually required because
431	   they have been released by the downstream LSR (perhaps because it was
432	   in the process of releasing the state) must be held for the full
433	   resynchronization time to ensure that they are not needed.

435	   The impact of recovery time will vary according to the use of the
436	   network. Both [LDP-FT] and [LDP-RESTART] allow advertisement of new
437	   labels while resynchronization is in progress. Issues to consider are
438	   re-availability of falsely retained resources and conflict between
439	   retained label mappings and newly advertised ones since this may
440	   cause incorrect forwarding of data - since labels are advertised
441	   from downstream, an LSR upstream of a failure may continue to
442	   forward data for one FEC on an old label while the recovering
443	   downstream LSR might re-assign that label to another FEC and
444	   advertise it. For this reason, restarting LSRs may choose to not
445	   advertise new labels until resynchronization with their peers has
446	   completed, or may decide to use special techniques to cover the short
447	   period of overlap between resynchronization and new LSP setup.

449	6.7 Scalability

451	   Scalability is largely the same issue as speed of recovery and is
452	   governed by the number of LSPs managed through the failed session(s).

454	   Note that there are limits to how small the resynchronization time in
455	   [LDP-RESTART] may be made given the capabilities of the LSRs, the
456	   throughput on the link between them, and the number of labels that
457	   must be resynchronized.

459	   Impact on normal operation should also be considered.

461	   [LDP-FT] requires acknowledgement of all messages. These
462	   acknowledgements may be deferred as for checkpointing described in
463	   section 6.4, or may be frequent. Although acknowledgements can be
464	   piggy-backed on other state messages, an option for frequent
465	   acknowledgement is to send a message solely for the purpose of
466	   acknowledging a state change message. Such an implementation would
467	   clearly be unwise in a busy network.

469	   [LDP-RESTART] has no impact on normal operations.

471	6.8 Rate of Change of LDP State

473	   Some networks do not show a high degree of change over time, such as
474	   those using targeted LDP sessions; others change the LDP forwarding
475	   state frequently, perhaps reacting to changes in routing information
476	   on LDP discovery sessions.

478	   Rate of change of LDP state exchanged over an LDP session depends
479	   on the application for which the LDP session is being used. LDP
480	   sessions used for exchanging <FEC, label> bindings for establishing
481	   hop by hop LSPs will typically exchange state reacting to IGP
482	   changes. Such exchanges could be frequent. On the other hand
483	   LDP sessions established for exchanging MPLS Layer 2 VPN FECs
484	   will typically exhibit a smaller rate of state exchange.

486	   In [LDP-FT] two options exist. The first uses a frequent (up to per-
487	   message) acknowledgement system which is most likely to be applicable
488	   in a more dynamic system where it is desirable to preserve the
489	   maximum amount of state over a failure to reduce the level of
490	   resynchronization required and to speed the recovery time.

492	   The second option in [LDP-FT] uses a less-frequent acknowledgement
493	   scheme known as checkpointing. This is particularly suitable to
494	   networks where changes are infrequent or bursty.

496	   [LDP-RESTART] resynchronizes all state on recovery regardless of the
497	   rate of change of the network before the failure. This consideration
498	   is thus not relevant to the choice of [LDP-RESTART].

500	6.9 Label Distribution Modes

502	   Both [LDP-FT] and [LDP-RESTART] are suitable for use with Downstream
503	   Unsolicited label distribution.

505	   [LDP-RESTART] describes Downstream-On-Demand as an area for future
506	   study and is therefore not applicable for a network in which this
507	   label distribution mode is used. It is possible that future
508	   examination of this issue will reveal that once a label has been
509	   distributed in either distribution mode, it can be redistributed
510	   by [LDP-RESTART] upon session recovery.

512	   [LDP-FT] is suitable for use in a network that uses Downstream-On-
513	   Demand label distribution.

515	   In theory, and according to [RFC3036], even in networks configured to
516	   utilize Downstream Unsolicited label distribution, there may be
517	   occasions when the use of Downstream-On-Deman distribution is
518	   desirable. The use of the Label Request message is not prohibited in
519	   a Downstream Unsolicited label distribution LDP network.

521	   Opinion varies as to whether there is a practical requirement for the
522	   use of the Label Request message in a Downstream Unsolicited label
523	   distribution LDP netowrk. Current deployment experience suggests that
524	   there is no requirement.

526	6.10 Implementation Complexity

528	   Implementation complexity has consequences for the implementer and
529	   also for the deployer since complex software is more error prone and
530	   harder to manage.

532	   [LDP-FT] is a more complex solution than [LDP-RESTART]. In
533	   particular, [LDP-RESTART] does not require any modification to the
534	   normal signaling and processing of LDP state changing messages.

536	   [LDP-FT] implementations may be simplified by implementing only
537	   the checkpointing subest of the functionality.

539	6.11 Implementation Robustness

541	   In addition to the implication for robustness associated with
542	   complexity of the solutions, consideration should be given to the
543	   effects of state preservation on robustness.

545	   If state has become incorrect for whatever reason then state
546	   preservation may retain incorrect state. In extreme cases it may be
547	   that the incorrect state is the cause of the failure in which case
548	   preserving that state would be bad.

550	   When state is preserved, the precise amount that is retained is an
551	   implementation issue. The basic requirement is that forwarding state
552	   is retained (to preserve the data path) and that that state can be
553	   accessed by the LDP software component.

555	   In both solutions, if the forwarding state is incorrect and is
556	   retained, it will continue to be incorrect. Both solutions have a
557	   mechanism to housekeep and free unwanted state after
558	   resynchronization is complete. [LDP-RESTART] may be better at
559	   eradicating incorrect forwarding state because it replays all
560	   messages exchanges that caused the state to be populated.

562	   In [LDP-RESTART] no more data than the forwarding state needs to have
563	   been saved by the recovering node. All LDP state may be relearned by
564	   message exchanges with peers. Whether those exchanges may cause the
565	   same incorrect state to arise on the recovering node is an obvious
566	   concern.

568	   In [LDP-FT] the forwarding state must be supplemented by a small
569	   amount of state specific to the protocol extensions. LDP state may
570	   be retained directly or reconstructed from the forwarding state. The
571	   same issues apply when reconstructing state but are mitigated by the
572	   fact that this is likely a different code path. Errors in the
573	   retained state specific to the protocol extensions will persist.

575	6.12 Interoperability and Backward Compatibility

577	   It is important that new additions to LDP interoperate with existing
578	   implementations at least in provision of the existing levels of
579	   function.

581	   Both [LDP-FT] and [LDP-RESTART] do this through rules for handling
582	   the absence of the FT optional negotiation object during session
583	   initialization.

585	   Additionally, [LDP-RESTART] is able to perform limited recovery (that
586	   is, redistribution of state) even when only one of the participating
587	   LSRs supports the procedures. This may offer considerable advantages
588	   in interoperation with legacy implementations.

590	6.13 Interaction With Other Label Distribution Mechanisms

592	   Many LDP LSRs also run other label distribution mechanisms. These
593	   include management interfaces for configuration of static label
594	   mappings, other distinct instances of LDP, and other label
595	   distribution protocols. The last example includes traffic engineering
596	   label distribution protocol that are used to construct tunnels
597	   through which LDP LSPs are established.

599	   As with re-use of individual labels by LDP within a restarting LDP
600	   system, care must be taken to prevent labels that need to be retained
601	   by a restarting LDP session or protocol component from being used by
602	   another label distribution mechanism since that might compromise
603	   data security amongst other things.

605	   It is a matter for implementations to avoid this issue through the
606	   use of techniques such as a common label management component or
607	   segmented label spaces.

609	6.14 Applicability to CR-LDP

611	   CR-LDP [RFC 3212] utilizes Downstream-On-Demand label distribution.
612	   [LDP-RESTART] describes Downstream-On-Demand as an area for future
613	   study and is therefore not applicable for CR-LDP. [LDP-FT] is
614	   suitable for use in a network entirely based on CR-LDP or in one
615	   that is mixed between LDP and CR-LDP.

617	7. Security Considerations

619	   This document is informational and introduces no new security
620	   concerns.

622	   The security considerations pertaining to the original LDP protocol
623	   [RFC3036] remain relevant.

625	   [LDP-RESTART] introduces the possibility of additional denial-of-
626	   service attacks. All of these attacks may be countered by use of an
627	   authentication scheme between LDP peers, such as the MD5-based scheme
628	   outlined in [LDP].

630	   In MPLS, a data mis-delivery security issue can arise if an LSR
631	   continues to use labels after expiration of the session that first
632	   caused them to be used. Both [LDP-FT] and [LDP-RESTART] are open to
633	   this issue.

635	8. Intellectual Property Considerations

637	   The IETF takes no position regarding the validity or scope of any
638	   intellectual property or other rights that might be claimed to
639	   pertain to the implementation or use of the technology described in
640	   this document or the extent to which any license under such rights
641	   might or might not be available; neither does it represent that it
642	   has made any effort to identify any such rights.  Information on the
643	   IETF's procedures with respect to rights in standards-track and
644	   standards-related documentation can be found in BCP-11.  Copies of
645	   claims of rights made available for publication and any assurances of
646	   licenses to be made available, or the result of an attempt made to
647	   obtain a general license or permission for the use of such
648	   proprietary rights by implementors or users of this specification can
649	   be obtained from the IETF Secretariat.

651	   The IETF invites any interested party to bring to its attention any
652	   copyrights, patents or patent applications, or other proprietary
653	   rights which may cover technology that may be required to practice
654	   this standard.  Please address the information to the IETF Executive
655	   Director.

657	   Parts of [LDP-FT] are the subject of a patent application by
658	   Data Connection Ltd.

660	   Parts of [LDP-RESTART] are the subject of patent applications by
661	   Juniper Networks and Redback Networks.

663	   In all cases, the parties have indicated that if technology is
664	   adopted as a standard they agree to license, on reasonable and non-
665	   discriminatory terms, any patent rights they obtain covering such
666	   technology to the extent necessary to comply with the standard.

668	9. References

670	9.1 Normative References

672	   [RFC2026]      Bradner, S., "The Internet Standards Process --
673	                  Revision 3", BCP 9, RFC 2026, October 1996.

675	   [RFC2119]      Bradner, S., "Key words for use in RFCs to Indicate
676	                  Requirement Levels", BCP 14, RFC 2119, March 1997.

678	   [RFC3036]      Andersson, L., et. al., LDP Specification, RFC 3036,
679	                  January 2001.

681	   [LDP-FT]       Farrel, A., et al., Fault Tolerance for the Label
682	                  Distribution Protocol (LDP), draft-ietf-mpls-ldp-
683	                  ft-06.txt, September 2002, work in progress.

685	   [LDP-RESTART]  Leelanivas, M., et al., Graceful Restart Mechanism for
686	                  LDP, draft-ietf-ldp-restart-05.txt, September 2002,
687	                  work in progress.

689	9.2 Informational References

691	   [MPLS-RECOV]   Sharma, Hellstrand, et al., Framework for MPLS-based
692	                  Recovery, draft-ietf-mpls-recovery-frmwrk-07.txt,
693	                  September 2002, work in progress.

695	   [RFC3212]      Jamoussi, B., et. al., Constraint-Based LSP Setup
696	                  using LDP, RFC 3212, January 2002.

698	10. Acknowledgements

700	   The author would like to thank the authors of [LDP-FT] and
701	   [LDP-RESTART] for their work on fault tolerance of LDP.
702	   Many thanks to Yakov Rekhter, Rahul Aggarwal, Manoj Leelanivas
703	   and Andrew Malis for their considered input to this applicability
704	   statement.

706	11. Author Information

708	   Adrian Farrel
709	   Movaz Networks, Inc.
710	   7926 Jones Branch Drive, Suite 615
711	   McLean, VA 22102
712	   Phone:  +1 703-847-1867
713	   Email:  afarrel@movaz.com

715	12.   Full Copyright Statement

717	   Copyright (C) The Internet Society (2003).  All Rights Reserved.

719	   This document and translations of it may be copied and furnished to
720	   others, and derivative works that comment on or otherwise explain it
721	   or assist in its implementation may be prepared, copied, published
722	   and distributed, in whole or in part, without restriction of any
723	   kind, provided that the above copyright notice and this paragraph are
724	   included on all such copies and derivative works.  However, this
725	   document itself may not be modified in any way, such as by removing
726	   the copyright notice or references to the Internet Society or other
727	   Internet organizations, except as needed for the purpose of
728	   developing Internet standards in which case the procedures for
729	   copyrights defined in the Internet Standards process must be
730	   followed, or as required to translate it into languages other than
731	   English.

733	   The limited permissions granted above are perpetual and will not be
734	   revoked by the Internet Society or its successors or assigns.

736	   This document and the information contained herein is provided on an
737	   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
738	   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
739	   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
740	   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
741	   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.