idnits 2.17.1 

draft-corson-fastrestore-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  == The page length should not exceed 58 lines per page, but there was 1
     longer page, the longest (page 1) being 848 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 448 has weird spacing: '...blishes  links...'

  == Line 452 has weird spacing: '...lt, the  route...'

  == Line 457 has weird spacing: '...started  route...'

  == Line 508 has weird spacing: '...nts (or  equiv...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (November 2000) is 8562 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'MobileIP' is defined on line 731, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-04) exists of
     draft-ramachandra-bgp-restart-00

  ** Obsolete normative reference: RFC 2002 (ref. 'MobileIP') (Obsoleted by
     RFC 3220)

  == Outdated reference: A later version (-11) exists of
     draft-ietf-manet-olsr-01

  == Outdated reference: A later version (-04) exists of
     draft-ietf-manet-tora-spec-02


     Summary: 7 errors (**), 0 flaws (~~), 9 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Personal                                                M. Scott Corson
2	                                                        Ansible Systems
3	Internet Draft                                               A. O'Neill
4	                                                                     BT
5	                                                            G. Tsirtsis
6	                                                                Flarion
7	Document: draft-corson-fastrestore-00.txt
8	Category: Informational                                   November 2000

10	                         IP Fast Restoration
11	                  <draft-corson-fastrestore-00.txt>

13	Status of this Memo

15	   This document is an Internet-Draft and is in full conformance with
16	   all provisions of Section 10 of RFC2026. Internet-Drafts are working
17	   documents of the Internet Engineering Task Force (IETF), its areas,
18	   and its working groups.  Note that other groups may also distribute
19	   working documents as Internet-Drafts.

21	   Internet-Drafts are draft documents valid for a maximum of six
22	   months and may be updated, replaced, or obsoleted by other documents
23	   at any time.  It is inappropriate to use Internet-Drafts as
24	   reference material or to cite them other than as "work in progress."

26	   The list of current Internet-Drafts can be accessed at
27	   http://www.ietf.org/ietf/1id-abstracts.txt

29	   The list of Internet-Draft Shadow Directories can be accessed at
30	   http://www.ietf.org/shadow.html.

32	                                Abstract

34	   This memo describes a generalized approach for achieving IP fast
35	   restoration in response to link and router failures by using IP
36	   routing to support high network availability.  The approach relies
37	   on the utilization of flat, yet highly-localized routing technology
38	   to reduce far-reaching control message propagation, thereby limiting
39	   the effects of a failure and enabling sub-second restoration.  It
40	   then outlines how the Temporally-Ordered Routing Algorithm (TORA) (a
41	   Mobile Ad hoc Network (MANET) routing protocol), when suitably
42	   modified for use in fixed networks, can be used within this
43	   framework. The general approach to routing topology management
44	   within the fast restoration framework is immediate reaction on link
45	   failure (to preserve existing flows) and gradually integration of
46	   new network elements on activation into the domain (to further
47	   assure stability).

49	INDEX

51	      1. Introduction . . . . . . . . . . . . . . . . . . . . .  2
52	           1.1 Terminology. . . . . . . . . . . . . . . . . . .  2
53	      2. Motivation . . . . . . . . . . . . . . . . . . . . . .  3
54	      3. The Fast Restoration Framework . . . . . . . . . . . .  4
55	           3.1 Highly Localized Routing Protocol. . . . . . . .  5
56	           3.2 Link Status Assessment Protocol. . . . . . . . .  5
57	           3.3 Reaction to Failure. . . . . . . . . . . . . . .  6
58	      4. The Temporally-Ordered Routing Algorithm . . . . . . .  7
59	           4.1 Overview . . . . . . . . . . . . . . . . . . . .  7
60	           4.2 Router Activation (Route Creation) . . . . . . .  8
61	           4.3 Link Failure . . . . . . . . . . . . . . . . . .  9
62	           4.4 Link Activation. . . . . . . . . . . . . . . . .  9
63	           4.5 Router Failure (Route Erasure) . . . . . . . . . 10
64	      5. Detecting Router Failures with Single-Homed Prefixes . 10
65	           5.1 Neighbor Advertisement Protocol. . . . . . . . . 11
66	           5.2 Neighbor Reachability Protocol . . . . . . . . . 12
67	      6. Conclusion . . . . . . . . . . . . . . . . . . . . . . 14
68	      7. References . . . . . . . . . . . . . . . . . . . . . . 15

70	1. Introduction

72	   This memo describes a generalized approach for achieving IP fast
73	   restoration in response to link and router failures by using IP
74	   routing to support high network availability.  The approach relies
75	   on the utilization of flat, yet highly-localized routing technology
76	   to reduce far-reaching control message propagation, thereby limiting
77	   the effects of a failure and enabling sub-second restoration.  It
78	   then outlines how the Temporally-Ordered Routing Algorithm (TORA) (a
79	   Mobile Ad hoc Network (MANET) routing protocol), when suitably
80	   modified for use in fixed networks, can be used within this
81	   framework. The general approach to routing topology management
82	   within the fast restoration framework is an immediate reaction on
83	   link failure (to preserve existing flows) and gradually integration
84	   of new network elements on activation into the domain (to further
85	   assure stability).

87	   The draft first discusses the motivation for this draft and its
88	   relevance to a recent IAB report [IAB99] on routing convergence. It
89	   then outlines a framework of protocol mechanisms, and then gives a
90	   fuller description of the main mechanisms.  It then describes a
91	   candidate routing algorithm that can be used within this framework.
92	   It then additionally describes the mechanisms within the framework
93	   to address efficiency and localization issues associated with the
94	   failure of routers advertising single-homed prefixes. The draft
95	   concludes with a discussion of the benefits of this fast restoration
96	   framework.

98	1.1 Terminology
99	   Here are some new terms defined in this draft.

101	   RID    Router ID. A unique ID used by a routing protocol to identify
102	       a router.

104	   RPLF   Routing Protocol Link Failure. A RPLF packet is generated as
105	          a routing protocol's reaction to a link failure for each
106	          destination prefix affected by the failure

108	   RPDF   Routing Protocol Destination Failure. RPDF is a packet
109	          generated to eliminate routing to a destination that has been
110	          partitioned from the rest of the network

112	   LFDT   Link Failure Detection Timer. Failure to receive an FRLS
113	          packet with the neighbor heard flag set within the LFDT
114	          setting indicates link failure.

116	   LFUB   Link Failure Upper Bound. This is the upper bounds of the
117	   time
118	          required to detect failure.

120	   FRNA   Fast Restore Neighbor Advertisement. A router's FRNA
121	          packet also contains the RIDs and corresponding prefixes of
122	          the router's one-hop neighbors.

124	   FRLS   Fast Restore Link Status. A FRLS packet contains a router's
125	          RID and an indication of whether or not the router has
126	          recently heard its neighbor.

128	2. Motivation

130	   The report from the recent IAB Network Layer Workshop [IAB99] raised
131	   several routing issues.  Regarding routing convergence, it noted
132	   that "at least one backbone operator has concerns about the
133	   convergence time of internetwork-wide routing during a failover.
134	   This operator believes that current convergence times are on the
135	   order of half a minute, and possibly getting worse.  Others in the
136	   routing community did not believe that the convergence times are a
137	   current issue.  Some, who believe that real-time applications (e.g.
138	   telephony) require sub-second convergence, are concerned about the
139	   implications of convergence times of a half minute on such
140	   applications. [IAB99]"

142	   It is agreed that existing Internet intra-domain routing protocols
143	   (e.g. [RIP,OSPF]) have limited responsiveness in reacting to
144	   topological changes (i.e. slow convergence) and consequently have
145	   limitations on the size of domains they may support.  These
146	   limitations stem directly from the use of shortest-path routing
147	   based on global topological information exchange.

149	   For any premium service with critically tight availability
150	   requirements, operators are left with the use of hot standby routers
151	   and link level restoration to improve restoration times. Both
152	   mechanisms are however very expensive, and applicable mainly to
153	   large-scale core elements. However, these cannot be cost-effectively
154	   applied in edge networks such as IP enabled Radio Access Networks
155	   and base stations and flat enterprise networks. These will likely be
156	   composed of large numbers of small and necessarily cheap IP routers
157	   which may be interconnected by ad-hoc, non-resilient or unprotected
158	   link layers such as ethernet and various serial lines. The need for
159	   a fast restoring IGP is therefore apparent. In addition, if a cost-
160	   effective IP layer solution could be found which could be made
161	   sympathetic to traffic engineering requirements then the requirement
162	   for hot standby and link layer restoration could be reduced
163	   resulting in cost savings. The traffic engineering implications are
164	   not however considered in this draft.

166	   By relaxing the requirement for shortest path routing only slightly
167	   (with acceptable impact on forwarding efficiency), one can
168	   approximate shortest-path routing while enabling IP fast restoration
169	   and higher levels of domain scalability.  This can be achieved by
170	   utilizing adaptive routing algorithms that localize their response
171	   to topological changes while ensuring loop-free paths.  It is
172	   localization that yields the desired benefits of reduced restoration
173	   time and increased domain size.

175	   The proposed fast restoration approach therefore offers a potential
176	   intra-domain solution to this convergence problem in reaction to
177	   failure. It is presently unclear whether the fast restoration
178	   techniques discussed in this memo have potential applicability to
179	   the associated inter-domain routing problem.

181	3. Fast Restoration Framework

183	   Four general protocol mechanisms are required in the proposed fast
184	   restoration framework. The first two are mandatory in that they
185	   provide the basic fault detection and reaction mechanisms. The
186	   second two are optional and illustrate one way to enable the
187	   reaction to be tailored to whether the failure point is a link or a
188	   neighbor router.

190	   1) A highly localized routing protocol
191	        This is required to enable the protocol to be able to react
192	        extremely quickly to failures, with minimal impact on other
193	        routers within the domain.

195	   2) A link status assessment protocol
196	        This is required to enable a router to rapidly detect link or
197	        router failures.

199	   3) A neighbor advertisement protocol
200	        This is used to enable a router to learn about its local two
201	        hop topology including its neighbors' neighbors and the degree
202	        of connectivity (single/multi-homed) of subnets.

204	   4) A neighbor reachability protocol
205	        This is used to enable a router to identify whether it is the
206	        link or the neighbor router that has failed through
207	        collaboration with its two hop neighbors.

209	3.1 Highly Localized Routing

211	   The general approach to routing topology management within the fast
212	   restoration framework is immediate reaction on link failure (to
213	   preserve existing flows) and gradually integration of new network
214	   elements on activation into the domain (to further assure
215	   stability).

217	   The method by which the aforementioned route adjustment occurs is a
218	   function of the routing algorithm.  It is essential that a routing
219	   algorithm have the following properties to operate within this fast
220	   restoration framework.

222	   1. The protocol reaction to a link failure that does not partition
223	      the network be localized for all destination prefixes.

225	   2. The protocol reaction to a router failure that does not partition
226	      the network should be localized for all destination prefixes.
227	      (i.e. it should be treated as a link failure with respect to
228	      these other prefixes).

230	   3. The protocol reaction to a link failure that does partition the
231	      network be capable of detecting the partition (and erasing routes
232	      if desired).

234	   4. The protocol reaction to a router failure that does partition the
235	      network be capable of erasing routes for these prefixes (if
236	      desired). Single-homed prefixes on the failed router will by
237	      definition create a partition.

239	   The preceding description assumes that a routing protocol reacts to
240	   a link or router failure by sending generic Routing Protocol Link
241	   Failure (RPLF). Routing Protocol Destination Failure (RPDF) packets
242	   as generated as necessary in response to partitions.

244	3.2 Link Status Assessment

246	   The link status assessment protocol is principally concerned with
247	   fast detection of link failures.  Work within the IETF on a
248	   generalized IP level link status assessment algorithm was recently
249	   proposed [FLIP].  A similar capability is required here where a link
250	   consists of a pair of Router IDs (RIDs).  After discovery of a
251	   neighbor (identified by its RID) via the neighbor discovery
252	   protocol, the objective of a link status assessment algorithm would
253	   be to monitor with high resolution (in time) the bi-directional
254	   connectivity of a link with an adjacent neighbor router.

256	   It is desired that a link failure be detectable on millisecond time
257	   scales enabling the higher level mechanisms (to be described) to
258	   quickly react thereby preserving ongoing sessions. The protocol
259	   operates by periodically sending Fast Restore Link Status (FRLS)
260	   packets between adjacent interfaces to reconfirm link bi-
261	   directionality.  The FRLS retransmission period is set to
262	   LINK_PROBE_INTERVAL time units.  A FRLS packet contains a router's
263	   RID and an indication of whether or not the router has recently
264	   heard its neighbor.

266	   FRLS packets are initially sent indicating that the sending router
267	   has not heard its neighbor.  Reception of a FRLS packet at a router
268	   from a neighbor causes the router to set the "neighbor heard" flag
269	   to TRUE for ROUTER_HEARD TIME (perhaps 3.5 times
270	   LINK_PROBE_INTERVAL) during which time it sends FRLS packets back
271	   towards its neighbor indicating that it has heard its neighbor.
272	   Reception of any packet from its neighbor causes the router to reset
273	   the ROUTER_HEARD_TIMER associated with this flag setting. Reception
274	   of a FRLS packet with the neighbor heard flag set to TRUE causes the
275	   router to consider the link with its neighbor as bi-directional.  At
276	   that time the Link Failure Detection Timer (LFDT) is also set to
277	   ROUTER_HEARD_TIME.  Reception of a subsequent FRLS packet with the
278	   neighbor heard flag set resets the LFDT.  Failure to receive an FRLS
279	   packet with the neighbor heard flag set within the LFDT setting
280	   indicates link failure.  Also, if the router considers the link to
281	   be bi-directional, subsequent reception of a FRLS packet without the
282	   neighbor heard flag set immediately indicates link failure (i.e.
283	   this indicates that its neighbor no longer hears the router).  Thus
284	   the sum of ROUTER_HEARD_TIME plus LINK_PROBE_INTERVAL equals Link
285	   Failure Upper Bound (LFUB) and effectively upper bounds the time
286	   required to detect failure.

288	   Alternatively, the protocol may also rely on a link layer
289	   notification of link failure (e.g. loss of carrier) if available.
290	   Such a signal may provide a more accurate and timely indication of
291	   link status.

293	3.3 Reaction to Failure

295	   After the link status assessment protocol detects a link failure
296	   (possible router failure) at a router X with a neighbor Y, the
297	   following processing steps occur at router X:

299	   1) Router X initially drops (or buffers) data packets for
300	      destination routes whose next hop was over the failed link with
301	      neighbor Y.  This clearly includes any single-homed or multi-
302	      homed destination prefixes on the (potentially failed)
303	      neighboring router Y.

305	   2) Router X adjusts routes for all affected destination prefixes
306	      aiming to locally route around the failed link and/or router. Let
307	      us assume that the routing protocol's reaction to a link failure
308	      generates a Routing Protocol Link Failure (RPLF) packet for each
309	      destination prefix. Note that the RPLF activity for a particular
310	      destination will definitely fail in either of two cases.

312	   3) The first would be if the link failure had generated a partition
313	      in the topology towards that destination which places an obvious
314	      requirement that restoration requires a sufficient level of
315	      connectivity between nodes to guarantee the availability of an
316	      alternate path. Partition detection would lead to the generation
317	      of a Routing Protocol Destination Failure (RPDF) packet that
318	      would be used to eliminate the routing for that destination
319	      throughout the domain.

321	   4) The second more specific example occurs if the destination prefix
322	      is single-homed on a router, and that router has itself failed,
323	      rendering restoration around the failure futile. Therefore, for
324	      singly homed prefixes it may be useful to react differently to
325	      the failure of the router. This firstly requires a method to
326	      discover which routers have single-homed prefixes, a method to
327	      discover when such routers fail, and an alternative strategy to
328	      react to this type of failure. These are discussed in section 5.
329	      One such strategy though would be to hold back on pointless RPLF
330	      processing to give the router time to re-boot. This avoids the
331	      RPLF processing causing partition detection and subsequent RPDF
332	      generation, resulting in the elimination of the routing for that
333	      destination throughout the domain. In this case, the domain
334	      relies on interim ICMP unreachable messages to inform
335	      applications of the loss of connectivity to the destination. If
336	      re-boot does not occur within a fixed period then the routes to
337	      that destination can be then be reasonably removed from the
338	      domain through RPDF propagation. This is somewhat similar to the
339	      `graceful' strategy for avoiding routing flaps in BGP
340	      [BGPrestart] by suppressing route erasure on router failure in
341	      anticipation of fast reboot and graceful BGP restart.

343	4 The Temporally-Ordered Routing Algorithm

345	4.1 Overview

347	   We now map these generic packets to a specific routing protocol ---
348	   the Temporally-Ordered Routing Algorithm (TORA) [TORA, TORAdraft].
349	   In TORA, RPLF packets map to Update (UPD) packets while RPDF packets
350	   map to Erase (ERS) packets.  This latter packet type does not yet
351	   appear in [TORA, TORAdraft].

353	   TORA is not a shortest-path routing protocol.  It is a "link
354	   reversal" routing protocol that differs significantly from existing
355	   routing technology.  TORA is similar to distance vector protocols
356	   (e.g. RIP) in that a separate version of the protocol runs
357	   independently for each destination prefix.  Thus its storage
358	   requirements scale with the number of destination prefixes.  But its
359	   similarity ends there.  In its original usage mode, TORA is an on-
360	   demand protocol that builds routes reactively in response to route
361	   requests from a MANET IP kernel.  However, for usage in fixed
362	   networks it is transformed into a more traditional, proactive
363	   protocol.

365	   The key attribute of the protocol remains the same for both
366	   versions, and this is the protocol's aggressive, optimistic reaction
367	   to link failures which is critical within the MANET environment but
368	   is becoming equally important in fixed network scenario's.  On link
369	   failure, when necessary, the protocol immediately reverses the
370	   directions of links in the area near the failure effectively
371	   searching for alternative routes to the destination.  If such exist,
372	   the protocol typically finds them in a time and communication-
373	   efficient, single-pass reaction.  Otherwise (in case of partition),
374	   the reaction detects the partition enabling route erasure if
375	   desired.

377	   We now briefly describe how TORA would function within the fast
378	   restoration framework.

380	4.2 Router Activation (Route Creation)

382	   When a router boots up, it establishes new links with its neighbors.
383	   The router proactively initiates a localized process of height
384	   database synchronization with each of its new neighbors.  Its
385	   neighbors transmit the contents of their height databases to the new
386	   router.  From this and any received routing advertisements (if any),
387	   a router can determine which of its own prefixes are single or
388	   multi-homed.

390	   It then generates an Optimization (OPT) packet for each destination
391	   prefix it wishes to advertise.  If the prefix is single-homed and
392	   heights exist in the neighboring routers, the OPT packet stops at
393	   these neighbors and is not propagated further. Otherwise, the OPT
394	   packet floods fully (or partially) throughout a domain installing
395	   (or reorienting) routing entries for its destination prefix. The
396	   collection of routing entries forms a Directed Acyclic Graph (DAG).
397	   From an individual router's perspective, this means that it will
398	   have one or more feasible next hops for the destination, all of
399	   which are guaranteed to be loop-free.

401	   Loop-free multipath routing is achieved through the use of
402	   "heights".  Each router maintains a height with respect to the
403	   destination, and the destination has the lowest height.  A link is
404	   directed from a higher height to a lower height, and packets may
405	   only be forwarded from routers with higher heights to routers with
406	   lower heights.  Thus packets flow "downhill" towards the
407	   destination.  Each height is guaranteed to be unique and thus the
408	   collection of heights forms a total order.  The protocol is
409	   sometimes also referred to as the "Totally-Ordered Routing
410	   Algorithm".

412	   From a router's perspective, a neighbor that is higher is referred
413	   to as an "upstream" neighbor while a neighbor that is lower is
414	   referred to as a "downstream" neighbor.  A destination router always
415	   has only upstream neighbors, and no other router is allowed to have
416	   upstream neighbors without having at least one downstream neighbor.
417	   It should be understood that all heights and associated link
418	   directions are per IP destination prefix DAG.

420	4.3 Link Failure

422	   A router may lose an upstream neighbor (due to a link failure) at
423	   any time without requiring protocol reaction.  Similarly, it may
424	   lose a downstream neighbor provided it has other downstream
425	   neighbors, or has no downstream neighbors and no upstream neighbors,
426	   without requiring protocol reaction. However, if due to a link
427	   failure a router finds itself with at least one upstream neighbor
428	   without having any downstream neighbors (i.e. it just lost its last
429	   downstream neighbor and is a black hole), the router must adjust its
430	   height so that it has at least one downstream neighbor.  The manner
431	   in which this occurs is described in [TORA,TORAdraft].  The
432	   adjustment results in transmission of an UPD (RPLF) packet to the
433	   router's neighbors advertising its new height.

435	   The effect of this height adjustment is to reverse the direction of
436	   some or all of the router's links; hence the name "link reversal"
437	   algorithm.  Reception of an UPD (RPLF) packet at a neighbor may
438	   cause that neighbor to lose its last downstream link, which is cause
439	   for subsequent height modification and UPD packet transmission.
440	   Examination of the algorithm's reaction on mesh-like topologies
441	   reveals that the number of subsequent UPD (RPLF) transmissions is
442	   typically quite low; often only one UPD (RPLF) transmission is
443	   required. The resultant reaction is thus highly localized.  It is
444	   also strongly de-correlated with the size of the domain.

446	4.4 Link Activation

448	   When a router restarts, it re-establishes  links with its neighbors.
449	   The router initiates a localized process of height database
450	   synchronization with each of its new neighbors.  Its neighbors
451	   transmit the contents of their height databases to the new router.
452	   By default, the  router's new heights are NULL for each destination.
453	   (Note: A NULL height is higher than any non-NULL height, and packets
454	   may be forwarded from NULL to non-NULL heights.  A link between two
455	   routers with NULL heights for a destination is marked as
456	   "unassigned", and packet forwarding over such links is disallowed.)
457	   The re-started  router then performs its own non-NULL height
458	   selection with respect to the existing heights of its neighbors.

460	   The algorithm for height selection is not discussed here, but it
461	   will be described in an upcoming draft detailing modification of
462	   TORA for use in fixed networks.  The procedure only involves a
463	   router's one-hop neighbors.  The height selection procedure occurs
464	   slowly, and a link is gradually brought back into active usage
465	   within the routing domain.

467	4.5 Router Failure (Route Erasure)

469	   When a router fails, then the default behavior is that each neighbor
470	   individually reacts to the router failure as a link failure, sending
471	   UPD (RPLF) packets as described in 3.3 above.  This will restore
472	   paths around the failed router if sufficient physical connectivity
473	   exists.

475	   The failed router itself may have been advertising destination
476	   prefixes into the network and in the case of multi-homed
477	   subnets/prefixes the TORA reaction will result in the paths towards
478	   the failed routing being redirected towards the nearest alternate
479	   router also advertising that prefix.

481	   In the case of single-homed prefixes, the restoration attempt will
482	   clearly fail as those subnets are now partitioned from the core
483	   network. The collective UPD (RPLF) reaction of the routers connected
484	   to the failed router, followed by partition detection via the normal
485	   TORA reflection, would eventually clear the network of routing state
486	   for these single-homed destination prefixes.  This is wasteful if
487	   the router is simply undergoing a re-boot and therefore additional
488	   protocol mechanisms may be warranted as highlighted in 3.3 and
489	   described in section5 below. The aim of these mechanisms in TORA
490	   would be to avoid sending UPD (RPLF) packets for the single-homed
491	   destination prefixes, wait to see if the associated router recovers
492	   in a set time, and if not, only then use ERS (RPDF) packets to
493	   efficiently clear this routing state from the network.

495	5. Detecting Router Failures with Single-Homed Prefixes

497	   As has previously been discussed, it may be useful to use additional
498	   protocol mechanisms to be able to distinguish router failures and
499	   associated single-homed destinations so that futile restoration
500	   processing and premature route elimination can be avoided. One such
501	   approach is outlined below, but the necessity and some of the
502	   mechanics for solving this problem may be critically dependent on
503	   the particular IGP.

505	   In this approach, a bootstrapping neighbor advertisement protocol
506	   enables a router to learn its one-hop neighbors, as well as its
507	   neighbors' one-hop neighbors, called the two-hop neighbor set.  It
508	   works in combination with IPv6 router advertisements (or  equivalent
509	   ICMP router messages in IPv4 (RFC1256)) to enable a router to assess
510	   its routing neighbors. Once a link failure is detected, the neighbor
511	   reachability protocol attempts to determine whether the router at
512	   the other end of the failed link is still reachable, and whether it
513	   is advertising single-homed and/or multi-homed destination prefixes.
514	   With input from the preceding protocols, the highly localized
515	   routing protocol then reacts differently for router failures
516	   involving single-homed prefixes.

518	5.1 Neighbor Advertisement Protocol

520	   This protocol is principally concerned with neighbor discovery and
521	   protocol bootstrapping. As with OSPF, each router is identified by a
522	   Router ID (RID).  It also is associated with at least one IP
523	   destination prefix by which it is globally reachable.  Routes for
524	   this prefix are advertised out of all its interfaces and built by
525	   the routing algorithm.  It may also advertise other destination
526	   prefixes.  These prefixes may be single-homed (i.e. only this router
527	   advertises itself as the last hop for these prefixes) or multi-homed
528	   (i.e. they are advertised by multiple routers as being last hop
529	   routers).

531	   The neighbor advertisement protocol requires that each router
532	   periodically advertises its RID and destination prefix(es) in Fast
533	   Restore Neighbor Advertisement (FRNA) packets to its one-hop routing
534	   neighbors out interfaces on which routing is activated. FRNA packets
535	   would likely be multicast (TTL scope == 1) to a well-known multicast
536	   address.  The transmission period would be relatively long (on the
537	   order of 30 seconds).  A router's FRNA packet also contains the RIDs
538	   and corresponding prefixes of the router's one-hop neighbors that it
539	   has learned from their previous FRNA transmissions. In this way, by
540	   receiving its neighbors' FRNA packets, a router learns the
541	   identities of its neighbors' neighbors in a fashion very similar to
542	   the neighbor advertisement algorithm of [OLSR].  Associated with
543	   each link a router has with a routing neighbor is an instance of a
544	   link status assessment protocol (see section 2.2).  For each
545	   instance a router has a Link Failure Detection Timer (LFDT) for that
546	   link.  This is the time after which a router will declare a link to
547	   its neighbor down if no packets are received from its neighbor.  A
548	   router also computes an Link Failure Upper Bound (LFUB) on the
549	   overall time required to detect a failure on each link, which is
550	   equal to the LFDT value plus a link polling period (to be
551	   described). A router X also advertises in its FRNA transmissions the
552	   maximum computed value over all its links of these bounds, known as
553	   the maximum LFUB (LFUBmax(X)).

555	   The neighbor advertisement protocol operates in conjunction with
556	   IPv6 Router Advertisement (RA) messages (or an equivalent in IPv4)
557	   which are sent out interfaces on which routing is not activated
558	   (e.g. over subnets with attached hosts).  Router advertisement
559	   messages convey the destination prefixes being advertised by other
560	   routers on a given subnet.  By listening to these advertisements, a
561	   router can learn if other routers are advertising destination
562	   prefixes that itself is also advertising.  It thus determines which
563	   of its own last-hop prefixes are single-homed or multi-homed.

565	5.2 Neighbor Reachability Protocol

567	   We now describe the basic operation of a neighbor reachability
568	   protocol. The protocol is assumed to run at a router X, and is
569	   triggered by link failure detection on a link to router Y. After a
570	   short time (estimated sufficient for localized route re-
571	   computation), router X executes a distributed neighbor reachability
572	   protocol to determine reachability of the single-homed destination
573	   prefixes advertised by neighbor Y as follows.

575	   The (potentially failed) neighbor Y's maximum LFUB (LFUBmax) is
576	   known from its previous FRNA transmissions. With this information a
577	   router X creates a RPLF packet as it would if this were a link
578	   failure, but delays transmission of the packet for a time T somewhat
579	   greater than its neighbor Y's LFUBmax; thus T > LFUBmax(Y).  In the
580	   meantime it immediately sends a Fast Restore Link Failure (FRLF)
581	   packet to each of the (potentially failed) neighbor Y's one-hop
582	   neighbors.  Reception of a FRLF packet at such a neighbor Z
583	   instructs that neighbor to mark the link between the indicated
584	   router Y and the sender X as "failed". But the receiver Z keeps the
585	   sender X in its list of the neighbor Y's one-hop neighbors.  This is
586	   to ensure that Z will still send a FRLF packet to X if it also
587	   detects a link failure with Y.  Z will change its notion of Y's
588	   neighbors only on reception of a FRNA message from Y itself.
589	   Subsequent reception of a FRLS packet at router Z from neighbor Y
590	   indicates to the receiver Z that the link between routers X and Y
591	   may indeed have failed, but that router Y itself is still
592	   functioning.

594	   Let us assume that router Y has indeed failed, and that its neighbor
595	   X was the first neighbor to detect this failure, thus sending FRLF
596	   packets to router Y's previous one-hop neighbors.  Subsequently
597	   others of Y's previous neighbor set begin to declare link failures
598	   with Y, each scheduling delayed RPLF transmissions and immediately
599	   sending an FRLF packet to each of Y's past neighbors.  Let us also
600	   assume that Y's neighbor Z is the last neighbor to detect a link
601	   failure with router Y.  It also schedules a RPLF packet for
602	   transmission (suitably delayed) and sends FRLF packets to each of
603	   Y's previous neighbors. By this time (or very shortly thereafter) Z
604	   will likely to be the first of Y's past neighbors to both declare
605	   link failure with Y and have received FRLF messages from all of Y's
606	   previous neighbors (without having heard a subsequent message from
607	   Y).

609	   At that point router Z declares "router failure" of previously
610	   adjacent neighbor Y, and cancels its pending RPLF transmission for
611	   Y. The router Z then queues a Routing Protocol Destination Failure
612	   (RPDF) packet for each of neighbor Y's single-homed destination
613	   prefixes (Note that for efficiency, multiple RPDF packets may be
614	   aggregated and sent within a single aggregate RPDF packet).

616	   Assuming T (T > LFUBmax(Y)) is set appropriately at each of Y's
617	   previous neighbors, none of Y's previous neighbors would have yet
618	   transmitted their pending RPLF packets. The router also sets a
619	   NEIGHBOR_RESTART_TIMER to some value (possibly 30 seconds) while it
620	   waits for neighbor Y to reboot.  During this time packets sent to
621	   Y's single-homed prefixes will be undeliverable resulting in "host
622	   unreachable" ICMP message being returned to the sources.  Assuming
623	   neighbor Y reboots within this time, the queued RPDF transmission is
624	   cancelled.  Otherwise the queued RPDF packet is transmitted when the
625	   timer expires and the routes for Y's single-homed destination
626	   prefixes are erased from the domain.  The objective of this timer is
627	   to suppress route erasure in the case of fast router reboot in a
628	   fashion similar to that proposed in [BGPrestart].

630	   If the NEIGHBOR_RESTART_TIMER expires and T (T > LFUBmax(Y)) is not
631	   set appropriately at each of Y's previous neighbors, possibly others
632	   may also have queued and then issued RPDF packets prior to receiving
633	   an FRDF message originating from router Z.  This is not detrimental.
634	   RPDF packets are destination-specific and source-independent.  Their
635	   collective propagation results in only a single aggregated flood,
636	   regardless of the number of routers originating RPDF packets. An
637	   RPDF packet floods throughout the network immediately, efficiently
638	   clearing state information regarding the effected destinations. An
639	   RPDF packet is sent to the ALL_ROUTERS multicast address.

641	   For maximal efficiency the aforementioned protocol assumes a domain
642	   topology with the characteristic that, after any single router or
643	   link failure, the remaining domain remains well connected.  Thus
644	   router failure does not partition the remaining domain permitting
645	   the failed neighbors to exchange FRLF messages. This is required so
646	   fast restoration can be achieved.

648	   Nevertheless, if the network becomes partitioned the above algorithm
649	   need not result in undo harm provided the routing protocol is
650	   capable of detecting partitions. A router/link failure, in such a
651	   case, will result in multiple link failures being detected since the
652	   failed router Y's neighbors on one side of the partitioned network
653	   will not be able to reach Y's one hop neighbor's on the other side
654	   of the partitioned domain (i.e. FRLF will fail to reach some
655	   neighbors) to reach a more definite conclusion.  Consequently each
656	   such neighbor will transmit a RPLF packet for each single-homed
657	   destination prefix on the potentially failed router.  Each packet
658	   will initiate a distributed computation that performs "distributed
659	   partition detection" and eventually clears the domain of routing
660	   state for its destination.  This approach is less efficient than
661	   that using RPDF packets.

663	   The approach also assumes the FRLF messages are not frequently
664	   dropped.  Loss of at least one FRLF message destined for each of the
665	   failed destination's neighbors would ensure that no neighbor
666	   generates an RPDF packet.  If there are k such neighbors with each
667	   needing to receive all k-1 FRLF packets from the other neighbors,
668	   and the probability of packet dropping is p, then an upper bound on
669	   the probability of this undesirable, joint packet dropping event is
670	   p^k.  That is, each of the k neighbors must have missed at least one
671	   packet from some other neighbor.  If control packets are given
672	   reasonable priority (suitably small p), then this will not occur
673	   very frequently (i.e. only with probability p^k).

675	   If it does occur, each of these neighbors would start to
676	   independently transmit their queued RPLF packets (which is what this
677	   router failure detection mechanism intends to avoid).  This results
678	   in the aforementioned distributed partition detection process and
679	   the worst-case consequences regarding convergence in this case
680	   depend on the particular routing algorithm.

682	6. Conclusions

684	   The fast restoration processing occurs entirely at layer 3 and does
685	   not require special layer 2 considerations, although a layer 2 link
686	   activation/failure notification trigger may be used if available.

688	   The overall restoration approach is constrained only by the speed at
689	   which a router can create, transmit and process the localized
690	   routing updates.  Increases in processing and transmission speeds
691	   will only improve restoration performance over time.

693	   It is anticipated that restoration can frequently be achieved with
694	   little or no noticeable effect on voice over IP sessions.  Some
695	   degradation would likely be experienced by higher rate sessions
696	   given current hardware capabilities.

698	   The approach offers immunity to link instability (such as that
699	   triggered by malfunctioning interface cards).  A link that flip-
700	   flops between UP and DOWN will effectively be removed from the
701	   forwarding paths of its adjacent routers without triggering domain-
702	   wide routing updates or routing re-convergence computations.  Thus
703	   the network will not melt down due to localized link problems.

705	   The approach offers reasonable immunity to router instability as
706	   well.  A router that frequently goes UP and DOWN will not create
707	   domain-wide routing messages (route building followed by erasure) if
708	   it can quickly reboot.  If it cannot, then these messages still only
709	   effect its own prefixes.  With regards to the remaining network
710	   prefixes, the routing protocol will treat such router instability as
711	   a set of individually unstable links, effectively removing them from
712	   the forwarding paths for these prefixes and isolating the network
713	   from harm.

715	   The net effect then is that one can deploy much larger domains with
716	   greatly reduced risk of domain melt down while enabling fast
717	   restoration of IP flows when necessary.

719	7. References

721	   [BGPrestart] S. Ramachandra et al., "Graceful Restart mechanism for
722	   BGP", Internet-Draft, draft-ramachandra-bgp-restart-00.txt, (work in
723	   progress), August 2000.

725	   [FLIP] H.Sandick et al., "Fast LIveness Protocol (FLIP)", Internet
726	   Draft, draft-sandiick-flip-00.txt, February 2000.

728	   [IAB99] M. Kaat, "Overview of 1999 IAB Network Layer Workshop",
729	   Internet-Draft, draft-iab-ntwlyrws-over-03.txt, July 2000.

731	   [MobileIP] C.E. Perkins, ``IP Mobility Support," Internet RFC 2002,
732	   Oct 1996.

734	   [OLSR] P. Jacquet, P. Muhlethaler and A. Qayyum, "Optimized Link
735	   State Routing Protocol", Internet-Draft, draft-ietf-manet-olsr-
736	   01.txt, February 2000.

738	   [OSPF] J. Moy, "OSPF Version 2", Internet RFC 2328, April 1998.

740	   [RIP] G. Malkin, "RIP Version 2", Internet RFC 2453, November 1998.

742	   [TORA] V. Park, M. S. Corson, "The Temporally-Ordered Routing
743	   Algorithm", Proc. IEEE INFOCOM '97, Kobe, Japan, April 1997.

745	   [TORAdraft] V. Park, M. S. Corson, "Temporally-Ordered Routing
746	   Algorithm (TORA) Version 1 Functional Specification", Internet-Draft
747	   (work in progress), draft-ietf-manet-tora-spec-02.txt, October 1999.

749	Author's Addresses

751	      M. Scott Corson
752	      University of Maryland
753	      Ansible Systems
754	      (+1) 301-405-6630
755	      corson@isr.umd.edu
756	      mscorson@ix.netcom.com

758	      Alan O'Neill
759	      BT
760	      (+44) 1473-646459
761	      alan.w.oneill@bt.com

763	      George Tsirtsis
764	      Flarion Technologies
765	      (+44) 020 88260073
766	      g.tsirtsis@Flarion.com
767	      gtsirt@hotmail.com
768	                                Copyright Notice

770	                         Placeholder for ISOC copyright.