idnits 2.17.1 

draft-bryant-ipfrr-tunnels-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3667, Section 5.1 on line 1330.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 1391.

  ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line
     1379), which is fine, but *also* found old RFC 2026, Section 10.4C,
     paragraph 1 text on line 1379.

  ** The document claims conformance with section 10 of RFC 2026, but uses
     some RFC 3978/3979 boilerplate.  As RFC 3978/3979 replaces section 10 of
     RFC 2026, you should not claim conformance with it if you have changed to
     using RFC 3978/3979 boilerplate.

  ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure
     Acknowledgement -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.

  ** The document seems to lack an RFC 3979 Section 5, para. 1 IPR Disclosure
     Acknowledgement. 

  ** The document seems to lack an RFC 3979 Section 5, para. 2 IPR Disclosure
     Acknowledgement. 

  ** The document seems to lack an RFC 3979 Section 5, para. 3 IPR Disclosure
     Invitation. 

  ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate
     instead of verbatim RFC 3978 boilerplate.  After 6 May 2005, submission
     of drafts without verbatim RFC 3978 boilerplate is not accepted.

     The following non-3978 patterns matched text found in the document. 
     That text should be removed or replaced:

        By submitting this Internet-Draft, I certify that any applicable patent
        or other IPR claims of which I am aware have been disclosed, or
        will be disclosed, and any of which I become aware will be
        disclosed, in accordance with RFC 3668.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (May 2004) is 7285 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Missing reference section? 'A' on line 956 looks like a reference

  -- Missing reference section? 'B' on line 960 looks like a reference

  -- Missing reference section? 'C' on line 960 looks like a reference

  -- Missing reference section? 'J' on line 799 looks like a reference

  -- Missing reference section? 'D' on line 960 looks like a reference

  -- Missing reference section? 'E' on line 960 looks like a reference

  -- Missing reference section? 'BFD' on line 1096 looks like a reference

  -- Missing reference section? 'IPSEC' on line 1311 looks like a reference


     Summary: 11 errors (**), 0 flaws (~~), 2 warnings (==), 12 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Network Working Group                                          S. Bryant
2	Internet Draft                                               C. Filsfils
3	Expiration Date: Nov 2004                                     S. Previdi
4	                                                                M. Shand
5	                                                           Cisco Systems

7	                                                                May 2004

9	                     IP Fast Reroute using tunnels

11	                   draft-bryant-ipfrr-tunnels-00.txt

13	Status of this Memo

15	   This document is an Internet-Draft and is in full conformance with
16	   all provisions of Section 10 of RFC 2026.

18	   Internet-Drafts are working documents of the Internet Engineering
19	   Task Force (IETF), its areas, and its working groups. Note that other
20	   groups may also distribute working documents as Internet-Drafts.
21	   Internet-Drafts are draft documents valid for a maximum of six months
22	   and may be updated, replaced, or obsolete by other documents at any
23	   time. It is inappropriate to use Internet-Drafts as reference
24	   material or to cite them other than as "work in progress".

26	   The list of current Internet-Drafts can be accessed at
27	   http://www.ietf.org/ietf/1id-abstracts.txt

29	   The list of Internet-Draft Shadow Directories can be accessed at
30	   http://www.ietf.org/shadow.html.

32	Abstract

34	   This draft describes an IP fast re-route mechanism that provides
35	   backup connectivity in the event of a link or router failure. In the
36	   absence of single points of failure and asymmetric costs, the
37	   mechanism provides complete protection against any single failure. If
38	   perfect repair is not possible, the identity of all the unprotected
39	   links and routers is known in advance. The draft also describes the
40	   mechanisms needed to prevent the packet loss caused by loops which
41	   normally occur during the reconvergence of the network following a
42	   failure.

44	Table of Contents
45	1. Introduction......................................................5
46	2. Goals, non-goals, limitations and constraints.....................5
47	  2.1. Goals.........................................................5
48	  2.2. Non-Goals.....................................................6
49	  2.3. Limitations...................................................6
50	  2.4. Constraints...................................................6
51	3. Repair Paths......................................................7
52	  3.1. Tunnels as Repair Paths.......................................7
53	  3.2. Tunnel Requirements..........................................10
54	    3.2.1. Setup....................................................10
55	    3.2.2. Multipoint...............................................10
56	    3.2.3. Directed forwarding......................................10
57	    3.2.4. Security.................................................10
58	4. Construction of Repair Paths.....................................10
59	  4.1. Identifying Repair Path Targets..............................10
60	  4.2. Determining Tunneled Repair Paths............................11
61	    4.2.1. Computing Repair Paths...................................12
62	    4.2.2. Extended P-space.........................................13
63	    4.2.3. Downstream Paths.........................................13
64	    4.2.4. Selecting Repair Paths...................................13
65	  4.3. Assigning Traffic to Repair Paths............................14
66	  4.4. When no Repair Path is Possible..............................14
67	    4.4.1. Unreachable Target.......................................15
68	    4.4.2. Asymmetric Link Costs....................................15
69	    4.4.3. Interference Between Potential Node Repair Paths.........15
70	  4.5. Multi-homed Prefixes.........................................18
71	  4.6. Equal Cost Path Splits.......................................19
72	    4.6.1. Equal Cost Path Splits as Link Repair Paths..............19
73	    4.6.2. Equal Cost Path Splits and Node Failure..................20
74	  4.7. LANs and pseudonodes.........................................20
75	    4.7.1. The Link between Routers A and B is a LAN................21
76	      4.7.1.1. Case 1...............................................21
77	      4.7.1.2. Case 2...............................................21
78	      4.7.1.3. Simplified LAN repair................................22
79	    4.7.2. A LAN exists at the release point........................22
80	    4.7.3. A LAN between B and its neighbors........................22
81	    4.7.4. The LAN is a Transit Subnet..............................23
82	5. Failure Detection and Repair Path Activation.....................23
83	  5.1. Failure Detection............................................23
84	  5.2. Repair Path Activation.......................................23
85	  5.3. Node Failure Detection Mechanism.............................23
86	6. Loop Free Transition.............................................24
87	  6.1. Incremental Cost Advertisement...............................24
88	  6.2. Single Tunnel Per Router.....................................25
89	  6.3. Distributed Tunnels..........................................25
90	  6.4. Ordered SPFs.................................................26
91	7. Restoring Failed Components to Service...........................26
92	8. Implications for Network Management..............................26
93	9. IPFRR Capability.................................................27
94	10. Enhancements to routing protocols...............................27
95	11. IANA considerations.............................................27
96	12. Security Considerations.........................................27
97	Terminology

99	   This section defines words, acronyms, and actions used in this draft.

101	   A              Frequently used to denote a router that is
102	                  the source of a repair path computed in
103	                  anticipation of the failure of a neighboring
104	                  router denoted as B.

106	   B              Frequently used to denote a router whose
107	                  anticipated failure is the subject of repair
108	                  path computations.

110	   Directed       The ability of the repairing router (A) to
111	   forwarding     specify the next hop (Q) on exit from a
112	                  tunnel end-point (P)

114	   Extended       The union of the p-space of the neighbors of
115	   P-space        a specific router with respect to a common
116	                  component.

118	                  Extended p-space does not include the
119	                  additional space reachable though directed
120	                  forwarding.

122	   FIB            Forwarding Information Base. The database
123	                  used by the packet forwarder to determine
124	                  what actions to perform on a packet

126	   IPFRR          IP fast re-route

128	   P              The router in P-space to which a packet is
129	                  tunneled for repair.

131	   PQ             A router that is in both P and Q space and
132	                  hence does not need directed forwarding.

134	   P-space        P-space is the set of routers reachable from
135	                  a specific router without any path
136	                  (including equal cost path splits)
137	                  transiting a specified component.

139	                  For example, the P-space of A, is the set of
140	                  routers that A can reach without using B
141	                  (router failure case) or the A-B link
142	                  failure case).

144	   Q              The router in Q space, to which the packet
145	                  is directed by router P on exit from the
146	                  repair tunnel. Q will always be adjacent to
147	                  P, or P itself.

149	   Q-space        Q-space is the set of routers from which a
150	                  specific router can be reached without any
151	                  path (including equal cost path splits)
152	                  transiting a specified component.

154	   Routing        The process whereby routers converge on a
155	   transition     new topology. In conventional networks this
156	                  process frequently causes some disruption to
157	                  packet delivery.

159	   RPF            Reverse Path Forwarding. I.e. checking that
160	                  a packet is received over the interface
161	                  which would be used to send packets
162	                  addressed to the source address of the
163	                  packet.

165	   SPF            Shortest Path First, e.g. Dijkstra's
166	                  algorithm.

168	   SPT            Shortest path tree

170	1. Introduction

172	   When the topology of a network changes (due to link or router
173	   failure, recovery or management action), the routers need to converge
174	   on a common view of the new topology. During this process, referred
175	   to as a routing transition, packet delivery between certain
176	   source/destination pairs may be disrupted. This occurs due to the
177	   time it takes for the topology change to be propagated around the
178	   network plus the time it takes each individual router to determine
179	   and then update the forwarding information base (FIB) for the
180	   affected destinations. During this transition, packets are lost due
181	   to the continuing attempts to use of the failed component, and due to
182	   forwarding loops. Forwarding loops arise due to the inconsistent FIBs
183	   that occur as a result of the difference in time taken by routers to
184	   execute the transition process.

186	   The service failures caused by routing transitions are largely hidden
187	   by higher-level protocols that retransmit the lost data. However new
188	   Internet services are emerging which are more sensitive to the packet
189	   disruption that occurs during a transition. To make the transition
190	   transparent to their users, these services require a short routing
191	   transition. Ideally, routing transitions would be completed in zero
192	   time with no packet loss.

194	   Regardless of how optimally the mechanisms involved have been
195	   designed and implemented, it is inevitable that a routing transition
196	   will take some minimum interval that is greater than zero. The
197	   solution described here uses pre-computed backup routes and
198	   controlled notification of network changes. A set of repair paths
199	   temporarily provides substitute connectivity in place of a link, or
200	   router that has failed. Once the set of repair paths has been
201	   activated, there should be no further packet loss as a result of the
202	   associated failure. To achieve the maximum benefit from repair paths,
203	   they must be activated immediately a failure has been detected, and a
204	   controlled transition to normal operation invoked to prevent packet
205	   loss due to micro-looping. The packet loss attributable to the
206	   failure will then be confined to the unavoidable loss that occurs as
207	   a result of the latency of the failure detection mechanism itself.

209	   The mechanisms described here have been designed for use with any
210	   link-state routing protocol.

212	2. Goals, non-goals, limitations and constraints

214	2.1. Goals

216	   The following are the goals of IPFRR:

218	        o  Protect against any link or router failure in the network.

220	        o  No constraints on the network topology or link costs.

222	        o  Never worse than the existing routing convergence mechanism.

224	        o  Co-existence with non-IP fast-reroute capable routers in the
225	           network.

227	2.2. Non-Goals

229	   The following are non-goals of IPFRR:

231	        o  Protection of a single point of failure.

233	        o  To provide protection in the presence of multiple concurrent
234	           failures other than those that occur due to the failure of a
235	           single router.

237	        o  Shared risk group protection.

239	        o  Complete fault coverage in networks that make use of
240	           asymmetric costs.

242	2.3. Limitations

244	   The following limitations apply to IPFRR:

246	        o  Because the mechanisms described here rely on complete
247	           topological information from the link state routing protocol,
248	           they will only work within a single link state flooding
249	           domain.

251	        o  Reverse Path Forwarding (RPF) checks cannot be used in
252	           conjunction with IPFRR. This is because the use of tunnels
253	           may result in packets arriving over different interfaces than
254	           expected.

256	2.4. Constraints

258	   The following constraints are assumed:

260	        o  Following a failure, only the routers adjacent to the failure
261	           have any knowledge of the failure.

263	        o  There is insufficient time following a failure to compute a
264	           repair strategy based on knowledge of the specific failure
265	           that has occurred.

267	        o  Multiple concurrent failures may not be protected.

269	3. Repair Paths

271	   When a router detects an adjacent failure, it uses a set of repair
272	   paths in place of the failed component, and continues to use this
273	   until the completion of the routing transition. Only routers adjacent
274	   to the failed component are aware of the nature of the failure. Once
275	   the routing transition has been completed, the router will have no
276	   further use for the repair paths since all routers in the network
277	   will have revised their forwarding data and the failed link will have
278	   been eliminated from this computation.

280	   Repair paths are pre-computed in anticipation of later failures so
281	   they can be promptly activated when a failure is detected.

283	   Three types of repair path are considered here.

285	     1. Equal cost path-split.

287	     Where a link is being used as a member of an equal cost path-split
288	     set for some destination, the other members of the set may be used
289	     to provide an alternative path, provided that they avoid the
290	     network component being protected.

292	     2. Downstream Path.

294	     A 'downstream path' is a next hop that will get a packet nearer to
295	     its destination. It does not necessarily represent the shortest
296	     path to the destination but has the property that a packet sent on
297	     it will not loop back because, having traversed this hop, it is
298	     then closer to its destination.

300	     3. Tunnel.

302	     A tunneled repair path tunnels traffic to some staging point from
303	     which it will travel to its destination using normal forwarding
304	     without looping back. The repair path can be thought of as
305	     providing a virtual link, originating at a router adjacent to a
306	     failure, and diverting traffic around the failure.

308	3.1. Tunnels as Repair Paths

310	   The repair strategies described in this draft operate on the basis
311	   that if a packet can somehow be sent to the other side of the
312	   failure, it will subsequently proceed towards its destination exactly
313	   as if it had traversed the failed component. See Figure 1.

315	       Repair Path from A to B
316	         +-----------+
317	         |           |
318	         |           v
319	   ---->[A]---//----[B]----->

321	   Figure 1 Simple Link Repair

323	   Creating a repair path from A to B may require a packet to traverse
324	   an unnatural route. If a suitable natural path starts at a neighbor
325	   (i.e. it is a downstream path), then A can force the packet directly
326	   there. If this is not the case, then A must use a tunnel to force the
327	   packet down the repair path. Note that the tunnel does not have to go
328	   from A to B. The tunnel can terminate at any router in the network,
329	   provided that A can be sure that the packet will proceed correctly to
330	   its destination from that router.

332	   A repair path computed for a link failure may not however work
333	   satisfactorily when the neighboring router has, itself, failed. This
334	   is illustrated in Figure 2.

336	        Repair path from A to B
337	        +-------------------------+
338	        |                         |
339	        |            <------------+
340	   --->[A]---//----[B]----//-----[C]-->
341	        +---------->              |
342	        |                         |
343	        +-------------------------+
344	         Repair Path from C to B

346	   Figure 2 Looping Link Repair when Router Fails

348	   Consider the case of a router B with just two neighbors A and C. When
349	   router B fails, both A and C will observe the failure of their local
350	   link to B, but will have no immediate knowledge that B itself has
351	   failed. If they were both to attempt to repair traffic around their
352	   local link, they would invoke mutual repairs which would loop.

354	   Since it is not easy for a router to immediately distinguish between
355	   a link failure and the failure of its neighbor, repair paths are
356	   calculated in anticipation of adjacent router failure. Thus, for each
357	   of its protected links, router A (Figure 3) pre-computes a set of
358	   tunneled repair paths, one for each of the neighbors (C,D,E) of its
359	   neighbor B on the A-B link. The set of destinations that are normally
360	   assigned to link A-B will be assigned to a repair path based on the
361	   neighbor of B through which router B would have forwarded traffic to
362	   them.

364	             Repair AC
365	          +----------->[C]
366	          |             |
367	          |             |
368	          |             |
369	   ----->[A]----//-----[B]---------[D]
370	          ||            |           ^
371	          ||            |           |
372	          || Repair AE  |           |
373	          |+---------->[E]          |
374	          |                         |
375	          +-------------------------+
376	             Repair AD

378	   Figure 3: Repair paths in anticipation of a router failure

380	   The set of repair paths in Figure 3 will function correctly in the
381	   case of link and router failure. However, in some network topologies
382	   they may not provide a means for traffic to reach router B itself.
383	   This is important in cases where B is a single point of failure and B
384	   is still functional (i.e. the failure was actually a failure of the
385	   A-B link). Hence, in addition to computing repair paths for the
386	   neighbors of its neighbor on a protected link, a router also
387	   calculates a repair path for the neighbor itself. This is illustrated
388	   in Figure 4.

390	             Repair AB
391	          +----------------+
392	          |                |
393	          |  Repair AC     |
394	          |+---------->[C] |
395	          ||            |  /
396	          ||            | /
397	          ||            |/
398	   ----->[A]----//-----[B]---------[D]
399	          ||            |           ^
400	          ||            |           |
401	          || Repair AE  |           |
402	          |+---------->[E]          |
403	          |                         |
404	          +-------------------------+
405	             Repair AD

407	   Figure 4 The full set of A-B repair paths.

409	   In the event of a failure, the only traffic that is assigned to the
410	   link repair path (the AB repair) is that traffic which has no other
411	   path to its destination except via B. As we have already seen, there
412	   is a danger that traffic assigned to this link repair path may loop
413	   if B has failed, therefore, when the repair paths are invoked, a loop
414	   detection mechanism is used which promptly detects the loop and, upon
415	   detection, withdraws the link (A-B) repair path from service.

417	3.2. Tunnel Requirements

419	   The specific tunneling mechanism used to provide a repair path is
420	   outside the scope of this document. However the following sections
421	   describe the requirements for the tunneling mechanism.

423	3.2.1. Setup.

425	   When a failure is detected, it is necessary to immediately redirect
426	   traffic to the repair paths. Consequently, the tunnels used must be
427	   provisioned beforehand in anticipation of the failure. IP fast re-
428	   route will determine which tunnels it requires. It must therefore be
429	   possible to establish tunnels automatically, without management
430	   action, and without the need to manually establish context at the
431	   tunnel endpoint.

433	3.2.2. Multipoint

435	   To reduce the number of tunnel endpoints in the network the tunnels
436	   should be be multi-point tunnels capable of receiving repair traffic
437	   from any IPFRR router in the network.

439	3.2.3. Directed forwarding.

441	   Directed forwarding must be supported such that the router at the
442	   tunnel endpoint (P) can be directed by the router at the tunnel
443	   source (A) to forward the packet directly to a specific neighbor.
444	   Specification of the directed forwarding mechanism is outside the
445	   scope of this document.

447	3.2.4. Security

449	   A lightweight security mechanism should be supported to prevent the
450	   abuse of the repair tunnels by an attacker. This is discussed in more
451	   detail in Section 12.

453	4. Construction of Repair Paths

455	4.1. Identifying Repair Path Targets

457	   To establish protection for a link or node it is necessary to
458	   determine which neighbors of the neighboring node should be targets
459	   of repair paths. Normally all neighbors will be used as repair path
460	   targets. However, in some topologies, not all neighbors will be
461	   needed as targets because, prior to the failure, no traffic was being
462	   forwarded through them by the repairing router.  This can determined
463	   by examining the normal spanning tree computed by the repairing
464	   router.

466	   In addition, the neighboring router B will also be the target of a
467	   repair path for any destinations for which B is a single point of
468	   failure.

470	4.2. Determining Tunneled Repair Paths

472	   The objective of each tunneled repair path is to deliver traffic to a
473	   target router when a link is observed to have failed. However, it is
474	   seldom possible to use the target router itself as the tunnel
475	   endpoint because other routers on the repair path, that have not
476	   learned of the failure, will forward traffic addressed to it using
477	   their least cost path which may be via the failed link. This is
478	   illustrated in Figure 5 in which all link costs are one in both
479	   directions. Router A's intended repair path for traffic to D when
480	   link A-B fails is the path W-X-Y-Z-D. However, if router A makes D be
481	   the tunnel endpoint and forwards the packet to router W, router W
482	   will immediately return it to A because its least cost path to D is
483	   A-B-D (cost 3 versus cost 4) and has no knowledge of the failure of
484	   link A-B.

486	              [A]--//--[B]------[D]
487	               |                 |
488	               |                 |
489	              [W]---[X]---[Y]---[Z]

491	   Figure 5. Repair path to target router D.

493	   Thus the tunnel endpoint needs to be somewhere on the repair path
494	   such that packets addressed to the tunnel end point will not loop
495	   back towards router A. In addition, the release point needs to be
496	   somewhere such that when packets are released from the tunnel they
497	   will flow towards the target router (or their actual destination)
498	   without being attracted back to the failed link. By inspection, in
499	   Figure 5, suitable tunnel endpoints are routers X, Y, and Z.

501	   Note that it is not essential that traffic assigned to a repair path
502	   actually traverse the target router for which the repair path was
503	   created. If, for example, in Figure 5, a packet's destination were
504	   normally reached via the path A-B-D-Z-?-?-?, once released at any of
505	   the possible tunnel endpoints, it would arrive at its destination by
506	   the best available route without traversing D.

508	   In general, the properties that are required of tunnel endpoints are:

510	        o  the end point must be reachable from the tunnel source
511	           without traversing the failed link; and

513	        o  once released, tunneled packets will proceed towards their
514	           destination without being attracted back over the failed link
515	           or node.

517	   Provided both of these conditions are met, packets forwarded on the
518	   repair path will not loop.

520	   In some topologies it will not be possible to find a tunnel endpoint
521	   that exhibits both the required properties. For example, in Figure 5,
522	   if the cost of link X-Y were increased from one to four in both
523	   directions, there is no longer a viable endpoint within the fragment
524	   of the topology shown.

526	   To solve this problem we introduce the concept of directed forwarding
527	   from the tunnel endpoint. Directed forwarding allows the originator
528	   of a tunneled packet to instruct that, when it is de-capsulated at
529	   the end of the tunnel, it be forwarded via a specific adjacency, and
530	   not be subjected to the normal forwarding decision process. This
531	   effectively allows the tunnel to be extended by one hop. So, for
532	   example, in Figure 5 with the cost of link X-Y set to four, it would
533	   be possible to select X as the tunnel endpoint with the directive
534	   that X always forward the packets it decapsulates via the
535	   adjacency to Y. Thus, router X is reached from A using normal
536	   forwarding, and directed forwarding is then used to force packets to
537	   router Y, from where D can be reached using normal forwarding.

539	   Provided link costs are symmetrical, it can be proved that it is
540	   always possible to compute a tunneled repair path (possibly using
541	   directed forwarding) around a link failure.

543	   The tunnel endpoint (P) and the release point (Q) may be coincident,
544	   or may be separated by at most one hop.

546	4.2.1. Computing Repair Paths

548	   For a router A, determining tunneled repair paths around a
549	   neighboring router B, the set of potential tunnel end points includes
550	   all the routers that can be reached from A using normal forwarding
551	   without traversing the failed link A-B. This is termed the "P-space"
552	   of A with respect to the failure of B. Any router that is on an equal
553	   cost path split via the failed link is excluded from this set.

555	   The resulting set defines all the possible tunnel end points that
556	   could be used in repair paths originating at router A for the failure
557	   of link A-B. This set can be obtained by computing a spanning tree
558	   rooted at A and excising the subtree reached via the A-B link.

560	   The set of possible release points can be determined by computing the
561	   set of routers that can reach the repair path target without
562	   traversing the failed link. This is termed the "Q-space" of the
563	   target with respect to the failure. The Q-space can be obtained by
564	   computing a reverse spanning tree rooted at the repair path target,
565	   with the subtree which traverses the failed link (or node) excised.

567	   The reverse spanning tree uses the cost towards the root rather than
568	   from it and yields the best paths towards the root from other nodes
569	   in the network.

571	   The intersection of the target's Q-space with A's P-space includes
572	   all the possible release points for any repair path not employing
573	   directed forwarding. Where there is no intersection, but there exist
574	   a pair of routers, P in A's P-space and Q in the target's Q-space,
575	   router P can be used as the tunnel endpoint with directed forwarding
576	   to the release point Q.

578	4.2.2. Extended P-space

580	   The description in section 4.2.1 calculated router A's P-space rooted
581	   at A itself. However, since router A will only use a repair path when
582	   it has detected the failure of the link A-B, the initial hop of the
583	   repair path need not be subject to A's normal forwarding decision
584	   process. Thus we introduce the concept of extended P-space. Router
585	   A's extended P-space is the union of the P-spaces of each of A's
586	   neighbors. The use of extended P-space may allow router A to repair
587	   to targets that were otherwise unreachable.

589	4.2.3. Downstream Paths

591	   Under certain circumstances, the target's Q-space will include a
592	   router that is a neighbor of A. This is traditionally referred to as
593	   a downstream path and has the property that a packet sent on it will
594	   not loop back because, having traversed this hop, it is then closer
595	   to its destination. A trivial example of this is shown in Figure 6.

597	                   [A]--//---[B]
598	                    |         |
599	                    2         |
600	                    |         |
601	                   [D]-------[C]

603	   Figure 6. A topology that will permit a single-hop release point

605	   When a downstream path exists, no tunneling is required.

607	4.2.4. Selecting Repair Paths

609	   The mechanism described in section 4.2 will identify all the possible
610	   release points that can be used to reach each particular target. (The
611	   circumstances when no release points exist are described in
612	   section 4.4.) In a well-connected network there are likely to be
613	   multiple possible release points for each target, and all will work
614	   correctly. For simplicity, one release point per target is chosen.
615	   All will deliver the packets correctly so, arguably, it does not
616	   matter which is chosen. However, one release point may be preferred
617	   over the others on the basis of path cost or some other criteria. It
618	   is an implementation matter as to how the release point is selected.

620	4.3. Assigning Traffic to Repair Paths

622	   Once the repair path for each target has been selected, it is
623	   necessary to determine which of the destinations normally reached via
624	   the protected link should be assigned to which of the repair paths
625	   when the link fails.

627	   This is achieved by recording which neighbor of B would be used to
628	   reach each destination reachable over A-B when running the original
629	   SPF. Traffic assignment is then simply a matter of assigning the
630	   traffic which B would have forwarded via each neighbor to the repair
631	   path which has that neighbor as its target.

633	   Although the repair paths are calculated based on traffic addressed
634	   to specific targets, it can be proved that the traffic assignment
635	   algorithm guarantees that the repair path can be used for any traffic
636	   assigned to it.

638	   Where B would normally split the traffic to a particular destination
639	   via two or more of its neighbors, it is an implementation decision
640	   whether the repaired traffic should be split across the corresponding
641	   set of repair paths.

643	   The repair path to B itself is normally used just for traffic
644	   destined for B and any prefixes advertised by B. However, under some
645	   circumstances, it may be impossible to compute a repair path to one
646	   or more of B's neighbors, for example, because B is a single point of
647	   failure. In this case traffic for the destinations served by the
648	   otherwise irreparable targets is assigned to the repair path with B
649	   as its target, in the optimistic assumption that router B is still
650	   functioning. If router B is indeed still functioning, this will
651	   ensure delivery of the traffic. If, however, router B has failed, the
652	   traffic on this repair path will loop as previously shown in
653	   section 3.1. The way this is detected, and the course of action when
654	   it is detected, are described in section 5.3.

656	4.4. When no Repair Path is Possible

658	   Under some circumstances, it will not be possible to identify a
659	   repair path to one or more of the targets. This can occur for the
660	   following reasons:

662	        o  The neighboring router that is presumed to have failed
663	           constitutes a single point of failure in the network.

665	        o  Severely asymmetric link costs may cause an otherwise viable
666	           physical repair path to be unusable.

668	        o  Interference may occur between the repair paths of individual
669	           targets.

671	   In practice, these cases are unlikely to be encountered frequently.
672	   Networks that will benefit from the mechanisms described here will
673	   usually exhibit considerable redundancy and are normally operated
674	   with largely symmetric link costs. Note that a router's inability to
675	   compute a full set of repair paths for one of its links does not
676	   necessarily affect its ability to do so for its other links.

678	   Example topologies illustrating each of the three cases above are
679	   described in the following subsections.

681	4.4.1. Unreachable Target

683	   If the failure of a neighboring router makes one or more of its
684	   neighbors genuinely unreachable, clearly it will not be possible to
685	   establish a repair path to such targets. Such single points of
686	   failure are not expected to be encountered frequently in properly
687	   designed networks, and will probably occur only when the network has
688	   previously suffered other failures that have reduced its
689	   connectivity.

691	4.4.2. Asymmetric Link Costs

693	   When link costs have been set asymmetrically, it is possible that a
694	   repair path cannot be constructed even using directed forwarding.

696	   Although it is trivial to construct a network fragment with this
697	   property, this should not be regarded as a major problem. Firstly,
698	   asymmetric link costs are seldom used deliberately. And, secondly,
699	   even when an asymmetric link cost prevents one potential repair path
700	   being used, there will normally be other ones available.

702	4.4.3. Interference Between Potential Node Repair Paths

704	   Under some circumstances the existence of one neighbor may interfere
705	   with a potential repair path to another. Consider the topology shown
706	   in Figure 7 in which all links have a symmetrical cost of one, with
707	   the exception of that between H and G, which has a cost of 3. In this
708	   example, the fact that router F is a neighbor of B prevents the
709	   discovery of a repair path from router A to router C despite the
710	   existence of an apparently suitable path.

712	                   [A]---//---[B]------ [C]
713	                    |          |         |
714	                    |          |         |
715	                   [H]-3-[G]--[F]--[E]--[D]

717	   Figure 7. Interference between repair paths

719	   A repair path from router A to F can use F itself as the release
720	   point by employing directed forwarding from G. However, it is not
721	   possible to identify a suitable release point for a repair path to
722	   router C within the topology shown since there is nowhere that
723	   router A can reach that will subsequently forward traffic to router C
724	   except via the forbidden link B-C (F's least cost path to C is
725	   F-B-C). This is because the extended P-space of router A is separated
726	   by more than one hop from the Q-space of router C.

728	   Since the topology shown in Figure 7 will typically form part of a
729	   much larger topology, a different, and possibly more circuitous
730	   repair path from A to C, that does not go via F, may be discovered.
731	   This is illustrated in Figure 8. In this enhanced topology, a repair
732	   path to C using Y as the release point can be used.

734	     [A]---//---[B]-------[C]
735	      |          |         |
736	      |          |         |
737	     [H]-3-[G]--[F]--[E]--[D]
738	            |         |
739	            |         |
740	           [X]--[Y]--[Z]

742	   Figure 8. Resolving interference in a larger network

744	   Note that, in Figure 8, if the traffic for C were assigned to the
745	   repair path for F, it would correctly reach C because F would assign
746	   it to its repair path to C. That is, packets from A to C would travel
747	   via two successive tunnels. Consequently, this is referred to as a
748	   "secondary repair path". However, it is not always the case that
749	   interference can be handled in this fashion and it is possible to
750	   create looping repair paths.

752	   One possibility of looping repair paths is illustrated in Figure 9.
753	   All links have a symmetrical cost of one with the exception of HG,
754	   which is cost 3 in either direction, and ED and DC which are cost 5
755	   in the indicated direction and cost 1 in the other.

757	                    [A]---//---[B]--------[C]
758	                     |          |          |^
759	                     |          |          |5
760	                    [H]-3-[G]--[F]--[E]---[D]
761	                                        5>

763	   Figure 9 Looping secondary repair paths

765	   In this topology, A can establish a repair path to F, but cannot
766	   establish a repair path to C because of interference. Router A might
767	   assign traffic intended for C onto its repair path to F expecting it
768	   to undergo a secondary repair towards C. However, because of the
769	   asymmetrical link costs, F is unable to establish a repair path to C.
770	   It is only able to establish a repair path to A. If F, like A,
771	   elected to forward repaired traffic to C using its (only) repair path
772	   to A, similarly expecting a secondary repair to get it to its
773	   destination, traffic for C would loop between A and F. Thus when
774	   interference occurs, the possibility of a secondary repair path
775	   cannot be relied upon to ensure that traffic reaches its destination.

777	   In order to determine the viability of secondary repair paths, it is
778	   necessary for each router to take into account the repair paths which
779	   the other neighbors of router B can achieve. These can be computed
780	   locally by running the repair path computation algorithms rooted at
781	   each of those neighbors. It is only necessary to compute the repair
782	   paths from the routers to which router A can establish repair paths,
783	   with targets of those routers to which repair paths have not yet been
784	   established.

786	   It is then possible to determine whether all routers can now be
787	   reached by invoking secondary (or if necessary tertiary, etc.) repair
788	   paths, and if so, to which primary repair path traffic for each
789	   target should be assigned.

791	   There is another, more subtle, possibility of loops arising when
792	   secondary repair paths are used. This is illustrated in Figure 10,
793	   where all links are cost 1 with the exception of JI which has a cost
794	   5 in that direction and cost 1 in the direction IJ.

796	               [A]---//---[B]--------[C]
797	                |          |          |
798	                |          |          |
799	               [J]         |         [D]
800	               5|          |          |
801	               v|          |          |
802	               [I]---[H]--[G]---[F]--[E]

804	   Figure 10 Example of an apparently non-looping secondary repair path
805	   which results in a loop.

807	   Router A has a primary repair path to G (with a release point of I),
808	   and G has a primary repair path to C (with a release point of E). It
809	   would appear that these form a non-looping secondary repair path from
810	   A to C. As usual, the primary repair path from A to G has been
811	   computed on the basis of destinations normally reachable through BG.
812	   However, when making use of the secondary repair path, the traffic
813	   inserted in the repair path from A to G will be destined not for one
814	   of the routers normally reachable via BG, but for C. Hence this
815	   repair path is not necessary valid for such traffic, and in this
816	   example it will have a 50% probability of being forwarded back along
817	   the path IJABC, and hence looping.

819	   This problem can in general be avoided by choosing a release point
820	   for the initial primary repair with the property that traffic for the
821	   secondary target (C) is guaranteed to traverse the primary target
822	   (G). This can be achieved by computing the reverse SPF rooted at the
823	   secondary target (C) and examining the sub-tree which traverses the
824	   primary target. It can be proved that in the absence of asymmetric
825	   link costs, such a release point will always exist. Where asymmetric
826	   link costs prevent this, the traffic can be encapsulated to the
827	   intermediate router (G), which may require the use of double
828	   encapsulation. On reaching router G, the traffic for C is
829	   decapsulated and then forwarded in G's primary repair path to C (via
830	   router E, in the example).

832	4.5. Multi-homed Prefixes

834	   Up to this point, it has been assumed that any particular prefix is
835	   "attached" to exactly one router in the network, and consequently
836	   only the routers in the network need be considered when constructing
837	   repair paths, etc. However, in many cases the same prefix will be
838	   attached to two or more routers. Common cases are: -

840	        o  The subnet present on a link is advertised from both ends of
841	           the link.

843	        o  Prefixes are propagated from one routing domain to another by
844	           multiple routers.

846	        o  Prefixes are advertised from multiple routers to provide
847	           resilience in the event of the failure of one of the routers.

849	   In general, this causes no particular problems, and the shortest
850	   route to each prefix (and hence which of the routers to which it is
851	   attached should be used to reach it) is resolved by the normal SPF
852	   process. However, in the particular case where one of the instances
853	   of a prefix is attached to router B, or to a router for which router
854	   B is a single point of failure, the situation is more complicated.

856	               P
857	               |
858	               |
859	   [A]---//---[B]--------[C]
860	    |                     |                               P
861	    |                     |                               |
862	   [W]-----[X]----[Y]----[Z]-[G]-[H]-[I]-[J]-[K]-[L]-[M]-[N]

864	   Figure 11 A multi-homed prefix p

866	   Consider a prefix p, which is attached to router B and some other
867	   router N as illustrated in Figure 11. Before the failure of the link
868	   A-B, p is reachable from A via A-B. After the failure it cannot be
869	   assumed that B is still reachable. If traffic to p is assigned to a
870	   link repair path to B (as it would be if p were attached only to B),
871	   and router B has failed, then it would loop and subsequently be
872	   dropped. Traffic for p cannot simply be assigned to whatever repair
873	   path would be used for traffic to N, because other routers, which are
874	   not yet aware of any failure, may direct the traffic back towards B,
875	   since the instance of p attached to B is closer.

877	   A solution is to treat p itself as a neighbor of B, and compute a
878	   repair path with p as a target. However, although correct, this
879	   solution may be infeasible where there are a very large number of
880	   such prefixes, which would result in an unacceptably large
881	   computational overhead.

883	   Some simplification is possible where there exist a large number of
884	   multi-homed prefixes which all share the same connectivity and
885	   metrics. These may be treated as a single router and a single repair
886	   path computed for the entire set of prefixes.

888	   An alternative solution is to tunnel the traffic for a multi-homed
889	   prefix to the router N where it is also attached (see Figure 11). If
890	   this involves a repair path that was already tunneled, then this
891	   requires double encapsulation.

893	4.6. Equal Cost Path Splits

895	   Equal cost path splits may be used as a repair mechanism, but link
896	   and node repairs need to be considered separately.

898	4.6.1. Equal Cost Path Splits as Link Repair Paths

900	   When a link is used as a member of one or more path-split sets, by
901	   definition, the destinations served could be equally well served by
902	   any other member of the path-split set. Therefore, when the link
903	   fails, any destinations that use the link as a path-split may be
904	   immediately assigned to another member of the set. Clearly, if
905	   traffic to some destinations can be repaired using a path split, it
906	   should not also be subject to repair by tunneling. Such destinations
907	   should be identified before performing traffic assignment to tunneled
908	   repair paths.

910	4.6.2. Equal Cost Path Splits and Node Failure

912	   An equal cost path split may traverse the failed node (router B). In
913	   this case, the path split may not be an appropriate repair path.
914	   There are two cases: -

916	        o  the path split is a parallel link, having router B as a
917	           direct neighbor, and

919	        o  the path split does not have router B as a direct neighbor,
920	           but the route traverses router B at some point further
921	           downstream.

923	   These are illustrated in Figure 12 and Figure 13 respectively.

925	               +---//---+
926	              [A]      [B]-------[D]
927	               +--------+

929	   Figure 12 A parallel link path split

931	               +-2-//---+
932	              [A]      [B]-------[D]
933	               +--[C]---+

935	   Figure 13 A path split via an intermediate node

937	   In both cases it must be assumed that router B has failed and some
938	   other repair path, diverse with respect to router B, must be used.

940	4.7. LANs and pseudonodes

942	   In link state protocols a LAN is represented by a construct known as
943	   a pseudonode in IS-IS and a network LSA in OSPF.

945	   In order to deal correctly with this representation of LANs, the
946	   algorithms described in this draft require certain modifications.
947	   There are four cases which require consideration. These are described
948	   in the following subsections.

950	4.7.1. The Link between Routers A and B is a LAN

952	   In this case, the link which is being protected is a LAN, and the
953	   router B which has potentially failed is reachable over the LAN. This
954	   is illustrated in Figure 14.

956	              [A]
957	               |
958	     =====================
959	     |    |       |      |
960	    [B]  [C]     [D]    [E]

962	            Figure 14 The link between routers A and B is a LAN

964	   There are two possible failure modes in this case.

966	4.7.1.1. Case 1

968	   Router B or its interface to the LAN may have failed independently of
969	   the rest of the LAN. In this case the remaining routers on the LAN
970	   (routers C, D and E) will remain reachable from router A. These
971	   routers do not appear as direct neighbors of router B in the link
972	   state database and are not treated as neighbors of router B for the
973	   purposes of this specification because no traffic from router A would
974	   be directed through router B to any of these routers. However, each
975	   of these neighboring routers will have router B as a neighbor and
976	   they will initiate their own repair paths in the event of the failure
977	   of router B or its LAN interface.

979	   Repair paths are computed with the non-LAN neighbors of B as targets,
980	   and also B itself (the "link-failure" repair path). Note that since
981	   the remaining neighbors of A on the LAN are assumed to be still
982	   reachable when the link to B has failed, these repair paths may
983	   traverse the LAN.

985	   A separate set of repair paths is required in anticipation of the
986	   potential failure of each router on the LAN.

988	4.7.1.2. Case 2

990	   Router A's interface to the LAN may have failed (or the entire LAN
991	   may have failed). In either event, simultaneous failures will be
992	   observed from router A to all the remaining routers on the LAN
993	   (routers B, C, D and E). In this case, the pseudonode itself can be
994	   treated as the "adjacent" router (i.e. the router normally referred
995	   to as "router B"), and repairs constructed using the normal
996	   mechanisms with all the neighbors of the pseudonode (routers B, C, D
997	   and E) as repair path targets. If one or more of the routers had
998	   failed in addition to the LAN connectivity, treating it as a repair
999	   path target would not be viable, but this would be a case of multiple
1000	   simultaneous failures which is out of scope of this specification.

1002	   The entire sub-tree over A's LAN interface is the failed component
1003	   and is excised from the spanning tree when computing A's extended P-
1004	   space. For the Q-spaces of the targets, the sub-tree over the LAN
1005	   interface of the target is excised.

1007	4.7.1.3. Simplified LAN repair

1009	   A simpler alternative strategy is to always consider the LAN and all
1010	   routers attached to it as failing as a single unit. In this case, a
1011	   single set of repair paths is computed with targets being the entire
1012	   set of non-LAN neighbors of all the routers on the LAN, together with
1013	   "link-repair" paths with all the routers on the LAN as targets. Any
1014	   failure of one or more LAN adjacencies results in these repair paths
1015	   being invoked for all neighbors on the LAN. These repair paths must
1016	   not traverse the LAN, and so must be computed by excising the entire
1017	   sub-tree reachable over A's LAN interface from A's spanning tree
1018	   (i.e. the entire LAN is the failed component). The Q-spaces are
1019	   computed as normal, with the LAN neighbors or their interface to the
1020	   LAN being excised as appropriate. This is simpler than the approach
1021	   proposed above, but will fail to make use of possible repair paths
1022	   (or even path splits) over the LAN. In particular, if the only viable
1023	   repair paths involve the LAN, it will prevent any repair being
1024	   possible.

1026	4.7.2. A LAN exists at the release point

1028	   When computing the viable release points, it may be that one or more
1029	   of the leaf nodes are actually pseudonodes. In this case, the release
1030	   point is deemed to be any of the parent nodes on the LAN by which the
1031	   pseudonode had been reached, and when computing the extended set of
1032	   release points (reachable by directed forwarding), all the remaining
1033	   routers on the LAN may be included.

1035	4.7.3. A LAN between B and its neighbors

1037	   If there is a LAN between router B and one or more of B's neighbors
1038	   (other than router A), then rather than treating each of those
1039	   neighbors as a separate target to which a repair path must be
1040	   computed, the pseudonode itself can be treated as a single target for
1041	   which a repair path can be computed. If there are other neighbors of
1042	   B which are directly attached to B, including those which may also be
1043	   attached to the LAN, they must still be treated as an individual
1044	   repair path target.

1046	   Normally a repair path with the pseudonode as its target will have a
1047	   release point before the pseudonode. However it is possible that the
1048	   release point would be computed as the pseudonode itself. This will
1049	   occur if the reverse spanning tree rooted at the pseudonode includes
1050	   no routers other than itself. In this case a single repair with the
1051	   pseudonode as target is not possible, and it is necessary to compute
1052	   individual repair paths whose target are each of the neighbors of B
1053	   on the LAN.

1055	4.7.4. The LAN is a Transit Subnet.

1057	   This is the most common case, where a LAN is traversed by a repair
1058	   path, but is not in any of the special positions described above. In
1059	   this case no special treatment is required, and the normal SPF
1060	   mechanisms are applicable.

1062	5. Failure Detection and Repair Path Activation

1064	   The details of repair path activation are inherently implementation-
1065	   dependent and must be addressed by individual design specifications.
1066	   This section describes the implementation independent aspects of the
1067	   failover to the repair path.

1069	5.1. Failure Detection

1071	   The failure detection mechanism must provide timely detection of the
1072	   failure and activation of the repair paths. The failure detection
1073	   mechanisms may be media specific (for example loss of light), or may
1074	   be generic (for example BFD). Multiple detection mechanisms may be
1075	   used in order to improve detection latency. Note that in the case of
1076	   a LAN it may be necessary to monitor connectivity to all of the
1077	   adjacent routers on the LAN.

1079	5.2. Repair Path Activation

1081	   The mechanism used by the router to activate the repair path
1082	   following failure will be implementation specific.

1084	   An implementation that is capable of withdrawing the repair may delay
1085	   the start of network convergence in order to minimize network
1086	   disruption in the event that the failure was a transient.

1088	5.3. Node Failure Detection Mechanism

1090	   When router A detects a failure of the A-B link, it will invoke the
1091	   link repair path from itself to router B. This A-B link repair is
1092	   always invoked because even if all other traffic can be re-routed, B
1093	   is always a single point of failure to itself. If router B has
1094	   failed, the A-B link repair can result in a forwarding loop. A node
1095	   failure detection mechanism is therefore needed. A suitable mechanism
1096	   might be to run BFD [BFD] between A and B, over the A-B link repair
1097	   path.

1099	   When the node failure detection mechanism has determined that router
1100	   B has failed it withdraws the A-B link repair path. The node failure
1101	   detection and revocation of the A-B link repair needs to be
1102	   expedited, in order to minimize the duration of collateral damage to
1103	   the network cause by packets looping around the A-B link repair path.

1105	   If B is a single point of failure to some destinations, then
1106	   withdrawing the A-B link repair has no impact on network
1107	   connectivity, because those destinations will have been rendered
1108	   unreachable by the failure of router B.

1110	   If B is not a single point of failure, but traffic to some
1111	   destinations is being repaired via the A-B link because of the
1112	   inability to provide suitable repair paths, then there are
1113	   destinations that are rendered temporarily unreachable by IPFRR. The
1114	   IPFRR loop free convergence mechanism delays normal convergence of
1115	   the network. Consideration therefore has to be given to the relative
1116	   importance of the traffic being protected and the traffic being
1117	   black-holed. Depending on the outcome of that consideration, the
1118	   IPFRR loop-free strategy may need to be abandoned.

1120	6. Loop Free Transition

1122	   Once the repair paths have been activated, data will again be
1123	   forwarded correctly. At this stage only the routers directly adjacent
1124	   to the failure will be aware of the failure because no routing
1125	   information concerning the failure has yet been propagated to other
1126	   routers. The network now has to be transitioned to normal operation
1127	   using the available components.

1129	   During network transition inconsistent state may lead to the
1130	   formation of micro-loops. During this period, packets may be
1131	   prevented from reaching the repair path, may expire due to transiting
1132	   an excessive number of hops, may be subject to excessive delay, and
1133	   the resultant congestion may disrupt the passage of other packets
1134	   through the network. The use of a loop free transition technique
1135	   allows the network to re-converge without packet loss or disruption.

1137	   Four loop free transition strategies are described:

1139	        o  Incremental cost advertisement

1141	        o  Single Tunnel

1143	        o  Distributed Tunnels

1145	        o  Ordered SPF

1147	6.1. Incremental Cost Advertisement

1149	   When a link fails, the cost of the link is normally changed from its
1150	   assigned metric to "infinity". However it can be proved that: if the
1151	   link cost is increased in suitable increments, and the network is
1152	   allowed to stabilize before the next cost increment is advertised,
1153	   then no micro-loops will form.

1155	   This approach has the advantage that it requires no change to the
1156	   routing protocol, and will work with non-IPFRR capable routers.
1157	   However the loop-free transition is slow, particularly if large
1158	   metrics are used, and during this time the network is vulnerable to a
1159	   second failure.

1161	6.2. Single Tunnel Per Router

1163	   When a failure is detected, the routers adjacent to the failure issue
1164	   a "covert" announcement of the failure, which is propagated through
1165	   the network by all routers, but which is understood only by IPFRR
1166	   capable routers. These routers each build a tunnel to the closest
1167	   IPFRR router adjacent to the failure. They then determine which of
1168	   their traffic would transit the failure and place that traffic in the
1169	   tunnel. When all of these tunnels are in place, the failure is then
1170	   announced as normal. Because the tunnel will be unaffected by the
1171	   transition, and because the IPFRR router at the tunnel endpoint will
1172	   continue the repair, no traffic will be disrupted by the failure.
1173	   When the network has converged, the IPFRR routers can withdraw the
1174	   tunnels. The order of tunnel insertion and withdrawal is not
1175	   important, provided the tunnels are all in place before the normal
1176	   announcement.

1178	   This technique has the disadvantage that it requires traffic to be
1179	   tunneled during the transition.

1181	   A further disadvantage of this method is that it requires co-
1182	   operation from all the routers within the routing domain to fully
1183	   protect the network against micro-loops. However it can be shown that
1184	   micro-loops will be confined to contiguous groups of non-IPFRR
1185	   capable routers, and will only affect traffic arriving at the network
1186	   through one of those routers.

1188	6.3. Distributed Tunnels

1190	   This is similar to the single tunnel per router approach except that
1191	   all IPFRR capable routers calculate a set of repair paths using the
1192	   same algorithms as for traffic that will be affected by the failure.

1194	   This reduces the load on the tunnel endpoints, but the length of time
1195	   taken to calculate the repairs increases the convergence time.

1197	   This method suffers from the same disadvantages as the single tunnel
1198	   method.

1200	6.4. Ordered SPFs

1202	   Micro loops occur when a router closer to the failed component
1203	   revises its routes to take account of the failure before a router
1204	   which is further away. By analyzing the reverse spanning tree over
1205	   which traffic is directed to the failed component, it is possible to
1206	   determine a strict ordering which ensures that routers closer to the
1207	   root always process the failure after any routers further away, and
1208	   hence micro loops are prevented.

1210	   When the failure has been announced, each router waits a multiple of
1211	   some time delay value. The multiple is determined by the router's
1212	   position in the reverse spanning tree, and the delay value is chosen
1213	   to guarantee that a router can complete its processing within this
1214	   time. The convergence time may be reduced by employing a signaling
1215	   mechanism to notify the parent when all the children have completed
1216	   their processing, and hence when it was safe for the parent to
1217	   instantiate its new routes.

1219	   The property of this approach is therefore that it imposes a delay
1220	   which is bounded by the network diameter although in most cases it
1221	   will be much less.

1223	   It requires all routers in the domain to operate according to these
1224	   procedures, and the presence of non co-operating routers can give
1225	   rise to loops for any traffic which traverses them (not just traffic
1226	   which is originated through them).

1228	7. Restoring Failed Components to Service

1230	   When a neighbor or failed link is restored to service, it will be
1231	   detected according to the normal operation of the routing protocols
1232	   by the formation of an adjacency. Normally this would result in the
1233	   information about the link being included in newly generated routing
1234	   information. However, just as in the case with increasing costs, the
1235	   sudden decrease in cost from "infinity" to the configured value of
1236	   the link cost may give rise to loops. Each of the loop-free
1237	   transition mechanism described above has a corresponding mechanism
1238	   that can be used to add a link to the network without the formation
1239	   of micro-loops.

1241	8. Implications for Network Management

1243	   It will be clear from the above that topology changes introduced by
1244	   management action, such as enabling or disabling a link or router, or
1245	   changing the cost metric of a link may result in disruption of
1246	   traffic due to the formation of micro-loops. It will equally be clear
1247	   that the loop-free convergence strategies described above can equally
1248	   be applied to the prevention of such micro-loops.

1250	9. IPFRR Capability

1252	   In the previous sections it has been assumed that all routers in the
1253	   network are capable of acting as IPFRR routers, performing such tasks
1254	   as tunnel termination and directed forwarding. In practice this is
1255	   unlikely to be the case, partially because of the heterogeneous
1256	   nature of a practical network, and partially because of the need to
1257	   progressively deploy such capability. IPFRR therefore needs to
1258	   support some form of capability announcement, and the algorithms need
1259	   to take these capabilities into account when calculating their path
1260	   repair strategies. For example, the ability of routers to function as
1261	   tunnel end points and perform directed forwarding will influence the
1262	   choice of repair path. However, routers which are simply traversed by
1263	   repair paths (tunneled or not) do not need to be IPFRR capable in
1264	   order to guarantee correct operation of the repair paths.

1266	10. Enhancements to routing protocols

1268	   It will be seen from the above that a number of enhancements to the
1269	   appropriate routing protocols are needed to support IPFRR. The
1270	   following possible enhancements have been identified:

1272	        o  The ability to advertise IPFRR capability

1274	        o  The ability to advertise tunnel endpoint capability

1276	        o  The ability to advertise directed forwarding identifiers

1278	        o  The ability to announce the start of a loop-free transition,
1279	           and to abort a loop-free transition.

1281	        o  The ability to signal transition completion status to
1282	           neighbors.

1284	        o  The ability to advertise that a link is protected.

1286	   Capability advertisement should make use of existing capability
1287	   mechanisms in the routing protocols. The exact set of enhancements
1288	   will depend on specific IPFRR design choices.

1290	11. IANA considerations

1292	   There are no IANA considerations that arise from this architectural
1293	   description of IPFRR. However there will be changes to the IGPs to
1294	   support IPFRR in which there will be IANA considerations.

1296	12. Security Considerations

1298	   Changes to the IGPs to support IPFRR do not introduce any additional
1299	   security risks.

1301	   The security implications of the increased convergence time due to
1302	   the loop avoidance strategy depend on the approach to multiple
1303	   failures. If the presence of multiple failures results in the network
1304	   aborting the loop free strategy, then the convergence time will be
1305	   similar to that of a conventional network. On the other hand, an
1306	   attacker in a position to disrupt part of a network might use this to
1307	   disrupt the repair of a critical path.

1309	   The tunnel endpoints need to be secured to prevent their use as a
1310	   facility by an attacker. Performance considerations indicate that
1311	   tunnels cannot be secured by IPsec [IPSEC]. A system of packet
1312	   address policing, both at the tunnel endpoints and at the edges of
1313	   the network would prevent an attacker's packet arriving at a tunnel
1314	   endpoint and would seem to be the best strategy.

1316	   When a fast re-route is in progress, there may be an unacceptable
1317	   increase in traffic load over the repair path. Network operators need
1318	   to examine the computed repair paths and ensure that they have
1319	   sufficient capacity.

1321	Acknowledgments
1322	   The authors acknowledge the significant technical contributions made
1323	   to this work by their colleagues: John Harper and Kevin Miles.

1325	IPR Disclosure Acknowledgement

1327	   By submitting this Internet-Draft, we certify that any applicable
1328	   patent or other IPR claims of which we are aware have been disclosed,
1329	   and any of which we become aware will be disclosed, in accordance
1330	   with RFC 3668.

1332	Normative References

1334	   Internet-drafts are works in progress available from
1335	   http://www.ietf.org/internet-drafts/

1337	Informative References

1339	   Internet-drafts are works in progress available from
1340	   http://www.ietf.org/internet-drafts/

1342	   BFD       Katz, D., and Ward, D., "Bidirectional Forwarding
1343	             Detection", draft-katz-ward-bfd-01.txt, August
1344	             2003 (work in progress).

1346	  IPSEC      Kent, S., Atkinson, R., "Security Architecture
1347	             for the Internet Protocol", RFC 2401

1349	Authors' Addresses

1351	   Stewart Bryant
1352	   Cisco Systems,
1353	   250, Longwater Avenue,
1354	   Green Park,
1355	   Reading, RG2 6GB,
1356	   United Kingdom.             Email: stbryant@cisco.com

1358	   Clarence Filsfils
1359	   Cisco Systems,
1360	   De Kleetlaan 6a,
1361	   1831 Diegem,
1362	   Belgium                     Email: cfilsfil@cisco.com

1364	   Stefano Previdi
1365	   Cisco Systems,
1366	   Via Del Serafico 200
1367	   00142 Roma,
1368	   Italy                       Email: sprevidi@cisco.com

1370	   Mike Shand
1371	   Cisco Systems,
1372	   250, Longwater Avenue,
1373	   Green Park,
1374	   Reading, RG2 6GB,
1375	   United Kingdom.             Email: mshand@cisco.com

1377	Full Copyright statement

1379	   Copyright (C) The Internet Society (2004). All Rights Reserved.

1381	   This document is subject to the rights, licenses and restrictions
1382	   contained in BCP 78, and except as set forth therein, the authors
1383	   retain all their rights.

1385	   This document and the information contained herein are provided on an
1386	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1387	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
1388	   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
1389	   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
1390	   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1391	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.