idnits 2.17.1 

draft-ietf-rtgwg-lf-conv-frmwk-07.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** The document seems to lack a License Notice according IETF Trust
     Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009
     Section 6.b -- however, there's a paragraph with a matching beginning.
     Boilerplate error?

     (You're using the IETF Trust Provisions' Section 6.b License Notice from
     12 Feb 2009 rather than one of the newer Notices.  See
     https://trustee.ietf.org/license-info/.)


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (October 20, 2009) is 5296 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-13) exists of
     draft-ietf-rtgwg-ipfrr-framework-12

  == Outdated reference: A later version (-11) exists of
     draft-ietf-rtgwg-ipfrr-notvia-addresses-04

  == Outdated reference: A later version (-12) exists of
     draft-ietf-rtgwg-ordered-fib-02

  -- Obsolete informational reference (is this intentional?): RFC 1305
     (Obsoleted by RFC 5905)


     Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	RTGWG                                                           M. Shand
3	Internet-Draft                                                 S. Bryant
4	Intended status: Informational                             Cisco Systems
5	Expires: April 23, 2010                                 October 20, 2009

7	                 A Framework for Loop-free Convergence
8	                   draft-ietf-rtgwg-lf-conv-frmwk-07

10	Status of this Memo

12	   This Internet-Draft is submitted to IETF in full conformance with the
13	   provisions of BCP 78 and BCP 79.

15	   Internet-Drafts are working documents of the Internet Engineering
16	   Task Force (IETF), its areas, and its working groups.  Note that
17	   other groups may also distribute working documents as Internet-
18	   Drafts.

20	   Internet-Drafts are draft documents valid for a maximum of six months
21	   and may be updated, replaced, or obsoleted by other documents at any
22	   time.  It is inappropriate to use Internet-Drafts as reference
23	   material or to cite them other than as "work in progress."

25	   The list of current Internet-Drafts can be accessed at
26	   http://www.ietf.org/ietf/1id-abstracts.txt.

28	   The list of Internet-Draft Shadow Directories can be accessed at
29	   http://www.ietf.org/shadow.html.

31	   This Internet-Draft will expire on April 23, 2010.

33	Copyright Notice

35	   Copyright (c) 2009 IETF Trust and the persons identified as the
36	   document authors.  All rights reserved.

38	   This document is subject to BCP 78 and the IETF Trust's Legal
39	   Provisions Relating to IETF Documents in effect on the date of
40	   publication of this document (http://trustee.ietf.org/license-info).
41	   Please review these documents carefully, as they describe your rights
42	   and restrictions with respect to this document.

44	Abstract

46	   A micro-loop is a packet forwarding loop which may occur transiently
47	   among two or more routers in a hop by hop packet forwarding paradigm.

49	   This framework provides a summary of the causes and consequences of
50	   micro-loops and enables the reader to form a judgement on whether
51	   micro-looping is an issue that needs to be addressed in specific
52	   networks.  It also provides a survey of the currently proposed
53	   mechanisms that may be used to prevent or to suppress the formation
54	   of micro-loops when an IP or MPLS network undergoes topology change
55	   due to failure, repair or management action.  When sufficiently fast
56	   convergence is not available and the topology is susceptible to
57	   micro-loops, use of one or more of these mechanisms may be desirable.

59	Table of Contents

61	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
62	   2.  The Nature of Micro-loops  . . . . . . . . . . . . . . . . . .  4
63	   3.  Applicability  . . . . . . . . . . . . . . . . . . . . . . . .  5
64	   4.  Micro-loop Control Strategies  . . . . . . . . . . . . . . . .  6
65	   5.  Loop mitigation  . . . . . . . . . . . . . . . . . . . . . . .  7
66	     5.1.  Fast-convergence . . . . . . . . . . . . . . . . . . . . .  8
67	     5.2.  PLSN . . . . . . . . . . . . . . . . . . . . . . . . . . .  8
68	   6.  Micro-loop Prevention  . . . . . . . . . . . . . . . . . . . . 10
69	     6.1.  Incremental Cost Advertisement . . . . . . . . . . . . . . 10
70	     6.2.  Nearside Tunneling . . . . . . . . . . . . . . . . . . . . 11
71	     6.3.  Farside Tunnels  . . . . . . . . . . . . . . . . . . . . . 13
72	     6.4.  Distributed Tunnels  . . . . . . . . . . . . . . . . . . . 14
73	     6.5.  Packet Marking . . . . . . . . . . . . . . . . . . . . . . 14
74	     6.6.  MPLS New Labels  . . . . . . . . . . . . . . . . . . . . . 15
75	     6.7.  Ordered FIB Update . . . . . . . . . . . . . . . . . . . . 16
76	     6.8.  Synchronised FIB Update  . . . . . . . . . . . . . . . . . 17
77	   7.  Using PLSN In Conjunction With Other Methods . . . . . . . . . 18
78	   8.  Loop Suppression . . . . . . . . . . . . . . . . . . . . . . . 19
79	   9.  Compatibility Issues . . . . . . . . . . . . . . . . . . . . . 19
80	   10. Comparison of Loop-free Convergence Methods  . . . . . . . . . 20
81	   11. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 21
82	   12. Security Considerations  . . . . . . . . . . . . . . . . . . . 21
83	   13. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 21
84	   14. Informative References . . . . . . . . . . . . . . . . . . . . 21
85	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22

87	1.  Introduction

89	   When there is a change to the network topology (due to the failure or
90	   restoration of a link or router, or as a result of management action)
91	   the routers need to converge on a common view of the new topology and
92	   the paths to be used for forwarding traffic to each destination.
93	   During this process, referred to as a routing transition, packet
94	   delivery between certain source/destination pairs may be disrupted.
95	   This occurs due to the time it takes for the topology change to be
96	   propagated around the network together with the time it takes each
97	   individual router to determine and then update the forwarding
98	   information base (FIB) for the affected destinations.  During this
99	   transition, packets may be lost due to the continuing attempts to use
100	   the failed component, and due to forwarding loops.  Forwarding loops
101	   arise due to the inconsistent FIBs that occur as a result of the
102	   difference in time taken by routers to execute the transition
103	   process.  This is a problem that may occur in both IP networks and
104	   MPLS networks that use label distribution protocol (LDP) RFC5036
105	   [RFC5036] as the label switched path (LSP) signaling protocol.

107	   The service failures caused by routing transitions are largely hidden
108	   by higher-level protocols that retransmit the lost data.  However new
109	   Internet services could emerge which are more sensitive to the packet
110	   disruption that occurs during a transition.  To make the transition
111	   transparent to their users, these services would require a short
112	   routing transition.  Ideally, routing transitions would be completed
113	   in zero time with no packet loss.

115	   Regardless of how optimally the mechanisms involved have been
116	   designed and implemented, it is inevitable that a routing transition
117	   will take some minimum interval that is greater than zero.  This has
118	   led to the development of a traffic engineering (TE) fast-reroute
119	   mechanism for MPLS [RFC4090].  Alternative mechanisms that might be
120	   deployed in an MPLS network and mechanisms that may be used in an IP
121	   network are work in progress in the IETF
122	   [I-D.ietf-rtgwg-ipfrr-framework].  The repair mechanism may however
123	   be disrupted by the formation of micro-loops during the period
124	   between the time when the failure is announced, and the time when all
125	   FIBs have been updated to reflect the new topology.

127	   One method of mitigating the effects of micro-loops is to ensure that
128	   the network reconverges in a sufficiently short time that these
129	   effects are inconsequential.  Another method is to design the network
130	   topology to minimise or even eliminate the possibility of micro-
131	   loops.

133	   The propensity to form micro-loops is highly topology dependent and
134	   algorithms are available to identify which links in a network are
135	   subject to micro-looping.  In topologies which are critically
136	   susceptible to the formation of micro-loops, there is little point in
137	   introducing new mechanisms to provide fast re-route, without also
138	   deploying mechanisms that prevent the disruptive effects of micro-
139	   loops.  Unless micro-loop prevention is used in these topologies,
140	   packets may not reach the repair and micro-looping packets may cause
141	   congestion resulting in further packet loss.

143	   The disruptive effect of micro-loops is not confined to periods when
144	   there is a component failure.  Micro-loops can, for example, form
145	   when a component is put back into service following repair.  Micro-
146	   loops can also form as a result of a network maintenance action such
147	   as adding a new network component, removing a network component or
148	   modifying a link cost.

150	   This framework provides a summary of the causes and consequences of
151	   micro-loops and enables the reader to form a judgement on whether
152	   micro-looping is an issue that needs to be addressed in specific
153	   networks.  It also provides a survey of the currently proposed micro-
154	   loop mitigation mechanisms.  When sufficiently fast convergence is
155	   not available and the topology is susceptible to micro-loops, use of
156	   one or more of these mechanisms may be desirable.

158	2.  The Nature of Micro-loops

160	   A micro-loop is a packet forwarding loop which may occur transiently
161	   among two or more routers in a hop by hop packet forwarding paradigm.

163	   Micro-loops may form during the periods when a network is re-
164	   converging following ANY topology change, and are caused by
165	   inconsistent FIBs in the routers.  During the transition, micro-loops
166	   may occur over a single link between a pair of routers that
167	   temporarily use each other as the next hop for a prefix.  Micro-loops
168	   may also form when each router in a cycle of three or more routers
169	   has the next router in the cycle as a next hop for a given prefix.

171	   Cyclic loops may occur if one or more of the following conditions are
172	   met:-

174	   1.  Asymmetric link costs.

176	   2.  The existence of an equal cost path between a pair of routers
177	       which make different decisions regarding which path to use for
178	       forwarding to a particular destination.  Note that even routers
179	       which do not implement equal cost multi-path (ECMP) forwarding
180	       must make a choice between the available equal cost paths and
181	       unless they make the same choice the condition for cyclic loops
182	       will be fulfilled.

184	   3.  Topology changes affecting multiple links, including single node
185	       and line card failures.

187	   Micro-loops have two undesirable side-effects; congestion and repair
188	   starvation.

190	   o  A looping packet consumes bandwidth until it either escapes as a
191	      result of the re-synchronization of the FIBs, or its TTL expires.
192	      This transiently increases the traffic over a link by as much as
193	      128 times, and may cause the link to become congested.  This
194	      congestion reduces the bandwidth available to other traffic (which
195	      is not otherwise affected by the topology change).  As a result
196	      the "innocent" traffic using the link experiences increased
197	      latency, and is liable to congestive packet loss.

199	   o  In cases where the link or node failure has been protected by a
200	      fast re-route repair, an inconsistency in the FIBs may prevent
201	      some traffic from reaching the failure and hence being repaired.
202	      The repair may thus become starved of traffic and thereby rendered
203	      ineffective.

205	   Although micro-loops are usually considered in the context of a
206	   failure, similar problems of congestive packet loss and starvation
207	   may also occur if the topology change is the result of management
208	   action.  For example, consider the case where a link is to be taken
209	   out of service by management action.  The link can be retained in
210	   service throughout the transition, thus avoiding the need for any
211	   repair.  However, if micro-loops form, they may cause congestion loss
212	   and may also prevent traffic from reaching the link.

214	   Unless otherwise controlled, micro-loops may form in any part of the
215	   network that forwards (or in the case of a new link, will forward)
216	   packets over a path that includes the affected topology change.  The
217	   time taken to propagate the topology change through the network, and
218	   the non-uniform time taken by each router to calculate the new
219	   shortest path tree (SPT) and update its FIB contribute to the
220	   duration of the packet disruption caused by the micro-loops.  In some
221	   cases a packet may be subject to disruption from micro-loops which
222	   occur sequentially at links along the path, thus further extending
223	   the period of disruption beyond that required to resolve a single
224	   loop.

226	3.  Applicability

228	   Loop free convergence techniques are applicable to any situation in
229	   which micro-loops may form.  For example the convergence of a network
230	   following:

232	   1.  Component failure.

234	   2.  Component repair.

236	   3.  Management withdrawal of a component.

238	   4.  Management insertion or a component.

240	   5.  Management change of link cost (either positive or negative).

242	   6.  External cost change, for example change of external gateway as a
243	       result of a BGP change.

245	   7.  A Shared Risk Link Group (SRLG) failure.

247	   In each case, a component may be a link, a set of links or an entire
248	   router.  Throughout this document we use the term SRLG when
249	   describing the procedure to be followed when multiple failures have
250	   occurred whether or not they are members of an explicit SRLG.  In the
251	   case of multiple independent failures, the loop prevention method
252	   described for SRLG may be used provided it is known that all of these
253	   failures have been repaired.

255	   Loop free convergence techniques are applicable to both IP networks
256	   and MPLS enabled networks that use LDP, including LDP networks that
257	   use the single-hop tunnel fast-reroute mechanism.

259	   An assessment of whether loop free convergence techniques are
260	   required should take into account whether or not the interior gateway
261	   protocol (IGP) convergence is sufficiently fast that any micro-loops
262	   are of such short duration that they are not disruptive, and whether
263	   or not the topology is such that micro-loops are likely to form.

265	4.  Micro-loop Control Strategies

267	   Micro-loop control strategies fall into four basic classes:

269	   1.  Micro-loop mitigation

271	   2.  Micro-loop prevention

273	   3.  Micro-loop suppression
274	   4.  Network design to minimise micro-loops

276	   A micro-loop mitigation scheme works by re-converging the network in
277	   such a way that it reduces, but does not eliminate, the formation of
278	   micro-loops.  Such schemes cannot guarantee the productive forwarding
279	   of packets during the transition.

281	   A micro-loop prevention mechanism controls the re-convergence of the
282	   network in such a way that no micro-loops form.  Such a micro-loop
283	   prevention mechanism allows the continued use of any fast repair
284	   method until the network has converged on its new topology, and
285	   prevents the collateral damage that occurs to other traffic for the
286	   duration of each micro-loop.

288	   A micro-loop suppression mechanism attempts to eliminate the
289	   collateral damage caused by micro-loops to other traffic.  This may
290	   be achieved by, for example, using a packet monitoring method that
291	   detects that a packet is looping and drops it.  Such schemes make no
292	   attempt to productively forward the packet throughout the network
293	   transition.

295	   Highly meshed topologies are less susceptible to micro-loops, thus
296	   networks may be designed to minimise the occurrence of micro-loops by
297	   appropriate link placement and metric settings.  However, this
298	   approach may conflict with other design requirements such as cost and
299	   traffic planning and may not accurately track the evolution of the
300	   network, or temporary changes due to outages.

302	   Note that all known micro-loop prevention mechanisms and most micro-
303	   loop mitigation mechanisms extend the duration of the re-convergence
304	   process.  When the failed component is protected by a fast re-route
305	   repair this implies that the converging network requires the repair
306	   to remain in place for longer than would otherwise be the case.  The
307	   extended convergence time means any traffic which is not repaired by
308	   an imperfect repair experiences a significantly longer outage than it
309	   would experience with conventional convergence.

311	   When a component is returned to service, or when a network management
312	   action has taken place, this additional delay does not cause traffic
313	   disruption, because there is no repair involved.  However the
314	   extended delay is undesirable, because it increases the time that the
315	   network takes to be ready for another failure, and hence leaves it
316	   vulnerable to multiple failures.

318	5.  Loop mitigation

320	   There are two approaches to loop mitigation.

322	   o  Fast-convergence

324	   o  A purpose designed loop mitigation mechanism

326	5.1.  Fast-convergence

328	   The duration of micro-loops is dependent on the speed of convergence.
329	   Improving the speed of convergence may therefore be seen as a loop
330	   mitigation technique.

332	5.2.  PLSN

334	   The only known purpose designed loop mitigation approach is the Path
335	   Locking with Safe-Neighbors (PLSN) method described in PLSN
336	   [I-D.ietf-rtgwg-microloop-analysis].  In this method, a micro-loop
337	   free next-hop safety condition is defined as follows:

339	   In a symmetric cost network, it is safe for router X to change to the
340	   use of neighbor Y as its next-hop for a specific destination if the
341	   path through Y to that destination satisfies both of the following
342	   criteria:

344	   1.  X considers Y as its loop-free neighbor based on the topology
345	       before the change AND

347	   2.  X considers Y as its downstream neighbor based on the topology
348	       after the change.

350	   In an asymmetric cost network, a stricter safety condition is needed,
351	   and the criterion is that:

353	      X considers Y as its downstream neighbor based on the topology
354	      both before and after the change.

356	   Based on these criteria, destinations are classified by each router
357	   into three classes:

359	   o  Type A destinations: Destinations unaffected by the change (type
360	      A1) and also destinations whose next hop after the change
361	      satisfies the safety criteria (type A2).

363	   o  Type B destinations: Destinations that cannot be sent via the new
364	      primary next-hop because the safety criteria are not satisfied,
365	      but which can be sent via another next-hop that does satisfy the
366	      safety criteria.

368	   o  Type C destinations: All other destinations.

370	   Following a topology change, Type A destinations are immediately
371	   changed to go via the new topology.  Type B destinations are
372	   immediately changed to go via the next hop that satisfies the safety
373	   criteria, even though this is not the shortest path.  Type B
374	   destinations continue to go via this path until all routers have
375	   changed their Type C destinations over to the new next hop.  Routers
376	   must not change their Type C destinations until all routers have
377	   changed their Type A2 and Type B destinations to the new or
378	   intermediate (safe) next hop.

380	   Simulations indicate that this approach produces a significant
381	   reduction in the number of links that are subject to micro-looping.
382	   However unlike all of the micro-loop prevention methods it is only a
383	   partial solution.  In particular, micro-loops may form on any link
384	   joining a pair of type C routers.

386	   Because routers delay updating their Type C destination FIB entries,
387	   they will continue to route towards the failure during the time when
388	   the routers are changing their Type A and B destinations, and hence
389	   will continue to productively forward packets provided that viable
390	   repair paths exist.

392	   A backwards compatibility issue arises with PLSN.  If a router is not
393	   capable of micro-loop control, it will not correctly delay its FIB
394	   update.  If all such routers had only type A destinations this loop
395	   mitigation mechanism would work as it was designed.  Alternatively,
396	   if all such incapable routers had only type C destinations, the
397	   "loop-prevention" announcement mechanism used to trigger the tunnel
398	   based schemes (see sections 5.2 to 5.4) could be used to cause the
399	   Type A and Type B destinations to be changed, with the incapable
400	   routers and routers having type C destinations delaying until they
401	   received the "real" announcement.  Unfortunately, these two
402	   approaches are mutually incompatible.

404	   Note that simulations indicate that in most topologies treating type
405	   B destinations as type C results in only a small degradation in loop
406	   prevention.  Also note that simulation results indicate that in
407	   production networks where some, but not all, links have asymmetric
408	   costs, using the stricter asymmetric cost criterion actually reduces
409	   the number of loop free destinations, because fewer destinations can
410	   be classified as type A or B.

412	   This mechanism operates identically for

414	   o  events that degrade the topology (e.g. link failure),

416	   o  events that improve the topology (e.g. link restoration), and
417	   o  shared risk link group (SRLG) failure.

419	6.  Micro-loop Prevention

421	   Eight micro-loop prevention methods have been proposed:

423	   1.  Incremental cost advertisement

425	   2.  Nearside tunneling

427	   3.  Farside tunneling

429	   4.  Distributed tunnels

431	   5.  Packet marking

433	   6.  New MPLS labels

435	   7.  Ordered FIB update

437	   8.  Synchronized FIB update

439	6.1.  Incremental Cost Advertisement

441	   When a link fails, the cost of the link is normally changed from its
442	   assigned metric to "infinity" in one step.  However, it can be proved
443	   [OPT] that no micro-loops will form if the link cost is increased in
444	   suitable increments, and the network is allowed to stabilize before
445	   the next cost increment is advertised.  Once the link cost has been
446	   increased to a value greater than that of the lowest alternative cost
447	   around the link, the link may be disabled without causing a micro-
448	   loop.

450	   The criterion for a link cost change to be safe is that any link
451	   which is subjected to a cost change of x can only cause loops in a
452	   part of the network that has a cyclic cost less than or equal to x.
453	   Because there may exist links which have a cost of one in each
454	   direction, resulting in a cyclic cost of two, this can result in the
455	   link cost having to be raised in increments of one.  However the
456	   increment can be larger where the minimum cost permits.  Recent work
457	   [OPT] has shown that there are a number of optimizations which can be
458	   applied to the problem in order to determine the exact set of cost
459	   values required and hence minimize the number of increments.

461	   It will be appreciated that when a link is returned to service, its
462	   cost is reduced in small steps from "infinity" to its final cost,
463	   thereby providing similar micro-loop prevention during a "good-news"
464	   event.  Note that the link cost may be decreased from "infinity" to
465	   any value greater than that of the lowest alternative cost around the
466	   link in one step without causing a micro-loop.

468	   When the failure is an SRLG the link cost increments must be
469	   coordinated across all failing members of the SRLG.  This may be
470	   achieved by completing the transition of one link before starting the
471	   next, or by interleaving the changes.

473	   The incremental cost change approach has the advantage over all other
474	   currently known loop prevention scheme that it requires no change to
475	   the routing protocol.  It will work in any network because it does
476	   not require any co-operation from the other routers in the network.

478	   Where the micro-loop prevention mechanism is being used to support a
479	   planned reconfiguration of the network, the extended total
480	   reconvergence time resulting from the multiple increments is of
481	   limited consequence, particularly where the number of increments have
482	   been optimized.  This, together with the ability to implement this
483	   technique in isolation, makes this method a good candidate for use
484	   with such management initiated changes.

486	   Where the micro-loop prevention mechanism is being used to support
487	   failure recovery, the number of increments required, and hence the
488	   time taken to fully converge, is significant even for small numbers
489	   of increments.  This is because, for the duration of the transition,
490	   some parts of the network continue to use the old forwarding path,
491	   and hence use any repair mechanism for an extended period.  In the
492	   case of a failure that cannot be fully repaired, some destinations
493	   may therefore become unreachable for an extended period.  In addition
494	   the network may be vulnerable to a second failure for the duration of
495	   the controlled re-convergence.

497	   Where large metrics are used and no optimization (such as that
498	   described above) is performed, the incremental cost method can be
499	   extremely slow.  However in cases where the per link metric is small,
500	   either because small values have been assigned by the network
501	   designers, or because of restrictions implicit in the routing
502	   protocol (e.g.  RIP restricts the metric, and BGP using the AS path
503	   length frequently uses an effective metric of one, or a very small
504	   integer for each inter AS hop), the number of required increments can
505	   be acceptably small even without optimizations.

507	6.2.  Nearside Tunneling

509	   This mechanism works by creating an overlay network using tunnels
510	   whose path is not affected by the topology change and carrying the
511	   traffic affected by the change in that new network.  When all the
512	   traffic is in the new, tunnel based, network, the real network is
513	   allowed to converge on the new topology.  Because all the traffic
514	   that would be affected by the change is carried in the overlay
515	   network no micro-loops form.

517	   When a failure is detected (or a link is withdrawn from service), the
518	   router adjacent to the failure issues a new "loop-prevention" routing
519	   message announcing the topology change.  This message is propagated
520	   through the network by all routers, but is only understood by routers
521	   capable of using one of the tunnel based micro-loop prevention
522	   mechanisms.

524	   Each of the micro-loop preventing routers builds a tunnel to the
525	   closest router adjacent to the failure.  They then determine which of
526	   their traffic would transit the failure and place that traffic in the
527	   tunnel.  When all of these tunnels are in place (determined, for
528	   example, by waiting a suitable interval) the failure is announced as
529	   normal.  Because these tunnels will be unaffected by the transition,
530	   and because the routers protecting the link will continue the repair
531	   (or forward across the link being withdrawn), no traffic will be
532	   disrupted by the failure.  When the network has converged these
533	   tunnels are withdrawn, allowing traffic to be forwarded along its new
534	   "natural" path.  The order of tunnel insertion and withdrawal is not
535	   important, provided that the tunnels are all in place before the
536	   normal announcement is issued, and provided that the repair remains
537	   in place until normal convergence has completed.

539	   This method completes in bounded time, and is generally much faster
540	   than the incremental cost method.  Depending on the exact design, it
541	   completes in two or three flood-SPF-FIB update cycles.

543	   At the time at which the failure is announced as normal, micro-loops
544	   may form within isolated islands of non-micro-loop preventing
545	   routers.  However, only traffic entering the network via such routers
546	   can micro-loop.  All traffic entering the network via a micro-loop
547	   preventing router will be tunneled correctly to the nearest repairing
548	   router, including, if necessary being tunneled via a non-micro-loop
549	   preventing router, and will not micro-loop.

551	   Where there is no requirement to prevent the formation of micro-loops
552	   involving non-micro-loop preventing routers, a single, "normal"
553	   announcement may be made, and a local timer used to determine the
554	   time at which transition from tunneled forwarding to normal
555	   forwarding over the new topology may commence.

557	   This technique has the disadvantage that it requires traffic to be
558	   tunneled during the transition.  This is an issue in IP networks
559	   because not all router designs are capable of high performance IP
560	   tunneling.  It is also an issue in MPLS networks because the
561	   encapsulating router has to know the label set that the decapsulating
562	   router is distributing.

564	   A further disadvantage of this method is that it requires co-
565	   operation from all the routers within the routing domain to fully
566	   protect the network against micro-loops.

568	   When a new link is added, the mechanism is run in "reverse".  When
569	   the loop-prevention announcement is heard, routers determine which
570	   traffic they will send over the new link, and tunnel that traffic to
571	   the router on the near side of that link.  This path will not be
572	   affected by the presence of the new link.  When the "normal"
573	   announcement is heard, they then update their FIB to send the traffic
574	   normally according to the new topology.  Any traffic encountering a
575	   router that has not yet updated its FIB will be tunneled to the near
576	   side of the link, and will therefore not loop.

578	   When a management change to the topology is required, again exactly
579	   the same mechanism protects against micro-looping of packets by the
580	   micro-loop preventing routers.

582	   When the failure is an SRLG, the required strategy is to classify
583	   traffic according the furthest failing member of the SRLG that it
584	   will traverse on its way to the destination, and to tunnel that
585	   traffic to the repairing router for that SRLG member.  This will
586	   require multiple tunnel destinations, in the limiting case, one per
587	   SRLG member.

589	6.3.  Farside Tunnels

591	   Farside tunneling loop prevention requires the loop preventing
592	   routers to place all of the traffic that would traverse the failure
593	   in one or more tunnels terminating at the router (or in the case of
594	   node failure routers) at the far side of the failure.  The properties
595	   of this method are a more uniform distribution of repair traffic than
596	   is a achieved using the nearside tunnel method, and in the case of
597	   node failure, a reduction in the decapsulation load on any single
598	   router.

600	   Unlike the nearside tunnel method (which uses normal routing to the
601	   repairing router), this method requires the use of a repair path to
602	   the farside router.  This may be provided by the not-via
603	   [I-D.ietf-rtgwg-ipfrr-notvia-addresses] mechanism, in which case no
604	   further computation is needed.

606	   The mode of operation is otherwise identical to the nearside
607	   tunneling loop prevention method (Section 6.2).

609	6.4.  Distributed Tunnels

611	   In the distributed tunnels loop prevention method, each router
612	   calculates its own repair and forwards traffic affected by the
613	   failure using that repair.  Unlike the FRR case, the actual failure
614	   is known at the time of the calculation.  The objective of the loop
615	   preventing routers is to get the packets that would have gone via the
616	   failure into Q-space [I-D.bryant-ipfrr-tunnels] using routers that
617	   are in P-space.  Because packets are decapsulated on entry to
618	   Q-space, rather than being forced to go to the farside of the
619	   failure, more optimum routing may be achieved.  This method is
620	   subject to the same reachability constraints described in
621	   [I-D.bryant-ipfrr-tunnels].

623	   The mode of operation is otherwise identical to the nearside
624	   tunneling loop prevention method (Section 6.2).

626	   An alternative distributed tunnel mechanism is for all routers to
627	   tunnel to the not-via address [I-D.ietf-rtgwg-ipfrr-notvia-addresses]
628	   associated with the failure.

630	6.5.  Packet Marking

632	   If packets could be marked in some way, this information could be
633	   used to assign them to one of:

635	   o  the new topology,

637	   o  the old topology or

639	   o  a transition topology.

641	   They would then be correctly forwarded during the transition.  This
642	   mechanism works identically for both "bad-news" and "good-news"
643	   events.  It also works identically for SRLG failure.  There are three
644	   problems with this solution:

646	   o  A packet marking bit may not be available, for example a network
647	      supporting both the differentiated services architecture [RFC2475]
648	      and explicit congestion notification [RFC3168] uses all eight bits
649	      of the IPv4 Type of Service field.

651	   o  The mechanism would introduce a non-standard forwarding procedure.

653	   o  Packet marking using either the old or the new topology would
654	      double the size of the FIB, however some optimizations may be
655	      possible

657	6.6.  MPLS New Labels

659	   In an MPLS network that is using RFC5036 [RFC5036] for label
660	   distribution, loop free convergence can be achieved through the use
661	   of new labels when the path that a prefix will take through the
662	   network changes.

664	   As described in Section 6.2, the repairing routers issue a loop-
665	   prevention announcement to start the loop free convergence process.
666	   All loop preventing routers calculate the new topology and determine
667	   whether their FIB needs to be changed.  If there is no change in the
668	   FIB they take no part in the following process.

670	   The routers that need to make a change to their FIB consider each
671	   change and check the new next hop to determine whether it will use a
672	   path in the OLD topology which reaches the destination without
673	   traversing the failure (i.e. the next hop is in P-space with respect
674	   to the failure [I-D.bryant-ipfrr-tunnels]).  If so the FIB entry can
675	   be immediately updated.  For all of the remaining FIB entries, the
676	   router issues a new label to each of its neighbors.  This new label
677	   is used to lock the path during the transition in a similar manner to
678	   the previously described loop-free convergence with tunnels method
679	   (Section 6.2).  Routers receiving a new label install it in their
680	   FIB, for MPLS label translation, but do not yet remove the old label
681	   and do not yet use this new label to forward IP packets. i.e. they
682	   prepare to forward using the new label on the new path, but do not
683	   use it yet.  Any packets received continue to be forwarded the old
684	   way, using the old labels, towards the repair.

686	   At some time after the loop-prevention announcement, a normal routing
687	   announcement of the failure is issued.  This announcement must not be
688	   issued until such time as all routers have carried out all of their
689	   loop-prevention announcement triggered activities.  On receipt of the
690	   normal announcement all routers that were delaying convergence move
691	   to their new path for both the new and the old labels.  This involves
692	   changing the IP address entries to use the new labels, AND changing
693	   the old labels to forward using the new labels.

695	   Because the new label path was installed during the loop-prevention
696	   phase, packets reach their destinations as follows:

698	   o  If they do not go via any router using a new label they go via the
699	      repairing router and the repair.

701	   o  If they meet any router that is using the new labels they get
702	      marked with the new labels and reach their destination using the
703	      new path, back-tracking if necessary.

705	   When all routers have changed to the new path the network is
706	   converged.  At some later time, when it can be assumed that all
707	   routers have moved to using the new path, the FIB can be cleaned up
708	   to remove the, now redundant, old labels.

710	   As with other method methods the new labels may be modified to
711	   provide loop prevention for "good news".  There are also a number of
712	   optimizations of this method.

714	6.7.  Ordered FIB Update

716	   The Ordered FIB loop prevention method is described in OFIB
717	   [I-D.ietf-rtgwg-ordered-fib].  Micro-loops occur following a failure
718	   or a cost increase, when a router closer to the failed component
719	   revises its routes to take account of the failure before a router
720	   which is further away.  By analyzing the reverse shortest path tree
721	   (rSPT) over which traffic is directed to the failed component in the
722	   old topology, it is possible to determine a strict ordering which
723	   ensures that nodes closer to the root always process the failure
724	   after any nodes further away, and hence micro-loops are prevented.

726	   When the failure has been announced, each router waits a multiple of
727	   the convergence timer [I-D.atlas-bryant-shand-lf-timers].  The
728	   multiple is determined by the node's position in the rSPT, and the
729	   delay value is chosen to guarantee that a node can complete its
730	   processing within this time.  The convergence time may be reduced by
731	   employing a signaling mechanism to notify the parent when all the
732	   children have completed their processing, and hence when it is safe
733	   for the parent to instantiate its new routes.

735	   The property of this approach is therefore that it imposes a delay
736	   which is bounded by the network diameter although in many cases it
737	   will be much less.

739	   When a link is returned to service the convergence process above is
740	   reversed.  A router first determines its distance (in hops) from the
741	   new link in the NEW topology.  Before updating its FIB, it then waits
742	   a time equal to the value of that distance multiplied by the
743	   convergence timer.

745	   It will be seen that network management actions can similarly be
746	   undertaken by treating a cost increase in a manner similar to a
747	   failure and a cost decrease similar to a restoration.

749	   The ordered FIB mechanism requires all nodes in the domain to operate
750	   according to these procedures, and the presence of non co-operating
751	   nodes can give rise to loops for any traffic which traverses them
752	   (not just traffic which is originated through them).  Without
753	   additional mechanisms these loops could remain in place for a
754	   significant time.

756	   It should be noted that this method requires per router ordering, but
757	   not per prefix ordering.  A router must wait its turn to update its
758	   FIB, but it should then update its entire FIB.

760	   When an SRLG failure occurs a router must classify traffic into the
761	   classes that pass over each member of the SRLG.  Each router is then
762	   independently assigned a ranking with respect to each SRLG member for
763	   which they have a traffic class.  These rankings may be different for
764	   each traffic class.  The prefixes of each class are then changed in
765	   the FIB according to the ordering of their specific ranking.  Again,
766	   as for the single failure case, signaling may be used to speed up the
767	   convergence process.

769	   Note that the special SRLG case of a full or partial node failure,
770	   can be dealt with without using per prefix ordering, by running a
771	   single reverse SPF computation rooted at the failed node (or common
772	   point of the subset of failing links in the partial case).

774	   There are two classes of signaling optimization that can be applied
775	   to the ordered FIB loop-prevention method:

777	   o  When the router makes NO change, it can signal immediately.  This
778	      significantly reduces the time taken by the network to process
779	      long chains of routers that have no change to make to their FIB.

781	   o  When a router HAS changed, it can signal that it has completed.
782	      This is more problematic since this may be difficult to determine,
783	      particularly in a distributed architecture, and the optimization
784	      obtained is the difference between the actual time taken to make
785	      the FIB change and the worst case timer value.  This saving could
786	      be of the order of one second per hop.

788	   There is another method of executing ordered FIB which is based on
789	   pure signaling [SIG].  Methods that use signaling as an optimization
790	   are safe because eventually they fall back on the established IGP
791	   mechanisms which ensure that networks converge under conditions of
792	   packet loss.  However a mechanism that relies on signaling in order
793	   to converge requires a reliable signaling mechanism which must be
794	   proven to recover from any failure circumstance.

796	6.8.  Synchronised FIB Update

798	   Micro-loops form because of the asynchronous nature of the FIB update
799	   process during a network transition.  In many router architectures it
800	   is the time taken to update the FIB itself that is the dominant term.

802	   One approach would be to have two FIBs and, in a synchronized action
803	   throughout the network, to switch from the old to the new.  One way
804	   to achieve this synchronized change would be to signal or otherwise
805	   determine the wall clock time of the change, and then execute the
806	   change at that time, using NTP [RFC1305] to synchronize the wall
807	   clocks in the routers.

809	   This approach has a number of major issues.  Firstly two complete
810	   FIBs are needed which may create a scaling issue and secondly a
811	   suitable network wide synchronization method is needed.  However,
812	   neither of these are insurmountable problems.

814	   Since the FIB change synchronization will not be perfect there may be
815	   some interval during which micro-loops form.  Whether this scheme is
816	   classified as a micro-loop prevention mechanism or a micro-loop
817	   mitigation mechanism within this taxonomy is therefore dependent on
818	   the degree of synchronization achieved.

820	   This mechanism works identically for both "bad-news" and "good-news"
821	   events.  It also works identically for SRLG failure.  Further
822	   consideration needs to be given to interoperating with routers that
823	   do not support this mechanism.  Without a suitable interoperating
824	   mechanism, loops may form for the duration of the synchronization
825	   delay.

827	7.  Using PLSN In Conjunction With Other Methods

829	   All of the tunnel methods and packet marking can be combined with
830	   PLSN (Section 5.2)[I-D.ietf-rtgwg-microloop-analysis] to reduce the
831	   traffic that needs to be protected by the advanced method.
832	   Specifically all traffic could use PLSN except traffic between a pair
833	   of routers both of which consider the destination to be type C. The
834	   type C to type C traffic would be protected from micro-looping
835	   through the use of a loop prevention method.

837	   However, determining whether the new next hop router considers a
838	   destination to be type C may be computationally intensive.  An
839	   alternative approach would be to use a loop prevention method for all
840	   local type C destinations.  This would not require any additional
841	   computation, but would require the additional loop prevention method
842	   to be used in cases which would not have generated loops (i.e. when
843	   the new next-hop router considered this to be a type A or B
844	   destination).

846	   The amount of traffic that would use PLSN is highly dependent on the
847	   network topology and the specific change, but would be expected to be
848	   in the region %70 to %90 in typical networks.

850	   However, PLSN cannot be combined safely with Ordered FIB.  Consider
851	   the network fragment shown below:

853	                      R
854	                     /|\
855	                    / | \
856	                  1/ 2|  \3
857	                  /   |   \    cost S->T = 10
858	           Y-----X----S----T   cost T->S = 1
859	           |  1     2      |
860	           |1              |
861	           D---------------+
862	                  20

864	   On failure of link XY, according to PLSN, S will regard R as a safe
865	   neighbor for traffic to D. However the ordered FIB rank of both R and
866	   T will be zero and hence these can change their FIBs during the same
867	   time interval.  If R changes before T, then a loop will form around
868	   R, T and S. This can be prevented by using a stronger safety
869	   condition than PLSN currently specifies, at the cost of introducing
870	   more type C routers, and hence reducing the PLSN coverage.

872	8.  Loop Suppression

874	   A micro-loop suppression mechanism recognizes that a packet is
875	   looping and drops it.  One such approach would be for a router to
876	   recognize, by some means, that it had seen the same packet before.
877	   It is difficult to see how sufficiently reliable discrimination could
878	   be achieved without some form of per-router signature such as route
879	   recording.  A packet recognizing approach therefore seems infeasible.

881	   An alternative approach would be to recognize that a packet was
882	   looping by recognizing that it was being sent back to the place that
883	   it had just come from.  This would work for the types of loop that
884	   form in symmetric cost networks, but would not suppress the cyclic
885	   loops that form in asymmetric networks, and as a result of multiple
886	   failures.

888	   This mechanism operates identically for both "bad-news" events,
889	   "good-news" events and SRLG failure.

891	9.  Compatibility Issues

893	   Deployment of any micro-loop control mechanism is a major change to a
894	   network.  Full consideration must be given to interoperation between
895	   routers that are capable of micro-loop control, and those that are
896	   not.  Additionally there may be a desire to limit the complexity of
897	   micro-loop control by choosing a method based purely on its
898	   simplicity.  Any such decision must take into account that if a more
899	   capable scheme is needed in the future, its deployment might be
900	   complicated by interaction with the scheme previously deployed.

902	10.  Comparison of Loop-free Convergence Methods

904	   PLSN [I-D.ietf-rtgwg-microloop-analysis] is an efficient mechanism to
905	   prevent the formation of micro-loops, but is only a partial solution.
906	   It is a useful adjunct to some of the complete solutions, but may
907	   need modification.

909	   Incremental cost advertisement in its simplest form is impractical as
910	   a general solution because it takes too long to complete.  Optimized
911	   Incremental cost advertisement, however, completes in much less time
912	   and requires no assistance from other routers in the network.  It is
913	   therefore, useful for network reconfiguration operations.

915	   Packet Marking is probably impractical because of the need to find
916	   the marking bit and to change the forwarding behavior.

918	   Of the remaining methods, distributed tunnels is significantly more
919	   complex than nearside or farside tunnels, and should only be
920	   considered if there is a requirement to distribute the tunnel
921	   decapsulation load.

923	   Synchronised FIBs is a fast method, but has the issue that a suitable
924	   synchronization mechanism needs to be defined.  One method would be
925	   to use NTP [RFC1305], however the coupling of routing convergence to
926	   a protocol that uses the network may be a problem.  During the
927	   transition there will be some micro-looping for a short interval
928	   because it is not possible to achieve complete synchronization of the
929	   FIB changeover.

931	   The ordered FIB mechanism has the major advantage that it is a
932	   control plane only solution.  However, SRLGs require a per-
933	   destination calculation, and the convergence delay may be high,
934	   bounded by the network diameter.  The use of signaling as an
935	   accelerator may reduce the number of destinations that experience the
936	   full delay, and hence reduce the total re-convergence time to an
937	   acceptable period.

939	   The nearside and farside tunnel methods deal relatively easily with
940	   SRLGs and uncorrelated changes.  The convergence delay would be
941	   small.  However these methods require the use of tunneled forwarding
942	   which is not supported on all router hardware, and raises issues of
943	   forwarding performance.  When used with PLSN, the amount of traffic
944	   that was tunneled would be significantly reduced, thus reducing the
945	   forwarding performance concerns.  If the selected repair mechanism
946	   requires the use of tunnels, then a tunnel based loop prevention
947	   scheme may be acceptable.

949	11.  IANA Considerations

951	   There are no IANA considerations that arise from this draft.

953	12.  Security Considerations

955	   This document analyzes the problem of micro-loops and summarizes a
956	   number of potential solutions that have been proposed.  These
957	   solutions require only minor modifications to existing routing
958	   protocols and therefore do not add additional security risks.
959	   However a full security analysis would need to be provided within the
960	   specification of a particular solution proposed for deployment.

962	13.  Acknowledgments

964	   The authors would like to acknowledge contributions to this document
965	   made by Clarence Filsfils.

967	14.  Informative References

969	   [I-D.atlas-bryant-shand-lf-timers]
970	              K, A. and S. Bryant, "Synchronisation of Loop Free Timer
971	              Values", draft-atlas-bryant-shand-lf-timers-04 (work in
972	              progress), February 2008.

974	   [I-D.bryant-ipfrr-tunnels]
975	              Bryant, S., Filsfils, C., Previdi, S., and M. Shand, "IP
976	              Fast Reroute using tunnels", draft-bryant-ipfrr-tunnels-03
977	              (work in progress), November 2007.

979	   [I-D.ietf-rtgwg-ipfrr-framework]
980	              Shand, M. and S. Bryant, "IP Fast Reroute Framework",
981	              draft-ietf-rtgwg-ipfrr-framework-12 (work in progress),
982	              September 2009.

984	   [I-D.ietf-rtgwg-ipfrr-notvia-addresses]
985	              Shand, M., Bryant, S., and S. Previdi, "IP Fast Reroute
986	              Using Not-via Addresses",
987	              draft-ietf-rtgwg-ipfrr-notvia-addresses-04 (work in
988	              progress), July 2009.

990	   [I-D.ietf-rtgwg-microloop-analysis]
991	              Zinin, A., "Analysis and Minimization of Microloops in
992	              Link-state Routing Protocols",
993	              draft-ietf-rtgwg-microloop-analysis-01 (work in progress),
994	              October 2005.

996	   [I-D.ietf-rtgwg-ordered-fib]
997	              Francois, P., "Loop-free convergence using oFIB",
998	              draft-ietf-rtgwg-ordered-fib-02 (work in progress),
999	              February 2008.

1001	   [OPT]      Francois, P., Shand, M., and O. Bonaventure, "Disruption
1002	              free topology reconfiguration in OSPF networks"", IEEE
1003	              INFOCOM May 2007, Anchorage, 2007.

1005	   [RFC1305]  Mills, D., "Network Time Protocol (Version 3)
1006	              Specification, Implementation", RFC 1305, March 1992.

1008	   [RFC2475]  Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.,
1009	              and W. Weiss, "An Architecture for Differentiated
1010	              Services", RFC 2475, December 1998.

1012	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
1013	              of Explicit Congestion Notification (ECN) to IP",
1014	              RFC 3168, September 2001.

1016	   [RFC4090]  Pan, P., Swallow, G., and A. Atlas, "Fast Reroute
1017	              Extensions to RSVP-TE for LSP Tunnels", RFC 4090,
1018	              May 2005.

1020	   [RFC5036]  Andersson, L., Minei, I., and B. Thomas, "LDP
1021	              Specification", RFC 5036, October 2007.

1023	   [SIG]      Francois, P. and O. Bonaventure, "Avoiding transient loops
1024	              during IGP convergence", IEEE INFOCOM March 2005, Miami,
1025	              Fl, USA, 2005.

1027	Authors' Addresses

1029	   Mike Shand
1030	   Cisco Systems
1031	   250, Longwater Ave,
1032	   Green Park,, Reading,  RG2 6GB,
1033	   United Kingdom.

1035	   Email: mshand@cisco.com

1037	   Stewart Bryant
1038	   Cisco Systems
1039	   250, Longwater Ave,
1040	   Green Park,, Reading,  RG2 6GB
1041	   United Kingdom.

1043	   Email: stbryant@cisco.com