idnits 2.17.1 draft-ietf-rtgwg-lf-conv-frmwk-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 20, 2009) is 5296 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-13) exists of draft-ietf-rtgwg-ipfrr-framework-12 == Outdated reference: A later version (-11) exists of draft-ietf-rtgwg-ipfrr-notvia-addresses-04 == Outdated reference: A later version (-12) exists of draft-ietf-rtgwg-ordered-fib-02 -- Obsolete informational reference (is this intentional?): RFC 1305 (Obsoleted by RFC 5905) Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 RTGWG M. Shand 3 Internet-Draft S. Bryant 4 Intended status: Informational Cisco Systems 5 Expires: April 23, 2010 October 20, 2009 7 A Framework for Loop-free Convergence 8 draft-ietf-rtgwg-lf-conv-frmwk-07 10 Status of this Memo 12 This Internet-Draft is submitted to IETF in full conformance with the 13 provisions of BCP 78 and BCP 79. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress." 25 The list of current Internet-Drafts can be accessed at 26 http://www.ietf.org/ietf/1id-abstracts.txt. 28 The list of Internet-Draft Shadow Directories can be accessed at 29 http://www.ietf.org/shadow.html. 31 This Internet-Draft will expire on April 23, 2010. 33 Copyright Notice 35 Copyright (c) 2009 IETF Trust and the persons identified as the 36 document authors. All rights reserved. 38 This document is subject to BCP 78 and the IETF Trust's Legal 39 Provisions Relating to IETF Documents in effect on the date of 40 publication of this document (http://trustee.ietf.org/license-info). 41 Please review these documents carefully, as they describe your rights 42 and restrictions with respect to this document. 44 Abstract 46 A micro-loop is a packet forwarding loop which may occur transiently 47 among two or more routers in a hop by hop packet forwarding paradigm. 49 This framework provides a summary of the causes and consequences of 50 micro-loops and enables the reader to form a judgement on whether 51 micro-looping is an issue that needs to be addressed in specific 52 networks. It also provides a survey of the currently proposed 53 mechanisms that may be used to prevent or to suppress the formation 54 of micro-loops when an IP or MPLS network undergoes topology change 55 due to failure, repair or management action. When sufficiently fast 56 convergence is not available and the topology is susceptible to 57 micro-loops, use of one or more of these mechanisms may be desirable. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 62 2. The Nature of Micro-loops . . . . . . . . . . . . . . . . . . 4 63 3. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 5 64 4. Micro-loop Control Strategies . . . . . . . . . . . . . . . . 6 65 5. Loop mitigation . . . . . . . . . . . . . . . . . . . . . . . 7 66 5.1. Fast-convergence . . . . . . . . . . . . . . . . . . . . . 8 67 5.2. PLSN . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 68 6. Micro-loop Prevention . . . . . . . . . . . . . . . . . . . . 10 69 6.1. Incremental Cost Advertisement . . . . . . . . . . . . . . 10 70 6.2. Nearside Tunneling . . . . . . . . . . . . . . . . . . . . 11 71 6.3. Farside Tunnels . . . . . . . . . . . . . . . . . . . . . 13 72 6.4. Distributed Tunnels . . . . . . . . . . . . . . . . . . . 14 73 6.5. Packet Marking . . . . . . . . . . . . . . . . . . . . . . 14 74 6.6. MPLS New Labels . . . . . . . . . . . . . . . . . . . . . 15 75 6.7. Ordered FIB Update . . . . . . . . . . . . . . . . . . . . 16 76 6.8. Synchronised FIB Update . . . . . . . . . . . . . . . . . 17 77 7. Using PLSN In Conjunction With Other Methods . . . . . . . . . 18 78 8. Loop Suppression . . . . . . . . . . . . . . . . . . . . . . . 19 79 9. Compatibility Issues . . . . . . . . . . . . . . . . . . . . . 19 80 10. Comparison of Loop-free Convergence Methods . . . . . . . . . 20 81 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 82 12. Security Considerations . . . . . . . . . . . . . . . . . . . 21 83 13. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 21 84 14. Informative References . . . . . . . . . . . . . . . . . . . . 21 85 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22 87 1. Introduction 89 When there is a change to the network topology (due to the failure or 90 restoration of a link or router, or as a result of management action) 91 the routers need to converge on a common view of the new topology and 92 the paths to be used for forwarding traffic to each destination. 93 During this process, referred to as a routing transition, packet 94 delivery between certain source/destination pairs may be disrupted. 95 This occurs due to the time it takes for the topology change to be 96 propagated around the network together with the time it takes each 97 individual router to determine and then update the forwarding 98 information base (FIB) for the affected destinations. During this 99 transition, packets may be lost due to the continuing attempts to use 100 the failed component, and due to forwarding loops. Forwarding loops 101 arise due to the inconsistent FIBs that occur as a result of the 102 difference in time taken by routers to execute the transition 103 process. This is a problem that may occur in both IP networks and 104 MPLS networks that use label distribution protocol (LDP) RFC5036 105 [RFC5036] as the label switched path (LSP) signaling protocol. 107 The service failures caused by routing transitions are largely hidden 108 by higher-level protocols that retransmit the lost data. However new 109 Internet services could emerge which are more sensitive to the packet 110 disruption that occurs during a transition. To make the transition 111 transparent to their users, these services would require a short 112 routing transition. Ideally, routing transitions would be completed 113 in zero time with no packet loss. 115 Regardless of how optimally the mechanisms involved have been 116 designed and implemented, it is inevitable that a routing transition 117 will take some minimum interval that is greater than zero. This has 118 led to the development of a traffic engineering (TE) fast-reroute 119 mechanism for MPLS [RFC4090]. Alternative mechanisms that might be 120 deployed in an MPLS network and mechanisms that may be used in an IP 121 network are work in progress in the IETF 122 [I-D.ietf-rtgwg-ipfrr-framework]. The repair mechanism may however 123 be disrupted by the formation of micro-loops during the period 124 between the time when the failure is announced, and the time when all 125 FIBs have been updated to reflect the new topology. 127 One method of mitigating the effects of micro-loops is to ensure that 128 the network reconverges in a sufficiently short time that these 129 effects are inconsequential. Another method is to design the network 130 topology to minimise or even eliminate the possibility of micro- 131 loops. 133 The propensity to form micro-loops is highly topology dependent and 134 algorithms are available to identify which links in a network are 135 subject to micro-looping. In topologies which are critically 136 susceptible to the formation of micro-loops, there is little point in 137 introducing new mechanisms to provide fast re-route, without also 138 deploying mechanisms that prevent the disruptive effects of micro- 139 loops. Unless micro-loop prevention is used in these topologies, 140 packets may not reach the repair and micro-looping packets may cause 141 congestion resulting in further packet loss. 143 The disruptive effect of micro-loops is not confined to periods when 144 there is a component failure. Micro-loops can, for example, form 145 when a component is put back into service following repair. Micro- 146 loops can also form as a result of a network maintenance action such 147 as adding a new network component, removing a network component or 148 modifying a link cost. 150 This framework provides a summary of the causes and consequences of 151 micro-loops and enables the reader to form a judgement on whether 152 micro-looping is an issue that needs to be addressed in specific 153 networks. It also provides a survey of the currently proposed micro- 154 loop mitigation mechanisms. When sufficiently fast convergence is 155 not available and the topology is susceptible to micro-loops, use of 156 one or more of these mechanisms may be desirable. 158 2. The Nature of Micro-loops 160 A micro-loop is a packet forwarding loop which may occur transiently 161 among two or more routers in a hop by hop packet forwarding paradigm. 163 Micro-loops may form during the periods when a network is re- 164 converging following ANY topology change, and are caused by 165 inconsistent FIBs in the routers. During the transition, micro-loops 166 may occur over a single link between a pair of routers that 167 temporarily use each other as the next hop for a prefix. Micro-loops 168 may also form when each router in a cycle of three or more routers 169 has the next router in the cycle as a next hop for a given prefix. 171 Cyclic loops may occur if one or more of the following conditions are 172 met:- 174 1. Asymmetric link costs. 176 2. The existence of an equal cost path between a pair of routers 177 which make different decisions regarding which path to use for 178 forwarding to a particular destination. Note that even routers 179 which do not implement equal cost multi-path (ECMP) forwarding 180 must make a choice between the available equal cost paths and 181 unless they make the same choice the condition for cyclic loops 182 will be fulfilled. 184 3. Topology changes affecting multiple links, including single node 185 and line card failures. 187 Micro-loops have two undesirable side-effects; congestion and repair 188 starvation. 190 o A looping packet consumes bandwidth until it either escapes as a 191 result of the re-synchronization of the FIBs, or its TTL expires. 192 This transiently increases the traffic over a link by as much as 193 128 times, and may cause the link to become congested. This 194 congestion reduces the bandwidth available to other traffic (which 195 is not otherwise affected by the topology change). As a result 196 the "innocent" traffic using the link experiences increased 197 latency, and is liable to congestive packet loss. 199 o In cases where the link or node failure has been protected by a 200 fast re-route repair, an inconsistency in the FIBs may prevent 201 some traffic from reaching the failure and hence being repaired. 202 The repair may thus become starved of traffic and thereby rendered 203 ineffective. 205 Although micro-loops are usually considered in the context of a 206 failure, similar problems of congestive packet loss and starvation 207 may also occur if the topology change is the result of management 208 action. For example, consider the case where a link is to be taken 209 out of service by management action. The link can be retained in 210 service throughout the transition, thus avoiding the need for any 211 repair. However, if micro-loops form, they may cause congestion loss 212 and may also prevent traffic from reaching the link. 214 Unless otherwise controlled, micro-loops may form in any part of the 215 network that forwards (or in the case of a new link, will forward) 216 packets over a path that includes the affected topology change. The 217 time taken to propagate the topology change through the network, and 218 the non-uniform time taken by each router to calculate the new 219 shortest path tree (SPT) and update its FIB contribute to the 220 duration of the packet disruption caused by the micro-loops. In some 221 cases a packet may be subject to disruption from micro-loops which 222 occur sequentially at links along the path, thus further extending 223 the period of disruption beyond that required to resolve a single 224 loop. 226 3. Applicability 228 Loop free convergence techniques are applicable to any situation in 229 which micro-loops may form. For example the convergence of a network 230 following: 232 1. Component failure. 234 2. Component repair. 236 3. Management withdrawal of a component. 238 4. Management insertion or a component. 240 5. Management change of link cost (either positive or negative). 242 6. External cost change, for example change of external gateway as a 243 result of a BGP change. 245 7. A Shared Risk Link Group (SRLG) failure. 247 In each case, a component may be a link, a set of links or an entire 248 router. Throughout this document we use the term SRLG when 249 describing the procedure to be followed when multiple failures have 250 occurred whether or not they are members of an explicit SRLG. In the 251 case of multiple independent failures, the loop prevention method 252 described for SRLG may be used provided it is known that all of these 253 failures have been repaired. 255 Loop free convergence techniques are applicable to both IP networks 256 and MPLS enabled networks that use LDP, including LDP networks that 257 use the single-hop tunnel fast-reroute mechanism. 259 An assessment of whether loop free convergence techniques are 260 required should take into account whether or not the interior gateway 261 protocol (IGP) convergence is sufficiently fast that any micro-loops 262 are of such short duration that they are not disruptive, and whether 263 or not the topology is such that micro-loops are likely to form. 265 4. Micro-loop Control Strategies 267 Micro-loop control strategies fall into four basic classes: 269 1. Micro-loop mitigation 271 2. Micro-loop prevention 273 3. Micro-loop suppression 274 4. Network design to minimise micro-loops 276 A micro-loop mitigation scheme works by re-converging the network in 277 such a way that it reduces, but does not eliminate, the formation of 278 micro-loops. Such schemes cannot guarantee the productive forwarding 279 of packets during the transition. 281 A micro-loop prevention mechanism controls the re-convergence of the 282 network in such a way that no micro-loops form. Such a micro-loop 283 prevention mechanism allows the continued use of any fast repair 284 method until the network has converged on its new topology, and 285 prevents the collateral damage that occurs to other traffic for the 286 duration of each micro-loop. 288 A micro-loop suppression mechanism attempts to eliminate the 289 collateral damage caused by micro-loops to other traffic. This may 290 be achieved by, for example, using a packet monitoring method that 291 detects that a packet is looping and drops it. Such schemes make no 292 attempt to productively forward the packet throughout the network 293 transition. 295 Highly meshed topologies are less susceptible to micro-loops, thus 296 networks may be designed to minimise the occurrence of micro-loops by 297 appropriate link placement and metric settings. However, this 298 approach may conflict with other design requirements such as cost and 299 traffic planning and may not accurately track the evolution of the 300 network, or temporary changes due to outages. 302 Note that all known micro-loop prevention mechanisms and most micro- 303 loop mitigation mechanisms extend the duration of the re-convergence 304 process. When the failed component is protected by a fast re-route 305 repair this implies that the converging network requires the repair 306 to remain in place for longer than would otherwise be the case. The 307 extended convergence time means any traffic which is not repaired by 308 an imperfect repair experiences a significantly longer outage than it 309 would experience with conventional convergence. 311 When a component is returned to service, or when a network management 312 action has taken place, this additional delay does not cause traffic 313 disruption, because there is no repair involved. However the 314 extended delay is undesirable, because it increases the time that the 315 network takes to be ready for another failure, and hence leaves it 316 vulnerable to multiple failures. 318 5. Loop mitigation 320 There are two approaches to loop mitigation. 322 o Fast-convergence 324 o A purpose designed loop mitigation mechanism 326 5.1. Fast-convergence 328 The duration of micro-loops is dependent on the speed of convergence. 329 Improving the speed of convergence may therefore be seen as a loop 330 mitigation technique. 332 5.2. PLSN 334 The only known purpose designed loop mitigation approach is the Path 335 Locking with Safe-Neighbors (PLSN) method described in PLSN 336 [I-D.ietf-rtgwg-microloop-analysis]. In this method, a micro-loop 337 free next-hop safety condition is defined as follows: 339 In a symmetric cost network, it is safe for router X to change to the 340 use of neighbor Y as its next-hop for a specific destination if the 341 path through Y to that destination satisfies both of the following 342 criteria: 344 1. X considers Y as its loop-free neighbor based on the topology 345 before the change AND 347 2. X considers Y as its downstream neighbor based on the topology 348 after the change. 350 In an asymmetric cost network, a stricter safety condition is needed, 351 and the criterion is that: 353 X considers Y as its downstream neighbor based on the topology 354 both before and after the change. 356 Based on these criteria, destinations are classified by each router 357 into three classes: 359 o Type A destinations: Destinations unaffected by the change (type 360 A1) and also destinations whose next hop after the change 361 satisfies the safety criteria (type A2). 363 o Type B destinations: Destinations that cannot be sent via the new 364 primary next-hop because the safety criteria are not satisfied, 365 but which can be sent via another next-hop that does satisfy the 366 safety criteria. 368 o Type C destinations: All other destinations. 370 Following a topology change, Type A destinations are immediately 371 changed to go via the new topology. Type B destinations are 372 immediately changed to go via the next hop that satisfies the safety 373 criteria, even though this is not the shortest path. Type B 374 destinations continue to go via this path until all routers have 375 changed their Type C destinations over to the new next hop. Routers 376 must not change their Type C destinations until all routers have 377 changed their Type A2 and Type B destinations to the new or 378 intermediate (safe) next hop. 380 Simulations indicate that this approach produces a significant 381 reduction in the number of links that are subject to micro-looping. 382 However unlike all of the micro-loop prevention methods it is only a 383 partial solution. In particular, micro-loops may form on any link 384 joining a pair of type C routers. 386 Because routers delay updating their Type C destination FIB entries, 387 they will continue to route towards the failure during the time when 388 the routers are changing their Type A and B destinations, and hence 389 will continue to productively forward packets provided that viable 390 repair paths exist. 392 A backwards compatibility issue arises with PLSN. If a router is not 393 capable of micro-loop control, it will not correctly delay its FIB 394 update. If all such routers had only type A destinations this loop 395 mitigation mechanism would work as it was designed. Alternatively, 396 if all such incapable routers had only type C destinations, the 397 "loop-prevention" announcement mechanism used to trigger the tunnel 398 based schemes (see sections 5.2 to 5.4) could be used to cause the 399 Type A and Type B destinations to be changed, with the incapable 400 routers and routers having type C destinations delaying until they 401 received the "real" announcement. Unfortunately, these two 402 approaches are mutually incompatible. 404 Note that simulations indicate that in most topologies treating type 405 B destinations as type C results in only a small degradation in loop 406 prevention. Also note that simulation results indicate that in 407 production networks where some, but not all, links have asymmetric 408 costs, using the stricter asymmetric cost criterion actually reduces 409 the number of loop free destinations, because fewer destinations can 410 be classified as type A or B. 412 This mechanism operates identically for 414 o events that degrade the topology (e.g. link failure), 416 o events that improve the topology (e.g. link restoration), and 417 o shared risk link group (SRLG) failure. 419 6. Micro-loop Prevention 421 Eight micro-loop prevention methods have been proposed: 423 1. Incremental cost advertisement 425 2. Nearside tunneling 427 3. Farside tunneling 429 4. Distributed tunnels 431 5. Packet marking 433 6. New MPLS labels 435 7. Ordered FIB update 437 8. Synchronized FIB update 439 6.1. Incremental Cost Advertisement 441 When a link fails, the cost of the link is normally changed from its 442 assigned metric to "infinity" in one step. However, it can be proved 443 [OPT] that no micro-loops will form if the link cost is increased in 444 suitable increments, and the network is allowed to stabilize before 445 the next cost increment is advertised. Once the link cost has been 446 increased to a value greater than that of the lowest alternative cost 447 around the link, the link may be disabled without causing a micro- 448 loop. 450 The criterion for a link cost change to be safe is that any link 451 which is subjected to a cost change of x can only cause loops in a 452 part of the network that has a cyclic cost less than or equal to x. 453 Because there may exist links which have a cost of one in each 454 direction, resulting in a cyclic cost of two, this can result in the 455 link cost having to be raised in increments of one. However the 456 increment can be larger where the minimum cost permits. Recent work 457 [OPT] has shown that there are a number of optimizations which can be 458 applied to the problem in order to determine the exact set of cost 459 values required and hence minimize the number of increments. 461 It will be appreciated that when a link is returned to service, its 462 cost is reduced in small steps from "infinity" to its final cost, 463 thereby providing similar micro-loop prevention during a "good-news" 464 event. Note that the link cost may be decreased from "infinity" to 465 any value greater than that of the lowest alternative cost around the 466 link in one step without causing a micro-loop. 468 When the failure is an SRLG the link cost increments must be 469 coordinated across all failing members of the SRLG. This may be 470 achieved by completing the transition of one link before starting the 471 next, or by interleaving the changes. 473 The incremental cost change approach has the advantage over all other 474 currently known loop prevention scheme that it requires no change to 475 the routing protocol. It will work in any network because it does 476 not require any co-operation from the other routers in the network. 478 Where the micro-loop prevention mechanism is being used to support a 479 planned reconfiguration of the network, the extended total 480 reconvergence time resulting from the multiple increments is of 481 limited consequence, particularly where the number of increments have 482 been optimized. This, together with the ability to implement this 483 technique in isolation, makes this method a good candidate for use 484 with such management initiated changes. 486 Where the micro-loop prevention mechanism is being used to support 487 failure recovery, the number of increments required, and hence the 488 time taken to fully converge, is significant even for small numbers 489 of increments. This is because, for the duration of the transition, 490 some parts of the network continue to use the old forwarding path, 491 and hence use any repair mechanism for an extended period. In the 492 case of a failure that cannot be fully repaired, some destinations 493 may therefore become unreachable for an extended period. In addition 494 the network may be vulnerable to a second failure for the duration of 495 the controlled re-convergence. 497 Where large metrics are used and no optimization (such as that 498 described above) is performed, the incremental cost method can be 499 extremely slow. However in cases where the per link metric is small, 500 either because small values have been assigned by the network 501 designers, or because of restrictions implicit in the routing 502 protocol (e.g. RIP restricts the metric, and BGP using the AS path 503 length frequently uses an effective metric of one, or a very small 504 integer for each inter AS hop), the number of required increments can 505 be acceptably small even without optimizations. 507 6.2. Nearside Tunneling 509 This mechanism works by creating an overlay network using tunnels 510 whose path is not affected by the topology change and carrying the 511 traffic affected by the change in that new network. When all the 512 traffic is in the new, tunnel based, network, the real network is 513 allowed to converge on the new topology. Because all the traffic 514 that would be affected by the change is carried in the overlay 515 network no micro-loops form. 517 When a failure is detected (or a link is withdrawn from service), the 518 router adjacent to the failure issues a new "loop-prevention" routing 519 message announcing the topology change. This message is propagated 520 through the network by all routers, but is only understood by routers 521 capable of using one of the tunnel based micro-loop prevention 522 mechanisms. 524 Each of the micro-loop preventing routers builds a tunnel to the 525 closest router adjacent to the failure. They then determine which of 526 their traffic would transit the failure and place that traffic in the 527 tunnel. When all of these tunnels are in place (determined, for 528 example, by waiting a suitable interval) the failure is announced as 529 normal. Because these tunnels will be unaffected by the transition, 530 and because the routers protecting the link will continue the repair 531 (or forward across the link being withdrawn), no traffic will be 532 disrupted by the failure. When the network has converged these 533 tunnels are withdrawn, allowing traffic to be forwarded along its new 534 "natural" path. The order of tunnel insertion and withdrawal is not 535 important, provided that the tunnels are all in place before the 536 normal announcement is issued, and provided that the repair remains 537 in place until normal convergence has completed. 539 This method completes in bounded time, and is generally much faster 540 than the incremental cost method. Depending on the exact design, it 541 completes in two or three flood-SPF-FIB update cycles. 543 At the time at which the failure is announced as normal, micro-loops 544 may form within isolated islands of non-micro-loop preventing 545 routers. However, only traffic entering the network via such routers 546 can micro-loop. All traffic entering the network via a micro-loop 547 preventing router will be tunneled correctly to the nearest repairing 548 router, including, if necessary being tunneled via a non-micro-loop 549 preventing router, and will not micro-loop. 551 Where there is no requirement to prevent the formation of micro-loops 552 involving non-micro-loop preventing routers, a single, "normal" 553 announcement may be made, and a local timer used to determine the 554 time at which transition from tunneled forwarding to normal 555 forwarding over the new topology may commence. 557 This technique has the disadvantage that it requires traffic to be 558 tunneled during the transition. This is an issue in IP networks 559 because not all router designs are capable of high performance IP 560 tunneling. It is also an issue in MPLS networks because the 561 encapsulating router has to know the label set that the decapsulating 562 router is distributing. 564 A further disadvantage of this method is that it requires co- 565 operation from all the routers within the routing domain to fully 566 protect the network against micro-loops. 568 When a new link is added, the mechanism is run in "reverse". When 569 the loop-prevention announcement is heard, routers determine which 570 traffic they will send over the new link, and tunnel that traffic to 571 the router on the near side of that link. This path will not be 572 affected by the presence of the new link. When the "normal" 573 announcement is heard, they then update their FIB to send the traffic 574 normally according to the new topology. Any traffic encountering a 575 router that has not yet updated its FIB will be tunneled to the near 576 side of the link, and will therefore not loop. 578 When a management change to the topology is required, again exactly 579 the same mechanism protects against micro-looping of packets by the 580 micro-loop preventing routers. 582 When the failure is an SRLG, the required strategy is to classify 583 traffic according the furthest failing member of the SRLG that it 584 will traverse on its way to the destination, and to tunnel that 585 traffic to the repairing router for that SRLG member. This will 586 require multiple tunnel destinations, in the limiting case, one per 587 SRLG member. 589 6.3. Farside Tunnels 591 Farside tunneling loop prevention requires the loop preventing 592 routers to place all of the traffic that would traverse the failure 593 in one or more tunnels terminating at the router (or in the case of 594 node failure routers) at the far side of the failure. The properties 595 of this method are a more uniform distribution of repair traffic than 596 is a achieved using the nearside tunnel method, and in the case of 597 node failure, a reduction in the decapsulation load on any single 598 router. 600 Unlike the nearside tunnel method (which uses normal routing to the 601 repairing router), this method requires the use of a repair path to 602 the farside router. This may be provided by the not-via 603 [I-D.ietf-rtgwg-ipfrr-notvia-addresses] mechanism, in which case no 604 further computation is needed. 606 The mode of operation is otherwise identical to the nearside 607 tunneling loop prevention method (Section 6.2). 609 6.4. Distributed Tunnels 611 In the distributed tunnels loop prevention method, each router 612 calculates its own repair and forwards traffic affected by the 613 failure using that repair. Unlike the FRR case, the actual failure 614 is known at the time of the calculation. The objective of the loop 615 preventing routers is to get the packets that would have gone via the 616 failure into Q-space [I-D.bryant-ipfrr-tunnels] using routers that 617 are in P-space. Because packets are decapsulated on entry to 618 Q-space, rather than being forced to go to the farside of the 619 failure, more optimum routing may be achieved. This method is 620 subject to the same reachability constraints described in 621 [I-D.bryant-ipfrr-tunnels]. 623 The mode of operation is otherwise identical to the nearside 624 tunneling loop prevention method (Section 6.2). 626 An alternative distributed tunnel mechanism is for all routers to 627 tunnel to the not-via address [I-D.ietf-rtgwg-ipfrr-notvia-addresses] 628 associated with the failure. 630 6.5. Packet Marking 632 If packets could be marked in some way, this information could be 633 used to assign them to one of: 635 o the new topology, 637 o the old topology or 639 o a transition topology. 641 They would then be correctly forwarded during the transition. This 642 mechanism works identically for both "bad-news" and "good-news" 643 events. It also works identically for SRLG failure. There are three 644 problems with this solution: 646 o A packet marking bit may not be available, for example a network 647 supporting both the differentiated services architecture [RFC2475] 648 and explicit congestion notification [RFC3168] uses all eight bits 649 of the IPv4 Type of Service field. 651 o The mechanism would introduce a non-standard forwarding procedure. 653 o Packet marking using either the old or the new topology would 654 double the size of the FIB, however some optimizations may be 655 possible 657 6.6. MPLS New Labels 659 In an MPLS network that is using RFC5036 [RFC5036] for label 660 distribution, loop free convergence can be achieved through the use 661 of new labels when the path that a prefix will take through the 662 network changes. 664 As described in Section 6.2, the repairing routers issue a loop- 665 prevention announcement to start the loop free convergence process. 666 All loop preventing routers calculate the new topology and determine 667 whether their FIB needs to be changed. If there is no change in the 668 FIB they take no part in the following process. 670 The routers that need to make a change to their FIB consider each 671 change and check the new next hop to determine whether it will use a 672 path in the OLD topology which reaches the destination without 673 traversing the failure (i.e. the next hop is in P-space with respect 674 to the failure [I-D.bryant-ipfrr-tunnels]). If so the FIB entry can 675 be immediately updated. For all of the remaining FIB entries, the 676 router issues a new label to each of its neighbors. This new label 677 is used to lock the path during the transition in a similar manner to 678 the previously described loop-free convergence with tunnels method 679 (Section 6.2). Routers receiving a new label install it in their 680 FIB, for MPLS label translation, but do not yet remove the old label 681 and do not yet use this new label to forward IP packets. i.e. they 682 prepare to forward using the new label on the new path, but do not 683 use it yet. Any packets received continue to be forwarded the old 684 way, using the old labels, towards the repair. 686 At some time after the loop-prevention announcement, a normal routing 687 announcement of the failure is issued. This announcement must not be 688 issued until such time as all routers have carried out all of their 689 loop-prevention announcement triggered activities. On receipt of the 690 normal announcement all routers that were delaying convergence move 691 to their new path for both the new and the old labels. This involves 692 changing the IP address entries to use the new labels, AND changing 693 the old labels to forward using the new labels. 695 Because the new label path was installed during the loop-prevention 696 phase, packets reach their destinations as follows: 698 o If they do not go via any router using a new label they go via the 699 repairing router and the repair. 701 o If they meet any router that is using the new labels they get 702 marked with the new labels and reach their destination using the 703 new path, back-tracking if necessary. 705 When all routers have changed to the new path the network is 706 converged. At some later time, when it can be assumed that all 707 routers have moved to using the new path, the FIB can be cleaned up 708 to remove the, now redundant, old labels. 710 As with other method methods the new labels may be modified to 711 provide loop prevention for "good news". There are also a number of 712 optimizations of this method. 714 6.7. Ordered FIB Update 716 The Ordered FIB loop prevention method is described in OFIB 717 [I-D.ietf-rtgwg-ordered-fib]. Micro-loops occur following a failure 718 or a cost increase, when a router closer to the failed component 719 revises its routes to take account of the failure before a router 720 which is further away. By analyzing the reverse shortest path tree 721 (rSPT) over which traffic is directed to the failed component in the 722 old topology, it is possible to determine a strict ordering which 723 ensures that nodes closer to the root always process the failure 724 after any nodes further away, and hence micro-loops are prevented. 726 When the failure has been announced, each router waits a multiple of 727 the convergence timer [I-D.atlas-bryant-shand-lf-timers]. The 728 multiple is determined by the node's position in the rSPT, and the 729 delay value is chosen to guarantee that a node can complete its 730 processing within this time. The convergence time may be reduced by 731 employing a signaling mechanism to notify the parent when all the 732 children have completed their processing, and hence when it is safe 733 for the parent to instantiate its new routes. 735 The property of this approach is therefore that it imposes a delay 736 which is bounded by the network diameter although in many cases it 737 will be much less. 739 When a link is returned to service the convergence process above is 740 reversed. A router first determines its distance (in hops) from the 741 new link in the NEW topology. Before updating its FIB, it then waits 742 a time equal to the value of that distance multiplied by the 743 convergence timer. 745 It will be seen that network management actions can similarly be 746 undertaken by treating a cost increase in a manner similar to a 747 failure and a cost decrease similar to a restoration. 749 The ordered FIB mechanism requires all nodes in the domain to operate 750 according to these procedures, and the presence of non co-operating 751 nodes can give rise to loops for any traffic which traverses them 752 (not just traffic which is originated through them). Without 753 additional mechanisms these loops could remain in place for a 754 significant time. 756 It should be noted that this method requires per router ordering, but 757 not per prefix ordering. A router must wait its turn to update its 758 FIB, but it should then update its entire FIB. 760 When an SRLG failure occurs a router must classify traffic into the 761 classes that pass over each member of the SRLG. Each router is then 762 independently assigned a ranking with respect to each SRLG member for 763 which they have a traffic class. These rankings may be different for 764 each traffic class. The prefixes of each class are then changed in 765 the FIB according to the ordering of their specific ranking. Again, 766 as for the single failure case, signaling may be used to speed up the 767 convergence process. 769 Note that the special SRLG case of a full or partial node failure, 770 can be dealt with without using per prefix ordering, by running a 771 single reverse SPF computation rooted at the failed node (or common 772 point of the subset of failing links in the partial case). 774 There are two classes of signaling optimization that can be applied 775 to the ordered FIB loop-prevention method: 777 o When the router makes NO change, it can signal immediately. This 778 significantly reduces the time taken by the network to process 779 long chains of routers that have no change to make to their FIB. 781 o When a router HAS changed, it can signal that it has completed. 782 This is more problematic since this may be difficult to determine, 783 particularly in a distributed architecture, and the optimization 784 obtained is the difference between the actual time taken to make 785 the FIB change and the worst case timer value. This saving could 786 be of the order of one second per hop. 788 There is another method of executing ordered FIB which is based on 789 pure signaling [SIG]. Methods that use signaling as an optimization 790 are safe because eventually they fall back on the established IGP 791 mechanisms which ensure that networks converge under conditions of 792 packet loss. However a mechanism that relies on signaling in order 793 to converge requires a reliable signaling mechanism which must be 794 proven to recover from any failure circumstance. 796 6.8. Synchronised FIB Update 798 Micro-loops form because of the asynchronous nature of the FIB update 799 process during a network transition. In many router architectures it 800 is the time taken to update the FIB itself that is the dominant term. 802 One approach would be to have two FIBs and, in a synchronized action 803 throughout the network, to switch from the old to the new. One way 804 to achieve this synchronized change would be to signal or otherwise 805 determine the wall clock time of the change, and then execute the 806 change at that time, using NTP [RFC1305] to synchronize the wall 807 clocks in the routers. 809 This approach has a number of major issues. Firstly two complete 810 FIBs are needed which may create a scaling issue and secondly a 811 suitable network wide synchronization method is needed. However, 812 neither of these are insurmountable problems. 814 Since the FIB change synchronization will not be perfect there may be 815 some interval during which micro-loops form. Whether this scheme is 816 classified as a micro-loop prevention mechanism or a micro-loop 817 mitigation mechanism within this taxonomy is therefore dependent on 818 the degree of synchronization achieved. 820 This mechanism works identically for both "bad-news" and "good-news" 821 events. It also works identically for SRLG failure. Further 822 consideration needs to be given to interoperating with routers that 823 do not support this mechanism. Without a suitable interoperating 824 mechanism, loops may form for the duration of the synchronization 825 delay. 827 7. Using PLSN In Conjunction With Other Methods 829 All of the tunnel methods and packet marking can be combined with 830 PLSN (Section 5.2)[I-D.ietf-rtgwg-microloop-analysis] to reduce the 831 traffic that needs to be protected by the advanced method. 832 Specifically all traffic could use PLSN except traffic between a pair 833 of routers both of which consider the destination to be type C. The 834 type C to type C traffic would be protected from micro-looping 835 through the use of a loop prevention method. 837 However, determining whether the new next hop router considers a 838 destination to be type C may be computationally intensive. An 839 alternative approach would be to use a loop prevention method for all 840 local type C destinations. This would not require any additional 841 computation, but would require the additional loop prevention method 842 to be used in cases which would not have generated loops (i.e. when 843 the new next-hop router considered this to be a type A or B 844 destination). 846 The amount of traffic that would use PLSN is highly dependent on the 847 network topology and the specific change, but would be expected to be 848 in the region %70 to %90 in typical networks. 850 However, PLSN cannot be combined safely with Ordered FIB. Consider 851 the network fragment shown below: 853 R 854 /|\ 855 / | \ 856 1/ 2| \3 857 / | \ cost S->T = 10 858 Y-----X----S----T cost T->S = 1 859 | 1 2 | 860 |1 | 861 D---------------+ 862 20 864 On failure of link XY, according to PLSN, S will regard R as a safe 865 neighbor for traffic to D. However the ordered FIB rank of both R and 866 T will be zero and hence these can change their FIBs during the same 867 time interval. If R changes before T, then a loop will form around 868 R, T and S. This can be prevented by using a stronger safety 869 condition than PLSN currently specifies, at the cost of introducing 870 more type C routers, and hence reducing the PLSN coverage. 872 8. Loop Suppression 874 A micro-loop suppression mechanism recognizes that a packet is 875 looping and drops it. One such approach would be for a router to 876 recognize, by some means, that it had seen the same packet before. 877 It is difficult to see how sufficiently reliable discrimination could 878 be achieved without some form of per-router signature such as route 879 recording. A packet recognizing approach therefore seems infeasible. 881 An alternative approach would be to recognize that a packet was 882 looping by recognizing that it was being sent back to the place that 883 it had just come from. This would work for the types of loop that 884 form in symmetric cost networks, but would not suppress the cyclic 885 loops that form in asymmetric networks, and as a result of multiple 886 failures. 888 This mechanism operates identically for both "bad-news" events, 889 "good-news" events and SRLG failure. 891 9. Compatibility Issues 893 Deployment of any micro-loop control mechanism is a major change to a 894 network. Full consideration must be given to interoperation between 895 routers that are capable of micro-loop control, and those that are 896 not. Additionally there may be a desire to limit the complexity of 897 micro-loop control by choosing a method based purely on its 898 simplicity. Any such decision must take into account that if a more 899 capable scheme is needed in the future, its deployment might be 900 complicated by interaction with the scheme previously deployed. 902 10. Comparison of Loop-free Convergence Methods 904 PLSN [I-D.ietf-rtgwg-microloop-analysis] is an efficient mechanism to 905 prevent the formation of micro-loops, but is only a partial solution. 906 It is a useful adjunct to some of the complete solutions, but may 907 need modification. 909 Incremental cost advertisement in its simplest form is impractical as 910 a general solution because it takes too long to complete. Optimized 911 Incremental cost advertisement, however, completes in much less time 912 and requires no assistance from other routers in the network. It is 913 therefore, useful for network reconfiguration operations. 915 Packet Marking is probably impractical because of the need to find 916 the marking bit and to change the forwarding behavior. 918 Of the remaining methods, distributed tunnels is significantly more 919 complex than nearside or farside tunnels, and should only be 920 considered if there is a requirement to distribute the tunnel 921 decapsulation load. 923 Synchronised FIBs is a fast method, but has the issue that a suitable 924 synchronization mechanism needs to be defined. One method would be 925 to use NTP [RFC1305], however the coupling of routing convergence to 926 a protocol that uses the network may be a problem. During the 927 transition there will be some micro-looping for a short interval 928 because it is not possible to achieve complete synchronization of the 929 FIB changeover. 931 The ordered FIB mechanism has the major advantage that it is a 932 control plane only solution. However, SRLGs require a per- 933 destination calculation, and the convergence delay may be high, 934 bounded by the network diameter. The use of signaling as an 935 accelerator may reduce the number of destinations that experience the 936 full delay, and hence reduce the total re-convergence time to an 937 acceptable period. 939 The nearside and farside tunnel methods deal relatively easily with 940 SRLGs and uncorrelated changes. The convergence delay would be 941 small. However these methods require the use of tunneled forwarding 942 which is not supported on all router hardware, and raises issues of 943 forwarding performance. When used with PLSN, the amount of traffic 944 that was tunneled would be significantly reduced, thus reducing the 945 forwarding performance concerns. If the selected repair mechanism 946 requires the use of tunnels, then a tunnel based loop prevention 947 scheme may be acceptable. 949 11. IANA Considerations 951 There are no IANA considerations that arise from this draft. 953 12. Security Considerations 955 This document analyzes the problem of micro-loops and summarizes a 956 number of potential solutions that have been proposed. These 957 solutions require only minor modifications to existing routing 958 protocols and therefore do not add additional security risks. 959 However a full security analysis would need to be provided within the 960 specification of a particular solution proposed for deployment. 962 13. Acknowledgments 964 The authors would like to acknowledge contributions to this document 965 made by Clarence Filsfils. 967 14. Informative References 969 [I-D.atlas-bryant-shand-lf-timers] 970 K, A. and S. Bryant, "Synchronisation of Loop Free Timer 971 Values", draft-atlas-bryant-shand-lf-timers-04 (work in 972 progress), February 2008. 974 [I-D.bryant-ipfrr-tunnels] 975 Bryant, S., Filsfils, C., Previdi, S., and M. Shand, "IP 976 Fast Reroute using tunnels", draft-bryant-ipfrr-tunnels-03 977 (work in progress), November 2007. 979 [I-D.ietf-rtgwg-ipfrr-framework] 980 Shand, M. and S. Bryant, "IP Fast Reroute Framework", 981 draft-ietf-rtgwg-ipfrr-framework-12 (work in progress), 982 September 2009. 984 [I-D.ietf-rtgwg-ipfrr-notvia-addresses] 985 Shand, M., Bryant, S., and S. Previdi, "IP Fast Reroute 986 Using Not-via Addresses", 987 draft-ietf-rtgwg-ipfrr-notvia-addresses-04 (work in 988 progress), July 2009. 990 [I-D.ietf-rtgwg-microloop-analysis] 991 Zinin, A., "Analysis and Minimization of Microloops in 992 Link-state Routing Protocols", 993 draft-ietf-rtgwg-microloop-analysis-01 (work in progress), 994 October 2005. 996 [I-D.ietf-rtgwg-ordered-fib] 997 Francois, P., "Loop-free convergence using oFIB", 998 draft-ietf-rtgwg-ordered-fib-02 (work in progress), 999 February 2008. 1001 [OPT] Francois, P., Shand, M., and O. Bonaventure, "Disruption 1002 free topology reconfiguration in OSPF networks"", IEEE 1003 INFOCOM May 2007, Anchorage, 2007. 1005 [RFC1305] Mills, D., "Network Time Protocol (Version 3) 1006 Specification, Implementation", RFC 1305, March 1992. 1008 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 1009 and W. Weiss, "An Architecture for Differentiated 1010 Services", RFC 2475, December 1998. 1012 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 1013 of Explicit Congestion Notification (ECN) to IP", 1014 RFC 3168, September 2001. 1016 [RFC4090] Pan, P., Swallow, G., and A. Atlas, "Fast Reroute 1017 Extensions to RSVP-TE for LSP Tunnels", RFC 4090, 1018 May 2005. 1020 [RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP 1021 Specification", RFC 5036, October 2007. 1023 [SIG] Francois, P. and O. Bonaventure, "Avoiding transient loops 1024 during IGP convergence", IEEE INFOCOM March 2005, Miami, 1025 Fl, USA, 2005. 1027 Authors' Addresses 1029 Mike Shand 1030 Cisco Systems 1031 250, Longwater Ave, 1032 Green Park,, Reading, RG2 6GB, 1033 United Kingdom. 1035 Email: mshand@cisco.com 1037 Stewart Bryant 1038 Cisco Systems 1039 250, Longwater Ave, 1040 Green Park,, Reading, RG2 6GB 1041 United Kingdom. 1043 Email: stbryant@cisco.com