idnits 2.17.1 draft-bryant-ipfrr-tunnels-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3667, Section 5.1 on line 1330. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1391. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 1379), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 1379. ** The document claims conformance with section 10 of RFC 2026, but uses some RFC 3978/3979 boilerplate. As RFC 3978/3979 replaces section 10 of RFC 2026, you should not claim conformance with it if you have changed to using RFC 3978/3979 boilerplate. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document seems to lack an RFC 3979 Section 5, para. 1 IPR Disclosure Acknowledgement. ** The document seems to lack an RFC 3979 Section 5, para. 2 IPR Disclosure Acknowledgement. ** The document seems to lack an RFC 3979 Section 5, para. 3 IPR Disclosure Invitation. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, or will be disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 2004) is 7285 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'A' on line 956 looks like a reference -- Missing reference section? 'B' on line 960 looks like a reference -- Missing reference section? 'C' on line 960 looks like a reference -- Missing reference section? 'J' on line 799 looks like a reference -- Missing reference section? 'D' on line 960 looks like a reference -- Missing reference section? 'E' on line 960 looks like a reference -- Missing reference section? 'BFD' on line 1096 looks like a reference -- Missing reference section? 'IPSEC' on line 1311 looks like a reference Summary: 11 errors (**), 0 flaws (~~), 2 warnings (==), 12 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group S. Bryant 2 Internet Draft C. Filsfils 3 Expiration Date: Nov 2004 S. Previdi 4 M. Shand 5 Cisco Systems 7 May 2004 9 IP Fast Reroute using tunnels 11 draft-bryant-ipfrr-tunnels-00.txt 13 Status of this Memo 15 This document is an Internet-Draft and is in full conformance with 16 all provisions of Section 10 of RFC 2026. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that other 20 groups may also distribute working documents as Internet-Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six months 22 and may be updated, replaced, or obsolete by other documents at any 23 time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress". 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html. 32 Abstract 34 This draft describes an IP fast re-route mechanism that provides 35 backup connectivity in the event of a link or router failure. In the 36 absence of single points of failure and asymmetric costs, the 37 mechanism provides complete protection against any single failure. If 38 perfect repair is not possible, the identity of all the unprotected 39 links and routers is known in advance. The draft also describes the 40 mechanisms needed to prevent the packet loss caused by loops which 41 normally occur during the reconvergence of the network following a 42 failure. 44 Table of Contents 45 1. Introduction......................................................5 46 2. Goals, non-goals, limitations and constraints.....................5 47 2.1. Goals.........................................................5 48 2.2. Non-Goals.....................................................6 49 2.3. Limitations...................................................6 50 2.4. Constraints...................................................6 51 3. Repair Paths......................................................7 52 3.1. Tunnels as Repair Paths.......................................7 53 3.2. Tunnel Requirements..........................................10 54 3.2.1. Setup....................................................10 55 3.2.2. Multipoint...............................................10 56 3.2.3. Directed forwarding......................................10 57 3.2.4. Security.................................................10 58 4. Construction of Repair Paths.....................................10 59 4.1. Identifying Repair Path Targets..............................10 60 4.2. Determining Tunneled Repair Paths............................11 61 4.2.1. Computing Repair Paths...................................12 62 4.2.2. Extended P-space.........................................13 63 4.2.3. Downstream Paths.........................................13 64 4.2.4. Selecting Repair Paths...................................13 65 4.3. Assigning Traffic to Repair Paths............................14 66 4.4. When no Repair Path is Possible..............................14 67 4.4.1. Unreachable Target.......................................15 68 4.4.2. Asymmetric Link Costs....................................15 69 4.4.3. Interference Between Potential Node Repair Paths.........15 70 4.5. Multi-homed Prefixes.........................................18 71 4.6. Equal Cost Path Splits.......................................19 72 4.6.1. Equal Cost Path Splits as Link Repair Paths..............19 73 4.6.2. Equal Cost Path Splits and Node Failure..................20 74 4.7. LANs and pseudonodes.........................................20 75 4.7.1. The Link between Routers A and B is a LAN................21 76 4.7.1.1. Case 1...............................................21 77 4.7.1.2. Case 2...............................................21 78 4.7.1.3. Simplified LAN repair................................22 79 4.7.2. A LAN exists at the release point........................22 80 4.7.3. A LAN between B and its neighbors........................22 81 4.7.4. The LAN is a Transit Subnet..............................23 82 5. Failure Detection and Repair Path Activation.....................23 83 5.1. Failure Detection............................................23 84 5.2. Repair Path Activation.......................................23 85 5.3. Node Failure Detection Mechanism.............................23 86 6. Loop Free Transition.............................................24 87 6.1. Incremental Cost Advertisement...............................24 88 6.2. Single Tunnel Per Router.....................................25 89 6.3. Distributed Tunnels..........................................25 90 6.4. Ordered SPFs.................................................26 91 7. Restoring Failed Components to Service...........................26 92 8. Implications for Network Management..............................26 93 9. IPFRR Capability.................................................27 94 10. Enhancements to routing protocols...............................27 95 11. IANA considerations.............................................27 96 12. Security Considerations.........................................27 97 Terminology 99 This section defines words, acronyms, and actions used in this draft. 101 A Frequently used to denote a router that is 102 the source of a repair path computed in 103 anticipation of the failure of a neighboring 104 router denoted as B. 106 B Frequently used to denote a router whose 107 anticipated failure is the subject of repair 108 path computations. 110 Directed The ability of the repairing router (A) to 111 forwarding specify the next hop (Q) on exit from a 112 tunnel end-point (P) 114 Extended The union of the p-space of the neighbors of 115 P-space a specific router with respect to a common 116 component. 118 Extended p-space does not include the 119 additional space reachable though directed 120 forwarding. 122 FIB Forwarding Information Base. The database 123 used by the packet forwarder to determine 124 what actions to perform on a packet 126 IPFRR IP fast re-route 128 P The router in P-space to which a packet is 129 tunneled for repair. 131 PQ A router that is in both P and Q space and 132 hence does not need directed forwarding. 134 P-space P-space is the set of routers reachable from 135 a specific router without any path 136 (including equal cost path splits) 137 transiting a specified component. 139 For example, the P-space of A, is the set of 140 routers that A can reach without using B 141 (router failure case) or the A-B link 142 failure case). 144 Q The router in Q space, to which the packet 145 is directed by router P on exit from the 146 repair tunnel. Q will always be adjacent to 147 P, or P itself. 149 Q-space Q-space is the set of routers from which a 150 specific router can be reached without any 151 path (including equal cost path splits) 152 transiting a specified component. 154 Routing The process whereby routers converge on a 155 transition new topology. In conventional networks this 156 process frequently causes some disruption to 157 packet delivery. 159 RPF Reverse Path Forwarding. I.e. checking that 160 a packet is received over the interface 161 which would be used to send packets 162 addressed to the source address of the 163 packet. 165 SPF Shortest Path First, e.g. Dijkstra's 166 algorithm. 168 SPT Shortest path tree 170 1. Introduction 172 When the topology of a network changes (due to link or router 173 failure, recovery or management action), the routers need to converge 174 on a common view of the new topology. During this process, referred 175 to as a routing transition, packet delivery between certain 176 source/destination pairs may be disrupted. This occurs due to the 177 time it takes for the topology change to be propagated around the 178 network plus the time it takes each individual router to determine 179 and then update the forwarding information base (FIB) for the 180 affected destinations. During this transition, packets are lost due 181 to the continuing attempts to use of the failed component, and due to 182 forwarding loops. Forwarding loops arise due to the inconsistent FIBs 183 that occur as a result of the difference in time taken by routers to 184 execute the transition process. 186 The service failures caused by routing transitions are largely hidden 187 by higher-level protocols that retransmit the lost data. However new 188 Internet services are emerging which are more sensitive to the packet 189 disruption that occurs during a transition. To make the transition 190 transparent to their users, these services require a short routing 191 transition. Ideally, routing transitions would be completed in zero 192 time with no packet loss. 194 Regardless of how optimally the mechanisms involved have been 195 designed and implemented, it is inevitable that a routing transition 196 will take some minimum interval that is greater than zero. The 197 solution described here uses pre-computed backup routes and 198 controlled notification of network changes. A set of repair paths 199 temporarily provides substitute connectivity in place of a link, or 200 router that has failed. Once the set of repair paths has been 201 activated, there should be no further packet loss as a result of the 202 associated failure. To achieve the maximum benefit from repair paths, 203 they must be activated immediately a failure has been detected, and a 204 controlled transition to normal operation invoked to prevent packet 205 loss due to micro-looping. The packet loss attributable to the 206 failure will then be confined to the unavoidable loss that occurs as 207 a result of the latency of the failure detection mechanism itself. 209 The mechanisms described here have been designed for use with any 210 link-state routing protocol. 212 2. Goals, non-goals, limitations and constraints 214 2.1. Goals 216 The following are the goals of IPFRR: 218 o Protect against any link or router failure in the network. 220 o No constraints on the network topology or link costs. 222 o Never worse than the existing routing convergence mechanism. 224 o Co-existence with non-IP fast-reroute capable routers in the 225 network. 227 2.2. Non-Goals 229 The following are non-goals of IPFRR: 231 o Protection of a single point of failure. 233 o To provide protection in the presence of multiple concurrent 234 failures other than those that occur due to the failure of a 235 single router. 237 o Shared risk group protection. 239 o Complete fault coverage in networks that make use of 240 asymmetric costs. 242 2.3. Limitations 244 The following limitations apply to IPFRR: 246 o Because the mechanisms described here rely on complete 247 topological information from the link state routing protocol, 248 they will only work within a single link state flooding 249 domain. 251 o Reverse Path Forwarding (RPF) checks cannot be used in 252 conjunction with IPFRR. This is because the use of tunnels 253 may result in packets arriving over different interfaces than 254 expected. 256 2.4. Constraints 258 The following constraints are assumed: 260 o Following a failure, only the routers adjacent to the failure 261 have any knowledge of the failure. 263 o There is insufficient time following a failure to compute a 264 repair strategy based on knowledge of the specific failure 265 that has occurred. 267 o Multiple concurrent failures may not be protected. 269 3. Repair Paths 271 When a router detects an adjacent failure, it uses a set of repair 272 paths in place of the failed component, and continues to use this 273 until the completion of the routing transition. Only routers adjacent 274 to the failed component are aware of the nature of the failure. Once 275 the routing transition has been completed, the router will have no 276 further use for the repair paths since all routers in the network 277 will have revised their forwarding data and the failed link will have 278 been eliminated from this computation. 280 Repair paths are pre-computed in anticipation of later failures so 281 they can be promptly activated when a failure is detected. 283 Three types of repair path are considered here. 285 1. Equal cost path-split. 287 Where a link is being used as a member of an equal cost path-split 288 set for some destination, the other members of the set may be used 289 to provide an alternative path, provided that they avoid the 290 network component being protected. 292 2. Downstream Path. 294 A 'downstream path' is a next hop that will get a packet nearer to 295 its destination. It does not necessarily represent the shortest 296 path to the destination but has the property that a packet sent on 297 it will not loop back because, having traversed this hop, it is 298 then closer to its destination. 300 3. Tunnel. 302 A tunneled repair path tunnels traffic to some staging point from 303 which it will travel to its destination using normal forwarding 304 without looping back. The repair path can be thought of as 305 providing a virtual link, originating at a router adjacent to a 306 failure, and diverting traffic around the failure. 308 3.1. Tunnels as Repair Paths 310 The repair strategies described in this draft operate on the basis 311 that if a packet can somehow be sent to the other side of the 312 failure, it will subsequently proceed towards its destination exactly 313 as if it had traversed the failed component. See Figure 1. 315 Repair Path from A to B 316 +-----------+ 317 | | 318 | v 319 ---->[A]---//----[B]-----> 321 Figure 1 Simple Link Repair 323 Creating a repair path from A to B may require a packet to traverse 324 an unnatural route. If a suitable natural path starts at a neighbor 325 (i.e. it is a downstream path), then A can force the packet directly 326 there. If this is not the case, then A must use a tunnel to force the 327 packet down the repair path. Note that the tunnel does not have to go 328 from A to B. The tunnel can terminate at any router in the network, 329 provided that A can be sure that the packet will proceed correctly to 330 its destination from that router. 332 A repair path computed for a link failure may not however work 333 satisfactorily when the neighboring router has, itself, failed. This 334 is illustrated in Figure 2. 336 Repair path from A to B 337 +-------------------------+ 338 | | 339 | <------------+ 340 --->[A]---//----[B]----//-----[C]--> 341 +----------> | 342 | | 343 +-------------------------+ 344 Repair Path from C to B 346 Figure 2 Looping Link Repair when Router Fails 348 Consider the case of a router B with just two neighbors A and C. When 349 router B fails, both A and C will observe the failure of their local 350 link to B, but will have no immediate knowledge that B itself has 351 failed. If they were both to attempt to repair traffic around their 352 local link, they would invoke mutual repairs which would loop. 354 Since it is not easy for a router to immediately distinguish between 355 a link failure and the failure of its neighbor, repair paths are 356 calculated in anticipation of adjacent router failure. Thus, for each 357 of its protected links, router A (Figure 3) pre-computes a set of 358 tunneled repair paths, one for each of the neighbors (C,D,E) of its 359 neighbor B on the A-B link. The set of destinations that are normally 360 assigned to link A-B will be assigned to a repair path based on the 361 neighbor of B through which router B would have forwarded traffic to 362 them. 364 Repair AC 365 +----------->[C] 366 | | 367 | | 368 | | 369 ----->[A]----//-----[B]---------[D] 370 || | ^ 371 || | | 372 || Repair AE | | 373 |+---------->[E] | 374 | | 375 +-------------------------+ 376 Repair AD 378 Figure 3: Repair paths in anticipation of a router failure 380 The set of repair paths in Figure 3 will function correctly in the 381 case of link and router failure. However, in some network topologies 382 they may not provide a means for traffic to reach router B itself. 383 This is important in cases where B is a single point of failure and B 384 is still functional (i.e. the failure was actually a failure of the 385 A-B link). Hence, in addition to computing repair paths for the 386 neighbors of its neighbor on a protected link, a router also 387 calculates a repair path for the neighbor itself. This is illustrated 388 in Figure 4. 390 Repair AB 391 +----------------+ 392 | | 393 | Repair AC | 394 |+---------->[C] | 395 || | / 396 || | / 397 || |/ 398 ----->[A]----//-----[B]---------[D] 399 || | ^ 400 || | | 401 || Repair AE | | 402 |+---------->[E] | 403 | | 404 +-------------------------+ 405 Repair AD 407 Figure 4 The full set of A-B repair paths. 409 In the event of a failure, the only traffic that is assigned to the 410 link repair path (the AB repair) is that traffic which has no other 411 path to its destination except via B. As we have already seen, there 412 is a danger that traffic assigned to this link repair path may loop 413 if B has failed, therefore, when the repair paths are invoked, a loop 414 detection mechanism is used which promptly detects the loop and, upon 415 detection, withdraws the link (A-B) repair path from service. 417 3.2. Tunnel Requirements 419 The specific tunneling mechanism used to provide a repair path is 420 outside the scope of this document. However the following sections 421 describe the requirements for the tunneling mechanism. 423 3.2.1. Setup. 425 When a failure is detected, it is necessary to immediately redirect 426 traffic to the repair paths. Consequently, the tunnels used must be 427 provisioned beforehand in anticipation of the failure. IP fast re- 428 route will determine which tunnels it requires. It must therefore be 429 possible to establish tunnels automatically, without management 430 action, and without the need to manually establish context at the 431 tunnel endpoint. 433 3.2.2. Multipoint 435 To reduce the number of tunnel endpoints in the network the tunnels 436 should be be multi-point tunnels capable of receiving repair traffic 437 from any IPFRR router in the network. 439 3.2.3. Directed forwarding. 441 Directed forwarding must be supported such that the router at the 442 tunnel endpoint (P) can be directed by the router at the tunnel 443 source (A) to forward the packet directly to a specific neighbor. 444 Specification of the directed forwarding mechanism is outside the 445 scope of this document. 447 3.2.4. Security 449 A lightweight security mechanism should be supported to prevent the 450 abuse of the repair tunnels by an attacker. This is discussed in more 451 detail in Section 12. 453 4. Construction of Repair Paths 455 4.1. Identifying Repair Path Targets 457 To establish protection for a link or node it is necessary to 458 determine which neighbors of the neighboring node should be targets 459 of repair paths. Normally all neighbors will be used as repair path 460 targets. However, in some topologies, not all neighbors will be 461 needed as targets because, prior to the failure, no traffic was being 462 forwarded through them by the repairing router. This can determined 463 by examining the normal spanning tree computed by the repairing 464 router. 466 In addition, the neighboring router B will also be the target of a 467 repair path for any destinations for which B is a single point of 468 failure. 470 4.2. Determining Tunneled Repair Paths 472 The objective of each tunneled repair path is to deliver traffic to a 473 target router when a link is observed to have failed. However, it is 474 seldom possible to use the target router itself as the tunnel 475 endpoint because other routers on the repair path, that have not 476 learned of the failure, will forward traffic addressed to it using 477 their least cost path which may be via the failed link. This is 478 illustrated in Figure 5 in which all link costs are one in both 479 directions. Router A's intended repair path for traffic to D when 480 link A-B fails is the path W-X-Y-Z-D. However, if router A makes D be 481 the tunnel endpoint and forwards the packet to router W, router W 482 will immediately return it to A because its least cost path to D is 483 A-B-D (cost 3 versus cost 4) and has no knowledge of the failure of 484 link A-B. 486 [A]--//--[B]------[D] 487 | | 488 | | 489 [W]---[X]---[Y]---[Z] 491 Figure 5. Repair path to target router D. 493 Thus the tunnel endpoint needs to be somewhere on the repair path 494 such that packets addressed to the tunnel end point will not loop 495 back towards router A. In addition, the release point needs to be 496 somewhere such that when packets are released from the tunnel they 497 will flow towards the target router (or their actual destination) 498 without being attracted back to the failed link. By inspection, in 499 Figure 5, suitable tunnel endpoints are routers X, Y, and Z. 501 Note that it is not essential that traffic assigned to a repair path 502 actually traverse the target router for which the repair path was 503 created. If, for example, in Figure 5, a packet's destination were 504 normally reached via the path A-B-D-Z-?-?-?, once released at any of 505 the possible tunnel endpoints, it would arrive at its destination by 506 the best available route without traversing D. 508 In general, the properties that are required of tunnel endpoints are: 510 o the end point must be reachable from the tunnel source 511 without traversing the failed link; and 513 o once released, tunneled packets will proceed towards their 514 destination without being attracted back over the failed link 515 or node. 517 Provided both of these conditions are met, packets forwarded on the 518 repair path will not loop. 520 In some topologies it will not be possible to find a tunnel endpoint 521 that exhibits both the required properties. For example, in Figure 5, 522 if the cost of link X-Y were increased from one to four in both 523 directions, there is no longer a viable endpoint within the fragment 524 of the topology shown. 526 To solve this problem we introduce the concept of directed forwarding 527 from the tunnel endpoint. Directed forwarding allows the originator 528 of a tunneled packet to instruct that, when it is de-capsulated at 529 the end of the tunnel, it be forwarded via a specific adjacency, and 530 not be subjected to the normal forwarding decision process. This 531 effectively allows the tunnel to be extended by one hop. So, for 532 example, in Figure 5 with the cost of link X-Y set to four, it would 533 be possible to select X as the tunnel endpoint with the directive 534 that X always forward the packets it decapsulates via the 535 adjacency to Y. Thus, router X is reached from A using normal 536 forwarding, and directed forwarding is then used to force packets to 537 router Y, from where D can be reached using normal forwarding. 539 Provided link costs are symmetrical, it can be proved that it is 540 always possible to compute a tunneled repair path (possibly using 541 directed forwarding) around a link failure. 543 The tunnel endpoint (P) and the release point (Q) may be coincident, 544 or may be separated by at most one hop. 546 4.2.1. Computing Repair Paths 548 For a router A, determining tunneled repair paths around a 549 neighboring router B, the set of potential tunnel end points includes 550 all the routers that can be reached from A using normal forwarding 551 without traversing the failed link A-B. This is termed the "P-space" 552 of A with respect to the failure of B. Any router that is on an equal 553 cost path split via the failed link is excluded from this set. 555 The resulting set defines all the possible tunnel end points that 556 could be used in repair paths originating at router A for the failure 557 of link A-B. This set can be obtained by computing a spanning tree 558 rooted at A and excising the subtree reached via the A-B link. 560 The set of possible release points can be determined by computing the 561 set of routers that can reach the repair path target without 562 traversing the failed link. This is termed the "Q-space" of the 563 target with respect to the failure. The Q-space can be obtained by 564 computing a reverse spanning tree rooted at the repair path target, 565 with the subtree which traverses the failed link (or node) excised. 567 The reverse spanning tree uses the cost towards the root rather than 568 from it and yields the best paths towards the root from other nodes 569 in the network. 571 The intersection of the target's Q-space with A's P-space includes 572 all the possible release points for any repair path not employing 573 directed forwarding. Where there is no intersection, but there exist 574 a pair of routers, P in A's P-space and Q in the target's Q-space, 575 router P can be used as the tunnel endpoint with directed forwarding 576 to the release point Q. 578 4.2.2. Extended P-space 580 The description in section 4.2.1 calculated router A's P-space rooted 581 at A itself. However, since router A will only use a repair path when 582 it has detected the failure of the link A-B, the initial hop of the 583 repair path need not be subject to A's normal forwarding decision 584 process. Thus we introduce the concept of extended P-space. Router 585 A's extended P-space is the union of the P-spaces of each of A's 586 neighbors. The use of extended P-space may allow router A to repair 587 to targets that were otherwise unreachable. 589 4.2.3. Downstream Paths 591 Under certain circumstances, the target's Q-space will include a 592 router that is a neighbor of A. This is traditionally referred to as 593 a downstream path and has the property that a packet sent on it will 594 not loop back because, having traversed this hop, it is then closer 595 to its destination. A trivial example of this is shown in Figure 6. 597 [A]--//---[B] 598 | | 599 2 | 600 | | 601 [D]-------[C] 603 Figure 6. A topology that will permit a single-hop release point 605 When a downstream path exists, no tunneling is required. 607 4.2.4. Selecting Repair Paths 609 The mechanism described in section 4.2 will identify all the possible 610 release points that can be used to reach each particular target. (The 611 circumstances when no release points exist are described in 612 section 4.4.) In a well-connected network there are likely to be 613 multiple possible release points for each target, and all will work 614 correctly. For simplicity, one release point per target is chosen. 615 All will deliver the packets correctly so, arguably, it does not 616 matter which is chosen. However, one release point may be preferred 617 over the others on the basis of path cost or some other criteria. It 618 is an implementation matter as to how the release point is selected. 620 4.3. Assigning Traffic to Repair Paths 622 Once the repair path for each target has been selected, it is 623 necessary to determine which of the destinations normally reached via 624 the protected link should be assigned to which of the repair paths 625 when the link fails. 627 This is achieved by recording which neighbor of B would be used to 628 reach each destination reachable over A-B when running the original 629 SPF. Traffic assignment is then simply a matter of assigning the 630 traffic which B would have forwarded via each neighbor to the repair 631 path which has that neighbor as its target. 633 Although the repair paths are calculated based on traffic addressed 634 to specific targets, it can be proved that the traffic assignment 635 algorithm guarantees that the repair path can be used for any traffic 636 assigned to it. 638 Where B would normally split the traffic to a particular destination 639 via two or more of its neighbors, it is an implementation decision 640 whether the repaired traffic should be split across the corresponding 641 set of repair paths. 643 The repair path to B itself is normally used just for traffic 644 destined for B and any prefixes advertised by B. However, under some 645 circumstances, it may be impossible to compute a repair path to one 646 or more of B's neighbors, for example, because B is a single point of 647 failure. In this case traffic for the destinations served by the 648 otherwise irreparable targets is assigned to the repair path with B 649 as its target, in the optimistic assumption that router B is still 650 functioning. If router B is indeed still functioning, this will 651 ensure delivery of the traffic. If, however, router B has failed, the 652 traffic on this repair path will loop as previously shown in 653 section 3.1. The way this is detected, and the course of action when 654 it is detected, are described in section 5.3. 656 4.4. When no Repair Path is Possible 658 Under some circumstances, it will not be possible to identify a 659 repair path to one or more of the targets. This can occur for the 660 following reasons: 662 o The neighboring router that is presumed to have failed 663 constitutes a single point of failure in the network. 665 o Severely asymmetric link costs may cause an otherwise viable 666 physical repair path to be unusable. 668 o Interference may occur between the repair paths of individual 669 targets. 671 In practice, these cases are unlikely to be encountered frequently. 672 Networks that will benefit from the mechanisms described here will 673 usually exhibit considerable redundancy and are normally operated 674 with largely symmetric link costs. Note that a router's inability to 675 compute a full set of repair paths for one of its links does not 676 necessarily affect its ability to do so for its other links. 678 Example topologies illustrating each of the three cases above are 679 described in the following subsections. 681 4.4.1. Unreachable Target 683 If the failure of a neighboring router makes one or more of its 684 neighbors genuinely unreachable, clearly it will not be possible to 685 establish a repair path to such targets. Such single points of 686 failure are not expected to be encountered frequently in properly 687 designed networks, and will probably occur only when the network has 688 previously suffered other failures that have reduced its 689 connectivity. 691 4.4.2. Asymmetric Link Costs 693 When link costs have been set asymmetrically, it is possible that a 694 repair path cannot be constructed even using directed forwarding. 696 Although it is trivial to construct a network fragment with this 697 property, this should not be regarded as a major problem. Firstly, 698 asymmetric link costs are seldom used deliberately. And, secondly, 699 even when an asymmetric link cost prevents one potential repair path 700 being used, there will normally be other ones available. 702 4.4.3. Interference Between Potential Node Repair Paths 704 Under some circumstances the existence of one neighbor may interfere 705 with a potential repair path to another. Consider the topology shown 706 in Figure 7 in which all links have a symmetrical cost of one, with 707 the exception of that between H and G, which has a cost of 3. In this 708 example, the fact that router F is a neighbor of B prevents the 709 discovery of a repair path from router A to router C despite the 710 existence of an apparently suitable path. 712 [A]---//---[B]------ [C] 713 | | | 714 | | | 715 [H]-3-[G]--[F]--[E]--[D] 717 Figure 7. Interference between repair paths 719 A repair path from router A to F can use F itself as the release 720 point by employing directed forwarding from G. However, it is not 721 possible to identify a suitable release point for a repair path to 722 router C within the topology shown since there is nowhere that 723 router A can reach that will subsequently forward traffic to router C 724 except via the forbidden link B-C (F's least cost path to C is 725 F-B-C). This is because the extended P-space of router A is separated 726 by more than one hop from the Q-space of router C. 728 Since the topology shown in Figure 7 will typically form part of a 729 much larger topology, a different, and possibly more circuitous 730 repair path from A to C, that does not go via F, may be discovered. 731 This is illustrated in Figure 8. In this enhanced topology, a repair 732 path to C using Y as the release point can be used. 734 [A]---//---[B]-------[C] 735 | | | 736 | | | 737 [H]-3-[G]--[F]--[E]--[D] 738 | | 739 | | 740 [X]--[Y]--[Z] 742 Figure 8. Resolving interference in a larger network 744 Note that, in Figure 8, if the traffic for C were assigned to the 745 repair path for F, it would correctly reach C because F would assign 746 it to its repair path to C. That is, packets from A to C would travel 747 via two successive tunnels. Consequently, this is referred to as a 748 "secondary repair path". However, it is not always the case that 749 interference can be handled in this fashion and it is possible to 750 create looping repair paths. 752 One possibility of looping repair paths is illustrated in Figure 9. 753 All links have a symmetrical cost of one with the exception of HG, 754 which is cost 3 in either direction, and ED and DC which are cost 5 755 in the indicated direction and cost 1 in the other. 757 [A]---//---[B]--------[C] 758 | | |^ 759 | | |5 760 [H]-3-[G]--[F]--[E]---[D] 761 5> 763 Figure 9 Looping secondary repair paths 765 In this topology, A can establish a repair path to F, but cannot 766 establish a repair path to C because of interference. Router A might 767 assign traffic intended for C onto its repair path to F expecting it 768 to undergo a secondary repair towards C. However, because of the 769 asymmetrical link costs, F is unable to establish a repair path to C. 770 It is only able to establish a repair path to A. If F, like A, 771 elected to forward repaired traffic to C using its (only) repair path 772 to A, similarly expecting a secondary repair to get it to its 773 destination, traffic for C would loop between A and F. Thus when 774 interference occurs, the possibility of a secondary repair path 775 cannot be relied upon to ensure that traffic reaches its destination. 777 In order to determine the viability of secondary repair paths, it is 778 necessary for each router to take into account the repair paths which 779 the other neighbors of router B can achieve. These can be computed 780 locally by running the repair path computation algorithms rooted at 781 each of those neighbors. It is only necessary to compute the repair 782 paths from the routers to which router A can establish repair paths, 783 with targets of those routers to which repair paths have not yet been 784 established. 786 It is then possible to determine whether all routers can now be 787 reached by invoking secondary (or if necessary tertiary, etc.) repair 788 paths, and if so, to which primary repair path traffic for each 789 target should be assigned. 791 There is another, more subtle, possibility of loops arising when 792 secondary repair paths are used. This is illustrated in Figure 10, 793 where all links are cost 1 with the exception of JI which has a cost 794 5 in that direction and cost 1 in the direction IJ. 796 [A]---//---[B]--------[C] 797 | | | 798 | | | 799 [J] | [D] 800 5| | | 801 v| | | 802 [I]---[H]--[G]---[F]--[E] 804 Figure 10 Example of an apparently non-looping secondary repair path 805 which results in a loop. 807 Router A has a primary repair path to G (with a release point of I), 808 and G has a primary repair path to C (with a release point of E). It 809 would appear that these form a non-looping secondary repair path from 810 A to C. As usual, the primary repair path from A to G has been 811 computed on the basis of destinations normally reachable through BG. 812 However, when making use of the secondary repair path, the traffic 813 inserted in the repair path from A to G will be destined not for one 814 of the routers normally reachable via BG, but for C. Hence this 815 repair path is not necessary valid for such traffic, and in this 816 example it will have a 50% probability of being forwarded back along 817 the path IJABC, and hence looping. 819 This problem can in general be avoided by choosing a release point 820 for the initial primary repair with the property that traffic for the 821 secondary target (C) is guaranteed to traverse the primary target 822 (G). This can be achieved by computing the reverse SPF rooted at the 823 secondary target (C) and examining the sub-tree which traverses the 824 primary target. It can be proved that in the absence of asymmetric 825 link costs, such a release point will always exist. Where asymmetric 826 link costs prevent this, the traffic can be encapsulated to the 827 intermediate router (G), which may require the use of double 828 encapsulation. On reaching router G, the traffic for C is 829 decapsulated and then forwarded in G's primary repair path to C (via 830 router E, in the example). 832 4.5. Multi-homed Prefixes 834 Up to this point, it has been assumed that any particular prefix is 835 "attached" to exactly one router in the network, and consequently 836 only the routers in the network need be considered when constructing 837 repair paths, etc. However, in many cases the same prefix will be 838 attached to two or more routers. Common cases are: - 840 o The subnet present on a link is advertised from both ends of 841 the link. 843 o Prefixes are propagated from one routing domain to another by 844 multiple routers. 846 o Prefixes are advertised from multiple routers to provide 847 resilience in the event of the failure of one of the routers. 849 In general, this causes no particular problems, and the shortest 850 route to each prefix (and hence which of the routers to which it is 851 attached should be used to reach it) is resolved by the normal SPF 852 process. However, in the particular case where one of the instances 853 of a prefix is attached to router B, or to a router for which router 854 B is a single point of failure, the situation is more complicated. 856 P 857 | 858 | 859 [A]---//---[B]--------[C] 860 | | P 861 | | | 862 [W]-----[X]----[Y]----[Z]-[G]-[H]-[I]-[J]-[K]-[L]-[M]-[N] 864 Figure 11 A multi-homed prefix p 866 Consider a prefix p, which is attached to router B and some other 867 router N as illustrated in Figure 11. Before the failure of the link 868 A-B, p is reachable from A via A-B. After the failure it cannot be 869 assumed that B is still reachable. If traffic to p is assigned to a 870 link repair path to B (as it would be if p were attached only to B), 871 and router B has failed, then it would loop and subsequently be 872 dropped. Traffic for p cannot simply be assigned to whatever repair 873 path would be used for traffic to N, because other routers, which are 874 not yet aware of any failure, may direct the traffic back towards B, 875 since the instance of p attached to B is closer. 877 A solution is to treat p itself as a neighbor of B, and compute a 878 repair path with p as a target. However, although correct, this 879 solution may be infeasible where there are a very large number of 880 such prefixes, which would result in an unacceptably large 881 computational overhead. 883 Some simplification is possible where there exist a large number of 884 multi-homed prefixes which all share the same connectivity and 885 metrics. These may be treated as a single router and a single repair 886 path computed for the entire set of prefixes. 888 An alternative solution is to tunnel the traffic for a multi-homed 889 prefix to the router N where it is also attached (see Figure 11). If 890 this involves a repair path that was already tunneled, then this 891 requires double encapsulation. 893 4.6. Equal Cost Path Splits 895 Equal cost path splits may be used as a repair mechanism, but link 896 and node repairs need to be considered separately. 898 4.6.1. Equal Cost Path Splits as Link Repair Paths 900 When a link is used as a member of one or more path-split sets, by 901 definition, the destinations served could be equally well served by 902 any other member of the path-split set. Therefore, when the link 903 fails, any destinations that use the link as a path-split may be 904 immediately assigned to another member of the set. Clearly, if 905 traffic to some destinations can be repaired using a path split, it 906 should not also be subject to repair by tunneling. Such destinations 907 should be identified before performing traffic assignment to tunneled 908 repair paths. 910 4.6.2. Equal Cost Path Splits and Node Failure 912 An equal cost path split may traverse the failed node (router B). In 913 this case, the path split may not be an appropriate repair path. 914 There are two cases: - 916 o the path split is a parallel link, having router B as a 917 direct neighbor, and 919 o the path split does not have router B as a direct neighbor, 920 but the route traverses router B at some point further 921 downstream. 923 These are illustrated in Figure 12 and Figure 13 respectively. 925 +---//---+ 926 [A] [B]-------[D] 927 +--------+ 929 Figure 12 A parallel link path split 931 +-2-//---+ 932 [A] [B]-------[D] 933 +--[C]---+ 935 Figure 13 A path split via an intermediate node 937 In both cases it must be assumed that router B has failed and some 938 other repair path, diverse with respect to router B, must be used. 940 4.7. LANs and pseudonodes 942 In link state protocols a LAN is represented by a construct known as 943 a pseudonode in IS-IS and a network LSA in OSPF. 945 In order to deal correctly with this representation of LANs, the 946 algorithms described in this draft require certain modifications. 947 There are four cases which require consideration. These are described 948 in the following subsections. 950 4.7.1. The Link between Routers A and B is a LAN 952 In this case, the link which is being protected is a LAN, and the 953 router B which has potentially failed is reachable over the LAN. This 954 is illustrated in Figure 14. 956 [A] 957 | 958 ===================== 959 | | | | 960 [B] [C] [D] [E] 962 Figure 14 The link between routers A and B is a LAN 964 There are two possible failure modes in this case. 966 4.7.1.1. Case 1 968 Router B or its interface to the LAN may have failed independently of 969 the rest of the LAN. In this case the remaining routers on the LAN 970 (routers C, D and E) will remain reachable from router A. These 971 routers do not appear as direct neighbors of router B in the link 972 state database and are not treated as neighbors of router B for the 973 purposes of this specification because no traffic from router A would 974 be directed through router B to any of these routers. However, each 975 of these neighboring routers will have router B as a neighbor and 976 they will initiate their own repair paths in the event of the failure 977 of router B or its LAN interface. 979 Repair paths are computed with the non-LAN neighbors of B as targets, 980 and also B itself (the "link-failure" repair path). Note that since 981 the remaining neighbors of A on the LAN are assumed to be still 982 reachable when the link to B has failed, these repair paths may 983 traverse the LAN. 985 A separate set of repair paths is required in anticipation of the 986 potential failure of each router on the LAN. 988 4.7.1.2. Case 2 990 Router A's interface to the LAN may have failed (or the entire LAN 991 may have failed). In either event, simultaneous failures will be 992 observed from router A to all the remaining routers on the LAN 993 (routers B, C, D and E). In this case, the pseudonode itself can be 994 treated as the "adjacent" router (i.e. the router normally referred 995 to as "router B"), and repairs constructed using the normal 996 mechanisms with all the neighbors of the pseudonode (routers B, C, D 997 and E) as repair path targets. If one or more of the routers had 998 failed in addition to the LAN connectivity, treating it as a repair 999 path target would not be viable, but this would be a case of multiple 1000 simultaneous failures which is out of scope of this specification. 1002 The entire sub-tree over A's LAN interface is the failed component 1003 and is excised from the spanning tree when computing A's extended P- 1004 space. For the Q-spaces of the targets, the sub-tree over the LAN 1005 interface of the target is excised. 1007 4.7.1.3. Simplified LAN repair 1009 A simpler alternative strategy is to always consider the LAN and all 1010 routers attached to it as failing as a single unit. In this case, a 1011 single set of repair paths is computed with targets being the entire 1012 set of non-LAN neighbors of all the routers on the LAN, together with 1013 "link-repair" paths with all the routers on the LAN as targets. Any 1014 failure of one or more LAN adjacencies results in these repair paths 1015 being invoked for all neighbors on the LAN. These repair paths must 1016 not traverse the LAN, and so must be computed by excising the entire 1017 sub-tree reachable over A's LAN interface from A's spanning tree 1018 (i.e. the entire LAN is the failed component). The Q-spaces are 1019 computed as normal, with the LAN neighbors or their interface to the 1020 LAN being excised as appropriate. This is simpler than the approach 1021 proposed above, but will fail to make use of possible repair paths 1022 (or even path splits) over the LAN. In particular, if the only viable 1023 repair paths involve the LAN, it will prevent any repair being 1024 possible. 1026 4.7.2. A LAN exists at the release point 1028 When computing the viable release points, it may be that one or more 1029 of the leaf nodes are actually pseudonodes. In this case, the release 1030 point is deemed to be any of the parent nodes on the LAN by which the 1031 pseudonode had been reached, and when computing the extended set of 1032 release points (reachable by directed forwarding), all the remaining 1033 routers on the LAN may be included. 1035 4.7.3. A LAN between B and its neighbors 1037 If there is a LAN between router B and one or more of B's neighbors 1038 (other than router A), then rather than treating each of those 1039 neighbors as a separate target to which a repair path must be 1040 computed, the pseudonode itself can be treated as a single target for 1041 which a repair path can be computed. If there are other neighbors of 1042 B which are directly attached to B, including those which may also be 1043 attached to the LAN, they must still be treated as an individual 1044 repair path target. 1046 Normally a repair path with the pseudonode as its target will have a 1047 release point before the pseudonode. However it is possible that the 1048 release point would be computed as the pseudonode itself. This will 1049 occur if the reverse spanning tree rooted at the pseudonode includes 1050 no routers other than itself. In this case a single repair with the 1051 pseudonode as target is not possible, and it is necessary to compute 1052 individual repair paths whose target are each of the neighbors of B 1053 on the LAN. 1055 4.7.4. The LAN is a Transit Subnet. 1057 This is the most common case, where a LAN is traversed by a repair 1058 path, but is not in any of the special positions described above. In 1059 this case no special treatment is required, and the normal SPF 1060 mechanisms are applicable. 1062 5. Failure Detection and Repair Path Activation 1064 The details of repair path activation are inherently implementation- 1065 dependent and must be addressed by individual design specifications. 1066 This section describes the implementation independent aspects of the 1067 failover to the repair path. 1069 5.1. Failure Detection 1071 The failure detection mechanism must provide timely detection of the 1072 failure and activation of the repair paths. The failure detection 1073 mechanisms may be media specific (for example loss of light), or may 1074 be generic (for example BFD). Multiple detection mechanisms may be 1075 used in order to improve detection latency. Note that in the case of 1076 a LAN it may be necessary to monitor connectivity to all of the 1077 adjacent routers on the LAN. 1079 5.2. Repair Path Activation 1081 The mechanism used by the router to activate the repair path 1082 following failure will be implementation specific. 1084 An implementation that is capable of withdrawing the repair may delay 1085 the start of network convergence in order to minimize network 1086 disruption in the event that the failure was a transient. 1088 5.3. Node Failure Detection Mechanism 1090 When router A detects a failure of the A-B link, it will invoke the 1091 link repair path from itself to router B. This A-B link repair is 1092 always invoked because even if all other traffic can be re-routed, B 1093 is always a single point of failure to itself. If router B has 1094 failed, the A-B link repair can result in a forwarding loop. A node 1095 failure detection mechanism is therefore needed. A suitable mechanism 1096 might be to run BFD [BFD] between A and B, over the A-B link repair 1097 path. 1099 When the node failure detection mechanism has determined that router 1100 B has failed it withdraws the A-B link repair path. The node failure 1101 detection and revocation of the A-B link repair needs to be 1102 expedited, in order to minimize the duration of collateral damage to 1103 the network cause by packets looping around the A-B link repair path. 1105 If B is a single point of failure to some destinations, then 1106 withdrawing the A-B link repair has no impact on network 1107 connectivity, because those destinations will have been rendered 1108 unreachable by the failure of router B. 1110 If B is not a single point of failure, but traffic to some 1111 destinations is being repaired via the A-B link because of the 1112 inability to provide suitable repair paths, then there are 1113 destinations that are rendered temporarily unreachable by IPFRR. The 1114 IPFRR loop free convergence mechanism delays normal convergence of 1115 the network. Consideration therefore has to be given to the relative 1116 importance of the traffic being protected and the traffic being 1117 black-holed. Depending on the outcome of that consideration, the 1118 IPFRR loop-free strategy may need to be abandoned. 1120 6. Loop Free Transition 1122 Once the repair paths have been activated, data will again be 1123 forwarded correctly. At this stage only the routers directly adjacent 1124 to the failure will be aware of the failure because no routing 1125 information concerning the failure has yet been propagated to other 1126 routers. The network now has to be transitioned to normal operation 1127 using the available components. 1129 During network transition inconsistent state may lead to the 1130 formation of micro-loops. During this period, packets may be 1131 prevented from reaching the repair path, may expire due to transiting 1132 an excessive number of hops, may be subject to excessive delay, and 1133 the resultant congestion may disrupt the passage of other packets 1134 through the network. The use of a loop free transition technique 1135 allows the network to re-converge without packet loss or disruption. 1137 Four loop free transition strategies are described: 1139 o Incremental cost advertisement 1141 o Single Tunnel 1143 o Distributed Tunnels 1145 o Ordered SPF 1147 6.1. Incremental Cost Advertisement 1149 When a link fails, the cost of the link is normally changed from its 1150 assigned metric to "infinity". However it can be proved that: if the 1151 link cost is increased in suitable increments, and the network is 1152 allowed to stabilize before the next cost increment is advertised, 1153 then no micro-loops will form. 1155 This approach has the advantage that it requires no change to the 1156 routing protocol, and will work with non-IPFRR capable routers. 1157 However the loop-free transition is slow, particularly if large 1158 metrics are used, and during this time the network is vulnerable to a 1159 second failure. 1161 6.2. Single Tunnel Per Router 1163 When a failure is detected, the routers adjacent to the failure issue 1164 a "covert" announcement of the failure, which is propagated through 1165 the network by all routers, but which is understood only by IPFRR 1166 capable routers. These routers each build a tunnel to the closest 1167 IPFRR router adjacent to the failure. They then determine which of 1168 their traffic would transit the failure and place that traffic in the 1169 tunnel. When all of these tunnels are in place, the failure is then 1170 announced as normal. Because the tunnel will be unaffected by the 1171 transition, and because the IPFRR router at the tunnel endpoint will 1172 continue the repair, no traffic will be disrupted by the failure. 1173 When the network has converged, the IPFRR routers can withdraw the 1174 tunnels. The order of tunnel insertion and withdrawal is not 1175 important, provided the tunnels are all in place before the normal 1176 announcement. 1178 This technique has the disadvantage that it requires traffic to be 1179 tunneled during the transition. 1181 A further disadvantage of this method is that it requires co- 1182 operation from all the routers within the routing domain to fully 1183 protect the network against micro-loops. However it can be shown that 1184 micro-loops will be confined to contiguous groups of non-IPFRR 1185 capable routers, and will only affect traffic arriving at the network 1186 through one of those routers. 1188 6.3. Distributed Tunnels 1190 This is similar to the single tunnel per router approach except that 1191 all IPFRR capable routers calculate a set of repair paths using the 1192 same algorithms as for traffic that will be affected by the failure. 1194 This reduces the load on the tunnel endpoints, but the length of time 1195 taken to calculate the repairs increases the convergence time. 1197 This method suffers from the same disadvantages as the single tunnel 1198 method. 1200 6.4. Ordered SPFs 1202 Micro loops occur when a router closer to the failed component 1203 revises its routes to take account of the failure before a router 1204 which is further away. By analyzing the reverse spanning tree over 1205 which traffic is directed to the failed component, it is possible to 1206 determine a strict ordering which ensures that routers closer to the 1207 root always process the failure after any routers further away, and 1208 hence micro loops are prevented. 1210 When the failure has been announced, each router waits a multiple of 1211 some time delay value. The multiple is determined by the router's 1212 position in the reverse spanning tree, and the delay value is chosen 1213 to guarantee that a router can complete its processing within this 1214 time. The convergence time may be reduced by employing a signaling 1215 mechanism to notify the parent when all the children have completed 1216 their processing, and hence when it was safe for the parent to 1217 instantiate its new routes. 1219 The property of this approach is therefore that it imposes a delay 1220 which is bounded by the network diameter although in most cases it 1221 will be much less. 1223 It requires all routers in the domain to operate according to these 1224 procedures, and the presence of non co-operating routers can give 1225 rise to loops for any traffic which traverses them (not just traffic 1226 which is originated through them). 1228 7. Restoring Failed Components to Service 1230 When a neighbor or failed link is restored to service, it will be 1231 detected according to the normal operation of the routing protocols 1232 by the formation of an adjacency. Normally this would result in the 1233 information about the link being included in newly generated routing 1234 information. However, just as in the case with increasing costs, the 1235 sudden decrease in cost from "infinity" to the configured value of 1236 the link cost may give rise to loops. Each of the loop-free 1237 transition mechanism described above has a corresponding mechanism 1238 that can be used to add a link to the network without the formation 1239 of micro-loops. 1241 8. Implications for Network Management 1243 It will be clear from the above that topology changes introduced by 1244 management action, such as enabling or disabling a link or router, or 1245 changing the cost metric of a link may result in disruption of 1246 traffic due to the formation of micro-loops. It will equally be clear 1247 that the loop-free convergence strategies described above can equally 1248 be applied to the prevention of such micro-loops. 1250 9. IPFRR Capability 1252 In the previous sections it has been assumed that all routers in the 1253 network are capable of acting as IPFRR routers, performing such tasks 1254 as tunnel termination and directed forwarding. In practice this is 1255 unlikely to be the case, partially because of the heterogeneous 1256 nature of a practical network, and partially because of the need to 1257 progressively deploy such capability. IPFRR therefore needs to 1258 support some form of capability announcement, and the algorithms need 1259 to take these capabilities into account when calculating their path 1260 repair strategies. For example, the ability of routers to function as 1261 tunnel end points and perform directed forwarding will influence the 1262 choice of repair path. However, routers which are simply traversed by 1263 repair paths (tunneled or not) do not need to be IPFRR capable in 1264 order to guarantee correct operation of the repair paths. 1266 10. Enhancements to routing protocols 1268 It will be seen from the above that a number of enhancements to the 1269 appropriate routing protocols are needed to support IPFRR. The 1270 following possible enhancements have been identified: 1272 o The ability to advertise IPFRR capability 1274 o The ability to advertise tunnel endpoint capability 1276 o The ability to advertise directed forwarding identifiers 1278 o The ability to announce the start of a loop-free transition, 1279 and to abort a loop-free transition. 1281 o The ability to signal transition completion status to 1282 neighbors. 1284 o The ability to advertise that a link is protected. 1286 Capability advertisement should make use of existing capability 1287 mechanisms in the routing protocols. The exact set of enhancements 1288 will depend on specific IPFRR design choices. 1290 11. IANA considerations 1292 There are no IANA considerations that arise from this architectural 1293 description of IPFRR. However there will be changes to the IGPs to 1294 support IPFRR in which there will be IANA considerations. 1296 12. Security Considerations 1298 Changes to the IGPs to support IPFRR do not introduce any additional 1299 security risks. 1301 The security implications of the increased convergence time due to 1302 the loop avoidance strategy depend on the approach to multiple 1303 failures. If the presence of multiple failures results in the network 1304 aborting the loop free strategy, then the convergence time will be 1305 similar to that of a conventional network. On the other hand, an 1306 attacker in a position to disrupt part of a network might use this to 1307 disrupt the repair of a critical path. 1309 The tunnel endpoints need to be secured to prevent their use as a 1310 facility by an attacker. Performance considerations indicate that 1311 tunnels cannot be secured by IPsec [IPSEC]. A system of packet 1312 address policing, both at the tunnel endpoints and at the edges of 1313 the network would prevent an attacker's packet arriving at a tunnel 1314 endpoint and would seem to be the best strategy. 1316 When a fast re-route is in progress, there may be an unacceptable 1317 increase in traffic load over the repair path. Network operators need 1318 to examine the computed repair paths and ensure that they have 1319 sufficient capacity. 1321 Acknowledgments 1322 The authors acknowledge the significant technical contributions made 1323 to this work by their colleagues: John Harper and Kevin Miles. 1325 IPR Disclosure Acknowledgement 1327 By submitting this Internet-Draft, we certify that any applicable 1328 patent or other IPR claims of which we are aware have been disclosed, 1329 and any of which we become aware will be disclosed, in accordance 1330 with RFC 3668. 1332 Normative References 1334 Internet-drafts are works in progress available from 1335 http://www.ietf.org/internet-drafts/ 1337 Informative References 1339 Internet-drafts are works in progress available from 1340 http://www.ietf.org/internet-drafts/ 1342 BFD Katz, D., and Ward, D., "Bidirectional Forwarding 1343 Detection", draft-katz-ward-bfd-01.txt, August 1344 2003 (work in progress). 1346 IPSEC Kent, S., Atkinson, R., "Security Architecture 1347 for the Internet Protocol", RFC 2401 1349 Authors' Addresses 1351 Stewart Bryant 1352 Cisco Systems, 1353 250, Longwater Avenue, 1354 Green Park, 1355 Reading, RG2 6GB, 1356 United Kingdom. Email: stbryant@cisco.com 1358 Clarence Filsfils 1359 Cisco Systems, 1360 De Kleetlaan 6a, 1361 1831 Diegem, 1362 Belgium Email: cfilsfil@cisco.com 1364 Stefano Previdi 1365 Cisco Systems, 1366 Via Del Serafico 200 1367 00142 Roma, 1368 Italy Email: sprevidi@cisco.com 1370 Mike Shand 1371 Cisco Systems, 1372 250, Longwater Avenue, 1373 Green Park, 1374 Reading, RG2 6GB, 1375 United Kingdom. Email: mshand@cisco.com 1377 Full Copyright statement 1379 Copyright (C) The Internet Society (2004). All Rights Reserved. 1381 This document is subject to the rights, licenses and restrictions 1382 contained in BCP 78, and except as set forth therein, the authors 1383 retain all their rights. 1385 This document and the information contained herein are provided on an 1386 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1387 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1388 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1389 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1390 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1391 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.