idnits 2.17.1 draft-ietf-rtgwg-microloop-analysis-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 17. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 732. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 745. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** The document seems to lack an RFC 3978 Section 5.5 (updated by RFC 4748) Disclaimer -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack an RFC 3979 Section 5, para. 2 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 16 longer pages, the longest (page 2) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 17 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There is 1 instance of too long lines in the document, the longest one being 3 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 292: '... The route SHALL be updated with...' RFC 2119 keyword, line 296: '... The route SHALL be updated with...' RFC 2119 keyword, line 298: '...porary next-hops SHALL be used for the...' RFC 2119 keyword, line 299: '... After DELAY_TYPEB, the route SHALL be...' RFC 2119 keyword, line 303: '...imary) next-hops SHALL continue to be ...' (10 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 2005) is 6761 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'ISIS' == Outdated reference: A later version (-12) exists of draft-ietf-rtgwg-ipfrr-spec-base-03 == Outdated reference: A later version (-13) exists of draft-ietf-rtgwg-ipfrr-framework-04 -- Obsolete informational reference (is this intentional?): RFC 3137 (ref. 'STUB') (Obsoleted by RFC 6987) Summary: 10 errors (**), 0 flaws (~~), 6 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Alex Zinin 3 Internet Draft Alcatel 4 Expiration Date: June 2006 October 2005 5 File name: draft-ietf-rtgwg-microloop-analysis-01.txt 7 Analysis and Minimization of Microloops in 8 Link-state Routing Protocols 10 draft-ietf-rtgwg-microloop-analysis-01.txt 12 Status of this Memo 14 By submitting this Internet-Draft, each author represents that any 15 applicable patent or other IPR claims of which he or she is aware 16 have been or will be disclosed, and any of which he or she becomes 17 aware will be disclosed, in accordance with Section 6 of BCP 79. 19 Internet Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its Areas, and its Working Groups. Note that other 21 groups may also distribute working documents as Internet Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than a "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 Abstract 36 Link-state routing protocols (e.g. OSPF or IS-IS) are known to 37 converge to a loop-free state within a finite period of time after a 38 change in the topology. It is normal, however, to observe short-term 39 loops during the period of topology update propagation, route 40 recalculation, and forwarding table update, due to the asynchronous 41 nature of link-state protocol operation. This document provides an 42 analysis of formation of such microloops and suggests simple 43 mechanisms to minimize them. 45 1 Introduction 47 Link-state routing protocols, such as [OSPF] and [ISIS] converge to a 48 loop-free state within a finite period of time after a topology 49 change. Additional changes postpone the convergence, but do not get 50 in its way. 52 During the period of convergence, however, link-state protocols 53 exhibit short-term routing table inconsistencies caused by the 54 protocol's asynchronous nature. These incornsistencies may cause 55 short-term packet loops, also known as microloops. For example, see a 56 sample network in Figure 1. 58 +--+ 1 +--+ 59 |A |---------|B | 60 +--+ +--+ 61 | \ 10 | 62 5| ------ |1 63 | \ | 64 +--+ 10 \+--+ 65 |E |---------|C | 66 +--+ +--+ 67 \_ / 68 5 \ /1 (failure) 69 +--+ 70 |D | 71 +--+ 73 Figure 1. Microloop example 75 We are interested in routers A and B and their best paths towards D. 76 Before failure, B's best path to D is B-C-D with cost 2, and A's best 77 path is A-B-C-D with cost 3. When link C-D fails, both C and D 78 announce their link state information with link C-D missing. Within a 79 finite period of time, both A and B shall receive the topology 80 updates and converge on them, installing new best paths: A-E-D (10) 81 for A, and B-A-E-D (11) for B. However, if, due to the timing 82 differences, B calculates and installs its new best path through A 83 before A has a chance to switch from B to E, a microloop will form 84 between A and B for the duration of time required for A to complete 85 its routing table update. 87 Similar microloops may form when other topological changes happen in 88 the network, for example, when a new link or a node is added, a link 89 cost is changed, etc. In summary, whenever a topological change in 90 the network results in changes of the shortest path three (SPT) for 91 more than one node, it is possible for the network to exhibit 92 temporary loops. 94 This document provides an analysis of microloop formation. 95 Specifically, we categorize different types of reconvergence 96 scenarios, and explore their properties. We then show that in certain 97 scenarios microloops do not form, in others they can be eliminated 98 using simple techniques described in this document, and define 99 scenarios where more sophisticated loop avoidance mechanisms may be 100 necessary. 102 It is useful to understand the relationship between [IPFRR] and the 103 technique described here. The two mechanisms play complimentary roles 104 to each other: deploying [IPFRR] without micro-loop prevention only 105 partially addresses the goal of minimizing packet loss during network 106 reconvergence, since packets will be lost due to microloops. On the 107 other hand, micro-loop prevention described in this document relies 108 on [IPFRR] local failure protection, as routers will keep forwarding 109 traffic down the old path until the new next-hops are known to be 110 safe. 112 2 Analysis 114 To analyse the behavior of a network during reconvergence, we look at 115 a given router and its neighbors before failure and during the 116 transition to the new routes. More specifically, we analyse whether 117 switching to the new routing information can result in loop formation 118 or not. 120 2.1 Terminology 122 The following terms are used in the draft. 124 Dopt(X,Y) 125 Integer function defined as the cumulative cost of the least- 126 cost path from node X to node Y in a topology graph. Normally 127 calculated by link-state routing protocols using Dijkstra 128 algorithm as part of regular route calculation procedures. 129 This is the same as "Distance_opt(A,B)" defined in [IPFRR-FW] 131 Downstream neighbor 132 Neighbor N of router S is considered S's downstream neighbor 133 for destination D, if Dopt(N, D) < Dopt(S, D) 135 Primary neighbor 136 Neighbor N of router S is considered S's primary neighbor for 137 destination D, if the path via N is such that D(S, D) is mini- 138 mized, i.e. N provides a shortest path to D according to the 139 SPF calculation. 141 Loop-free neighbor 142 Neighbor N of router S is considered S's loop-free neighbor 143 for destination D, if Dopt(N, D) < Dopt(N, S) + Dopt(S, D). 144 Note that a loop-free neighbor may be, for example, router's 145 primary before and/or after failure. 147 2.2 Next hop safety condition 149 We start the analysis of single-hop loops with the following observa- 150 tion: 152 After a topology change, there are precisely two situations when a 153 microloop between routers X and Y can form: 155 a) Before failure, X uses Y as its next-hop. After failure, Y 156 uses X as its next-hop. Y updates its routes based on the new 157 topology before X. 159 b) Exact opposite of the previous case. Before failure, Y uses X 160 as its next-hop. After failure, X uses Y as its next-hop. X 161 updates its routes based on the new topology before Y. 163 Formulating this for a given calculating router S (either X or Y in 164 the above example) switching to a new primary Pn, a microloop may 165 occur between S and Pn only if Pn was forwarding through S before 166 failure. 168 Based on the above, we can define a general safety condition for any 169 neighbor N (whether new primary or not) of router S that has just 170 learned about a topology change. Note that the condition must satisfy 171 the topological criteria above, and be non-recursive, i.e. not lead 172 to loops if both S and N follow it. 174 Next-hop safety condition: 176 After a topology change, it is safe for router S to switch to 177 neigbor N as its next-hop for a specific destination if the 178 path through N satisfies both of the following criteria: 180 1. S considered N as its loop-free neighbor based on the 181 topology before change AND 183 2. S considers N as its downstream neighbor based on the 184 topology after change. 186 The first requirement ensures that N has not been forwarding 187 traffic to S before the change occured and both S and N used 188 old topology. The second requirement makes sure N does not 189 forward traffic to S when N learns the new topology. Note 190 again, that N is S's any neighbor, and may or may not be used 191 by S as its new primary or a temporary safe neighbor. 193 The difference in the conditions before and after failure is 194 there to make sure that S and N do not recursively consider 195 each other as safe next-hops when they learn about the fail- 196 ure. 198 2.3 Transition types 200 Here, we analyse different types of scenarios that a given router may 201 find itself in after learning about a topology change. 203 For each destination affected by a topological change, the network 204 will have three major types of nodes categorized by the degree of 205 safety of their old primary, new primary, and other neighbors. (Note 206 that we do not yet consider ECMP, which will be discussed in section 207 3.2.) 209 Type A 211 Routers whose new primary next-hops after the topology change 212 are safe and transition to them will not create a microloop. 213 Two subtypes are recognized: 215 A1: Routers whose primaries haven't changed as a result of 216 the topology change 218 A2: Routers whose new primary satisfies the safety condition 220 Type B 222 Routers whose new primary next-hops after the topology change 223 do not satisfy the safety condition, but that have at least 224 one other neighbor that does. Note that such a neighbor can be 225 the router's old primary (type B1) or a neighbor that is nei- 226 ther old nor new primary (type B2). 228 Type C 230 Routers that have no neighbor that satisfies the safety condi- 231 tion. 233 It is clear that nothing special needs to be done for type-A routers 234 as they either do not need to modify their routes or can immediately 235 switch to the new primary next hops. 237 It can also be shown that if type-B routers do not immediately switch 238 to their new primaries, but use their safe next-hops for some time, 239 switching to the new primaries later will not create loops, provided 240 that their downstream routers have also switched to the safe hops or 241 have already switched to the new primaries. 243 NOTE: The above analysis applies to single-hop loops. Multi-hop 244 loops, possible in networks with asymmetric link costs could be 245 prevented by using a tighter safety condition. However, as shown 246 by simulations on real-life network topologies, doing so would 247 decrease micro-loop coverage and thus result in increased number of 248 unprevented single-hop loops. 250 The following section formally defines the mechanism. 252 3. Loop prevention mechanism 254 3.1 Basic procedures 256 The essense of the mechanism defined here, also known as "path lock- 257 ing via safe neighbors" (PLSN), can be informally summarized as fol- 258 lows. Upon a topology change, for each destination: 260 - Each router in the network assesses safety of its new primary 261 next-hops. 263 - If the new primaries are safe, they are used immediately, oth- 264 erwise, partial 265 ordering of updates is introduced: 267 o If non-primary safe neighbors are found, they are used 268 for a period of time, thus locking traffic to a safe path 269 while the new primaries complete their transition to the 270 new routes 272 o If no safe neighbors are found, the forwarding path is 273 locked on the old next-hop for some time to give the new 274 primary enough time to complete route updates. 276 For a description of several architectural constants used in this 277 document (named as "DELAY_xxx"), refer to section 3.4. 279 On receiving a topology update, the router delays its SPF calculation 280 by DELAY_SPF time in order to collect the remaining updates that 281 relate to the same topological event (e.g. update from the router 282 connected to the second end of a point-to-point link in case of a 283 link failure, or updates from other neighbors of a failed node). 285 Upon expiration of DELAY_SPF, the router calculates the new SPT, the 286 new routes, checks the safety status of each neighbor relative to 287 each affected destination using the conditions in section 3.1, and 288 applies the following logic for each route depending on the type of 289 role it finds itself in: 291 Type A: 292 The route SHALL be updated with the new primary next-hops 293 without an additional delay. 295 Type B: 296 The route SHALL be updated with one or more temporary next- 297 hops that satisfy the safety condition without an additional 298 delay. These temporary next-hops SHALL be used for the dura- 299 tion of DELAY_TYPEB. After DELAY_TYPEB, the route SHALL be 300 updated with the new primary next-hops. 302 Type C: 303 The route's old (primary) next-hops SHALL continue to be used 304 for DELAY_TYPEC. After DELAY_TYPEC, the route SHALL be 305 updated with the new primary next-hops. 307 If, after expiration of DELAY_SPF, the router receives a topology 308 update sooner than DELAY_STABLE after the previous one, the router 309 MUST fall back to the regular convergence mechanisms by prematurely 310 expiring DELAY_TYPEB or DELAY_TYPEC timers if they are still running 311 (thus causing immediate installation of the new primary next-hops), 312 MUST recalculate its routing table as soon as practical, and MUST 313 refrain from using the mechanisms described here until it has seen no 314 topological updates for at least DELAY_STABLE. This is a safeguard 315 mechanism to ensure that procedures described here are applied only 316 when a single failure is experienced and that the network converges 317 in a situation where multiple topological events or network instabil- 318 ities are experienced. 320 [ISIS] includes the concept of an Overload bit (OL) that indicates a 321 node in the network that shouldn't be used as transit. A similar 322 notion is introduced in OSPF by [STUB] using LSInfinity link costs. 323 Honoring these conventions, implementations of this document MUST NOT 324 use neighbors with the OL bit set in IS-IS or announcing links to the 325 calculating router with LSInfinity cost in OSPF as temporary safe 326 neighbors. 328 NOTE: In OSPF, if S's neighbor N is a stub router, the S->N link, 329 visited first by the SPF algorithm, will normally have a real link 330 cost, and it is the backwards link N->S announced by N that will 331 have its cost set to LSInfinity. Implementations have to account 332 for this details when satisfying the above requirement. 334 3.2 Equal Cost Multipath Considerations 336 In situations where more than one primary next-hop is available after 337 the topology change, there are several possible combination of their 338 safety properties: 340 1) All new next-hops satisfy the safery condition (a pure type-A 341 situation) 343 2) Some of the new next-hops satisfy the safety condition, some 344 of them do not (a combination of type-A and type-B) 346 3) None of the new next-hops satisfy the safety condition, how- 347 ever, there's at least one other neighbor that satisfies it (a 348 safe non-primary next-hop, causing new primaries to be type-B) 350 4) None of the new next-hops satisfy the safety condition, and 351 there is no other neighbor that satisfies it (a pure type-C 352 situation). 354 For situations 1, 3, and 4 above, the implementation merely follows 355 the basic procedures described in section 3.1 357 For situation 2 (an A/B combination), the implementation: 359 1) SHALL update the route with the new next-hops that satisfy the 360 safety condition without an additional delay 362 2) SHALL add the remaining new next-hops after DELAY_TYPEB. 364 Note that one could potentially use temporary safe neighbors in situ- 365 ation 2 above, however this specification does not recommend this to 366 avoid unnessesary traffic rerouting and hence packet reordering. 368 3.3 Local Failure and IP Fast Reroute Considerations 370 After detecting a local failure and initiating the local repair 371 process if IPFRR is supported, the router directly attached to the 372 point of failure follows the procedures described in this docu- 373 ment--it delays its SPF calculation to collect updates from other 374 routers, calculates new routes, and classifies the next-hops. 376 For routers implementing IPFRR, the difference with routers that 377 learn about the failure from the routing protocol updates, is that 378 one or more of the repairing router's old next-hops has become 379 unavailable, and hence cannot be considered as the temporary safe 380 next-hops for type-B operation. Also, if the router was able to 381 locally repair the failure, and the new primary next-hops do not sat- 382 isfy the safety condition, the router should consider itself in the 383 middle of type-B operation with the temporary safe neighbor engaged 384 as part of IP Fast Reroute operation. 386 Another distinct situation is when the router does not support IPFRR 387 or could not repair the failure, the new primary next-hops do not 388 satisfy the safety condition, and there's no other neighbor that 389 does, i.e. a type-C situation. Unlike other routers in the network, 390 the router directly connected to the network does not have the old 391 next-hop any more, and cannot continue using it. Immediately switch- 392 ing to the new next-hops, on the other hand, may result in a micro- 393 loop. In this situation, the router MUST discard traffic forwarded 394 along the affected route for the duration of DELAY_TYPEC, and then 395 update the routes. Implementations MAY have a configuration option to 396 allow switching immediately to the new next-hops for situations where 397 this type of a micro-loop is not a concern. If implemented, this 398 option MUST be disabled by default. 400 As a result, there are the following possible scenarios: 402 1) If the new primary next-hops satisfy the safery condition, the 403 router updates the routes without an additional delay. 405 2) Otherwise, if the failure could be repaired locally by IP Fast 406 Reroute, the router continues to use the repair path for 407 DELAY_TYPEB and updates the routes with the new primary next- 408 hops after it expires. 410 3) Otherwise (new next-hops are not safe, and IPFRR is not sup- 411 ported or the failure couldn't be repaired), the router dis- 412 cards traffic for DELAY_TYPEC and updates the routes with the 413 new primary next-hops after its expiration. 415 3.4 Architectural Constants 417 The following architectural constants have been used in the descrip- 418 tion of the algorithm above: 420 DELAY_SPF 421 The delay between the moment the router receives the first 422 topology update after a period of stability and the moment it 423 starts its routing table recalculation. This delay is 424 necessary to collect multiple updates originated by different 425 routers that relate to the same topological event. 427 DELAY_STABLE 428 Period of time, during which the network topology is consid- 429 ered to be stable if the router receives no topological 430 updates. When the first update after DELAY_STABLE is received, 431 all other updates that fit within DELAY_SPF are considered as 432 related to a single topological event. 434 DELAY_TYPEB and DELAY_TYPEC 435 Periods of time used by the router to delay installation of 436 new primary next-hops after a topology change when the router 437 has (type-B) or has not (type-C) a safe neighbor to temporary 438 divert the traffic to in the meantime. 440 While correctness and effectiveness of the algorithm described here 441 does not depend on the actual values assigned to the architectural 442 constants, it does depend on the relationship between them, and the 443 assumption that all routers in the same network use the same values. 445 To satisfy these constrains, and yet allow these delays to be 446 decreased as implementations continue to improve towards faster con- 447 vergence, this document defines the architectural constants as con- 448 figurable, specifies the required relationship between the values, 449 and the default values that should be used by the implementations. 451 The following equations define the relationship between the constants 452 that needs to be maintained in order for the mechanism described here 453 to provide desireable results: 455 DELAY_SPF > update-propagation-time 457 DELAY_STABLE > DELAY_TYPEB > DELAY_TYPEC > fault-propagation-time 459 where: 461 o update-propagation-time is the time it is expected to take 462 routers in the network to detect the failure, and originate 463 and propagate new link-state information. 465 o fault-propagation-time is update-propagation plus the time it 466 is expected to take routers in the network to calculate the 467 new SPT, check the safety condition of the neighbors, and 468 install required FIB entries. 470 Because fault-propagation-time includes update-propagation-time, and 471 DELAY_SPF (since every router will delay its SPF according to this 472 document): 474 fault-propagation-time > DELAY_SPF + update-propagation-time 476 and hence the equations above can be converted to one: 478 DELAY_STABLE > DELAY_TYPEB > DELAY_TYPEC > (DELAY_SPF + update-prop- 479 agation-time) 481 The implementations SHOULD use the following default values for the 482 architectural constants: 484 Constant Default val 485 ---------------------------------------- 486 DELAY_SPF 500 msec 487 DELAY_TYPEC 2 sec 488 DELAY_TYPEB 4 sec 489 DELAY_STABLE 10 sec 491 4 Coverage analysis 493 The above algorithm minimizes the probability of loop formation. More 494 specifically, loops will only be possible when two neighboring 495 routers both experience the type C condition after the topology 496 change. Appendix A shows that transitions between A-A, A-B, A-C, and 497 B-C routers are loop-free. 499 While this mechanism does not remove all possible micro-loops, it 500 addresses the majority of them in topologies with a reasonable level 501 of physical redundancy. Topologically, micro-loop coverage provided 502 by this algorithm is very similar to that provided by [IPFRR]. This 503 is due to the fact that similar construct are used by both mecha- 504 nisms. 506 5 Backwards Compatibility Analysis 508 Effectiveness of the mechanism described here relies on the assump- 509 tion that all routers in the network support it. 511 In a situation where some routers do not support the describer mecha- 512 nism, the network will continue to converge properly fundamentals of 513 the routing system are not changed. When a topology change event 514 occurs in such a network, Type-A and Type-B routers will not substan- 515 tially change the convergence patterns, as they will switch to 516 routers that are guaranteed to forward traffic correctly after 517 DELAY_SPF. (Note that routers today already implement a delay 518 similar to DELAY_SPF.) Type-C routers, when mixed with routers not 519 supporting this mechanism, may induce longer than usual micro-loops 520 (up to DELAY_TYPEC), however this delay is in the same order of mag- 521 nitude as in most deployed networks today. 523 6 Security Considerations 525 The mechanism described in this document does not modify any routing 526 protocol messages, and hence no new threats related to packet modifi- 527 cations or replay attacks are introduced. The mechanism changes cer- 528 tain delays used in node-local algorithms and introduces partial 529 event ordering after a topology change has occured. This, however, 530 does not introduce new security risks. For type-B situations, traffic 531 to certain destinations can be temporarily routed via next-hop 532 routers that would not be used with the same topology change if this 533 mechanism wasn't employed. However, these next-hop routers can be 534 used anyway when a different topological change occurs, and hence 535 this can't be viewed as a new security threat. 537 7 Acknowledgements 539 The author would like to thank Don Fedyk, Chris Martin, Alex Audu, 540 Olivier Bonaventure, Stefano Previdi, and other members of the IETF 541 RTGWG for their useful comments. Special thanks go to Alia Atlas, 542 Mike Shand, and Steward Bryant, who were instrumental in development 543 of this mechanism, such as fine-tuning the safety condition, simulat- 544 ing the mechanism, proof-reading the document, and without whom this 545 work wouldn't be possible. 547 8 References 549 8.1 Normative References 551 [OSPF] J. Moy. OSPF version 2. Technical Report RFC 2328, Internet 552 Engineering Task Force, 1998. 554 [ISIS] ISO, "Intermediate system to Intermediate system routeing 555 information exchange protocol for use in conjunction with the 556 Protocol for providing the Connectionless-mode Network Service 557 (ISO 8473)," ISO/IEC 10589:1992. 559 [IPFRR] Atlas, A., "Basic Specification for IP Fast-Reroute: 560 Loop-free Alternates", Internet Engineering Task Force, Work 561 in Progress, draft-ietf-rtgwg-ipfrr-spec-base-03.txt 563 8.2 Informative References 565 [IPFRR-FW] Shand, M., S. Bryant, "IP Fast Reroute Framework", 566 Internet Engineering Task Force, Work in Progress, draft-ietf- 567 rtgwg-ipfrr-framework-04.txt 569 [STUB] Retana, A., et al, OSPF Stub Router Advertisement, RFC 3137, 570 Internet Engineering Task Force, 2001. 572 Author's Address 574 Alex Zinin 575 Alcatel 576 701 E Middlefield Rd 577 Mountain View, CA 94043 578 E-mail: zinin@psg.com 580 Appendix A. Loop formation analysis 582 S is the calculating router discovering the failure through a link-state 583 update. P is the old primary, NP is the new primary. 585 BF: 586 <------ 587 [P]----------------[S]----------------[NP] 588 ...>? 590 AF: 592 ------> 593 [P]----------------[S]----------------[NP] 594 ?<... 596 To analyze possible loop formation, we need to check the following: 598 1) if it is possible for P to start forwarding packets to S 599 before S switches to NP 601 2) if it is possible for NP to be forwarding packets back to S 602 before or after S starts using it 604 Assumptions are that type-As switch-over to NP immediately, and type- 605 Bs and type-Cs wait certain amount of time so that: 607 DELAY_TYPEB > DELAY_TYPEC > fault-propagation-time 609 1. S is type A: 611 BF analysis: 613 1.1 If P is another type-A, then S cannot be its new primary, since 614 S has not been P's LFA before (since it's been fwd'ing through P). 615 Hence, P will not route through S AF, and the will be no loops 616 between P and S. 618 1.2 If P is a type-B, then S hasn't been P's LF neighbor BF, and P 619 will not forward through S at least for DELAY_TYPEB, which gives S 620 enough time to switch to NP. After DELAY_TYPEB P may start using S 621 as it's new primary. 623 1.3 If P is a type-C, then it hasn't been forwarding traffic to S 624 BF, and will not use S as its new primary at least for DELAY_TYPEC, 625 which should give S enough time to switch to NP. 627 1.4 Consequently, no loops will form between a type-A node and it's 628 old primary before the type-A nodes switches to its new primary. 630 AF analysis: 632 1.5 Regardless of its type, NP has not been forwarding packets to S 633 BF and will not do so AF by definition of type-A. 635 1.6 Consequently, no loops will form between a type-A node and it's 636 new primary before or after the type-A nodes switches to it. 638 2. S is type B: 640 BF analysis: 642 2.1 If P is a type-A, then similarly to 1.1 above, there will be no 643 routes between P and S. 645 2.2 If P is another type-B, then similarly to 1.2, S will not be 646 used by P for at least DELAY_TYPEB, and S will have enough time to 647 switch to its safe hops or NP. 649 2.3 If P is a type-C, then similarly to 1.3, S hasn't been receiv- 650 ing traffic from P BF, and will not AF for at least DELAY_TYPEC, 651 which should give S enough time to switch to its safe hops or NP. 653 2.4 Consequently, no loops will form between a type-B node and it's 654 old primary before the type-B nodes switches to its new primary. 656 AF analysis: 658 2.5 If NP is a type-A, then because of the DELAY_TYPEB NP must have 659 had enough time to switch to its new NP, which cannot be S by defi- 660 nition of SPT considering that NP is S's new nexthop in the SPT AF. 662 2.6 If NP is another type-B, then because of DELAY_TYPEB, NP must 663 have had enough time to switch from its old primary and can equally 664 likely be routing through either its safe hops, or its new primary. 665 Neither of the two can be S by definition of a downstream node (for 666 safe hops) and SPT (for new primary). 668 2.7 If NP is a type-C, then because DELAY_TYPEB > DELAY_TYPEC, NP 669 must have had enough time to switch to its new primary, which can't 670 be S by definition of SPT and considering that NP is S's nexthop in 671 the SPT AF. 673 2.8 Consequently, no loops will form between a type-B node and it's 674 new primary before or after the type-A nodes switches to it. 676 3. S is type C: 678 BF analysis: 680 3.1 If P is a type-A, then similarly to 1.1 before, S has not been 681 P's LF neighbor before and hence won't be its new primary, so no 682 loops will form between P and S. 684 3.2 If P is a type-B, then similarly to 1.2, S will not be used by 685 P for at least DELAY_TYPEB, and because DELAY_TYPEB > DELAY_TYPEC, 686 S will have enough time to switch to NP. 688 3.3 If P is another type-C, then it hasn't been using S as its pri- 689 mary BF, but it is possible for P to consider S as its new primary 690 AF and to install routes before S after their DELAY_TYPEC expires. 691 Hence, a microloop is possible between P and S. 693 3.4 Consequently, a microloop between a type-C node and its old 694 primary is possible only if the old primary is also a type-C node 695 and it considers S as its new primary AF. Note that DELAY_TYPEC 696 only delays probably loop formation, but does not increase its 697 duration, as both neighboring routers are using the same delay. 699 AF analysis: 701 3.5 If NP is a type-A, then because of the DELAY_TYPEC NP must have 702 had enough time to switch to its new NP, which cannot be S by defi- 703 nition of SPT considering that NP is S's new nexthop in the SPT AF. 705 3.6 If NP is a type-B, then because of DELAY_TYPEC, NP must have 706 had enough time to switch to its safe hops, which can't be S by 707 definition of a downstream node and considering that NP is S's new 708 SPT next-hop. 710 3.7 If NP is another type-C, a loop is possible if S's DELAY_TYPEC 711 expires before that on NP and NP has been using S as its primary 712 BF. 714 3.8 Consequently, a microloop between a type-C node and its new 715 primary is possible only if the new primary is also a type-C node 716 and S was NP's primary BF. 718 4. Given the above analysis, it can be noted that, for a given fail- 719 ure, presence of single type-C nodes in the network does not create 720 microloops. 721 It is the C-C combination that introduces this potential. 723 IPR Disclaimer 725 The IETF takes no position regarding the validity or scope of any 726 Intellectual Property Rights or other rights that might be claimed to 727 pertain to the implementation or use of the technology described in 728 this document or the extent to which any license under such rights 729 might or might not be available; nor does it represent that it has 730 made any independent effort to identify any such rights. Information 731 on the procedures with respect to rights in RFC documents can be 732 found in BCP 78 and BCP 79. 734 Copies of IPR disclosures made to the IETF Secretariat and any assur- 735 ances of licenses to be made available, or the result of an attempt 736 made to obtain a general license or permission for the use of such 737 proprietary rights by implementers or users of this specification can 738 be obtained from the IETF on-line IPR repository at 739 http://www.ietf.org/ipr. 741 The IETF invites any interested party to bring to its attention any 742 copyrights, patents or patent applications, or other proprietary 743 rights that may cover technology that may be required to implement 744 this standard. Please address the information to the IETF at ietf- 745 ipr@ietf.org. 747 Full Copyright Statement 749 Copyright (C) The Internet Society (2005). This document is subject 750 to the rights, licenses and restrictions contained in BCP 78, and 751 except as set forth therein, the authors retain all their rights. 753 This document and the information contained herein are provided on an 754 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 755 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 756 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 757 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFOR- 758 MATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES 759 OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.