idnits 2.17.1 draft-ietf-rtgwg-mrt-frr-algorithm-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([I-D.ietf-rtgwg-mrt-frr-architecture]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == There are 19 instances of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (February 14, 2014) is 3723 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'R' is mentioned on line 1611, but not defined == Missing Reference: 'F' is mentioned on line 658, but not defined == Missing Reference: 'C' is mentioned on line 1611, but not defined == Missing Reference: 'G' is mentioned on line 1611, but not defined == Missing Reference: 'E' is mentioned on line 329, but not defined == Missing Reference: 'D' is mentioned on line 658, but not defined == Missing Reference: 'J' is mentioned on line 169, but not defined == Missing Reference: 'A' is mentioned on line 329, but not defined == Missing Reference: 'B' is mentioned on line 175, but not defined == Missing Reference: 'H' is mentioned on line 1193, but not defined == Missing Reference: 'K' is mentioned on line 658, but not defined == Missing Reference: 'N' is mentioned on line 658, but not defined == Missing Reference: 'X' is mentioned on line 1193, but not defined == Missing Reference: 'Y' is mentioned on line 1193, but not defined == Missing Reference: 'R-small' is mentioned on line 1152, but not defined == Missing Reference: 'R-great' is mentioned on line 1168, but not defined == Missing Reference: 'I' is mentioned on line 1611, but not defined == Unused Reference: 'I-D.atlas-mpls-ldp-mrt' is defined on line 2255, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-rtgwg-ipfrr-notvia-addresses' is defined on line 2267, but no explicit reference was found in the text == Unused Reference: 'LFARevisited' is defined on line 2296, but no explicit reference was found in the text == Unused Reference: 'LightweightNotVia' is defined on line 2302, but no explicit reference was found in the text == Unused Reference: 'RFC5286' is defined on line 2319, but no explicit reference was found in the text == Unused Reference: 'RFC5714' is defined on line 2322, but no explicit reference was found in the text == Unused Reference: 'RFC6571' is defined on line 2325, but no explicit reference was found in the text == Outdated reference: A later version (-10) exists of draft-ietf-rtgwg-mrt-frr-architecture-03 == Outdated reference: A later version (-03) exists of draft-atlas-mpls-ldp-mrt-00 == Outdated reference: A later version (-03) exists of draft-atlas-ospf-mrt-01 == Outdated reference: A later version (-11) exists of draft-ietf-rtgwg-lfa-manageability-03 == Outdated reference: A later version (-11) exists of draft-ietf-rtgwg-remote-lfa-04 == Outdated reference: A later version (-02) exists of draft-li-isis-mrt-00 -- Obsolete informational reference (is this intentional?): RFC 3137 (Obsoleted by RFC 6987) Summary: 1 error (**), 0 flaws (~~), 33 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Routing Area Working Group G. Enyedi, Ed. 3 Internet-Draft A. Csaszar 4 Intended status: Standards Track Ericsson 5 Expires: August 18, 2014 A. Atlas, Ed. 6 C. Bowers 7 Juniper Networks 8 A. Gopalan 9 University of Arizona 10 February 14, 2014 12 Algorithms for computing Maximally Redundant Trees for IP/LDP Fast- 13 Reroute 14 draft-ietf-rtgwg-mrt-frr-algorithm-00 16 Abstract 18 A complete solution for IP and LDP Fast-Reroute using Maximally 19 Redundant Trees is presented in [I-D.ietf-rtgwg-mrt-frr- 20 architecture]. This document defines the associated MRT Lowpoint 21 algorithm that is used in the default MRT profile to compute both the 22 necessary Maximally Redundant Trees with their associated next-hops 23 and the alternates to select for MRT-FRR. 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on August 18, 2014. 42 Copyright Notice 44 Copyright (c) 2014 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 60 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 5 61 3. Terminology and Definitions . . . . . . . . . . . . . . . . . 5 62 4. Algorithm Key Concepts . . . . . . . . . . . . . . . . . . . 7 63 4.1. Partial Ordering for Disjoint Paths . . . . . . . . . . . 7 64 4.2. Finding an Ear and the Correct Direction . . . . . . . . 9 65 4.3. Low-Point Values and Their Uses . . . . . . . . . . . . . 11 66 4.4. Blocks in a Graph . . . . . . . . . . . . . . . . . . . . 15 67 4.5. Determining Local-Root and Assigning Block-ID . . . . . . 17 68 5. Algorithm Sections . . . . . . . . . . . . . . . . . . . . . 19 69 5.1. MRT Island Identification . . . . . . . . . . . . . . . . 20 70 5.2. Root Selection . . . . . . . . . . . . . . . . . . . . . 21 71 5.3. Initialization . . . . . . . . . . . . . . . . . . . . . 21 72 5.4. MRT Lowpoint Algorithm: Computing GADAG using lowpoint 73 inheritance . . . . . . . . . . . . . . . . . . . . . . . 22 74 5.5. Augmenting the GADAG by directing all links . . . . . . . 24 75 5.6. Compute MRT next-hops . . . . . . . . . . . . . . . . . . 26 76 5.6.1. MRT next-hops to all nodes partially ordered with 77 respect to the computing node . . . . . . . . . . . . 27 78 5.6.2. MRT next-hops to all nodes not partially ordered with 79 respect to the computing node . . . . . . . . . . . . 28 80 5.6.3. Computing Redundant Tree next-hops in a 2-connected 81 Graph . . . . . . . . . . . . . . . . . . . . . . . . 29 82 5.6.4. Generalizing for graph that isn't 2-connected . . . . 30 83 5.6.5. Complete Algorithm to Compute MRT Next-Hops . . . . . 31 84 5.7. Identify MRT alternates . . . . . . . . . . . . . . . . . 33 85 5.8. Finding FRR Next-Hops for Proxy-Nodes . . . . . . . . . . 37 86 6. MRT Lowpoint Algorithm: Complete Specification . . . . . . . 40 87 7. Algorithm Alternatives and Evaluation . . . . . . . . . . . . 40 88 7.1. Algorithm Evaluation . . . . . . . . . . . . . . . . . . 41 89 8. Algorithm Work to Be Done . . . . . . . . . . . . . . . . . . 51 90 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 51 91 10. Security Considerations . . . . . . . . . . . . . . . . . . . 51 92 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 51 93 11.1. Normative References . . . . . . . . . . . . . . . . . . 51 94 11.2. Informative References . . . . . . . . . . . . . . . . . 51 95 Appendix A. Option 2: Computing GADAG using SPFs . . . . . . . . 53 96 Appendix B. Option 3: Computing GADAG using a hybrid method . . 58 97 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 60 99 1. Introduction 101 MRT Fast-Reroute requires that packets can be forwarded not only on 102 the shortest-path tree, but also on two Maximally Redundant Trees 103 (MRTs), referred to as the MRT-Blue and the MRT-Red. A router which 104 experiences a local failure must also have pre-determined which 105 alternate to use. This document defines how to compute these three 106 things for use in MRT-FRR and describes the algorithm design 107 decisions and rationale. The algorithm is based on those presented 108 in [MRTLinear] and expanded in [EnyediThesis]. The MRT Lowpoint 109 algorithm is required for implementation when the default MRT profile 110 is implemented. 112 Just as packets routed on a hop-by-hop basis require that each router 113 compute a shortest-path tree which is consistent, it is necessary for 114 each router to compute the MRT-Blue next-hops and MRT-Red next-hops 115 in a consistent fashion. This document defines the MRT Lowpoint 116 algorithm to be used as a standard in the default MRT profile for 117 MRT-FRR. 119 As now, a router's FIB will contain primary next-hops for the current 120 shortest-path tree for forwarding traffic. In addition, a router's 121 FIB will contain primary next-hops for the MRT-Blue for forwarding 122 received traffic on the MRT-Blue and primary next-hops for the MRT- 123 Red for forwarding received traffic on the MRT-Red. 125 What alternate next-hops a point-of-local-repair (PLR) selects need 126 not be consistent - but loops must be prevented. To reduce 127 congestion, it is possible for multiple alternate next-hops to be 128 selected; in the context of MRT alternates, each of those alternate 129 next-hops would be equal-cost paths. 131 This document defines an algorithm for selecting an appropriate MRT 132 alternate for consideration. Other alternates, e.g. LFAs that are 133 downstream paths, may be prefered when available and that policy- 134 based alternate selection process[I-D.ietf-rtgwg-lfa-manageability] 135 is not captured in this document. 137 [E]---[D]---| [E]<--[D]<--| [E]-->[D] 138 | | | | ^ | | 139 | | | V | | V 140 [R] [F] [C] [R] [F] [C] [R] [F] [C] 141 | | | ^ ^ | | 142 | | | | | V | 143 [A]---[B]---| [A]-->[B] [A]---[B]<--| 145 (a) (b) (c) 146 a 2-connected graph MRT-Blue towards R MRT-Red towards R 148 Figure 1 150 Algorithms for computing MRTs can handle arbitrary network topologies 151 where the whole network graph is not 2-connected, as in Figure 2, as 152 well as the easier case where the network graph is 2-connected 153 (Figure 1). Each MRT is a spanning tree. The pair of MRTs provide 154 two paths from every node X to the root of the MRTs. Those paths 155 share the minimum number of nodes and the minimum number of links. 156 Each such shared node is a cut-vertex. Any shared links are cut- 157 links. 159 [E]---[D]---| |---[J] 160 | | | | | 161 | | | | | 162 [R] [F] [C]---[G] | 163 | | | | | 164 | | | | | 165 [A]---[B]---| |---[H] 167 (a) a graph that isn't 2-connected 169 [E]<--[D]<--| [J] [E]-->[D]---| |---[J] 170 | ^ | | | | | ^ 171 V | | | V V V | 172 [R] [F] [C]<--[G] | [R] [F] [C]<--[G] | 173 ^ ^ ^ | ^ | | | 174 | | | V | V | | 175 [A]-->[B]---| |---[H] [A]<--[B]<--| [H] 177 (b) MRT-Blue towards R (c) MRT-Red towards R 179 Figure 2 181 2. Requirements Language 183 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 184 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 185 document are to be interpreted as described in [RFC2119] 187 3. Terminology and Definitions 189 network graph: A graph that reflects the network topology where all 190 links connect exactly two nodes and broadcast links have been 191 transformed into a pseudo-node representation (e.g. in OSPF, 192 viewing a Network LSA as representing a pseudo-noe). 194 Redundant Trees (RT): A pair of trees where the path from any node X 195 to the root R on the first tree is node-disjoint with the path 196 from the same node X to the root along the second tree. These can 197 be computed in 2-connected graphs. 199 Maximally Redundant Trees (MRT): A pair of trees where the path 200 from any node X to the root R along the first tree and the path 201 from the same node X to the root along the second tree share the 202 minimum number of nodes and the minimum number of links. Each 203 such shared node is a cut-vertex. Any shared links are cut-links. 204 Any RT is an MRT but many MRTs are not RTs. 206 MRT Island: From the computing router, the set of routers that 207 support a particular MRT profile and are connected. 209 MRT-Red: MRT-Red is used to describe one of the two MRTs; it is 210 used to describe the associated forwarding topology and MT-ID. 211 Specifically, MRT-Red is the decreasing MRT where links in the 212 GADAG are taken in the direction from a higher topologically 213 ordered node to a lower one. 215 MRT-Blue: MRT-Blue is used to describe one of the two MRTs; it is 216 used to describe the associated forwarding topology and MT-ID. 217 Specifically, MRT-Blue is the increasing MRT where links in the 218 GADAG are taken in the direction from a lower topologically 219 ordered node to a higher one. 221 cut-vertex: A vertex whose removal partitions the network. 223 cut-link: A link whose removal partitions the network. A cut-link 224 by definition must be connected between two cut-vertices. If 225 there are multiple parallel links, then they are referred to as 226 cut-links in this document if removing the set of parallel links 227 would partition the network. 229 2-connected: A graph that has no cut-vertices. This is a graph 230 that requires two nodes to be removed before the network is 231 partitioned. 233 spanning tree: A tree containing links that connects all nodes in 234 the network graph. 236 back-edge: In the context of a spanning tree computed via a depth- 237 first search, a back-edge is a link that connects a descendant of 238 a node x with an ancestor of x. 240 2-connected cluster: A maximal set of nodes that are 2-connected. 241 In a network graph with at least one cut-vertex, there will be 242 multiple 2-connected clusters. 244 block: Either a 2-connected cluster, a cut-link, or an isolated 245 vertex. 247 DAG: Directed Acyclic Graph - a digraph containing no directed 248 cycle. 250 ADAG: Almost Directed Acyclic Graph - a digraph that can be 251 transformed into a DAG with removing a single node (the root 252 node). 254 partial ADAG: A subset of an ADAG that doesn't yet contain all the 255 nodes in the block. A partial ADAG is created during the MRT 256 algorithm and then expanded until all nodes in the block are 257 included and it is an ADAG. 259 GADAG: Generalized ADAG - a digraph, which has only ADAGs as all of 260 its blocks. The root of such a block is the node closest to the 261 global root (e.g. with uniform link costs). 263 DFS: Depth-First Search 265 DFS ancestor: A node n is a DFS ancestor of x if n is on the DFS- 266 tree path from the DFS root to x. 268 DFS descendant: A node n is a DFS descendant of x if x is on the 269 DFS-tree path from the DFS root to n. 271 ear: A path along not-yet-included-in-the-GADAG nodes that starts 272 at a node that is already-included-in-the-GADAG and that ends at a 273 node that is already-included-in-the-GADAG. The starting and 274 ending nodes may be the same node if it is a cut-vertex. 276 X >> Y or Y << X: Indicates the relationship between X and Y in a 277 partial order, such as found in a GADAG. X >> Y means that X is 278 higher in the partial order than Y. Y << X means that Y is lower 279 in the partial order than X. 281 X > Y or Y < X: Indicates the relationship between X and Y in the 282 total order, such as found via a topological sort. X > Y means 283 that X is higher in the total order than Y. Y < X means that Y is 284 lower in the total order than X. 286 proxy-node: A node added to the network graph to represent a multi- 287 homed prefix or routers outside the local MRT-fast-reroute- 288 supporting island of routers. The key property of proxy-nodes is 289 that traffic cannot transit them. 291 UNDIRECTED: In the GADAG, each link is marked as OUTGOING, INCOMING 292 or both. Until the directionality of the link is determined, the 293 link is marked as UNDIRECTED to indicate that its direction hasn't 294 been determined. 296 OUTGOING: In the GADAG, each link is marked as OUTGOING, INCOMING 297 or both. A link marked as OUTGOING has direction from the 298 interface's router to the remote end. 300 INCOMING: In the GADAG, each link is marked as OUTGOING, INCOMING 301 or both. A link marked as INCOMING has direction from the remote 302 end to the interface's router. 304 4. Algorithm Key Concepts 306 There are five key concepts that are critical for understanding the 307 MRT Lowpoint algorithm and other algorithms for computing MRTs. The 308 first is the idea of partially ordering the nodes in a network graph 309 with regard to each other and to the GADAG root. The second is the 310 idea of finding an ear of nodes and adding them in the correct 311 direction. The third is the idea of a Low-Point value and how it can 312 be used to identify cut-vertices and to find a second path towards 313 the root. The fourth is the idea that a non-2-connected graph is 314 made up of blocks, where a block is a 2-connected cluster, a cut-link 315 or an isolated node. The fifth is the idea of a local-root for each 316 node; this is used to compute ADAGs in each block. 318 4.1. Partial Ordering for Disjoint Paths 320 Given any two nodes X and Y in a graph, a particular total order 321 means that either X < Y or X > Y in that total order. An example 322 would be a graph where the nodes are ranked based upon their unique 323 IP loopback addresses. In a partial order, there may be some nodes 324 for which it can't be determined whether X << Y or X >> Y. A partial 325 order can be captured in a directed graph, as shown in Figure 3. In 326 a graphical representation, a link directed from X to Y indicates 327 that X is a neighbor of Y in the network graph and X << Y. 329 [A]<---[R] [E] R << A << B << C << D << E 330 | ^ R << A << B << F << G << H << D << E 331 | | 332 V | Unspecified Relationships: 333 [B]--->[C]--->[D] C and F 334 | ^ C and G 335 | | C and H 336 V | 337 [F]--->[G]--->[H] 339 Figure 3: Directed Graph showing a Partial Order 341 To compute MRTs, the root of the MRTs is at both the very bottom and 342 the very top of the partial ordering. This means that from any node 343 X, one can pick nodes higher in the order until the root is reached. 344 Similarly, from any node X, one can pick nodes lower in the order 345 until the root is reached. For instance, in Figure 4, from G the 346 higher nodes picked can be traced by following the directed links and 347 are H, D, E and R. Similarly, from G the lower nodes picked can be 348 traced by reversing the directed links and are F, B, A, and R. A 349 graph that represents this modified partial order is no longer a DAG; 350 it is termed an Almost DAG (ADAG) because if the links directed to 351 the root were removed, it would be a DAG. 353 [A]<---[R]<---[E] R << A << B << C << R 354 | ^ ^ R << A << B << C << D << E << R 355 | | | R << A << B << F << G << H << D << E << R 356 V | | 357 [B]--->[C]--->[D] Unspecified Relationships: 358 | ^ C and F 359 | | C and G 360 V | C and H 361 [F]--->[G]--->[H] 363 Figure 4: ADAG showing a Partial Order with R lowest and highest 365 Most importantly, if a node Y >> X, then Y can only appear on the 366 increasing path from X to the root and never on the decreasing path. 368 Similarly, if a node Z << X, then Z can only appear on the decreasing 369 path from X to the root and never on the inceasing path. 371 When following the increasing paths, it is possible to pick multiple 372 higher nodes and still have the certainty that those paths will be 373 disjoint from the decreasing paths. E.g. in the previous example 374 node B has multiple possibilities to forward packets along an 375 increasing path: it can either forward packets to C or F. 377 4.2. Finding an Ear and the Correct Direction 379 For simplicity, the basic idea of creating a GADAG by adding ears is 380 described assuming that the network graph is a single 2-connected 381 cluster so that an ADAG is sufficient. Generalizing to multiple 382 blocks is done by considering the block-roots instead of the GADAG 383 root - and the actual algorithm is given in Section 5.4. 385 In order to understand the basic idea of finding an ADAG, first 386 suppose that we have already a partial ADAG, which doesn't contain 387 all the nodes in the block yet, and we want to extend it to cover all 388 the nodes. Suppose that we find a path from a node X to Y such that 389 X and Y are already contained by our partial ADAG, but all the 390 remaining nodes along the path are not added to the ADAG yet. We 391 refer to such a path as an ear. 393 Recall that our ADAG is closely related to a partial order, more 394 precisely, if we remove root R, the remaining DAG describes a partial 395 order of the nodes. If we suppose that neither X nor Y is the root, 396 we may be able to compare them. If one of them is definitely lesser 397 with respect to our partial order (say X<B---| A-->B---| 409 (a) (b) (c) 411 (a) A 2-connected graph 412 (b) Partial ADAG (C is not included) 413 (c) Resulting ADAG after adding path (or ear) B-C-D 415 Figure 5 417 In this partial ADAG, node C is not yet included. However, we can 418 find path B-C-D, where both endpoints are contained by this partial 419 ADAG (we say those nodes are *ready* in the sequel), and the 420 remaining node (node C) is not contained yet. If we remove R, the 421 remaining DAG defines a partial order, and with respect to this 422 partial order we can say that B<C and C->D are added). If 424 B >> D, we would add the same path in reverse direction. 426 If in the partial order where an ear's two ends are X and Y, X << Y, 427 then there must already be a directed path from X to Y in the ADAG. 428 The ear must be added in a direction such that it doesn't create a 429 cycle; therefore the ear must go from X to Y. 431 In the case, when X and Y are not ordered with each other, we can 432 select either direction for the ear. We have no restriction since 433 neither of the directions can result in a cycle. In the corner case 434 when one of the endpoints of an ear, say X, is the root (recall that 435 the two endpoints must be different), we could use both directions 436 again for the ear because the root can be considered both as smaller 437 and as greater than Y. However, we strictly pick that direction in 438 which the root is lower than Y. The logic for this decision is 439 explained in Section 5.6 441 A partial ADAG is started by finding a cycle from the root R back to 442 itself. This can be done by selecting a non-ready neighbor N of R 443 and then finding a path from N to R that doesn't use any links 444 between R and N. The direction of the cycle can be assigned either 445 way since it is starting the ordering. 447 Once a partial ADAG is already present, it will always have a node 448 that is not the root R in it. As a brief proof that a partial ADAG 449 can always have ears added to it: just select a non-ready neighbor N 450 of a ready node Q, such that Q is not the root R, find a path from N 451 to the root R in the graph with Q removed. This path is an ear where 452 the first node of the ear is Q, the next is N, then the path until 453 the first ready node the path reached (that second ready node is the 454 other endpoint of the path). Since the graph is 2-connected, there 455 must be a path from N to R without Q. 457 It is always possible to select a non-ready neighbor N of a ready 458 node Q so that Q is not the root R. Because the network is 459 2-connected, N must be connected to two different nodes and only one 460 can be R. Because the initial cycle has already been added to the 461 ADAG, there are ready nodes that are not R. Since the graph is 462 2-connected, while there are non-ready nodes, there must be a non- 463 ready neighbor N of a ready node that is not R. 465 Generic_Find_Ears_ADAG(root) 466 Create an empty ADAG. Add root to the ADAG. 467 Mark root as IN_GADAG. 468 Select an arbitrary cycle containing root. 469 Add the arbitrary cycle to the ADAG. 470 Mark cycle's nodes as IN_GADAG. 471 Add cycle's non-root nodes to process_list. 472 while there exists connected nodes in graph that are not IN_GADAG 473 Select a new ear. Let its endpoints be X and Y. 474 if Y is root or (Y << X) 475 add the ear towards X to the ADAG 476 else // (a) X is root or (b)X << Y or (c) X, Y not ordered 477 Add the ear towards Y to the ADAG 479 Figure 6: Generic Algorithm to find ears and their direction in 480 2-connected graph 482 Algorithm Figure 6 merely requires that a cycle or ear be selected 483 without specifying how. Regardless of the way of selecting the path, 484 we will get an ADAG. The method used for finding and selecting the 485 ears is important; shorter ears result in shorter paths along the 486 MRTs. The MRT Lowpoint algorithm's method using Low-Point 487 Inheritance is defined in Section 5.4. Other methods are described 488 in the Appendices (Appendix A and Appendix B). 490 As an example, consider Figure 5 again. First, we select the 491 shortest cycle containing R, which can be R-A-B-F-D-E (uniform link 492 costs were assumed), so we get to the situation depicted in Figure 5 493 (b). Finally, we find a node next to a ready node; that must be node 494 C and assume we reached it from ready node B. We search a path from C 495 to R without B in the original graph. The first ready node along 496 this is node D, so the open ear is B-C-D. Since B<C 497 and C->D to the ADAG. Since all the nodes are ready, we stop at this 498 point. 500 4.3. Low-Point Values and Their Uses 502 A basic way of computing a spanning tree on a network graph is to run 503 a depth-first-search, such as given in Figure 7. This tree has the 504 important property that if there is a link (x, n), then either n is a 505 DFS ancestor of x or n is a DFS descendant of x. In other words, 506 either n is on the path from the root to x or x is on the path from 507 the root to n. 509 global_variable: dfs_number 511 DFS_Visit(node x, node parent) 512 D(x) = dfs_number 513 dfs_number += 1 514 x.dfs_parent = parent 515 for each link (x, w) 516 if D(w) is not set 517 DFS_Visit(w, x) 519 Run_DFS(node root) 520 dfs_number = 0 521 DFS_Visit(root, NONE) 523 Figure 7: Basic Depth-First Search algorithm 525 Given a node x, one can compute the minimal DFS number of the 526 neighbours of x, i.e. min( D(w) if (x,w) is a link). This gives the 527 highest attachment point neighbouring x. What is interesting, 528 though, is what is the highest attachment point from x and x's 529 descendants. This is what is determined by computing the Low-Point 530 value, as given in Algorithm Figure 9 and illustrated on a graph in 531 Figure 8. 533 [E]---| [J]-------[I] [P]---[O] 534 | | | | | | 535 | | | | | | 536 [R] [D]---[C]--[F] [H]---[K] [N] 537 | | | | | | 538 | | | | | | 539 [A]--------[B] [G]---| [L]---[M] 541 (a) a non-2-connected graph 543 [E]----| [J]---------[I] [P]------[O] 544 (5, ) | (10, ) (9, ) (16, ) (15, ) 545 | | | | | | 546 | | | | | | 547 [R] [D]---[C]---[F] [H]----[K] [N] 548 (0, ) (4, ) (3, ) (6, ) (8, ) (11, ) (14, ) 549 | | | | | | 550 | | | | | | 551 [A]---------[B] [G]----| [L]------[M] 552 (1, ) (2, ) (7, ) (12, ) (13, ) 554 (b) with DFS values assigned (D(x), L(x)) 556 [E]----| [J]---------[I] [P]------[O] 557 (5,0) | (10,3) (9,3) (16,11) (15,11) 558 | | | | | | 559 | | | | | | 560 [R] [D]---[C]---[F] [H]----[K] [N] 561 (0, ) (4,0) (3,0) (6,3) (8,3) (11,11) (14,11) 562 | | | | | | 563 | | | | | | 564 [A]---------[B] [G]----| [L]------[M] 565 (1,0) (2,0) (7,3) (12,11) (13,11) 567 (c) with low-point values assigned (D(x), L(x)) 569 Figure 8 571 global_variable: dfs_number 573 Lowpoint_Visit(node x, node parent, interface p_to_x) 574 D(x) = dfs_number 575 L(x) = D(x) 576 dfs_number += 1 577 x.dfs_parent = parent 578 x.dfs_parent_intf = p_to_x 579 x.lowpoint_parent = NONE 580 for each interface intf of x: 581 if D(intf.remote_node) is not set 582 Lowpoint_Visit(intf.remote_node, x, intf) 583 if L(intf.remote_node) < L(x) 584 L(x) = L(intf.remote_node) 585 x.lowpoint_parent = intf.remote_node 586 x.lowpoint_parent_intf = intf 587 else if intf.remote_node is not parent 588 if D(intf.remote_node) < L(x) 589 L(x) = D(intf.remote) 590 x.lowpoint_parent = intf.remote_node 591 x.lowpoint_parent_intf = intf 593 Run_Lowpoint(node root) 594 dfs_number = 0 595 Lowpoint_Visit(root, NONE, NONE) 597 Figure 9: Computing Low-Point value 599 From the low-point value and lowpoint parent, there are three very 600 useful things which motivate our computation. 602 First, if there is a child c of x such that L(c) >= D(x), then there 603 are no paths in the network graph that go from c or its descendants 604 to an ancestor of x - and therefore x is a cut-vertex. In Figure 8, 605 this can be seen by looking at the DFS children of C. C has two 606 children - D and F and L(F) = 3 = D(C) so it is clear that C is a 607 cut-vertex and F is in a block where C is the block's root. L(D) = 0 608 > 3 = D(C) so D has a path to the ancestors of C; in this case, D can 609 go via E to reach R. Comparing the low-point values of all a node's 610 DFS-children with the node's DFS-value is very useful because it 611 allows identification of the cut-vertices and thus the blocks. 613 Second, by repeatedly following the path given by lowpoint_parent, 614 there is a path from x back to an ancestor of x that does not use the 615 link [x, x.dfs_parent] in either direction. The full path need not 616 be taken, but this gives a way of finding an initial cycle and then 617 ears. 619 Third, as seen in Figure 8, even if L(x) < D(x), there may be a block 620 that contains both the root and a DFS-child of a node while other 621 DFS-children might be in different blocks. In this example, C's 622 child D is in the same block as R while F is not. It is important to 623 realize that the root of a block may also be the root of another 624 block. 626 4.4. Blocks in a Graph 628 A key idea for an MRT algorithm is that any non-2-connected graph is 629 made up by blocks (e.g. 2-connected clusters, cut-links, and/or 630 isolated nodes). To compute GADAGs and thus MRTs, computation is 631 done in each block to compute ADAGs or Redundant Trees and then those 632 ADAGs or Redundant Trees are combined into a GADAG or MRT. 634 [E]---| [J]-------[I] [P]---[O] 635 | | | | | | 636 | | | | | | 637 [R] [D]---[C]--[F] [H]---[K] [N] 638 | | | | | | 639 | | | | | | 640 [A]--------[B] [G]---| [L]---[M] 642 (a) A graph with four blocks that are: 643 3 2-connected clusters and a cut-link 645 [E]<--| [J]<------[I] [P]<--[O] 646 | | | ^ | ^ 647 V | V | V | 648 [R] [D]<--[C] [F] [H]<---[K] [N] 649 ^ | ^ ^ 650 | V | | 651 [A]------->[B] [G]---| [L]-->[M] 653 (b) MRT-Blue 655 [E]---| [J]-------->[I] [P]-->[O] 656 | | | 657 V V V 658 [R] [D]-->[C]<---[F] [H]<---[K] [N] 659 ^ | ^ | ^ | 660 | V | | | V 661 [A]<-------[B] [G]<--| [L]<--[M] 663 (c) MRT-Red 665 Figure 10 667 Consider the example depicted in Figure 10 (a). In this figure, a 668 special graph is presented, showing us all the ways 2-connected 669 clusters can be connected. It has four blocks: block 1 contains R, 670 A, B, C, D, E, block 2 contains C, F, G, H, I, J, block 3 contains K, 671 L, M, N, O, P, and block 4 is a cut-link containing H and K. As can 672 be observed, the first two blocks have one common node (node C) and 673 blocks 2 and 3 do not have any common node, but they are connected 674 through a cut-link that is block 4. No two blocks can have more than 675 one common node, since two blocks with at least 2 common nodes would 676 qualify as a single 2-connected cluster. 678 Moreover, observe that if we want to get from one block to another, 679 we must use a cut-vertex (the cut-vertices in this graph are C, H, 680 K), regardless of the path selected, so we can say that all the paths 681 from block 3 along the MRTs rooted at R will cross K first. This 682 observation means that if we want to find a pair of MRTs rooted at R, 683 then we need to build up a pair of RTs in block 3 with K as a root. 684 Similarly, we need to find another pair of RTs in block 2 with C as a 685 root, and finally, we need the last pair of RTs in block 1 with R as 686 a root. When all the trees are selected, we can simply combine them; 687 when a block is a cut-link (as in block 4), that cut-link is added in 688 the same direction to both of the trees. The resulting trees are 689 depicted in Figure 10 (b) and (c). 691 Similarly, to create a GADAG it is sufficient to compute ADAGs in 692 each block and connect them. 694 It is necessary, therefore, to identify the cut-vertices, the blocks 695 and identify the appropriate local-root to use for each block. 697 4.5. Determining Local-Root and Assigning Block-ID 699 Each node in a network graph has a local-root, which is the cut- 700 vertex (or root) in the same block that is closest to the root. The 701 local-root is used to determine whether two nodes share a common 702 block. 704 Compute_Localroot(node x, node localroot) 705 x.localroot = localroot 706 for each DFS child c 707 if L(c) < D(x) //x is not a cut-vertex 708 Compute_Localroot(c, x.localroot) 709 else 710 mark x as cut-vertex 711 Compute_Localroot(c, x) 713 Compute_Localroot(root, root) 715 Figure 11: A method for computing local-roots 717 There are two different ways of computing the local-root for each 718 node. The stand-alone method is given in Figure 11 and better 719 illustrates the concept; it is used by the MRT algorithms given in 720 the Appendices Appendix A and Appendix B. The MRT Lowpoint algorithm 721 computes the local-root for a block as part of computing the GADAG 722 using lowpoint inheritance; the essence of this computation is given 723 in Figure 12. 725 Get the current node, s. 726 Compute an ear(either through lowpoint inheritance 727 or by following dfs parents) from s to a ready node e. 728 (Thus, s is not e, if there is such ear.) 729 if s is e 730 for each node x in the ear that is not s 731 x.localroot = s 732 else 733 for each node x in the ear that is not s or e 734 x.localroot = e.localroot 736 Figure 12: Ear-based method for computing local-roots 738 Once the local-roots are known, two nodes X and Y are in a common 739 block if and only if one of the following three conditions apply. 741 o Y's local-root is X's local-root : They are in the same block and 742 neither is the cut-vertex closest to the root. 744 o Y's local-root is X: X is the cut-vertex closest to the root for 745 Y's block 747 o Y is X's local-root: Y is the cut-vertex closest to the root for 748 X's block 750 Once we have computed the local-root for each node in the network 751 graph, we can assign for each node, a block id that represents the 752 block in which the node is present. This computation is shown in 753 Figure 13. 755 global_var: max_block_id 757 Assign_Block_ID(x, cur_block_id) 758 x.block_id = cur_block_id 759 foreach DFS child c of x 760 if (c.local_root is x) 761 max_block_id += 1 762 Assign_Block_ID(c, max_block_id) 763 else 764 Assign_Block_ID(c, cur_block_id) 766 max_block_id = 0 767 Assign_Block_ID(root, max_block_id) 769 Figure 13: Assigning block id to identify blocks 771 5. Algorithm Sections 773 This algorithm computes one GADAG that is then used by a router to 774 determine its MRT-Blue and MRT-Red next-hops to all destinations. 775 Finally, based upon that information, alternates are selected for 776 each next-hop to each destination. The different parts of this 777 algorithm are described below. These work on a network graph after 778 its interfaces have been ordered as per Figure 14. 780 1. Compute the local MRT Island for the particular MRT Profile. 781 [See Section 5.1.] 783 2. Select the root to use for the GADAG. [See Section 5.2.] 785 3. Initialize all interfaces to UNDIRECTED. [See Section 5.3.] 787 4. Compute the DFS value,e.g. D(x), and lowpoint value, L(x). [See 788 Figure 9.] 790 5. Construct the GADAG. [See Section 5.4] 792 6. Assign directions to all interfaces that are still UNDIRECTED. 793 [See Section 5.5.] 795 7. From the computing router x, compute the next-hops for the MRT- 796 Blue and MRT-Red. [See Section 5.6.] 798 8. Identify alternates for each next-hop to each destination by 799 determining which one of the blue MRT and the red MRT the 800 computing router x should select. [See Section 5.7.] 802 To ensure consistency in computation, all routers MUST order 803 interfaces identically down to the set of links with the same metric 804 to the same neighboring node. This is necessary for the DFS, where 805 the selection order of the interfaces to explore results in different 806 trees, and for computing the GADAG, where the selection order of the 807 interfaces to use to form ears can result in different GADAGs. The 808 required ordering between two interfaces from the same router x is 809 given in Figure 14. 811 Interface_Compare(interface a, interface b) 812 if a.metric < b.metric 813 return A_LESS_THAN_B 814 if b.metric < a.metric 815 return B_LESS_THAN_A 816 if a.neighbor.loopback_addr < b.neighbor.loopback_addr 817 return A_LESS_THAN_B 818 if b.neighbor.loopback_addr < a.neighbor.loopback_addr 819 return B_LESS_THAN_A 820 // Same metric to same node, so the order doesn't matter anymore for 821 // interoperability. 822 // To have a unique, consistent total order, 823 // tie-break in OSPF based on the link's linkData as 824 // distributed in an OSPF Router-LSA 825 if a.link_data < b.link_data 826 return A_LESS_THAN_B 827 return B_LESS_THAN_A 829 Figure 14: Rules for ranking multiple interfaces. Order is from low 830 to high. 832 5.1. MRT Island Identification 834 The local MRT Island for a particular MRT profile can be determined 835 by starting from the computing router in the network graph and doing 836 a breadth-first-search (BFS), exploring only links that aren't MRT- 837 ineligible. 839 MRT_Island_Identification(topology, computing_rtr, profile_id) 840 for all routers in topology 841 rtr.IN_MRT_ISLAND = FALSE 842 computing_rtr.IN_MRT_ISLAND = TRUE 843 explore_list = { computing_rtr } 844 while (explore_list is not empty) 845 next_rtr = remove_head(explore_list) 846 for each interface in next_rtr 847 if interface is not MRT-ineligible 848 if ((interface.remote_node supports profile_id) and 849 (interface.remote_node.IN_MRT_ISLAND is FALSE)) 850 interface.remote_node.IN_MRT_ISLAND = TRUE 851 add_to_tail(explore_list, interface.remote_node) 853 Figure 15: MRT Island Identification 855 5.2. Root Selection 857 In Section 7 of [I-D.ietf-rtgwg-mrt-frr-architecture], the GADAG Root 858 Selection is described for the MRT default profile. In 859 [I-D.atlas-ospf-mrt] and [I-D.li-isis-mrt], a mechanism is given for 860 routers to advertise the GADAG Root Selection Priority and 861 consistently select a GADAG Root inside the local MRT Island. The 862 MRT Lowpoint algorithm simply requires that all routers in the MRT 863 Island MUST select the same GADAG Root; the mechanism can vary based 864 upon the MRT profile description. Before beginning computation, the 865 network graph is reduced to contain only the set of routers that 866 support the specific MRT profile whose MRTs are being computed. 868 Analysis has shown that the centrality of a router can have a 869 significant impact on the lengths of the alternate paths computed. 870 Therefore, it is RECOMMENDED that off-line analysis that considers 871 the centrality of a router be used to help determine how good a 872 choice a particular router is for the role of GADAG root. 874 5.3. Initialization 876 Before running the algorithm, there is the standard type of 877 initialization to be done, such as clearing any computed DFS-values, 878 lowpoint-values, DFS-parents, lowpoint-parents, any MRT-computed 879 next-hops, and flags associated with algorithm. 881 It is assumed that a regular SPF computation has been run so that the 882 primary next-hops from the computing router to each destination are 883 known. This is required for determining alternates at the last step. 885 Initially, all interfaces MUST be initialized to UNDIRECTED. Whether 886 they are OUTGOING, INCOMING or both is determined when the GADAG is 887 constructed and augmented. 889 Due to manageability consideration 890 [I-D.ietf-rtgwg-lfa-manageability], overload, or a transient cause 891 such as [RFC3137], it is possible that some links and nodes will be 892 marked as unusable. This constraint must be consistently flooded via 893 the IGP [I-D.atlas-ospf-mrt] [I-D.li-isis-mrt] so that links are 894 clearly known to be MRT-ineligible and not explored or used in the 895 MRT algorithm. In the algorithm description, it is assumed that such 896 links and nodes will not be explored or used and no more discussion 897 is given of this restriction. 899 5.4. MRT Lowpoint Algorithm: Computing GADAG using lowpoint inheritance 901 As discussed in Section 4.2, it is necessary to find ears from a node 902 x that is already in the GADAG (known as IN_GADAG). Two different 903 methods are used to find ears in the algorithm. The first is by 904 going to a not IN_GADAG DFS-child and then following the chain of 905 low-point parents until an IN_GADAG node is found. The second is by 906 going to a not IN_GADAG neighbor and then following the chain of DFS 907 parents until an IN_GADAG node is found. As an ear is found, the 908 associated interfaces are marked based on the direction taken. The 909 nodes in the ear are marked as IN_GADAG. In the algorithm, first the 910 ears via DFS-children are found and then the ears via DFS-neighbors 911 are found. 913 By adding both types of ears when an IN_GADAG node is processed, all 914 ears that connect to that node are found. The order in which the 915 IN_GADAG nodes is processed is, of course, key to the algorithm. The 916 order is a stack of ears so the most recent ear is found at the top 917 of the stack. Of course, the stack stores nodes and not ears, so an 918 ordered list of nodes, from the first node in the ear to the last 919 node in the ear, is created as the ear is explored and then that list 920 is pushed onto the stack. 922 Each ear represents a partial order (see Figure 4) and processing the 923 nodes in order along each ear ensures that all ears connecting to a 924 node are found before a node higher in the partial order has its ears 925 explored. This means that the direction of the links in the ear is 926 always from the node x being processed towards the other end of the 927 ear. Additionally, by using a stack of ears, this means that any 928 unprocessed nodes in previous ears can only be ordered higher than 929 nodes in the ears below it on the stack. 931 In this algorithm that depends upon Low-Point inheritance, it is 932 necessary that every node have a low-point parent that is not itself. 933 If a node is a cut-vertex, that may not yet be the case. Therefore, 934 any nodes without a low-point parent will have their low-point parent 935 set to their DFS parent and their low-point value set to the DFS- 936 value of their parent. This assignment also properly allows an ear 937 between two cut-vertices. 939 Finally, the algorithm simultaneously computes each node's local- 940 root, as described in Figure 12. This is further elaborated as 941 follows. The local-root can be inherited from the node at the end of 942 the ear unless the end of the ear is x itself, in which case the 943 local-root for all the nodes in the ear would be x. This is because 944 whenever the first cycle is found in a block, or an ear involving a 945 bridge is computed, the cut-vertex closest to the root would be x 946 itself. In all other scenarios, the properties of lowpoint/dfs 947 parents ensure that the end of the ear will be in the same block, and 948 thus inheriting its local-root would be the correct local-root for 949 all newly added nodes. 951 The pseudo-code for the GADAG algorithm (assuming that the adjustment 952 of lowpoint for cut-vertices has been made) is shown in Figure 16. 954 Construct_Ear(x, Stack, intf, type) 955 ear_list = empty 956 cur_node = intf.remote_node 957 cur_intf = intf 958 not_done = true 960 while not_done 961 cur_intf.UNDIRECTED = false 962 cur_intf.OUTGOING = true 963 cur_intf.remote_intf.UNDIRECTED = false 964 cur_intf.remote_intf.INCOMING = true 966 if cur_node.IN_GADAG is false 967 cur_node.IN_GADAG = true 968 add_to_list_end(ear_list, cur_node) 969 if type is CHILD 970 cur_intf = cur_node.lowpoint_parent_intf 971 cur_node = cur_node.lowpoint_parent 972 else type must be NEIGHBOR 973 cur_intf = cur_node.dfs_parent_intf 974 cur_node = cur_node.dfs_parent 975 else 976 not_done = false 978 if (type is CHILD) and (cur_node is x) 979 //x is a cut-vertex and the local root for 980 //the block in which the ear is computed 981 localroot = x 982 else 983 // Inherit local-root from the end of the ear 984 localroot = cur_node.localroot 985 while ear_list is not empty 986 y = remove_end_item_from_list(ear_list) 987 y.localroot = localroot 988 push(Stack, y) 990 Construct_GADAG_via_Lowpoint(topology, root) 991 root.IN_GADAG = true 992 root.localroot = root 993 Initialize Stack to empty 994 push root onto Stack 995 while (Stack is not empty) 996 x = pop(Stack) 997 foreach interface intf of x 998 if ((intf.remote_node.IN_GADAG == false) and 999 (intf.remote_node.dfs_parent is x)) 1000 Construct_Ear(x, Stack, intf, CHILD) 1001 foreach interface intf of x 1002 if ((intf.remote_node.IN_GADAG == false) and 1003 (intf.remote_node.dfs_parent is not x)) 1004 Construct_Ear(x, Stack, intf, NEIGHBOR) 1006 Construct_GADAG_via_Lowpoint(topology, root) 1008 Figure 16: Low-point Inheritance GADAG algorithm 1010 5.5. Augmenting the GADAG by directing all links 1012 The GADAG, regardless of the algorithm used to construct it, at this 1013 point could be used to find MRTs but the topology does not include 1014 all links in the network graph. That has two impacts. First, there 1015 might be shorter paths that respect the GADAG partial ordering and so 1016 the alternate paths would not be as short as possible. Second, there 1017 may be additional paths between a router x and the root that are not 1018 included in the GADAG. Including those provides potentially more 1019 bandwidth to traffic flowing on the alternates and may reduce 1020 congestion compared to just using the GADAG as currently constructed. 1022 The goal is thus to assign direction to every remaining link marked 1023 as UNDIRECTED to improve the paths and number of paths found when the 1024 MRTs are computed. 1026 To do this, we need to establish a total order that respects the 1027 partial order described by the GADAG. This can be done using Kahn's 1028 topological sort[Kahn_1962_topo_sort] which essentially assigns a 1029 number to a node x only after all nodes before it (e.g. with a link 1030 incoming to x) have had their numbers assigned. The only issue with 1031 the topological sort is that it works on DAGs and not ADAGs or 1032 GADAGs. 1034 To convert a GADAG to a DAG, it is necessary to remove all links that 1035 point to a root of block from within that block. That provides the 1036 necessary conversion to a DAG and then a topological sort can be 1037 done. Finally, all UNDIRECTED links are assigned a direction based 1038 upon the partial ordering. Any UNDIRECTED links that connect to a 1039 root of a block from within that block are assigned a direction 1040 INCOMING to that root. The exact details of this whole process are 1041 captured in Figure 17 1042 Set_Block_Root_Incoming_Links(topo, root, mark_or_clear) 1043 foreach node x in topo 1044 if node x is a cut-vertex or root 1045 foreach interface i of x 1046 if (i.remote_node.localroot is x) 1047 if i.UNDIRECTED 1048 i.OUTGOING = true 1049 i.remote_intf.INCOMING = true 1050 i.UNDIRECTED = false 1051 i.remote_intf.UNDIRECTED = false 1052 if i.INCOMING 1053 if mark_or_clear is mark 1054 if i.OUTGOING // a cut-link 1055 i.STORE_INCOMING = true 1056 i.INCOMING = false 1057 i.remote_intf.STORE_OUTGOING = true 1058 i.remote_intf.OUTGOING = false 1059 i.TEMP_UNUSABLE = true 1060 i.remote_intf.TEMP_UNUSABLE = true 1061 else 1062 i.TEMP_UNUSABLE = false 1063 i.remote_intf.TEMP_UNUSABLE = false 1064 if i.STORE_INCOMING and (mark_or_clear is clear) 1065 i.INCOMING = true 1066 i.STORE_INCOMING = false 1067 i.remote_intf.OUTGOING = true 1068 i.remote_intf.STORE_OUTGOING = false 1070 Run_Topological_Sort_GADAG(topo, root) 1071 Set_Block_Root_Incoming_Links(topo, root, MARK) 1072 foreach node x 1073 set x.unvisited to the count of x's incoming interfaces 1074 that aren't marked TEMP_UNUSABLE 1075 Initialize working_list to empty 1076 Initialize topo_order_list to empty 1077 add_to_list_end(working_list, root) 1078 while working_list is not empty 1079 y = remove_start_item_from_list(working_list) 1080 add_to_list_end(topo_order_list, y) 1081 foreach interface i of y 1082 if (i.OUTGOING) and (not i.TEMP_UNUSABLE) 1083 i.remote_node.unvisited -= 1 1084 if i.remote_node.unvisited is 0 1085 add_to_list_end(working_list, i.remote_node) 1086 next_topo_order = 1 1087 while topo_order_list is not empty 1088 y = remove_start_item_from_list(topo_order_list) 1089 y.topo_order = next_topo_order 1090 next_topo_order += 1 1091 Set_Block_Root_Incoming_Links(topo, root, CLEAR) 1093 Add_Undirected_Links(topo, root) 1094 Run_Topological_Sort_GADAG(topo, root) 1095 foreach node x in topo 1096 foreach interface i of x 1097 if i.UNDIRECTED 1098 if x.topo_order < i.remote_node.topo_order 1099 i.OUTGOING = true 1100 i.UNDIRECTED = false 1101 i.remote_intf.INCOMING = true 1102 i.remote_intf.UNDIRECTED = false 1103 else 1104 i.INCOMING = true 1105 i.UNDIRECTED = false 1106 i.remote_intf.OUTGOING = true 1107 i.remote_intf.UNDIRECTED = false 1109 Add_Undirected_Links(topo, root) 1111 Figure 17: Assigning direction to UNDIRECTED links 1113 Proxy-nodes do not need to be added to the network graph. They 1114 cannot be transited and do not affect the MRTs that are computed. 1115 The details of how the MRT-Blue and MRT-Red next-hops are computed 1116 and how the appropriate alternate next-hops are selected is given in 1117 Section 5.8. 1119 5.6. Compute MRT next-hops 1121 As was discussed in Section 4.1, once a ADAG is found, it is 1122 straightforward to find the next-hops from any node X to the ADAG 1123 root. However, in this algorithm, we want to reuse the common GADAG 1124 and find not only the one pair of MRTs rooted at the GADAG root with 1125 it, but find a pair rooted at each node. This is useful since it is 1126 significantly faster to compute. It may also provide easier 1127 troubleshooting of the MRT-Red and MRT-Blue. 1129 The method for computing differently rooted MRTs from the common 1130 GADAG is based on two ideas. First, if two nodes X and Y are ordered 1131 with respect to each other in the partial order, then an SPF along 1132 OUTGOING links (an increasing-SPF) and an SPF along INCOMING links (a 1133 decreasing-SPF) can be used to find the increasing and decreasing 1134 paths. Second, if two nodes X and Y aren't ordered with respect to 1135 each other in the partial order, then intermediary nodes can be used 1136 to create the paths by increasing/decreasing to the intermediary and 1137 then decreasing/increasing to reach Y. 1139 As usual, the two basic ideas will be discussed assuming the network 1140 is two-connected. The generalization to multiple blocks is discussed 1141 in Section 5.6.4. The full algorithm is given in Section 5.6.5. 1143 5.6.1. MRT next-hops to all nodes partially ordered with respect to the 1144 computing node 1146 To find two node-disjoint paths from the computing router X to any 1147 node Y, depends upon whether Y >> X or Y << X. As shown in 1148 Figure 18, if Y >> X, then there is an increasing path that goes from 1149 X to Y without crossing R; this contains nodes in the interval [X,Y]. 1150 There is also a decreasing path that decreases towards R and then 1151 decreases from R to Y; this contains nodes in the interval 1152 [X,R-small] or [R-great,Y]. The two paths cannot have common nodes 1153 other than X and Y. 1155 [Y]<---(Cloud 2)<--- [X] 1156 | ^ 1157 | | 1158 V | 1159 (Cloud 3)--->[R]--->(Cloud 1) 1161 MRT-Blue path: X->Cloud 2->Y 1162 MRT-Red path: X->Cloud 1->R->Cloud 3->Y 1164 Figure 18: Y >> X 1166 Similar logic applies if Y << X, as shown in Figure 19. In this 1167 case, the increasing path from X increases to R and then increases 1168 from R to Y to use nodes in the intervals [X,R-great] and [R-small, 1169 Y]. The decreasing path from X reaches Y without crossing R and uses 1170 nodes in the interval [Y,X]. 1172 [X]<---(Cloud 2)<--- [Y] 1173 | ^ 1174 | | 1175 V | 1176 (Cloud 3)--->[R]--->(Cloud 1) 1178 MRT-Blue path: X->Cloud 3->R->Cloud 1->Y 1179 MRT-Red path: X->Cloud 2->Y 1181 Figure 19: Y << X 1183 5.6.2. MRT next-hops to all nodes not partially ordered with respect to 1184 the computing node 1186 When X and Y are not ordered, the first path should increase until we 1187 get to a node G, where G >> Y. At G, we need to decrease to Y. The 1188 other path should be just the opposite: we must decrease until we get 1189 to a node H, where H << Y, and then increase. Since R is smaller and 1190 greater than Y, such G and H must exist. It is also easy to see that 1191 these two paths must be node disjoint: the first path contains nodes 1192 in interval [X,G] and [Y,G], while the second path contains nodes in 1193 interval [H,X] and [H,Y]. This is illustrated in Figure 20. It is 1194 necessary to decrease and then increase for the MRT-Blue and increase 1195 and then decrease for the MRT-Red; if one simply increased for one 1196 and decreased for the other, then both paths would go through the 1197 root R. 1199 (Cloud 6)<---[Y]<---(Cloud 5)<------------| 1200 | | 1201 | | 1202 V | 1203 [G]--->(Cloud 4)--->[R]--->(Cloud 1)--->[H] 1204 ^ | 1205 | | 1206 | | 1207 (Cloud 3)<---[X]<---(Cloud 2)<-----------| 1209 MRT-Blue path: decrease to H and increase to Y 1210 X->Cloud 2->H->Cloud 5->Y 1211 MRT-Red path: increase to G and decrease to Y 1212 X->Cloud 3->G->Cloud 6->Y 1214 Figure 20: X and Y unordered 1216 This gives disjoint paths as long as G and H are not the same node. 1217 Since G >> Y and H << Y, if G and H could be the same node, that 1218 would have to be the root R. This is not possible because there is 1219 only one incoming interface to the root R which is created when the 1220 initial cycle is found. Recall from Figure 6 that whenever an ear 1221 was found to have an end that was the root R, the ear was directed 1222 from R so that the associated interface on R is outgoing and not 1223 incoming. Therefore, there must be exactly one node M which is the 1224 largest one before R, so the MRT-Red path will never reach R; it will 1225 turn at M and decrease to Y. 1227 5.6.3. Computing Redundant Tree next-hops in a 2-connected Graph 1229 The basic ideas for computing RT next-hops in a 2-connected graph 1230 were given in Section 5.6.1 and Section 5.6.2. Given these two 1231 ideas, how can we find the trees? 1233 If some node X only wants to find the next-hops (which is usually the 1234 case for IP networks), it is enough to find which nodes are greater 1235 and less than X, and which are not ordered; this can be done by 1236 running an increasing-SPF and a decreasing-SPF rooted at X and not 1237 exploring any links from the ADAG root. ( Traversal algorithms other 1238 than SPF could safely be used instead where one traversal takes the 1239 links in their given directions and the other reverses the links' 1240 directions.) 1242 An increasing-SPF rooted at X and not exploring links from the root 1243 will find the increasing next-hops to all Y >> X. Those increasing 1244 next-hops are X's next-hops on the MRT-Blue to reach Y. An 1245 decreasing-SPF rooted at X and not exploring links from the root will 1246 find the decreasing next-hops to all Z << X. Those decreasing next- 1247 hops are X's next-hops on the MRT-Red to reach Z. Since the root R 1248 is both greater than and less than X, after this increasing-SPF and 1249 decreasing-SPF, X's next-hops on the MRT-Blue and on the MRT-Red to 1250 reach R are known. For every node Y >> X, X's next-hops on the MRT- 1251 Red to reach Y are set to those on the MRT-Red to reach R. For every 1252 node Z << X, X's next-hops on the MRT-Blue to reach Z are set to 1253 those on the MRT-Blue to reach R. 1255 For those nodes, which were not reached, we have the next-hops as 1256 well. The increasing MRT-Blue next-hop for a node, which is not 1257 ordered, is the next-hop along the decreasing MRT-Red towards R and 1258 the decreasing MRT-Red next-hop is the next-hop along the increasing 1259 MRT-Blue towards R. Naturally, since R is ordered with respect to all 1260 the nodes, there will always be an increasing and a decreasing path 1261 towards it. This algorithm does not provide the complete specific 1262 path taken but just the appropriate next-hops to use. The identity 1263 of G and H is not determined. 1265 The final case to considered is when the root R computes its own 1266 next-hops. Since the root R is << all other nodes, running an 1267 increasing-SPF rooted at R will reach all other nodes; the MRT-Blue 1268 next-hops are those found with this increasing-SPF. Similarly, since 1269 the root R is >> all other nodes, running a decreasing-SPF rooted at 1270 R will reach all other nodes; the MRT-Red next-hops are those found 1271 with this decreasing-SPF. 1273 E---D---| E<--D<--| 1274 | | | | ^ | 1275 | | | V | | 1276 R F C R F C 1277 | | | | ^ ^ 1278 | | | V | | 1279 A---B---| A-->B---| 1281 (a) (b) 1282 A 2-connected graph A spanning ADAG rooted at R 1284 Figure 21 1286 As an example consider the situation depicted in Figure 21. There 1287 node C runs an increasing-SPF and a decreasing-SPF The increasing-SPF 1288 reaches D, E and R and the decreasing-SPF reaches B, A and R. So 1289 towards E the increasing next-hop is D (it was reached though D), and 1290 the decreasing next-hop is B (since R was reached though B). Since 1291 both D and B, A and R will compute the next hops similarly, the 1292 packets will reach E. 1294 We have the next-hops towards F as well: since F is not ordered with 1295 respect to C, the MRT-Blue next-hop is the decreasing one towards R 1296 (which is B) and the MRT-Red next-hop is the increasing one towards R 1297 (which is D). Since B is ordered with F, it will find, for its MRT- 1298 Blue, a real increasing next-hop, so packet forwarded to B will get 1299 to F on path C-B-F. Similarly, D will have, for its MRT-Red, a real 1300 decreasing next-hop, and the packet will use path C-D-F. 1302 5.6.4. Generalizing for graph that isn't 2-connected 1304 If a graph isn't 2-connected, then the basic approach given in 1305 Section 5.6.3 needs some extensions to determine the appropriate MRT 1306 next-hops to use for destinations outside the computing router X's 1307 blocks. In order to find a pair of maximally redundant trees in that 1308 graph we need to find a pair of RTs in each of the blocks (the root 1309 of these trees will be discussed later), and combine them. 1311 When computing the MRT next-hops from a router X, there are three 1312 basic differences: 1314 1. Only nodes in a common block with X should be explored in the 1315 increasing-SPF and decreasing-SPF. 1317 2. Instead of using the GADAG root, X's local-root should be used. 1318 This has the following implications: 1320 A. The links from X's local-root should not be explored. 1322 B. If a node is explored in the outgoing SPF so Y >> X, then X's 1323 MRT-Red next-hops to reach Y uses X's MRT-Red next-hops to 1324 reach X's local-root and if Z << X, then X's MRT-Blue next- 1325 hops to reach Z uses X's MRT-Blue next-hops to reach X's 1326 local-root. 1328 C. If a node W in a common block with X was not reached in the 1329 increasing-SPF or decreasing-SPF, then W is unordered with 1330 respect to X. X's MRT-Blue next-hops to W are X's decreasing 1331 aka MRT-Red next-hops to X's local-root. X's MRT-Red next- 1332 hops to W are X's increasing aka Blue MRT next-hops to X's 1333 local-root. 1335 3. For nodes in different blocks, the next-hops must be inherited 1336 via the relevant cut-vertex. 1338 These are all captured in the detailed algorithm given in 1339 Section 5.6.5. 1341 5.6.5. Complete Algorithm to Compute MRT Next-Hops 1343 The complete algorithm to compute MRT Next-Hops for a particular 1344 router X is given in Figure 22. In addition to computing the MRT- 1345 Blue next-hops and MRT-Red next-hops used by X to reach each node Y, 1346 the algorithm also stores an "order_proxy", which is the proper cut- 1347 vertex to reach Y if it is outside the block, and which is used later 1348 in deciding whether the MRT-Blue or the MRT-Red can provide an 1349 acceptable alternate for a particular primary next-hop. 1351 In_Common_Block(x, y) 1352 if (((x.localroot is y.localroot) and (x.block_id is y.block_id)) 1353 or (x is y.localroot) or (y is x.localroot)) 1354 return true 1355 return false 1357 Store_Results(y, direction, spf_root, store_nhs) 1358 if direction is FORWARD 1359 y.higher = true 1360 if store_nhs 1361 y.blue_next_hops = y.next_hops 1362 if direction is REVERSE 1363 y.lower = true 1364 if store_nhs 1365 y.red_next_hops = y.next_hops 1367 SPF_No_Traverse_Root(spf_root, block_root, direction, store_nhs) 1368 Initialize spf_heap to empty 1369 Initialize nodes' spf_metric to infinity and next_hops to empty 1370 spf_root.spf_metric = 0 1371 insert(spf_heap, spf_root) 1372 while (spf_heap is not empty) 1373 min_node = remove_lowest(spf_heap) 1374 Store_Results(min_node, direction, spf_root, store_nhs) 1375 if ((min_node is spf_root) or (min_node is not block_root)) 1376 foreach interface intf of min_node 1377 if (((direction is FORWARD) and intf.OUTGOING) or 1378 ((direction is REVERSE) and intf.INCOMING) and 1379 In_Common_Block(spf_root, intf.remote_node)) 1380 path_metric = min_node.spf_metric + intf.metric 1381 if path_metric < intf.remote_node.spf_metric 1382 intf.remote_node.spf_metric = path_metric 1383 if min_node is spf_root 1384 intf.remote_node.next_hops = make_list(intf) 1385 else 1386 intf.remote_node.next_hops = min_node.next_hops 1387 insert_or_update(spf_heap, intf.remote_node) 1388 else if path_metric is intf.remote_node.spf_metric 1389 if min_node is spf_root 1390 add_to_list(intf.remote_node.next_hops, intf) 1391 else 1392 add_list_to_list(intf.remote_node.next_hops, 1393 min_node.next_hops) 1395 SetEdge(y) 1396 if y.blue_next_hops is empty and y.red_next_hops is empty 1397 if (y.local_root != y) { 1398 SetEdge(y.localroot) 1399 } 1400 y.blue_next_hops = y.localroot.blue_next_hops 1401 y.red_next_hops = y.localroot.red_next_hops 1402 y.order_proxy = y.localroot.order_proxy 1404 Compute_MRT_NextHops(x, root) 1405 foreach node y 1406 y.higher = y.lower = false 1407 clear y.red_next_hops and y.blue_next_hops 1408 y.order_proxy = y 1409 SPF_No_Traverse_Root(x, x.localroot, FORWARD, TRUE) 1410 SPF_No_Traverse_Root(x, x.localroot, REVERSE, TRUE) 1412 // red and blue next-hops are stored to x.localroot as different 1413 // paths are found via the SPF and reverse-SPF. 1414 // Similarly any nodes whose local-root is x will have their 1415 // red_next_hops and blue_next_hops already set. 1417 // Handle nodes in the same block that aren't the local-root 1418 foreach node y 1419 if (y.IN_MRT_ISLAND and (y is not x) and 1420 (y.localroot is x.localroot) and 1421 ((y is x.localroot) or (x is y.localroot) or 1422 (y.block_id is x.block_id))) 1423 if y.higher 1424 y.red_next_hops = x.localroot.red_next_hops 1425 else if y.lower 1426 y.blue_next_hops = x.localroot.blue_next_hops 1427 else 1428 y.blue_next_hops = x.localroot.red_next_hops 1429 y.red_next_hops = x.localroot.blue_next_hops 1431 // Inherit next-hops and order_proxies to other components 1432 if x is not root 1433 root.blue_next_hops = x.localroot.blue_next_hops 1434 root.red_next_hops = x.localroot.red_next_hops 1435 root.order_proxy = x.localroot 1436 foreach node y 1437 if (y is not root) and (y is not x) and y.IN_MRT_ISLAND 1438 SetEdge(y) 1440 max_block_id = 0 1441 Assign_Block_ID(root, max_block_id) 1442 Compute_MRT_NextHops(x, root) 1444 Figure 22 1446 5.7. Identify MRT alternates 1448 At this point, a computing router S knows its MRT-Blue next-hops and 1449 MRT-Red next-hops for each destination in the MRT Island. The 1450 primary next-hops along the SPT are also known. It remains to 1451 determine for each primary next-hop to a destination D, which of the 1452 MRTs avoids the primary next-hop node F. This computation depends 1453 upon data set in Compute_MRT_NextHops such as each node y's 1454 y.blue_next_hops, y.red_next_hops, y.order_proxy, y.higher, y.lower 1455 and topo_orders. Recall that any router knows only which are the 1456 nodes greater and lesser than itself, but it cannot decide the 1457 relation between any two given nodes easily; that is why we need 1458 topological ordering. 1460 For each primary next-hop node F to each destination D, S can call 1461 Select_Alternates(S, D, F, primary_intf) to determine whether to use 1462 the MRT-Blue next-hops as the alternate next-hop(s) for that primary 1463 next hop or to use the MRT-Red next-hops. The algorithm is given in 1464 Figure 23 and discussed afterwards. 1466 Select_Alternates_Internal(S, D, F, primary_intf, 1467 D_lower, D_higher, D_topo_order) 1469 //When D==F, we can do only link protection 1470 if ((D is F) or (D.order_proxy is F)) 1471 if an MRT doesn't use primary_intf 1472 indicate alternate is not node-protecting 1473 return that MRT color 1474 else // parallel links are cut-links 1475 return AVOID_LINK_ON_BLUE 1477 if (D_lower and D_higher and F.lower and F.higher) 1478 if F.topo_order < D_topo_order 1479 return USE_RED 1480 else 1481 return USE_BLUE 1483 if (D_lower and D_higher) 1484 if F.higher 1485 return USE_RED 1486 else 1487 return USE_BLUE 1489 if (F.lower and F.higher) 1490 if D_lower 1491 return USE_RED 1492 else if D_higher 1493 return USE_BLUE 1494 else 1495 if primary_intf.OUTGOING and primary_intf.INCOMING 1496 return AVOID_LINK_ON_BLUE 1497 if primary_intf.OUTGOING is true 1498 return USE_BLUE 1499 if primary_intf.INCOMING is true 1500 return USE_RED 1502 if D_higher 1503 if F.higher 1504 if F.topo_order < D_topo_order 1505 return USE_RED 1506 else 1507 return USE_BLUE 1508 else if F.lower 1509 return USE_BLUE 1510 else 1511 // F and S are neighbors so either F << S or F >> S 1512 else if D_lower 1513 if F.higher 1514 return USE_RED 1515 else if F.lower 1516 if F.topo_order < D_topo_order 1517 return USE_RED 1518 else 1519 return USE_BLUE 1520 else 1521 // F and S are neighbors so either F << S or F >> S 1522 else // D and S not ordered 1523 if F.lower 1524 return USE_RED 1525 else if F.higher 1526 return USE_BLUE 1527 else 1528 // F and S are neighbors so either F << S or F >> S 1530 Select_Alternates(S, D, F, primary_intf) 1531 if D.order_proxy is not D 1532 D_lower = D.order_proxy.lower 1533 D_higher = D.order_proxy.higher 1534 D_topo_order = D.order_proxy.topo_order 1535 else 1536 D_lower = D.lower 1537 D_higher = D.higher 1538 D_topo_order = D.topo_order 1539 return Select_Alternates_Internal(S, D, F, primary_intf, 1540 D_lower, D_higher, D_topo_order) 1542 Figure 23 1544 If either D>>S>>F or D<>D (ii) F<D.topo_order, either case (i) or 1555 case (iii) holds true, which means that selecting the Blue next-hop 1556 is safe. Similarly, if F.topo_order>S, we 1566 should use the Blue next-hop. 1568 Additionally, the cases where either F or D is ordered both higher 1569 and lower must be considered; this can happen when one is a block- 1570 root or its order_proxy is. If D is both higher and lower than S, 1571 then the MRT to use is the one that avoids F so if F is higher, then 1572 the MRT-Red should be used and if F is lower, then the MRT-Blue 1573 should be used; F and S must be ordered because they are neighbors. 1574 If F is both higher and lower, then if D is lower, using the MRT-Red 1575 to decrease reaches D and if D is higher, using the Blue MRT to 1576 increase reaches D; if D is unordered compared to S, then the 1577 situation is a bit more complicated. 1579 In the case where F< F, then use the MRT-Blue (decrease to avoid 1582 that link and then increase). If the link is directed S <- F, then 1583 use the MRT-Red (increase to avoid that link and then decrease). If 1584 the link is S <--> F, then the link must be a cut-link and there is 1585 no node-protecting alternate. If there are multiple links between S 1586 and F, then they can protect against each other; of course, in this 1587 situation, they are probably already ECMP. 1589 Finally, there is the case where D is also F. In this case, only link 1590 protection is possible. The MRT that doesn't use the indicated 1591 primary next-hop is used. If both MRTs use the primary next-hop, 1592 then the primary next-hop must be a cut-link so either MRT could be 1593 used but the set of MRT next-hops must be pruned to avoid that 1594 primary next-hop. To indicate this case, Select_Alternates returns 1595 AVOID_LINK_ON_BLUE. 1597 As an example, consider the ADAG depicted in Figure 24 and first 1598 suppose that G is the source, D is the destination and H is the 1599 failed next-hop. Since D>>G, we need to compare H.topo_order and 1600 D.topo_order. Since D.topo_order>H.topo_order, D must be not smaller 1601 than H, so we should select the decreasing path towards the root. 1602 If, however, the destination were instead J, we must find that 1603 H.topo_order>J.topo_order, so we must choose the increasing Blue 1604 next-hop to J, which is I. In the case, when instead the destination 1605 is C, we find that we need to first decrease to avoid using H, so the 1606 Blue, first decreasing then increasing, path is selected. 1608 [E]<-[D]<-[H]<-[J] 1609 | ^ ^ ^ 1610 V | | | 1611 [R] [C] [G]->[I] 1612 | ^ ^ ^ 1613 V | | | 1614 [A]->[B]->[F]---| 1616 (a) 1617 a 2-connected graph 1619 Figure 24 1621 5.8. Finding FRR Next-Hops for Proxy-Nodes 1623 As discussed in Section 10.2 of 1624 [I-D.ietf-rtgwg-mrt-frr-architecture], it is necessary to find MRT- 1625 Blue and MRT-Red next-hops and MRT-FRR alternates for a named proxy- 1626 nodes. An example case is for a router that is not part of that 1627 local MRT Island, when there is only partial MRT support in the 1628 domain. 1630 A first incorrect and naive approach to handling proxy-nodes, which 1631 cannot be transited, is to simply add these proxy-nodes to the graph 1632 of the network and connect it to the routers through which the new 1633 proxy-node can be reached. Unfortunately, this can introduce some 1634 new ordering between the border routers connected to the new node 1635 which could result in routing MRT paths through the proxy-node. 1636 Thus, this naive approach would need to recompute GADAGs and redo 1637 SPTs for each proxy-node. 1639 Instead of adding the proxy-node to the original network graph, each 1640 individual proxy-node can be individually added to the GADAG. The 1641 proxy-node is connected to at most two nodes in the GADAG. 1642 Section 10.2 of [I-D.ietf-rtgwg-mrt-frr-architecture] defines how the 1643 proxy-node attachments MUST be determined. The degenerate case where 1644 the proxy-node is attached to only one node in the GADAG is trivial 1645 as all needed information can be derived from that attachment node; 1646 if there are different interfaces, then some can be assigned to MRT- 1647 Red and others to MRT_Blue. 1649 Now, consider the proxy-node that is attached to exactly two nodes in 1650 the GADAG. Let the order_proxies of these nodes be A and B. Let the 1651 current node, where next-hop is just being calculated, be S. If one 1652 of these two nodes A and B is the local root of S, let A=S.local_root 1653 and the other one be B. Otherwise, let A.topo_order < B.topo_order. 1655 A valid GADAG was constructed. Instead doing an increasing-SPF and a 1656 decreasing-SPF to find ordering for the proxy-nodes, the following 1657 simple rules, providing the same result, can be used independently 1658 for each different proxy-node. For the following rules, let 1659 X=A.local_root, and if A is the local root, let that be strictly 1660 lower than any other node. Always take the first rule that matches. 1662 Rule Condition Blue NH Red NH Notes 1663 1 S=X Blue to A Red to B 1664 2 S<>B Blue to R Red to B 1666 4 A<> S, 1749 the computing router, MUST be one or more nodes, T, whose topo_order 1750 is in the interval [X.topo_order, Y.topo_order] and where Y >> T or Y 1751 is T. Similarly, the MRT-Red next-hops MUST be have a topo_order in 1752 the interval [R-small.topo_order, X.topo_order] or [Y.topo_order, 1753 R-big.topo_order]. 1755 Implementations SHOULD implement the Select_Alternates() function to 1756 pick an MRT-FRR alternate. 1758 In a future version, this section will include pseudo-code describing 1759 the full code path through the pseudo-code given earlier in the 1760 draft. 1762 7. Algorithm Alternatives and Evaluation 1764 This specification defines the MRT Lowpoint Algorithm, which is one 1765 option among several possible MRT algorithms. Other alternatives are 1766 described in the appendices. 1768 In addition, it is possible to calculate Destination-Rooted GADAG, 1769 where for each destination, a GADAG rooted at that destination is 1770 computed. Then a router can compute the blue MRT and red MRT next- 1771 hops to that destination. Building GADAGs per destination is 1772 computationally more expensive, but may give somewhat shorter 1773 alternate paths. It may be useful for live-live multicast along 1774 MRTs. 1776 7.1. Algorithm Evaluation 1778 The MRT Lowpoint algorithm is the lowest computation of the MRT 1779 algorithms. Two other MRT algorithms are provided in Appendix A and 1780 Appendix B. When analyzed on service provider network topologies, 1781 they did not provide significant differences in the path lenghts for 1782 the alternatives. This section does not focus on that analysis or 1783 the decision to use the MRT Lowpoint algorithm as the default MRT 1784 algorithm; it has the lowest computational and storage requirements 1785 and gave comparable results. 1787 Since this document defines the MRT Lowpoint algorithm for use in 1788 fast-reroute applications, it is useful to compare MRT and Remote LFA 1789 [I-D.ietf-rtgwg-remote-lfa]. This section compares MRT and remote 1790 LFA for IP Fast Reroute in 19 service provider network topologies, 1791 focusing on coverage and alternate path length. Figure 26 shows the 1792 node-protecting coverage provided by local LFA (LLFA), remote LFA 1793 (RLFA), and MRT against different failure scenarios in these 1794 topologies. The coverage values are calculated as the percentage of 1795 source-destination pairs protected by the given IPFRR method relative 1796 to those protectable by optimal routing, against the same failure 1797 modes. More details on alternate selection policies used for this 1798 analysis are described later in this section. 1800 +------------+-----------------------------+ 1801 | Topology | percentage of failure | 1802 | | scenarios covered by | 1803 | | IPFRR method | 1804 | |-----------------------------+ 1805 | | NP_LLFA | NP_RLFA | MRT | 1806 +------------+---------+---------+---------+ 1807 | T201 | 37 | 90 | 100 | 1808 | T202 | 73 | 83 | 100 | 1809 | T203 | 51 | 80 | 100 | 1810 | T204 | 55 | 81 | 100 | 1811 | T205 | 92 | 93 | 100 | 1812 | T206 | 71 | 74 | 100 | 1813 | T207 | 57 | 74 | 100 | 1814 | T208 | 66 | 81 | 100 | 1815 | T209 | 79 | 79 | 100 | 1816 | T210 | 95 | 98 | 100 | 1817 | T211 | 68 | 71 | 100 | 1818 | T212 | 59 | 63 | 100 | 1819 | T213 | 84 | 84 | 100 | 1820 | T214 | 68 | 78 | 100 | 1821 | T215 | 84 | 88 | 100 | 1822 | T216 | 43 | 59 | 100 | 1823 | T217 | 78 | 88 | 100 | 1824 | T218 | 72 | 75 | 100 | 1825 | T219 | 78 | 84 | 100 | 1826 +------------+---------+---------+---------+ 1828 Figure 26 1830 For the topologies analyzed here, LLFA is able to provide node- 1831 protecting coverage ranging from 37% to 95% of the source-destination 1832 pairs, as seen in the column labeled NP_LLFA. The use of RLFA in 1833 addition to LLFA is generally able to increase the node-protecting 1834 coverage. The percentage of node-protecting coverage with RLFA is 1835 provided in the column labeled NP_RLFA, ranges from 59% to 98% for 1836 these topologies. The node-protecting coverage provided by MRT is 1837 100% since MRT is able to provide protection for any source- 1838 destination pair for which a path still exists after the failure. 1840 We would also like to measure the quality of the alternate paths 1841 produced by these different IPFRR methods. An obvious approach is to 1842 take an average of the alternate path costs over all source- 1843 destination pairs and failure modes. However, this presents a 1844 problem, which we will illustrate by presenting an example of results 1845 for one topology using this approach ( Figure 27). In this table, 1846 the average relative path length is the alternate path length for the 1847 IPFRR method divided by the optimal alternate path length, averaged 1848 over all source-destination pairs and failure modes. The first three 1849 columns of data in the table give the path length calculated from the 1850 sum of IGP metrics of the links in the path. The results for 1851 topology T208 show that the metric-based path lengths for NP_LLFA and 1852 NP_RLFA alternates are on average 78 and 66 times longer than the 1853 path lengths for optimal alternates. The metric-based path lengths 1854 for MRT alternates are on average 14 times longer than for optimal 1855 alternates. 1857 +--------+------------------------------------------------+ 1858 | | average relative alternate path length | 1859 | |-----------------------+------------------------+ 1860 |Topology| IGP metric | hopcount | 1861 | |-----------------------+------------------------+ 1862 | |NP_LLFA |NP_RLFA | MRT |NP_LLFA |NP_RLFA | MRT | 1863 +--------+--------+--------+-----+--------+--------+------+ 1864 | T208 | 78.2 | 66.0 | 13.6| 0.99 | 1.01 | 1.32 | 1865 +--------+--------+--------+-----+--------+--------+------+ 1867 Figure 27 1869 The network topology represented by T208 uses values of 10, 100, and 1870 1000 as IGP costs, so small deviations from the optimal alternate 1871 path can result in large differences in relative path length. LLFA, 1872 RLFA, and MRT all allow for at least one hop in the alterate path to 1873 be chosen independent of the cost of the link. This can easily 1874 result in an alternate using a link with cost 1000, which introduces 1875 noise into the path length measurement. In the case of T208, the 1876 adverse effects of using metric-based path lengths is obvious. 1877 However, we have observed that the metric-based path length 1878 introduces noise into alternate path length measurements in several 1879 other topologies as well. For this reason, we have opted to measure 1880 the alternate path length using hopcount. While IGP metrics may be 1881 adjusted by the network operator for a number of reasons (e.g. 1882 traffic engineering), the hopcount is a fairly stable measurement of 1883 path length. As shown in the last three columns of Figure 27, the 1884 hopcount-based alternate path lengths for topology T208 are fairly 1885 well-behaved. 1887 Figure 28, Figure 29, Figure 30, and Figure 31 present the hopcount- 1888 based path length results for the 19 topologies examined. The 1889 topologies in the four tables are grouped based on the size of the 1890 topologies, as measured by the number of nodes, with Figure 28 having 1891 the smallest topologies and Figure 31 having the largest topologies. 1892 Instead of trying to represent the path lengths of a large set of 1893 alternates with a single number, we have chosen to present a 1894 histogram of the path lengths for each IPFRR method and alternate 1895 selection policy studied. The first eight colums of data represent 1896 the percentage of failure scenarios protected by an alternate N hops 1897 longer than the primary path, with the first column representing an 1898 alternate 0 or 1 hops longer than the primary path, all the way up 1899 through the eighth column respresenting an alternate 14 or 15 hops 1900 longer than the primary path. The last column in the table gives the 1901 percentage of failure scenarios for which there is no alternate less 1902 than 16 hops longer than the primary path. In the case of LLFA and 1903 RLFA, this category includes failure scenarios for which no alternate 1904 was found. 1906 For each topology, the first row (labeled OPTIMAL) is the 1907 distribution of the number of hops in excess of the primary path 1908 hopcount for optimally routed alternates. (The optimal routing was 1909 done with respect to IGP metrics, as opposed to hopcount.) The 1910 second row(labeled NP_LLFA) is the distribution of the extra hops for 1911 node-protecting LLFA. The third row (labeled NP_LLFA_THEN_NP_RLFA) 1912 is the hopcount distribution when one adds node-protecting RLFA to 1913 increase the coverage. The alternate selection policy used here 1914 first tries to find a node-protecting LLFA. If that does not exist, 1915 then it tries to find an RLFA, and checks if it is node-protecting. 1916 Comparing the hopcount distribution for RLFA and LLFA across these 1917 topologies, one can see how the coverage is increased at the expense 1918 of using longer alternates. It is also worth noting that while 1919 superficially LLFA and RLFA appear to have better hopcount 1920 distributions than OPTIMAL, the presence of entries in the last 1921 column (no alternate < 16) mainly represent failure scenarios that 1922 are not protected, for which the hopcount is effectively infinite. 1924 The fourth and fifth rows of each topology show the hopcount 1925 distributions for two alternate selection policies using MRT 1926 alternates. The policy represented by the label 1927 NP_LLFA_THEN_MRT_LOWPOINT will first use a node-protecting LLFA. If 1928 a node-protecting LLFA does not exist, then it will use an MRT 1929 alternate. The policy represented by the label MRT_LOWPOINT instead 1930 will use the MRT alternate even if a node-protecting LLFA exists. 1931 One can see from the data that combining node-protecting LLFA with 1932 MRT results in a significant shortening of the alternate hopcount 1933 distribution. 1935 +-------------------------------------------------------------------+ 1936 | | percentage of failure scenarios | 1937 | Topology name | protected by an alternate N hops | 1938 | and | longer than the primary path | 1939 | alternate selection +------------------------------------+ 1940 | policy evaluated | | | | | | | | | no | 1941 | | | | | | |10 |12 |14 | alt| 1942 | |0-1|2-3|4-5|6-7|8-9|-11|-13|-15| <16| 1943 +------------------------------+---+---+---+---+---+---+---+---+----+ 1944 | T201(avg primary hops=3.5) | | | | | | | | | | 1945 | OPTIMAL | 37| 37| 20| 3| 3| | | | | 1946 | NP_LLFA | 37| | | | | | | | 63| 1947 | NP_LLFA_THEN_NP_RLFA | 37| 34| 19| | | | | | 10| 1948 | NP_LLFA_THEN_MRT_LOWPOINT | 37| 33| 21| 6| 3| | | | | 1949 | MRT_LOWPOINT | 33| 36| 23| 6| 3| | | | | 1950 +------------------------------+---+---+---+---+---+---+---+---+----+ 1951 | T202(avg primary hops=4.8) | | | | | | | | | | 1952 | OPTIMAL | 90| 9| | | | | | | | 1953 | NP_LLFA | 71| 2| | | | | | | 27| 1954 | NP_LLFA_THEN_NP_RLFA | 78| 5| | | | | | | 17| 1955 | NP_LLFA_THEN_MRT_LOWPOINT | 80| 12| 5| 2| 1| | | | | 1956 | MRT_LOWPOINT_ONLY | 48| 29| 13| 7| 2| 1| | | | 1957 +------------------------------+---+---+---+---+---+---+---+---+----+ 1958 | T203(avg primary hops=4.1) | | | | | | | | | | 1959 | OPTIMAL | 36| 37| 21| 4| 2| | | | | 1960 | NP_LLFA | 34| 15| 3| | | | | | 49| 1961 | NP_LLFA_THEN_NP_RLFA | 35| 19| 22| 4| | | | | 20| 1962 | NP_LLFA_THEN_MRT_LOWPOINT | 36| 35| 22| 5| 2| | | | | 1963 | MRT_LOWPOINT_ONLY | 31| 35| 26| 7| 2| | | | | 1964 +------------------------------+---+---+---+---+---+---+---+---+----+ 1965 | T204(avg primary hops=3.7) | | | | | | | | | | 1966 | OPTIMAL | 76| 20| 3| 1| | | | | | 1967 | NP_LLFA | 54| 1| | | | | | | 45| 1968 | NP_LLFA_THEN_NP_RLFA | 67| 10| 4| | | | | | 19| 1969 | NP_LLFA_THEN_MRT_LOWPOINT | 70| 18| 8| 3| 1| | | | | 1970 | MRT_LOWPOINT_ONLY | 58| 27| 11| 3| 1| | | | | 1971 +------------------------------+---+---+---+---+---+---+---+---+----+ 1972 | T205(avg primary hops=3.4) | | | | | | | | | | 1973 | OPTIMAL | 92| 8| | | | | | | | 1974 | NP_LLFA | 89| 3| | | | | | | 8| 1975 | NP_LLFA_THEN_NP_RLFA | 90| 4| | | | | | | 7| 1976 | NP_LLFA_THEN_MRT_LOWPOINT | 91| 9| | | | | | | | 1977 | MRT_LOWPOINT_ONLY | 62| 33| 5| 1| | | | | | 1978 +------------------------------+---+---+---+---+---+---+---+---+----+ 1980 Figure 28 1982 +-------------------------------------------------------------------+ 1983 | | percentage of failure scenarios | 1984 | Topology name | protected by an alternate N hops | 1985 | and | longer than the primary path | 1986 | alternate selection +------------------------------------+ 1987 | policy evaluated | | | | | | | | | no | 1988 | | | | | | |10 |12 |14 | alt| 1989 | |0-1|2-3|4-5|6-7|8-9|-11|-13|-15| <16| 1990 +------------------------------+---+---+---+---+---+---+---+---+----+ 1991 | T206(avg primary hops=3.7) | | | | | | | | | | 1992 | OPTIMAL | 63| 30| 7| | | | | | | 1993 | NP_LLFA | 60| 9| 1| | | | | | 29| 1994 | NP_LLFA_THEN_NP_RLFA | 60| 13| 1| | | | | | 26| 1995 | NP_LLFA_THEN_MRT_LOWPOINT | 64| 29| 7| | | | | | | 1996 | MRT_LOWPOINT | 55| 32| 13| | | | | | | 1997 +------------------------------+---+---+---+---+---+---+---+---+----+ 1998 | T207(avg primary hops=3.9) | | | | | | | | | | 1999 | OPTIMAL | 71| 24| 5| 1| | | | | | 2000 | NP_LLFA | 55| 2| | | | | | | 43| 2001 | NP_LLFA_THEN_NP_RLFA | 63| 10| | | | | | | 26| 2002 | NP_LLFA_THEN_MRT_LOWPOINT | 70| 20| 7| 2| 1| | | | | 2003 | MRT_LOWPOINT_ONLY | 57| 29| 11| 3| 1| | | | | 2004 +------------------------------+---+---+---+---+---+---+---+---+----+ 2005 | T208(avg primary hops=4.6) | | | | | | | | | | 2006 | OPTIMAL | 58| 28| 12| 2| 1| | | | | 2007 | NP_LLFA | 53| 11| 3| | | | | | 34| 2008 | NP_LLFA_THEN_NP_RLFA | 56| 17| 7| 1| | | | | 19| 2009 | NP_LLFA_THEN_MRT_LOWPOINT | 58| 19| 10| 7| 3| 1| | | | 2010 | MRT_LOWPOINT_ONLY | 34| 24| 21| 13| 6| 2| 1| | | 2011 +------------------------------+---+---+---+---+---+---+---+---+----+ 2012 | T209(avg primary hops=3.6) | | | | | | | | | | 2013 | OPTIMAL | 85| 14| 1| | | | | | | 2014 | NP_LLFA | 79| | | | | | | | 21| 2015 | NP_LLFA_THEN_NP_RLFA | 79| | | | | | | | 21| 2016 | NP_LLFA_THEN_MRT_LOWPOINT | 82| 15| 2| | | | | | | 2017 | MRT_LOWPOINT_ONLY | 63| 29| 8| | | | | | | 2018 +------------------------------+---+---+---+---+---+---+---+---+----+ 2019 | T210(avg primary hops=2.5) | | | | | | | | | | 2020 | OPTIMAL | 95| 4| 1| | | | | | | 2021 | NP_LLFA | 94| 1| | | | | | | 5| 2022 | NP_LLFA_THEN_NP_RLFA | 94| 3| 1| | | | | | 2| 2023 | NP_LLFA_THEN_MRT_LOWPOINT | 95| 4| 1| | | | | | | 2024 | MRT_LOWPOINT_ONLY | 91| 6| 2| | | | | | | 2025 +------------------------------+---+---+---+---+---+---+---+---+----+ 2027 Figure 29 2029 +-------------------------------------------------------------------+ 2030 | | percentage of failure scenarios | 2031 | Topology name | protected by an alternate N hops | 2032 | and | longer than the primary path | 2033 | alternate selection +------------------------------------+ 2034 | policy evaluated | | | | | | | | | no | 2035 | | | | | | |10 |12 |14 | alt| 2036 | |0-1|2-3|4-5|6-7|8-9|-11|-13|-15| <16| 2037 +------------------------------+---+---+---+---+---+---+---+---+----+ 2038 | T211(avg primary hops=3.3) | | | | | | | | | | 2039 | OPTIMAL | 88| 11| | | | | | | | 2040 | NP_LLFA | 66| 1| | | | | | | 32| 2041 | NP_LLFA_THEN_NP_RLFA | 68| 3| | | | | | | 29| 2042 | NP_LLFA_THEN_MRT_LOWPOINT | 88| 12| | | | | | | | 2043 | MRT_LOWPOINT | 85| 15| 1| | | | | | | 2044 +------------------------------+---+---+---+---+---+---+---+---+----+ 2045 | T212(avg primary hops=3.5) | | | | | | | | | | 2046 | OPTIMAL | 76| 23| 1| | | | | | | 2047 | NP_LLFA | 59| | | | | | | | 41| 2048 | NP_LLFA_THEN_NP_RLFA | 61| 1| 1| | | | | | 37| 2049 | NP_LLFA_THEN_MRT_LOWPOINT | 75| 24| 1| | | | | | | 2050 | MRT_LOWPOINT_ONLY | 66| 31| 3| | | | | | | 2051 +------------------------------+---+---+---+---+---+---+---+---+----+ 2052 | T213(avg primary hops=4.3) | | | | | | | | | | 2053 | OPTIMAL | 91| 9| | | | | | | | 2054 | NP_LLFA | 84| | | | | | | | 16| 2055 | NP_LLFA_THEN_NP_RLFA | 84| | | | | | | | 16| 2056 | NP_LLFA_THEN_MRT_LOWPOINT | 89| 10| 1| | | | | | | 2057 | MRT_LOWPOINT_ONLY | 75| 24| 1| | | | | | | 2058 +------------------------------+---+---+---+---+---+---+---+---+----+ 2059 | T214(avg primary hops=5.8) | | | | | | | | | | 2060 | OPTIMAL | 71| 22| 5| 2| | | | | | 2061 | NP_LLFA | 58| 8| 1| 1| | | | | 32| 2062 | NP_LLFA_THEN_NP_RLFA | 61| 13| 3| 1| | | | | 22| 2063 | NP_LLFA_THEN_MRT_LOWPOINT | 66| 14| 7| 5| 3| 2| 1| 1| 1| 2064 | MRT_LOWPOINT_ONLY | 30| 20| 18| 12| 8| 4| 3| 2| 3| 2065 +------------------------------+---+---+---+---+---+---+---+---+----+ 2066 | T215(avg primary hops=4.8) | | | | | | | | | | 2067 | OPTIMAL | 73| 27| | | | | | | | 2068 | NP_LLFA | 73| 11| | | | | | | 16| 2069 | NP_LLFA_THEN_NP_RLFA | 73| 13| 2| | | | | | 12| 2070 | NP_LLFA_THEN_MRT_LOWPOINT | 74| 19| 3| 2| 1| 1| 1| | | 2071 | MRT_LOWPOINT_ONLY | 32| 31| 16| 12| 4| 3| 1| | | 2072 +------------------------------+---+---+---+---+---+---+---+---+----+ 2074 Figure 30 2076 +-------------------------------------------------------------------+ 2077 | | percentage of failure scenarios | 2078 | Topology name | protected by an alternate N hops | 2079 | and | longer than the primary path | 2080 | alternate selection +------------------------------------+ 2081 | policy evaluated | | | | | | | | | no | 2082 | | | | | | |10 |12 |14 | alt| 2083 | |0-1|2-3|4-5|6-7|8-9|-11|-13|-15| <16| 2084 +------------------------------+---+---+---+---+---+---+---+---+----+ 2085 | T216(avg primary hops=5.2) | | | | | | | | | | 2086 | OPTIMAL | 60| 32| 7| 1| | | | | | 2087 | NP_LLFA | 39| 4| | | | | | | 57| 2088 | NP_LLFA_THEN_NP_RLFA | 46| 12| 2| | | | | | 41| 2089 | NP_LLFA_THEN_MRT_LOWPOINT | 48| 20| 12| 7| 5| 4| 2| 1| 1| 2090 | MRT_LOWPOINT | 28| 25| 18| 11| 7| 6| 3| 2| 1| 2091 +------------------------------+---+---+---+---+---+---+---+---+----+ 2092 | T217(avg primary hops=8.0) | | | | | | | | | | 2093 | OPTIMAL | 81| 13| 5| 1| | | | | | 2094 | NP_LLFA | 74| 3| 1| | | | | | 22| 2095 | NP_LLFA_THEN_NP_RLFA | 76| 8| 3| 1| | | | | 12| 2096 | NP_LLFA_THEN_MRT_LOWPOINT | 77| 7| 5| 4| 3| 2| 1| 1| | 2097 | MRT_LOWPOINT_ONLY | 25| 18| 18| 16| 12| 6| 3| 1| | 2098 +------------------------------+---+---+---+---+---+---+---+---+----+ 2099 | T218(avg primary hops=5.5) | | | | | | | | | | 2100 | OPTIMAL | 85| 14| 1| | | | | | | 2101 | NP_LLFA | 68| 3| | | | | | | 28| 2102 | NP_LLFA_THEN_NP_RLFA | 71| 4| | | | | | | 25| 2103 | NP_LLFA_THEN_MRT_LOWPOINT | 77| 12| 7| 4| 1| | | | | 2104 | MRT_LOWPOINT_ONLY | 37| 29| 21| 10| 3| 1| | | | 2105 +------------------------------+---+---+---+---+---+---+---+---+----+ 2106 | T219(avg primary hops=7.7) | | | | | | | | | | 2107 | OPTIMAL | 77| 15| 5| 1| 1| | | | | 2108 | NP_LLFA | 72| 5| | | | | | | 22| 2109 | NP_LLFA_THEN_NP_RLFA | 73| 8| 2| | | | | | 16| 2110 | NP_LLFA_THEN_MRT_LOWPOINT | 74| 8| 3| 3| 2| 2| 2| 2| 4| 2111 | MRT_LOWPOINT_ONLY | 19| 14| 15| 12| 10| 8| 7| 6| 10| 2112 +------------------------------+---+---+---+---+---+---+---+---+----+ 2114 Figure 31 2116 In the preceding analysis, the following procedure for selecting an 2117 RLFA was used. Nodes were ordered with respect to distance from the 2118 source and checked for membership in Q and P-space. The first node 2119 to satisfy this condition was selected as the RLFA. More 2120 sophisticated methods to select node-protecting RLFAs is an area of 2121 active research. 2123 The analysis presented above uses the MRT Lowpoint Algorithm defined 2124 in this specification with a common GADAG root. The particular 2125 choice of a common GADAG root is expected to affect the quality of 2126 the MRT alternate paths, with a more central common GADAG root 2127 resulting in shorter MRT alternate path lengths. For the analysis 2128 above, the GADAG root was chosen for each topology by calculating 2129 node centrality as the sum of costs of all shortest paths to and from 2130 a given node. The node with the lowest sum was chosen as the common 2131 GADAG root. In actual deployments, the common GADAG root would be 2132 chosen based on the GADAG Root Selection Priority advertised by each 2133 router, the values of which would be determined off-line. 2135 In order to measure how sensitive the MRT alternate path lengths are 2136 to the choice of common GADAG root, we performed the same analysis 2137 using different choices of GADAG root. All of the nodes in the 2138 network were ordered with respect to the node centrality as computed 2139 above. Nodes were chosen at the 0th, 25th, and 50th percentile with 2140 respect to the centrality ordering, with 0th percentile being the 2141 most central node. The distribution of alternate path lengths for 2142 those three choices of GADAG root are shown in Figure 32 for a subset 2143 of the 19 topologies (chosen arbitrarily). The third row for each 2144 topology (labeled MRT_LOWPOINT ( 0 percentile) ) reproduces the 2145 results presented above for MRT_LOWPOINT_ONLY. The fourth and fifth 2146 rows show the alternate path length distibution for the 25th and 50th 2147 percentile choice for GADAG root. One can see some impact on the 2148 path length distribution with the less central choice of GADAG root 2149 resulting in longer path lenghths. 2151 We also looked at the impact of MRT algorithm variant on the 2152 alternate path lengths. The first two rows for each topology present 2153 results of the same alternate path length distribution analysis for 2154 the SPF and Hybrid methods for computing the GADAG. These two 2155 methods are described in Appendix A and Appendix B. For three of the 2156 topologies in this subset (T201, T206, and T211), the use of SPF or 2157 Hybrid methods does not appear to provide a significant advantage 2158 over the Lowpoint method with respect to path length. Instead, the 2159 choice of GADAG root appears to have more impact on the path length. 2160 However, for two of the topologies in this subset(T216 and T219) and 2161 for this particular choice of GAGAG root, the use of the SPF method 2162 results in noticeably shorter alternate path lengths than the use of 2163 the Lowpoint or Hybrid methods. It remains to be determined if this 2164 effect applies generally across more topologies or is sensitive to 2165 choice of GADAG root. 2167 +-------------------------------------------------------------------+ 2168 | Topology name | percentage of failure scenarios | 2169 | | protected by an alternate N hops | 2170 | MRT algorithm variant | longer than the primary path | 2171 | +------------------------------------+ 2172 | (GADAG root | | | | | | | | | no | 2173 | centrality percentile) | | | | | |10 |12 |14 | alt| 2174 | |0-1|2-3|4-5|6-7|8-9|-11|-13|-15| <16| 2175 +------------------------------+---+---+---+---+---+---+---+---+----+ 2176 | T201(avg primary hops=3.5) | | | | | | | | | | 2177 | MRT_HYBRID ( 0 percentile) | 33| 26| 23| 6| 3| | | | | 2178 | MRT_SPF ( 0 percentile) | 33| 36| 23| 6| 3| | | | | 2179 | MRT_LOWPOINT ( 0 percentile) | 33| 36| 23| 6| 3| | | | | 2180 | MRT_LOWPOINT (25 percentile) | 27| 29| 23| 11| 10| | | | | 2181 | MRT_LOWPOINT (50 percentile) | 27| 29| 23| 11| 10| | | | | 2182 +------------------------------+---+---+---+---+---+---+---+---+----+ 2183 | T206(avg primary hops=3.7) | | | | | | | | | | 2184 | MRT_HYBRID ( 0 percentile) | 50| 35| 13| 2| | | | | | 2185 | MRT_SPF ( 0 percentile) | 50| 35| 13| 2| | | | | | 2186 | MRT_LOWPOINT ( 0 percentile) | 55| 32| 13| | | | | | | 2187 | MRT_LOWPOINT (25 percentile) | 47| 25| 22| 6| | | | | | 2188 | MRT_LOWPOINT (50 percentile) | 38| 38| 14| 11| | | | | | 2189 +------------------------------+---+---+---+---+---+---+---+---+----+ 2190 | T211(avg primary hops=3.3) | | | | | | | | | | 2191 | MRT_HYBRID ( 0 percentile) | 86| 14| | | | | | | | 2192 | MRT_SPF ( 0 percentile) | 86| 14| | | | | | | | 2193 | MRT_LOWPOINT ( 0 percentile) | 85| 15| 1| | | | | | | 2194 | MRT_LOWPOINT (25 percentile) | 70| 25| 5| 1| | | | | | 2195 | MRT_LOWPOINT (50 percentile) | 80| 18| 2| | | | | | | 2196 +------------------------------+---+---+---+---+---+---+---+---+----+ 2197 | T216(avg primary hops=5.2) | | | | | | | | | | 2198 | MRT_HYBRID ( 0 percentile) | 23| 22| 18| 13| 10| 7| 4| 2| 2| 2199 | MRT_SPF ( 0 percentile) | 35| 32| 19| 9| 3| 1| | | | 2200 | MRT_LOWPOINT ( 0 percentile) | 28| 25| 18| 11| 7| 6| 3| 2| 1| 2201 | MRT_LOWPOINT (25 percentile) | 24| 20| 19| 16| 10| 6| 3| 1| | 2202 | MRT_LOWPOINT (50 percentile) | 19| 14| 13| 10| 8| 6| 5| 5| 10| 2203 +------------------------------+---+---+---+---+---+---+---+---+----+ 2204 | T219(avg primary hops=7.7) | | | | | | | | | | 2205 | MRT_HYBRID ( 0 percentile) | 20| 16| 13| 10| 7| 5| 5| 5| 3| 2206 | MRT_SPF ( 0 percentile) | 31| 23| 19| 12| 7| 4| 2| 1| | 2207 | MRT_LOWPOINT ( 0 percentile) | 19| 14| 15| 12| 10| 8| 7| 6| 10| 2208 | MRT_LOWPOINT (25 percentile) | 19| 14| 15| 13| 12| 10| 6| 5| 7| 2209 | MRT_LOWPOINT (50 percentile) | 19| 14| 14| 12| 11| 8| 6| 6| 10| 2210 +------------------------------+---+---+---+---+---+---+---+---+----+ 2212 Figure 32 2214 8. Algorithm Work to Be Done 2216 Broadcast Interfaces: The algorithm assumes that broadcast 2217 interfaces are already represented as pseudo-nodes in the network 2218 graph. Given maximal redundancy, one of the MRT will try to avoid 2219 both the pseudo-node and the next hop. The exact rules need to be 2220 fully specified. 2222 9. IANA Considerations 2224 This doument includes no request to IANA. 2226 10. Security Considerations 2228 This architecture is not currently believed to introduce new security 2229 concerns. 2231 11. References 2233 11.1. Normative References 2235 [I-D.ietf-rtgwg-mrt-frr-architecture] 2236 Atlas, A., Kebler, R., Envedi, G., Csaszar, A., Tantsura, 2237 J., Konstantynowicz, M., and R. White, "An Architecture 2238 for IP/LDP Fast-Reroute Using Maximally Redundant Trees", 2239 draft-ietf-rtgwg-mrt-frr-architecture-03 (work in 2240 progress), July 2013. 2242 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2243 Requirement Levels", BCP 14, RFC 2119, March 1997. 2245 11.2. Informative References 2247 [EnyediThesis] 2248 Enyedi, G., "Novel Algorithms for IP Fast Reroute", 2249 Department of Telecommunications and Media Informatics, 2250 Budapest University of Technology and Economics Ph.D. 2251 Thesis, February 2011, . 2255 [I-D.atlas-mpls-ldp-mrt] 2256 Atlas, A., Tiruveedhula, K., Tantsura, J., and I. 2257 Wijnands, "LDP Extensions to Support Maximally Redundant 2258 Trees", draft-atlas-mpls-ldp-mrt-00 (work in progress), 2259 July 2013. 2261 [I-D.atlas-ospf-mrt] 2262 Atlas, A., Hegde, S., cbowers@juniper.net, c., and J. 2263 Tantsura, "OSPF Extensions to Support Maximally Redundant 2264 Trees", draft-atlas-ospf-mrt-01 (work in progress), 2265 October 2013. 2267 [I-D.ietf-rtgwg-ipfrr-notvia-addresses] 2268 Bryant, S., Previdi, S., and M. Shand, "A Framework for IP 2269 and MPLS Fast Reroute Using Not-via Addresses", draft- 2270 ietf-rtgwg-ipfrr-notvia-addresses-11 (work in progress), 2271 May 2013. 2273 [I-D.ietf-rtgwg-lfa-manageability] 2274 Litkowski, S., Decraene, B., Filsfils, C., Raza, K., 2275 Horneffer, M., and p. psarkar@juniper.net, "Operational 2276 management of Loop Free Alternates", draft-ietf-rtgwg-lfa- 2277 manageability-03 (work in progress), February 2014. 2279 [I-D.ietf-rtgwg-remote-lfa] 2280 Bryant, S., Filsfils, C., Previdi, S., Shand, M., and S. 2281 Ning, "Remote LFA FRR", draft-ietf-rtgwg-remote-lfa-04 2282 (work in progress), November 2013. 2284 [I-D.li-isis-mrt] 2285 Li, Z., Wu, N., Zhao, Q., Atlas, A., cbowers@juniper.net, 2286 c., and J. Tantsura, "Intermediate System to Intermediate 2287 System (IS-IS) Extensions for Maximally Redundant Trees 2288 (MRT)", draft-li-isis-mrt-00 (work in progress), October 2289 2013. 2291 [Kahn_1962_topo_sort] 2292 Kahn, A., "Topological sorting of large networks", 2293 Communications of the ACM, Volume 5, Issue 11 , Nov 1962, 2294 . 2296 [LFARevisited] 2297 Retvari, G., Tapolcai, J., Enyedi, G., and A. Csaszar, "IP 2298 Fast ReRoute: Loop Free Alternates Revisited", Proceedings 2299 of IEEE INFOCOM , 2011, . 2302 [LightweightNotVia] 2303 Enyedi, G., Retvari, G., Szilagyi, P., and A. Csaszar, "IP 2304 Fast ReRoute: Lightweight Not-Via without Additional 2305 Addresses", Proceedings of IEEE INFOCOM , 2009, 2306 . 2308 [MRTLinear] 2309 Enyedi, G., Retvari, G., and A. Csaszar, "On Finding 2310 Maximally Redundant Trees in Strictly Linear Time", IEEE 2311 Symposium on Computers and Comunications (ISCC) , 2009, 2312 . 2315 [RFC3137] Retana, A., Nguyen, L., White, R., Zinin, A., and D. 2316 McPherson, "OSPF Stub Router Advertisement", RFC 3137, 2317 June 2001. 2319 [RFC5286] Atlas, A. and A. Zinin, "Basic Specification for IP Fast 2320 Reroute: Loop-Free Alternates", RFC 5286, September 2008. 2322 [RFC5714] Shand, M. and S. Bryant, "IP Fast Reroute Framework", RFC 2323 5714, January 2010. 2325 [RFC6571] Filsfils, C., Francois, P., Shand, M., Decraene, B., 2326 Uttaro, J., Leymann, N., and M. Horneffer, "Loop-Free 2327 Alternate (LFA) Applicability in Service Provider (SP) 2328 Networks", RFC 6571, June 2012. 2330 Appendix A. Option 2: Computing GADAG using SPFs 2332 The basic idea in this option is to use slightly-modified SPF 2333 computations to find ears. In every block, an SPF computation is 2334 first done to find a cycle from the local root and then SPF 2335 computations in that block find ears until there are no more 2336 interfaces to be explored. The used result from the SPF computation 2337 is the path of interfaces indicated by following the previous hops 2338 from the mininized IN_GADAG node back to the SPF root. 2340 To do this, first all cut-vertices must be identified and local-roots 2341 assigned as specified in Figure 12. 2343 The slight modifications to the SPF are as follows. The root of the 2344 block is referred to as the block-root; it is either the GADAG root 2345 or a cut-vertex. 2347 a. The SPF is rooted at a neighbor x of an IN_GADAG node y. All 2348 links between y and x are marked as TEMP_UNUSABLE. They should 2349 not be used during the SPF computation. 2351 b. If y is not the block-root, then it is marked TEMP_UNUSABLE. It 2352 should not be used during the SPF computation. This prevents 2353 ears from starting and ending at the same node and avoids cycles; 2354 the exception is because cycles to/from the block-root are 2355 acceptable and expected. 2357 c. Do not explore links to nodes whose local-root is not the block- 2358 root. This keeps the SPF confined to the particular block. 2360 d. Terminate when the first IN_GADAG node z is minimized. 2362 e. Respect the existing directions (e.g. INCOMING, OUTGOING, 2363 UNDIRECTED) already specified for each interface. 2365 Mod_SPF(spf_root, block_root) 2366 Initialize spf_heap to empty 2367 Initialize nodes' spf_metric to infinity 2368 spf_root.spf_metric = 0 2369 insert(spf_heap, spf_root) 2370 found_in_gadag = false 2371 while (spf_heap is not empty) and (found_in_gadag is false) 2372 min_node = remove_lowest(spf_heap) 2373 if min_node.IN_GADAG is true 2374 found_in_gadag = true 2375 else 2376 foreach interface intf of min_node 2377 if ((intf.OUTGOING or intf.UNDIRECTED) and 2378 ((intf.remote_node.localroot is block_root) or 2379 (intf.remote_node is block_root)) and 2380 (intf.remote_node is not TEMP_UNUSABLE) and 2381 (intf is not TEMP_UNUSABLE)) 2382 path_metric = min_node.spf_metric + intf.metric 2383 if path_metric < intf.remote_node.spf_metric 2384 intf.remote_node.spf_metric = path_metric 2385 intf.remote_node.spf_prev_intf = intf 2386 insert_or_update(spf_heap, intf.remote_node) 2387 return min_node 2389 SPF_for_Ear(cand_intf.local_node,cand_intf.remote_node, block_root, 2390 method) 2391 Mark all interfaces between cand_intf.remote_node 2392 and cand_intf.local_node as TEMP_UNUSABLE 2393 if cand_intf.local_node is not block_root 2394 Mark cand_intf.local_node as TEMP_UNUSABLE 2395 Initialize ear_list to empty 2396 end_ear = Mod_SPF(spf_root, block_root) 2397 y = end_ear.spf_prev_hop 2398 while y.local_node is not spf_root 2399 add_to_list_start(ear_list, y) 2400 y.local_node.IN_GADAG = true 2401 y = y.local_node.spf_prev_intf 2403 if(method is not hybrid) 2404 Set_Ear_Direction(ear_list, cand_intf.local_node, 2405 end_ear,block_root) 2406 Clear TEMP_UNUSABLE from all interfaces between 2407 cand_intf.remote_node and cand_intf.local_node 2408 Clear TEMP_UNUSABLE from cand_intf.local_node 2409 return end_ear 2411 Figure 33: Modified SPF for GADAG computation 2413 Assume that an ear is found by going from y to x and then running an 2414 SPF that terminates by minimizing z (e.g. y<->x...q<->z). Now it is 2415 necessary to determine the direction of the ear; if y << z, then the 2416 path should be y->x...q->z but if y >> z, then the path should be 2417 y<-x...q<-z. In Section 5.4, the same problem was handled by finding 2418 all ears that started at a node before looking at ears starting at 2419 nodes higher in the partial order. In this algorithm, using that 2420 approach could mean that new ears aren't added in order of their 2421 total cost since all ears connected to a node would need to be found 2422 before additional nodes could be found. 2424 The alternative is to track the order relationship of each node with 2425 respect to every other node. This can be accomplished by maintaining 2426 two sets of nodes at each node. The first set, Higher_Nodes, 2427 contains all nodes that are known to be ordered above the node. The 2428 second set, Lower_Nodes, contains all nodes that are known to be 2429 ordered below the node. This is the approach used in this algorithm. 2431 Set_Ear_Direction(ear_list, end_a, end_b, block_root) 2432 // Default of A_TO_B for the following cases: 2433 // (a) end_a and end_b are the same (root) 2434 // or (b) end_a is in end_b's Lower Nodes 2435 // or (c) end_a and end_b were unordered with respect to each 2436 // other 2437 direction = A_TO_B 2438 if (end_b is block_root) and (end_a is not end_b) 2439 direction = B_TO_A 2440 else if end_a is in end_b.Higher_Nodes 2441 direction = B_TO_A 2442 if direction is B_TO_A 2443 foreach interface i in ear_list 2444 i.UNDIRECTED = false 2445 i.INCOMING = true 2446 i.remote_intf.UNDIRECTED = false 2447 i.remote_intf.OUTGOING = true 2448 else 2449 foreach interface i in ear_list 2450 i.UNDIRECTED = false 2451 i.OUTGOING = true 2452 i.remote_intf.UNDIRECTED = false 2453 i.remote_intf.INCOMING = true 2454 if end_a is end_b 2455 return 2456 // Next, update all nodes' Lower_Nodes and Higher_Nodes 2457 if (end_a is in end_b.Higher_Nodes) 2458 foreach node x where x.localroot is block_root 2459 if end_a is in x.Lower_Nodes 2460 foreach interface i in ear_list 2461 add i.remote_node to x.Lower_Nodes 2462 if end_b is in x.Higher_Nodes 2463 foreach interface i in ear_list 2464 add i.local_node to x.Higher_Nodes 2465 else 2466 foreach node x where x.localroot is block_root 2467 if end_b is in x.Lower_Nodes 2468 foreach interface i in ear_list 2469 add i.local_node to x.Lower_Nodes 2470 if end_a is in x.Higher_Nodes 2471 foreach interface i in ear_list 2472 add i.remote_node to x.Higher_Nodes 2474 Figure 34: Algorithm to assign links of an ear direction 2476 A goal of the algorithm is to find the shortest cycles and ears. An 2477 ear is started by going to a neighbor x of an IN_GADAG node y. The 2478 path from x to an IN_GADAG node is minimal, since it is computed via 2479 SPF. Since a shortest path is made of shortest paths, to find the 2480 shortest ears requires reaching from the set of IN_GADAG nodes to the 2481 closest node that isn't IN_GADAG. Therefore, an ordered tree is 2482 maintained of interfaces that could be explored from the IN_GADAG 2483 nodes. The interfaces are ordered by their characteristics of 2484 metric, local loopback address, remote loopback address, and ifindex, 2485 as in the algorithm previously described in Figure 14. 2487 The algorithm ignores interfaces picked from the ordered tree that 2488 belong to the block root if the block in which the interface is 2489 present already has an ear that has been computed. This is necessary 2490 since we allow at most one incoming interface to a block root in each 2491 block. This requirement stems from the way next-hops are computed as 2492 will be seen in Section 5.6. After any ear gets computed, we 2493 traverse the newly added nodes to the GADAG and insert interfaces 2494 whose far end is not yet on the GADAG to the ordered tree for later 2495 processing. 2497 Finally, cut-links are a special case because there is no point in 2498 doing an SPF on a block of 2 nodes. The algorithm identifies cut- 2499 links simply as links where both ends of the link are cut-vertices. 2500 Cut-links can simply be added to the GADAG with both OUTGOING and 2501 INCOMING specified on their interfaces. 2503 add_eligible_interfaces_of_node(ordered_intfs_tree,node) 2504 for each interface of node 2505 if intf.remote_node.IN_GADAG is false 2506 insert(intf,ordered_intfs_tree) 2508 check_if_block_has_ear(x,block_id) 2509 block_has_ear = false 2510 for all interfaces of x 2511 if (intf.remote_node.block_id == block_id) && 2512 (intf.remote_node.IN_GADAG is true) 2513 block_has_ear = true 2514 return block_has_ear 2516 Construct_GADAG_via_SPF(topology, root) 2517 Compute_Localroot (root,root) 2518 Assign_Block_ID(root,0) 2519 root.IN_GADAG = true 2520 add_eligible_interfaces_of_node(ordered_intfs_tree,root) 2521 while ordered_intfs_tree is not empty 2522 cand_intf = remove_lowest(ordered_intfs_tree) 2523 if cand_intf.remote_node.IN_GADAG is false 2524 if L(cand_intf.remote_node) == D(cand_intf.remote_node) 2525 // Special case for cut-links 2526 cand_intf.UNDIRECTED = false 2527 cand_intf.remote_intf.UNDIRECTED = false 2528 cand_intf.OUTGOING = true 2529 cand_intf.INCOMING = true 2530 cand_intf.remote_intf.OUTGOING = true 2531 cand_intf.remote_intf.INCOMING = true 2532 cand_intf.remote_node.IN_GADAG = true 2533 add_eligible_interfaces_of_node( 2534 ordered_intfs_tree,cand_intf.remote_node) 2535 else 2536 if (cand_intf.remote_node.local_root == 2537 cand_intf.local_node) && 2538 check_if_block_has_ear 2539 (cand_intf.local_node, 2540 cand_intf.remote_node.block_id)) 2541 /* Skip the interface since the block root 2542 already has an incoming interface in the 2543 block */ 2544 else 2545 ear_end = SPF_for_Ear(cand_intf.local_node, 2546 cand_intf.remote_node, 2547 cand_intf.remote_node.localroot, 2548 SPF method) 2549 y = ear_end.spf_prev_hop 2550 while y.local_node is not cand_intf.local_node 2551 add_eligible_interfaces_of_node( 2552 ordered_intfs_tree, 2553 y.local_node) 2554 y = y.local_node.spf_prev_intf 2556 Figure 35: SPF-based GADAG algorithm 2558 Appendix B. Option 3: Computing GADAG using a hybrid method 2560 In this option, the idea is to combine the salient features of the 2561 above two options. To this end, we process nodes as they get added 2562 to the GADAG just like in the lowpoint inheritance by maintaining a 2563 stack of nodes. This ensures that we do not need to maintain lower 2564 and higher sets at each node to ascertain ear directions since the 2565 ears will always be directed from the node being processed towards 2566 the end of the ear. To compute the ear however, we resort to an SPF 2567 to have the possibility of better ears (path lentghs) thus giving 2568 more flexibility than the restricted use of lowpoint/dfs parents. 2570 Regarding ears involving a block root, unlike the SPF method which 2571 ignored interfaces of the block root after the first ear, in the 2572 hybrid method we would have to process all interfaces of the block 2573 root before moving on to other nodes in the block since the direction 2574 of an ear is pre-determined. Thus, whenever the block already has an 2575 ear computed, and we are processing an interface of the block root, 2576 we mark the block root as unusable before the SPF run that computes 2577 the ear. This ensures that the SPF terminates at some node other 2578 than the block-root. This in turn guarantees that the block-root has 2579 only one incoming interface in each block, which is necessary for 2580 correctly computing the next-hops on the GADAG. 2582 As in the SPF gadag, bridge ears are handled as a special case. 2584 The entire algorithm is shown below in Figure 36 2586 find_spf_stack_ear(stack, x, y, xy_intf, block_root) 2587 if L(y) == D(y) 2588 // Special case for cut-links 2589 xy_intf.UNDIRECTED = false 2590 xy_intf.remote_intf.UNDIRECTED = false 2591 xy_intf.OUTGOING = true 2592 xy_intf.INCOMING = true 2593 xy_intf.remote_intf.OUTGOING = true 2594 xy_intf.remote_intf.INCOMING = true 2595 xy_intf.remote_node.IN_GADAG = true 2596 push y onto stack 2597 return 2598 else 2599 if (y.local_root == x) && 2600 check_if_block_has_ear(x,y.block_id) 2601 //Avoid the block root during the SPF 2602 Mark x as TEMP_UNUSABLE 2603 end_ear = SPF_for_Ear(x,y,block_root,hybrid) 2604 If x was set as TEMP_UNUSABLE, clear it 2605 cur = end_ear 2606 while (cur != y) 2607 intf = cur.spf_prev_hop 2608 prev = intf.local_node 2609 intf.UNDIRECTED = false 2610 intf.remote_intf.UNDIRECTED = false 2611 intf.OUTGOING = true 2612 intf.remote_intf.INCOMING = true 2613 push prev onto stack 2614 cur = prev 2615 xy_intf.UNDIRECTED = false 2616 xy_intf.remote_intf.UNDIRECTED = false 2617 xy_intf.OUTGOING = true 2618 xy_intf.remote_intf.INCOMING = true 2619 return 2621 Construct_GADAG_via_hybrid(topology,root) 2622 Compute_Localroot (root,root) 2623 Assign_Block_ID(root,0) 2624 root.IN_GADAG = true 2625 Initialize Stack to empty 2626 push root onto Stack 2627 while (Stack is not empty) 2628 x = pop(Stack) 2629 for each interface intf of x 2630 y = intf.remote_node 2631 if y.IN_GADAG is false 2632 find_spf_stack_ear(stack, x, y, intf, y.block_root) 2634 Figure 36: Hybrid GADAG algorithm 2636 Authors' Addresses 2638 Gabor Sandor Enyedi (editor) 2639 Ericsson 2640 Konyves Kalman krt 11 2641 Budapest 1097 2642 Hungary 2644 Email: Gabor.Sandor.Enyedi@ericsson.com 2646 Andras Csaszar 2647 Ericsson 2648 Konyves Kalman krt 11 2649 Budapest 1097 2650 Hungary 2652 Email: Andras.Csaszar@ericsson.com 2654 Alia Atlas (editor) 2655 Juniper Networks 2656 10 Technology Park Drive 2657 Westford, MA 01886 2658 USA 2660 Email: akatlas@juniper.net 2661 Chris Bowers 2662 Juniper Networks 2663 1194 N. Mathilda Ave. 2664 Sunnyvale, CA 94089 2665 USA 2667 Email: cbowers@juniper.net 2669 Abishek Gopalan 2670 University of Arizona 2671 1230 E Speedway Blvd. 2672 Tucson, AZ 85721 2673 USA 2675 Email: abishek@ece.arizona.edu