idnits 2.17.1 draft-ietf-rtgwg-mrt-frr-architecture-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC5286]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (July 12, 2013) is 3935 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'R' is mentioned on line 377, but not defined == Missing Reference: 'F' is mentioned on line 1027, but not defined == Missing Reference: 'C' is mentioned on line 377, but not defined == Missing Reference: 'I' is mentioned on line 375, but not defined == Missing Reference: 'G' is mentioned on line 1027, but not defined == Missing Reference: 'J' is mentioned on line 379, but not defined == Missing Reference: 'A' is mentioned on line 752, but not defined == Missing Reference: 'ABR1' is mentioned on line 762, but not defined == Missing Reference: 'H' is mentioned on line 1027, but not defined == Missing Reference: 'E' is mentioned on line 1027, but not defined == Unused Reference: 'RFC2328' is defined on line 1239, but no explicit reference was found in the text == Outdated reference: A later version (-04) exists of draft-enyedi-rtgwg-mrt-frr-algorithm-03 ** Downref: Normative reference to an Informational draft: draft-enyedi-rtgwg-mrt-frr-algorithm (ref. 'I-D.enyedi-rtgwg-mrt-frr-algorithm') ** Downref: Normative reference to an Informational RFC: RFC 5714 == Outdated reference: A later version (-03) exists of draft-atlas-mpls-ldp-mrt-00 == Outdated reference: A later version (-03) exists of draft-atlas-ospf-mrt-00 == Outdated reference: A later version (-12) exists of draft-ietf-mpls-ldp-multi-topology-08 == Outdated reference: A later version (-11) exists of draft-ietf-rtgwg-remote-lfa-02 -- Obsolete informational reference (is this intentional?): RFC 3137 (Obsoleted by RFC 6987) Summary: 3 errors (**), 0 flaws (~~), 18 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Routing Area Working Group A. Atlas, Ed. 3 Internet-Draft R. Kebler 4 Intended status: Standards Track Juniper Networks 5 Expires: January 13, 2014 G. Enyedi 6 A. Csaszar 7 J. Tantsura 8 Ericsson 9 M. Konstantynowicz 10 Cisco Systems 11 R. White 12 VCE 13 July 12, 2013 15 An Architecture for IP/LDP Fast-Reroute Using Maximally Redundant Trees 16 draft-ietf-rtgwg-mrt-frr-architecture-03 18 Abstract 20 With increasing deployment of Loop-Free Alternates (LFA) [RFC5286], 21 it is clear that a complete solution for IP and LDP Fast-Reroute is 22 required. This specification provides that solution. IP/LDP Fast- 23 Reroute with Maximally Redundant Trees (MRT-FRR) is a technology that 24 gives link-protection and node-protection with 100% coverage in any 25 network topology that is still connected after the failure. 27 MRT removes all need to engineer for coverage. MRT is also extremely 28 computationally efficient. For any router in the network, the MRT 29 computation is less than the LFA computation for a node with three or 30 more neighbors. 32 Status of This Memo 34 This Internet-Draft is submitted in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF). Note that other groups may also distribute 39 working documents as Internet-Drafts. The list of current Internet- 40 Drafts is at http://datatracker.ietf.org/drafts/current/. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 47 This Internet-Draft will expire on January 13, 2014. 49 Copyright Notice 51 Copyright (c) 2013 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents 56 (http://trustee.ietf.org/license-info) in effect on the date of 57 publication of this document. Please review these documents 58 carefully, as they describe your rights and restrictions with respect 59 to this document. Code Components extracted from this document must 60 include Simplified BSD License text as described in Section 4.e of 61 the Trust Legal Provisions and are provided without warranty as 62 described in the Simplified BSD License. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 67 1.1. Importance of 100% Coverage . . . . . . . . . . . . . . . 4 68 1.2. Partial Deployment and Backwards Compatibility . . . . . 5 69 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 6 70 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 71 4. Maximally Redundant Trees (MRT) . . . . . . . . . . . . . . . 7 72 5. Maximally Redundant Trees (MRT) and Fast-Reroute . . . . . . 9 73 6. Unicast Forwarding with MRT Fast-Reroute . . . . . . . . . . 10 74 6.1. LDP Unicast Forwarding - Avoid Tunneling . . . . . . . . 10 75 6.2. IP Unicast Traffic . . . . . . . . . . . . . . . . . . . 11 76 7. Protocol Extensions and Considerations: OSPF and ISIS . . . . 12 77 8. Protocol Extensions and considerations: LDP . . . . . . . . . 14 78 9. Inter-Area and ABR Forwarding Behavior . . . . . . . . . . . 15 79 10. Prefixes Multiply Attached to the MRT Island . . . . . . . . 18 80 10.1. Endpoint Selection . . . . . . . . . . . . . . . . . . . 19 81 10.2. Named Proxy-Nodes . . . . . . . . . . . . . . . . . . . 21 82 10.2.1. Computing if an Island Neighbor (IN) is loop-free . 22 83 10.3. MRT Alternates for Destinations Outside the MRT Island . 23 84 11. Network Convergence and Preparing for the Next Failure . . . 24 85 11.1. Micro-forwarding loop prevention and MRTs . . . . . . . 24 86 11.2. MRT Recalculation . . . . . . . . . . . . . . . . . . . 24 87 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 25 88 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 89 14. Security Considerations . . . . . . . . . . . . . . . . . . . 25 90 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 25 91 15.1. Normative References . . . . . . . . . . . . . . . . . . 25 92 15.2. Informative References . . . . . . . . . . . . . . . . . 26 93 Appendix A. General Issues with Area Abstraction . . . . . . . . 27 94 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 28 96 1. Introduction 98 This document gives a complete solution for IP/LDP fast-reroute 99 [RFC5714]. MRT-FRR creates two alternate trees separate from the 100 primary next-hop forwarding used during stable operation. These two 101 trees are maximally diverse from each other, providing link and node 102 protection for 100% of paths and failures as long as the failure does 103 not cut the network into multiple pieces. This document defines the 104 architecture for IP/LDP fast-reroute with MRT. The associated 105 protocol extensions are defined in [I-D.atlas-ospf-mrt] and 106 [I-D.atlas-mpls-ldp-mrt]. The exact MRT algorithm is defined in 107 [I-D.enyedi-rtgwg-mrt-frr-algorithm]. 109 IP/LDP Fast-Reroute with MRT (MRT-FRR) uses two maximally diverse 110 forwarding topologies to provide alternates. A primary next-hop 111 should be on only one of the diverse forwarding topologies; thus, the 112 other can be used to provide an alternate. Once traffic has been 113 moved to one of MRTs, it is not subject to further repair actions. 114 Thus, the traffic will not loop even if a worse failure (e.g. node) 115 occurs when protection was only available for a simpler failure (e.g. 116 link). 118 In addition to supporting IP and LDP unicast fast-reroute, the 119 diverse forwarding topologies and guarantee of 100% coverage permit 120 fast-reroute technology to be applied to multicast traffic as 121 described in [I-D.atlas-rtgwg-mrt-mc-arch]. 123 Other existing or proposed solutions are partial solutions or have 124 significant issues, as described below. 126 Summary Comparison of IP/LDP FRR Methods 128 +-----------+---------------+---------------+-----------------------+ 129 | Method | Coverage | Alternate | Computation (in SPFs) | 130 | | | Looping? | | 131 +-----------+---------------+---------------+-----------------------+ 132 | MRT-FRR | 100% | None | less than 3 | 133 | | Link/Node | | | 134 | | | | | 135 | LFA | Partial | Possible | per neighbor | 136 | | Link/Node | | | 137 | | | | | 138 | Remote | Partial | Possible | per neighbor (link) | 139 | LFA | Link/Node | | or neighbor's | 140 | | | | neighbor (node) | 141 | | | | | 142 | Not-Via | 100% | None | per link and node | 143 | | Link/Node | | | 144 +-----------+---------------+---------------+-----------------------+ 146 Table 1 148 Loop-Free Alternates (LFA): LFAs [RFC5286] provide limited 149 topology-dependent coverage for link and node protection. 150 Restrictions on choice of alternates can be relaxed to improve 151 coverage, but this can cause forwarding loops if a worse failure 152 is experienced than protected against. Augmenting a network to 153 provide better coverage is NP-hard [LFARevisited]. [RFC6571] 154 discusses the applicability of LFA to different topologies with a 155 focus on common PoP architectures. 157 Remote LFA: Remote LFAs [I-D.ietf-rtgwg-remote-lfa] improve 158 coverage over LFAs for link protection but still cannot guarantee 159 complete coverage. The trade-off of looping traffic to improve 160 coverage is still made. Remote LFAs can provide node-protection 161 [I-D.litkowski-rtgwg-node-protect-remote-lfa] but not guaranteed 162 coverage and the computation required is quite high (an SPF per 163 neighbor's neighbor). [I-D.bryant-ipfrr-tunnels] describes 164 additional mechanisms to further improve coverage, at the cost of 165 added complexity. 167 Not-Via: Not-Via [I-D.ietf-rtgwg-ipfrr-notvia-addresses] is the 168 only other solution that provides 100% coverage for link and node 169 failures and does not have potential looping. However, the 170 computation is very high (an SPF per failure point) and academic 171 implementations [LightweightNotVia] have found the address 172 management complexity to be high. 174 1.1. Importance of 100% Coverage 175 Fast-reroute is based upon the single failure assumption - that the 176 time between single failures is long enough for a network to 177 reconverge and start forwarding on the new shortest paths. That does 178 not imply that the network will only experience one failure or 179 change. 181 It is straightforward to analyze a particular network topology for 182 coverage. However, a real network does not always have the same 183 topology. For instance, maintenance events will take links or nodes 184 out of use. Simply costing out a link can have a significant effect 185 on what LFAs are available. Similarly, after a single failure has 186 happened, the topology is changed and its associated coverage. 187 Finally, many networks have new routers or links added and removed; 188 each of those changes can have an effect on the coverage for 189 topology-sensitive methods such as LFA and Remote LFA. If fast- 190 reroute is important for the network services provided, then a method 191 that guarantees 100% coverage is important to accomodate natural 192 network topology changes. 194 Asymmetric link costs are also a common aspect of networks. There 195 are at least three common causes for them. First, any broadcast 196 interface is represented by a pseudo-node and has asymmetric link 197 costs to and from that pseudo-node. Second, when routers come up or 198 a link with LDP comes up, it is recommended in [RFC5443] and 199 [RFC3137] that the link metric be raised to the maximum cost; this 200 may not be symmetric and for [RFC3137] is not expected to be. Third, 201 techniques such as IGP metric tuning for traffic-engineering can 202 result in asymmetric link costs. A fast-reroute solution needs to 203 handle network topologies with asymmetric link costs. 205 When a network needs to use a micro-loop prevention mechanism 206 [RFC5715] such as Ordered FIB[I-D.ietf-rtgwg-ordered-fib] or Farside 207 Tunneling[RFC5715], then the whole IGP area needs to have alternates 208 available so that the micro-loop prevention mechanism, which requires 209 slower network convergence, can take the necessary time without 210 impacting traffic badly. Without complete coverage, traffic to the 211 unprotected destinations will be dropped for significantly longer 212 than with current convergence - where routers individually converge 213 as fast as possible. 215 1.2. Partial Deployment and Backwards Compatibility 217 MRT-FRR supports partial deployment. As with many new features, the 218 protocols (OSPF, LDP, ISIS) indicate their capability to support MRT. 219 Inside the MRT-capable connected group of routers (referred to as an 220 MRT Island), the MRTs are computed. Alternates to destinations 221 outside the MRT Island are computed and depend upon the existence of 222 a loop-free neighbor of the MRT Island for that destination. 224 2. Requirements Language 226 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 227 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 228 document are to be interpreted as described in [RFC2119] 230 3. Terminology 232 network graph: A graph that reflects the network topology where all 233 links connect exactly two nodes and broadcast links have been 234 transformed into the standard pseudo-node representation. 236 Redundant Trees (RT): A pair of trees where the path from any node 237 X to the root R along the first tree is node-disjoint with the 238 path from the same node X to the root along the second tree. 239 These can be computed in 2-connected graphs. 241 Maximally Redundant Trees (MRT): A pair of trees where the path 242 from any node X to the root R along the first tree and the path 243 from the same node X to the root along the second tree share the 244 minimum number of nodes and the minimum number of links. Each 245 such shared node is a cut-vertex. Any shared links are cut-links. 246 Any RT is an MRT but many MRTs are not RTs. 248 MRT-Red: MRT-Red is used to describe one of the two MRTs; it is 249 used to described the associated forwarding topology and MT-ID. 250 Specifically, MRT-Red is the decreasing MRT where links in the 251 GADAG are taken in the direction from a higher topologically 252 ordered node to a lower one. 254 MRT-Blue: MRT-Blue is used to describe one of the two MRTs; it is 255 used to described the associated forwarding topology and MT-ID. 256 Specifically, MRT-Blue is the increasing MRT where links in the 257 GADAG are taken in the direction from a lower topologically 258 ordered node to a higher one. 260 Rainbow MRT: It is useful to have an MT-ID that refers to the 261 multiple MRT topologies and to the default topology. This is 262 referred to as the Rainbow MRT MT-ID and is used by LDP to reduce 263 signaling and permit the same label to always be advertised to all 264 peers for the same (MT-ID, Prefix). 266 MRT Island: From the computing router, the set of routers that 267 support a particular MRT profile and are connected. 269 Island Border Router (IBR): A router in the MRT Island that is 270 connected to a router not in the MRT Island and both routers are 271 in a common area or level. 273 Island Neighbor (IN): A router that is not in the MRT Island but is 274 adjacent to an IBR and in the same area/level as the IBR. 276 cut-link: A link whose removal partitions the network. A cut-link 277 by definition must be connected between two cut-vertices. If 278 there are multiple parallel links, then they are referred to as 279 cut-links in this document if removing the set of parallel links 280 would partition the network graph. 282 cut-vertex: A vertex whose removal partitions the network graph. 284 2-connected: A graph that has no cut-vertices. This is a graph 285 that requires two nodes to be removed before the network is 286 partitioned. 288 2-connected cluster: A maximal set of nodes that are 2-connected. 290 2-edge-connected: A network graph where at least two links must be 291 removed to partition the network. 293 block: Either a 2-connected cluster, a cut-edge, or an isolated 294 vertex. 296 DAG: Directed Acyclic Graph - a graph where all links are directed 297 and there are no cycles in it. 299 ADAG: Almost Directed Acyclic Graph - a graph that, if all links 300 incoming to the root were removed, would be a DAG. 302 GADAG: Generalized ADAG - a graph that is the combination of the 303 ADAGs of all blocks. 305 named proxy-node: A proxy-node can represent a destination prefix 306 that can be attached to the MRT Island via at least two routers. 307 It is named if there is a way that traffic can be encapsulated to 308 reach specifically that proxy node; this could be because there is 309 an LDP FEC for the associated prefix or because MRT-Red and MRT- 310 Blue IP addresses are advertised in an undefined fashion for that 311 proxy-node. 313 4. Maximally Redundant Trees (MRT) 315 A pair of Maximally Redundant Trees are directed spanning trees that 316 provide maximally disjoint paths towards their common root. Only 317 links or nodes whose failure would partition the network (i.e. cut- 318 links and cut-vertices) are shared between the trees. The algorithm 319 to compute MRTs is given in [I-D.enyedi-rtgwg-mrt-frr-algorithm]. 320 This algorithm can be computed in O(e + n log n); it is less than 321 three SPFs. Modeling results comparing MRT alternates to the optimal 322 are described in [I-D.enyedi-rtgwg-mrt-frr-algorithm]. This document 323 describes how the MRTs can be used and not how to compute them. 325 MRT provides destination-based trees for each destination. Each 326 router stores its normal primary next-hop(s) as well as MRT-Blue 327 next-hop(s) and MRT-Red next-hop(s) toward each destination. The 328 alternate will be selected between the MRT-Blue and MRT-Red. 330 The most important thing to understand about MRTs is that for each 331 pair of destination-routed MRTs, there is a path from every node X to 332 the destination D on the Blue MRT that is as disjoint as possible 333 from the path on the Red MRT. 335 For example, in Figure 1, there is a network graph that is 336 2-connected in (a) and associated MRTs in (b) and (c). One can 337 consider the paths from B to R; on the Blue MRT, the paths are 338 B->F->D->E->R or B->C->D->E->R. On the Red MRT, the path is B->A->R. 339 These are clearly link and node-disjoint. These MRTs are redundant 340 trees because the paths are disjoint. 342 [E]---[D]---| [E]<--[D]<--| [E]-->[D]---| 343 | | | | ^ | | | 344 | | | V | | V V 345 [R] [F] [C] [R] [F] [C] [R] [F] [C] 346 | | | ^ ^ ^ | | 347 | | | | | | V | 348 [A]---[B]---| [A]-->[B]---| [A]<--[B]<--| 350 (a) (b) (c) 351 a 2-connected graph Blue MRT towards R Red MRT towards R 353 Figure 1: A 2-connected Network 355 By contrast, in Figure 2, the network in (a) is not 2-connected. If 356 F, G or the link F<->G failed, then the network would be partitioned. 357 It is clearly impossible to have two link-disjoint or node-disjoint 358 paths from G, I or J to R. The MRTs given in (b) and (c) offer paths 359 that are as disjoint as possible. For instance, the paths from B to 360 R are the same as in Figure 1 and the path from G to R on the Blue 361 MRT is G->F->D->E->R and on the Red MRT is G->F->B->A->R. 363 [E]---[D]---| 364 | | | |----[I] 365 | | | | | 366 [R]---[C] [F]---[G] | 367 | | | | | 368 | | | |----[J] 369 [A]---[B]---| 371 (a) 372 a non-2-connected graph 374 [E]<--[D]<--| [E]-->[D] 375 | ^ | [I] | |----[I] 376 V | | | V V ^ 377 [R] [C] [F]<--[G] | [R]<--[C] [F]<--[G] | 378 ^ ^ ^ V ^ | | 379 | | |----[J] | | [J] 380 [A]-->[B]---| [A]<--[B]<--| 382 (b) (c) 383 Blue MRT towards R Red MRT towards R 385 Figure 2: A non-2-connected network 387 5. Maximally Redundant Trees (MRT) and Fast-Reroute 389 In normal IGP routing, each router has its shortest-path-tree to all 390 destinations. From the perspective of a particular destination, D, 391 this looks like a reverse SPT (rSPT). To use maximally redundant 392 trees, in addition, each destination D has two MRTs associated with 393 it; by convention these will be called the MRT-Blue and MRT-Red. 394 MRT-FRR is realized by using multi-topology forwarding. There is a 395 MRT-Blue forwarding topology and a MRT-Red forwarding topology. 397 Any IP/LDP fast-reroute technique beyond LFA requires an additional 398 dataplane procedure, such as an additional forwarding mechanism. The 399 well-known options are multi-topology forwarding (used by MRT-FRR), 400 tunneling (e.g. [I-D.ietf-rtgwg-ipfrr-notvia-addresses] or 401 [I-D.ietf-rtgwg-remote-lfa]), and per-interface forwarding (e.g. 402 Loop-Free Failure Insensitive Routing in [EnyediThesis]). 404 When there is a link or node failure affecting, but not partitioning, 405 the network, each node will still have at least one path via one of 406 the MRTs to reach the destination D. For example, in Figure 2, C 407 would normally forward traffic to R across the C<->R link. If that 408 C<->R link fails, then C could use the Blue MRT path C->D->E->R. 410 As is always the case with fast-reroute technologies, forwarding does 411 not change until a local failure is detected. Packets are forwarded 412 along the shortest path. The appropriate alternate to use is pre- 413 computed. [I-D.enyedi-rtgwg-mrt-frr-algorithm] describes exactly how 414 to determine whether the MRT-Blue next-hops or the MRT-Red next-hops 415 should be the MRT alternate next-hops for a particular primary next- 416 hop N to a particular destination D. 418 MRT alternates are always available to use. It is a local decision 419 whether to use an MRT alternate, a Loop-Free Alternate or some other 420 type of alternate. 422 As described in [RFC5286], when a worse failure than is anticipated 423 happens, using LFAs that are not downstream neighbors can cause 424 micro-looping. Section 1.1 of [RFC5286] gives an example of link- 425 protecting alternates causing a loop on node failure. Even if a 426 worse failure than anticipated happens, the use of MRT alternates 427 will not cause looping. Therefore, while node-protecting LFAs may be 428 preferred, the certainty that no alternate-induced looping will occur 429 is an advantage of using MRT alternates when the available node- 430 protecting LFA is not a downstream path. 432 6. Unicast Forwarding with MRT Fast-Reroute 434 With LFA, there is no need to tunnel unicast traffic, whether IP or 435 LDP. The traffic is simply sent to an alternate. As mentioned 436 earlier in Section 5, MRT needs multi-topology forwarding. 437 Unfortunately, neither IP nor LDP provides extra bits for a packet to 438 indicate its topology. 440 Once the MRTs are computed, the two sets of MRTs are seen by the 441 forwarding plane as essentially two additional topologies. The same 442 considerations apply for forwarding along the MRTs as for handling 443 multiple topologies. 445 6.1. LDP Unicast Forwarding - Avoid Tunneling 447 For LDP, it is very desirable to avoid tunneling because, for at 448 least node protection, tunneling requires knowledge of remote LDP 449 label mappings and thus requires targeted LDP sessions and the 450 associated management complexity. There are two different mechanisms 451 that can be used; Option A MUST be supported. 453 1. Option A - Encode MT-ID in Labels: In addition to sending a 454 single label for a FEC, a router would provide two additional 455 labels with the MT-IDs associated with the Blue MRT or Red MRT 456 forwarding topologies. This is very simple for hardware support. 457 It does reduce the label space for other uses. It also increases 458 the memory to store the labels and the communication required by 459 LDP. 461 2. Option B - Create Topology-Identification Labels: Use the label- 462 stacking ability of MPLS and specify only two additional labels - 463 one for each associated MRT color - by a new FEC type. When 464 sending a packet onto an MRT, first swap the LDP label and then 465 push the topology-identification label for that MRT color. When 466 receiving a packet with a topology-identification label, pop it 467 and use it to guide the next-hop selection in combination with 468 the next label in the stack; then swap the remaining label, if 469 appropriate, and push the topology-identification label for the 470 next-hop. This has minimal usage of additional labels, memory 471 and LDP communication. It does increase the size of packets and 472 the complexity of the required label operations and look-ups. 473 This can use the same mechanisms as are needed for context-aware 474 label spaces. 476 Note that with LDP unicast forwarding, regardless of whether 477 topology-identification label or encoding topology in label is used, 478 no additional loopbacks per router are required. This is because LDP 479 labels are used on a hop-by-hop basis to identify MRT-blue and MRT- 480 red forwading topologies. 482 For greatest hardware compatibility, routers implementing MRT LDP 483 fast-reroute MUST support Option A of encoding the MT-ID in the 484 labels. The extensions to indicate an MT-ID for a FEC are described 485 in Section 3.2.1 of [I-D.ietf-mpls-ldp-multi-topology]. 487 6.2. IP Unicast Traffic 489 For IP, there is no currently practical alternative except tunneling 490 to gain the bits needed to indicate the MRT-Blue or MRT-Red 491 forwarding topology. The choice of tunnel egress MAY be flexible 492 since any router closer to the destination than the next-hop can 493 work. This architecture assumes that the original destination in the 494 area is selected (see Section 10 for handling of multi-homed 495 prefixes); another possible choice is the next-next-hop towards the 496 destination. For LDP traffic, using the original destination 497 simplifies MRT-FRR by avoiding the need for targeted LDP sessions to 498 the next-next-hop. For IP, that consideration doesn't apply but 499 consistency with LDP is RECOMMENDED. If the tunnel egress is the 500 original destination router, then the traffic remains on the 501 redundant tree with sub-optimal routing. Selection of the tunnel 502 egress is a router-local decision. 504 There are three options available for marking IP packets with which 505 MRT it should be forwarded in. For greatest hardware compatibility 506 and ease in removing the MRT-topology marking at area/level 507 boundaries, routers that support MPLS and implement IP MRT fast- 508 reroute MUST support Option A - using an LDP label that indicates the 509 destination and MT-ID. 511 1. Tunnel IP packets via an LDP LSP. This has the advantage that 512 more installed routers can do line-rate encapsulation and 513 decapsulation. Also, no additional IP addresses would need to be 514 allocated or signaled. 516 a. Option A - LDP Destination-Topology Label: Use a label that 517 indicates both destination and MRT. This method allows easy 518 tunneling to the next-next-hop as well as to the IGP-area 519 destination. For a proxy-node, the destination to use is the 520 non-proxy-node immediately before the proxy-node on that 521 particular color MRT. 523 b. Option B - LDP Topology Label: Use a Topology-Identifier 524 label on top of the IP packet. This is very simple. If 525 tunneling to a next-next-hop is desired, then a two-deep 526 label stack can be used with [ Topology-ID label, Next-Next- 527 Hop Label ]. 529 2. Tunnel IP packets in IP. Each router supporting this option 530 would announce two additional loopback addresses and their 531 associated MRT color. Those addresses are used as destination 532 addresses for MRT-blue and MRT-red IP tunnels respectively. They 533 allow the transit nodes to identify the traffic as being 534 forwarded along either MRT-blue or MRT-red tree topology to reach 535 the tunnel destination. Announcements of these two additional 536 loopback addresses per router with their MRT color requires IGP 537 extensions. 539 7. Protocol Extensions and Considerations: OSPF and ISIS 541 For simplicity, the approach of defining a well-known profile is 542 taken in [I-D.atlas-ospf-mrt]. The purpose of communicating support 543 for MRT in the IGP is to indicate thatqq the MRT-Blue and MRT-Red 544 forwarding topologies are created for transit traffic. This section 545 describes the various options to be selected. The default MRT 546 profile is described here and the signaling extensions for OSPF are 547 given in [I-D.atlas-ospf-mrt]. 549 For any MRT profile, the MRT Island is created by starting from the 550 computing router. If the computing router supports the default MRT 551 profile, add it to the MRT Island. Add a router to the MRT Island if 552 the router supports the default MRT profile and is connected to the 553 MRT Island via bidirectional links eligible for MRT. 555 If a router advertises support for multiple MRT profiles, then it 556 MUST create the transit forwarding topologies for each of those, 557 unless the profile specifies No Forwarding Mechanism (e.g. as might 558 be done for a profile used only for multicast global protection). A 559 router MUST NOT advertise multiple MRT profiles that overlap in their 560 MRT-Red MT-ID or MRT-Blue MT-ID. 562 The MRT Profile also defines different behaviors such as how MRT 563 recomputation is handled and how area/level boundaries are dealt 564 with. 566 MRT Algorithm: MRT Lowpoint algorithm defined in 567 [I-D.enyedi-rtgwg-mrt-frr-algorithm]. 569 MRT-Red MT-ID: experimental 3997, final value assigned by IANA 570 allocated from the LDP MT-ID space 572 MRT-Blue MT-ID: experimental 3998, final value assigned by IANA 573 allocated from the LDP MT-ID space 575 GADAG Root Selection Priority: Among the routers in the MRT Island 576 and with the highest priority advertised, an implementation MUST 577 pick the router with the highest Router ID to be the GADAG root. 579 Forwarding Mechanisms: LDP 581 Recalculation: Recalculation of MRTs SHOULD occur as described in 582 Section 11.2. This allows the MRT forwarding topologies to 583 support IP/LDP fast-reroute traffic. 585 Area/Level Border Behavior: As described in Section 9, ABRs/LBRs 586 SHOULD ensure that traffic leaving the area also exits the MRT-Red 587 or MRT-Blue forwarding topology. 589 The following describes the aspects to be considered to define a 590 profile to advertise. For some profiles, associated information may 591 need to be distributed, such as GADAG Root Selection Priority, Red 592 MRT Loopback Address, Blue MRT Loopback Address. 594 MRT Algorithm: This identifies the particular MRT algorithm used by 595 the router for this profile. Algorithm transitions can be managed 596 by advertising multiple MRT profiles. 598 MRT-Red MT-ID: This specifies the MT-ID to be associated with the 599 MRT-Red forwarding topology. It is needed for use in LDP 600 signaling. All routers in the MRT Island MUST agree on a value. 602 MRT-Blue MT-ID: This specifies the MT-ID to be associated with the 603 MRT-Blue forwarding topology. It is needed for use in LDP 604 signaling. All routers in the MRT Island MUST agree on a value. 606 GADAG Root Selection Priority: A MRT profile might specify this to 607 provide the network operator with a knob to force a particular 608 GADAG root selection. If not specified in the MRT profile, the 609 highest Router ID in the profile's MRT Island will be elected the 610 GADAG Root. If a GADAG Root Selection Priority is specified, then 611 the MRT profile must also specify how the GADAG Root is elected. 613 Forwarding Mechanism: This specifies which forwarding mechanisms 614 the router supports for transit traffic. An MRT island must 615 program appropriate next-hops into the forwarding plane. The 616 known options are IPv4, IPv6, LDP, and None. If IPv4 is 617 supported, then both MRT-Red and MRT-Blue IPv4 Loopback Addresses 618 SHOULD be specified. If IPv6 is supported, both MRT-Red and MRT- 619 Blue IPv6 Loopback Addresses SHOULD be specified. If LDP is 620 supported, then LDP support and signaling extensions MUST be 621 supported. 623 MRT-Red Loopback Address: This provides the router's loopback 624 address to reach the router via the MRT-Red forwarding topology. 625 It can, of course, be specified for both IPv4 and IPv6. 627 MRT-Blue Loopback Address: This provides the router's loopback 628 address to reach the router via the MRT-Blue forwarding topology. 629 It can, of course, be specified for both IPv4 and IPv6. 631 Recalculation: As part of what process and timing should the new 632 MRTs be computed on a modified topology? Section 11.2 describes 633 the minimum behavior required to support fast-reroute. 635 Area/Level Border Behavior: Should inter-area traffic on the MRT- 636 Blue or MRT-Red be put back onto the shortest path tree? Should 637 it be swapped from MRT-Blue or MRT-Red in one area/level to MRT- 638 Red or MRT-Blue in the next area/level to avoid the potential 639 failure of an ABR? (See [I-D.atlas-rtgwg-mrt-mc-arch] for use- 640 case details. 642 Other Profile-Specific Behavior: Depending upon the use-case for 643 the profile, there may be additional profile-specific behavior. 645 As with LFA, it is expected that OSPF Virtual Links will not be 646 supported. 648 8. Protocol Extensions and considerations: LDP 649 The protocol extensions for LDP are defined in 650 [I-D.atlas-mpls-ldp-mrt]. A router must indicate that it has the 651 ability to support MRT; having this explicit allows the use of MRT- 652 specific processing, such as special handling of FECs sent with the 653 Rainbow MRT MT-ID. 655 A FEC sent with the Rainbow MRT MT-ID indicates that the FEC applies 656 to all the MRT-Blue and MRT-Red MT-IDs in supported MRT profiles as 657 well as to the default shortest-path based MT-ID 0. The Rainbow MRT 658 MT-ID is defined to provide an easy way to handle the special 659 signaling that is needed at ABRs or LBRs. It avoids the problem of 660 needing to signal different MPLS labels for the same FEC. Because 661 the Rainbow MRT MT-ID is used only by ABRs/LBRs or the LDP egress, it 662 is not MRT profile specific. The proposed experimental value is 3999 663 and the final value will be assigned by IANA and allocated from the 664 LDP MT-ID space. The authoritative values are given in 665 [I-D.atlas-mpls-ldp-mrt]. 667 9. Inter-Area and ABR Forwarding Behavior 669 An ABR/LBR has two forwarding roles. First, it forwards traffic 670 inside its area. Second, it forwards traffic from one area into 671 another. These same two roles apply for MRT transit traffic. 672 Traffic on MRT-Red or MRT-Blue destined inside the area needs to stay 673 on MRT-Red or MRT-Blue in that area. However, it is desirable for 674 traffic leaving the area to also exit MRT-Red or MRT-Blue back to the 675 shortest-path forwarding. 677 For unicast MRT-FRR, the need to stay on an MRT forwarding topology 678 terminates at the ABR/LBR whose best route is via a different area/ 679 level. It is highly desirable to go back to the default forwarding 680 topology when leaving an area/level. There are three basic reasons 681 for this. First, the default topology uses shortest paths; the 682 packet will thus take the shortest possible route to the destination. 683 Second, this allows failures that might appear in multiple areas 684 (e.g. ABR/LBR failures) to be separately identified and repaired 685 around. Third, the packet can be fast-rerouted again, if necessary, 686 due to a failure in a different area. 688 An ABR/LBR that receives a packet on MRT-Red or MRT-Blue towards a 689 destination in another area/level should forward the packet in the 690 area/level with the best route along MRT-Red or MRT-Blue. If the 691 packet came from that area/level, this correctly avoids the failure. 692 However, if the traffic came from a different area/level, the packet 693 should be removed from MRT-Red or MRT-Blue and forwarded on the 694 shortest-path default forwarding topology. 696 To avoid per-interface forwarding state for MRT-Red and MRT-Blue, the 697 ABR/LBR needs to arrange that packets destined to a different area 698 arrive at the ABR/LBR already not marked as MRT-Red or MRT-Blue. 700 For LDP forwarding where the MPLS label specifies (MT-ID, FEC), the 701 ABR/LBR is responsible for advertising the proper label to each 702 neighbor. Assume that an ABR/LBR has allocated three labels for a 703 particular destination; those labels are L_primary, L_blue, and 704 L_red. When the ABR/LBR advertises label bindings to routers in the 705 area with the best route to the destination, the ABR/LBR provides 706 L_primary for the default topology, L_blue for the MRT-Blue MT-ID and 707 L_red for the MRT-Red MT-ID, exactly as expected. However, when the 708 ABR/LBR advertises label bindings to routers in other areas, the ABR/ 709 LBR advertises L_primary for the Rainbow MRT MT-ID, which is then 710 used for the default topology, for the MRT-Blue MT-ID and for the 711 MRT-Red MT-ID. 713 The ABR/LBR installs all next-hops from the best area: primary next- 714 hops for L_primary, MRT-Blue next-hops for L_blue, and MRT-Red next- 715 hops for L_red. Because the ABR/LBR advertised (Rainbow MRT MT-ID, 716 FEC) with L_primary to neighbors not in the best area, packets from 717 those neighbors will arrive at the ABR/LBR with a label L_primary and 718 will be forwarded into the best area along the default topology. By 719 controlling what labels are advertised, the ABR/LBR can thus enforce 720 that packets exiting the area do so on the shortest-path default 721 topology. 723 If IP forwarding is used, then the ABR/LBR behavior is dependent upon 724 the outermost IP address. If the outermost IP address is an MRT 725 loopback address of the ABR/LBR, then the packet is decapsulated and 726 forwarded based upon the inner IP address, which should go on the 727 default SPT topology. If the outermost IP address is not an MRT 728 loopback address of the ABR/LBR, then the packet is simply forwarded 729 along the associated forwarding topology. A PLR sending traffic to a 730 destination outside its local area/level will pick the MRT and use 731 the associated MRT loopback address of the selected ABR/LBR connected 732 to the external destination. 734 Thus, regardless of which of these two forwarding mechanisms are 735 used, there is no need for additional computation or per-area 736 forwarding state. 738 +----[C]---- --[D]--[E] --[D]--[E] 739 | \ / \ / \ 740 p--[A] Area 10 [ABR1] Area 0 [H]--p +-[ABR1] Area 0 [H]-+ 741 | / \ / | \ / | 742 +----[B]---- --[F]--[G] | --[F]--[G] | 743 | | 744 | other | 745 +----------[p]-------+ 746 area 748 (a) Example topology (b) Proxy node view in Area 0 nodes 750 +----[C]<--- [D]->[E] 751 V \ \ 752 +-[A] Area 10 [ABR1] Area 0 [H]-+ 753 | ^ / / | 754 | +----[B]<--- [F]->[G] V 755 | | 756 +------------->[p]<--------------+ 758 (c) rSPT towards destination p 760 ->[D]->[E] -<[D]<-[E] 761 / \ / \ 762 [ABR1] Area 0 [H]-+ +-[ABR1] [H] 763 / | | \ 764 [F]->[G] V V -<[F]<-[G] 765 | | 766 | | 767 [p]<------+ +--------->[p] 769 (d) Blue MRT in Area 0 (e) Red MRT in Area 0 771 Figure 3: ABR Forwarding Behavior and MRTs 773 The other forwarding mechanism described in Section 6 is using 774 Topology-Identification Labels. This mechanism would require that 775 any router whose MRT-Red or MRT-Blue next-hop is an ABR/LBR would 776 need to determine whether the ABR/LBR would forward the packet out of 777 the area/level. If so, then that router should pop off the topology- 778 identification label before forwarding the packet to the ABR/LBR. 780 For example, in Figure 3, if node H fails, node E has to put traffic 781 towards prefix p onto MRT-Red. But since node D knows that ABR1 will 782 use a best from another area, it is safe for D to pop the Topology- 783 Identification Label and just forward the packet to ABR1 along the 784 MRT-Red next-hop. ABR1 will use the shortest path in Area 10. 786 In all cases for ISIS and most cases for OSPF, the penultimate router 787 can determine what decision the adjacent ABR will make. The one case 788 where it can't be determined is when two ASBRs are in different non- 789 backbone areas attached to the same ABR, then the ASBR's Area ID may 790 be needed for tie-breaking (prefer the route with the largest OPSF 791 area ID) and the Area ID isn't announced as part of the ASBR link- 792 state advertisement (LSA). In this one case, suboptimal forwarding 793 along the MRT in the other area would happen. If that becomes a 794 realistic deployment scenario, OSPF extensions could be considered. 795 This is not covered in [I-D.atlas-ospf-mrt]. 797 10. Prefixes Multiply Attached to the MRT Island 799 How a computing router S determines its local MRT Island for each 800 supported MRT profile is already discussed in Section 7. 802 There are two types of prefixes or FECs that may be multiply attached 803 to an MRT Island. The first type are multi-homed prefixes that 804 usually connect at a domain or protocol boundary. The second type 805 represent routers that do not support the profile for the MRT Island. 806 The key difference is whether the traffic, once out of the MRT 807 Island, remains in the same area/level and might reenter the MRT 808 Island if a loop-free exit point is not selected. 810 One property of LFAs that is necessary to preserve is the ability to 811 protect multi-homed prefixes against ABR failure. For instance, if a 812 prefix from the backbone is available via both ABR A and ABR B, if A 813 fails, then the traffic should be redirected to B. This can also be 814 done for backups via MRT. 816 If ASBR protection is desired, this has additonal complexities if the 817 ASBRs are in different areas. Similarly, protecting labeled BGP 818 traffic in the event of an ASBR failure has additional complexities 819 due to the per-ASBR label spaces involved. 821 As discussed in [RFC5286], a multi-homed prefix could be: 823 o An out-of-area prefix announced by more than one ABR, 825 o An AS-External route announced by 2 or more ASBRs, 827 o A prefix with iBGP multipath to different ASBRs, 829 o etc. 831 There are also two different approaches to protection. The first is 832 to do endpoint selection to pick a router to tunnel to where that 833 router is loop-free with respect to the failure-point. Conceptually, 834 the set of candidate routers to provide LFAs expands to all routers, 835 with an MRT alternate, attached to the prefix. 837 The second is to use a proxy-node, that can be named via MPLS label 838 or IP address, and pick the appropriate label or IP address to reach 839 it on either MRT-Blue or MRT-Red as appropriate to avoid the failure 840 point. A proxy-node can represent a destination prefix that can be 841 attached to the MRT Island via at least two routers. It is termed a 842 named proxy-node if there is a way that traffic can be encapsulated 843 to reach specifically that proxy-node; this could be because there is 844 an LDP FEC for the associated prefix or because MRT-Red and MRT-Blue 845 IP addresses are advertised in an as-yet undefined fashion for that 846 proxy-node. Traffic to a named proxy-node may take a different path 847 than traffic to the attaching router; traffic is also explicitly 848 forwarded from the attaching router along a predetermined interface 849 towards the relevant prefixes. 851 For IP traffic, multi-homed prefixes can use endpoint selection. For 852 IP traffic that is destined to a router outside the MRT Island, if 853 that router is the egress for a FEC advertised into the MRT Island, 854 then the named proxy-node approach can be used. 856 For LDP traffic, there is always a FEC advertised into the MRT 857 Island. The named proxy-node approach should be used, unless the 858 computing router S knows the label for the FEC at the selected 859 endpoint. 861 If a FEC is advertised from outside the MRT Island into the MRT 862 Island and the forwarding mechanism specified in the profile includes 863 LDP, then the routers learning that FEC MUST also advertise labels 864 for (MRT-Red, FEC) and (MRT-Blue, FEC) to neighbors inside the MRT 865 Island. If the forwarding mechanism includes LDP, any router 866 receiving a FEC corresponding to a router outside the MRT Island or 867 to a multi-homed prefix MUST compute and install the transit MRT-Blue 868 and MRT-Red next-hops for that FEC; the associated FECs ( (MT-ID 0, 869 FEC), (MRT-Red, FEC), and (MRT-Blue, FEC)) MUST also be provided via 870 LDP to neighbors inside the MRT Island. 872 10.1. Endpoint Selection 874 Endpoint Selection is a local matter for a router in the MRT Island 875 since it pertains to selecting and using an alternate and does not 876 affect the transit MRT-Red and MRT-Blue forwarding topologies. 878 Let the computing router be S and the next-hop F be the node whose 879 failure is to be avoided. Let the destination be prefix p. Have A 880 be the router to which the prefix p is attached for S's shortest path 881 to p. 883 The candidates for endpoint selection are those to which the 884 destination prefix is attached in the area/level. For a particular 885 candidate B, it is necessary to determine if B is loop-free to reach 886 p with respect to S and F for node-protection or at least with 887 respect to S and the link (S, F) for link-protection. If B will 888 always prefer to send traffic to p via a different area/level, then 889 this is definitional. Otherwise, distance-based computations are 890 necessary and an SPF from B's perspective may be necessary. The 891 following equations give the checks needed; the rationale is similar 892 to that given in [RFC5286]. 894 Loop-Free for S: D_opt(B, p) < D_opt(B, S) + D_opt(S, p) 896 Loop-Free for F: D_opt(B, p) < D_opt(B, F) + D_opt(F, p) 898 The latter is equivalent to the following, which avoids the need to 899 compute the shortest path from F to p. 901 Loop-Free for F: D_opt(B, p) < D_opt(B, F) + D_opt(S, p) - D_opt(S, 902 F) 904 Finally, the rules for Endpoint selection are given below. The basic 905 idea is to repair to the prefix-advertising router selected for the 906 shortest-path and only to select and tunnel to a different endpoint 907 if necessary (e.g. A=F or F is a cut-vertex or the link (S,F) is a 908 cut-link). 910 1. Does S have a node-protecting alternate to A? If so, select 911 that. Tunnel the packet to A along that alternate. For example, 912 if LDP is the forwarding mechanism, then push the label (MRT-Red, 913 A) or (MRT-Blue, A) onto the packet. 915 2. If not, then is there a router B that is loop-free to reach p 916 while avoiding both F and S? If so, select B as the end-point. 917 Determine the MRT alternate to reach B while avoiding F. Tunnel 918 the packet to B along that alternate. For example, with LDP, 919 push the label (MRT-Red, B) or (MRT-Blue, B) onto the packet. 921 3. If not, then does S have a link-protecting alternate to A? If 922 so, select that. 924 4. If not, then is there a router B that is loop-free to reach p 925 while avoiding S and the link from S to F? If so, select B as 926 the endpoint and the MRT alternate that for reaching B from S 927 avoiding the link (S,F). 929 The endpoint selected will receive a packet destined to itself and, 930 being the egress, will pop that MPLS label (or have signaled Implicit 931 Null) and forward based on what is underneath. This suffices for IP 932 traffic where the MPLS labels understood by the endpoint router are 933 not needed. 935 10.2. Named Proxy-Nodes 937 A clear advantage to using a named proxy-node is that it is possible 938 to explicitly forward from the MRT Island along an interface to a 939 loop-free island neighbor (LFIN) when that interface may not be a 940 primary next-hop. For LDP traffic where the label indicates both the 941 topology and the FEC, it is necessary to either use a named proxy- 942 node or deal with learning remote MPLS labels. 944 A named proxy-node represents one or more destinations and, for LDP 945 forwarding, has a FEC associated with it that is signaled into the 946 MRT Island. Therefore, it is possible to explicitly label packets to 947 go to (MRT-Red, FEC) or (MRT-Blue, FEC); at the border of the MRT 948 Island, the label will swap to meaning (MT-ID 0, FEC). It would be 949 possible to have named proxy-nodes for IP forwarding, but this would 950 require extensions to signal two IP addresses to be associated with 951 MRT-Red and MRT-Blue for the proxy-node. A named proxy-node can be 952 uniquely represented by the two routers in the MRT Island to which it 953 is connected. The extensions to signal such IP addresses are not 954 defined in [I-D.atlas-ospf-mrt]. The details of what label-bindings 955 must be originated are described in [I-D.atlas-mpls-ldp-mrt]. 957 Computing the MRT next-hops to a named proxy-node and the MRT 958 alternate for the computing router S to avoid a particular failure 959 node F is extremely straightforward. The details of the simple 960 constant-time functions, Select_Proxy_Node_NHs() and 961 Select_Alternates_Proxy_Node(), are given in 962 [I-D.enyedi-rtgwg-mrt-frr-algorithm]. A key point is that computing 963 these MRT next-hops and alternates can be done as new named proxy- 964 nodes are added or removed without requiring a new MRT computation or 965 impacting other existing MRT paths. This maps very well to, for 966 example, how OSPFv2 [[RFC2328] Section 16.5] does incremental updates 967 for new summary-LSAs. 969 The key question is how to attach the named proxy-node to the MRT 970 Island; all the routers in the MRT Island MUST do this consistently. 971 No more than 2 routers in the MRT Island can be selected; one should 972 only be selected if there are no others that meet the necessary 973 criteria. The named proxy-node is logically part of the area/level. 975 There are two sources for candidate routers in the MRT Island to 976 connect to the named proxy-node. The first set are those routers 977 that are advertising the prefix; the cost assigned to each such 978 router is the announced cost to the prefix. The second set are those 979 routers in the MRT Island that are connected to routers not in the 980 MRT Island but in the same area/level; such routers will be defined 981 as Island Border Routers (IBRs). The routers connected to the IBRs 982 that are not in the MRT Island and are in the same area/level are 983 Island Neighbors (INs). 985 Since packets sent to the named proxy-node along MRT-Red or MRT-Blue 986 may come from any router inside the MRT Island, it is necessary that 987 whatever router to which an IBR forwards the packet be loop-free with 988 regard to the whole MRT Island for the destination. Thus, an IBR is 989 a candidate router only if it possesses at least one IN whose path to 990 the prefix does not enter the MRT Island. The cost assigned to each 991 (IBR, IN) pair is the D_opt(IN, prefix) plus Cost(IBR, IN). 993 From the set of prefix-advertising routers and the IBRs, the two 994 lowest cost routers are selected and ties are broken based upon the 995 lowest Router ID. For ease of discussion, such selected routers are 996 proxy-node attachment routers and the two selected will be named A 997 and B. 999 A proxy-node attachment router has a special forwarding role. When a 1000 packet is received destined to (MRT-Red, prefix) or (MRT-Blue, 1001 prefix), if the proxy-node attachment router is an IBR, it MUST swap 1002 to the default topology (e.g. swap to the label for (MT-ID 0, prefix) 1003 or remove the outer IP encapsulation) and forward the packet to the 1004 IN whose cost was used in the selection. If the proxy-node 1005 attachment router is not an IBR, then the packet MUST be removed from 1006 the MRT forwarding topology and sent along the interface that caused 1007 the router to advertise the prefix; this interface might be out of 1008 the area/level/AS. 1010 10.2.1. Computing if an Island Neighbor (IN) is loop-free 1012 As discussed, the Island Neighbor needs to be loop-free with regard 1013 to the whole MRT Island for the destination. Conceptually, the cost 1014 of transiting the MRT Island should be regarded as 0. This can be 1015 done by collapsing the MRT Island into a single node, as seen in 1016 Figure 4, and then computing SPFs from each Island Neighbor and from 1017 the MRT Island itself. 1019 [G]---[E]---(V)---(U)---(T) 1020 | \ | | | 1021 | \ | | | 1022 | \ | | | 1023 [H]---[F]---(R)---(S)----| 1025 (1) Network Graph with Partial Deployment 1027 [E],[F],[G],[H] : No support for MRT 1028 (R),(S),(T),(U),(V): MRT Island - supports MRT 1030 [G]---[E]----| |---(V)---(U)---(T) 1031 | \ | | | | | 1032 | \ | ( MRT Island ) [ proxy ] | | 1033 | \ | | | | | 1034 [H]---[F]----| |---(R)---(S)----| 1036 (2) Graph for determining (3) Graph for MRT computation 1037 loop-free neighbors 1039 Figure 4: Computing alternates to destinations outside the MRT Island 1041 The simple way to do this without manipulating the topology is to 1042 compute the SPFs from each IN and a node in the MRT Island (e.g. the 1043 GADAG root), but use a link metric of 0 for all links between routers 1044 in the MRT Island. The distances computed via SPF this way will be 1045 refered to as Dist_mrt0. 1047 An IN is loop-free with respect to a destination D if: Dist_mrt0(IN, 1048 D) < Dist_mrt0(IN, MRT Island Router) + Dist_mrt0(MRT Island Router, 1049 D). Any router in the MRT Island can be used since the cost of 1050 transiting between MRT Island routers is 0. The GADAG Root is 1051 recommended for consistency. 1053 10.3. MRT Alternates for Destinations Outside the MRT Island 1055 A natural concern with new functionality is how to have it be useful 1056 when it is not deployed across an entire IGP area. In the case of 1057 MRT FRR, where it provides alternates when appropriate LFAs aren't 1058 available, there are also deployment scenarios where it may make 1059 sense to only enable some routers in an area with MRT FRR. A simple 1060 example of such a scenario would be a ring of 6 or more routers that 1061 is connected via two routers to the rest of the area. 1063 Destinations inside the local island can obviously use MRT 1064 alternates. Destinations outside the local island can be treated 1065 like a multi-homed prefix and either Endpoint Selection or Named 1066 Proxy-Nodes can be used. Named Proxy-Nodes MUST be supported when 1067 LDP forwarding is supported and a label-binding for the destination 1068 is sent to an IBR. 1070 Naturally, there are more complicated options to improve coverage, 1071 such as connecting multiple MRT islands across tunnels, but the need 1072 for the additional complexity has not been justified. 1074 11. Network Convergence and Preparing for the Next Failure 1076 After a failure, MRT detours ensure that packets reach their intended 1077 destination while the IGP has not reconverged onto the new topology. 1078 As link-state updates reach the routers, the IGP process calculates 1079 the new shortest paths. Two things need attention: micro-loop 1080 prevention and MRT re-calculation. 1082 11.1. Micro-forwarding loop prevention and MRTs 1084 As is well known[RFC5715], micro-loops can occur during IGP 1085 convergence; such loops can be local to the failure or remote from 1086 the failure. Managing micro-loops is an orthogonal issue to having 1087 alternates for local repair, such as MRT fast-reroute provides. 1089 There are two possible micro-loop prevention mechanisms discussed in 1090 [RFC5715]. The first is Ordered FIB [I-D.ietf-rtgwg-ordered-fib]. 1091 The second is Farside Tunneling which requires tunnels or an 1092 alternate topology to reach routers on the farside of the failure. 1094 Since MRTs provide an alternate topology through which traffic can be 1095 sent and which can be manipulated separately from the SPT, it is 1096 possible that MRTs could be used to support Farside Tunneling. 1097 Details of how to do so are outside the scope of this document. 1099 Micro-loop mitigation mechanisms can also work when combined with 1100 MRT. 1102 11.2. MRT Recalculation 1104 When a failure event happens, traffic is put by the PLRs onto the MRT 1105 topologies. After that, each router recomputes its shortest path 1106 tree (SPT) and moves traffic over to that. Only after all the PLRs 1107 have switched to using their SPTs and traffic has drained from the 1108 MRT topologies should each router install the recomputed MRTs into 1109 the FIBs. 1111 At each router, therefore, the sequence is as follows: 1113 1. Receive failure notification 1115 2. Recompute SPT 1117 3. Install new SPT 1119 4. If the network was stable before the failure occured, wait a 1120 configured (or advertised) period for all routers to be using 1121 their SPTs and traffic to drain from the MRTs. 1123 5. Recompute MRTs 1125 6. Install new MRTs. 1127 While the recomputed MRTs are not installed in the FIB, protection 1128 coverage is lowered. Therefore, it is important to recalculate the 1129 MRTs and install them quickly. 1131 12. Acknowledgements 1133 The authors would like to thank Mike Shand for his valuable review 1134 and contributions. 1136 The authors would like to thank Joel Halpern, Hannes Gredler, Ted 1137 Qian, Kishore Tiruveedhula, Shraddha Hegde, Santosh Esale, Nitin 1138 Bahadur, Harish Sitaraman, Raveendra Torvi and Chris Bowers for their 1139 suggestions and review. 1141 13. IANA Considerations 1143 This doument includes no request to IANA. 1145 14. Security Considerations 1147 This architecture is not currently believed to introduce new security 1148 concerns. 1150 15. References 1152 15.1. Normative References 1154 [I-D.enyedi-rtgwg-mrt-frr-algorithm] 1155 Atlas, A., Envedi, G., Csaszar, A., Gopalan, A., and C. 1156 Bowers, "Algorithms for computing Maximally Redundant 1157 Trees for IP/LDP Fast- Reroute", draft-enyedi-rtgwg-mrt- 1158 frr-algorithm-03 (work in progress), July 2013. 1160 [RFC5286] Atlas, A. and A. Zinin, "Basic Specification for IP Fast 1161 Reroute: Loop-Free Alternates", RFC 5286, September 2008. 1163 [RFC5714] Shand, M. and S. Bryant, "IP Fast Reroute Framework", RFC 1164 5714, January 2010. 1166 15.2. Informative References 1168 [EnyediThesis] 1169 Enyedi, G., "Novel Algorithms for IP Fast Reroute", 1170 Department of Telecommunications and Media Informatics, 1171 Budapest University of Technology and Economics Ph.D. 1172 Thesis, February 2011, 1173 . 1175 [I-D.atlas-mpls-ldp-mrt] 1176 Atlas, A., Tiruveedhula, K., Tantsura, J., and IJ. 1177 Wijnands, "LDP Extensions to Support Maximally Redundant 1178 Trees", draft-atlas-mpls-ldp-mrt-00 (work in progress), 1179 July 2013. 1181 [I-D.atlas-ospf-mrt] 1182 Atlas, A., Hegde, S., Chris, C., and J. Tantsura, "OSPF 1183 Extensions to Support Maximally Redundant Trees", draft- 1184 atlas-ospf-mrt-00 (work in progress), July 2013. 1186 [I-D.atlas-rtgwg-mrt-mc-arch] 1187 Atlas, A., Kebler, R., Wijnands, I., Csaszar, A., and G. 1188 Envedi, "An Architecture for Multicast Protection Using 1189 Maximally Redundant Trees", draft-atlas-rtgwg-mrt-mc- 1190 arch-02 (work in progress), July 2013. 1192 [I-D.bryant-ipfrr-tunnels] 1193 Bryant, S., Filsfils, C., Previdi, S., and M. Shand, "IP 1194 Fast Reroute using tunnels", draft-bryant-ipfrr-tunnels-03 1195 (work in progress), November 2007. 1197 [I-D.ietf-mpls-ldp-multi-topology] 1198 Zhao, Q., Fang, L., Zhou, C., Li, L., and K. Raza, "LDP 1199 Extensions for Multi Topology Routing", draft-ietf-mpls- 1200 ldp-multi-topology-08 (work in progress), May 2013. 1202 [I-D.ietf-rtgwg-ipfrr-notvia-addresses] 1203 Bryant, S., Previdi, S., and M. Shand, "A Framework for IP 1204 and MPLS Fast Reroute Using Not-via Addresses", draft- 1205 ietf-rtgwg-ipfrr-notvia-addresses-11 (work in progress), 1206 May 2013. 1208 [I-D.ietf-rtgwg-ordered-fib] 1209 Shand, M., Bryant, S., Previdi, S., Filsfils, C., 1210 Francois, P., and O. Bonaventure, "Framework for Loop-free 1211 convergence using oFIB", draft-ietf-rtgwg-ordered-fib-12 1212 (work in progress), May 2013. 1214 [I-D.ietf-rtgwg-remote-lfa] 1215 Bryant, S., Filsfils, C., Previdi, S., Shand, M., and S. 1216 Ning, "Remote LFA FRR", draft-ietf-rtgwg-remote-lfa-02 1217 (work in progress), May 2013. 1219 [I-D.litkowski-rtgwg-node-protect-remote-lfa] 1220 Litkowski, S., "Node protecting remote LFA", draft- 1221 litkowski-rtgwg-node-protect-remote-lfa-00 (work in 1222 progress), April 2013. 1224 [LFARevisited] 1225 Retvari, G., Tapolcai, J., Enyedi, G., and A. Csaszar, "IP 1226 Fast ReRoute: Loop Free Alternates Revisited", Proceedings 1227 of IEEE INFOCOM , 2011, . 1230 [LightweightNotVia] 1231 Enyedi, G., Retvari, G., Szilagyi, P., and A. Csaszar, "IP 1232 Fast ReRoute: Lightweight Not-Via without Additional 1233 Addresses", Proceedings of IEEE INFOCOM , 2009, 1234 . 1236 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1237 Requirement Levels", BCP 14, RFC 2119, March 1997. 1239 [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998. 1241 [RFC3137] Retana, A., Nguyen, L., White, R., Zinin, A., and D. 1242 McPherson, "OSPF Stub Router Advertisement", RFC 3137, 1243 June 2001. 1245 [RFC5443] Jork, M., Atlas, A., and L. Fang, "LDP IGP 1246 Synchronization", RFC 5443, March 2009. 1248 [RFC5715] Shand, M. and S. Bryant, "A Framework for Loop-Free 1249 Convergence", RFC 5715, January 2010. 1251 [RFC6571] Filsfils, C., Francois, P., Shand, M., Decraene, B., 1252 Uttaro, J., Leymann, N., and M. Horneffer, "Loop-Free 1253 Alternate (LFA) Applicability in Service Provider (SP) 1254 Networks", RFC 6571, June 2012. 1256 Appendix A. General Issues with Area Abstraction 1258 When a multi-homed prefix is connected in two different areas, it may 1259 be impractical to protect them without adding the complexity of 1260 explicit tunneling. This is also a problem for LFA and Remote-LFA. 1262 50 1263 |----[ASBR Y]---[B]---[ABR 2]---[C] Backbone Area 0: 1264 | | ABR 1, ABR 2, C, D 1265 | | 1266 | | Area 20: A, ASBR X 1267 | | 1268 p ---[ASBR X]---[A]---[ABR 1]---[D] Area 10: B, ASBR Y 1269 5 p is a Type 1 AS-external 1271 Figure 5: AS external prefixes in different areas 1273 Consider the network in Figure 5 and assume there is a richer 1274 connective topology that isn't shown, where the same prefix is 1275 announced by ASBR X and ASBR Y which are in different non-backbone 1276 areas. If the link from A to ASBR X fails, then an MRT alternate 1277 could forward the packet to ABR 1 and ABR 1 could forward it to D, 1278 but then D would find the shortest route is back via ABR 1 to Area 1279 20. This problem occurs because the routers, including the ABR, in 1280 one area are not yet aware of the failure in a different area. 1282 The only way to get it from A to ASBR Y is to explicitly tunnel it to 1283 ASBR Y. If the traffic is unlabeled or the appropriate MPLS labels 1284 are known, then explicit tunneling MAY be used as long as the 1285 shortest-path of the tunnel avoids the failure point. In that case, 1286 A must determine that it should use an explicit tunnel instead of an 1287 MRT alternate. 1289 Authors' Addresses 1291 Alia Atlas (editor) 1292 Juniper Networks 1293 10 Technology Park Drive 1294 Westford, MA 01886 1295 USA 1297 Email: akatlas@juniper.net 1299 Robert Kebler 1300 Juniper Networks 1301 10 Technology Park Drive 1302 Westford, MA 01886 1303 USA 1305 Email: rkebler@juniper.net 1306 Gabor Sandor Enyedi 1307 Ericsson 1308 Konyves Kalman krt 11. 1309 Budapest 1097 1310 Hungary 1312 Email: Gabor.Sandor.Enyedi@ericsson.com 1314 Andras Csaszar 1315 Ericsson 1316 Konyves Kalman krt 11 1317 Budapest 1097 1318 Hungary 1320 Email: Andras.Csaszar@ericsson.com 1322 Jeff Tantsura 1323 Ericsson 1324 300 Holger Way 1325 San Jose, CA 95134 1326 USA 1328 Email: jeff.tantsura@ericsson.com 1330 Maciek Konstantynowicz 1331 Cisco Systems 1333 Email: maciek@bgp.nu 1335 Russ White 1336 VCE 1338 Email: russw@riw.us