idnits 2.17.1 draft-ietf-rtgwg-lfa-manageability-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 19, 2015) is 3234 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC5307' is defined on line 1141, but no explicit reference was found in the text ** Obsolete normative reference: RFC 4205 (Obsoleted by RFC 5307) ** Downref: Normative reference to an Informational RFC: RFC 6571 == Outdated reference: A later version (-11) exists of draft-ietf-isis-node-admin-tag-02 == Outdated reference: A later version (-04) exists of draft-ietf-isis-prefix-attributes-00 == Outdated reference: A later version (-09) exists of draft-ietf-ospf-node-admin-tag-02 == Outdated reference: A later version (-13) exists of draft-ietf-ospf-prefix-link-attr-06 == Outdated reference: A later version (-13) exists of draft-ietf-rtgwg-rlfa-node-protection-02 Summary: 2 errors (**), 0 flaws (~~), 7 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Routing Area Working Group S. Litkowski, Ed. 3 Internet-Draft B. Decraene 4 Intended status: Standards Track Orange 5 Expires: December 21, 2015 C. Filsfils 6 K. Raza 7 Cisco Systems 8 M. Horneffer 9 Deutsche Telekom 10 P. Sarkar 11 Juniper Networks 12 June 19, 2015 14 Operational management of Loop Free Alternates 15 draft-ietf-rtgwg-lfa-manageability-09 17 Abstract 19 Loop Free Alternates (LFA), as defined in RFC 5286 is an IP Fast 20 ReRoute (IP FRR) mechanism enabling traffic protection for IP traffic 21 (and MPLS LDP traffic by extension). Following first deployment 22 experiences, this document provides operational feedback on LFA, 23 highlights some limitations, and proposes a set of refinements to 24 address those limitations. It also proposes required management 25 specifications. 27 This proposal is also applicable to remote LFA solution. 29 Requirements Language 31 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 32 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 33 document are to be interpreted as described in [RFC2119]. 35 Status of This Memo 37 This Internet-Draft is submitted in full conformance with the 38 provisions of BCP 78 and BCP 79. 40 Internet-Drafts are working documents of the Internet Engineering 41 Task Force (IETF). Note that other groups may also distribute 42 working documents as Internet-Drafts. The list of current Internet- 43 Drafts is at http://datatracker.ietf.org/drafts/current/. 45 Internet-Drafts are draft documents valid for a maximum of six months 46 and may be updated, replaced, or obsoleted by other documents at any 47 time. It is inappropriate to use Internet-Drafts as reference 48 material or to cite them other than as "work in progress." 49 This Internet-Draft will expire on December 21, 2015. 51 Copyright Notice 53 Copyright (c) 2015 IETF Trust and the persons identified as the 54 document authors. All rights reserved. 56 This document is subject to BCP 78 and the IETF Trust's Legal 57 Provisions Relating to IETF Documents 58 (http://trustee.ietf.org/license-info) in effect on the date of 59 publication of this document. Please review these documents 60 carefully, as they describe your rights and restrictions with respect 61 to this document. Code Components extracted from this document must 62 include Simplified BSD License text as described in Section 4.e of 63 the Trust Legal Provisions and are provided without warranty as 64 described in the Simplified BSD License. 66 Table of Contents 68 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 69 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 70 3. Operational issues with default LFA tie breakers . . . . . . 4 71 3.1. Case 1: PE router protecting failures within core network 4 72 3.2. Case 2: PE router choosen to protect core failures while 73 P router LFA exists . . . . . . . . . . . . . . . . . . . 5 74 3.3. Case 3: suboptimal P router alternate choice . . . . . . 6 75 3.4. Case 4: IS-IS overload bit on LFA computing node . . . . 7 76 4. Need for coverage monitoring . . . . . . . . . . . . . . . . 8 77 5. Need for LFA activation granularity . . . . . . . . . . . . . 9 78 6. Configuration requirements . . . . . . . . . . . . . . . . . 9 79 6.1. LFA enabling/disabling scope . . . . . . . . . . . . . . 9 80 6.2. Policy based LFA selection . . . . . . . . . . . . . . . 10 81 6.2.1. Connected vs remote alternates . . . . . . . . . . . 11 82 6.2.2. Mandatory criteria . . . . . . . . . . . . . . . . . 11 83 6.2.3. Enhanced criteria . . . . . . . . . . . . . . . . . . 12 84 6.2.4. Criteria evaluation . . . . . . . . . . . . . . . . . 12 85 6.2.5. Retrieving alternate path attributes . . . . . . . . 16 86 6.2.6. ECMP LFAs . . . . . . . . . . . . . . . . . . . . . . 21 87 7. Operational aspects . . . . . . . . . . . . . . . . . . . . . 22 88 7.1. IS-IS overload bit on LFA computing node . . . . . . . . 22 89 7.2. Manual triggering of FRR . . . . . . . . . . . . . . . . 23 90 7.3. Required local information . . . . . . . . . . . . . . . 24 91 7.4. Coverage monitoring . . . . . . . . . . . . . . . . . . . 24 92 7.5. LFA and network planning . . . . . . . . . . . . . . . . 25 93 8. Security Considerations . . . . . . . . . . . . . . . . . . . 25 94 9. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 25 95 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 26 96 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 26 97 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 26 98 12.1. Normative References . . . . . . . . . . . . . . . . . . 26 99 12.2. Informative References . . . . . . . . . . . . . . . . . 26 100 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 27 102 1. Introduction 104 Following the first deployments of Loop Free Alternates (LFA), this 105 document provides feedback to the community about the management of 106 LFA. 108 Section 3 provides real uses cases illustrating some limitations 109 and suboptimal behavior. 111 Section 5 proposes requirements for activation granularity and 112 policy based selection of the alternate. 114 Section 6 express requirements for the operational management of 115 LFA. 117 2. Definitions 119 o Per-prefix LFA : LFA computation, and best alternate evaluation is 120 done for each destination prefix. As opposed to "Per-next hop" 121 simplification also proposed in [RFC5286] Section 3.8. 123 o PE router : Provider Edge router. These routers are connecting 124 customers 126 o P router : Provider router. These routers are core routers, 127 without customer connections. They provide transit between PE 128 routers and they form the core network. 130 o Core network : subset of the network composed by P routers and 131 links between them. 133 o Core link : network link part of the core network i.e. a P router 134 to P router link. 136 o Link-protecting LFA : alternate providing protection against link 137 failure. 139 o Node-protecting LFA : alternate providing protection against node 140 failure. 142 o Connected alternate : alternate adjacent (at IGP level) to the 143 point of local repair (i.e. an IGP neighbor). 145 o Remote alternate : alternate which is does not share an IGP 146 adjacency with the point of local repair. 148 3. Operational issues with default LFA tie breakers 150 [RFC5286] introduces the notion of tie breakers when selecting the 151 LFA among multiple candidate alternate next-hops. When multiple LFA 152 exist, RFC 5286 has favored the selection of the LFA providing the 153 best coverage of the failure cases. While this is indeed a goal, 154 this is one among multiple and in some deployment this lead to the 155 selection of a suboptimal LFA. The following sections details real 156 use cases of such limitations. 158 Note that the use case of LFA computation per destination (per-prefix 159 LFA) is assumed throughout this analysis. We also assume in the 160 network figures that all IP prefixes are advertised with zero cost. 162 3.1. Case 1: PE router protecting failures within core network 164 P1 --------- P2 ---------- P3 --------- P4 165 | 1 100 1 | 166 | | 167 | 100 | 100 168 | | 169 | 1 100 1 | 1 5k 170 P5 --------- P6 ---------- P7 --------- P8 --- P9 -- PE1 171 | | | | | | 172 5k| |5k 5k| |5k | 5k | 5k 173 | | | | | | 174 | +-- PE4 --+ | +---- PE2 ----+ 175 | | | 176 +---- PE5 ----+ | 5k 177 | 178 PE3 180 Figure 1 182 Px routers are P routers using n*10G links. PEs are connected using 183 links with lower bandwidth. 185 In figure 1, let us consider the traffic flowing from PE1 to PE4. 186 The nominal path is P9-P8-P7-P6-PE4. Let us consider the failure of 187 link P7-P8. For P8, P4 is not an LFA and the only available LFA is 188 PE2. 190 When the core link P8-P7 fails, P8 switches all traffic destined to 191 PE4/PE5 towards the node PE2. Hence a PE node and PE links are used 192 to protect the failure of a core link. Typically, PE links have less 193 capacity than core links and congestion may occur on PE2 links. Note 194 that although PE2 was not directly affected by the failure, its links 195 become congested and its traffic will suffer from the congestion. 197 In summary, in case of P8-P7 link failure, the impact on customer 198 traffic is: 200 o From PE2 point of view : 202 * without LFA: no impact 204 * with LFA: traffic is partially dropped (but possibly 205 prioritized by a QoS mechanism). It must be highlighted that 206 in such situation, traffic not affected by the failure may be 207 affected by the congestion. 209 o From P8 point of view: 211 * without LFA: traffic is totally dropped until convergence 212 occurs. 214 * with LFA: traffic is partially dropped (but possibly 215 prioritized by a QoS mechanism). 217 Besides the congestion aspects of using an Edge router as an 218 alternate to protect a core failure, a service provider may consider 219 this as a bad routing design and would like to prevent it. 221 3.2. Case 2: PE router choosen to protect core failures while P router 222 LFA exists 223 P1 --------- P2 ------------ P3 -------- P4 224 | 1 100 | 1 | 225 | | | 226 | 100 | 30 | 30 227 | | | 228 | 1 50 50 | 10 | 1 5k 229 P5 --------- P6 --- P10 ---- P7 -------- P8 --- P9 -- PE1 230 | | | | \ | 231 5k| |5k 5k| |5k \ 5k | 5k 232 | | | | \ | 233 | +-- PE4 --+ | +---- PE2 ----+ 234 | | | 235 +---- PE5 ----+ | 5k 236 | 237 PE3 239 Figure 2 241 Px routers are P routers meshed with n*10G links. PEs are meshed 242 using links with lower bandwidth. 244 In the figure 2, let us consider the traffic coming from PE1 to PE4. 245 Nominal path is P9-P8-P7-P10-P6-PE4. Let us consider the failure of 246 the link P7-P8. For P8, P4 is a link-protecting LFA and PE2 is a 247 node-protecting LFA. PE2 is chosen as best LFA due to its better 248 protection type. Just like in case 1, this may lead to congestion on 249 PE2 links upon LFA activation. 251 3.3. Case 3: suboptimal P router alternate choice 252 +--- PE3 --+ 253 / \ 254 1000 / \ 1000 255 / \ 256 +----- P1 ---------------- P2 ----+ 257 | | 500 | | 258 | 10 | | | 10 259 | | | | 260 R5 | 10 | 10 R7 261 | | | | 262 | 10 | | | 10 263 | | 500 | | 264 +---- P3 ---------------- P4 -----+ 265 \ / 266 1000 \ / 1000 267 \ / 268 +--- PE1 ---+ 270 Figure 3 272 Px routers are P routers. P1-P2 and P3-P4 links are 1G links. All 273 others inter Px links are 10G links. 275 In the figure above, let us consider the failure of link P1-P3. For 276 destination PE3, P3 has two possible alternates: 278 o P4, which is node-protecting 280 o P5, which is link-protecting 282 P4 is chosen as best LFA due to its better protection type. However, 283 it may not be desirable to use P4 for bandwidth capacity reason. A 284 service provider may prefer to use high bandwidth links as prefered 285 LFA. In this example, prefering shortest path over protection type 286 may achieve the expected behavior, but in cases where metric are not 287 reflecting bandwidth, it would not work and some other criteria would 288 need to be involved when selecting the best LFA. 290 3.4. Case 4: IS-IS overload bit on LFA computing node 291 P1 P2 292 | \ / | 293 50 | 50 \/ 50 | 50 294 | /\ | 295 PE1-+ +-- PE2 296 \ / 297 45 \ / 45 298 -PE3-+ 299 (OL set) 301 Figure 4 303 In the figure above, PE3 has its overload bit set (permanently, for 304 design reason) and wants to protect traffic using LFA for destination 305 PE2. 307 On PE3, the loop-free condition is not satisfied : 100 !< 45 + 45. 308 PE1 is thus not considered as an LFA. However thanks to the overload 309 bit set on PE3, we know that PE1 is loop-free so PE1 is an LFA to 310 reach PE2. 312 In case of overload condition set on a node, LFA behavior must be 313 clarified. 315 4. Need for coverage monitoring 317 As per [RFC6571], LFA coverage highly depends on the used network 318 topology. Even if remote LFA ([RFC7490]) extends significantly the 319 coverage of the basic LFA specification, there is still some cases 320 where protection would not be available. As network topologies are 321 constantly evolving (network extension, capacity addings, latency 322 optimization ...), the protection coverage may change. Fast reroute 323 functionality may be critical for some services supported by the 324 network, a service provider must constantly know what protection 325 coverage is currently available on the network. Moreover, predicting 326 the protection coverage in case of network topology change is 327 mandatory. 329 Today network simulation tool associated with whatif scenarios 330 functionality are often used by service providers for the overall 331 network design (capacity, path optimization ...). Section 7.5, 332 Section 7.4 and Section 7.3 of this document propose to add LFA 333 informations into such tool and within routers, so a service provider 334 may be able : 336 o to evaluate protection coverage after a topology change. 338 o to adjust the topology change to cover the primary need (e.g. 339 latency optimization or bandwidth increase) as well as LFA 340 protection. 342 o monitor constantly the LFA coverage in the live network and being 343 alerted. 345 Implementers SHOULD document their LFA selection algorithms (default 346 and tuning options) in order to leave possibility for 3rd party 347 modules to model these policy-LFA expressions. 349 5. Need for LFA activation granularity 351 As all FRR mechanism, LFA installs backup paths in Forwarding 352 Information Base (FIB). Depending of the hardware used by a service 353 provider, FIB resource may be critical. Activating LFA, by default, 354 on all available components (IGP topologies, interface, address 355 families ...) may lead to waste of FIB resource as generally in a 356 network only few destinations should be protected (e.g. loopback 357 addresses supporting MPLS services) compared to the amount of 358 destinations in RIB. 360 Moreover a service provider may implement multiple different FRR 361 mechanism in its networks for different usages (MRT, TE FRR). In 362 this scenario, an implementation MAY permit to compute alternates for 363 a specific destination even if the destination is already protected 364 by another mechanism. This will bring redundancy and let the ability 365 for the operator to select the best option for FRR using a policy 366 langage. 368 Section 6 of this document propose some implementation guidelines. 370 6. Configuration requirements 372 Controlling best alternate and LFA activation granularity is a 373 requirement for Service Providers. This section defines 374 configuration requirements for LFA. 376 6.1. LFA enabling/disabling scope 378 The granularity of LFA activation should be controlled (as alternate 379 next hop consume memory in forwarding plane). 381 An implementation of LFA SHOULD allow its activation with the 382 following criteria: 384 o Per routing context: VRF, virtual/logical router, global routing 385 table, ... 387 o Per interface 389 o Per protocol instance, topology, area 391 o Per prefixes: prefix protection SHOULD have a better priority 392 compared to interface protection. This means that if a specific 393 prefix must be protected due to a configuration request, LFA must 394 be computed and installed for this prefix even if the primary 395 outgoing interface is not configured for protection. 397 An implementation of LFA MAY allow its activation with the following 398 criteria: 400 o Per address-family: ipv4 unicast, ipv6 unicast 402 o Per MPLS control plane: for MPLS control planes that inherit 403 routing decision from the IGP routing protocol, MPLS dataplane may 404 be protected by LFA. The implementation may allow operator to 405 control this inheritance of protection from the IP prefix to the 406 MPLS label bound to this prefix. The protection inheritance will 407 concern : IP to MPLS, MPLS to MPLS, and MPLS to IP entries. As 408 example, LDP and segment-routing extensions for ISIS and OSPF are 409 control plane eligible to this inheritance of protection. 411 6.2. Policy based LFA selection 413 When multiple alternates exist, LFA selection algorithm is based on 414 tie breakers. Current tie breakers do not provide sufficient control 415 on how the best alternate is chosen. This document proposes an 416 enhanced tie breaker allowing service providers to manage all 417 specific cases: 419 1. An implementation of LFA SHOULD support policy-based decision for 420 determining the best LFA. 422 2. Policy based decision SHOULD be based on multiple criterions, 423 with each criteria having a level of preference. 425 3. If the defined policy does not permit to determine a unique best 426 LFA, an implementation SHOULD pick only one based on its own 427 decision, as a default behavior. An implementation SHOULD also 428 support election of multiple LFAs, for loadbalancing purposes. 430 4. Policy SHOULD be applicable to a protected interface or to a 431 specific set of destinations. In case of application on the 432 protected interface, all destinations primarily routed on this 433 interface SHOULD use the interface policy. 435 5. It is an implementation choice to reevaluate policy dynamically 436 or not (in case of policy change). If a dynamic approach is 437 chosen, the implementation SHOULD recompute the best LFAs and 438 reinstall them in FIB, without service disruption. If a non- 439 dynamic approach is chosen, the policy would be taken into 440 account upon the next IGP event. In this case, the 441 implementation SHOULD support a command to manually force the 442 recomputation/reinstallation of LFAs. 444 6.2.1. Connected vs remote alternates 446 In addition to connected LFAs, tunnels (e.g. IP, LDP, RSVP-TE or 447 Segment Routing) to distant routers may be used to complement LFA 448 coverage (tunnel tail used as virtual neighbor). When a router has 449 multiple alternate candidates for a specific destination, it may have 450 connected alternates and remote alternates (reachable via a tunnel). 451 Connected alternates may not always provide an optimal routing path 452 and it may be preferable to select a remote alternate over a 453 connected alternate. Some usage of tunnels to extend LFA ([RFC5286]) 454 coverage is described in either [RFC7490] or 455 [I-D.francois-segment-routing-ti-lfa]. These documents present some 456 use cases of LDP tunnels ([RFC7490]) or Segment Routing tunnels 457 ([I-D.francois-segment-routing-ti-lfa]). This document considers any 458 type of tunneling techniques to reach remote alternates (IP, GRE, 459 LDP, RSVP-TE, L2TP, Segment Routing ...) and does not restrict the 460 remote alternates to the usage presented in the referenced document. 462 In figure 1, there is no P router alternate for P8 to reach PE4 or 463 PE5 , so P8 is using PE2 as alternate, which may generate congestion 464 when FRR is activated. Instead, we could have a remote alternate for 465 P8 to protect traffic to PE4 and PE5. For example, a tunnel from P8 466 to P3 (following shortest path) can be setup and P8 would be able to 467 use P3 as remote alternate to protect traffic to PE4 and PE5. In 468 this scenario, traffic will not use a PE link during FRR activation. 470 When selecting the best alternate, the selection algorithm MUST 471 consider all available alternates (connected or tunnel). For example 472 with Remote LFA, computation of PQ set ([RFC7490]) SHOULD be 473 performed before best alternate selection. 475 6.2.2. Mandatory criteria 477 An implementation of LFA MUST support the following criteria: 479 o Non candidate link: A link marked as "non candidate" will never be 480 used as LFA. 482 o A primary next hop being protected by another primary next hop of 483 the same prefix (ECMP case). 485 o Type of protection provided by the alternate: link protection, 486 node protection. In case of node protection preference, an 487 implementation SHOULD support fall back to link protection if node 488 protection is not available. 490 o Shortest path: lowest IGP metric used to reach the destination. 492 o SRLG (as defined in [RFC5286] Section 3, see also Section 6.2.4.1 493 for more details). 495 6.2.3. Enhanced criteria 497 An implementation of LFA SHOULD support the following enhanced 498 criteria: 500 o Downstreamness of an alternate : preference of a downstream path 501 over a non downstream path SHOULD be configurable. 503 o Link coloring with : include, exclude and preference based system 504 (see Section 6.2.4.2). 506 o Link Bandwidth (see Section 6.2.4.3). 508 o Alternate preference/Node coloring (see Section 6.2.4.4). 510 6.2.4. Criteria evaluation 512 6.2.4.1. SRLG 514 [RFC5286] Section 3. proposes to reuse GMPLS IGP extensions to encode 515 SRLGs ([RFC4205] and [RFC4203]). The section is also describing the 516 algorithm to compute SRLG protection. 518 When SRLG protection is computed, and implementation SHOULD permit to 519 : 521 o Exclude alternates violating SRLG. 523 o Maintain a preference system between alternates based on SRLG 524 violations. How the preference system is implemented is out of 525 scope of this document but here are few examples : 527 * Preference based on number of violation. In this case : the 528 more violation = the less preferred. 530 * Preference based on violation cost. In this case, each SRLG 531 violation has an associated cost. The lower violation cost sum 532 is preferred. 534 When applying SRLG criteria, the SRLG violation check SHOULD be 535 performed on source to alternate as well as alternate to destination 536 paths based on the SRLG set of the primary path. In the case of 537 remote LFA, PQ to destination path attributes would be retrieved from 538 SPT rooted at PQ. 540 6.2.4.2. Link coloring 542 Link coloring is a powerful system to control the choice of 543 alternates. Protecting interfaces are tagged with colors. Protected 544 interfaces are configured to include some colors with a preference 545 level, and exclude others. 547 Link color information SHOULD be signalled in the IGP. How 548 signalling is done is out of scope of the document but it may be 549 useful to reuse existing admin-groups from traffic-engineering 550 extensions or link attributes extensions like in 551 [I-D.ietf-ospf-prefix-link-attr]. 553 PE2 554 | +---- P4 555 | / 556 PE1 ---- P1 --------- P2 557 | 10Gb 558 1Gb | 559 | 560 P3 562 Figure 8 564 Example : P1 router is connected to three P routers and two PEs. 566 P1 is configured to protect the P1-P4 link. We assume that given the 567 topology, all neighbors are candidate LFA. We would like to enforce 568 a policy in the network where only a core router may protect against 569 the failure of a core link, and where high capacity links are 570 prefered. 572 In this example, we can use the proposed link coloring by: 574 o Marking PEs links with color RED 576 o Marking 10Gb CORE link with color BLUE 577 o Marking 1Gb CORE link with color YELLOW 579 o Configured the protected interface P1->P4 with : 581 * Include BLUE, preference 200 583 * Include YELLOW, preference 100 585 * Exclude RED 587 Using this, PE links will never be used to protect against P1-P4 link 588 failure and 10Gb link will be be preferred. 590 The main advantage of this solution is that it can easily be 591 duplicated on other interfaces and other nodes without change. A 592 Service Provider has only to define the color system (associate color 593 with a significance), as it is done already for TE affinities or BGP 594 communities. 596 An implementation of link coloring: 598 o SHOULD support multiple include and exclude colors on a single 599 protected interface. 601 o SHOULD provide a level of preference between included colors. 603 o SHOULD support multiple colors configuration on a single 604 protecting interface. 606 6.2.4.3. Bandwidth 608 As mentioned in previous sections, not taking into account bandwidth 609 of an alternate could lead to congestion during FRR activation. We 610 propose to base the bandwidth criteria on the link speed information 611 for the following reason : 613 o if a router S has a set of X destinations primarly forwarded to N, 614 using per prefix LFA may lead to have a subset of X protected by a 615 neighbor N1, another subset by N2, another subset by Nx ... 617 o S is not aware about traffic flows to each destination and is not 618 able to evaluate how much traffic will be sent to N1,N2, ... Nx in 619 case of FRR activation. 621 Based on this, it is not useful to gather available bandwidth on 622 alternate paths, as the router does not know how much bandwidth it 623 requires for protection. The proposed link speed approach provides a 624 good approximation with a small cost as information is easily 625 available. 627 The bandwidth criteria of the policy framework SHOULD work in at 628 least two ways : 630 o PRUNE : exclude a LFA if link speed to reach it is lower than the 631 link speed of the primary next hop interface. 633 o PREFER : prefer a LFA based on its bandwidth to reach it compared 634 to the link speed of the primary next hop interface. 636 6.2.4.4. Alternate preference/Node coloring 638 Rather than tagging interface on each node (using link color) to 639 identify alternate node type (as example), it would be helpful if 640 routers could be identified in the IGP. This would permit a grouped 641 processing on multiple nodes. As an implementation need to exclude 642 some specific alternates (see Section 6.2.3), an implementation : 644 o SHOULD be able to give a preference to specific alternate. 646 o SHOULD be able to give a preference to a group of alternate. 648 o SHOULD be able to exclude a group of alternate. 650 A specific alternate may be identified by its interface, IP address 651 or router ID and group of alternates may be identified by a marker 652 (tag) (for example, those IGP extensions can be used : 653 [I-D.ietf-isis-node-admin-tag], [I-D.ietf-ospf-node-admin-tag], 654 [I-D.ietf-isis-prefix-attributes], [I-D.ietf-ospf-prefix-link-attr] 655 ). Using a tag is referred as Node coloring in comparison to link 656 coloring option presented in Section 6.2.4.2. 658 Consider the following network: 660 PE3 661 | 662 | 663 PE2 664 | +---- P4 665 | / 666 PE1 ---- P1 -------- P2 667 | 10Gb 668 1Gb | 669 | 670 P3 672 Figure 9 674 In the example above, each node is configured with a specific tag 675 flooded through the IGP. 677 o PE1,PE3: 200 (non candidate). 679 o PE2: 100 (edge/core). 681 o P1,P2,P3: 50 (core). 683 A simple policy could be configured on P1 to choose the best 684 alternate for P1->P4 based on router function/role as follows : 686 o criteria 1 -> alternate preference: exclude tag 100 and 200. 688 o criteria 2 -> bandwidth. 690 6.2.5. Retrieving alternate path attributes 692 6.2.5.1. Alternate path 694 The alternate path is composed of two distinct parts : PLR to 695 alternate and alternate to destination. 697 N1 -- R1 ---- R2 698 /50 \ \ 699 / R3 --- R4 700 / \ 701 S -------- E ------- D 702 \\ // 703 \\ // 704 N2 ---- PQ ---- R5 706 Figure 5 708 In the figure above, we consider a primary path from S to D, S using 709 E as primary nexthop. All metrics are 1 except {S,N1}=50. Two 710 alternate paths are available: 712 o {S,N1,R1,R2|R3,R4,D} where N1 is a connected alternate: the path 713 is composed of PLR to alternate path which is {S,N1} and alternate 714 to destination path which is {N1,R1,R2|R3,R4,D}. 716 o {S,N2,PQ,R5,D} where PQ is a remote alternate: the path is 717 composed of PLR to alternate path which is {S,N2,PQ} and alternate 718 to destination path which is {PQ,R5,D}. 720 As displayed in the figure, some part of the alternate path may 721 fanout in multipath due to ECMP. 723 6.2.5.2. Alternate path attributes 725 Some criterions listed in the previous sections are requiring to 726 retrieve some characteristic of the alternate path (SRLG, bandwidth, 727 color, tag ...). We call these characteristics "path attributes". A 728 path attribute can record a list of node properties (e.g. node tag) 729 or link properties (e.g. link color). 731 This document defines two types of path attributes: 733 o Cumulative attribute: when a path attribute is cumulative, the 734 implementation SHOULD record the value of the attribute on each 735 element (link and node) along the alternate path. SRLG, link 736 color, and node color are cumulative attributes. 738 o Unitary attribute: when a path attribute is unitary, the 739 implementation SHOULD record the value of the attribute only on 740 the first element along the alternate path (first node, or first 741 link). Bandwidth is a unitary attribute. 743 N1 -- R1 ---- R2 744 / \ 745 / 50 R4 746 / \ 747 S -------- E ------- D 749 In the figure above, N1 is a connected alternate to each D from S. 750 We consider that all links have a RED color except {R1,R2} which is 751 BLUE. We consider all links to be 10Gbps, except {N1,R1} which is 752 2.5Gbps. The bandwidth attribute collected for the alternate path 753 will be 10Gbps. As the attribute is unitary, only the link speed of 754 the first link {S,N1} is recorded. The link color attribute 755 collected for the alternate path will be {RED,RED,BLUE,RED,RED}. As 756 the attribute is cumulative, the value of the attribute on each link 757 along the path is recorded. 759 6.2.5.3. Connected alternate 761 For alternate path using a connected alternate: 763 o attributes from PLR to alternate are retrieved from the interface 764 connected to the alternate. In case the alternate is connected 765 through multiple interfaces, the evaluation of attributes SHOULD 766 be done once per interface (each interface is considered as a 767 separate alternate) and once per ECMP group of interfaces. 769 o path attributes from alternate to destination are retrieved from 770 SPF rooted at the alternate. As the alternate is a connected 771 alternate, the SPF has already been computed to find the 772 alternate, so there is no need of additional computation. 774 N1 -- R1 ---- R2 775 50//50 \ 776 // \ 777 i1//i2 \ 778 S -------- E -------- D 780 Figure 6 782 In the figure above, we consider a primary path from S to D, S using 783 E as primary nexthop. All metrics are considered as 1 expect {S,N1} 784 links which are using metric of 50. We consider the following SRLG 785 groups on links: 787 o {S,N1} using i1 : SRLG1,SRLG10 789 o {S,N1} using i2 : SRLG2,SRLG20 790 o {N1,R1} : SRLG3 792 o {R1,R2} : SRLG4 794 o {R2,D} : SRLG5 796 o {S,E} : SRLG10 798 o {E,D} : SRLG6 800 S is connected to the alternate using two interfaces i1 and i2. 802 If i1 and i2 are not part of an ECMP group, the evaluation of 803 attributes is done once per interface, and each interface is 804 considered as a separate alternate path. Two alternate paths will be 805 available with the associated SRLG attributes : 807 o Alternate path #1 : {S,N1 using if1,R1,R2,D}: 808 SRLG1,SRLG10,SRLG3,SRLG4,SRLG5. 810 o Alternate path #2 : {S,N1 using if2,R1,R2,D}: 811 SRLG2,SRLG20,SRLG3,SRLG4,SRLG5. 813 Alternate path #1 is sharing risks with primary path and may be 814 depreferred or pruned by user defined policy. 816 If i1 and i2 are part of an ECMP group, the evaluation of attributes 817 is done once per ECMP group, and the implementation considers a 818 single alternate path {S,N1 using if1|if2,R1,R2,D} with the following 819 SRLG attributes: SRLG1,SRLG10,SRLG2,SRLG20,SRLG3,SRLG4,SRLG5. 820 Alternate path is sharing risks with primary path and may be 821 depreferred or pruned by user defined policy. 823 6.2.5.4. Remote alternate 825 For alternate path using a remote alternate (tunnel) : 827 o attributes from the PLR to alternate path are retrieved using the 828 PLR's primary SPF if P space is used or using the neighbor's SPF 829 if extended P space is used, combined with the attributes of the 830 link(s) to reach that neighbor. In both cases, no additional SPF 831 is required. 833 o attributes from alternate to destination path may be retrieved 834 from SPF rooted at the remote alternate. An additional forward 835 SPF is required for each remote alternate as indicated in 836 [I-D.ietf-rtgwg-rlfa-node-protection] section 3.2 . In some remote 837 alternate scenarios, like [I-D.francois-segment-routing-ti-lfa], 838 alternate to destination path attributes may be obtained using a 839 different technique. 841 The number of remote alternates may be very high. . In case of 842 remote LFA, simulations of real-world network topologies have shown 843 that order of hundreths of PQ may be possible. The computational 844 overhead to collect all path attributes of all PQ to destination 845 paths may grow beyond practical reason. 847 To handle this situation, it is needed to limit the number of remote 848 alternates to be evaluated to a finite number before collecting 849 alternate path attributes and running the policy evaluation. [I- 850 D.ietf-rtgwg-rlfa-node-protection] Section 2.3.3 provides a way to 851 reduce the number of PQ to be evaluated. 853 Some other remote alternate techniques using static or dynamic 854 tunnels may not require this pruning. 856 Link Remote Remote 857 alternate alternate alternate 858 ------------- ------------------ ------------- 859 Alternates | LFA | | rLFA (PQs) | | Static/ | 860 | | | | | Dynamic | 861 sources | | | | | tunnels | 862 ------------- ------------------ ------------- 863 | | | 864 | | | 865 | -------------------------- | 866 | | Prune some alternates | | 867 | | (sorting strategy) | | 868 | -------------------------- | 869 | | | 870 | | | 871 ------------------------------------------------ 872 | Collect alternate attributes | 873 ------------------------------------------------ 874 | 875 | 876 ------------------------- 877 | Evaluate policy | 878 ------------------------- 879 | 880 | 881 Best alternates 883 6.2.5.5. Collecting attributes in case of multipath 885 As described in Section 6.2.5, there may be some situation where an 886 alternate path or part of an alternate path fans out to multiple 887 paths (e.g. ECMP). When collecting path attributes in such case, an 888 implementation SHOULD consider the union of attributes of each sub- 889 path. 891 In the figure 5 (in Section 6.2.5), S has two alternates paths to 892 reach D. Each alternate path fans out into multipath due to ECMP. 893 Considering the following link color attributes : all links are RED 894 except {R1,R3} which is BLUE. The user wants to use an alternate 895 path with only RED links. The first alternate path 896 {S,N1,R1,R2|R3,R4,D} does not fit the constraint, as {R1,R3} is BLUE. 897 The second alternate path {S,N2,PQ,R5,D} fits the constraint and will 898 be preferred as it uses only RED links. 900 6.2.6. ECMP LFAs 902 10 903 PE2 - PE3 904 | | 905 50 | 5 | 50 906 P1----P2 907 \\ // 908 50 \\ // 50 909 PE1 911 Figure 7 913 Links between P1 and PE1 are L1 and L2, links between P2 and PE1 are 914 L3 and L4 916 In the figure above, primary path from PE1 to PE2 is through P1 using 917 ECMP on two parallel links L1 and L2. In case of standard ECMP 918 behavior, if L1 is failing, postconvergence next hop would become L2 919 and there would be no longer ECMP. If LFA is activated, as stated in 920 [RFC5286] Section 3.4., "alternate next-hops may themselves also be 921 primary next-hops, but need not be" and "alternate next-hops should 922 maximize the coverage of the failure cases". In this scenario there 923 is no alternate providing node protection, LFA will so prefer L2 as 924 alternate to protect L1 which makes sense compared to postconvergence 925 behavior. 927 Considering a different scenario using figure 7, where L1 and L2 are 928 configured as a layer 3 bundle using a local feature, as well as L3/ 929 L4 being a second layer 3 bundle. Layer 3 bundles are configured as 930 if a link in the bundle is failing, the traffic must be rerouted out 931 of the bundle. Layer 3 bundles are generally introduced to increase 932 bandwidth between nodes. In nominal situation, ECMP is still 933 available from PE1 to PE2, but if L1 is failing, postconvergence next 934 hop would become ECMP on L3 and L4. In this case, LFA behavior 935 SHOULD be adapted in order to reflect the bandwidth requirement. 937 We would expect the following FIB entry on PE1 : 939 On PE1 : PE2 +--> ECMP -> L1 940 | | 941 | +----> L2 942 | 943 +--> LFA(ECMP) -> L3 944 | 945 +---------> L4 947 If L1 or L2 is failing, traffic must be switched on the LFA ECMP 948 bundle rather than using the other primary next hop. 950 As mentioned in [RFC5286] Section 3.4., protecting a link within an 951 ECMP by another primary next hop is not a MUST. Moreover, we already 952 presented in this document, that maximizing the coverage of the 953 failure case may not be the right approach and policy based choice of 954 alternate may be preferred. 956 An implementation SHOULD permit to prefer to protect a primary next 957 hop by another primary next hop. An implementation SHOULD permit to 958 prefer to protect a primary next hop by a NON primary next hop. An 959 implementation SHOULD permit to use an ECMP bundle as a LFA. 961 7. Operational aspects 963 7.1. IS-IS overload bit on LFA computing node 965 In [RFC5286], Section 3.5, the setting of the overload bit condition 966 in LFA computation is only taken into account for the case where a 967 neighbor has the overload bit set. 969 In addition to RFC 5286 inequality 1 Loop-Free Criterion 970 (Distance_opt(N, D) < Distance_opt(N, S) + Distance_opt(S, D)), the 971 IS-IS overload bit of the LFA calculating neighbor (S) SHOULD be 972 taken into account. Indeed, if it has the overload bit set, no 973 neighbor will loop back to traffic to itself. 975 7.2. Manual triggering of FRR 977 Service providers often perform manual link shutdown (using router 978 CLI) to perform some network changes/tests. A manual link shutdown 979 may be done at multiple level : physical interface, logical 980 interface, IGP interface, BFD session ... Especially testing or 981 troubleshooting FRR requires to perform the manual shutdown on the 982 remote end of the link as generally a local shutdown would not 983 trigger FRR. 985 To enhance such situation, an implementation SHOULD support 986 triggering/activating LFA Fast Reroute for a given link when a manual 987 shutdown is done on a component that currently supports FRR 988 activation. 990 An implementation MAY also support FRR activation for a specific 991 interface or a specific prefix on a primary next-hop interface and 992 revert without any action on any running component of the node (links 993 or protocols). In this use case, the FRR activation time need to be 994 controlled by a timer in case the operator forgot to revert traffic 995 on primary path. When the timer expires, the traffic is 996 automatically reverted to the primary path. This will make easier 997 tests of fast-reroute path and then revert back to the primary path 998 without causing a global network convergence. 1000 For example : 1002 o if an implementation supports FRR activation upon BFD session down 1003 event, this implementation SHOULD support FRR activation when a 1004 manual shutdown is done on the BFD session. But if an 1005 implementation does not support FRR activation on BFD session 1006 down, there is no need for this implementation to support FRR 1007 activation on manual shutdown of BFD session. 1009 o if an implementation supports FRR activation on physical link down 1010 event (e.g. Rx laser Off detection, or error threshold raised 1011 ...), this implementation SHOULD support FRR activation when a 1012 manual shutdown at physical interface is done. But if an 1013 implementation does not support FRR activation on physical link 1014 down event, there is no need for this implementation to support 1015 FRR activation on manual physical link shutdown. 1017 o A CLI command may permit to switch from primary path to FRR path 1018 for testing FRR path for a specific. There is no impact on 1019 controlplane, only dataplane of the local node could be changed. 1020 A similar command may permit to switch back traffic from FRR path 1021 to primary path. 1023 7.3. Required local information 1025 LFA introduction requires some enhancement in standard routing 1026 information provided by implementations. Moreover, due to the non 1027 100% coverage, coverage informations is also required. 1029 Hence an implementation : 1031 o MUST be able to display, for every prefixes, the primary next hop 1032 as well as the alternate next hop information. 1034 o MUST provide coverage information per activation domain of LFA 1035 (area, level, topology, instance, virtual router, address family 1036 ...). 1038 o MUST provide number of protected prefixes as well as non protected 1039 prefixes globally. 1041 o SHOULD provide number of protected prefixes as well as non 1042 protected prefixes per link. 1044 o MAY provide number of protected prefixes as well as non protected 1045 prefixes per priority if implementation supports prefix-priority 1046 insertion in RIB/FIB. 1048 o SHOULD provide a reason for choosing an alternate (policy and 1049 criteria) and for excluding an alternate. 1051 o SHOULD provide the list of non protected prefixes and the reason 1052 why they are not protected (no protection required or no alternate 1053 available). 1055 7.4. Coverage monitoring 1057 It is pretty easy to evaluate the coverage of a network in a nominal 1058 situation, but topology changes may change the coverage. In some 1059 situations, the network may no longer be able to provide the required 1060 level of protection. Hence, it becomes very important for service 1061 providers to get alerted about changes of coverage. 1063 An implementation SHOULD : 1065 o provide an alert system if total coverage (for a node) is below a 1066 defined threshold or comes back to a normal situation. 1068 o provide an alert system if coverage of a specific link is below a 1069 defined threshold or comes back to a normal situation. 1071 An implementation MAY : 1073 o provide an alert system if a specific destination is not protected 1074 anymore or when protection comes back up for this destination 1076 Although the procedures for providing alerts are beyond the scope of 1077 this document, we recommend that implementations consider standard 1078 and well used mechanisms like syslog or SNMP traps. 1080 7.5. LFA and network planning 1082 The operator may choose to run simulations in order to ensure full 1083 coverage of a certain type for the whole network or a given subset of 1084 the network. This is particularly likely if he operates the network 1085 in the sense of the third backbone profiles described in [RFC6571], 1086 that is, he seeks to design and engineer the network topology in a 1087 way that a certain coverage is always achieved. Obviously a complete 1088 and exact simulation of the IP FRR coverage can only be achieved, if 1089 the behavior is deterministic and if the algorithm used is available 1090 to the simulation tool. Thus, an implementation SHOULD: 1092 o Behave deterministic in its selection LFA process. I.e. in the 1093 same topology and with the same policy configuration, the 1094 implementation MUST always choose the same alternate for a given 1095 prefix. 1097 o Document its behavior. The implementation SHOULD provide enough 1098 documentation of its behavior that allows an implementer of a 1099 simulation tool, to foresee the exact choice of the LFA 1100 implementation for every prefix in a given topology. This SHOULD 1101 take into account all possible policy configuration options. One 1102 possible way to document this behavior is to disclose the 1103 algorithm used to choose alternates. 1105 8. Security Considerations 1107 This document does not introduce any change in security consideration 1108 compared to [RFC5286]. 1110 9. Contributors 1112 Significant contributions were made by Pierre Francois, Hannes 1113 Gredler, Chris Bowers, Jeff Tantsura, Uma Chunduri and Mustapha 1114 Aissaoui which the authors would like to acknowledge. 1116 10. Acknowledgements 1118 11. IANA Considerations 1120 This document has no action for IANA. 1122 12. References 1124 12.1. Normative References 1126 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1127 Requirement Levels", BCP 14, RFC 2119, March 1997. 1129 [RFC4203] Kompella, K. and Y. Rekhter, "OSPF Extensions in Support 1130 of Generalized Multi-Protocol Label Switching (GMPLS)", 1131 RFC 4203, October 2005. 1133 [RFC4205] Kompella, K. and Y. Rekhter, "Intermediate System to 1134 Intermediate System (IS-IS) Extensions in Support of 1135 Generalized Multi-Protocol Label Switching (GMPLS)", RFC 1136 4205, October 2005. 1138 [RFC5286] Atlas, A. and A. Zinin, "Basic Specification for IP Fast 1139 Reroute: Loop-Free Alternates", RFC 5286, September 2008. 1141 [RFC5307] Kompella, K. and Y. Rekhter, "IS-IS Extensions in Support 1142 of Generalized Multi-Protocol Label Switching (GMPLS)", 1143 RFC 5307, October 2008. 1145 [RFC6571] Filsfils, C., Francois, P., Shand, M., Decraene, B., 1146 Uttaro, J., Leymann, N., and M. Horneffer, "Loop-Free 1147 Alternate (LFA) Applicability in Service Provider (SP) 1148 Networks", RFC 6571, June 2012. 1150 [RFC7490] Bryant, S., Filsfils, C., Previdi, S., Shand, M., and N. 1151 So, "Remote Loop-Free Alternate (LFA) Fast Reroute (FRR)", 1152 RFC 7490, April 2015. 1154 12.2. Informative References 1156 [I-D.francois-segment-routing-ti-lfa] 1157 Francois, P., Filsfils, C., Bashandy, A., and B. Decraene, 1158 "Topology Independent Fast Reroute using Segment Routing", 1159 draft-francois-segment-routing-ti-lfa-00 (work in 1160 progress), November 2013. 1162 [I-D.ietf-isis-node-admin-tag] 1163 Sarkar, P., Gredler, H., Hegde, S., Litkowski, S., 1164 Decraene, B., Li, Z., Aries, E., Rodriguez, R., and H. 1165 Raghuveer, "Advertising Per-node Admin Tags in IS-IS", 1166 draft-ietf-isis-node-admin-tag-02 (work in progress), June 1167 2015. 1169 [I-D.ietf-isis-prefix-attributes] 1170 Ginsberg, L., Decraene, B., Filsfils, C., Litkowski, S., 1171 Previdi, S., Xu, X., and U. Chunduri, "IS-IS Prefix 1172 Attributes for Extended IP and IPv6 Reachability", draft- 1173 ietf-isis-prefix-attributes-00 (work in progress), May 1174 2015. 1176 [I-D.ietf-ospf-node-admin-tag] 1177 Hegde, S., Raghuveer, H., Gredler, H., Shakir, R., 1178 Smirnov, A., Li, Z., and B. Decraene, "Advertising per- 1179 node administrative tags in OSPF", draft-ietf-ospf-node- 1180 admin-tag-02 (work in progress), June 2015. 1182 [I-D.ietf-ospf-prefix-link-attr] 1183 Psenak, P., Gredler, H., Shakir, R., Henderickx, W., 1184 Tantsura, J., and A. Lindem, "OSPFv2 Prefix/Link Attribute 1185 Advertisement", draft-ietf-ospf-prefix-link-attr-06 (work 1186 in progress), June 2015. 1188 [I-D.ietf-rtgwg-rlfa-node-protection] 1189 Sarkar, P., Gredler, H., Hegde, S., Bowers, C., Litkowski, 1190 S., and H. Raghuveer, "Remote-LFA Node Protection and 1191 Manageability", draft-ietf-rtgwg-rlfa-node-protection-02 1192 (work in progress), June 2015. 1194 Authors' Addresses 1196 Stephane Litkowski (editor) 1197 Orange 1199 Email: stephane.litkowski@orange.com 1201 Bruno Decraene 1202 Orange 1204 Email: bruno.decraene@orange.com 1205 Clarence Filsfils 1206 Cisco Systems 1208 Email: cfilsfil@cisco.com 1210 Kamran Raza 1211 Cisco Systems 1213 Email: skraza@cisco.com 1215 Martin Horneffer 1216 Deutsche Telekom 1218 Email: Martin.Horneffer@telekom.de 1220 Pushpasis Sarkar 1221 Juniper Networks 1223 Email: psarkar@juniper.net