idnits 2.17.1 draft-shand-remote-lfa-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 1, 2012) is 4347 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'ISOCORE2010' is defined on line 497, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Bryant 3 Internet-Draft C. Filsfils 4 Intended status: Standards Track Cisco Systems 5 Expires: December 3, 2012 M. Shand 6 Independent Contributor 7 N. So 8 Verizon Inc. 9 June 1, 2012 11 Remote LFA FRR 12 draft-shand-remote-lfa-01 14 Abstract 16 This draft describes an extension to the basic IP fast re-route 17 mechanism described in RFC 5286 that provides additional backup 18 connectivity when none can be provided by the basic mechanisms. 20 Requirements Language 22 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 23 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 24 document are to be interpreted as described in RFC2119 [RFC2119]. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on December 3, 2012. 43 Copyright Notice 45 Copyright (c) 2012 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 1. Terminology 60 This draft uses the terms defined in [RFC5714]. This section defines 61 additional terms used in this draft. 63 Extended P-space 65 The union of the P-space of the neighbours of a 66 specific router with respect to the protected link. 68 P-space P-space is the set of routers reachable from a 69 specific router without any path (including equal cost 70 path splits) transiting the protected link. 72 For example, the P-space of S, is the set of routers 73 that S can reach without using the protected link S-E. 75 PQ node A node which is a member of both the extended P-space 76 and the Q-space. 78 Q-space Q-space is the set of routers from which a specific 79 router can be reached without any path (including 80 equal cost path splits) transiting the protected link. 82 Repair tunnel A tunnel established for the purpose of providing a 83 virtual neighbor which is a Loop Free Alternate. 85 Remote LFA The tail-end of a repair tunnel. This tail-end is a 86 member of both the extended-P space the Q space. It 87 is also termed a "PQ" node. 89 2. Introduction 91 RFC 5714 [RFC5714] describes a framework for IP Fast Re-route and 92 provides a summary of various proposed IPFRR solutions. A basic 93 mechanism using loop-free alternates (LFAs) is described in [RFC5286] 94 that provides good repair coverage in many 95 topologies[I-D.filsfils-rtgwg-lfa-applicability], especially those 96 that are highly meshed. However, some topologies, notably ring based 97 topologies are not well protected by LFAs alone. This is illustrated 98 in Figure 1 below. 100 S---E 101 / \ 102 A D 103 \ / 104 B---C 106 Figure 1: A simple ring topology 108 If all link costs are equal, the link S-E cannot be fully protected 109 by LFAs. The destination C is an ECMP from S, and so can be 110 protected when S-E fails, but D and E are not protectable using LFAs 112 This draft describes extensions to the basic repair mechanism in 113 which tunnels are used to provide additional logical links which can 114 then be used as loop free alternates where none exist in the original 115 topology. For example if a tunnel is provided between S and C as 116 shown in Figure 2 then C, now being a direct neighbor of S would 117 become an LFA for D and E. The non-failure traffic distribution is 118 not disrupted by the provision of such a tunnel since it is only used 119 for repair traffic and MUST NOT be used for normal traffic. 121 S---E 122 / \ \ 123 A \ D 124 \ \ / 125 B---C 127 Figure 2: The addition of a tunnel 129 The use of this technique is not restricted to ring based topologies, 130 but is a general mechanism which can be used to enhance the 131 protection provided by LFAs. 133 3. Repair Paths 135 As with LFA FRR, when a router detects an adjacent link failure, it 136 uses one or more repair paths in place of the failed link. Repair 137 paths are pre-computed in anticipation of later failures so they can 138 be promptly activated when a failure is detected. 140 A tunneled repair path tunnels traffic to some staging point in the 141 network from which it is assumed that, in the absence of multiple 142 failures, it will travel to its destination using normal forwarding 143 without looping back. This is equivalent to providing a virtual 144 loop-free alternate to supplement the physical loop-free alternates. 145 Hence the name "Remote LFA FRR". When a link cannot be entirely 146 protected with local LFA neighbors, the protecting router seeks the 147 help of a remote LFA staging point. 149 3.1. Tunnels as Repair Paths 151 Consider an arbitrary protected link S-E. In LFA FRR, if a path to 152 the destination from a neighbor N of S does not cause a packet to 153 loop back over the link S-E (i.e. N is a loop-free alternate), then 154 S can send the packet to N and the packet will be delivered to the 155 destination using the pre-failure forwarding information. If there 156 is no such LFA neighbor, then S may be able to create a virtual LFA 157 by using a tunnel to carry the packet to a point in the network which 158 is not a direct neighbor of S from which the packet will be delivered 159 to the destination without looping back to S. In this document such a 160 tunnel is termed a repair tunnel. The tail-end of this tunnel is 161 called a "remote LFA" or a "PQ node". 163 Note that the repair tunnel terminates at some intermediate router 164 between S and E, and not E itself. This is clearly the case, since 165 if it were possible to construct a tunnel from S to E then a 166 conventional LFA would have been sufficient to effect the repair. 168 3.2. Tunnel Requirements 170 There are a number of IP in IP tunnel mechanisms that may be used to 171 fulfil the requirements of this design, such as IP-in-IP [RFC1853] 172 and GRE[RFC1701] . 174 In an MPLS enabled network using LDP[RFC5036], a simple label 175 stack[RFC3032] may be used to provide the required repair tunnel. In 176 this case the outer label is S's neighbor's label for the repair 177 tunnel end point, and the inner label is the repair tunnel end 178 point's label for the packet destination. In order for S to obtain 179 the correct inner label it is necessary to establish a directed LDP 180 session[RFC5036] to the tunnel end point. 182 The selection of the specific tunnelling mechanism (and any necessary 183 enhancements) used to provide a repair path is outside the scope of 184 this document. The authors simply note that deployment in an MPLS/ 185 LDP environment is extremely simple and straight-forward as an LDP 186 LSP from S to the PQ node is readily available, and hence does not 187 require any new protocol extension or design change. This LSP is 188 automatically established as a basic property of LDP behavior. The 189 performance of the encapsulation and decapsulation is also excellent 190 as encapsulation is just a push of one label (like conventional MPLS 191 TE FRR) and the decapsulation occurs naturally at the penultimate hop 192 before the PQ node. 194 When a failure is detected, it is necessary to immediately redirect 195 traffic to the repair path. Consequently, the repair tunnel used 196 must be provisioned beforehand in anticipation of the failure. Since 197 the location of the repair tunnels is dynamically determined it is 198 necessary to establish the repair tunnels without management action. 199 Multiple repairs may share a tunnel end point. 201 4. Construction of Repair Paths 203 4.1. Identifying Required Tunneled Repair Paths 205 Not all links will require protection using a tunneled repair path. 206 If E can already be protected via an LFA, S-E does not need to be 207 protected using a repair tunnel, since all destinations normally 208 reachable through E must therefore also be protectable by an LFA. 209 Such an LFA is frequently termed a "link LFA". Tunneled repair paths 210 are only required for links which do not have a link LFA. 212 4.2. Determining Tunnel End Points 214 The repair tunnel endpoint needs to be a node in the network 215 reachable from S without traversing S-E. In addition, the repair 216 tunnel end point needs to be a node from which packets will normally 217 flow towards their destination without being attracted back to the 218 failed link S-E. 220 Note that once released from the tunnel, the packet will be 221 forwarded, as normal, on the shortest path from the release point to 222 its destination. This may result in the packet traversing the router 223 E at the far end of the protected link S-E., but this is obviously 224 not required. 226 The properties that are required of repair tunnel end points are 227 therefore: 229 o The repair tunneled point MUST be reachable from the tunnel source 230 without traversing the failed link; and 232 o When released, tunneled packets MUST proceed towards their 233 destination without being attracted back over the failed link. 235 Provided both these requirements are met, packets forwarded over the 236 repair tunnel will reach their destination and will not loop. 238 In some topologies it will not be possible to find a repair tunnel 239 endpoint that exhibits both the required properties. For example if 240 the ring topology illustrated in Figure 1 had a cost of 4 for the 241 link B-C, while the remaining links were cost 1, then it would not be 242 possible to establish a tunnel from S to C (without resorting to some 243 form of source routing). 245 4.2.1. Computing Repair Paths 247 The set of routers which can be reached from S without traversing S-E 248 is termed the P-space of S with respect to the link S-E. The P-space 249 can be obtained by computing a shortest path tree (SPT) rooted at S 250 and excising the sub-tree reached via the link S-E (including those 251 which are members of an ECMP). In the case of Figure 1 the P-space 252 comprises nodes A and B only. 254 The set of routers from which the node E can be reached, by normal 255 forwarding, without traversing the link S-E is termed the Q-space of 256 E with respect to the link S-E. The Q-space can be obtained by 257 computing a reverse shortest path tree (rSPT) rooted at E, with the 258 sub-tree which traverses the failed link excised (including those 259 which are members of an ECMP). The rSPT uses the cost towards the 260 root rather than from it and yields the best paths towards the root 261 from other nodes in the network. In the case of Figure 1 the Q-space 262 comprises nodes C and D only. 264 The intersection of the E's Q-space with S's P-space defines the set 265 of viable repair tunnel end-points, known as "PQ nodes". As can be 266 seen, for the case of Figure 1 there is no common node and hence no 267 viable repair tunnel end-point. 269 Note that the Q-space calculation could be conducted for each 270 individual destination and a per-destination repair tunnel end point 271 determined. However this would, in the worst case, require an SPF 272 computation per destination which is not considered to be scalable. 273 We therefore use the Q-space of E as a proxy for the Q-space of each 274 destination. This approximation is obviously correct since the 275 repair is only used for the set of destinations which were, prior to 276 the failure, routed through node E. This is analogous to the use of 277 link-LFAs rather than per-prefix LFAs. 279 4.2.2. Extended P-space 281 The description in Section 4.2.1 calculated router S's P-space rooted 282 at S itself. However, since router S will only use a repair path 283 when it has detected the failure of the link S-E, the initial hop of 284 the repair path need not be subject to S's normal forwarding decision 285 process. Thus we introduce the concept of extended P-space. Router 286 S's extended P-space is the union of the P-spaces of each of S's 287 neighbours. The use of extended P-space may allow router S to reach 288 potential repair tunnel end points that were otherwise unreachable. 290 Another way to describe extended P-space is that it is the union of ( 291 un-extended ) P-space and the set of destinations for which S has a 292 per-prefix LFA protecting the link S-E. i.e. the repair tunnel end 293 point can be reached either directly or using a per-prefix LFA. 295 Since in the case of Figure 1 node A is a per-prefix LFA for the 296 destination node C, the set of extended P-space nodes comprises nodes 297 A, B and C. Since node C is also in E's Q-space, there is now a node 298 common to both extended P-space and Q-space which can be used as a 299 repair tunnel end-point to protect the link S-E. 301 4.2.3. Selecting Repair Paths 303 The mechanisms described above will identify all the possible repair 304 tunnel end points that can be used to protect a particular link. In 305 a well-connected network there are likely to be multiple possible 306 release points for each protected link. All will deliver the packets 307 correctly so, arguably, it does not matter which is chosen. However, 308 one repair tunnel end point may be preferred over the others on the 309 basis of path cost or some other selection criteria. 311 In general there are advantages in choosing the repair tunnel end 312 point closest (shortest metric) to S. Choosing the closest maximises 313 the opportunity for the traffic to be load balanced once it has been 314 released from the tunnel. 316 There is no technical requirement for the selection criteria to be 317 consistent across all routers, but such consistency may be desirable 318 from an operational point of view. 320 5. Example Application of Remote LFAs 322 An example of a commonly deployed topology which is not fully 323 protected by LFAs alone is shown in Figure 3. PE1 and PE2 are 324 connected in the same site. P1 and P2 may be geographically 325 separated (inter-site). In order to guarantee the lowest latency 326 path from/to all other remote PEs, normally the shortest path follows 327 the geographical distance of the site locations. Therefore, to 328 ensure this, a lower IGP metric (5) is assigned between PE1 and PE2. 329 A high metric (1000) is set on the P-PE links to prevent the PEs 330 being used for transit traffic. The PEs are not individually dual- 331 homed in order to reduce costs. 333 This is a common topology in SP networks. 335 When a failure occurs on the link between PE1 and P2, PE1 does not 336 have an LFA for traffic reachable via P1. Similarly, by symmetry, if 337 the link between PE2 and P1 fails, PE2 does not have an LFA for 338 traffic reachable via P2. 340 Increasing the metric between PE1 and PE2 to allow the LFA would 341 impact the normal traffic performance by potentially increasing the 342 latency. 343 | 100 | 344 -P2---------P1- 345 \ / 346 1000 \ / 1000 347 PE1---PE2 348 5 350 Figure 3: Example SP topology 352 Clearly, full protection can be provided, using the techniques 353 described in this draft, by PE1 choosing P2 as a PQ node, and PE2 354 choosing P1 as a PQ node. 356 6. Historical Note 358 The basic concepts behind Remote LFA were invented in 2002 and were 359 later included in draft-bryant-ipfrr-tunnels, submitted in 2004. 361 draft-bryant-ipfrr-tunnels targetted a 100% protection coverage and 362 hence included additional mechanims on top of the Remote LFA concept. 363 The addition of these mechanisms made the proposal very complex and 364 computationally intensive and it was therefore not pursued as a 365 working group item. 367 As explained in [I-D.filsfils-rtgwg-lfa-applicability], the purpose 368 of the LFA FRR technology is not to provide coverage at any cost. A 369 solution for this already exists with MPLS TE FRR. MPLS TE FRR is a 370 mature technology which is able to provide protection in any topology 371 thanks to the explicit routing capability of MPLS TE. 373 The purpose of LFA FRR technology is to provide for a simple FRR 374 solution when such a solution is possible. The first step along this 375 simplicity approach was "local" LFA [RFC5286]. We propose "Remote 376 LFA" as a natural second step. The following section motivates its 377 benefits in terms of simplicity, incremental deployment and 378 significant coverage increase. 380 7. Benefits 382 Remote LFAs preserve the benefits of RFC5286: simplicity, incremental 383 deployment and good protection coverage. 385 7.1. Simplicity 387 The remote LFA algorithm is simple to compute. 389 o The extended P space does not require any new computation (it is 390 known once per-prefix LFA computation is completed). 392 o The Q-space is a single reverse SPF rooted at the neighbor. 394 o The directed LDP session is automatically computed and 395 established. 397 In edge topologies (square, ring), the directed LDP session position 398 and number is determinic and hence troubleshooting is simple. 400 In core topologies, our simulation indicates that the 90th percentile 401 number of LDP sessions per node to achieve the significant Remote LFA 402 coverage observed in section 7.3 is <= 6. This is insignificant 403 compared to the number of LDP sessions commonly deployed per router 404 which is frequently is in the several hundreds. 406 7.2. Incremental Deployment 408 The establishment of the directed LDP session to the PQ node does not 409 require any new technology on the PQ node. Indeed, routers commonly 410 support the ability to accept a remote request to open a directed LDP 411 session. The new capability is restricted to the Remote-LFA 412 computing node (the originator of the LDP session). 414 7.3. Significant Coverage Extension 416 The previous sections have already explained how Remote LFAs provide 417 protection for frequently occuring edge topologies: square and rings. 418 In the core, we extend the analysis framework in section 4.3 of 419 [I-D.filsfils-rtgwg-lfa-applicability]and provide hereafter the 420 Remote LFA coverage results for the 11 topologies: 422 +----------+--------------+----------------+------------+ 423 | Topology | Per-link LFA | Per-prefix LFA | Remote LFA | 424 +----------+--------------+----------------+------------+ 425 | T1 | 45% | 77% | 78% | 426 | T2 | 49% | 99% | 100% | 427 | T3 | 88% | 99% | 99% | 428 | T4 | 68% | 84% | 92% | 429 | T5 | 75% | 94% | 99% | 430 | T6 | 87% | 99% | 100% | 431 | T7 | 16% | 67% | 96% | 432 | T8 | 87% | 100% | 100% | 433 | T9 | 67% | 80% | 98% | 434 | T10 | 98% | 100% | 100% | 435 | T11 | 59% | 77% | 95% | 436 | Average | 67% | 89% | 96% | 437 | Median | 68% | 94% | 99% | 438 +----------+--------------+----------------+------------+ 440 Another study[ISOCORE2010]confirms the significant coverage increase 441 provided by Remote LFAs. 443 8. Complete Protection 445 As shown in the previous table, Remote LFA provides for 96% average 446 (99% median) protection in the 11 analyzed SP topologies. 448 In an MPLS network, this is achieved without any scalability impact 449 as the tunnels to the PQ nodes are always present as a property of an 450 LDP-based deployment. 452 In the very few cases where P and Q spaces have an empty 453 intersection, one could select the closest node in the Q space (i.e. 454 Qc) and signal an explicitely-routed RSVP TE LSP to Qc. A directed 455 LDP session is then established with Qc and the rest of the solution 456 is identical. 458 The drawbacks of this solution are: 460 1. only available for MPLS network; 462 2. the addition of LSPs in the SP infrastructure. 464 This extension is described for exhaustivity. In practice, the 465 "Remote LFA" solution should be preferred for three reasons: its 466 simplicity, its excellent coverage in the analyzed backbones and its 467 complete coverage in the most frequent access/aggregation topologies 468 (box or ring). 470 9. IANA Considerations 472 There are no IANA considerations that arise from this architectural 473 description of IPFRR. 475 10. Security Considerations 477 The security considerations of RFC 5286 also apply. 479 To prevent their use as an attack vector the repair tunnel endpoints 480 SHOULD be assigned from a set of addresses that are not reachable 481 from outside the routing domain. 483 11. Acknowledgments 485 The authors acknowledge the technical contributions made to this work 486 by Stefano Previdi. 488 12. Informative References 490 [I-D.filsfils-rtgwg-lfa-applicability] 491 Filsfils, C., Francois, P., Shand, M., Decraene, B., 492 Uttaro, J., Leymann, N., and M. Horneffer, "LFA 493 applicability in SP networks", 494 draft-filsfils-rtgwg-lfa-applicability-00 (work in 495 progress), March 2010. 497 [ISOCORE2010] 498 So, N., Lin, T., and C. Chen, "LFA (Loop Free Alternates) 499 Case Studies in Verizon's LDP Network", 2010. 501 [RFC1701] Hanks, S., Li, T., Farinacci, D., and P. Traina, "Generic 502 Routing Encapsulation (GRE)", RFC 1701, October 1994. 504 [RFC1853] Simpson, W., "IP in IP Tunneling", RFC 1853, October 1995. 506 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 507 Requirement Levels", BCP 14, RFC 2119, March 1997. 509 [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., 510 Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack 511 Encoding", RFC 3032, January 2001. 513 [RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP 514 Specification", RFC 5036, October 2007. 516 [RFC5286] Atlas, A. and A. Zinin, "Basic Specification for IP Fast 517 Reroute: Loop-Free Alternates", RFC 5286, September 2008. 519 [RFC5714] Shand, M. and S. Bryant, "IP Fast Reroute Framework", 520 RFC 5714, January 2010. 522 Authors' Addresses 524 Stewart Bryant 525 Cisco Systems 526 250, Longwater, Green Park, 527 Reading RG2 6GB, UK 528 UK 530 Email: stbryant@cisco.com 532 Clarence Filsfils 533 Cisco Systems 534 De Kleetlaan 6a 535 1831 Diegem 536 Belgium 538 Email: cfilsfil@cisco.com 540 Mike Shand 541 Independent Contributor 543 Email: imc.shand@gmail.com 545 Ning So 546 Verizon Inc. 548 Email: ningso@yahoo.com