idnits 2.17.1 draft-ietf-rtgwg-remote-lfa-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 19, 2012) is 4118 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'ISOCORE2010' is defined on line 498, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Bryant 3 Internet-Draft C. Filsfils 4 Intended status: Standards Track S. Previdi 5 Expires: June 22, 2013 Cisco Systems 6 M. Shand 7 Independent Contributor 8 N. So 9 Tata Communications 10 December 19, 2012 12 Remote LFA FRR 13 draft-ietf-rtgwg-remote-lfa-01 15 Abstract 17 This draft describes an extension to the basic IP fast re-route 18 mechanism described in RFC 5286 that provides additional backup 19 connectivity when none can be provided by the basic mechanisms. 21 Requirements Language 23 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 24 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 25 document are to be interpreted as described in RFC2119 [RFC2119]. 27 Status of this Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on June 22, 2013. 44 Copyright Notice 46 Copyright (c) 2012 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 1. Terminology 61 This draft uses the terms defined in [RFC5714]. This section defines 62 additional terms used in this draft. 64 Extended P-space 66 The union of the P-space of the neighbours of a 67 specific router with respect to the protected link. 69 P-space P-space is the set of routers reachable from a 70 specific router without any path (including equal cost 71 path splits) transiting the protected link. 73 For example, the P-space of S, is the set of routers 74 that S can reach without using the protected link S-E. 76 PQ node A node which is a member of both the extended P-space 77 and the Q-space. 79 Q-space Q-space is the set of routers from which a specific 80 router can be reached without any path (including 81 equal cost path splits) transiting the protected link. 83 Repair tunnel A tunnel established for the purpose of providing a 84 virtual neighbor which is a Loop Free Alternate. 86 Remote LFA The tail-end of a repair tunnel. This tail-end is a 87 member of both the extended-P space the Q space. It 88 is also termed a "PQ" node. 90 2. Introduction 92 RFC 5714 [RFC5714] describes a framework for IP Fast Re-route and 93 provides a summary of various proposed IPFRR solutions. A basic 94 mechanism using loop-free alternates (LFAs) is described in [RFC5286] 95 that provides good repair coverage in many 96 topologies[I-D.filsfils-rtgwg-lfa-applicability], especially those 97 that are highly meshed. However, some topologies, notably ring based 98 topologies are not well protected by LFAs alone. This is illustrated 99 in Figure 1 below. 101 S---E 102 / \ 103 A D 104 \ / 105 B---C 107 Figure 1: A simple ring topology 109 If all link costs are equal, the link S-E cannot be fully protected 110 by LFAs. The destination C is an ECMP from S, and so can be 111 protected when S-E fails, but D and E are not protectable using LFAs 113 This draft describes extensions to the basic repair mechanism in 114 which tunnels are used to provide additional logical links which can 115 then be used as loop free alternates where none exist in the original 116 topology. For example if a tunnel is provided between S and C as 117 shown in Figure 2 then C, now being a direct neighbor of S would 118 become an LFA for D and E. The non-failure traffic distribution is 119 not disrupted by the provision of such a tunnel since it is only used 120 for repair traffic and MUST NOT be used for normal traffic. 122 S---E 123 / \ \ 124 A \ D 125 \ \ / 126 B---C 128 Figure 2: The addition of a tunnel 130 The use of this technique is not restricted to ring based topologies, 131 but is a general mechanism which can be used to enhance the 132 protection provided by LFAs. 134 3. Repair Paths 136 As with LFA FRR, when a router detects an adjacent link failure, it 137 uses one or more repair paths in place of the failed link. Repair 138 paths are pre-computed in anticipation of later failures so they can 139 be promptly activated when a failure is detected. 141 A tunneled repair path tunnels traffic to some staging point in the 142 network from which it is assumed that, in the absence of multiple 143 failures, it will travel to its destination using normal forwarding 144 without looping back. This is equivalent to providing a virtual 145 loop-free alternate to supplement the physical loop-free alternates. 146 Hence the name "Remote LFA FRR". When a link cannot be entirely 147 protected with local LFA neighbors, the protecting router seeks the 148 help of a remote LFA staging point. 150 3.1. Tunnels as Repair Paths 152 Consider an arbitrary protected link S-E. In LFA FRR, if a path to 153 the destination from a neighbor N of S does not cause a packet to 154 loop back over the link S-E (i.e. N is a loop-free alternate), then 155 S can send the packet to N and the packet will be delivered to the 156 destination using the pre-failure forwarding information. If there 157 is no such LFA neighbor, then S may be able to create a virtual LFA 158 by using a tunnel to carry the packet to a point in the network which 159 is not a direct neighbor of S from which the packet will be delivered 160 to the destination without looping back to S. In this document such a 161 tunnel is termed a repair tunnel. The tail-end of this tunnel is 162 called a "remote LFA" or a "PQ node". 164 Note that the repair tunnel terminates at some intermediate router 165 between S and E, and not E itself. This is clearly the case, since 166 if it were possible to construct a tunnel from S to E then a 167 conventional LFA would have been sufficient to effect the repair. 169 3.2. Tunnel Requirements 171 There are a number of IP in IP tunnel mechanisms that may be used to 172 fulfil the requirements of this design, such as IP-in-IP [RFC1853] 173 and GRE[RFC1701] . 175 In an MPLS enabled network using LDP[RFC5036], a simple label 176 stack[RFC3032] may be used to provide the required repair tunnel. In 177 this case the outer label is S's neighbor's label for the repair 178 tunnel end point, and the inner label is the repair tunnel end 179 point's label for the packet destination. In order for S to obtain 180 the correct inner label it is necessary to establish a directed LDP 181 session[RFC5036] to the tunnel end point. 183 The selection of the specific tunnelling mechanism (and any necessary 184 enhancements) used to provide a repair path is outside the scope of 185 this document. The authors simply note that deployment in an MPLS/ 186 LDP environment is extremely simple and straight-forward as an LDP 187 LSP from S to the PQ node is readily available, and hence does not 188 require any new protocol extension or design change. This LSP is 189 automatically established as a basic property of LDP behavior. The 190 performance of the encapsulation and decapsulation is also excellent 191 as encapsulation is just a push of one label (like conventional MPLS 192 TE FRR) and the decapsulation occurs naturally at the penultimate hop 193 before the PQ node. 195 When a failure is detected, it is necessary to immediately redirect 196 traffic to the repair path. Consequently, the repair tunnel used 197 must be provisioned beforehand in anticipation of the failure. Since 198 the location of the repair tunnels is dynamically determined it is 199 necessary to establish the repair tunnels without management action. 200 Multiple repairs may share a tunnel end point. 202 4. Construction of Repair Paths 204 4.1. Identifying Required Tunneled Repair Paths 206 Not all links will require protection using a tunneled repair path. 207 If E can already be protected via an LFA, S-E does not need to be 208 protected using a repair tunnel, since all destinations normally 209 reachable through E must therefore also be protectable by an LFA. 210 Such an LFA is frequently termed a "link LFA". Tunneled repair paths 211 are only required for links which do not have a link LFA. 213 4.2. Determining Tunnel End Points 215 The repair tunnel endpoint needs to be a node in the network 216 reachable from S without traversing S-E. In addition, the repair 217 tunnel end point needs to be a node from which packets will normally 218 flow towards their destination without being attracted back to the 219 failed link S-E. 221 Note that once released from the tunnel, the packet will be 222 forwarded, as normal, on the shortest path from the release point to 223 its destination. This may result in the packet traversing the router 224 E at the far end of the protected link S-E., but this is obviously 225 not required. 227 The properties that are required of repair tunnel end points are 228 therefore: 230 o The repair tunneled point MUST be reachable from the tunnel source 231 without traversing the failed link; and 233 o When released, tunneled packets MUST proceed towards their 234 destination without being attracted back over the failed link. 236 Provided both these requirements are met, packets forwarded over the 237 repair tunnel will reach their destination and will not loop. 239 In some topologies it will not be possible to find a repair tunnel 240 endpoint that exhibits both the required properties. For example if 241 the ring topology illustrated in Figure 1 had a cost of 4 for the 242 link B-C, while the remaining links were cost 1, then it would not be 243 possible to establish a tunnel from S to C (without resorting to some 244 form of source routing). 246 4.2.1. Computing Repair Paths 248 The set of routers which can be reached from S without traversing S-E 249 is termed the P-space of S with respect to the link S-E. The P-space 250 can be obtained by computing a shortest path tree (SPT) rooted at S 251 and excising the sub-tree reached via the link S-E (including those 252 which are members of an ECMP). In the case of Figure 1 the P-space 253 comprises nodes A and B only. 255 The set of routers from which the node E can be reached, by normal 256 forwarding, without traversing the link S-E is termed the Q-space of 257 E with respect to the link S-E. The Q-space can be obtained by 258 computing a reverse shortest path tree (rSPT) rooted at E, with the 259 sub-tree which traverses the failed link excised (including those 260 which are members of an ECMP). The rSPT uses the cost towards the 261 root rather than from it and yields the best paths towards the root 262 from other nodes in the network. In the case of Figure 1 the Q-space 263 comprises nodes C and D only. 265 The intersection of the E's Q-space with S's P-space defines the set 266 of viable repair tunnel end-points, known as "PQ nodes". As can be 267 seen, for the case of Figure 1 there is no common node and hence no 268 viable repair tunnel end-point. 270 Note that the Q-space calculation could be conducted for each 271 individual destination and a per-destination repair tunnel end point 272 determined. However this would, in the worst case, require an SPF 273 computation per destination which is not considered to be scalable. 274 We therefore use the Q-space of E as a proxy for the Q-space of each 275 destination. This approximation is obviously correct since the 276 repair is only used for the set of destinations which were, prior to 277 the failure, routed through node E. This is analogous to the use of 278 link-LFAs rather than per-prefix LFAs. 280 4.2.2. Extended P-space 282 The description in Section 4.2.1 calculated router S's P-space rooted 283 at S itself. However, since router S will only use a repair path 284 when it has detected the failure of the link S-E, the initial hop of 285 the repair path need not be subject to S's normal forwarding decision 286 process. Thus we introduce the concept of extended P-space. Router 287 S's extended P-space is the union of the P-spaces of each of S's 288 neighbours. The use of extended P-space may allow router S to reach 289 potential repair tunnel end points that were otherwise unreachable. 291 Another way to describe extended P-space is that it is the union of ( 292 un-extended ) P-space and the set of destinations for which S has a 293 per-prefix LFA protecting the link S-E. i.e. the repair tunnel end 294 point can be reached either directly or using a per-prefix LFA. 296 Since in the case of Figure 1 node A is a per-prefix LFA for the 297 destination node C, the set of extended P-space nodes comprises nodes 298 A, B and C. Since node C is also in E's Q-space, there is now a node 299 common to both extended P-space and Q-space which can be used as a 300 repair tunnel end-point to protect the link S-E. 302 4.2.3. Selecting Repair Paths 304 The mechanisms described above will identify all the possible repair 305 tunnel end points that can be used to protect a particular link. In 306 a well-connected network there are likely to be multiple possible 307 release points for each protected link. All will deliver the packets 308 correctly so, arguably, it does not matter which is chosen. However, 309 one repair tunnel end point may be preferred over the others on the 310 basis of path cost or some other selection criteria. 312 In general there are advantages in choosing the repair tunnel end 313 point closest (shortest metric) to S. Choosing the closest maximises 314 the opportunity for the traffic to be load balanced once it has been 315 released from the tunnel. 317 There is no technical requirement for the selection criteria to be 318 consistent across all routers, but such consistency may be desirable 319 from an operational point of view. 321 5. Example Application of Remote LFAs 323 An example of a commonly deployed topology which is not fully 324 protected by LFAs alone is shown in Figure 3. PE1 and PE2 are 325 connected in the same site. P1 and P2 may be geographically 326 separated (inter-site). In order to guarantee the lowest latency 327 path from/to all other remote PEs, normally the shortest path follows 328 the geographical distance of the site locations. Therefore, to 329 ensure this, a lower IGP metric (5) is assigned between PE1 and PE2. 330 A high metric (1000) is set on the P-PE links to prevent the PEs 331 being used for transit traffic. The PEs are not individually dual- 332 homed in order to reduce costs. 334 This is a common topology in SP networks. 336 When a failure occurs on the link between PE1 and P2, PE1 does not 337 have an LFA for traffic reachable via P1. Similarly, by symmetry, if 338 the link between PE2 and P1 fails, PE2 does not have an LFA for 339 traffic reachable via P2. 341 Increasing the metric between PE1 and PE2 to allow the LFA would 342 impact the normal traffic performance by potentially increasing the 343 latency. 344 | 100 | 345 -P2---------P1- 346 \ / 347 1000 \ / 1000 348 PE1---PE2 349 5 351 Figure 3: Example SP topology 353 Clearly, full protection can be provided, using the techniques 354 described in this draft, by PE1 choosing P2 as a PQ node, and PE2 355 choosing P1 as a PQ node. 357 6. Historical Note 359 The basic concepts behind Remote LFA were invented in 2002 and were 360 later included in draft-bryant-ipfrr-tunnels, submitted in 2004. 362 draft-bryant-ipfrr-tunnels targetted a 100% protection coverage and 363 hence included additional mechanims on top of the Remote LFA concept. 364 The addition of these mechanisms made the proposal very complex and 365 computationally intensive and it was therefore not pursued as a 366 working group item. 368 As explained in [I-D.filsfils-rtgwg-lfa-applicability], the purpose 369 of the LFA FRR technology is not to provide coverage at any cost. A 370 solution for this already exists with MPLS TE FRR. MPLS TE FRR is a 371 mature technology which is able to provide protection in any topology 372 thanks to the explicit routing capability of MPLS TE. 374 The purpose of LFA FRR technology is to provide for a simple FRR 375 solution when such a solution is possible. The first step along this 376 simplicity approach was "local" LFA [RFC5286]. We propose "Remote 377 LFA" as a natural second step. The following section motivates its 378 benefits in terms of simplicity, incremental deployment and 379 significant coverage increase. 381 7. Benefits 383 Remote LFAs preserve the benefits of RFC5286: simplicity, incremental 384 deployment and good protection coverage. 386 7.1. Simplicity 388 The remote LFA algorithm is simple to compute. 390 o The extended P space does not require any new computation (it is 391 known once per-prefix LFA computation is completed). 393 o The Q-space is a single reverse SPF rooted at the neighbor. 395 o The directed LDP session is automatically computed and 396 established. 398 In edge topologies (square, ring), the directed LDP session position 399 and number is determinic and hence troubleshooting is simple. 401 In core topologies, our simulation indicates that the 90th percentile 402 number of LDP sessions per node to achieve the significant Remote LFA 403 coverage observed in section 7.3 is <= 6. This is insignificant 404 compared to the number of LDP sessions commonly deployed per router 405 which is frequently is in the several hundreds. 407 7.2. Incremental Deployment 409 The establishment of the directed LDP session to the PQ node does not 410 require any new technology on the PQ node. Indeed, routers commonly 411 support the ability to accept a remote request to open a directed LDP 412 session. The new capability is restricted to the Remote-LFA 413 computing node (the originator of the LDP session). 415 7.3. Significant Coverage Extension 417 The previous sections have already explained how Remote LFAs provide 418 protection for frequently occuring edge topologies: square and rings. 419 In the core, we extend the analysis framework in section 4.3 of 420 [I-D.filsfils-rtgwg-lfa-applicability]and provide hereafter the 421 Remote LFA coverage results for the 11 topologies: 423 +----------+--------------+----------------+------------+ 424 | Topology | Per-link LFA | Per-prefix LFA | Remote LFA | 425 +----------+--------------+----------------+------------+ 426 | T1 | 45% | 77% | 78% | 427 | T2 | 49% | 99% | 100% | 428 | T3 | 88% | 99% | 99% | 429 | T4 | 68% | 84% | 92% | 430 | T5 | 75% | 94% | 99% | 431 | T6 | 87% | 99% | 100% | 432 | T7 | 16% | 67% | 96% | 433 | T8 | 87% | 100% | 100% | 434 | T9 | 67% | 80% | 98% | 435 | T10 | 98% | 100% | 100% | 436 | T11 | 59% | 77% | 95% | 437 | Average | 67% | 89% | 96% | 438 | Median | 68% | 94% | 99% | 439 +----------+--------------+----------------+------------+ 441 Another study[ISOCORE2010]confirms the significant coverage increase 442 provided by Remote LFAs. 444 8. Complete Protection 446 As shown in the previous table, Remote LFA provides for 96% average 447 (99% median) protection in the 11 analyzed SP topologies. 449 In an MPLS network, this is achieved without any scalability impact 450 as the tunnels to the PQ nodes are always present as a property of an 451 LDP-based deployment. 453 In the very few cases where P and Q spaces have an empty 454 intersection, one could select the closest node in the Q space (i.e. 455 Qc) and signal an explicitely-routed RSVP TE LSP to Qc. A directed 456 LDP session is then established with Qc and the rest of the solution 457 is identical. 459 The drawbacks of this solution are: 461 1. only available for MPLS network; 463 2. the addition of LSPs in the SP infrastructure. 465 This extension is described for exhaustivity. In practice, the 466 "Remote LFA" solution should be preferred for three reasons: its 467 simplicity, its excellent coverage in the analyzed backbones and its 468 complete coverage in the most frequent access/aggregation topologies 469 (box or ring). 471 9. IANA Considerations 473 There are no IANA considerations that arise from this architectural 474 description of IPFRR. 476 10. Security Considerations 478 The security considerations of RFC 5286 also apply. 480 To prevent their use as an attack vector the repair tunnel endpoints 481 SHOULD be assigned from a set of addresses that are not reachable 482 from outside the routing domain. 484 11. Acknowledgments 486 The authors acknowledge the technical contributions made to this work 487 by Stefano Previdi. 489 12. Informative References 491 [I-D.filsfils-rtgwg-lfa-applicability] 492 Filsfils, C., Francois, P., Shand, M., Decraene, B., 493 Uttaro, J., Leymann, N., and M. Horneffer, "LFA 494 applicability in SP networks", 495 draft-filsfils-rtgwg-lfa-applicability-00 (work in 496 progress), March 2010. 498 [ISOCORE2010] 499 So, N., Lin, T., and C. Chen, "LFA (Loop Free Alternates) 500 Case Studies in Verizon's LDP Network", 2010. 502 [RFC1701] Hanks, S., Li, T., Farinacci, D., and P. Traina, "Generic 503 Routing Encapsulation (GRE)", RFC 1701, October 1994. 505 [RFC1853] Simpson, W., "IP in IP Tunneling", RFC 1853, October 1995. 507 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 508 Requirement Levels", BCP 14, RFC 2119, March 1997. 510 [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., 511 Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack 512 Encoding", RFC 3032, January 2001. 514 [RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP 515 Specification", RFC 5036, October 2007. 517 [RFC5286] Atlas, A. and A. Zinin, "Basic Specification for IP Fast 518 Reroute: Loop-Free Alternates", RFC 5286, September 2008. 520 [RFC5714] Shand, M. and S. Bryant, "IP Fast Reroute Framework", 521 RFC 5714, January 2010. 523 Authors' Addresses 525 Stewart Bryant 526 Cisco Systems 527 250, Longwater, Green Park, 528 Reading RG2 6GB, UK 529 UK 531 Email: stbryant@cisco.com 533 Clarence Filsfils 534 Cisco Systems 535 De Kleetlaan 6a 536 1831 Diegem 537 Belgium 539 Email: cfilsfil@cisco.com 541 Stefano Previdi 542 Cisco Systems 544 Email: sprevidi@cisco.com 545 URI: 547 Mike Shand 548 Independent Contributor 550 Email: imc.shand@gmail.com 551 Ning So 552 Tata Communications 553 Mobile Broadband Services 555 Email: Ning.So@tatacommunications.com