idnits 2.17.1 draft-chandra-mpls-ri-rsvp-frr-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 20 longer pages, the longest (page 1) being 61 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 5 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 10, 2015) is 3114 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC3209' is mentioned on line 939, but not defined == Missing Reference: 'RFC3936' is mentioned on line 959, but not defined == Unused Reference: 'RFC5439' is defined on line 1005, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Chandra Ramachandran 3 Internet Draft Juniper Networks 4 Intended status: Standards Track Ina Minei 5 Google, Inc 6 Dante Pacella 7 Verizon 8 Tarek Saad 9 Cisco Systems Inc. 10 Ebben Aries 12 Expires: April 10, 2016 October 10, 2015 14 Refresh Interval Independent FRR Facility Protection 15 draft-chandra-mpls-ri-rsvp-frr-01 17 Status of this Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF), its areas, and its working groups. Note that 24 other groups may also distribute working documents as Internet- 25 Drafts. 27 Internet-Drafts are draft documents valid for a maximum of six 28 months and may be updated, replaced, or obsoleted by other documents 29 at any time. It is inappropriate to use Internet-Drafts as 30 reference material or to cite them other than as "work in progress." 32 The list of current Internet-Drafts can be accessed at 33 http://www.ietf.org/ietf/1id-abstracts.txt 35 The list of Internet-Draft Shadow Directories can be accessed at 36 http://www.ietf.org/shadow.html 38 This Internet-Draft will expire on April 10, 2016. 40 Copyright Notice 42 Copyright (c) 2015 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with 50 respect to this document. Code Components extracted from this 51 document must include Simplified BSD License text as described in 52 Section 4.e of the Trust Legal Provisions and are provided without 53 warranty as described in the Simplified BSD License. 55 Abstract 57 RSVP-TE relies on periodic refresh of RSVP messages to maintain the 58 LSP related states along the reserved path. In the absence of 59 refresh messages, the LSP related states are automatically deleted. 60 Reliance on periodic refreshes and refresh timeouts are problematic 61 from the scalability point of view. The number of RSVP-TE LSPs that 62 a router needs to maintain has been growing in service provider 63 networks and the implementations should be capable of handling 64 increase in LSP scale. 66 RFC 2961 specifies mechanisms to eliminate the reliance on periodic 67 refresh and refresh timeout of RSVP messages, and enables a router 68 to increase the message refresh interval to values much larger than 69 the default 30 seconds defined in RFC 2205. However, the protocol 70 extensions defined in RFC 4090 for supporting fast reroute (FRR) 71 using bypass tunnels implicitly rely on refresh timeouts to cleanup 72 stale states. This document defines extensions to bypass FRR related 73 procedures defined in RFC 4090 to support refresh-interval 74 independent FRR. 76 Requirements Language 78 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 79 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 80 document are to be interpreted as described in RFC-2119 [RFC2119]. 82 Table of Contents 84 1. Introduction...................................................3 85 1.1. Motivation................................................4 86 2. Terminology....................................................5 87 3. Problem Description............................................5 88 4. Solution Aspects...............................................7 89 4.1. Signaling Protection availability in Path RRO.............8 90 4.1.1. PLR Behavior.........................................8 91 4.1.2. Remote Signaling Adjacency...........................9 92 4.1.3. SUMMARY_FRR_BYPASS_ASSIGNMENT sub-object Propagation10 93 4.1.4. MP Behavior.........................................10 94 4.1.5. "Remote" state on MP................................10 95 4.2. Impact of Failures on LSP State..........................11 96 4.2.1. Non-MP Behavior on Phop Link/Node Failure...........12 97 4.2.2. LP-MP Behavior on Phop Link Failure.................12 98 4.2.3. LP-MP Behavior on Phop Node Failure.................12 99 4.2.4. NP-MP Behavior on Phop Link/Node Failure............12 100 4.2.5. NP-MP Behavior on PLR Link Failure..................12 101 4.2.6. Phop Link Failure on a Node that is LP-MP and NP-MP.13 102 4.2.7. Phop Node Failure on Node that is LP-MP and NP-MP...14 103 4.3. Conditional Path Tear....................................14 104 4.3.1. Sending Conditional Path Tear.......................14 105 4.3.2. Processing Conditional Path Tear....................15 106 4.3.3. CONDITIONS object...................................15 107 4.4. Remote State Teardown....................................16 108 4.4.1. PLR Behavior on Local Repair Failure................17 109 4.4.2. PLR Behavior on Resv RRO Change.....................17 110 4.4.3. LSP Preemption during Local Repair..................17 111 4.4.3.1. Preemption on LP-MP after Phop Link failure....18 112 4.4.3.2. Preemption on NP-MP after Phop Link failure....18 113 4.5. Backward Compatibility Procedures........................18 114 4.5.1. Detecting Support for Refresh interval Independent FRR 115 ...........................................................19 116 4.5.2. Procedures for backward compatibility...............19 117 4.5.2.1. Lack of support on Downstream Node.............20 118 4.5.2.2. Lack of support on Upstream Node...............20 119 4.5.2.3. Incremental Deployment.........................21 120 5. Security Considerations.......................................21 121 6. IANA Considerations...........................................22 122 6.1. New Object - CONDITIONS..................................22 123 7. Normative References..........................................22 124 8. Informative References........................................23 125 9. Acknowledgments...............................................23 126 10. Authors' Addresses...........................................23 128 1. Introduction 130 RSVP-TE Fast Reroute [RFC4090] defines two local repair techniques 131 to reroute label switched path (LSP) traffic over pre-established 132 backup tunnel. Facility backup method allows one or more LSPs 133 traversing a connected link or node to be protected using a bypass 134 tunnel. The many-to-one nature of local repair technique is 135 attractive from scalability point of view. This document enumerates 136 facility backup procedures in RFC 4090 that rely on refresh timeout 137 and hence make facility backup method refresh-interval dependent. 139 The RSVP-TE extensions defined in this document will enhance the 140 facility backup protection mechanism by making the corresponding 141 procedures refresh-interval independent. 143 1.1. Motivation 145 Standard RSVP [RFC2205] maintains state via the generation of RSVP 146 Path/Resv refresh messages. Refresh messages are used to both 147 synchronize state between RSVP neighbors and to recover from lost 148 RSVP messages. The use of Refresh messages to cover many possible 149 failures has resulted in a number of operational problems. 151 - One problem relates to RSVP control plane scaling due to periodic 152 refreshes of Path and Resv messages, another relates to the 153 reliability and latency of RSVP signaling. 155 - An additional problem is the time to clean up the stale state 156 after a tear message is lost. For more on these problems see 157 Section 1 of RSVP Refresh Overhead Reduction Extensions 158 [RFC2961]. 160 The problems listed above adversely affect RSVP control plane 161 scalability and RSVP-TE [RFC3209] inherited these problems from 162 standard RSVP. Procedures specified in [RFC2961] address the above 163 mentioned problems by eliminating dependency on refreshes for state 164 synchronization and for recovering from lost RSVP messages, and by 165 eliminating dependency on refresh timeout for stale state cleanup. 166 Implementing these procedures allows to improve RSVP-TE control 167 plane scalability. For more details on eliminating dependency on 168 refresh timeout for stale state cleanup, refer to "Refresh Interval 169 Independent RSVP" section in [TE-SCALE-REC]. 171 However, the procedures specified in [RFC2961] do not fully address 172 stale state cleanup for facility backup protection [RFC4090], as 173 facility backup protection still depends on refresh timeouts for 174 stale state cleanup. Thus [RFC2961] is insufficient to address the 175 problem of stale state cleanup when facility backup protection is 176 used. 178 The procedures specified in this document, in combination with 179 [RFC2961], eliminate facility backup protection dependency on 180 refresh timeouts for stale state cleanup. These procedures, in 181 combination with [RFC2961], fully address the above mentioned 182 problem of RSVP-TE stale state cleanup, including the cleanup for 183 facility backup protection. 185 The procedures specified in this document assume reliable delivery 186 of RSVP messages, as specified in [RFC2961]. Therefore this document 187 makes support for [RFC2961] a pre-requisite. 189 2. Terminology 191 The reader is assumed to be familiar with the terminology in 192 [RFC2205], [RFC3209], [RFC4090] and [RFC4558]. 194 Phop node: Previous-hop router along the label switched path 196 PPhop node: Previous-Previous-hop router along the LSP 198 LP-MP node: Merge Point router at the tail of Link-protecting bypass 199 tunnel 201 NP-MP node: Merger Point router at the tail of Node-protecting 202 bypass tunnel 204 Conditional PathTear: PathTear message containing a suggestion to a 205 receiving downstream router to retain Path state if the receiving 206 router is NP-MP 208 Remote PathTear: PathTear message sent from Point of Local Repair 209 (PLR) to MP to delete state on MP if PLR had not reliably sent 210 backup Path state before 212 3. Problem Description 214 E 215 / \ 216 / \ 217 / \ 218 / \ 219 / \ 220 / \ 221 A ----- B ----- C ----- D 222 \ / 223 \ / 224 \ / 225 \ / 226 \ / 227 \ / 228 F 229 Figure 1: Example Topology 231 In the topology in Figure 1, consider a large number of LSPs from A 232 to D transiting B and C. Assume that refresh interval has been 233 configured to be large of the order of minutes and refresh reduction 234 extensions are enabled on all routers. 236 Also assume that node protection has been configured for the LSPs 237 and the LSPs are protected by each router in the following way 239 - A has made node protection available using bypass LSP A -> E -> 240 C; A is the Point of Local Repair (PLR) and C is Node Protecting 241 Merge Point (NP-MP) 243 - B has made node protection available using bypass LSP B -> F -> 244 D; B is the PLR and D is the NP-MP 246 - C has made link protection available using bypass LSP C -> B -> F 247 -> D; C is the PLR and D is the Link Protecting Merge Point (LP- 248 MP) 250 In the above condition, assume that B-C link fails. The following is 251 the sequence of events that is expected to occur for all protected 252 LSPs under normal conditions. 254 1. B performs local repair and re-directs LSP traffic over the bypass 255 LSP B -> F -> D. 256 2. B also creates backup state for the LSP and triggers sending of 257 backup LSP state to D over the bypass LSP B -> F -> D. 258 3. D receives backup LSP states and merges the backups with the 259 protected LSPs. 260 4. As the link on C, over which the LSP states are refreshed has 261 failed, C will no longer receive state refreshes. Consequently the 262 protected LSP states on C will time out and C will send tear down 263 message for all LSPs. 264 While the above sequence of events has been described in [RFC4090], 265 there are a few problems for which no mechanism has been specified 266 explicitly. 268 - If the protected LSP on C times out before D receives signaling 269 for the backup LSP, then D would receive PathTear from C prior to 270 receiving signaling for the backup LSP, thus resulting in deleting 271 the LSP state. This would be possible at scale even with default 272 refresh time. 274 - If upon the link failure C is to keep state until its timeout, 275 then with long refresh interval this may result in a large amount 276 of stale state on C. Alternatively, if upon the link failure C is 277 to delete the state and send PathTear to D, this would result in 278 deleting the state on D, thus deleting the LSP. D needs a reliable 279 mechanism to determine whether it is MP or not to overcome this 280 problem. 282 - If head-end A attempts to tear down LSP after step 1 but before 283 step 2 of the above sequence, then B may receive the tear down 284 message before step 2 and delete the LSP state from its state 285 database. If B deletes its state without informing D, with long 286 refresh interval this could cause (large) buildup of stale state 287 on D. 289 - If B fails to perform local repair in step 1, then B will delete 290 the LSP state from its state database without informing D. As B 291 deletes its state without informing D, with long refresh interval 292 this could cause (large) buildup of stale state on D. 294 The purpose of this document is to provide solutions to the above 295 problems which will then make it practical to scale up to a large 296 number of protected LSPs in the network. 298 4. Solution Aspects 300 The solution consists of five parts. 302 - Utilize MP determination mechanism specified in [SUMMARY-FRR] 303 that enables the PLR to signal availability of local protection to 304 MP. In addition, introduce PLR and MP procedures to establish 305 Node-ID hello session between the PLR and the MP to detect router 306 failures and to determine capability. See section 4.1 for more 307 details. 309 - Handle upstream link or node failures by cleaning up LSP states 310 if the node has not found itself as MP through the MP 311 determination mechanism. See section 4.2 for more details. 313 The combination of "path state" maintained as Path State Block 314 (PSB) and "reservation state" maintained as Reservation State 315 Block (RSB) forms an individual LSP state on an RSVP-TE speaker. 317 - Introduce extensions to enable a router to send tear down message 318 to downstream router that enables the receiving router to 319 conditionally delete its local state. See section 4.3 for more 320 details. 322 - Enhance facility protection by allowing a PLR to directly send 323 tear down message to MP without requiring the PLR to either have a 324 working bypass LSP or have already signaled backup LSP state. See 325 section 4.4 for more details. 327 - Introduce extensions to enable the above procedures to be 328 backward compatible with routers along the LSP path running 329 implementation that do not support these procedures. See section 330 4.5 for more details. 332 4.1. Signaling Protection availability in Path RRO 334 [SUMMARY-FRR] defines a mechanism for PLR to signal the association 335 of a protected LSP to a bypass LSP by introducing a new sub-object 336 in RRO carried in Path and Resv messages. Implementations supporting 337 this document SHOULD support SUMMARY_FRR_BYPASS_ASSIGNMENT sub- 338 object in RRO carried in Path message. Implementations MAY also 339 support SUMMARY_FRR_BYPASS_ASSIGNMENT sub-object in RRO carried in 340 Resv message. 342 4.1.1. PLR Behavior 344 As per the procedures specified in RFC 4090, when a protected LSP 345 comes up and if the "local protection desired" flag is set in the 346 SESSION_ATTRIBUTE object, each node along the LSP path attempts to 347 make local protection available for the LSP. 349 - If the "node protection desired" flag is set, then the node 350 tries to become a PLR by attempting to create a NP-bypass LSP to 351 the NNhop node avoiding the Nhop node on protected LSP path. In 352 case node protection could not be made available after some time 353 out, the node attempts to create a LP-bypass LSP to Nhop node 354 avoiding only the link that protected LSP takes to reach Nhop 356 - If the "node protection desired" flag is not set, then the PLR 357 attempts to create a LP-bypass LSP to Nhop node avoiding the link 358 that the protected LSP takes to reach Nhop 360 With regard to the PLR procedures described above and that are 361 specified in RFC 4090, this document specifies the following 362 recommendations involving addresses selection. 364 - While selecting the destination address of the bypass LSP, the 365 PLR SHOULD attempt to select the router ID of the NNhop or Nhop 366 node. If the PLR and the MP are in same area, then the PLR may 367 utilize the TED to determine the router ID from the interface 368 address in RRO (if NodeID is not included in RRO). If the PLR and 369 the MP are in different IGP areas, then the PLR SHOULD use the 370 NodeID address of NNhop MP if included in the RRO of RESV. If the 371 NP-MP in a different area has not included NodeID in RRO, then the 372 PLR SHOULD use NP-MP's interface address present in the RRO. The 373 PLR SHOULD use its router ID as the source address of the bypass 374 LSP. The PLR SHOULD also include its router ID as the NodeID in 375 PATH RRO unless configured explicitly not to include NodeID. 377 In parallel to the attempt made to create NP-bypass or LP-bypass, 378 the PLR SHOULD initiate a Node-ID based Hello session to the NNhop 379 or Nhop node respectively to establish the RSVP-TE signaling 380 adjacency. This Hello session is used to track the state of the 381 adjacency, including detection of adjacency failure. 383 - If the bypass LSP comes up, then the PLR SHOULD include 384 SUMMARY_FRR_BYPASS_ASSIGNMENT sub-object in RRO and triggers PATH 385 to be sent. If the sub-object is included in PATH RRO, then the 386 encoding rules specified in [SUMMARY-FRR] SHOULD be followed. 388 - After signaling protection availability, if the PLR finds that 389 the protection becomes unavailable then it SHOULD attempt to make 390 protection available. The PLR SHOULD wait for a time out before 391 removing SUMMARY_FRR_BYPASS_ASSIGNMENT sub-object in RRO and 392 triggering PATH downstream. On the other hand, the PLR need not 393 wait for a time out to add SUMMARY_FRR_BYPASS_ASSIGNMENT sub- 394 object in RRO and may immediately trigger PATH downstream. 396 4.1.2. Remote Signaling Adjacency 398 A NodeID based RSVP-TE Hello session is one in which NodeID is used 399 in source and destination address fields in RSVP Hello. [RFC4558] 400 formalizes NodeID based Hello messages between two routers. This 401 document extends NodeID based RSVP Hello session to track the state 402 of RSVP-TE neighbor that is not directly connected by at least one 403 interface. In order to apply NodeID based RSVP-TE Hello session 404 between any two routers that are not immediate neighbors, the router 405 that supports the extensions defined in the document SHOULD set TTL 406 to 255 in the NodeID based Hello messages exchanged between PLR and 407 MP. 409 In the rest of the document the term "signaling adjacency", or 410 "remote signaling adjacency" refers specifically to the RSVP-TE 411 signaling adjacency. 413 4.1.3. SUMMARY_FRR_BYPASS_ASSIGNMENT sub-object Propagation 415 The propagation rules for SUMMARY_FRR_BYPASS_ASSIGNMENT sub-object 416 in RRO carried in Path specified in [SUMMARY-FRR] SHOULD be 417 followed. 419 4.1.4. MP Behavior 421 When the NNhop or Nhop node receives the triggered PATH with a 422 "matching" SUMMARY_FRR_BYPASS_ASSIGNMENT sub-object, the node should 423 consider itself as the MP for the PLR IP address "corresponding" to 424 the SUMMARY_FRR_BYPASS_ASSIGNMENT sub-object in RRO. As specified in 425 [SUMMARY-FRR], a SUMMARY_FRR_BYPASS_ASSIGNMENT in PATH RRO is said 426 to "match" a node if "Bypass Destination Address" matches a local 427 address and "Bypass Tunnel ID" matches an LSP terminating on the 428 node. Also, a SUMMARY_FRR_BYPASS_ASSIGNMENT sub-object is said to 429 "correspond" to an IP address in RRO if the sub-object is present in 430 RRO after the IP address but before the next hop router's IP 431 address. The ordering rules of SUMMARY_FRR_BYPASS_ASSIGNMENT 432 specified in [SUMMARY-FRR] SHOULD be followed by implementations 433 supporting this document. 435 In addition to the above procedures, the node SHOULD check the 436 presence of remote signaling adjacency with PLR (this check is 437 needed to detect network being partitioned). If a matching 438 SUMMARY_FRR_BYPASS_ASSIGNMENT sub-object is found in RRO and the 439 RSVP-TE signaling adjacency is present, the node concludes that the 440 PLR will undertake refresh-interval independent FRR procedures 441 specified in this document. If the PLR has included NodeID in PATH 442 RRO, then that NodeID is the remote neighbor address. Otherwise, the 443 PLR's interface address in RRO will be the remote neighbor address. 444 If a matching SUMMARY_FRR_BYPASS_ASSIGNMENT sub-object is included 445 by PPhop node, then it is NP-MP. If a matching 446 SUMMARY_FRR_BYPASS_ASSIGNMENT sub-object is included by Phop node, 447 it concludes it is LP-MP. 449 4.1.5. "Remote" state on MP 451 Once a router concludes it is MP for a PLR running refresh-interval 452 independent FRR procedures, it SHOULD create a remote path state for 453 the LSP. The "remote" state is identical to the protected LSP path 454 state except for the difference in HOP object. The HOP object 455 corresponding to the "remote" path state contains the address of 456 remote node signaling adjacency with PLR. 458 The MP SHOULD consider the "remote" path state automatically deleted 459 if: 461 - MP later receives a PATH with no matching 462 SUMMARY_FRR_BYPASS_ASSIGNMENT sub-object corresponding to the PLR 463 RRO, or 465 - Node signaling adjacency with PLR goes down, or 467 - MP receives backup LSP signaling from PLR or 469 - MP receives PathTear, or 471 - MP deletes the LSP state on local policy or exception event 473 Unlike the normal path state that is either locally generated on 474 Ingress or created from PATH message from Phop node, the "remote" 475 path state is not signaled explicitly form PLR. The purpose of 476 "remote" path state is to enable the PLR to explicitly tear down 477 path and reservation states corresponding to the LSP by sending tear 478 message for the "remote" path state. Such message tearing down 479 "remote" path state is called "Remote PathTear. 481 The scenarios in which "Remote" PathTear is applied are described in 482 Section 4.4 - Remote State Teardown. 484 4.2. Impact of Failures on LSP State 486 This section describes the procedures for routers on the LSP path 487 for different kinds of failures. The procedures described on 488 detecting RSVP control plane adjacency failures do not impact the 489 RSVP-TE graceful restart mechanisms ([RFC3473], [RFC5063]). If the 490 router executing these procedures act as helper for neighboring 491 router, then the control plane adjacency will be declared as having 492 failed after taking into account the grace period extended for 493 neighbor by the helper. 495 It should be noted that even though this section and the subsequent 496 sections of the document mention "link failure" and "node failure" 497 separately involving upstream or downstream of a protected LSP, a 498 router implementing the procedures specified in the document need 499 not have a mechanism to distinguish between these two types of 500 failures. Optionally, a router MAY run Node-ID based RSVP-TE 501 signaling adjacency with immediate neighbors to distinguish between 502 these two types of failures. 504 4.2.1. Non-MP Behavior on Phop Link/Node Failure 506 When a router detects Phop link or Phop node failure and the router 507 is not an MP for the LSP, then it SHOULD send Conditional PathTear 508 (refer to Section "Conditional PathTear" below) and delete PSB and 509 RSB states corresponding to the LSP. 511 4.2.2. LP-MP Behavior on Phop Link Failure 513 When the Phop link for an LSP fails on a router that is LP-MP for 514 the LSP, the LP-MP SHOULD retain PSB and RSB states corresponding to 515 the LSP till the occurrence of any of the following events. 517 - Node-ID signaling adjacency with Phop PLR goes down, or 519 - MP receives normal or "Remote" PathTear for PSB, or 521 - MP receives ResvTear RSB. 523 4.2.3. LP-MP Behavior on Phop Node Failure 525 When a router that is LP-MP for an LSP detects Phop node failure 526 from Node-ID signaling adjacency state, the LP-MP SHOULD send normal 527 PathTear and delete PSB and RSB states corresponding to the LSP. 529 4.2.4. NP-MP Behavior on Phop Link/Node Failure 531 When a router that is NP-MP for an LSP detects Phop link failure, or 532 Phop node failure from Node-ID signaling adjacency, the router 533 SHOULD retain PSB and RSB states corresponding to the LSP till the 534 occurrence of any of the following events. 536 - Remote Node-ID signaling adjacency with PPhop PLR goes down, or 538 - MP receives normal or "Remote" PathTear for PSB, or 540 - MP receives ResvTear for RSB. 542 4.2.5. NP-MP Behavior on PLR Link Failure 544 If the PLR link that is not attached to NP-MP fails and if NP-MP 545 receives Conditional PathTear from the Phop node, then the MP SHOULD 546 retain PSB and RSB states corresponding to the LSP till the 547 occurrence of any of the following events. 549 - Remote Node-ID signaling adjacency with PPhop PLR goes down, or 551 - MP receives normal or "Remote" PathTear for PSB, or 553 - MP receives ResvTear for RSB. 555 Receiving Conditional PathTear from the Phop node will not impact 556 the "remote" state from the PLR. Note that Phop node would send 557 Conditional PathTear if it was not an MP. 559 In the example topology in Figure 1, assume C & D are NP-MP for PLRs 560 A & B respectively. Now when A-B link fails, as B is not MP and its 561 Phop link signaling adjacency has failed, B will delete LSP state 562 (this behavior is required for unprotected LSPs - Section 4.2.1). In 563 the data plane, that would require B to delete the label forwarding 564 entry corresponding to the LSP. So if B's downstream nodes C and D 565 continue to retain state, it would not be correct for D to continue 566 to assume itself as NP-MP for PLR B. 568 The mechanism that enables D to stop considering itself as NP-MP and 569 delete "remote" path state is given below. 571 1. When C receives Conditional PathTear from B, it decides to 572 retain LSP state as it is NP-MP of PLR A. C also SHOULD check 573 whether Phop B had previously signaled availability of node 574 protection. As B had previously signaled NP availability in its 575 PATH RRO, C SHOULD remove SUMMARY_FRR_BYPASS_ASSOCIATION sub- 576 object corresponding to B from the RRO and trigger PATH to D. 577 2. When D receives triggered PATH, it realizes that it is no 578 longer NP-MP and so deletes the "remote" path state. D does not 579 propagate PATH further down because the only change is in PATH 580 RRO SUMMARY_FRR_BYPASS_ASSOCIATION sub-object corresponding to 581 B. 582 4.2.6. Phop Link Failure on a Node that is LP-MP and NP-MP 584 A router may be both LP-MP as well as NP-MP at the same time for 585 Phop and PPhop nodes respectively of an LSP. If Phop link fails on 586 such node, the node SHOULD retain PSB and RSB states corresponding 587 to the LSP till the occurrence of any of the following events. 589 - Both Node-ID signaling adjacencies with Phop and PPhop nodes go 590 down, or 592 - MP receives normal or "Remote" PathTear for PSB, or 594 - MP receives ResvTear for RSB. 596 4.2.7. Phop Node Failure on Node that is LP-MP and NP-MP 598 If a router that is both LP-MP and NP-MP detects Phop node failure, 599 then the node SHOULD retain PSB and RSB states corresponding to the 600 LSP till the occurrence of any of the following events. 602 - Remote Node-ID signaling adjacency with PPhop PLR goes down, or 604 - MP receives normal or "Remote" PathTear for PSB, or 606 - MP receives ResvTear for RSB. 608 4.3. Conditional Path Tear 610 In the example provided in the Section 4.2.5 "NP-MP Behavior on PLR 611 link failure", B deletes PSB and RSB states corresponding to the LSP 612 once B detects its link to Phop went down as B is not MP. If B were 613 to send PathTear normally, then C would delete LSP state 614 immediately. In order to avoid this, there should be some mechanism 615 by which B can indicate to C that B does not require the receiving 616 node to unconditionally delete the LSP state immediately. For this, 617 B SHOULD add a new optional object called CONDITIONS object in 618 PathTear. The new optional object is defined in Section 4.3.3. If 619 node C also understands the new object, then C SHOULD delete LSP 620 state only if it is not an NP-MP - in other words C SHOULD delete 621 LSP state if there is no "remote" PLR state on C. 623 4.3.1. Sending Conditional Path Tear 625 A router that is not an MP for an LSP SHOULD delete PSB and RSB 626 states corresponding to the LSP if Phop link or Phop Node-ID 627 signaling adjacency goes down (Section 4.2.1). The router SHOULD 628 send Conditional PathTear if the following are also true. 630 - Ingress has requested node protection for the LSP, and 632 - PathTear is not received from upstream node 634 4.3.2. Processing Conditional Path Tear 636 When a router that is not an NP-MP receives Conditional PathTear, 637 the node SHOULD delete PSB and RSB states corresponding to the LSP, 638 and process Conditional PathTear by considering it as normal 639 PathTear. Specifically, the node SHOULD NOT propagate Conditional 640 PathTear downstream but remove the optional object and send normal 641 PathTear downstream. 643 When a node that is an NP-MP receives Conditional PathTear, it 644 SHOULD NOT delete LSP state. The node SHOULD check whether the Phop 645 node had previously included SUMMARY_FRR_BYPASS_ASSOCIATION sub- 646 object in PATH RRO. If the sub-object had been included previously 647 by Phop, then the node SHOULD remove the 648 SUMMARY_FRR_BYPASS_ASSOCIATION sub-object corresponding to the Phop 649 from the RRO and trigger PATH downstream. 651 If Conditional PathTear is received from a neighbor that has not 652 advertised support (refer to Section 4.5) for the new procedures 653 defined in this document, then the node SHOULD consider the message 654 as normal PathTear. The node SHOULD propagate normal PathTear 655 downstream and delete LSP state. 657 4.3.3. CONDITIONS object 659 As any implementation that does not support Conditional PathTear 660 SHOULD ignore the new object but process the message as normal 661 PathTear without generating any error, the Class-Num of the new 662 object SHOULD be 10bbbbbb where 'b' represents a bit (from Section 663 3.10 of [RFC2205]). 665 The new object is called as "CONDITIONS" object that will specify 666 the conditions under which default processing rules of the RSVP-TE 667 message SHOULD be invoked. 669 The object has the following format: 671 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 672 | Length | Class | C-type | 673 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 674 | Reserved |M| 675 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 677 Length 679 This contains the size of the object in bytes and should be set to 680 eight. 682 Class 684 To be assigned 686 C-type 688 1 690 M bit 692 This bit indicates that the message SHOULD be processed based on the 693 condition whether the receiving node is Merge Point or not. 695 4.4. Remote State Teardown 697 If the Ingress wants to tear down the LSP because of a management 698 event while the LSP is being locally repaired at a transit PLR, it 699 would not be desirable to wait till backup LSP signaling to perform 700 state cleanup. To enable LSP state cleanup when the LSP is being 701 locally repaired, the PLR SHOULD send "remote" PathTear message 702 instructing the MP to delete PSB and RSB states corresponding to the 703 LSP. 705 Consider node C in example topology (Figure 1) has gone down and B 706 locally repairs the LSP. 708 1. Ingress A receives a management event to tear down the LSP. 709 2. A sends normal PathTear to B. 710 3. To enable LSP state cleanup, B SHOULD send "remote" PathTear with 711 destination IP address set to that of D used in Node-ID signaling 712 adjacency with D, and HOP object containing local address used in 713 Node-ID signaling adjacency. 714 4. B then deletes PSB and RSB states corresponding to the LSP. 715 5. On D there would be a remote signaling adjacency with B and so D 716 SHOULD accept the remote PathTear and delete PSB and RSB states 717 corresponding to the LSP. 719 4.4.1. PLR Behavior on Local Repair Failure 721 If local repair fails on the PLR after a failure, then this should 722 be considered as a case for cleaning up LSP state from PLR to the 723 Egress. PLR would achieve this using "remote" PathTear to clean up 724 state from MP. If MP has retained state, then it would propagate 725 PathTear downstream thereby achieving state cleanup. Note that in 726 the case of link protection, the PathTear would be directed to LP-MP 727 node IP address rather than the Nhop interface address. 729 4.4.2. PLR Behavior on Resv RRO Change 731 When a router that has already made NP available detects a change in 732 the RRO carried in RESV message, and if the RRO change indicates 733 that the router's former NP-MP is no longer present in the LSP path, 734 then the router SHOULD send "Remote" PathTear directly to its former 735 NP-MP. 737 In the example topology in Figure 1, assume A has made node 738 protection available and C has concluded it is NP-MP. When the B-C 739 link fails then implementing the procedure specified in Section 740 4.2.4 of this document, C will retain state till: remote NodeID 741 control plane adjacency with A goes down, or PathTear or ResvTear is 742 received for PSB or RSB respectively. If B also has made node 743 protection available, B will eventually complete backup LSP 744 signaling with its NP-MP D and trigger RESV to A with RRO changed. 745 The new RRO of the LSP carried in RESV will not contain C. When A 746 processes the RESV with a new RRO not containing C - its former NP- 747 MP, A SHOULD send "Remote" PathTear to C. When C receives a "Remote" 748 PathTear for its PSB state, C will send normal PathTear downstream 749 to D and delete both PSB and RSB states corresponding to the LSP. As 750 D has already received backup LSP signaling from B, D will retain 751 control plane and forwarding states corresponding to the LSP. 753 4.4.3. LSP Preemption during Local Repair 755 If an LSP is preempted when there is no failure along the path of 756 the LSP, the node on which preemption occurs would send PathErr and 757 ResvTear upstream and only delete the forwarding state and RSB state 758 corresponding to the LSP. But if the LSP is being locally repaired 759 upstream of the node on which the LSP is preempted, then the node 760 SHOULD delete both PSB and RSB states corresponding to the LSP and 761 send normal PathTear downstream. 763 4.4.3.1. Preemption on LP-MP after Phop Link failure 765 If an LSP is preempted on LP-MP after its Phop or incoming link has 766 already failed but the backup LSP has not been signaled yet, then 767 the node SHOULD send normal PathTear and delete both PSB and RSB 768 states corresponding to the LSP. As the LP-MP has retained LSP state 769 because the PLR would signal the LSP through backup LSP signaling, 770 preemption would bring down the LSP and the node would not be LP-MP 771 any more requiring the node to clean up LSP state. 773 4.4.3.2. Preemption on NP-MP after Phop Link failure 775 If an LSP is preempted on NP-MP after its Phop link has already 776 failed but the backup LSP has not been signaled yet, then the node 777 SHOULD send normal PathTear and delete PSB and RSB states 778 corresponding to the LSP. As the NP-MP has retained LSP state 779 because the PLR would signal the LSP through backup LSP signaling, 780 preemption would bring down the LSP and the node would not be NP-MP 781 any more requiring the node to clean up LSP state. 783 Consider B-C link goes down on the same example topology (Figure 1). 784 As C is NP-MP for PLR A, C will retain LSP state. 786 1. The LSP is preempted on C. 787 2. C will delete RSB state corresponding to the LSP. But C cannot 788 send PathErr or ResvTear to PLR A because backup LSP has not 789 been signaled yet. 790 3. As the only reason for C having retained state after Phop node 791 failure was that it was NP-MP, C SHOULD send normal PathTear to 792 D and delete PSB state also. D would also delete PSB and RSB 793 states on receiving PathTear from C. 794 4. B starts backup LSP signaling to D. But as D does not have the 795 LSP state, it will reject backup LSP PATH and send PathErr to B. 796 5. B will delete its reservation and send ResvTear to A. 797 4.5. Backward Compatibility Procedures 799 The "Refresh interval Independent FRR" or RI-RSVP-FRR referred below 800 in this section refers to the changes that have been proposed in 801 previous sections. Any implementation that does not support them has 802 been termed as "non-RI-RSVP-FRR implementation". The extensions 803 proposed in [SUMMARY-FRR] are applicable to implementations that do 804 not support RI-RSVP-FRR. On the other hand, changes proposed 805 relating to LSP state cleanup namely Conditional and remote PathTear 806 require support from one-hop and two-hop neighboring nodes along the 807 LSP path. So procedures that fall under LSP state cleanup category 808 SHOULD be turned on only if all nodes involved in the node 809 protection FRR i.e. PLR, MP and intermediate node in the case of NP, 810 support the extensions. Note that for LSPs requesting only link 811 protection, the PLR and the LP-MP should support the extensions. 813 4.5.1. Detecting Support for Refresh interval Independent FRR 815 An implementation supporting the extensions specified in previous 816 sections (called RI-RSVP-FRR here after) SHOULD set the flag 817 "Refresh interval Independent RSVP" or RI-RSVP in CAPABILITY object 818 in Hello messages. The RI-RSVP flag is specified in [TE-SCALE-REC]. 820 - As nodes supporting the extensions SHOULD initiate Node Hellos 821 with adjacent nodes, a node on the path of protected LSP can 822 determine whether its Phop or Nhop neighbor supports RI-RSVP-FRR 823 enhancements from the Hello messages sent by the neighbor. 825 - If a node attempts to make node protection available, then the 826 PLR SHOULD initiate remote Node-ID signaling adjacency with NNhop. 827 If the NNhop (a) does not reply to remote node Hello message or 828 (b) does not set "Enhanced facility protection" flag in CAPABILITY 829 object in the reply, then the PLR can conclude that NNhop does not 830 support RI-RSVP-FRR extensions. 832 - If node protection is requested for an LSP and if (a) PPhop node 833 has not included a matching SUMMARY_FRR_BYPASS_ASSIGNMENT sub- 834 object in PATH RRO or (b) PPhop node has not initiated remote node 835 Hello messages, then the node SHOULD conclude that PLR does not 836 support RI-RSVP-FRR extensions. The details are described in the 837 "Procedures for backward compatibility" section below. 839 Any node that sets the I-bit is set in its CAPABILITY object MUST 840 also set Refresh-Reduction-Capable bit in common header of all RSVP- 841 TE messages. 843 4.5.2. Procedures for backward compatibility 845 The procedures defined hereafter are performed on a subset of LSPs 846 that traverse a node, rather than on all LSPs that traverse a node. 847 This behavior is required to support backward compatibility for a 848 subset of LSPs traversing nodes running non-RI-RSVP-FRR 849 implementations. 851 4.5.2.1. Lack of support on Downstream Node 853 - If the Nhop does not support the RI-RSVP-FRR extensions, then the 854 node SHOULD reduce the "refresh period" in TIME_VALUES object 855 carried in PATH to default small refresh default value. 857 - If node protection is requested and the NNhop node does not 858 support the enhancements, then the node SHOULD reduce the "refresh 859 period" in TIME_VALUES object carried in PATH to a small refresh 860 default value. 862 If the node reduces the refresh time from the above procedures, it 863 SHOULD also not send remote PathTear or Conditional PathTear 864 messages. 866 Consider the example topology in Figure 1. If C does not support the 867 RI-RSVP-FRR extensions, then: 869 - A and B SHOULD reduce the refresh time to default value of 30 870 seconds and trigger PATH 872 - If B is not an MP and if Phop link of B fails, B cannot send 873 Conditional PathTear to C but SHOULD time out PSB state from A 874 normally. This would be accomplished if A would also reduce the 875 refresh time to default value. So if C does not support the RI- 876 RSVP-FRR extensions, then Phop B and PPhop A SHOULD reduce refresh 877 time to a small default value. 879 4.5.2.2. Lack of support on Upstream Node 881 - If Phop node does not support the RI-RSVP-FRR extensions, then 882 the node SHOULD reduce the "refresh period" in TIME_VALUES object 883 carried in RESV to default small refresh time value. 885 - If node protection is requested and the Phop node does not 886 support the RI-RSVP-FRR extensions, then the node SHOULD reduce 887 the "refresh period" in TIME_VALUES object carried in PATH to 888 default value. 890 - If node protection is requested and PPhop node does not support 891 the RI-RSVP-FRR extensions, then the node SHOULD reduce the 892 "refresh period" in TIME_VALUES object carried in RESV to default 893 value. 895 - If the node reduces the refresh time from the above procedures, 896 it SHOULD also not execute MP procedures specified in Section 4.2 897 of this document. 899 4.5.2.3. Incremental Deployment 901 The backward compatibility procedures described in the previous sub- 902 sections imply that a router supporting the RI-RSVP-FRR extensions 903 specified in this document can apply the procedures specified in the 904 document either in the downstream or upstream direction of an LSP, 905 depending on the capability of the routers downstream or upstream in 906 the LSP path. 908 - RI-RSVP-FRR extensions and procedures are enabled for downstream 909 Path, PathTear and ResvErr messages corresponding to an LSP if 910 link protection is requested for the LSP and the Nhop node 911 supports the extensions 913 - RI-RSVP-FRR extensions and procedures are enabled for downstream 914 Path, PathTear and ResvErr messages corresponding to an LSP if 915 node protection is requested for the LSP and both Nhop & NNhop 916 nodes support the extensions 918 - RI-RSVP-FRR extensions and procedures are enabled for upstream 919 PathErr, Resv and ResvTear messages corresponding to an LSP if 920 link protection is requested for the LSP and the Phop node 921 supports the extensions 923 - RI-RSVP-FRR extensions and procedures are enabled for upstream 924 PathErr, Resv and ResvTear messages corresponding to an LSP if 925 node protection is requested for the LSP and both Phop and PPhop 926 nodes support the extensions 928 For example, if an implementation supporting the RI-RSVP-FRR 929 extensions specified in this document is deployed on all routers in 930 particular region of the network and if all the LSPs in the network 931 request node protection, then the FRR extensions will only be 932 applied for the LSP segments that traverse the particular region. 933 This will aid incremental deployment of these extensions and also 934 allow reaping the benefits of the extensions in portions of the 935 network where it is supported. 937 5. Security Considerations 939 This security considerations pertaining to [RFC2205], [RFC3209] and 940 [RFC5920] remain relevant. 942 This document extends the applicability of Node-ID based Hello 943 session between immediate neighbors. The Node-ID based Hello session 944 between PLR and NP-MP may require the two routers to exchange Hello 945 messages with non-immediate neighbor. So, the implementations SHOULD 946 provide the option to configure Node-ID neighbor specific or global 947 authentication key to authentication messages received from Node-ID 948 neighbors. The network administrator MAY utilize this option to 949 enable RSVP-TE routers to authenticate Node-ID Hello messages 950 received with TTL greater than 1. Implementations SHOULD also 951 provide the option to specify a limit on the number of Node-ID based 952 Hello sessions that can be established on a router supporting the 953 extensions defined in this document. 955 6. IANA Considerations 957 6.1. New Object - CONDITIONS 959 RSVP Change Guidelines [RFC3936] defines the Class-Number name space 960 for RSVP objects. The name space is managed by IANA. 962 IANA registry: RSVP Parameters 963 Subsection: Class Names, Class Numbers, and Class Types 965 A new RSVP object using a Class-Number from 128-183 range called the 966 "CONDITIONS" object is defined in Section 4.3 of this document. The 967 Class-Number from 128-183 range will be allocated by IANA. 969 7. Normative References 971 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 972 Requirement Levels", BCP 14, RFC 2119, March 1997. 974 [RFC4090] Pan, P., "Fast Reroute Extensions to RSVP-TE for LSP 975 Tunnels", RFC 4090, May 2005. 977 [RFC2961] Berger, L., "RSVP Refresh Overhead Reduction Extensions", 978 RFC 2961, April 2001. 980 [RFC2205] Braden, R., "Resource Reservation Protocol (RSVP)", RFC 981 2205, September 1997. 983 [RFC4558] Ali, Z., "Node-ID Based Resource Reservation (RSVP) Hello: 984 A Clarification Statement", RFC 4558, June 2006. 986 [RFC3473] Berger, L., "Generalized Multi-Protocol Label Switching 987 Signaling Resource Reservation Protocol-Traffic Engineering 988 Extensions", RFC 3473, January 2003. 990 [RFC5063] Satyanarayana, A., "Extensions to GMPLS Resource 991 Reservation Protocol Graceful Restart", RFC5063, October 992 2007. 994 [TE-SCALE-REC] Vishnu Pavan Beeram et. al, "Implementation 995 Recommendations to improve scalability of RSVP-TE 996 Deployments", draft-beeram-teas-rsvp-te-scaling-rec (work 997 in progress) 999 [SUMMARY-FRR] Mike Tallion et. al, "RSVP-TE Summary Fast Reroute 1000 Extensions for LSP Tunnels", draft-mtaillon-mpls-summary- 1001 frr-rsvpte (work in progress) 1003 8. Informative References 1005 [RFC5439] Yasukawa, S., "An Analysis of Scaling Issues in MPLS-TE 1006 Core Networks", RFC 5439, February 2009. 1008 [RFC5920] Fang, L., "Security Framework for MPLS and GMPLS 1009 Networks", RFC 5920, July 2010. 1011 9. Acknowledgments 1013 We are very grateful to Yakov Rekhter for his contributions to the 1014 development of the idea and thorough review of content of the draft. 1015 Thanks to Raveendra Torvi and Yimin Shen for their comments and 1016 inputs. 1018 10. Authors' Addresses 1020 Chandra Ramachandran 1021 Juniper Networks 1022 Email: csekar@juniper.net 1024 Ina Minei 1025 Google, Inc 1026 inaminei@google.com 1028 Ebben Aries 1029 Email: e@dscp.org 1030 Dante Pacella 1031 Verizon 1032 Email: dante.j.pacella@verizon.com 1034 Tarek Saad 1035 Cisco Systems Inc. 1036 Email: tsaad@cisco.com 1038 Markus Jork 1039 Juniper Networks 1040 Email: mjork@juniper.net 1042 Harish Sitaraman 1043 Juniper Networks 1044 Email: hsitaraman@juniper.net 1046 Vishnu Pavan Beeram 1047 Juniper Networks 1048 Email: vbeeram@juniper.net 1050 Mike Tallion 1051 Cisco Systems Inc. 1052 Email: mtallion@cisco.com