idnits 2.17.1 draft-ietf-ccamp-crankback-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3667, Section 5.1 on line 24. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1534. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1401. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1408. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1414. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, or will be disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 31 longer pages, the longest (page 1) being 60 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 2004) is 7132 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RC3473' is mentioned on line 188, but not defined == Missing Reference: 'RFC 3209' is mentioned on line 1219, but not defined == Missing Reference: 'RFC 3473' is mentioned on line 1220, but not defined == Missing Reference: 'RFC 2205' is mentioned on line 1191, but not defined == Unused Reference: 'ASON-REQ' is defined on line 1441, but no explicit reference was found in the text == Unused Reference: 'G8080' is defined on line 1458, but no explicit reference was found in the text == Outdated reference: A later version (-05) exists of draft-ietf-mpls-rsvpte-attributes-04 -- Possible downref: Non-RFC (?) normative reference: ref. 'ASON-REQ' == Outdated reference: A later version (-07) exists of draft-ietf-mpls-rsvp-lsp-fastreroute-06 == Outdated reference: A later version (-06) exists of draft-ietf-ccamp-rsvp-te-exclude-route-02 Summary: 5 errors (**), 0 flaws (~~), 11 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Adrian Farrel (editor) 3 Internet Draft Old Dog Consulting 4 Category: Standards Track 5 Expires: April 2005 Arun Satyanarayana 6 Movaz Networks, Inc. 8 Atsushi Iwata 9 Norihito Fujita 10 NEC Corporation 12 Gerald R. Ash (AT&T) 14 October 2004 16 Crankback Signaling Extensions for MPLS Signaling 17 19 Status of this Memo 21 By submitting this Internet-Draft, I certify that any applicable 22 patent or other IPR claims of which I am aware have been disclosed, 23 or will be disclosed, and any of which I become aware will be 24 disclosed, in accordance with RFC 3668. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF), its areas, and its working groups. Note that other 28 groups may also distribute working documents as Internet-Drafts. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt 38 The list of Internet-Draft Shadow Directories can be 39 accessed at http://www.ietf.org/shadow.html. 41 Abstract 43 In a distributed, constraint-based routing environment, the 44 information used to compute a path may be out of date. This means 45 that Multiprotocol Label Switching (MPLS) label switched path (LSP) 46 setup requests may be blocked by links or nodes without sufficient 47 resources. Crankback is a scheme whereby setup failure information is 48 returned from the point of failure to allow new setup attempts to be 49 made avoiding the blocked resources. Crankback can also be applied to 50 LSP restoration to indicate the location of the failed link or node. 52 This document specifies crankback signaling extensions for use in 53 MPLS signaling using RSVP-TE as defined in "RSVP-TE: Extensions to 54 RSVP for LSP Tunnels", RFC3209, so that the LSP setup request can be 55 retried on an alternate path that detours around blocked links or 56 nodes. This offers significant improvements in the successful setup 57 and recovery ratios for LSPs, especially in situations where a large 58 number of setup requests are triggered at the same time. 60 Table of Contents 62 Section A : Problem Statement 64 1. Terminology......................................................3 65 2. Introduction and Framework.......................................3 66 2.1. Background.....................................................3 67 2.2. Repair and Restoration.........................................4 68 3. Discussion: Explicit Versus Implicit Re-routing Indications......5 69 4. Required Operation...............................................6 70 4.1. Resource Failure or Unavailability.............................6 71 4.2. Computation of an Alternate Path...............................6 72 4.2.1 Information Required for Re-routing...........................7 73 4.2.2 Signaling a New Route.........................................7 74 4.3. Persistence of Error Information...............................7 75 4.4. Handling Re-route Failure......................................7 76 4.5. Limiting Re-routing Attempts...................................8 77 5. Existing Protocol Support for Crankback Re-routing...............8 78 5.1. RSVP-TE [RFC 3209].............................................9 79 5.2. GMPLS-RSVP-TE [RFC 3473].......................................9 81 Section B : Solution 83 6. Control of Crankback Operation..................................10 84 6.1. Requesting Crankback and Controlling In-Network Re-routing....10 85 6.2. Action on Detecting a Failure.................................11 86 6.3. Limiting Re-routing Attempts..................................11 87 6.3.1 New Status Codes for Re-routing..............................11 88 6.4. Protocol Control of Re-routing Behavior.......................11 89 7. Reporting Crankback Information.................................12 90 7.1. Required Information..........................................12 91 7.2. Protocol Extensions...........................................12 92 7.3 Guidance for Use of IF_ID Error Spec TLVs......................16 93 7.3.1 General Principles...........................................16 94 7.3.2 Error Report TLVs............................................17 95 7.3.3 Fundamental Crankback TLVs...................................17 96 7.3.4 Additional Crankback TLVs....................................18 97 7.3.5 Grouping TLVs by Failure Location............................19 98 7.3.6 Alternate Path identification................................20 99 7.4. Action on Receiving Crankback Information.....................20 100 7.4.1 Re-route Attempts............................................20 101 7.4.2 Location Identifiers of Blocked Links or Nodes...............20 102 7.4.3 Locating Errors within Loose or Abstract Nodes...............21 103 7.4.4 When Re-routing Fails........................................21 104 7.4.5 Aggregation of Crankback Information.........................21 105 7.5. Notification of Errors........................................22 106 7.5.1 ResvErr Processing...........................................22 107 7.5.2 Notify Message Processing....................................22 108 7.6. Error Values..................................................23 109 7.7. Backward Compatibility........................................23 110 8. Routing Protocol Interactions...................................23 111 9. LSP Restoration Considerations..................................24 112 9.1. Upstream of the Fault.........................................24 113 9.2. Downstream of the Fault.......................................25 114 10. IANA Considerations............................................25 115 10.1. Error Codes..................................................25 116 10.2. IF_ID_ERROR_SPEC TLVs........................................25 117 10.3. LSP_ATTRIBUTES Object........................................25 118 11. Security Considerations........................................26 119 12. Acknowledgments................................................26 120 13. Intellectual Property Considerations...........................26 121 14. Normative References...........................................26 122 15. Informational References.......................................27 123 16. Authors' Addresses.............................................28 124 17. Disclaimer of Validity.........................................29 125 18. Full Copyright Statement.......................................29 126 A. Experience of Crankback in TDM-based Networks..................30 128 Section A : Problem Statement 130 0. Changes 132 (This section to be removed before publication as an RFC.) 134 0.1 Changes from 01 to 02, and 02 to 03 Versions 136 - Update IPR and copyright 137 - Update references 139 0.2 Changes from 00 to 01 Versions 141 - Removal of background descriptive material pertaining to TDM 142 network experience from section 3 to an Appendix. 143 - Removal of definition of Error Spec TLVs for unnumbered bundled 144 links from section 7.2 to a separate document. 145 - More detailed guidance on which Error Spec TLVs to use when. 146 - Change LSP_ATTRIBUTE flags from hex values to bit numbers. 147 - Typographic errors fixed. 148 - Update references. 150 1. Terminology 152 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 153 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 154 document are to be interpreted as described in [RFC2119]. 156 2. Introduction and Framework 158 2.1. Background 160 RSVP-TE (RSVP Extensions for LSP Tunnels) [RFC3209] can be 161 used for establishing explicitly routed LSPs in an MPLS 162 network. Using RSVP-TE, resources can also be reserved 163 along a path to guarantee or control QoS for traffic 164 carried on the LSP. To designate an explicit path that 165 satisfies QoS constraints, it is necessary to discern the 166 resources available to each link or node in the network. 167 For the collection of such resource information, routing 168 protocols, such as OSPF and IS-IS , can be extended to 169 distribute additional state information [RFC2702]. 171 Explicit paths can be computed based on the distributed 172 information at the LSR initiating an LSP and signaled as 173 Explicit Routes during LSP establishment. Explicit Routes 174 may contain 'loose hops' and 'abstract nodes' that convey 175 routing through any of a collection of nodes. This 176 mechanism may be used to devolve parts of the path 177 computation to intermediate nodes such as area border LSRs. 179 In a distributed routing environment, however, the 180 resource information used to compute a constraint-based 181 path may be out of date. This means that a setup request 182 may be blocked, for example, because a link or node along 183 the selected path has insufficient resources. 185 In RSVP-TE, a blocked LSP setup may result in a PathErr 186 message sent to the initiator, or a ResvErr sent to the 187 terminator (egress LSR). These messages may result in the 188 LSP setup being abandoned. In Generalized MPLS [RC3473] 189 the Notify message may additionally be used to expedite 190 notification of LSP failures to ingress and egress LSRs, 191 or to a specific "repair point". 193 These existing mechanisms provide a certain amount of 194 information about the path of the failed LSP. 196 2.2. Repair and Restoration 198 If the ingress LSR or intermediate area border LSR knows 199 the location of the blocked link or node, the LSR can 200 designate an alternate path and then reissue the setup 201 request. Determination of the identity of the blocked 202 link or node can be achieved by the mechanism known as 203 crankback routing [PNNI, ASH1]. In RSVP-TE, crankback 204 signaling requires notifying an upstream LSR of the 205 location of the blocked link or node. In some cases this 206 requires more information than is currently available in 207 the signaling protocols. 209 On the other hand, various restoration schemes for link 210 or node failures have been proposed in [RFC3469] and 211 include fast restoration. These schemes rely on 212 the existence of a backup LSP to protect the primary, but 213 if both the primary and backup paths fail it is necessary 214 to re-establish the LSP on an end-to-end basis avoiding 215 the known failures. Similarly, fast restoration by 216 establishing a restoration path on demand after failure 217 requires computation of a new LSP that avoids the known 218 failures. End-to-end restoration for alternate routing 219 requires the location of the failed link or node. 220 Crankback routing schemes could be used to notify 221 upstream LSRs of the location of the failure. 223 Furthermore, in situations where many link or node 224 failures occur at the same time, the difference between 225 the distributed routing information and the real-time 226 network state becomes much greater than in normal LSP 227 setups. LSP restoration might, therefore, be performed 228 with inaccurate information, which is likely to cause 229 setup blocking. Crankback routing could improve failure 230 recovery in these situations. 232 Generalized MPLS [RFC3471] extends MPLS into networks 233 that manage Layer 2, TDM and lambda resources. In a 234 network without wavelength converters, setup requests are 235 likely to be blocked more often than in a conventional 236 MPLS environment because the same wavelength must be 237 allocated at each Optical Cross-Connect on an end-to-end 238 explicit path. Furthermore, end-to-end restoration is the 239 only way to recover LSP failures. This implies that 240 crankback routing would also be useful in a GMPLS 241 network, in particular in dynamic LSP re-routing cases 242 (no backup LSP pre-establishment). 244 3. Discussion: Explicit Versus Implicit Re-routing Indications 246 There have been problems in service provider networks 247 when "inferring" from indirect information that re-routing 248 is allowed. This document proposes the use of an explicit 249 re-routing indication that explicitly authorizes re-routing. 251 Various existing protocol options and exchanges including 252 the error values of PathErr message [RFC2205, RFC3209] 253 and the Notify message [RFC3473] allow an implementation 254 to infer a situation where re-routing can be done. This 255 allows for recovery from network errors or resource 256 contention. 258 However, such inference of recovery signaling is not always 259 desirable since it may be doomed to failure. For example, 260 experience of using release messages in TDM-based networks for 261 analogous implicit and explicit re-routing indications purposes 262 provides some guidance. This background information is given in 263 Appendix A." 265 It is certainly the case that with topology exchange, 266 such as OSPF, the ingress LSR could infer the re-routing 267 condition. However, convergence of routing information is 268 typically slower than the expected LSP setup times. One of 269 the reasons for crankback is to avoid the overhead of 270 available-link-bandwidth flooding, and to more efficiently 271 use local state information to direct alternate routing 272 at the ingress-LSR. 274 [ASH1] shows how event-dependent-routing can just use crankback, 275 and not available-link-bandwidth flooding, to decide on the 276 re-route path in the network through "learning models". Reducing 277 this flooding reduces overhead and can lead to the ability to 278 support much larger AS sizes. 280 Therefore, the alternate routing should be indicated based on 281 an explicit indication, and it is best to know the following 282 information separately: 284 - where blockage/congestion occurred 285 - whether alternate routing "should" be attempted. 287 4. Required Operation 289 Section 2 identifies some of the circumstances under which 290 crankback may be useful. Crankback routing is performed as 291 described in the following procedures, when an LSP setup 292 request is blocked along the path, or when an existing LSP fails. 294 4.1. Resource Failure or Unavailability 296 When an LSP setup request is blocked due to unavailable 297 resources, an error message response with the location 298 identifier of the blockage should be returned to the LSR 299 initiating the LSP setup (ingress LSR), the area border 300 LSR, the AS border LSR, or to some other repair point. 302 This error message carries an error specification 303 according to [RFC3209] - this indicates the cause of the 304 error and the node/link on which the error occurred. 305 Crankback operation may require further information as 306 detailed in sections 4.2.1 and 7. 308 4.2. Computation of an Alternate Path 310 In a flat network without partitioning, when the ingress 311 LSR receives the error message it computes an alternate 312 path around the blocked link or node to satisfy QoS 313 constraints using link state information about the network. 314 If an alternate path is found, a new LSP setup request is 315 sent over this path. 317 On the other hand, in a network partitioned into areas 318 such as with hierarchical OSPF, an area border LSR may 319 intercept and terminate the error response, and perform 320 alternate (re-)routing within the downstream area. 322 In a third scenario, any node within an area may act as a 323 repair point. In this case, each LSR behaves much as an 324 area border LSR as described above. It can intercept and 325 terminate the error response, and perform alternate 326 routing. This may be particularly useful where domains of 327 computation are applied within the network, however if 328 all nodes in the network perform re-routing it is 329 possible to spend excessive network and CPU resources on 330 re-routing attempts that would be better made only at 331 designated re-routing nodes. This scenario is somewhat 332 like 'MPLS fast re-route' [FASTRR], in which any node in 333 the MPLS domain can establish 'local repair' LSPs after 334 failure notification. 336 4.2.1 Information Required for Re-routing 338 In order to correctly compute a route that avoids the 339 blocking problem, a repair point LSR must gather as much 340 crankback information as possible. Ideally, the repair 341 node will be given the node, link and reason for the 342 failure. 344 However, this information may not be enough to help with 345 re-computation. Consider for instance an explicit route 346 that contains a non-explicit abstract node or a loose 347 hop. In this case, the failed node and link is not 348 necessarily enough to tell the repair point which hop in 349 the explicit route has failed. The crankback information 350 needs to provide the context into the explicit route. 352 4.2.2 Signaling a New Route 354 If the crankback information can be used to compute a new 355 route avoiding the blocking problem, the route can be 356 signaled as an Explicit Route. 358 However, it may be that the repair point does not have 359 sufficient topology information to compute an Explicit 360 Route that is guaranteed to avoid the failed link or 361 node. In this case, Route Exclusions [EXCLUDE] may be 362 particularly helpful. To achieve this, [EXCLUDE] allows 363 the crankback information to be presented as route exclusions 364 to force avoidance of the failed node, link or resource. 366 4.3. Persistence of Error Information 368 The repair point LSR that computes the alternate path 369 should store the location identifiers of the blockages 370 indicated in the error message until the LSP is 371 successfully established or until the LSR abandons re-routing 372 attempts. Since crankback routing may happen more than once 373 while establishing a specific LSP, a history table of all 374 experienced blockages for this LSP SHOULD be maintained (at 375 least until the routing protocol updates the state of this 376 information) to perform an accurate path computation to 377 detour all blockages. 379 If a second error response is received by a repair point (while 380 it is performing crankback re-routing) it should update the 381 history table that lists all experienced blockages, and use the 382 entire gathered information when making a further re-routing attempt. 384 4.4. Handling Re-route Failure 386 Multiple blockages (for the same LSP) may occur, and successive 387 setup retry attempts may fail. Retaining error information from 388 previous attempts ensures that there is no thrashing of setup 389 attempts, and knowledge of the blockages increases with each 390 attempt. 392 It may be that after several retries, a given repair point is 393 unable to compute a path to the destination (that is, the egress 394 of the LSP) that avoids all of the blockages. In this case, it 395 must pass the error indication upstream. It is most useful to the 396 upstream nodes (and in particular the ingress LSR) that may, 397 themselves, attempt new routes for the LSP setup, if the error 398 indication in this case identifies all of the downstream blockages 399 and also the node that has been unable to compute an alternate path. 401 4.5. Limiting Re-routing Attempts 403 It is important to prevent an endless repetition of LSP 404 setup attempts using crankback routing information after 405 error conditions are signaled, or during periods of high 406 congestion. It may also be useful to reduce the number of 407 retries, since failed retries will increase setup latency 408 and degrade performance. 410 The maximum number of crankback re-routing attempts 411 allowed may be limited in a variety of ways. The number 412 may be limited by LSP, by node, by area or by AS. Control 413 of the limit may be applied as a configuration item per 414 LSP, per node, per area or per AS. 416 When the number of retries at a particular node, area or 417 AS is exceeded, the LSR handling the current failure 418 reports the failure upstream to the next node, area or AS 419 where further re-routing attempts may be attempted. It is 420 important that the crankback information provided 421 indicates that routing back through this node, area or AS 422 will not succeed - this situation is similar to that in 423 section 4.4. Note that in some circumstances, such a 424 report will also mean that no further re-routing attempts 425 can possibly succeed - for example, when the egress node 426 is within the failed area. 428 When the maximum number of retries for a specific LSP has 429 been exceeded, the LSR handling the current failure 430 should send an error message upstream indicating "Maximum 431 number of re-routings exceeded". This error will be 432 passed back to the ingress LSR with no further re-routing 433 attempts. The ingress LSR may choose to retry the LSP 434 setup according to local policy and might choose to re-use 435 its original path or seek to compute a path that avoids 436 the blocked resources. In the latter case, it may be 437 useful to indicate the blocked resource in this error 438 message. 440 5. Existing Protocol Support for Crankback Re-routing 442 Crankback re-routing is appropriate for use with RSVP-TE. 444 1) LSP establishment may fail because of an inability to 445 route, perhaps because links are down. In this case a 446 PathErr message is returned to the initiator. 448 2) LSP establishment may fail because resources are 449 unavailable. This is particularly relevant in GMPLS where 450 explicit label control may be in use. Again, a PathErr 451 message is returned to the initiator. 453 3) Resource reservation may fail during LSP establishment, 454 as the Resv is processed. If 455 resources are not available on the required link or at a 456 specific node, a ResvErr message is returned to the egress 457 node indicating "Admission Control failure" [RFC2205]. The 458 egress is allowed to change the FLOWSPEC and try again, but 459 in the event that this is not practical or not supported 460 (particularly in the GMPLS context), the egress LSR may 461 choose to take any one of the following actions. 463 - Ignore the situation and allow recovery to happen through 464 Path refresh message and refresh timeout [RFC2205]. 465 - Send a PathErr message towards the initiator indicating 466 "Admission Control failure". 467 - Send a ResvTear message towards the initiator to abort 468 the LSP setup. 470 Note that in multi-area/AS networks, the ResvErr might be 471 intercepted and acted on at an area/AS border router. 473 4) It is also possible to make resource reservations on the forward 474 path as the Path message is processed. This choice is compatible 475 with LSP setup in GMPLS networks [RFC3471]. In this case if 476 resources are not available, a PathErr message is returned to 477 initiator indicating "Admission Control failure". 479 Crankback information would be useful to an upstream node (such as 480 the ingress) if it is supplied on a PathErr or a Notify message that 481 is sent upstream. 483 5.1. RSVP-TE [RFC 3209] 485 In RSVP-TE a failed LSP setup attempt results in a PathErr 486 message returned upstream. The PathErr message carries an 487 ERROR_SPEC object, which indicates the node or interface 488 reporting the error and the reason for the failure. 490 Crankback re-routing can be performed explicitly avoiding 491 the node or interface reported. 493 5.2. GMPLS-RSVP-TE [RFC 3473] 495 GMPLS extends the error reporting described above by 496 allowing LSRs to report the interface that is in error in 497 addition to the identity of the node reporting the error. 498 This further enhances the ability of a re-computing node 499 to route around the error. 501 GMPLS introduces a targeted Notify message that may be used to 502 report LSP failures direct to a selected node. This message carries 503 the same error reporting facilities as described above. The Notify 504 message may be used to expedite the propagation of error 505 notifications, but in a network that offers crankback routing at 506 multiple nodes there would need to be some agreement between LSRs 507 as to whether PathErr or Notify provides the stimulus for crankback 508 operation. Otherwise, multiple nodes might attempt to repair the LSP 509 at the same time, because 511 1) these messages can flow through different paths before 512 reaching the ingress LSR, and 513 2) the destination of the Notify message might not be the 514 ingress LSR. 516 Section B : Solution 518 6. Control of Crankback Operation 520 6.1. Requesting Crankback and Controlling In-Network Re-routing 522 When a request is made to set up an LSP tunnel, the ingress LSR 523 should specify whether it wants crankback information to be collected 524 in the event of a failure, and whether it requests re-routing 525 attempts by any or specific intermediate nodes. For this purpose, a 526 Re-routing Flag field is added to the protocol setup request 527 messages. The corresponding values are mutually exclusive. 529 No Re-routing The ingress node MAY attempt re-routing after 530 failure. Intermediate nodes SHOULD NOT attempt 531 re-routing after failure. Nodes detecting 532 failures MUST report an error and MAY supply 533 crankback information. This is the default 534 and backwards compatible option. 536 End-to-end Re-routing The ingress node MAY attempt re-routing after 537 failure. Intermediate nodes SHOULD NOT attempt 538 re-routing after failure. Nodes detecting 539 failures MUST report an error and SHOULD 540 supply crankback information. 542 Boundary Re-routing Intermediate nodes MAY attempt re-routing 543 after failure only if they are Area Border 544 Routers or AS Border Routers. The boundary 545 (ABR/ASBR) can either decide to forward the 546 error message upstream to the ingress 547 LSR or try to select another egress boundary 548 LSR. Other intermediate nodes SHOULD NOT 549 attempt re-routing. Nodes detecting failures 550 MUST report an error and SHOULD supply 551 crankback information. 553 Segment-based Re-routing 554 All nodes MAY attempt re-routing after 555 failure. Nodes detecting failures MUST report 556 an error and SHOULD supply full crankback 557 information. 559 6.2. Action on Detecting a Failure 561 A node that detects the failure to setup an LSP or the failure of an 562 established LSP SHOULD act according to the Re-routing Flag passed on 563 the LSP setup request. 565 If Segment-based Re-routing is allowed, or if Boundary Re-routing is 566 allowed and the detecting node is an ABR or ASBR, the detecting node 567 MAY immediately attempt to re-route. 569 If End-to-end Re-routing is indicated, or if Segment-based or 570 Boundary Re-routing is allowed and the detecting node chooses 571 not to make re-routing attempts (or has exhausted all possible 572 re-routing attempts), the detecting node MUST return a protocol 573 error indication and SHOULD include full crankback information. 575 6.3. Limiting Re-routing Attempts 577 Each repair point SHOULD apply a locally configurable 578 limit to the number of attempts it makes to re-route an 579 LSP. This helps to prevent excessive network usage in the 580 event of significant faults, and allows back-off to other 581 repair points which may have a better chance of routing 582 around the problem. 584 6.3.1 New Status Codes for Re-routing 586 An error code/value of "Routing Problem"/"Re-routing 587 limit exceeded" (24/TBD) is used to identify that a node 588 has abandoned crankback re-routing because it has reached 589 a threshold for retry attempts. 591 A node receiving an error response with this status code 592 MAY also attempt crankback re-routing, but it is RECOMMENDED 593 that such attempts be limited to the ingress LSR. 595 6.4. Protocol Control of Re-routing Behavior 597 The Session Attributes Object in RSVP-TE is used on Path 598 messages to indicate the capabilities and attributes of the 599 session. This object contains an 8-bit flag field which is 600 used to signal individual Boolean capabilities or attributes. 601 The Re-Routing Flag described in section 5.1 would fit 602 naturally into this field, but there is a scarcity of bits, so 603 use is made of the new LSP_ATTRIBUTES object defined in 604 [LSP-ATTRIB]. Three bits are defined for inclusion in the LSP 605 Attributes TLV as follows. The bit numbers below are suggested 606 and actual values are TBD by IETF consensus. 608 Bit Name and Usage 609 Number 611 1 End-to-end re-routing desired. 612 This flag indicates the end-to-end re-routing behavior 613 for an LSP under establishment. This MAY also be used 614 for specifying the behavior of end-to-end LSP restoration 615 for established LSPs. 617 2 Boundary re-routing desired. 618 This flag indicates the boundary re-routing 619 behavior for an LSP under establishment. 620 This MAY also be used for specifying the 621 segment-based (hierarchical) LSP restoration 622 for established LSPs. The boundary ABR/ASBR 623 can either decide to forward the PathErr 624 message upstream to the Head-end LSR or try 625 to select another egress boundary LSR. 627 3 Segment-based re-routing desired. 628 This flag indicates the segment-based 629 re-routing behavior for an LSP under 630 establishment. This MAY also be used 631 for specifying the segment-based LSP 632 restoration for established LSPs. 634 7. Reporting Crankback Information 636 7.1. Required Information 638 As described above, full crankback information SHOULD 639 indicate the node, link and other resources, which have 640 been attempted but have failed because of allocation 641 issues or network failure. 643 The default crankback information SHOULD include the 644 interface and the node address. 646 7.2. Protocol Extensions 648 [RFC3473] defines an IF_ID ERROR_SPEC object that can be 649 used on PathErr, ResvErr and Notify messages to convey 650 the information carried in the Error Spec Object defined 651 in [RFC 3209]. Additionally, the IF_ID ERROR_SPEC Object 652 has scope for carrying TLVs that identify the link 653 associated with the error. 655 The TLVs for use with this object are defined in [RFC3471], and 656 are listed below. They are used to identify links in the IF_ID 657 PHOP Object and in the IF_ID ERROR_SPEC object to identify the 658 failed resource which is usually the downstream resource from 659 the reporting node. 661 Type Length Format Description 662 -------------------------------------------------------------------- 663 1 8 IPv4 Addr. IPv4 (Interface address) 664 2 20 IPv6 Addr. IPv6 (Interface address) 665 3 12 Compound IF_INDEX (Interface index) 666 4 12 Compound COMPONENT_IF_DOWNSTREAM (Component interface) 667 5 12 Compound COMPONENT_IF_UPSTREAM (Component interface) 669 Two further TLVs are defined in [TE-BUNDLE] for use in the IF_ID 670 PHOP Object and in the IF_ID ERROR_SPEC object to identify component 671 links of unnumbered interfaces. Note that the Type values shown here 672 are only suggested values in [TE-BUNDLE] - final values are TBD and 673 to be determined by IETF consensus. 675 Type Length Format Description 676 -------------------------------------------------------------------- 677 6 16 Compound UNUM_COMPONENT_IF_DOWN (Component interface) 678 7 16 Compound UNUM_COMPONENT_IF_UP (Component interface) 680 In order to facilitate reporting of crankback information, the 681 following additional TLVs are defined. Note that the Type values 682 shown here are only suggested values - final values are TBD and to be 683 determined by IETF consensus. 685 Type Length Format Description 686 -------------------------------------------------------------------- 687 8 var See below DOWNSTREAM_LABEL (GMPLS label) 688 9 var See below UPSTREAM_LABEL (GMPLS label) 689 10 8 See below NODE_ID (Router Id) 690 11 x See below OSPF_AREA (Area Id) 691 12 x See below ISIS_AREA (Area Id) 692 13 8 See below AUTONOMOUS_SYSTEM (Autonomous system) 693 14 var See below ERO_CONTEXT (ERO subobject) 694 15 var See below ERO_NEXT_CONTEXT (ERO subobjects) 695 16 8 IPv4 Addr. PREVIOUS_HOP_IPv4 (Node address) 696 17 20 IPv6 Addr. PREVIOUS_HOP_IPv6 (Node address) 697 18 8 IPv4 Addr. INCOMING_IPv4 (Interface address) 698 19 20 IPv6 Addr. INCOMING_IPv6 (Interface address) 699 20 12 Compound INCOMING_IF_INDEX (Interface index) 700 21 12 Compound INCOMING_COMP_IF_DOWN (Component interface) 701 22 12 Compound INCOMING_COMP_IF_UP (Component interface) 702 23 16 See below INCOMING_UNUM_COMP_DOWN (Component interface) 703 24 16 See below INCOMING_UNUM_COMP_UP (Component interface) 704 25 var See below INCOMING_DOWN_LABEL (GMPLS label) 705 26 var See below INCOMING_UP_LABEL (GMPLS label) 706 27 8 See below REPORTING_NODE_ID (Router Id) 707 28 x See below REPORTING_OSPF_AREA (Area Id) 708 29 x See below REPORTING_ISIS_AREA (Area Id) 709 30 8 See below REPORTING_AS (Autonomous system) 710 31 var See below PROPOSED_ERO (ERO subobjects) 711 32 var See below NODE_EXCLUSIONS (List of nodes) 712 33 var See below LINK_EXCLUSIONS (List of interfaces) 713 For types 1, 2, 3, 4 and 5, the format of the Value field 714 is already defined in [RFC3471]. 716 For types 6 and 7 the format of the Value field is already 717 defined in [TE-BUNDLE]. 719 For types 16 and 18, they format of the Value field is 720 the same as for type 1. 722 For types 17 and 19, the format of the Value field is the 723 same as for type 2. 725 For types 20, 21 and 22, the formats of the Value fields 726 are the same as for types 3, 4 and 5 respectively. 728 For types 23 and 24 the Value field is the same as for 729 types 6 and 7 respectively. 731 For types 8, 9, 25 and 26 the length field is variable 732 and the Value field is a label as defined in [RFC3471]. 733 As with all uses of labels, it is assumed that any node 734 that can process the label information knows the syntax 735 and semantics of the label from the context. Note that 736 all TLVs are zero-padded to a multiple four octets so 737 that if a label is not itself a multiple of four octets 738 it must be disambiguated from the trailing zero pads by 739 knowledge derived from the context. 740 For types 10 and 27 the Value field has the format: 742 0 1 2 3 743 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 744 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 745 | Router Id | 746 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 748 Router Id: 32 bits 750 The Router Id used to identify the node within the IGP. 752 For types 11 and 28 the Value field has the format: 754 0 1 2 3 755 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 756 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 757 | OSPF Area Identifier | 758 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 760 OSPF Area Identifier 762 The 4-octet area identifier for the node. In the case of 763 ABRs, this identifies the area where the failure has occurred. 765 For types 12 and 29 the Value field has the format: 767 0 1 2 3 768 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 769 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 770 | Length | ISIS Area Identifier | 771 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 772 ~ ISIS Area Identifier (continued) ~ 773 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 775 Length 777 Length of the actual (non-padded) ISIS Area Identifier 778 in octets. Valid values are from 2 to 11 inclusive. 780 ISIS Area Identifier 782 The variable-length ISIS area identifier. Padded with 783 trailing zeroes to a four-octet boundary. 785 For types 13 and 30 the Value field has the format: 787 0 1 2 3 788 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 789 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 790 | Autonomous System Number | 791 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 793 Autonomous System Number: 32 bits 795 The AS Number of the associated Autonomous System. Note 796 that if 16-bit AS numbers are in use, the low order bits 797 (16 through 31) should be used and the high order bits 798 (0 through 15) should be set to zero. 800 For types 14, 15 and 31 the Value field has the format: 802 0 1 2 3 803 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 804 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 805 | | 806 ~ ERO Subobjects ~ 807 | | 808 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 810 ERO Subobjects: 812 A sequence of ERO subobjects. Any ERO subobjects are 813 allowed whether defined in [RFC3209], [RFC3473] or other 814 documents. Note that ERO subobjects contain their own 815 type and length fields. 817 For type 32 the Value field has the format: 819 0 1 2 3 820 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 821 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 822 | | 823 ~ Node Identifiers ~ 824 | | 825 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 827 Node Identifiers: 829 A sequence of TLVs as defined here of types 1, 2 or 10 830 that indicates downstream nodes that have already 831 participated in crankback attempts and have been declared 832 unusable for the current LSP setup attempt. 834 For type 33 the Value field has the format: 836 0 1 2 3 837 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 838 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 839 | | 840 ~ Link Identifiers ~ 841 | | 842 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 844 Link Identifiers: 846 A sequence of TLVs as defined here of types 3, 4, 5, 6 or 7 847 that indicates incoming interfaces at downstream nodes that 848 have already participated in crankback attempts and have 849 been declared unusable for the current LSP setup attempt. 851 7.3 Guidance for Use of IF_ID ERROR_SPEC TLVs 853 7.3.1 General Principles 855 If crankback is not being used but an IF-ID ERROR_SPEC 856 object is included in a PathErr, ResvErr or Notify 857 message, the sender SHOULD include one of the TLVs of 858 type 1 through 5 as described in [RFC3473]. A sender that 859 wishes to report an error with a component link of an 860 unnumbered bundle SHOULD use the new TLVs of type 6 or 7 861 as defined in this document. A sender MAY include 862 additional TLVs from the range 8 through 33 to report 863 crankback information, although this information will at 864 most only be used for logging. 866 If crankback is being used, the sender of a PathErr, 867 ResvErr or Notify message MUST use the IF_ID ERROR_SPEC 868 object and MUST include at least one of the TLVs in the 869 range 1 through 7 as described in [RFC3473] and the 870 previous paragraph. Additional TLVs SHOULD also be 871 included to report further information. The following 872 section gives advice on which TLVs should be used under 873 different circumstances, and which TLVs must be supported 874 by LSRs. 876 Note that all such TLVs are optional and MAY be omitted. 877 Inclusion of the optional TLVs SHOULD be performed where 878 doing so helps to facilitate error reporting and crankback. 879 The TLVs fall into three categories: those that are essential 880 to report the error, those that provide additional information 881 that is or may be fundamental to the utility of crankback, and 882 those that provide additional information that may be useful for 883 crankback in some circumstances. 885 Note that all LSRs MUST be prepared to receive and forward any 886 TLV as per [RFC3473]. There is, however, no requirement for an 887 LSR to actively process any but the error report TLVs. An LSR 888 that proposes to perform crankback re-routing SHOULD support 889 receipt and processing of all of the fundamental crankback TLVs, 890 and is RECOMMENDED to support the receipt and processing of 891 the additional crankback TLVs. 893 It should be noted, however, that some assumptions about the 894 TLVs that will be used MAY be made based on the deployment 895 scenarios. For example, a router that is deployed in a 896 single-area network does not need to support the receipt and 897 processing of TLV types 28 and 29. Those TLVs might be inserted 898 in an IF_ID ERROR_SPEC object, but would not need to be processed 899 by the receiver of a PathErr message. 901 7.3.2 Error Report TLVs 903 Error Report TLVs are those in the range 1 through 7. 905 As stated above, when crankback information is reported, 906 the IF_ID ERROR_SPEC object MUST be used. When the IF_ID 907 ERROR_SPEC object is used, at least one of the TLVs in 908 the range 1 through 7 MUST be present. The choice of which 909 TLV to use will be dependent on the circumstance of the error 910 and device capabilities. For example, a device that does not 911 support IPv6 will not need the ability to create a TLV of type 912 2. Note, however, that such a device MUST still be prepared 913 to receive and process all error report TLVs. 915 7.3.3 Fundamental Crankback TLVs 917 Many of the TLVs report the specific resource that has 918 failed. For example, TLV type 1 can be used to report that 919 the setup attempt was blocked by some form of resource 920 failure on a specific interface identified by the IP 921 address supplied. TLVs in this category are 1 through 13. 923 These TLVs SHOULD be supplied whenever the node detecting 924 and reporting the failure with crankback information has 925 the information available. 927 The use of TLVs of type 10, 11, 12 and 13, MAY, however, be 928 omitted according to local policy and relevance of the 929 information. 931 7.3.4 Additional Crankback TLVs 933 Some TLVs help to locate the fault within the context of 934 the path of the LSP that was being set up. TLVs of types 935 14, 15, 16 and 17 help to set the context of the error 936 within the scope of an explicit path that has loose hops 937 or non-precise abstract nodes. The ERO context 938 information is not always a requirement, but a node may 939 notice that it is a member of the next hop in the ERO 940 (such as a loose or non-specific abstract node) and 941 deduce that its upstream neighbor may have selected the 942 path using next hop routing. In this case, providing the 943 ERO context will be useful to the node further that 944 performs re-routing. 946 Reporting nodes SHOULD also supply TLVs from the range 14 947 through 26 as appropriate for reporting the error. The 948 reporting nodes MAY also supply TLVs from the range 27 949 through 33. 951 Note that in deciding whether a TLV in the range 14 952 through 26 "is appropriate", the reporting node should 953 consider amongst other things, whether the information is 954 pertinent to the cause of the failure. For example, when 955 a cross-connection fails it may be that the outgoing 956 interface is faulted, in which case only the interface 957 (for example, TLV type 1) needs to be reported, but if 958 the problem is that the incoming interface cannot be 959 connected to the outgoing interface because of temporary 960 or permanent cross-connect limitations, the node should 961 also include reference to the incoming interface (for 962 example, TLV type 18). 964 Four TLVs (27, 28, 29 and 30) allow the location of the 965 reporting node to be expanded upon. These TLVs would not 966 be included if the information is not of use within the 967 local system, but might be added by ABRs relaying the 968 error. Note that the Reporting Node Id (TLV 27) need not 969 be included if the IP address of the reporting node as 970 indicated in the ERROR_SPEC itself, is sufficient to 971 fully identify the node. 973 The last three TLVs (31, 32, and 33) provide additional 974 information for recomputation points. The reporting node 975 (or some node forwarding the error) may supply 976 suggestions about the ERO that could have been used to 977 avoid the error. As the error propagates back upstream 978 and as crankback routing is attempted and fails, it is 979 beneficial to collect lists of failed nodes and links so 980 that they will not be included in further computations 981 performed at upstream nodes. Theses lists may also be 982 factored into route exclusions [EXCLUDE]. 984 Note that there is no ordering requirement on any of the 985 TLVs within the IF_ID Error Spec, and no implication 986 should be drawn from the ordering of the TLVs in a 987 received IF_ID Error Spec. 989 It is left as an implementation detail precisely when to 990 include each of the TLVs according to the capabilities of 991 the system reporting the error. 993 7.3.5 Grouping TLVs by Failure Location 995 Further guidance as to the inclusion of crankback TLVs can be given 996 by grouping the TLVs according to the location of the failure and 997 the context within which it is reported. For example, a TLV that 998 reports an area identifier would only need to be included as the 999 crankback error report transits an area boundary. 1001 Although discussion of aggregation of crankback information is out 1002 of the scope of this document, it should be noted that this topic is 1003 closely aligned to the information presented here. 1005 Resource Failure 1006 8 DOWNSTREAM_LABEL 1007 9 UPSTREAM_LABEL 1008 Interface failures 1009 1 IPv4 1010 2 IPv6 1011 3 IF_INDEX 1012 4 COMPONENT_IF_DOWNSTREAM 1013 5 COMPONENT_IF_UPSTREAM 1014 6 UNUM_COMPONENT_IF_DOWN 1015 7 UNUM_COMPONENT_IF_UP 1016 14 ERO_CONTEXT 1017 15 ERO_NEXT_CONTEXT 1018 16 PREVIOUS_HOP_IPv4 1019 17 PREVIOUS_HOP_IPv6 1020 18 INCOMING_IPv4 1021 19 INCOMING_IPv6 1022 20 INCOMING_IF_INDEX 1023 21 INCOMING_COMP_IF_DOWN 1024 22 INCOMING_COMP_IF_UP 1025 23 INCOMING_UNUM_COMP_DOWN 1026 24 INCOMING_UNUM_COMP_UP 1027 25 INCOMING_DOWN_LABEL 1028 26 INCOMING_UP_LABEL 1029 Node failures 1030 10 NODE_ID 1031 27 REPORTING_NODE_ID 1032 Area failures 1033 11 OSPF_AREA 1034 12 ISIS_AREA 1035 28 REPORTING_OSPF_AREA 1036 29 REPORTING_ISIS_AREA 1037 31 PROPOSED_ERO 1038 32 NODE_EXCLUSIONS 1039 33 LINK_EXCLUSIONS 1040 AS failures 1041 13 AUTONOMOUS_SYSTEM 1042 30 REPORTING_AS 1044 7.3.6 Alternate Path identification 1046 No new object is used to distinguish between Path/Resv messages 1047 for an alternate LSP. Thus, the alternate LSP uses the same 1048 SESSION and SENDER_TEMPLATE/FILTER_SPEC objects as the ones used 1049 for the initial LSP under re-routing. 1051 7.4. Action on Receiving Crankback Information 1053 7.4.1 Re-route Attempts 1055 As described in section 3, a node receiving crankback information 1056 in a PathErr must first check to see whether it is allowed to 1057 perform re-routing. This is indicated by the Re-routing Flags in 1058 the SESSION_ATTRIBUTE object during LSP setup request. 1060 If a node is not allowed to perform re-routing it should 1061 forward the PathErr message, or if it is the ingress 1062 report the LSP as having failed. 1064 If re-routing is allowed, the node should attempt to compute a path 1065 to the destination using the original (received) explicit path and 1066 excluding the failed/blocked node/link. The new path should be added 1067 to an LSP setup request as an explicit route and signaled. 1069 LSRs performing crankback re-routing should store all received 1070 crankback information for an LSP until the LSP is successfully 1071 established or until the node abandons its attempts to re-route 1072 the LSP. This allows the combination of crankback information 1073 from multiple failures when computing an alternate path. 1075 It is an implementation decision whether the crankback 1076 information is discarded immediately upon successful LSP 1077 establishment or retained for a period in case the LSP fails. 1079 7.4.2 Location Identifiers of Blocked Links or Nodes 1081 In order to compute an alternate path by crankback re-routing, 1082 it is necessary to identify the blocked links or nodes and 1083 their locations. The common identifier of each link or node 1084 in an MPLS network should be specified. Both 1085 protocol-independent and protocol- dependent identifiers 1086 may be specified. Although a general identifier that is 1087 independent of other protocols is preferable, there are a 1088 couple of restrictions on its use as described in the 1089 following subsection. 1091 In link state protocols such as OSPF and IS-IS , each 1092 link and node in a network can be uniquely identified. 1093 For example, by the context of a Router ID and the Link 1094 ID. If the topology and resource information obtained by 1095 OSPF advertisements is used to compute a constraint-based 1096 path, the location of a blockage can be represented by 1097 such identifiers. 1099 Note that, when the routing-protocol-specific link 1100 identifiers are used, the Re-routing Flag on the LSP 1101 setup request must have been set to show support for 1102 boundary or segment-based re-routing. 1104 In this document, we specify routing protocol specific 1105 link and node identifiers for OSPFv2 for IPv4, IS-IS for 1106 IPv4, OSPF for IPv6, and IS-IS for IPv6. These 1107 identifiers may only be used if segment-based re-routing 1108 is supported, as indicated by the Routing Behavior flag 1109 on the LSP setup request. 1111 7.4.3 Locating Errors within Loose or Abstract Nodes 1113 The explicit route on the original LSP setup request may 1114 contain a loose or an Abstract Node. In these cases, the 1115 crankback information may refer to links or nodes that 1116 were not in the original explicit route. 1118 In order to compute a new path, the repair point may need 1119 to identify the pair of hops (or nodes) in the explicit 1120 route between which the error/blockage occurred. 1122 To assist this, the crankback information reports the top 1123 two hops of the explicit route as received at the 1124 reporting node. The first hop will likely identify the 1125 node or the link, the second hop will identify a 'next' 1126 hop from the original explicit route. 1128 7.4.4 When Re-routing Fails 1130 When a node cannot or chooses not to perform crankback 1131 re-routing it must forward the PathErr message further upstream. 1133 However, when a node was responsible for expanding or 1134 replacing the explicit route as the LSP setup was 1135 processed it MUST update the crankback information with 1136 regard to the explicit route that it received. Only if 1137 this is done will the upstream nodes stand a chance of 1138 successfully routing around the problem. 1140 7.4.5 Aggregation of Crankback Information 1142 When a setup blocking error or an error in an established 1143 LSP occurs and crankback information is sent in an error 1144 notification message, some node upstream may choose to 1145 attempt crankback re-routing. If that node's attempts at 1146 re-routing fail the node will accumulate a set of failure 1147 information. When the node gives up it must propagate the 1148 failure message further upstream and include crankback 1149 information when it does so. 1151 There is not scope in the protocol extensions described 1152 in this document to supply a full list of all of the 1153 failures that have occurred. Such a list would be 1154 indefinitely long and would include more detail than is 1155 required. However, TLVs 32 and 33 allow lists of unusable 1156 links and nodes to be accumulated as the failure is 1157 passed back upstream. 1159 Aggregation may involve reporting all links from a node 1160 as unusable by flagging the node as unusable, or flagging 1161 an ABR as unusable when there is no downstream path 1162 available, and so on. The precise details of how 1163 aggregation of crankback information is performed are 1164 beyond the scope of this document. 1166 7.5. Notification of Errors 1168 7.5.1 ResvErr Processing 1170 As described above, the resource allocation failure for 1171 RSVP-TE may occur on the reverse path when the Resv 1172 message is being processed. In this case, it is still 1173 useful to return the received crankback information to 1174 the ingress LSR. However, when the egress LSR receives 1175 the ResvErr message, per RFC 2205 it still has the option 1176 of re-issuing the Resv with different resource 1177 requirements (although not on an alternate path). 1179 When a ResvErr carrying crankback information is received at 1180 an egress LSR, the egress LSR MAY ignore this object and 1181 perform the same actions as for any other ResvErr. However, 1182 if the egress LSR supports the crankback extensions defined 1183 in this document, and after all local recovery procedures 1184 have failed, it SHOULD generate a PathErr message carrying 1185 the crankback information and send it to the ingress LSR. 1187 If a ResvErr reports on more than one FILTER_SPEC 1188 (because the Resv carried more than one FILTER_SPEC) then 1189 only one set of crankback information should be present 1190 in the ResvErr and it should apply to all FILTER_SPEC 1191 carried. In this case, it may be necessary per [RFC 2205] 1192 to generate more than one PathErr. 1194 7.5.2 Notify Message Processing 1196 [RFC3473] defines the Notify message to enhance error 1197 reporting in RSVP-TE networks. This message is not 1198 intended to replace the PathErr and ResvErr messages. The 1199 Notify message is sent to addresses requested on the Path 1200 and Resv messages. These addresses could (but need not) 1201 identify the ingress and egress LSRs respectively. 1203 When a network error occurs, such as the failure of link 1204 hardware, the LSRs that detect the error MAY send Notify 1205 messages to the requested addresses. The type of error 1206 that causes a Notify message to be sent is an 1207 implementation detail. 1209 In the event of a failure, an LSR that supports [RFC3473] 1210 and the crankback extensions defined in this document MAY 1211 choose to send a Notify message carrying crankback 1212 information. This would ensure a speedier report of the 1213 error to the ingress/egress LSRs. 1215 7.6. Error Values 1217 Error values for the Error Code "Admission Control 1218 Failure" are defined in [RFC2205]. Error values for the 1219 error code "Routing Problem" are defined in [RFC 3209] 1220 and [RFC 3473]. 1222 A new error value is defined for the error code "Routing 1223 Problem". "Re-routing limit exceeded" indicates that re-routing 1224 has failed because the number of crankback re-routing attempts 1225 has gone beyond the predetermined threshold at an individual LSR. 1227 7.7. Backward Compatibility 1229 It is recognized that not all nodes in an RSVP-TE network 1230 will support the extensions defined in this document. It 1231 is important that an LSR that does not support these 1232 extensions can continue to process a PathErr, ResvErr or 1233 Notify message even if it carries the newly defined IF_ID 1234 ERROR_SPEC information (TLVs). 1236 8. Routing Protocol Interactions 1238 If the routing-protocol-specific link or node identifiers 1239 are used in the Link and Node IF_ID ERROR_SPEC TLVs 1240 defined above, the signaling has to interact with the 1241 OSPF/IS-IS routing protocol. 1243 For example, when an intermediate LSR issues a PathErr 1244 message, the signaling module of the intermediate LSR 1245 should interact with the routing logic to determine the 1246 routing-protocol-specific link or node ID where the 1247 blockage or fault occurred and carry this information 1248 onto the Link TLV and Node TLV inside the IF_ID 1249 ERROR_SPEC object. The ingress LSR, upon receiving the 1250 error message, should interact with the routing logic to 1251 compute an alternate path by pruning the specified link 1252 ID or node ID in the routing database. 1254 Procedures concerning these protocol interactions are out 1255 of scope of this document. 1257 9. LSP Restoration Considerations 1259 LSP restoration is performed to recover an established 1260 LSP when a failure occurs along the path. In the case of 1261 LSP restoration, the extensions for crankback re-routing 1262 explained above can be applied for improving performance. 1263 This section gives an example of applying the above 1264 extensions to LSP restoration. The goal of this example 1265 is to give a general overview of how this might work, and 1266 not to give a detailed procedure for LSP restoration. 1268 Although there are several techniques for LSP 1269 restoration, this section explains the case of on-demand 1270 LSP restoration, which attempts to set up a new LSP on 1271 demand after detecting an LSP failure. 1273 9.1. Upstream of the Fault 1275 When an LSR detects a fault on an adjacent downstream 1276 link or node, a PathErr message is sent upstream. In 1277 GMPLS, the ERROR_SPEC object may carry a 1278 Path_State_Remove_Flag indication. Each LSR receiving the 1279 message then releases the corresponding LSP. (Note that 1280 if the state removal indication is not present on the 1281 PathErr message, the ingress node must issue a PathTear 1282 message to cause the resources to be released.) If the 1283 failed LSP has to be restored at an upstream LSR, the 1284 IF_ID ERROR SPEC that includes the location information 1285 of the failed link or node is included in the PathErr 1286 message. The ingress, intermediate area border LSR, or 1287 indeed any repair point permitted by the Re-routing 1288 Flags, that receives the PathErr message can terminate 1289 the message and then perform alternate routing. 1291 In a flat network, when the ingress LSR receives the 1292 PathErr message with the IF_ID ERROR_SPEC TLVs, it 1293 computes an alternate path around the blocked link or 1294 node satisfying the QoS constraints. If an alternate path 1295 is found, a new Path message is sent over this path 1296 toward the egress LSR. 1298 In a network segmented into areas, the following 1299 procedures can be used. As explained in Section 8.2, the 1300 LSP restoration behavior is indicated in the Flags field 1301 of the SESSION_ATTRIBUTE object of the Path message. If 1302 the Flags indicate "End-to-end re-routing", the PathErr 1303 message is returned all the way back to the ingress LSR, 1304 which may then issue a new Path message along another 1305 path, which is the same procedure as in the flat network 1306 case above. 1308 If the Flags field indicates Boundary re-routing, the 1309 ingress area border LSR MAY terminate the PathErr message 1310 and then perform alternate routing within the area for 1311 which the area border LSR is the ingress LSR. 1313 If the Flags field indicates segment-based re-routing, any node 1314 MAY apply the procedures described above for Boundary re-routing. 1316 9.2. Downstream of the Fault 1318 This section only applies to errors that occur after an 1319 LSP has been established. Note that an LSR that generates 1320 a PathErr with Path_State_Remove Flag SHOULD also send a 1321 PathTear downstream to clean up the LSP. 1323 A node that detects a fault and is downstream of the 1324 fault MAY send a PathErr or Notify message containing an 1325 IF_ID ERROR SPEC that includes the location information 1326 of the failed link or node, and MAY send a PathTear to 1327 clean up the LSP at all other downstream nodes. However, 1328 if the reservation style for the LSP is Shared Explicit (SE) 1329 the detecting LSR MAY choose not to send a PathTear - this 1330 leaves the downstream LSP state in place and facilitates 1331 make-before-break repair of the LSP re-utilizing downstream 1332 resources. Note that if the detecting node does not send a 1333 PathTear immediately then unused sate will timeout according 1334 to the normal rules of [RFC2205]. 1336 At a well-known merge point, an ABR or an ASBR, a similar 1337 decision might also be made so as to better facilitate 1338 make-before-break repair. In this case a received 1339 PathTear might be 'absorbed' and not propagated further 1340 downstream for an LSP that has SE reservation style. 1341 Note, however, that this is a divergence from the protocol 1342 and might severely impact normal tear-down of LSPs. 1344 10. IANA Considerations 1346 10.1 Error Codes 1348 A new error value is defined for the RSVP-TE "Routing 1349 Problem" error code that is defined in [RFC3209]. 1351 TBD Re-routing limit exceeded. 1353 10.2 IF_ID_ERROR_SPEC TLVs 1355 Note that the IF_ID_ERROR_SPEC TLV type values are not 1356 currently tracked by IANA. This might be a good 1357 opportunity to move them under IANA control. The values 1358 proposed by this document are found in section 7.2. 1360 10.3 LSP_ATTRIBUTES Object 1362 Three bits are defined for inclusion in the LSP Attributes TLV of 1363 the LSP_ATTRIBUTES object in section 6.4. Suggested values are 1364 supplied. IANA is requested to assign those bits. 1366 11. Security Considerations 1368 It should be noted that while the extensions in this document 1369 introduce no new security holes in the protocols, should a malicious 1370 user gain protocol access to the network, the crankback information 1371 might be used to prevent establishment of valid LSPs. 1373 The implementation of re-routing attempt thresholds are 1374 particularly important in this context. 1376 The crankback routing extensions and procedures for LSP restoration 1377 as applied to RSVP-TE introduce no further new security 1378 considerations. Refer to [RFC2205], [RFC3209] and [RFC3473] for a 1379 description of applicable security considerations. 1381 12. Acknowledgments 1383 We would like to thank Juha Heinanen and Srinivas Makam 1384 for their review and comments, and Zhi-Wei Lin for his 1385 considered opinions. Thanks, too, to John Drake for 1386 encouraging us to resurrect this document and consider 1387 the use of the IF-ID ERROR SPEC object. Thanks for a 1388 welcome and very thorough review by Dimitri Papadimitriou. 1390 Simon Marshall-Unitt made contributions to this draft. 1392 13. Intellectual Property Considerations 1394 The IETF takes no position regarding the validity or scope of any 1395 Intellectual Property Rights or other rights that might be claimed to 1396 pertain to the implementation or use of the technology described in 1397 this document or the extent to which any license under such rights 1398 might or might not be available; nor does it represent that it has 1399 made any independent effort to identify any such rights. Information 1400 on the procedures with respect to rights in RFC documents can be 1401 found in BCP 78 and BCP 79. 1403 Copies of IPR disclosures made to the IETF Secretariat and any 1404 assurances of licenses to be made available, or the result of an 1405 attempt made to obtain a general license or permission for the use of 1406 such proprietary rights by implementers or users of this 1407 specification can be obtained from the IETF on-line IPR repository at 1408 http://www.ietf.org/ipr. 1410 The IETF invites any interested party to bring to its attention any 1411 copyrights, patents or patent applications, or other proprietary 1412 rights that may cover technology that may be required to implement 1413 this standard. Please address the information to the IETF at ietf- 1414 ipr@ietf.org. 1416 14. Normative References 1418 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1419 Requirement Levels", BCP 14, RFC 2119, March 1997. 1421 [RFC2205] R. Braden, et al., "Resource ReSerVation Protocol (RSVP) 1422 Version 1 Functional Specification", RFC2205, September 1423 1997. 1425 [RFC3209] D. Awduche, et al., "RSVP-TE: Extensions to RSVP for LSP 1426 Tunnels", RFC3209, December 2001. 1428 [RFC3471] P. Ashwood-Smith and L. Berger, et al., "Generalized 1429 MPLS - Signaling Functional Description", RFC 3471, 1430 January 2003. 1432 [RFC3473] L. Berger, et al., "Generalized MPLS Signaling - RSVP-TE 1433 Extensions", RFC 3473, January 2003. 1435 [LSP-ATTRIB] A. Farrel, D. Papadimitriou, JP. Vasseur, "Encoding of 1436 Attributes for Multiprotocol Label Switching (MPLS) 1437 Label Switched Path (LSP) Establishment Using RSVP-TE", 1438 draft-ietf-mpls-rsvpte-attributes-04.txt, July 2004, 1439 work in progress. 1441 [ASON-REQ] D. Papadimitriou, J. Drake, J. Ash, A. Farrel, L. Ong, 1442 "Requirements for Generalized MPLS (GMPLS) Signaling 1443 Usage and Extensions for Automatically Switched Optical 1444 Network (ASON)", daft-ietf-ccamp-gmpls-ason-reqts-07.txt 1445 October 2004, work in progress. 1447 15. Informational References 1449 [ASH1] G. Ash, ITU-T Recommendations E.360.1 --> E.360.7, "QoS 1450 Routing & Related Traffic Engineering Methods for IP-, 1451 ATM-, & TDM-Based Multiservice Networks", May, 2002. 1453 [FASTRR] Ping Pan, et al., "Fast Reroute Extensions to RSVP-TE 1454 for LSP Tunnels", 1455 draft-ietf-mpls-rsvp-lsp-fastreroute-06.txt, May 2004 1456 (work in progress). 1458 [G8080] ITU-T Recommendation G.808/Y.1304, Architecture for the 1459 Automatically Switched Optical Network (ASON), November 1460 2001. For information on the availability of this 1461 document, please see http://www.itu.int. 1463 [EXCLUDE] C-Y. Lee, A. Farrel and S De Cnodder, "Exclude Routes - 1464 Extension to RSVP-TE", 1465 draft-ietf-ccamp-rsvp-te-exclude-route-02.txt, July 2004 1466 (work in progress). 1468 [PNNI] ATM Forum, "Private Network-Network Interface 1469 Specification Version 1.0 (PNNI 1.0)", 1470 , May 1996. 1472 [RFC2702] D. Awduche, et al., "Requirements for Traffic 1473 Engineering Over MPLS", RFC2702, September 1999. 1475 [RFC3469] V. Sharma, et al., "Framework for MPLS-based Recovery", 1476 RFC 3469, February 2003. 1478 [TE-BUNDLE] Z. Ali, A. Farrel, D. Papadimitriou, A. Satyanarayana, 1479 and A. Zamfir, "Generalized Multi-Protocol Label 1480 Switching (GMPLS) RSVP-TE signaling using Bundled 1481 Traffic Engineering (TE) Links", 1482 draft-dimitri-ccamp-gmpls-rsvp-te-bundled-links-00.txt, 1483 May 2004, work in progress. 1485 16. Authors' Addresses 1487 Adrian Farrel (editor) 1488 Old Dog Consulting 1489 Phone: +44 (0) 1978 860944 1490 EMail: adrian@olddog.co.uk 1492 Arun Satyanarayana 1493 Movaz Networks, Inc. 1494 7926 Jones Branch Drive, Suite 615 1495 McLean, VA 22102 1496 Phone: (+1) 703-847-1785 1497 EMail: aruns@movaz.com 1499 Atsushi Iwata 1500 NEC Corporation 1501 Networking Research Laboratories 1502 1-1, Miyazaki, 4-Chome, Miyamae-ku, 1503 Kawasaki, Kanagawa, 216-8555, JAPAN 1504 Phone: +81-(44)-856-2123 1505 Fax: +81-(44)-856-2230 1506 EMail: a-iwata@ah.jp.nec.com 1508 Norihito Fujita 1509 NEC Corporation 1510 Networking Research Laboratories 1511 1-1, Miyazaki, 4-Chome, Miyamae-ku, 1512 Kawasaki, Kanagawa, 216-8555, JAPAN 1513 Phone: +81-(44)-856-2123 1514 Fax: +81-(44)-856-2230 1515 EMail: n-fujita@bk.jp.nec.com 1517 Gerald R. Ash 1518 AT&T 1519 Room MT D5-2A01 1520 200 Laurel Avenue 1521 Middletown, NJ 07748, USA 1522 Phone: (+1) 732-420-4578 1523 Fax: (+1) 732-368-8659 1524 EMail: gash@att.com 1526 17. Disclaimer of Validity 1528 This document and the information contained herein are provided on an 1529 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1530 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1531 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1532 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1533 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1534 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1536 18. Full Copyright Statement 1538 Copyright (C) The Internet Society (2004). This document is subject 1539 to the rights, licenses and restrictions contained in BCP 78, and 1540 except as set forth therein, the authors retain all their rights. 1542 Appendix A. Experience of Crankback in TDM-based Networks 1544 Experience of using release messages in TDM-based networks for 1545 analogous repair and re-routing purposes provides some guidance. 1547 One can use the receipt of a release message with a cause value (CV) 1548 indicating "link congestion" to trigger a re-routing attempt at the 1549 originating node. However, this sometimes leads to problems. 1551 *--------------------* *-----------------* 1552 | | | | 1553 | N2 ----------- N3-|--|----- AT--- EO2 | 1554 | | | \| | / | | 1555 | | | |--|- / | | 1556 | | | | | \/ | | 1557 | | | | | /\ | | 1558 | | | |--|- \ | | 1559 | | | /| | \ | | 1560 | N1 ----------- N4-|--|----- EO1 | 1561 | | | | 1562 *--------------------* *-----------------* 1563 A-1 A-2 1565 Figure 1. Example of network topology 1567 Figure 1 illustrates four examples based on service-provider 1568 experiences with respect to crankback (i.e., explicit indication) 1569 versus implicit indication through a release with CV. In this 1570 example, N1, N2,N3, and N4 are located in one area (A-1), and AT, 1571 EO1, and EO2 are in another area (A-2). 1573 Note that two distinct areas are used in this example to expose the 1574 issues clearly. In fact, the issues are not limited to multi-area 1575 networks, but arise whenever path computation is distributed 1576 throughout the network. For example where loose routes, AS routes or 1577 path computation domains are used. 1579 1. A connection request from node N1 to EO1 may route to N4 and then 1580 find "all circuits busy". N4 returns a release message to N1 with 1581 CV34 indicating all circuits busy. Normally, a node such as N1 is 1582 programmed to block a connection request when receiving CV34, 1583 although there is good reason to try to alternate route the 1584 connection request via N2 and N3. 1586 Some service providers have implemented a technique called route 1587 advance (RA), where if a node that is RA capable receives a 1588 release message with CV34, it will use this as an implicit 1589 re-route indication and try to find an alternate route for the 1590 connection request if possible. In this example, alternate route 1591 N1-N2-N3-EO1 can be tried and may well succeed. 1593 2. Suppose a connection request goes from N2 to N3 to AT trying to 1594 reach EO2 and is blocked at link AT-EO2. Node AT returns a CV34 1595 and with RA, N2 may try to re-route N2-N1-N4-AT-EO2, but of 1596 course this fails again. The problem is that N2 does not realize 1597 where this blocking occurred based on the CV34, and in this case 1598 there is no point in further alternate routing. 1600 3. However, in another case of a connection request from N2 to E02, 1601 suppose that link N3-AT is blocked. In this case N3 should return 1602 crankback information (and not CV34) so that N2 can alternate 1603 route to N1-N4-AT-EO2, which may well be successful. 1605 4. In a final example, for a connection request from EO1 to N2, EO1 1606 first tries to route the connection request directly to N3. 1607 However, node N3 may reject the connection request even if there 1608 is bandwidth available on link N3-EO1 (perhaps for priority 1609 routing considerations, e.g., reserving bandwidth for high 1610 priority connection requests). However, when N3 returns CV34 in 1611 the release message, EO1 blocks the connection request (a normal 1612 response to CV34 especially if E01-N4 is already known blocked) 1613 rather than trying to alternate route through AT-N3-N2, which 1614 might be successful. If N3 returns crankback information, EO1 1615 could respond by trying the alternate route. 1617 It is certainly the case that with topology exchange, such as OSPF, 1618 the ingress LSR could infer the re-routing condition. However, 1619 convergence of routing information is typically slower than the 1620 expected LSP setup times. One of the reasons for crankback is to 1621 avoid the overhead of available-link-bandwidth flooding, and to more 1622 efficiently use local state information to direct alternate routing 1623 at the ingress-LSR. 1625 [ASH1] shows how event-dependent-routing can just use crankback, 1626 and not available-link-bandwidth flooding, to decide on the 1627 re-route path in the network through "learning models". Reducing 1628 this flooding reduces overhead and can lead to the ability to 1629 support much larger AS sizes. 1631 Therefore, the alternate routing should be indicated based on 1632 an explicit indication (as in examples 3 and 4), and it is best 1633 to know the following information separately: 1635 a) where blockage/congestion occurred (as in examples 1-2), 1637 and 1639 b) whether alternate routing "should" be attempted even if 1640 there is no "blockage" (as in example 4).