idnits 2.17.1 draft-ietf-mpls-ldp-ft-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 32 longer pages, the longest (page 1) being 60 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([2], [4]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 2001) is 8382 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 21 looks like a reference -- Missing reference section? '2' on line 1434 looks like a reference -- Missing reference section? '4' on line 1519 looks like a reference -- Missing reference section? '3' on line 162 looks like a reference -- Missing reference section? '5' on line 839 looks like a reference -- Missing reference section? '6' on line 252 looks like a reference -- Missing reference section? '7' on line 252 looks like a reference -- Missing reference section? '8' on line 252 looks like a reference -- Missing reference section? '9' on line 1438 looks like a reference Summary: 4 errors (**), 0 flaws (~~), 2 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 MPLS WG Adrian Farrel 2 Internet Draft Movaz Networks, Inc. 3 Document: draft-ietf-mpls-ldp-ft-02.txt 4 Expiration Date: October 2001 Paul Brittain 5 MetaSwitch Ltd 7 Philip Matthews 8 Nortel 10 Eric Gray 11 Sandburst 13 May 2001 15 Fault Tolerance for LDP and CR-LDP 17 Status of this Memo 19 This document is an Internet-Draft and is in full conformance with 20 all provisions of Section 10 of RFC2026 [1]. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF), its areas, and its working groups. Note that 24 other groups may also distribute working documents as Internet- 25 Drafts. Internet-Drafts are draft documents valid for a maximum of 26 six months and may be updated, replaced, or obsoleted by other 27 documents at any time. It is inappropriate to use Internet- Drafts 28 as reference material or to cite them other than as "work in 29 progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 NOTE: The new TLV type numbers, bit values for flags specified in 38 this draft, and new LDP status code values are preliminary suggested 39 values and have yet to be approved by IANA or the MPLS WG. See the 40 section "IANA Considerations" for further details. 42 Abstract 44 MPLS systems will be used in core networks where system downtime 45 must be kept to an absolute minimum. Many MPLS LSRs may, therefore, 46 exploit Fault Tolerant (FT) hardware or software to provide 47 high availability of the core networks. 49 The details of how FT is achieved for the various components of an FT 50 LSR, including LDP, CR-LDP, the switching hardware and TCP, are 51 implementation specific. This document identifies issues in the 52 CR-LDP specification [2] and the LDP specification [4] that make it 53 difficult to implement an FT LSR using the current LDP and CR-LDP 54 protocols, and proposes enhancements to the LDP specification to ease 55 such FT LSR implementations. 57 The extensions described here are equally applicable to CR-LDP. 59 Contents 61 0. Changes from Previous Version...................................3 62 1. Conventions and Terminology used in this document...............3 63 2. Introduction....................................................4 64 2.1 Fault Tolerance for MPLS.......................................4 65 2.2 Issues with LDP and CR-LDP.....................................5 66 3. Overview of LDP FT Enhancements.................................6 67 3.1 Establishing an FT LDP Session.................................7 68 3.1.1 Interoperation with Non-FT LSRs.............................7 69 3.2 TCP Connection Failure.........................................7 70 3.2.1 Detecting TCP Connection Failures............................7 71 3.2.2 LDP Processing after Connection Failure......................8 72 3.3 Data Forwarding During TCP Connection Failure..................8 73 3.4 FT LDP Session Reconnection....................................8 74 3.5 Operations on FT Labels........................................9 75 3.6 Label Space Depletion and Replenishment........................9 76 4. FT Operations..................................................10 77 4.1 FT LDP Messages...............................................10 78 4.1.1 FT Label Messages...........................................10 79 4.1.1.1 Scope of FT Labels........................................10 80 4.1.2 FT Address Messages........................................11 81 4.1.3 FT Label Resources Available Notification Messages..........11 82 4.2 FT Operation ACKs.............................................12 83 4.3 Preservation of FT State......................................12 84 4.4 FT Procedure After TCP Failure................................14 85 4.4.1 FT LDP Operations During TCP Failure........................15 86 4.5 FT Procedure After TCP Re-connection..........................16 87 4.5.1 Re-Issuing FT Messages......................................16 88 4.5.2 Interaction with CR-LDP LSP Modification....................17 89 5. Changes to Existing Messages...................................17 90 5.1 LDP Initialization Message....................................17 91 5.2 LDP Keepalive Message.........................................18 92 5.3 All Other LDP Session Messages................................18 93 6. New Fields and Values..........................................18 94 6.1 Status Codes..................................................18 95 6.2 FT Session TLV................................................19 96 6.3 FT Protection TLV.............................................20 97 6.4 FT ACK TLV....................................................22 98 7. Example Use....................................................23 99 7.1 Session Failure and Recovery..................................24 100 7.2 Temporary Shutdown............................................26 101 8. Security Considerations........................................27 102 9. Implementation Notes...........................................28 103 9.1 FT Recovery Support on Non-FT LSRs............................28 104 9.2 ACK generation logic..........................................28 105 10. Acknowledgements..............................................29 106 11. Intellectual Property Consideration...........................29 107 12. Full Copyright Statement......................................29 108 13. IANA Considerations...........................................30 109 13.1 FT Session TLV...............................................30 110 13.2 FT Protection TLV............................................30 111 13.3 FT ACK TLV...................................................31 112 13.4 Status Codes.................................................31 113 14. Authors' Addresses............................................31 114 15. References....................................................31 116 0. Changes From Version 1 to Version 2 118 This section to be removed before final publication. 120 2.2 Add paragraph discussing use of this draft for recovery in non- 121 FT systems. 123 3.2.2 Clarify selection of FT Reconnect Timeout value. 125 3.4 Explain procedure when FT Reconnect flag is 'unexpectedly' set. 127 4.1.1.1 Explain re-use of labels from the per platform label space. 129 4.3 Clarify that the Reconnection Timeout provides an upper limit on 130 the preservation of state, but that other events may cause state 131 to be released sooner. 133 4.4.1 Describe behavior if an LDP peer is unwilling or unable to 134 queue operations during TCP failure. 136 4.5 Describe behavior if an LDP peer is unwilling or unable to 137 queue operations during TCP failure. 139 8. Text to expose security risks concerned with reuse of labels. 141 1. Conventions and Terminology used in this document 143 Definitions of key words and terms applicable to LDP and CR-LDP are 144 inherited from [2] and [4]. 146 The term "FT label" is introduced in this document to 147 indicated a label for which fault tolerant operation is used. A 148 "non-FT label" is not fault tolerant and is handled as specified in 149 [2] and [4]. 151 The extensions to LDP specified in this document are collectively 152 referred to as the "LDP FT enhancements". 154 In the examples quoted, the following notation is used. 156 Ln : An LSP. For example L1. 157 Pn : An LDP peer. For example P1. 159 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 160 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 161 this document are to be interpreted as described in RFC-2119 [3]. 163 2. Introduction 165 High Availability (HA) is typically claimed by equipment vendors 166 when their hardware achieves availability levels of at least 99.999% 167 (five 9s). To implement this, the equipment must be capable of 168 recovering from local hardware and software failures through a 169 process known as fault tolerance (FT). 171 The usual approach to FT involves provisioning backup copies of 172 hardware and/or software. When a primary copy fails, processing is 173 switched to the backup copy. This process, called failover, should 174 result in minimal disruption to the Data Plane. 176 In an FT system, backup resources are sometimes provisioned on a 177 one-to-one basis (1:1), sometimes as many-to-one (1:n), and 178 occasionally as many-to-many (m:n). Whatever backup provisioning is 179 made, the system must switch to the backup automatically on failure 180 of the primary, and the software and hardware state in the backup 181 must be set to replicate the state in the primary at the point 182 of failure. 184 2.1 Fault Tolerance for MPLS 186 MPLS will be used in core networks where system downtime must be kept 187 to an absolute minimum. Many MPLS LSRs may, therefore, exploit FT 188 hardware or software to provide high availability of core networks. 190 In order to provide HA, an MPLS system needs to be able to survive 191 a variety of faults with minimal disruption to the Data Plane, 192 including the following fault types: 193 - failure/hot-swap of a physical connection between LSRs 194 - failure/hot-swap of the switching fabric in the LSR 195 - failure of the TCP or LDP stack in an LSR 196 - software upgrade to the TCP or LDP stacks. 198 The first two examples of faults listed above are confined to the 199 Data Plane. Such faults can be handled by providing redundancy in 200 the Data Plane which is transparent to LDP operating in the Control 201 Plane. The last two example types of fault require action in 202 the Control Plane to recover from the fault without disrupting 203 traffic in the Data Plane. This is possible because many recent 204 router architectures separate the Control and Data Planes such that 205 forwarding can continue unaffected by recovery action in the Control 206 Plane. 208 2.2 Issues with LDP and CR-LDP 210 LDP and CR-LDP use TCP to provide reliable connections between LSRs 211 over which to exchange protocol messages to distribute labels and to 212 set up LSPs. A pair of LSRs that have such a connection are referred 213 to as LDP peers. 215 TCP enables LDP and CR-LDP to assume reliable transfer of protocol 216 messages. This means that some of the messages do not need to be 217 acknowledged (for example, Label Release). 219 LDP and CR-LDP are defined such that if the TCP connection fails, the 220 LSR should immediately tear down the LSPs associated with the session 221 between the LDP peers, and release any labels and resources assigned 222 to those LSPs. 224 It is notoriously hard to provide a Fault Tolerant implementation of 225 TCP. To do so might involve making copies of all data sent and 226 received. This is an issue familiar to implementers of other TCP 227 applications such as BGP. 229 During failover affecting the TCP or LDP stacks, therefore, the TCP 230 connection may be lost. Recovery from this position is made worse by 231 the fact that LDP or CR-LDP control messages may have been lost 232 during the connection failure. Since these messages are unconfirmed, 233 it is possible that LSP or label state information will be lost. 235 This draft describes a solution which involves 236 - negotiation between LDP peers of the intent to support extensions 237 to LDP that facilitate recovery from failover without loss of LSPs 238 - selection of FT survival on a per LSP/label basis 239 - acknowledgement of LDP messages to ensure that a full handshake is 240 performed on those messages 241 - re-issuing lost messages after failover to ensure that LSP/label 242 state is correctly recovered after reconnection of the LDP session. 244 Other objectives of this draft are to 245 - offer back-compatibility with LSRs that do not implement these 246 proposals 247 - preserve existing protocol rules described in [2] and [4] for 248 handling unexpected duplicate messages and for processing 249 unexpected messages referring to unknown LSPs/labels 250 - integrate with the LSP modification function described in [5] 251 - avoid full state refresh solutions (such as those present in RSVP: 252 see [6], [7] and [8]) whether they be full-time, or limited to 253 post-failover recovery. 255 Note that this draft concentrates on the preservation of label state 256 for labels exchanged between a pair of adjacent LSRs when the TCP 257 connection between those LSRs is lost. This is a requirement for 258 Fault Tolerant operation of LSPs, but a full implementation of end- 259 to-end protection for LSPs requires that this is combined with other 260 techniques that are outside the scope of this draft. 262 In particular, this draft does not attempt to describe how to modify 263 the routing of an LSP or the resources allocated to a label or LSP, 264 which is covered by [5]. This draft also does not address how to 265 provide automatic layer 2/3 protection switching for a label or LSP, 266 which is a separate area for study. 268 This specification does not preclude an implementation from 269 attempting (or require it to attempt) to use the FT behavior 270 described here to recover from a preemptive failure of a connection 271 on a non-FT system due to, for example, a partial system crash. 272 Note, however, that there are potential issues too numerous to list 273 here - not least the likelihood that the same crash will immediately 274 occur when processing the restored data. 276 3. Overview of LDP FT Enhancements 278 The LDP FT enhancements consist of the following main elements, which 279 are described in more detail in the sections that follow. 281 - The presence of an FT Session TLV on the LDP Initialization 282 message indicates that an LSR supports the LDP FT enhancements on 283 this session. 285 - An FT Reconnect Flag in the FT Session TLV indicates whether an 286 LSR has preserved FT label state across a failure of the TCP 287 connection. 289 - An FT Reconnection Timeout, exchanged on the LDP Initialization 290 message, that indicates the maximum time peer LSRs will preserve 291 FT label state after a failure of the TCP connection. 293 - An FT Protection TLV used to identify operations that affect LDP 294 labels. All LDP messages carrying the FT Protection TLV need to 295 be secured (e.g. to NVRAM) and ACKed to the sending LDP peer in 296 order that the state for FT labels can be correctly recovered 297 after LDP session reconnection. 299 Note that the implementation within an FT system is left open by 300 this draft. An implementation could choose to secure entire 301 messages relating to FT labels, or it could secure only the 302 relevant state information. 304 - Address advertisement is also secured by use of the FT Protection 305 TLV. This enables recovery after LDP session reconnection without 306 the need to re-advertise what may be a very large number of 307 addresses. 309 3.1 Establishing an FT LDP Session 311 In order that the extensions to LDP [4] and CR-LDP [2] described in 312 this draft can be used successfully on an LDP session between a pair 313 of LDP peers, they MUST negotiate that the LDP FT enhancements 314 are to be used on the LDP session. 316 This is done on the LDP Initialization message exchange using a new 317 FT Session TLV. Presence of this TLV indicates that the peer wants 318 to support the LDP FT enhancements on this LDP session. 320 The LDP FT enhancements MUST be supported on an LDP session if both 321 LDP peers include an FT Session TLV on the LDP Initialization 322 message. 324 If either LDP Peer does not include the FT Session TLV on the LDP 325 Initialization message, the LDP FT enhancements MUST NOT be used 326 during this LDP session. Use of LDP FT enhancements by a sending 327 LDP peer MUST be interpreted by the receiving LDP peer as a serious 328 protocol error causing the session to be terminated. 330 An LSR MAY present different FT/non-FT behavior on different TCP 331 connections, even if those connections are successive instantiations 332 of the LDP session between the same LDP peers. 334 3.1.1 Interoperation with Non-FT LSRs 336 The FT Session TLV on the LDP Initialization message carries the 337 U-bit. If an LSR does not support the LDP FT enhancements, it will 338 ignore this TLV. Since such partners also do not include the FT 339 Session TLV, all LDP sessions to such LSRs will not use the LDP FT 340 enhancements. 342 The rest of this draft assumes that the LDP sessions under discussion 343 are between LSRs that do support the LDP FT enhancements, except 344 where explicitly stated otherwise. 346 3.2 TCP Connection Failure 348 3.2.1 Detecting TCP Connection Failures 350 TCP connection failures may be detected and reported to the LDP 351 component in a variety of ways. These should all be treated in the 352 same way by the LDP component. 354 - Indication from the management component that a TCP connection or 355 underlying resource is no longer active. 356 - Notification from a hardware management component of an interface 357 failure. 358 - Sockets keepalive timeout. 359 - Sockets send failure. 360 - New (incoming) Socket opened. 361 - LDP protocol timeout. 363 3.2.2 LDP Processing after Connection Failure 365 If the LDP FT enhancements are not in use on an LDP session, the 366 action of the LDP peers on failure of the TCP connection is as 367 specified in [2] and [4]. 369 All state information and resources associated with non-FT labels 370 MUST be released on the failure of the TCP connection, including 371 deprogramming the non-FT label from the switching hardware. This is 372 equivalent to the behavior specified in [4]. 374 If the LDP FT enhancements are in use on an LDP session, both LDP 375 peers SHOULD preserve state information and resources associated with 376 FT labels exchanged on the LDP session. Both LDP peers SHOULD use a 377 timer to release the preserved state information and resources 378 associated with FT-labels if the TCP connection is not restored 379 within a reasonable period. The behavior when this timer expires is 380 equivalent to the LDP session failure behavior described in [4]. 382 The FT Reconnection Timeout each LDP peer intends to apply to the LDP 383 session is carried in the FT Session TLV on the LDP Initialization 384 messages. Both LDP peers MUST use the value that corresponds to the 385 lesser timeout interval of the two proposed timeout values from the 386 LDP Initialization exchange, where a value of zero is treated as 387 positive infinity. 389 3.3 Data Forwarding During TCP Connection Failure 391 An LSR that implements the LDP FT enhancements SHOULD preserve the 392 programming of the switching hardware across a failover. This 393 ensures that data forwarding is unaffected by the state of the TCP 394 connection between LSRs. 396 It is an integral part of FT failover processing in some hardware 397 configurations that some data packets might be lost. If data loss is 398 not acceptable to the applications using the MPLS network, the LDP FT 399 enhancements described in this draft SHOULD NOT be used. 401 3.4 FT LDP Session Reconnection 403 When a new TCP connection is established, the LDP peers MUST exchange 404 LDP Initialization messages. When a new TCP connection is 405 established after failure, the LDP peers MUST re-exchange LDP 406 Initialization messages. 408 If an LDP peer includes the FT Session TLV in the LDP Initialization 409 message for the new instantiation of the LDP session, it MUST also 410 set the FT Reconnect Flag according to whether it has been able to 411 preserve label state. The FT Reconnect Flag is carried in the FT 412 Session TLV. 414 If an LDP peer has preserved all state information for previous 415 instantiations of the LDP session, then it SHOULD set the FT 416 Reconnect Flag to 1 in the FT Session TLV. Otherwise, it MUST set the 417 FT Reconnect Flag to 0. 419 If either LDP peer sets the FT Reconnect Flag to 0, or omits the FT 420 Session TLV, both LDP peers MUST release any state information and 421 resources associated with the previous instantiation of the LDP 422 session between the same LDP peers, including FT label state and 423 Addresses. This ensures that network resources are not permanently 424 lost by one LSR if its LDP peer is forced to undergo a cold start. 426 If an LDP peer changes any session parameters (for example, the label 427 space bounds) from the previous instantiation the nature of any 428 preserved labels may have changed. In particular, previously 429 allocated labels may now be out of range. For this reason, session 430 reconnection MUST use the same parameters as were in use on the 431 session before the failure. If an LDP peer notices that the 432 parameters have been changed by the other peer it SHOULD send a 433 Notification message with the 'FT Session parameters changed' status 434 code. 436 If both LDP peers set the FT Reconnect Flag to 1, both LDP peers MUST 437 use the FT label operation procedures indicated in this draft to 438 complete any label operations on FT labels that were interrupted by 439 the LDP session failure. 441 If an LDP peer receives an LDP Initialization message with the FT 442 Reconnect Flag set before it sends its own Initialization message, 443 but has retained no information about the previous version of the 444 session, it MUST respond with an Initialization message with the FT 445 Reconnect Flag clear. If an LDP peer receives an LDP Initialization 446 message with the FT Reconnect Flag set in response to an 447 Initialization message that it has sent with the FT Reconnect Flag 448 clear it MUST act as if no state was retained by either peer on the 449 session. 451 3.5 Operations on FT Labels 453 Label operations on FT labels are made Fault Tolerant by providing 454 acknowledgement of all LDP messages that affect FT labels. 455 Acknowledgements are achieved by means of sequence numbers on these 456 LDP messages. 458 The message exchanges used to achieve acknowledgement of label 459 operations and the procedures used to complete interrupted label 460 operations are detailed in the section "FT Operations". 462 Using these acknowledgements and procedures, it is not necessary for 463 LDP peers to perform a complete re-synchronization of state for all 464 FT labels, either on re-connection of the LDP session between the LDP 465 peers or on a timed basis. 467 3.6 Label Space Depletion and Replenishment 469 When an LDP peer is unable to satisfy a Label Request message because 470 it has no more available labels, it sends a Notification message 471 carrying the status code 'No label resources'. This warns the 472 requesting LDP peer that subsequent Label Request messages are also 473 likely to fail for the same reason. This message does not need to be 474 acknowledged for FT purposes since Label Request messages sent after 475 session recovery will receive the same response. 477 However, the LDP peer that receives a 'No label resources' 478 Notification stops sending Label Request messages until it receives a 479 'Label resources available' Notification message. Since this 480 unsolicited Notification might get lost during session failure, it 481 must be protected using the procedures described in this draft. 483 4. FT Operations 485 Once an FT LDP session has been established, using the procedures 486 described in the section "Establishing an FT LDP Session", both LDP 487 peers MUST apply the procedures described in this section for FT LDP 488 message exchanges. 490 If the LDP session has been negotiated to not use the LDP FT 491 enhancements, these procedures MUST NOT be used. 493 4.1 FT LDP Messages 495 4.1.1 FT Label Messages 497 A label is identified as being an FT label if the initial Label 498 Request or Label Mapping message relating to that label carries the 499 FT Protection TLV. 501 If a label is an FT label, all LDP messages affecting that label MUST 502 carry the FT Protection TLV in order that the state of the label can 503 be recovered after a failure of the LDP session. 505 4.1.1.1 Scope of FT Labels 507 The scope of the FT/non-FT status of a label is limited to the 508 LDP message exchanges between a pair of LDP peers. 510 In Ordered Control, when the message is forwarded downstream or 511 upstream, the TLV may be present or absent according to the 512 requirements of the LSR sending the message. 514 If a platform-wide label space is used for FT labels, an FT label 515 value MUST NOT be reused until all LDP FT peers to which the label 516 was passed have acknowledged the withdrawal of the FT label, either 517 by an explicit LABEL WITHDRAW/LABEL RELEASE exchange or implicitly if 518 the LDP session is reconnected after failure but without the FT 519 Reconnect Flag set. In the event that a session is not re- 520 established within the Reconnection Timeout, a label MAY become 521 available for re-use if it is not still in use on some other 522 session. 524 4.1.2 FT Address Messages 526 If an LDP session uses the LDP FT enhancements, both LDP peers 527 MUST secure Address and Address Withdraw messages using FT Operation 528 ACKs, as described below. This avoids any ambiguity over whether 529 an Address is still valid after the LDP session is reconnected. 531 If an LSR determines that an Address message that it sent on a 532 previous instantiation of a recovered LDP session is no longer valid, 533 it MUST explicitly issue an Address Withdraw for that address when 534 the session is reconnected. 536 If the FT Reconnect Flag is not set by both LDP peers on reconnection 537 of an LDP session (i.e. state has not been preserved), both LDP 538 peers MUST consider all Addresses to have been withdrawn. The LDP 539 peers SHOULD issue new Address messages for all their valid addresses 540 as specified in [4]. 542 4.1.3 FT Label Resources Available Notification Messages 544 In LDP, it is possible that a downstream LSR may not have labels 545 available to respond to a Label Request. In this case, as specified 546 in RFC3036, the downstream LSR must respond with a Notification - No 547 Label Resources message. The upstream LSR then suspends asking for 548 new labels until it receives a Notification - Label Resources 549 Available message from the downstream LSR. 551 When the FT extensions are used on a session: 552 1. The downstream LSR must preserve the label availability state 553 across a failover so that it remembers to send Notification - 554 Label Resources Available when the resources become available. 555 2. The upstream LSR must recall the label availability state across 556 failover so that it can optimize not sending Label Requests when 557 it recovers. 558 3. The downstream LSR must use sequence numbers on Notification - 559 Label Resources Available so that it can check that LSR A has 560 received the message and clear its secured state, or resend the 561 message if LSR A recovers without having received it. 563 If the FT Reconnect Flag is not set by both LDP peers on reconnection 564 of an LDP session (i.e. state has not been preserved), both LDP 565 peers MUST consider the label availability state to have been reset 566 as if the session had been set up for the first time. 568 4.2 FT Operation ACKs 570 Handshaking of FT LDP messages is achieved by use of ACKs. 571 Correlation between the original message and the ACK is by means of 572 the FT Sequence Number contained in the FT Protection TLV, and passed 573 back in the FT ACK TLV. The FT ACK TLV may be carried on any LDP 574 message that is sent on the TCP connection between LDP peers. 576 An LDP peer maintains a separate FT sequence number for each LDP 577 session it participates in. The FT Sequence number is incremented by 578 one for each FT LDP message (i.e. containing the FT Protection TLV) 579 issued by this LSR on the FT LDP session with which the FT sequence 580 number is associated. 582 When an LDP peer receives a message containing the FT Protection TLV, 583 it MUST take steps to secure this message (or the state information 584 derived from processing the message). Once the message is secured, 585 it MUST be ACKed. However, there is no requirement on the LSR to 586 send this ACK immediately. 588 ACKs may be accumulated to reduce the message flow between LDP peers. 589 For example, if an LSR received FT LDP messages with sequence numbers 590 1, 2, 3, 4, it could send a single ACK with sequence number 4 to ACK 591 receipt and securing of all these messages. 593 ACKs MUST NOT be sent out of sequence, as this is incompatible with 594 the use of accumulated ACKs. Duplicate ACKs (that is two successive 595 messages that acknowledge the same sequence number) are acceptable. 597 If an LDP peer discovers that its sequence number space for a 598 specific session is full of un-acknowledged sequence numbers (because 599 its partner on the session has not acknowledged them in a timely way) 600 it cannot allocate a new sequence number for any further FT LPD 601 message. It SHOULD send a Notification message with the status code 602 "FT Seq Numbers Exhausted". 604 4.3 Preservation of FT State 606 If the LDP FT enhancements are in use on an LDP session, each LDP 607 peer SHOULD NOT release the state information and resources 608 associated with FT labels exchanged on that LDP session when the TCP 609 connection fails. This is contrary to [2] and [4], but allows label 610 operations on FT labels to be completed after re-connection of the 611 TCP connection. 613 Both LDP peers on an LDP session that is using the LDP FT 614 enhancements SHOULD preserve the state information and resources they 615 hold for that LDP session as described below. 617 - An upstream LDP peer SHOULD release the resources (in 618 particular bandwidth) associated with an FT label when it 619 initiates a Label Release or Label Abort message for the label. 620 The upstream LDP peer MUST preserve state information for 621 the label, even if it releases the resources associated with the 622 label, as it may need to reissue the label operation if the 623 TCP connection is interrupted. 625 - An upstream LDP peer MUST release the state information 626 and resources associated with an FT label when it receives an 627 acknowledgement to a Label Release or Label Abort message that it 628 sent for the label, or when it sends a Label Release 629 message in response to a Label Withdraw message received from the 630 downstream LDP peer. 632 - A downstream LDP peer SHOULD NOT release the resources 633 associated with an FT label when it sends a Label Withdraw message 634 for the label as it has not yet received confirmation that the 635 upstream LDP peer has ceased to send data using the label. The 636 downstream LDP peer MUST NOT release the state information it 637 holds for the label as it may yet have to reissue the label 638 operation if the TCP connection is interrupted. 640 - A downstream LDP peer MUST release the resources and state 641 information associated with an FT label when it receives an 642 acknowledgement to a Label Withdraw message for the label. 644 - When the FT Reconnection Timeout expires, an LSR SHOULD release 645 all state information and resources from previous instantiations 646 of the (permanently) failed LDP session. 648 - Either LDP peer MAY elect to release state information based on 649 its internal knowledge of the loss of integrity of the state 650 information or an inability to pend (or queue) LDP operations 651 (as described in section 4.4.1) during a TCP failure. That is, 652 the peer is not required to wait for the duration of the FT 653 Reconnection Timeout before releasing state; the timeout provides 654 an upper limit on the persistence of state. However, In the event 655 that a peer releases state before the expiration of the 656 Reconnection Timeout it MUST NOT re-use any label that was in use 657 on the session until the Reconnection Timeout has expired. 659 - When an LSR receives a Status TLV with the E-bit set in 660 the status code, which causes it to close the TCP connection, the 661 LSR MUST release all state information and resources associated 662 with the session. This behavior is mandated because it is 663 impossible for the LSR to predict the precise state and future 664 behavior of the partner LSR that set the E-bit without knowledge 665 of the implementation of that partner LSR. 667 Note that the "Temporary Shutdown" status code does not have the 668 E-bit set, and MAY be used during maintenance or upgrade 669 operations to indicate that the LSR intends to preserve state 670 across a closure and re-establishment of the TCP session. 672 - If an LSR determines that it must release state for any single FT 673 label during a failure of the TCP connection on which that label 674 was exchanged, it MUST release all state for all labels on the LDP 675 session. 677 The release of state information and resources associated with non-FT 678 labels is as described in [2] and [4]. 680 Note that a Label Release and the acknowledgement to a Label Withdraw 681 may be received by a downstream LSR in any order. The downstream LSR 682 MAY release its resources on receipt of the first message and MUST 683 release its resources on receipt of the second message. 685 4.4 FT Procedure After TCP Failure 687 When an LSR discovers or is notified of a TCP connection failure it 688 SHOULD start an FT Reconnection Timer to allow a period for 689 re-connection of the TCP connection between the LDP peers. 691 The RECOMMENDED default value for this timer is 5 seconds. During 692 this time, failure must be detected and reported, new hardware may 693 need to be activated, software state must be audited, and a new TCP 694 session must be set up. 696 Once the TCP connection between LDP peers has failed, the active LSR 697 SHOULD attempt to re-establish the TCP connection. The mechanisms, 698 timers and retry counts to re-establish the TCP connection are an 699 implementation choice. It is RECOMMENDED that any attempt to 700 re-establish the connection take account of the failover processing 701 necessary on the peer LSR, the nature of the network between the 702 LDP peers, and the FT Reconnection Timeout chosen on the previous 703 instantiation of the TCP connection (if any). 705 If the TCP connection cannot be re-established within the FT 706 Reconnection Timeout period, the LSR detecting this timeout SHOULD 707 release all state preserved for the failed LDP session. If the TCP 708 connection is subsequently re-established (for example, after a 709 further Hello exchange to set up a new LDP session), the LSR MUST set 710 the FT Reconnect Flag to 0 if it released the preserved state 711 information on this timeout event. 713 If the TCP connection is successfully re-established within the FT 714 Reconnection Timeout, both peers MUST re-issue LDP operations that 715 were interrupted by (that is, un-acknowledged as a result of) the TCP 716 connection failure. This procedure is described in section "FT 717 Procedure After TCP Re-connection". 719 The Hold Timer for an FT LDP Session (see [4] section 2.5.5) SHOULD 720 be ignored while the FT Reconnection Timer is running. The hold 721 timer SHOULD be restarted when the TCP connection is re-established. 723 4.4.1 FT LDP Operations During TCP Failure 725 When the LDP FT enhancements are in use for an LDP session, it is 726 possible that an LSR may determine that it needs to send an LDP 727 message to an LDP peer but that the TCP connection to that peer is 728 currently down. These label operations affect the state of FT labels 729 preserved for the failed TCP connection, so it is important that the 730 state changes are passed to the LDP peer when the TCP connection is 731 restored. 733 If an LSR determines that it needs to issue a new FT LDP operation to 734 an LDP peer to which the TCP connection is currently failed, it MUST 735 pend the operation (e.g. on a queue) and complete that operation 736 with the LDP peer when the TCP connection is restored, unless the 737 label operation is overridden by a subsequent additional operation 738 during the TCP connection failure (see section "FT Procedure After 739 TCP Re-connection"). 741 If, during TCP Failure, an LSR determines that it cannot pend an 742 operation which it cannot simply fail (for example a Label Withdraw, 743 Release, or Abort operation), it MUST NOT attempt to re-establish 744 the previous LDP session. The LSR MUST behave as if the Reconnection 745 Timer expired and release all state information with respect to the 746 LDP peer. An LSR may be unable (or unwilling) to pend operations; 747 for instance, if a major routing transition occurred while TCP was 748 inoperable between LDP peers it might result in excessively large 749 numbers of FT LDP Operations. An LSR that releases state before the 750 expiration of the Reconnection Timeout MUST NOT re-use any label that 751 was in use on the session until the Reconnection Timeout has expired. 753 In ordered operation, received FT LDP operations that cannot be 754 correctly forwarded because of a TCP connection failure MAY be 755 processed immediately (provided sufficient state is kept to forward 756 the label operation) or pended for processing when the onward TCP 757 connection is restored and the operation can be correctly forwarded 758 upstream or downstream. Operations on existing FT labels SHOULD NOT 759 be failed during TCP session failure. 761 It is RECOMMENDED that Label Request operations for new FT labels are 762 not pended awaiting the re-establishment of TCP connection that is 763 awaiting recovery at the time the LSR determines that it needs to 764 issue the Label Request message. Instead, such Label Request 765 operations SHOULD be failed and, if necessary, a notification message 766 containing the "No LDP Session" status code sent upstream. 768 Label Requests for new non-FT labels MUST be rejected during TCP 769 connection failure, as specified in [2] and [4]. 771 4.5 FT Procedure After TCP Re-connection 773 The FT operation handshaking described above means that all state 774 changes for FT labels and Address messages are confirmed or 775 reproducible at each LSR. 777 If the TCP connection between LDP peers fails but is re-connected 778 within the FT Reconnection Timeout, and both LSRs have indicated 779 they will be re-establishing the previous LDP session, both LDP 780 peers on the connection MUST complete any label operations for FT 781 labels that were interrupted by the failure and re-connection of 782 the TCP connection. 784 The procedures for FT Reconnection Timeout MAY have been invoked as 785 a result of either LDP peer being unable (or unwilling) to pend 786 operations which occurred during the TCP Failure (as described in 787 section 4.4.1). 789 If, for any reason, an LSR has been unable to pend operations with 790 respect to an LDP peer, as described in section 4.4.1, the LSR MUST 791 set the FT Reconnect Flag to 0 on re-connection to that LDP peer 792 indicating that no FT state has been preserved. 794 Label operations are completed using the procedure described below. 796 4.5.1 Re-Issuing FT Messages 798 On restoration of the TCP connection between LDP peers, any FT 799 LDP messages that were lost because of the TCP connection 800 failure are re-issued. The LDP peer that receives a re-issued message 801 processes the message as if received for the first time. 803 "Net-zero" combinations of messages need not be re-issued after 804 re-establishment of the TCP connection between LDP peers. This leads 805 to the following rules for re-issuing messages that are not ACKed by 806 the LDP peer on the LDP Initialization message exchange after 807 re-connection of the TCP session. 809 - A Label Request message MUST be re-issued unless a Label Abort 810 would be re-issued for the same FT label. 812 - A Label Mapping message MUST be re-issued unless a Label Withdraw 813 message would be re-issued for the same FT label. 815 - All other messages on the LDP session that carried the FT 816 Protection TLV MUST be re-issued if an acknowledgement had not 817 previously been received. 819 Any FT label operations that were pended (see section "FT Label 820 Operations During TCP Failure") during the TCP connection failure 821 MUST also be issued on re-establishment of the LDP session, except 822 where they form part of a "net-zero" combination of messages 823 according to the above rules. 825 The determination of "net-zero" FT label operations according to the 826 above rules MAY be performed on pended messages prior to the 827 re-establishment of the TCP connection in order to optimize the use 828 of queue resources. Messages that were sent to the LDP peer before 829 the TCP connection failure, or pended messages that are paired with 830 them, MUST NOT be subject to such optimization until an FT ACK TLV is 831 received from the LDP peer. This ACK allows the LSR to identify 832 which messages were received by the LDP peer prior to the TCP 833 connection failure. 835 4.5.2 Interaction with CR-LDP LSP Modification 837 Re-issuing LDP messages for FT operation is orthogonal to the use of 838 duplicate messages marked with the Modify ActFlg, as specified in 839 [5]. Each time an LSR uses the modification procedure for an FT LSP 840 to issue a new Label Request message, the FT label operation 841 procedures MUST be separately applied to the new Label Request 842 message. 844 5. Changes to Existing Messages 846 5.1 LDP Initialization Message 848 The LDP FT enhancements add the following optional parameters to a 849 LDP Initialization message 851 Optional Parameter Length Value 853 FT Session TLV 4 See below 854 FT ACK TLV 4 See below 856 The encoding for these TLVs is found in Section "New Fields and 857 Values". 859 FT Session 860 If present, specifies the FT behavior of the LDP session. 862 FT ACK TLV 863 If present, specifies the last FT message that the sending LDP 864 peer was able to secure prior to the failure of the previous 865 instantiation of the LDP session. This TLV is only present if 866 the FT Reconnect flag is set in the FT Session TLV, in which 867 case this TLV MUST be present. 869 5.2 LDP Keepalive Messages 871 The LDP FT enhancements add the following optional parameter to a 872 LDP Keepalive message 874 Optional Parameter Length Value 876 FT ACK TLV 4 See below 878 The encoding for FT ACK TLV is found in Section "FT ACK TLV". 880 FT ACK TLV 881 If present, specifies the most recent FT message that the 882 sending LDP peer has been able to secure. 884 5.3 All Other LDP Session Messages 886 The LDP FT enhancements add the following optional parameters to all 887 other message types that flow on an LDP session after the LDP 888 Initialization message 890 Optional Parameter Length Value 892 FT Protection TLV 4 See below 893 FT ACK TLV 4 See below 895 The encoding for these TLVs is found in the section "New Fields and 896 Values". 898 FT Protection 899 If present, specifies FT Sequence Number for the LDP message. 901 FT ACK 902 If present, identifies the most recent FT LDP message 903 ACKed by the sending LDP peer. 905 6. New Fields and Values 907 6.1 Status Codes 909 The following new status codes are defined to indicate various 910 conditions specific to the LDP FT enhancements. These status codes 911 are carried in the Status TLV of a Notification message. 913 The "E" column is the required setting of the Status Code E-bit; the 914 "Status Data" column is the value of the 30-bit Status Data field in 915 the Status Code TLV. 917 Note that the setting of the Status Code F-bit is at the discretion 918 of the LSR originating the Status TLV. However, it is RECOMMENDED 919 that the F-bit is not set on Notification messages containing 920 status codes except "No LDP Session" because the duplication of 921 messages SHOULD be restricted to being a per-hop behavior. 923 Status Code E Status Data 925 No LDP Session 0 0x000000xx 926 Zero FT seqnum 1 0x000000xx 927 Unexpected TLV / 1 0x000000xx 928 Session Not FT 929 Unexpected TLV / 1 0x000000xx 930 Label Not FT 931 Missing FT Protection TLV 1 0x000000xx 932 FT ACK sequence error 1 0x000000xx 933 Temporary Shutdown 0 0x000000xx 934 FT Seq Numbers Exhausted 1 0x000000xx 935 FT Session parameters / 1 0x000000xx 936 changed 938 The Temporary Shutdown status code SHOULD be used in place of 939 the Shutdown status code (which has the E-bit set) if the LSR that is 940 shutting down wishes to inform its LDP peer that it expects to be 941 able to preserve FT label state and to return to service before the 942 FT Reconnection Timer expires. 944 6.2 FT Session TLV 946 LDP peers can negotiate whether the LDP session between them supports 947 FT extensions by using a new OPTIONAL parameter, the FT Session 948 TLV, on LDP Initialization Messages. 950 The FT Session TLV is encoded as follows. 952 0 1 2 3 953 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 954 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 955 |1|0| FT Session TLV (0x0503) | Length (= 4) | 956 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 957 | FT Flags | FT Reconnection Timeout | 958 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 960 FT Flags 961 FT Flags: A 16 bit field that indicates various attributes the 962 FT support on this LDP session. This fields is formatted as 963 follows: 965 0 1 966 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 967 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 968 |R| Reserved | 969 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 971 R: FT Reconnect Flag. 972 Set to 1 if the sending LSR has preserved state and 973 resources for all FT-labels since the previous LDP 974 session between the same LDP peers, and set to 0 975 otherwise. See the section "FT LDP Session 976 Reconnection" for details of how this flag is used. 978 If the FT Reconnect Flag is set, the sending LSR must 979 include an FT ACK TLV on the LDP Initialization message. 981 All other bits in this field are currently reserved and SHOULD 982 be set to zero on transmission and ignored on receipt. 984 FT Reconnection Timeout 985 The period of time the sending LSR will preserve state and 986 resources for FT labels exchanged on the previous instantiation of 987 an FT LDP session that has currently failed. The timeout is 988 encoded as a 16-bit unsigned integer number of seconds. 990 A value of zero in this field means that the sending LSR will 991 preserve state and resources indefinitely. 993 See the section "FT Procedure After TCP Failure" for details of how 994 this field is used. 996 6.3 FT Protection TLV 998 LDP peers use the FT Protection TLV to indicate that an LDP message 999 contains an FT label operation. 1001 The FT Protection TLV MUST NOT be used in messages flowing on an LDP 1002 session that does not support the LDP FT enhancements. Its presence 1003 in such messages SHALL be treated as a protocol error by the 1004 receiving LDP peer which SHOULD send a Notification message with the 1005 'Unexpected TLV Session Not FT' status code. 1007 The FT Protection TLV MAY be carried on an LDP message transported on 1008 the LDP session after the initial exchange of LDP Initialization 1009 messages. In particular, this TLV MAY optionally be present on the 1010 following messages: 1012 - Label Request Messages in downstream on-demand distribution mode 1013 - Label Mapping messages in downstream unsolicited mode. 1015 If a label is to be an FT label, then the Protection TLV MUST be 1016 present: 1017 - on the Label Request message in DoD mode 1018 - on the Label Mapping message in DU mode 1019 - on all subsequent messages concerning this label. 1021 Here 'subsequent messages concerning this label' means any message 1022 whose Label TLV specifies this label or whose Label Request Message 1023 ID TLV specifies the initial Label Request message. 1025 If a label is not to be an FT label, then the Protection TLV 1026 MUST NOT be present on any of these messages. The presence of the FT 1027 TLV on a message relating to a non-FT label SHALL be treated as a 1028 protocol error by the receiving LDP peer which SHOULD send a 1029 notification message with the 'Unexpected TLV Label Not FT' status 1030 code. 1032 Where a Label Withdraw or Label Release message contains only a FEC 1033 TLV and does not identify a single specific label, the FT TLV should 1034 be included in the message if any label affected by the message is an 1035 FT label. If there is any doubt as to whether an FT TLV should be 1036 present, it is RECOMMENDED that the sender add the TLV. 1038 When an LDP peer receives a Label Withdraw Message or Label Release 1039 message that contains only a FEC, it SHALL accept the FT TLV if it is 1040 present regardless of the FT status of the labels which it affects. 1042 If an LDP session is an FT session as determined by the presence of 1043 the FT Session TLV on the LDP Initialization messages, the FT 1044 Protection TLV MUST be present: 1045 - on all Address messages on the session 1046 - on all Notification messages on the session that have the status 1047 code 'Label Resources Available'. 1049 The FT Protection TLV is encoded as follows. 1051 0 1 2 3 1052 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1053 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1054 |0|0| FT Protection (0x0203) | Length (= 4) | 1055 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1056 | FT Sequence Number | 1057 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1059 FT Sequence Number 1060 The sequence number for this FT label operation. The 1061 sequence number is encoded as a 32-bit unsigned integer. The 1062 initial value for this field on a new LDP session is 0x00000001 and 1063 is incremented by one for each FT LDP message issued by the sending 1064 LSR on this LDP session. This field may wrap from 0xFFFFFFFF to 1065 0x00000001. 1067 This field MUST be reset to 0x00000001 if either LDP peer does not 1068 set the FT Reconnect Flag on re-establishment of the TCP 1069 connection. 1071 See the section "FT Operation Acks" for details of how this field 1072 is used. 1074 The special use of 0x00000000 is discussed in the section "FT ACK 1075 TLV" below. 1077 If an LSR receives an FT Protection TLV on a session that does not 1078 support the FT LDP enhancements, it SHOULD send a Notification 1079 message to its LDP peer containing the "Unexpected TLV, Session Not 1080 FT" status code. 1082 If an LSR receives an FT Protection TLV on an operation affecting a 1083 label that it believes is a non-FT label, it SHOULD send a 1084 Notification message to its LDP peer containing the "Unexpected TLV, 1085 Label Not FT" status code. 1087 If an LSR receives a message without the FT Protection TLV affecting 1088 a label that it believes is an FT label, it SHOULD send a 1089 Notification message to its LDP peer containing the "Missing FT 1090 Protection TLV" status code. 1092 If an LSR receives an FT Protection TLV containing a zero FT 1093 Sequence Number, it SHOULD send a Notification message to its LDP 1094 peer containing the "Zero FT Seqnum" status code. 1096 6.4 FT ACK TLV 1098 LDP peers use the FT ACK TLV to acknowledge FT label operations. 1100 The FT ACK TLV MUST NOT be used in messages flowing on an LDP session 1101 that does not support the LDP FT enhancements. Its presence on such 1102 messages SHALL be treated as a protocol error by the receiving LDP 1103 peer. 1105 The FT ACK TLV MAY be present on any LDP message exchanged on an 1106 LDP session after the initial LDP Initialization messages. It is 1107 RECOMMENDED that the FT ACK TLV is included on all FT 1108 Keepalive messages in order to ensure that the LDP peers do not 1109 build up a large backlog of unacknowledged state information. 1111 The FT ACK TLV is encoded as follows. 1113 0 1 2 3 1114 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1115 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1116 |0|0| FT ACK (0x0504) | Length (= 4) | 1117 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1118 | FT ACK Sequence Number | 1119 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1121 FT ACK Sequence Number 1122 The sequence number for this most recent FT label message 1123 that the sending LDP peer has received from the receiving LDP 1124 peer and secured against failure of the LDP session. It is not 1125 necessary for the sending peer to have fully processed the message 1126 before ACKing it. For example, an LSR MAY ACK a Label Request 1127 message as soon as it has securely recorded the message, without 1128 waiting until it can send the Label Mapping message in response. 1130 ACKs are cumulative. Receipt of an LDP message containing an FT 1131 ACK TLV with an FT ACK Sequence Number of 12 is treated as the 1132 acknowledgement of all messages from 1 to 12 inclusive (assuming 1133 the LDP session started with a sequence number of 1). 1135 This field MUST be set to 0 if the LSR sending the FT ACK TLV has 1136 not received any FT label operations on this LDP session. This 1137 would apply to LDP sessions to new LDP peers or after an LSR 1138 determines that it must drop all state for a failed TCP connection. 1140 See the section "FT Operation Acks" for details of how this field 1141 is used. 1143 If an LSR receives a message affecting a label that it believes is an 1144 FT label and that message does not contain the FT Protection TLV, it 1145 SHOULD send a Notification message to its LDP peer containing the 1146 "Missing FT Protection TLV" status code. 1148 If an LSR receives an FT ACK TLV that contains an FT ACK Sequence 1149 Number that is less than the previously received FT ACK Sequence 1150 Number (remembering to take account of wrapping), it SHOULD send a 1151 Notification message to its LDP peer containing the "FT ACK 1152 Sequence Error" status code. 1154 7. Example Use 1156 Consider two LDP peers, P1 and P2, implementing LDP over a TCP 1157 connection that connects them, and the message flow shown below. 1159 The parameters shown on each message shown below are as follows: 1161 message (label, senders FT sequence number, FT ACK number) 1163 A "-" for FT ACK number means that the FT ACK TLV is not included 1164 on that message. "n/a" means that the parameter in question is not 1165 applicable to that type of message. 1167 In the diagrams below, time flows from top to bottom. The relative 1168 position of each message shows when it is transmitted. See the notes 1169 for a description of when each message is received, secured for FT or 1170 processed. 1172 7.1 Session Failure and Recovery 1174 notes P1 P2 1175 ===== == == 1176 (1) Label Request(L1,27,-) 1177 ---------------------------> 1178 Label Request(L2,28,-) 1179 ---------------------------> 1180 (2) Label Request(L3,93,27) 1181 <--------------------------- 1182 (3) Label Request(L1,123,-) 1183 --------------------------> 1184 Label Request(L2,124,-) 1185 --------------------------> 1186 (4) Label Mapping(L1,57,-) 1187 <-------------------------- 1188 Label Mapping(L1,94,28) 1189 <--------------------------- 1190 (5) Label Mapping(L2,58,-) 1191 <-------------------------- 1192 Label Mapping(L2,95,-) 1193 <--------------------------- 1194 (6) Address(n/a,29,-) 1195 ---------------------------> 1196 (7) Label Request(L4,30,-) 1197 ---------------------------> 1198 (8) Keepalive(n/a,na/,94) 1199 ---------------------------> 1200 (9) Label Abort(L3,96,-) 1201 <--------------------------- 1202 (10) ===== TCP Session lost ===== 1203 : 1204 (11) : Label Withdraw(L1,59,-) 1205 : <-------------------------- 1206 : 1207 (12) === TCP Session restored === 1209 LDP Init(n/a,n/a,94) 1210 ---------------------------> 1211 LDP Init(n/a,n/a,29) 1212 <--------------------------- 1213 (13) Label Request(L4,30,-) 1214 ---------------------------> 1215 (14) Label Mapping(L2,95,-) 1216 <--------------------------- 1217 Label Abort(L3,96,30) 1218 <--------------------------- 1219 (15) Label Withdraw(L1,97,-) 1220 <--------------------------- 1222 Notes: 1223 ====== 1225 (1) Assume that the LDP session has already been initialized. 1226 P1 issues 2 new Label Requests using the next sequence numbers. 1228 (2) P2 issues a third Label request to P1. At the time of sending 1229 this request, P2 has secured the receipt of the label request 1230 for L1 from P1, so it includes an ACK for that message. 1232 (3) P2 Processes the Label Requests for L1 and L2 and forwards them 1233 downstream. Details of downstream processing are not shown in 1234 the diagram above. 1236 (4) P2 receives a Label Mapping from downstream for L1, which it 1237 forwards to P1. It includes an ACK to the Label Request for L2, 1238 as that message has now been secured and processed. 1240 (5) P2 receives the Label Mapping for L2, which it forwards to P1. 1241 This time it does not include an ACK as it has not received any 1242 further messages from P1. 1244 (6) Meanwhile, P1 sends a new Address Message to P2 . 1246 (7) P1 also sends a fourth Label Request to P2 1248 (8) P1 sends a Keepalive message to P2, on which it includes an ACK 1249 for the Label Mapping for L1, which is the latest message P1 has 1250 received and secured at the time the Keepalive is sent. 1252 (9) P2 issues a Label Abort for L3. 1254 (10) At this point, the TCP session goes down. 1256 (11) While the TCP session is down, P2 receives a Label Withdraw 1257 Message for L1, which it queues. 1259 (12) The TCP session is reconnected and P1 and P2 exchange LDP 1260 Initialization messages on the recovered session, which include 1261 ACKS for the last message each peer received and secured prior 1262 to the failure. 1264 (13) From the LDP Init exchange, P1 determines that it needs to 1265 re-issue the Label request for L4. 1267 (14) Similarly, P2 determines that it needs to re-issue the Label 1268 Mapping for L2 and the Label Abort. 1270 (15) P2 issues the queued Label Withdraw to P1. 1272 7.2 Temporary Shutdown 1274 notes P1 P2 1275 ===== == == 1276 (1) Label Request(L1,27,-) 1277 ---------------------------> 1278 Label Request(L2,28,-) 1279 ---------------------------> 1280 (2) Label Request(L3,93,27) 1281 <--------------------------- 1282 (3) Label Request(L1,123,-) 1283 --------------------------> 1284 Label Request(L2,124,-) 1285 --------------------------> 1286 (4) Label Mapping(L1,57,-) 1287 <-------------------------- 1288 Label Mapping(L1,94,28) 1289 <--------------------------- 1290 (5) Label Mapping(L2,58,-) 1291 <-------------------------- 1292 Label Mapping(L2,95,-) 1293 <--------------------------- 1294 (6) Address(n/a,29,-) 1295 ---------------------------> 1296 (7) Label Request(L4,30,-) 1297 ---------------------------> 1298 (8) Keepalive(n/a,na/,94) 1299 ---------------------------> 1300 (9) Label Abort(L3,96,-) 1301 <--------------------------- 1302 (10) Notification(Temporary shutdown) 1303 ---------------------------> 1304 : 1305 (11) : Label Withdraw(L1,59,-) 1306 : <-------------------------- 1307 : 1308 (12) LDP Init(n/a,n/a,94) 1309 ---------------------------> 1310 LDP Init(n/a,n/a,29) 1311 <--------------------------- 1312 (13) Label Request(L4,30,-) 1313 ---------------------------> 1314 (14) Label Mapping(L2,95,-) 1315 <--------------------------- 1316 Label Abort(L3,96,30) 1317 <--------------------------- 1318 (15) Label Withdraw(L1,97,-) 1319 <--------------------------- 1321 Notes: 1322 ====== 1324 Notes are as in the previous example except as follows. 1326 (10) P1 needs to upgrade the software or hardware that it is running. 1327 It issues a Notification message to terminate the LDP session, 1328 but sets the status code as 'Temporary shutdown' to inform P2 1329 that this is not a fatal error, and P2 should maintain FT state. 1330 The TCP connection may also fail during the period that the LDP 1331 session is down (in which case it will need to be 1332 re-established), but it is also possible that the TCP connection 1333 will be preserved. 1335 8. Security Considerations 1337 The LDP FT enhancements inherit similar security considerations to 1338 those discussed in [2] and [4]. 1340 The LDP FT enhancements allow the re-establishment of a TCP 1341 connection between LDP peers without a full re-exchange of the 1342 attributes of established labels, which renders LSRs that implement 1343 the extensions specified in this draft vulnerable to additional 1344 denial-of-service attacks as follows: 1346 - An intruder may impersonate an LDP peer in order to force a 1347 failure and reconnection of the TCP connection, but where the 1348 intruder does not set the FT Reconnect Flag on re-connection. 1349 This forces all FT labels to be released. 1351 - Similarly, an intruder could set the FT Reconnect Flag on 1352 re-establishment of the TCP session without preserving the state 1353 and resources for FT labels. 1355 - An intruder could intercept the traffic between LDP peers and 1356 override the setting of the FT Label Flag to be set to 0 for 1357 all labels. 1359 All of these attacks may be countered by use of an authentication 1360 scheme between LDP peers, such as the MD5-based scheme outlined in 1361 [4]. 1363 Alternative authentication schemes for LDP peers are outside the 1364 scope of this draft, but could be deployed to provide enhanced 1365 security to implementations of LDP, CR-LDP and the LDP FT 1366 enhancements. 1368 As with LDP and CR-LDP, a security issue may exist if an LDP 1369 implementation continues to use labels after expiration of the 1370 session that first caused them to be used. This may arise if the 1371 upstream LSR detects the session failure after the downstream LSR 1372 has released and re-used the label. The problem is most obvious 1373 with the platform-wide label space and could result in mis-routing 1374 of data to other than intended destinations and it is conceivable 1375 that these behaviors may be deliberately exploited to either obtain 1376 services without authorization or to deny services to others. 1378 In this draft, the validity of the session may be extended by the FT 1379 Reconnection Timeout, and the session may be re-established in this 1380 period. After the expiry of the Reconnection Timeout the session 1381 must be considered to have failed and the same security issue applies 1382 as described above. 1384 However, the downstream LSR may declare the session as failed before 1385 the expiration of its Reconnection Timeout. This increases the 1386 period during which the downstream LSR might reallocate the label 1387 while the upstream LSR continues to transmit data using the old usage 1388 of the label. To reduce this issue, this draft requires that labels 1389 are not re-used until the Reconnection Timeout has expired. 1391 A further issue might apply if labels were re-used prior to the 1392 expiration of the FT Reconnection Timeout, but this is forbidden by 1393 this draft. 1395 9. Implementation Notes 1397 9.1 FT Recovery Support on Non-FT LSRs 1399 In order to take full advantage of the FT capabilities of LSRs in the 1400 network, it may be that an LSR that does not itself contain the 1401 ability to recover from local hardware or software faults still needs 1402 to support the LDP FT enhancements described in this draft. 1404 Consider an LSR, P1, that is an LDP peer of a fully Fault Tolerant 1405 LSR, P2. If P2 experiences a fault in the hardware or software that 1406 serves an LDP session between P1 and P2, it may fail the TCP 1407 connection between the peers. When the connection is recovered, the 1408 LSPs/labels between P1 and P2 can only be recovered if both LSRs were 1409 applying the FT recovery procedures to the LDP session. 1411 9.2 ACK generation logic 1413 FT ACKs SHOULD be returned to the sending LSR as soon as is 1414 practicable in order to avoid building up a large quantity of 1415 unacknowledged state changes at the LSR. However, immediate 1416 one-for-one acknowledgements would waste bandwidth unnecessarily. 1418 A possible implementation strategy for sending ACKs to FT LDP 1419 messages is as follows: 1420 - An LSR secures received messages in order and tracks the sequence 1421 number of the most recently secured message, Sr. 1422 - On each LDP KeepAlive that the LSR sends, it attaches an FT ACK 1423 TLV listing Sr 1424 - Optionally, the LSR may attach an FT ACK TLV to any other LDP 1425 message sent between Keepalive messages if, for example, Sr has 1426 increased by more than a threshold value since the last ACK sent. 1428 This implementation combines the bandwidth benefits of accumulating 1429 ACKs while still providing timely ACKs. 1431 10. Acknowledgments 1433 The work in this draft is based on the LDP and CR-LDP ideas 1434 expressed by the authors of [2] and [4]. 1436 The ACK scheme used in this draft was inspired by the proposal by 1437 David Ward and John Scudder for restarting BGP sessions now included 1438 in [9]. 1440 The authors would also like to acknowledge the careful review and 1441 comments of Nick Weeds, Piers Finlayson, Tim Harrison, Duncan Archer, 1442 Peter Ashwood-Smith, Bob Thomas, S.Manikantan, Adam Sheppard and 1443 Alan Davey. 1445 11. Intellectual Property Consideration 1447 The IETF has been notified of intellectual property rights claimed in 1448 regard to some or all of the specification contained in this 1449 document. For more information, consult the online list of claimed 1450 rights. 1452 12. Full Copyright Statement 1454 Copyright (c) The Internet Society (2000, 2001). All Rights Reserved. 1455 This document and translations of it may be copied and furnished to 1456 others, and derivative works that comment on or otherwise explain it 1457 or assist in its implementation may be prepared, copied, published 1458 and distributed, in whole or in part, without restriction of any 1459 kind, provided that the above copyright notice and this paragraph 1460 are included on all such copies and derivative works. However, this 1461 document itself may not be modified in any way, such as by removing 1462 the copyright notice or references to the Internet Society or other 1463 Internet organizations, except as needed for the purpose of 1464 developing Internet standards in which case the procedures for 1465 copyrights defined in the Internet Standards process must be 1466 followed, or as required to translate it into languages other than 1467 English. 1469 The limited permissions granted above are perpetual and will not be 1470 revoked by the Internet Society or its successors or assigns. 1472 This document and the information contained herein is provided on an 1473 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 1474 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 1475 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 1476 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 1477 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1479 13. IANA Considerations 1481 This draft requires the use of a number of new TLVs and status codes 1482 from the number spaces within the LDP protocol. This section 1483 explains the logic used by the authors to choose the most appropriate 1484 number space for each new entity, and is intended to assist in the 1485 determination of any final values assigned by IANA or the MPLS WG in 1486 the event that the MPLS WG chooses to advance this draft on the 1487 standards track. 1489 This section will be removed when the TLV and status code values have 1490 been agreed with IANA. 1492 13.1 FT Session TLV 1494 The FT Session TLV carries attributes that affect the entire LDP 1495 session between LDP peers. It is suggested that the type for this 1496 TLV should be chosen from the 0x05xx range for TLVs that is used in 1497 [4] by other TLVs carrying session-wide attributes. At the time of 1498 this writing, the next available number in this range is 0x0503. 1500 13.2 FT Protection TLV 1502 The FT Protection TLV carries attributes that affect a single label 1503 exchanged between LDP peers. It is suggested that the type for this 1504 TLV should be chosen from the 0x02xx range for TLVs that is used in 1505 [4] by other TLVs carrying label attributes. At the time of this 1506 writing, the next available number in this range is 0x0203. 1508 Consideration was given to using the message number field instead of 1509 a new FT Sequence Number field. However, the authors felt this 1510 placed unacceptable implementation constraints on the use of message 1511 number (e.g. it could no longer be used to reference a control 1512 block). 1514 13.3 FT ACK TLV 1516 The FT Protection TLV may ACK many label operations at once 1517 if cumulative ACKS are used. It is suggested that the type for this 1518 TLV should be chosen from the 0x05xx range for TLVs that is used in 1519 [4] by other TLVs carrying session-wide attributes. At the time of 1520 this writing, the next available number in this range is 0x0504. 1522 Consideration was given to carrying the FT ACK Number in the FT 1523 Protection TLV, but the authors felt this would be inappropriate as 1524 many implementations may wish to carry the ACKs on Keepalive 1525 messages. 1527 13.4 Status Codes 1529 The authors' current understanding is that MPLS status codes are not 1530 sub-divided into specific ranges for different types of error. 1531 Hence, the numeric status code values assigned for this draft should 1532 simply be the next available values at the time of writing and may be 1533 substituted for other numeric values. 1535 See section "Status Codes" for details of the status codes defined in 1536 this draft. 1538 14. Authors' Addresses 1540 Adrian Farrel (editor) Paul Brittain 1541 Movaz Networks, Inc. Data Connection Ltd. 1542 7926 Jones Branch Drive, Suite 615 Windsor House, Pepper Street, 1543 McLean, VA 22102 Chester, Cheshire 1544 Voice: +1 703-847-1719 CH1 1DF, UK 1545 Email: afarrel@movaz.com Phone: +44-(0)-1244-313440 1546 Fax: +44-(0)-1244-312422 1547 Email: pjb@dataconnection.com 1549 Philip Matthews Eric Gray 1550 Nortel Networks Corp. Sandburst Corporation 1551 P.O. Box 3511 Station C, 600 Federal Street 1552 Ottawa, ON K1Y 4H7 Andover, MA 01810 1553 Canada Phone: +1 978-689-1600 1554 Phone: +1 613-768-3262 eric.gray@sandburst.com 1555 philipma@nortelnetworks.com 1557 15. References 1559 1 Bradner, S., "The Internet Standards Process -- Revision 3", BCP 1560 9, RFC 2026, October 1996. 1562 2 Jamoussi, B., et. al., Constraint-Based LSP Setup using LDP, 1563 draft-ietf-mpls-cr-ldp-05.txt, February 2001, (work in progress). 1565 3 Bradner, S., "Key words for use in RFCs to Indicate Requirement 1566 Levels", BCP 14, RFC 2119, March 1997. 1568 4 Andersson, L., et. al., LDP Specification, RFC 3036, January 2001. 1570 5 Ash, G., et al., LSP Modification Using CR-LDP, draft-ietf-mpls- 1571 crlsp-modify-03.txt, March 2001 (work in progress). 1573 6 Braden, R., et al., Resource ReSerVation Protocol (RSVP) -- 1574 Version 1, Functional Specification, RFC 2205, September 1997. 1576 7 Berger, L., et al., RSVP Refresh Reduction Extensions, draft- 1577 ietf-rsvp-refresh-reduct-05.txt, June 2000 (work in progress). 1579 8 Swallow, G., et al,. Extensions to RSVP for LSP Tunnels, draft- 1580 ietf-mpls-rsvp-lsp-tunnel-08.txt, February 2000 (work in 1581 progress). 1583 9 Ramachandra, S., et al., Graceful Restart Mechanism for BGP, 1584 draft-ietf-idr-restart-00.txt, December 2000 (work in progress) 1586 10 Stewart, R., et al., Stream Control Transmission Protocol, 1587 RFC 2906, October 2000. 1589 11 Moy, J., Hitless OSPF Restart, draft-ietf-ospf-hitless-restart- 1590 00.txt, February 2001 (work in progress)