Network Working Group Toby Smith (Laurel Networks) Internet Draft Andrew G. Malis (Vivace Networks) Expiration Date: April 2002 Jack Shaio (Vivace Networks) Graceful Restart Mechanism for LDP draft-smith-mpls-ldp-restart-00.txt 1. Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress.'' The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 2. Abstract This document proposes a lightweight mechanism that the LDP protocol may use to help minimize the impact introduced by transient interruptions to an LDP session's TCP connection, with a focus on connection preservation for signaled Layer 2 circuits. A new LDP Cork message request/response mechanism is specified. New message types are defined for the delivery of graceful restart events. Finally, procedures for utilizing this mechanism are detailed. 3. Introduction The LDP protocol [1] provides label mapping information to its peer LSRs. In addition to providing label mappings for IP prefixes, LDP has recently been adopted as a signaling mechanism for the establishment of Layer 2 circuits between two provider edge LSRs [2, 3]. Customers expect these circuits, like the physical circuits they emulate, to be highly available connections. Under some circumstances (planned outages, software upgrades), LDP may temporarily lose connectivity to its peer(s). In these circumstances, it is beneficial to the customer to maintain the LDP-established LSPs even in the (temporary) absence of an LDP session. This draft describes a proposal for a lightweight mechanism which allows LDP LSRs to retain their forwarding state, even when the connection to the peer LSR is temporarily lost. The procedure described in this draft has excellent scaling properties: the LDP state is preserved incrementally, such that after an unexpected restart of an LDP session, only the LDP activity not already acknowledged during the previous session needs to be resignaled. In the case of provisioned Layer 2 circuits, it is probable that no resignaling will be necessary. The procedure described in this draft is minimally invasive to the LDP state machine and requires no changes to the LDP message processing procedures. This mechanism may be used in conjunction with a mechanism for the preservation of IP forwarding state; when LDP is being used solely as a signaling mechanism for the establishment of Layer 2 transports, however, such coordination is not required. The remainder of this document is organized as follows: A new LDP Cork message request/response mechanism is specified. New message types are defined for the delivery of graceful restart events. Finally, procedures for utilizing this mechanism are detailed. 4. Overview of Graceful Restart Mechanism LDP LSRs which support this graceful restart mechanism signal this capability with an additional Graceful Restart TLV sent as part of the session's Initialization messages. During normal session operation, each peer periodically issues a Cork message, defined below, which checkpoints the current label advertisement state between the peers. Each cork message is acknowledged by the far end. If an LDP peer is able to recognize that it needs to temporarily drop its connection to its peer, this LSR (termed the Originating Peer) will send a special, final Cork message to each of its peer LSRs (termed the Receiving Peer(s)). When the Receiving Peer receives a final Cork message, it responds with a corresponding final Cork message to the Originating Peer. Upon receiving the final Cork message response from each Receiving Peer, the Originating Peer may sever its TCP connection(s). All forwarding state corresponding to the cached state of the LDP protocol is preserved over the loss of connectivity with the LDP peer. Once the Originating Peer's LDP state is able to be re-established, it reconnects to each of its Receiving Peers, following the standard procedures for establishing TCP connections as specified in [1]. When the TCP session to the Receiving Peer(s) has been re-established, the LSRs exchange Graceful Restart TLVs as part of their Initialization messages. This TLV contains that checkpoint information corresponding to the last exchanged Cork messages, which allows the LSRs to resume operation without readvertising any checkpointed label mapping information. The details of the steps outlined in this section may be found in the Procedures section, below. 5. Message Formats This section describes the new LDP message and TLV formats used by this document. 5.1 Cork Message The LDP Cork message is sent periodically by each participating LSR. The Cork message may be used to checkpoint currently sent information, to acknowledge the reception of a previously received Cork message, or both. The rate at which periodic Cork messages are sent is locally determined by each participating LSR, and is implementation dependent. For example, cork messages may be sent at regular intervals, or after a threshold of sent LDP messages has been exceeded. Cork updates are not necessary if the state of the LSR has not changed since the time the last Cork message was sent. Cork messages with the Final Bit set are used to flush all currently pending label mapping and nexthop messages to the peer LSR, in anticipation of dropping the connection to the peer. The encoding for the Cork Message is: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0| Cork (0x3F00) | Message Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Message ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Acknowledged Message ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F|C|A| Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Message ID 32-bit value used to identify this message. Acknowledged Message ID A 32-bit value used to acknowledge the reception of a prior Cork message from the sender. The receiver replies with a Cork message of its own, with this field set to the Message ID of the Cork message it is acknowledging. If the Acknowledgement Bit is not set (see below), this field MUST be ignored. Final Bit A single bit denoting whether this message is the final checkpointing Cork message that the receiver should expect to receive from the sender. Checkpoint Bit A single bit denoting that this Cork message is being used by the sender to checkpoint its currently sent label and address information. An LSR which receives a Cork message with the Checkpoint Bit set MUST acknowledge the reception of this message with a corresponding Cork message with the Acknowledgement Bit set (see below). Cork messages with the Checkpoint Bit set MUST contain a non-zero Message ID. Acknowledgement Bit A single bit denoting that this Cork message is being used by the sender to acknowledge the reception of a previously received Cork message. When the Acknowledgement Bit is set, the Acknowledged Message ID field MUST be set to the Message ID of the Cork message being acknowledged. A single Cork message may have both the Checkpoint and Acknowledgement Bits set, allowing a single message to both checkpoint recently sent information, as well as acknowledge recently received Cork messages. Reserved These 13 bits MUST be filled with zeroes. 5.2 Graceful Restart TLV The Graceful Restart TLV is contained within both the Originating and Receiving Peers' Initialization messages to denote their participation in the graceful restart protocol. The encoding for the Graceful Restart TLV is: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0|0| Graceful Restart (0x3F00) | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Acknowledged Message ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Restart Timeout | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Acknowledged Message ID If the LSR is establishing a connection to a peer for the first time, this field MUST be set to zero. If an LSR is re-establishing a session with a remote peer with which it had previously exchanged Cork messages, and if the local LSR's Restart Timeout time has not expired, this value MUST contain the Message ID of the last successfully acknowledged Cork message received from the remote peer. If the Restart Timeout time has expired, this value MUST be reset to zero. Restart Timeout 32-bit unsigned non-zero integer that indicates the number of seconds that the sending LSR is willing to wait for re-establishment of the TCP connection between the peers after a restart has begun. This timer is started when the current TCP connection is terminated. The Restart Timeout MUST be calculated by using the smaller of the values sent in the Graceful Restart TLV to the peer LSR and the Restart Timeout value in the Graceful Restart TLV received from the peer LSR. 6. Procedures This section describes in detail the procedures which must be implemented by participating LSRs. An LSR which is capable of participating in this mechanism includes a Graceful Restart TLV in the Initialization message it sends to its remote peer. If the Initialization message received from the remote peer does not contain a Graceful Restart TLV, or if the value contained in the Acknowledged Message ID field is not the value expected from that peer, then the graceful restart mechanism MUST NOT be employed, and no Cork messages may be sent to the remote peer. In this case, if the local LSR has cached any state from a prior session to this peer, that cached state MUST be immediately discarded. For two LSRs which have successfully exchanged Graceful Restart TLVs, the Restart Timeout value used by both LSRs is calculated to be the lesser of the values exchanged by the peers. If this is the first time that the two LSRs have peered, or if the Restart Timeout time from a previous session has expired, the peering LSRs MUST include a value of zero in the Acknowledged Message ID field. When the exchanged Acknowledged Message ID values are non-zero, and neither LSR's Restart Timeout time has expired, both peers MUST resume operation of the LDP session as if all checkpointed sent and received information is still active. Upon returning to such a state, the first message sent by each LSR to its peer MUST be a Cork message with the Acknowledgement Bit set, and the Acknowledged Message ID set to the value contained in the LSR's Graceful Restart TLV Acknowledged Message ID field. If the LSR is unable to restore its state for any reason, it MUST immediately send a Cork message with the Acknowledgement Bit set and containing an Acknowledged Message ID value of zero. In either case, after exchanging Initialization messages with non-zero Acknowledged Message ID values, the first messages exchanged between the peers MUST be Cork messages. If an LSR which is re-establishing cached state after a restart receives an initial Cork message which does not match the value contained in the peer's Graceful Restart TLV, the receiving LSR MUST immediately discard any cached state, as the graceful restart has failed on the peer LSR. After successfully negotiating the use of the graceful restart mechanism, and restoring cached state (if recovering from a prior restart), the peering LSRs resume normal LDP operation. Each LSR periodically checkpoints the label mapping and nexthop information that it has sent to its peer and issues an unsolicited Cork message with the Checkpoint Bit set to its peer. The sending LSR MUST NOT cache the current state of the sent session information until the remote peer acknowledges the receipt of the current Cork message. If the local LSR knows a priori that it is about to restart, it may issue a Cork message with the Final Bit set. After sending a Cork message with the Final bit set, the sending LSR MUST NOT send any further Label Mapping, Label Withdraw, Address, or Address Withdraw messages to the receiving peer. An LSR which receives a Cork message from its peer with the Checkpoint Bit set MUST acknowledge the receipt of this message by responding to the sending peer with a Cork message with the Acknowledgement Bit set. The receiving LSR MUST cache all received session information from the remote peer before acknowledging the reception of a Checkpoint Cork message. If the received Cork message's Final bit is set, the receiving peer immediately sends any pending Label Mapping, Label Withdraw, Address, and Address Withdraw messages to the sending peer, followed by a Cork message with the Final bit set in response. This Cork message may also serve to acknowledge receipt of the sending peer's Final Cork message. After sending the Cork message, the receiving peer MUST not send any more Label Mapping, Label Withdraw, Address, or Address Withdraw messages to the sending peer. An LSR which is expecting to be restarted initiates the graceful restart by sending a Cork message with the Final bit set to its peer. This LSR may restart upon receiving both a corresponding Final Cork message from its peer, and upon receiving a Acknowledgement Cork message from its peer. These two messages may be consolidated into a single message with the Final, Checkpoint and Acknowledgement Bits set. LSRs participating in this graceful restart mechanism do not expect to see a fatal Notification message from their remote peer before restarting. If an LSR sends a fatal Notification message to its remote peer, or receives a fatal Notification from its remote peer, the LSR MUST discard any cached LDP state immediately. 7. Operational Considerations This document describes a mechanism for the graceful re-establishment of LDP sessions, with a focus on providing a simple signaling recovery mechanism for Layer 2 transport LSPs. Given that the establishment of IP LSPs via LDP relies upon the existence of an underlying IGP to determine the network topology, a complete graceful restart mechanism requires a degree of coordination between LDP and its underlying IGP when restarting. This document does not address ways in which the IGP state may be preserved during a graceful restart. 8. Security Considerations Given that this document describes a mechanism for preserving LDP session state during periods of lost connectivity, there may be concern that this proposal introduces new security risks. However, since the re-establishment of the LDP session is based upon the same mechanisms described in [1], and since the cached LDP session state is only eligible for use if an LDP session is re-established to a peer which had previously been peering with the LSR, the authors believe that this proposal does not impact the underlying security model of LDP. 9. References [1] "LDP Specification", L. Andersson, P. Doolan, N. Feldman, A. Fredette, B. Thomas. RFC3036 [2] "Transport of Layer 2 Frames Over MPLS", draft-martini- l2circuit-trans-mpls-08.txt. ( work in progress ) [3] "MPLS-based Layer 2 VPNs", Kompella, et. al., draft- kompella-mpls-l2vpn-02.txt. ( work in progress ) 10. Author Information Toby Smith Laurel Networks, Inc. 1300 Omega Drive Pittsburgh, PA 15205 Email: tob@laurelnetworks.com Andrew G. Malis Vivace Networks, Inc. 2730 Orchard Parkway San Jose, CA 95134 Phone: +1 408 383 7223 Email: Andy.Malis@vivacenetworks.com Jack Shaio Vivace Networks, Inc. 2730 Orchard Parkway San Jose, CA 95134 Phone: +1 408 432 7623 Email: Jack.Shaio@vivacenetworks.com