draft-ietf-mpls-soft-preemption-01.txt October, 2003 Matthew R. Meyer Global Crossing Denver Maddux Nitrous.net Jean-Philippe Vasseur Cisco Systems, Inc. Curtis Villamizar Avici Systems Amir Birjandi MCI IETF Internet Draft Expires: April, 2004 October, 2003 MPLS Traffic Engineering Soft preemption Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are Working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Meyer, Maddux, Vasseur, Villamizar and Birjandi 1 draft-ietf-mpls-soft-preemption-01.txt October, 2003 Abstract This draft documents MPLS TE Soft Preemption, a suite of protocol modifications extending the concept of preemption with the goal of reducing/eliminating traffic disruption of preempted TE LSPs. Initially MPLS RSVP-TE was defined supporting only immediate LSP displacement upon preemption. The utilization of a preemption pending flag helps more gracefully mitigate the re-route process of preempted LSPs. For the brief period soft preemption is activated, reservations (though not necessarily traffic levels) are in effect over-provisioned until the LSP can be re-routed. For this reason, the feature is primarily but not exclusively interesting in packet oriented MPLS networks with Diff-Serv and TE capabilities. 1. Terminology CSPF - Constraint-based Shortest Path First. Hard Preemption - Process whereby one LSP is intrusively displaced by a better priority LSP. LER - Label Edge Router LSR - Label Switch Router LSP - An MPLS Label Switched Path Make Before Break - Technique used to non-intrusively alter the path of an LSP. The ingress LER First signals, sharing the bandwidth with the primary LSP (to avoid double booking), then switches forwarding over to a new path. Finally the old path state is torn down. Preemption Pending flag - This flag is set on an IPv4 or IPv6 RSVP Resv RRO sub-object to signal to the TE LSP ingress LER that the TE LSP is about to be preempted and must be re-signaled (in a non disruptive fashion, with make before break) along another path. The flag can be set for Path RRO as well. Soft Preemption Desired Flag - This flag is set on an IPv4 or IPv6 Path RRO sub-object to indicate to LSRs along the path that, should the LSP need to be preempted, soft preemption should be used if supported. TE LSP - Traffic Engineering Label Switched Path Meyer, Maddux, Vasseur, Villamizar and Birjandi 2 draft-ietf-mpls-soft-preemption-01.txt October, 2003 2. Motivations Initially MPLS RSVP-TE was defined supporting only a method of TE LSP preemption which immediately tore down TE LSPs, disregarding the preempted in-transit traffic. This simple but abrupt process nearly guarantees preempted traffic will be discarded, if only briefly, until the RSVP Path Error message reaches and is processed by the ingress LER and a new forwarding path can be established. In cases of actual resource contention this might be helpful, however preemption is triggered by mere reservation contention and reservations may not reflect forwarding plane contention up to the moment. The result is that traffic is often needlessly being discarded. The intrusive or hard preemption may be a requirement to protect traffic in a network without Diff-Serv, but in a Diff-Serv enabled architecture one need not rely exclusively upon preemption to enforce a preference for the most valued traffic since the marking and queuing disciplines should already be aligned for those purposes. Moreover, even in non Diff-Serv aware networks, depending on the TE LSP sizing rules, reservation contention may not accurately reflect forwarding plane congestion. 3. Introduction In an MPLS RSVP-TE enabled network hard preemption provides no mechanism to allow preempted TE LSPs to be handled in a make-before- break fashion: the hard preemption scheme instead utilizes a very intrusive method that can cause traffic disruption for a potentially large amount of TE LSPs. The consequences of disruptive preemption make periodic automated mechanisms like TE LSP dynamic resizing less palatable when high network stability is sought. This draft proposes the use of additional signaling and accounting mechanisms to alert the ingress LER of the preemption that is pending and allow for temporary over-provisioning while the preempted tunnel is re-routed in a non disruptive fashion (make-before-break) by the ingress LER. During the period that the tunnel is being re-routed, link capacity is over- provisioned on links where soft preemption has occurred. Optionally the egress LER may be signaled as well to more efficiently deal with any simultaneous soft preemptions. 4. RSVP extensions 4.1. SESSION-ATTRIBUTES Flags To explicitly signal the desire for a TE LSP to benefit from the soft preemption mechanism (and so not to be hard preempted), the following Meyer, Maddux, Vasseur, Villamizar and Birjandi 3 draft-ietf-mpls-soft-preemption-01.txt October, 2003 flag of the SESSION-ATTRIBUTE object (for both the C-Type 1 and 7) is defined: Soft preemption desired: 0x40 4.2. RRO IPv4/IPv6 Sub-Object Flags To report that a soft preemption is pending for an LSP, a flag is defined in the IPv4/IPv6 sub-object carried in the RRO object message defined in RFC3209. This flag is called the preemption pending (PPend) flag. A compliant LSR MUST support the RRO object, as defined in RFC 3209. RRO IPv4 and IPv6 sub-object address These two sub-objects currently have the following flags defined in RFC 3209 and [FAST-REROUTE]: Local protection available: 0x01 Indicates that the link downstream of this node is protected via a local repair mechanism, which can be either one-to-one or facility backup. Local protection in use: 0x02 Indicates that a local repair mechanism is in use to maintain this tunnel (usually in the face of an outage of the link it was previously routed over, or an outage of the neighboring node). Bandwidth protection: 0x04 The PLR will set this when the protected LSP has a backup path which is guaranteed to provide the desired bandwidth specified in the FAST_REROUTE object or the bandwidth of the protected LSP, if no FAST_REROUTE object was included. The PLR may set this whenever the desired bandwidth is guaranteed; the PLR MUST set this flag when the desired bandwidth is guaranteed and the "bandwidth protection desired" flag was set in the SESSION_ATTRIBUTE object. If the requested bandwidth is not guaranteed, the PLR MUST NOT set this flag. Node protection: 0x08 The PLR will set this when the protected LSP has a backup path which provides protection against a failure of the next LSR along the protected LSP. The PLR may set this whenever node protection is provided by the protected LSP's backup path; the PLR MUST set this flag when the node protection is provided and the "node protection desired" flag was set in the SESSION_ATTRIBUTE object. If node protection is not provided, the PLR MUST NOT set this flag. Thus, if a PLR could only setup Meyer, Maddux, Vasseur, Villamizar and Birjandi 4 draft-ietf-mpls-soft-preemption-01.txt October, 2003 a link-protection backup path, the "Local protection available" bit will be set but the "Node protection" bit will be cleared. Soft preemption makes use of the Preemption pending flag defined here: Preemption pending: 0x10 The preempting node sets this flag if a pending preemption is in progress for the TE LSP. This indicates to the ingress LER of this LSP that it SHOULD be re-routed. 4.3. Use of the RRO IPv4/IPv6 Sub-Object in Path message An LSR MAY use the Preemption pending flag in the IPv4/IPv6 RRO sub- object carried in a PATH RRO message to simultaneously alert downstream LSRs that the LSP was soft preempted upstream. This information could be used by the downstream LSR to bias future soft preemption candidates toward LSPs already soft preempted elsewhere in their path. 5. Mode of operation R0-----1G--R1--155--R2 LSP1: LSP2: | \ | | \ 155 R0-->R1 R1<--R2 | \ | \ | 155 1G R3 V V | \ | R5 R4 | \ 155 | \ | R4------1G--R5 Fig 1. In the network depicted above in figure 1, consider the following conditions: -Reservable BW on R0-R1, R1-R5 and R4-R5 is 1Gb/sec -Reservable BW on R1-R2, R1-R4, R2-R3, R3-R5 is 155 Mb/sec. Bandwidths and costs are identical in both directions -Each circuit has an IGP metric of 10 and IGP metric is used by CSPF -Two TE tunnels are defined: -LSP1: 155 Mb, setup/hold priority 0 tunnel path R0-R1-R5. -LSP2: 155 Mb, setup/hold priority 7 tunnel path R2-R1-R4. Both TE LSPs are signaled with the Soft Preemption bit of their SESSION-ATTRIBUTE object set. -Circuit R1-R5 fails. -Soft Preemption is functional. Meyer, Maddux, Vasseur, Villamizar and Birjandi 5 draft-ietf-mpls-soft-preemption-01.txt October, 2003 When the circuit R1-R5 fails, R1 detects the failure and sends an updated IGP LSA/LSP and Path Error message to all the ingress LERs having a TE LSP traversing the failed link (R0 in the example above). Either form of notification may arrive at the ingress LERs first. Upon receiving the link failure notification, ingress LER R0 triggers a TE LSP re-route of LSP1, and re-signals LSP1 along shortest path available satisfying the TE LSP constraints: R0-R1-R4-R5 path. The Resv messages for LSP1 travel in the upstream direction (from the destination to the ingress LER -- R5 to R0 in this example). LSP2 is soft preempted at R1 as it has a numerically lower priority value and both bandwidth reservations cannot be satisfied on the R1-R4 link. Instead of sending a path tear for LSP2 upon preemption as with hard preemption (which would result in an immediate traffic disruption for LSP2), R1s local BW accounting for LSP2 is zeroed and a preemption pending flagged Resv RRO for LSP2 is issued upstream toward the ingress LER, R2. Optionally, R1 MAY simultaneously send a soft preemption flagged Path RRO notifying downstream LSRs of LSP2s soft preemption. If more than one soft preempted LSP has the same ingress LER (egress LER), these soft preemption Resv (Path) messages MAY be bundled together (see RFC2961). The preempting node MUST immediately send a Resv message with the Preemption pending RRO flag set for each soft preempted TE LSP. The node MAY use the occurrence of soft preemption to trigger an immediate IGP update or influence the scheduling of an IGP update. Should a refresh event for LSP2 arrive before LSP2 is re-routed, soft preempting nodes such as R1 MUST continue to refresh the LSP. Resv messages with the RRO Preemption pending flag set SHOULD be sent in reliable mode (RFC 2961). Upon reception of the Resv with the Preemption pending flag set, the ingress LER (of LSP2 in this case) MAY update the working copy of the TE-DB before running CSPF for the new LSP. In the case that Diff-Serv [DIFF-MPLS] & TE [RSVP-TE]are deployed (as opposed to Diff-Serv-aware TE [DS-TE]), receiving preemption pending may imply to a ingress LER that the available bandwidth for the affected priority level and greater has been exhausted for the indicated node interface. An ingress LER MAY choose to reduce or zero available BW for the implied priority range until more accurate information is available (i.e. a new IGP TE update is received). In the case that reservation availability is restored at the point of preemption (R1) the point of preemption MAY issue a Resv message with the Preemption pending flag unset to signal restoration to the ingress LER. This implies that a ingress LER might have delayed or been unsuccessful in re-signaling. After the ingress LER has successfully established a new LSP, the old path MUST be torn down. Meyer, Maddux, Vasseur, Villamizar and Birjandi 6 draft-ietf-mpls-soft-preemption-01.txt October, 2003 As a result of 'soft preemption', no traffic will be needlessly black- holed due to mere reservation contention. If loss is to occur, it will be due only to an actual traffic congestion scenario and according to the operators Diff-Serv (if Diff-Serv is deployed) and queuing scheme. 6. Selection of the preempted TE LSP at a preempting mid-point When a numerically lower priority TE LSP is signaled that requires the preemption of a set of numerically higher priority LSPs, the node where preemption is to occur has to make a decision on the set of TE LSP, candidates for preemption. This decision is a local decision and various algorithms can be used, depending on the objective. See [PREEMPT-EXP]. As already mentioned, soft preemption causes a temporary link over- provisioning condition while the soft preempted TE LSPs are re-routed by their respective ingress LERs. In order to reduce this over- provisioning exposure, a preempting LSR MAY limit the number of soft preempt-able TE LSPs to the subset of TE LSP that have explicitly requested soft preemption via signaling, setting their Soft Preemption desired bit in the SESSION-ATTRIBUTE of their RSVP Path messages. This way, the preempting LSR could apply hard preemption to the remaining TE LSPs that have not explicitly requested soft preemption, sending a Path Error message to their ingress LER and immediately removing the corresponding local states. This would help reducing the temporarily elevated over-provisioning ratio on the links where soft preemption occurs. Optionally, a midpoint LSR upstream or downstream from a soft preempting node MAY choose to cache the LSPs soft preempted state. In the event a local preemption is needed, the relevant priority level LSPs from the cache are soft preempted first, followed by the normal soft and hard preemption selection process for the given priority. 7. Interoperability Backward compatibility should be assured as long as the implementation followed the recommendation set forth in RFC 3209. "The presence of an unrecognized subobject which is not encountered in a node's ERO processing SHOULD be ignored. It is passed forward along with the rest of the remaining ERO stack." An LSR without soft preemption capabilities but that followed the aforementioned recommendation will simply ignore the RRO Preemption Pending flag and treat the Resv message as a regular Resv refresh message. As a consequence, the soft preempted TE LSP will not be re-routed with make before break by the ingress LER. Meyer, Maddux, Vasseur, Villamizar and Birjandi 7 draft-ietf-mpls-soft-preemption-01.txt October, 2003 To guard against a situation where bandwidth over-provisioning will last forever, a local timer (soft preemption expiration timer) MUST be started on the preemption node, upon soft preemption. When this timer expires, the soft preempted TE LSP will be torn down and the preempting node SHOULD send a Path Error. This timer MAY be configurable. Optionally, an implementation MAY choose to Hard preempt TE LSP for which the Soft preemption desired bit has not been set. The current hard preemption scheme can be emulated with a soft preemption expiration timer set to zero. 8. Management Both the point of preemption and the ingress LER SHOULD provide some form of accounting internally and to the user with regard to which TE LSPs and how much capacity is over-provisioned due to soft preemption. 9. Security Considerations The practice described in this draft does not raise specific security issues beyond those of existing TE. 10. Acknowledgment The authors would like to thank Carol Iturralde, Dave Cooper for their valuable comments. 11. Intellectual Property The contributor represents that he has disclosed the existence of any proprietary or intellectual property rights in the contribution that are reasonably and personally known to the contributor. The contributor does not represent that he personally knows of all potentially pertinent proprietary and intellectual property rights owned or claimed by the organization he represents (if any) or third parties. References [TE-REQ] Awduche et al, Requirements for Traffic Engineering over MPLS, RFC2702, September 1999. [OSPF-TE] Katz, Yeung, Traffic Engineering Extensions to OSPF, draft- katz-yeung-ospf-traffic-09.txt, October 2002. [ISIS-TE] Smit, Li, IS-IS extensions for Traffic Engineering, draft- ietf-isis-traffic-04.txt, December 2002. Meyer, Maddux, Vasseur, Villamizar and Birjandi 8 draft-ietf-mpls-soft-preemption-01.txt October, 2003 [RSVP-TE] Awduche et al, "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC3209, December 2001. [DS-TE] Le Faucheur et al, "Requirements for support of Diff-Serv-aware MPLS Traffic Engineering", RFC3564, July 2003. [DS-TE-PROT] Le Faucheur et al, "Protocol extensions for support of Diff-Serv-aware MPLS Traffic Engineering", draft-ietf-tewg-diff-te- proto-05.txt, September 2003 [FAST-REROUTE] Pan, P. et al., "Fast Reroute Extentions to RSVP-TE for LSP Tunnels", Internet Draft, draft-ietf-mpls-rsvp-lsp- fastreroute-03.txt , December, 2003 [REFRESH-REDUCTION] Berger et al, "RSVP Refresh Overhead Reduction Extensions", RFC 2961, April 2001. [PREEMPT-EXP]DE Oliviera, JP. Vasseur, L.Chen and C. Scoglio " LSP Preemption Polcies for MPLS Traffic Engineering", daft-deoliviera-diff-te-preemption-02.txt, October 2003 [DIFF-MPLS] Le Faucheur, F., Wu, L., Davie, B., Davari, S., Vaananen, P., Krishnan, R., Cheval, P. and J. Heinanen, "Multi-Protocol Label Switching (MPLS) Support of Differentiated Services", RFC 3270, May 2002. Matthew R. Meyer Global Crossing 3133 Indian Valley Tr. Howell, MI 48855 USA email: mrm@gblx.net Denver Maddux Nitrous.net 1020 SW 35th St Corvallis, OR 97333 USA email: denver@nitrous.net Jean Philippe Vasseur Cisco Systems, Inc. 300 Beaver Brook Road Boxborough , MA - 01719 USA Email: jpv@cisco.com Curtis Villamizar Avici Systems Inc. USA Meyer, Maddux, Vasseur, Villamizar and Birjandi 9 draft-ietf-mpls-soft-preemption-01.txt October, 2003 Email: curtis@avici.com Amir Birjandi MCI 22001 louden county pky Ashburn, VA 20147 USA Meyer, Maddux, Vasseur, Villamizar and Birjandi 10