< draft-ietf-mpls-recovery-frmwrk-05.txt   draft-ietf-mpls-recovery-frmwrk-06.txt >
MPLS Working Group Vishal Sharma (Metanoia, Inc.) MPLS Working Group Vishal Sharma (Metanoia, Inc.)
Informational Track Fiffi Hellstrand (Nortel Networks) Informational Track Fiffi Hellstrand (Nortel Networks)
Expires: November 2002 (Editors) Expires: Januray 2003 (Editors)
May 2002 July 2002
Framework for MPLS-based Recovery Framework for MPLS-based Recovery
<draft-ietf-mpls-recovery-frmwrk-05.txt> <draft-ietf-mpls-recovery-frmwrk-06.txt>
Status of this memo Status of this memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other Task Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts. groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
skipping to change at page 2, line 26 skipping to change at page 2, line 26
4.4. Scope of Recovery..............................................18 4.4. Scope of Recovery..............................................18
4.4.1 Topology.....................................................18 4.4.1 Topology.....................................................18
4.4.1.1 Local Repair................................................18 4.4.1.1 Local Repair................................................18
4.4.1.2 Global Repair...............................................19 4.4.1.2 Global Repair...............................................19
4.4.1.3 Alternate Egress Repair.....................................19 4.4.1.3 Alternate Egress Repair.....................................19
4.4.1.4 Multi-Layer Repair..........................................20 4.4.1.4 Multi-Layer Repair..........................................20
4.4.1.5 Concatenated Protection Domains.............................20 4.4.1.5 Concatenated Protection Domains.............................20
4.4.2 Path Mapping.................................................20 4.4.2 Path Mapping.................................................20
4.4.3 Bypass Tunnels...............................................21 4.4.3 Bypass Tunnels...............................................21
4.4.4 Recovery Granularity.........................................21 4.4.4 Recovery Granularity.........................................21
4.4.4.1 Selective Traffic Recovery..................................21 4.4.4.1 Selective Traffic Recovery..................................22
4.4.4.2 Bundling....................................................22 4.4.4.2 Bundling....................................................22
4.4.5 Recovery Path Resource Use...................................22 4.4.5 Recovery Path Resource Use...................................22
4.5. Fault Detection................................................22 4.5. Fault Detection................................................22
4.6. Fault Notification.............................................23 4.6. Fault Notification.............................................23
4.7. Switch-Over Operation..........................................24 4.7. Switch-Over Operation..........................................24
4.7.1 Recovery Trigger.............................................24 4.7.1 Recovery Trigger.............................................24
4.7.2 Recovery Action..............................................24 4.7.2 Recovery Action..............................................25
4.8. Post Recovery Operation........................................24 4.8. Post Recovery Operation........................................25
4.8.1 Fixed Protection Counterparts................................25 4.8.1 Fixed Protection Counterparts................................25
4.8.1.1 Revertive Mode..............................................25 4.8.1.1 Revertive Mode..............................................25
4.8.1.2 Non-revertive Mode..........................................25 4.8.1.2 Non-revertive Mode..........................................25
4.8.2 Dynamic Protection Counterparts..............................26 4.8.2 Dynamic Protection Counterparts..............................26
4.8.3 Restoration and Notification.................................26 4.8.3 Restoration and Notification.................................26
4.8.4 Reverting to Preferred Path (or Controlled Rearrangement)....27 4.8.4 Reverting to Preferred Path (or Controlled Rearrangement)....27
4.9. Performance....................................................27 4.9. Performance....................................................27
5. MPLS Recovery Features.........................................27 5. MPLS Recovery Features.........................................28
6. Comparison Criteria............................................28 6. Comparison Criteria............................................28
7. Security Considerations........................................30 7. Security Considerations........................................30
8. Intellectual Property Considerations...........................30 8. Intellectual Property Considerations...........................30
9. Acknowledgements...............................................30 9. Acknowledgements...............................................31
10. EditorsĘ Addresses.............................................31 10. EditorsĘ Addresses.............................................31
11. References.....................................................31 11. References.....................................................31
1. Introduction 1. Introduction
This memo describes a framework for MPLS-based recovery. We provide a This memo describes a framework for MPLS-based recovery. We provide a
detailed taxonomy of recovery terminology, and discuss the motivation detailed taxonomy of recovery terminology, and discuss the motivation
for, the objectives of, and the requirements for MPLS-based recovery. for, the objectives of, and the requirements for MPLS-based recovery.
We outline principles for MPLS-based recovery, and also provide We outline principles for MPLS-based recovery, and also provide
comparison criteria that may serve as a basis for comparing and comparison criteria that may serve as a basis for comparing and
skipping to change at page 14, line 17 skipping to change at page 14, line 17
A path group that requires protection. A path group that requires protection.
Protected Traffic Portion (PTP) Protected Traffic Portion (PTP)
The portion of the traffic on an individual path that requires The portion of the traffic on an individual path that requires
protection. For example, code points in the EXP bits of the shim protection. For example, code points in the EXP bits of the shim
header may identify a protected portion. header may identify a protected portion.
Path Switch LSR (PSL) Path Switch LSR (PSL)
The PSL is responsible for switching or replicating the traffic An LSR that is responsible for switching or replicating the traffic
between the working path and the recovery path. between the working path and the recovery path.
Path Merge LSR (PML) Path Merge LSR (PML)
An LSR that is responsible for receiving the recovery path traffic, An LSR that is responsible for receiving the recovery path traffic,
and either merges the traffic back onto the working path, or, if it and either merging the traffic back onto the working path, or, if it
is itself the destination, passes the traffic on to the higher layer is itself the destination, passing the traffic on to the higher layer
protocols. protocols.
Point of Repair (POR)
An LSR that is setup for performing MPLS recovery. In other words, an
LSR that is responsible for effecting the repair of an LSP. The POR,
for example, can be a PSL or a PML, depending on the type of recovery
scheme employed.
Intermediate LSR Intermediate LSR
An LSR on a working or recovery path that is neither a PSL nor a PML An LSR on a working or recovery path that is neither a PSL nor a PML
for that path. for that path.
Bypass Tunnel Bypass Tunnel
A path that serves to back up a set of working paths using the label A path that serves to back up a set of working paths using the label
stacking approach [1]. The working paths and the bypass tunnel must stacking approach [1]. The working paths and the bypass tunnel must
all share the same path switch LSR (PSL) and the path merge LSR all share the same path switch LSR (PSL) and the path merge LSR
skipping to change at page 16, line 16 skipping to change at page 16, line 22
layer. layer.
Link Degraded (LD) Link Degraded (LD)
A lower layer indication to MPLS-based recovery mechanisms that the A lower layer indication to MPLS-based recovery mechanisms that the
link is performing below an acceptable level. link is performing below an acceptable level.
Fault Indication Signal (FIS) Fault Indication Signal (FIS)
A signal that indicates that a fault along a path has occurred. It is A signal that indicates that a fault along a path has occurred. It is
relayed by each intermediate LSR to its upstream or downstream relayed by each intermediate LSR to its upstream or downstream
neighbor, until it reaches an LSR that is setup to perform MPLS neighbor, until it reaches an LSR that is setup to perform MPLS
recovery. The FIS is transmitted periodically by the node/nodes recovery (the POR). The FIS is transmitted periodically by the
closest to the point of failure, for some configurable length of node/nodes closest to the point of failure, for some configurable
time. length of time.
Fault Recovery Signal (FRS) Fault Recovery Signal (FRS)
A signal that indicates a fault along a working path has been A signal that indicates a fault along a working path has been
repaired. Again, like the FIS, it is relayed by each intermediate LSR repaired. Again, like the FIS, it is relayed by each intermediate LSR
to its upstream or downstream neighbor, until is reaches the LSR that to its upstream or downstream neighbor, until is reaches the LSR that
performs recovery of the original path. The FRS is transmitted performs recovery of the original path. The FRS is transmitted
periodically by the node/nodes closest to the point of failure, for periodically by the node/nodes closest to the point of failure, for
some configurable length of time. some configurable length of time.
3.4. Abbreviations 3.4. Abbreviations
FIS: Fault Indication Signal. FIS: Fault Indication Signal.
FRS: Fault Recovery Signal. FRS: Fault Recovery Signal.
LD: Link Degraded. LD: Link Degraded.
LF: Link Failure. LF: Link Failure.
PD: Path Degraded. PD: Path Degraded.
PF: Path Failure. PF: Path Failure.
PML: Path Merge LSR. PML: Path Merge LSR.
PG: Path Group. PG: Path Group.
PPG: Protected Path Group. POR: Point of Repair
PTP: Protected Traffic Portion. PPG: Protected Path Group.
PSL: Path Switch LSR. PTP: Protected Traffic Portion.
PSL: Path Switch LSR.
4. MPLS-based Recovery Principles 4. MPLS-based Recovery Principles
MPLS-based recovery refers to the ability to effect quick and MPLS-based recovery refers to the ability to effect quick and
complete restoration of traffic affected by a fault in an MPLS- complete restoration of traffic affected by a fault in an MPLS-
enabled network. The fault may be detected on the IP layer or in enabled network. The fault may be detected on the IP layer or in
lower layers over which IP traffic is transported. Fastest MPLS lower layers over which IP traffic is transported. Fastest MPLS
recovery is assumed to be achieved with protection switching and may recovery is assumed to be achieved with protection switching and may
be viewed as the MPLS LSR switch completion time that is comparable be viewed as the MPLS LSR switch completion time that is comparable
to, or equivalent to, the 50 ms switch-over completion time of the to, or equivalent to, the 50 ms switch-over completion time of the
skipping to change at page 19, line 31 skipping to change at page 19, line 40
associated with the working path at that node. Once again, the associated with the working path at that node. Once again, the
traffic on the primary path is switched over to the recovery path at traffic on the primary path is switched over to the recovery path at
the upstream LSR that directly connects to the failed node, and the the upstream LSR that directly connects to the failed node, and the
recovery path shares overlapping portions with the working path. recovery path shares overlapping portions with the working path.
4.4.1.2 Global Repair 4.4.1.2 Global Repair
The intent of global repair is to protect against any link or node The intent of global repair is to protect against any link or node
fault on a path or on a segment of a path, with the obvious exception fault on a path or on a segment of a path, with the obvious exception
of the faults occurring at the ingress node of the protected path of the faults occurring at the ingress node of the protected path
segment. In global repair the PSL is usually distant from the failure segment. In global repair, the POR is usually distant from the
and needs to be notified by a FIS. failure and needs to be notified by a FIS.
In global repair also, end-to-end path recovery/restoration applies. In global repair also, end-to-end path recovery/restoration applies.
In many cases, the recovery path can be made completely link and node In many cases, the recovery path can be made completely link and node
disjoint with its working path. This has the advantage of protecting disjoint with its working path. This has the advantage of protecting
against all link and node fault(s) on the working path (end-to-end against all link and node fault(s) on the working path (end-to-end
path or path segment). path or path segment).
However, it may, in some cases, be slower than local repair since the However, it may, in some cases, be slower than local repair since the
fault notification message must now travel to the PSL to trigger the fault notification message must now travel to the POR to trigger the
recovery action. recovery action.
4.4.1.3 Alternate Egress Repair 4.4.1.3 Alternate Egress Repair
It is possible to restore service without specifically recovering the It is possible to restore service without specifically recovering the
faulted path. faulted path.
For example, for best effort IP service it is possible to select a For example, for best effort IP service it is possible to select a
recovery path that has a different egress point from the working path recovery path that has a different egress point from the working path
(i.e., there is no PML). The recovery path egress must simply be a (i.e., there is no PML). The recovery path egress must simply be a
router that is acceptable for forwarding the FEC carried by the router that is acceptable for forwarding the FEC carried by the
skipping to change at page 23, line 41 skipping to change at page 23, line 50
fault, it may be used by the MPLS recovery mechanism. In some cases, fault, it may be used by the MPLS recovery mechanism. In some cases,
using LD indications may provide faster fault detection than using using LD indications may provide faster fault detection than using
only MPLS-based fault detection mechanisms. only MPLS-based fault detection mechanisms.
4.6. Fault Notification 4.6. Fault Notification
MPLS-based recovery relies on rapid and reliable notification of MPLS-based recovery relies on rapid and reliable notification of
faults. Once a fault is detected, the node that detected the fault faults. Once a fault is detected, the node that detected the fault
must determine if the fault is severe enough to require path must determine if the fault is severe enough to require path
recovery. If the node is not capable of initiating direct action recovery. If the node is not capable of initiating direct action
(e.g. as a PSL) the node should send out a notification of the fault (e.g. as a point of repair, POR) the node should send out a
by transmitting a FIS to those of its upstream LSRs that were sending notification of the fault by transmitting a FIS to the POR. This can
traffic on the working path that is affected by the fault. This take several forms:
notification is relayed hop-by-hop by each subsequent LSR to its
upstream neighbor, until it eventually reaches a PSL. A PSL is the (i) control plane messaging: relayed hop-by-hop along the path of the
only LSR that can terminate the FIS and initiate a protection switch failed LSP until a POR is reached.
of the working path to a recovery path.
(ii) user plane messaging: sent to the PML, which may take corrective
action (as a POR for 1+1) or then communicate with a POR (for 1:n) by
any of several means:
- control plane messaging
- user plane return path (either through a bi-directional LSP
or via other means)
Since the FIS is a control message, it should be transmitted with Since the FIS is a control message, it should be transmitted with
high priority to ensure that it propagates rapidly towards the high priority to ensure that it propagates rapidly towards the
affected PSL(s). Depending on how fault notification is configured in affected POR(s). Depending on how fault notification is configured in
the LSRs of an MPLS domain, the FIS could be sent either as a Layer 2 the LSRs of an MPLS domain, the FIS could be sent either as a Layer 2
or Layer 3 packet [3]. The use of a Layer 2-based notification or Layer 3 packet [3]. The use of a Layer 2-based notification
requires a Layer 2 path direct to the PSL. An example of a FIS could requires a Layer 2 path direct to the POR. An example of a FIS could
be the liveness message sent by a downstream LSR to its upstream be the liveness message sent by a downstream LSR to its upstream
neighbor, with an optional fault notification field set or it can be neighbor, with an optional fault notification field set or it can be
implicitly denoted by a teardown message. Alternatively, it could be implicitly denoted by a teardown message. Alternatively, it could be
a separate fault notification packet. The intermediate LSR should a separate fault notification packet. The intermediate LSR should
identify which of its incoming links (upstream LSRs) to propagate the identify which of its incoming links to propagate the FIS on.
FIS on. In the case of 1+1 protection, the FIS should also be sent
downstream to the PML where the recovery action is taken.
4.7. Switch-Over Operation 4.7. Switch-Over Operation
4.7.1 Recovery Trigger 4.7.1 Recovery Trigger
The activation of an MPLS protection switch following the detection The activation of an MPLS protection switch following the detection
or notification of a fault requires a trigger mechanism at the PSL. or notification of a fault requires a trigger mechanism at the PSL.
MPLS protection switching may be initiated due to automatic inputs or MPLS protection switching may be initiated due to automatic inputs or
external commands. The automatic activation of an MPLS protection external commands. The automatic activation of an MPLS protection
switch results from a response to a defect or fault conditions switch results from a response to a defect or fault conditions
detected at the PSL or to fault notifications received at the PSL. It detected at the PSL or to fault notifications received at the PSL. It
is possible that the fault detection and trigger mechanisms may be is possible that the fault detection and trigger mechanisms may be
combined, as is the case when a PF, PD, LF, or LD is detected at a combined, as is the case when a PF, PD, LF, or LD is detected at a
PSL and triggers a protection switch to the recovery path. In most PSL and triggers a protection switch to the recovery path. In most
cases, however, the detection and trigger mechanisms are distinct, cases, however, the detection and trigger mechanisms are distinct,
involving the detection of fault at some intermediate LSR followed by involving the detection of fault at some intermediate LSR followed by
the propagation of a fault notification back to the PSL via the FIS, the propagation of a fault notification to the POR via the FIS, which
which serves as the protection switch trigger at the PSL. MPLS serves as the protection switch trigger at the POR. MPLS protection
protection switching in response to external commands results when switching in response to external commands results when the operator
the operator initiates a protection switch by a command to a PSL (or initiates a protection switch by a command to a POR (or alternatively
alternatively by a configuration command to an intermediate LSR, by a configuration command to an intermediate LSR, which transmits
which transmits the FIS towards the PSL). the FIS towards the POR).
Note that the PF fault applies to hard failures (fiber cuts, Note that the PF fault applies to hard failures (fiber cuts,
transmitter failures, or LSR fabric failures), as does the LF fault, transmitter failures, or LSR fabric failures), as does the LF fault,
with the difference that the LF is a lower layer impairment that may with the difference that the LF is a lower layer impairment that may
be communicated to - MPLS-based recovery mechanisms. The PD (or LD) be communicated to - MPLS-based recovery mechanisms. The PD (or LD)
fault, on the other hand, applies to soft defects (excessive errors fault, on the other hand, applies to soft defects (excessive errors
due to noise on the link, for instance). The PD (or LD) results in a due to noise on the link, for instance). The PD (or LD) results in a
fault declaration only when the percentage of lost packets exceeds a fault declaration only when the percentage of lost packets exceeds a
given threshold, which is provisioned and may be set based on the given threshold, which is provisioned and may be set based on the
service level agreement(s) in effect between a service provider and a service level agreement(s) in effect between a service provider and a
customer. customer.
4.7.2 Recovery Action 4.7.2 Recovery Action
After a fault is detected or FIS is received by the PSL, the recovery After a fault is detected or FIS is received by the POR, the recovery
action involves either a rerouting or protection switching operation. action involves either a rerouting or protection switching operation.
In both scenarios, the next hop label forwarding entry for a recovery In both scenarios, the next hop label forwarding entry for a recovery
path is bound to the working path. path is bound to the working path.
4.8. Post Recovery Operation 4.8. Post Recovery Operation
When traffic is flowing on the recovery path decisions can be made to When traffic is flowing on the recovery path decisions can be made to
whether let the traffic remain on the recovery path and consider it whether let the traffic remain on the recovery path and consider it
as a new working path or do a switch to the old or a new working as a new working path or do a switch to the old or a new working
path. This post recovery operation has two styles, one where the path. This post recovery operation has two styles, one where the
skipping to change at page 30, line 46 skipping to change at page 31, line 9
The IETF has been notified of intellectual property rights claimed in The IETF has been notified of intellectual property rights claimed in
regard to some or all of the specification contained in this regard to some or all of the specification contained in this
document. For more information consult the online list of claimed document. For more information consult the online list of claimed
rights. rights.
9. Acknowledgements 9. Acknowledgements
We would like to thank members of the MPLS WG mailing list for their We would like to thank members of the MPLS WG mailing list for their
suggestions on the earlier versions of this draft. In particular, suggestions on the earlier versions of this draft. In particular,
Bora Akyol, Dave Allan, Neil Harrison, and Dave Danenberg whose Bora Akyol, Dave Allan, Dave Danenberg, Sharam Davari, and Neil
suggestions and comments were very helpful in revising the document. Harrison whose suggestions and comments were very helpful in revising
the document.
The editors would like to give very special thanks to Curtis The editors would like to give very special thanks to Curtis
Villamizar for his careful and extremely thorough reading of the Villamizar for his careful and extremely thorough reading of the
document and for taking the time to provide numerous suggestions, document and for taking the time to provide numerous suggestions,
which were very helpful in the last couple of revisions of the which were very helpful in the last couple of revisions of the
document. document.
10. EditorsĘ Addresses 10. EditorsĘ Addresses
Vishal Sharma Fiffi Hellstrand Vishal Sharma Fiffi Hellstrand
Metanoia, Inc. Nortel Networks Metanoia, Inc. Nortel Networks
305 Elan Village Ln., Unit 121 St Eriksgatan 115 1600 Villa Street, Unit 352 St Eriksgatan 115
San Jose, CA 95134 PO Box 6701 Mountain View, CA 94041-1174 PO Box 6701
Phone: (408) 955-0910 113 85 Stockholm, Sweden Phone: (650) 386-6723 113 85 Stockholm, Sweden
v.sharma@ieee.org Phone: +46 8 5088 3687 v.sharma@ieee.org Phone: +46 8 5088 3687
Fiffi@nortelnetworks.com Fiffi@nortelnetworks.com
11. References 11. References
[1] Rosen, E., Viswanathan, A., and Callon, R., "Multiprotocol Label [1] Rosen, E., Viswanathan, A., and Callon, R., "Multiprotocol Label
Switching Architecture", RFC 3031, January 2001. Switching Architecture", RFC 3031, January 2001.
[2] Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., McManus, J., [2] Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., McManus, J.,
"Requirements for Traffic Engineering Over MPLS", RFC 2702, "Requirements for Traffic Engineering Over MPLS", RFC 2702,
 End of changes. 22 change blocks. 
52 lines changed or deleted 65 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/