< draft-ietf-mpls-recovery-frmwrk-04.txt   draft-ietf-mpls-recovery-frmwrk-05.txt >
MPLS Working Group Vishal Sharma (Metanoia, Inc.) MPLS Working Group Vishal Sharma (Metanoia, Inc.)
Informational Track Fiffi Hellstrand (Nortel Networks) Informational Track Fiffi Hellstrand (Nortel Networks)
Expires: November 2002 Ben-Mack Crane (Tellabs) Expires: November 2002 (Editors)
Srinivas Makam
Ken Owens (Erlang Technology)
Changcheng Huang (Carleton University)
Jon Weil (Nortel Networks)
Loa Anderson (Utfors)
Bilel Jamoussi (Nortel Networks)
Brad Cain (Storigen)
Angela Chiu (Celion Networks)
May 2002 May 2002
Framework for MPLS-based Recovery Framework for MPLS-based Recovery
<draft-ietf-mpls-recovery-frmwrk-04.txt> <draft-ietf-mpls-recovery-frmwrk-05.txt>
Status of this memo Status of this memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other Task Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts. groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
skipping to change at page 1, line 48 skipping to change at page 1, line 40
Multi-protocol label switching (MPLS) integrates the label swapping Multi-protocol label switching (MPLS) integrates the label swapping
forwarding paradigm with network layer routing. To deliver reliable forwarding paradigm with network layer routing. To deliver reliable
service, MPLS requires a set of procedures to provide protection of service, MPLS requires a set of procedures to provide protection of
the traffic carried on different paths. This requires that the label the traffic carried on different paths. This requires that the label
switched routers (LSRs) support fault detection, fault notification, switched routers (LSRs) support fault detection, fault notification,
and fault recovery mechanisms, and that MPLS signaling, support the and fault recovery mechanisms, and that MPLS signaling, support the
configuration of recovery. With these objectives in mind, this configuration of recovery. With these objectives in mind, this
document specifies a framework for MPLS based recovery. document specifies a framework for MPLS based recovery.
Table of Contents Table of Contents
1. Introduction....................................................2
1. Introduction....................................................3
1.1. Background......................................................3 1.1. Background......................................................3
1.2. Motivation for MPLS-Based Recovery..............................4 1.2. Motivation for MPLS-Based Recovery..............................3
1.3. Objectives/Goals................................................4 1.3. Objectives/Goals................................................4
2. Overview........................................................6 2. Contributing Authors............................................6
2.1. Recovery Models.................................................6 3. Overview........................................................6
2.1.1 Rerouting.....................................................6 3.1. Recovery Models.................................................7
2.1.2 Protection Switching..........................................7 3.1.1 Rerouting.....................................................7
2.2. The Recovery Cycles.............................................7 3.1.2 Protection Switching..........................................7
2.2.1 MPLS Recovery Cycle Model.....................................7 3.2. The Recovery Cycles.............................................8
2.2.2 MPLS Reversion Cycle Model....................................9 3.2.1 MPLS Recovery Cycle Model.....................................8
2.2.3 Dynamic Re-routing Cycle Model...............................10 3.2.2 MPLS Reversion Cycle Model....................................9
2.3. Definitions and Terminology....................................12 3.2.3 Dynamic Re-routing Cycle Model...............................11
2.3.1 General Recovery Terminology.................................12 3.3. Definitions and Terminology....................................12
2.3.2 Failure Terminology..........................................15 3.3.1 General Recovery Terminology.................................13
2.4. Abbreviations..................................................15 3.3.2 Failure Terminology..........................................15
3. MPLS-based Recovery Principles.................................16 3.4. Abbreviations..................................................16
3.1. Configuration of Recovery......................................16 4. MPLS-based Recovery Principles.................................16
3.2. Initiation of Path Setup.......................................16 4.1. Configuration of Recovery......................................17
3.3. Initiation of Resource Allocation..............................17 4.2. Initiation of Path Setup.......................................17
3.4. Scope of Recovery..............................................17 4.3. Initiation of Resource Allocation..............................18
3.4.1 Topology.....................................................17 4.4. Scope of Recovery..............................................18
1.1.1.1 Local Repair................................................18 4.4.1 Topology.....................................................18
1.1.1.2 Global Repair...............................................18 4.4.1.1 Local Repair................................................18
1.1.1.3 Alternate Egress Repair.....................................19 4.4.1.2 Global Repair...............................................19
1.1.1.4 Multi-Layer Repair..........................................19 4.4.1.3 Alternate Egress Repair.....................................19
1.1.1.5 Concatenated Protection Domains.............................19 4.4.1.4 Multi-Layer Repair..........................................20
3.4.2 Path Mapping.................................................19 4.4.1.5 Concatenated Protection Domains.............................20
3.4.3 Bypass Tunnels...............................................20 4.4.2 Path Mapping.................................................20
3.4.4 Recovery Granularity.........................................21 4.4.3 Bypass Tunnels...............................................21
1.1.1.6 Selective Traffic Recovery..................................21 4.4.4 Recovery Granularity.........................................21
1.1.1.7 Bundling....................................................21 4.4.4.1 Selective Traffic Recovery..................................21
3.4.5 Recovery Path Resource Use...................................21 4.4.4.2 Bundling....................................................22
3.5. Fault Detection................................................22 4.4.5 Recovery Path Resource Use...................................22
3.6. Fault Notification.............................................22 4.5. Fault Detection................................................22
3.7. Switch-Over Operation..........................................23 4.6. Fault Notification.............................................23
3.7.1 Recovery Trigger.............................................23 4.7. Switch-Over Operation..........................................24
3.7.2 Recovery Action..............................................24 4.7.1 Recovery Trigger.............................................24
3.8. Post Recovery Operation........................................24 4.7.2 Recovery Action..............................................24
3.8.1 Fixed Protection Counterparts................................24 4.8. Post Recovery Operation........................................24
1.1.1.8 Revertive Mode..............................................24 4.8.1 Fixed Protection Counterparts................................25
1.1.1.9 Non-revertive Mode..........................................24 4.8.1.1 Revertive Mode..............................................25
3.8.2 Dynamic Protection Counterparts..............................25 4.8.1.2 Non-revertive Mode..........................................25
3.8.3 Restoration and Notification.................................25 4.8.2 Dynamic Protection Counterparts..............................26
3.8.4 Reverting to Preferred Path (or Controlled Rearrangement)....26 4.8.3 Restoration and Notification.................................26
3.9. Performance....................................................26 4.8.4 Reverting to Preferred Path (or Controlled Rearrangement)....27
4. MPLS Recovery Features.........................................27 4.9. Performance....................................................27
5. Comparison Criteria............................................27 5. MPLS Recovery Features.........................................27
6. Security Considerations........................................29 6. Comparison Criteria............................................28
7. Intellectual Property Considerations...........................29 7. Security Considerations........................................30
8. Acknowledgements...............................................30 8. Intellectual Property Considerations...........................30
9. AuthorsÆ Addresses.............................................30 9. Acknowledgements...............................................30
10. References.....................................................31 10. EditorsÆ Addresses.............................................31
11. References.....................................................31
1. Introduction 1. Introduction
This memo describes a framework for MPLS-based recovery. We provide a This memo describes a framework for MPLS-based recovery. We provide a
detailed taxonomy of recovery terminology, and discuss the motivation detailed taxonomy of recovery terminology, and discuss the motivation
for, the objectives of, and the requirements for MPLS-based recovery. for, the objectives of, and the requirements for MPLS-based recovery.
We outline principles for MPLS-based recovery, and also provide We outline principles for MPLS-based recovery, and also provide
comparison criteria that may serve as a basis for comparing and comparison criteria that may serve as a basis for comparing and
evaluating different recovery schemes. evaluating different recovery schemes.
skipping to change at page 6, line 13 skipping to change at page 6, line 5
desired, the recovery path should meet the resource requirements of, desired, the recovery path should meet the resource requirements of,
and achieve the same performance characteristics as, the working and achieve the same performance characteristics as, the working
path. path.
We observe that some of the above are conflicting goals, and real We observe that some of the above are conflicting goals, and real
deployment will often involve engineering compromises based on a deployment will often involve engineering compromises based on a
variety of factors such as cost, end-user application requirements, variety of factors such as cost, end-user application requirements,
network efficiency, and revenue considerations. Thus, these goals are network efficiency, and revenue considerations. Thus, these goals are
subject to tradeoffs based on the above considerations. subject to tradeoffs based on the above considerations.
2. Overview 2. Contributing Authors
This document was the collective work of several individuals over a
period of two and a half years. The text and content of this document
was contributed by the editors and the co-authors listed below. (The
contact information for the editors appears in Section 10, and is not
repeated below.)
Ben Mack-Crane Srinivas Makam
Tellabs Operations, Inc. Eshernet, Inc.
4951 Indiana Avenue 1712 Ada Ct.
Lisle, IL 60532 Naperville, IL 60540
Phone: (630) 512-7255 Phone: (630) 308-3213
Ben.Mack-Crane@tellabs.com Smakam60540@yahoo.com
Ken Owens Changcheng Huang
Erlang Technology, Inc. Carleton University
345 Marshall Ave., Suite 300 Minto Center, Rm. 3082
St. Louis, MO 63119 1125 Colonial By Drive
Phone: (314) 918-1579 Ottawa, Ont. K1S 5B6 Canada
keno@erlangtech.com Phone: (613) 520-2600 x2477
Changcheng.Huang@sce.carleton.ca
Jon Weil Brad Cain
Nortel Networks Storigen Systems
Harlow Laboratories London Road 650 Suffolk Street
Harlow Essex CM17 9NA, UK Lowell, MA 01854
Phone: +44 (0)1279 403935 Phone: (978) 323-4454
jonweil@nortelnetworks.com bcain@storigen.com
Loa Andersson Bilel Jamoussi
Utfors AB Nortel Networks
R…sundav„gen 12, Box 525 3 Federal Street, BL3-03
169 29 Solna, Sweden Billerica, MA 01821, USA
Phone: +46 8 5270 5038 Phone:(978) 288-4506
loa.andersson@utfors.se jamoussi@nortelnetworks.com
Angela Chiu Seyhan Civanlar
Celion Networks, Inc. Lemur Networks, Inc.
One Shiela Drive, Suite 2 135 West 20th Street, 5th Floor
Tinton Falls, NJ 07724 New York, NY 10011
Phone: (732) 345-3441 Phone: (212) 367-7676
angela.chiu@celion.com scivanlar@lemurnetworks.com
3. Overview
There are several options for providing protection of traffic. The There are several options for providing protection of traffic. The
most generic requirement is the specification of whether recovery most generic requirement is the specification of whether recovery
should be via Layer 3 (or IP) rerouting or via MPLS protection should be via Layer 3 (or IP) rerouting or via MPLS protection
switching or rerouting actions. switching or rerouting actions.
Generally network operators aim to provide the fastest and the best Generally network operators aim to provide the fastest and the best
protection mechanism that can be provided at a reasonable cost. The protection mechanism that can be provided at a reasonable cost. The
higher the levels of protection, the more the resources consumed. higher the levels of protection, the more the resources consumed.
Therefore it is expected that network operators will offer a spectrum Therefore it is expected that network operators will offer a spectrum
skipping to change at page 6, line 42 skipping to change at page 7, line 27
real-time applications like Voice over IP (VoIP) may be supported real-time applications like Voice over IP (VoIP) may be supported
using link/node protection together with pre-established, pre- using link/node protection together with pre-established, pre-
reserved path protection. Best effort traffic, on the other hand, may reserved path protection. Best effort traffic, on the other hand, may
use path protection that is established on demand or may simply rely use path protection that is established on demand or may simply rely
on IP re-route or higher layer recovery mechanisms. As another on IP re-route or higher layer recovery mechanisms. As another
example of their range of application, MPLS-based recovery strategies example of their range of application, MPLS-based recovery strategies
may be used to protect traffic not originally flowing on label may be used to protect traffic not originally flowing on label
switched paths, such as IP traffic that is normally routed hop-by- switched paths, such as IP traffic that is normally routed hop-by-
hop, as well as traffic forwarded on label switched paths. hop, as well as traffic forwarded on label switched paths.
2.1. Recovery Models 3.1. Recovery Models
There are two basic models for path recovery: rerouting and There are two basic models for path recovery: rerouting and
protection switching. protection switching.
Protection switching and rerouting, as defined below, may be used Protection switching and rerouting, as defined below, may be used
together. For example, protection switching to a recovery path may together. For example, protection switching to a recovery path may
be used for rapid restoration of connectivity while rerouting be used for rapid restoration of connectivity while rerouting
determines a new optimal network configuration, rearranging paths, as determines a new optimal network configuration, rearranging paths, as
needed, at a later time. needed, at a later time.
2.1.1 Rerouting 3.1.1 Rerouting
Recovery by rerouting is defined as establishing new paths or path Recovery by rerouting is defined as establishing new paths or path
segments on demand for restoring traffic after the occurrence of a segments on demand for restoring traffic after the occurrence of a
fault. The new paths may be based upon fault information, network fault. The new paths may be based upon fault information, network
routing policies, pre-defined configurations and network topology routing policies, pre-defined configurations and network topology
information. Thus, upon detecting a fault, paths or path segments to information. Thus, upon detecting a fault, paths or path segments to
bypass the fault are established using signaling. bypass the fault are established using signaling.
Once the network routing algorithms have converged after a fault, it Once the network routing algorithms have converged after a fault, it
may be preferable, in some cases, to reoptimize the network by may be preferable, in some cases, to reoptimize the network by
performing a reroute based on the current state of the network and performing a reroute based on the current state of the network and
network policies. This is discussed further in Section 3.8. network policies. This is discussed further in Section 3.8.
In terms of the principles defined in section 3, reroute recovery In terms of the principles defined in section 3, reroute recovery
employs paths established-on-demand with resources reserved-on- employs paths established-on-demand with resources reserved-on-
demand. demand.
2.1.2 Protection Switching 3.1.2 Protection Switching
Protection switching recovery mechanisms pre-establish a recovery Protection switching recovery mechanisms pre-establish a recovery
path or path segment, based upon network routing policies, the path or path segment, based upon network routing policies, the
restoration requirements of the traffic on the working path, and restoration requirements of the traffic on the working path, and
administrative considerations. The recovery path may or may not be administrative considerations. The recovery path may or may not be
link and node disjoint with the working path. However if the recovery link and node disjoint with the working path. However if the recovery
path shares sources of failure with the working path, the overall path shares sources of failure with the working path, the overall
reliability of the construct is degraded. When a fault is detected, reliability of the construct is degraded. When a fault is detected,
the protected traffic is switched over to the recovery path(s) and the protected traffic is switched over to the recovery path(s) and
restored. restored.
In terms of the principles in section 3, protection switching employs In terms of the principles in section 3, protection switching employs
pre-established recovery paths, and, if resource reservation is pre-established recovery paths, and, if resource reservation is
required on the recovery path, pre-reserved resources. The various required on the recovery path, pre-reserved resources. The various
sub-types of protection switching are detailed in Section 3.4 of this sub-types of protection switching are detailed in Section 4.4 of this
document. document.
2.2. The Recovery Cycles 3.2. The Recovery Cycles
There are three defined recovery cycles: the MPLS Recovery Cycle, the There are three defined recovery cycles: the MPLS Recovery Cycle, the
MPLS Reversion Cycle and the Dynamic Re-routing Cycle. The first MPLS Reversion Cycle and the Dynamic Re-routing Cycle. The first
cycle detects a fault and restores traffic onto MPLS-based recovery cycle detects a fault and restores traffic onto MPLS-based recovery
paths. If the recovery path is non-optimal the cycle may be followed paths. If the recovery path is non-optimal the cycle may be followed
by any of the two latter cycles to achieve an optimized network by any of the two latter cycles to achieve an optimized network
again. The reversion cycle applies for explicitly routed traffic that again. The reversion cycle applies for explicitly routed traffic that
that does not rely on any dynamic routing protocols to be converged. that does not rely on any dynamic routing protocols to be converged.
The dynamic re-routing cycle applies for traffic that is forwarded The dynamic re-routing cycle applies for traffic that is forwarded
based on hop-by-hop routing. based on hop-by-hop routing.
2.2.1 MPLS Recovery Cycle Model 3.2.1 MPLS Recovery Cycle Model
The MPLS recovery cycle model is illustrated in Figure 1. The MPLS recovery cycle model is illustrated in Figure 1.
Definitions and a key to abbreviations follow. Definitions and a key to abbreviations follow.
--Network Impairment --Network Impairment
| --Fault Detected | --Fault Detected
| | --Start of Notification | | --Start of Notification
| | | -- Start of Recovery Operation | | | -- Start of Recovery Operation
| | | | --Recovery Operation Complete | | | | --Recovery Operation Complete
| | | | | --Path Traffic Restored | | | | | --Path Traffic Restored
skipping to change at page 9, line 4 skipping to change at page 9, line 38
LSR detecting the fault and the time at which the Path Switch LSR LSR detecting the fault and the time at which the Path Switch LSR
(PSL) begins the recovery operation. This is zero if the PSL detects (PSL) begins the recovery operation. This is zero if the PSL detects
the fault itself or infers a fault from such events as an adjacency the fault itself or infers a fault from such events as an adjacency
failure. failure.
Note: If the PSL detects the fault itself, there still may be a Hold- Note: If the PSL detects the fault itself, there still may be a Hold-
Off Time period between detection and the start of the recovery Off Time period between detection and the start of the recovery
operation. operation.
Recovery Operation Time Recovery Operation Time
The time between the first and last recovery actions. This may The time between the first and last recovery actions. This may
include message exchanges between the PSL and PML to coordinate include message exchanges between the PSL and PML to coordinate
recovery actions. recovery actions.
Traffic Restoration Time Traffic Restoration Time
The time between the last recovery action and the time that the The time between the last recovery action and the time that the
traffic (if present) is completely recovered. This interval is traffic (if present) is completely recovered. This interval is
intended to account for the time required for traffic to once again intended to account for the time required for traffic to once again
arrive at the point in the network that experienced disrupted or arrive at the point in the network that experienced disrupted or
degraded service due to the occurrence of the fault (e.g. the PML). degraded service due to the occurrence of the fault (e.g. the PML).
This time may depend on the location of the fault, the recovery This time may depend on the location of the fault, the recovery
mechanism, and the propagation delay along the recovery path. mechanism, and the propagation delay along the recovery path.
2.2.2 MPLS Reversion Cycle Model 3.2.2 MPLS Reversion Cycle Model
Protection switching, revertive mode, requires the traffic to be Protection switching, revertive mode, requires the traffic to be
switched back to a preferred path when the fault on that path is switched back to a preferred path when the fault on that path is
cleared. The MPLS reversion cycle model is illustrated in Figure 2. cleared. The MPLS reversion cycle model is illustrated in Figure 2.
Note that the cycle shown below comes after the recovery cycle shown Note that the cycle shown below comes after the recovery cycle shown
in Fig. 1. in Fig. 1.
--Network Impairment Repaired --Network Impairment Repaired
| --Fault Cleared | --Fault Cleared
| | --Path Available | | --Path Available
| | | --Start of Reversion Operation | | | --Start of Reversion Operation
skipping to change at page 10, line 48 skipping to change at page 11, line 32
interval is expected to be quite small since both paths are working interval is expected to be quite small since both paths are working
and care may be taken to limit the traffic disruption (e.g., using and care may be taken to limit the traffic disruption (e.g., using
"make before break" techniques and synchronous switch-over). "make before break" techniques and synchronous switch-over).
In practice, the only interesting times in the reversion cycle are In practice, the only interesting times in the reversion cycle are
the Wait-to-Restore Time and the Traffic Restoration Time (or some the Wait-to-Restore Time and the Traffic Restoration Time (or some
other measure of traffic disruption). Given that both paths are other measure of traffic disruption). Given that both paths are
available, there is no need for rapid operation, and a well- available, there is no need for rapid operation, and a well-
controlled switch-back with minimal disruption is desirable. controlled switch-back with minimal disruption is desirable.
2.2.3 Dynamic Re-routing Cycle Model 3.2.3 Dynamic Re-routing Cycle Model
Dynamic rerouting aims to bring the IP network to a stable state Dynamic rerouting aims to bring the IP network to a stable state
after a network impairment has occurred. A re-optimized network is after a network impairment has occurred. A re-optimized network is
achieved after the routing protocols have converged, and the traffic achieved after the routing protocols have converged, and the traffic
is moved from a recovery path to a (possibly) new working path. The is moved from a recovery path to a (possibly) new working path. The
steps involved in this mode are illustrated in Figure 3. steps involved in this mode are illustrated in Figure 3.
Note that the cycle shown below may be overlaid on the recovery cycle Note that the cycle shown below may be overlaid on the recovery cycle
shown in Fig. 1 or the reversion cycle shown in Fig. 2, or both (in shown in Fig. 1 or the reversion cycle shown in Fig. 2, or both (in
the event that both the recovery cycle and the reversion cycle take the event that both the recovery cycle and the reversion cycle take
skipping to change at page 12, line 20 skipping to change at page 12, line 54
V. The PSL switches over the traffic from the working path to the V. The PSL switches over the traffic from the working path to the
recovery path recovery path
VI. The network enters a semi-stable state VI. The network enters a semi-stable state
VII. Dynamic routing protocols converge after the fault, and a new VII. Dynamic routing protocols converge after the fault, and a new
working path is calculated (based, for example, on some of the working path is calculated (based, for example, on some of the
criteria mentioned in Section 2.1.1). criteria mentioned in Section 2.1.1).
VIII. A new working path is established between the PSL and the PML VIII. A new working path is established between the PSL and the PML
(assumption is that PSL and PML have not changed) (assumption is that PSL and PML have not changed)
IX. Traffic is switched over to the new working path. IX. Traffic is switched over to the new working path.
2.3. Definitions and Terminology 3.3. Definitions and Terminology
This document assumes the terminology given in [1], and, in addition, This document assumes the terminology given in [1], and, in addition,
introduces the following new terms. introduces the following new terms.
2.3.1 General Recovery Terminology 3.3.1 General Recovery Terminology
Rerouting Rerouting
A recovery mechanism in which the recovery path or path segments are A recovery mechanism in which the recovery path or path segments are
created dynamically after the detection of a fault on the working created dynamically after the detection of a fault on the working
path. In other words, a recovery mechanism in which the recovery path path. In other words, a recovery mechanism in which the recovery path
is not pre-established. is not pre-established.
Protection Switching Protection Switching
skipping to change at page 15, line 7 skipping to change at page 15, line 43
Path Continuity Test Path Continuity Test
A test that verifies the integrity and continuity of a path or path A test that verifies the integrity and continuity of a path or path
segment. The details of such a test are beyond the scope of this segment. The details of such a test are beyond the scope of this
draft. (This could be accomplished, for example, by transmitting a draft. (This could be accomplished, for example, by transmitting a
control message along the same links and nodes as the data traffic or control message along the same links and nodes as the data traffic or
similarly could be measured by the absence of traffic and by similarly could be measured by the absence of traffic and by
providing feedback.) providing feedback.)
2.3.2 Failure Terminology 3.3.2 Failure Terminology
Path Failure (PF) Path Failure (PF)
Path failure is fault detected by MPLS-based recovery mechanisms, Path failure is fault detected by MPLS-based recovery mechanisms,
which is define as the failure of the liveness message test or a path which is define as the failure of the liveness message test or a path
continuity test, which indicates that path connectivity is lost. continuity test, which indicates that path connectivity is lost.
Path Degraded (PD) Path Degraded (PD)
Path degraded is a fault detected by MPLS-based recovery mechanisms Path degraded is a fault detected by MPLS-based recovery mechanisms
that indicates that the quality of the path is unacceptable. that indicates that the quality of the path is unacceptable.
skipping to change at page 15, line 43 skipping to change at page 16, line 28
time. time.
Fault Recovery Signal (FRS) Fault Recovery Signal (FRS)
A signal that indicates a fault along a working path has been A signal that indicates a fault along a working path has been
repaired. Again, like the FIS, it is relayed by each intermediate LSR repaired. Again, like the FIS, it is relayed by each intermediate LSR
to its upstream or downstream neighbor, until is reaches the LSR that to its upstream or downstream neighbor, until is reaches the LSR that
performs recovery of the original path. The FRS is transmitted performs recovery of the original path. The FRS is transmitted
periodically by the node/nodes closest to the point of failure, for periodically by the node/nodes closest to the point of failure, for
some configurable length of time. some configurable length of time.
2.4. Abbreviations 3.4. Abbreviations
FIS: Fault Indication Signal. FIS: Fault Indication Signal.
FRS: Fault Recovery Signal. FRS: Fault Recovery Signal.
LD: Link Degraded. LD: Link Degraded.
LF: Link Failure. LF: Link Failure.
PD: Path Degraded. PD: Path Degraded.
PF: Path Failure. PF: Path Failure.
PML: Path Merge LSR. PML: Path Merge LSR.
PG: Path Group. PG: Path Group.
PPG: Protected Path Group. PPG: Protected Path Group.
PTP: Protected Traffic Portion. PTP: Protected Traffic Portion.
PSL: Path Switch LSR. PSL: Path Switch LSR.
3. MPLS-based Recovery Principles 4. MPLS-based Recovery Principles
MPLS-based recovery refers to the ability to effect quick and MPLS-based recovery refers to the ability to effect quick and
complete restoration of traffic affected by a fault in an MPLS- complete restoration of traffic affected by a fault in an MPLS-
enabled network. The fault may be detected on the IP layer or in enabled network. The fault may be detected on the IP layer or in
lower layers over which IP traffic is transported. Fastest MPLS lower layers over which IP traffic is transported. Fastest MPLS
recovery is assumed to be achieved with protection switching and may recovery is assumed to be achieved with protection switching and may
be viewed as the MPLS LSR switch completion time that is comparable be viewed as the MPLS LSR switch completion time that is comparable
to, or equivalent to, the 50 ms switch-over completion time of the to, or equivalent to, the 50 ms switch-over completion time of the
SONET layer. This section provides a discussion of the concepts and SONET layer. This section provides a discussion of the concepts and
principles of MPLS-based recovery. The concepts are presented in principles of MPLS-based recovery. The concepts are presented in
terms of atomic or primitive terms that may be combined to specify terms of atomic or primitive terms that may be combined to specify
recovery approaches. We do not make any assumptions about the recovery approaches. We do not make any assumptions about the
underlying layer 1 or layer 2 transport mechanisms or their recovery underlying layer 1 or layer 2 transport mechanisms or their recovery
mechanisms. mechanisms.
3.1. Configuration of Recovery 4.1. Configuration of Recovery
An LSR may support any or all of the following recovery options: An LSR may support any or all of the following recovery options:
Default-recovery (No MPLS-based recovery enabled): Default-recovery (No MPLS-based recovery enabled):
Traffic on the working path is recovered only via Layer 3 or IP Traffic on the working path is recovered only via Layer 3 or IP
rerouting or by some lower layer mechanism such as SONET APS. This rerouting or by some lower layer mechanism such as SONET APS. This
is equivalent to having no MPLS-based recovery. This option may be is equivalent to having no MPLS-based recovery. This option may be
used for low priority traffic or for traffic that is recovered in used for low priority traffic or for traffic that is recovered in
another way (for example load shared traffic on parallel working another way (for example load shared traffic on parallel working
paths may be automatically recovered upon a fault along one of the paths may be automatically recovered upon a fault along one of the
working paths by distributing it among the remaining working paths). working paths by distributing it among the remaining working paths).
Recoverable (MPLS-based recovery enabled): Recoverable (MPLS-based recovery enabled):
This working path is recovered using one or more recovery paths, This working path is recovered using one or more recovery paths,
either via rerouting or via protection switching. either via rerouting or via protection switching.
3.2. Initiation of Path Setup 4.2. Initiation of Path Setup
There are three options for the initiation of the recovery path There are three options for the initiation of the recovery path
setup. The active and recovery paths may be established by using setup. The active and recovery paths may be established by using
either RSVP-TE [4][5] or CR-LDP [6]. either RSVP-TE [4][5] or CR-LDP [6].
Pre-established: Pre-established:
This is the same as the protection switching option. Here a recovery This is the same as the protection switching option. Here a recovery
path(s) is established prior to any failure on the working path. The path(s) is established prior to any failure on the working path. The
path selection can either be determined by an administrative path selection can either be determined by an administrative
skipping to change at page 17, line 21 skipping to change at page 18, line 5
acceptable alternative for carrying the working path traffic. acceptable alternative for carrying the working path traffic.
Variants include the case where an optical path or trail is Variants include the case where an optical path or trail is
configured, but no switches are set. configured, but no switches are set.
Established-on-Demand: Established-on-Demand:
This is the same as the rerouting option. Here, a recovery path is This is the same as the rerouting option. Here, a recovery path is
established after a failure on its working path has been detected and established after a failure on its working path has been detected and
notified to the PSL. notified to the PSL.
3.3. Initiation of Resource Allocation 4.3. Initiation of Resource Allocation
A recovery path may support the same traffic contract as the working A recovery path may support the same traffic contract as the working
path, or it may not. We will distinguish these two situations by path, or it may not. We will distinguish these two situations by
using different additive terms. If the recovery path is capable of using different additive terms. If the recovery path is capable of
replacing the working path without degrading service, it will be replacing the working path without degrading service, it will be
called an equivalent recovery path. If the recovery path lacks the called an equivalent recovery path. If the recovery path lacks the
resources (or resource reservations) to replace the working path resources (or resource reservations) to replace the working path
without degrading service, it will be called a limited recovery path. without degrading service, it will be called a limited recovery path.
Based on this, there are two options for the initiation of resource Based on this, there are two options for the initiation of resource
allocation: allocation:
skipping to change at page 17, line 54 skipping to change at page 18, line 38
This option may apply either to rerouting or to protection switching. This option may apply either to rerouting or to protection switching.
Here a recovery path reserves the required resources after a failure Here a recovery path reserves the required resources after a failure
on the working path has been detected and notified to the PSL and on the working path has been detected and notified to the PSL and
before the traffic on the working path is switched over to the before the traffic on the working path is switched over to the
recovery path. recovery path.
Note that under both the options above, depending on the amount of Note that under both the options above, depending on the amount of
resources reserved on the recovery path, it could either be an resources reserved on the recovery path, it could either be an
equivalent recovery path or a limited recovery path. equivalent recovery path or a limited recovery path.
3.4. Scope of Recovery 4.4. Scope of Recovery
3.4.1 Topology 4.4.1 Topology
1.1.1.1 Local Repair
4.4.1.1 Local Repair
The intent of local repair is to protect against a link or neighbor The intent of local repair is to protect against a link or neighbor
node fault and to minimize the amount of time required for failure node fault and to minimize the amount of time required for failure
propagation. In local repair (also known as local recovery), the node propagation. In local repair (also known as local recovery), the node
immediately upstream of the fault is the one to initiate recovery immediately upstream of the fault is the one to initiate recovery
(either rerouting or protection switching). Local repair can be of (either rerouting or protection switching). Local repair can be of
two types: two types:
Link Recovery/Restoration Link Recovery/Restoration
skipping to change at page 18, line 41 skipping to change at page 19, line 26
Node Recovery/Restoration Node Recovery/Restoration
In this case, the recovery path may be configured to route around a In this case, the recovery path may be configured to route around a
neighbor node deemed to be unreliable. Thus the recovery path is neighbor node deemed to be unreliable. Thus the recovery path is
disjoint from the working path only at a particular node and at links disjoint from the working path only at a particular node and at links
associated with the working path at that node. Once again, the associated with the working path at that node. Once again, the
traffic on the primary path is switched over to the recovery path at traffic on the primary path is switched over to the recovery path at
the upstream LSR that directly connects to the failed node, and the the upstream LSR that directly connects to the failed node, and the
recovery path shares overlapping portions with the working path. recovery path shares overlapping portions with the working path.
1.1.1.2 Global Repair 4.4.1.2 Global Repair
The intent of global repair is to protect against any link or node The intent of global repair is to protect against any link or node
fault on a path or on a segment of a path, with the obvious exception fault on a path or on a segment of a path, with the obvious exception
of the faults occurring at the ingress node of the protected path of the faults occurring at the ingress node of the protected path
segment. In global repair the PSL is usually distant from the failure segment. In global repair the PSL is usually distant from the failure
and needs to be notified by a FIS. and needs to be notified by a FIS.
In global repair also, end-to-end path recovery/restoration applies. In global repair also, end-to-end path recovery/restoration applies.
In many cases, the recovery path can be made completely link and node In many cases, the recovery path can be made completely link and node
disjoint with its working path. This has the advantage of protecting disjoint with its working path. This has the advantage of protecting
against all link and node fault(s) on the working path (end-to-end against all link and node fault(s) on the working path (end-to-end
path or path segment). path or path segment).
However, it may, in some cases, be slower than local repair since the However, it may, in some cases, be slower than local repair since the
fault notification message must now travel to the PSL to trigger the fault notification message must now travel to the PSL to trigger the
recovery action. recovery action.
1.1.1.3 Alternate Egress Repair 4.4.1.3 Alternate Egress Repair
It is possible to restore service without specifically recovering the It is possible to restore service without specifically recovering the
faulted path. faulted path.
For example, for best effort IP service it is possible to select a For example, for best effort IP service it is possible to select a
recovery path that has a different egress point from the working path recovery path that has a different egress point from the working path
(i.e., there is no PML). The recovery path egress must simply be a (i.e., there is no PML). The recovery path egress must simply be a
router that is acceptable for forwarding the FEC carried by the router that is acceptable for forwarding the FEC carried by the
working path (without creating looping). In an engineering context, working path (without creating looping). In an engineering context,
specific alternative FEC/LSP mappings with alternate egresses can be specific alternative FEC/LSP mappings with alternate egresses can be
formed. formed.
This may simplify enhancing the reliability of implicitly constructed This may simplify enhancing the reliability of implicitly constructed
MPLS topologies. A PSL may qualify LSP/FEC bindings as candidate MPLS topologies. A PSL may qualify LSP/FEC bindings as candidate
recovery paths as simply link and node disjoint with the immediate recovery paths as simply link and node disjoint with the immediate
downstream LSR of the working path. downstream LSR of the working path.
1.1.1.4 Multi-Layer Repair 4.4.1.4 Multi-Layer Repair
Multi-layer repair broadens the network designerÆs tool set for those Multi-layer repair broadens the network designerÆs tool set for those
cases where multiple network layers can be managed together to cases where multiple network layers can be managed together to
achieve overall network goals. Specific criteria for determining achieve overall network goals. Specific criteria for determining
when multi-layer repair is appropriate are beyond the scope of this when multi-layer repair is appropriate are beyond the scope of this
draft. draft.
1.1.1.5 Concatenated Protection Domains 4.4.1.5 Concatenated Protection Domains
A given service may cross multiple networks and these may employ A given service may cross multiple networks and these may employ
different recovery mechanisms. It is possible to concatenate different recovery mechanisms. It is possible to concatenate
protection domains so that service recovery can be provided end-to- protection domains so that service recovery can be provided end-to-
end. It is considered that the recovery mechanisms in different end. It is considered that the recovery mechanisms in different
domains may operate autonomously, and that multiple points of domains may operate autonomously, and that multiple points of
attachment may be used between domains (to ensure there is no single attachment may be used between domains (to ensure there is no single
point of failure). Alternate egress repair requires management of point of failure). Alternate egress repair requires management of
concatenated domains in that an explicit MPLS point of failure (the concatenated domains in that an explicit MPLS point of failure (the
PML) is by definition excluded. Details of concatenated protection PML) is by definition excluded. Details of concatenated protection
domains are beyond the scope of this draft. domains are beyond the scope of this draft.
3.4.2 Path Mapping 4.4.2 Path Mapping
Path mapping refers to the methods of mapping traffic from a faulty Path mapping refers to the methods of mapping traffic from a faulty
working path on to the recovery path. There are several options for working path on to the recovery path. There are several options for
this, as described below. Note that the options below should be this, as described below. Note that the options below should be
viewed as atomic terms that only describe how the working and viewed as atomic terms that only describe how the working and
protection paths are mapped to each other. The issues of resource protection paths are mapped to each other. The issues of resource
reservation along these paths, and how switchover is actually reservation along these paths, and how switchover is actually
performed lead to the more commonly used composite terms, such as 1+1 performed lead to the more commonly used composite terms, such as 1+1
and 1:1 protection, which were described in Section 2.1. and 1:1 protection, which were described in Section 2.1.
skipping to change at page 20, line 37 skipping to change at page 21, line 23
carry the traffic of a working path based on a certain configurable carry the traffic of a working path based on a certain configurable
load splitting ratio. This is especially useful when no single load splitting ratio. This is especially useful when no single
recovery path can be found that can carry the entire traffic of the recovery path can be found that can carry the entire traffic of the
working path in case of a fault. Split path protection may require working path in case of a fault. Split path protection may require
handshaking between the PSL and the PML(s), and may require the handshaking between the PSL and the PML(s), and may require the
PML(s) to correlate the traffic arriving on multiple recovery paths PML(s) to correlate the traffic arriving on multiple recovery paths
with the working path. Although this is an attractive option, the with the working path. Although this is an attractive option, the
details of split path protection are beyond the scope of this draft, details of split path protection are beyond the scope of this draft,
and are for further study. and are for further study.
3.4.3 Bypass Tunnels 4.4.3 Bypass Tunnels
It may be convenient, in some cases, to create a "bypass tunnel" for It may be convenient, in some cases, to create a "bypass tunnel" for
a PPG between a PSL and PML, thereby allowing multiple recovery paths a PPG between a PSL and PML, thereby allowing multiple recovery paths
to be transparent to intervening LSRs [2]. In this case, one LSP to be transparent to intervening LSRs [2]. In this case, one LSP
(the tunnel) is established between the PSL and PML following an (the tunnel) is established between the PSL and PML following an
acceptable route and a number of recovery paths are supported through acceptable route and a number of recovery paths are supported through
the tunnel via label stacking. A bypass tunnel can be used with any the tunnel via label stacking. A bypass tunnel can be used with any
of the path mapping options discussed in the previous section. of the path mapping options discussed in the previous section.
As with recovery paths, the bypass tunnel may or may not have As with recovery paths, the bypass tunnel may or may not have
resource reservations sufficient to provide recovery without service resource reservations sufficient to provide recovery without service
degradation. It is possible that the bypass tunnel may have degradation. It is possible that the bypass tunnel may have
sufficient resources to recover some number of working paths, but not sufficient resources to recover some number of working paths, but not
all at the same time. If the number of recovery paths carrying all at the same time. If the number of recovery paths carrying
traffic in the tunnel at any given time is restricted, this is traffic in the tunnel at any given time is restricted, this is
similar to the n-to-1 or n-to-m protection cases mentioned in Section similar to the n-to-1 or n-to-m protection cases mentioned in Section
3.4.2. 3.4.2.
3.4.4 Recovery Granularity 4.4.4 Recovery Granularity
Another dimension of recovery considers the amount of traffic Another dimension of recovery considers the amount of traffic
requiring protection. This may range from a fraction of a path to a requiring protection. This may range from a fraction of a path to a
bundle of paths. bundle of paths.
1.1.1.6 Selective Traffic Recovery 4.4.4.1 Selective Traffic Recovery
This option allows for the protection of a fraction of traffic within This option allows for the protection of a fraction of traffic within
the same path. The portion of the traffic on an individual path that the same path. The portion of the traffic on an individual path that
requires protection is called a protected traffic portion (PTP). A requires protection is called a protected traffic portion (PTP). A
single path may carry different classes of traffic, with different single path may carry different classes of traffic, with different
protection requirements. The protected portion of this traffic may be protection requirements. The protected portion of this traffic may be
identified by its class, as for example, via the EXP bits in the MPLS identified by its class, as for example, via the EXP bits in the MPLS
shim header or via the priority bit in the ATM header. shim header or via the priority bit in the ATM header.
1.1.1.7 Bundling 4.4.4.2 Bundling
Bundling is a technique used to group multiple working paths together Bundling is a technique used to group multiple working paths together
in order to recover them simultaneously. The logical bundling of in order to recover them simultaneously. The logical bundling of
multiple working paths requiring protection, each of which is routed multiple working paths requiring protection, each of which is routed
identically between a PSL and a PML, is called a protected path group identically between a PSL and a PML, is called a protected path group
(PPG). When a fault occurs on the working path carrying the PPG, the (PPG). When a fault occurs on the working path carrying the PPG, the
PPG as a whole can be protected either by being switched to a bypass PPG as a whole can be protected either by being switched to a bypass
tunnel or by being switched to a recovery path. tunnel or by being switched to a recovery path.
3.4.5 Recovery Path Resource Use 4.4.5 Recovery Path Resource Use
In the case of pre-reserved recovery paths, there is the question of In the case of pre-reserved recovery paths, there is the question of
what use these resources may be put to when the recovery path is not what use these resources may be put to when the recovery path is not
in use. There are two options: in use. There are two options:
Dedicated-resource: Dedicated-resource:
If the recovery path resources are dedicated, they may not be used If the recovery path resources are dedicated, they may not be used
for anything except carrying the working traffic. For example, in for anything except carrying the working traffic. For example, in
the case of 1+1 protection, the working traffic is always carried on the case of 1+1 protection, the working traffic is always carried on
the recovery path. Even if the recovery path is not always carrying the recovery path. Even if the recovery path is not always carrying
skipping to change at page 22, line 5 skipping to change at page 22, line 42
the reserved resources at other times. Extra traffic is, by the reserved resources at other times. Extra traffic is, by
definition, traffic that can be displaced (without violating service definition, traffic that can be displaced (without violating service
agreements) whenever the recovery path resources are needed for agreements) whenever the recovery path resources are needed for
carrying the working path traffic. carrying the working path traffic.
Shared-resource: Shared-resource:
A shared recovery resource is dedicated for use by multiple primary A shared recovery resource is dedicated for use by multiple primary
resources that (according to SRLGs) are not expected to fail resources that (according to SRLGs) are not expected to fail
simultaneously. simultaneously.
3.5. Fault Detection 4.5. Fault Detection
MPLS recovery is initiated after the detection of either a lower MPLS recovery is initiated after the detection of either a lower
layer fault or a fault at the IP layer or in the operation of MPLS- layer fault or a fault at the IP layer or in the operation of MPLS-
based mechanisms. We consider four classes of impairments: Path based mechanisms. We consider four classes of impairments: Path
Failure, Path Degraded, Link Failure, and Link Degraded. Failure, Path Degraded, Link Failure, and Link Degraded.
Path Failure (PF) is a fault that indicates to an MPLS-based recovery Path Failure (PF) is a fault that indicates to an MPLS-based recovery
scheme that the connectivity of the path is lost. This may be scheme that the connectivity of the path is lost. This may be
detected by a path continuity test between the PSL and PML. Some, detected by a path continuity test between the PSL and PML. Some,
and perhaps the most common, path failures may be detected using a and perhaps the most common, path failures may be detected using a
skipping to change at page 22, line 51 skipping to change at page 23, line 35
provide faster fault detection than using only MPLSûbased fault provide faster fault detection than using only MPLSûbased fault
detection mechanisms. detection mechanisms.
Link Degraded (LD) is an indication from a lower layer that the link Link Degraded (LD) is an indication from a lower layer that the link
over which the path is carried is performing below an acceptable over which the path is carried is performing below an acceptable
level. If the lower layer supports detection and reporting of this level. If the lower layer supports detection and reporting of this
fault, it may be used by the MPLS recovery mechanism. In some cases, fault, it may be used by the MPLS recovery mechanism. In some cases,
using LD indications may provide faster fault detection than using using LD indications may provide faster fault detection than using
only MPLS-based fault detection mechanisms. only MPLS-based fault detection mechanisms.
3.6. Fault Notification 4.6. Fault Notification
MPLS-based recovery relies on rapid and reliable notification of MPLS-based recovery relies on rapid and reliable notification of
faults. Once a fault is detected, the node that detected the fault faults. Once a fault is detected, the node that detected the fault
must determine if the fault is severe enough to require path must determine if the fault is severe enough to require path
recovery. If the node is not capable of initiating direct action recovery. If the node is not capable of initiating direct action
(e.g. as a PSL) the node should send out a notification of the fault (e.g. as a PSL) the node should send out a notification of the fault
by transmitting a FIS to those of its upstream LSRs that were sending by transmitting a FIS to those of its upstream LSRs that were sending
traffic on the working path that is affected by the fault. This traffic on the working path that is affected by the fault. This
notification is relayed hop-by-hop by each subsequent LSR to its notification is relayed hop-by-hop by each subsequent LSR to its
upstream neighbor, until it eventually reaches a PSL. A PSL is the upstream neighbor, until it eventually reaches a PSL. A PSL is the
skipping to change at page 23, line 25 skipping to change at page 24, line 10
or Layer 3 packet [3]. The use of a Layer 2-based notification or Layer 3 packet [3]. The use of a Layer 2-based notification
requires a Layer 2 path direct to the PSL. An example of a FIS could requires a Layer 2 path direct to the PSL. An example of a FIS could
be the liveness message sent by a downstream LSR to its upstream be the liveness message sent by a downstream LSR to its upstream
neighbor, with an optional fault notification field set or it can be neighbor, with an optional fault notification field set or it can be
implicitly denoted by a teardown message. Alternatively, it could be implicitly denoted by a teardown message. Alternatively, it could be
a separate fault notification packet. The intermediate LSR should a separate fault notification packet. The intermediate LSR should
identify which of its incoming links (upstream LSRs) to propagate the identify which of its incoming links (upstream LSRs) to propagate the
FIS on. In the case of 1+1 protection, the FIS should also be sent FIS on. In the case of 1+1 protection, the FIS should also be sent
downstream to the PML where the recovery action is taken. downstream to the PML where the recovery action is taken.
3.7. Switch-Over Operation 4.7. Switch-Over Operation
3.7.1 Recovery Trigger 4.7.1 Recovery Trigger
The activation of an MPLS protection switch following the detection The activation of an MPLS protection switch following the detection
or notification of a fault requires a trigger mechanism at the PSL. or notification of a fault requires a trigger mechanism at the PSL.
MPLS protection switching may be initiated due to automatic inputs or MPLS protection switching may be initiated due to automatic inputs or
external commands. The automatic activation of an MPLS protection external commands. The automatic activation of an MPLS protection
switch results from a response to a defect or fault conditions switch results from a response to a defect or fault conditions
detected at the PSL or to fault notifications received at the PSL. It detected at the PSL or to fault notifications received at the PSL. It
is possible that the fault detection and trigger mechanisms may be is possible that the fault detection and trigger mechanisms may be
combined, as is the case when a PF, PD, LF, or LD is detected at a combined, as is the case when a PF, PD, LF, or LD is detected at a
PSL and triggers a protection switch to the recovery path. In most PSL and triggers a protection switch to the recovery path. In most
skipping to change at page 24, line 5 skipping to change at page 24, line 43
transmitter failures, or LSR fabric failures), as does the LF fault, transmitter failures, or LSR fabric failures), as does the LF fault,
with the difference that the LF is a lower layer impairment that may with the difference that the LF is a lower layer impairment that may
be communicated to - MPLS-based recovery mechanisms. The PD (or LD) be communicated to - MPLS-based recovery mechanisms. The PD (or LD)
fault, on the other hand, applies to soft defects (excessive errors fault, on the other hand, applies to soft defects (excessive errors
due to noise on the link, for instance). The PD (or LD) results in a due to noise on the link, for instance). The PD (or LD) results in a
fault declaration only when the percentage of lost packets exceeds a fault declaration only when the percentage of lost packets exceeds a
given threshold, which is provisioned and may be set based on the given threshold, which is provisioned and may be set based on the
service level agreement(s) in effect between a service provider and a service level agreement(s) in effect between a service provider and a
customer. customer.
3.7.2 Recovery Action 4.7.2 Recovery Action
After a fault is detected or FIS is received by the PSL, the recovery After a fault is detected or FIS is received by the PSL, the recovery
action involves either a rerouting or protection switching operation. action involves either a rerouting or protection switching operation.
In both scenarios, the next hop label forwarding entry for a recovery In both scenarios, the next hop label forwarding entry for a recovery
path is bound to the working path. path is bound to the working path.
3.8. Post Recovery Operation 4.8. Post Recovery Operation
When traffic is flowing on the recovery path decisions can be made to When traffic is flowing on the recovery path decisions can be made to
whether let the traffic remain on the recovery path and consider it whether let the traffic remain on the recovery path and consider it
as a new working path or do a switch to the old or a new working as a new working path or do a switch to the old or a new working
path. This post recovery operation has two styles, one where the path. This post recovery operation has two styles, one where the
protection counterparts, i.e. the working and recovery path, are protection counterparts, i.e. the working and recovery path, are
fixed or "pinned" to its route and one in which the PSL or other fixed or "pinned" to its route and one in which the PSL or other
network entity with real time knowledge of failure dynamically network entity with real time knowledge of failure dynamically
performs re-establishment or controlled rearrangement of the paths performs re-establishment or controlled rearrangement of the paths
comprising the protected service. comprising the protected service.
3.8.1 Fixed Protection Counterparts 4.8.1 Fixed Protection Counterparts
For fixed protection counterparts the PSL will be pre-configured with For fixed protection counterparts the PSL will be pre-configured with
the appropriate behavior to take when the original fixed path is the appropriate behavior to take when the original fixed path is
restored to service. The choices are revertive and non-revertive restored to service. The choices are revertive and non-revertive
mode. The choice will typically be depended on relative costs of the mode. The choice will typically be depended on relative costs of the
working and protection paths, and the tolerance of the service to the working and protection paths, and the tolerance of the service to the
effects of switching paths yet again. These protection modes indicate effects of switching paths yet again. These protection modes indicate
whether or not there is a preferred path for the protected traffic. whether or not there is a preferred path for the protected traffic.
1.1.1.8 Revertive Mode 4.8.1.1 Revertive Mode
If the working path always is the preferred path, this path will be If the working path always is the preferred path, this path will be
used whenever it is available. Thus, in the event of a fault on this used whenever it is available. Thus, in the event of a fault on this
path, its unused resources will not be reclaimed by the network on path, its unused resources will not be reclaimed by the network on
failure. If the working path has a fault, traffic is switched to the failure. If the working path has a fault, traffic is switched to the
recovery path. In the revertive mode of operation, when the recovery path. In the revertive mode of operation, when the
preferred path is restored the traffic is automatically switched back preferred path is restored the traffic is automatically switched back
to it. to it.
There are a number of implications to pinned working and recovery There are a number of implications to pinned working and recovery
paths: paths:
- upon failure and traffic moved to recovery path, the traffic is - upon failure and traffic moved to recovery path, the traffic is
unprotected until such time as the path defect in the original unprotected until such time as the path defect in the original
working path is repaired and that path restored to service. working path is repaired and that path restored to service.
- upon failure and traffic moved to recovery path, the resources - upon failure and traffic moved to recovery path, the resources
associated with the original path remain reserved. associated with the original path remain reserved.
1.1.1.9 Non-revertive Mode 4.8.1.2 Non-revertive Mode
In the non-revertive mode of operation, there is no preferred path or In the non-revertive mode of operation, there is no preferred path or
it may be desirable to minimize further disruption of the service it may be desirable to minimize further disruption of the service
brought on by a revertive switching operation. A switch-back to the brought on by a revertive switching operation. A switch-back to the
original working path is not desired or not possible since the original working path is not desired or not possible since the
original path may no longer exist after the occurrence of a fault on original path may no longer exist after the occurrence of a fault on
that path. that path.
If there is a fault on the working path, traffic is switched to the If there is a fault on the working path, traffic is switched to the
recovery path. When or if the faulty path (the originally working recovery path. When or if the faulty path (the originally working
path) is restored, it may become the recovery path (either by path) is restored, it may become the recovery path (either by
skipping to change at page 25, line 22 skipping to change at page 26, line 7
In the non-revertive mode of operation, the working traffic may or In the non-revertive mode of operation, the working traffic may or
may not be restored to a new optimal working path or to the original may not be restored to a new optimal working path or to the original
working path anyway. This is because it might be useful, in some working path anyway. This is because it might be useful, in some
cases, to either: (a) administratively perform a protection switch cases, to either: (a) administratively perform a protection switch
back to the original working path after gaining further assurances back to the original working path after gaining further assurances
about the integrity of the path, or (b) it may be acceptable to about the integrity of the path, or (b) it may be acceptable to
continue operation on the recovery path, or (c) it may be desirable continue operation on the recovery path, or (c) it may be desirable
to move the traffic to a new optimal working path that is calculated to move the traffic to a new optimal working path that is calculated
based on network topology and network policies. based on network topology and network policies.
3.8.2 Dynamic Protection Counterparts 4.8.2 Dynamic Protection Counterparts
For dynamic protection counterparts when the traffic is switched over For dynamic protection counterparts when the traffic is switched over
to a recovery path, the association between the original working path to a recovery path, the association between the original working path
and the recovery path may no longer exist, since the original path and the recovery path may no longer exist, since the original path
itself may no longer exist after the fault. Instead, when the network itself may no longer exist after the fault. Instead, when the network
reaches a stable state following routing convergence, the recovery reaches a stable state following routing convergence, the recovery
path may be switched over to a different preferred path either path may be switched over to a different preferred path either
optimization based on the new network topology and associated optimization based on the new network topology and associated
information or based on pre-configured information. information or based on pre-configured information.
Dynamic protection counterparts assume that upon failure, the PSL or Dynamic protection counterparts assume that upon failure, the PSL or
other network entity will establish new working paths if another other network entity will establish new working paths if another
switch-over will be performed. switch-over will be performed.
3.8.3 Restoration and Notification 4.8.3 Restoration and Notification
MPLS restoration deals with returning the working traffic from the MPLS restoration deals with returning the working traffic from the
recovery path to the original or a new working path. Reversion is recovery path to the original or a new working path. Reversion is
performed by the PSL either upon receiving notification, via FRS, performed by the PSL either upon receiving notification, via FRS,
that the working path is repaired, or upon receiving notification that the working path is repaired, or upon receiving notification
that a new working path is established. that a new working path is established.
For fixed counterparts in revertive mode, an LSR that detected the For fixed counterparts in revertive mode, an LSR that detected the
fault on the working path also detects the restoration of the working fault on the working path also detects the restoration of the working
path. If the working path had experienced a LF defect, the LSR path. If the working path had experienced a LF defect, the LSR
skipping to change at page 26, line 23 skipping to change at page 27, line 8
along a recovery path towards a PSL and if the recovery path is an along a recovery path towards a PSL and if the recovery path is an
equivalent working path, it is possible for the working path and its equivalent working path, it is possible for the working path and its
recovery path to exchange roles once the original working path is recovery path to exchange roles once the original working path is
repaired following a fault. This is because, in that case, the repaired following a fault. This is because, in that case, the
recovery path effectively becomes the working path, and the restored recovery path effectively becomes the working path, and the restored
working path functions as a recovery path for the original recovery working path functions as a recovery path for the original recovery
path. This is important, since it affords the benefits of non- path. This is important, since it affords the benefits of non-
revertive switch operation outlined in Section 3.8.1, without leaving revertive switch operation outlined in Section 3.8.1, without leaving
the recovery path unprotected. the recovery path unprotected.
3.8.4 Reverting to Preferred Path (or Controlled Rearrangement) 4.8.4 Reverting to Preferred Path (or Controlled Rearrangement)
In the revertive mode, a "make before break" restoration switching In the revertive mode, a "make before break" restoration switching
can be used, which is less disruptive than performing protection can be used, which is less disruptive than performing protection
switching upon the occurrence of network impairments. This will switching upon the occurrence of network impairments. This will
minimize both packet loss and packet reordering. The controlled minimize both packet loss and packet reordering. The controlled
rearrangement of paths can also be used to satisfy traffic rearrangement of paths can also be used to satisfy traffic
engineering requirements for load balancing across an MPLS domain. engineering requirements for load balancing across an MPLS domain.
3.9. Performance 4.9. Performance
Resource/performance requirements for recovery paths should be Resource/performance requirements for recovery paths should be
specified in terms of the following attributes: specified in terms of the following attributes:
I. Resource class attribute: I. Resource class attribute:
Equivalent Recovery Class: The recovery path has the same resource Equivalent Recovery Class: The recovery path has the same resource
reservations and performance guarantees as the working path. In other reservations and performance guarantees as the working path. In other
words, the recovery path meets the same SLAs as the working path. words, the recovery path meets the same SLAs as the working path.
Limited Recovery Class: The recovery path does not have the same Limited Recovery Class: The recovery path does not have the same
resource reservations and performance guarantees as the working path. resource reservations and performance guarantees as the working path.
skipping to change at page 27, line 5 skipping to change at page 27, line 43
II. Priority Attribute: II. Priority Attribute:
The recovery path has a priority attribute just like the working path The recovery path has a priority attribute just like the working path
(i.e., the priority attribute of the associated traffic trunks). It (i.e., the priority attribute of the associated traffic trunks). It
can have the same priority as the working path or lower priority. can have the same priority as the working path or lower priority.
III. Preemption Attribute: III. Preemption Attribute:
The recovery path can have the same preemption attribute as the The recovery path can have the same preemption attribute as the
working path or a lower one. working path or a lower one.
4. MPLS Recovery Features 5. MPLS Recovery Features
The following features are desirable from an operational point of The following features are desirable from an operational point of
view: view:
I. It is desirable that MPLS recovery provides an option to identify I. It is desirable that MPLS recovery provides an option to identify
protection groups (PPGs) and protection portions (PTPs). protection groups (PPGs) and protection portions (PTPs).
II. Each PSL should be capable of performing MPLS recovery upon the II. Each PSL should be capable of performing MPLS recovery upon the
detection of the impairments or upon receipt of notifications of detection of the impairments or upon receipt of notifications of
impairments. impairments.
skipping to change at page 27, line 35 skipping to change at page 28, line 23
original working path after the fault is corrected or a switchover to original working path after the fault is corrected or a switchover to
a new working path, upon the discovery or establishment of a more a new working path, upon the discovery or establishment of a more
optimal working path. optimal working path.
V. The recovery model should take into consideration path merging at V. The recovery model should take into consideration path merging at
intermediate LSRs. If a fault affects the merged segment, all the intermediate LSRs. If a fault affects the merged segment, all the
paths sharing that merged segment should be able to recover. paths sharing that merged segment should be able to recover.
Similarly, if a fault affects a non-merged segment, only the path Similarly, if a fault affects a non-merged segment, only the path
that is affected by the fault should be recovered. that is affected by the fault should be recovered.
5. Comparison Criteria 6. Comparison Criteria
Possible criteria to use for comparison of MPLS-based recovery Possible criteria to use for comparison of MPLS-based recovery
schemes are as follows: schemes are as follows:
Recovery Time Recovery Time
We define recovery time as the time required for a recovery path to We define recovery time as the time required for a recovery path to
be activated (and traffic flowing) after a fault. Recovery Time is be activated (and traffic flowing) after a fault. Recovery Time is
the sum of the Fault Detection Time, Hold-off Time, Notification the sum of the Fault Detection Time, Hold-off Time, Notification
Time, Recovery Operation Time, and the Traffic Restoration Time. In Time, Recovery Operation Time, and the Traffic Restoration Time. In
skipping to change at page 29, line 43 skipping to change at page 30, line 29
IV. Percentage of coverage: dependent on a scheme and its IV. Percentage of coverage: dependent on a scheme and its
implementation, a certain percentage of faults may be covered. This implementation, a certain percentage of faults may be covered. This
may be subdivided into percentage of link faults and percentage of may be subdivided into percentage of link faults and percentage of
node faults. node faults.
V. The number of protected paths may effect how fast the total set of V. The number of protected paths may effect how fast the total set of
paths affected by a fault could be recovered. The ratio of protected paths affected by a fault could be recovered. The ratio of protected
is n/N, where n is the number of protected paths and N is the total is n/N, where n is the number of protected paths and N is the total
number of paths. number of paths.
6. Security Considerations 7. Security Considerations
The MPLS recovery that is specified herein does not raise any The MPLS recovery that is specified herein does not raise any
security issues that are not already present in the MPLS security issues that are not already present in the MPLS
architecture. architecture.
7. Intellectual Property Considerations 8. Intellectual Property Considerations
The IETF has been notified of intellectual property rights claimed in The IETF has been notified of intellectual property rights claimed in
regard to some or all of the specification contained in this regard to some or all of the specification contained in this
document. For more information consult the online list of claimed document. For more information consult the online list of claimed
rights. rights.
8. Acknowledgements 9. Acknowledgements
We would like to thank members of the MPLS WG mailing list for their We would like to thank members of the MPLS WG mailing list for their
suggestions on the earlier versions of this draft. In particular, suggestions on the earlier versions of this draft. In particular,
Bora Akyol, Dave Allan, Neil Harrison, and Dave Danenberg whose Bora Akyol, Dave Allan, Neil Harrison, and Dave Danenberg whose
suggestions and comments were very helpful in revising the document. suggestions and comments were very helpful in revising the document.
The editors would like to give very special thanks to Curtis The editors would like to give very special thanks to Curtis
Villamizar for his careful and extremely thorough reading of the Villamizar for his careful and extremely thorough reading of the
document and for taking the time to provide numerous suggestions, document and for taking the time to provide numerous suggestions,
which were very helpful in our latest revision of the document, and which were very helpful in the last couple of revisions of the
to Seyhan Civanlar, who provided initial input on the rerouting document.
section.
9. AuthorsÆ Addresses 10. EditorsÆ Addresses
Vishal Sharma Fiffi Hellstrand Vishal Sharma Fiffi Hellstrand
Metanoia, Inc. Nortel Networks Metanoia, Inc. Nortel Networks
305 Elan Village Ln., Unit 121 St Eriksgatan 115 305 Elan Village Ln., Unit 121 St Eriksgatan 115
San Jose, CA 95134 PO Box 6701 San Jose, CA 95134 PO Box 6701
Phone: (408) 955-0910 113 85 Stockholm, Sweden Phone: (408) 955-0910 113 85 Stockholm, Sweden
v.sharma@ieee.org Phone: +46 8 5088 3687 v.sharma@ieee.org Phone: +46 8 5088 3687
Fiffi@nortelnetworks.com Fiffi@nortelnetworks.com
Ben Mack-Crane Srinivas Makam 11. References
Tellabs Operations, Inc. Smakam60540@yahoo.com
4951 Indiana Avenue
Lisle, IL 60532
Phone: (630) 512-7255
Ben.Mack-Crane@tellabs.com
Ken Owens Changcheng Huang
Erlang Technology, Inc. Carleton University
345 Marshall Ave., Suite 300 Minto Center, Rm. 3082
St. Louis, MO 63119 1125 Colonial By Drive
Phone: (314) 918-1579 Ottawa, Ontario K1S 5B6,
Canada
keno@erlangtech.com Phone: (613) 520-2600 x2477
Changcheng.Huang@sce.carlet
on.ca
Jon Weil Brad Cain
Nortel Networks Storigen Systems
Harlow Laboratories London Road 650 Suffolk Street
Harlow Essex CM17 9NA, UK Lowell, MA 01854
Phone: +44 (0)1279 403935 Phone: (978) 323-4454
jonweil@nortelnetworks.com bcain@storigen.com
Loa Andersson Bilel Jamoussi
Utfors AB Nortel Networks
R…sundav„gen 12, Box 525 3 Federal Street, BL3-03
169 29 Solna, Sweden Billerica, MA 01821, USA
Phone: +46 8 5270 5038 Phone:(978) 288-4506
loa.andersson@utfors.se jamoussi@nortelnetworks.com
Angela Chiu
Celion Networks, Inc.
One Shiela Drive, Suite 2
Tinton Falls, NJ 07724
Phone: (732) 345-3441
angela.chiu@celion.com
10. References
[1] Rosen, E., Viswanathan, A., and Callon, R., "Multiprotocol Label [1] Rosen, E., Viswanathan, A., and Callon, R., "Multiprotocol Label
Switching Architecture", RFC 3031, January 2001. Switching Architecture", RFC 3031, January 2001.
[2] Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., McManus, J., [2] Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., McManus, J.,
"Requirements for Traffic Engineering Over MPLS", RFC 2702, "Requirements for Traffic Engineering Over MPLS", RFC 2702,
September 1999. September 1999.
[3] Haung, C., Sharma, V., Owens, K., Makam, V. "Building Reliable [3] Haung, C., Sharma, V., Owens, K., Makam, V. "Building Reliable
MPLS Networks Using a Path Protection Mechanism", IEEE Commun. MPLS Networks Using a Path Protection Mechanism", IEEE Commun.
 End of changes. 56 change blocks. 
156 lines changed or deleted 153 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/