| < draft-ietf-mpls-recovery-frmwrk-03.txt | draft-ietf-mpls-recovery-frmwrk-04.txt > | |||
|---|---|---|---|---|
| MPLS Working Group Vishal Sharma | MPLS Working Group Vishal Sharma (Metanoia, Inc.) | |||
| Informational Track Metanoia, Inc. | Informational Track Fiffi Hellstrand (Nortel Networks) | |||
| Expires: January 2002 | Expires: November 2002 Ben-Mack Crane (Tellabs) | |||
| Ben-Mack Crane | Srinivas Makam | |||
| Srinivas Makam | Ken Owens (Erlang Technology) | |||
| Tellabs Operations, Inc. | Changcheng Huang (Carleton University) | |||
| Jon Weil (Nortel Networks) | ||||
| Ken Owens | Loa Anderson (Utfors) | |||
| Erlang Technology, Inc. | Bilel Jamoussi (Nortel Networks) | |||
| Brad Cain (Storigen) | ||||
| Changcheng Huang | Angela Chiu (Celion Networks) | |||
| Carleton University | ||||
| Fiffi Hellstrand | ||||
| Jon Weil | ||||
| Loa Andersson | ||||
| Bilel Jamoussi | ||||
| Nortel Networks | ||||
| Brad Cain | ||||
| Cereva Networks | ||||
| Seyhan Civanlar | ||||
| Lemur Networks | ||||
| Angela Chiu | ||||
| Celion Networks, Inc. | ||||
| July 2001 | May 2002 | |||
| Framework for MPLS-based Recovery | Framework for MPLS-based Recovery | |||
| <draft-ietf-mpls-recovery-frmwrk-03.txt> | <draft-ietf-mpls-recovery-frmwrk-04.txt> | |||
| Status of this memo | Status of this memo | |||
| This document is an Internet-Draft and is in full conformance with | This document is an Internet-Draft and is in full conformance with | |||
| all provisions of Section 10 of RFC2026. | all provisions of Section 10 of RFC2026. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that other | Task Force (IETF), its areas, and its working groups. Note that other | |||
| groups may also distribute working documents as Internet-Drafts. | groups may also distribute working documents as Internet-Drafts. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| Abstract | Abstract | |||
| Multi-protocol label switching (MPLS) [1] integrates the label | Multi-protocol label switching (MPLS) integrates the label swapping | |||
| swapping forwarding paradigm with network layer routing. To deliver | forwarding paradigm with network layer routing. To deliver reliable | |||
| reliable service, MPLS requires a set of procedures to provide | service, MPLS requires a set of procedures to provide protection of | |||
| protection of the traffic carried on different paths. This requires | the traffic carried on different paths. This requires that the label | |||
| that the label switched routers (LSRs) support fault detection, fault | switched routers (LSRs) support fault detection, fault notification, | |||
| notification, and fault recovery mechanisms, and that MPLS signaling | and fault recovery mechanisms, and that MPLS signaling, support the | |||
| [2], [3], [4], [5], [6], [7] support the configuration of recovery. | configuration of recovery. With these objectives in mind, this | |||
| With these objectives in mind, this document specifies a framework | document specifies a framework for MPLS based recovery. | |||
| for MPLS based recovery. | ||||
| Table of Contents | Table of Contents | |||
| 1. Introduction.....................................................3 | 1. Introduction....................................................3 | |||
| 1.1. Background......................................................3 | 1.1. Background......................................................3 | |||
| 1.2. Motivation for MPLS-Based Recovery..............................4 | 1.2. Motivation for MPLS-Based Recovery..............................4 | |||
| 1.3. Objectives/Goals................................................5 | 1.3. Objectives/Goals................................................4 | |||
| 2. Overview.........................................................6 | 2. Overview........................................................6 | |||
| 2.1. Recovery Models.................................................7 | 2.1. Recovery Models.................................................6 | |||
| 2.1.1 Rerouting.......................................................7 | 2.1.1 Rerouting.....................................................6 | |||
| 2.1.2 Protection Switching............................................7 | 2.1.2 Protection Switching..........................................7 | |||
| 2.2. The Recovery Cycles.............................................8 | 2.2. The Recovery Cycles.............................................7 | |||
| 2.2.1 MPLS Recovery Cycle Model.......................................8 | 2.2.1 MPLS Recovery Cycle Model.....................................7 | |||
| 2.2.2 MPLS Reversion Cycle Model......................................9 | 2.2.2 MPLS Reversion Cycle Model....................................9 | |||
| 2.2.3 Dynamic Re-routing Cycle Model.................................11 | 2.2.3 Dynamic Re-routing Cycle Model...............................10 | |||
| 2.3. Definitions and Terminology....................................12 | 2.3. Definitions and Terminology....................................12 | |||
| 2.3.1 General Recovery Terminology...................................12 | 2.3.1 General Recovery Terminology.................................12 | |||
| 2.3.2 Failure Terminology............................................15 | 2.3.2 Failure Terminology..........................................15 | |||
| 2.4. Abbreviations..................................................16 | 2.4. Abbreviations..................................................15 | |||
| 3. MPLS-based Recovery Principles..................................16 | 3. MPLS-based Recovery Principles.................................16 | |||
| 3.1. Configuration of Recovery......................................16 | 3.1. Configuration of Recovery......................................16 | |||
| 3.2. Initiation of Path Setup.......................................17 | 3.2. Initiation of Path Setup.......................................16 | |||
| 3.3. Initiation of Resource Allocation..............................17 | 3.3. Initiation of Resource Allocation..............................17 | |||
| 3.4. Scope of Recovery..............................................18 | 3.4. Scope of Recovery..............................................17 | |||
| 3.4.1 Topology.......................................................18 | 3.4.1 Topology.....................................................17 | |||
| 3.4.1.1 Local Repair................................................18 | 1.1.1.1 Local Repair................................................18 | |||
| 3.4.1.2 Global Repair...............................................19 | 1.1.1.2 Global Repair...............................................18 | |||
| 3.4.1.3 Alternate Egress Repair.....................................19 | 1.1.1.3 Alternate Egress Repair.....................................19 | |||
| 3.4.1.4 Multi-Layer Repair..........................................19 | 1.1.1.4 Multi-Layer Repair..........................................19 | |||
| 3.4.1.5 Concatenated Protection Domains.............................19 | 1.1.1.5 Concatenated Protection Domains.............................19 | |||
| 3.4.2 Path Mapping...................................................20 | 3.4.2 Path Mapping.................................................19 | |||
| 3.4.3 Bypass Tunnels.................................................21 | 3.4.3 Bypass Tunnels...............................................20 | |||
| 3.4.4 Recovery Granularity...........................................21 | 3.4.4 Recovery Granularity.........................................21 | |||
| 3.4.4.1 Selective Traffic Recovery..................................21 | 1.1.1.6 Selective Traffic Recovery..................................21 | |||
| 3.4.4.2 Bundling....................................................21 | 1.1.1.7 Bundling....................................................21 | |||
| 3.4.5 Recovery Path Resource Use.....................................21 | 3.4.5 Recovery Path Resource Use...................................21 | |||
| 3.5. Fault Detection................................................22 | 3.5. Fault Detection................................................22 | |||
| 3.6. Fault Notification.............................................23 | 3.6. Fault Notification.............................................22 | |||
| 3.7. Switch-Over Operation..........................................23 | 3.7. Switch-Over Operation..........................................23 | |||
| 3.7.1 Recovery Trigger...............................................23 | 3.7.1 Recovery Trigger.............................................23 | |||
| 3.7.2 Recovery Action................................................24 | 3.7.2 Recovery Action..............................................24 | |||
| 3.8. Post Recovery Operation........................................24 | 3.8. Post Recovery Operation........................................24 | |||
| 3.8.1 Fixed Protection Counterparts..................................24 | 3.8.1 Fixed Protection Counterparts................................24 | |||
| 3.8.1.1 Revertive Mode..............................................25 | 1.1.1.8 Revertive Mode..............................................24 | |||
| 3.8.1.2 Non-revertive Mode..........................................25 | 1.1.1.9 Non-revertive Mode..........................................24 | |||
| 3.8.2 Dynamic Protection Counterparts................................25 | 3.8.2 Dynamic Protection Counterparts..............................25 | |||
| 3.8.3 Restoration and Notification...................................26 | 3.8.3 Restoration and Notification.................................25 | |||
| 3.8.4 Reverting to Preferred Path (or Controlled Rearrangement)......26 | 3.8.4 Reverting to Preferred Path (or Controlled Rearrangement)....26 | |||
| 3.9. Performance....................................................27 | 3.9. Performance....................................................26 | |||
| 4. MPLS Recovery Features..........................................27 | 4. MPLS Recovery Features.........................................27 | |||
| 5. Comparison Criteria.............................................28 | 5. Comparison Criteria............................................27 | |||
| 6. Security Considerations.........................................30 | 6. Security Considerations........................................29 | |||
| 7. Intellectual Property Considerations............................30 | 7. Intellectual Property Considerations...........................29 | |||
| 8. Acknowledgements................................................30 | 8. Acknowledgements...............................................30 | |||
| 9. AuthorsÆ Addresses..............................................30 | 9. AuthorsÆ Addresses.............................................30 | |||
| 10. References......................................................31 | 10. References.....................................................31 | |||
| 1. Introduction | 1. Introduction | |||
| This memo describes a framework for MPLS-based recovery. We provide a | This memo describes a framework for MPLS-based recovery. We provide a | |||
| detailed taxonomy of recovery terminology, and discuss the motivation | detailed taxonomy of recovery terminology, and discuss the motivation | |||
| for, the objectives of, and the requirements for MPLS-based recovery. | for, the objectives of, and the requirements for MPLS-based recovery. | |||
| We outline principles for MPLS-based recovery, and also provide | We outline principles for MPLS-based recovery, and also provide | |||
| comparison criteria that may serve as a basis for comparing and | comparison criteria that may serve as a basis for comparing and | |||
| evaluating different recovery schemes. | evaluating different recovery schemes. | |||
| At points in the document, we provide some thoughts about the | ||||
| operation or viability of certain recovery objectives. These should | ||||
| be viewed as the opinions of the authors, and not the consolidated | ||||
| views of the IETF. | ||||
| 1.1. Background | 1.1. Background | |||
| Network routing deployed today is focussed primarily on connectivity | Network routing deployed today is focused primarily on connectivity, | |||
| and typically supports only one class of service, the best effort | and typically supports only one class of service, the best effort | |||
| class. Multi-protocol label switching, on the other hand, by | class. Multi-protocol label switching [1], on the other hand, by | |||
| integrating forwarding based on label-swapping of a link local label | integrating forwarding based on label-swapping of a link local label | |||
| with network layer routing allows flexibility in the delivery of new | with network layer routing allows flexibility in the delivery of new | |||
| routing services. MPLS allows for using such media specific | routing services. MPLS allows for using such media specific | |||
| forwarding mechanisms as label swapping. This enables more | forwarding mechanisms as label swapping. This enables some | |||
| sophisticated features such as quality-of-service (QoS) and traffic | sophisticated features such as quality-of-service (QoS) and traffic | |||
| engineering [8] to be implemented more effectively. An important | engineering [2] to be implemented more effectively. An important | |||
| component of providing QoS, however, is the ability to transport data | component of providing QoS, however, is the ability to transport data | |||
| reliably and efficiently. Although the current routing algorithms are | reliably and efficiently. Although the current routing algorithms are | |||
| very robust and survivable, the amount of time they take to recover | robust and survivable, the amount of time they take to recover from a | |||
| from a fault can be significant, on the order of several seconds or | fault can be significant, on the order of several seconds or minutes, | |||
| minutes, causing serious disruption of service for some applications | causing disruption of service for some applications in the interim. | |||
| in the interim. This is unacceptable to many organizations that aim | This is unacceptable is situations where the aim to provide a highly | |||
| to provide a highly reliable service, and thus require recovery times | reliable service, with recovery times that are on the order of | |||
| that are on the order of seconds down to 10's of milliseconds. | seconds down to 10's of milliseconds. | |||
| MPLS recovery may be motivated by the notion that there are inherent | MPLS recovery may be motivated by the notion that there are | |||
| limitations to improving the recovery times of current routing | limitations to improving the recovery times of current routing | |||
| algorithms. Additional improvement not obtainable by other means can | algorithms. Additional improvement can be obtained by augmenting | |||
| be obtained by augmenting these algorithms with MPLS recovery | these algorithms with MPLS recovery mechanisms [3]. Since MPLS is a | |||
| mechanisms. Since MPLS is likely to be the technology of choice in | possible technology of choice in future IP-based transport networks, | |||
| the future IP-based transport network, it is useful that MPLS be able | it is useful that MPLS be able to provide protection and restoration | |||
| to provide protection and restoration of traffic. MPLS may | of traffic. MPLS may facilitate the convergence of network | |||
| facilitate the convergence of network functionality on a common | functionality on a common control and management plane. Further, a | |||
| control and management plane. Further, a protection priority could be | protection priority could be used as a differentiating mechanism for | |||
| used as a differentiating mechanism for premium services that require | premium services that require high reliability. The remainder of this | |||
| high reliability. The remainder of this document provides a framework | document provides a framework for MPLS based recovery. It is focused | |||
| for MPLS based recovery. It is focused at a conceptual level and is | at a conceptual level and is meant to address motivation, objectives | |||
| meant to address motivation, objectives and requirements. Issues of | and requirements. Issues of mechanism, policy, routing plans and | |||
| mechanism, policy, routing plans and characteristics of traffic | characteristics of traffic carried by recovery paths are beyond the | |||
| carried by recovery paths are beyond the scope of this document. | scope of this document. | |||
| 1.2. Motivation for MPLS-Based Recovery | 1.2. Motivation for MPLS-Based Recovery | |||
| MPLS based protection of traffic (called MPLS-based Recovery) is | MPLS based protection of traffic (called MPLS-based Recovery) is | |||
| useful for a number of reasons. The most important is its ability to | useful for a number of reasons. The most important is its ability to | |||
| increase network reliability by enabling a faster response to faults | increase network reliability by enabling a faster response to faults | |||
| than is possible with traditional Layer 3 (or IP layer) approaches | than is possible with traditional Layer 3 (or IP layer) approaches | |||
| alone while still providing the visibility of the network afforded by | alone while still providing the visibility of the network afforded by | |||
| Layer 3. Furthermore, a protection mechanism using MPLS could enable | Layer 3. Furthermore, a protection mechanism using MPLS could enable | |||
| IP traffic to be put directly over WDM optical channels and provide a | IP traffic to be put directly over WDM optical channels and provide a | |||
| recovery option without an intervening SONET layer. This would | recovery option without an intervening SONET layer. This would | |||
| facilitate the construction of IP-over-WDM networks that request fast | facilitate the construction of IP-over-WDM networks that request a | |||
| recovery ability. | fast recovery ability. | |||
| The need for MPLS-based recovery arises because of the following: | The need for MPLS-based recovery arises because of the following: | |||
| I. Layer 3 or IP rerouting may be too slow for a core MPLS network | I. Layer 3 or IP rerouting may be too slow for a core MPLS network | |||
| that needs to support high reliability/availability. | that needs to support recovery times that are smaller than the | |||
| convergence times of IP routing protocols. | ||||
| II. Layer 0 (for example, optical layer) or Layer 1 (for example, | II. Layer 0 (for example, optical layer) or Layer 1 (for example, | |||
| SONET) mechanisms may not be deployed in topologies that meet | SONET) mechanisms may be wasteful use of resources. | |||
| carriersÆ protection goals. Restoration at these layers may also be | ||||
| wasteful use of resources. | ||||
| III. The granularity at which the lower layers may be able to protect | III. The granularity at which the lower layers may be able to protect | |||
| traffic may be too coarse for traffic that is switched using MPLS- | traffic may be too coarse for traffic that is switched using MPLS- | |||
| based mechanisms. | based mechanisms. | |||
| IV. Layer 0 or Layer 1 mechanisms may have no visibility into higher | IV. Layer 0 or Layer 1 mechanisms may have no visibility into higher | |||
| layer operations. Thus, while they may provide, for example, link | layer operations. Thus, while they may provide, for example, link | |||
| protection, they cannot easily provide node protection or protection | protection, they cannot easily provide node protection or protection | |||
| of traffic transported at layer 3. Further, this may prevent the | of traffic transported at layer 3. Further, this may prevent the | |||
| lower layers from providing fast restoration for traffic that needs | lower layers from providing restoration based on the trafficÆs needs. | |||
| it, while providing slower restoration (with possibly more optimal | For example, fast restoration for traffic that needs it, and slower | |||
| use of resources) for traffic that does not require fast restoration. | restoration (with possibly more optimal use of resources) for traffic | |||
| In networks where the latter class of traffic is dominant, providing | that does not require fast restoration. In networks where the latter | |||
| fast restoration to all classes of traffic may not be cost effective | class of traffic is dominant, providing fast restoration to all | |||
| from a service providerÆs perspective. | classes of traffic may not be cost effective from a service | |||
| providerÆs perspective. | ||||
| V. MPLS has desirable attributes when applied to the purpose of | V. MPLS has desirable attributes when applied to the purpose of | |||
| recovery for connectionless networks. Specifically that an LSP is | recovery for connectionless networks. Specifically that an LSP is | |||
| source routed and a forwarding path for recovery can be "pinned" and | source routed and a forwarding path for recovery can be "pinned" and | |||
| is not affected by transient instability in SPF routing brought on by | is not affected by transient instability in SPF routing brought on by | |||
| failure scenarios. | failure scenarios. | |||
| Furthermore, there is a need for open standards. | ||||
| VI. Establishing interoperability of protection mechanisms between | VI. Establishing interoperability of protection mechanisms between | |||
| routers/LSRs from different vendors in IP or MPLS networks is desired | routers/LSRs from different vendors in IP or MPLS networks is desired | |||
| to enable recovery mechanisms to work in a multivendor environment, | to enable recovery mechanisms to work in a multivendor environment, | |||
| and to enable the transition of certain protected services to an MPLS | and to enable the transition of certain protected services to an MPLS | |||
| core. | core. | |||
| 1.3. Objectives/Goals | 1.3. Objectives/Goals | |||
| The following are some important goals for MPLS-based recovery. | The following are some important goals for MPLS-based recovery. | |||
| Ia. MPLS-based recovery mechanisms may be subject to the traffic | Ia. MPLS-based recovery mechanisms may be subject to the traffic | |||
| engineering goal of optimal use of resources. | engineering goal of optimal use of resources. | |||
| Ib. MPLS based recovery mechanisms should aim to facilitate | Ib. MPLS based recovery mechanisms should aim to facilitate | |||
| restoration times that are sufficiently fast for the end user | restoration times that are sufficiently fast for the end user | |||
| application. That is, that better match the end-user applicationÆs | application. That is, that better match the end-userÆs application | |||
| requirements. In some cases, this may be as short as 10s of | requirements. In some cases, this may be as short as 10s of | |||
| milliseconds. | milliseconds. | |||
| We observe that Ia and Ib are conflicting objectives, and a trade off | We observe that Ia and Ib are conflicting objectives, and a trade off | |||
| exists between them. The optimal choice depends on the end-user | exists between them. The optimal choice depends on the end-user | |||
| application to restoration time and the cost impact of introducing | applicationÆs sensitivity to restoration time and the cost impact of | |||
| restoration in the network, as well as the end-user applicationÆs | introducing restoration in the network, as well as the end-user | |||
| sensitivity to cost. | applicationÆs sensitivity to cost. | |||
| II. MPLS-based recovery should aim to maximize network reliability | II. MPLS-based recovery should aim to maximize network reliability | |||
| and availability. MPLS-based recovery of traffic should aim to | and availability. MPLS-based recovery of traffic should aim to | |||
| minimize the number of single points of failure in the MPLS protected | minimize the number of single points of failure in the MPLS protected | |||
| domain. | domain. | |||
| III. MPLS-based recovery should aim to enhance the reliability of the | III. MPLS-based recovery should aim to enhance the reliability of the | |||
| protected traffic while minimally or predictably degrading the | protected traffic while minimally or predictably degrading the | |||
| traffic carried by the diverted resources. | traffic carried by the diverted resources. | |||
| skipping to change at page 6, line 20 ¶ | skipping to change at page 6, line 4 ¶ | |||
| of data and packet reordering during recovery operations. (The | of data and packet reordering during recovery operations. (The | |||
| current MPLS specification itself has no explicit requirement on | current MPLS specification itself has no explicit requirement on | |||
| reordering). | reordering). | |||
| VIII. MPLS-based recovery mechanisms should aim to minimize the state | VIII. MPLS-based recovery mechanisms should aim to minimize the state | |||
| overhead incurred for each recovery path maintained. | overhead incurred for each recovery path maintained. | |||
| IX. MPLS-based recovery mechanisms should aim to preserve the | IX. MPLS-based recovery mechanisms should aim to preserve the | |||
| constraints on traffic after switchover, if desired. That is, if | constraints on traffic after switchover, if desired. That is, if | |||
| desired, the recovery path should meet the resource requirements of, | desired, the recovery path should meet the resource requirements of, | |||
| and achieve the same performance characteristics as the working path. | and achieve the same performance characteristics as, the working | |||
| path. | ||||
| We observe that some of the above are conflicting goals, and real | We observe that some of the above are conflicting goals, and real | |||
| deployment will often involve engineering compromises based on a | deployment will often involve engineering compromises based on a | |||
| variety of factors such as cost, end-user application requirements, | variety of factors such as cost, end-user application requirements, | |||
| network efficiency, and revenue considerations. Thus, these goals are | network efficiency, and revenue considerations. Thus, these goals are | |||
| subject to tradeoffs based on the above considerations. | subject to tradeoffs based on the above considerations. | |||
| 2. Overview | 2. Overview | |||
| There are several options for providing protection of traffic using | There are several options for providing protection of traffic. The | |||
| MPLS. The most generic requirement is the specification of whether | most generic requirement is the specification of whether recovery | |||
| recovery should be via Layer 3 (or IP) rerouting or via MPLS | should be via Layer 3 (or IP) rerouting or via MPLS protection | |||
| protection switching or rerouting actions. | switching or rerouting actions. | |||
| Generally network operators aim to provide the fastest and the best | Generally network operators aim to provide the fastest and the best | |||
| protection mechanism that can be provided at a reasonable cost. The | protection mechanism that can be provided at a reasonable cost. The | |||
| higher the level of protection, the more resources are consumed. | higher the levels of protection, the more the resources consumed. | |||
| Therefore it is expected that network operators will offer a spectrum | Therefore it is expected that network operators will offer a spectrum | |||
| of service levels. MPLS-based recovery should give the flexibility to | of service levels. MPLS-based recovery should give the flexibility to | |||
| select the recovery mechanism, choose the granularity at which | select the recovery mechanism, choose the granularity at which | |||
| traffic is protected, and to also choose the specific types of | traffic is protected, and to also choose the specific types of | |||
| traffic that are protected in order to give operators more control | traffic that are protected in order to give operators more control | |||
| over that tradeoff. With MPLS-based recovery, it can be possible to | over that tradeoff. With MPLS-based recovery, it can be possible to | |||
| provide different levels of protection for different classes of | provide different levels of protection for different classes of | |||
| service, based on their service requirements. For example, using | service, based on their service requirements. For example, using | |||
| approaches outlined below, a Virtual Leased Line (VLL) service or | approaches outlined below, a Virtual Leased Line (VLL) service or | |||
| real-time applications like Voice over IP (VoIP) may be supported | real-time applications like Voice over IP (VoIP) may be supported | |||
| using link/node protection together with pre-established, pre- | using link/node protection together with pre-established, pre- | |||
| reserved path protection. Best effort traffic, on the other hand, may | reserved path protection. Best effort traffic, on the other hand, may | |||
| use established-on-demand path protection or simply rely on IP re- | use path protection that is established on demand or may simply rely | |||
| route or higher layer recovery mechanisms. As another example of | on IP re-route or higher layer recovery mechanisms. As another | |||
| their range of application, MPLS-based recovery strategies may be | example of their range of application, MPLS-based recovery strategies | |||
| used to protect traffic not originally flowing on label switched | may be used to protect traffic not originally flowing on label | |||
| paths, such as IP traffic that is normally routed hop-by-hop, as well | switched paths, such as IP traffic that is normally routed hop-by- | |||
| as traffic forwarded on label switched paths. | hop, as well as traffic forwarded on label switched paths. | |||
| 2.1. Recovery Models | 2.1. Recovery Models | |||
| There are two basic models for path recovery: rerouting and | There are two basic models for path recovery: rerouting and | |||
| protection switching. | protection switching. | |||
| Protection switching and rerouting, as defined below, may be used | Protection switching and rerouting, as defined below, may be used | |||
| together. For example, protection switching to a recovery path may | together. For example, protection switching to a recovery path may | |||
| be used for rapid restoration of connectivity while rerouting | be used for rapid restoration of connectivity while rerouting | |||
| determines a new optimal network configuration, rearranging paths, as | determines a new optimal network configuration, rearranging paths, as | |||
| needed, at a later time. | needed, at a later time. | |||
| 2.1.1 Rerouting | 2.1.1 Rerouting | |||
| Recovery by rerouting is defined as establishing new paths or path | Recovery by rerouting is defined as establishing new paths or path | |||
| segments on demand for restoring traffic after the occurrence of a | segments on demand for restoring traffic after the occurrence of a | |||
| fault. The new paths may be based upon fault information, network | fault. The new paths may be based upon fault information, network | |||
| routing policies, pre-defined configurations and network topology | routing policies, pre-defined configurations and network topology | |||
| information. Thus, upon detecting a fault, paths or path segments to | information. Thus, upon detecting a fault, paths or path segments to | |||
| bypass the fault are established using signaling. Reroute mechanisms | bypass the fault are established using signaling. | |||
| are inherently slower than protection switching mechanisms, since | ||||
| more must be done following the detection of a fault. However reroute | ||||
| mechanisms are simpler and more frugal as no resources are committed | ||||
| until after the fault occurs and the location of the fault is known. | ||||
| Once the network routing algorithms have converged after a fault, it | Once the network routing algorithms have converged after a fault, it | |||
| may be preferable, in some cases, to reoptimize the network by | may be preferable, in some cases, to reoptimize the network by | |||
| performing a reroute based on the current state of the network and | performing a reroute based on the current state of the network and | |||
| network policies. This is discussed further in Section 3.8. | network policies. This is discussed further in Section 3.8. | |||
| In terms of the principles defined in section 3, reroute recovery | In terms of the principles defined in section 3, reroute recovery | |||
| employs paths established-on-demand with resources reserved-on- | employs paths established-on-demand with resources reserved-on- | |||
| demand. | demand. | |||
| 2.1.2 Protection Switching | 2.1.2 Protection Switching | |||
| Protection switching recovery mechanisms pre-establish a recovery | Protection switching recovery mechanisms pre-establish a recovery | |||
| path or path segment, based upon network routing policies, the | path or path segment, based upon network routing policies, the | |||
| restoration requirements of the traffic on the working path, and | restoration requirements of the traffic on the working path, and | |||
| administrative considerations. The recovery path may or may not be | administrative considerations. The recovery path may or may not be | |||
| link and node disjoint with the working path[9], [14]. However if the | link and node disjoint with the working path. However if the recovery | |||
| recovery path shares sources of failure with the working path, the | path shares sources of failure with the working path, the overall | |||
| overall reliability of the construct is degraded. When a fault is | reliability of the construct is degraded. When a fault is detected, | |||
| detected, the protected traffic is switched over to the recovery | the protected traffic is switched over to the recovery path(s) and | |||
| path(s) and restored. | restored. | |||
| In terms of the principles in section 3, protection switching employs | In terms of the principles in section 3, protection switching employs | |||
| pre-established recovery paths, and, if resource reservation is | pre-established recovery paths, and, if resource reservation is | |||
| required on the recovery path, pre-reserved resources. The various | required on the recovery path, pre-reserved resources. The various | |||
| sub-types of protection switching are detailed in Section 3.4 of this | sub-types of protection switching are detailed in Section 3.4 of this | |||
| document. | document. | |||
| 2.1.2.1 | ||||
| 2.2. The Recovery Cycles | 2.2. The Recovery Cycles | |||
| There are three defined recovery cycles; the MPLS Recovery Cycle, the | There are three defined recovery cycles: the MPLS Recovery Cycle, the | |||
| MPLS Reversion Cycle and the Dynamic Re-routing Cycle. The first | MPLS Reversion Cycle and the Dynamic Re-routing Cycle. The first | |||
| cycle detects a fault and restores traffic onto MPLS-based recovery | cycle detects a fault and restores traffic onto MPLS-based recovery | |||
| paths. If the recovery path is non-optimal the cycle may be followed | paths. If the recovery path is non-optimal the cycle may be followed | |||
| by any of the two latter to achieve an optimized network again. The | by any of the two latter cycles to achieve an optimized network | |||
| reversion cycle applies for explicitly routed traffic that that does | again. The reversion cycle applies for explicitly routed traffic that | |||
| not rely on any dynamic routing protocols to be converged. The | that does not rely on any dynamic routing protocols to be converged. | |||
| dynamic re-routing cycle applies for traffic that is forwarded based | The dynamic re-routing cycle applies for traffic that is forwarded | |||
| on hop-by-hop routing. | based on hop-by-hop routing. | |||
| 2.2.1 MPLS Recovery Cycle Model | 2.2.1 MPLS Recovery Cycle Model | |||
| The MPLS recovery cycle model is illustrated in Figure 1. | The MPLS recovery cycle model is illustrated in Figure 1. | |||
| Definitions and a key to abbreviations follow. | Definitions and a key to abbreviations follow. | |||
| --Network Impairment | --Network Impairment | |||
| | --Fault Detected | | --Fault Detected | |||
| | | --Start of Notification | | | --Start of Notification | |||
| | | | -- Start of Recovery Operation | | | | -- Start of Recovery Operation | |||
| | | | | --Recovery Operation Complete | | | | | --Recovery Operation Complete | |||
| | | | | | --Path Traffic Restored | | | | | | --Path Traffic Restored | |||
| skipping to change at page 9, line 25 ¶ | skipping to change at page 9, line 4 ¶ | |||
| LSR detecting the fault and the time at which the Path Switch LSR | LSR detecting the fault and the time at which the Path Switch LSR | |||
| (PSL) begins the recovery operation. This is zero if the PSL detects | (PSL) begins the recovery operation. This is zero if the PSL detects | |||
| the fault itself or infers a fault from such events as an adjacency | the fault itself or infers a fault from such events as an adjacency | |||
| failure. | failure. | |||
| Note: If the PSL detects the fault itself, there still may be a Hold- | Note: If the PSL detects the fault itself, there still may be a Hold- | |||
| Off Time period between detection and the start of the recovery | Off Time period between detection and the start of the recovery | |||
| operation. | operation. | |||
| Recovery Operation Time | Recovery Operation Time | |||
| The time between the first and last recovery actions. This may | The time between the first and last recovery actions. This may | |||
| include message exchanges between the PSL and PML to coordinate | include message exchanges between the PSL and PML to coordinate | |||
| recovery actions. | recovery actions. | |||
| Traffic Restoration Time | Traffic Restoration Time | |||
| The time between the last recovery action and the time that the | The time between the last recovery action and the time that the | |||
| traffic (if present) is completely recovered. This interval is | traffic (if present) is completely recovered. This interval is | |||
| intended to account for the time required for traffic to once again | intended to account for the time required for traffic to once again | |||
| arrive at the point in the network that experienced disrupted or | arrive at the point in the network that experienced disrupted or | |||
| degraded service due to the occurrence of the fault (e.g. the PML). | degraded service due to the occurrence of the fault (e.g. the PML). | |||
| This time may depend on the location of the fault, the recovery | This time may depend on the location of the fault, the recovery | |||
| mechanism, and the propagation delay along the recovery path. | mechanism, and the propagation delay along the recovery path. | |||
| 2.2.2 MPLS Reversion Cycle Model | 2.2.2 MPLS Reversion Cycle Model | |||
| Protection switching, revertive mode, requires the traffic to be | Protection switching, revertive mode, requires the traffic to be | |||
| switched back to a preferred path when the fault on that path is | switched back to a preferred path when the fault on that path is | |||
| cleared. The MPLS reversion cycle model is illustrated in Figure 2. | cleared. The MPLS reversion cycle model is illustrated in Figure 2. | |||
| Note that the cycle shown below comes after the recovery cycle shown | Note that the cycle shown below comes after the recovery cycle shown | |||
| in Fig. 1. | in Fig. 1. | |||
| --Network Impairment Repaired | --Network Impairment Repaired | |||
| | --Fault Cleared | | --Fault Cleared | |||
| | | --Path Available | | | --Path Available | |||
| skipping to change at page 10, line 24 ¶ | skipping to change at page 10, line 4 ¶ | |||
| T10 Reversion Operation Time | T10 Reversion Operation Time | |||
| T11 Traffic Restoration Time | T11 Traffic Restoration Time | |||
| Note that time T6 (not shown above) is the time for which the network | Note that time T6 (not shown above) is the time for which the network | |||
| impairment is not repaired and traffic is flowing on the recovery | impairment is not repaired and traffic is flowing on the recovery | |||
| path. | path. | |||
| Definitions of the reversion cycle times are as follows: | Definitions of the reversion cycle times are as follows: | |||
| Fault Clearing Time | Fault Clearing Time | |||
| The time between the repair of a network impairment and the time that | The time between the repair of a network impairment and the time that | |||
| MPLS-based mechanisms learn that the fault has been cleared. This | MPLS-based mechanisms learn that the fault has been cleared. This | |||
| time may be highly dependent on lower layer protocols. | time may be highly dependent on lower layer protocols. | |||
| Wait-to-Restore Time | Wait-to-Restore Time | |||
| The configured waiting time between the clearing of a fault and MPLS- | The configured waiting time between the clearing of a fault and MPLS- | |||
| based recovery action(s). Waiting time may be needed to ensure the | based recovery action(s). Waiting time may be needed to ensure that | |||
| path is stable and to avoid flapping in cases where a fault is | the path is stable and to avoid flapping in cases where a fault is | |||
| intermittent. The Wait-to-Restore Time may be zero. | intermittent. The Wait-to-Restore Time may be zero. | |||
| Note: The Wait-to-Restore Time may occur after the Notification Time | Note: The Wait-to-Restore Time may occur after the Notification Time | |||
| interval if the PSL is configured to wait. | interval if the PSL is configured to wait. | |||
| Notification Time | Notification Time | |||
| The time between initiation of an FRS by the LSR clearing the fault | The time between initiation of a fault recovery signal (FRS) by the | |||
| and the time at which the path switch LSR begins the reversion | LSR clearing the fault and the time at which the path switch LSR | |||
| operation. This is zero if the PSL clears the fault itself. | begins the reversion operation. This is zero if the PSL clears the | |||
| fault itself. | ||||
| Note: If the PSL clears the fault itself, there still may be a Wait- | Note: If the PSL clears the fault itself, there still may be a Wait- | |||
| to-Restore Time period between fault clearing and the start of the | to-Restore Time period between fault clearing and the start of the | |||
| reversion operation. | reversion operation. | |||
| Reversion Operation Time | Reversion Operation Time | |||
| The time between the first and last reversion actions. This may | The time between the first and last reversion actions. This may | |||
| include message exchanges between the PSL and PML to coordinate | include message exchanges between the PSL and PML to coordinate | |||
| reversion actions. | reversion actions. | |||
| skipping to change at page 11, line 4 ¶ | skipping to change at page 10, line 35 ¶ | |||
| to-Restore Time period between fault clearing and the start of the | to-Restore Time period between fault clearing and the start of the | |||
| reversion operation. | reversion operation. | |||
| Reversion Operation Time | Reversion Operation Time | |||
| The time between the first and last reversion actions. This may | The time between the first and last reversion actions. This may | |||
| include message exchanges between the PSL and PML to coordinate | include message exchanges between the PSL and PML to coordinate | |||
| reversion actions. | reversion actions. | |||
| Traffic Restoration Time | Traffic Restoration Time | |||
| The time between the last reversion action and the time that traffic | The time between the last reversion action and the time that traffic | |||
| (if present) is completely restored on the preferred path. This | (if present) is completely restored on the preferred path. This | |||
| interval is expected to be quite small since both paths are working | interval is expected to be quite small since both paths are working | |||
| and care may be taken to limit the traffic disruption (e.g., using | and care may be taken to limit the traffic disruption (e.g., using | |||
| "make before break" techniques and synchronous switch-over). | "make before break" techniques and synchronous switch-over). | |||
| In practice, the only interesting times in the reversion cycle are | In practice, the only interesting times in the reversion cycle are | |||
| the Wait-to-Restore Time and the Traffic Restoration Time (or some | the Wait-to-Restore Time and the Traffic Restoration Time (or some | |||
| other measure of traffic disruption). Given that both paths are | other measure of traffic disruption). Given that both paths are | |||
| available, there is no need for rapid operation, and a well- | available, there is no need for rapid operation, and a well- | |||
| controlled switch-back with minimal disruption is desirable. | controlled switch-back with minimal disruption is desirable. | |||
| 2.2.3 Dynamic Re-routing Cycle Model | 2.2.3 Dynamic Re-routing Cycle Model | |||
| Dynamic rerouting aims to bring the IP network to a stable state | Dynamic rerouting aims to bring the IP network to a stable state | |||
| after a network impairment has occurred. A re-optimized network is | after a network impairment has occurred. A re-optimized network is | |||
| achieved after the routing protocols have converged, and the traffic | achieved after the routing protocols have converged, and the traffic | |||
| is moved from a recovery path to a (possibly) new working path. The | is moved from a recovery path to a (possibly) new working path. The | |||
| steps involved in this mode are illustrated in Figure 3. | steps involved in this mode are illustrated in Figure 3. | |||
| Note that the cycle shown below may be overlaid on the recovery | Note that the cycle shown below may be overlaid on the recovery cycle | |||
| cycle shown in Fig. 1 or the reversion cycle shown in Fig. 2, or both | shown in Fig. 1 or the reversion cycle shown in Fig. 2, or both (in | |||
| (in the event that both the recovery cycle and the reversion cycle | the event that both the recovery cycle and the reversion cycle take | |||
| take place before the routing protocols converge, and after the | place before the routing protocols converge), and after the | |||
| convergence of the routing protocols it is determined (based on on- | convergence of the routing protocols it is determined (based on on- | |||
| line algorithms or off-line traffic engineering tools, network | line algorithms or off-line traffic engineering tools, network | |||
| configuration, or a variety of other possible criteria) that there is | configuration, or a variety of other possible criteria) that there is | |||
| a better route for the working path). | a better route for the working path. | |||
| --Network Enters a Semi-stable State after an Impairment | --Network Enters a Semi-stable State after an Impairment | |||
| | --Dynamic Routing Protocols Converge | | --Dynamic Routing Protocols Converge | |||
| | | --Initiate Setup of New Working Path between PSL | | | --Initiate Setup of New Working Path between PSL | |||
| | | | and PML | | | | and PML | |||
| | | | --Switchover Operation Complete | | | | --Switchover Operation Complete | |||
| | | | | --Traffic Moved to New Working Path | | | | | --Traffic Moved to New Working Path | |||
| | | | | | | | | | | | | |||
| | | | | | | | | | | | | |||
| v v v v v | v v v v v | |||
| skipping to change at page 12, line 27 ¶ | skipping to change at page 12, line 6 ¶ | |||
| The time between the first and last switchover actions. This may | The time between the first and last switchover actions. This may | |||
| include message exchanges between the PSL and PML to coordinate the | include message exchanges between the PSL and PML to coordinate the | |||
| switchover actions. | switchover actions. | |||
| As an example of the recovery cycle, we present a sequence of events | As an example of the recovery cycle, we present a sequence of events | |||
| that occur after a network impairment occurs and when a protection | that occur after a network impairment occurs and when a protection | |||
| switch is followed by dynamic rerouting. | switch is followed by dynamic rerouting. | |||
| I. Link or path fault occurs | I. Link or path fault occurs | |||
| II. Signaling initiated (FIS) for the fault detected | II. Signaling initiated (FIS) for the detected fault | |||
| III. FIS arrives at the PSL | III. FIS arrives at the PSL | |||
| IV. The PSL initiates a protection switch to a pre-configured | IV. The PSL initiates a protection switch to a pre-configured | |||
| recovery path | recovery path | |||
| V. The PSL switches over the traffic from the working path to the | V. The PSL switches over the traffic from the working path to the | |||
| recovery path | recovery path | |||
| VI. The network enters a semi-stable state | VI. The network enters a semi-stable state | |||
| VII. Dynamic routing protocols converge after the fault, and a new | VII. Dynamic routing protocols converge after the fault, and a new | |||
| working path is calculated (based, for example, on some of the | working path is calculated (based, for example, on some of the | |||
| criteria mentioned earlier in Section 2.1.1). | criteria mentioned in Section 2.1.1). | |||
| VIII. A new working path is established between the PSL and the PML | VIII. A new working path is established between the PSL and the PML | |||
| (assumption is that PSL and PML have not changed) | (assumption is that PSL and PML have not changed) | |||
| IX. Traffic is switched over to the new working path. | IX. Traffic is switched over to the new working path. | |||
| 2.3. Definitions and Terminology | 2.3. Definitions and Terminology | |||
| This document assumes the terminology given in [1], and, in addition, | This document assumes the terminology given in [1], and, in addition, | |||
| introduces the following new terms. | introduces the following new terms. | |||
| 2.3.1 General Recovery Terminology | 2.3.1 General Recovery Terminology | |||
| Rerouting | Rerouting | |||
| A recovery mechanism in which the recovery path or path segments are | A recovery mechanism in which the recovery path or path segments are | |||
| created dynamically after the detection of a fault on the working | created dynamically after the detection of a fault on the working | |||
| path. In other words, a recovery mechanism in which the recovery path | path. In other words, a recovery mechanism in which the recovery path | |||
| is not pre-established. | is not pre-established. | |||
| Protection Switching | Protection Switching | |||
| A recovery mechanism in which the recovery path or path segments are | A recovery mechanism in which the recovery path or path segments are | |||
| created prior to the detection of a fault on the working path. In | created prior to the detection of a fault on the working path. In | |||
| other words, a recovery mechanism in which the recovery path is pre- | other words, a recovery mechanism in which the recovery path is pre- | |||
| established. | established. | |||
| Working Path | Working Path | |||
| The protected path that carries traffic before the occurrence of a | The protected path that carries traffic before the occurrence of a | |||
| fault. The working path exists between a PSL and PML. The working | fault. The working path exists between a PSL and PML. The working | |||
| path can be of different kinds; a hop-by-hop routed path, a trunk, a | path can be of different kinds; a hop-by-hop routed path, a trunk, a | |||
| skipping to change at page 15, line 29 ¶ | skipping to change at page 15, line 7 ¶ | |||
| Path Continuity Test | Path Continuity Test | |||
| A test that verifies the integrity and continuity of a path or path | A test that verifies the integrity and continuity of a path or path | |||
| segment. The details of such a test are beyond the scope of this | segment. The details of such a test are beyond the scope of this | |||
| draft. (This could be accomplished, for example, by transmitting a | draft. (This could be accomplished, for example, by transmitting a | |||
| control message along the same links and nodes as the data traffic or | control message along the same links and nodes as the data traffic or | |||
| similarly could be measured by the absence of traffic and by | similarly could be measured by the absence of traffic and by | |||
| providing feedback.) | providing feedback.) | |||
| 2.3.2 Failure Terminology | 2.3.2 Failure Terminology | |||
| Path Failure (PF) | Path Failure (PF) | |||
| Path failure is fault detected by MPLS-based recovery mechanisms, | Path failure is fault detected by MPLS-based recovery mechanisms, | |||
| which is define as the failure of the liveness message test or a path | which is define as the failure of the liveness message test or a path | |||
| continuity test, which indicates that path connectivity is lost. | continuity test, which indicates that path connectivity is lost. | |||
| Path Degraded (PD) | Path Degraded (PD) | |||
| Path degraded is a fault detected by MPLS-based recovery mechanisms | Path degraded is a fault detected by MPLS-based recovery mechanisms | |||
| that indicates that the quality of the path is unacceptable. | that indicates that the quality of the path is unacceptable. | |||
| skipping to change at page 16, line 9 ¶ | skipping to change at page 15, line 39 ¶ | |||
| relayed by each intermediate LSR to its upstream or downstream | relayed by each intermediate LSR to its upstream or downstream | |||
| neighbor, until it reaches an LSR that is setup to perform MPLS | neighbor, until it reaches an LSR that is setup to perform MPLS | |||
| recovery. The FIS is transmitted periodically by the node/nodes | recovery. The FIS is transmitted periodically by the node/nodes | |||
| closest to the point of failure, for some configurable length of | closest to the point of failure, for some configurable length of | |||
| time. | time. | |||
| Fault Recovery Signal (FRS) | Fault Recovery Signal (FRS) | |||
| A signal that indicates a fault along a working path has been | A signal that indicates a fault along a working path has been | |||
| repaired. Again, like the FIS, it is relayed by each intermediate LSR | repaired. Again, like the FIS, it is relayed by each intermediate LSR | |||
| to its upstream or downstream neighbor, until is reaches the LSR that | to its upstream or downstream neighbor, until is reaches the LSR that | |||
| performs recovery of the original path. . The FRS is transmitted | performs recovery of the original path. The FRS is transmitted | |||
| periodically by the node/nodes closest to the point of failure, for | periodically by the node/nodes closest to the point of failure, for | |||
| some configurable length of time. | some configurable length of time. | |||
| 2.4. Abbreviations | 2.4. Abbreviations | |||
| FIS: Fault Indication Signal. | FIS: Fault Indication Signal. | |||
| FRS: Fault Recovery Signal. | FRS: Fault Recovery Signal. | |||
| LD: Link Degraded. | LD: Link Degraded. | |||
| LF: Link Failure. | LF: Link Failure. | |||
| PD: Path Degraded. | PD: Path Degraded. | |||
| skipping to change at page 17, line 14 ¶ | skipping to change at page 16, line 41 ¶ | |||
| paths may be automatically recovered upon a fault along one of the | paths may be automatically recovered upon a fault along one of the | |||
| working paths by distributing it among the remaining working paths). | working paths by distributing it among the remaining working paths). | |||
| Recoverable (MPLS-based recovery enabled): | Recoverable (MPLS-based recovery enabled): | |||
| This working path is recovered using one or more recovery paths, | This working path is recovered using one or more recovery paths, | |||
| either via rerouting or via protection switching. | either via rerouting or via protection switching. | |||
| 3.2. Initiation of Path Setup | 3.2. Initiation of Path Setup | |||
| There are three options for the initiation of the recovery path | There are three options for the initiation of the recovery path | |||
| setup. | setup. The active and recovery paths may be established by using | |||
| either RSVP-TE [4][5] or CR-LDP [6]. | ||||
| Pre-established: | Pre-established: | |||
| This is the same as the protection switching option. Here a recovery | This is the same as the protection switching option. Here a recovery | |||
| path(s) is established prior to any failure on the working path. The | path(s) is established prior to any failure on the working path. The | |||
| path selection can either be determined by an administrative | path selection can either be determined by an administrative | |||
| centralized tool, or chosen based on some algorithm implemented at | centralized tool, or chosen based on some algorithm implemented at | |||
| the PSL and possibly intermediate nodes. To guard against the | the PSL and possibly intermediate nodes. To guard against the | |||
| situation when the pre-established recovery path fails before or at | situation when the pre-established recovery path fails before or at | |||
| the same time as the working path, the recovery path should have | the same time as the working path, the recovery path should have | |||
| secondary configuration options as explained in Section 3.3 below. | secondary configuration options as explained in Section 3.3 below. | |||
| Pre Qualified: | Pre Qualified: | |||
| A pre-established path need not be created, it may be pre-qualified. | A pre-established path need not be created, it may be pre-qualified. | |||
| A pre-qualified recovery path is not created expressly for protecting | A pre-qualified recovery path is not created expressly for protecting | |||
| the working path, but instead is a path created for other purposes | the working path, but instead is a path created for other purposes | |||
| that is designated as a recovery path after determination that it is | that is designated as a recovery path after determining that it is an | |||
| an acceptable alternative for carrying the working path traffic. | acceptable alternative for carrying the working path traffic. | |||
| Variants include the case where an optical path or trail is | Variants include the case where an optical path or trail is | |||
| configured, but no switches are set. | configured, but no switches are set. | |||
| Established-on-Demand: | Established-on-Demand: | |||
| This is the same as the rerouting option. Here, a recovery path is | This is the same as the rerouting option. Here, a recovery path is | |||
| established after a failure on its working path has been detected and | established after a failure on its working path has been detected and | |||
| notified to the PSL. | notified to the PSL. | |||
| 3.3. Initiation of Resource Allocation | 3.3. Initiation of Resource Allocation | |||
| skipping to change at page 18, line 26 ¶ | skipping to change at page 17, line 56 ¶ | |||
| on the working path has been detected and notified to the PSL and | on the working path has been detected and notified to the PSL and | |||
| before the traffic on the working path is switched over to the | before the traffic on the working path is switched over to the | |||
| recovery path. | recovery path. | |||
| Note that under both the options above, depending on the amount of | Note that under both the options above, depending on the amount of | |||
| resources reserved on the recovery path, it could either be an | resources reserved on the recovery path, it could either be an | |||
| equivalent recovery path or a limited recovery path. | equivalent recovery path or a limited recovery path. | |||
| 3.4. Scope of Recovery | 3.4. Scope of Recovery | |||
| 3.4.1 Topology | 3.4.1 Topology | |||
| 1.1.1.1 Local Repair | ||||
| 3.4.1.1 Local Repair | ||||
| The intent of local repair is to protect against a link or neighbor | The intent of local repair is to protect against a link or neighbor | |||
| node fault and to minimize the amount of time required for failure | node fault and to minimize the amount of time required for failure | |||
| propagation. In local repair (also known as local recovery [10] [9]), | propagation. In local repair (also known as local recovery), the node | |||
| the node immediately upstream of the fault is the one to initiate | immediately upstream of the fault is the one to initiate recovery | |||
| recovery (either rerouting or protection switching). Local repair can | (either rerouting or protection switching). Local repair can be of | |||
| be of two types: | two types: | |||
| Link Recovery/Restoration | Link Recovery/Restoration | |||
| In this case, the recovery path may be configured to route around a | In this case, the recovery path may be configured to route around a | |||
| certain link deemed to be unreliable. If protection switching is | certain link deemed to be unreliable. If protection switching is | |||
| used, several recovery paths may be configured for one working path, | used, several recovery paths may be configured for one working path, | |||
| depending on the specific faulty link that each protects against. | depending on the specific faulty link that each protects against. | |||
| Alternatively, if rerouting is used, upon the occurrence of a fault | Alternatively, if rerouting is used, upon the occurrence of a fault | |||
| on the specified link each path is rebuilt such that it detours | on the specified link, each path is rebuilt such that it detours | |||
| around the faulty link. | around the faulty link. | |||
| In this case, the recovery path need only be disjoint from its | In this case, the recovery path need only be disjoint from its | |||
| working path at a particular link on the working path, and may have | working path at a particular link on the working path, and may have | |||
| overlapping segments with the working path. Traffic on the working | overlapping segments with the working path. Traffic on the working | |||
| path is switched over to an alternate path at the upstream LSR that | path is switched over to an alternate path at the upstream LSR that | |||
| connects to the failed link. This method is potentially the fastest | connects to the failed link. This method is potentially the fastest | |||
| to perform the switchover, and can be effective in situations where | to perform the switchover, and can be effective in situations where | |||
| certain path components are much more unreliable than others. | certain path components are much more unreliable than others. | |||
| Node Recovery/Restoration | Node Recovery/Restoration | |||
| skipping to change at page 19, line 4 ¶ | skipping to change at page 18, line 32 ¶ | |||
| around the faulty link. | around the faulty link. | |||
| In this case, the recovery path need only be disjoint from its | In this case, the recovery path need only be disjoint from its | |||
| working path at a particular link on the working path, and may have | working path at a particular link on the working path, and may have | |||
| overlapping segments with the working path. Traffic on the working | overlapping segments with the working path. Traffic on the working | |||
| path is switched over to an alternate path at the upstream LSR that | path is switched over to an alternate path at the upstream LSR that | |||
| connects to the failed link. This method is potentially the fastest | connects to the failed link. This method is potentially the fastest | |||
| to perform the switchover, and can be effective in situations where | to perform the switchover, and can be effective in situations where | |||
| certain path components are much more unreliable than others. | certain path components are much more unreliable than others. | |||
| Node Recovery/Restoration | Node Recovery/Restoration | |||
| In this case, the recovery path may be configured to route around a | In this case, the recovery path may be configured to route around a | |||
| neighbor node deemed to be unreliable. Thus the recovery path is | neighbor node deemed to be unreliable. Thus the recovery path is | |||
| disjoint from the working path only at a particular node and at links | disjoint from the working path only at a particular node and at links | |||
| associated with the working path at that node. Once again, the | associated with the working path at that node. Once again, the | |||
| traffic on the primary path is switched over to the recovery path at | traffic on the primary path is switched over to the recovery path at | |||
| the upstream LSR that directly connects to the failed node, and the | the upstream LSR that directly connects to the failed node, and the | |||
| recovery path shares overlapping portions with the working path. | recovery path shares overlapping portions with the working path. | |||
| 3.4.1.2 Global Repair | 1.1.1.2 Global Repair | |||
| The intent of global repair is to protect against any link or node | The intent of global repair is to protect against any link or node | |||
| fault on a path or on a segment of a path, with the obvious exception | fault on a path or on a segment of a path, with the obvious exception | |||
| of the faults occurring at the ingress node of the protected path | of the faults occurring at the ingress node of the protected path | |||
| segment. In global repair the PSL is usually distant from the failure | segment. In global repair the PSL is usually distant from the failure | |||
| and needs to be notified by a FIS. | and needs to be notified by a FIS. | |||
| In global repair also end-to end path recovery/restoration applies. | In global repair also, end-to-end path recovery/restoration applies. | |||
| In many cases, the recovery path can be made completely link and node | In many cases, the recovery path can be made completely link and node | |||
| disjoint with its working path. This has the advantage of protecting | disjoint with its working path. This has the advantage of protecting | |||
| against all link and node fault(s) on the working path (end-to-end | against all link and node fault(s) on the working path (end-to-end | |||
| path or path segment). | path or path segment). | |||
| However, it is in some cases slower than local repair since it takes | However, it may, in some cases, be slower than local repair since the | |||
| longer for the fault notification message to get to the PSL to | fault notification message must now travel to the PSL to trigger the | |||
| trigger the recovery action. | recovery action. | |||
| 3.4.1.3 Alternate Egress Repair | 1.1.1.3 Alternate Egress Repair | |||
| It is possible to restore service without specifically recovering the | It is possible to restore service without specifically recovering the | |||
| faulted path. | faulted path. | |||
| For example, for best effort IP service it is possible to select a | For example, for best effort IP service it is possible to select a | |||
| recovery path that has a different egress point from the working path | recovery path that has a different egress point from the working path | |||
| (i.e., there is no PML). The recovery path egress must simply be a | (i.e., there is no PML). The recovery path egress must simply be a | |||
| router that is acceptable for forwarding the FEC carried by the | router that is acceptable for forwarding the FEC carried by the | |||
| working path (without creating looping). In an engineering context, | working path (without creating looping). In an engineering context, | |||
| specific alternative FEC/LSP mappings with alternate egresses can be | specific alternative FEC/LSP mappings with alternate egresses can be | |||
| formed. | formed. | |||
| This may simplify enhancing the reliability of implicitly constructed | This may simplify enhancing the reliability of implicitly constructed | |||
| MPLS topologies. A PSL may qualify LSP/FEC bindings as candidate | MPLS topologies. A PSL may qualify LSP/FEC bindings as candidate | |||
| recovery paths as simply link and node disjoint with the immediate | recovery paths as simply link and node disjoint with the immediate | |||
| downstream LSR of the working path. | downstream LSR of the working path. | |||
| 3.4.1.4 Multi-Layer Repair | 1.1.1.4 Multi-Layer Repair | |||
| Multi-layer repair broadens the network designerÆs tool set for those | Multi-layer repair broadens the network designerÆs tool set for those | |||
| cases where multiple network layers can be managed together to | cases where multiple network layers can be managed together to | |||
| achieve overall network goals. Specific criteria for determining | achieve overall network goals. Specific criteria for determining | |||
| when multi-layer repair is appropriate are beyond the scope of this | when multi-layer repair is appropriate are beyond the scope of this | |||
| draft. | draft. | |||
| 3.4.1.5 Concatenated Protection Domains | 1.1.1.5 Concatenated Protection Domains | |||
| A given service may cross multiple networks and these may employ | A given service may cross multiple networks and these may employ | |||
| different recovery mechanisms. It is possible to concatenate | different recovery mechanisms. It is possible to concatenate | |||
| protection domains so that service recovery can be provided end-to- | protection domains so that service recovery can be provided end-to- | |||
| end. It is considered that the recovery mechanisms in different | end. It is considered that the recovery mechanisms in different | |||
| domains may operate autonomously, and that multiple points of | domains may operate autonomously, and that multiple points of | |||
| attachment may be used between domains (to ensure there is no single | attachment may be used between domains (to ensure there is no single | |||
| point of failure). Alternate egress repair requires management of | point of failure). Alternate egress repair requires management of | |||
| concatenated domains in that an explicit MPLS point of failure (the | concatenated domains in that an explicit MPLS point of failure (the | |||
| PML) is by definition excluded. Details of concatenated protection | PML) is by definition excluded. Details of concatenated protection | |||
| domains are beyond the scope of this draft. | domains are beyond the scope of this draft. | |||
| 3.4.2 Path Mapping | 3.4.2 Path Mapping | |||
| Path mapping refers to the methods of mapping traffic from a faulty | Path mapping refers to the methods of mapping traffic from a faulty | |||
| working path on to the recovery path. There are several options for | working path on to the recovery path. There are several options for | |||
| this, as described below. Note that the options below should be | this, as described below. Note that the options below should be | |||
| viewed as atomic terms that only describe how the working and | viewed as atomic terms that only describe how the working and | |||
| protection paths are mapped to each other. The issues of resource | protection paths are mapped to each other. The issues of resource | |||
| reservation along these paths, and how switchover is actually | reservation along these paths, and how switchover is actually | |||
| performed lead to the more commonly used composite terms, such as 1+1 | performed lead to the more commonly used composite terms, such as 1+1 | |||
| and 1:1 protection, which were described in Section 2.1. | and 1:1 protection, which were described in Section 2.1. | |||
| skipping to change at page 20, line 46 ¶ | skipping to change at page 20, line 22 ¶ | |||
| recovery, the details of which are beyond the scope of this draft. | recovery, the details of which are beyond the scope of this draft. | |||
| n-to-m Protection | n-to-m Protection | |||
| In n-to-m protection, up to n working paths are protected using m | In n-to-m protection, up to n working paths are protected using m | |||
| recovery paths. Once again, if the intent is to protect against any | recovery paths. Once again, if the intent is to protect against any | |||
| single fault on any of the n working paths, the n working paths and | single fault on any of the n working paths, the n working paths and | |||
| the m recovery paths should be diversely routed between the same PSL | the m recovery paths should be diversely routed between the same PSL | |||
| and PML. In some cases, handshaking between PSL and PML may be | and PML. In some cases, handshaking between PSL and PML may be | |||
| required to complete the recovery, the details of which are beyond | required to complete the recovery, the details of which are beyond | |||
| the scope of this draft. N-to-m protection is for further study. | the scope of this draft. n-to-m protection is for further study. | |||
| Split Path Protection | Split Path Protection | |||
| In split path protection, multiple recovery paths are allowed to | In split path protection, multiple recovery paths are allowed to | |||
| carry the traffic of a working path based on a certain configurable | carry the traffic of a working path based on a certain configurable | |||
| load splitting ratio. This is especially useful when no single | load splitting ratio. This is especially useful when no single | |||
| recovery path can be found that can carry the entire traffic of the | recovery path can be found that can carry the entire traffic of the | |||
| working path in case of a fault. Split path protection may require | working path in case of a fault. Split path protection may require | |||
| handshaking between the PSL and the PML(s), and may require the | handshaking between the PSL and the PML(s), and may require the | |||
| PML(s) to correlate the traffic arriving on multiple recovery paths | PML(s) to correlate the traffic arriving on multiple recovery paths | |||
| with the working path. Although this is an attractive option, the | with the working path. Although this is an attractive option, the | |||
| details of split path protection are beyond the scope of this draft, | details of split path protection are beyond the scope of this draft, | |||
| and are for further study. | and are for further study. | |||
| 3.4.3 Bypass Tunnels | 3.4.3 Bypass Tunnels | |||
| It may be convenient, in some cases, to create a "bypass tunnel" for | It may be convenient, in some cases, to create a "bypass tunnel" for | |||
| a PPG between a PSL and PML, thereby allowing multiple recovery paths | a PPG between a PSL and PML, thereby allowing multiple recovery paths | |||
| to be transparent to intervening LSRs [8]. In this case, one LSP | to be transparent to intervening LSRs [2]. In this case, one LSP | |||
| (the tunnel) is established between the PSL and PML following an | (the tunnel) is established between the PSL and PML following an | |||
| acceptable route and a number of recovery paths are supported through | acceptable route and a number of recovery paths are supported through | |||
| the tunnel via label stacking. A bypass tunnel can be used with any | the tunnel via label stacking. A bypass tunnel can be used with any | |||
| of the path mapping options discussed in the previous section. | of the path mapping options discussed in the previous section. | |||
| As with recovery paths, the bypass tunnel may or may not have | As with recovery paths, the bypass tunnel may or may not have | |||
| resource reservations sufficient to provide recovery without service | resource reservations sufficient to provide recovery without service | |||
| degradation. It is possible that the bypass tunnel may have | degradation. It is possible that the bypass tunnel may have | |||
| sufficient resources to recover some number of working paths, but not | sufficient resources to recover some number of working paths, but not | |||
| all at the same time. If the number of recovery paths carrying | all at the same time. If the number of recovery paths carrying | |||
| traffic in the tunnel at any given time is restricted, this is | traffic in the tunnel at any given time is restricted, this is | |||
| similar to the 1 to n or m to n protection cases mentioned in Section | similar to the n-to-1 or n-to-m protection cases mentioned in Section | |||
| 3.4.2. | 3.4.2. | |||
| 3.4.4 Recovery Granularity | 3.4.4 Recovery Granularity | |||
| Another dimension of recovery considers the amount of traffic | Another dimension of recovery considers the amount of traffic | |||
| requiring protection. This may range from a fraction of a path to a | requiring protection. This may range from a fraction of a path to a | |||
| bundle of paths. | bundle of paths. | |||
| 3.4.4.1 Selective Traffic Recovery | 1.1.1.6 Selective Traffic Recovery | |||
| This option allows for the protection of a fraction of traffic within | This option allows for the protection of a fraction of traffic within | |||
| the same path. The portion of the traffic on an individual path that | the same path. The portion of the traffic on an individual path that | |||
| requires protection is called a protected traffic portion (PTP). A | requires protection is called a protected traffic portion (PTP). A | |||
| single path may carry different classes of traffic, with different | single path may carry different classes of traffic, with different | |||
| protection requirements. The protected portion of this traffic may be | protection requirements. The protected portion of this traffic may be | |||
| identified by its class, as for example, via the EXP bits in the MPLS | identified by its class, as for example, via the EXP bits in the MPLS | |||
| shim header or via the priority bit in the ATM header. | shim header or via the priority bit in the ATM header. | |||
| 3.4.4.2 Bundling | 1.1.1.7 Bundling | |||
| Bundling is a technique used to group multiple working paths together | Bundling is a technique used to group multiple working paths together | |||
| in order to recover them simultaneously. The logical bundling of | in order to recover them simultaneously. The logical bundling of | |||
| multiple working paths requiring protection, each of which is routed | multiple working paths requiring protection, each of which is routed | |||
| identically between a PSL and a PML, is called a protected path group | identically between a PSL and a PML, is called a protected path group | |||
| (PPG). When a fault occurs on the working path carrying the PPG, the | (PPG). When a fault occurs on the working path carrying the PPG, the | |||
| PPG as a whole can be protected either by being switched to a bypass | PPG as a whole can be protected either by being switched to a bypass | |||
| tunnel or by being switched to a recovery path. | tunnel or by being switched to a recovery path. | |||
| 3.4.5 Recovery Path Resource Use | 3.4.5 Recovery Path Resource Use | |||
| In the case of pre-reserved recovery paths, there is the question of | In the case of pre-reserved recovery paths, there is the question of | |||
| what use these resources may be put to when the recovery path is not | what use these resources may be put to when the recovery path is not | |||
| in use. There are two options: | in use. There are two options: | |||
| Dedicated-resource: | Dedicated-resource: | |||
| If the recovery path resources are dedicated, they may not be used | If the recovery path resources are dedicated, they may not be used | |||
| for anything except carrying the working traffic. For example, in | for anything except carrying the working traffic. For example, in | |||
| the case of 1+1 protection, the working traffic is always carried on | the case of 1+1 protection, the working traffic is always carried on | |||
| the recovery path. Even if the recovery path is not always carrying | the recovery path. Even if the recovery path is not always carrying | |||
| the working traffic, it may not be possible or desirable to allow | the working traffic, it may not be possible or desirable to allow | |||
| skipping to change at page 22, line 27 ¶ | skipping to change at page 21, line 56 ¶ | |||
| If the recovery path only carries the working traffic when the | If the recovery path only carries the working traffic when the | |||
| working path fails, then it is possible to allow extra traffic to use | working path fails, then it is possible to allow extra traffic to use | |||
| the reserved resources at other times. Extra traffic is, by | the reserved resources at other times. Extra traffic is, by | |||
| definition, traffic that can be displaced (without violating service | definition, traffic that can be displaced (without violating service | |||
| agreements) whenever the recovery path resources are needed for | agreements) whenever the recovery path resources are needed for | |||
| carrying the working path traffic. | carrying the working path traffic. | |||
| Shared-resource: | Shared-resource: | |||
| A shared recovery resource is dedicated for use by multiple primary | A shared recovery resource is dedicated for use by multiple primary | |||
| resources that (according to SRLGs) are not expected to fail | resources that (according to SRLGs) are not expected to fail | |||
| simultaneously. Determining what resources that can be shared can be | simultaneously. | |||
| accomplished by offline analysis or by techniques described in [14]. | ||||
| 3.5. Fault Detection | 3.5. Fault Detection | |||
| MPLS recovery is initiated after the detection of either a lower | MPLS recovery is initiated after the detection of either a lower | |||
| layer fault or a fault at the IP layer or in the operation of MPLS- | layer fault or a fault at the IP layer or in the operation of MPLS- | |||
| based mechanisms. We consider four classes of impairments: Path | based mechanisms. We consider four classes of impairments: Path | |||
| Failure, Path Degraded, Link Failure, and Link Degraded. | Failure, Path Degraded, Link Failure, and Link Degraded. | |||
| Path Failure (PF) is a fault that indicates to an MPLS-based recovery | Path Failure (PF) is a fault that indicates to an MPLS-based recovery | |||
| scheme that the connectivity of the path is lost. This may be | scheme that the connectivity of the path is lost. This may be | |||
| detected by a path continuity test between the PSL and PML. Some, | detected by a path continuity test between the PSL and PML. Some, | |||
| and perhaps the most common, path failures may be detected using a | and perhaps the most common, path failures may be detected using a | |||
| link probing mechanism between neighbor LSRs. An example of a probing | link probing mechanism between neighbor LSRs. An example of a probing | |||
| mechanism is a liveness message that is exchanged periodically along | mechanism is a liveness message that is exchanged periodically along | |||
| the working path between peer LSRs. For either a link probing | the working path between peer LSRs [3]. For either a link probing | |||
| mechanism or path continuity test to be effective, the test message | mechanism or path continuity test to be effective, the test message | |||
| must be guaranteed to follow the same route as the working or | must be guaranteed to follow the same route as the working or | |||
| recovery path, over the segment being tested. In addition, the path | recovery path, over the segment being tested. In addition, the path | |||
| continuity test must take the path merge points into consideration. | continuity test must take the path merge points into consideration. | |||
| In the case of a bi-directional link implemented as two | In the case of a bi-directional link implemented as two | |||
| unidirectional links, path failure could mean that either one or both | unidirectional links, path failure could mean that either one or both | |||
| unidirectional links are damaged. | unidirectional links are damaged. | |||
| Path Degraded (PD) is a fault that indicates to MPLS-based recovery | Path Degraded (PD) is a fault that indicates to MPLS-based recovery | |||
| schemes/mechanisms that the path has connectivity, but that the | schemes/mechanisms that the path has connectivity, but that the | |||
| skipping to change at page 23, line 42 ¶ | skipping to change at page 23, line 15 ¶ | |||
| traffic on the working path that is affected by the fault. This | traffic on the working path that is affected by the fault. This | |||
| notification is relayed hop-by-hop by each subsequent LSR to its | notification is relayed hop-by-hop by each subsequent LSR to its | |||
| upstream neighbor, until it eventually reaches a PSL. A PSL is the | upstream neighbor, until it eventually reaches a PSL. A PSL is the | |||
| only LSR that can terminate the FIS and initiate a protection switch | only LSR that can terminate the FIS and initiate a protection switch | |||
| of the working path to a recovery path. | of the working path to a recovery path. | |||
| Since the FIS is a control message, it should be transmitted with | Since the FIS is a control message, it should be transmitted with | |||
| high priority to ensure that it propagates rapidly towards the | high priority to ensure that it propagates rapidly towards the | |||
| affected PSL(s). Depending on how fault notification is configured in | affected PSL(s). Depending on how fault notification is configured in | |||
| the LSRs of an MPLS domain, the FIS could be sent either as a Layer 2 | the LSRs of an MPLS domain, the FIS could be sent either as a Layer 2 | |||
| or Layer 3 packet [11]. The use of a Layer 2-based notification | or Layer 3 packet [3]. The use of a Layer 2-based notification | |||
| requires a Layer 2 path direct to the PSL. An example of a FIS could | requires a Layer 2 path direct to the PSL. An example of a FIS could | |||
| be the liveness message sent by a downstream LSR to its upstream | be the liveness message sent by a downstream LSR to its upstream | |||
| neighbor, with an optional fault notification field set or it can be | neighbor, with an optional fault notification field set or it can be | |||
| implicitly denoted by a teardown message. Alternatively, it could be | implicitly denoted by a teardown message. Alternatively, it could be | |||
| a separate fault notification packet. The intermediate LSR should | a separate fault notification packet. The intermediate LSR should | |||
| identify which of its incoming links (upstream LSRs) to propagate the | identify which of its incoming links (upstream LSRs) to propagate the | |||
| FIS on. In the case of 1+1 protection, the FIS should also be sent | FIS on. In the case of 1+1 protection, the FIS should also be sent | |||
| downstream to the PML where the recovery action is taken. | downstream to the PML where the recovery action is taken. | |||
| 3.7. Switch-Over Operation | 3.7. Switch-Over Operation | |||
| 3.7.1 Recovery Trigger | 3.7.1 Recovery Trigger | |||
| The activation of an MPLS protection switch following the detection | The activation of an MPLS protection switch following the detection | |||
| or notification of a fault requires a trigger mechanism at the PSL. | or notification of a fault requires a trigger mechanism at the PSL. | |||
| MPLS protection switching may be initiated due to automatic inputs or | MPLS protection switching may be initiated due to automatic inputs or | |||
| external commands. The automatic activation of an MPLS protection | external commands. The automatic activation of an MPLS protection | |||
| switch results from a response to a defect or fault conditions | switch results from a response to a defect or fault conditions | |||
| detected at the PSL or to fault notifications received at the PSL. It | detected at the PSL or to fault notifications received at the PSL. It | |||
| is possible that the fault detection and trigger mechanisms may be | is possible that the fault detection and trigger mechanisms may be | |||
| combined, as is the case when a PF, PD, LF, or LD is detected at a | combined, as is the case when a PF, PD, LF, or LD is detected at a | |||
| PSL and triggers a protection switch to the recovery path. In most | PSL and triggers a protection switch to the recovery path. In most | |||
| cases, however, the detection and trigger mechanisms are distinct, | cases, however, the detection and trigger mechanisms are distinct, | |||
| involving the detection of fault at some intermediate LSR followed by | involving the detection of fault at some intermediate LSR followed by | |||
| the propagation of a fault notification back to the PSL via the FIS, | the propagation of a fault notification back to the PSL via the FIS, | |||
| skipping to change at page 24, line 32 ¶ | skipping to change at page 24, line 5 ¶ | |||
| transmitter failures, or LSR fabric failures), as does the LF fault, | transmitter failures, or LSR fabric failures), as does the LF fault, | |||
| with the difference that the LF is a lower layer impairment that may | with the difference that the LF is a lower layer impairment that may | |||
| be communicated to - MPLS-based recovery mechanisms. The PD (or LD) | be communicated to - MPLS-based recovery mechanisms. The PD (or LD) | |||
| fault, on the other hand, applies to soft defects (excessive errors | fault, on the other hand, applies to soft defects (excessive errors | |||
| due to noise on the link, for instance). The PD (or LD) results in a | due to noise on the link, for instance). The PD (or LD) results in a | |||
| fault declaration only when the percentage of lost packets exceeds a | fault declaration only when the percentage of lost packets exceeds a | |||
| given threshold, which is provisioned and may be set based on the | given threshold, which is provisioned and may be set based on the | |||
| service level agreement(s) in effect between a service provider and a | service level agreement(s) in effect between a service provider and a | |||
| customer. | customer. | |||
| 3.7.2 Recovery Action | 3.7.2 Recovery Action | |||
| After a fault is detected or FIS is received by the PSL, the recovery | After a fault is detected or FIS is received by the PSL, the recovery | |||
| action involves either a rerouting or protection switching operation. | action involves either a rerouting or protection switching operation. | |||
| In both scenarios, the next hop label forwarding entry for a recovery | In both scenarios, the next hop label forwarding entry for a recovery | |||
| path is bound to the working path. | path is bound to the working path. | |||
| 3.8. Post Recovery Operation | 3.8. Post Recovery Operation | |||
| When traffic is flowing on the recovery path decisions can be made to | When traffic is flowing on the recovery path decisions can be made to | |||
| whether let the traffic remain on the recovery path and consider it | whether let the traffic remain on the recovery path and consider it | |||
| as a new working path or do a switch to the old or a new working | as a new working path or do a switch to the old or a new working | |||
| path. This post recovery operation has two styles, one where the | path. This post recovery operation has two styles, one where the | |||
| protection counterparts, i.e. the working and recovery path, are | protection counterparts, i.e. the working and recovery path, are | |||
| fixed or "pinned" to its route and one in which the PSL or other | fixed or "pinned" to its route and one in which the PSL or other | |||
| network entity with real time knowledge of failure dynamically | network entity with real time knowledge of failure dynamically | |||
| performs re-establishment or controlled rearrangement of the paths | performs re-establishment or controlled rearrangement of the paths | |||
| comprising the protected service. | comprising the protected service. | |||
| 3.8.1 Fixed Protection Counterparts | 3.8.1 Fixed Protection Counterparts | |||
| For fixed protection counterparts the PSL will be pre-configured with | For fixed protection counterparts the PSL will be pre-configured with | |||
| the appropriate behavior to take when the original fixed path is | the appropriate behavior to take when the original fixed path is | |||
| restored to service. The choices are revertive and non-revertive | restored to service. The choices are revertive and non-revertive | |||
| mode. The choice will typically be depended on relative costs of the | mode. The choice will typically be depended on relative costs of the | |||
| working and protection paths, and the tolerance of the service to the | working and protection paths, and the tolerance of the service to the | |||
| effects of switching paths yet again. These protection modes indicate | effects of switching paths yet again. These protection modes indicate | |||
| whether or not there is a preferred path for the protected traffic. | whether or not there is a preferred path for the protected traffic. | |||
| 3.8.1.1 Revertive Mode | 1.1.1.8 Revertive Mode | |||
| If the working path always is the preferred path, this path will be | If the working path always is the preferred path, this path will be | |||
| used whenever it is available. Thus, in the event of a fault on this | used whenever it is available. Thus, in the event of a fault on this | |||
| path, its unused resources will not be reclaimed by the network on | path, its unused resources will not be reclaimed by the network on | |||
| failure. If the working path has a fault, traffic is switched to the | failure. If the working path has a fault, traffic is switched to the | |||
| recovery path. In the revertive mode of operation, when the | recovery path. In the revertive mode of operation, when the | |||
| preferred path is restored the traffic is automatically switched back | preferred path is restored the traffic is automatically switched back | |||
| to it. | to it. | |||
| There are a number of implications to pinned working and recovery | There are a number of implications to pinned working and recovery | |||
| paths: | paths: | |||
| - upon failure and traffic moved to recovery path, the traffic is | - upon failure and traffic moved to recovery path, the traffic is | |||
| unprotected until such time as the path defect in the original | unprotected until such time as the path defect in the original | |||
| working path is repaired and that path restored to service. | working path is repaired and that path restored to service. | |||
| - upon failure and traffic moved to recovery path, the resources | - upon failure and traffic moved to recovery path, the resources | |||
| associated with the original path remain reserved. | associated with the original path remain reserved. | |||
| 3.8.1.2 Non-revertive Mode | 1.1.1.9 Non-revertive Mode | |||
| In the non-revertive mode of operation, there is no preferred path or | In the non-revertive mode of operation, there is no preferred path or | |||
| it may be desirable to minimize further disruption of the service | it may be desirable to minimize further disruption of the service | |||
| brought on by a revertive switching operation. A switch-back to the | brought on by a revertive switching operation. A switch-back to the | |||
| original working path is not desired or not possible since the | original working path is not desired or not possible since the | |||
| original path may no longer exist after the occurrence of a fault on | original path may no longer exist after the occurrence of a fault on | |||
| that path. | that path. | |||
| If there is a fault on the working path, traffic is switched to the | If there is a fault on the working path, traffic is switched to the | |||
| recovery path. When or if the faulty path (the originally working | recovery path. When or if the faulty path (the originally working | |||
| path) is restored, it may become the recovery path (either by | path) is restored, it may become the recovery path (either by | |||
| skipping to change at page 25, line 49 ¶ | skipping to change at page 25, line 22 ¶ | |||
| In the non-revertive mode of operation, the working traffic may or | In the non-revertive mode of operation, the working traffic may or | |||
| may not be restored to a new optimal working path or to the original | may not be restored to a new optimal working path or to the original | |||
| working path anyway. This is because it might be useful, in some | working path anyway. This is because it might be useful, in some | |||
| cases, to either: (a) administratively perform a protection switch | cases, to either: (a) administratively perform a protection switch | |||
| back to the original working path after gaining further assurances | back to the original working path after gaining further assurances | |||
| about the integrity of the path, or (b) it may be acceptable to | about the integrity of the path, or (b) it may be acceptable to | |||
| continue operation on the recovery path, or (c) it may be desirable | continue operation on the recovery path, or (c) it may be desirable | |||
| to move the traffic to a new optimal working path that is calculated | to move the traffic to a new optimal working path that is calculated | |||
| based on network topology and network policies. | based on network topology and network policies. | |||
| 3.8.2 Dynamic Protection Counterparts | 3.8.2 Dynamic Protection Counterparts | |||
| For Dynamic protection counterparts when the traffic is switched over | For dynamic protection counterparts when the traffic is switched over | |||
| to a recovery path, the association between the original working path | to a recovery path, the association between the original working path | |||
| and the recovery path may no longer exist, since the original path | and the recovery path may no longer exist, since the original path | |||
| itself may no longer exist after the fault. Instead, when the network | itself may no longer exist after the fault. Instead, when the network | |||
| reaches a stable state following routing convergence, the recovery | reaches a stable state following routing convergence, the recovery | |||
| path may be switched over to a different preferred path either | path may be switched over to a different preferred path either | |||
| optimization based on the new network topology and associated | optimization based on the new network topology and associated | |||
| information or based on pre-configured information. | information or based on pre-configured information. | |||
| Dynamic protection counterparts assume that upon failure, the PSL or | Dynamic protection counterparts assume that upon failure, the PSL or | |||
| other network entity will establish new working paths if another | other network entity will establish new working paths if another | |||
| switch-over will be performed. | switch-over will be performed. | |||
| 3.8.3 Restoration and Notification | 3.8.3 Restoration and Notification | |||
| MPLS restoration deals with returning the working traffic from the | MPLS restoration deals with returning the working traffic from the | |||
| recovery path to the original or a new working path. Reversion is | recovery path to the original or a new working path. Reversion is | |||
| performed by the PSL either upon receiving notification, via FRS, | performed by the PSL either upon receiving notification, via FRS, | |||
| that the working path is repaired, or upon receiving notification | that the working path is repaired, or upon receiving notification | |||
| that a new working path is established. | that a new working path is established. | |||
| For fixed counterparts in revertive mode, an LSR that detected the | For fixed counterparts in revertive mode, an LSR that detected the | |||
| fault on the working path also detects the restoration of the working | fault on the working path also detects the restoration of the working | |||
| path. If the working path had experienced a LF defect, the LSR | path. If the working path had experienced a LF defect, the LSR | |||
| skipping to change at page 26, line 36 ¶ | skipping to change at page 26, line 9 ¶ | |||
| interface. Alternatively, a lower layer that no longer detects a LF | interface. Alternatively, a lower layer that no longer detects a LF | |||
| defect may inform the MPLS-based recovery mechanisms at the LSR that | defect may inform the MPLS-based recovery mechanisms at the LSR that | |||
| the link to its peer LSR is operational. | the link to its peer LSR is operational. | |||
| The LSR then transmits FRS to its upstream LSR(s) that were | The LSR then transmits FRS to its upstream LSR(s) that were | |||
| transmitting traffic on the working path. At the point the PSL | transmitting traffic on the working path. At the point the PSL | |||
| receives the FRS, it switches the working traffic back to the | receives the FRS, it switches the working traffic back to the | |||
| original working path. | original working path. | |||
| A similar scheme is for dynamic counterparts where e.g. an update of | A similar scheme is for dynamic counterparts where e.g. an update of | |||
| topology and/or network convergence may trigger installation or setup | topology and/or network convergence may trigger installation or setup | |||
| of new working paths and send notification to the PSL to perform a | of new working paths and may send notification to the PSL to perform | |||
| switch over. | a switch over. | |||
| We note that if there is a way to transmit fault information back | We note that if there is a way to transmit fault information back | |||
| along a recovery path towards a PSL and if the recovery path is an | along a recovery path towards a PSL and if the recovery path is an | |||
| equivalent working path, it is possible for the working path and its | equivalent working path, it is possible for the working path and its | |||
| recovery path to exchange roles once the original working path is | recovery path to exchange roles once the original working path is | |||
| repaired following a fault. This is because, in that case, the | repaired following a fault. This is because, in that case, the | |||
| recovery path effectively becomes the working path, and the restored | recovery path effectively becomes the working path, and the restored | |||
| working path functions as a recovery path for the original recovery | working path functions as a recovery path for the original recovery | |||
| path. This is important, since it affords the benefits of non- | path. This is important, since it affords the benefits of non- | |||
| revertive switch operation outlined in Section 3.8.1, without leaving | revertive switch operation outlined in Section 3.8.1, without leaving | |||
| the recovery path unprotected. | the recovery path unprotected. | |||
| 3.8.4 Reverting to Preferred Path (or Controlled Rearrangement) | 3.8.4 Reverting to Preferred Path (or Controlled Rearrangement) | |||
| In the revertive mode, a "make before break" restoration switching | In the revertive mode, a "make before break" restoration switching | |||
| can be used, which is less disruptive than performing protection | can be used, which is less disruptive than performing protection | |||
| switching upon the occurrence of network impairments. This will | switching upon the occurrence of network impairments. This will | |||
| minimize both packet loss and packet reordering. The controlled | minimize both packet loss and packet reordering. The controlled | |||
| rearrangement of paths can also be used to satisfy traffic | rearrangement of paths can also be used to satisfy traffic | |||
| engineering requirements for load balancing across an MPLS domain. | engineering requirements for load balancing across an MPLS domain. | |||
| 3.9. Performance | 3.9. Performance | |||
| skipping to change at page 27, line 38 ¶ | skipping to change at page 27, line 10 ¶ | |||
| III. Preemption Attribute: | III. Preemption Attribute: | |||
| The recovery path can have the same preemption attribute as the | The recovery path can have the same preemption attribute as the | |||
| working path or a lower one. | working path or a lower one. | |||
| 4. MPLS Recovery Features | 4. MPLS Recovery Features | |||
| The following features are desirable from an operational point of | The following features are desirable from an operational point of | |||
| view: | view: | |||
| I. It is highly desirable that MPLS recovery provides an option to | I. It is desirable that MPLS recovery provides an option to identify | |||
| identify protection groups (PPGs) and protection portions (PTPs). | protection groups (PPGs) and protection portions (PTPs). | |||
| II. Each PSL should be capable of performing MPLS recovery upon the | II. Each PSL should be capable of performing MPLS recovery upon the | |||
| detection of the impairments or upon receipt of notifications of | detection of the impairments or upon receipt of notifications of | |||
| impairments. | impairments. | |||
| III. A MPLS recovery method should not preclude manual protection | III. A MPLS recovery method should not preclude manual protection | |||
| switching commands. This implies that it would be possible under | switching commands. This implies that it would be possible under | |||
| administrative commands to transfer traffic from a working path to a | administrative commands to transfer traffic from a working path to a | |||
| recovery path, or to transfer traffic from a recovery path to a | recovery path, or to transfer traffic from a recovery path to a | |||
| working path, once the working path becomes operational following a | working path, once the working path becomes operational following a | |||
| skipping to change at page 29, line 18 ¶ | skipping to change at page 28, line 39 ¶ | |||
| example, a recovery path may take many more hops than the working | example, a recovery path may take many more hops than the working | |||
| path. This may be dependent on the recovery path selection | path. This may be dependent on the recovery path selection | |||
| algorithms. | algorithms. | |||
| Quality of Protection | Quality of Protection | |||
| Recovery schemes can be considered to encompass a spectrum of "packet | Recovery schemes can be considered to encompass a spectrum of "packet | |||
| survivability" which may range from "relative" to "absolute". | survivability" which may range from "relative" to "absolute". | |||
| Relative survivability may mean that the packet is on an equal | Relative survivability may mean that the packet is on an equal | |||
| footing with other traffic of, as an example, the same diff-serv code | footing with other traffic of, as an example, the same diff-serv code | |||
| point (DSCP) in contending for the surviving network resources. | point (DSCP) in contending for the resources of the portion of the | |||
| Absolute survivability may mean that the survivability of the | network that survives the failure. Absolute survivability may mean | |||
| protected traffic has explicit guarantees. | that the survivability of the protected traffic has explicit | |||
| guarantees. | ||||
| Re-ordering | Re-ordering | |||
| Recovery schemes may introduce re-ordering of packets. Also the | Recovery schemes may introduce re-ordering of packets. Also the | |||
| action of putting traffic back on preferred paths might cause packet | action of putting traffic back on preferred paths might cause packet | |||
| re-ordering. | re-ordering. | |||
| State Overhead | State Overhead | |||
| As the number of recovery paths in a protection plan grows, the state | As the number of recovery paths in a protection plan grows, the state | |||
| skipping to change at page 30, line 45 ¶ | skipping to change at page 30, line 15 ¶ | |||
| 8. Acknowledgements | 8. Acknowledgements | |||
| We would like to thank members of the MPLS WG mailing list for their | We would like to thank members of the MPLS WG mailing list for their | |||
| suggestions on the earlier versions of this draft. In particular, | suggestions on the earlier versions of this draft. In particular, | |||
| Bora Akyol, Dave Allan, Neil Harrison, and Dave Danenberg whose | Bora Akyol, Dave Allan, Neil Harrison, and Dave Danenberg whose | |||
| suggestions and comments were very helpful in revising the document. | suggestions and comments were very helpful in revising the document. | |||
| The editors would like to give very special thanks to Curtis | The editors would like to give very special thanks to Curtis | |||
| Villamizar for his careful and extremely thorough reading of the | Villamizar for his careful and extremely thorough reading of the | |||
| document and for taking the time to provide numerous suggestions, | document and for taking the time to provide numerous suggestions, | |||
| which were very helpful in our latest revision of the document. | which were very helpful in our latest revision of the document, and | |||
| to Seyhan Civanlar, who provided initial input on the rerouting | ||||
| section. | ||||
| 9. AuthorsÆ Addresses | 9. AuthorsÆ Addresses | |||
| Vishal Sharma Ben Mack-Crane | Vishal Sharma Fiffi Hellstrand | |||
| Metanoia, Inc. Tellabs Operations, Inc. | Metanoia, Inc. Nortel Networks | |||
| 335 Elan Village Ln., Unit 203 4951 Indiana Avenue | 305 Elan Village Ln., Unit 121 St Eriksgatan 115 | |||
| San Jose, CA 95134 Lisle, IL 60532 | San Jose, CA 95134 PO Box 6701 | |||
| Phone: 408-943-1794 Phone: 630-512-7255 | Phone: (408) 955-0910 113 85 Stockholm, Sweden | |||
| v.sharma@ieee.org Ben.Mack-Crane@tellabs.com | v.sharma@ieee.org Phone: +46 8 5088 3687 | |||
| Srinivas Makam Ken Owens | Fiffi@nortelnetworks.com | |||
| Tellabs Operations, Inc. Erlang Technology, Inc. | ||||
| Lisle, IL 60532 St. Louis, MO 63119 | ||||
| Phone: 630-512-7217 Phone: 314-918-1579 | ||||
| Srinivas.Makam@tellabs.com keno@erlangtech.com | ||||
| Changcheng Huang Fiffi Hellstrand | Ben Mack-Crane Srinivas Makam | |||
| Dept. of Systems & Computer Engg. Nortel Networks | Tellabs Operations, Inc. Smakam60540@yahoo.com | |||
| Carleton University St Eriksgatan 115 | 4951 Indiana Avenue | |||
| Minto Center, Rm. 3082 PO Box 6701 | Lisle, IL 60532 | |||
| 1125 Colonial By Drive 113 85 Stockholm, Sweden | Phone: (630) 512-7255 | |||
| Ottawa, Ontario K1S 5B6, Canada Phone: +46 8 5088 3687 | Ben.Mack-Crane@tellabs.com | |||
| Phone: 613 520-2600 x2477 Fiffi@nortelnetworks.com | ||||
| Changcheng.Huang@sce.carleton.ca | Ken Owens Changcheng Huang | |||
| Erlang Technology, Inc. Carleton University | ||||
| 345 Marshall Ave., Suite 300 Minto Center, Rm. 3082 | ||||
| St. Louis, MO 63119 1125 Colonial By Drive | ||||
| Phone: (314) 918-1579 Ottawa, Ontario K1S 5B6, | ||||
| Canada | ||||
| keno@erlangtech.com Phone: (613) 520-2600 x2477 | ||||
| Changcheng.Huang@sce.carlet | ||||
| on.ca | ||||
| Jon Weil Brad Cain | Jon Weil Brad Cain | |||
| Nortel Networks Cereva Networks | Nortel Networks Storigen Systems | |||
| Harlow Laboratories London Road 3 Network Drive | Harlow Laboratories London Road 650 Suffolk Street | |||
| Harlow Essex CM17 9NA, UK Marlborough, MA 01752 | Harlow Essex CM17 9NA, UK Lowell, MA 01854 | |||
| Phone: +44 (0)1279 403935 Phone: 508-787-5000 | Phone: +44 (0)1279 403935 Phone: (978) 323-4454 | |||
| jonweil@nortelnetworks.com bcain@cereva.com | jonweil@nortelnetworks.com bcain@storigen.com | |||
| Loa Andersson Bilel Jamoussi | Loa Andersson Bilel Jamoussi | |||
| Utfors AB Nortel Networks | Utfors AB Nortel Networks | |||
| R…sundav„gen 12, Box 525 3 Federal Street, BL3-03 | R…sundav„gen 12, Box 525 3 Federal Street, BL3-03 | |||
| 169 29 Solna, Sweden Billerica, MA 01821, USA | 169 29 Solna, Sweden Billerica, MA 01821, USA | |||
| Phone: +46 8 5270 5038 Phone:(978) 288-4506 | Phone: +46 8 5270 5038 Phone:(978) 288-4506 | |||
| loa.andersson@utfors.se jamoussi@nortelnetworks.com | loa.andersson@utfors.se jamoussi@nortelnetworks.com | |||
| Seyhan Civanlar Angela Chiu | Angela Chiu | |||
| Lemur Networks, Inc. Celion Networks, Inc. | Celion Networks, Inc. | |||
| 135 West 20th Street, 5th Floor One Shiela Drive, Suite 2 | One Shiela Drive, Suite 2 | |||
| New York, NY 10011 Tinton Falls, NJ 07724 | Tinton Falls, NJ 07724 | |||
| Phone: 212-367-7676 Phone: (732) 345-3441 | Phone: (732) 345-3441 | |||
| scivanlar@lemurnetworks.com angela.chiu@celion.com | angela.chiu@celion.com | |||
| 10. References | 10. References | |||
| [1] Rosen, E., Viswanathan, A., and Callon, R., "Multiprotocol Label | [1] Rosen, E., Viswanathan, A., and Callon, R., "Multiprotocol Label | |||
| Switching Architecture", RFC 3031, January 2001. | Switching Architecture", RFC 3031, January 2001. | |||
| [2] Andersson, L., Doolan, P., Feldman, N., Fredette, A., Thomas, B., | [2] Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., McManus, J., | |||
| "LDP Specification", RFC 3036, January 2001. | ||||
| [3] Awduche, D. Hannan, A., and Xiao, X., "Applicability Statement | ||||
| for Extensions to RSVP for LSP-Tunnels", draft-ietf-mpls-rsvp- | ||||
| tunnel-applicability-02.txt, Work in Progress, April 2001. | ||||
| [4] Jamoussi, B. et al "Constraint-Based LSP Setup using LDP", | ||||
| Internet Draft draft-ietf-mpls-cr-ldp-05.txt, Work in Progress , | ||||
| February 2001. | ||||
| [5] Braden, R., Zhang, L., Berson, S., Herzog, S., "Resource | ||||
| ReSerVation Protocol (RSVP) -- Version 1 Functional | ||||
| Specification", RFC 2205, September 1997. | ||||
| [6] Awduche, D. et al "Extensions to RSVP for LSP Tunnels", Internet | ||||
| Draft, draft-ietf-mpls-rsvp-lsp-tunnel-08.txt, Work in Progress, | ||||
| February 2001. | ||||
| [7] Hellstrand, F., and Andersson, L., "Extensions to RSVP-TE and CR- | ||||
| LDP for setup of pre-established LSP Tunnels," Internet Draft, | ||||
| Work in Progress, draft-hellstrand-mpls-recovery-merge-01.txt, | ||||
| November 2000. | ||||
| [8] Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., McManus, J., | ||||
| "Requirements for Traffic Engineering Over MPLS", RFC 2702, | "Requirements for Traffic Engineering Over MPLS", RFC 2702, | |||
| September 1999. | September 1999. | |||
| [9] Kini, S., Lakshman, T. V., Villamizar, C., "Reservation Protocol | [3] Haung, C., Sharma, V., Owens, K., Makam, V. "Building Reliable | |||
| with Traffic Engineering Extensions: Extension for Label Switched | MPLS Networks Using a Path Protection Mechanism", IEEE Commun. | |||
| Path Restoration," Internet Draft, Work in Progress, draft-kini- | Mag., Vol. 40, Issue 3, March 2002, pp. 156-162. | |||
| rsvp-lsp-restoration-00.txt, November 2000. | ||||
| [10] Haskin, D. and Krishnan R., "A Method for Setting an Alternative | [4] Braden, R., Zhang, L., Berson, S., Herzog, S., "Resource | |||
| Label Switched Path to Handle Fast Reroute", Internet Draft draft- | ReSerVation Protocol (RSVP) -- Version 1 Functional | |||
| haskin-mpls-fast-reroute-05.txt, November 2000, Work in progress. | Specification", RFC 2205, September 1997. | |||
| [11] Owens, K., Makam, V., Sharma, V., Mack-Crane, B., and Haung, C., | [5] Awduche, D., et al "RSVP-TE Extensions to RSVP for LSP Tunnels", | |||
| "A Path Protection/Restoration Mechanism for MPLS Networks", | RFC 3209, December 2001. | |||
| Internet Draft, draft-chang-mpls-path-protection-03.txt, Work in | ||||
| Progress, July 2001. | ||||
| [14] Kini, S., Kodialam, M., Sengupta, S., Villamizar, C., "Shared | [6] Jamoussi, B., et al "Constraint-Based LSP Setup using LDP", RFC | |||
| Backup Label Switched Path Restoration", Internet Draft, draft- | 3212, January 2002. | |||
| kini-restoration-shared-backup-01.txt, Work in Progress May 2001. | ||||
| End of changes. 103 change blocks. | ||||
| 296 lines changed or deleted | 257 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||