| < draft-ietf-rtgwg-ipfrr-framework-12.txt | draft-ietf-rtgwg-ipfrr-framework-13.txt > | |||
|---|---|---|---|---|
| Network Working Group M. Shand | Network Working Group M. Shand | |||
| Internet-Draft S. Bryant | Internet-Draft S. Bryant | |||
| Intended status: Informational Cisco Systems | Intended status: Informational Cisco Systems | |||
| Expires: March 22, 2010 September 18, 2009 | Expires: April 26, 2010 October 23, 2009 | |||
| IP Fast Reroute Framework | IP Fast Reroute Framework | |||
| draft-ietf-rtgwg-ipfrr-framework-12 | draft-ietf-rtgwg-ipfrr-framework-13 | |||
| Status of this Memo | Status of this Memo | |||
| This Internet-Draft is submitted to IETF in full conformance with the | This Internet-Draft is submitted to IETF in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
| other groups may also distribute working documents as Internet- | other groups may also distribute working documents as Internet- | |||
| Drafts. | Drafts. | |||
| skipping to change at page 1, line 32 ¶ | skipping to change at page 1, line 32 ¶ | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| This Internet-Draft will expire on March 22, 2010. | This Internet-Draft will expire on April 26, 2010. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2009 IETF Trust and the persons identified as the | Copyright (c) 2009 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents in effect on the date of | Provisions Relating to IETF Documents in effect on the date of | |||
| publication of this document (http://trustee.ietf.org/license-info). | publication of this document (http://trustee.ietf.org/license-info). | |||
| Please review these documents carefully, as they describe your rights | Please review these documents carefully, as they describe your rights | |||
| skipping to change at page 2, line 11 ¶ | skipping to change at page 2, line 11 ¶ | |||
| This document provides a framework for the development of IP fast- | This document provides a framework for the development of IP fast- | |||
| reroute mechanisms which provide protection against link or router | reroute mechanisms which provide protection against link or router | |||
| failure by invoking locally determined repair paths. Unlike MPLS | failure by invoking locally determined repair paths. Unlike MPLS | |||
| fast-reroute, the mechanisms are applicable to a network employing | fast-reroute, the mechanisms are applicable to a network employing | |||
| conventional IP routing and forwarding. | conventional IP routing and forwarding. | |||
| Table of Contents | Table of Contents | |||
| 1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 3. Problem Analysis . . . . . . . . . . . . . . . . . . . . . . . 5 | 3. Scope and applicability . . . . . . . . . . . . . . . . . . . 6 | |||
| 4. Mechanisms for IP Fast-reroute . . . . . . . . . . . . . . . . 7 | 4. Problem Analysis . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 4.1. Mechanisms for fast failure detection . . . . . . . . . . 7 | 5. Mechanisms for IP Fast-reroute . . . . . . . . . . . . . . . . 8 | |||
| 4.2. Mechanisms for repair paths . . . . . . . . . . . . . . . 8 | 5.1. Mechanisms for fast failure detection . . . . . . . . . . 8 | |||
| 4.2.1. Scope of repair paths . . . . . . . . . . . . . . . . 9 | 5.2. Mechanisms for repair paths . . . . . . . . . . . . . . . 8 | |||
| 4.2.2. Analysis of repair coverage . . . . . . . . . . . . . 9 | 5.2.1. Scope of repair paths . . . . . . . . . . . . . . . . 9 | |||
| 4.2.3. Link or node repair . . . . . . . . . . . . . . . . . 10 | 5.2.2. Analysis of repair coverage . . . . . . . . . . . . . 10 | |||
| 4.2.4. Maintenance of Repair paths . . . . . . . . . . . . . 11 | 5.2.3. Link or node repair . . . . . . . . . . . . . . . . . 11 | |||
| 4.2.5. Multiple failures and Shared Risk Link Groups . . . . 11 | 5.2.4. Maintenance of Repair paths . . . . . . . . . . . . . 11 | |||
| 4.3. Local Area Networks . . . . . . . . . . . . . . . . . . . 12 | 5.2.5. Local Area Networks . . . . . . . . . . . . . . . . . 12 | |||
| 4.4. Mechanisms for micro-loop prevention . . . . . . . . . . . 12 | 5.2.6. Multiple failures and Shared Risk Link Groups . . . . 12 | |||
| 5. Management Considerations . . . . . . . . . . . . . . . . . . 12 | 5.3. Mechanisms for micro-loop prevention . . . . . . . . . . . 12 | |||
| 6. Scope and applicability . . . . . . . . . . . . . . . . . . . 13 | 6. Management Considerations . . . . . . . . . . . . . . . . . . 13 | |||
| 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 | 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 | |||
| 8. Security Considerations . . . . . . . . . . . . . . . . . . . 13 | 8. Security Considerations . . . . . . . . . . . . . . . . . . . 14 | |||
| 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14 | 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
| 10. Informative References . . . . . . . . . . . . . . . . . . . . 14 | 10. Informative References . . . . . . . . . . . . . . . . . . . . 14 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15 | |||
| 1. Terminology | 1. Terminology | |||
| This section defines words and acronyms used in this draft and other | This section defines words and acronyms used in this draft and other | |||
| drafts discussing IP fast-reroute. | drafts discussing IP fast-reroute. | |||
| D Used to denote the destination router under | D Used to denote the destination router under | |||
| discussion. | discussion. | |||
| Distance_opt(A,B) The distance of the shortest path from A to B. | Distance_opt(A,B) The metric sum of the shortest path from A to B. | |||
| Downstream Path This is a subset of the loop-free alternates | Downstream Path This is a subset of the loop-free alternates | |||
| where the neighbor N meets the following | where the neighbor N meets the following | |||
| condition:- | condition:- | |||
| Distance_opt(N, D) < Distance_opt(S,D) | Distance_opt(N, D) < Distance_opt(S,D) | |||
| E Used to denote the router which is the primary | E Used to denote the router which is the primary | |||
| next-hop neighbor to get from S to the | neighbor to get from S to the destination D. | |||
| destination D. Where there is an ECMP set for the | Where there is an ECMP set for the shortest path | |||
| shortest path from S to D, these are referred to | from S to D, these are referred to as E_1, E_2, | |||
| as E_1, E_2, etc. | etc. | |||
| ECMP Equal cost multi-path: Where, for a particular | ECMP Equal cost multi-path: Where, for a particular | |||
| destination D, multiple primary next-hops are | destination D, multiple primary next-hops are | |||
| used to forward traffic because there exist | used to forward traffic because there exist | |||
| multiple shortest paths from S via different | multiple shortest paths from S via different | |||
| output layer-3 interfaces. | output layer-3 interfaces. | |||
| FIB Forwarding Information Base. The database used | FIB Forwarding Information Base. The database used | |||
| by the packet forwarder to determine what actions | by the packet forwarder to determine what actions | |||
| to perform on a packet. | to perform on a packet. | |||
| IPFRR IP fast-reroute. | IPFRR IP fast-reroute. | |||
| Link(A->B) A link connecting router A to router B. | Link(A->B) A link connecting router A to router B. | |||
| LFA Loop Free Alternate. A neighbor N, that is not a | LFA Loop Free Alternate. A neighbor N, that is not a | |||
| primary next-hop neighbor E, whose shortest path | primary neighbor E, whose shortest path to the | |||
| to the destination D does not go back through the | destination D does not go back through the router | |||
| router S. The neighbor N must meet the following | S. The neighbor N must meet the following | |||
| condition:- | condition:- | |||
| Distance_opt(N, D) < Distance_opt(N, S) + | Distance_opt(N, D) < Distance_opt(N, S) + | |||
| Distance_opt(S, D) | Distance_opt(S, D) | |||
| Loop Free Neighbor A neighbor N_i, which is not the particular | Loop Free Neighbor A neighbor N_i, which is not the particular | |||
| primary neighbor E_k under discussion, and whose | primary neighbor E_k under discussion, and whose | |||
| shortest path to D does not traverse S. For | shortest path to D does not traverse S. For | |||
| example, if there are two primary neighbors E_1 | example, if there are two primary neighbors E_1 | |||
| and E_2, E_1 is a loop-free neighbor with regard | and E_2, E_1 is a loop-free neighbor with regard | |||
| skipping to change at page 4, line 33 ¶ | skipping to change at page 4, line 33 ¶ | |||
| being protected. | being protected. | |||
| N_i The ith neighbor of S. | N_i The ith neighbor of S. | |||
| Primary Neighbor A neighbor N_i of S which is one of the next hops | Primary Neighbor A neighbor N_i of S which is one of the next hops | |||
| for destination D in S's FIB prior to any | for destination D in S's FIB prior to any | |||
| failure. | failure. | |||
| R_i_j The jth neighbor of N_i. | R_i_j The jth neighbor of N_i. | |||
| Repair Path The path used by a repairing node to send traffic | ||||
| that it is unable to send via the normal path | ||||
| owing to a failure. | ||||
| Routing Transition The process whereby routers converge on a new | Routing Transition The process whereby routers converge on a new | |||
| topology. In conventional networks this process | topology. In conventional networks this process | |||
| frequently causes some disruption to packet | frequently causes some disruption to packet | |||
| delivery. | delivery. | |||
| RPF Reverse Path Forwarding. I.e. checking that a | RPF Reverse Path Forwarding. I.e. checking that a | |||
| packet is received over the interface which would | packet is received over the interface which would | |||
| be used to send packets addressed to the source | be used to send packets addressed to the source | |||
| address of the packet. | address of the packet. | |||
| skipping to change at page 5, line 9 ¶ | skipping to change at page 5, line 11 ¶ | |||
| failure of a neighboring router denoted as E, or | failure of a neighboring router denoted as E, or | |||
| of the link between S and E. It is the viewpoint | of the link between S and E. It is the viewpoint | |||
| from which IP fast-reroute is described. | from which IP fast-reroute is described. | |||
| SPF Shortest Path First, e.g. Dijkstra's algorithm. | SPF Shortest Path First, e.g. Dijkstra's algorithm. | |||
| SPT Shortest path tree | SPT Shortest path tree | |||
| Upstream Forwarding Loop | Upstream Forwarding Loop | |||
| A forwarding loop that involves a set of routers, | A forwarding loop that involves a set of routers, | |||
| none of which are directly connected to the link | none of which is directly connected to the link | |||
| that has caused the topology change that | that has caused the topology change that | |||
| triggered a new SPF in any of the routers. | triggered a new SPF in any of the routers. | |||
| 2. Introduction | 2. Introduction | |||
| When a link or node failure occurs in a routed network, there is | When a link or node failure occurs in a routed network, there is | |||
| inevitably a period of disruption to the delivery of traffic until | inevitably a period of disruption to the delivery of traffic until | |||
| the network re-converges on the new topology. Packets for | the network re-converges on the new topology. Packets for | |||
| destinations which were previously reached by traversing the failed | destinations which were previously reached by traversing the failed | |||
| component may be dropped or may suffer looping. Traditionally such | component may be dropped or may suffer looping. Traditionally such | |||
| skipping to change at page 5, line 47 ¶ | skipping to change at page 5, line 49 ¶ | |||
| routers of the failure. In this case, the disruption time can be | routers of the failure. In this case, the disruption time can be | |||
| limited to the small time taken to detect the adjacent failure and | limited to the small time taken to detect the adjacent failure and | |||
| invoke the backup routes. This is analogous to the technique | invoke the backup routes. This is analogous to the technique | |||
| employed by MPLS fast-reroute [RFC4090], but the mechanisms employed | employed by MPLS fast-reroute [RFC4090], but the mechanisms employed | |||
| for the backup routes in pure IP networks are necessarily very | for the backup routes in pure IP networks are necessarily very | |||
| different. | different. | |||
| This document provides a framework for the development of this | This document provides a framework for the development of this | |||
| approach. | approach. | |||
| 3. Problem Analysis | Note that in order to further minimise the impact on user | |||
| applications, it may be necessary to design the network such that | ||||
| backup paths with suitable characteristics, for example capacity | ||||
| and/or delay, are available for the algorithms to select. Such | ||||
| considerations are outside the scope of this document. | ||||
| 3. Scope and applicability | ||||
| The initial scope of this work is in the context of link state IGPs. | ||||
| Link state protocols provide ubiquitous topology information, which | ||||
| facilitates the computation of repairs paths. | ||||
| Provision of similar facilities in non-link state IGPs and BGP is a | ||||
| matter for further study, but the correct operation of the repair | ||||
| mechanisms for traffic with a destination outside the IGP domain is | ||||
| an important consideration for solutions based on this framework. | ||||
| Complete protection against multiple unrelated failures is out of | ||||
| scope of this work. | ||||
| 4. Problem Analysis | ||||
| The duration of the packet delivery disruption caused by a | The duration of the packet delivery disruption caused by a | |||
| conventional routing transition is determined by a number of factors: | conventional routing transition is determined by a number of factors: | |||
| 1. The time taken to detect the failure. This may be of the order | 1. The time taken to detect the failure. This may be of the order | |||
| of a few milliseconds when it can be detected at the physical | of a few milliseconds when it can be detected at the physical | |||
| layer, up to several tens of seconds when a routing protocol | layer, up to several tens of seconds when a routing protocol | |||
| Hello is employed. During this period packets will be | Hello is employed. During this period packets will be | |||
| unavoidably lost. | unavoidably lost. | |||
| skipping to change at page 6, line 39 ¶ | skipping to change at page 7, line 14 ¶ | |||
| The disruption will last until the routers adjacent to the failure | The disruption will last until the routers adjacent to the failure | |||
| have completed steps 1 and 2, and then all the routers in the network | have completed steps 1 and 2, and then all the routers in the network | |||
| whose paths are affected by the failure have completed the remaining | whose paths are affected by the failure have completed the remaining | |||
| steps. | steps. | |||
| The initial packet loss is caused by the router(s) adjacent to the | The initial packet loss is caused by the router(s) adjacent to the | |||
| failure continuing to attempt to transmit packets across the failure | failure continuing to attempt to transmit packets across the failure | |||
| until it is detected. This loss is unavoidable, but the detection | until it is detected. This loss is unavoidable, but the detection | |||
| time can be reduced to a few tens of milliseconds as described in | time can be reduced to a few tens of milliseconds as described in | |||
| Section 4.1. | Section 5.1. | |||
| In some topologies subsequent packet loss may be caused by the | In some topologies subsequent packet loss may be caused by the | |||
| "micro-loops" which may form as a result of temporary inconsistencies | "micro-loops" which may form as a result of temporary inconsistencies | |||
| between routers' forwarding tables[I-D.ietf-rtgwg-lf-conv-frmwk]. | between routers' forwarding tables[I-D.ietf-rtgwg-lf-conv-frmwk]. | |||
| These inconsistencies are caused by steps 3, 4 and 5 above and in | These inconsistencies are caused by steps 3, 4 and 5 above and in | |||
| many routers it is step 5 which is both the largest factor and which | many routers it is step 5 which is both the largest factor and which | |||
| has the greatest variance between routers. The large variance arises | has the greatest variance between routers. The large variance arises | |||
| from implementation differences and from the differing impact that a | from implementation differences and from the differing impact that a | |||
| failure has on each individual router. For example, the number of | failure has on each individual router. For example, the number of | |||
| prefixes affected by the failure may vary dramatically from one | prefixes affected by the failure may vary dramatically from one | |||
| router to another. | router to another. | |||
| In order to achieve packet disruption times which are commensurate | In order to reduce packet disruption times to a duration commensurate | |||
| with the failure detection times two mechanisms may be required:- | with the failure detection times, two mechanisms may be required:- | |||
| 1. A mechanism for the router(s) adjacent to the failure to rapidly | a. A mechanism for the router(s) adjacent to the failure to rapidly | |||
| invoke a repair path, which is unaffected by any subsequent re- | invoke a repair path, which is unaffected by any subsequent re- | |||
| convergence. | convergence. | |||
| 2. In topologies that are susceptible to micro-loops, a mechanism to | b. In topologies that are susceptible to micro-loops, a micro-loop | |||
| prevent the effects of any micro-loops during subsequent re- | control mechanism may be required[I-D.ietf-rtgwg-lf-conv-frmwk]. | |||
| convergence. | ||||
| Performing the first task without the second may result in the repair | Performing the first task without the second may result in the repair | |||
| path being starved of traffic and hence being redundant. Performing | path being starved of traffic and hence being redundant. Performing | |||
| the second without the first will result in traffic being discarded | the second without the first will result in traffic being discarded | |||
| by the router(s) adjacent to the failure. | by the router(s) adjacent to the failure. | |||
| Repair paths may always be used in isolation where the failure is | Repair paths may always be used in isolation where the failure is | |||
| short-lived. In this case, the repair paths can be kept in place | short-lived. In this case, the repair paths can be kept in place | |||
| until the failure is repaired in which case there is no need to | until the failure is repaired in which case there is no need to | |||
| advertise the failure to other routers. | advertise the failure to other routers. | |||
| skipping to change at page 7, line 34 ¶ | skipping to change at page 8, line 9 ¶ | |||
| Similarly, micro-loop avoidance may be used in isolation to prevent | Similarly, micro-loop avoidance may be used in isolation to prevent | |||
| loops arising from pre-planned management action. In which case the | loops arising from pre-planned management action. In which case the | |||
| link or node being shut down can remain in service for a short time | link or node being shut down can remain in service for a short time | |||
| after its removal has been announced into the network, and hence it | after its removal has been announced into the network, and hence it | |||
| can function as its own "repair path". | can function as its own "repair path". | |||
| Note that micro-loops may also occur when a link or node is restored | Note that micro-loops may also occur when a link or node is restored | |||
| to service and thus a micro-loop avoidance mechanism may be required | to service and thus a micro-loop avoidance mechanism may be required | |||
| for both link up and link down cases. | for both link up and link down cases. | |||
| 4. Mechanisms for IP Fast-reroute | 5. Mechanisms for IP Fast-reroute | |||
| The set of mechanisms required for an effective solution to the | The set of mechanisms required for an effective solution to the | |||
| problem can be broken down into the sub-problems described in this | problem can be broken down into the sub-problems described in this | |||
| section. | section. | |||
| 4.1. Mechanisms for fast failure detection | 5.1. Mechanisms for fast failure detection | |||
| It is critical that the failure detection time is minimized. A | It is critical that the failure detection time is minimized. A | |||
| number of well documented approaches are possible, such as: | number of well documented approaches are possible, such as: | |||
| 1. Physical detection; for example, loss of light. | 1. Physical detection; for example, loss of light. | |||
| 2. Routing protocol independent protocol detection; for example, The | 2. Routing protocol independent protocol detection; for example, The | |||
| Bidirectional Failure Detection protocol [I-D.ietf-bfd-base]. | Bidirectional Failure Detection protocol [I-D.ietf-bfd-base]. | |||
| 3. Routing protocol detection; for example, use of "fast Hellos". | 3. Routing protocol detection; for example, use of "fast Hellos". | |||
| 4.2. Mechanisms for repair paths | When configuring packet based failure detection mechanisms it is | |||
| important that consideration be given to the likelihood and | ||||
| consequences of false indications of failure. The incidence of false | ||||
| indication of failure may be minimised by appropriately prioritizing | ||||
| of the transmission, reception and processing of the packets used to | ||||
| detect link or node failure. Note that this is not an issue that is | ||||
| specific to IPFRR. | ||||
| 5.2. Mechanisms for repair paths | ||||
| Once a failure has been detected by one of the above mechanisms, | Once a failure has been detected by one of the above mechanisms, | |||
| traffic which previously traversed the failure is transmitted over | traffic which previously traversed the failure is transmitted over | |||
| one or more repair paths. The design of the repair paths should be | one or more repair paths. The design of the repair paths should be | |||
| such that they can be pre-calculated in anticipation of each local | such that they can be pre-calculated in anticipation of each local | |||
| failure and made available for invocation with minimal delay. There | failure and made available for invocation with minimal delay. There | |||
| are three basic categories of repair paths: | are three basic categories of repair paths: | |||
| 1. Equal cost multi-paths (ECMP). Where such paths exist, and one | 1. Equal cost multi-paths (ECMP). Where such paths exist, and one | |||
| or more of the alternate paths do not traverse the failure, they | or more of the alternate paths do not traverse the failure, they | |||
| skipping to change at page 8, line 33 ¶ | skipping to change at page 9, line 14 ¶ | |||
| 3. Multi-hop repair paths. When there is no feasible loop free | 3. Multi-hop repair paths. When there is no feasible loop free | |||
| alternate path it may still be possible to locate a router, which | alternate path it may still be possible to locate a router, which | |||
| is more than one hop away from the router adjacent to the | is more than one hop away from the router adjacent to the | |||
| failure, from which traffic will be forwarded to the destination | failure, from which traffic will be forwarded to the destination | |||
| without traversing the failure. | without traversing the failure. | |||
| ECMP and loop free alternate paths (as described in [RFC5286]) offer | ECMP and loop free alternate paths (as described in [RFC5286]) offer | |||
| the simplest repair paths and would normally be used when they are | the simplest repair paths and would normally be used when they are | |||
| available. It is anticipated that around 80% of failures (see | available. It is anticipated that around 80% of failures (see | |||
| Section 4.2.2) can be repaired using these basic methods alone. | Section 5.2.2) can be repaired using these basic methods alone. | |||
| Multi-hop repair paths are more complex, both in the computations | Multi-hop repair paths are more complex, both in the computations | |||
| required to determine their existence, and in the mechanisms required | required to determine their existence, and in the mechanisms required | |||
| to invoke them. They can be further classified as: | to invoke them. They can be further classified as: | |||
| 1. Mechanisms where one or more alternate FIBs are pre-computed in | a. Mechanisms where one or more alternate FIBs are pre-computed in | |||
| all routers and the repaired packet is instructed to be forwarded | all routers and the repaired packet is instructed to be forwarded | |||
| using a "repair FIB" by some method of per packet signaling such | using a "repair FIB" by some method of per packet signaling such | |||
| as detecting a "U-turn" [I-D.atlas-ip-local-protect-uturn] , | as detecting a "U-turn" [I-D.atlas-ip-local-protect-uturn] , | |||
| [FIFR] or by marking the packet [SIMULA]. | [FIFR] or by marking the packet [SIMULA]. | |||
| 2. Mechanisms functionally equivalent to a loose source route which | b. Mechanisms functionally equivalent to a loose source route which | |||
| is invoked using the normal FIB. These include tunnels | is invoked using the normal FIB. These include tunnels | |||
| [I-D.bryant-ipfrr-tunnels], alternative shortest paths | [I-D.bryant-ipfrr-tunnels], alternative shortest paths | |||
| [I-D.tian-frr-alt-shortest-path] and label based mechanisms. | [I-D.tian-frr-alt-shortest-path] and label based mechanisms. | |||
| 3. Mechanisms employing special addresses or labels which are | c. Mechanisms employing special addresses or labels which are | |||
| installed in the FIBs of all routers with routes pre-computed to | installed in the FIBs of all routers with routes pre-computed to | |||
| avoid certain components of the network. For example | avoid certain components of the network. For example | |||
| [I-D.ietf-rtgwg-ipfrr-notvia-addresses]. | [I-D.ietf-rtgwg-ipfrr-notvia-addresses]. | |||
| In many cases a repair path which reaches two hops away from the | In many cases a repair path which reaches two hops away from the | |||
| router detecting the failure will suffice, and it is anticipated that | router detecting the failure will suffice, and it is anticipated that | |||
| around 98% of failures (see Section 4.2.2) can be repaired by this | around 98% of failures (see Section 5.2.2) can be repaired by this | |||
| method. However, to provide complete repair coverage some use of | method. However, to provide complete repair coverage some use of | |||
| longer multi-hop repair paths is generally necessary. | longer multi-hop repair paths is generally necessary. | |||
| 4.2.1. Scope of repair paths | 5.2.1. Scope of repair paths | |||
| A particular repair path may be valid for all destinations which | A particular repair path may be valid for all destinations which | |||
| require repair or may only be valid for a subset of destinations. If | require repair or may only be valid for a subset of destinations. If | |||
| a repair path is valid for a node immediately downstream of the | a repair path is valid for a node immediately downstream of the | |||
| failure, then it will be valid for all destinations previously | failure, then it will be valid for all destinations previously | |||
| reachable by traversing the failure. However, in cases where such a | reachable by traversing the failure. However, in cases where such a | |||
| repair path is difficult to achieve because it requires a high order | repair path is difficult to achieve because it requires a high order | |||
| multi-hop repair path, it may still be possible to identify lower | multi-hop repair path, it may still be possible to identify lower | |||
| order repair paths (possibly even loop free alternate paths) which | order repair paths (possibly even loop free alternate paths) which | |||
| allow the majority of destinations to be repaired. When IPFRR is | allow the majority of destinations to be repaired. When IPFRR is | |||
| skipping to change at page 9, line 46 ¶ | skipping to change at page 10, line 26 ¶ | |||
| be repaired using only the "basic" repair mechanism, leaving a | be repaired using only the "basic" repair mechanism, leaving a | |||
| smaller subset of the destinations to be repaired using one of the | smaller subset of the destinations to be repaired using one of the | |||
| more complex multi-hop methods. Such a hybrid approach may go some | more complex multi-hop methods. Such a hybrid approach may go some | |||
| way to resolving the conflict between completeness and complexity. | way to resolving the conflict between completeness and complexity. | |||
| The use of repair paths may result in excessive traffic passing over | The use of repair paths may result in excessive traffic passing over | |||
| a link, resulting in congestion discard. This reduces the | a link, resulting in congestion discard. This reduces the | |||
| effectiveness of IPFRR. Mechanisms to influence the distribution of | effectiveness of IPFRR. Mechanisms to influence the distribution of | |||
| repaired traffic to minimize this effect are therefore desirable. | repaired traffic to minimize this effect are therefore desirable. | |||
| 4.2.2. Analysis of repair coverage | 5.2.2. Analysis of repair coverage | |||
| The repair coverage obtained is dependent on the repair strategy and | The repair coverage obtained is dependent on the repair strategy and | |||
| highly dependent on the detailed topology and metrics. Estimates of | highly dependent on the detailed topology and metrics. Estimates of | |||
| the repair coverage quoted in this document are for illustrative | the repair coverage quoted in this document are for illustrative | |||
| purposes only and may not be always be achievable. | purposes only and may not be always be achievable. | |||
| In some cases the repair strategy will permit the repair of all | In some cases the repair strategy will permit the repair of all | |||
| single link or node failures in the network for all possible | single link or node failures in the network for all possible | |||
| destinations. This can be defined as 100% coverage. However, where | destinations. This can be defined as 100% coverage. However, where | |||
| the coverage is less than 100% it is important for the purposes of | the coverage is less than 100% it is important for the purposes of | |||
| comparisons between different proposed repair strategies to define | comparisons between different proposed repair strategies to define | |||
| what is meant by such a percentage. There are four possibilities: | what is meant by such a percentage. There are four possibilities: | |||
| 1. The percentage of links (or nodes) which can be fully protected | 1. The percentage of links (or nodes) which can be fully protected | |||
| for all destinations. This is appropriate where the requirement | (i.e. for all destinations). This is appropriate where the | |||
| is to protect all traffic, but some percentage of the possible | requirement is to protect all traffic, but some percentage of the | |||
| failures may be identified as being un-protectable. | possible failures may be identified as being un-protectable. | |||
| 2. The percentage of destinations which can be fully protected for | 2. The percentage of destinations which can be protected for all | |||
| all link (or node) failures. This is appropriate where the | link (or node) failures. This is appropriate where the | |||
| requirement is to protect against all possible failures, but some | requirement is to protect against all possible failures, but some | |||
| percentage of destinations may be identified as being un- | percentage of destinations may be identified as being un- | |||
| protectable. | protectable. | |||
| 3. For all destinations (d) and for all failures (f), the percentage | 3. For all destinations (d) and for all failures (f), the percentage | |||
| of the total potential failure cases (d*f) which are protected. | of the total potential failure cases (d*f) which are protected. | |||
| This is appropriate where the requirement is an overall "best | This is appropriate where the requirement is an overall "best | |||
| effort" protection. | effort" protection. | |||
| 4. The percentage of packets normally passing though the network | 4. The percentage of packets normally passing though the network | |||
| that will continue to reach their destination. This requires a | that will continue to reach their destination. This requires a | |||
| traffic matrix for the network as part of the analysis. | traffic matrix for the network as part of the analysis. | |||
| 4.2.3. Link or node repair | 5.2.3. Link or node repair | |||
| A repair path may be computed to protect against failure of an | A repair path may be computed to protect against failure of an | |||
| adjacent link, or failure of an adjacent node. In general, link | adjacent link, or failure of an adjacent node. In general, link | |||
| protection is simpler to achieve. A repair which protects against | protection is simpler to achieve. A repair which protects against | |||
| node failure will also protect against link failure for all | node failure will also protect against link failure for all | |||
| destinations except those for which the adjacent node is a single | destinations except those for which the adjacent node is a single | |||
| point of failure. | point of failure. | |||
| In some cases it may be necessary to distinguish between a link or | In some cases it may be necessary to distinguish between a link or | |||
| node failure in order that the optimal repair strategy is invoked. | node failure in order that the optimal repair strategy is invoked. | |||
| Methods for link/node failure determination may be based on | Methods for link/node failure determination may be based on | |||
| techniques such as BFD[I-D.ietf-bfd-base]. This determination may be | techniques such as BFD[I-D.ietf-bfd-base]. This determination may be | |||
| made prior to invoking any repairs, but this will increase the period | made prior to invoking any repairs, but this will increase the period | |||
| of packet loss following a failure unless the determination can be | of packet loss following a failure unless the determination can be | |||
| performed as part of the failure detection mechanism itself. | performed as part of the failure detection mechanism itself. | |||
| Alternatively, a subsequent determination can be used to optimise an | Alternatively, a subsequent determination can be used to optimise an | |||
| already invoked default strategy. | already invoked default strategy. | |||
| 4.2.4. Maintenance of Repair paths | 5.2.4. Maintenance of Repair paths | |||
| In order to meet the response time goals, it is expected (though not | In order to meet the response time goals, it is expected (though not | |||
| required) that repair paths, and their associated FIB entries, will | required) that repair paths, and their associated FIB entries, will | |||
| be pre-computed and installed ready for invocation when a failure is | be pre-computed and installed ready for invocation when a failure is | |||
| detected. Following invocation the repair paths remain in effect | detected. Following invocation the repair paths remain in effect | |||
| until they are no longer required. This will normally be when the | until they are no longer required. This will normally be when the | |||
| routing protocol has re-converged on the new topology taking into | routing protocol has re-converged on the new topology taking into | |||
| account the failure, and traffic will no longer be using the repair | account the failure, and traffic will no longer be using the repair | |||
| paths. | paths. | |||
| The repair paths have the property that they are unaffected by any | The repair paths have the property that they are unaffected by any | |||
| topology changes resulting from the failure which caused their | topology changes resulting from the failure which caused their | |||
| instantiation. Therefore there is no need to re-compute them during | instantiation. Therefore there is no need to re-compute them during | |||
| the convergence period. They may be affected by an unrelated | the convergence period. They may be affected by an unrelated | |||
| simultaneous topology change, but such events are out of scope of | simultaneous topology change, but such events are out of scope of | |||
| this work (see Section 4.2.5). | this work (see Section 5.2.6). | |||
| Once the routing protocol has re-converged it is necessary for all | Once the routing protocol has re-converged it is necessary for all | |||
| repair paths to take account of the new topology. Various | repair paths to take account of the new topology. Various | |||
| optimizations may permit the efficient identification of repair paths | optimizations may permit the efficient identification of repair paths | |||
| which are unaffected by the change, and hence do not require full re- | which are unaffected by the change, and hence do not require full re- | |||
| computation. Since the new repair paths will not be required until | computation. Since the new repair paths will not be required until | |||
| the next failure occurs, the re-computation may be performed as a | the next failure occurs, the re-computation may be performed as a | |||
| background task and be subject to a hold-down, but excessive delay in | background task and be subject to a hold-down, but excessive delay in | |||
| completing this operation will increase the risk of a new failure | completing this operation will increase the risk of a new failure | |||
| occurring before the repair paths are in place. | occurring before the repair paths are in place. | |||
| 4.2.5. Multiple failures and Shared Risk Link Groups | 5.2.5. Local Area Networks | |||
| Protection against partial or complete failure of LANs is more | ||||
| complex than the point to point case. In general there is a trade- | ||||
| off between the simplicity of the repair and the ability to provide | ||||
| complete and optimal repair coverage. | ||||
| 5.2.6. Multiple failures and Shared Risk Link Groups | ||||
| Complete protection against multiple unrelated failures is out of | Complete protection against multiple unrelated failures is out of | |||
| scope of this work. However, it is important that the occurrence of | scope of this work. However, it is important that the occurrence of | |||
| a second failure while one failure is undergoing repair should not | a second failure while one failure is undergoing repair should not | |||
| result in a level of service which is significantly worse than that | result in a level of service which is significantly worse than that | |||
| which would have been achieved in the absence of any repair strategy. | which would have been achieved in the absence of any repair strategy. | |||
| Shared Risk Link Groups (SRLGs) are an example of multiple related | Shared Risk Link Groups (SRLGs) are an example of multiple related | |||
| failures, and the more complex aspects of their protection is a | failures, and the more complex aspects of their protection is a | |||
| matter for further study. | matter for further study. | |||
| One specific example of an SRLG which is clearly within the scope of | One specific example of an SRLG which is clearly within the scope of | |||
| this work is a node failure. This causes the simultaneous failure of | this work is a node failure. This causes the simultaneous failure of | |||
| multiple links, but their closely defined topological relationship | multiple links, but their closely defined topological relationship | |||
| makes the problem more tractable. | makes the problem more tractable. | |||
| 4.3. Local Area Networks | 5.3. Mechanisms for micro-loop prevention | |||
| Protection against partial or complete failure of LANs is more | ||||
| complex than the point to point case. In general there is a trade- | ||||
| off between the simplicity of the repair and the ability to provide | ||||
| complete and optimal repair coverage. | ||||
| 4.4. Mechanisms for micro-loop prevention | ||||
| Ensuring the absence of micro-loops is important not only because | Ensuring the absence of micro-loops is important not only because | |||
| they can cause packet loss in traffic which is affected by the | they can cause packet loss in traffic which is affected by the | |||
| failure, but because by saturating a link with looping packets they | failure, but because by saturating a link with looping packets they | |||
| can also cause congestion loss of traffic flowing over that link | can also cause congestion loss of traffic flowing over that link | |||
| which would otherwise be unaffected by the failure. | which would otherwise be unaffected by the failure. | |||
| A number of solutions to the problem of micro-loop formation have | A number of solutions to the problem of micro-loop formation have | |||
| been proposed and are summarized in [I-D.ietf-rtgwg-lf-conv-frmwk]. | been proposed and are summarized in [I-D.ietf-rtgwg-lf-conv-frmwk]. | |||
| The following factors are significant in their classification: | The following factors are significant in their classification: | |||
| skipping to change at page 12, line 39 ¶ | skipping to change at page 13, line 16 ¶ | |||
| general). | general). | |||
| 4. Computational complexity (pre-computed or real time). | 4. Computational complexity (pre-computed or real time). | |||
| 5. Applicability to scheduled events. | 5. Applicability to scheduled events. | |||
| 6. Applicability to link/node reinstatement. | 6. Applicability to link/node reinstatement. | |||
| 7. Topological constraints. | 7. Topological constraints. | |||
| 5. Management Considerations | 6. Management Considerations | |||
| While many of the management requirements will be specific to | While many of the management requirements will be specific to | |||
| particular IPFRR solutions, the following general aspects need to be | particular IPFRR solutions, the following general aspects need to be | |||
| addressed: | addressed: | |||
| 1. Configuration | 1. Configuration | |||
| A. Enabling/disabling IPFRR support. | A. Enabling/disabling IPFRR support. | |||
| B. Enabling/disabling protection on a per link/node basis. | B. Enabling/disabling protection on a per link/node basis. | |||
| skipping to change at page 13, line 25 ¶ | skipping to change at page 14, line 5 ¶ | |||
| protected. | protected. | |||
| B. Notification of pre-computed repair paths, and anticipated | B. Notification of pre-computed repair paths, and anticipated | |||
| traffic patterns. | traffic patterns. | |||
| C. Counts of failure detections, protection invocations and | C. Counts of failure detections, protection invocations and | |||
| packets forwarded over repair paths. | packets forwarded over repair paths. | |||
| D. Testing repairs. | D. Testing repairs. | |||
| 6. Scope and applicability | ||||
| The initial scope of this work is in the context of link state IGPs. | ||||
| Link state protocols provide ubiquitous topology information, which | ||||
| facilitates the computation of repairs paths. | ||||
| Provision of similar facilities in non-link state IGPs and BGP is a | ||||
| matter for further study, but the correct operation of the repair | ||||
| mechanisms for traffic with a destination outside the IGP domain is | ||||
| an important consideration for solutions based on this framework | ||||
| 7. IANA Considerations | 7. IANA Considerations | |||
| There are no IANA considerations that arise from this framework | There are no IANA considerations that arise from this framework | |||
| document. | document. | |||
| 8. Security Considerations | 8. Security Considerations | |||
| This framework document does not itself introduce any security | This framework document does not itself introduce any security | |||
| issues, but attention must be paid to the security implications of | issues, but attention must be paid to the security implications of | |||
| any proposed solutions to the problem. | any proposed solutions to the problem. | |||
| skipping to change at page 15, line 7 ¶ | skipping to change at page 15, line 21 ¶ | |||
| February 2009. | February 2009. | |||
| [I-D.ietf-rtgwg-ipfrr-notvia-addresses] | [I-D.ietf-rtgwg-ipfrr-notvia-addresses] | |||
| Shand, M., Bryant, S., and S. Previdi, "IP Fast Reroute | Shand, M., Bryant, S., and S. Previdi, "IP Fast Reroute | |||
| Using Not-via Addresses", | Using Not-via Addresses", | |||
| draft-ietf-rtgwg-ipfrr-notvia-addresses-04 (work in | draft-ietf-rtgwg-ipfrr-notvia-addresses-04 (work in | |||
| progress), July 2009. | progress), July 2009. | |||
| [I-D.ietf-rtgwg-lf-conv-frmwk] | [I-D.ietf-rtgwg-lf-conv-frmwk] | |||
| Shand, M. and S. Bryant, "A Framework for Loop-free | Shand, M. and S. Bryant, "A Framework for Loop-free | |||
| Convergence", draft-ietf-rtgwg-lf-conv-frmwk-05 (work in | Convergence", draft-ietf-rtgwg-lf-conv-frmwk-07 (work in | |||
| progress), June 2009. | progress), October 2009. | |||
| [I-D.tian-frr-alt-shortest-path] | [I-D.tian-frr-alt-shortest-path] | |||
| Tian, A., "Fast Reroute using Alternative Shortest Paths", | Tian, A., "Fast Reroute using Alternative Shortest Paths", | |||
| draft-tian-frr-alt-shortest-path-01 (work in progress), | draft-tian-frr-alt-shortest-path-01 (work in progress), | |||
| July 2004. | July 2004. | |||
| [RFC4090] Pan, P., Swallow, G., and A. Atlas, "Fast Reroute | [RFC4090] Pan, P., Swallow, G., and A. Atlas, "Fast Reroute | |||
| Extensions to RSVP-TE for LSP Tunnels", RFC 4090, | Extensions to RSVP-TE for LSP Tunnels", RFC 4090, | |||
| May 2005. | May 2005. | |||
| End of changes. 35 change blocks. | ||||
| 77 lines changed or deleted | 96 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||