< draft-bryant-ipfrr-tunnels-00.txt   draft-bryant-ipfrr-tunnels-01.txt >
Network Working Group S. Bryant
Internet Draft C. Filsfils
Expiration Date: Apr 2005 S. Previdi
M. Shand
Cisco Systems
INTERNET DRAFT IP Fast-reroute May 2004 Oct 2004
Network Working Group S. Bryant
Internet Draft C. Filsfils
Expiration Date: Nov 2004 S. Previdi
M. Shand
Cisco Systems
May 2004
IP Fast Reroute using tunnels
draft-bryant-ipfrr-tunnels-00.txt
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC 2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsolete by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress".
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
This draft describes an IP fast re-route mechanism that provides
backup connectivity in the event of a link or router failure. In the
absence of single points of failure and asymmetric costs, the
mechanism provides complete protection against any single failure. If
perfect repair is not possible, the identity of all the unprotected
links and routers is known in advance. The draft also describes the
mechanisms needed to prevent the packet loss caused by loops which
normally occur during the reconvergence of the network following a
failure.
Table of Contents
1. Introduction......................................................5
2. Goals, non-goals, limitations and constraints.....................5
2.1. Goals.........................................................5
2.2. Non-Goals.....................................................6
2.3. Limitations...................................................6
2.4. Constraints...................................................6
3. Repair Paths......................................................7
3.1. Tunnels as Repair Paths.......................................7
3.2. Tunnel Requirements..........................................10
3.2.1. Setup....................................................10
3.2.2. Multipoint...............................................10
3.2.3. Directed forwarding......................................10
3.2.4. Security.................................................10
4. Construction of Repair Paths.....................................10
4.1. Identifying Repair Path Targets..............................10
4.2. Determining Tunneled Repair Paths............................11
4.2.1. Computing Repair Paths...................................12
4.2.2. Extended P-space.........................................13
4.2.3. Downstream Paths.........................................13
4.2.4. Selecting Repair Paths...................................13
4.3. Assigning Traffic to Repair Paths............................14
4.4. When no Repair Path is Possible..............................14
4.4.1. Unreachable Target.......................................15
4.4.2. Asymmetric Link Costs....................................15
4.4.3. Interference Between Potential Node Repair Paths.........15
4.5. Multi-homed Prefixes.........................................18
4.6. Equal Cost Path Splits.......................................19
4.6.1. Equal Cost Path Splits as Link Repair Paths..............19
4.6.2. Equal Cost Path Splits and Node Failure..................20
4.7. LANs and pseudonodes.........................................20
4.7.1. The Link between Routers A and B is a LAN................21
4.7.1.1. Case 1...............................................21
4.7.1.2. Case 2...............................................21
4.7.1.3. Simplified LAN repair................................22
4.7.2. A LAN exists at the release point........................22
4.7.3. A LAN between B and its neighbors........................22
4.7.4. The LAN is a Transit Subnet..............................23
5. Failure Detection and Repair Path Activation.....................23
5.1. Failure Detection............................................23
5.2. Repair Path Activation.......................................23
5.3. Node Failure Detection Mechanism.............................23
6. Loop Free Transition.............................................24
6.1. Incremental Cost Advertisement...............................24
6.2. Single Tunnel Per Router.....................................25
6.3. Distributed Tunnels..........................................25
6.4. Ordered SPFs.................................................26
7. Restoring Failed Components to Service...........................26
8. Implications for Network Management..............................26
9. IPFRR Capability.................................................27
10. Enhancements to routing protocols...............................27
11. IANA considerations.............................................27
12. Security Considerations.........................................27
Terminology
This section defines words, acronyms, and actions used in this draft.
A Frequently used to denote a router that is
the source of a repair path computed in
anticipation of the failure of a neighboring
router denoted as B.
B Frequently used to denote a router whose
anticipated failure is the subject of repair
path computations.
Directed The ability of the repairing router (A) to
forwarding specify the next hop (Q) on exit from a
tunnel end-point (P)
Extended The union of the p-space of the neighbors of
P-space a specific router with respect to a common
component.
Extended p-space does not include the
additional space reachable though directed
forwarding.
FIB Forwarding Information Base. The database
used by the packet forwarder to determine
what actions to perform on a packet
IPFRR IP fast re-route
P The router in P-space to which a packet is
tunneled for repair.
PQ A router that is in both P and Q space and
hence does not need directed forwarding.
P-space P-space is the set of routers reachable from
a specific router without any path
(including equal cost path splits)
transiting a specified component.
For example, the P-space of A, is the set of
routers that A can reach without using B
(router failure case) or the A-B link
failure case).
Q The router in Q space, to which the packet
is directed by router P on exit from the
repair tunnel. Q will always be adjacent to
P, or P itself.
Q-space Q-space is the set of routers from which a
specific router can be reached without any
path (including equal cost path splits)
transiting a specified component.
Routing The process whereby routers converge on a
transition new topology. In conventional networks this
process frequently causes some disruption to
packet delivery.
RPF Reverse Path Forwarding. I.e. checking that
a packet is received over the interface
which would be used to send packets
addressed to the source address of the
packet.
SPF Shortest Path First, e.g. Dijkstra's
algorithm.
SPT Shortest path tree
1. Introduction
When the topology of a network changes (due to link or router
failure, recovery or management action), the routers need to converge
on a common view of the new topology. During this process, referred
to as a routing transition, packet delivery between certain
source/destination pairs may be disrupted. This occurs due to the
time it takes for the topology change to be propagated around the
network plus the time it takes each individual router to determine
and then update the forwarding information base (FIB) for the
affected destinations. During this transition, packets are lost due
to the continuing attempts to use of the failed component, and due to
forwarding loops. Forwarding loops arise due to the inconsistent FIBs
that occur as a result of the difference in time taken by routers to
execute the transition process.
The service failures caused by routing transitions are largely hidden
by higher-level protocols that retransmit the lost data. However new
Internet services are emerging which are more sensitive to the packet
disruption that occurs during a transition. To make the transition
transparent to their users, these services require a short routing
transition. Ideally, routing transitions would be completed in zero
time with no packet loss.
Regardless of how optimally the mechanisms involved have been
designed and implemented, it is inevitable that a routing transition
will take some minimum interval that is greater than zero. The
solution described here uses pre-computed backup routes and
controlled notification of network changes. A set of repair paths
temporarily provides substitute connectivity in place of a link, or
router that has failed. Once the set of repair paths has been
activated, there should be no further packet loss as a result of the
associated failure. To achieve the maximum benefit from repair paths,
they must be activated immediately a failure has been detected, and a
controlled transition to normal operation invoked to prevent packet
loss due to micro-looping. The packet loss attributable to the
failure will then be confined to the unavoidable loss that occurs as
a result of the latency of the failure detection mechanism itself.
The mechanisms described here have been designed for use with any
link-state routing protocol.
2. Goals, non-goals, limitations and constraints
2.1. Goals
The following are the goals of IPFRR:
o Protect against any link or router failure in the network.
o No constraints on the network topology or link costs.
o Never worse than the existing routing convergence mechanism. IP Fast Reroute using tunnels
o Co-existence with non-IP fast-reroute capable routers in the draft-bryant-ipfrr-tunnels-01.txt
network.
2.2. Non-Goals Status of this Memo
The following are non-goals of IPFRR: By submitting this Internet-Draft, we certify that any applicable
patent or other IPR claims of which we are aware have been
disclosed, or will be disclosed, and any of which we become aware
will be disclosed, in accordance with RFC 3668.
o Protection of a single point of failure. Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
o To provide protection in the presence of multiple concurrent Internet-Drafts are draft documents valid for a maximum of six
failures other than those that occur due to the failure of a months and may be updated, replaced, or obsoleted by other
single router. documents at any time. It is inappropriate to use Internet-Drafts
as reference material or to cite them other than a "work in
progress."
o Shared risk group protection. The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html
o Complete fault coverage in networks that make use of The list of Internet-Draft Shadow Directories can be accessed at
asymmetric costs. http://www.ietf.org/shadow.html
2.3. Limitations Abstract
The following limitations apply to IPFRR: This draft describes an IP fast re-route mechanism that provides
backup connectivity in the event of a link or router failure. In
the absence of single points of failure and asymmetric costs, the
mechanism provides complete protection against any single failure.
If perfect repair is not possible, the identity of all the
unprotected links and routers is known in advance.
o Because the mechanisms described here rely on complete Table of Contents
topological information from the link state routing protocol, 1. Introduction......................................................4
they will only work within a single link state flooding 2. Goals, non-goals, limitations and constraints.....................4
domain. 2.1. Goals.........................................................4
2.2. Non-Goals.....................................................5
2.3. Limitations...................................................5
2.4. Constraints...................................................5
3. Repair Paths......................................................6
3.1. Tunnels as Repair Paths.......................................6
3.2. Tunnel Requirements...........................................9
3.2.1. Setup.....................................................9
3.2.2. Multipoint................................................9
3.2.3. Directed forwarding.......................................9
3.2.4. Security..................................................9
4. Construction of Repair Paths.....................................10
4.1. Identifying Repair Path Targets..............................10
4.2. Determining Tunneled Repair Paths............................10
4.2.1. Computing Repair Paths...................................11
4.2.2. Extended F-space.........................................12
4.2.3. Loop-free Alternates.....................................12
4.2.4. Selecting Repair Paths...................................12
4.3. Assigning Traffic to Repair Paths............................13
4.4. When no Repair Path is Possible..............................13
4.4.1. Unreachable Target.......................................14
4.4.2. Asymmetric Link Costs....................................14
4.4.3. Interference Between Potential Node Repair Paths.........14
4.5. Multi-homed Prefixes.........................................17
4.6. LANs and pseudo-nodes........................................18
4.6.1. The Link between Routers S and E is a LAN................19
4.6.1.1. Case 1...............................................19
4.6.1.2. Case 2...............................................19
4.6.1.3. Simplified LAN repair................................20
4.6.2. A LAN exists at the release point........................20
4.6.3. A LAN between E and its neighbors........................20
4.6.4. The LAN is a Transit Subnet..............................21
5. Failure Detection and Repair Path Activation.....................21
5.1. Failure Detection............................................21
5.2. Repair Path Activation.......................................21
5.3. Node Failure Detection Mechanism.............................21
6. Loop Free Transition.............................................22
7. IPFRR Capability.................................................22
8. Enhancements to routing protocols................................23
9. IANA considerations..............................................23
10. Security Considerations.........................................23
o Reverse Path Forwarding (RPF) checks cannot be used in Terminology
conjunction with IPFRR. This is because the use of tunnels
may result in packets arriving over different interfaces than
expected.
2.4. Constraints This draft uses the terms defined in [FRMWK]. This section defines
additional words, acronyms, and actions used in this draft.
The following constraints are assumed: Directed The ability of the repairing router (S) to
forwarding specify the next hop (G) on exit from a
tunnel end-point (F)
o Following a failure, only the routers adjacent to the failure Extended F- The union of the F-space of the neighbors of
have any knowledge of the failure. space a specific router with respect to a common
component.
o There is insufficient time following a failure to compute a Extended F-space does not include the
repair strategy based on knowledge of the specific failure additional space reachable though directed
that has occurred. forwarding.
o Multiple concurrent failures may not be protected. F The router in F-space to which a packet is
tunneled for repair.
3. Repair Paths FG A router that is in both F and G space and
hence does not need directed forwarding.
When a router detects an adjacent failure, it uses a set of repair F-space F-space is the set of routers reachable from
paths in place of the failed component, and continues to use this a specific router without any path
until the completion of the routing transition. Only routers adjacent (including equal cost path splits)
to the failed component are aware of the nature of the failure. Once transiting a specified component.
the routing transition has been completed, the router will have no
further use for the repair paths since all routers in the network
will have revised their forwarding data and the failed link will have
been eliminated from this computation.
Repair paths are pre-computed in anticipation of later failures so For example, the F-space of S, is the set of
they can be promptly activated when a failure is detected. routers that S can reach without using E
(router failure case) or the S-E link
failure case).
Three types of repair path are considered here. G The router in G space, to which the packet
is directed by router F on exit from the
repair tunnel. G will always be adjacent to
F, or F itself.
1. Equal cost path-split. G-space G-space is the set of routers from which a
specific router can be reached without any
path (including equal cost path splits)
transiting a specified component.
Where a link is being used as a member of an equal cost path-split Interference The condition where the network costs are
set for some destination, the other members of the set may be used such that a repairing router cannot tunnel a
to provide an alternative path, provided that they avoid the packet sufficiently far from a failed node
network component being protected. such that it is not attracted back to the
failed node via another of that node's
neighbors.
2. Downstream Path. 1. Introduction
A 'downstream path' is a next hop that will get a packet nearer to When a link or node failure occurs in a routed network, there is
its destination. It does not necessarily represent the shortest inevitably a period of disruption to the delivery of traffic until
path to the destination but has the property that a packet sent on the network re-converges on the new topology. Packets for
it will not loop back because, having traversed this hop, it is destinations which were previously reached by traversing the failed
then closer to its destination. component may be dropped or may suffer looping. Traditionally such
disruptions have lasted for periods of at least several seconds,
and most applications have been constructed to tolerate such a
quality of service.
3. Tunnel. Recent advances in routers have reduced this interval to under a
second for carefully configured networks using link state IGPs.
However, new Internet services are emerging which may be sensitive
to periods of traffic loss which are orders of magnitude shorter
than this.
A tunneled repair path tunnels traffic to some staging point from Addressing these issues is difficult because the distributed nature
which it will travel to its destination using normal forwarding of the network imposes an intrinsic limit on the minimum
without looping back. The repair path can be thought of as convergence time which can be achieved.
providing a virtual link, originating at a router adjacent to a
failure, and diverting traffic around the failure.
3.1. Tunnels as Repair Paths However, there is an alternative approach, which is to compute
backup routes that allow the failure to be repaired locally by the
router(s) detecting the failure without the immediate need to
inform other routers of the failure. In this case, the disruption
time can be limited to the small time taken to detect the adjacent
failure and invoke the backup routes. This is analogous to the
technique employed by MPLS Fast Reroute [MPLSFRR], but the
mechanisms employed for the backup routes in pure IP networks are
necessarily very different.
The repair strategies described in this draft operate on the basis A framework for IP Fast Reroute [IPFRR] provides a summary of the
that if a packet can somehow be sent to the other side of the proposed IPFRR solutions, and a partial solution using equal cost
failure, it will subsequently proceed towards its destination exactly multi-path and loop-free alternate case is described in [BASIC].
as if it had traversed the failed component. See Figure 1.
Repair Path from A to B This draft describes extensions to the basic repair mechanism in
+-----------+ which we propose the use of tunnels to provide additional logical
| | downstream paths. These mechanisms provide almost 100% repair
| v connectivity in practical networks.
---->[A]---//----[B]----->
Figure 1 Simple Link Repair 2. Goals, non-goals, limitations and constraints
Creating a repair path from A to B may require a packet to traverse 2.1. Goals
an unnatural route. If a suitable natural path starts at a neighbor
(i.e. it is a downstream path), then A can force the packet directly
there. If this is not the case, then A must use a tunnel to force the
packet down the repair path. Note that the tunnel does not have to go
from A to B. The tunnel can terminate at any router in the network,
provided that A can be sure that the packet will proceed correctly to
its destination from that router.
A repair path computed for a link failure may not however work The following are the goals of IPFRR:
satisfactorily when the neighboring router has, itself, failed. This
is illustrated in Figure 2.
Repair path from A to B o Protect against any link or router failure in the network.
+-------------------------+
| |
| <------------+
--->[A]---//----[B]----//-----[C]-->
+----------> |
| |
+-------------------------+
Repair Path from C to B
Figure 2 Looping Link Repair when Router Fails o No constraints on the network topology or link costs.
Consider the case of a router B with just two neighbors A and C. When o Never worse than the existing routing convergence
router B fails, both A and C will observe the failure of their local mechanism.
link to B, but will have no immediate knowledge that B itself has
failed. If they were both to attempt to repair traffic around their
local link, they would invoke mutual repairs which would loop.
Since it is not easy for a router to immediately distinguish between o Co-existence with non-IP fast-reroute capable routers in
a link failure and the failure of its neighbor, repair paths are the network.
calculated in anticipation of adjacent router failure. Thus, for each
of its protected links, router A (Figure 3) pre-computes a set of
tunneled repair paths, one for each of the neighbors (C,D,E) of its
neighbor B on the A-B link. The set of destinations that are normally
assigned to link A-B will be assigned to a repair path based on the
neighbor of B through which router B would have forwarded traffic to
them.
Repair AC 2.2. Non-Goals
+----------->[C]
| |
| |
| |
----->[A]----//-----[B]---------[D]
|| | ^
|| | |
|| Repair AE | |
|+---------->[E] |
| |
+-------------------------+
Repair AD
Figure 3: Repair paths in anticipation of a router failure The following are non-goals of IPFRR:
The set of repair paths in Figure 3 will function correctly in the o Protection of a single point of failure.
case of link and router failure. However, in some network topologies
they may not provide a means for traffic to reach router B itself.
This is important in cases where B is a single point of failure and B
is still functional (i.e. the failure was actually a failure of the
A-B link). Hence, in addition to computing repair paths for the
neighbors of its neighbor on a protected link, a router also
calculates a repair path for the neighbor itself. This is illustrated
in Figure 4.
Repair AB o To provide protection in the presence of multiple
+----------------+ concurrent failures other than those that occur due to the
| | failure of a single router.
| Repair AC |
|+---------->[C] |
|| | /
|| | /
|| |/
----->[A]----//-----[B]---------[D]
|| | ^
|| | |
|| Repair AE | |
|+---------->[E] |
| |
+-------------------------+
Repair AD
Figure 4 The full set of A-B repair paths. o Shared risk group protection.
In the event of a failure, the only traffic that is assigned to the o Complete fault coverage in networks that make use of
link repair path (the AB repair) is that traffic which has no other asymmetric costs.
path to its destination except via B. As we have already seen, there
is a danger that traffic assigned to this link repair path may loop
if B has failed, therefore, when the repair paths are invoked, a loop
detection mechanism is used which promptly detects the loop and, upon
detection, withdraws the link (A-B) repair path from service.
3.2. Tunnel Requirements 2.3. Limitations
The specific tunneling mechanism used to provide a repair path is The following limitations apply to IPFRR:
outside the scope of this document. However the following sections
describe the requirements for the tunneling mechanism.
3.2.1. Setup. o Because the mechanisms described here rely on complete
topological information from the link state routing
protocol, they will only work within a single link state
flooding domain.
When a failure is detected, it is necessary to immediately redirect o Reverse Path Forwarding (RPF) checks cannot be used in
traffic to the repair paths. Consequently, the tunnels used must be conjunction with IPFRR. This is because the use of tunnels
provisioned beforehand in anticipation of the failure. IP fast re- may result in packets arriving over different interfaces
route will determine which tunnels it requires. It must therefore be than expected.
possible to establish tunnels automatically, without management
action, and without the need to manually establish context at the
tunnel endpoint.
3.2.2. Multipoint 2.4. Constraints
To reduce the number of tunnel endpoints in the network the tunnels The following constraints are assumed:
should be be multi-point tunnels capable of receiving repair traffic
from any IPFRR router in the network.
3.2.3. Directed forwarding. o Following a failure, only the routers adjacent to the
failure have any knowledge of the failure.
Directed forwarding must be supported such that the router at the o There is insufficient time following a failure to compute a
tunnel endpoint (P) can be directed by the router at the tunnel repair strategy based on knowledge of the specific failure
source (A) to forward the packet directly to a specific neighbor. that has occurred.
Specification of the directed forwarding mechanism is outside the
scope of this document.
3.2.4. Security o Multiple concurrent failures may not be protected.
A lightweight security mechanism should be supported to prevent the 3. Repair Paths
abuse of the repair tunnels by an attacker. This is discussed in more
detail in Section 12.
4. Construction of Repair Paths When a router detects an adjacent failure, it uses a set of repair
paths in place of the failed component, and continues to use this
until the completion of the routing transition. Only routers
adjacent to the failed component are aware of the nature of the
failure. Once the routing transition has been completed, the router
will have no further use for the repair paths since all routers in
the network will have revised their forwarding data and the failed
link will have been eliminated from this computation.
4.1. Identifying Repair Path Targets Repair paths are pre-computed in anticipation of later failures so
they can be promptly activated when a failure is detected.
To establish protection for a link or node it is necessary to Three types of repair path are used to achieve the repair:
determine which neighbors of the neighboring node should be targets
of repair paths. Normally all neighbors will be used as repair path
targets. However, in some topologies, not all neighbors will be
needed as targets because, prior to the failure, no traffic was being
forwarded through them by the repairing router. This can determined
by examining the normal spanning tree computed by the repairing
router.
In addition, the neighboring router B will also be the target of a 1. Equal cost path-split.
repair path for any destinations for which B is a single point of
failure.
4.2. Determining Tunneled Repair Paths 2. Loop-free Alternate.
The objective of each tunneled repair path is to deliver traffic to a 3. Tunnel.
target router when a link is observed to have failed. However, it is
seldom possible to use the target router itself as the tunnel
endpoint because other routers on the repair path, that have not
learned of the failure, will forward traffic addressed to it using
their least cost path which may be via the failed link. This is
illustrated in Figure 5 in which all link costs are one in both
directions. Router A's intended repair path for traffic to D when
link A-B fails is the path W-X-Y-Z-D. However, if router A makes D be
the tunnel endpoint and forwards the packet to router W, router W
will immediately return it to A because its least cost path to D is
A-B-D (cost 3 versus cost 4) and has no knowledge of the failure of
link A-B.
[A]--//--[B]------[D] The operation of equal cost path-split and loop-free alternate is
| | described in [BASIC]. A tunneled repair path tunnels traffic to
| | some staging point from which it will travel to its destination
[W]---[X]---[Y]---[Z] using normal forwarding without looping back. The repair path can
be thought of as providing a virtual link, originating at a router
adjacent to a failure, and diverting traffic around the failure.
This is equivalent to providing a virtual loop-free alternate to
supplement the physical loop-free alternates.
Figure 5. Repair path to target router D. 3.1. Tunnels as Repair Paths
Thus the tunnel endpoint needs to be somewhere on the repair path The repair strategies described in this draft operate on the basis
such that packets addressed to the tunnel end point will not loop that if a packet can somehow be sent to the other side of the
back towards router A. In addition, the release point needs to be failure, it will subsequently proceed towards its destination
somewhere such that when packets are released from the tunnel they exactly as if it had traversed the failed component. See Figure 1.
will flow towards the target router (or their actual destination)
without being attracted back to the failed link. By inspection, in
Figure 5, suitable tunnel endpoints are routers X, Y, and Z.
Note that it is not essential that traffic assigned to a repair path Repair Path from S to E
actually traverse the target router for which the repair path was +-----------+
created. If, for example, in Figure 5, a packet's destination were | |
normally reached via the path A-B-D-Z-?-?-?, once released at any of | v
the possible tunnel endpoints, it would arrive at its destination by ---->[S]---//----[E]----->
the best available route without traversing D.
In general, the properties that are required of tunnel endpoints are: Figure 1 Simple Link Repair
o the end point must be reachable from the tunnel source Creating a repair path from S to E may require a packet to traverse
without traversing the failed link; and an unnatural route. If a suitable natural path starts at a neighbor
(i.e. it is a loop-free alternate), then S can force the packet
directly there. If this is not the case, then S may create one by
using a tunnel to carry the packet to a point in the network where
there is a real loop-free alternate. Note that the tunnel does not
have to go from S to E. The tunnel can terminate at any router in
the network, provided that S can be sure that the packet will
proceed correctly to its destination from that router.
o once released, tunneled packets will proceed towards their A repair path computed for a link failure may not however work
destination without being attracted back over the failed link satisfactorily when the neighboring router has, itself, failed.
or node. This is illustrated in Figure 2.
Provided both of these conditions are met, packets forwarded on the Repair path from S to E
repair path will not loop. +-------------------------+
| |
| <------------+
--->[S]---//----[E]----//-----[S1]-->
+----------> |
| |
+-------------------------+
Repair Path from S1 to E
In some topologies it will not be possible to find a tunnel endpoint Figure 2 Looping Link Repair when Router Fails
that exhibits both the required properties. For example, in Figure 5,
if the cost of link X-Y were increased from one to four in both
directions, there is no longer a viable endpoint within the fragment
of the topology shown.
To solve this problem we introduce the concept of directed forwarding Consider the case of a router E with just two neighbors S and S1.
from the tunnel endpoint. Directed forwarding allows the originator When router E fails, both S and S1 will observe the failure of
of a tunneled packet to instruct that, when it is de-capsulated at their local link to E, but will have no immediate knowledge that E
the end of the tunnel, it be forwarded via a specific adjacency, and itself has failed. If they were both to attempt to repair traffic
not be subjected to the normal forwarding decision process. This around their local link, they would invoke mutual repairs which
effectively allows the tunnel to be extended by one hop. So, for would loop.
example, in Figure 5 with the cost of link X-Y set to four, it would
be possible to select X as the tunnel endpoint with the directive
that X always forward the packets it decapsulates via the
adjacency to Y. Thus, router X is reached from A using normal
forwarding, and directed forwarding is then used to force packets to
router Y, from where D can be reached using normal forwarding.
Provided link costs are symmetrical, it can be proved that it is Since it is not easy for a router to immediately distinguish
always possible to compute a tunneled repair path (possibly using between a link failure and the failure of its neighbor, repair
directed forwarding) around a link failure. paths are calculated in anticipation of adjacent router failure.
Thus, for each of its protected links, router S (Figure 3)
pre-computes a set of tunneled repair paths, one for each of the
neighbors (S1, S2 and S3) of its neighbor E on the S-E link. The
set of destinations that are normally assigned to link S-E will be
assigned to a repair path based on the neighbor of E through which
router E would have forwarded traffic to them.
The tunnel endpoint (P) and the release point (Q) may be coincident, Repair S-S1
or may be separated by at most one hop. +----------->[S1]
| |
| |
| |
----->[S]----//-----[E]---------[S2]
|| | ^
|| | |
||Repair S-S3 | |
|+---------->[S3] |
| |
+-------------------------+
Repair S-S2
4.2.1. Computing Repair Paths Figure 3: Repair paths in anticipation of a router failure
For a router A, determining tunneled repair paths around a The set of repair paths in Figure 3 will function correctly in the
neighboring router B, the set of potential tunnel end points includes case of link and router failure. However, in some network
all the routers that can be reached from A using normal forwarding topologies they may not provide a means for traffic to reach router
without traversing the failed link A-B. This is termed the "P-space" E itself. This is important in cases where E is a single point of
of A with respect to the failure of B. Any router that is on an equal failure and E is still functional (i.e. the failure was actually a
cost path split via the failed link is excluded from this set. failure of the S-E link). Hence, in addition to computing repair
paths for the neighbors of its neighbor on a protected link, a
router also calculates a repair path for the neighbor itself. This
is illustrated in Figure 4.
The resulting set defines all the possible tunnel end points that Repair S-E
could be used in repair paths originating at router A for the failure +----------------+
of link A-B. This set can be obtained by computing a spanning tree | |
rooted at A and excising the subtree reached via the A-B link. | Repair S-S1 |
|+---------->[S1]|
|| | /
|| | /
|| |/
----->[S]----//-----[E]---------[S2]
|| | ^
|| | |
||Repair S-S3 | |
|+---------->[S3] |
| |
+-------------------------+
Repair S-S2
The set of possible release points can be determined by computing the Figure 4 The full set of S-E repair paths.
set of routers that can reach the repair path target without
traversing the failed link. This is termed the "Q-space" of the
target with respect to the failure. The Q-space can be obtained by
computing a reverse spanning tree rooted at the repair path target,
with the subtree which traverses the failed link (or node) excised.
The reverse spanning tree uses the cost towards the root rather than In the event of a failure, the only traffic that is assigned to the
from it and yields the best paths towards the root from other nodes link repair path (the S-E repair) is that traffic which has no
in the network. other path to its destination except via E. As we have already
seen, there is a danger that traffic assigned to this link repair
path may loop if E has failed, therefore, when the repair paths are
invoked, a loop detection mechanism is used which promptly detects
the loop and, upon detection, withdraws the link (S-E) repair path
from service.
The intersection of the target's Q-space with A's P-space includes 3.2. Tunnel Requirements
all the possible release points for any repair path not employing
directed forwarding. Where there is no intersection, but there exist
a pair of routers, P in A's P-space and Q in the target's Q-space,
router P can be used as the tunnel endpoint with directed forwarding
to the release point Q.
4.2.2. Extended P-space There are a number of IP in IP tunnel mechanisms that may be used
to fulfill the requirements of this design. Suitable candidates
include IP-in-IP [RFC1853], GRE [RFC1701] and L2TPv3 [L2TPv3]. The
selection of the specific tunneling mechanism (and any necessary
enhancements) used to provide a repair path is outside the scope of
this document. However the following sections describe the
requirements for the tunneling mechanism.
The description in section 4.2.1 calculated router A's P-space rooted 3.2.1. Setup.
at A itself. However, since router A will only use a repair path when
it has detected the failure of the link A-B, the initial hop of the
repair path need not be subject to A's normal forwarding decision
process. Thus we introduce the concept of extended P-space. Router
A's extended P-space is the union of the P-spaces of each of A's
neighbors. The use of extended P-space may allow router A to repair
to targets that were otherwise unreachable.
4.2.3. Downstream Paths When a failure is detected, it is necessary to immediately redirect
traffic to the repair paths. Consequently, the tunnels used must be
provisioned beforehand in anticipation of the failure. IP fast
re-route will determine which tunnels it requires. It must
therefore be possible to establish tunnels automatically, without
management action, and without the need to manually establish
context at the tunnel endpoint.
Under certain circumstances, the target's Q-space will include a 3.2.2. Multipoint
router that is a neighbor of A. This is traditionally referred to as
a downstream path and has the property that a packet sent on it will
not loop back because, having traversed this hop, it is then closer
to its destination. A trivial example of this is shown in Figure 6.
[A]--//---[B] To reduce the number of tunnel endpoints in the network the tunnels
| | should be multi-point tunnels capable of receiving repair traffic
2 | from any IPFRR router in the network.
| |
[D]-------[C]
Figure 6. A topology that will permit a single-hop release point 3.2.3. Directed forwarding.
When a downstream path exists, no tunneling is required. Directed forwarding must be supported such that the router at the
tunnel endpoint (F) can be directed by the router at the tunnel
source (S) to forward the packet directly to a specific neighbor.
Specification of the directed forwarding mechanism is outside the
scope of this document. Directed forwarding might be provided using
an enhancement to the IP tunneling encapsulation, or it might be
provided through the use of a single MPLS label stack entry
[RFC3032] interposed between the IP tunnel header and the packet
being repaired.
4.2.4. Selecting Repair Paths 3.2.4. Security
The mechanism described in section 4.2 will identify all the possible A lightweight security mechanism should be supported to prevent the
release points that can be used to reach each particular target. (The abuse of the repair tunnels by an attacker. This is discussed in
circumstances when no release points exist are described in more detail in Section 10.
section 4.4.) In a well-connected network there are likely to be
multiple possible release points for each target, and all will work
correctly. For simplicity, one release point per target is chosen.
All will deliver the packets correctly so, arguably, it does not
matter which is chosen. However, one release point may be preferred
over the others on the basis of path cost or some other criteria. It
is an implementation matter as to how the release point is selected.
4.3. Assigning Traffic to Repair Paths 4. Construction of Repair Paths
Once the repair path for each target has been selected, it is 4.1. Identifying Repair Path Targets
necessary to determine which of the destinations normally reached via
the protected link should be assigned to which of the repair paths
when the link fails.
This is achieved by recording which neighbor of B would be used to To establish protection for a link or node it is necessary to
reach each destination reachable over A-B when running the original determine which neighbors of the neighboring node should be targets
SPF. Traffic assignment is then simply a matter of assigning the of repair paths. Normally all neighbors will be used as repair path
traffic which B would have forwarded via each neighbor to the repair targets. However, in some topologies, not all neighbors will be
path which has that neighbor as its target. needed as targets because, prior to the failure, no traffic was
being forwarded through them by the repairing router. This can be
determined by examining the normal shortest path tree (SPT)
computed by the repairing router.
Although the repair paths are calculated based on traffic addressed In addition, the neighboring router E will also be the target of a
to specific targets, it can be proved that the traffic assignment repair path for any destinations for which E is a single point of
algorithm guarantees that the repair path can be used for any traffic failure.
assigned to it.
Where B would normally split the traffic to a particular destination 4.2. Determining Tunneled Repair Paths
via two or more of its neighbors, it is an implementation decision
whether the repaired traffic should be split across the corresponding
set of repair paths.
The repair path to B itself is normally used just for traffic The objective of each tunneled repair path is to deliver traffic to
destined for B and any prefixes advertised by B. However, under some a target router when a link is observed to have failed. However, it
circumstances, it may be impossible to compute a repair path to one is seldom possible to use the target router itself as the tunnel
or more of B's neighbors, for example, because B is a single point of endpoint because other routers on the repair path, that have not
failure. In this case traffic for the destinations served by the learned of the failure, will forward traffic addressed to it using
otherwise irreparable targets is assigned to the repair path with B their least cost path which may be via the failed link. This is
as its target, in the optimistic assumption that router B is still illustrated in Figure 5 in which all link costs are one in both
functioning. If router B is indeed still functioning, this will directions. Router S's intended repair path for traffic to D when
ensure delivery of the traffic. If, however, router B has failed, the link S-E fails is the path W-X-Y-Z-S1. However, if router S makes
traffic on this repair path will loop as previously shown in S1 be the tunnel endpoint and forwards the packet to router W,
section 3.1. The way this is detected, and the course of action when router W will immediately return it to S because its least cost
it is detected, are described in section 5.3. path to S1 is S-E-S1 (cost 3 versus cost 4) and has no knowledge of
the failure of link S-E.
4.4. When no Repair Path is Possible [S]--//--[E]-----[S1]
| |
| |
[W]---[X]---[Y]---[Z]
Under some circumstances, it will not be possible to identify a Figure 5. Repair path to target router S1.
repair path to one or more of the targets. This can occur for the
following reasons:
o The neighboring router that is presumed to have failed Thus the tunnel endpoint needs to be somewhere on the repair path
constitutes a single point of failure in the network. such that packets addressed to the tunnel end point will not loop
back towards router S. In addition, the release point needs to be
somewhere such that when packets are released from the tunnel they
will flow towards the target router (or their actual destination)
without being attracted back to the failed link. By inspection, in
Figure 5, suitable tunnel endpoints are routers X, Y, and Z.
o Severely asymmetric link costs may cause an otherwise viable Note that it is not essential that traffic assigned to a repair
physical repair path to be unusable. path actually traverse the target router for which the repair path
was created. If, for example, in Figure 5, if a packet's
destination were normally reached via the path S-E-S1-Z-?-?-?, once
it was released at any of the possible tunnel endpoints, it would
arrive at its destination by the best available route without
traversing S1.
o Interference may occur between the repair paths of individual In general, the properties that are required of tunnel endpoints
targets. are:
In practice, these cases are unlikely to be encountered frequently. o the end point must be reachable from the tunnel source
Networks that will benefit from the mechanisms described here will without traversing the failed link; and
usually exhibit considerable redundancy and are normally operated
with largely symmetric link costs. Note that a router's inability to
compute a full set of repair paths for one of its links does not
necessarily affect its ability to do so for its other links.
Example topologies illustrating each of the three cases above are o when released, tunneled packets will proceed towards their
described in the following subsections. destination without being attracted back over the failed
link or node.
4.4.1. Unreachable Target Provided both of these conditions are met, packets forwarded on the
repair path will not loop.
If the failure of a neighboring router makes one or more of its In some topologies it will not be possible to find a tunnel
neighbors genuinely unreachable, clearly it will not be possible to endpoint that exhibits both the required properties. For example,
establish a repair path to such targets. Such single points of in Figure 5, if the cost of link X-Y were increased from one to
failure are not expected to be encountered frequently in properly four in both directions, there is no longer a viable endpoint
designed networks, and will probably occur only when the network has within the fragment of the topology shown.
previously suffered other failures that have reduced its
connectivity.
4.4.2. Asymmetric Link Costs To solve this problem we introduce the concept of directed
forwarding from the tunnel endpoint. Directed forwarding allows the
originator of a tunneled packet to instruct that, when it is
decapsulated at the end of the tunnel, it be forwarded via a
specific adjacency, and not be subjected to the normal forwarding
decision process. This effectively allows the tunnel to be extended
by one hop. So, for example, in Figure 5 with the cost of link X-Y
set to four, it would be possible to select X as the tunnel
endpoint with the directive that X always forward the packets it
decapsulates via the adjacency to Y. Thus, router X is reached
from S using normal forwarding, and directed forwarding is then
used to force packets to router Y, from where S1 can be reached
using normal forwarding.
When link costs have been set asymmetrically, it is possible that a Provided link costs are symmetrical, it can be proved that it is
repair path cannot be constructed even using directed forwarding. always possible to compute a tunneled repair path (possibly using
directed forwarding) around a link failure, and that the tunnel
endpoint (F) and the release point (G) will be coincident, or may
be separated by at most one hop.
Although it is trivial to construct a network fragment with this 4.2.1. Computing Repair Paths
property, this should not be regarded as a major problem. Firstly,
asymmetric link costs are seldom used deliberately. And, secondly,
even when an asymmetric link cost prevents one potential repair path
being used, there will normally be other ones available.
4.4.3. Interference Between Potential Node Repair Paths For a router S, determining tunneled repair paths around a
neighboring router E, the set of potential tunnel end points
includes all the routers that can be reached from S using normal
forwarding without traversing the failed link S-E. This is termed
the "F-space" of S with respect to the failure of E. Any router
that is on an equal cost path split via the failed link is excluded
from this set.
Under some circumstances the existence of one neighbor may interfere The resulting set defines all the possible tunnel end points that
with a potential repair path to another. Consider the topology shown could be used in repair paths originating at router S for the
in Figure 7 in which all links have a symmetrical cost of one, with failure of link S-E. This set can be obtained by computing an SPT
the exception of that between H and G, which has a cost of 3. In this rooted at S and excising the sub-tree reached via the S-E link.
example, the fact that router F is a neighbor of B prevents the
discovery of a repair path from router A to router C despite the
existence of an apparently suitable path.
[A]---//---[B]------ [C] The set of possible release points can be determined by computing
| | | the set of routers that can reach the repair path target without
| | | traversing the failed link. This is termed the "G-space" of the
[H]-3-[G]--[F]--[E]--[D] target with respect to the failure. The G-space can be obtained by
computing a reverse shortest path tree (rSPT) rooted at the repair
path target, with the sub-tree which traverses the failed link (or
node) excised. The rSPT uses the cost towards the root rather than
from it and yields the best paths towards the root from other nodes
in the network.
Figure 7. Interference between repair paths The intersection of the target's G-space with S's F-space includes
all the possible release points for any repair path not employing
directed forwarding. Where there is no intersection, but there
exist a pair of routers, F in S's F-space and G in the target's
G-space, router F can be used as the tunnel endpoint with directed
forwarding to the release point G.
A repair path from router A to F can use F itself as the release 4.2.2. Extended F-space
point by employing directed forwarding from G. However, it is not
possible to identify a suitable release point for a repair path to
router C within the topology shown since there is nowhere that
router A can reach that will subsequently forward traffic to router C
except via the forbidden link B-C (F's least cost path to C is
F-B-C). This is because the extended P-space of router A is separated
by more than one hop from the Q-space of router C.
Since the topology shown in Figure 7 will typically form part of a The description in section 4.2.1 calculated router S's F-space
much larger topology, a different, and possibly more circuitous rooted at S itself. However, since router S will only use a repair
repair path from A to C, that does not go via F, may be discovered. path when it has detected the failure of the link S-E, the initial
This is illustrated in Figure 8. In this enhanced topology, a repair hop of the repair path need not be subject to S's normal forwarding
path to C using Y as the release point can be used. decision process. Thus we introduce the concept of extended F-
space. Router S's extended F-space is the union of the F-spaces of
each of S's neighbors. The use of extended F-space may allow router
S to repair to targets that were otherwise unreachable.
[A]---//---[B]-------[C] 4.2.3. Loop-free Alternates
| | |
| | |
[H]-3-[G]--[F]--[E]--[D]
| |
| |
[X]--[Y]--[Z]
Figure 8. Resolving interference in a larger network When a loop-free alternate exists, no tunneling is required.
Note that, in Figure 8, if the traffic for C were assigned to the 4.2.4. Selecting Repair Paths
repair path for F, it would correctly reach C because F would assign
it to its repair path to C. That is, packets from A to C would travel
via two successive tunnels. Consequently, this is referred to as a
"secondary repair path". However, it is not always the case that
interference can be handled in this fashion and it is possible to
create looping repair paths.
One possibility of looping repair paths is illustrated in Figure 9. The mechanisms described above will identify all the possible
All links have a symmetrical cost of one with the exception of HG, release points that can be used to reach each particular target.
which is cost 3 in either direction, and ED and DC which are cost 5 (The circumstances when no release points exist are described in
in the indicated direction and cost 1 in the other. section 4.4.) In a well-connected network there are likely to be
multiple possible release points for each target, and all will work
correctly. For simplicity, one release point per target is chosen.
All will deliver the packets correctly so, arguably, it does not
matter which is chosen. However, one release point may be preferred
over the others on the basis of path cost or some other criteria.
[A]---//---[B]--------[C] It is an implementation matter as to how the release point is
| | |^ selected.
| | |5
[H]-3-[G]--[F]--[E]---[D]
5>
Figure 9 Looping secondary repair paths 4.3. Assigning Traffic to Repair Paths
In this topology, A can establish a repair path to F, but cannot Once the repair path for each target has been selected, it is
establish a repair path to C because of interference. Router A might necessary to determine which of the destinations normally reached
assign traffic intended for C onto its repair path to F expecting it via the protected link should be assigned to which of the repair
to undergo a secondary repair towards C. However, because of the paths when the link fails.
asymmetrical link costs, F is unable to establish a repair path to C.
It is only able to establish a repair path to A. If F, like A,
elected to forward repaired traffic to C using its (only) repair path
to A, similarly expecting a secondary repair to get it to its
destination, traffic for C would loop between A and F. Thus when
interference occurs, the possibility of a secondary repair path
cannot be relied upon to ensure that traffic reaches its destination.
In order to determine the viability of secondary repair paths, it is This is achieved by recording which neighbor of E would be used to
necessary for each router to take into account the repair paths which reach each destination reachable over S-E when running the original
the other neighbors of router B can achieve. These can be computed SPF. Traffic assignment is then simply a matter of assigning the
locally by running the repair path computation algorithms rooted at traffic which E would have forwarded via each neighbor to the
each of those neighbors. It is only necessary to compute the repair repair path which has that neighbor as its target.
paths from the routers to which router A can establish repair paths,
with targets of those routers to which repair paths have not yet been
established.
It is then possible to determine whether all routers can now be Although the repair paths are calculated based on traffic addressed
reached by invoking secondary (or if necessary tertiary, etc.) repair to specific targets, it can be proved that the traffic assignment
paths, and if so, to which primary repair path traffic for each algorithm guarantees that the repair path can be used for any
target should be assigned. traffic assigned to it.
There is another, more subtle, possibility of loops arising when Where E would normally split the traffic to a particular
secondary repair paths are used. This is illustrated in Figure 10, destination via two or more of its neighbors, it is an
where all links are cost 1 with the exception of JI which has a cost implementation decision whether the repaired traffic should be
5 in that direction and cost 1 in the direction IJ. split across the corresponding set of repair paths.
[A]---//---[B]--------[C] The repair path to E itself is normally used just for traffic
| | | destined for E and any prefixes advertised by E. However, under
| | | some circumstances, it may be impossible to compute a repair path
[J] | [D] to one or more of E's neighbors, for example, because E is a single
5| | | point of failure. In this case traffic for the destinations served
v| | | by the otherwise irreparable targets is assigned to the repair path
[I]---[H]--[G]---[F]--[E] with E as its target, in the optimistic assumption that router E is
still functioning. If router E is indeed still functioning, this
will ensure delivery of the traffic. If, however, router E has
failed, the traffic on this repair path will loop as previously
shown in section 3.1. The way this is detected, and the course of
action when it is detected, is described in section 5.3.
Figure 10 Example of an apparently non-looping secondary repair path 4.4. When no Repair Path is Possible
which results in a loop.
Router A has a primary repair path to G (with a release point of I), Under some circumstances, it will not be possible to identify a
and G has a primary repair path to C (with a release point of E). It repair path to one or more of the targets. This can occur for the
would appear that these form a non-looping secondary repair path from following reasons:
A to C. As usual, the primary repair path from A to G has been
computed on the basis of destinations normally reachable through BG.
However, when making use of the secondary repair path, the traffic
inserted in the repair path from A to G will be destined not for one
of the routers normally reachable via BG, but for C. Hence this
repair path is not necessary valid for such traffic, and in this
example it will have a 50% probability of being forwarded back along
the path IJABC, and hence looping.
This problem can in general be avoided by choosing a release point o The neighboring router that is presumed to have failed
for the initial primary repair with the property that traffic for the constitutes a single point of failure in the network.
secondary target (C) is guaranteed to traverse the primary target
(G). This can be achieved by computing the reverse SPF rooted at the
secondary target (C) and examining the sub-tree which traverses the
primary target. It can be proved that in the absence of asymmetric
link costs, such a release point will always exist. Where asymmetric
link costs prevent this, the traffic can be encapsulated to the
intermediate router (G), which may require the use of double
encapsulation. On reaching router G, the traffic for C is
decapsulated and then forwarded in G's primary repair path to C (via
router E, in the example).
4.5. Multi-homed Prefixes o Severely asymmetric link costs may cause an otherwise
viable physical repair path to be unusable.
Up to this point, it has been assumed that any particular prefix is o Interference may occur between the repair paths of
"attached" to exactly one router in the network, and consequently individual targets.
only the routers in the network need be considered when constructing
repair paths, etc. However, in many cases the same prefix will be
attached to two or more routers. Common cases are: -
o The subnet present on a link is advertised from both ends of In practice, these cases are unlikely to be encountered frequently.
the link. Networks that will benefit from the mechanisms described here will
usually exhibit considerable redundancy and are normally operated
with largely symmetric link costs. Note that a router's inability
to compute a full set of repair paths for one of its links does not
necessarily affect its ability to do so for its other links.
o Prefixes are propagated from one routing domain to another by Example topologies illustrating each of the three cases above are
multiple routers. described in the following subsections.
o Prefixes are advertised from multiple routers to provide 4.4.1. Unreachable Target
resilience in the event of the failure of one of the routers.
In general, this causes no particular problems, and the shortest If the failure of a neighboring router makes one or more of its
route to each prefix (and hence which of the routers to which it is neighbors genuinely unreachable, clearly it will not be possible to
attached should be used to reach it) is resolved by the normal SPF establish a repair path to such targets. Such single points of
process. However, in the particular case where one of the instances failure are not expected to be encountered frequently in properly
of a prefix is attached to router B, or to a router for which router designed networks, and will probably occur only when the network
B is a single point of failure, the situation is more complicated. has previously suffered other failures that have reduced its
connectivity.
P 4.4.2. Asymmetric Link Costs
|
|
[A]---//---[B]--------[C]
| | P
| | |
[W]-----[X]----[Y]----[Z]-[G]-[H]-[I]-[J]-[K]-[L]-[M]-[N]
Figure 11 A multi-homed prefix p When link costs have been set asymmetrically, it is possible that a
repair path cannot be constructed even using directed forwarding.
Consider a prefix p, which is attached to router B and some other Although it is trivial to construct a network fragment with this
router N as illustrated in Figure 11. Before the failure of the link property, this should not be regarded as a major problem. Firstly,
A-B, p is reachable from A via A-B. After the failure it cannot be asymmetric link costs are seldom used deliberately. And, secondly,
assumed that B is still reachable. If traffic to p is assigned to a even when an asymmetric link cost prevents one potential repair
link repair path to B (as it would be if p were attached only to B), path being used, there will normally be other ones available.
and router B has failed, then it would loop and subsequently be
dropped. Traffic for p cannot simply be assigned to whatever repair
path would be used for traffic to N, because other routers, which are
not yet aware of any failure, may direct the traffic back towards B,
since the instance of p attached to B is closer.
A solution is to treat p itself as a neighbor of B, and compute a 4.4.3. Interference Between Potential Node Repair Paths
repair path with p as a target. However, although correct, this
solution may be infeasible where there are a very large number of
such prefixes, which would result in an unacceptably large
computational overhead.
Some simplification is possible where there exist a large number of Under some circumstances the existence of one neighbor may
multi-homed prefixes which all share the same connectivity and interfere with a potential repair path to another. Consider the
metrics. These may be treated as a single router and a single repair topology shown in Figure 6, in which all links have a symmetrical
path computed for the entire set of prefixes. cost of one, with the exception of that between H and I, which has
a cost of 3. In this example, the fact that router J is a neighbor
of E prevents the discovery of a repair path from router S to
router S1 despite the existence of an apparently suitable path.
An alternative solution is to tunnel the traffic for a multi-homed [S]---//---[E]-------[S1]
prefix to the router N where it is also attached (see Figure 11). If | | |
this involves a repair path that was already tunneled, then this | | |
requires double encapsulation. [H]-3-[I]--[J]--[K]--[L]
4.6. Equal Cost Path Splits Figure 6. Interference between repair paths
Equal cost path splits may be used as a repair mechanism, but link A repair path from router S to J can use J itself as the release
and node repairs need to be considered separately. point by employing directed forwarding from I. However, it is not
possible to identify a suitable release point for a repair path to
router S1 within the topology shown since there is nowhere that
router S can reach that will subsequently forward traffic to
router S1 except via the forbidden link E-S1 (J's least cost path
to S1 is J-E-S1). This is because the extended F-space of router S
is separated by more than one hop from the G-space of router S1.
4.6.1. Equal Cost Path Splits as Link Repair Paths Since the topology shown in Figure 6 will typically form part of a
much larger topology, a different, and possibly more circuitous
repair path from S to S1, that does not go via J, may be
discovered. This is illustrated in Figure 7. In this enhanced
topology, a repair path to S1 using Y as the release point can be
used.
When a link is used as a member of one or more path-split sets, by [S]---//---[B]-------[S1]
definition, the destinations served could be equally well served by | | |
any other member of the path-split set. Therefore, when the link | | |
fails, any destinations that use the link as a path-split may be [H]-3-[I]--[J]--[K]--[L]
immediately assigned to another member of the set. Clearly, if | |
traffic to some destinations can be repaired using a path split, it | |
should not also be subject to repair by tunneling. Such destinations [X]--[Y]--[Z]
should be identified before performing traffic assignment to tunneled
repair paths.
4.6.2. Equal Cost Path Splits and Node Failure Figure 7. Resolving interference in a larger network
An equal cost path split may traverse the failed node (router B). In Note that, in Figure 6, if the traffic for S1 were assigned to the
this case, the path split may not be an appropriate repair path. repair path for J, it would correctly reach S1 because J would
There are two cases: - assign it to its repair path to S1. That is, packets from S to S1
would travel via two successive tunnels. Consequently, this is
referred to as a "secondary repair path". However, it is not always
the case that interference can be handled in this fashion and it is
possible to create looping repair paths.
o the path split is a parallel link, having router B as a One possibility of looping repair paths is illustrated in Figure 8.
direct neighbor, and All links have a symmetrical cost of one with the exception of H-I,
which is cost 3 in both directions, and K-L and L-S1 which are cost
5 in the indicated direction and cost 1 in the other.
o the path split does not have router B as a direct neighbor, [S]---//---[E]--------[S1]
but the route traverses router B at some point further | | |^
downstream. | | |5
[H]-3-[I]--[J]--[K]---[L]
5>
These are illustrated in Figure 12 and Figure 13 respectively. Figure 8 Looping secondary repair paths
+---//---+ In this topology, S can establish a repair path to J, but cannot
[A] [B]-------[D] establish a repair path to S1 because of interference. Router S
+--------+ might assign traffic intended for S1 onto its repair path to J
expecting it to undergo a secondary repair towards S1. However,
because of the asymmetrical link costs, J is unable to establish a
repair path to S1. It is only able to establish a repair path to S.
If J, like S, elected to forward repaired traffic to S1 using its
(only) repair path to S, similarly expecting a secondary repair to
get it to its destination, traffic for S1 would loop between S
and J. Thus when interference occurs, the possibility of a
secondary repair path cannot be relied upon to ensure that traffic
reaches its destination.
Figure 12 A parallel link path split In order to determine the viability of secondary repair paths, it
is necessary for each router to take into account the repair paths
which the other neighbors of router E can achieve. These can be
computed locally by running the repair path computation algorithms
rooted at each of those neighbors. It is only necessary to compute
the repair paths from the routers to which router S can establish
repair paths, with targets of those routers to which repair paths
have not yet been established.
+-2-//---+ It is then possible to determine whether all routers can now be
[A] [B]-------[D] reached by invoking secondary (or if necessary tertiary, etc.)
+--[C]---+ repair paths, and if so, to which primary repair path traffic for
each target should be assigned.
Figure 13 A path split via an intermediate node There is another, more subtle, possibility of loops arising when
secondary repair paths are used. This is illustrated in Figure 9,
where all links are cost 1 with the exception of L-K which has a
cost 5 in that direction and cost 1 in the direction K-L.
In both cases it must be assumed that router B has failed and some [S]---//---[E]--------[S1]
other repair path, diverse with respect to router B, must be used. | | |
| | |
[L] | [D]
5| | |
v| | |
[K]---[J]--[I]---[H]--[E]
4.7. LANs and pseudonodes Figure 9 Example of an apparently non-looping secondary repair path
which results in a loop.
In link state protocols a LAN is represented by a construct known as Router S has a primary repair path to I (with a release point
a pseudonode in IS-IS and a network LSA in OSPF. of K), and I has a primary repair path to S1 (with a release point
of E). It would appear that these form a non-looping secondary
repair path from S to S1. As usual, the primary repair path from S
to I has been computed on the basis of destinations normally
reachable through E-I. However, when making use of the secondary
repair path, the traffic inserted in the repair path from S to I
will be destined not for one of the routers normally reachable via
E-I, but for S1. Hence this repair path is not necessary valid for
such traffic and in this example it will have a 50% probability of
being forwarded back along the path K-L-S-E-S1, and hence looping.
In order to deal correctly with this representation of LANs, the This problem can in general be avoided by choosing a release point
algorithms described in this draft require certain modifications. for the initial primary repair with the property that traffic for
There are four cases which require consideration. These are described the secondary target (S1) is guaranteed to traverse the primary
in the following subsections. target (I). This can be achieved by computing the rSTF rooted at
the secondary target (S1) and examining the sub-tree which
traverses the primary target. It can be proved that in the absence
of asymmetric link costs, such a release point will always exist.
Where asymmetric link costs prevent this, the traffic can be
encapsulated to the intermediate router (I), which may require the
use of double encapsulation. On reaching router I, the traffic for
S1 is decapsulated and then forwarded in I's primary repair path to
S1 (via router E, in the example).
4.7.1. The Link between Routers A and B is a LAN 4.5. Multi-homed Prefixes
In this case, the link which is being protected is a LAN, and the Up to this point, it has been assumed that any particular prefix is
router B which has potentially failed is reachable over the LAN. This "attached" to exactly one router in the network, and consequently
is illustrated in Figure 14. only the routers in the network need be considered when
constructing repair paths, etc. However, in many cases the same
prefix will be attached to two or more routers. Common cases are:
[A] o The subnet present on a link is advertised from both ends
| of the link.
=====================
| | | |
[B] [C] [D] [E]
Figure 14 The link between routers A and B is a LAN o Prefixes are propagated from one routing domain to another
by multiple routers.
There are two possible failure modes in this case. o Prefixes are advertised from multiple routers to provide
resilience in the event of the failure of one of the
routers.
4.7.1.1. Case 1 In general, this causes no particular problems, and the shortest
route to each prefix (and hence which of the routers to which it is
attached should be used to reach it) is resolved by the normal SPF
process. However, in the particular case where one of the instances
of a prefix is attached to router E, or to a router for which
router E is a single point of failure, the situation is more
complicated.
Router B or its interface to the LAN may have failed independently of P
the rest of the LAN. In this case the remaining routers on the LAN |
(routers C, D and E) will remain reachable from router A. These |
routers do not appear as direct neighbors of router B in the link [S]---//---[E]--------[S1]
state database and are not treated as neighbors of router B for the | | p
purposes of this specification because no traffic from router A would | | |
be directed through router B to any of these routers. However, each [W]-----[X]----[Y]----[Z]-[I]-[J]-[K]-[L]-[M]-[N]
of these neighboring routers will have router B as a neighbor and
they will initiate their own repair paths in the event of the failure
of router B or its LAN interface.
Repair paths are computed with the non-LAN neighbors of B as targets, Figure 10 A multi-homed prefix p
and also B itself (the "link-failure" repair path). Note that since
the remaining neighbors of A on the LAN are assumed to be still
reachable when the link to B has failed, these repair paths may
traverse the LAN.
A separate set of repair paths is required in anticipation of the Consider a prefix p, which is attached to router E and some other
potential failure of each router on the LAN. router N as illustrated in Figure 10. Before the failure of the
link S-E, p is reachable from S via S-E. After the failure it
cannot be assumed that E is still reachable. If traffic to p is
assigned to a link repair path to E (as it would be if p were
attached only to E), and router E has failed, then it would loop
and subsequently be dropped. Traffic for p cannot simply be
assigned to whatever repair path would be used for traffic to N,
because other routers, which are not yet aware of any failure, may
direct the traffic back towards E, since the instance of p attached
to E is closer.
4.7.1.2. Case 2 A solution is to treat p itself as a neighbor of E, and compute a
repair path with p as a target. However, although correct, this
solution may be infeasible where there are a very large number of
such prefixes, which would result in an unacceptably large
computational overhead.
Router A's interface to the LAN may have failed (or the entire LAN Some simplification is possible where there exist a large number of
may have failed). In either event, simultaneous failures will be multi-homed prefixes which all share the same connectivity and
observed from router A to all the remaining routers on the LAN metrics. These may be treated as a single router and a single
(routers B, C, D and E). In this case, the pseudonode itself can be repair path computed for the entire set of prefixes.
treated as the "adjacent" router (i.e. the router normally referred
to as "router B"), and repairs constructed using the normal
mechanisms with all the neighbors of the pseudonode (routers B, C, D
and E) as repair path targets. If one or more of the routers had
failed in addition to the LAN connectivity, treating it as a repair
path target would not be viable, but this would be a case of multiple
simultaneous failures which is out of scope of this specification.
The entire sub-tree over A's LAN interface is the failed component An alternative solution is to tunnel the traffic for a multi-homed
and is excised from the spanning tree when computing A's extended P- prefix to the router N where it is also attached (see Figure 10).
space. For the Q-spaces of the targets, the sub-tree over the LAN If this involves a repair path that was already tunneled, then this
interface of the target is excised. requires double encapsulation.
4.7.1.3. Simplified LAN repair 4.6. LANs and pseudo-nodes
A simpler alternative strategy is to always consider the LAN and all In link state protocols a LAN is represented by a construct known
routers attached to it as failing as a single unit. In this case, a as a pseudo-node in IS-IS and a network LSA in OSPF.
single set of repair paths is computed with targets being the entire
set of non-LAN neighbors of all the routers on the LAN, together with
"link-repair" paths with all the routers on the LAN as targets. Any
failure of one or more LAN adjacencies results in these repair paths
being invoked for all neighbors on the LAN. These repair paths must
not traverse the LAN, and so must be computed by excising the entire
sub-tree reachable over A's LAN interface from A's spanning tree
(i.e. the entire LAN is the failed component). The Q-spaces are
computed as normal, with the LAN neighbors or their interface to the
LAN being excised as appropriate. This is simpler than the approach
proposed above, but will fail to make use of possible repair paths
(or even path splits) over the LAN. In particular, if the only viable
repair paths involve the LAN, it will prevent any repair being
possible.
4.7.2. A LAN exists at the release point In order to deal correctly with this representation of LANs, the
algorithms described in this draft require certain modifications.
There are four cases which require consideration. These are
described in the following subsections.
When computing the viable release points, it may be that one or more 4.6.1. The Link between Routers S and E is a LAN
of the leaf nodes are actually pseudonodes. In this case, the release
point is deemed to be any of the parent nodes on the LAN by which the
pseudonode had been reached, and when computing the extended set of
release points (reachable by directed forwarding), all the remaining
routers on the LAN may be included.
4.7.3. A LAN between B and its neighbors In this case, the link which is being protected is a LAN, and the
router E which has potentially failed is reachable over the LAN.
This is illustrated in Figure 11.
If there is a LAN between router B and one or more of B's neighbors [S]
(other than router A), then rather than treating each of those |
neighbors as a separate target to which a repair path must be =====================
computed, the pseudonode itself can be treated as a single target for | | | |
which a repair path can be computed. If there are other neighbors of [E] [X] [Y] [Z]
B which are directly attached to B, including those which may also be
attached to the LAN, they must still be treated as an individual
repair path target.
Normally a repair path with the pseudonode as its target will have a Figure 11 The link between routers S and E is a LAN
release point before the pseudonode. However it is possible that the
release point would be computed as the pseudonode itself. This will
occur if the reverse spanning tree rooted at the pseudonode includes
no routers other than itself. In this case a single repair with the
pseudonode as target is not possible, and it is necessary to compute
individual repair paths whose target are each of the neighbors of B
on the LAN.
4.7.4. The LAN is a Transit Subnet. There are two possible failure modes in this case.
This is the most common case, where a LAN is traversed by a repair 4.6.1.1. Case 1
path, but is not in any of the special positions described above. In
this case no special treatment is required, and the normal SPF
mechanisms are applicable.
5. Failure Detection and Repair Path Activation Router E or its interface to the LAN may have failed independently
of the rest of the LAN. In this case the remaining routers on the
LAN (routers X, Y and Z) will remain reachable from router S. These
routers do not appear as direct neighbors of router E in the link
state database and are not treated as neighbors of router E for the
purposes of this specification because no traffic from router S
would be directed through router E to any of these routers.
However, each of these neighboring routers will have router E as a
neighbor and they will initiate their own repair paths in the event
of the failure of router E or its LAN interface.
The details of repair path activation are inherently implementation- Repair paths are computed with the non-LAN neighbors of E as
dependent and must be addressed by individual design specifications. targets, and also E itself (the "link-failure" repair path). Note
This section describes the implementation independent aspects of the that since the remaining neighbors of S on the LAN are assumed to
failover to the repair path. be still reachable when the link to E has failed, these repair
paths may traverse the LAN.
5.1. Failure Detection A separate set of repair paths is required in anticipation of the
potential failure of each router on the LAN.
The failure detection mechanism must provide timely detection of the 4.6.1.2. Case 2
failure and activation of the repair paths. The failure detection
mechanisms may be media specific (for example loss of light), or may
be generic (for example BFD). Multiple detection mechanisms may be
used in order to improve detection latency. Note that in the case of
a LAN it may be necessary to monitor connectivity to all of the
adjacent routers on the LAN.
5.2. Repair Path Activation Router S's interface to the LAN may have failed (or the entire LAN
may have failed). In either event, simultaneous failures will be
observed from router S to all the remaining routers on the LAN
(routers E, X, Y and Z). In this case, the pseudo-node itself can
be treated as the "adjacent" router (i.e. the router normally
referred to as "router E"), and repairs constructed using the
normal mechanisms with all the neighbors of the pseudo-node
(routers E, X, Y and Z) as repair path targets. If one or more of
the routers had failed in addition to the LAN connectivity,
treating it as a repair path target would not be viable, but this
would be a case of multiple simultaneous failures which is out of
scope of this specification.
The mechanism used by the router to activate the repair path The entire sub-tree over S's LAN interface is the failed component
following failure will be implementation specific. and is excised from the SPT when computing S's extended F-space.
For the G-spaces of the targets, the sub-tree over the LAN
interface of the target is excised.
An implementation that is capable of withdrawing the repair may delay 4.6.1.3. Simplified LAN repair
the start of network convergence in order to minimize network
disruption in the event that the failure was a transient.
5.3. Node Failure Detection Mechanism A simpler alternative strategy is to always consider the LAN and
all routers attached to it as failing as a single unit. In this
case, a single set of repair paths is computed with targets being
the entire set of non-LAN neighbors of all the routers on the LAN,
together with "link-repair" paths with all the routers on the LAN
as targets. Any failure of one or more LAN adjacencies results in
these repair paths being invoked for all neighbors on the LAN.
These repair paths must not traverse the LAN, and so must be
computed by excising the entire sub-tree reachable over S's LAN
interface from S's SPT (i.e. the entire LAN is the failed
component). The G-spaces are computed as normal, with the LAN
neighbors or their interface to the LAN being excised as
appropriate. This is simpler than the approach proposed above, but
will fail to make use of possible repair paths (or even path
splits) over the LAN. In particular, if the only viable repair
paths involve the LAN, it will prevent any repair being possible.
When router A detects a failure of the A-B link, it will invoke the 4.6.2. A LAN exists at the release point
link repair path from itself to router B. This A-B link repair is
always invoked because even if all other traffic can be re-routed, B
is always a single point of failure to itself. If router B has
failed, the A-B link repair can result in a forwarding loop. A node
failure detection mechanism is therefore needed. A suitable mechanism
might be to run BFD [BFD] between A and B, over the A-B link repair
path.
When the node failure detection mechanism has determined that router When computing the viable release points, it may be that one or
B has failed it withdraws the A-B link repair path. The node failure more of the leaf nodes are actually pseudo-nodes. In this case, the
detection and revocation of the A-B link repair needs to be release point is deemed to be any of the parent nodes on the LAN by
expedited, in order to minimize the duration of collateral damage to which the pseudo-node had been reached, and when computing the
the network cause by packets looping around the A-B link repair path. extended set of release points (reachable by directed forwarding),
all the remaining routers on the LAN may be included.
If B is a single point of failure to some destinations, then 4.6.3. A LAN between E and its neighbors
withdrawing the A-B link repair has no impact on network
connectivity, because those destinations will have been rendered
unreachable by the failure of router B.
If B is not a single point of failure, but traffic to some If there is a LAN between router E and one or more of E's neighbors
destinations is being repaired via the A-B link because of the (other than router S), then rather than treating each of those
inability to provide suitable repair paths, then there are neighbors as a separate target to which a repair path must be
destinations that are rendered temporarily unreachable by IPFRR. The computed, the pseudo-node itself can be treated as a single target
IPFRR loop free convergence mechanism delays normal convergence of for which a repair path can be computed. If there are other
the network. Consideration therefore has to be given to the relative neighbors of E which are directly attached to E, including those
importance of the traffic being protected and the traffic being which may also be attached to the LAN, they must still be treated
black-holed. Depending on the outcome of that consideration, the as an individual repair path target.
IPFRR loop-free strategy may need to be abandoned.
6. Loop Free Transition Normally a repair path with the pseudo-node as its target will have
a release point before the pseudo-node. However it is possible that
the release point would be computed as the pseudo-node itself. This
will occur if the rSPT rooted at the pseudo-node includes no
routers other than itself. In this case a single repair with the
pseudo-node as target is not possible, and it is necessary to
compute individual repair paths whose target are each of the
neighbors of E on the LAN.
Once the repair paths have been activated, data will again be 4.6.4. The LAN is a Transit Subnet.
forwarded correctly. At this stage only the routers directly adjacent
to the failure will be aware of the failure because no routing
information concerning the failure has yet been propagated to other
routers. The network now has to be transitioned to normal operation
using the available components.
During network transition inconsistent state may lead to the This is the most common case, where a LAN is traversed by a repair
formation of micro-loops. During this period, packets may be path, but is not in any of the special positions described above.
prevented from reaching the repair path, may expire due to transiting In this case no special treatment is required, and the normal SPF
an excessive number of hops, may be subject to excessive delay, and mechanisms are applicable.
the resultant congestion may disrupt the passage of other packets
through the network. The use of a loop free transition technique
allows the network to re-converge without packet loss or disruption.
Four loop free transition strategies are described: 5. Failure Detection and Repair Path Activation
o Incremental cost advertisement The details of repair path activation are inherently
implementation-dependent and must be addressed by individual design
specifications. This section describes the implementation
independent aspects of the failover to the repair path.
o Single Tunnel 5.1. Failure Detection
o Distributed Tunnels The failure detection mechanism must provide timely detection of
the failure and activation of the repair paths. The failure
detection mechanisms may be media specific (for example loss of
light), or may be generic (for example BFD). Multiple detection
mechanisms may be used in order to improve detection latency. Note
that in the case of a LAN it may be necessary to monitor
connectivity to all of the adjacent routers on the LAN.
o Ordered SPF 5.2. Repair Path Activation
6.1. Incremental Cost Advertisement The mechanism used by the router to activate the repair path
following failure will be implementation specific.
When a link fails, the cost of the link is normally changed from its An implementation that is capable of withdrawing the repair may
assigned metric to "infinity". However it can be proved that: if the delay the start of network convergence in order to minimize network
link cost is increased in suitable increments, and the network is disruption in the event that the failure was a transient.
allowed to stabilize before the next cost increment is advertised,
then no micro-loops will form.
This approach has the advantage that it requires no change to the 5.3. Node Failure Detection Mechanism
routing protocol, and will work with non-IPFRR capable routers.
However the loop-free transition is slow, particularly if large
metrics are used, and during this time the network is vulnerable to a
second failure.
6.2. Single Tunnel Per Router When router S detects a failure of the S-E link, it will invoke the
link repair path from itself to router S. This S-E link repair is
always invoked because even if all other traffic can be re-routed,
E is always a single point of failure to itself. If router E has
failed, the S-E link repair can result in a forwarding loop. A node
failure detection mechanism is therefore needed. A suitable
mechanism might be to run BFD [BFD] between S and E, over the S-E
link repair path.
When a failure is detected, the routers adjacent to the failure issue When the node failure detection mechanism has determined that
a "covert" announcement of the failure, which is propagated through router E has failed it withdraws the S-E link repair path. The node
the network by all routers, but which is understood only by IPFRR failure detection and revocation of the S-E link repair needs to be
capable routers. These routers each build a tunnel to the closest expedited, in order to minimize the duration of collateral damage
IPFRR router adjacent to the failure. They then determine which of to the network cause by packets looping around the S-E link repair
their traffic would transit the failure and place that traffic in the path.
tunnel. When all of these tunnels are in place, the failure is then
announced as normal. Because the tunnel will be unaffected by the
transition, and because the IPFRR router at the tunnel endpoint will
continue the repair, no traffic will be disrupted by the failure.
When the network has converged, the IPFRR routers can withdraw the
tunnels. The order of tunnel insertion and withdrawal is not
important, provided the tunnels are all in place before the normal
announcement.
This technique has the disadvantage that it requires traffic to be If E is a single point of failure to some destinations, then
tunneled during the transition. withdrawing the S-E link repair has no impact on network
connectivity, because those destinations will have been rendered
unreachable by the failure of router E.
A further disadvantage of this method is that it requires co- If E is not a single point of failure, but traffic to some
operation from all the routers within the routing domain to fully destinations is being repaired via the S-E link because of the
protect the network against micro-loops. However it can be shown that inability to provide suitable repair paths, then there are
micro-loops will be confined to contiguous groups of non-IPFRR destinations that are rendered temporarily unreachable by IPFRR.
capable routers, and will only affect traffic arriving at the network The IPFRR loop free convergence mechanism delays normal convergence
through one of those routers. of the network. Consideration therefore has to be given to the
relative importance of the traffic being protected and the traffic
being black-holed. Depending on the outcome of that consideration,
the IPFRR loop-free strategy may need to be abandoned.
6.3. Distributed Tunnels 6. Loop Free Transition
This is similar to the single tunnel per router approach except that Once the repair paths have been activated, data will again be
all IPFRR capable routers calculate a set of repair paths using the forwarded correctly. At this stage only the routers directly
same algorithms as for traffic that will be affected by the failure. adjacent to the failure will be aware of the failure because no
routing information concerning the failure has yet been propagated
to other routers. The network now has to be transitioned to normal
operation using the available components.
This reduces the load on the tunnel endpoints, but the length of time During network transition inconsistent state may lead to the
taken to calculate the repairs increases the convergence time. formation of micro-loops. During this period, packets may be
prevented from reaching the repair path, may expire due to
transiting an excessive number of hops, may be subject to excessive
delay, and the resultant congestion may disrupt the passage of
other packets through the network. A loop free transition technique
which allows the network to re-converge without packet loss or
disruption is therefore required.
This method suffers from the same disadvantages as the single tunnel A number of suitable loop-free convergence techniques are described
method. in [LVCONV].
6.4. Ordered SPFs 7. IPFRR Capability
Micro loops occur when a router closer to the failed component In the previous sections it has been assumed that all routers in
revises its routes to take account of the failure before a router the network are capable of acting as IPFRR routers, performing such
which is further away. By analyzing the reverse spanning tree over tasks as tunnel termination and directed forwarding. In practice
which traffic is directed to the failed component, it is possible to this is unlikely to be the case, partially because of the
determine a strict ordering which ensures that routers closer to the heterogeneous nature of a practical network, and partially because
root always process the failure after any routers further away, and of the need to progressively deploy such capability. IPFRR
hence micro loops are prevented. therefore needs to support some form of capability announcement,
and the algorithms need to take these capabilities into account
when calculating their path repair strategies. For example, the
ability of routers to function as tunnel end points and perform
directed forwarding will influence the choice of repair path.
However, routers which are simply traversed by repair paths
(tunneled or not) do not need to be IPFRR capable in order to
guarantee correct operation of the repair paths.
When the failure has been announced, each router waits a multiple of 8. Enhancements to routing protocols
some time delay value. The multiple is determined by the router's
position in the reverse spanning tree, and the delay value is chosen
to guarantee that a router can complete its processing within this
time. The convergence time may be reduced by employing a signaling
mechanism to notify the parent when all the children have completed
their processing, and hence when it was safe for the parent to
instantiate its new routes.
The property of this approach is therefore that it imposes a delay It will be seen from the above that a number of enhancements to the
which is bounded by the network diameter although in most cases it appropriate routing protocols are needed to support IPFRR. The
will be much less. following possible enhancements have been identified:
It requires all routers in the domain to operate according to these o The ability to advertise IPFRR capability
procedures, and the presence of non co-operating routers can give
rise to loops for any traffic which traverses them (not just traffic
which is originated through them).
7. Restoring Failed Components to Service o The ability to advertise tunnel endpoint capability
When a neighbor or failed link is restored to service, it will be o The ability to advertise directed forwarding identifiers
detected according to the normal operation of the routing protocols
by the formation of an adjacency. Normally this would result in the
information about the link being included in newly generated routing
information. However, just as in the case with increasing costs, the
sudden decrease in cost from "infinity" to the configured value of
the link cost may give rise to loops. Each of the loop-free
transition mechanism described above has a corresponding mechanism
that can be used to add a link to the network without the formation
of micro-loops.
8. Implications for Network Management o The ability to announce the start of a loop-free
transition, and to abort a loop-free transition.
It will be clear from the above that topology changes introduced by o The ability to signal transition completion status to
management action, such as enabling or disabling a link or router, or neighbors.
changing the cost metric of a link may result in disruption of
traffic due to the formation of micro-loops. It will equally be clear
that the loop-free convergence strategies described above can equally
be applied to the prevention of such micro-loops.
9. IPFRR Capability o The ability to advertise that a link is protected.
In the previous sections it has been assumed that all routers in the Capability advertisement should make use of existing capability
network are capable of acting as IPFRR routers, performing such tasks mechanisms in the routing protocols. The exact set of enhancements
as tunnel termination and directed forwarding. In practice this is will depend on specific IPFRR design choices.
unlikely to be the case, partially because of the heterogeneous
nature of a practical network, and partially because of the need to
progressively deploy such capability. IPFRR therefore needs to
support some form of capability announcement, and the algorithms need
to take these capabilities into account when calculating their path
repair strategies. For example, the ability of routers to function as
tunnel end points and perform directed forwarding will influence the
choice of repair path. However, routers which are simply traversed by
repair paths (tunneled or not) do not need to be IPFRR capable in
order to guarantee correct operation of the repair paths.
10. Enhancements to routing protocols 9. IANA considerations
It will be seen from the above that a number of enhancements to the There are no IANA considerations that arise from this architectural
appropriate routing protocols are needed to support IPFRR. The description of IPFRR. However there will be changes to the IGPs to
following possible enhancements have been identified: support IPFRR in which there will be IANA considerations.
o The ability to advertise IPFRR capability 10. Security Considerations
o The ability to advertise tunnel endpoint capability Changes to the IGPs to support IPFRR do not introduce any
additional security risks.
o The ability to advertise directed forwarding identifiers The security implications of the increased convergence time due to
the loop avoidance strategy depend on the approach to multiple
failures. If the presence of multiple failures results in the
network aborting the loop free strategy, then the convergence time
will be similar to that of a conventional network. On the other
hand, an attacker in a position to disrupt part of a network might
use this to disrupt the repair of a critical path.
o The ability to announce the start of a loop-free transition, The tunnel endpoints need to be secured to prevent their use as a
and to abort a loop-free transition. facility by an attacker. Performance considerations indicate that
tunnels cannot be secured by IPsec [IPSEC]. A system of packet
address policing, both at the tunnel endpoints and at the edges of
the network would prevent an attacker's packet arriving at a tunnel
endpoint and would seem to be the best strategy.
o The ability to signal transition completion status to When a fast re-route is in progress, there may be an unacceptable
neighbors. increase in traffic load over the repair path. Network operators
need to examine the computed repair paths and ensure that they have
sufficient capacity.
o The ability to advertise that a link is protected. Acknowledgments
Capability advertisement should make use of existing capability The authors acknowledge the significant technical contributions
mechanisms in the routing protocols. The exact set of enhancements made to this work by their colleagues: John Harper and Kevin Miles.
will depend on specific IPFRR design choices.
11. IANA considerations Intellectual Property Statement
There are no IANA considerations that arise from this architectural The IETF takes no position regarding the validity or scope of any
description of IPFRR. However there will be changes to the IGPs to Intellectual Property Rights or other rights that might be claimed
support IPFRR in which there will be IANA considerations. to pertain to the implementation or use of the technology described
in this document or the extent to which any license under such
rights might or might not be available; nor does it represent that
it has made any independent effort to identify any such rights.
Information on the procedures with respect to rights in RFC
documents can be found in BCP 78 and BCP 79.
12. Security Considerations Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use
of such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository
at http://www.ietf.org/ipr.
Changes to the IGPs to support IPFRR do not introduce any additional The IETF invites any interested party to bring to its attention any
security risks. copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
The security implications of the increased convergence time due to Full copyright statement
the loop avoidance strategy depend on the approach to multiple Copyright (C) The Internet Society (2004). This document is subject
failures. If the presence of multiple failures results in the network to the rights, licenses and restrictions contained in BCP 78, and
aborting the loop free strategy, then the convergence time will be except as set forth therein, the authors retain all their rights.
similar to that of a conventional network. On the other hand, an
attacker in a position to disrupt part of a network might use this to
disrupt the repair of a critical path.
The tunnel endpoints need to be secured to prevent their use as a This document and the information contained herein are provided on
facility by an attacker. Performance considerations indicate that an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
tunnels cannot be secured by IPsec [IPSEC]. A system of packet REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND
address policing, both at the tunnel endpoints and at the edges of THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES,
the network would prevent an attacker's packet arriving at a tunnel EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT
endpoint and would seem to be the best strategy. THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR
ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A
PARTICULAR PURPOSE.
When a fast re-route is in progress, there may be an unacceptable Normative References
increase in traffic load over the repair path. Network operators need
to examine the computed repair paths and ensure that they have
sufficient capacity.
Acknowledgments There are no normative references.
The authors acknowledge the significant technical contributions made
to this work by their colleagues: John Harper and Kevin Miles.
IPR Disclosure Acknowledgement Informative References
By submitting this Internet-Draft, we certify that any applicable Internet-drafts are works in progress available from
patent or other IPR claims of which we are aware have been disclosed, http://www.ietf.org/internet-drafts/
and any of which we become aware will be disclosed, in accordance
with RFC 3668.
Normative References [BASIC] Alia Atlas, Ed., et al., "Basic Specification
for IP Fast-Reroute: Loop-free Alternates",
<draft-ietf-rtgwg-ipfrr-spec-base-01.txt>,
October 2004, (work in progress).
Internet-drafts are works in progress available from [BFD] Katz, D., and Ward, D., "Bidirectional
http://www.ietf.org/internet-drafts/ Forwarding Detection", <draft-katz-ward-bfd-
01.txt>, August 2003 (work in progress).
Informative References [IPFRR] Shand, M., "IP Fast-reroute Framework",
<draft-ietf-rtgwg-ipfrr-framework-02.txt>,
October 2004, (work in progress).
Internet-drafts are works in progress available from [IPSEC] Kent, S., Atkinson, R., "Security Architecture
http://www.ietf.org/internet-drafts/ for the Internet Protocol", RFC 2401
BFD Katz, D., and Ward, D., "Bidirectional Forwarding [L2TPv3] J. et al., "Layer Two Tunneling Protocol
Detection", draft-katz-ward-bfd-01.txt, August (Version 3)", <draft-ietf-l2tpext-l2tp-base-
2003 (work in progress). 14.txt>, June 2004, (work in progress).
IPSEC Kent, S., Atkinson, R., "Security Architecture [LFCONV] Bryant, S., Shand, M., "A Framework for Loop-
for the Internet Protocol", RFC 2401 free Convergence", <draft-bryant-shand-lf-conv-
frmwk-00.txt>, October 2004,(work in progress).
Authors' Addresses [MPLSFRR] Pan, P. et al, "Fast Reroute Extensions to
RSVP-TE for LSP Tunnels", <draft-ietf-mpls-
rsvp-lsp-fastreroute-05.txt> (work in
progress).
Stewart Bryant [RFC1701] S. Hanks. et al.,RFC-1701,"Generic Routing
Cisco Systems, Encapsulation (GRE)". October 1994.
250, Longwater Avenue,
Green Park,
Reading, RG2 6GB,
United Kingdom. Email: stbryant@cisco.com
Clarence Filsfils [RFC1853] Simpson, W., RFC-1853, "IP in IP Tunneling".
Cisco Systems, October 1995.
De Kleetlaan 6a,
1831 Diegem,
Belgium Email: cfilsfil@cisco.com
Stefano Previdi [RFC3032] Rosen E., et al., RFC-3032, "MPLS Label Stack
Cisco Systems, Encoding", January 2001.
Via Del Serafico 200
00142 Roma,
Italy Email: sprevidi@cisco.com
Mike Shand Authors' Addresses
Cisco Systems,
250, Longwater Avenue,
Green Park,
Reading, RG2 6GB,
United Kingdom. Email: mshand@cisco.com
Full Copyright statement Stewart Bryant
Cisco Systems,
250, Longwater Avenue,
Green Park,
Reading, RG2 6GB,
United Kingdom. Email: stbryant@cisco.com
Copyright (C) The Internet Society (2004). All Rights Reserved. Clarence Filsfils
Cisco Systems,
De Kleetlaan 6a,
1831 Diegem,
Belgium Email: cfilsfil@cisco.com
This document is subject to the rights, licenses and restrictions Stefano Previdi
contained in BCP 78, and except as set forth therein, the authors Cisco Systems,
retain all their rights. Via Del Serafico 200
00142 Roma,
Italy Email: sprevidi@cisco.com
This document and the information contained herein are provided on an Mike Shand
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS Cisco Systems,
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 250, Longwater Avenue,
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, Green Park,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE Reading, RG2 6GB,
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED United Kingdom. Email: mshand@cisco.com
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
 End of changes. 252 change blocks. 
1136 lines changed or deleted 995 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/