idnits 2.17.1 

draft-csaszar-ipfrr-fn-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (June 6, 2012) is 4342 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Obsolete normative reference: RFC 4970 (Obsoleted by RFC 7770)

  ** Obsolete normative reference: RFC 4971 (Obsoleted by RFC 7981)

  == Outdated reference: A later version (-10) exists of
     draft-ietf-rtgwg-mrt-frr-architecture-01

  == Outdated reference: A later version (-04) exists of
     draft-enyedi-rtgwg-mrt-frr-algorithm-01


     Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Network Working Group                                  A. Csaszar (Ed.)
2	Internet Draft                                                G. Enyedi
3	Intended status: Standards Track                            J. Tantsura
4	Expires: December 6, 2012                                       S. Kini
5	                                                               Ericsson

7	                                                               J. Sucec
8	                                                                 S. Das
9	                                                              Telcordia

11	                                                           June 6, 2012

13	                  IP Fast Re-Route with Fast Notification
14	                       draft-csaszar-ipfrr-fn-03.txt

16	Status of this Memo

18	   This Internet-Draft is submitted in full conformance with the
19	   provisions of BCP 78 and BCP 79.

21	   Internet-Drafts are working documents of the Internet Engineering
22	   Task Force (IETF), its areas, and its working groups.  Note that
23	   other groups may also distribute working documents as Internet-
24	   Drafts.

26	   Internet-Drafts are draft documents valid for a maximum of six months
27	   and may be updated, replaced, or obsoleted by other documents at any
28	   time.  It is inappropriate to use Internet-Drafts as reference
29	   material or to cite them other than as "work in progress."

31	   The list of current Internet-Drafts can be accessed at
32	   http://www.ietf.org/ietf/1id-abstracts.txt

34	   The list of Internet-Draft Shadow Directories can be accessed at
35	   http://www.ietf.org/shadow.html

37	   This Internet-Draft will expire on November 6, 2012.

39	Copyright Notice

41	   Copyright (c) 2012 IETF Trust and the persons identified as the
42	   document authors. All rights reserved.

44	   This document is subject to BCP 78 and the IETF Trust's Legal
45	   Provisions Relating to IETF Documents
46	   (http://trustee.ietf.org/license-info) in effect on the date of
47	   publication of this document. Please review these documents
48	   carefully, as they describe your rights and restrictions with respect
49	   to this document.

51	Abstract

53	   This document describes the benefits and main applications of sending
54	   explicit fast notification (FN) packets to routers in an area. FN
55	   packets are generated and processed in the dataplane, and a single FN
56	   service can substitute existing OAM methods for remote failure
57	   detection, such as a full mesh of multi-hop BFD session. The FN
58	   service, therefore, decreases network overhead considerable. The main
59	   application is fast reroute in pure IP and in IP/LDP-MPLS networks
60	   called IPFRR-FN. The detour paths used when IPFRR-FN is active are in
61	   most cases identical to those used after Interior Gateway Protocol
62	   (IGP) convergence. The proposed mechanism can address all single
63	   link, node, and SRLG failures in an area; moreover it is an efficient
64	   solution to protect against BGP ASBR failures as well as VPN PE
65	   router failures. IPFRR-FN can be a supplemental tool to provide FRR
66	   when LFA cannot repair a failure case, while it can be a replacement
67	   of existing ASBR/PE protection mechanisms by overcoming their
68	   scalability and complexity issues.

70	Table of Contents

72	   1. Introduction...................................................3
73	   2. Overview of current IPFRR Proposals based on Local Repair......6
74	   3. Requirements of an Explicit Failure Signaling Mechanism........7
75	   4. Conceptual Operation of IPFRR relying on Fast Notification.....8
76	      4.1. Preparation Phase.........................................8
77	      4.2. Failure Reaction Phase....................................9
78	         4.2.1. Activating Failure Specific Backups.................10
79	         4.2.2. SRLG Handling.......................................11
80	      4.3. Example and Timing.......................................11
81	      4.4. Scoping FN Messages with TTL.............................12
82	   5. Operation Details.............................................13
83	      5.1. Transport of Fast Notification Messages..................13
84	      5.2. Message Handling and Encoding............................14
85	         5.2.1. Failure Identification Message for OSPF.............15
86	         5.2.2. Failure Identification Message for ISIS.............16
87	      5.3. Protecting External Prefixes.............................17
88	         5.3.1. Failure on the Intra-Area Path Leading to the ASBR..17
89	         5.3.2. Protecting ASBR Failures: BGP-FRR...................18
90	            5.3.2.1. Primary and Backup ASBR in the Same Area.......18
91	            5.3.2.2. Primary and Backup ASBR in Different Areas.....19
92	      5.4. Application to LDP.......................................22
93	      5.5. Application to VPN PE Protection.........................23
94	      5.6. Bypassing Legacy Nodes...................................23
95	      5.7. Capability Advertisement.................................24
96	      5.8. Constraining the Dissemination Scope of Fast Notification
97	      Packets.......................................................25
98	         5.8.1. Pre-Configured FN TTL Setting.......................25
99	         5.8.2. Advanced FN Scoping.................................25
100	   6. Protection against Replay Attacks.............................26
101	      6.1. Calculating LSDB Digest..................................27
102	   7. Security Considerations.......................................28
103	   8. IANA Considerations...........................................28
104	   9. References....................................................28
105	      9.1. Normative References.....................................28
106	      9.2. Informative References...................................29
107	   10. Acknowledgments..............................................31
108	   Appendix A. Memory Needs of a Naive Implementation...............32
109	      A.1. An Example Implementation................................32
110	      A.2. Estimation of Memory Requirements........................33
111	      A.3. Estimation of Failover Time..............................34
112	   Appendix B. Impact Scope of Fast Notification....................35

114	1. Introduction

116	   Convergence of link-state IGPs, such as OSPF or IS-IS, after a link
117	   or node failure is known to be relatively slow. While this may be
118	   sufficient for many applications, some network SLAs and applications
119	   require faster reaction to network failures.

121	   IGP convergence time is composed mainly of:

123	   1. Failure detection at nodes adjacent to the failure

125	   2. Advertisement of the topology change

127	   3. Calculation of new routes

129	   4. Installing new routes to linecards

131	   Traditional Hello-based failure detection methods of link-state IGPs
132	   are relatively slow, hence a new, optimized, Hello protocol has been
133	   standardized [BFD] which can reduce failure detection times to the
134	   range of 10ms even if no lower layer notices the failure quickly
135	   (like loss of signal, etc.).

137	   Even with fast failure detection, reaction times of IGPs may take
138	   several seconds, and even with a tuned configuration it may take at
139	   least a couple of hundreds of milliseconds.

141	   To decrease fail-over time even further, IPFRR techniques [RFC5714],
142	   can be introduced. IPFRR solutions compliant with [RFC5714] are
143	   targeting fail-over time reduction of steps 2-4 with the following
144	   design principles:

146	               IGP                                IPFRR

148	   2. Advertisement of the       ==>      No explicit advertisement,
149	      topology change                     only local repair

151	   3. Calculation of new routes  ==>      Pre-computation of new
152	                                          routes

154	   4. Installing new routes      ==>      Pre-installation of backup
155	      to linecards                        routes

157	   Pre-computing means that the way of bypassing a failed resource is
158	   computed before any failure occurs. In order to limit complexity,
159	   IPFRR techniques typically prepare for single link, single node and
160	   single Shared Risk Link Group (SRLG) failures, which failure types
161	   are undoubtedly the most common ones. The pre-calculated backup
162	   routes are also downloaded to linecards in preparation for the
163	   failure, in this way sparing the lengthy communication between
164	   control plane and data plane when a failure happens.

166	   The principle of local rerouting requires forwarding a packet along a
167	   detour even if only the immediate neighbors of the failed resource
168	   know the failure. IPFRR methods observing the local rerouting
169	   principle do not explicitly propagate the failure information.
170	   Unfortunately, packets on detours must be handled in a different way
171	   than normal packets as otherwise they might get returned to the
172	   failed resource. Rephrased, a node not having *any* sort of
173	   information about the failure may loop the packet back to the node
174	   from where it was rerouted - simply because its default
175	   routing/forwarding configuration dictates that. As an example, see
176	   the following figure. Assuming a link failure between A and Dst, A
177	   needs to drop packets heading to Dst. If node A forwarded packets to
178	   Src, and if the latter had absolutely no knowledge of the failure, a
179	   loop would be formed between Src and A.

181	            +---+             +---+
182	            | B |-------------| C |
183	            +---+             +---+
184	           /                       \
185	          /                         \
186	         /                           \
187	    +---+            +---+  failure   +---+
188	    |Src|------------| A |-----X------|Dst|
189	    +---+            +---+            +---+
190	      =========>==============>=========>
191	                  Primary path

193	    Figure 1 Forwarding inconsistency in case of local repair: The path
194	                       of Src to Dst leads through A

196	   The basic problem that previous IPFRR solutions struggle to solve is,
197	   therefore, to provide consistent routing hop-by-hop without explicit
198	   signaling of the failure.

200	   To provide protection for all single failure cases in arbitrary
201	   topologies, the information about the failure must be given in *some*
202	   way to other nodes. That is, IPFRR solutions targeting full failure
203	   coverage need to signal the fact and to some extent the identity of
204	   the failure within the data packet as no explicit signaling is
205	   allowed. Such solutions have turned out to be considerably complex
206	   and hard or impossible to implement practically. The Loop Free
207	   Alternates (LFA) solution [RFC5286] does not give the failure
208	   information in any way to other routers, and so it cannot repair all
209	   failure cases such as the one in Figure 1.

211	   As discussed in Section 2. solutions that address full failure
212	   coverage and rely on local repair, i.e. carrying some failure
213	   information within the data packets, present an overly complex and
214	   therefore often inpractical alternative to LFA. This draft,
215	   therefore, suggests that relaxing the local re-routing principle with
216	   carefully engineered explicit failure signaling is an effective
217	   approach.

219	   The idea of using explicit failure notification for IPFRR has been
220	   proposed before for Remote LFA Paths [RLFAP]. RLFAP sends explicit
221	   notifications and can limit the radius in which the notification is
222	   propagated to enhance scalability. Design, implementation and
223	   enhancements for the remote LFAP concept are reported in [Hok2007],
224	   [Hok2008] and [Cev2010].

226	   This draft attempts to work out in more detail what kind of failure
227	   dissemination mechanism is required to facilitate remote repair
228	   efficiently. Requirements for explicit signaling are given in
229	   Section 3. This draft does not limit the failure advertisement radius
230	   as opposed to RLFAP. As a result, the detour paths remain stable in
231	   most cases, since they are identical to those that the IGP will
232	   calculation after IGP convergence. Hence, micro-loop will not occur
233	   after IGP convergence.

235	   A key contribution of this memo is to recognize that a Fast
236	   Notification service is not only an enabler for a new IPFRR approach
237	   but it is also a replacement for various OAM remote connectivity
238	   verification procedures such as multi-hop BFD. These previous methods
239	   posed considerable overhead to the network: (i) management of many
240	   OAM sessions; (ii) careful configuration of connectivity verification
241	   packet interval so that no false alarm is given for network internal
242	   failures which are handled by other mechanisms; and (iii) packet
243	   processing overhead, since connectivity verification packets have to
244	   be transmitted continuously through the network in a mesh, even in
245	   fault-free conditions.

247	2. Overview of current IPFRR Proposals based on Local Repair

249	   The only practically feasible solution, Loop Free
250	   Alternates [RFC5286], offers the simplest resolution of the hop-by-
251	   hop routing consistency problem: a node performing fail-over may only
252	   use a next-hop as backup if it is guaranteed that it does not send
253	   the packets back. These neighbors are called Loop-Free
254	   Alternates (LFA). LFAs, however, do not always exist, as shown in
255	   Figure 1 above, i.e., node A has no LFAs with respect to Dst. while
256	   it is true that tweaking the network configuration may boost LFA
257	   failure case coverage considerably [Ret2011], LFAs cannot protect all
258	   failure cases in arbitrary network topologies.

260	   The exact way of adding extra information to data packets and its
261	   usage for forwarding is the most important property that
262	   differentiates most existing IPFRR proposals.

264	   Packets can be marked "implicitly", when they are not altered in any
265	   way, but some extra information owned by the router helps deciding
266	   the correct way of forwarding. Such extra information can be for
267	   instance the direction of the packet, e.g., the incoming interface,
268	   e.g. as in [FIFR]. Such solutions require what is called interface-
269	   based or interface-specific forwarding.

271	   Interface-based forwarding significantly changes the well-established
272	   nature of IP's destination-based forwarding principle, where the IP
273	   destination address alone describes the next hop. One embodiment
274	   would need to download different FIBs for each physical or virtual IP
275	   interface - not a very compelling idea. Another embodiment would
276	   alter the next-hop selection process by adding the incoming interface
277	   id also to the lookup fields, which would impact forwarding
278	   performance considerably.

280	   Other solutions mark data packets explicitly. Some proposals suggest
281	   using free bits in the IP header [MRC], which unfortunately do not
282	   exist in the IPv4 header. Other proposals resort to encapsulating re-
283	   routed packets with an additional IP header as in e.g. [NotVia],
284	   [Eny2009] or [MRT-ARCH]. Encapsulation raises the problem of
285	   fragmentation and reassembly, which could be a performance
286	   bottleneck, if many packets are sent at MTU size. Another significant
287	   problem is the additional management complexity of the encapsulation
288	   addresses, which have their own semantics and require cumbersome
289	   routing calculations, see e.g. [MRT-ALG]. Encapsulation in the IP
290	   header translates to label stacking in LDP-MPLS. The above mentioned
291	   mechanisms either encode the active topology ID in a label on the
292	   stack or encode the failure point in a label, and also require an
293	   increasing mesh of targeted LDP sessions to acquire a valid label at
294	   the detour endpoint, which is another level of complexity.

296	3. Requirements of an Explicit Failure Signaling Mechanism

298	   All local repair mechanisms touched above try to avoid explicit
299	   notification of the failure via signaling, and instead try to hack
300	   some failure-related information into data packets. This is mainly
301	   due to relatively low signaling performance of legacy hardware.
302	   Failure notification, therefore, should fulfill the following
303	   properties to be practically feasible:

305	   1. The signaling mechanism should be reliable. The mechanism needs to
306	      propagate the failure information to all interested nodes even in
307	      a network where a single link or a node is down.

309	   2. The mechanism should be fast in the sense that getting the
310	      notification packet to remote nodes through possible multiple hops
311	      should not require (considerably) more processing at each hop than
312	      plain fast path packet forwarding.

314	   3. The mechanism should involve simple and efficient processing to be
315	      feasible for implementation in the dataplane. This goal manifests
316	      itself in three ways:

318	       a. Origination of notification should be very easy, e.g. creating
319	          a simple IP packet, the payload of which can be filled easily.

321	       b. When receiving the packet, it should be easy to recognize by
322	          dataplane linecards so that processing can commence after
323	          forwarding.

325	       c. No complex operations should be required in order to extract
326	          the information from the packet needed to activate the correct
327	          backup routes.

329	   4. The mechanism should be trustable; that is, it should provide
330	      means to verify the authenticity of the notifications without
331	      significant increase of the processing burden in the dataplane.

333	   5. Duplication of notification packets should be either strictly
334	      bounded or handled without significant dataplane processing
335	      burden.

337	   These requirements present a trade-off. A proper balance needs to be
338	   found that offers good enough authentication and reliability while
339	   keeping processing complexity sufficiently low to be feasible for
340	   data plane implementation. One such solution is proposed in [fn-
341	   transport], which is the assumed notification protocol in the
342	   following.

344	4. Conceptual Operation of IPFRR relying on Fast Notification

346	   This section outlines the operation of an IPFRR mechanism relying on
347	   Fast Notification.

349	4.1. Preparation Phase

351	   As any other IPFRR solution, IPFRR-FN also requires quick failure
352	   detection mechanisms in place, such as lower layer upcalls or BFD.
353	   The FN service needs to be activated and configured so that FN
354	   disseminates the information identifying the failure to the area once
355	   triggered by a local failure detection method.

357	   Based on the detailed topology database obtained by a link state IGP,
358	   the node should pre-calculate alternative paths considering
359	   *relevant* link or node failures in the area. Failure specific
360	   alternative path computation should typically be executed at lower
361	   priority than other routing processing. Note that the calculation can
362	   be done "offline", while the network is intact and the CP has few
363	   things to do.

365	   Also note the word *relevant* above: a node does not needed to
366	   compute all the shortest paths with respect to each possible failure;
367	   only those link failures need to be taken into consideration, which
368	   are in the shortest path tree starting from the node.

370	   To provide protection for Autonomous System Border Router (ASBR)
371	   failures, the node will need information not only from the IGP but
372	   also from BGP. This is described in detail in Section 5.3.

374	   After calculating the failure specific alternative next-hops, only
375	   those which represent a change to the primary next-hop, should be
376	   pre-installed to the linecards together with the identifier of the
377	   failure, which triggers the switch-over. In order to preserve
378	   scalability, external prefixes are handled through FIB indirection
379	   available in most routers already. Due to indirection, backup routes
380	   need to be installed only for egress routers. (The resource needs of
381	   an example implementation are briefly discussed in Appendix A.)

383	4.2. Failure Reaction Phase

385	   The main steps to be taken after a failure are the following:

387	   1. Quick dataplane failure detection

389	   2. Send information about failure using FN service right from
390	      dataplane.

392	   3. Forward the received notification as defined by the actually used
393	      FN protocol such as the one in [fn-transport]

395	   4. After learning about a local or remote failure, extract failure
396	      identifier and activate failure specific backups, if needed,
397	      directly within dataplane

399	   5. Start forwarding data traffic using the updated FIB

401	   After a node detects the loss of connectivity to another node, it
402	   should make a decision whether the failure can be handled locally. If
403	   local repair is not possible or not configured, for example because
404	   LFA is not configured or there are destinations for which no LFA
405	   exists, a failure should trigger the FN service to disseminate the
406	   failure description. For instance, if BFD detects a dataplane failure
407	   it not only should invoke routines to notify the control plane but it
408	   should first trigger FN before notifying the CP.

410	   After receiving the trigger, without any DP-CP communication
411	   involved, FN constructs a packet and adds the description of the
412	   failure (described in Section 5.1. ) to the payload. The notification
413	   describes that
414	   o  a node X has lost connectivity

416	   o  to a node Z

418	   o  via a link L.

420	   The proposed encoding of the IPFRR-FN packet is described in
421	   Section 5.1.

423	   The packet is then disseminated by the FN service in the routing
424	   area. Note the synergy of the relation between BFD and IGP Hellos and
425	   between FN and IGP link state advertisements. BFD makes a dataplane
426	   optimized implementation of the routing protocol's Hello mechanism,
427	   while Fast Notification makes a dataplane optimized implementation of
428	   the link state advertisement flooding mechanism of IGPs.

430	   In each hop, the recipient node needs to perform a "punt and
431	   forward". That is, the FN packet not only needs to be forwarded to
432	   the FN neighbors as the specific FN mechanism dictates, but a replica
433	   needs to be detached and, after forwarding, started to be processed
434	   by the dataplane card.

436	4.2.1. Activating Failure Specific Backups

438	   After the forwarding element extracted the contents of the
439	   notification packet, it knows that a node X has lost connectivity to
440	   a node Z via a link L. The recipient now needs to decide whether the
441	   failure was a link or a node failure. Two approaches can be thought
442	   of. Both options are based on the property that notifications advance
443	   in the network as fast as possible.

445	   In the first option, the router does not immediately make the
446	   decision, but instead starts a timer set to fire after a couple of
447	   milliseconds. If, the failure was a node failure, the node will
448	   receive further notifications saying that another node Y has lost
449	   connectivity to node Z through another link M. That is, if node Z is
450	   common in multiple notifications, the recipient can conclude that it
451	   is a node failure and already knows which node it is (Z). If link L
452	   is common, then the recipient can decide for link failure (L). If
453	   further inconclusive notifications arrive, then it means multiple
454	   failures which case is not in scope for IPFRR, and is left for
455	   regular IGP convergence.

457	   After concluding about the exact failure, the data plane element
458	   needs to check in its pre-installed IPFRR database whether this
459	   particular failure results in any route changes. If yes, the linecard
460	   replaces the next-hops impacted by that failure with their failure
461	   specific backups which were pre-installed in the preparation phase.

463	   In the second option, the first received notification is handled
464	   immediately as a link failure, hence the router may start replacing
465	   its next-hops. In many cases this is a good decision, as it has been
466	   shown before that most network failures are link failures. If,
467	   however, another notification arrives a couple of milliseconds later
468	   that points to a node failure, the router then needs to start
469	   replacing its next-hops again. This may cause a route flap but due to
470	   the quick dissemination mechanism the routing inconsistency is very
471	   short lived and likely takes only a couple of milliseconds.

473	4.2.2. SRLG Handling

475	   The above conceptual solution is easily extensible to support pre-
476	   configured SRLGs. Namely, if the failed link is part of an SRLG, then
477	   the disseminated link ID should identify the SRLG itself. As a
478	   result, possible notifications describing other link failures of the
479	   same SRLG will identify the same resource.

481	   If the control plane knows about SRLGs, it can prepare for failures
482	   of these, e.g. by calculating a path that avoids all links in that
483	   SRLG. SRLG identifier may have been pre-configured or have been
484	   obtained by automated mechanisms such as [RFC4203].

486	4.3. Example and Timing

488	   The main message of this section is that big delay links do not
489	   represent a problem for IPFRR-FN. The FN message of course propagates
490	   on long-haul links slower but the same delay is incurred by normal
491	   data packets as well. Packet loss only takes place as long as a node
492	   forwards traffic to an incorrect or inconsistent next-hop. This may
493	   happen in two cases:

495	   First, as long as the failure is not detected, the node adjacent to
496	   the failure only has the failed next-hop installed.

498	   Secondly, when a node (A) selects a new next-hop (B) after detecting
499	   the failure locally or by receiving an FN, the question is if the
500	   routing in the new next-hop (B) is consistent by the time the first
501	   data packets get from A to B. The following timeline depicts the
502	   situation:

504	   Legend: X : period with packet loss
505	           FN forwarding delay: |--|

507	         |--|--------|
508	   A:----1XX2XXXXXXXX3--------------------------------------------------
509	            |----|   |----|
510	   B:------------4--5-----6XX7------------------------------------------
511	                 |--|--------|

513	   (a) Link delay is |----| FIB update delay is |--------|

515	         |--|--------|
516	   A:----1XX2XXXXXXXX3--------------------------------------------------
517	            |---------------|
518	                     |---------------|
519	   B:-----------------------4--5-----6XX7-------------------------------
520	                            |--|--------|

522	   (b) Link delay is |---------------| FIB update delay is |--------|

524	             Figure 2 Timing of FN and data packet forwarding

526	   As can be seen above, the outage time is only influenced by the FN
527	   forwarding delay and the FIB update time. The link delay is not a
528	   factor. Node A forwards the first re-routed packets from time
529	   instance 3 to node B. These reach node B at time instance 6. Node B
530	   is doing incorrect/inconsistent forwarding when it tries to forward
531	   those packets back to A which have already been put onto a detour by
532	   A. This is the interval between time instances 6 and 7.

534	4.4. Scoping FN Messages with TTL

536	   In a large routing area it is often the case that a failure (i.e. a
537	   topology change) causes next-hop changes only in routers relatively
538	   close to the failure. Analysis of certain random topologies and two
539	   example ISP topologies revealed that a single link failure event
540	   generated routing table changes only in routers not more than 2 hops
541	   away from the failure site for the particular topologies under study
542	   [Hok2008]. Based on this analysis, it is anticipated that in practice
543	   the TTL for failure notification messages can be set to a relatively
544	   small radius, perhaps as small as 2 or 3 hops.

546	   A chief benefit of TTL scoping is that it reduces the overhead on
547	   routers that have no use for the information (i.e. which do not need
548	   to re-route). Another benefit (that is particularly important for
549	   links with scarce capacity) is that it helps to constrain the control
550	   overhead incurred on network links. Determining a suitable TTL value
551	   for each locally originated event and controlling failure
552	   notification dissemination, in general, is discussed further in
553	   Section 5.8.

555	5. Operation Details

557	5.1. Transport of Fast Notification Messages

559	   This draft recommends that out of the several FN delivery options
560	   defined in [fn-transport], the flooding transport option is
561	   preferred, which ensures that any event can reach each node from any
562	   source with any failure present in the network area as long as
563	   theoretically possible. Flooding also ensures that FN messages reach
564	   each node on the shortest (delay) path, and as a side effect failure
565	   notifications always reach *each* node *before* re-routed data
566	   packets could reach that node. This means that looping is minimized.

568	   [fn-transport] describes that the dataplane flooding procedure
569	   requires routers to perform duplicate checking before forwarding the
570	   notifications to other interfaces to avoid duplicating notifications.
571	   [fn-transport] describes that duplicate check can be performed by a
572	   simple storage queue, where previously received notification packets
573	   or their signatures are stored.

575	   IPFRR-FN enables another duplicate check process that is based on the
576	   internal state machine. Routers, after receiving a notification but
577	   before forwarding it to other peers, check the authenticity of the
578	   message, if authentication is used. Now the router may check what is
579	   the stored event and what is the event described by the received
580	   notification.

582	   Two variables and a bit describe what is the known failure state:

584	   o  Suspected failed node ID (denoted by N)

586	   o  Suspected link/SRLG ID (denoted by S)

588	   o  Bit indicating the type of the failure, i.e. link/SRLG failure or
589	      node failure (denoted by T)

591	   Recall that the incoming notification describes that a node X has
592	   lost connectivity to a node Z via a link L. Now, the state machine
593	   can be described with the following pseudo-code:

595	   //current state:
596	   //  N: ID of suspected failed node
597	   //  S: ID of suspected failed link/SRLG
598	   //  T: bit indicating the type of the failure
599	   //     T=0 indicates link/SRLG
600	   //     T=1 indicates node
601	   //
602	   Proc notification_received(Node Originator_X, Node Y, SRLG L) {
603	       if (N == NULL) {
604	           // this is a new event, store it and forward it
605	           N=Y;
606	           S=L;
607	           T=0; //which is the default anyway
608	           Forward_notification;
609	       }
610	       else if (S == L AND T == 0) {
611	           // this is the same link or SRLG as before, need not do
612	           // anything
613	           Discard_notification;
614	       }
615	       else if (N == Y) {
616	           // This is a node failure
617	           if (T == 0) {
618	               // Just now turned out that it is a node failure
619	               T=1;
620	               Forward_notification;
621	           }
622	           else {
623	               // Known before that it is a node failure,
624	               // no need to forward it
625	               Discard_notification;
626	           }
627	       }
628	       else {
629	           // multiple failures
630	       }
631	   }
632	          Figure 3 Pseudo-code of state machine for FN forwarding

634	5.2. Message Handling and Encoding

636	   A failure identifier is needed that unambiguously describes the
637	   failed resource consistently among the nodes in the area. The
638	   schemantics of the identifiers are defined by the IGP used to pre-
639	   calculate and pre-install the backup forwarding entries, e.g. OSPF or
640	   ISIS.

642	   This draft defines a Failure Identification message class. Members of
643	   this class represent a routing protocol specific Failure
644	   Identification message to be carried with the Fast Notification
645	   transport protocol. Each message within the Failure Identification
646	   message class shall contain the following fields, the lengths of
647	   which are routing protocol specific. The exact values shall be
648	   aligned with the WG of the routing protocol:

650	   o  Originator Router ID: the identifier of the router advertising the
651	      failure;

653	   o  Neighbour Router ID: the identifier of the neighbour node to which
654	      the originator lost connectivity.

656	   o  Link ID: the identifier of the link, through which connectivity
657	      was lost to the neighbour. The routing protocol should assign the
658	      same Link ID for bidirectional, broadcast or multi access links
659	      from each access point, consistently.

661	   o  Sequence Number: [fn-transport] expects the applications of the FN
662	      service that require replay attack protection to create and verify
663	      a sequence number in FN messages. It is described in Section 6.

665	   Routers forwarding the FN packets should ensure that Failure
666	   Identification messages are not lost, e.g. due to congestion. FN
667	   packets can be put a high precedence traffic class (e.g. Network
668	   Control class). If the network environment is known to be lossy, the
669	   FN sender should repeat the same notification a couple of times, like
670	   a salvo fire.

672	   After the forwarding element processed the FN packet and extracted
673	   the Failure Identification message, it should decide what backups
674	   need to be activated if at all - as described in Section 4.2.1.

676	5.2.1. Failure Identification Message for OSPF

678	    0                   1                   2                   3
679	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
680	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
681	   |           FN Length           |  FN App Type  | AuType|unused |
682	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
683	   |                    Originator Router ID                       |
684	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
685	   |                     Neighbour Router ID                       |
686	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
687	   |                            Link ID                            |
688	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
689	   |                     Sequence Number                           |
690	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
691	   |                     Sequence Number (cont'd)                  |
692	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

694	   FN Header fields:

696	     FN Length
697	        The length of the Failure Identification message for OSPF is 16
698	        bytes.

700	     FN App Type
701	        The exact values are to be assigned by IANA for the Failure
702	        Identification message class. For example, FN App Type values
703	        between 0x0008 and 0x000F could represent Failure
704	        Identification messages, from which 0x0008 could mean OSPF,
705	        0x0009 could be ISIS.

707	     AuType
708	        IPFRR-FN relies on the authentication options offered the FN
709	        transport service. Cryptographic authentication is recommended.

711	   Originator Router ID
712	     If the routing protocol is OSPF, then the value can take the OSPF
713	     Router ID of the advertising router.

715	   Neighbour Router ID
716	     The OSPF Router ID of the neighbour router to which connectivity
717	     was lost.

719	   Link ID
720	     If the link is a LAN, the Link ID takes the LSAID of its
721	     representing Network LSA.
722	     If the link is a point-to-point link, the Link ID can take the
723	     minimum or the maximum of the two interface IDs. The requirement
724	     is that it is performed consistently.

726	   Sequence Number
727	     This field stores a digest of the LSDB of the routing protocol, as
728	     described in Section 6. 5.8.1.

730	5.2.2. Failure Identification Message for ISIS

732	   TBA.

734	5.3. Protecting External Prefixes

736	5.3.1. Failure on the Intra-Area Path Leading to the ASBR

738	   Installing failure specific backup next-hops for each external prefix
739	   would be a scalability problem as the number of these prefixes may be
740	   one or two orders of magnitude higher than intra-area destinations.
741	   To avoid this, it is suggested to make use of indirection already
742	   offered by router vendors.

744	   Indirection means that when a packet needs to be forwarded to an
745	   external destination, the IP address lookup in the FIB will not
746	   return a direct result but a pointer to another FIB entry, i.e. to
747	   the FIB entry of the ASBR. In LDP/MPLS this means that all prefixes
748	   reachable through the same ASBR constitute the same FEC.

750	   As an example, consider that in an area ASBR1 is the primary BGP
751	   route for prefixes P1, P2, P3 and P4 and ASBR2 is the primary route
752	   for prefixes P5, P6 and P7. A FIB arrangement for this scenario could
753	   be the one shown on the following figure. Prefixes using the same
754	   ASBR could be resolved to the same pointer that references to the
755	   next-hop leading to the ASBR. Prefixes resolved to the same pointer
756	   are said to be part of the same "prefix group" or FEC.

758	         FIB lookup       |         FIB lookup
759	                          |
760	   ASBR2 ========> NH2    |   ASBR2 ========> NH2 <----+
761	   ASBR1 ========> NH1    |   ASBR1 ========> NH1 <-+  |
762	                          |                         |  |
763	   P1    ========> NH1    |   P1    ========> Ptr1 -+  |
764	   P2    ========> NH1    |   P2    ========> Ptr1 -+  |
765	   P3    ========> NH1    |   P3    ========> Ptr1 -+  |
766	   P4    ========> NH1    |   P4    ========> Ptr1 -+  |
767	                          |                            |
768	   P5    ========> NH2    |   P5    ========> Ptr2 ----+
769	   P6    ========> NH2    |   P6    ========> Ptr2 ----+
770	   P7    ========> NH2    |   P7    ========> Ptr2 ----+
771	                            |

773	         Figure 4 FIB without (left) and with (right) indirection

775	   If the next-hop to an ASBR changes, it is enough to update in the FIB
776	   the next-hop of the ASBR route. In the above example, this means that
777	   if the next-hop of ASBR1 changes, it is enough to update the route
778	   entry for ASBR1 and due to indirection through pointer Ptr1 this
779	   updates several prefixes at the same time.

781	5.3.2. Protecting ASBR Failures: BGP-FRR

783	   IPFRR-FN can make use of alternative BGP routes advertised in an AS
784	   by new extensions of BGP such as [BGPAddPaths], [DiverseBGP] or
785	   [BGPBestExt]. Using these extensions, for each destination prefix, a
786	   node may learn a "backup" ASBR besides the primary ASBR learnt by
787	   normal BGP operation.

789	5.3.2.1. Primary and Backup ASBR in the Same Area

791	   If the failed ASBR is inside the area, all nodes within that area get
792	   notified by FN. Grouping prefixes into FECs, however, needs to be
793	   done carefully. Prefixes now constitute a common group (i.e. are
794	   resolved to the same pointer) if *both* their primary AND their
795	   backup ASBRs are the same. This is due to the fact that even if two
796	   prefixes use the ASBR by default, they may use different ASBRs when
797	   their common default ASBR fails.

799	   Considering the previous example, let us assume that the backup ASBR
800	   of prefixes P1 and P2 is ASBR3 but that the backup ASBR of P3 and P4
801	   is an ASBR2. Let us further assume that P5 also has ASBR3 as its
802	   backup ASBR but P6 and P7 have an ASBR 4 as their backup ASBR. The
803	   resulting FIB structure is shown in the following figure:

805	         FIB lookup
806	   ASBR4 ========> NH4
807	   ASBR2 ========> NH2
808	   ASBR3 ========> NH3
809	   ASBR1 ========> NH1

811	   P1    ========> Ptr1 -+-> NH1
812	   P2    ========> Ptr1 -+

814	   P3    ========> Ptr2 -+-> NH1
815	   P4    ========> Ptr2 -+

817	   P5    ========> Ptr3 ---> NH2

819	   P6    ========> Ptr4 -+-> NH2
820	   P7    ========> Ptr4 -+

822	                 Figure 5 Indirect FIB for ASBR protection

824	   If, for example, ASBR1 goes down, this affects prefixes P1 through
825	   P4. In order to set the correct backup routes, the container
826	   referenced by Ptr1 needs to be updated to NH2 (next-hop of ASBR2) but
827	   the location referenced by Ptr2 needs to be updated to NH3 (next-hop
828	   of ASBR3). This means that P1 and P2 may constitute the same FEC but
829	   P3 and P4 needs to be another FEC so that there backups can be set
830	   independently.

832	   Note that the routes towards ASBR2 or ASBR3 may have changed, too.
833	   For example, if after the failure ASBR3 would use a new next-hop NH5,
834	   then the container referenced by Ptr2 should be updated to NH5. A
835	   resulting detour FIB is shown in the following figure.

837	         FIB lookup
838	   ASBR4 ========>    NH4
839	   ASBR2 ========>    NH2
840	   ASBR3 ========>    NH5
841	   ASBR1 ========>     X

843	   P1    ========> Ptr1 -+-> NH2
844	   P2    ========> Ptr1 -+

846	   P3    ========> Ptr2 -+-> NH5
847	   P4    ========> Ptr2 -+

849	   P5    ========> Ptr3 ---> NH2

851	   P6    ========> Ptr4 -+-> NH2
852	   P7    ========> Ptr4 -+

854	          Figure 6 Indirect "detour" FIB in case of ASBR1 failure

856	   During pre-calculation, the control plane pre-downloaded the failure
857	   identifier of ASBR1 and assigned NH5 as the failure specific backup
858	   for routes for ASBR3 and pointer Ptr2 and assigned NH2 as the failure
859	   specific backup for the route referenced by Ptr1.

861	5.3.2.2. Primary and Backup ASBR in Different Areas

863	   By default, the scope of FN messages is limited to a single routing
864	   area.

866	   The IPFRR-FN application of FN, may, however, need to redistribute
867	   some specific notifications across areas in a limited manner.

869	   If an ASBR1 in Area1 goes down and some prefixes need to use ASBR2 in
870	   another Area2, then, besides Area1, routers in Area2 need to know
871	   about this failure. Since communication between non-backbone areas is
872	   done through the backbone areas, it may also need the information.

874	   Naturally, if ASBR2 resides in the backbone area, then the FN of
875	   ASBR1 failure needs to be leaked only to the backbone area.

877	   Leaking is facilitated by area border routers (ABR). During failure
878	   preparation phase, the routing engine of an ABR can determine that
879	   for an intra-area ASBR the backup ASBR is in a different area to
880	   which it is the ABR. Therefore, the routing engine installs such
881	   intra-area ASBRs in an "FN redistribution list" at the dataplane
882	   cards.

884	   The ABR, after receiving FN messages, may conclude in its state
885	   machine that a node failure happened. If this node failure is in the
886	   redistribution list, the ABR will generate an FN with the following
887	   data:

889	    0                   1                   2                   3
890	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
891	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
892	   |               16              |     0x008     | AuType|unused |
893	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
894	   |                        ABR Router ID                          |
895	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
896	   |                       ASBR Router ID                          |
897	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
898	   |                              0x0                              |
899	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
900	   |                     Sequence Number                           |
901	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
902	   |                     Sequence Number (cont'd)                  |
903	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

905	   This message is then distributed to the neighbour area specified in
906	   the redistribution list as a regular FN message. A Link ID of 0x0
907	   specifically signals in the neighbour area that this failure is a
908	   known node failure of the node specified by the "Neighbour Router ID"
909	   field (which was set to the failed ASBR's ID).

911	   ABRs in a non-backbone area need to prepare to redistribute ASBR
912	   failure notifications from within their area to the backbone area.

914	   ABRs in the backbone area need to prepare to redistribute an ASBR
915	   failure notification from the backbone area to that area where a
916	   backup ASBR resides.

918	   Consider the previous example, but now let us assume that the current
919	   area is Area0, ASBR2 and ASBR3 reside in Area1 (reachable through
920	   ABR1) but ASBR 4 resides in Area2 (reachable through ABR2). The
921	   resulting FIBs are shown in the following figures: in case of ASBR2
922	   failure, only Ptr4 needs an update.

924	       FIB lookup
925	   ABR1 ========> NH6
926	   ABR2 ========> NH7

928	   (ASBR4 ========> NH7)  //may or may not be in the FIB
929	   (ASBR2 ========> NH6)  //may or may not be in the FIB
930	   (ASBR3 ========> NH6)  //may or may not be in the FIB
931	   (ASBR1 ========> NH1)  //may or may not be in the FIB

933	   P1   ========> Ptr1 -+-> NH1
934	   P2   ========> Ptr1 -+

936	   P3   ========> Ptr2 -+-> NH1
937	   P4   ========> Ptr2 -+

939	   P5   ========> Ptr3 ---> NH6

941	   P6   ========> Ptr4 -+-> NH6
942	   P7   ========> Ptr4 -+

944	           Figure 7 Indirect FIB for inter-area ASBR protection

946	       FIB lookup
947	   ABR1 ========>    NH6
948	   ABR2 ========>    NH7

950	   (ASBR4  =======> NH7)  //may or may not be in the FIB
951	   (ASBR2  =======>  X )  //may or may not be in the FIB
952	   (ASBR3 ========> NH6)  //may or may not be in the FIB
953	   (ASBR1 ========> NH1)  //may or may not be in the FIB

955	   P1   ========> Ptr1 -+-> NH1
956	   P2   ========> Ptr1 -+

958	   P3   ========> Ptr2 -+-> NH1
959	   P4   ========> Ptr2 -+

961	   P5   ========> Ptr3 ---> NH6

963	   P6   ========> Ptr4 -+-> NH7
964	   P7   ========> Ptr4 -+

966	   Figure 8 Indirect "detour" FIB for inter-area ASBR protection, ASBR2
967	                                  failure

969	5.4. Application to LDP

971	   It is possible for LDP traffic to follow paths other than those
972	   indicated by the IGP.  To do so, it is necessary for LDP to have the
973	   appropriate labels available for the alternate so that the
974	   appropriate out-segments can be installed in the forwarding plane
975	   before the failure occurs.

977	   This means that a Label Switching Router (LSR) running LDP must
978	   distribute its labels for the Forwarding Equivalence Classes (FECs)
979	   it can provide to all its neighbours, regardless of whether or not
980	   they are upstream.  Additionally, LDP must be acting in liberal label
981	   retention mode so that the labels that correspond to neighbours that
982	   aren't currently the primary neighbour are stored.  Similarly, LDP
983	   should be in downstream unsolicited mode, so that the labels for the
984	   FEC are distributed other than along the SPT.

986	   The above criteria are identical to those defined in [RFC5286].

988	   In IP, a received FN message may result in rewriting the next-hop in
989	   the FIB. If LDP is applied, the label FIB also needs to be updated in
990	   accordance with the new next-hop; in the LFIB, however, not only the
991	   outgoing interface needs to be replaced but also the label that is
992	   valid to this non-default next-hop. The latter is available due to
993	   liberal label retention and unsolicited downstream mode.

995	5.5. Application to VPN PE Protection

997	   Protecting against (egress) PE router failures in VPN scenarios is
998	   conceptually similar to protecting against ASBR failures for Internet
999	   traffic. The difference is that in case of ASBR protection core
1000	   routers are normally aware of external prefixes using iBGP, while in
1001	   VPN cases P routers can only route inside the domain. In case of
1002	   VPNs, tunnels running between ingress PE and egress PE decrease the
1003	   burden for P routers. The task here is to redirect traffic to a
1004	   backup egress PE.

1006	   Egress PE protection effectively calls out for an explicit failure
1007	   notification, yet existing proposals try to avoid it.

1009	   [I-D.bashandy-bgp-edge-node-frr] proposes that the P routers adjacent
1010	   to the primary PE maintain the necessary routing state and perform
1011	   the tunnel decaps/re-encaps to the backup PE, thereby proposing
1012	   considerable complexity for P routers.

1014	   [I-D.ietf-pwe3-redundancy] describes a mechanism for pseudowire
1015	   redundancy, where PE routers need to run multi-hop BFD sessions to
1016	   detect the loss of a primary egress PE. This leads to a potentially
1017	   full mesh of multihop BFD session, which is a tremendous complexity.
1018	   In addition, in some cases the egress PE of the secondary PW might
1019	   need to explicitly set the PW state from standby to active.

1021	   FN provides the needed mechanism to actively inform all nodes
1022	   including PE routers that a failure happened, and also identifies
1023	   that a node failure happened. Furthermore, since both the ingress PE
1024	   and the secondary egress PE are informed, all information is
1025	   available for a proper switch-over. This is without a full mesh of
1026	   BFD sessions running all the time between PE routers.

1028	5.6. Bypassing Legacy Nodes

1030	   Legacy nodes, while cannot originate fast notifications and cannot
1031	   process them either, can be assumed to be able to forward the
1032	   notifications. As [fn-transport] discusses, FN forwarding is based on
1033	   multicast. It is safe to assume that legacy routers' multicast
1034	   configuration can be set up statically so as to be able to propagate
1035	   fast notifications as needed.

1037	   When calculating failure specific alternative routes, IPFRR-FN
1038	   capable nodes must consider legacy nodes as being fixed directed
1039	   links since legacy nodes do not change packet forwarding in the case
1040	   of failure. There are situations when an FN-IPFRR capable node can,
1041	   exceptionally, bypass a non-IPFRR-FN capable node in order to handle
1042	   a remote failure.

1044	   As an example consider the topology depicted in Figure 9, where the
1045	   link between C and D fails. C cannot locally repair the failure.

1047	    +---+   +---+   +---+   +---+
1048	    | E |---| F |---| G |---| H |
1049	    +---+   +---+   +---+   +---+
1050	      |            /          |
1051	      |           /           |
1052	      |          /            |
1053	    +---+   +---+   +---+   +---+
1054	    | A |---| B |---| C |-X-| D |
1055	    +---+   +---+   +---+   +---+
1056	      >========>==============>
1057	         Traffic from A to D

1059	                Figure 9 Example for bypassing legacy nodes

1061	   First, let us assume that each node is IPFRR-FN capable. C would
1062	   advertise the failure information using FN. Each node learns that the
1063	   link between C and D fails, as a result of which C changes its
1064	   forwarding table to send any traffic destined to D via B. B also
1065	   makes a change, replacing its default next-hop (C) with G. Note that
1066	   other nodes do not need to modify their forwarding at all.

1068	   Now, let us assume that B is a legacy router not supporting IPFRR-FN
1069	   but it is statically configured to multicast fast notifications as
1070	   needed. As such, A will receive the notification. A's pre-
1071	   calculations have been done knowing that B is unable to correct the
1072	   failure. Node A, therefore, has pre-calculated E as the failure
1073	   specific next-hop. Traffic entering at A and heading to D can thus be
1074	   repaired.

1076	5.7. Capability Advertisement

1078	   The solution requires nodes to know which other nodes in the area are
1079	   capable of IPFRR-FN. The most straightforward way to achieve this is
1080	   to rely on the Router Capability TLVs available both in
1081	   OSPF [RFC4970] and in IS-IS [RFC4971].

1083	5.8. Constraining the Dissemination Scope of Fast Notification Packets

1085	   As discussed earlier in Section 4.4. it is desirable to constrain the
1086	   dissemination scope of failure notification messages.  This section
1087	   presents three candidate methods for controlling the scope of failure
1088	   notification: (1) Pre-configure the TTL for FN messages in routers
1089	   based on best current practices and related studies of available ISP
1090	   and enterprise network topologies; (2) dynamically calculate the
1091	   minimum TTL value needed to ensure 100% remote LFAP coverage; and (3)
1092	   dynamically calculate the set of neighbours for which FN message
1093	   should given the identity of the link that has failed.

1095	   These candidate dissemination options are mechanisms with different
1096	   levels of optimality and complexity.  The intent here is to present
1097	   some options that will generate further discussion on the tradeoffs
1098	   between different FN message scoping methods.

1100	5.8.1. Pre-Configured FN TTL Setting

1102	   As discussed, earlier in Section 4.4. studies of various network
1103	   topologies suggest that a fixed TTL setting of 2 hops may be
1104	   sufficient to ensure failure notification message for typical OSPF
1105	   area topologies.  Therefore, a potentially simple solution for
1106	   constraining FN message dissemination is for network managers to
1107	   configure their routers with fixed TTL setting (e.g., TTL=2 hops) for
1108	   FN messages.  This TTL setting can be adjusted by network managers to
1109	   consider implementation-specific details of the topology such as
1110	   configuring a larger TTL setting for topologies containing, say,
1111	   large ring sub-graph structures.

1113	   In terms of performance trades, pre-configuring the FN TTL, since it
1114	   is fixed at configuration time, incurs no computational overhead for
1115	   the router.  On the other hand, it represents a configurable router
1116	   parameter that network administrators must manage.  Furthermore, the
1117	   fixed, pre-configured FN TTL approach is sub-optimal in terms of
1118	   constraining the FN dissemination as most single link events will not
1119	   require FN messages send to up to TTL hops away from the failure
1120	   site.

1122	5.8.2. Advanced FN Scoping

1124	   While the static pre-configured setting of the FN TTL will likely
1125	   work in practice for a wide range of OSPF area topologies, it has at
1126	   two least weaknesses: (1) There may be certain topologies for which
1127	   the TTL setting happens to be insufficient to provide the needed
1128	   failure coverage; and (2) as discussed above, it tends to result in
1129	   FN being disseminated to a larger radius than needed to facilitate
1130	   re-routing.

1132	   The solution to these drawbacks is for routers to dynamically compute
1133	   the FN TTL radius needed for each of the local links it monitors.
1134	   Doing so addresses the two weakness of a pre-configured TTL setting
1135	   by computing a custom TTL setting for each of its local links that
1136	   matches exactly the FN message radius for the given topology.  The
1137	   drawback, of course, is the additional computations.  However, given
1138	   a quasi-static network topology, it is possible this dynamic FN TTL
1139	   computation is performed infrequently and, therefore, on average
1140	   incurs relatively small computation overhead.

1142	   While a pre-configured TTL eliminates computation overhead at the
1143	   expense of FN dissemination overhead and dynamic updates of the TTL
1144	   settings achieve better dissemination efficiency by incurring some
1145	   computational complexity, directed FN message forwarding attempts to
1146	   minimize the FN dissemination scope by leveraging additional
1147	   computation power.  Here, rather than computing a FN TTL setting for
1148	   each local link, a network employing directed forwarding has each
1149	   router instance R compute the sets of one-hop neighbours to which a
1150	   FN message must be forwarded for every possible failure event in the
1151	   routing area.  This has the beneficial effect of constraining the FN
1152	   scope to the direction where there are nodes that require the FN
1153	   update as opposed to disseminating to the entire TTL hop radius about
1154	   a failure site.  The trade off here, of course, is the additional
1155	   computation complexity incurred and the maintenance of forwarding
1156	   state for each possible failure case.  Reference [Cev2010] gives an
1157	   algorithm for finding, for each failure event, the direct neighbours
1158	   to which the notification should be forwarded.

1160	6. Protection against Replay Attacks

1162	   To defend against replay attacks, recipients should be able to ignore
1163	   a re-sent recording of a previously sent FN packet. This suggests
1164	   that some sort of sequence number should be included in the FN
1165	   packet, the verification of which should not need control plane
1166	   involvement. Since the solution should be simple to implement in the
1167	   dataplane, maintaining and verifying per-source sequence numbers is
1168	   not the best option.

1170	   We propose, therefore, that messages should be stamped with the
1171	   digest of the actual routing configuration, i.e., a digest of the
1172	   link state database of the link state routing protocol. The digest
1173	   has to be picked carefully, so that if two LSDBs describe the same
1174	   connectivity information, their digest should be identical as well,
1175	   and different LSDBs should result in different digest values with
1176	   high probability.

1178	   The conceptual way of handling these digests could be the following:

1180	   o  When the LSDB changes, the IGP re-calculates the digest and
1181	      downloads the new value to the dataplane element(s), in a secure
1182	      way.

1184	   o  When a FN packet is originated, the digest is put into the FN
1185	      message into the Sequence Number field.

1187	   o  Network nodes distribute (forward) the FN packet.

1189	   o  When processing, the dataplane element first performs an
1190	      authentication check of the FN packet, as described in [fn-
1191	      transport].

1193	   o  Finally, before processing the failure notification, the dataplane
1194	      element should check whether its own known LSDB digest is
1195	      identical with the one in the message.

1197	   If due to a failure event a node disseminates a failure notification
1198	   with FN, an attacker might capture the whole packet and re-send it
1199	   later. If it resends the packet after the IGP re-converged on the new
1200	   topology, the active LSDB digest is different, so the packet can be
1201	   ignored. If the packet is replayed to a recipient who still has the
1202	   same LSDB digest, then it means that the original failure
1203	   notification was already processed but the IGP has not yet finished
1204	   converging; the IPFRR detour is already active, the replica has no
1205	   impact.

1207	6.1. Calculating LSDB Digest

1209	   We propose to create an LSDB digest that is conceptually similar
1210	   to [ISISDigest]. The operation is proposed to be the following:

1212	   o  Create a hash from each LSA(OSPF)/LSP(ISIS) one by one

1214	   o  XOR these hashes together

1216	   o  When an LSA/LSP is removed, the new LSDB digest is received by
1217	      computing the hash of the removed LSA, and then XOR to the
1218	      existing digest

1220	   o  When an LSA/LSP is added, the new LSDB digest is received by
1221	      computing the hash of the new LSA, and then XOR to the existing
1222	      digest

1224	7. Security Considerations

1226	   The IPFRR application of Fast Notification does not raise further
1227	   known security consideration in addition to those already present in
1228	   Fast Notification itself. If an attacker could send false Failure
1229	   Identification Messages or could hinder the transmission of legal
1230	   messages, then the network would produce an undesired routing
1231	   behaviour. These issues should be solved, however, in [fn-transport].

1233	   IPFRR-FN relies on the authentication mechanism provided by the Fast
1234	   Notification transport protocol [fn-transport]. The specification of
1235	   the FN transport protocol requires applications to protect against
1236	   replay attacks with application specific sequence numbers. This
1237	   draft, therefore, describes its own proposed sequence number in
1238	   Section 5.8.1.

1240	8. IANA Considerations

1242	   The Failure Identification message types need to be allocated a value
1243	   in the FN App Type field.

1245	   IPFRR-FN capability needs to be allocated within Router Capability
1246	   TLVs both for OSPF [RFC4970] and in IS-IS [RFC4971].

1248	9. References

1250	9.1. Normative References

1252	   [RFC5286]
1253	             A. Atlas, A. Zinin, "Basic specification for IP Fast-
1254	             Reroute: Loop-Free Alternates", Internet Engineering Task
1255	             Force: RFC 5286, 2008.

1257	   [fn-transport]
1258	             W. Lu, S. Kini, A. Csaszar, G. Enyedi, J. Tantsura, A.
1259	             Tian, "Transport of Fast Notifications Messages", draft-lu-
1260	             fn-transport, 2011

1262	   [RFC4970]
1263	             A. Lindem et al., Extensions to OSPF for Advertising
1264	             Optional Router Capabilities, RFC 4970, 2007

1266	   [RFC4971]
1267	             JP. Vasseur et al., Intermediate System to Intermediate
1268	             System (IS-IS) Extensions for Advertising Router
1269	             Information, RFC 4971, 2007

1271	   [RFC4203]
1272	             K. Kompella, Y. Rekhter, " OSPF Extensions in Support of
1273	             Generalized Multi-Protocol Label Switching (GMPLS)",
1274	             RFC4203, 2005

1276	9.2. Informative References

1278	   [BFD]
1279	             D. Katz, D. Ward, "Bidirectional forwarding detection",
1280	             RFC 5880, IETF, 2010

1282	   [RFC5714]
1283	             M. Shand, S. Bryant, "IP Fast Reroute Framework", RFC 5714,
1284	             IETF, 2010.

1286	   [Cev2010]
1287	             S. Sevher, T. Chen, I. Hokelek, J. Kang, V. Kaul, Y.J. Lin,
1288	             M. Pang, R. Rodoper, S. Samtani, C. Shah, J. Bowcock, G. B.
1289	             Rucker, J. L. Simbol and A. Staikos, "An Integrated Soft
1290	             Handoff Approach in IP Fast Reroute in Wireless Mobile
1291	             Networks", In Proceedings IEEE COMSNETS, 2011.

1293	   [Eny2009]
1294	             Gabor Enyedi, Peter Szilagyi, Gabor Retvari, Andras
1295	             Csaszar, "IP Fast ReRoute: Lightweight Not-Via without
1296	             Additional Addresses", IEEE INFOCOM-MiniConference, Rio de
1297	             Janeiro, Brazil, 2009.

1299	   [FIFR]
1300	             J. Wand, S. Nelakuditi, "IP fast reroute with failure
1301	             inferencing", In Proceedings of ACM SIGCOMM Workshop on
1302	             Internet Network Management - The Five-Nines Workshop,
1303	             2007.

1305	   [Hok2007]
1306	             I. Hokelek, M. A. Fecko, P. Gurung, S. Samtani, J. Sucec,
1307	             A. Staikos, J. Bowcock and Z. Zhang, "Seamless Soft Handoff
1308	             in Wireless Battlefield Networks Using Local and Remote
1309	             LFAPs", In Proceedings IEEE MILCOM, 2007.

1311	   [Hok2008]
1312	             I. Hokelek, S. Cevher, M. A. Fecko, P. Gurung, S. Samtani,
1313	             Z. Zhang, A. Staikos and J. Bowcock, "Testbed
1314	             Implementation of Loop-Free Soft Handoff in Wireless
1315	             Battlefield Networks", In Proceedings of the 26th Army
1316	             Science Conference, December 1-4, 2008.

1318	   [MRC]
1319	             T. Cicic, A. F. Hansen, A. Kvalbein, M. Hartmann, R.
1320	             Martin, M. Menth, S. Gjessing, O. Lysne, "Relaxed multiple
1321	             routing configurations IP fast reroute for single and
1322	             correlated failures", IEEE Transactions on Network and
1323	             Service Management, available online:
1324	             http://www3.informatik.uni-
1325	             wuerzburg.de/staff/menth/Publications/papers/Menth08-Sub-
1326	             4.pdf, September 2010.

1328	   [NotVia]
1329	             S. Bryant, M. Shand, S. Previdi, "IP fast reroute using
1330	             Not-via addresses", Internet Draft, draft-ietf-rtgwg-ipfrr-
1331	             notvia-addresses, 2010.

1333	   [RLFAP]
1334	             I. Hokelek, M. Fecko, P. Gurung, S. Samtani, S. Cevher, J.
1335	             Sucec, "Loop-Free IP Fast Reroute Using Local and Remote
1336	             LFAPs", Internet Draft, draft-hokelek-rlfap-01 (expired),
1337	             2008.

1339	   [Ret2011]
1340	             G. Retvari, J. Tapolcai, G. Enyedi, A. Csaszar, "IP Fast
1341	             ReRoute: Loop Free Alternates Revisited", to appear at IEEE
1342	             INFOCOM 2011

1344	   [ISISDigest]
1345	             J. Chiabaut and D. Fedyk. IS-IS Multicast Synchronization
1346	             Digest. Available online:
1347	             http://www.ieee802.org/1/files/public/docs2008/aq-fedyk-
1348	             ISIS-digest-1108-v1.pdf, Nov 2008.

1350	   [BGPAddPaths]
1351	             D. Walton, A. Retana, E. Chen, J. Scudder, "Advertisement
1352	             of Multiple Paths in BGP", draft-ietf-idr-add-paths, Work
1353	             in progress

1355	   [DiverseBGP]
1356	             R. Raszuk, et. Al, "Distribution of diverse BGP paths",
1357	             draft-ietf-grow-diverse-bgp-path-dist, Work in progress

1359	   [BGPBestExt]
1360	             P. Marques, R. Fernando, E. Chen, P. Mohapatra, H. Gredler,
1361	             "Advertisement of the best external route in BGP", draft-
1362	             ietf-idr-best-external, Work in progress

1364	   [BRITE]
1365	             Oliver Heckmann et al., "How to use topology generators to
1366	             create realistic topologies", Technical Report, Dec 2002.

1368	   [MRT-ARCH]
1369	             A. Atlas et al., "An Architecture for IP/LDP Fast-Reroute
1370	             Using Maximally Redundant Trees", Internet Draft, draft-
1371	             ietf-rtgwg-mrt-frr-architecture-01, 2012

1373	   [MRT-ALG]
1374	             A. Atlas, G. Enyedi, A. Csaszar, "Algorithms for computing
1375	             Maximally Redundant Trees for IP/LDP Fast-Reroute",
1376	             Internet Draft, draft-enyedi-rtgwg-mrt-frr-algorithm-01,
1377	             2012

1379	   [I-D.ietf-pwe3-redundancy]
1380	             P. Muley, M. Aissaoui, M. Bocci, "Pseudowire Redundancy",
1381	             draft-ietf-pwe3-redundancy (Work in progress!), May 2012

1383	   [I-D.bashandy-bgp-edge-node-frr]
1384	             A. Bashandy, B. Pithawala, K. Patel, "Scalable BGP FRR
1385	             Protection against Edge Node Failure", draft-bashandy-bgp-
1386	             edge-node-frr (Work in progress!), March 2012

1388	10. Acknowledgments

1390	   The authors would like to thank Albert Tian, Wenhu Lu, Acee Lindem
1391	   and Ibrahim Hokelek for the continuous discussions and comments on
1392	   the topic, as well as Joel Halpern for his comments and review.

1394	Appendix A.                 Memory Needs of a Naive Implementation

1396	   Practical background might suggest that storing and maintaining
1397	   backup next-hops for many potential remote failures could overwhelm
1398	   the resources of router linecards. This section attempts to provide a
1399	   calculation describing the approximate memory needs in reasonable
1400	   sized networks with a possible implementation.

1402	A.1. An Example Implementation

1404	   Let us suppose that for exterior destinations the forwarding engine
1405	   is using recursive lookup or indirection in order to improve updating
1406	   time such as described in Section 5.3. We are also supposing that the
1407	   concept of "prefix groups" is applied, i.e. there is an internal
1408	   entity for the prefixes using exactly the same primary and backup
1409	   ASBRs, and the next hop entry for a prefix among them is pointing to
1410	   the next hop towards this entity. See e.g. Figure 7.

1412	   In the sequel, the term of "area" refers to an extended area, made up
1413	   by the OSPF or IS-IS area containing the router, with the prefix
1414	   groups added to the area as virtual nodes. Naturally, a prefix group
1415	   is connected to the egress routers (ABRs) through which it can be
1416	   reached. We just need to react to the failure ID of an ASBR for all
1417	   the prefix groups connected to that ASBR; technically, we must
1418	   suppose that one of the virtual links of all the affected prefix
1419	   groups go down.

1421	   Here we show a simple naive implementation which can easily be beaten
1422	   in real routers. This implementation uses an array for all the nodes
1423	   (including real routers and virtual nodes representing prefix groups)
1424	   in the area (node array in the sequel), made up by two pointers and a
1425	   length filed (an integer) per record. One of the pointers points to
1426	   another array (called alternative array). That second array is
1427	   basically an enumeration containing the IDs of those failures
1428	   influencing a shortest path towards that node and an alternative
1429	   neighbor, which can be used, when such a failure occurs. When a
1430	   failure is detected, (either locally, or by FN), we can easily find
1431	   the proper record in all the lists. Moreover, since these arrays can
1432	   be sorted based on the failure ID, we can even use binary search to
1433	   find the needed record. The length of this array is stored in the
1434	   record of the node array pointing to the alternative list.

1436	   Now, we only need to know, which records in the FIB should be
1437	   updated. Therefore there is a second pointer in the node array
1438	   pointing to that record.

1440	          +-------+-------+-------+--   --+-------+
1441	          |   r1  |   r2  |   r3  |  ...  |   rk  |
1442	          +-------+-------+-------+--   --+-------+
1443	              |       |       |               |
1444	              |       |       |               |
1445	             \|/     \|/     \|/             \|/
1446	              *       *       *               *
1447	          +-------+-------+-------+--   --+-------+
1448	          | fail1 | fail2 | fail3 |       | failk |
1449	          | alt.1 | alt.2 | alt.3 |  ...  | alt.k |
1450	          +-------+-------+-------+--   --+-------+
1451	          | fail4 |       | fail5 |
1452	          | alt.4 |       | alt.5 |
1453	          +-------+       +-------+
1454	          | fail6 |
1455	          | alt.6 |
1456	          +-------+

1458	                 Figure 10The way of storing alternatives

1460	A.2. Estimation of Memory Requirements.

1462	   Now, suppose that there are V nodes in the extended area, the network
1463	   diameter is D, a neighbor descriptor takes X bytes, a failure ID
1464	   takes Y bytes and a pointer takes Z bytes. We suppose that lookup for
1465	   external prefixes are using indirection, so we only need to deal with
1466	   destinations inside the extended area. In this way, if there is no
1467	   ECMP, this data structure takes

1469	      (2*Z+Y)*(V-1) + 2*(X+Y)*D*(V-1)

1471	   bytes altogether. The first part is the memory consumption of the
1472	   node array. The memory needed by alternative arrays: any path can
1473	   contain at most D nodes and D links, each record needs X+Y bytes;
1474	   there are records for all the other nodes in the area (V-1 nodes).
1475	   Observe that this is a very rough overestimation, since most of the
1476	   possible failures influencing the path will not change the next hop.

1478	   For computing memory consumption, suppose that neighbor descriptors,
1479	   failure IDs and pointers take 4 bytes, there are 10000 nodes in the
1480	   extended area (so both real routers and virtual nodes representing
1481	   prefix groups are included) and the network diameter is 20 hops. In
1482	   this case, we get that the node array needs about 120KB, the
1483	   alternative array needs about 3.2MB, so altogether 3.4MB if there is
1484	   no ECMP. Observe that the number of external prefixes is not
1485	   important.

1487	   If however, there are paths with equal costs, the size of the
1488	   alternative array increases. Suppose that there are 10 equal paths
1489	   between ANY two nodes in the network. This would cause that the
1490	   alternative list gets 10 times bigger, and now it needs a bit less
1491	   than 32MB. Observe that the node array still needs only about 160KB,
1492	   so 32MB is a good overestimation, which is likely acceptable for
1493	   modern linecards with gigs of DRAM. Moreover, we need to stress here
1494	   again that this is an extremely rough overestimation, so in reality
1495	   much less memory will be enough. Furthermore, usually only protecting
1496	   outer prefixes is needed, so we only need to protect the paths
1497	   towards the prefix groups, which further decreases both the size of
1498	   node array and the number of alternative lists.

1500	A.3. Estimation of Failover Time

1502	   After a failover was detected either locally or by using FN, the
1503	   nodes need to change the entries in their FIB. Here we do a rough
1504	   estimation to show that the previous implementation can do it in at
1505	   most a few milliseconds.

1507	   We are supposing that we have the data structure described in the
1508	   previous section. When a failure happens we need to decide for each
1509	   node in the node table whether the shortest path towards that
1510	   destination was influenced by the failure. We can sort the elements
1511	   in the alternative list, so now we can use binary search, which needs
1512	   ceil(log(2D)) memory access (log here has base 2) for worst case. We
1513	   need one more access to get the node list entry and another to
1514	   rewrite the FIB.

1516	   We suppose DDR3 SDRAM with 64 byte cache line, which means that up to
1517	   8 entries of the alternative list can be fetched from the RAM at a
1518	   time, so the previous formula is modified as we need ceil(log(D/4))+2
1519	   transactions. In this way for D=20 and V=10.000 we need
1520	   (3+2)*10.000=50.000 transactions. If we suppose 10 ECMP paths as
1521	   previously, D=200 and we need (5+2)*10000=70.000 transactions.

1523	   We can do a very conservative estimation by supposing a recent DDR3
1524	   SDRAM module which can do 5MT/s with completely random access, so
1525	   doing 50.000 or 70.000 transaction takes 10ms or 14ms. Keep in mind
1526	   that we assumed that there is only one memory controller, we always
1527	   got the result of the search with the last read, and all the
1528	   alternative lists were full. Moreover, internal system latencies
1529	   (e.g. multiple memory requests) were overestimated seriously, since a
1530	   DDR3 SDRAM can reach even 6 times this speed with random access.

1532	Appendix B.                 Impact Scope of Fast Notification

1534	   The memory and fail-over time calculations presented in Appendix A
1535	   are based on worst-case estimation. They assume that basically in a
1536	   network with diameter equal to 20 hops, each failure has a route
1537	   changing consequence on all routers in the full diameter.

1539	   This section provides experimental results on real-world topologies,
1540	   showing that already 100% failure coverage can be achieved within a
1541	   2-hop radius around the failure.

1543	   We performed the coverage analysis of the fast reroute mechanism
1544	   presented here on realistic topologies, which were generated by the
1545	   BRITE topology generator in bottom-up mode [BRITE]. The coverage
1546	   percentage is defined here as the percentage of the number of useable
1547	   backup paths for protecting the primary paths which are failed
1548	   because of link failures to the number of all failed primary paths.

1550	   The realistic topologies include AT&T and DFN using pre-determined
1551	   BRITE parameter values from [BRITE] and various random topologies
1552	   with different number of nodes and varying network connectivity. For
1553	   example, the number of nodes for AT&T and DFN are 154 and 30,
1554	   respectively, while the number of nodes for other random topologies
1555	   is varied from 20 to 100. The BRITE parameters which are used in our
1556	   topology generation process are summarized in Figure 11 (see [BRITE]
1557	   for the details of each parameter). In summary, m represents the
1558	   average number of edges per node and is set to either 2 or 3. A
1559	   uniform bandwidth distribution in the range 100-1024 Mbps is selected
1560	   and the link cost is obtained deterministically from the link
1561	   bandwidth (i.e., inversely proportional to the link bandwidth as used
1562	   by many vendors). Since the values for p(add) and beta determine the
1563	   number of edges in the generated topologies, their values are varied
1564	   to obtain network topologies with varying connectivity (e.g., sparse
1565	   and dense).

1567	      |----------------------------|-----------------------|
1568	      |                            |  Bottom up            |
1569	      |----------------------------|-----------------------|
1570	      |  Grouping Model            |  Random pick          |
1571	      |  Model                     |  GLP                  |
1572	      |  Node Placement            |  Random               |
1573	      |  Growth Type               |  Incremental          |
1574	      |  Preferential Connectivity |  On                   |
1575	      |  BW Distribution           |  Uniform              |
1576	      |  Minimum BW                |  100                  |
1577	      |  Maximum BW                |  1024                 |
1578	      |  m                         |  2-3                  |
1579	      |  Number of Nodes (N)       |  20,30,50,100,154     |
1580	      |  p(add)                    |  0.01,0.05,0.10,0.42  |
1581	      |  beta                      |  0.01,0.05,0.15,0.62  |
1582	      |----------------------------|-----------------------|

1584	              Figure 11   BRITE topology generator parameters

1586	   The coverage percentage of our fast reroute method is reported for
1587	   different network topologies (e.g., different number of nodes and
1588	   varying network connectivity) using neighborhood depths of 0, 1, and
1589	   2. (i.e., X=0, 1, and 2). For a particular failure, backup routes
1590	   protecting the failed primary paths are calculated only by those
1591	   nodes which are within the selected radious of this failure. Note
1592	   that these nodes are determined by the parameter X as follows: For
1593	   X=0, two nodes which are directly connected to the failed link, for
1594	   X=1, two nodes which are directly connected to the failed link and
1595	   also neighboring nodes which are adjacent to one of the outgoing
1596	   links of these two nodes, and so on.

1598	   The coverage percentage for a certain topology is computed by the
1599	   following formula: Coverage Percentage = N_backupsexist*100/N_fpp
1600	   where N_backupsexist is the number of source-destination pairs whose
1601	   primary paths are failed because of link failures and have backup
1602	   paths for protecting these failed paths, and N_fpp is the number of
1603	   source-destination pairs whose primary paths are failed because of
1604	   link failures. The source-destination pairs, in which source and
1605	   destination nodes do not have any physical connectivity after a
1606	   failure, are excluded from N_fpp. Note that the coverage percentage
1607	   includes a network-wide result which is calculated by averaging all
1608	   coverage results obtained by individually failing all edges for a
1609	   certain network topology.

1611	   Figure 12 shows the coverage percentage results for random topologies
1612	   with different number of nodes (N) and network connectivity, and
1613	   Figure 13 shows these results for AT&T and DFN topologies. In these
1614	   figures, E_mean represents the average number of edges per node for a
1615	   certain topology. Note that the average number of edges per node is
1616	   determined by the parameters m, p(add), and beta. We observed that
1617	   E_mean increases when p(add) and beta values increase. For each
1618	   topology, coverage analysis is repeated for 10 topologies generated
1619	   randomly by using the same BRITE parameters. E_mean and coverage
1620	   percentage are obtained by averaging the results of these ten
1621	   experiments.

1623	     |------------|-----|------|------|------|------|
1624	     |   Case     |N    |E_mean|X=0   |X=1   |X=2   |
1625	     |------------|-----|------|------|------|------|
1626	     |p(add)=0.01 |20   |3.64  |82.39 |98.85 |100.0 |
1627	     |beta=0.01   |50   |3.86  |82.10 |98.69 |100.0 |
1628	     |            |100  |3.98  |83.21 |98.04 |100.0 |
1629	     |------------|-----|------|------|------|------|
1630	     |p(add)=0.05 |20   |3.70  |85.60 |99.14 |100.0 |
1631	     |beta=0.05   |50   |4.01  |84.17 |99.09 |100.0 |
1632	     |            |100  |4.08  |83.35 |98.01 |100.0 |
1633	     |------------|-----|------|------|------|------|
1634	     |p(add)=0.1  |20   |5.52  |93.24 |100.0 |100.0 |
1635	     |beta=0.15   |50   |6.21  |91.46 |99.87 |100.0 |
1636	     |            |100  |6.39  |91.17 |99.86 |100.0 |
1637	     |------------|-----|------|------|------|------|

1639	       Figure 12  Coverage percentage results for random topologies

1641	     |------------|-----------|------|------|------|------|
1642	     |   Case     |N          |E_mean|X=0   |X=1   |X=2   |
1643	     |------------|-----------|------|------|------|------|
1644	     |p(add)=0.42 |154 (AT&T) |6.88  |91.04 |99.81 |100.0 |
1645	     |beta=0.62   |30  (DFN)  |8.32  |93.76 |100.0 |100.0 |
1646	     |------------|-----------|------|------|------|------|

1648	    Figure 13  Coverage percentage results for AT&T and DFN topologies

1650	   There are two main observations from these results:

1652	   1. As the neighborhood depth (X) increases the coverage percentage
1653	   increases and the complete coverage is obtained using a low
1654	   neighborhood depth value (i.e., X=2). This result is significant
1655	   since failure notification message needs to be sent only to nodes
1656	   which are two-hop away from the point of failure for the complete
1657	   coverage. This result supports that our method provides fast
1658	   convergence by introducing minimal signaling overhead within only the
1659	   two-hop neighborhood.

1661	   2. The topologies with higher connectivity (i.e., higher E_mean
1662	   values) have better coverage compared to the topologies with lower
1663	   connectivity (i.e., lower E_mean values). This is an intuitive result
1664	   since the number of possible alternate hops in dense network
1665	   topologies is higher than the number of possible alternate hops in
1666	   sparse topologies. This phenomenon increases the likelihood of
1667	   finding backup paths, and therefore the coverage percentage.

1669	Authors' Addresses

1671	   Andras Csaszar
1672	   Ericsson
1673	   Irinyi J utca 4-10, Budapest, Hungary, 1117
1674	   Email: Andras.Csaszar@ericsson.com

1676	   Gabor Sandor Enyedi
1677	   Ericsson
1678	   Irinyi J utca 4-10, Budapest, Hungary, 1117
1679	   Email: Gabor.Sandor.Enyedi@ericsson.com

1681	   Jeff Tantsura
1682	   Ericsson
1683	   300 Holger Way, San Jose, CA 95134
1684	   Email: jeff.tantsura@ericsson.com

1686	   Sriganesh Kini
1687	   Ericsson
1688	   300 Holger Way, San Jose, CA 95134
1689	   Email: sriganesh.kini@ericsson.com

1691	   John Sucec
1692	   Telcordia Technologies
1693	   One Telcordia Drive, Piscataway, NJ  08854
1694	   Email: sucecj@telcordia.com

1696	   Subir Das
1697	   Telcordia Technologies
1698	   One Telcordia Drive, Piscataway, NJ  08854
1699	   Email: sdas2@telcordia.com