idnits 2.17.1 

draft-ietf-rtgwg-ordered-fib-12.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (May 24, 2013) is 3989 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'AAH' is mentioned on line 960, but not defined

  == Missing Reference: 'Hold' is mentioned on line 953, but not defined

  == Missing Reference: 'Q' is mentioned on line 967, but not defined

  == Missing Reference: 'CC' is mentioned on line 964, but not defined

  == Missing Reference: 'AAH-hold' is mentioned on line 965, but not defined

  == Missing Reference: 'IDLE' is mentioned on line 1043, but not defined

  == Missing Reference: 'TX-AAH' is mentioned on line 1050, but not defined


     Summary: 0 errors (**), 0 flaws (~~), 8 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                           M. Shand
3	Internet-Draft                                    Individual Contributor
4	Intended status: Informational                                 S. Bryant
5	Expires: November 25, 2013                                    S. Previdi
6	                                                             C. Filsfils
7	                                                           Cisco Systems
8	                                                             P. Francois
9	                                                Institute IMDEA Networks
10	                                                          O. Bonaventure
11	                                        Universite catholique de Louvain
12	                                                            May 24, 2013

14	             Framework for Loop-free convergence using oFIB
15	                    draft-ietf-rtgwg-ordered-fib-12

17	Abstract

19	   This document describes an illustrative framework of a mechanism for
20	   use in conjunction with link state routing protocols which prevents
21	   the transient loops which would otherwise occur during topology
22	   changes.  It does this by correctly sequencing the forwarding
23	   information base (FIB) updates on the routers.

25	   This mechanism can be used in the case of non-urgent (management
26	   action) link or node shutdowns and restarts or link metric changes.
27	   It can also be used in conjunction with a fast re-route mechanism
28	   which converts a sudden link or node failure into a non-urgent
29	   topology change.  This is possible where a complete repair path is
30	   provided for all affected destinations.

32	   After a non-urgent topology change, each router computes a rank that
33	   defines the time at which it can safely update its FIB.  A method for
34	   accelerating this loop-free convergence process by the use of
35	   completion messages is also described.

37	   The technology described in this document has been subject to
38	   extensive simulation using real network topologies and costs, and
39	   pathological convergence behaviour.  However the mechanism described
40	   in this document are purely illustrative of the general approach and
41	   do not constitute a protocol specification.  The document represents
42	   a snapshot of the work of the Routing Area Working Group at the time
43	   of publication and is published as a document of record.  Further
44	   work is needed before implementation or deployment.

46	Status of This Memo
47	   This Internet-Draft is submitted in full conformance with the
48	   provisions of BCP 78 and BCP 79.

50	   Internet-Drafts are working documents of the Internet Engineering
51	   Task Force (IETF).  Note that other groups may also distribute
52	   working documents as Internet-Drafts.  The list of current Internet-
53	   Drafts is at http://datatracker.ietf.org/drafts/current/.

55	   Internet-Drafts are draft documents valid for a maximum of six months
56	   and may be updated, replaced, or obsoleted by other documents at any
57	   time.  It is inappropriate to use Internet-Drafts as reference
58	   material or to cite them other than as "work in progress."

60	   This Internet-Draft will expire on November 25, 2013.

62	Copyright Notice

64	   Copyright (c) 2013 IETF Trust and the persons identified as the
65	   document authors.  All rights reserved.

67	   This document is subject to BCP 78 and the IETF Trust's Legal
68	   Provisions Relating to IETF Documents
69	   (http://trustee.ietf.org/license-info) in effect on the date of
70	   publication of this document.  Please review these documents
71	   carefully, as they describe your rights and restrictions with respect
72	   to this document.  Code Components extracted from this document must
73	   include Simplified BSD License text as described in Section 4.e of
74	   the Trust Legal Provisions and are provided without warranty as
75	   described in the Simplified BSD License.

77	Table of Contents

79	   1.  The Purpose of this Document  . . . . . . . . . . . . . . . .   3
80	   2.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   4
81	   3.  The required FIB update order . . . . . . . . . . . . . . . .   5
82	     3.1.  Single Link Events  . . . . . . . . . . . . . . . . . . .   5
83	       3.1.1.  Link Down / Metric Increase . . . . . . . . . . . . .   5
84	       3.1.2.  Link Up / Metric Decrease . . . . . . . . . . . . . .   7
85	     3.2.  Multi-link events . . . . . . . . . . . . . . . . . . . .   7
86	       3.2.1.  Router Down events  . . . . . . . . . . . . . . . . .   7
87	       3.2.2.  Router Up events  . . . . . . . . . . . . . . . . . .   7
88	       3.2.3.  Linecard Failure/Restoration Events . . . . . . . . .   7
89	   4.  Applying ordered FIB updates  . . . . . . . . . . . . . . . .   8
90	     4.1.  Deducing the topology change  . . . . . . . . . . . . . .   8
91	     4.2.  Deciding if ordered FIB updates applies . . . . . . . . .   8
92	   5.  Computation of the ordering . . . . . . . . . . . . . . . . .   9
93	     5.1.  Link or Router Down or Metric Increase  . . . . . . . . .   9
94	     5.2.  Link or Router Up or Metric Decrease  . . . . . . . . . .  10

96	   6.  Acceleration of Ordered Convergence . . . . . . . . . . . . .  10
97	     6.1.  Construction of the waiting list and notification list  .  11
98	       6.1.1.  Down events . . . . . . . . . . . . . . . . . . . . .  11
99	       6.1.2.  Up Events . . . . . . . . . . . . . . . . . . . . . .  11
100	     6.2.  Format of Completion Messages . . . . . . . . . . . . . .  12
101	   7.  Fall back to Conventional Convergence . . . . . . . . . . . .  12
102	   8.  oFIB state machine  . . . . . . . . . . . . . . . . . . . . .  12
103	     8.1.  OFIB_STABLE . . . . . . . . . . . . . . . . . . . . . . .  13
104	     8.2.  OFIB_HOLDING_DOWN . . . . . . . . . . . . . . . . . . . .  14
105	     8.3.  OFIB_HOLDING_UP . . . . . . . . . . . . . . . . . . . . .  15
106	     8.4.  OFIB_ONGOING  . . . . . . . . . . . . . . . . . . . . . .  16
107	     8.5.  OFIB_ABANDONED  . . . . . . . . . . . . . . . . . . . . .  17
108	   9.  Management Considerations . . . . . . . . . . . . . . . . . .  17
109	   10. IANA considerations . . . . . . . . . . . . . . . . . . . . .  17
110	   11. Security considerations . . . . . . . . . . . . . . . . . . .  17
111	   12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  17
112	   13. Informative References  . . . . . . . . . . . . . . . . . . .  18
113	   Appendix A.  Candidate Methods of Safely Abandoning Loop-Free
114	                Convergence (AAH)  . . . . . . . . . . . . . . . . .  18
115	     A.1.  Possible Solutions  . . . . . . . . . . . . . . . . . . .  19
116	     A.2.  Hold-down timer only  . . . . . . . . . . . . . . . . . .  19
117	     A.3.  AAH messages  . . . . . . . . . . . . . . . . . . . . . .  20
118	       A.3.1.  Per Router State Machine  . . . . . . . . . . . . . .  20
119	       A.3.2.  Per Neighbor State Machine  . . . . . . . . . . . . .  22
120	   Appendix B.  Synchronisation of Loop Free Timer Values  . . . . .  24
121	     B.1.  Introduction  . . . . . . . . . . . . . . . . . . . . . .  24
122	     B.2.  Required Properties . . . . . . . . . . . . . . . . . . .  24
123	     B.3.  Mechanism . . . . . . . . . . . . . . . . . . . . . . . .  25
124	     B.4.  Security Considerations . . . . . . . . . . . . . . . . .  25
125	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  26

127	1.  The Purpose of this Document

129	   This document describes an illustrative framework of a mechanism for
130	   use in conjunction with link state routing protocols which prevents
131	   the transient loops which would otherwise occur during topology
132	   changes.  It does this by correctly sequencing the forwarding
133	   information base (FIB) updates on the routers.

135	   At the time of publication there is no demand to deploy this
136	   technology, however in view of the subtleties involved in the design
137	   of loop-free convergence routing protocol extensions the Routing Area
138	   Working Group considered it desirable to publish this document to
139	   place on record the design consideration of the ordered FIB
140	   (oFIB)approach.

142	   The mechanisms presented in this document are purely illustrative of
143	   the general approach and do not constitute a protocol specification.

145	   The document represents a snapshot of the work of the working group
146	   at the time of publication and is published as a document of record.
147	   Additional work is needed to specify the necessary routing protocol
148	   extensions necessary to support this IP fast re-route (IPFRR) method
149	   before implementation or deployment.

151	2.  Introduction

153	   With link-state protocols, such as IS-IS [ISO10589] and OSPF
154	   [RFC2328], each time the network topology changes, some routers need
155	   to modify their forwarding information bases (FIBs) to take into
156	   account the new topology.  Each topology change causes a convergence
157	   phase.  During this phase, routers may transiently have inconsistent
158	   FIBs, which may lead to packet loops and losses, even if the
159	   reachability of the destinations is not compromised after the
160	   topology change.  Packet losses and transient loops can also occur in
161	   the case of a link down event implied by a maintenance operation,
162	   even if this operation is predictable and not urgent.  When the link
163	   state change is a metric update and when a new link is brought up in
164	   the network, there is no direct loss of connectivity, but transient
165	   packet loops and loss can still occur.

167	   For example, in Figure 1, if the link between X and Y is shut down by
168	   an operator, packets destined to X can loop between R and Y when Y
169	   has updated its FIB while R has not yet updated its FIB, and packets
170	   destined to Y can loop between X and S if X updates its FIB before S.
171	   According to the current behaviour of ISIS and OSPF, this scenario
172	   will happen most of the time because X and Y are the first routers to
173	   be aware of the failure, so that they will update their FIBs first.

175	                                  1
176	                    X-------------/-------------Y
177	                    |                           |
178	                    |                           |
179	                    |                           |
180	                    |                           |
181	                  1 |                           | 1
182	                    |                           |
183	                    |                           |
184	                    |                           |
185	                    |                           |
186	                    S---------------------------R
187	                                  2

189	                        Figure 1: A simple topology

191	   It should be noted that the loops can occur remotely from the
192	   failure, not just adjacent to it.

194	   [RFC5715] provides an introduction to a number of loop-free
195	   convergence methods and readers unfamiliar with this technology are
196	   recommended to read before studying this document in detail.  Note
197	   that in common with other loop-free convergence methods, oFIB is only
198	   capable of providing loop free convergence in the presence of a
199	   single failure.

201	   The goal of this document is to describe a mechanism which sequences
202	   the router FIB updates to maintain consistency throughout the
203	   network.  By correctly setting the FIB change order, no looping or
204	   packet loss can occur.  This mechanism may be applied to the case of
205	   managed link-state changes, i.e.  link metric change, manual link
206	   down/up, manual router down/up, and managed state changes of a set of
207	   links attached to one router.  It may also be applied to the case
208	   where one or more network elements are protected by a fast re-route
209	   mechanism (FRR) [RFC5714] [RFC4090].  The mechanisms that are used in
210	   the failure case are exactly the same as those used for managed
211	   changes.  For simplicity this document makes no further distinction
212	   between managed and unplanned changes.

214	   It is assumed in the description that follows that all routers in the
215	   routing domain are oFIB capable.  This can be verified in an
216	   operation network by the routers reporting oFIB capability using the
217	   IGP in use.  Where non-oFIB capable routers exist in the network,
218	   normal convergence would be used by all routers.  The operation of
219	   mixed-mode networks is for further study.

221	   The technology described in this document has been subject to
222	   extensive simulation using real network topologies and costs and
223	   pathological convergence behaviour.  A variant of the technology
224	   described here has been experimentally deployed in a production
225	   network.

227	3.  The required FIB update order

229	   This section provides an overview of the required ordering of the FIB
230	   updates.  A more detailed analysis of the rerouting dynamics and
231	   correctness proofs of the mechanism can be found in [refs.PFOB07].

233	3.1.  Single Link Events

235	   For simplicity the correct ordering for single link changes are
236	   described first.  The document then builds on this to demonstrate
237	   that the same principles can be applied to more complex scenarios
238	   such as line card or node changes.

240	3.1.1.  Link Down / Metric Increase
241	   First consider the non-urgent failure of a link (i.e.  where an
242	   operator or a network management system (NMS) shuts down a link
243	   thereby removing it from the currently active topology) or the
244	   increase of a link metric by the operator or NMS . In this case, a
245	   router R must not update its FIB until all other routers that send
246	   traffic via R and the affected link have first updated their FIBs.

248	   The following argument shows that this rule ensures the correct order
249	   of FIB change when the link X->Y is shut down or its metric is
250	   increased.

252	   An "outdated" FIB entry for a destination is defined as being a FIB
253	   entry that still reflects the shortest path(s) in use before the
254	   topology change.  Once a packet reaches a router R that has an
255	   outdated FIB entry for the packet destination, then, provided the
256	   oFIB ordering is respected, the packet will continue to X only
257	   traversing routers that also have an outdated FIB entry for the
258	   destination.  The packet thus reaches X without looping and will be
259	   forwarded to Y via X->Y (or in the case of FRR, the X->Y repair path)
260	   and hence reach its destination.

262	   Since it can be assumed that the original topology was loop-free, Y
263	   will never use the link Y->X to reach the destination and hence the
264	   path(s) between Y and the destination are guaranteed to be unaffected
265	   by the topology change.  It therefore follows that the packet
266	   arriving at Y will reach its destination without looping.

268	   Since it can also be assumed that the new topology is loop-free, by
269	   definition a packet cannot loop while being forwarded exclusively by
270	   routers with an updated FIB entry.

272	   In other words, when the oFIB ordering is respected, if a packet
273	   reaches an outdated router, it can never subsequently reach an
274	   updated router, and cannot loop because from this point on it will
275	   only be forwarded on the consistent path that was used before the
276	   event.  If it does not reach an outdated router, it will only be
277	   forwarded on the loop free path that will be used after the
278	   convergence.

280	   According to the proposed ordering, X will be the last router to
281	   update its FIB.  Once it has updated its FIB, the link X->Y can
282	   actually be shut down (or the repair removed).

284	   If the link X-Y is bidirectional a similar process must be run to
285	   order the FIB update for destinations using the link in the direction
286	   Y->X.  As has already been shown, no packet ever traverses the X-Y
287	   link in both directions, and hence the operation of the two ordering
288	   processes is orthogonal.

290	3.1.2.  Link Up / Metric Decrease

292	   In the case of link up events or metric decreases, a router R must
293	   update its FIB before all other routers that will use R to reach the
294	   affected link.

296	   The following argument shows that this rule ensures the correct order
297	   of FIB change when the link X->Y is brought into service or its
298	   metric is decreased.

300	   Firstly, when a packet reaches a router R that has already updated
301	   its FIB, all the routers on the path from R to X will also have
302	   updated their FIB, so that the packet will reach X and be forwarded
303	   along X->Y, ultimately reaching its destination.

305	   Secondly, a packet cannot loop between routers that have not yet
306	   updated their FIB.  This proves that no packet can loop.

308	3.2.  Multi-link events

310	   The following sections describe the required ordering for single
311	   events which may manifest as multiple link events.  For example, the
312	   failure of a router may be notified to the rest of the network as the
313	   individual failure of all its attached links.  The means of
314	   identifying the event type from the collection of received link
315	   events is described in Section 4.1.

317	3.2.1.  Router Down events

319	   In the case of the non-urgent shut-down of a router, a router R must
320	   not update its FIB until all other routers that send traffic via R
321	   and the affected router have first updated their FIBs.

323	   Using a proof similar to that for link failure, it can be shown that
324	   no loops will occur if this ordering is respected [refs.PFOB07].

326	3.2.2.  Router Up events

328	   In the case of a router being brought into service, a router R must
329	   update its FIB BEFORE all other routers that WILL use R to reach the
330	   affected router.

332	   A proof similar to that for link up, shows that no loops will occur
333	   if this ordering is respected [refs.PFOB07].

335	3.2.3.  Linecard Failure/Restoration Events
336	   The failure of a line card involves the failure of a set of links all
337	   of which have a single node in common, i.e.  the parent router.  The
338	   ordering to be applied is the same as if it were the failure of the
339	   parent router.

341	   In a similar way, the restoration of an entire linecard to service as
342	   a single event can be treated as if the parent router were returning
343	   to service.

345	4.  Applying ordered FIB updates

347	4.1.  Deducing the topology change

349	   As has been described, a single event such as the failure or
350	   restoration of a single link, single router or a linecard may be
351	   notified to the rest of the network as a set of individual link
352	   change events.  It is necessary to deduce from this collection of
353	   link state notifications the type of event that has occurred in the
354	   network and hence the required ordering.

356	   When a link change event is received which impacts the receiving
357	   router's FIB, the routers at the near and far end of the link are
358	   noted.

360	   If all events received within some hold-down period (the time that a
361	   router waits to acquire a set of LSPs which should be processed
362	   together) have a single router in common, then it is assumed that the
363	   change reflects an event (line-card or router change) concerning that
364	   router.

366	   In the case of a link change event, the router at the far end of the
367	   link is deemed to be the common router.

369	   All ordering computations are based on treating the common router as
370	   the root for both link and node events.

372	4.2.  Deciding if ordered FIB updates applies

374	   There are some events (for example, a subsequent failure with
375	   conflicting repair requirements occurring before the ordered FIB
376	   process has completed) that cannot be correctly processed by this
377	   mechanism.  In these cases it is necessary to ensure that convergence
378	   falls back to the conventional mode of operation (see Section 7).

380	   In all cases it is necessary to wait some hold-down period after
381	   receiving the first notification to ensure that all routers have
382	   received the complete set of link state notifications associated with
383	   the single event.

385	   At any time, if a link change notification is received which would
386	   have no effect on the receiving router's FIB, then it may be ignored.

388	   If no other event is received during the hold-down time, the event is
389	   treated as a link event.  Note that the IGP reverse connectivity
390	   check means that only the first failure event, or second up event
391	   have an effect on the FIB.

393	   If an event is received within the hold down period which does NOT
394	   reference the common router (R) then in this version of the
395	   specification normal convergence is invoked immediately (see
396	   Section 7).

398	   Network reconvergence under ordered FIB takes longer than the normal
399	   reconvergence process.  Where the failure is protected by an FRR
400	   mechanism, this additional delay in convergence causes no packet
401	   loss.  When the sudden failure of a link or a set of links that are
402	   not protected using a FRR mechanism occurs this must be processed
403	   using the conventional (faster) mode of operation to minimise packet
404	   loss during re-convergence.

406	   In summary an ordered FIB process is applicable if the set of link
407	   state notifications received between the first event and the hold
408	   down period reference a common router R, and one of the following
409	   assertions is verified :

411	   o  The set of notifications refer to link down events concerning
412	      protected links and metric increase events

414	   o  The set of notifications refer to link up events and metric
415	      decrease events.

417	5.  Computation of the ordering

419	   This section describes how the required ordering is computed.

421	   This computation required the introduction of the concept of a
422	   reverse Shortest Path Tree (rSPT).  The rSPT uses the cost towards
423	   the root rather than from it and yields the best paths towards the
424	   root from other nodes in the network[I-D.bryant-ipfrr-tunnels].

426	5.1.  Link or Router Down or Metric Increase

428	   To respect the proposed ordering, routers compute a rank that will be
429	   used to determine the time at which they are permitted to perform
430	   their FIB update.  In the case of a failure event rooted at router Y
431	   or an increase of the metric of link X->Y, router R computes the rSPT
432	   in the topology before the failure (rSPT_OLD) rooted at Y.  This rSPT
433	   gives the shortest paths to reach Y before the failure.  The branch
434	   of the reverse SPT that is below R corresponds to the set of shortest
435	   paths to R that are used by the routers that reach Y via R.

437	   The rank of router R is defined as the depth (in number of hops) of
438	   this branch.  In the case of Equal Cost Multi-path (ECMP), the
439	   maximum depth of the ECMP path set is used.

441	   Router R is required to update its FIB at time

443	   T0 + H + (rank * MAX_FIB)

445	   where T0 is the arrival time of the link-state packet containing the
446	   topology change, H is the hold-down time and MAX_FIB is a network-
447	   wide constant that reflects the maximum time required to update a FIB
448	   irrespective of the change required.  The value of MAX_FIB is network
449	   specific and its determination is out of the scope of this document.
450	   This value must be agreed by all the routers in the network.  This
451	   agreement can be performed by using a capability TLV as defined in
452	   Appendix B.

454	   All the routers that use R to reach Y will compute a lower rank than
455	   R, and hence the correct order will be respected.  It should be noted
456	   that only the routers that used Y before the event need to compute
457	   their rank.

459	5.2.  Link or Router Up or Metric Decrease

461	   In the case of a link or router up event rooted at Y or a link metric
462	   decrease affecting link Y->W, a router R must have a rank that is
463	   higher than the rank of the routers that it will use to reach Y,
464	   according to the rule described in Section 3.  The rank of R is thus
465	   the number of hops between R and Y in its renewed Shortest Path Tree.
466	   When R has multiple equal cost paths to Y, the rank is the length in
467	   hops of the longest ECMP path to Y.

469	   Router R is required to update its FIB at time

471	   T0 + H + (rank * MAX_FIB)

473	   It should be noted that only the routers that use Y after the event
474	   have to compute a rank, i.e.  only the routers that have Y in their
475	   SPT after the link-state change.

477	6.  Acceleration of Ordered Convergence

479	   The mechanism described above is conservative, and hence may be
480	   relatively slow.  The purpose of this section is to describe a method
481	   of accelerating the controlled convergence in such a way that ordered
482	   loop-free convergence is still guaranteed.

484	   In many cases a router will complete its required FIB changes in a
485	   time much shorter than MAX_FIB and in many other cases, a router will
486	   not have to perform any FIB change at all.

488	   This section describes the use of completion messages to speed up the
489	   convergence by providing a means for a router to inform those routers
490	   waiting for it, that it has completed any required FIB changes.  When
491	   a router has been advised of completion by all the routers for which
492	   it is waiting, it can safely update its own FIB without further
493	   delay.  In most cases this can result in a sub-second re-convergence
494	   time comparable with that of normal convergence.

496	   Routers maintain a waiting list of the neighbours from which a
497	   completion message must be received.  Upon reception of a completion
498	   message from a neighbour, a router removes this neighbour from its
499	   waiting list.  Once its waiting list becomes empty, the router is
500	   allowed to update its FIB immediately even if its ranking timer has
501	   not yet expired.  Once this is done, the router sends a completion
502	   message to the neighbours that are waiting for it to complete.  Those
503	   routers are listed in a list called the Notification List.
504	   Completion messages contain an identification of the event to which
505	   they refer.

507	   Note that, since this is only an optimization, any loss of completion
508	   messages will result in the routers waiting their defined ranking
509	   time and hence the loop-free properties will be preserved.

511	6.1.  Construction of the waiting list and notification list

513	6.1.1.  Down events

515	   Consider a link or node down event rooted at router Y or the cost
516	   increase of the link X->Y.  A router R will compute rSPT_OLD(Y) to
517	   determine its rank.  When doing this, R also computes the set of
518	   neighbours that R uses to reach the failing node or link, and the set
519	   of neighbours that are using R to reach the failing node or link.
520	   The Notification list of R is equal to the former set and the Waiting
521	   list of R is equal to the latter.

523	   Note that R could include all its neighbours except those in the
524	   Waiting list in the Notification list, this has no impact on the
525	   correctness of the protocol, but would be unnecessarily inefficient.

527	6.1.2.  Up Events
528	   Consider a link or node up event rooted at router Y or the cost
529	   decrease of the link Y->X.  A router R will compute its new SPT
530	   (SPT_new(R)).  The Waiting list is the set of next hop routers that R
531	   uses to reach Y in SPT_new(R).

533	   In a simple implementation the notification list of R is all the
534	   neighbours of R excluding those in the Waiting list.  This may be
535	   further optimized by computing rSPT_new(Y) to determine those routers
536	   that are waiting for R to complete.

538	6.2.  Format of Completion Messages

540	   The format of completion messages and means of their delivery is
541	   routing protocol dependent and is outside the scope of this document.

543	   The following information is required:

545	   o  Identity of the sender.

547	   o  List of routing notifications being considered in the associated
548	      FIB change.  Each notification is defined as :

550	         Node ID of the near end of the link

552	         Node ID of the far end of the link

554	         Inclusion or removal of link.

556	         Old Metric

558	         New Metric

560	7.  Fall back to Conventional Convergence

562	   In circumstances where a router detects that it is dealing with
563	   incomplete or inconsistent link state information, or when a further
564	   topology event is received before completion of the current ordered
565	   FIB update process, it may be expedient to abandon the controlled
566	   convergence process.  A number of possible fall back mechanisms are
567	   described in Appendix A.  This mechanism is referred to as
568	   "Abandoning All Hope" (AAH).  The state machine defined in the body
569	   of this document does not make any assumption about which fall back
570	   mechanism will be used.

572	8.  oFIB state machine

574	   This section describes a model of an oFIB state machine which an
575	   implementation must be capable of interworking with.

577	   An oFIB capable router maintains an oFIB state value which can be one
578	   of : OFIB_STABLE, OFIB_HOLDING_DOWN, OFIB_HOLDING_UP, OFIB_ABANDONED,
579	   OFIB_ONGOING.

581	   An oFIB capable router maintains a timer, Hold_down_timer.  An oFIB
582	   capable router is configured with a value referred to as
583	   HOLD_DOWN_DURATION.  This configuration can be performed manually or
584	   using Appendix B.

586	   An oFIB capable router maintains a timer, rank_timer.

588	8.1.  OFIB_STABLE

590	   OFIB_STABLE is the state of a router which is not currently involved
591	   in any convergence process.  This router is ready to process an event
592	   by applying oFIB.

594	   EVENT : Reception of a link-state packet describing an event of the
595	   type link X--Y down or metric increase to be processed using oFIB.

597	   ACTION :

599	      Set state to OFIB_HOLDING_DOWN.

601	      Start Hold_down_timer.

603	      ofib_current_common_set = {X,Y}.

605	      Compute rank with respect to the event, as defined in Section 5.

607	      Store Waiting List and Notification List for X--Y obtained from
608	      the rank computation.

610	   EVENT : Reception of a link-state packet describing an event of the
611	   type link X--Y up or metric decrease which to be processed using
612	   oFIB.

614	   ACTION :

616	      Set state to OFIB_HOLDING_UP.

618	      Start Hold_down_timer.

620	      ofib_current_common_set = {X,Y}.

622	      Compute rank with respect to the event, as defined in Section 5.

624	      Store Waiting List and Notification List for X--Y obtained from
625	      the rank computation.

627	8.2.  OFIB_HOLDING_DOWN

629	   OFIB_HOLDING_DOWN is the state of a router that is collecting a set
630	   of link down or metric increase link-state packets to be processed
631	   together using controlled convergence.

633	   EVENT : Reception of a link-state packet describing an event of the
634	   type link up or metric decrease which in itself can be processed
635	   using oFIB.

637	   ACTION :

639	      Set state to OFIB_ABANDONED.

641	      Reset Hold_down_timer.

643	      Trigger AAH mechanism

645	   EVENT : Reception of a link-state packet describing an event of the
646	   type link A--B down or metric increase which in itself can be
647	   processed using oFIB.

649	   ACTION :

651	      ofib_current_common_set =
652	      intersection(ofib_current_common_set,{A,B}).

654	      If ofib_current_common_set is empty, then there is no longer a
655	      node in common in all the pending link-state changes.

657	         Set state to OFIB_ABANDONED.

659	         Reset Hold_down_timer.

661	         Trigger AAH mechanism.

663	      If ofib_current_common set is not empty, update waiting list and
664	      notification list as defined in Section 5.  Note that in the case
665	      of a single link event, the link-state packet received when the
666	      router is in this state describes the state change of the other
667	      direction of the link, hence no changes will be made to the
668	      waiting and notification lists.

670	   EVENT : Hold_down_timer expires.

672	   ACTION :

674	      Set state to OFIB_ONGOING.

676	      Start rank_timer with computed rank.

678	   EVENT : Reception of a completion message

680	   ACTION : Remove the sender from waiting list associated with the
681	   event identified in the completion message.

683	8.3.  OFIB_HOLDING_UP

685	   OFIB_HOLDING_UP is the state of a router that is collecting a set of
686	   link up or metric decrease link-state packets to be processed
687	   together using controlled convergence.

689	   EVENT : Reception of a link-state packet describing an event of the
690	   type link down or metric increase to be processed using oFIB.

692	   ACTION :

694	      Set state to OFIB_ABANDONED.

696	      Reset Hold_down_timer.

698	      Trigger AAH mechanism.

700	   EVENT : Reception of a link-state packet describing an event of the
701	   type link A--B up or metric decrease to be processed using oFIB.

703	   ACTION :

705	      ofib_current_common_set =
706	      intersection(ofib_current_common_set,{A,B}).

708	      If ofib_current_common_set is empty, then there is no longer a
709	      common node in the set of pending link-state changes.

711	         Set state to OFIB_ABANDONED.

713	         Reset Hold_down_timer.

715	         Trigger AAH mechanism.

717	      If ofib_current_common set is not empty, update waiting list and
718	      notification list as defined in Section 5.  Note that in the case
719	      of a single link event, the link-state packet received when the
720	      router is in this state describes the state change of the other
721	      direction of the link, hence no changes will be made to the
722	      waiting and notification lists.

724	   EVENT : Reception of a completion message

726	   ACTION : Remove the sender from the waiting list associated with the
727	   event identified in the completion message.

729	   EVENT : Hold_down_timer expires.

731	   ACTION :

733	      Set state to OFIB_ONGOING.

735	      Start rank_timer with computed rank.

737	8.4.  OFIB_ONGOING

739	   OFIB_ONGOING is the state of a router that is applying the ordering
740	   mechanism w.r.t.  the set of Link State Packets (LSP) collected when
741	   in OFIB_HOLDING_DOWN or OFIB_HOLDING_UP state.

743	   EVENT : rank_timer expires or waiting list becomes empty.

745	   ACTION :

747	      Perform FIB updates according to the change.

749	      Send completion message to each member of the notification list.

751	      Set State to OFIB_STABLE.

753	   EVENT : Reception of a completion message

755	   ACTION : Remove the sender from the waiting list.

757	   EVENT : Reception of a link-state packet describing a link state
758	   change event.

760	   ACTION :

762	      Set state to OFIB_ABANDONED.

764	      Trigger AAH.

766	      Start Hold_down_timer.

768	8.5.  OFIB_ABANDONED

770	   OFIB_ABANDONED is the state of a router that has fallen back to fast
771	   convergence due to the reception of link-state packets that cannot be
772	   dealt together using oFIB.

774	   EVENT : Reception of a link-state packet describing a link-state
775	   change event.

777	   ACTION : Trigger AAH, reset AAH_Hold_down_timer.

779	   EVENT : AAH_Hold_down_timer expires.

781	   ACTION : Set state to OFIB_STABLE

783	9.  Management Considerations

785	   A system for recording the dynamics of the convergence process needs
786	   to be deployed in order to post hoc diagnose the re-convergence.  The
787	   sensitivity of applications to the any packet re-order introduced by
788	   the delayed convergence process will need to be studied, however both
789	   of these considerations apply to any loop-free convergence method and
790	   are not specific to the ordered FIB method described in this
791	   document.

793	10.  IANA considerations

795	   There are no IANA considerations which arise from this document.  Any
796	   such considerations will be called out in protocol specific documents
797	   defining the modification to any routing protocol that is to be
798	   enhanced to support loop-free convergence using ordered FIB.

800	11.  Security considerations

802	   This document requires only minor modifications to existing routing
803	   protocols and therefore does not add significant additional security
804	   risks.  However a full security analysis would need to be provided
805	   within the protocol specific specifications proposed for deployment.
806	   Additional security considerations are noted in Appendix B.4.

808	12.  Acknowledgments

810	   We would like to thank Jean-Philippe Vasseur and Les Ginsberg for
811	   their useful suggestions and comments.

813	13.  Informative References

815	   [ISO10589]
816	              International Organization for Standardization,
817	              "Intermediate system to Intermediate system intra-domain
818	              routing information exchange protocol for use in
819	              conjunction with the protocol for providing the
820	              connectionless-mode Network Service (ISO 8473)", ISO/IEC
821	              10589:2002, Second Edition, Nov 2002.

823	   [refs.PFOB07]
824	              P. Francois, and O. Bonaventure, "Avoiding transient loops
825	              during IGP convergence in IP Networks", in IEEE/ACM
826	              Transactions on Networking, http://inl.info.ucl.ac.be/
827	              system/files/pfr-obo-ofib-ton.pdf, December 2007.

829	   [RFC2328]  Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998.

831	   [RFC4090]  Pan, P., Swallow, G., and A. Atlas, "Fast Reroute
832	              Extensions to RSVP-TE for LSP Tunnels", RFC 4090, May
833	              2005.

835	   [RFC5714]  Shand, M. and S. Bryant, "IP Fast Reroute Framework", RFC
836	              5714, January 2010.

838	   [RFC5715]  Shand, M. and S. Bryant, "A Framework for Loop-Free
839	              Convergence", RFC 5715, January 2010.

841	   [I-D.atlas-bryant-shand-lf-timers]
842	              K, A. and S. Bryant, "Synchronisation of Loop Free Timer
843	              Values", draft-atlas-bryant-shand-lf-timers-04 (work in
844	              progress), February 2008.

846	   [I-D.bryant-ipfrr-tunnels]
847	              Bryant, S., Filsfils, C., Previdi, S., and M. Shand, "IP
848	              Fast Reroute using tunnels", draft-bryant-ipfrr-tunnels-03
849	              (work in progress), November 2007.

851	Appendix A.  Candidate Methods of Safely Abandoning Loop-Free
852	             Convergence (AAH)

854	   IPFRR[RFC5714] and loop-free convergence techniques [RFC5715] can
855	   deal with single topology change events, multiple correlated change
856	   events, and in some cases even certain uncorrelated events.  However,
857	   in all cases there are events which cannot be dealt with and the
858	   mechanism needs to quickly revert to normal convergence.  This is
859	   known as "Abandoning All Hope" (AAH).

861	   This appendix describes the outcome of a design study into the AAH
862	   problem, and is included here to trigger discussion on the trade-offs
863	   between complexity and robustness in the AAH solution-space.

865	A.1.  Possible Solutions

867	   Two approaches to this problem have been proposed:

869	   1.  Hold-down timer only.

871	   2.  Synchronization of AAH state using AAH messages.

873	   These are described below.

875	A.2.  Hold-down timer only

877	   The "hold-down timer only" AAH method uses a hold-down to acquire a
878	   set of LSPs which should be processed together.  On expiry of the
879	   local hold-down timer, the router begins processing the batch of LSPs
880	   according to the loop free prevention algorithm.

882	   There are a number of problems with this simple approach.  In some
883	   cases the timer value will be too short to ensure that all the
884	   related events have arrived at all routers (perhaps because there was
885	   some unexpected propagation delay, or one or more of the events are
886	   slow in being detected).  In other cases, a completely unrelated
887	   event may occur after the timer has expired, but before the
888	   processing is complete.  In addition, since the timer is started at
889	   each router on reception of the first LSP announcing a topology
890	   change, the actual starting time is dependant upon the propagation
891	   time of the first LSP.  So, for a subsequent event occurring around
892	   the time of the timer expiry, because of variations in propagation
893	   delay it may reach some routers before the timer expires and others
894	   after it has expired.  In the former case this LSP will be included
895	   in the set of changes to be considered, while in the latter it will
896	   be excluded leasing to serious routing inconsistency.  In such cases
897	   continuing to operate the loop-free convergence protocol may
898	   exacerbate the situation.

900	   The simple approach to this would be to revert to normal convergence
901	   (AAH) whenever an LSP is received after the timer has expired.
902	   However this also has problems for the reasons above and therefore
903	   AAH must be a synchronous operation, i.e.  it is necessary to arrange
904	   that an AAH invoked anywhere in the network causes ALL routers to
905	   AAH.

907	   It is also necessary to consider the means of exiting the AAH state.
908	   Again the simplest method is to use a timer.  However while in AAH
909	   state any topology changes previously received, or which are
910	   subsequently received, should be processed immediately using the
911	   traditional convergence algorithms, i.e.  without invoking controlled
912	   convergence.  If the exit from the AAH state is not correctly
913	   synchronized, a new event may be processed by some routers
914	   immediately (as AAH), while those which have already left AAH state
915	   will treat it as the first of a new batch of changes and attempt
916	   controlled convergence.  Thus both entry and exit from the AAH state
917	   needs to be synchronised.  A method of achieving this is described in
918	   Appendix A.3.

920	A.3.  AAH messages

922	   Like the simple timer AAH method, the "AAH messages" AAH method uses
923	   a hold-down to acquire a set of LSPs which should be processed
924	   together.  On expiry of the local hold-down timer, the router begins
925	   processing the batch of LSPs according to the loop free prevention
926	   algorithm.  This is the same behaviour as the hold-down timer only
927	   method.  However, if any router, having started the loop-free
928	   convergence process receives an LSP which would trigger a topology
929	   change, it locally abandons the controlled convergence process, and
930	   sends an AAH message to all its neighbours.  This eventually triggers
931	   all routers to abandon the controlled convergence.  The routers
932	   remain in AAH state (i.e.  processing topology changes using normal
933	   "fast" convergence), until a period of quiescence has elapsed.  The
934	   exit from AAH state is synchronized by using a two step process.  To
935	   achieve the required synchronization, two additional messages are
936	   required, AAH and AAH ACK.  The AAH message is reliably exchanged
937	   between neighbours using the AAH ACK message.  These could be
938	   implemented as a new message within the routing protocol or carried
939	   in existing routing hello messages.  Two types of state machines are
940	   needed.  A per-router AAH state machine and a per neighbour AAH state
941	   machine(PNSM).  These are described below.

943	A.3.1.  Per Router State Machine

945	   Per Router State Table

947	   +-------------+----------+---------+--------+------------+----------+
948	   | EVENT       |    Q     |   Hold  |   CC   |     AAH    | AAH-hold |
949	   +=============+==========+=========+========+============+==========+
950	   | RX LSP      |  Start   |    -    | TX-AAH |  Re-start  |  TX-AAH  |
951	   | triggering  |hold-down |         | Start  | AAH timer. |   Start  |
952	   | change      |  timer   |         |  AAH   |    [AAH]   |    AAH   |
953	   |             |  [Hold]  |         | timer. |            |   timer. |
954	   |             |          |         | [AAH]  |            |   [AAH]  |
955	   +-------------+----------+---------+--------+------------+----------+
956	   | RX AAH      |  TX-AAH  |  TX-AAH | TX-AAH |    [AAH]   |  TX-AAH  |
957	   |(Neighbour's |Start AAH |  Start  | Start  |            |   Start  |
958	   |  PNSM       |  timer.  |   AAH   |  AAH   |            |    AAH   |
959	   |  processes  |  [AAH]   |  timer  | timer. |            |   timer. |
960	   |  RX AAH.)   |          |  [AAH]  | [AAH]  |            |   [AAH]  |
961	   +-------------+----------+---------+--------+------------+----------+
962	   | Timer       |    -     | Trigger |    -   |    Start   |    [Q]   |
963	   | expiry      |          |   CC.   |        |  AAH-hold  |          |
964	   |             |          |  [CC]   |        |   timer.   |          |
965	   |             |          |         |        | [AAH-hold] |          |
966	   +-------------+----------+---------+--------+------------+----------+
967	   | Controlled  |    -     |    -    |   [Q]  |      -     |     -    |
968	   | convergence |          |         |        |            |          |
969	   | completed   |          |         |        |            |          |
970	   +-------------+----------+---------+--------+------------+----------+
971	    TX-AAH = Send "goto TX-AAH" to all other PNSMs.

973	   Operation of the per-router state machine is as follows:

975	   Operation of this state machine under normal topology change involves
976	   only states: Quiescent (Q), Hold-down (Hold) and Controlled
977	   Convergence (CC).  The remaining states are associated with an AAH
978	   event.

980	   The resting state is Quiescent.  When the router in the Quiescent
981	   state receives an LSP indicating a topology change, which would
982	   normally trigger an SPF, it starts the Hold-down timer and changes
983	   state to Hold-down.  It normally remains in this state, collecting
984	   additional LSPs until the Hold-down timer expires.  Note that all
985	   routers must use a common value for the Hold-down timer.  When the
986	   Hold-down timer expires the router then enters Controlled Convergence
987	   (CC) state and executes the CC mechanism to re-converge the topology.
988	   When the CC process has completed on the router, the router re-enters
989	   the Quiescent state.

991	   If this router receives a topology changing LSP whilst it is in the
992	   CC state, it enters AAH state, and sends a "goto TX-AAH" command to
993	   all per neighbour state machines which causes each per-neighbour
994	   state machine to signal this state change to its neighbour.
995	   Alternatively, if this router receives an AAH message from any of its
996	   neighbours whilst in any state except AAH, it starts the AAH timer
997	   and enters the AAH state.  The per neighbour state machine
998	   corresponding to the neighbour from which the AAH was received
999	   executes the RX AAH action (which causes it to send an AAH ACK),
1000	   while the remainder are sent the "goto TX-AAH" command.  The result
1001	   is that the AAH is acknowledged to the neighbour from which it was
1002	   received and propagated to all other neighbours.  On entering AAH
1003	   state, all CC timers are expired and normal convergence takes place.

1005	   Whilst in the AAH state, LSPs are processed in the traditional
1006	   manner.  Each time an LSP is received, the AAH timer is restarted.
1007	   In an unstable network ALL routers will remain in this state for some
1008	   time and the network will behave in the traditional uncontrolled
1009	   convergence manner.

1011	   When the AAH timer expires, the router enters AAH-hold state and
1012	   starts the AAH hold timer.  The purpose of the AAH-hold state is to
1013	   synchronize the transition of the network from AAH to Quiescent.  The
1014	   additional state ensures that the network cannot contain a mixture of
1015	   routers in both AAH and Quiescent states.  If, whilst in AAH-Hold
1016	   state the router receives a topology changing LSP, it re-enters AAH
1017	   state and commands all per neighbour state machines to "goto TX-AAH".
1018	   If, whilst in AAH-Hold state the router receives an AAH message from
1019	   one of its neighbours, it re-enters the AAH state and commands all
1020	   other per neighbour state machines to "goto TX-AAH".  Note that the
1021	   per-neighbour state machine receiving the AAH message will
1022	   autonomously acknowledge receipt of the AAH message.  Commanding the
1023	   per-neighbour state machine to "goto TX-AAH" is necessary, because
1024	   routers may be in a mixture of Quiescent, Hold-down and AAH-hold
1025	   state, and it is necessary to rendezvous the entire network back to
1026	   AAH state.

1028	   When the AAH Hold timer expires the router changes to state Quiescent
1029	   and is ready for loop free convergence.

1031	A.3.2.  Per Neighbor State Machine

1033	   Per Neighbour State Table

1035	   +----------------------------+--------------+-----------------------+
1036	   | EVENT                      | Idle         | TX-AAH                |
1037	   +============================+==============+=======================+
1038	   | RX AAH                     | Send ACK.    | Send ACK.             |
1039	   |                            |              | Cancel timer.         |
1040	   |                            | [IDLE]       | [IDLE]                |
1041	   +----------------------------+--------------+-----------------------+
1042	   | RX ACK                     | ignore       | Cancel timer.         |
1043	   |                            |              | [IDLE]                |
1044	   +----------------------------+--------------+-----------------------+
1045	   | RX "goto TX-AAH" from      | Send AAH     | ignore                |
1046	   | Router State Machine       | [TX-AAH]     |                       |
1047	   +----------------------------+--------------+-----------------------+
1048	   | Timer expires              | impossible   | Send AAH              |
1049	   |                            |              | Restart timer.        |
1050	   |                            |              | [TX-AAH]              |
1051	   +----------------------------+--------------+-----------------------+

1053	   There is one instance of the per-neighbour state machine(PNSM) for
1054	   each neighbour within the convergence control domain.

1056	   The normal state is IDLE.

1058	   On command ("goto TX-AAH") from the router state machine, the state
1059	   machine enters TX-AAH state, transmits an AAH message to its
1060	   neighbour and starts a timer.

1062	   On receipt of an AAH ACK in state TX-AAH the state machine cancels
1063	   the timer and enters IDLE state.

1065	   In states IDLE, any AAH ACK message received is ignored.

1067	   On expiry of the timer in state TX-AAH the state machine transmits an
1068	   AAH message to the neighbour and restarts the timer.  (The timer
1069	   cannot expire in any other state.)

1071	   In any state, receipt of an AAH causes the state machine to transmit
1072	   an AAH ACK and enter the IDLE state.

1074	   Note that for correct operation the state machine must remain in
1075	   state TX-AAH, until an AAH ACK or an AAH is received, or the state
1076	   machine is deleted.  Deletion of the per neighbour state machine
1077	   occurs when routing determines that the neighbour has gone away, or
1078	   when the interface goes away.

1080	   When routing detects a new neighbour it creates a new instance of the
1081	   per-neighbour state machine in state Idle.  The consequent generation
1082	   of the router's own LSP will then cause the router state machine to
1083	   execute the LSP receipt actions, which will if necessary result in
1084	   the new per-neighbour state machine receiving a "goto TX-AAH" command
1085	   and transitioning to TX-AAH state.

1087	Appendix B.  Synchronisation of Loop Free Timer Values

1089	   The Appendix provided the reader with access to the design
1090	   considerations originally described in
1091	   [I-D.atlas-bryant-shand-lf-timers] .

1093	B.1.  Introduction

1095	   Most of the loop-free convergence mechanisms [RFC5715] require one or
1096	   more convergence delay timers that must have a duration that is
1097	   consistent throughout the routing domain.  This time is the worst
1098	   case time that any router will take to calculate the new topology,
1099	   and to make the necessary changes to the FIB.  The timer is used by
1100	   the routers to know when it is safe to transition between the loop-
1101	   free convergence states.  The time taken by a router to complete each
1102	   phase of the loop-free transition will be dependent on the size of
1103	   the network and the design and implementation of the router.  It can
1104	   therefore be expected that the optimum delay will need to be tuned
1105	   from time to time as the network evolves.  Manual configuration of
1106	   the timer is fraught for two reasons.  Firstly it is always difficult
1107	   to ensure that the correct value is installed in all of the routers.
1108	   Secondly, if any change is introduced into the network that results
1109	   in a need to change the timer, for example, due to a change in
1110	   hardware or software version, then all of the routers need to be
1111	   reconfigured to use the new timer value.  It is therefore desirable
1112	   that a means be provided by which the convergence delay timer can be
1113	   automatically synchronized throughout the network.

1115	B.2.  Required Properties

1117	   The timer synchronization mechanism must have the following
1118	   properties:

1120	   o  The convergence delay time must be consistent amongst all routers
1121	      that are converging on the new topology.

1123	   o  The convergence delay time must be the highest delay required by
1124	      any router in the new topology.

1126	   o  The mechanism must increase the delay when a new router in
1127	      introduced to the network that requires a higher delay than is
1128	      currently in use.

1130	   o  When the router that had the longest delay requirements is removed
1131	      from the topology, the convergence delay timer value must, within
1132	      some reasonable time, be reduced to the longest delay required by
1133	      the remaining routers.

1135	   o  It must be possible for a router to change the convergence delay
1136	      timer value that it requires.

1138	   o  A router which is in multiple routing areas, or is running
1139	      multiple routing protocols may signal a different loop-free
1140	      convergence delay for each area, and for each protocol.

1142	   How a router determines the time that it needs to execute each
1143	   convergence phase is an implementation issue, and outside the scope
1144	   of this specification.  However a router that dynamically determines
1145	   its proposed timer value must do so in such a way that it does not
1146	   cause the synchronized value to continually fluctuate.

1148	B.3.  Mechanism

1150	   The following mechanism is proposed.

1152	   A new information element is introduced into the routing protocol
1153	   that specifies the maximum time (in milliseconds) that the router
1154	   will take to calculate the new topology and to update its FIB as a
1155	   result of any topology change.

1157	   When a topology change occurs, the largest convergence delay time
1158	   required by any router in the new topology is used by the loop-free
1159	   convergence mechanism.

1161	   If a routing protocol message is issued that changes the convergence
1162	   delay timer value, but does not change the topology, the new timer
1163	   value must be taken into consideration during the next loop-free
1164	   transition, but must not instigate a loop-free transition.

1166	   If a routing protocol message is issued that changes the convergence
1167	   timer value and changes the topology, a loop-free transition is
1168	   instigated and the new timer value is taken into consideration.

1170	   The loop-free convergence mechanism should specify the action to be
1171	   taken if a timer change (only) message and a topology change message
1172	   are independently generated during the hold-off time.  A suitable
1173	   action would be to take the same action that would be taken if two
1174	   uncorrelated topology changes occurred in the network.

1176	   All routers that support loop-free convergence must advertise a loop-
1177	   free convergence delay time.  The loop-free convergence mechanism
1178	   must specify the action to be taken if a router does not advertise a
1179	   convergence delay time.

1181	B.4.  Security Considerations
1182	   If an abnormally large timer value is proposed by a router, the there
1183	   is a danger that the loop-free convergence process will take an
1184	   excessive time.  If during that time the routing protocol signals the
1185	   need for another transition, the loop-free transition will be
1186	   abandoned and the default best case (traditional) convergence
1187	   mechanism used.

1189	   It is still undesirable that the routers select a convergence delay
1190	   time that has an excessive value.  The maximum value that can be
1191	   specified in the LSP/LSA is limited through the use of a 16 bit field
1192	   to about 65 seconds.  When sufficient implementation experience is
1193	   gained, an architectural constant will be specified which sets the
1194	   upper limit of the convergence delay timer.

1196	Authors' Addresses

1198	   Mike Shand
1199	   Individual Contributor

1201	   Email: imc.shand@googlemail.com

1203	   Stewart Bryant
1204	   Cisco Systems
1205	   Green Park, 250, Longwater Avenue,
1206	   Reading  RG2 6GB
1207	   UK

1209	   Email: stbryant@cisco.com

1211	   Stefano Previdi
1212	   Cisco Systems
1213	   Via Del Serafico 200
1214	   00142 Roma
1215	   Italy

1217	   Email: sprevidi@cisco.com

1219	   Clarence Filsfils
1220	   Cisco Systems
1221	   Brussels
1222	   Belgium

1224	   Email: cfilsfil@cisco.com
1225	   Pierre Francois
1226	   Institute IMDEA Networks
1227	   Avda. del Mar Mediterraneo, 22
1228	   Leganese  28918
1229	   ES

1231	   Email: pierre.francois@imdea.org

1233	   Olivier Bonaventure
1234	   Universite catholique de Louvain
1235	   Place Ste Barbe, 2
1236	   Louvain-la-Neuve  1348
1237	   BE

1239	   Email: Olivier.Bonaventure@uclouvain.be
1240	   URI:   http://inl.info.ucl.ac.be/