idnits 2.17.1 

draft-ietf-ospf-scalability-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an Introduction section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There is 1 instance of too long lines in the document, the longest one
     being 1 character in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Best Current Practice
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'Ref4-Ref7' is mentioned on line 321, but not defined

  == Unused Reference: 'Ref4' is defined on line 200, but no explicit
     reference was found in the text

  == Unused Reference: 'Ref5' is defined on line 203, but no explicit
     reference was found in the text

  == Unused Reference: 'Ref6' is defined on line 206, but no explicit
     reference was found in the text

  == Unused Reference: 'Ref7' is defined on line 209, but no explicit
     reference was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Ref2'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Ref3'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Ref4'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Ref5'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Ref6'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Ref7'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Ref8'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Ref9'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Ref10'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Ref11'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Ref12'


     Summary: 5 errors (**), 0 flaws (~~), 5 warnings (==), 13 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	   Internet Engineering Task Force                 Gagan L. Choudhury
3	   Internet Draft                                  Vera D. Sapozhnikova
4	   Expires in September, 2003                      AT&T
5	   Category: Best Current Practice
6	   draft-ietf-ospf-scalability-03.txt              Anurag S. Maunder
7	                                                   Sanera Systems

9	                                                   Vishwas Manral
10	                                                   Netplane Systems

12	                                                   March, 2003

14	           Prioritized Treatment of Specific OSPF
15	           Packets and Congestion Avoidance

17	Status of this Memo

19	   This document is an Internet-Draft and is in full conformance
20	   with all provisions of Section 10 of RFC2026.

22	   Internet-Drafts are working documents of the Internet Engineering
23	   Task Force (IETF), its areas, and its working groups.  Note that
24	   other groups may also distribute working documents as Internet-
25	   Drafts.

27	   Internet-Drafts are draft documents valid for a maximum of six
28	   months and may be updated, replaced, or obsoleted by other documents
29	   at any time.  It is inappropriate to use Internet-Drafts as
30	   reference material or to cite them other than as "work in progress."

32	   The list of current Internet-Drafts can be accessed at
33	        http://www.ietf.org/ietf/1id-abstracts.txt
34	   The list of Internet-Draft Shadow Directories can be accessed at
35	        http://www.ietf.org/shadow.html.
36	   Distribution of this memo is unlimited.

38	Abstract

40	   This document proposes methods that are intended to improve the
41	   scalability and stability of large networks using OSPF protocol.
42	   The methods include processing OSPF Hellos and LSA Acknowledgements
43	   at a higher priority compared to other OSPF packets, and other
44	   congestion avoidance procedures. Simulation results in support of
45	   some of the proposals are given in the appendix sections.

47	Table of Contents

49	   1. Motivation.....................................................2
50	   2. The Proposals..................................................3
51	   3. Security Considerations........................................4
52	   4. Acknowledgments................................................4
53	   5. References.....................................................5
54	   6. Authors' Addresses.............................................5
55	   Appendix A. LSA Storm: Causes and Impact..........................6
56	   Appendix B. Simulation Study......................................8
57	   Appendix B.1. The Network Under Simulation........................8
58	   Appendix B.2. Simulation Results.................................11
59	   Appendix B.3. Observations on Simulation Results.................15
60	   Appendix C. Other Proposals......................................15

62	1. Motivation

64	   A large network running OSPF [Ref1] or OSPF-TE [Ref2] protocol may
65	   occasionally experience the simultaneous or near-simultaneous update
66	   of a large number of link-state-advertisement messages, or LSAs.
67	   We call this event, an LSA storm and it may be initiated by an
68	   unscheduled failure or a scheduled maintenance or upgrade event.
69	   The failure may be hardware, software, or procedural in nature.

71	   The LSA storm causes high CPU and memory utilization at the node
72	   processors causing incoming packets to be delayed or dropped.
73	   Delayed acknowledgements (beyond the retransmission timer value)
74	   results in retransmissions, and delayed Hello packets (beyond the
75	   router-dead interval) results in links being declared down.
76	   The retransmissions and additional LSA generations result in further
77	   CPU and memory usage, essentially causing a positive feedback loop,
78	   which, in the extreme case, may drive the network to an unstable
79	   state.

81	   The default value of retransmission timer is 5 seconds and that of
82	   the router-dead interval is 40 seconds.  However, recently there
83	   has been a lot of interest in significantly reducing OSPF convergence
84	   time and as part of that plan much shorter (subsecond) Hello and
85	   router-dead intervals have been proposed [Ref3].  In such a scenario
86	   it will be more likely for Hello packets to be delayed beyond
87	   the router-dead interval during a network congestion event
88	   caused by an LSA storm.

90	   Appendix A explains in more detail LSA storm generation scenarios,
91	   its impact, and points out a few real-life examples of control-message
92	   storm generation.  Appendix B presents a simulation study on this
93	   phenomenon.

95	   In order to improve the scalability and stability of networks we
96	   propose steps for prioritizing critical OSPF packets and avoiding
97	   congestion. The details of the proposals are given
98	   in Section 2.  We also do a simulation study on a subset of the
99	   proposals in Appendix B and show that they indeed improve the
100	   scalability and stability of networks using OSPF protocol.

102	   Appendix C provides some further proposals with similar goals.

104	2. The Proposals

106	   The proposals below are intended to improve the scalability
107	   and stability of large networks using OSPF protocol.  During
108	   periods of network congestion they would reduce retransmissions,
109	   avoid an interface to be declared down due to Hello packets
110	   being delayed beyond the RouterDeadInterval, and take other
111	   congestion avoidance steps.

113	   Either all, or a subset of the proposals may be implemented by
114	   a Router.  It is also possible for some routers to implement
115	   them fully or partially, and others to not implement them at
116	   all.

118	   (1) Classify all OSPF packets in two classes: a "high priority"
119	       class comprising of OSPF Hello packets and Link State
120	       Acknowledgement packets, and a "low priority" class
121	       comprising of all other packets. The classification is
122	       accomplished by examining the OSPF packet header. While
123	       receiving a packet from a neighbor and while transmitting
124	       a packet to a neighbor, try to process a "high priority"
125	       packet ahead of a "low priority" packet.

127	   (2) Reset the Inactivity Timer for an interface whenever any OSPF
128	       packet is received over that interface (currently this is
129	       done only for the Hello packet).
130	       So OSPF would declare the interface to be down only if no
131	       OSPF packet is received over that interface for a period
132	       equaling or exceeding the RouterDeadInterval.

134	   (3) Use an Exponential Backoff algorithm for determining the value
135	       of the LSA retransmission interval (RxmtInterval).  Let R(i)
136	       represent the RxmtInterval value used during the i-th
137	       retransmission of an LSA.  Use the following algorithm to
138	       compute R(i)

140	                    R(1) = Rmin
141	                    R(i+1) = Min(KR(i),Rmax)  for i>=1

143	       where K, Rmin and Rmax are constants and the function
144	       Min(.,.) represents the minimum value of its two arguments.
145	       Example values for K, Rmin and Rmax may be 2, 5
146	       seconds and 40 seconds respectively.

148	   (4) Implicit Congestion Detection and Action Based on That:
149	       If there is control message congestion at a node, its
150	       neighbors do not know about that explicitly.  However, they
151	       can implicitly detect it based on the number of unacknowledged
152	       LSAs to this node.  If this number exceeds a certain "high
153	       water mark" then the rate at which LSAs are sent to this node
154	       should be reduced.  At a future time, if the number of
155	       unacknowledged LSAs to this node falls below a certain "low
156	       water mark" then the normal rate of sending LSAs to this
157	       node should be resumed.  An example value for the "high
158	       water mark" may be 20 unacknowledged LSAs and that for the "low
159	       water mark" may be 10 unacknowledged LSAs.  An example
160	       value for the rate on exceeding the "high water mark" may be
161	       50% the normal rate.

163	   (5) Throttling Adjacencies to be Brought Up Simultaneously:
164	       If a node tries to bring up a large number of adjacencies to
165	       its neighbors simultaneously then that may cause severe
166	       congestion due to database synchronization and LSA flooding
167	       activities.  It is recommended that during such a situation
168	       no more than "n" adjacencies should be brought up
169	       simultaneously.  Once a subset of adjacencies have been brought
170	       up successfully, newer adjacencies may be brought up as long as
171	       the number of simultaneous adjacencies being brought up does not
172	       exceed "n". An example value for "n" may be 4.

174	3. Security Considerations

176	   This memo does not create any new security issues for the OSPF
177	   protocol.  Security considerations for the base OSPF protocol are
178	   covered in [Ref1].

180	4. Acknowledgments

182	   We would like to acknowledge the support of OSPF WG chairs
183	   Rohit Dube, Acee Lindem, and John Moy.  We also acknowledge
184	   Jerry Ash, Margaret Chiosi, Elie
185	   Francis, Jeff Han, Beth Munson, Roshan Rao, Moshe Segal, Mike
186	   Wardlow, and Pat Wirth for collaboration and encouragement in
187	   our scalability improvement efforts for Link-State-Protocol based
188	   networks.

190	5. References

192	   [Ref1] J. Moy, "OSPF Version 2", RFC 2328, April, 1998.

194	   [Ref2] D. Katz, D. Yeung, K. Kompella, "Traffic Engineering
195	   Extension to OSPF Version 2," Work in Progress.

197	   [Ref3] C. Alaettinoglu, V. Jacobson and H. Yu, "Towards Milli-
198	   second IGP Convergence," Work in Progress.

200	   [Ref4] Pappalardo, D., "AT&T, customers grapple with ATM net
201	   outage," Network World, February 26, 2001.

203	   [Ref5] "AT&T announces cause of frame-relay network outage," AT&T
204	   Press Release, April 22, 1998.

206	   [Ref6] Cholewka, K., "MCI Outage Has Domino Effect," Inter@ctive
207	   Week, August 20, 1999.

209	   [Ref7] Jander, M., "In Qwest Outage, ATM Takes Some Heat," Light
210	   Reading, April 6, 2001.

212	   [Ref8] A. Zinin and M. Shand, "Flooding Optimizations in Link-State
213	   Routing Protocols," Work in Progress.

215	   [Ref9] J. Moy, "Flooding over Parallel Point-to-Point Links," Work in
216	   progress.

218	   [Ref10] P. Pillay-Esnault, "OSPF Refresh and flooding reduction in
219	   stable topologies," Work in progress.

221	   [Ref11] J. Ash, G. Choudhury, V. Sapozhnikova, M. Sherif, A.
222	   Maunder, V. Manral, "Congestion Avoidance & Control for OSPF
223	   Networks", Work in Progress.

225	   [Ref12] B. M. Waxman, "Routing of Multipoint Connections," IEEE
226	   Journal on Selected Areas in Communications, 6(9):1617-1622, 1988.

228	6. Authors' Addresses

230	   Gagan L. Choudhury
231	   AT&T
232	   Room D5-3C21
233	   200 Laurel Avenue
234	   Middletown, NJ, 07748
235	   USA
236	   Phone: (732)420-3721
237	   email: gchoudhury@att.com
238	   Vera D. Sapozhnikova
239	   AT&T
240	   Room C5-2C29
241	   200 Laurel Avenue
242	   Middletown, NJ, 07748
243	   USA
244	   Phone: (732)420-2653
245	   email: sapozhnikova@att.com

247	   Anurag S. Maunder
248	   Sanera Systems
249	   370 San Aleso Ave.
250	   Second Floor
251	   Sunnyvale, CA 94085
252	   Phone: (408)734-6123
253	   email: amaunder@sanera.net

255	   Vishwas Manral
256	   NetPlane
257	   189, Prashasan Nagar,
258	   Road Number 72
259	   Jubilee Hills, Hyderabad
260	   India
261	   email: Vishwasm@netplane.com

263	Appendix A. LSA Storm: Causes and Impact

265	   An LSA storm may be initiated due to many reasons.  Here
266	   are some examples:

268	   (a) one or more link failures due to fiber cuts,

270	   (b) one or more node failures for some reason, e.g., software
271	       crash or some type of disaster (including power outage)
272	       in an office complex hosting many nodes,

274	   (c) Link/node flapping,

276	   (d) requirement of taking down and later bringing back many
277	       nodes during a software/hardware upgrade,

279	   (e) near-synchronization of the once-in-30-minutes refresh instants
280	       of a subset of LSAs,

282	   (f) refresh of all LSAs in the system during a change in software
283	       version,

285	   (g) injecting a large number of external routes to OSPF due to
286	       a procedural error.

288	   In addition to the LSAs generated as a direct result of link/node
289	   failures, there may be other indirect LSAs as well.  One example
290	   in MPLS networks is traffic engineering LSAs generated at other
291	   links as a result of significant change in reserved bandwidth
292	   resulting from rerouting of Label Switched Paths (LSPs) that went
293	   down during the link/node failure.

295	   The LSA storm causes high CPU and memory utilization at the node
296	   processors causing incoming packets to be delayed or dropped.
297	   Delayed acknowledgements (beyond the retransmission timer value)
298	   results in retransmissions, and delayed Hello packets (beyond the
299	   Router-Dead interval) results in links being declared down.  A
300	   trunk-down event causes Router LSA generation by its end-point
301	   nodes.  If traffic engineering LSAs are used for each link then
302	   that type of LSAs would also be generated by the end-point nodes
303	   and potentially elsewhere as well due to significant changes in
304	   reserved bandwidths at other links caused by the failure and reroute
305	   of LSPs originally using the failed trunk.  Eventually, when the
306	   link recovers that would also trigger additional Router and traffic
307	   engineering LSAs.

309	   The retransmissions and additional LSA generations result in further
310	   CPU and memory usage, essentially causing a positive feedback loop.
311	   We define the LSA storm size as the number of LSAs in the original
312	   storm and not counting any additional LSAs resulting from the
313	   feedback loop described above.  If the LSA storm is too large then

315	   the positive feedback loop mentioned above may be large enough to
316	   indefinitely sustain a large CPU and memory utilization at many
317	   network nodes, thereby driving the network to an unstable state.
318	   In the past, network
319	   outage events have been reported in IP and ATM networks using
320	   link-state protocols such as OSPF, IS-IS, PNNI or some proprietary
321	   variants.  See, for example [Ref4-Ref7].  In many of these examples,
322	   large scale flooding of LSAs or other similar control messages
323	   (either naturally or triggered by some bug or inappropriate
324	   procedure) have been partly or fully responsible for network
325	   instability and outage.

327	   In Appendix B, we use a simulation model to show that there
328	   is a certain LSA storm
329	   size threshold above which the network may show unstable behavior
330	   caused by large number of retransmissions, link failures due to
331	   missed Hello packets and subsequent link recoveries.  We also show
332	   that the LSA storm size causing instability may be substantially
333	   increased by providing prioritized treatment to Hello and LSA
334	   Acknowledgment packets and by using an exponential backoff
335	   algorithm for determining the LSA retransmission interval.
336	   Furthermore, if we prioritize Hello
337	   packets then even when the network operates somewhat above the
338	   stability threshold, links are not declared down due to missed
339	   Hellos.  This implies that even though there is
340	   control plane congestion due to many retransmissions, the data plane
341	   stays up and no new LSAs are generated (besides the ones in the
342	   original storm and the refreshes).  These observations are the
343	   basis of the first three proposals in Section 2.

345	   One might argue that the scalability issue of large networks should
346	   be solved solely by dividing the network hierarchically into
347	   multiple areas so that flooding of LSAs remains localized within
348	   areas.  However, this approach increases the network management
349	   and design complexity and may result in less optimal routing between
350	   areas. Also, ASE LSAs are flooded throughout the AS and it may be
351	   a problem if there are large numbers of them.  Furthermore,
352	   a large number of summary LSAs may need to be flooded across
353	   Areas and their numbers would increase significantly if
354	   multiple Area Border Routers are employed for the purpose of
355	   reliability. Thus it is important to allow the network to grow
356	   towards as large a size as possible under a single area.

358	   Our proposal here is synergistic with a broader set of scalability
359	   and stability improvement proposals. [Ref8, Ref9] proposes flooding
360	   overhead reduction in case more than one interface goes to the same
361	   neighbor.  [Ref10] proposes a mechanism for
362	   greatly reducing LSA refreshes in stable topologies.
363	   [Ref11] proposes a wide range of congestion control and failure
364	   recovery mechanisms.

366	Appendix B. Simulation Study

368	   The main motivation of this study is to show the network congestion
369	   and instability caused by large LSA storms and the improvement in
370	   stability and scalability that can be achieved by following the
371	   proposals in this memo.

373	Appendix B.1. The Network Under Simulation

375	   We generate a random network over a rectangular grid using a
376	   modified version of Waxman's algorithm [Ref12] that ensures that
377	   the network is connected and has a pre-specified number of nodes,
378	   links, maximum number of neighbors per node, and maximum number
379	   of adjacencies per node. The rectangular grid resembles the
380	   continental U.S.A. with maximum one-way propagation delay of 30 ms
381	   in the East-West direction and maximum one-way propagation delay of
382	   15 ms in the North-South direction.  We consider two different
383	   network sizes as explained in Section B.2.

385	   The network has a flat, single-area topology.

387	   Each node is a Router and each link is a point-to-point link
388	   connecting two routers.

390	   We assume that node CPU and memory (not the link bandwidth) is the
391	   main bottleneck in the LSA flooding process.  This will typically
392	   be true for high speed links (e.g., OC3 or above) and/or links
393	   where OSPF traffic gets an adequate Quality of Service (QoS)
394	   compared to other traffic.

396	   Different Timers:
397	     LSA refresh interval = 1800 seconds,
398	     Hello refresh interval = 10 Seconds,
399	     Router-Dead interval = 40 seconds,
400	     LSA retransmission interval: two values are considered, 10 seconds
401	       and 5 Seconds (note that a retransmission is disabled on the
402	       receipt of either an explicit acknowledgment or a duplicate LSA
403	       over the same interface that acts as an implicit acknowledgment)
404	     Minimum time between successive generation of the same LSA = 5
405	       seconds,
406	     Minimum time between successive Dijkstra SPF calculations
407	       is 1 second.

409	   Packing of LSAs: It is assumed that for any given node, the LSAs
410	   generated over a 1-second period are packed together to form an LSU
411	   but no more than 3 LSAs are packed in one LSU.

413	   LSU/Ack/Hello Processing Times: All processing times are expressed
414	   in terms of the parameter T.  Two values of T are considered, 1 ms
415	   and 0.5 ms.

417	   In the case of a dedicated processor for processing OSPF packets the
418	   processing time reported represents the true processing time. If the
419	   processor does other work and only a fraction of its capacity can be
420	   dedicated to OSPF processing then we have to inflate the processing
421	   time appropriately to get the effective processing time and in that
422	   case it is assumed that the inflation factor is already taken into
423	   account as part of the reported processing time.

425	   The fixed time to send or receive any LSU, Ack or Hello packet is T.
426	   In addition, a variable processing time is used for LSU and Ack
427	   depending on the number and types of LSAs packed.  No variable
428	   processing time is used for Hello.
429	   Variable processing time per Router LSA is (0.5 + 0.17L)T where L is
430	   the number of adjacencies advertised by the Router LSA.  For other
431	   LSA types (e.g., ASE LSA or a "Link" LSA carrying traffic
432	   engineering information about a link), the variable processing time
433	   per LSA is 0.5T.

435	   Variable processing time for an Ack is 25% that of the corresponding
436	   LSA.

438	   It is to be noted that if multiple LSAs are packed in a single LSU
439	   packet then the fixed processing time is needed only once but the
440	   variable processing time is needed for every component of the
441	   packet.

443	   The processing time values we use are roughly in the same range of
444	   what has been observed in an operational network.

446	   LSU/Ack/Hello Priority: Two non-preemptive priority levels and
447	   three priority scenarios are considered. Within each priority level
448	   processing is FIFO with new packets of lower priority being
449	   dropped when the lower priority queue is full.  The higher priority
450	   packets are never dropped.
451	      In Priority scenario 1, all LSUs/Acks/Hellos received at a node
452	      are queued at the lower priority.
453	      In Priority scenario 2, Hellos received at a node are queued at
454	      the higher priority but LSUs/Acks are queued at lower priority.
455	      In Priority scenario 3, Hellos and Acks received at a node are
456	      queued at the higher priority but LSUs are queued at lower
457	      priority.
458	   All packets generated internally to a node (usually triggered by
459	   a timer) are processed at the higher priority.  This includes the
460	   initial LSA storm, LSA refresh, Hello refresh, LSA retransmission
461	   and new LSA generation after detection of a failure or recovery.

463	   Buffer Size for Incoming LSUs/Acks/Hellos (lower priority): Buffer
464	   size is assumed to be 2000 packets where a packet is either an Ack,
465	   LSU, or Hello.

467	   LSA Refresh: Each LSA is refreshed once in 1800 seconds and the
468	   refresh instants of various LSAs in the LSDB are assumed to be
469	   uniformly distributed over the 1800 seconds period, i.e., they are
470	   completely unsynchronized.  If however, an LSA is generated as part
471	   of the initial LSA storm then it goes on a new refresh schedule of
472	   once in 1800 seconds starting from its generation time.

474	   LSA Storm Generation: As defined earlier, "LSA storm" is the
475	   simultaneous or near simultaneous generation of a large number of
476	   LSAs. In the case of only Router and ASE LSAs we normally assume
477	   that the number of ASE LSAs in the storm is about 4 times that of
478	   the Router LSAs, but the ratio is allowed to change if either the
479	   Router or the ASE LSAs have reached their maximum possible value.
480	   In the case of only Router and Link LSAs (carrying traffic
481	   engineering information) we normally assume that the number of Link
482	   LSAs in the storm is about 4 times that of the Router LSAs, but the
483	   ratio is allowed to change if either the Router or the Link LSAs
484	   have reached their maximum possible value.  For any given LSA storm
485	   we keep generating LSAs starting from Node index 1 and moving
486	   upwards and stop until the correct number of LSAs of each type have
487	   been generated.  The LSAs generated at any given node is assumed to
488	   start at an instant uniformly distributed between 20 and 30 seconds
489	   from the start of the simulation.  Successive LSA generations at a
490	   node are assumed to be spaced apart by 400 ms. It is to be noted
491	   that during the period of observation there are other LSAs
492	   generated besides the ones in the storm.  These include refresh of
493	   LSAs that are not part of the storm and LSAs generated due to
494	   possible link failures and subsequent possible link recoveries.

496	   Failure/Recovery of Links: If no Hello is received over a link (due
497	   to CPU/memory congestion) for longer than Router-Dead Interval then
498	   the link is declared down.  At a later time, if Hellos are received
499	   then the link would be declared up.  Whenever a link is declared
500	   up or down, one Router LSA is generated by each Router on the
501	   two sides of the point-to-point link.  If "Link LSAs" carrying
502	   traffic engineering information is used then it is assumed that each
503	   Router would also generate a Link LSA.  In this case it is also
504	   assumed that due to rerouting of LSPs, three other links in the
505	   network (selected randomly in the simulation) would have significant
506	   change in reserved bandwidth which would result in one Link LSA
507	   being generated by the routers on the two ends of each such link.

509	Appendix B.2. Simulation Results

511	   In this section we study the relative performance of the three
512	   priority scenarios defined earlier (no priority to Hello or Ack,
513	   priority to Hello only, and priority to both Hello and Ack) with a
514	   range of Network sizes, LSA retransmission timer values, LSA types,
515	   processing time values and Hello/Router-Dead-Interval values:

517	   Network size: Two networks are considered.  Network 1 has 100 nodes,
518	   1200 links, maximum number of neighbors per node is 30 and maximum
519	   number of adjacencies per node is 50 (same neighbor may have more
520	   than one adjacencies).   Network 2 has 50 nodes, 600 links, maximum
521	   number of neighbors per node is 25 and maximum number of adjacencies
522	   per node is 48. Dijkstra SPF calculation time for Network 1 is
523	   assumed to be 100 ms and that for Network 2 is assumed to be 70 ms.

525	   LSA Type: Each node has 1 Router LSA (Total of 100 for Network 1 and
526	   50 for Network 2). There are no Network LSAs since all links are
527	   point-to-point links and no Summary LSAs since the network has only
528	   one area. Regarding other LSA types we consider two situations.  In
529	   Situation 1 we assume that there are no ASE LSAs and each link has
530	   one "Link" LSA carrying traffic engineering information (Total of
531	   2400 for Network 1 and 1200 for Network 2). In Situation 2 we assume
532	   that there are no "Link" LSAs and half of the nodes are ASA-Border
533	   nodes and each border node has 10 ASE LSAs (Total of 500 for
534	   Network 1 and 250 for Network 2).  We identify Situation 1 as "Link
535	   LSAs" and Situation 2 as "ASE LSAs".

537	   LSA retransmission timer value: Two values are considered, 10
538	   seconds and 5 seconds (default value).

540	   Processing time values: Processing times for LSUs, Acks and Hello
541	   packets have been previously expressed in terms of a common
542	   parameter T.  Two values are considered for T, which are 1 ms
543	   and 0.5 ms respectively.

545	   Hello/Router-Dead-Interval: It is assumed that Router-Dead interval
546	   is four times the Hello interval.  In one case it is assumed that
547	   Hello interval is 10 seconds and Router-Dead-Interval is 40
548	   seconds (default values), and in the other case it is assumed that
549	   Hello interval is 2 seconds and Router-Dead-Interval is 8 seconds.

551	   Based on Network size, LSA type and processing time values we
552	   develop 6 Test cases as follows:

554	   Case 1: Network 1, Link LSAs, retransmission timer = 10 sec.,
555	           T = 1 ms, Hello/Router-Dead-Interval = 10/40 sec.

557	   Case 2: Network 1, ASE LSAs, retransmission timer = 10 sec.,
558	           T = 1 ms, Hello/Router-Dead-Interval = 10/40 sec.

560	   Case 3: Network 1, Link LSAs, retransmission timer = 5 sec.,
561	           T = 1 ms, Hello/Router-Dead-Interval = 10/40 sec.

563	   Case 4: Network 1, Link LSAs, retransmission timer = 10 sec.,
564	           T = 0.5 ms, Hello/Router-Dead-Interval = 10/40 sec.

566	   Case 5: Network 1, Link LSAs, retransmission timer = 10 sec.,
567	           T = 1 ms, Hello/Router-Dead-Interval = 2/8 sec.

569	   Case 6: Network 2, Link LSAs, retransmission timer = 10 sec.,
570	           T = 1 ms, Hello/Router-Dead-Interval = 10/40 sec.

572	   For each case and for each Priority scenario we study the network
573	   stability as a function of the size of the LSA storm.  The stability
574	   is determined by looking at the number of non-converged LSUs as a
575	   function of time. An example is shown in Table 1 for Case 1 and
576	   Priority scenario 1 (No priority to Hellos or Acks).

578	=========|==========================================================
579	         | Number of Non-Converged LSUs in the Network at Time(in sec)
580	    LSA  |
581	   STORM |====|=====|=====|=====|=====|=====|=====|=====|========|==
582	   SIZE  |10s | 20s | 30s | 35s | 40s | 50s | 60s | 80s | 100s   |
583	=========|====|=====|=====|=====|=====|=====|=====|=====|========|==
584	    100  | 0  |  0  | 24  | 29  | 24  |  1  |  0  |  1  |  1     |
585	 (Stable)|    |     |     |     |     |     |     |     |        |
586	---------|----|-----|-----|-----|-----|-----|-----|-----|--------|--
587	    140  | 0  |  0  | 35  | 48  | 46  | 27  | 14  |  1  |  1     |
588	 (Stable)|    |     |     |     |     |     |     |     |        |
589	---------|----|-----|-----|-----|-----|-----|-----|-----|--------|--
590	    160  | 0  |  0  | 38  | 57  | 55  | 40  | 26  | 65  | 203    |
591	(Unstable)    |     |     |     |     |     |     |     |        |
592	=========|==========================================================

594	           Table 1: Network Stability Vs. LSA Storm
595	              (Case 1, No priority to Hello/Ack)

597	   The LSA storm starts a little after 20 seconds and so for some
598	   period of time after that the number of non-converged LSUs should
599	   stay high and then come down for a stable network.
600	   This happens for LSA storms of sizes 100 and 140.  With an LSA storm
601	   of size 160, the number of non-converged LSUs stay high indefinitely
602	   due to repeated retransmissions, link failures due to missed Hellos
603	   for more than the Router-Dead interval which generates additional
604	   LSAs and also due to subsequent link recoveries which again
605	   generate additional LSAs.  We define network stability threshold as
606	   the maximum allowable LSA storm size for which the number of
607	   non-converged LSUs come down to a low level after some time. It
608	   turns out that for this example the stability threshold is
609	   150.

611	   The network behavior as a function of the LSA storm size can
612	   be categorized as follows:

614	   (1) If the LSA storm is well below the stability threshold then
615	       the CPU/memory congestion lasts only for a short period and
616	       during this period there are very few retransmissions, very
617	       few dropped OSPF packets and no link
618	       failures due to missed Hellos.  This type of LSA storms are
619	       observed routinely in operational networks and networks
620	       recover from them easily.

622	   (2) If the LSA storm is just below the stability threshold then
623	       the CPU/memory congestion lasts for a longer period and during
624	       this period there may be considerable amount of retransmissions
625	       and dropped OSPF packets.  If Hello packets are not given
626	       priority then there may also be some link failures due to
627	       missed Hellos.  However, the network does go back to a stable
628	       state eventually. This type of LSA storm may happen rarely in
629	       operational networks and they recover from it with some
630	       difficulty.

632	   (3) If the LSA storm is above the stability threshold then
633	       the CPU/memory congestion may last indefinitely unless
634	       some special procedure for relieving congestion is followed.
635	       During this period there are considerable amount of
636	       retransmissions and dropped OSPF packets.  If Hello packets are
637	       not given priority then there would also be link failures due
638	       to missed Hellos.  This type of LSA storm may happen very rarely
639	       in operational networks and usually some manual procedure such
640	       as taking down adjacencies in heavily congested nodes is needed.

642	   (4) If Hello packets are given priority then the network stability
643	       threshold increases, i.e., the network can withstand a larger
644	       LSA storm. Furthermore, even if the network operates at or
645	       somewhat above this higher stability threshold, Hellos are
646	       still not missed and so there are no link failures.  So even
647	       if there is congestion in the control plane due to increased
648	       retransmissions requiring some special procedures for congestion
649	       reduction, the data plane remains unaffected.

651	   (5) If both Hello and Acknowledgement packets are given priority
652	       then the stability threshold increases even further.

654	   In Table 2 we show the network stability threshold for the five
655	   different cases and for the three different priority scenarios
656	   defined earlier.

658	|===========|========================================================|
659	|           |    Maximum Allowable LSA Storm Size For                |
660	|   Case    |=================|==================|===================|
661	|  Number   | No Priority to  |Priority to Hello | Priority to Hello |
662	|           |  Hello or Ack   |      Only        |   and Ack         |
663	|===========|=================|==================|===================|
664	|   Case 1  |        150      |        190       |        250        |
665	|___________|_________________|__________________|___________________|
666	|   Case 2  |        185      |        215       |        285        |
667	|___________|_________________|__________________|___________________|
668	|   Case 3  |        115      |        127       |        170        |
669	|___________|_________________|__________________|___________________|
670	|   Case 4  |        320      |        375       |        580        |
671	|___________|_________________|__________________|___________________|
672	|   Case 5  |        120      |        175       |        225        |
673	|___________|_________________|__________________|___________________|
674	|   Case 6  |        185      |        224       |        285        |
675	|___________|_________________|__________________|___________________|

677	       Table 2: Maximum Allowable LSA Storm for a Stable Network

679	   We also considered one more scenario with priority to Hello and Ack
680	   and with a truncated binary exponential backoff of the
681	   retransmission interval with an upper limit of 40 seconds (for the
682	   same LSA, each successive retransmission interval
683	   is doubled but not to exceed 40 seconds).  The maximum allowed
684	   LSA storm size for this scenario significantly exceeded the numbers
685	   given in the third column.

687	Appendix B.3. Observations on Simulation Results

689	   Table 2 shows that in all cases prioritizing Hello packets increases
690	   the network stability threshold, and in addition, prioritization of
691	   LSA Acknowledgment packets increases the stability threshold even
692	   further.  The reasons for the above observations are as follows.
693	   The main sources of sustained CPU/memory congestion (or positive
694	   feedback loop) following an LSA storm are (1) LSA retransmissions
695	   and (2) links being declared down due to missed Hellos which in
696	   turn causes further LSA generation and future recovery of the link
697	   causing even more LSA generation.
698	   Prioritizing Hello packets avoids and practically eliminates the
699	   second source of congestion.  Prioritizing Acknowledgements
700	   significantly reduces the first source of congestion, i.e.,
701	   LSA retransmissions.  It is to be noted that retransmissions can
702	   not be completely eliminated due to the following reasons. Firstly,
703	   only the explicit Acknowledgments are prioritized but duplicate
704	   LSAs carrying implicit Acknowledgments are still served at the
705	   lower priority.  Secondly, LSAs may get greatly delayed or dropped
706	   at the input queue of receivers and therefore Acknowledgments may
707	   not even get generated in which case prioritizing Acks would not
708	   help. Another factor to keep in mind is that since Hellos and Acks
709	   are prioritized, the LSAs see bigger delay and potential for
710	   dropping. However, the simulation results show that on the whole
711	   prioritizing Hello and LSA Acks are always beneficial and
712	   significantly improve the network stability threshold.

714	   As stated in Section B.2, exponenetial backoff of LSA retransmission
715	   interval further increases the network stability threshold.

717	   Our simulation study also showed that in each of the cases, instead
718	   of prioritizing Hello packets if we treat any packet received over
719	   a link as a surrogate for a Hello packet (an implicit Hello) then
720	   we get about the same stability threshold as obtained with
721	   prioritizing Hello packets.

723	Appendix C. Other Proposals

725	   (1) Explicit Marking:  In Section 2 we proposed that OSPF packets
726	       be classified to "high" and "low" priority classes based on
727	       examining the OSPF packet header.  In some cases (particularly
728	       in the receiver) this examination may be computationally
729	       costly.  An alternative would be the
730	       use of different TOS (DSCP) bits marking for high and low
731	       priority OSPF packets respectively.  The exact specification
732	       of this marking is for further study.

734	   (2) Other High Priority OSPF Packets: Besides the packets designated
735	       as high priority in Section 2 there may be other packets with
736	       a need for high priority designation.  One example is the
737	       Database Description (DBD) packet from a slave (during the
738	       database synchronization process) that is used as an
739	       acknowledgement.  A second example is an LSA carrying
740	       intra-area topology change information (this may trigger
741	       SPF calculation and rerouting of Label Switched paths and so
742	       fast processing of this packet may improve OSPF/LDP convergence
743	       times).