idnits 2.17.1 

draft-ietf-ospf-scalability-09.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1.a on line 16.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 625.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 636.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 647.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 647.

  ** The document claims conformance with section 10 of RFC 2026, but uses
     some RFC 3978/3979 boilerplate.  As RFC 3978/3979 replaces section 10 of
     RFC 2026, you should not claim conformance with it if you have changed to
     using RFC 3978/3979 boilerplate.

  ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure
     Acknowledgement -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** The document seems to lack an RFC 3978 Section 5.4 Reference to BCP 78
     -- however, there's a paragraph with a matching beginning. Boilerplate
     error?

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.

  ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate
     instead of verbatim RFC 3978 boilerplate.  After 6 May 2005, submission
     of drafts without verbatim RFC 3978 boilerplate is not accepted.

     The following non-3978 patterns matched text found in the document. 
     That text should be removed or replaced:

        By submitting this Internet-Draft, each author represents that any
        applicable patent or other IPR claims of which he or she is aware
        have been or will be disclosed, and any of which he or she
        becomes aware will be disclosed, in accordance with Section 6 of
        BCP 79.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack an Authors' Addresses Section.

  ** There are 3 instances of too long lines in the document, the longest one
     being 1 character in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Best Current Practice
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'Ref6-Ref9' is mentioned on line 432, but not defined

  == Unused Reference: 'Ref6' is defined on line 308, but no explicit
     reference was found in the text

  == Unused Reference: 'Ref7' is defined on line 311, but no explicit
     reference was found in the text

  == Unused Reference: 'Ref8' is defined on line 314, but no explicit
     reference was found in the text

  == Unused Reference: 'Ref9' is defined on line 317, but no explicit
     reference was found in the text

  == Unused Reference: 'Ref13' is defined on line 330, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 2740 (ref. 'Ref2') (Obsoleted by RFC
     5340)


     Summary: 11 errors (**), 0 flaws (~~), 7 warnings (==), 7 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	   Internet Engineering Task Force           Gagan L. Choudhury, Editor
3	   Internet Draft                                                  AT&T
4	   Expires in June, 2005
5	   Category: Best Current Practice                       December, 2004
6	   draft-ietf-ospf-scalability-09.txt

8	           Prioritized Treatment of Specific OSPF Version 2
9	           Packets and Congestion Avoidance

11	Status of this Memo

13	   By submitting this Internet-Draft, each author represents that any
14	   applicable patent or other IPR claims of which he or she is aware
15	   have been or will be disclosed, and any of which he or she becomes
16	   aware will be disclosed, in accordance with Section 6 of RFC 3668.

18	   This document is an Internet-Draft and is in full conformance
19	   with all provisions of Section 10 of RFC2026.

21	   Internet-Drafts are working documents of the Internet Engineering
22	   Task Force (IETF), its areas, and its working groups.  Note that
23	   other groups may also distribute working documents as Internet-
24	   Drafts.

26	   Internet-Drafts are draft documents valid for a maximum of six
27	   months and may be updated, replaced, or obsoleted by other documents
28	   at any time.  It is inappropriate to use Internet-Drafts as
29	   reference material or to cite them other than as "work in progress."

31	   The list of current Internet-Drafts can be accessed at
32	        http://www.ietf.org/ietf/1id-abstracts.html
33	   The list of Internet-Draft Shadow Directories can be accessed at
34	        http://www.ietf.org/shadow.html.
35	   Distribution of this memo is unlimited.

37	Abstract

39	   This document recommends methods that are intended to improve the
40	   scalability and stability of large networks using OSPF (Open Shortest
41	   Path First) Version 2 protocol.  The methods include processing
42	   OSPF Hellos and LSA (Link State Advertisement) Acknowledgments at a
43	   higher priority compared to other OSPF packets, and other congestion
44	   avoidance procedures.

46	Table of Contents

48	   1. Introduction...................................................2
49	   2. Recommendations................................................3
50	   3. Security Considerations........................................6
51	   4. Acknowledgments................................................6
52	   5. Normative Reference............................................6
53	   6. Informative References.........................................7
54	   7. Contributing Authors and their Addresses.......................8
55	   Appendix A. LSA Storm: Causes and Impact..........................8
56	   Appendix B. List of Variables and Values.........................10
57	   Appendix C. Other Recommendations and Suggestions................11

59	1. Introduction

61	   In this document as we refer to OSPF we mean OSPFv2 [Ref1].
62	   The scalability and stability improvement techniques described here
63	   may also apply to OSPFv3 [Ref2] but that will require further study
64	   and operational experience.

66	   A large network running OSPF protocol may occasionally
67	   experience the simultaneous or near-simultaneous update of a large
68	   number of link-state-advertisements, or LSAs.  This is particularly
69	   true if OSPF traffic engineering extension [Ref3] is used which
70	   may significantly increase the number of LSAs in the network.
71	   We call this event, an LSA storm and it may be initiated by an
72	   unscheduled failure or a scheduled maintenance event.
73	   The failure may be hardware, software, or procedural in nature.

75	   The LSA storm causes high CPU and memory utilization at the router
76	   causing incoming packets to be delayed or dropped.
77	   Delayed acknowledgments (beyond the retransmission timer value)
78	   result in retransmissions, and delayed Hello packets (beyond the
79	   router-dead interval) result in neighbor adjacencies being declared
80	   down. The retransmissions and additional LSA originations result in
81	   further CPU and memory usage, essentially causing a positive feedback
82	   loop, which, in the extreme case, may drive the network to an
83	   unstable state.

85	   The default value of retransmission timer is 5 seconds and that of
86	   the router-dead interval is 40 seconds.  However, recently there
87	   has been a lot of interest in significantly reducing OSPF convergence
88	   time. As part of that plan much shorter (sub-second) Hello and
89	   router-dead intervals have been proposed [Ref4].  In such a scenario
90	   it will be more likely for Hello packets to be delayed beyond
91	   the router-dead interval during network congestion
92	   caused by an LSA storm.

94	   In order to improve the scalability and stability of networks we
95	   recommend steps for prioritizing critical OSPF packets and avoiding
96	   congestion. The details of the recommendations are given in Section
97	   2.  A simulation study is reported in [Ref14] that quantifies the
98	   congestion phenomenon and its impact.  It also studies several of the
99	   recommendations and shows that they indeed improve the scalability
100	   and stability of networks using OSPF protocol.  [Ref14] is available
101	   on request by contacting the editor or one of the authors.

103	   Appendix A explains in more detail LSA storm scenarios,
104	   their impact, and points out a few real-life examples of control-
105	   message storms.  Appendix B provides a list of variables used in the
106	   recommendations and their example values.  Appendix C provides
107	   some further recommendations and suggestions with similar goals.

109	2. Recommendations

111	   The Recommendations below are intended to improve the scalability
112	   and stability of large networks using OSPF protocol.  During
113	   periods of network congestion they would reduce retransmissions,
114	   avoid an adjacency to be declared down due to Hello packets
115	   being delayed beyond the RouterDeadInterval, and take other
116	   congestion avoidance steps.  The recommendations are unordered
117	   except that Recommendation 2 is to be implemented only if
118	   Recommendation 1 is not implemented.

120	   (1) Classify all OSPF packets in two classes: a "high priority"
121	       class comprising of OSPF Hello packets and Link State
122	       Acknowledgment packets, and a "low priority" class
123	       comprising of all other packets. The classification is
124	       accomplished by examining the OSPF packet header. While
125	       receiving a packet from a neighbor and while transmitting
126	       a packet to a neighbor, try to process a "high priority"
127	       packet ahead of a "low priority" packet.

129	       The prioritized processing while transmitting may cause OSPF
130	       packets from a neighbor to be received out of sequence.
131	       If Cryptographic Authentication (AuType = 2) is used (as
132	       specified in [Ref1]) then successive received valid OSPF packets
133	       from a neighbor need to have a non-decreasing "Cryptographic
134	       sequence number".  To comply with this requirement we recommend
135	       that in case Cryptographic Authentication (AuType = 2) is used
136	       [Ref1], prioritized processing be not done at the transmitter.
137	       This will avoid packets arriving at the receiver out of sequence.
138	       However, after security processing at the receiver (including
139	       sequence number checking) is complete, the OSPF packets may be
140	       kept in a "high-priority" queue or a "low-priority" queue based
141	       on their class and processed accordingly.  The benefit of
142	       prioritized processing is clearly higher in the absence of
143	       Cryptographic Authentication since in that case prioritization
144	       can be implemented both at the transmitter and at the receiver.
145	       However, even with Cryptographic Authentication it will be
146	       beneficial to have prioritization only at the receiver (following
147	       security processing).

149	   (2) If the Recommendation 1 cannot be implemented then reset the
150	       inactivity timer for an adjacency whenever any OSPF unicast
151	       packet or any OSPF packet sent to AllSPFRouters over a
152	       point-to-point link is received over that adjacency instead of
153	       resetting the inactivity timer only on receipt of the
154	       Hello packet.  So OSPF would declare the adjacency to be down
155	       only if no OSPF unicast packets or no OSPF packets sent to
156	       AllSPFRouters over a point-to-point link are received over
157	       that adjacency for a period equaling or exceeding the
158	       RouterDeadInterval.  The reason for not recommending this
159	       proposal in conjunction with Recommendation 1 is to avoid
160	       potential undesirable side effects.  One such effect is the
161	       delay in discovering the down status of
162	       an adjacency in a case where no high priority Hello packets are
163	       being received but the inactivity timer is being reset by other
164	       stale packets in the low priority queue.

166	   (3) Use an exponential backoff algorithm for determining the value
167	       of the LSA retransmission interval (RxmtInterval).  Let R(i)
168	       represent the RxmtInterval value used during the i-th
169	       retransmission of an LSA.  Use the following algorithm to
170	       compute R(i)

172	                    R(1) = Rmin
173	                    R(i+1) = Min(KR(i),Rmax)  for i>=1

175	       where K, Rmin and Rmax are constants and the function
176	       Min(.,.) represents the minimum value of its two arguments.
177	       Example values for K, Rmin and Rmax may be 2, 5 seconds
178	       and 40 seconds respectively.  Note that the example value for
179	       Rmin, the initial retransmission interval, is the same as the
180	       sample value of RxmtInterval in [Ref1].

182	       This recommendation is motivated by the observation that during
183	       a network congestion event caused by control messages, a major
184	       source for sustaining the congestion is the repeated
185	       retransmission of LSAs.  The use of an exponential backoff
186	       algorithm for the LSA retransmission interval reduces the rate
187	       of LSA retransmissions while the network experiences
188	       congestion (during which it is more likely that multiple
189	       retransmissions of the same LSA would happen).  This in turn
190	       helps the network get out of the congested state.

192	   (4) Implicit Congestion Detection and Action Based on That:
193	       If there is control message congestion at a router, its
194	       neighbors do not know about that explicitly.  However, they
195	       can implicitly detect it based on the number of unacknowledged
196	       LSAs to this router.  If this number exceeds a certain "high
197	       water mark" then the rate at which LSAs are sent to this router
198	       should be reduced progressively using an exponential backoff
199	       mechanism but not below a certain minimum rate.  At a future
200	       time, if the number of unacknowledged LSAs to this router falls
201	       below a certain "low water mark" then the rate of sending
202	       LSAs to this router should be increased progressively, again
203	       using an exponential backoff mechanism but not above a certain
204	       maximum rate.  The whole algorithm is given below.  It is to be
205	       noted that this algorithm is to be applied independently to each
206	       neighbor and only for unicast LSAs sent to a neighbor or LSAs
207	       sent to AllSPFRouters over a point-to-point link.

209	       Let,
210	       U(t) = Number of unacknowledged LSAs to neighbor at time t.
211	       H = A high water mark (in units of number of unacknowledged LSAs)
212	       L = A low water mark (in units of number of unacknowledged LSAs)
213	       G(t) = Gap between sending successive LSAs to neighbor at time t.
214	       F = The factor by which the above gap is to be increased during
215	           congestion and decreased after coming out of congestion.
216	       T = Minimum time that has to elapse before the existing gap
217	           is considered for change.
218	       Gmin = Minimum allowed value of gap.
219	       Gmax = Maximum allowed value of gap.

221	       The equation below shows how the gap is to be changed after a
222	       time T has elapsed since the last change:
223	                 _
224	                |
225	                | Min(FG(t),Gmax) if U(t+T) > H
226	       G(t+T) = | G(t) if H >= U(t+T) >= L
227	                | Max(G(t)/F,Gmin) if U(t+T) < L
228	                |_

230	       Min(.,.) and Max(.,.) represent the minimum and maximum values
231	       of the two arguments respectively.
232	       Example values for the various parameters of the algorithm are
233	       as follows: H = 20, L = 10, F = 2, T = 1 second, Gmin = 20 ms,
234	       Gmax = 1 second.

236	       Recommendations 3 and 4 both slow down LSAs to congested
237	       neighbors based on implicitly detecting the congestion but
238	       they have important differences. Recommendation 3 progressively
239	       slows down successive retransmissions of the same LSA whereas
240	       Recommendation 3 progressively slows down all LSAs (new or
241	       retransmission) to a congested neighbor.

243	   (5) Throttling Adjacencies to be Brought Up Simultaneously:
244	       If a router tries to bring up a large number of adjacencies to
245	       its neighbors simultaneously then that may cause severe
246	       congestion due to database synchronization and LSA flooding
247	       activities.  It is recommended that during such a situation
248	       no more than "n" adjacencies should be brought up
249	       simultaneously.  Once a subset of adjacencies have been brought
250	       up successfully, newer adjacencies may be brought up as long as
251	       the number of simultaneous adjacencies being brought up does not
252	       exceed "n". The appropriate value of "n" would depend on the
253	       router processing power, total bandwidth available for control
254	       plane traffic and propagation delay.
255	       The value of "n" should be configurable.

257	       In the presence of throttling, an important issue is the order
258	       in which adjacencies are to be formed.  We recommend a First
259	       Come First Served (FCFS) policy based on the order in which the
260	       request for adjacency formation arrives.  Requests may either be
261	       from neighbors or self-generated. Among the self-generated
262	       requests a priority list may be used to decide the order in which
263	       the requests are to be made.  However, once an adjacency
264	       formation process starts it is not to be preempted except
265	       for unusual circumstances such as errors or time-outs.

267	   In some of the Recommendations above we refer to point-to-point links.
268	   Those references should also include cases where a broadcast network
269	   is to be treated as a point-to-point connection from the standpoint of
270	   IP routing [Ref5]

272	3. Security Considerations

274	   This memo does not create any new security issues for the OSPF
275	   protocol.

277	4. Acknowledgments

279	   We would like to acknowledge the support and helpful comments from
280	   OSPF WG chairs Rohit Dube, Acee Lindem, John Moy, Routing Area
281	   directors Alex Zinin and Bill Fenner, and IESG reviewers.  We
282	   acknowledge Vivek Dube,  Mitchell Erblich, Mike Fox, Tony
283	   Przygienda, and Krishna Rao for comments on previous versions of
284	   the draft.  We also acknowledge Margaret Chiosi, Elie Francis,
285	   Jeff Han, Beth Munson, Roshan Rao, Moshe Segal, Mike Wardlow, and
286	   Pat Wirth for collaboration and encouragement in our scalability
287	   improvement efforts for Link-State-Protocol based networks.

289	5. Normative Reference

291	   [Ref1] J. Moy, "OSPF Version 2", RFC 2328, April, 1998.

293	   [Ref2] R. Coltun, D. Ferguson and J. Moy, "OSPF For IPV6",
294	   RFC 2740, December, 1999.

296	6. Informative References

298	   [Ref3] D. Katz, K. Kompella, D. Yeung "Traffic Engineering (TE)
299	   Extensions to OSPF Version 2," RFC 3630, September, 2003.

301	   [Ref4] C. Alaettinoglu, V. Jacobson and H. Yu, "Towards Milli-
302	   second IGP Convergence," Work in Progress.

304	   [Ref5] N. Shen, A. Lindem, J. Yuan, A. Zinin, R. White and S. Previdi,
305	   "Point-to-point operation over LAN in link-state routing protocols,"
306	   Work in Progress.

308	   [Ref6] Pappalardo, D., "AT&T, customers grapple with ATM net
309	   outage," Network World, February 26, 2001.

311	   [Ref7] "AT&T announces cause of frame-relay network outage," AT&T
312	   Press Release, April 22, 1998.

314	   [Ref8] Cholewka, K., "MCI Outage Has Domino Effect," Inter@ctive
315	   Week, August 20, 1999.

317	   [Ref9] Jander, M., "In Qwest Outage, ATM Takes Some Heat," Light
318	   Reading, April 6, 2001.

320	   [Ref10] A. Zinin and M. Shand, "Flooding Optimizations in Link-State
321	   Routing Protocols," Work in Progress.

323	   [Ref11] P. Pillay-Esnault, "OSPF Refresh and flooding reduction in
324	   stable topologies," Work in progress.

326	   [Ref12] G. Ash, G. Choudhury, V. Sapozhnikova, M. Sherif, A.
327	   Maunder, V. Manral, "Congestion Avoidance & Control for OSPF
328	   Networks", Work in Progress.

330	   [Ref13] B. M. Waxman, "Routing of Multipoint Connections," IEEE
331	   Journal on Selected Areas in Communications, 6(9):1617-1622, 1988.

333	   [Ref14] G. Choudhury, G. Ash, V. Manral, A. Maunder and V.
334	   Sapozhnikova, "Prioritized Treatment of Specific OSPF Packets
335	   and Congestion Avoidance: Algorithms and Simulations," AT&T
336	   Technical Report, August, 2003.

338	   [Ref15] K. Nichols, S. Blake, F. Baker and D. Black, "Definition of
339	   the Differentiated Services Field (DS Field) in the IPV4 and IPV6
340	   Headers", RFC 2474, December, 1998.

342	7. Contributing Authors and their Addresses

344	   In addition to the Editor, several people contributed to this
345	   document.  The names and contact information of all authors
346	   are given below.

348	   Gagan L. Choudhury                     Anurag S. Maunder
349	   AT&T                                   Erlang Technology
350	   Room D5-3C21                           2880 Scott Boulevard
351	   200 Laurel Avenue                      Santa Clara, CA 95052
352	   Middletown, NJ, 07748                  USA
353	   USA                                    Phone: (408)420-7617
354	   Phone: (732)420-3721                   email: anuragm@erlangtech.com
355	   email: gchoudhury@att.com

357	   Gerald R. Ash                          Vera D. Sapozhnikova
358	   AT&T                                   AT&T
359	   Room D5-2A01                           Room C5-2C29
360	   200 Laurel Avenue                      200 Laurel Avenue
361	   Middletown, NJ, 07748                  Middletown, NJ, 07748
362	   USA                                    USA
363	   Phone: (732)420-4578                   Phone: (732)420-2653
364	   email: gash@att.com                    email: sapozhnikova@att.com

366	   Vishwas Manral
367	   Sinett Semiconductors,
368	   Infantry Road,
369	   Bangalore 500 081
370	   India
371	   email: vishwas@sinett.com

373	Appendix A. LSA Storm: Causes and Impact

375	   An LSA storm may be initiated due to many reasons.  Here
376	   are some examples:

378	   (a) one or more link failures due to fiber cuts,

380	   (b) one or more router failures for some reason, e.g., software
381	       crash or some type of disaster (including power outage)
382	       in an office complex hosting many routers,

384	   (c) Link/router flapping,

386	   (d) requirement of taking down and later bringing back many
387	       routers during a software/hardware upgrade,

389	   (e) near-synchronization of the periodic 1800 second LSA refreshes
390	       of a subset of LSAs,

392	   (f) refresh of all LSAs in the system during a change in software
393	       version,

395	   (g) injecting a large number of external routes to OSPF due to
396	       a procedural error,

398	   (h) Router ID changes causing a large number of LSA re-originations
399	       (possibly LSA purges as well depending on the implementation).

401	   In addition to the LSAs originated as a direct result of link/router
402	   failures, there may be other indirect LSAs as well.  One example in
403	   MPLS networks is traffic engineering LSAs [Ref3] originated at other
404	   links as a result of significant change in reserved bandwidth
405	   resulting from rerouting of Label Switched Paths (LSPs) that went
406	   down during the link/router failure.
407	   The LSA storm causes high CPU and memory utilization at the router
408	   processor causing incoming packets to be delayed or dropped.
409	   Delayed acknowledgments (beyond the retransmission timer value)
410	   results in retransmissions, and delayed Hello packets (beyond the
411	   Router-Dead interval) results in links being declared down.  A
412	   trunk-down event causes Router LSA origination by its end-point
413	   routers.  If traffic engineering LSAs are used for each link then
414	   that type of LSAs would also be originated by the end-point routers
415	   and potentially elsewhere as well due to significant changes in
416	   reserved bandwidths at other links caused by the failure and reroute
417	   of LSPs originally using the failed trunk.  Eventually, when the
418	   link recovers that would also trigger additional Router LSAs and
419	   traffic engineering LSAs.

421	   The retransmissions and additional LSA originations result in further
422	   CPU and memory usage, essentially causing a positive feedback loop.
423	   We define the LSA storm size as the number of LSAs in the original
424	   storm and not counting any additional LSAs resulting from the
425	   feedback loop described above.  If the LSA storm is too large then
426	   the positive feedback loop mentioned above may be large enough to
427	   indefinitely sustain a large CPU and memory utilization at many
428	   routers in the network, thereby driving the network to an unstable
429	   state. In the past, network
430	   outage events have been reported in IP and ATM networks using
431	   link-state protocols such as OSPF, IS-IS, PNNI or some proprietary
432	   variants.  See for example [Ref6-Ref9].  In many of these examples,
433	   large scale flooding of LSAs or other similar control messages
434	   (either naturally or triggered by some bug or inappropriate
435	   procedure) have been partly or fully responsible for network
436	   instability and outage.

438	   In [Ref14] a simulation model is used to show that there
439	   is a certain LSA storm size threshold above which the
440	   network may show unstable behavior caused by large number of
441	   retransmissions, link failures due to missed Hello packets and
442	   subsequent link recoveries.  It is also shown
443	   that the LSA storm size causing instability may be substantially
444	   increased by providing prioritized treatment to Hello and LSA
445	   Acknowledgment packets and by using an exponential backoff
446	   algorithm for determining the LSA retransmission interval.
447	   If it is not possible to prioritize Hello packets then resetting
448	   the inactivity timer on receiving any valid OSPF packets can also
449	   provide the same benefit. Furthermore, if we prioritize Hello
450	   packets then even when the network operates somewhat above the
451	   stability threshold, links are not declared down due to missed
452	   Hellos.  This implies that even though there is
453	   control plane congestion due to many retransmissions, the data plane
454	   stays up and no new LSAs are originated (besides the ones in the
455	   original storm and the refreshes).  These observations support
456	   the first three recommendations in Section 2. The authors of this
457	   draft have also done simulations to verify that the other
458	   recommendations in Section 2 helps avoid congestion and allows a
459	   graceful exit from a congested state.

461	   One might argue that the scalability issue of large networks should
462	   be solved solely by dividing the network hierarchically into
463	   multiple areas so that flooding of LSAs remains localized within
464	   areas.  However, this approach increases the network management
465	   and design complexity and may result in less optimal routing between
466	   areas. Also, ASE LSAs are flooded throughout the AS and it may be
467	   a problem if there are large numbers of them.  Furthermore,
468	   a large number of summary LSAs may need to be flooded across
469	   Areas and their numbers would increase significantly if
470	   multiple Area Border Routers are employed for the purpose of
471	   reliability. Thus it is important to allow the network to grow
472	   towards as large a size as possible under a single area.

474	   The recommendations in the draft are synergistic with a broader set
475	   of scalability and stability improvement proposals. [Ref10] proposes
476	   flooding overhead reduction in case more than one interface goes to
477	   the same neighbor.  [Ref11] proposes a mechanism for
478	   greatly reducing LSA refreshes in stable topologies.

480	   [Ref12] proposes a wide range of congestion control and failure
481	   recovery mechanisms (some of those ideas are covered in this
482	   draft but [Ref12] has other ideas not covered here).

484	Appendix B. List of Variables and Values

486	   F    = The factor by which the gap between sending successive LSAs to
487	          a neighbor is to be increased during congestion and decreased
488	          after coming out of congestion (used in Recommendation 4).
489	          Example value is 2.

491	   G(t) = Gap between sending successive LSAs to a neighbor at time t
492	          (used in Recommendation 4).

494	   Gmax = Maximum allowed value of gap between sending successive LSAs
495	          to a neighbor (used in Recommendation 4). Example value is 1
496	          second.

498	   Gmin = Minimum allowed value of gap between sending successive LSAs
499	          to a neighbor (used in Recommendation 4). Example value is
500	          20 ms.

502	   H    = A high water mark (in units of number of unacknowledged LSAs).
503	          Exceeding this mark would trigger a potential increase in the
504	          gap between sending successive LSAs to a neighbor.
505	          (used in Recommendation 4). Example value is 20.

507	   K    = A multiplicative constant used in increasing the RxmtInterval
508	          value used during successive retransmissions of the same LSA
509	          (used in Recommendation 3). Example value is 2.

511	   L    = A low water mark (in units of number of unacknowledged LSAs)
512	          Dropping below this mark would trigger a potential decrease
513	          in the gap between sending successive LSAs to a neighbor.
514	          (used in Recommendation 4). Example value is 10.

516	   n    = Upper limit on the number of adjacencies to be brought up
517	          simultaneously (used in Recommendation 5).

519	   R(i) = RxmtInterval value used during the i-th retransmission of
520	          an LSA (used in Recommendation 3).

522	   Rmax = The maximum allowed value of RxmtInterval (used in
523	          Recommendation 3). Example value is 40 seconds.

525	   Rmin = The minimum allowed value of RxmtInterval (used in
526	          Recommendation 3). Example value is 5 seconds.

528	   T    = Minimum time that has to elapse before the existing gap
529	          between sending successive LSAs to a neighbor
530	          is considered for change (used in Recommendation 4). Example
531	          value is 1 second.

533	   U(t) = Number of unacknowledged LSAs to a neighbor at time t
534	          (used in Recommendation 4).

536	Appendix C. Other Recommendations and Suggestions

538	   (1) Explicit Marking:  In Section 2 we recommended that OSPF packets
539	       be classified to "high" and "low" priority classes based on
540	       examining the OSPF packet header.  In some cases (particularly
541	       in the receiver) this examination may be computationally
542	       costly.  An alternative would be the
543	       use of different TOS/Precedence field settings for the two
544	       priority classes.  [Ref1] recommends setting the TOS field to 0
545	       and the Precedence field to 6 for all OSPF packets.  We recommend
546	       this same setting for the "low" priority OSPF packets and a
547	       different setting for the "high" priority OSPF packets in order
548	       to be able to classify them separately without having to examine
549	       the OSPF packet header.  Two examples are given below:

551	       Example 1: For "low" priority packets set TOS field to 0 and
552	                  Precedence field to 6, and for "high" priority
553	                  packets set TOS field to 4 and Precedence field to 6.

555	       Example 2: For "low" priority packets set TOS field to 0 and
556	                  Precedence field to 6, and for "high" priority
557	                  packets set TOS field to 0 and Precedence field to 7.

559	       It is to be noted that the TOS/Precedence bits have been
560	       redefined by Diffserv (RFC 2474, [Ref15]). It is also to be
561	       noted that the different TOS/Precedence field settings suggested
562	       above only need to be agreed among the systems on the link.
563	       This recommendation is not needed to be followed if it is easy
564	       to examine the OSPF packet header and thereby separately
565	       classify "high" and "low" priority packets.

567	   (2) Further Prioritization of OSPF Packets: Besides the packets
568	       designated as "high" priority in Recommendation 1 of Section 2
569	       there may be a need for further priority separation among the
570	       "low" priority OSPF packets.  We recommend the use of three
571	       priority classes: "high", "medium" and "low". While
572	       receiving a packet from a neighbor and while transmitting
573	       a packet to a neighbor, try to process a "high priority"
574	       packet ahead of "medium" and "low" priority packets and
575	       a "medium" priority packet ahead of "low priority" packets.
576	       The "high" priority packets are as designated in Recommendation
577	       1 of Section 2.  We provide below two candidate examples for
578	       "medium" priority packets.  All OSPF packets not designated
579	       as "high" or "medium" priority are "low" priority.
580	       If Cryptographic Authentication (AuType = 2) is used (as
581	       specified in [Ref1]) then prioritized treatment is to be
582	       provided only at the receiver and after security processing,
583	       but not at the transmitter since that
584	       may cause packets to arrive out of sequence and violate the
585	       requirements of "Autype = 2".

587	       One example of "medium" priority packet is the
588	       Database Description (DBD) packet from a slave (during the
589	       database synchronization process) that is used as an
590	       acknowledgment.

592	       A second example is an LSA carrying
593	       intra-area topology change information (this may trigger
594	       SPF calculation and rerouting of Label Switched paths and so
595	       fast processing of this packet may improve OSPF/LDP convergence
596	       times). However, if the processing cost of identifying and
597	       separately queueing the LSA in this example is deemed to be high
598	       then the implementer may decide not to do it.

600	   (3) Processing large number of LSA Purges: Occasionally some events
601	       in the network, such as Router ID changes, may result in a large
602	       number of LSA re-originations and LSA purges.  In such a scenario
603	       one may consider processing LSAs in different order, e.g.,
604	       processing LSA purges ahead of LSA originations.  We, however,
605	       do not recommend out-of-order LSA processing for several reasons.
606	       Firstly, detecting the LSA type ahead of queueing may be
607	       computationally expensive.  Out-of-order processing may also
608	       cause subtle bugs. We do not want to recommend a major change in
609	       the LSA processing paradigm for a relatively rare event such as
610	       Router ID change. However, a Router with a changing ID may flush
611	       the old LSAs gradually without causing a storm.

613	Full copyright statement

615	   Copyright (C) The Internet Society (2004).  This document is subject
616	   to the rights, licenses and restrictions contained in BCP 78 and
617	   except as set forth therein, the authors retain all their rights.

619	   This document and the information contained herein are provided on an
620	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
621	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
622	   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
623	   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
624	   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
625	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

627	Intellectual Property Considerations

629	   The IETF takes no position regarding the validity or scope of any
630	   Intellectual Property Rights or other rights that might be claimed to
631	   pertain to the implementation or use of the technology described in
632	   this document or the extent to which any license under such rights
633	   might or might not be available; nor does it represent that it has
634	   made any independent effort to identify any such rights.  Information
635	   on the procedures with respect to rights in RFC documents can be
636	   found in BCP 78 and BCP 79.

638	   Copies of IPR disclosures made to the IETF Secretariat and any
639	   assurances of licenses to be made available, or the result of an
640	   attempt made to obtain a general license or permission for the use of
641	   such proprietary rights by implementers or users of this
642	   specification can be obtained from the IETF on-line IPR repository at
643	   http://www.ietf.org/ipr.  The IETF invites any interested party to
644	   bring to its attention any copyrights, patents or patent
645	   applications, or other proprietary rights that may cover technology
646	   that may be required to implement this standard.  Please address the
647	   information to the IETF at ietf-ipr@ietf.org.