idnits 2.17.1 draft-ietf-ospf-scalability-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1.a on line 16. -- Found old boilerplate from RFC 3978, Section 5.5 on line 625. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 636. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 647. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 647. ** The document claims conformance with section 10 of RFC 2026, but uses some RFC 3978/3979 boilerplate. As RFC 3978/3979 replaces section 10 of RFC 2026, you should not claim conformance with it if you have changed to using RFC 3978/3979 boilerplate. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** The document seems to lack an RFC 3978 Section 5.4 Reference to BCP 78 -- however, there's a paragraph with a matching beginning. Boilerplate error? ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** There are 3 instances of too long lines in the document, the longest one being 1 character in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Best Current Practice ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'Ref6-Ref9' is mentioned on line 432, but not defined == Unused Reference: 'Ref6' is defined on line 308, but no explicit reference was found in the text == Unused Reference: 'Ref7' is defined on line 311, but no explicit reference was found in the text == Unused Reference: 'Ref8' is defined on line 314, but no explicit reference was found in the text == Unused Reference: 'Ref9' is defined on line 317, but no explicit reference was found in the text == Unused Reference: 'Ref13' is defined on line 330, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2740 (ref. 'Ref2') (Obsoleted by RFC 5340) Summary: 11 errors (**), 0 flaws (~~), 7 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force Gagan L. Choudhury, Editor 3 Internet Draft AT&T 4 Expires in June, 2005 5 Category: Best Current Practice December, 2004 6 draft-ietf-ospf-scalability-09.txt 8 Prioritized Treatment of Specific OSPF Version 2 9 Packets and Congestion Avoidance 11 Status of this Memo 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware 15 have been or will be disclosed, and any of which he or she becomes 16 aware will be disclosed, in accordance with Section 6 of RFC 3668. 18 This document is an Internet-Draft and is in full conformance 19 with all provisions of Section 10 of RFC2026. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six 27 months and may be updated, replaced, or obsoleted by other documents 28 at any time. It is inappropriate to use Internet-Drafts as 29 reference material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.html 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 35 Distribution of this memo is unlimited. 37 Abstract 39 This document recommends methods that are intended to improve the 40 scalability and stability of large networks using OSPF (Open Shortest 41 Path First) Version 2 protocol. The methods include processing 42 OSPF Hellos and LSA (Link State Advertisement) Acknowledgments at a 43 higher priority compared to other OSPF packets, and other congestion 44 avoidance procedures. 46 Table of Contents 48 1. Introduction...................................................2 49 2. Recommendations................................................3 50 3. Security Considerations........................................6 51 4. Acknowledgments................................................6 52 5. Normative Reference............................................6 53 6. Informative References.........................................7 54 7. Contributing Authors and their Addresses.......................8 55 Appendix A. LSA Storm: Causes and Impact..........................8 56 Appendix B. List of Variables and Values.........................10 57 Appendix C. Other Recommendations and Suggestions................11 59 1. Introduction 61 In this document as we refer to OSPF we mean OSPFv2 [Ref1]. 62 The scalability and stability improvement techniques described here 63 may also apply to OSPFv3 [Ref2] but that will require further study 64 and operational experience. 66 A large network running OSPF protocol may occasionally 67 experience the simultaneous or near-simultaneous update of a large 68 number of link-state-advertisements, or LSAs. This is particularly 69 true if OSPF traffic engineering extension [Ref3] is used which 70 may significantly increase the number of LSAs in the network. 71 We call this event, an LSA storm and it may be initiated by an 72 unscheduled failure or a scheduled maintenance event. 73 The failure may be hardware, software, or procedural in nature. 75 The LSA storm causes high CPU and memory utilization at the router 76 causing incoming packets to be delayed or dropped. 77 Delayed acknowledgments (beyond the retransmission timer value) 78 result in retransmissions, and delayed Hello packets (beyond the 79 router-dead interval) result in neighbor adjacencies being declared 80 down. The retransmissions and additional LSA originations result in 81 further CPU and memory usage, essentially causing a positive feedback 82 loop, which, in the extreme case, may drive the network to an 83 unstable state. 85 The default value of retransmission timer is 5 seconds and that of 86 the router-dead interval is 40 seconds. However, recently there 87 has been a lot of interest in significantly reducing OSPF convergence 88 time. As part of that plan much shorter (sub-second) Hello and 89 router-dead intervals have been proposed [Ref4]. In such a scenario 90 it will be more likely for Hello packets to be delayed beyond 91 the router-dead interval during network congestion 92 caused by an LSA storm. 94 In order to improve the scalability and stability of networks we 95 recommend steps for prioritizing critical OSPF packets and avoiding 96 congestion. The details of the recommendations are given in Section 97 2. A simulation study is reported in [Ref14] that quantifies the 98 congestion phenomenon and its impact. It also studies several of the 99 recommendations and shows that they indeed improve the scalability 100 and stability of networks using OSPF protocol. [Ref14] is available 101 on request by contacting the editor or one of the authors. 103 Appendix A explains in more detail LSA storm scenarios, 104 their impact, and points out a few real-life examples of control- 105 message storms. Appendix B provides a list of variables used in the 106 recommendations and their example values. Appendix C provides 107 some further recommendations and suggestions with similar goals. 109 2. Recommendations 111 The Recommendations below are intended to improve the scalability 112 and stability of large networks using OSPF protocol. During 113 periods of network congestion they would reduce retransmissions, 114 avoid an adjacency to be declared down due to Hello packets 115 being delayed beyond the RouterDeadInterval, and take other 116 congestion avoidance steps. The recommendations are unordered 117 except that Recommendation 2 is to be implemented only if 118 Recommendation 1 is not implemented. 120 (1) Classify all OSPF packets in two classes: a "high priority" 121 class comprising of OSPF Hello packets and Link State 122 Acknowledgment packets, and a "low priority" class 123 comprising of all other packets. The classification is 124 accomplished by examining the OSPF packet header. While 125 receiving a packet from a neighbor and while transmitting 126 a packet to a neighbor, try to process a "high priority" 127 packet ahead of a "low priority" packet. 129 The prioritized processing while transmitting may cause OSPF 130 packets from a neighbor to be received out of sequence. 131 If Cryptographic Authentication (AuType = 2) is used (as 132 specified in [Ref1]) then successive received valid OSPF packets 133 from a neighbor need to have a non-decreasing "Cryptographic 134 sequence number". To comply with this requirement we recommend 135 that in case Cryptographic Authentication (AuType = 2) is used 136 [Ref1], prioritized processing be not done at the transmitter. 137 This will avoid packets arriving at the receiver out of sequence. 138 However, after security processing at the receiver (including 139 sequence number checking) is complete, the OSPF packets may be 140 kept in a "high-priority" queue or a "low-priority" queue based 141 on their class and processed accordingly. The benefit of 142 prioritized processing is clearly higher in the absence of 143 Cryptographic Authentication since in that case prioritization 144 can be implemented both at the transmitter and at the receiver. 145 However, even with Cryptographic Authentication it will be 146 beneficial to have prioritization only at the receiver (following 147 security processing). 149 (2) If the Recommendation 1 cannot be implemented then reset the 150 inactivity timer for an adjacency whenever any OSPF unicast 151 packet or any OSPF packet sent to AllSPFRouters over a 152 point-to-point link is received over that adjacency instead of 153 resetting the inactivity timer only on receipt of the 154 Hello packet. So OSPF would declare the adjacency to be down 155 only if no OSPF unicast packets or no OSPF packets sent to 156 AllSPFRouters over a point-to-point link are received over 157 that adjacency for a period equaling or exceeding the 158 RouterDeadInterval. The reason for not recommending this 159 proposal in conjunction with Recommendation 1 is to avoid 160 potential undesirable side effects. One such effect is the 161 delay in discovering the down status of 162 an adjacency in a case where no high priority Hello packets are 163 being received but the inactivity timer is being reset by other 164 stale packets in the low priority queue. 166 (3) Use an exponential backoff algorithm for determining the value 167 of the LSA retransmission interval (RxmtInterval). Let R(i) 168 represent the RxmtInterval value used during the i-th 169 retransmission of an LSA. Use the following algorithm to 170 compute R(i) 172 R(1) = Rmin 173 R(i+1) = Min(KR(i),Rmax) for i>=1 175 where K, Rmin and Rmax are constants and the function 176 Min(.,.) represents the minimum value of its two arguments. 177 Example values for K, Rmin and Rmax may be 2, 5 seconds 178 and 40 seconds respectively. Note that the example value for 179 Rmin, the initial retransmission interval, is the same as the 180 sample value of RxmtInterval in [Ref1]. 182 This recommendation is motivated by the observation that during 183 a network congestion event caused by control messages, a major 184 source for sustaining the congestion is the repeated 185 retransmission of LSAs. The use of an exponential backoff 186 algorithm for the LSA retransmission interval reduces the rate 187 of LSA retransmissions while the network experiences 188 congestion (during which it is more likely that multiple 189 retransmissions of the same LSA would happen). This in turn 190 helps the network get out of the congested state. 192 (4) Implicit Congestion Detection and Action Based on That: 193 If there is control message congestion at a router, its 194 neighbors do not know about that explicitly. However, they 195 can implicitly detect it based on the number of unacknowledged 196 LSAs to this router. If this number exceeds a certain "high 197 water mark" then the rate at which LSAs are sent to this router 198 should be reduced progressively using an exponential backoff 199 mechanism but not below a certain minimum rate. At a future 200 time, if the number of unacknowledged LSAs to this router falls 201 below a certain "low water mark" then the rate of sending 202 LSAs to this router should be increased progressively, again 203 using an exponential backoff mechanism but not above a certain 204 maximum rate. The whole algorithm is given below. It is to be 205 noted that this algorithm is to be applied independently to each 206 neighbor and only for unicast LSAs sent to a neighbor or LSAs 207 sent to AllSPFRouters over a point-to-point link. 209 Let, 210 U(t) = Number of unacknowledged LSAs to neighbor at time t. 211 H = A high water mark (in units of number of unacknowledged LSAs) 212 L = A low water mark (in units of number of unacknowledged LSAs) 213 G(t) = Gap between sending successive LSAs to neighbor at time t. 214 F = The factor by which the above gap is to be increased during 215 congestion and decreased after coming out of congestion. 216 T = Minimum time that has to elapse before the existing gap 217 is considered for change. 218 Gmin = Minimum allowed value of gap. 219 Gmax = Maximum allowed value of gap. 221 The equation below shows how the gap is to be changed after a 222 time T has elapsed since the last change: 223 _ 224 | 225 | Min(FG(t),Gmax) if U(t+T) > H 226 G(t+T) = | G(t) if H >= U(t+T) >= L 227 | Max(G(t)/F,Gmin) if U(t+T) < L 228 |_ 230 Min(.,.) and Max(.,.) represent the minimum and maximum values 231 of the two arguments respectively. 232 Example values for the various parameters of the algorithm are 233 as follows: H = 20, L = 10, F = 2, T = 1 second, Gmin = 20 ms, 234 Gmax = 1 second. 236 Recommendations 3 and 4 both slow down LSAs to congested 237 neighbors based on implicitly detecting the congestion but 238 they have important differences. Recommendation 3 progressively 239 slows down successive retransmissions of the same LSA whereas 240 Recommendation 3 progressively slows down all LSAs (new or 241 retransmission) to a congested neighbor. 243 (5) Throttling Adjacencies to be Brought Up Simultaneously: 244 If a router tries to bring up a large number of adjacencies to 245 its neighbors simultaneously then that may cause severe 246 congestion due to database synchronization and LSA flooding 247 activities. It is recommended that during such a situation 248 no more than "n" adjacencies should be brought up 249 simultaneously. Once a subset of adjacencies have been brought 250 up successfully, newer adjacencies may be brought up as long as 251 the number of simultaneous adjacencies being brought up does not 252 exceed "n". The appropriate value of "n" would depend on the 253 router processing power, total bandwidth available for control 254 plane traffic and propagation delay. 255 The value of "n" should be configurable. 257 In the presence of throttling, an important issue is the order 258 in which adjacencies are to be formed. We recommend a First 259 Come First Served (FCFS) policy based on the order in which the 260 request for adjacency formation arrives. Requests may either be 261 from neighbors or self-generated. Among the self-generated 262 requests a priority list may be used to decide the order in which 263 the requests are to be made. However, once an adjacency 264 formation process starts it is not to be preempted except 265 for unusual circumstances such as errors or time-outs. 267 In some of the Recommendations above we refer to point-to-point links. 268 Those references should also include cases where a broadcast network 269 is to be treated as a point-to-point connection from the standpoint of 270 IP routing [Ref5] 272 3. Security Considerations 274 This memo does not create any new security issues for the OSPF 275 protocol. 277 4. Acknowledgments 279 We would like to acknowledge the support and helpful comments from 280 OSPF WG chairs Rohit Dube, Acee Lindem, John Moy, Routing Area 281 directors Alex Zinin and Bill Fenner, and IESG reviewers. We 282 acknowledge Vivek Dube, Mitchell Erblich, Mike Fox, Tony 283 Przygienda, and Krishna Rao for comments on previous versions of 284 the draft. We also acknowledge Margaret Chiosi, Elie Francis, 285 Jeff Han, Beth Munson, Roshan Rao, Moshe Segal, Mike Wardlow, and 286 Pat Wirth for collaboration and encouragement in our scalability 287 improvement efforts for Link-State-Protocol based networks. 289 5. Normative Reference 291 [Ref1] J. Moy, "OSPF Version 2", RFC 2328, April, 1998. 293 [Ref2] R. Coltun, D. Ferguson and J. Moy, "OSPF For IPV6", 294 RFC 2740, December, 1999. 296 6. Informative References 298 [Ref3] D. Katz, K. Kompella, D. Yeung "Traffic Engineering (TE) 299 Extensions to OSPF Version 2," RFC 3630, September, 2003. 301 [Ref4] C. Alaettinoglu, V. Jacobson and H. Yu, "Towards Milli- 302 second IGP Convergence," Work in Progress. 304 [Ref5] N. Shen, A. Lindem, J. Yuan, A. Zinin, R. White and S. Previdi, 305 "Point-to-point operation over LAN in link-state routing protocols," 306 Work in Progress. 308 [Ref6] Pappalardo, D., "AT&T, customers grapple with ATM net 309 outage," Network World, February 26, 2001. 311 [Ref7] "AT&T announces cause of frame-relay network outage," AT&T 312 Press Release, April 22, 1998. 314 [Ref8] Cholewka, K., "MCI Outage Has Domino Effect," Inter@ctive 315 Week, August 20, 1999. 317 [Ref9] Jander, M., "In Qwest Outage, ATM Takes Some Heat," Light 318 Reading, April 6, 2001. 320 [Ref10] A. Zinin and M. Shand, "Flooding Optimizations in Link-State 321 Routing Protocols," Work in Progress. 323 [Ref11] P. Pillay-Esnault, "OSPF Refresh and flooding reduction in 324 stable topologies," Work in progress. 326 [Ref12] G. Ash, G. Choudhury, V. Sapozhnikova, M. Sherif, A. 327 Maunder, V. Manral, "Congestion Avoidance & Control for OSPF 328 Networks", Work in Progress. 330 [Ref13] B. M. Waxman, "Routing of Multipoint Connections," IEEE 331 Journal on Selected Areas in Communications, 6(9):1617-1622, 1988. 333 [Ref14] G. Choudhury, G. Ash, V. Manral, A. Maunder and V. 334 Sapozhnikova, "Prioritized Treatment of Specific OSPF Packets 335 and Congestion Avoidance: Algorithms and Simulations," AT&T 336 Technical Report, August, 2003. 338 [Ref15] K. Nichols, S. Blake, F. Baker and D. Black, "Definition of 339 the Differentiated Services Field (DS Field) in the IPV4 and IPV6 340 Headers", RFC 2474, December, 1998. 342 7. Contributing Authors and their Addresses 344 In addition to the Editor, several people contributed to this 345 document. The names and contact information of all authors 346 are given below. 348 Gagan L. Choudhury Anurag S. Maunder 349 AT&T Erlang Technology 350 Room D5-3C21 2880 Scott Boulevard 351 200 Laurel Avenue Santa Clara, CA 95052 352 Middletown, NJ, 07748 USA 353 USA Phone: (408)420-7617 354 Phone: (732)420-3721 email: anuragm@erlangtech.com 355 email: gchoudhury@att.com 357 Gerald R. Ash Vera D. Sapozhnikova 358 AT&T AT&T 359 Room D5-2A01 Room C5-2C29 360 200 Laurel Avenue 200 Laurel Avenue 361 Middletown, NJ, 07748 Middletown, NJ, 07748 362 USA USA 363 Phone: (732)420-4578 Phone: (732)420-2653 364 email: gash@att.com email: sapozhnikova@att.com 366 Vishwas Manral 367 Sinett Semiconductors, 368 Infantry Road, 369 Bangalore 500 081 370 India 371 email: vishwas@sinett.com 373 Appendix A. LSA Storm: Causes and Impact 375 An LSA storm may be initiated due to many reasons. Here 376 are some examples: 378 (a) one or more link failures due to fiber cuts, 380 (b) one or more router failures for some reason, e.g., software 381 crash or some type of disaster (including power outage) 382 in an office complex hosting many routers, 384 (c) Link/router flapping, 386 (d) requirement of taking down and later bringing back many 387 routers during a software/hardware upgrade, 389 (e) near-synchronization of the periodic 1800 second LSA refreshes 390 of a subset of LSAs, 392 (f) refresh of all LSAs in the system during a change in software 393 version, 395 (g) injecting a large number of external routes to OSPF due to 396 a procedural error, 398 (h) Router ID changes causing a large number of LSA re-originations 399 (possibly LSA purges as well depending on the implementation). 401 In addition to the LSAs originated as a direct result of link/router 402 failures, there may be other indirect LSAs as well. One example in 403 MPLS networks is traffic engineering LSAs [Ref3] originated at other 404 links as a result of significant change in reserved bandwidth 405 resulting from rerouting of Label Switched Paths (LSPs) that went 406 down during the link/router failure. 407 The LSA storm causes high CPU and memory utilization at the router 408 processor causing incoming packets to be delayed or dropped. 409 Delayed acknowledgments (beyond the retransmission timer value) 410 results in retransmissions, and delayed Hello packets (beyond the 411 Router-Dead interval) results in links being declared down. A 412 trunk-down event causes Router LSA origination by its end-point 413 routers. If traffic engineering LSAs are used for each link then 414 that type of LSAs would also be originated by the end-point routers 415 and potentially elsewhere as well due to significant changes in 416 reserved bandwidths at other links caused by the failure and reroute 417 of LSPs originally using the failed trunk. Eventually, when the 418 link recovers that would also trigger additional Router LSAs and 419 traffic engineering LSAs. 421 The retransmissions and additional LSA originations result in further 422 CPU and memory usage, essentially causing a positive feedback loop. 423 We define the LSA storm size as the number of LSAs in the original 424 storm and not counting any additional LSAs resulting from the 425 feedback loop described above. If the LSA storm is too large then 426 the positive feedback loop mentioned above may be large enough to 427 indefinitely sustain a large CPU and memory utilization at many 428 routers in the network, thereby driving the network to an unstable 429 state. In the past, network 430 outage events have been reported in IP and ATM networks using 431 link-state protocols such as OSPF, IS-IS, PNNI or some proprietary 432 variants. See for example [Ref6-Ref9]. In many of these examples, 433 large scale flooding of LSAs or other similar control messages 434 (either naturally or triggered by some bug or inappropriate 435 procedure) have been partly or fully responsible for network 436 instability and outage. 438 In [Ref14] a simulation model is used to show that there 439 is a certain LSA storm size threshold above which the 440 network may show unstable behavior caused by large number of 441 retransmissions, link failures due to missed Hello packets and 442 subsequent link recoveries. It is also shown 443 that the LSA storm size causing instability may be substantially 444 increased by providing prioritized treatment to Hello and LSA 445 Acknowledgment packets and by using an exponential backoff 446 algorithm for determining the LSA retransmission interval. 447 If it is not possible to prioritize Hello packets then resetting 448 the inactivity timer on receiving any valid OSPF packets can also 449 provide the same benefit. Furthermore, if we prioritize Hello 450 packets then even when the network operates somewhat above the 451 stability threshold, links are not declared down due to missed 452 Hellos. This implies that even though there is 453 control plane congestion due to many retransmissions, the data plane 454 stays up and no new LSAs are originated (besides the ones in the 455 original storm and the refreshes). These observations support 456 the first three recommendations in Section 2. The authors of this 457 draft have also done simulations to verify that the other 458 recommendations in Section 2 helps avoid congestion and allows a 459 graceful exit from a congested state. 461 One might argue that the scalability issue of large networks should 462 be solved solely by dividing the network hierarchically into 463 multiple areas so that flooding of LSAs remains localized within 464 areas. However, this approach increases the network management 465 and design complexity and may result in less optimal routing between 466 areas. Also, ASE LSAs are flooded throughout the AS and it may be 467 a problem if there are large numbers of them. Furthermore, 468 a large number of summary LSAs may need to be flooded across 469 Areas and their numbers would increase significantly if 470 multiple Area Border Routers are employed for the purpose of 471 reliability. Thus it is important to allow the network to grow 472 towards as large a size as possible under a single area. 474 The recommendations in the draft are synergistic with a broader set 475 of scalability and stability improvement proposals. [Ref10] proposes 476 flooding overhead reduction in case more than one interface goes to 477 the same neighbor. [Ref11] proposes a mechanism for 478 greatly reducing LSA refreshes in stable topologies. 480 [Ref12] proposes a wide range of congestion control and failure 481 recovery mechanisms (some of those ideas are covered in this 482 draft but [Ref12] has other ideas not covered here). 484 Appendix B. List of Variables and Values 486 F = The factor by which the gap between sending successive LSAs to 487 a neighbor is to be increased during congestion and decreased 488 after coming out of congestion (used in Recommendation 4). 489 Example value is 2. 491 G(t) = Gap between sending successive LSAs to a neighbor at time t 492 (used in Recommendation 4). 494 Gmax = Maximum allowed value of gap between sending successive LSAs 495 to a neighbor (used in Recommendation 4). Example value is 1 496 second. 498 Gmin = Minimum allowed value of gap between sending successive LSAs 499 to a neighbor (used in Recommendation 4). Example value is 500 20 ms. 502 H = A high water mark (in units of number of unacknowledged LSAs). 503 Exceeding this mark would trigger a potential increase in the 504 gap between sending successive LSAs to a neighbor. 505 (used in Recommendation 4). Example value is 20. 507 K = A multiplicative constant used in increasing the RxmtInterval 508 value used during successive retransmissions of the same LSA 509 (used in Recommendation 3). Example value is 2. 511 L = A low water mark (in units of number of unacknowledged LSAs) 512 Dropping below this mark would trigger a potential decrease 513 in the gap between sending successive LSAs to a neighbor. 514 (used in Recommendation 4). Example value is 10. 516 n = Upper limit on the number of adjacencies to be brought up 517 simultaneously (used in Recommendation 5). 519 R(i) = RxmtInterval value used during the i-th retransmission of 520 an LSA (used in Recommendation 3). 522 Rmax = The maximum allowed value of RxmtInterval (used in 523 Recommendation 3). Example value is 40 seconds. 525 Rmin = The minimum allowed value of RxmtInterval (used in 526 Recommendation 3). Example value is 5 seconds. 528 T = Minimum time that has to elapse before the existing gap 529 between sending successive LSAs to a neighbor 530 is considered for change (used in Recommendation 4). Example 531 value is 1 second. 533 U(t) = Number of unacknowledged LSAs to a neighbor at time t 534 (used in Recommendation 4). 536 Appendix C. Other Recommendations and Suggestions 538 (1) Explicit Marking: In Section 2 we recommended that OSPF packets 539 be classified to "high" and "low" priority classes based on 540 examining the OSPF packet header. In some cases (particularly 541 in the receiver) this examination may be computationally 542 costly. An alternative would be the 543 use of different TOS/Precedence field settings for the two 544 priority classes. [Ref1] recommends setting the TOS field to 0 545 and the Precedence field to 6 for all OSPF packets. We recommend 546 this same setting for the "low" priority OSPF packets and a 547 different setting for the "high" priority OSPF packets in order 548 to be able to classify them separately without having to examine 549 the OSPF packet header. Two examples are given below: 551 Example 1: For "low" priority packets set TOS field to 0 and 552 Precedence field to 6, and for "high" priority 553 packets set TOS field to 4 and Precedence field to 6. 555 Example 2: For "low" priority packets set TOS field to 0 and 556 Precedence field to 6, and for "high" priority 557 packets set TOS field to 0 and Precedence field to 7. 559 It is to be noted that the TOS/Precedence bits have been 560 redefined by Diffserv (RFC 2474, [Ref15]). It is also to be 561 noted that the different TOS/Precedence field settings suggested 562 above only need to be agreed among the systems on the link. 563 This recommendation is not needed to be followed if it is easy 564 to examine the OSPF packet header and thereby separately 565 classify "high" and "low" priority packets. 567 (2) Further Prioritization of OSPF Packets: Besides the packets 568 designated as "high" priority in Recommendation 1 of Section 2 569 there may be a need for further priority separation among the 570 "low" priority OSPF packets. We recommend the use of three 571 priority classes: "high", "medium" and "low". While 572 receiving a packet from a neighbor and while transmitting 573 a packet to a neighbor, try to process a "high priority" 574 packet ahead of "medium" and "low" priority packets and 575 a "medium" priority packet ahead of "low priority" packets. 576 The "high" priority packets are as designated in Recommendation 577 1 of Section 2. We provide below two candidate examples for 578 "medium" priority packets. All OSPF packets not designated 579 as "high" or "medium" priority are "low" priority. 580 If Cryptographic Authentication (AuType = 2) is used (as 581 specified in [Ref1]) then prioritized treatment is to be 582 provided only at the receiver and after security processing, 583 but not at the transmitter since that 584 may cause packets to arrive out of sequence and violate the 585 requirements of "Autype = 2". 587 One example of "medium" priority packet is the 588 Database Description (DBD) packet from a slave (during the 589 database synchronization process) that is used as an 590 acknowledgment. 592 A second example is an LSA carrying 593 intra-area topology change information (this may trigger 594 SPF calculation and rerouting of Label Switched paths and so 595 fast processing of this packet may improve OSPF/LDP convergence 596 times). However, if the processing cost of identifying and 597 separately queueing the LSA in this example is deemed to be high 598 then the implementer may decide not to do it. 600 (3) Processing large number of LSA Purges: Occasionally some events 601 in the network, such as Router ID changes, may result in a large 602 number of LSA re-originations and LSA purges. In such a scenario 603 one may consider processing LSAs in different order, e.g., 604 processing LSA purges ahead of LSA originations. We, however, 605 do not recommend out-of-order LSA processing for several reasons. 606 Firstly, detecting the LSA type ahead of queueing may be 607 computationally expensive. Out-of-order processing may also 608 cause subtle bugs. We do not want to recommend a major change in 609 the LSA processing paradigm for a relatively rare event such as 610 Router ID change. However, a Router with a changing ID may flush 611 the old LSAs gradually without causing a storm. 613 Full copyright statement 615 Copyright (C) The Internet Society (2004). This document is subject 616 to the rights, licenses and restrictions contained in BCP 78 and 617 except as set forth therein, the authors retain all their rights. 619 This document and the information contained herein are provided on an 620 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 621 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 622 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 623 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 624 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 625 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 627 Intellectual Property Considerations 629 The IETF takes no position regarding the validity or scope of any 630 Intellectual Property Rights or other rights that might be claimed to 631 pertain to the implementation or use of the technology described in 632 this document or the extent to which any license under such rights 633 might or might not be available; nor does it represent that it has 634 made any independent effort to identify any such rights. Information 635 on the procedures with respect to rights in RFC documents can be 636 found in BCP 78 and BCP 79. 638 Copies of IPR disclosures made to the IETF Secretariat and any 639 assurances of licenses to be made available, or the result of an 640 attempt made to obtain a general license or permission for the use of 641 such proprietary rights by implementers or users of this 642 specification can be obtained from the IETF on-line IPR repository at 643 http://www.ietf.org/ipr. The IETF invites any interested party to 644 bring to its attention any copyrights, patents or patent 645 applications, or other proprietary rights that may cover technology 646 that may be required to implement this standard. Please address the 647 information to the IETF at ietf-ipr@ietf.org.