idnits 2.17.1 draft-ginsberg-lsr-isis-flooding-scale-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (April 17, 2020) is 1468 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-18) exists of draft-ietf-lsr-dynamic-flooding-04 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Networking Working Group L. Ginsberg 3 Internet-Draft P. Psenak 4 Intended status: Informational A. Lindem 5 Expires: October 19, 2020 Cisco Systems 6 T. Przygienda 7 Juniper 8 April 17, 2020 10 IS-IS Flooding Scale Considerations 11 draft-ginsberg-lsr-isis-flooding-scale-02 13 Abstract 15 Link State PDU flooding rates in use are much slower than what modern 16 networks can support. The use of IS-IS at larger scale requires 17 faster flooding rates to achieve desired convergence goals. This 18 document discusses issues associated with increasing flooding rates 19 and some recommended practices which allow faster flooding rates to 20 be used safely. 22 Requirements Language 24 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 25 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 26 "OPTIONAL" in this document are to be interpreted as described in BCP 27 14 [RFC2119] [RFC8174] when, and only when, they appear in all 28 capitals, as shown here. 30 Status of This Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at https://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on October 19, 2020. 47 Copyright Notice 49 Copyright (c) 2020 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (https://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 65 2. Historical Behavior . . . . . . . . . . . . . . . . . . . . . 3 66 3. Flooding Rate and Convergence . . . . . . . . . . . . . . . . 4 67 3.1. Flow Control . . . . . . . . . . . . . . . . . . . . . . 5 68 3.2. Rate of LSP Acknowledgments . . . . . . . . . . . . . . . 7 69 3.3. Bandwidth Utilization . . . . . . . . . . . . . . . . . . 7 70 3.4. Packet Prioritization on Receive . . . . . . . . . . . . 7 71 4. Minimizing LSP Generation . . . . . . . . . . . . . . . . . . 8 72 5. Redundant Flooding . . . . . . . . . . . . . . . . . . . . . 10 73 6. Use of Jumbo Frames . . . . . . . . . . . . . . . . . . . . . 10 74 7. Deployment Considerations . . . . . . . . . . . . . . . . . . 10 75 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 76 9. Security Considerations . . . . . . . . . . . . . . . . . . . 11 77 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11 78 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 79 11.1. Normative References . . . . . . . . . . . . . . . . . . 11 80 11.2. Informative References . . . . . . . . . . . . . . . . . 12 81 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 83 1. Introduction 85 Link state IGPs such as Intermediate-System-to-Intermediate-System 86 (IS-IS) depend upon having consistent Link State Databases (LSDB) on 87 all Intermediate Systems (ISs) in the network in order to provide 88 correct forwarding of data packets. When topology changes occur, 89 new/updated Link State PDUs (LSPs) are propagated network-wide. The 90 speed of propagation is a key contributor to convergence time. 92 Historically, flooding rates have been conservative - on the order of 93 10s of LSPs/second. This derives from guidance in the base 94 specification [ISO10589] and early deployments when both CPU speeds 95 and interface speeds were much slower than they are today and the 96 scale of an IS-IS area was smaller than it may be today. 98 As IS-IS is deployed in greater scale (larger number of nodes in an 99 area and larger number of neighbors/node), the impact of the historic 100 flooding rates becomes more significant. Consider the bringup or 101 failure of a node with 1000 neighbors. This will result in a minimum 102 of 1000 LSP updates. At a typical LSP flooding rate used in many 103 deployments today (33 LSPs/second), it would take 30+ seconds simply 104 to send the updated LSPs to a given neighbor. Depending on the 105 diameter of the network, achieving a consistent LSDB on all nodes in 106 the network could easily take a minute (or more). 108 Increasing LSP flooding rate therefore becomes an essential element 109 of supporting greater network scale. 111 The remainder of this document discusses various aspects of protocol 112 operation and how they are impacted by increased flooding rate. 113 Where appropriate, best practices are defined which enhance an 114 implementation's ability to support faster flooding rates. 116 2. Historical Behavior 118 The base specification for IS-IS [ISO10589] was first published in 119 1992 and updated in 2002. The update made no changes in regards to 120 suggested timer values. Convergence targets at the time were on the 121 order of seconds and the specified timer values reflect that. Here 122 are some examples: 124 minimumLSPGenerationInterval - This is the minimum time interval 125 between generation of Link State PDUs. A source Intermediate 126 system shall wait at least this long before re-generating one 127 of its own Link State PDUs. 128 The recommended value was 30 seconds. 130 minimumLSPTransmissionInterval - This is the amount of time an 131 Intermediate system shall wait before further propagating 132 another Link State PDU from the same source system. 133 The recommended value was 5 seconds. 135 partialSNPInterval - This is the amount of time between periodic 136 action for transmission of Partial Sequence Number PDUs. 137 It shall be less than minimumLSPTransmission-Interval. 138 The recommend value was 2 seconds. 140 Most relevant to a discussion of LSP flooding rate is the recommended 141 interval between the transmission of two different LSPs on a given 142 interface. 144 For broadcast interfaces, [ISO10589] defined: 146 minimumBroadcastLSPTransmissionInterval - the minimum interval 147 between PDU arrivals which can be processed by the slowest 148 Intermediate System on the LAN. 149 The default value was defined as 33 milliseconds. 150 NOTE: It was permitted to send multiple LSPs "back-to-back" 151 as a burst, but this was limited to 10 LSPs in a one second 152 period. 154 Although this value was specific to LAN interfaces, this has commonly 155 been applied by implementations to all interfaces though that was not 156 the original intent of the base specification. In fact 157 Section 12.1.2.4.3 states: 159 On point-to-point links the peak rate of arrival is limited only 160 by the speed of the data link and the other traffic flowing on 161 that link. 163 Although modern implementations have not strictly adhered to the 33 164 millisecond interval, it is commonplace for implementations to limit 165 flooding rate to an order of magnitude similar to the 33 ms value. 167 In the past 20 years, significant work on achieving faster 168 convergence - more specifically sub-second convergence - has resulted 169 in implementations modifying a number of the above timers in order to 170 support faster signaling of topology changes. For example, 171 minimumLSPGenerationInterval has been modified to support millisecond 172 intervals - often with a backoff algorithm applied to prevent LSP 173 generation storms in the event of a series of rapid oscillations. 175 However, flooding rate has not been fundamentally altered. 177 3. Flooding Rate and Convergence 179 Convergence involves a number of sequential operations. 181 First the topology change needs to be detected. This is a local 182 activity occurring only on the node or nodes directly connected to 183 the topology change. The directly connected node(s) then must 184 advertise the topology change by updating their LSPs and flooding the 185 changed LSPs. Routers then must process the updated LSDB and 186 recalculate paths to affected destinations. The updated paths must 187 then be installed in the forwarding plane. 189 Only when all of the steps are completed on all nodes in the network 190 has the network completed convergence. 192 As the convergence requirement is consistency of LSDBs on all nodes 193 in the network, it is fundamental to understand that the goal of 194 flooding is to update the LSDB on all nodes in the network "as fast 195 as possible". Controling the rate of flooding per interface is done 196 to address some practical limitations which include: 198 o Fairness to other data and control traffic on the same interface 200 o Limitations on the processing rate of incoming control traffic 202 However, intentionally using different flooding rates on different 203 interfaces increases the possibility of longer periods of LSDB 204 inconsistency, which, in turn, delays network wide convergence. 206 Many implementations provide knobs to control the rate of LSP 207 flooding on a per interface basis. To the extent that this serves as 208 a flow control mechanism, this may reduce the number of dropped LSPs 209 during high activity bursts and thereby reduce the number of LSP 210 retransmissions required. As LSP retransimssion timers are typically 211 long (multiple seconds), this may result in shorter convergence times 212 than if the LSP burst was uncontrolled. But if the performance 213 characteristics of routers in the network are such that some routers 214 consistently accept and process fewer LSPs/second than other routers, 215 convergence will be degraded. Tuning LSP transmission timers on a 216 per interface basis will never provide optimal convergence. 217 Consistent flooding rates should be used on all interfaces. 219 3.1. Flow Control 221 In large scale deployments where an increased flooding rate is being 222 used, it becomes more likely that a burst of LSPs may temporarily 223 overwhelm a receiver. Normal operation of the Update Process will 224 recover from this, but it may well make sense to employ some form of 225 flow control. This will not serve to optimize convergence, but it 226 can serve to reduce the number of LSP retransmissions. As 227 retransmissions are deliberately done at a slow rate, the result of 228 flow control will be to provide a shorter recovery time from a 229 transient condition which prevents a node from handling the targeted 230 rate of LSP transmission. Inability to handle LSP reception at the 231 targeted flooding rate should be viewed as an error condition which 232 should be reported. If this condition persists, it indicates that 233 the network is provisioned in a way which does not support optimal 234 convergence. Steps need to be taken to resolve this issue. Such 235 steps could include upgrading the routers that demonstrate this 236 condition consistently, altering the configuration on the problematic 237 routers or altering the position of the problematic routers in the 238 network so as to reduce the overall load on those routers, or 239 reducing the LSP transmission rate network-wide. 241 When flow control is necessary, it can be implemented in a 242 straightforward manner based on knowledge of the current flooding 243 rate and the number of unacknowledged LSPs which have been sent to a 244 nieghbor. Such an algorithm is a local matter and there is no need 245 to standardize an algorithm. One possible algorithm for point-to- 246 point interfaces is presented below. 248 Calculated parameters/interface 249 -------------------------------------------------------------- 250 CurrentLSPTxMax: Current maximum number of LSPs which can be 251 transmitted/second 252 CurrentUackLSP: Current number of unacknowledged LSPs already 253 transmitted 255 Interface independent configurable parameters 256 -------------------------------------------------------------- 257 MaxLSPTx: Maximum # LSPs transmitted/second/interface 258 MinLSPTx: Minimum # LSPs which may be 259 transmitted/second/interface 260 UackSafe: Safe level of unacknowledged LSP/Interface 261 expressed as a percentage of 262 CurrentLSPTxMax(1-99) 263 UpdateBackoff: Percent backoff when congestion occurs (1-99) 264 UpdateIncrement: Percent increment when congestion has cleared 266 Algorithm - run once/sec 267 ------------------------- 269 if (CurrentUackLSP > (CurrentLSPTxMax * UackSafe)) { 270 CurrentLSPTxMax = 271 max(MinLSPTx, (CurrentLSPMaxTx*UpdateBackoff)) 272 } else { // CurrentUackLSP is at a safe level 273 CurrentLSPTxMax = min(MaxLSPTx, 274 CurrentLSPTxMax*((100 + UpdateIncrement)/100)) 275 } 277 3.2. Rate of LSP Acknowledgments 279 On point-to-point networks, PSNP PDUs provide acknowledgments for 280 received LSPs. [ISO10589] suggests that some delay be used when 281 sending PSNPs. This provides some optimization as multiple LSPs can 282 be acknowledged in a single PSNP. 284 If faster LSP flooding is to be used safely, it is necessary that 285 LSPs be acknowledged more promptly as well. This requires a 286 reduction in the delay in sending PSNPs. 288 As PSNPs also consume link bandwidth and packet queue space and 289 protocol processing time on receipt, the increased sending of PSNPs 290 should be taken into account when considering the rate at which LSPs 291 can be sent on an interface. 293 3.3. Bandwidth Utilization 295 Routing protocol traffic has to share bandwidth on a link with other 296 control traffic and data traffic. During periods of instability, 297 routing protocol traffic will increase, but it is still desirable 298 that the maximum bandwidth consumption by routing protocol traffic be 299 modest. This needs to be considered when setting IS-IS flooding 300 rates. 302 If we assume a maximum size of 1492 bytes for an LSP, here are some 303 rough estimates of bandwidth consumption at different flooding rates: 305 +--------------+----------------+-------------+ 306 | LSPs/second | 100 Mb Link | 1 Gb Link | 307 +--------------+----------------+-------------+ 308 | 100 | 1.2 % | 0.1 % | 309 +--------------+----------------+-------------+ 310 | 500 | 6.1 % | 0.6 % | 311 +--------------+----------------+-------------+ 312 | 1000 | 12.1 % | 1.2 % | 313 +--------------+----------------+-------------+ 315 3.4. Packet Prioritization on Receive 317 There are three classes of PDUs sent by IS-IS: 319 o Hellos 321 o LSPs 322 o Complete Sequence Number PDUs (CSNPs) and Partial Sequence Number 323 PDUs (PSNPs) 325 Implementations today may prioritize the reception of Hellos over 326 LSPs and SNPs in order to prevent a burst of LSP updates from 327 triggering an adjacency timeout which in turn would require 328 additional LSPs to be updated. 330 SNPs serve to acknowledge or trigger the transmission of specified 331 LSPs. On a point-to-point link, PSNPs acknowledge the receipt of one 332 or more LSPs. Because PSNPs (like all IS-IS PDUs) use TLVs in the 333 body, it is possible to acknowledge multiple LSPs using a single 334 PSNP. For this reason, [ISO10589] specifies a delay 335 (partialSNPInterval) before sending a PSNP so that the number of 336 PSNPs required to be sent is reduced. On receipt of a PSNP, the set 337 of LSPs acknowledged by that PSNP can be marked so that they do not 338 need to be retransmitted. 340 If a PSNP is dropped on reception, this has a significant impact as 341 the set of LSPs advertised in the PSNP cannot be marked as 342 acknowledged and this results in needless retransmissions which may 343 further delay transmission of other LSPs which have yet to be 344 transmitted. It may also make it more likely that a receiver becomes 345 overwhelmed by LSP transmissions. 347 It is therefore recommended that implementations prioritize the 348 receipt of SNPs over LSPs. 350 4. Minimizing LSP Generation 352 In IS-IS the unit of flooding is an LSP. Each router may generate a 353 set of LSPs at each supported level. Each LSP in the set has an LSP 354 number - which is a value from 0-N where N = 255 for the base 355 protocol. (N has been extended to 65535 by [RFC7356].) Each LSP 356 carries network information using defined Type/Length/Value (TLV) 357 tuples. For example, some TLVs carry neighbor information and some 358 TLVs carry reachable prefix information. [ISO10589] strongly 359 recommends preserving the association of a given advertisement (such 360 as a neighbor) with a specific LSP whenever possible. This minimizes 361 the number of LSPs which need to be regenerated when a topology 362 change occurs. This recommendation becomes even more important as 363 the scale of the network increases. 365 Consider the following example; 367 Node A has 11 neighbors currently in the UP state and is advertising 368 them in three LSPs with content as follows: 370 A.00-00 contains the following advertisements 371 Neighbor 1 372 Neighbor 2 373 Neighbor 3 374 Neighbor 4 375 Neighbor 5 376 A.00-01 contains the following advertisements: 377 Neighbor 6 378 Neighbor 7 379 Neighbor 8 380 Neighbor 9 381 Neighbor 10 382 A.00-02 contains the following advertisements 383 Neighbor 11 385 Imagine that the adjacency to Neighbor 3 goes down. There are (at 386 least) two ways that A could update its LSPs. 388 Method 1: Node A removes the neighbor advertisement for neighbor 3 389 from A.00-00 and sends an update for that LSP. LSPs 00-01 and 00-02 390 are unchanged and so do not have to be flooded. 392 Method 2: Node A attempts to reduce the number of LSPs currently 393 active and updates the content as follows: 395 A.00-00 contains the following advertisements 396 Neighbor 1 397 Neighbor 2 398 Neighbor 4 399 Neighbor 5 400 Neighbor 6 401 A.00-01 contains the following advertisements: 402 Neighbor 7 403 Neighbor 8 404 Neighbor 9 405 Neighbor 10 406 Neighbor 11 407 A.00-02 becomes empty 409 Node A now has to flood all three LSPs. LSPs #0 and #1 are reflooded 410 because their content has changed. LSP #2 is purged. 412 In a large scale network, the impact of using Method #2 becomes 413 significant and introduces conditions where a much larger number of 414 LSPs need to be flooded than is the case with Method #1. 416 In order to operate at scale, implementations need to follow the 417 guidance in [ISO10589] and use Method #1 whenever possible. 419 5. Redundant Flooding 421 Default operation of the Update Process is to flood on all 422 interfaces. In cases where a network is highly meshed, this can 423 result in a significant amount of redundant flooding. Nodes will 424 receive multiple copies of each updated LSP. 426 There are defined mechanisms which can greatly reduce the redundant 427 flooding. These include: 429 o Mesh Groups ( [RFC2973] ) 431 o Dynamic Flooding ( [I-D.ietf-lsr-dynamic-flooding] ) 433 6. Use of Jumbo Frames 435 The maximum size of an LSP (LSPBufferSize) is a parameter that needs 436 to be set consistently network wide. This is because IS-IS does not 437 support fragmentation of its PDUs - so in order for network wide 438 flooding of an LSP to be successful all routers must restrict their 439 LSP size to a size which can be supported without fragmentation on 440 all interfaces on which IS-IS operates. 442 In networks where all interfaces on which IS-IS operates support 443 large frames, LSPBufferSize may be set to a larger value than the 444 default (1492). This allows more routing information to be encoded 445 in a single LSP, which means that fewer LSPs are generated by each 446 node and therefore the number of LSPs which need to be flooded can be 447 reduced in some scenarios (e.g., node or interface bringup). 449 7. Deployment Considerations 451 As noted earlier in this document, it is desired to have consistent 452 flooding speeds on all nodes in the network. Today, this is roughly 453 achieved to the extent that current implementations flood at rates 454 which are on the order of what is discussed in [ISO10589] , i.e., 33 455 LSPs/second). 457 As the goal is to introduce an order of magnitude increase in the 458 rate of flooding (e.g., 10 times the current flooding rate) a network 459 which has a mixture of nodes which support the faster flooding speeds 460 and nodes which do not is at greater risk of introducing longer 461 periods of LSDB inconsistency in the network - which is likely to 462 have a negative impact on convergence and increase the occurrence of 463 traffic drops or looping. 465 It is recommended that all nodes in the network support increased 466 flooding rates before enabling use of the increased flooding rates. 468 Note that as the Update process runs in the context of an area (or 469 the L2 sub-domain), enablement can safely be done on a per area basis 470 even when nodes in another area do not support the faster flooding 471 rates. 473 8. IANA Considerations 475 This document requires no actions by IANA. 477 9. Security Considerations 479 Security concerns for IS-IS are addressed in [ISO10589, [RFC5304], 480 and [RFC5310]. 482 10. Acknowledgements 484 Thanks to Bruno Decraene for his careful review and insightful 485 comments. 487 11. References 489 11.1. Normative References 491 [ISO10589] 492 International Organization for Standardization, 493 "Intermediate system to Intermediate system intra-domain 494 routeing information exchange protocol for use in 495 conjunction with the protocol for providing the 496 connectionless-mode Network Service (ISO 8473)", ISO/ 497 IEC 10589:2002, Second Edition, Nov 2002. 499 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 500 Requirement Levels", BCP 14, RFC 2119, 501 DOI 10.17487/RFC2119, March 1997, 502 . 504 [RFC2973] Balay, R., Katz, D., and J. Parker, "IS-IS Mesh Groups", 505 RFC 2973, DOI 10.17487/RFC2973, October 2000, 506 . 508 [RFC5304] Li, T. and R. Atkinson, "IS-IS Cryptographic 509 Authentication", RFC 5304, DOI 10.17487/RFC5304, October 510 2008, . 512 [RFC5310] Bhatia, M., Manral, V., Li, T., Atkinson, R., White, R., 513 and M. Fanto, "IS-IS Generic Cryptographic 514 Authentication", RFC 5310, DOI 10.17487/RFC5310, February 515 2009, . 517 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 518 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 519 May 2017, . 521 11.2. Informative References 523 [I-D.ietf-lsr-dynamic-flooding] 524 Li, T., Psenak, P., Ginsberg, L., Chen, H., Przygienda, 525 T., Cooper, D., Jalil, L., and S. Dontula, "Dynamic 526 Flooding on Dense Graphs", draft-ietf-lsr-dynamic- 527 flooding-04 (work in progress), November 2019. 529 [RFC7356] Ginsberg, L., Previdi, S., and Y. Yang, "IS-IS Flooding 530 Scope Link State PDUs (LSPs)", RFC 7356, 531 DOI 10.17487/RFC7356, September 2014, 532 . 534 Authors' Addresses 536 Les Ginsberg 537 Cisco Systems 538 821 Alder Drive 539 Milpitas, CA 95035 540 USA 542 Email: ginsberg@cisco.com 544 Peter Psenak 545 Cisco Systems 546 Apollo Business Center Mlynske nivy 43 547 Bratislava 821 09 548 Slovakia 550 Email: ppsenak@cisco.com 551 Acee Lindem 552 Cisco Systems 553 301 Midenhall Way 554 Cary, NC 27513 555 US 557 Email: acee@cisco.com 559 Tony Przygienda 560 Juniper 561 1137 Innovation Way 562 Sunnyvale, Ca 563 USA 565 Email: prz@juniper.net