idnits 2.17.1 draft-ginsberg-lsr-isis-flooding-scale-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (November 04, 2019) is 1634 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-18) exists of draft-ietf-lsr-dynamic-flooding-03 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Networking Working Group L. Ginsberg 3 Internet-Draft P. Psenak 4 Intended status: Informational A. Lindem 5 Expires: May 7, 2020 Cisco Systems 6 November 04, 2019 8 IS-IS Flooding Scale Considerations 9 draft-ginsberg-lsr-isis-flooding-scale-00 11 Abstract 13 Link State PDU flooding rates in use are much slower than what modern 14 networks can support. The use of IS-IS at larger scale requires 15 faster flooding rates to achieve desired convergence goals. This 16 document discusses issues associated with increasing flooding rates 17 and some recommended practices which allow faster flooding rates to 18 be used safely. 20 Requirements Language 22 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 23 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 24 "OPTIONAL" in this document are to be interpreted as described in BCP 25 14 [RFC2119] [RFC8174] when, and only when, they appear in all 26 capitals, as shown here. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on May 7, 2020. 45 Copyright Notice 47 Copyright (c) 2019 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (https://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 63 2. Historical Behavior . . . . . . . . . . . . . . . . . . . . . 3 64 3. Flooding Rate and Convergence . . . . . . . . . . . . . . . . 4 65 3.1. Flow Control . . . . . . . . . . . . . . . . . . . . . . 5 66 3.2. Bandwidth Utilization . . . . . . . . . . . . . . . . . . 6 67 3.3. Packet Prioritization on Receive . . . . . . . . . . . . 7 68 4. Minimizing LSP Generation . . . . . . . . . . . . . . . . . . 7 69 5. Redundant Flooding . . . . . . . . . . . . . . . . . . . . . 9 70 6. Use of Jumbo Frames . . . . . . . . . . . . . . . . . . . . . 9 71 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 72 8. Security Considerations . . . . . . . . . . . . . . . . . . . 10 73 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 10 74 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 75 10.1. Normative References . . . . . . . . . . . . . . . . . . 10 76 10.2. Informative References . . . . . . . . . . . . . . . . . 10 77 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 79 1. Introduction 81 Link state IGPs such as Intermediate-System-to-Intermediate-System 82 (IS-IS) depend upon having consistent Link State Databases (LSDB) on 83 all Intermediate Systems (ISs) in the network in order to provide 84 correct forwarding of data packets. When topology changes occur, 85 new/updated Link State PDUs (LSPs) are propagated network-wide. The 86 speed of propagation is a key contributor to convergence time. 88 Historically, flooding rates have been conservative - on the order of 89 10s of LSPs/second. This derives from guidance in the base 90 specification [ISO10589] and early deployments when both CPU speeds 91 and interface speeds were much slower than they are today and the 92 scale of an IS-IS area was smaller than it may be today. 94 As IS-IS is deployed in greater scale (larger number of nodes in an 95 area and larger number of neighbors/node) the impact of the historic 96 flooding rates becomes more significant. Consider the bringup or 97 failure of a node with 1000 neighbors. This will result in a minimum 98 of 1000 LSP updates. At a typical LSP flooding rate used in many 99 deployments today (33 LSPs/second) it would take 30+ seconds simply 100 to send the updated LSPs to a given neighbor. Depending on the 101 diameter of the network, achieving a consistent LSDB on all nodes in 102 the network could easily take a minute (or more). 104 Increasing LSP flooding rate therefore becomes an essential element 105 of supporting greater network scale. 107 The remainder of this document discusses various aspects of protocol 108 operation and how they are impacted by increased flooding rate. 109 Where appropriate, best practices are defined which enhance an 110 implementation's ability to support faster flooding rates. 112 2. Historical Behavior 114 The base specification for IS-IS [ISO10589] was first published in 115 1992 and updated in 2002. The update made no changes in regards to 116 suggested timer values. Convergence targets at the time were on the 117 order of seconds and the timer values specified reflect that. Here 118 are some examples: 120 minimumLSPGenerationInterval - This is the minimum time interval 121 between generation of Link State PDUs. A source Intermediate 122 system shall wait at least this long before re-generating one 123 of its own Link State PDUs. 124 The recommended value was 30 seconds. 126 minimumLSPTransmissionInterval - This is the amount of time an 127 Intermediate system shall wait before further propagating 128 another Link State PDU from the same source system. 129 The recommended value was 5 seconds. 131 partialSNPInterval - This is the amount of time between periodic 132 action for transmission of Partial Sequence Number PDUs. 133 It shall be less than minimumLSPTransmission-Interval. 134 The recommend value was 2 seconds. 136 Most relevant to a discussion of LSP flooding rate is the recommended 137 interval between the transmission of two different LSPs on a given 138 interface. 140 For broadcast interfaces, [ISO10589] defined: 142 minimumBroadcastLSPTransmissionInterval - the minimum interval 143 between PDU arrivals which can be processed by the slowest 144 Intermediate System on the LAN. 145 The default value was defined as 33 milliseconds. 146 NOTE: It was permitted to send multiple LSPs "back-to-back" 147 as a burst, but this was limited to 10 LSPs in a one second 148 period. 150 Although this value was specific to LAN interfaces, this has commonly 151 been applied by implementations to all interfaces though that was not 152 the original intent of the base specification. In fact 153 Section 12.1.2.4.3 states: 155 On point-to-point links the peak rate of arrival is limited only 156 by the speed of the data link and the other traffic flowing on 157 that link. 159 Although modern implementations have not strictly adhered to the 33 160 millisecond interval, it is commonplace for implementations to limit 161 flooding rate to an order of magnitude similar to the 33 ms value. 163 In the past 20 years, significant work on achieving faster 164 convergence - more specifically sub-second convergence - has resulted 165 in implementations modifying a number of the above timers in order to 166 support faster signaling of topology changes. For example, 167 minimumLSPGenerationInterval has been modified to support millisecond 168 intervals - often with a backoff algorithm applied to prevent LSP 169 generation storms in the event of a series of rapid oscillations. 171 However, flooding rate has not been fundamentally altered. 173 3. Flooding Rate and Convergence 175 Convergence involves a number of sequential operations. 177 First the topology change needs to be detected. This is a local 178 activity occurring only on the node or nodes directly connected to 179 the topology change. The directly connected node(s) then must 180 advertise the topology change by updating their LSPs and flooding the 181 changed LSPs. Routers then must process the updated LSDB and 182 recalculate paths to affected destinations. The updated paths must 183 then be installed in the forwarding plane. 185 Only when all of the steps are completed on all nodes in the network 186 has the network completed convergence. 188 As the convergence requirement is consistency of LSDBs on all nodes 189 in the network, it is fundamental to understand that the flooding 190 process is optimal when LSDB changes are propagated throughout the 191 network at a consistent rate. Using different flooding rates on 192 different interfaces results in longer periods of LSDB inconsistency. 193 Timers defined to control the rate of flooding are NOT per interface 194 - they are scoped by the IS-IS area/domain. 196 Many implementations provide knobs to control the rate of LSP 197 flooding on a per interface basis. To the extent that this serves as 198 a flow control mechanism, this may reduce the number of dropped LSPs 199 during high activity bursts and thereby reduce the number of LSP 200 retransmissions required. As LSP retransimssion timers are typically 201 long (multiple seconds), this may result in shorter convergence times 202 than if the LSP burst was uncontrolled. But if the performance 203 characteristics of routers in the network are such that some routers 204 consistently accept and process fewer LSPs/second than other routers, 205 convergence will be degraded. Tuning LSP transmission timers on a 206 per interfrace basis will never provide optimal convergence. 207 Consistent flooding rates should be used on all interfaces. 209 3.1. Flow Control 211 In large scale deployments where an increased flooding rate is being 212 used, it becomes more likely that a burst of LSPs may temporarily 213 overwhelm a receiver. Normal operation of the Update Process will 214 recover from this, but it may well make sense to employ some form of 215 flow control. This will not serve to optimize convergence, but it 216 can serve to reduce the number of LSP retransmissions. As 217 retransmissions are deliberately done at a slow rate, the result of 218 flow control will be to provide a shorter recovery time from a 219 transient condition which causes a node to be unable to handle the 220 targeted rate of LSP transmission. Inability to handle LSP reception 221 at the targeted flooding rate should be viewed as an error condition 222 which should be reported. If this condition persists, it indicates 223 that the network is provisioned in a way which does not support 224 optimal convergence. Steps need to be taken to resolve this issue. 225 Such steps could involve upgrading the routers that demonstrate this 226 condition consistently, altering the configuration on the problematic 227 routers or altering the position of the problematic routers in the 228 network so as to reduce the overall load on those routers, or 229 reducing the LSP transmission rate network-wide. 231 When flow control is necessary, it can be implemented in a 232 straightforward manner based on current transmission queue length. 233 Presented below is one possible algorithm for point-to-point 234 interfaces. 236 MaxLSPTx = maximum # LSPs transmitted/second/interface 237 Umax = Maximum Unacknowledged LSP/Interface 238 Usafe = Safe level of Unacknowledged LSP/Interface 239 U(i) = # of unacknowledged LSPs previously transmitted/interface 240 LSPTx(i) = max # LSPs transmitted/second for a given interface 242 Initialize LSPTx(i) = MaxLSPTx for all interfaces 244 On a given interface when U(i) >= Umax 245 Generate a log 246 Only retransmissions of unacknowledged LSPs are performed 247 Temporarily set LSPTx(i) = MaxLSPTx/2 249 For each second U(i) >= Usafe 250 further reduce LSPTx(i) = LSPTx(i)/2 252 When U(i) <= Usafe 253 restore LSPTx(i) = MaxLSPTx 254 New LSPs may be transmitted 256 3.2. Bandwidth Utilization 258 Routing protocol traffic has to share bandwidth on a link with other 259 control traffic and data traffic. During periods of instability, 260 routing protocol traffic will increase, but it is still desirable 261 that the maximum bandwidth consumption by routing protocol traffic be 262 modest. This needs to be considered when setting IS-IS flooding 263 rates. 265 If we assume a maximum size of 1492 bytes for an LSP, here are some 266 rough estimates of bandwidth consumption at different flooding rates: 268 +--------------+----------------+-------------+ 269 | LSPs/second | 100 Mb Link | 1 Gb Link | 270 +--------------+----------------+-------------+ 271 | 100 | 1.2 % | 0.1 % | 272 +--------------+----------------+-------------+ 273 | 500 | 6.1 % | 0.6 % | 274 +--------------+----------------+-------------+ 275 | 1000 | 12.1 % | 1.2 % | 276 +--------------+----------------+-------------+ 278 3.3. Packet Prioritization on Receive 280 There are three classes of PDUs sent by IS-IS: 282 o Hellos 284 o LSPs 286 o Complete Sequence Number PDUs (CSNPs) and Partial Sequence Number 287 PDUs (PSNPs) 289 Implementations today may prioritize the reception of Hellos over 290 LSPs and SNPs in order to prevent a burst of LSP updates from 291 triggering an adjacency timeout which in turn would require 292 additional LSPs to be updated. 294 SNPs serve to acknowledge or trigger the transmission of specified 295 LSPs. On a point-to-point link, PSNPs acknowledge the receipt of one 296 or more LSPs. Because PSNPs (like all IS-IS PDUs) use TLVs in the 297 body, it is possible to acknowledge multiple LSPs using a single 298 PSNP. For this reason, [ISO10589] specifies a delay 299 (partialSNPInterval) before sending a PSNP so that the number of 300 PSNPs required to be sent is reduced. On receipt of a PSNP, the set 301 of LSPs acknowledged by that PSNP can be marked so that they do not 302 need to be retransmitted. 304 If a PSNP is dropped on receive, this has a significant impact as the 305 set of LSPs advertised in the PSNP cannot be marked as acknowledged 306 and this results in needless retransmissions which may further delay 307 transmission of other LSPs which have yet to be transmitted. It may 308 also make it more likely that a receiver becomes overwhelmed by LSP 309 transmissions. 311 It is therefore recommended that implementations prioritize the 312 receipt of SNPs over LSPs. 314 4. Minimizing LSP Generation 316 In IS-IS the unit of flooding is an LSP. Each router may generate a 317 set of LSPs at each supported level. Each LSP in the set has an LSP 318 number - which is a value from 0-N where N = 255 for the base 319 protocol. (N has been extended to 65535 by [RFC7356].) Each LSP 320 carries network information using defined Type/Length/Value (TLV) 321 tuples. For example, some TLVs carry neighbor information and some 322 TLVs carry reachable prefix information. [ISO10589] strongly 323 recommends preserving the association of a given advertisement (such 324 as a neighbor) with a specific LSP whenever possible. This minimizes 325 the number of LSPs which need to be regenerated when a topology 326 change occurs. This recommendation becomes even more important as 327 the scale of the network increases. 329 Consider the following example; 331 Node A has 11 neighbors currently in the UP state and is advertising 332 them in three LSPs with content as follows: 334 A.00-00 contains the following advertisements 335 Neighbor 1 336 Neighbor 2 337 Neighbor 3 338 Neighbor 4 339 Neighbor 5 340 A.00-01 contains the following advertisements: 341 Neighbor 6 342 Neighbor 7 343 Neighbor 8 344 Neighbor 9 345 Neighbor 10 346 A.00-02 contains the following advertisements 347 Neighbor 11 349 Imagine that the adjacency to Neighbor 3 goes down. There are (at 350 least) two ways that A could update its LSPs. 352 Method 1: A removes the neighbor advertisement for neighbor 3 from 353 A.00-00 and sends an update for that LSP. LSPs #1 and #2 are 354 unchanged and so do not have to be flooded. 356 Method 2: A attempts to reduce the number of LSPs currently active 357 and updates the content as follows: 359 A.00-00 contains the following advertisements 360 Neighbor 1 361 Neighbor 2 362 Neighbor 4 363 Neighbor 5 364 Neighbor 6 365 A.00-01 contains the following advertisements: 366 Neighbor 7 367 Neighbor 8 368 Neighbor 9 369 Neighbor 10 370 Neighbor 11 371 A.00-02 becomes empty 372 Niode A now has to flood all three LSPs. LSPs #0 and #1 are 373 reflooded because their content has changed. LSP #2 is purged. 375 In a large scale network, the impact of using Method #2 becomes 376 significant and introduces conditions where a much larger number of 377 LSPs need to be flooded than is the case with Method #1. 379 In order to operate at scale, implementations need to follow the 380 guidance in [ISO10589] and use Method #1 whenever possible. 382 5. Redundant Flooding 384 Default operation of the Update Process is to flood on all 385 interfaces. In cases where a network is highly meshed, this can 386 result in a significant amount of redundant flooding. Nodes will 387 receive multiple copies of each updated LSP. 389 There are defined mechanisms which can greatly reduce the redundant 390 flooding. These include: 392 o Mesh Groups ( [RFC2973] ) 394 o Dynamic Flooding ( [I-D.ietf-lsr-dynamic-flooding] ) 396 6. Use of Jumbo Frames 398 The maximum size of an LSP (LSPBufferSize) is a parameter that needs 399 to be set consistently network wide. This is because IS-IS does not 400 support fragmentation of its PDUs - so in order for network wide 401 flooding of an LSP to be successful all routers must restrict their 402 LSP size to a size which all interfaces on which IS-IS operates can 403 support without fragmentation. 405 In networks where all interfaces on which IS-IS operates support 406 large frames, LSPBufferSize may be set to a larger value than the 407 default (1492). This allows more routing information to be encoded 408 in a single LSP, which means that fewer LSPs are required to make up 409 the LSDB and therefore the number of LSPs which need to be flooded 410 can be reduced in some scenarios (e.g., node or interface bringup). 412 7. IANA Considerations 414 This document requires no actions by IANA. 416 8. Security Considerations 418 Security concerns for IS-IS are addressed in [ISO10589, [RFC5304], 419 and [RFC5310]. 421 9. Acknowledgements 423 None at present.. 425 10. References 427 10.1. Normative References 429 [ISO10589] 430 International Organization for Standardization, 431 "Intermediate system to Intermediate system intra-domain 432 routeing information exchange protocol for use in 433 conjunction with the protocol for providing the 434 connectionless-mode Network Service (ISO 8473)", ISO/ 435 IEC 10589:2002, Second Edition, Nov 2002. 437 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 438 Requirement Levels", BCP 14, RFC 2119, 439 DOI 10.17487/RFC2119, March 1997, 440 . 442 [RFC2973] Balay, R., Katz, D., and J. Parker, "IS-IS Mesh Groups", 443 RFC 2973, DOI 10.17487/RFC2973, October 2000, 444 . 446 [RFC5304] Li, T. and R. Atkinson, "IS-IS Cryptographic 447 Authentication", RFC 5304, DOI 10.17487/RFC5304, October 448 2008, . 450 [RFC5310] Bhatia, M., Manral, V., Li, T., Atkinson, R., White, R., 451 and M. Fanto, "IS-IS Generic Cryptographic 452 Authentication", RFC 5310, DOI 10.17487/RFC5310, February 453 2009, . 455 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 456 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 457 May 2017, . 459 10.2. Informative References 461 [I-D.ietf-lsr-dynamic-flooding] 462 Li, T., Psenak, P., Ginsberg, L., Chen, H., Przygienda, 463 T., Cooper, D., Jalil, L., and S. Dontula, "Dynamic 464 Flooding on Dense Graphs", draft-ietf-lsr-dynamic- 465 flooding-03 (work in progress), June 2019. 467 [RFC7356] Ginsberg, L., Previdi, S., and Y. Yang, "IS-IS Flooding 468 Scope Link State PDUs (LSPs)", RFC 7356, 469 DOI 10.17487/RFC7356, September 2014, 470 . 472 Authors' Addresses 474 Les Ginsberg 475 Cisco Systems 476 821 Alder Drive 477 Milpitas, CA 95035 478 USA 480 Email: ginsberg@cisco.com 482 Peter Psenak 483 Cisco Systems 484 Apollo Business Center Mlynske nivy 43 485 Bratislava 821 09 486 Slovakia 488 Email: ppsenak@cisco.com 490 Acee Lindem 491 Cisco Systems 492 301 Midenhall Way 493 Cary, NC 27513 494 US 496 Email: acee@cisco.com