idnits 2.17.1 draft-ietf-lsr-dynamic-flooding-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 14, 2020) is 1222 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO10589' Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force T. Li, Ed. 3 Internet-Draft Arista Networks 4 Intended status: Standards Track P. Psenak, Ed. 5 Expires: June 17, 2021 L. Ginsberg 6 Cisco Systems, Inc. 7 H. Chen 8 Futurewei 9 T. Przygienda 10 Juniper Networks, Inc. 11 D. Cooper 12 CenturyLink 13 L. Jalil 14 Verizon 15 S. Dontula 16 ATT 17 G. Mishra 18 Verizon Inc. 19 December 14, 2020 21 Dynamic Flooding on Dense Graphs 22 draft-ietf-lsr-dynamic-flooding-08 24 Abstract 26 Routing with link state protocols in dense network topologies can 27 result in sub-optimal convergence times due to the overhead 28 associated with flooding. This can be addressed by decreasing the 29 flooding topology so that it is less dense. 31 This document discusses the problem in some depth and an 32 architectural solution. Specific protocol changes for IS-IS, OSPFv2, 33 and OSPFv3 are described in this document. 35 Status of This Memo 37 This Internet-Draft is submitted in full conformance with the 38 provisions of BCP 78 and BCP 79. 40 Internet-Drafts are working documents of the Internet Engineering 41 Task Force (IETF). Note that other groups may also distribute 42 working documents as Internet-Drafts. The list of current Internet- 43 Drafts is at https://datatracker.ietf.org/drafts/current/. 45 Internet-Drafts are draft documents valid for a maximum of six months 46 and may be updated, replaced, or obsoleted by other documents at any 47 time. It is inappropriate to use Internet-Drafts as reference 48 material or to cite them other than as "work in progress." 49 This Internet-Draft will expire on June 17, 2021. 51 Copyright Notice 53 Copyright (c) 2020 IETF Trust and the persons identified as the 54 document authors. All rights reserved. 56 This document is subject to BCP 78 and the IETF Trust's Legal 57 Provisions Relating to IETF Documents 58 (https://trustee.ietf.org/license-info) in effect on the date of 59 publication of this document. Please review these documents 60 carefully, as they describe your rights and restrictions with respect 61 to this document. Code Components extracted from this document must 62 include Simplified BSD License text as described in Section 4.e of 63 the Trust Legal Provisions and are provided without warranty as 64 described in the Simplified BSD License. 66 Table of Contents 68 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 69 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 5 70 2. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 5 71 3. Solution Requirements . . . . . . . . . . . . . . . . . . . . 5 72 4. Dynamic Flooding . . . . . . . . . . . . . . . . . . . . . . 6 73 4.1. Applicability . . . . . . . . . . . . . . . . . . . . . . 7 74 4.2. Leader election . . . . . . . . . . . . . . . . . . . . . 8 75 4.3. Computing the Flooding Topology . . . . . . . . . . . . . 8 76 4.4. Topologies on Complete Bipartite Graphs . . . . . . . . . 9 77 4.4.1. A Minimal Flooding Topology . . . . . . . . . . . . . 9 78 4.4.2. Xia Topologies . . . . . . . . . . . . . . . . . . . 10 79 4.4.3. Optimization . . . . . . . . . . . . . . . . . . . . 11 80 4.5. Encoding the Flooding Topology . . . . . . . . . . . . . 11 81 4.6. Advertising the Local Edges Enabled for Flooding . . . . 11 82 5. Protocol Elements . . . . . . . . . . . . . . . . . . . . . . 12 83 5.1. IS-IS TLVs . . . . . . . . . . . . . . . . . . . . . . . 12 84 5.1.1. IS-IS Area Leader Sub-TLV . . . . . . . . . . . . . . 12 85 5.1.2. IS-IS Dynamic Flooding Sub-TLV . . . . . . . . . . . 13 86 5.1.3. IS-IS Area Node IDs TLV . . . . . . . . . . . . . . . 14 87 5.1.4. IS-IS Flooding Path TLV . . . . . . . . . . . . . . . 15 88 5.1.5. IS-IS Flooding Request TLV . . . . . . . . . . . . . 16 89 5.1.6. IS-IS LEEF Advertisement . . . . . . . . . . . . . . 18 90 5.2. OSPF LSAs and TLVs . . . . . . . . . . . . . . . . . . . 18 91 5.2.1. OSPF Area Leader Sub-TLV . . . . . . . . . . . . . . 18 92 5.2.2. OSPF Dynamic Flooding Sub-TLV . . . . . . . . . . . . 19 93 5.2.3. OSPFv2 Dynamic Flooding Opaque LSA . . . . . . . . . 19 94 5.2.4. OSPFv3 Dynamic Flooding LSA . . . . . . . . . . . . . 21 95 5.2.5. OSPF Area Router ID TLVs . . . . . . . . . . . . . . 22 96 5.2.5.1. OSPFv2 Area Router ID TLV . . . . . . . . . . . . 22 97 5.2.5.2. OSPFv3 Area Router ID TLV . . . . . . . . . . . . 24 98 5.2.6. OSPF Flooding Path TLV . . . . . . . . . . . . . . . 26 99 5.2.7. OSPF Flooding Request Bit . . . . . . . . . . . . . . 27 100 5.2.8. OSPF LEEF Advertisement . . . . . . . . . . . . . . . 28 101 6. Behavioral Specification . . . . . . . . . . . . . . . . . . 29 102 6.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 29 103 6.2. Flooding Topology . . . . . . . . . . . . . . . . . . . . 29 104 6.3. Leader Election . . . . . . . . . . . . . . . . . . . . . 29 105 6.4. Area Leader Responsibilities . . . . . . . . . . . . . . 30 106 6.5. Distributed Flooding Topology Calculation . . . . . . . . 30 107 6.6. Use of LANs in the Flooding Topology . . . . . . . . . . 31 108 6.6.1. Use of LANs in Centralized mode . . . . . . . . . . . 31 109 6.6.2. Use of LANs in Distributed Mode . . . . . . . . . . . 31 110 6.6.2.1. Partial flooding on a LAN in IS-IS . . . . . . . 31 111 6.6.2.2. Partial Flooding on a LAN in OSPF . . . . . . . . 32 112 6.7. Flooding Behavior . . . . . . . . . . . . . . . . . . . . 32 113 6.8. Treatment of Topology Events . . . . . . . . . . . . . . 33 114 6.8.1. Temporary Addition of Link to Flooding Topology . . . 33 115 6.8.2. Local Link Addition . . . . . . . . . . . . . . . . . 34 116 6.8.3. Node Addition . . . . . . . . . . . . . . . . . . . . 35 117 6.8.4. Failures of Link Not on Flooding Topology . . . . . . 35 118 6.8.5. Failures of Link On the Flooding Topology . . . . . . 35 119 6.8.6. Node Deletion . . . . . . . . . . . . . . . . . . . . 36 120 6.8.7. Local Link Addition to the Flooding Topology . . . . 36 121 6.8.8. Local Link Deletion from the Flooding Topology . . . 36 122 6.8.9. Treatment of Disconnected Adjacent Nodes . . . . . . 37 123 6.8.10. Failure of the Area Leader . . . . . . . . . . . . . 37 124 6.8.11. Recovery from Multiple Failures . . . . . . . . . . . 37 125 6.8.12. Rate Limiting Temporary Flooding . . . . . . . . . . 38 126 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 38 127 7.1. IS-IS . . . . . . . . . . . . . . . . . . . . . . . . . . 38 128 7.2. OSPF . . . . . . . . . . . . . . . . . . . . . . . . . . 39 129 7.2.1. OSPF Dynamic Flooding LSA TLVs Registry . . . . . . . 41 130 7.2.2. OSPF Link Attributes Sub-TLV Bit Values Registry . . 41 131 7.3. IGP . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 132 8. Security Considerations . . . . . . . . . . . . . . . . . . . 42 133 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 43 134 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 43 135 10.1. Normative References . . . . . . . . . . . . . . . . . . 43 136 10.2. Informative References . . . . . . . . . . . . . . . . . 45 137 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 45 139 1. Introduction 141 In recent years, there has been increased focus on how to address the 142 dynamic routing of networks that have a bipartite (a.k.a. spine-leaf 143 or leaf-spine), Clos [Clos], or Fat Tree [Leiserson] topology. 144 Conventional Interior Gateway Protocols (IGPs, i.e., IS-IS 146 [ISO10589], OSPFv2 [RFC2328], and OSPFv3 [RFC5340]) under-perform, 147 redundantly flooding information throughout the dense topology, 148 leading to overloaded control plane inputs and thereby creating 149 operational issues. For practical considerations, network architects 150 have resorted to applying unconventional techniques to address the 151 problem, e.g., applying BGP in the data center [RFC7938]. However it 152 is very clear that using an Exterior Gateway Protocol as an IGP is 153 sub-optimal, if only due to the configuration overhead. 155 The primary issue that is demonstrated when conventional mechanisms 156 are applied is the poor reaction of the network to topology changes. 157 Normal link state routing protocols rely on a flooding algorithm for 158 state distribution within an area. In a dense topology, this 159 flooding algorithm is highly redundant, resulting in unnecessary 160 overhead. Each node in the topology receives each link state update 161 multiple times. Ultimately, all of the redundant copies will be 162 discarded, but only after they have reached the control plane and 163 been processed. This creates issues because significant link state 164 database updates can become queued behind many redundant copies of 165 another update. This delays convergence as the link state database 166 does not stabilize promptly. 168 In a real world implementation, the packet queues leading to the 169 control plane are necessarily of finite size, so if the flooding rate 170 exceeds the update processing rate for long enough, the control plane 171 will be obligated to drop incoming updates. If these lost updates 172 are of significance, this will further delay stabilization of the 173 link state database and the convergence of the network. 175 This is not a new problem. Historically, when routing protocols have 176 been deployed in networks where the underlying topology is a complete 177 graph, there have been similar issues. This was more common when the 178 underlying link layer fabric presented the network layer with a full 179 mesh of virtual connections. This was addressed by reducing the 180 flooding topology through IS-IS Mesh Groups [RFC2973], but this 181 approach requires careful configuration of the flooding topology. 183 Thus, the root problem is not limited to massively scalable data 184 centers. It exists with any dense topology at scale. 186 This problem is not entirely surprising. Link state routing 187 protocols were conceived when links were very expensive and 188 topologies were sparse. The fact that those same designs are sub- 189 optimal in a dense topology should not come as a huge surprise. The 190 fundamental premise that was addressed by the original designs was an 191 environment of extreme cost and scarcity. Technology has progressed 192 to the point where links are cheap and common. This represents a 193 complete reversal in the economic fundamentals of network 194 engineering. The original designs are to be commended for continuing 195 to provide correct operation to this point, and optimizations for 196 operation in today's environment are to be expected. 198 1.1. Requirements Language 200 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 201 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 202 document are to be interpreted as described in RFC 2119 [RFC2119]. 204 2. Problem Statement 206 In a dense topology, the flooding algorithm that is the heart of 207 conventional link state routing protocols causes a great deal of 208 redundant messaging. This is exacerbated by scale. While the 209 protocol can survive this combination, the redundant messaging is 210 unnecessary overhead and delays convergence. Thus, the problem is to 211 provide routing in dense, scalable topologies with rapid convergence. 213 3. Solution Requirements 215 A solution to this problem must then meet the following requirements: 217 Requirement 1 Provide a dynamic routing solution. Reachability 218 must be restored after any topology change. 220 Requirement 2 Provide a significant improvement in convergence. 222 Requirement 3 The solution should address a variety of dense 223 topologies. Just addressing a complete bipartite topology such 224 as K5,8 is insufficient. Multi-stage Clos topologies must also 225 be addressed, as well as topologies that are slight variants. 226 Addressing complete graphs is a good demonstration of generality. 228 Requirement 4 There must be no single point of failure. The loss 229 of any link or node should not unduly hinder convergence. 231 Requirement 5 Dense topologies are subgraphs of much larger 232 topologies. Operational efficiency requires that the dense 233 subgraph not operate in a radically different manner than the 234 remainder of the topology. While some operational differences 235 are permissible, they should be minimized. Changes to nodes 236 outside of the dense subgraph are not acceptable. These 237 situations occur when massively scaled data centers are part of 238 an overall larger wide-area network. Having a second protocol 239 operating just on this subgraph would add much more complexity at 240 the edge of the subgraph where the two protocols would have to 241 inter-operate. 243 4. Dynamic Flooding 245 We have observed that the combination of the dense topology and 246 flooding on the physical topology in a scalable network is sub- 247 optimal. However, if we decouple the flooding topology from the 248 physical topology and only flood on a greatly reduced portion of that 249 topology, we can have efficient flooding and retain all of the 250 resilience of existing protocols. A node that supports flooding on 251 the decoupled flooding topology is said to support dynamic flooding. 253 In this idea, the flooding topology is computed within an IGP area 254 with the dense topology either centrally on an elected node, termed 255 the Area Leader, or in a distributed manner on all nodes that are 256 supporting Dynamic Flooding. If the flooding topology is computed 257 centrally, it is encoded into and distributed as part of the normal 258 link state database. We call this the centralized mode of operation. 259 If the flooding topology is computed in a distributed fashion, we 260 call this the distributed mode of operation. Nodes within such an 261 IGP area would only flood on the flooding topology. On links outside 262 of the normal flooding topology, normal database synchronization 263 mechanisms (i.e., OSPF database exchange, IS-IS CSNPs) would apply, 264 but flooding may not. Details are described in Section 6. New link 265 state information that arrives from outside of the flooding topology 266 suggests that the sender has a different or no flooding topology 267 information and that the link state update should be flooded on the 268 flooding topology as well. 270 The flooding topology covers the full set of nodes within the area, 271 but excludes some of the links that standard flooding would employ. 273 Since the flooding topology is computed prior to topology changes, it 274 does not factor into the convergence time and can be done when the 275 topology is stable. The speed of the computation and its 276 distribution, in the case of a centralized mode, is not a significant 277 issue. 279 If a node does not have any flooding topology information when it 280 receives new link state information, it should flood according to 281 standard flooding rules. This situation will occur when the dense 282 topology is first established, but is unlikely to recur. 284 When centralized mode is used and if, during a transient, there are 285 multiple flooding topologies being advertised, then nodes should 286 flood link state updates on all of the flooding topologies. Each 287 node should locally evaluate the election of the Area Leader for the 288 IGP area and first flood on its flooding topology. The rationale 289 behind this is straightforward: if there is a transient and there has 290 been a recent change in Area Leader, then propagating topology 291 information promptly along the most likely flooding topology should 292 be the priority. 294 During transients, it is possible that loops will form in the 295 flooding topology. This is not problematic, as the legacy flooding 296 rules would cause duplicate updates to be ignored. Similarly, during 297 transients, it is possible that the flooding topology may become 298 disconnected. Section 6.8.11 discusses how such conditions are 299 handled. 301 4.1. Applicability 303 In a complete graph, this approach is appealing because it 304 drastically decreases the flooding topology without the manual 305 configuration of mesh groups. By controlling the diameter of the 306 flooding topology, as well as the maximum degree node in the flooding 307 topology, convergence time goals can be met and the stability of the 308 control plane can be assured. 310 Similarly, in a massively scaled data center, where there are many 311 opportunities for redundant flooding, this mechanism ensures that 312 flooding is redundant, with each leaf and spine well connected, while 313 ensuring that no update need make too many hops and that no node 314 shares an undue portion of the flooding effort. 316 In a network where only a portion of the nodes support Dynamic 317 Flooding, the remaining nodes will continue to perform standard 318 flooding. This is not an issue for correctness, as no node can 319 become isolated. 321 Flooding that is initiated by nodes that support Dynamic Flooding 322 will remain within the flooding topology until it reaches a legacy 323 node, which will resume legacy flooding. Standard flooding will be 324 bounded by nodes supporting Dynamic Flooding, which can help limit 325 the propagation of unnecessary flooding. Whether or not the network 326 can remain stable in this condition is unknown and may be very 327 dependent on the number and location of the nodes that support 328 Dynamic Flooding. 330 During incremental deployment of dynamic flooding an area will 331 consist of one or more sets of connected nodes that support dynamic 332 flooding and one or more sets of connected nodes that do not, i.e., 333 nodes that support standard flooding. The flooding topology is the 334 union of these sets of nodes. Each set of nodes that does not 335 support dynamic flooding needs to be part of the flooding topology 336 and such a set of nodes may provide connectivity between two or more 337 sets of nodes that support dynamic flooding. 339 4.2. Leader election 341 A single node within the dense topology is elected as an Area Leader. 343 A generalization of the mechanisms used in existing Designated Router 344 (OSPF) or Designated Intermediate-System (IS-IS) elections suffices. 345 The elected node is known as the Area Leader. 347 In the case of centralized mode, the Area Leader is responsible for 348 computing and distributing the flooding topology. When a new Area 349 Leader is elected and has distributed new flooding topology 350 information, then any prior Area Leaders should withdraw any of their 351 flooding topology information from their link state database entries. 353 In the case of distributed mode, the distributed algorithm advertised 354 by the Area Leader MUST be used by all nodes that participate in 355 Dynamic Flooding. 357 Not every node needs to be a candidate to be Area Leader within an 358 area, as a single candidate is sufficient for correct operation. For 359 redundancy, however, it is strongly RECOMMENDED that there be 360 multiple candidates. 362 4.3. Computing the Flooding Topology 364 There is a great deal of flexibility in how the flooding topology may 365 be computed. For resilience, it needs to at least contain a cycle of 366 all nodes in the dense subgraph. However, additional links could be 367 added to decrease the convergence time. The trade-off between the 368 density of the flooding topology and the convergence time is a matter 369 for further study. The exact algorithm for computing the flooding 370 topology in the case of the centralized computation need not be 371 standardized, as it is not an interoperability issue. Only the 372 encoding of the result needs to be documented. In the case of 373 distributed mode, all nodes in the IGP area need to use the same 374 algorithm to compute the flooding topology. It is possible to use 375 private algorithms to compute flooding topology, so long as all nodes 376 in the IGP area use the same algorithm. 378 While the flooding topology should be a covering cycle, it need not 379 be a Hamiltonian cycle where each node appears only once. In fact, 380 in many relevant topologies this will not be possible e.g., K5,8. 381 This is fortunate, as computing a Hamiltonian cycle is known to be 382 NP-complete. 384 A simple algorithm to compute the topology for a complete bipartite 385 graph is to simply select unvisited nodes on each side of the graph 386 until both sides are completely visited. If the number of nodes on 387 each side of the graph are unequal, then revisiting nodes on the less 388 populated side of the graph will be inevitable. This algorithm can 389 run in O(N) time, so is quite efficient. 391 While a simple cycle is adequate for correctness and resiliency, it 392 may not be optimal for convergence. At scale, a cycle may have a 393 diameter that is half the number of nodes in the graph. This could 394 cause an undue delay in link state update propagation. Therefore it 395 may be useful to have a bound on the diameter of the flooding 396 topology. Introducing more links into the flooding topology would 397 reduce the diameter, but at the trade-off of possibly adding 398 redundant messaging. The optimal trade-off between convergence time 399 and graph diameter is for further study. 401 Similarly, if additional redundancy is added to the flooding 402 topology, specific nodes in that topology may end up with a very high 403 degree. This could result in overloading the control plane of those 404 nodes, resulting in poor convergence. Thus, it may be optimal to 405 have an upper bound on the degree of nodes in the flooding topology. 406 Again, the optimal trade-off between graph diameter, node degree, and 407 convergence time, and topology computation time is for further study. 409 If the leader chooses to include a multi-node broadcast LAN segment 410 as part of the flooding topology, all of the connectivity to that LAN 411 segment should be included as well. Once updates are flooded onto 412 the LAN, they will be received by every attached node. 414 4.4. Topologies on Complete Bipartite Graphs 416 Complete bipartite graph topologies have become popular for data 417 center applications and are commonly called leaf-spine or spine-leaf 418 topologies. In this section, we discuss some flooding topologies 419 that are of particular interest in these networks. 421 4.4.1. A Minimal Flooding Topology 423 We define a Minimal Flooding Topology on a complete bipartite graph 424 as one in which the topology is connected and each node has at least 425 degree two. This is of interest because it guarantees that the 426 flooding topology has no single points of failure. 428 In practice, this implies that every leaf node in the flooding 429 topology will have a degree of two. As there are usually more leaves 430 than spines, the degree of the spines will be higher, but the load on 431 the individual spines can be evenly distributed. 433 This type of flooding topology is also of interest because it scales 434 well. As the number of leaves increases, we can construct flooding 435 topologies that perform well. Specifically, for n spines and m 436 leaves, if m >= n(n/2-1), then there is a flooding topology that has 437 a diameter of four. 439 4.4.2. Xia Topologies 441 We define a Xia Topology on a complete bipartite graph as one in 442 which all spine nodes are bi-connected through leaves with degree 443 two, but the remaining leaves all have degree one and are evenly 444 distributed across the spines. 446 Constructively, we can create a Xia topology by iterating through the 447 spines. Each spine can be connected to the next spine by selecting 448 any unused leaf. Since leaves are connected to all spines, all 449 leaves will have a connection to both the first and second spine and 450 we can therefore choose any leaf without loss of generality. 451 Continuing this iteration across all of the spines, selecting a new 452 leaf at each iteration, will result in a path that connects all 453 spines. Adding one more leaf between the last and first spine will 454 produce a cycle of n spines and n leaves. 456 At this point, m-n leaves remain unconnected. These can be 457 distributed evenly across the remaining spines, connected by a single 458 link. 460 Xia topologies represent a compromise that trades off increased risk 461 and decreased performance for lower flooding amplification. Xia 462 topologies will have a larger diameter. For m spines, the diameter 463 will be m + 2. 465 In a Xia topology, some leaves are singly connected. This represents 466 a risk in that in some failures, convergence may be delayed. 467 However, there may be some alternate behaviors that can be employed 468 to mitigate these risks. If a leaf node sees that its single link on 469 the flooding topology has failed, it can compensate by performing a 470 database synchronization check with a different spine. Similarly, if 471 a leaf determines that its connected spine on the flooding topology 472 has failed, it can compensate by performing a database 473 synchronization check with a different spine. In both of these 474 cases, the synchronization check is intended to ameliorate any delays 475 in link state propagation due to the fragmentation of the flooding 476 topology. 478 The benefit of this topology is that flooding load is easily 479 understood. Each node in the spine cycle will never receive an 480 update more than twice. For m leaves and n spines, a spine never 481 transmits more than (m/n +1) updates. 483 4.4.3. Optimization 485 If two nodes are adjacent on the flooding topology and there are a 486 set of parallel links between them, then any given update MUST be 487 flooded over a single one of those links. Selection of the specific 488 link is implementation specific. 490 4.5. Encoding the Flooding Topology 492 There are a variety of ways that the flooding topology could be 493 encoded efficiently. If the topology was only a cycle, a simple list 494 of the nodes in the topology would suffice. However, this is 495 insufficiently flexible as it would require a slightly different 496 encoding scheme as soon as a single additional link is added. 497 Instead, we choose to encode the flooding topology as a set of 498 intersecting paths, where each path is a set of connected edges. 500 Advertisement of the flooding topology includes support for multi- 501 access LANs. When a LAN is included in the flooding topology, all 502 edges between the LAN and nodes connected to the LAN are assumed to 503 be part of the flooding topology. In order to reduce the size of the 504 flooding topology advertisement, explicit advertisement of these 505 edges is optional. Note that this may result in the possibility of 506 "hidden nodes" existing which are actually part of the flooding 507 topology but which are not explicitly mentioned in the flooding 508 topology advertisements. These hidden nodes can be found by 509 examination of the Link State database where connectivity between a 510 LAN and nodes connected to the LAN is fully specified. 512 Note that while all nodes MUST be part of the advertised flooding 513 topology not all multi-access LANs need to be included. Only those 514 LANs which are part of the flooding topology need to be included in 515 the advertised flooding topology. 517 Other encodings are certainly possible. We have attempted to make a 518 useful trade off between simplicity, generality, and space. 520 4.6. Advertising the Local Edges Enabled for Flooding 522 Correct operation of the flooding topology requires that all nodes 523 which participate in the flooding topology choose local links for 524 flooding which are consistent with the calculated flooding topology. 525 Failure to do so could result in unexpected partition of the flooding 526 topology and/or sub-optimal flooding reduction. As an aid to 527 diagnosing problems when dynamic flooding is in use, this document 528 defines a means of advertising what local edges are enabled for 529 flooding (LEEF). The protocol specific encodings are defined in 530 Sections 5.1.6 and 5.2.8. 532 The following guidelines apply: 534 Advertisement of LEEFs is optional. 536 As the flooding topology is defined by edges (not by links), in 537 cases where parallel adjacencies to the same neighbor exist, the 538 advertisement SHOULD indicate that all such links have been 539 enabled. 541 LEEF advertisements MUST NOT include edges enabled for temporary 542 flooding (Section 6.7). 544 LEEF advertisements MUST NOT be used either when calculating a 545 flooding topology or when determining what links to add 546 temporarily to the flooding topology when the flooding topology is 547 temporarily partitioned. 549 5. Protocol Elements 551 5.1. IS-IS TLVs 553 The following TLVs/sub-TLVs are added to IS-IS: 555 1. A sub-TLV that an IS may inject into its LSP to indicate its 556 preference for becoming Area Leader. 558 2. A sub-TLV that an IS may inject into its LSP to indicate that it 559 supports Dynamic Flooding and the algorithms that it supports for 560 distributed mode, if any. 562 3. A TLV to carry the list of system IDs that compromise the 563 flooding topology for the area. 565 4. A TLV to carry a path which is part of the flooding topology 567 5. A TLV that requests flooding from the adjacent node 569 5.1.1. IS-IS Area Leader Sub-TLV 571 The Area Leader Sub-TLV allows a system to: 573 1. Indicate its eligibility and priority for becoming Area Leader. 575 2. Indicate whether centralized or distributed mode is to be used to 576 compute the flooding topology in the area. 578 3. Indicate the algorithm identifier for the algorithm that is used 579 to compute the flooding topology in distributed mode. 581 Intermediate Systems (nodes) that are not advertising this Sub-TLV 582 are not eligible to become Area Leader. 584 The Area Leader is the node with the numerically highest Area Leader 585 priority in the area. In the event of ties, the node with the 586 numerically highest system ID is the Area Leader. Due to transients 587 during database flooding, different nodes may not agree on the Area 588 Leader. 590 The Area Leader Sub-TLV is advertised as a Sub-TLV of the IS-IS 591 Router Capability TLV-242 that is defined in [RFC7981] and has the 592 following format: 594 0 1 2 3 595 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 596 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 597 | Type | Length | Priority | Algorithm | 598 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 600 Type: TBD1 602 Length: 2 604 Priority: 0-255, unsigned integer 606 Algorithm: a numeric identifier in the range 0-255 that identifies 607 the algorithm used to calculate the flooding topology. The 608 following values are defined: 610 0: Centralized computation by the Area Leader. 612 1-127: Standardized distributed algorithms. Individual values 613 are are to be assigned according to the "Specification 614 Required" policy defined in [RFC8126] (see Section 7.3). 616 128-254: Private distributed algorithms. Individual values are 617 are to be assigned according to the "Private Use" policy 618 defined in [RFC8126] (see Section 7.3). 620 255: Reserved 622 5.1.2. IS-IS Dynamic Flooding Sub-TLV 624 The Dynamic Flooding Sub-TLV allows a system to: 626 1. Indicate that it supports Dynamic Flooding. This is indicated by 627 the advertisement of this Sub-TLV. 629 2. Indicate the set of algorithms that it supports for distributed 630 mode, if any. 632 In incremental deployments, understanding which nodes support Dynamic 633 Flooding can be used to optimize the flooding topology. In 634 distributed mode, knowing the capabilities of the nodes can allow the 635 Area Leader to select the optimal algorithm. 637 The Dynamic Flooding Sub-TLV is advertised as a Sub-TLV of the IS-IS 638 Router Capability TLV (242) [RFC7981] and has the following format: 640 0 1 2 3 641 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 642 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 643 | Type | Length | Algorithm... | 644 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 646 Type: TBD7 648 Length: 0-255; number of Algorithms 650 Algorithm: zero or more numeric identifiers in the range 0-255 651 that identifies the algorithm used to calculate the flooding 652 topology, as described in Section 5.1.1. 654 5.1.3. IS-IS Area Node IDs TLV 656 The IS-IS Area Node IDs TLV is only used in centralized mode. 658 The Area Node IDs TLV is used by the Area Leader to enumerate the 659 Node IDs (System ID + pseudo-node ID) that it has used in computing 660 the area flooding topology. Conceptually, the Area Leader creates a 661 list of node IDs for all nodes in the area (including pseudo-nodes 662 for all LANs in the topology), assigning indices to each node, 663 starting with index 0. 665 Because the space in a single TLV is limited, more than one TLV may 666 be required to encode all of the node IDs in the area. This TLV may 667 be present in multiple LSPs. 669 The format of the Area Node IDs TLV is: 671 0 1 2 3 672 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 673 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 674 | Type | Length | Starting Index | 675 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 676 |L| Reserved | Node IDs ... 677 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 678 Node IDs continued .... 679 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 681 Type: TBD2 683 Length: 3 + ((System ID Length + 1) * (number of node IDs)) 685 Starting index: The index of the first node ID that appears in 686 this TLV. 688 L (Last): This bit is set if the index of the last node ID that 689 appears in this TLV is equal to the last index in the full list of 690 node IDs for the area. 692 Node IDs: A concatenated list of node IDs for the area 694 If there are multiple IS-IS Area Node IDs TLVs with the L bit set 695 advertised by the same node, the TLV which specifies the smaller 696 maximum index is used and the other TLV(s) with L bit set are 697 ignored. TLVs which specify node IDs with indices greater than that 698 specified by the TLV with the L bit set are also ignored. 700 5.1.4. IS-IS Flooding Path TLV 702 IS-IS Flooding Path TLV is only used in centralized mode. 704 The Flooding Path TLV is used to denote a path in the flooding 705 topology. The goal is an efficient encoding of the links of the 706 topology. A single link is a simple case of a path that only covers 707 two nodes. A connected path may be described as a sequence of 708 indices: (I1, I2, I3, ...), denoting a link from the system with 709 index 1 to the system with index 2, a link from the system with index 710 2 to the system with index 3, and so on. 712 If a path exceeds the size that can be stored in a single TLV, then 713 the path may be distributed across multiple TLVs by the replication 714 of a single system index. 716 Complex topologies that are not a single path can be described using 717 multiple TLVs. 719 The Flooding Path TLV contains a list of system indices relative to 720 the systems advertised through the Area Node IDs TLV. At least 2 721 indices must be included in the TLV. Due to the length restriction 722 of TLVs, this TLV can contain at most 126 system indices. 724 The Flooding Path TLV has the format: 726 0 1 2 3 727 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 728 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 729 | Type | Length | Starting Index | 730 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 731 | Index 2 | Additional indices ... 732 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 734 Type: TBD3 736 Length: 2 * (number of indices in the path) 738 Starting index: The index of the first system in the path. 740 Index 2: The index of the next system in the path. 742 Additional indices (optional): A sequence of additional indices to 743 systems along the path. 745 5.1.5. IS-IS Flooding Request TLV 747 The Flooding Request TLV allows a system to request an adjacent node 748 to enable flooding towards it on a specific link in the case where 749 the connection to adjacent node is not part of the existing flooding 750 topology. 752 Nodes that support Dynamic Flooding MAY include the Flooding Request 753 TLV in its IIH PDUs. 755 The Flooding Request TLV has the format: 757 0 1 2 3 758 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 759 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 760 | Type | Length | Levels |R| Scope | 761 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 762 |R| ... | 763 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 765 Type: TBD9 766 Length: 1 + number of advertised Flooding Scopes 768 Levels - the level(s) for which flooding is requested. Levels are 769 encoded as the circuit type specified in IS-IS [ISO10589] 771 R bit: MUST be 0 and is ignored on receipt. 773 Scope: Flooding Scope for which the flooding is requested as 774 defined by LSP Flooding Scope Identifier Registry defined by 775 [RFC7356]. Inclusion of flooding scopes is optional and is only 776 necessary if [RFC7356] is supported. Multiple flooding scopes MAY 777 be included. 779 Circuit Flooding Scope MUST NOT be sent in the Flooding Request TLV 780 and MUST be ignored if received. 782 When the TLV is received in a level specific LAN-Hello PDU (L1-LAN- 783 IIH or L2-LAN-IIH) only levels which match the PDU type are valid. 784 Levels which do not match the PDU type MUST be ignored on receipt. 786 When the TLV is received in a Point-to-Point Hello (P2P-IIH) only 787 levels which are supported by the established adjacency are valid. 788 Levels which are not supported by the adjacency MUST be ignored on 789 receipt. 791 If flooding was disabled on the received link due to Dynamic 792 Flooding, then flooding MUST be temporarily enabled over the link for 793 the specified Circuit Type(s) and Flooding Scope(s) received in the 794 in the Flooding Request TLV. Flooding MUST be enabled until the 795 Circuit Type or Flooding Scope is no longer advertised in the 796 Flooding Request TLV or the TLV no longer appears in IIH PDUs 797 received on the link. 799 When the flooding is temporarily enabled on the link for any Circuit 800 Type or Flooding Scope due to received Flooding Request TLV, the 801 receiver MUST perform standard database synchronization for the 802 corresponding Circuit Type(s) and Flooding Scope(s) on the link. In 803 the case of IS-IS, this results in setting SRM bit for all related 804 LSPs on the link and sending CSNPs. 806 So long as the Flooding Request TLV is being received flooding MUST 807 NOT be disabled for any of the Circuit Types or Flooding Scopes 808 present in the Flooding Request TLV even if the connection between 809 the neighbors is removed from the flooding topology. Flooding for 810 such Circuit Types or Flooding Scopes MUST continue on the link and 811 be considered as temporarily enabled. 813 5.1.6. IS-IS LEEF Advertisement 815 In support of advertising which edges are currently enabled in the 816 flooding topology, an implementation MAY indicate that a link is part 817 of the flooding topology by advertising a bit value in the Link 818 Attributes sub-TLV defined by [RFC5029]. 820 The following bit value is defined by this document: 822 Local Edge Enabled for Flooding (LEEF) - suggested value 4 (to be 823 assigned by IANA) 825 5.2. OSPF LSAs and TLVs 827 This section defines new LSAs and TLVs for both OSPFv2 and OSPFv3. 829 Following objects are added: 831 1. A TLV that is used to advertise the preference for becoming Area 832 Leader. 834 2. A TLV that is used to indicate the support for Dynamic Flooding 835 and the algorithms that the advertising node supports for 836 distributed mode, if any. 838 3. OSPFv2 Opaque LSA and OSPFv3 LSA to advertise the flooding 839 topology for centralized mode. 841 4. A TLV to carry the list of Router IDs that comprise the flooding 842 topology for the area. 844 5. A TLV to carry a path which is part of the flooding topology. 846 6. The bit in the LLS Type 1 Extended Options and Flags requests 847 flooding from the adjacent node. 849 5.2.1. OSPF Area Leader Sub-TLV 851 The usage of the OSPF Area Leader Sub-TLV is identical to IS-IS and 852 is described in Section 5.1.1. 854 The OSPF Area Leader Sub-TLV is used by both OSPFv2 and OSPFv3. 856 The OSPF Area Leader Sub-TLV is advertised as a top-level TLV of the 857 RI LSA that is defined in [RFC7770] and has the following format: 859 0 1 2 3 860 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 861 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 862 | Type | Length | 863 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 864 | Priority | Algorithm | Reserved | 865 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 867 Type: TBD4 869 Length: 4 octets 871 Priority: 0-255, unsigned integer 873 Algorithm: as defined in Section 5.1.1. 875 5.2.2. OSPF Dynamic Flooding Sub-TLV 877 The usage of the OSPF Dynamic Flooding Sub-TLV is identical to IS-IS 878 and is described in Section 5.1.2. 880 The OSPF Dynamic Flooding Sub-TLV is used by both OSPFv2 and OSPFv3. 882 The OSPF Dynamic Flooding Sub-TLV is advertised as a top-level TLV of 883 the RI LSA that is defined in [RFC7770] and has the following format: 885 0 1 2 3 886 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 887 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 888 | Type | Length | 889 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 890 | Algorithm ... | | 891 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 893 Type: TBD8 895 Length: number of Algorithms 897 Algorithm: as defined in Section 5.1.1. 899 5.2.3. OSPFv2 Dynamic Flooding Opaque LSA 901 The OSPFv2 Dynamic Flooding Opaque LSA is only used in centralized 902 mode. 904 The OSPFv2 Dynamic Flooding Opaque LSA is used to advertise 905 additional data related to the dynamic flooding in OSPFv2. OSPFv2 906 Opaque LSAs are described in [RFC5250]. 908 Multiple OSPFv2 Dynamic Flooding Opaque LSAs can be advertised by an 909 OSPFv2 router. The flooding scope of the OSPFv2 Dynamic Flooding 910 Opaque LSA is area-local. 912 The format of the OSPFv2 Dynamic Flooding Opaque LSA is as follows: 914 0 1 2 3 915 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 916 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 917 | LS age | Options | LS Type | 918 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 919 | TBD5 | Opaque ID | 920 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 921 | Advertising Router | 922 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 923 | LS sequence number | 924 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 925 | LS checksum | Length | 926 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 927 | | 928 +- TLVs -+ 929 | ... | 931 OSPFv2 Dynamic Flooding Opaque LSA 933 The opaque type used by OSPFv2 Dynamic Flooding Opaque LSA is TBD. 934 The opaque type is used to differentiate the various type of OSPFv2 935 Opaque LSAs and is described in section 3 of [RFC5250]. The LS Type 936 is 10. The LSA Length field [RFC2328] represents the total length 937 (in octets) of the Opaque LSA including the LSA header and all TLVs 938 (including padding). 940 The Opaque ID field is an arbitrary value used to maintain multiple 941 Dynamic Flooding Opaque LSAs. For OSPFv2 Dynamic Flooding Opaque 942 LSAs, the Opaque ID has no semantic significance other than to 943 differentiate Dynamic Flooding Opaque LSAs originated by the same 944 OSPFv2 router. 946 The format of the TLVs within the body of the OSPFv2 Dynamic Flooding 947 Opaque LSA is the same as the format used by the Traffic Engineering 948 Extensions to OSPF [RFC3630]. 950 The Length field defines the length of the value portion in octets 951 (thus a TLV with no value portion would have a length of 0). The TLV 952 is padded to 4-octet alignment; padding is not included in the length 953 field (so a 3-octet value would have a length of 3, but the total 954 size of the TLV would be 8 octets). Nested TLVs are also 32-bit 955 aligned. For example, a 1-octet value would have the length field 956 set to 1, and 3 octets of padding would be added to the end of the 957 value portion of the TLV. The padding is composed of zeros. 959 5.2.4. OSPFv3 Dynamic Flooding LSA 961 The OSPFv3 Dynamic Flooding Opaque LSA is only used in centralized 962 mode. 964 The OSPFv3 Dynamic Flooding LSA is used to advertise additional data 965 related to the dynamic flooding in OSPFv3. 967 The OSPFv3 Dynamic Flooding LSA has a function code of TBD. The 968 flooding scope of the OSPFv3 Dynamic Flooding LSA is area-local. The 969 U bit will be set indicating that the OSPFv3 Dynamic Flooding LSA 970 should be flooded even if it is not understood. The Link State ID 971 (LSID) value for this LSA is the Instance ID. OSPFv3 routers MAY 972 advertise multiple Dynamic Flooding Opaque LSAs in each area. 974 The format of the OSPFv3 Dynamic Flooding LSA is as follows: 976 0 1 2 3 977 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 978 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 979 | LS age |1|0|1| TBD6 | 980 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 981 | Link State ID | 982 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 983 | Advertising Router | 984 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 985 | LS sequence number | 986 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 987 | LS checksum | Length | 988 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 989 | | 990 +- TLVs -+ 991 | ... | 993 OSPFv3 Dynamic Flooding LSA 995 5.2.5. OSPF Area Router ID TLVs 997 In OSPF new TLVs are introduced to advertise indeces associated with 998 nodes and Broadcast/NBMA networks. Due to identifier differences 999 between OSPFv2 and OSPFv3 two different TLVs are defined as decribed 1000 in the following sub-sections. 1002 The OSPF Area Router ID TLVs are used by the Area Leader to enumerate 1003 the Router IDs that it has used in computing the flooding topology. 1004 This includes the identifiers associated with Broadcast/NBMA networks 1005 as defined for Network LSAs. Conceptually, the Area Leader creates a 1006 list of Router IDs for all routers in the area, assigning indices to 1007 each router, starting with index 0. 1009 5.2.5.1. OSPFv2 Area Router ID TLV 1011 This TLV is a top level TLV of the OSPFv2 Dynamic Flooding Opaque 1012 LSA. 1014 Because the space in a single OSPFv2 Area Router IDs TLV is limited, 1015 more than one TLV may be required to encode all of the Router IDs in 1016 the area. This TLV may also occur in multiple OSPFv2 Dynamic 1017 Flooding Opaque LSAs so that all Router IDs can be advertised. 1019 Each entry in the OSPFv2 Area Router IDs TLV represents either a node 1020 or a Broadcast/NBMA network identifier. An entry has the following 1021 format: 1023 0 1 2 3 1024 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1025 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1026 | Conn Type | Number of IDs | Reserved | 1027 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1028 | | 1029 +- Originating Router ID/DR Address -+ 1030 | ... | 1032 where 1034 Conn Type - 1 byte 1035 The following values are defined: 1036 1 - Router 1037 2 - Designated Router 1039 Number of IDs - 2 bytes 1041 Reserved - 1 byte 1042 MUST be transmitted as 0 and MUST be ignored on receipt 1044 Originating Router ID/DR Address - (4 * Number of IDs) bytes 1045 As indicated by the ID Type 1047 OSPFv2 Router IDs TLV Entry 1049 The format of the Area Router IDs TLV is: 1051 0 1 2 3 1052 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1053 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1054 | Type | Length | 1055 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1056 | Starting Index |L| Flags | Reserved | 1057 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1058 | | 1059 +- OSPFv2 Router ID TLV Entry -+ 1060 | ... | 1062 OSPFv2 Area Router IDs TLV 1064 TLV Type: 1 1066 TLV Length: 4 + (8 * the number TLV entries) 1067 Starting index: The index of the first Router/Designated Router ID 1068 that appears in this TLV. 1070 L (Last): This bit is set if the index of the last Router/ 1071 Designated ID that appears in this TLV is equal to the last index 1072 in the full list of Rourer IDs for the area. 1074 OSPFv2 Router ID TLV Entries: A concatenated list of Router ID TLV 1075 Entries for the area. 1077 If there are multiple OSPFv2 Area Router ID TLVs with the L bit set 1078 advertised by the same router, the TLV which specifies the smaller 1079 maximum index is used and the other TLV(s) with L bit set are 1080 ignored. TLVs which specify Router IDs with indices greater than 1081 that specified by the TLV with the L bit set are also ignored. 1083 5.2.5.2. OSPFv3 Area Router ID TLV 1085 This TLV is a top level TLV of the OSPFv3 Dynamic Flooding LSA. 1087 Because the space in a single OSPFv3 Area Router ID TLV is limited, 1088 more than one TLV may be required to encode all of the Router IDs in 1089 the area. This TLV may also occur in multiple OSPFv3 Dynamic 1090 Flooding Opaque LSAs so that all Router IDs can be advertised. 1092 Each entry in the OSPFv3 Area Router IDs TLV represents either a 1093 router or a Broadcast/NBMA network identifier. An entry has the 1094 following format: 1096 0 1 2 3 1097 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1098 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1099 | Conn Type | Number of IDs | Reserved | 1100 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1101 | | 1102 +- Originating ID Entry -+ 1103 | ... | 1105 where 1107 Conn Type - 1 byte 1108 The following values are defined: 1109 1 - Router 1110 2 - Designated Router 1112 Number of IDs - 2 bytes 1114 Reserved - 1 byte 1115 MUST be transmitted as 0 and MUST be ignored on receipt 1117 Originating ID Entry takes one of the following forms: 1119 Router: 1120 0 1 2 3 1121 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1122 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1123 | Originating Router ID | 1124 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1126 Length of Originating ID Entry is 4 * Number of IDs) bytes 1128 Designated Router: 1129 0 1 2 3 1130 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1131 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1132 | Originating Router ID | 1133 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1134 | Interface ID | 1135 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1137 Length of Originating ID Entry is (8 * Number of IDs) bytes 1139 OSPFv3 Router ID TLV Entry 1141 The format of the OSPFv3Area Router IDs TLV is: 1143 0 1 2 3 1144 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1145 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1146 | Type | Length | 1147 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1148 | Starting Index |L| Flags | Reserved | 1149 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1150 | | 1151 +- OSPFv3 Router ID TLV Entry -+ 1152 | ... | 1154 OSPFv3 Area Router IDs TLV 1156 TLV Type: 1 1158 TLV Length: 4 + sum of the lengths of all TLV entries 1160 Starting index: The index of the first Router/Designated Router ID 1161 that appears in this TLV. 1163 L (Last): This bit is set if the index of the last Router/ 1164 Designated Router ID that appears in this TLV is equal to the last 1165 index in the full list of Router IDs for the area. 1167 OSPFv3 Router ID TLV Entries: A concatenated list of Router ID TLV 1168 Entries for the area. 1170 If there are multiple OSPFv3 Area Router ID TLVs with the L bit set 1171 advertised by the same router, the TLV which specifies the smaller 1172 maximum index is used and the other TLV(s) with L bit set are 1173 ignored. TLVs which specify Router IDs with indices greater than 1174 that specified by the TLV with the L bit set are also ignored. 1176 5.2.6. OSPF Flooding Path TLV 1178 The OSPF Flooding Path TLV is a top level TLV of the OSPFv2 Dynamic 1179 Flooding Opaque LSAs and OSPFv3 Dynamic Flooding LSA. 1181 The usage of the OSPF Flooding Path TLV is identical to IS-IS and is 1182 described in Section 5.1.4. 1184 The OSPF Flooding Path TLV contains a list of Router ID indices 1185 relative to the Router IDs advertised through the OSPF Area Router 1186 IDs TLV. At least 2 indices must be included in the TLV. 1188 Multiple OSPF Flooding Path TLVs can be advertised in a single OSPFv2 1189 Dynamic Flooding Opaque LSA or OSPFv3 Dynamic Flooding LSA. OSPF 1190 Flooding Path TLVs can also be advertised in multiple OSPFv2 Dynamic 1191 Flooding Opaque LSAs or OSPFv3 Dynamic Flooding LSA, if they all can 1192 not fit in a single LSA. 1194 The Flooding Path TLV has the format: 1196 0 1 2 3 1197 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1198 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1199 | Type | Length | 1200 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1201 | Starting Index | Index 2 | 1202 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1203 | | 1204 +- Additional Indices -+ 1205 | ... | 1207 OSPF Flooding Path TLV 1209 TLV Type: 2 1211 TLV Length: 2 * (number of indices in the path) 1213 Starting index: The index of the first Router ID in the path. 1215 Index 2: The index of the next Router ID in the path. 1217 Additional indices (optional): A sequence of additional indices to 1218 Router IDs along the path. 1220 5.2.7. OSPF Flooding Request Bit 1222 A single new option bit, the Flooding-Request (FR-bit), is defined in 1223 the LLS Type 1 Extended Options and Flags field [RFC2328]. The FR- 1224 bit allows a router to request an adjacent node to enable flooding 1225 towards it on a specific link in the case where the connection to 1226 adjacent node is not part of the current flooding topology. 1228 Nodes that support Dynamic Flooding MAY include FR-bit in its OSPF 1229 LLS Extended Options and Flags TLV. 1231 If FR-bit is signalled for an area for which the flooding on the link 1232 was disabled due to Dynamic Flooding, the flooding MUST be 1233 temporarily enabled over such link and area. Flooding MUST be 1234 enabled until FR-bit is no longer advertised in the OSPF LLS Extended 1235 Options and Flags TLV or the OSPF LLS Extended Options and Flags TLV 1236 no longer appears in the OSPF Hellos. 1238 When the flooding is temporarily enabled on the link for any area due 1239 to received FR-bit in OSPF LLS Extended Options and Flags TLV, the 1240 receiver MUST perform standard database synchronization for the 1241 corresponding area(s) on the link. If the adjacency is already in 1242 the FULL state, mechanism specified in [RFC4811] MUST be used for 1243 database resynchronization. 1245 So long as the FR-bit is being received in the OSPF LLS Extended 1246 Options and Flags TLV for an area, flooding MUST NOT be disabled in 1247 such area even if the connection between the neighbors is removed 1248 from the flooding topology. Flooding for such area MUST continue on 1249 the link and be considered as temporarily enabled. 1251 5.2.8. OSPF LEEF Advertisement 1253 In support of advertising which edges are currently enabled in the 1254 flooding topology, an implementation MAY indicate that a link is part 1255 of the flooding topology. The OSPF Link Attributes Bits TLV is 1256 defined to support this advertisement. 1258 0 1 2 3 1259 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1260 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1261 | Type | Length | 1262 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1263 | Link Attribute Bits | 1264 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1265 | | 1266 +- Additional Link Attribute Bits -+ 1267 | ... | 1269 OSPF Link Attributes Bits TLV 1271 Type: TBD and specific to OSPFv2 and OSPFv3 1273 Length: size of the Link Attribute Bits in bytes. It MUST be a 1274 multiple of 4 bytes. 1276 The following bits are defined: 1278 Bit #0: - Local Edge Enabled for Flooding (LEEF) 1280 OSPF Link-attribute Bits TLV appears as: 1282 1. a sub-TLV of the OSPFv2 Extended Link TLV [RFC7684] 1284 2. a sub-TLV of the OSPFv3 Router-Link TLV [RFC8362] 1286 6. Behavioral Specification 1288 In this section, we specify the detailed behaviors of the nodes 1289 participating in the IGP. 1291 6.1. Terminology 1293 We define some terminology here that is used in the following 1294 sections: 1296 A node is considered reachable if it is part of the connected 1297 network graph. Note that this is independent of any constraints 1298 which may be considered when performing IGP SPT calculation (e.g., 1299 link metrics, OL bit state, etc.). Two-way-connectivity check 1300 MUST be performed before including an edge in the connected 1301 network graph. 1303 Node is connected to the flooding topology, if it has at least one 1304 local link, which is part of the flooding topology. 1306 Node is disconnected from the flooding topology when it is not 1307 connected to the flooding topology. 1309 Current flooding topology - latest version of the flooding 1310 topology received (in case of the centralized mode) or calculated 1311 locally (in case of the distributed mode). 1313 6.2. Flooding Topology 1315 The flooding topology MUST include all reachable nodes in the area. 1317 If a node's reachability changes, the flooding topology MUST be 1318 recalculated. In centralized mode, the Area Leader MUST advertise a 1319 new flooding topology. 1321 If a node becomes disconnected from the current flooding topology but 1322 is still reachable then a new flooding topology MUST be calculated. 1323 In centralized mode the Area Leader MUST advertise the new flooding 1324 topology. 1326 The flooding topology SHOULD be bi-connected. 1328 6.3. Leader Election 1330 Any node that is capable MAY advertise its eligibility to become Area 1331 Leader. 1333 Nodes that are not reachable are not eligible as Area Leader. Nodes 1334 that do not advertise their eligibility to become Area Leader are not 1335 eligible. Amongst the eligible nodes, the node with the numerically 1336 highest priority is the Area Leader. If multiple nodes all have the 1337 highest priority, then the node with the numerically highest system 1338 identifier in the case of IS-IS, or Router-ID in the case of OSPFv2 1339 and OSPFv3 is the Area Leader. 1341 6.4. Area Leader Responsibilities 1343 If the Area Leader operates in centralized mode, it MUST advertise 1344 algorithm 0 in its Area Leader Sub-TLV. In order for Dynamic 1345 Flooding to be enabled it also MUST compute and advertise a flooding 1346 topology for the area. The Area Leader may update the flooding 1347 topology at any time, however, it should not destabilize the network 1348 with undue or overly frequent topology changes. If the Area Leader 1349 operates in centralized mode and needs to advertise a new flooding 1350 topology, it floods the new flooding topology on both the new and old 1351 flooding topologies. 1353 If the Area Leader operates in distributed mode, it MUST advertise a 1354 non-zero algorithm in its Area Leader Sub-TLV. 1356 When the Area Leader advertises algorithm 0 in its Area Leader Sub- 1357 TLV and does not advertise a flooding topology, Dynamic Flooding is 1358 disabled for the area. Note this applies whether the Area Leader 1359 intends to operate in centralized mode or in distributed mode. 1361 Note that once Dynamic Flooding is enabled, disabling it risks 1362 destabilizing the network. 1364 6.5. Distributed Flooding Topology Calculation 1366 If the Area Leader advertises a non-zero algorithm in its Area Leader 1367 Sub-TLV, all nodes in the area that support Dynamic Flooding and the 1368 value of algorithm advertised by the Area Leader MUST compute the 1369 flooding topology based on the Area Leader's advertised algorithm. 1371 Nodes that do not support the value of algorithm advertised by the 1372 Area Leader MUST continue to use standard flooding mechanism as 1373 defined by the protocol. 1375 Nodes that do not support the value of algorithm advertised by the 1376 Area Leader MUST be considered as Dynamic Flooding incapable nodes by 1377 the Area Leader. 1379 If the value of the algorithm advertised by the Area Leader is from 1380 the range 128-254 (private distributed algorithms), it is the 1381 responsibility of the network operator to guarantee that all nodes in 1382 the area have a common understanding of what the given algorithm 1383 value represents. 1385 6.6. Use of LANs in the Flooding Topology 1387 Use of LANs in the flooding topology differs depending on whether the 1388 area is operating in Centralized or Distributed mode. 1390 6.6.1. Use of LANs in Centralized mode 1392 As specified in Section 4.5, when a LAN is advertised as part of the 1393 flooding topology, all nodes connected to the LAN are assumed to be 1394 using the LAN as part of the flooding topology. This assumption is 1395 made to reduce the size of the Flooding Topology advertisement. 1397 6.6.2. Use of LANs in Distributed Mode 1399 In distributed mode, the flooding topology is NOT advertised, 1400 therefore the space consumed to advertise it is not a concern. It is 1401 therefore possible to assign only a subset of the nodes connected to 1402 the LAN to use the LAN as part of the flooding topology. Doing so 1403 may further optimize flooding by reducing the amount of redundant 1404 flooding on a LAN. However, support of flooding only by a subset of 1405 the nodes connected to a LAN requires some modest - but backwards 1406 compatible - changes in the way flooding is performed on a LAN. 1408 6.6.2.1. Partial flooding on a LAN in IS-IS 1410 Designated Intermediate System (DIS) for a LAN MUST use standard 1411 flooding behavior. 1413 Non-DIS nodes whose connection to the LAN is included in the flooding 1414 topology MUST use standard flooding behavior. 1416 Non-DIS nodes whose connection to the LAN is NOT included in the 1417 flooding topology behave as follows: 1419 o Received CSNPs from the DIS are ignored 1421 o PSNPs are NOT originated on the LAN 1423 o LSPs received on the LAN which are newer than the corresponding 1424 LSP present in the LSPDB are retained and flooded on all local 1425 circuits which are part of the flooding topology (i.e., do not 1426 discard newer LSPs simply because they were received on a LAN 1427 which the receiving node is not using for flooding) 1429 o LSPs received on the LAN which are older or same as the 1430 corresponding LSP present in the LSPDB are silently discarded 1432 o LSPs received on links other than the LAN are NOT flooded on the 1433 LAN 1435 NOTE: If any node connected to the LAN requests the enablement of 1436 temporary flooding all nodes revert to standard flooding behavior. 1438 6.6.2.2. Partial Flooding on a LAN in OSPF 1440 Designated Router (DR) and Backup Designated Router (BDR) for LANs 1441 MUST use standard flooding behavior. 1443 Non-DR/BDR nodes whose connection to the LAN is included in the 1444 flooding topology use standard flooding behavior. 1446 Non-DR/BDR nodes whose connection to the LAN is NOT included in the 1447 flooding topology behave as follows: 1449 o LSAs received on the LAN are acknowledged to DR/BDR 1451 o LSAs received on interfaces other than the LAN are NOT flooded on 1452 the LAN 1454 NOTE: If any node connected to the LAN requests the enablement of 1455 temporary flooding all nodes revert to standard flooding behavior. 1457 NOTE: The sending of LSA acks by nodes NOT using the LAN as part of 1458 the flooding topology eliminates the need for changes on the part of 1459 the DR/BDR - which might Include nodes which do not support the 1460 flooding optimizations. 1462 6.7. Flooding Behavior 1464 Nodes that support Dynamic Flooding MUST use the flooding topology 1465 for flooding when possible, and MUST NOT revert to standard flooding 1466 when a valid flooding topology is available. 1468 In some cases a node that supports Dynamic Flooding may need to add a 1469 local link(s) to the flooding topology temporarily, even though the 1470 link(s) is not part of the calculated flooding topology. This is 1471 termed "temporary flooding" and is discussed in Section 6.8.1. 1473 The flooding topology is calculated locally in the case of 1474 distributed mode. In centralized mode the flooding topology is 1475 advertised in the area link state database. Received link state 1476 updates, whether received on a link that is in the flooding topology 1477 or on a link that is not in the flooding topology, MUST be flooded on 1478 all links that are in the flooding topology, except for the link on 1479 which the update was received. 1481 In centralized mode, if multiple flooding topologies are present in 1482 the area link state database, the node SHOULD flood on each of these 1483 topologies. 1485 When the flooding topology changes on a node, either as a result of 1486 the local computation in distributed mode or as a result of the 1487 advertisement from the Area Leader in centralized mode, the node MUST 1488 continue to flood on both the old and new flooding topology for a 1489 limited amount of time. This is required to provide all nodes 1490 sufficient time to migrate to the new flooding topology. 1492 6.8. Treatment of Topology Events 1494 In this section, we explicitly consider a variety of different 1495 topological events in the network and how Dynamic Flooding should 1496 address them. 1498 6.8.1. Temporary Addition of Link to Flooding Topology 1500 In some cases a node that supports Dynamic Flooding may need to add a 1501 local link(s) to the flooding topology temporarily, even though the 1502 link(s) is not part of the calculated flooding topology. We refer to 1503 this as "temporary flooding" on the link. 1505 When temporary flooding is enabled on the link, the flooding needs to 1506 be enabled from both directions on the link. To achieve that, the 1507 following steps MUST be performed: 1509 Link State Database needs to be re-synchronised on the link. This 1510 is done using the standard protocol mechanisms. In the case of 1511 IS-IS, this results in setting SRM bit for all LSPs on the circuit 1512 and sending compete set of CSNPs on it. In OSPF, the mechanism 1513 specified in [RFC4811] is used. 1515 Flooding is enabled locally on the link. 1517 Flooding is requested from the neighbor using the mechanism 1518 specified in section Section 5.1.5 or Section 5.2.7. 1520 The request for temporary flooding is withdrawn on the link when all 1521 of the following conditions are met: 1523 Node itself is connected to the current flooding topology. 1525 Adjacent node is connected to the current flooding topology. 1527 Any change in the flooding topology MUST result in evaluation of the 1528 above conditions for any link on which the temporary flooding was 1529 enabled. 1531 Temporary flooding is stopped on the link when both adjacent nodes 1532 stop requesting temporary flooding on the link. 1534 6.8.2. Local Link Addition 1536 If a local link is added to the topology, the protocol will form a 1537 normal adjacency on the link and update the appropriate link state 1538 advertisements for the nodes on either end of the link. These link 1539 state updates will be flooded on the flooding topology. 1541 In centralized mode, the Area Leader, upon receiving these updates, 1542 may choose to retain the existing flooding topology or may choose to 1543 modify the flooding topology. If it elects to change the flooding 1544 topology, it will update the flooding topology in the link state 1545 database and flood it using the new flooding topology. 1547 In distributed mode, any change in the topology, including the link 1548 addition, MUST trigger the flooding topology recalculation. This is 1549 done to ensure that all nodes converge to the same flooding topology, 1550 regardless of the time of the calculation. 1552 Temporary flooding MUST be enabled on the newly added local link, if 1553 at least one of the following conditions are met: 1555 The node on which the local link was added is not connected to the 1556 current flooding topology. 1558 The new adjacent node is not connected to the current flooding 1559 topology. 1561 Note that in this case there is no need to perform a database 1562 synchronization as part of the enablement of the temporary flooding, 1563 because it has been part of the adjacency bring-up itself. 1565 If multiple local links are added to the topology before the flooding 1566 topology is updated, temporary flooding MUST be enabled on a subset 1567 of these links. 1569 6.8.3. Node Addition 1571 If a node is added to the topology, then at least one link is also 1572 added to the topology. Section 6.8.2 applies. 1574 A node which has a large number of neighbors is at risk for 1575 introducing a local flooding storm if all neighbors are brought up at 1576 once and temporary flooding is enabled on all links simultaneously. 1577 The most robust way to address this is to limit the rate of initial 1578 adjacency formation following bootup. This both reduces unnecessary 1579 redundant flooding as part of initial database synchronization and 1580 minimizes the need for temporary flooding as it allows time for the 1581 new node to be added to the flooding topology after only a small 1582 number of adjacencies have been formed. 1584 In the event a node elects to bring up a large number of adjacencies 1585 simultaneously, a significant amount of redundant flooding may be 1586 introduced as multiple neighbors of the new node enable temporary 1587 flooding to the new node which initially is not part of the flooding 1588 topology. 1590 6.8.4. Failures of Link Not on Flooding Topology 1592 If a link that is not part of the flooding topology fails, then the 1593 adjacent nodes will update their link state advertisements and flood 1594 them on the flooding topology. 1596 In centralized mode, the Area Leader, upon receiving these updates, 1597 may choose to retain the existing flooding topology or may choose to 1598 modify the flooding topology. If it elects to change the flooding 1599 topology, it will update the flooding topology in the link state 1600 database and flood it using the new flooding topology. 1602 In distributed mode, any change in the topology, including the 1603 failure of the link that is not part of the flooding topology MUST 1604 trigger the flooding topology recalculation. This is done to ensure 1605 that all nodes converge to the same flooding topology, regardless of 1606 the time of the calculation. 1608 6.8.5. Failures of Link On the Flooding Topology 1610 If there is a failure on the flooding topology, the adjacent nodes 1611 will update their link state advertisements and flood them. If the 1612 original flooding topology is bi-connected, the flooding topology 1613 should still be connected despite a single failure. 1615 If the failed local link represented the only connection to the 1616 flooding topology on the node where the link failed, the node MUST 1617 enable temporary flooding on a subset of its local links. This 1618 allows the node to send its updated link state advertisement(s) and 1619 also keep receiving link state updates from other nodes in the 1620 network before the new flooding topology is calculated and 1621 distributed (in the case of centralized mode). 1623 In centralized mode, the Area Leader will notice the change in the 1624 flooding topology, recompute the flooding topology, and flood it 1625 using the new flooding topology. 1627 In distributed mode, all nodes supporting dynamic flooding will 1628 notice the change in the topology and recompute the new flooding 1629 topology. 1631 6.8.6. Node Deletion 1633 If a node is deleted from the topology, then at least one link is 1634 also removed from the topology. Section 6.8.4 and Section 6.8.5 1635 apply. 1637 6.8.7. Local Link Addition to the Flooding Topology 1639 If the new flooding topology is received in the case of centralized 1640 mode, or calculated locally in the case of distributed mode and the 1641 local link on the node that was not part of the flooding topology has 1642 been added to the flooding topology, the node MUST: 1644 Re-synchronize the Link State Database over the link. This is 1645 done using the standard protocol mechanisms. In the case of IS- 1646 IS, this results in setting SRM bit for all LSPs on the circuit 1647 and sending a complete set of CSNPs. In OSPF, the mechanism 1648 specified in [RFC4811] is used. 1650 Make the link part of the flooding topology and start flooding 1651 over it 1653 6.8.8. Local Link Deletion from the Flooding Topology 1655 If the new flooding topology is received in the case of centralized 1656 mode, or calculated locally in the case of distributed mode and the 1657 local link on the node that was part of the flooding topology has 1658 been removed from the flooding topology, the node MUST remove the 1659 link from the flooding topology. 1661 The node MUST keep flooding on such link for a limited amount of time 1662 to allow other nodes to migrate to the new flooding topology. 1664 If the removed local link represented the only connection to the 1665 flooding topology on the node, the node MUST enable temporary 1666 flooding on a subset of its local links. This allows the node to 1667 send its updated link state advertisement(s) and also keep receiving 1668 link state updates from other nodes in the network before the new 1669 flooding topology is calculated and distributed (in the case of 1670 centralized mode). 1672 6.8.9. Treatment of Disconnected Adjacent Nodes 1674 Every time there is a change in the flooding topology a node MUST 1675 check if there are any adjacent nodes that are disconnected from the 1676 current flooding topology. Temporary flooding MUST be enabled 1677 towards a subset of the disconnected nodes. 1679 6.8.10. Failure of the Area Leader 1681 The failure of the Area Leader can be detected by observing that it 1682 is no longer reachable. In this case, the Area Leader election 1683 process is repeated and a new Area Leader is elected. 1685 In order to minimize disruption to Dynamic Flooding if the Area 1686 Leader becomes unreachable, the node which has the second highest 1687 priority for becoming Area Leader (including the system identifier/ 1688 Router-ID tie breaker if necessary) SHOULD advertise the same 1689 algorithm in its Area Leader Sub-TLV as the Area Leader and (in 1690 centralized mode) SHOULD advertise a flooding topology. This SHOULD 1691 be done even when the Area Leader is reachable. 1693 In centralized mode, the new Area Leader will compute a new flooding 1694 topology and flood it using the new flooding topology. To minimze 1695 disruption, the new flooding topology SHOULD have as much in common 1696 as possible with the old flooding topology. This will minimize the 1697 risk of over-flooding. 1699 In the distributed mode, the new flooding topology will be calculated 1700 on all nodes that support the algorithm that is advertised by the new 1701 Area Leader. Nodes that do not support the algorithm advertised by 1702 the new Area Leader will no longer participate in Dynamic Flooding 1703 and will revert to standard flooding. 1705 6.8.11. Recovery from Multiple Failures 1707 In the unlikely event of multiple failures on the flooding topology, 1708 it may become partitioned. The nodes that remain active on the edges 1709 of the flooding topology partitions will recognize this and will try 1710 to repair the flooding topology locally by enabling temporary 1711 flooding towards the nodes that they consider disconnected from the 1712 flooding topology until a new flooding topology becomes connected 1713 again. 1715 Nodes where local failure was detected update their own link state 1716 advertisements and flood them on the remainder of the flooding 1717 topology. 1719 In centralized mode, the Area Leader will notice the change in the 1720 flooding topology, recompute the flooding topology, and flood it 1721 using the new flooding topology. 1723 In distributed mode, all nodes that actively participate in Dynamic 1724 Flooding will compute the new flooding topology. 1726 Note that this is very different from the area partition because 1727 there is still a connected network graph between the nodes in the 1728 area. The area may remain connected and forwarding may still be 1729 effective. 1731 6.8.12. Rate Limiting Temporary Flooding 1733 As discussed in the previous sections, there are events which require 1734 the introduction of temporary flooding on edges which are not part of 1735 the current flooding topology. This can occur regardless of whether 1736 the area is operating in centralized mode or distributed mode. 1738 Nodes which decide to enable temporary flooding also have to decide 1739 whether to do so on a subset of the edges which are currently not 1740 part of the flooding topology or on all the edges which are currently 1741 not part of the flooding topology. Doing the former risks a longer 1742 convergence time as it is possible that the initial set of edges 1743 enabled does not fully repair the flooding topology. Doing the 1744 latter risks introducing a flooding storm which destablizes the 1745 network. 1747 It is recommended that a node implement rate limiting on the number 1748 of edges on which it chooses to enable temporary flooding. Initial 1749 values for the number of edges to enable and the rate at which 1750 additional edges may subsequently be enabled is left as an 1751 implementation decision. 1753 7. IANA Considerations 1755 7.1. IS-IS 1757 This document requests the following code points from the "sub-TLVs 1758 for TLV 242" registry (IS-IS Router CAPABILITY TLV). 1760 Type: TBD1 1762 Description: IS-IS Area Leader Sub-TLV 1764 Reference: This document (Section 5.1.1) 1766 Type: TBD7 1768 Description: IS-IS Dynamic Flooding Sub-TLV 1770 Reference: This document (Section 5.1.2) 1772 This document requests that IANA allocate and assign code points from 1773 the "IS-IS TLV Codepoints" registry. One for each of the following 1774 TLVs: 1776 Type: TBD2 1778 Description: IS-IS Area System IDs TLV 1780 Reference: This document (Section 5.1.3) 1782 Type: TBD3 1784 Description: IS-IS Flooding Path TLV 1786 Reference: This document (Section 5.1.4) 1788 Type: TBD9 1790 Description: IS-IS Flooding Request TLV 1792 Reference: This document (Section 5.1.5) 1794 This document requests that IANA allocate a new bit value from the 1795 "link-attribute bit values for sub-TLV 19 of TLV 22" registry. 1797 Local Edge Enabled for Flooding (LEEF) - suggested value 4 (to be 1798 assigned by IANA) 1800 7.2. OSPF 1802 This document requests the following code points from the "OSPF 1803 Router Information (RI) TLVs" registry: 1805 Type: TBD4 1807 Description: OSPF Area Leader Sub-TLV 1808 Reference: This document (Section 5.2.1) 1810 Type: TBD8 1812 Description: OSPF Dynamic Flooding Sub-TLV 1814 Reference: This document (Section 5.2.2) 1816 This document requests the following code point from the "Opaque 1817 Link-State Advertisements (LSA) Option Types" registry: 1819 Type: TBD5 1821 Description: OSPFv2 Dynamic Flooding Opaque LSA 1823 Reference: This document (Section 5.2.3) 1825 This document requests the following code point from the "OSPFv3 LSA 1826 Function Codes" registry: 1828 Type: TBD6 1830 Description: OSPFv3 Dynamic Flooding LSA 1832 Reference: This document (Section 5.2.4) 1834 This document requests a new bit in LLS Type 1 Extended Options and 1835 Flags registry: 1837 Bit Position: TBD10 1839 Description: Flooding Request bit 1841 Reference: This document (Section 5.2.7) 1843 This document requests the following code point from the "OSPFv2 1844 Extended Link TLV Sub-TLVs" registry: 1846 Type: TBD11 1848 Description: OSPFv2 Link Attributes Bits Sub-TLV 1850 Reference: This document (Section 5.2.8) 1852 This document requests the following code point from the "OSPFv3 1853 Extended LSA Sub-TLVs" registry: 1855 Type: TBD12 1856 Description: OSPFv3 Link Attributes Bits Sub-TLV 1858 Reference: This document (Section 5.2.8) 1860 7.2.1. OSPF Dynamic Flooding LSA TLVs Registry 1862 This specification also requests a new registry - "OSPF Dynamic 1863 Flooding LSA TLVs". New values can be allocated via IETF Review or 1864 IESG Approval 1866 The "OSPF Dynamic Flooding LSA TLVs" registry will define top-level 1867 TLVs for the OSPFv2 Dynamic Flooding Opaque LSA and OSPFv3 Dynamic 1868 Flooding LSAs. It should be added to the "Open Shortest Path First 1869 (OSPF) Parameters" registries group. 1871 The following initial values are allocated: 1873 Type: 0 1875 Description: Reserved 1877 Reference: This document 1879 Type: 1 1881 Description: OSPF Area Router IDs TLV 1883 Reference: This document (Section 5.2.5) 1885 Type: 2 1887 Description: OSPF Flooding Path TLV 1889 Reference: This document (Section 5.2.6) 1891 Types in the range 32768-33023 are for experimental use; these will 1892 not be registered with IANA, and MUST NOT be mentioned by RFCs. 1894 Types in the range 33024-65535 are not to be assigned at this time. 1895 Before any assignments can be made in the 33024-65535 range, there 1896 MUST be an IETF specification that specifies IANA Considerations that 1897 covers the range being assigned. 1899 7.2.2. OSPF Link Attributes Sub-TLV Bit Values Registry 1901 This specification also requests a new registry - "OSPF Link 1902 Attributes Sub-TLV Bit Values". New values can be allocated via IETF 1903 Review or IESG Approval 1904 The "OSPF Link Attributes Sub-TLV Bit Values" registry defines Link 1905 Attribute bit values for the OSPFv2 Link Attributes Sub-TLV and 1906 OSPFv3 Link Attributes Sub-TLV. It should be added to the "Open 1907 Shortest Path First (OSPF) Parameters" registries group. 1909 The following initial value is allocated: 1911 Bit Number: 0 1913 Description: Local Edge Enabled for Flooding(LEEF) 1915 Reference: This document (Section 5.2.8) 1917 7.3. IGP 1919 IANA is requested to set up a registry called "IGP Algorithm Type For 1920 Computing Flooding Topology" under an existing "Interior Gateway 1921 Protocol (IGP) Parameters" IANA registries. 1923 Values in this registry come from the range 0-255. 1925 The initial values in the IGP Algorithm Type For Computing Flooding 1926 Topology registry are: 1928 0: Reserved for centralized mode. 1930 1-127: Available for standards action. Individual values are to 1931 be assigned according to the "Specification Required" policy 1932 defined in [RFC8126]. 1934 128-254: Reserved for private use. 1936 255: Reserved. 1938 8. Security Considerations 1940 This document introduces no new security issues. Security of routing 1941 within a domain is already addressed as part of the routing protocols 1942 themselves. This document proposes no changes to those security 1943 architectures. 1945 It is possible that an attacker could become Area Leader and 1946 introduce a flawed flooding algorithm into the network thus 1947 compromising the operation of the protocol. Authentication methods 1948 as describe in [RFC5304] and [RFC5310] for IS-IS, [RFC2328] and 1949 [RFC7474] for OSPFv2 and [RFC5340] and [RFC4552] for OSPFv3 SHOULD be 1950 used to prevent such attack. 1952 9. Acknowledgements 1954 The authors would like to thank Sarah Chen for her contribution to 1955 this work. 1957 The authors would like to thank Zeqing (Fred) Xia, Naiming Shen, Adam 1958 Sweeney and Olufemi Komolafe for their helpful comments. 1960 The authors would like to thank Tom Edsall for initially introducing 1961 them to the problem. 1963 Advertising Local Edges Enabled for Flooding (LEEF) is based on an 1964 idea proposed in [I-D.cc-lsr-flooding-reduction]. We wish to thank 1965 the authors of that draft. 1967 10. References 1969 10.1. Normative References 1971 [ISO10589] 1972 International Organization for Standardization, 1973 "Intermediate System to Intermediate System Intra-Domain 1974 Routing Exchange Protocol for use in Conjunction with the 1975 Protocol for Providing the Connectionless-mode Network 1976 Service (ISO 8473)", ISO/IEC 10589:2002, Nov. 2002. 1978 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1979 Requirement Levels", BCP 14, RFC 2119, 1980 DOI 10.17487/RFC2119, March 1997, 1981 . 1983 [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, 1984 DOI 10.17487/RFC2328, April 1998, 1985 . 1987 [RFC4552] Gupta, M. and N. Melam, "Authentication/Confidentiality 1988 for OSPFv3", RFC 4552, DOI 10.17487/RFC4552, June 2006, 1989 . 1991 [RFC5029] Vasseur, JP. and S. Previdi, "Definition of an IS-IS Link 1992 Attribute Sub-TLV", RFC 5029, DOI 10.17487/RFC5029, 1993 September 2007, . 1995 [RFC5250] Berger, L., Bryskin, I., Zinin, A., and R. Coltun, "The 1996 OSPF Opaque LSA Option", RFC 5250, DOI 10.17487/RFC5250, 1997 July 2008, . 1999 [RFC5304] Li, T. and R. Atkinson, "IS-IS Cryptographic 2000 Authentication", RFC 5304, DOI 10.17487/RFC5304, October 2001 2008, . 2003 [RFC5310] Bhatia, M., Manral, V., Li, T., Atkinson, R., White, R., 2004 and M. Fanto, "IS-IS Generic Cryptographic 2005 Authentication", RFC 5310, DOI 10.17487/RFC5310, February 2006 2009, . 2008 [RFC5340] Coltun, R., Ferguson, D., Moy, J., and A. Lindem, "OSPF 2009 for IPv6", RFC 5340, DOI 10.17487/RFC5340, July 2008, 2010 . 2012 [RFC7356] Ginsberg, L., Previdi, S., and Y. Yang, "IS-IS Flooding 2013 Scope Link State PDUs (LSPs)", RFC 7356, 2014 DOI 10.17487/RFC7356, September 2014, 2015 . 2017 [RFC7474] Bhatia, M., Hartman, S., Zhang, D., and A. Lindem, Ed., 2018 "Security Extension for OSPFv2 When Using Manual Key 2019 Management", RFC 7474, DOI 10.17487/RFC7474, April 2015, 2020 . 2022 [RFC7684] Psenak, P., Gredler, H., Shakir, R., Henderickx, W., 2023 Tantsura, J., and A. Lindem, "OSPFv2 Prefix/Link Attribute 2024 Advertisement", RFC 7684, DOI 10.17487/RFC7684, November 2025 2015, . 2027 [RFC7770] Lindem, A., Ed., Shen, N., Vasseur, JP., Aggarwal, R., and 2028 S. Shaffer, "Extensions to OSPF for Advertising Optional 2029 Router Capabilities", RFC 7770, DOI 10.17487/RFC7770, 2030 February 2016, . 2032 [RFC7981] Ginsberg, L., Previdi, S., and M. Chen, "IS-IS Extensions 2033 for Advertising Router Information", RFC 7981, 2034 DOI 10.17487/RFC7981, October 2016, 2035 . 2037 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 2038 Writing an IANA Considerations Section in RFCs", BCP 26, 2039 RFC 8126, DOI 10.17487/RFC8126, June 2017, 2040 . 2042 [RFC8362] Lindem, A., Roy, A., Goethals, D., Reddy Vallem, V., and 2043 F. Baker, "OSPFv3 Link State Advertisement (LSA) 2044 Extensibility", RFC 8362, DOI 10.17487/RFC8362, April 2045 2018, . 2047 10.2. Informative References 2049 [Clos] Clos, C., "A Study of Non-Blocking Switching Networks", 2050 The Bell System Technical Journal Vol. 32(2), DOI 2051 10.1002/j.1538-7305.1953.tb01433.x, March 1953, 2052 . 2054 [I-D.cc-lsr-flooding-reduction] 2055 Chen, H., Toy, M., Yang, Y., Wang, A., Liu, X., Fan, Y., 2056 and L. Liu, "Flooding Topology Computation Algorithm", 2057 draft-cc-lsr-flooding-reduction-09 (work in progress), 2058 June 2020. 2060 [Leiserson] 2061 Leiserson, C., "Fat-Trees: Universal Networks for 2062 Hardware-Efficient Supercomputing", IEEE Transactions on 2063 Computers 34(10):892-901, 1985. 2065 [RFC2973] Balay, R., Katz, D., and J. Parker, "IS-IS Mesh Groups", 2066 RFC 2973, DOI 10.17487/RFC2973, October 2000, 2067 . 2069 [RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering 2070 (TE) Extensions to OSPF Version 2", RFC 3630, 2071 DOI 10.17487/RFC3630, September 2003, 2072 . 2074 [RFC4811] Nguyen, L., Roy, A., and A. Zinin, "OSPF Out-of-Band Link 2075 State Database (LSDB) Resynchronization", RFC 4811, 2076 DOI 10.17487/RFC4811, March 2007, 2077 . 2079 [RFC7938] Lapukhov, P., Premji, A., and J. Mitchell, Ed., "Use of 2080 BGP for Routing in Large-Scale Data Centers", RFC 7938, 2081 DOI 10.17487/RFC7938, August 2016, 2082 . 2084 Authors' Addresses 2086 Tony Li (editor) 2087 Arista Networks 2088 5453 Great America Parkway 2089 Santa Clara, California 95054 2090 USA 2092 Email: tony.li@tony.li 2093 Peter Psenak (editor) 2094 Cisco Systems, Inc. 2095 Eurovea Centre, Central 3 2096 Pribinova Street 10 2097 Bratislava 81109 2098 Slovakia 2100 Email: ppsenak@cisco.com 2102 Les Ginsberg 2103 Cisco Systems, Inc. 2104 510 McCarthy Blvd. 2105 Milpitas, California 95035 2106 USA 2108 Email: ginsberg@cisco.com 2110 Huaimo Chen 2111 Futurewei 2112 Boston, Ma 2113 USA 2115 Email: hchen@futurewei.com 2117 Tony Przygienda 2118 Juniper Networks, Inc. 2119 1194 N. Mathilda Ave 2120 Sunnyvale, California 94089 2121 USA 2123 Email: prz@juniper.net 2125 Dave Cooper 2126 CenturyLink 2127 1025 Eldorado Blvd 2128 Broomfield, Colorado 80021 2129 USA 2131 Email: Dave.Cooper@centurylink.com 2132 Luay Jalil 2133 Verizon 2134 Richardson, Texas 75081 2135 USA 2137 Email: luay.jalil@verizon.com 2139 Srinath Dontula 2140 ATT 2141 200 S Laurel Ave 2142 Middletown, New Jersey 07748 2143 USA 2145 Email: sd947e@att.com 2147 Gyan S. Mishra 2148 Verizon Inc. 2149 13101 Columbia Pike 2150 Silver Spring, Maryland 20904 2151 USA 2153 Phone: 301 502-1347 2154 Email: gyan.s.mishra@verizon.com