idnits 2.17.1 draft-allan-lsr-flooding-algorithm-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 2018) is 2020 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 LSR Working Group Dave Allan 2 Internet Draft Ericsson 3 Intended status: Standards Track October 2018 4 Expires: April 2019 6 A Distributed Algorithm for Constrained Flooding of IGP 7 Advertisements 8 draft-allan-lsr-flooding-algorithm-00 10 Abstract 12 This document describes a distributed algorithm that can be applied 13 to the problem of constraining IGP flooding in dense mesh 14 topologies. The flooding topology utilizes two node-diverse spanning 15 trees in order to provide complete coverage in the presence of any 16 single failure while constraining the number of LSAs received by any 17 IGP speaker connected to the flooding topology. 19 Status of this Memo 21 This Internet-Draft is submitted to IETF in full conformance 22 with the provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet 25 Engineering Task Force (IETF), its areas, and its working 26 groups. Note that other groups may also distribute working 27 documents as Internet-Drafts. 29 Internet-Drafts are draft documents valid for a maximum of six 30 months and may be updated, replaced, or obsoleted by other 31 documents at any time. It is inappropriate to use Internet- 32 Drafts as reference material or to cite them other than as "work 33 in progress". 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt. 38 The list of Internet-Draft Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html. 41 This Internet-Draft will expire in March 2019. 43 Copyright and License Notice 44 Copyright (c) 2018 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with 52 respect to this document. Code Components extracted from this 53 document must include Simplified BSD License text as described 54 in Section 4.e of the Trust Legal Provisions and are provided 55 without warranty as described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction...................................................3 60 1.1. Authors......................................................3 61 1.2. Requirements Language........................................3 62 2. Conventions used in this document..............................3 63 2.1. Terminology..................................................3 64 3. Solution Overview..............................................4 65 3.1. The Flooding Topology........................................4 66 3.2. Solution Applicability.......................................4 67 3.3. Algorithm....................................................4 68 3.3.1. Algorithm Basics...........................................5 69 3.3.2. Generating Diverse Trees...................................5 70 3.3.3. Desirable Properties Computation Wise......................6 71 4. Applying the Algorithm.........................................6 72 4.1. Tree Generation..............................................6 73 4.2. Illustrating the result......................................6 74 4.3. Interactions between Participating and Non-Participating 75 Nodes.............................................................7 76 4.4. Flooding of LSAs.............................................8 77 4.5. Root Selection...............................................9 78 4.6. Node Additions...............................................9 79 5. Further work..................................................10 80 5.1. Thoughts on Coexistence in the Context of a Larger Network..10 81 5.1.1. Multiple flooding Domains and the Severing of Flooding 82 Domains..........................................................10 83 5.2. Thoughts on Flooding Topology Re-Optimization...............10 84 5.3. Thoughts on Node and Network Initialization.................11 85 5.4. Thoughts on Loop Prevention.................................11 86 5.5. Thoughts on Pathological Failure Scenarios..................11 87 6. Acknowledgements..............................................12 88 7. Security Considerations.......................................12 89 8. IANA Considerations...........................................12 90 9. References....................................................12 91 9.1. Normative References........................................12 92 9.2. Informative References......................................12 93 10. Author's Address.............................................13 95 1. Introduction 97 This memo describes an algorithm suitable for reducing the quantity 98 of IGP flooding in dense mesh networks. The only property that the 99 algorithm is dependent upon is that there are at least two equal and 100 diverse shortest paths between any pair of IGP speakers in order to 101 meet the requirements elucidated in [Li]. The algorithm uses a re- 102 purposing of the tie breaking algorithm used in 802.1aq Shortest Path 103 Bridging as an element of construction of the flooding topology. 104 It is not the intention of this memo to specify a complete solution, 105 but to offer a foundation of an eventual solution. 106 1.1. Authors 108 David Allan 110 1.2. Requirements Language 112 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 113 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 114 document are to be interpreted as described in RFC2119 [RFC2119]. 116 2. Conventions used in this document 118 2.1. Terminology 120 Member Adjacency - An adjacency that has been determined to part of 121 the flooding topology. 123 Member Node - A Participant node that is connected to the flooding 124 topology. 126 Participant Adjacency - An adjacency between two participating nodes. 127 It may be a member adjacency or a non-member adjacency 129 Non-Participant Adjacency - An adjacency where at least one of the 130 two nodes is a not a Participating Node 132 Participating Node - An IGP speaker that has advertised the 133 capability, and hence the intention, to participate in a flooding 134 topology 136 3. Solution Overview 138 3.1. The Flooding Topology 140 A flooding topology is composed of a contiguously connected set of 141 participating nodes. 143 The flooding topology constructed from two diversely rooted spanning 144 trees. A participating node that is connected to the physical 145 topology with a degree of two or greater and has at least two 146 participating adjacencies will be bi-connected to the flooding 147 topology. 149 The resulting flooding topology diameter will typically be two times 150 the depth of the tree hierarchy. The compromise in this approach is 151 that a subset of nodes in the network will not see a reduction of the 152 replication burden from current practice when flooding LSAs as the 153 degree of a subset of nodes in the flooding topology will correspond 154 to the degree of the physical topology. 156 The protocol structure of flooded information is unmodified. A 157 participant node may relay a received LSA onto member links of both 158 spanning trees. Specific forwarding rules prevent undue flooding, the 159 result being that every participant node that is bi-connected to the 160 flooding topology will receive two copies of any flooded LSA in a 161 fault free network. Participating nodes that due to network 162 degradation are only singly connected will receive one copy. The 163 forwarding rules are described in section 4.4. 165 3.2. Solution Applicability 167 This algorithm has been considered in the context of pure bipartite 168 graphs, bipartite graphs modified with the addition of intra-tier 169 adjacencies, and hierarchical variations of the above. Applicability 170 to other network designs is for further study. 172 For all graphs the link costs are assumed to be common for all inter- 173 tier links and common for any intra-tier links. Inter-tier and intra- 174 tier links do not have to have the same cost. 176 3.3. Algorithm 178 The algorithm borrows from 802.1aq for the construction of the 179 spanning trees used in this application. This is described in clause 180 28.5 of [802.1Q]. 182 3.3.1. Algorithm Basics 184 The key component of the 802.1aq employed is the tie breaking 185 algorithm. The original application of the algorithm was to produce a 186 symmetrically congruent mesh of multicast trees and unicast 187 forwarding whereby the path between any two nodes in the network was 188 symmetric in both directions and congruent for both unicast and 189 multicast traffic. 191 For this application the algorithm is used in the generation of two 192 diversely rooted spanning trees that define the flooding topology. 194 As part of tree construction, the algorithm tie breaks between equal 195 cost paths. When a tie is identified as part of a Dijkstra 196 computation, a path-id is constructed for each equal cost path. A 197 path-id is expressed as a lexicographically sorted list of the node- 198 ids in the path. The set of equal cost paths is ranked, and the 199 lowest selected. As an example: 201 Path-id 23-39-44-68-85 is ranked lower than 203 Path-id 23-44-59-63-90 205 When the path-ids are of unequal length, the path-ids with the fewest 206 hops are ranked superior to the longer paths, and tie breaking is 207 applied to select between the shorter path-ids. This is not expected 208 to apply in the general case of the dense graphs this application is 209 targeted at. 211 The node-ids used would be the loopback address of each node, 212 therefore each path-id will be unique. 214 3.3.2. Generating Diverse Trees 216 The algorithm includes the concept of an "algorithm-mask", which is a 217 value XOR'd with the node-ids prior to sorting into path IDs and 218 ranking the paths. This permits the construction of diverse trees in 219 a dense topology. 221 Two algorithm masks are used (zero and -1). When computing two trees 222 from the same root, when there are at least two nodes to choose from 223 at each distance from the root, fully diverse trees will be 224 generated. When computing two trees from diverse roots in a tree 225 architecture, diverse nodes will be selected in each tier in the 226 hierarchy as the relay nodes to the next tier. 228 3.3.3. Desirable Properties Computation Wise 230 The algorithm has the property of permitting the pruning of 231 intermediate state as a Dijkstra progresses as ties can be 232 immediately evaluated, and the all but the selected path removed from 233 further consideration. This is desirable when computing a Dijkstra in 234 a dense graph as all path permutations do not need to be carried 235 forward during computation. This permits the computation to be quite 236 fast. 238 The resulting computational complexity would still be expressed as 239 2N(lnN). 241 4. Applying the Algorithm 243 4.1. Tree Generation 245 Each IGP speaker in the network has knowledge of each of the two 246 spanning tree roots and the algorithm mask associated with each. This 247 memo does not specify how root selection is performed and 248 disseminated through the network, but does discuss selection 249 requirements in section 4.5. 251 Each root has one of the two algorithm masks associated with it. 253 Each participating IGP speaker in the network computes a spanning 254 tree from each the two roots (using the algorithm mask associated 255 with each root) and from that can determine its own role in the 256 flooding topology. The two spanning trees are designated the "low 257 spanning tree" and the "high spanning tree". 259 The spanning trees are a starting point for a redundant topology. 260 Unlike the commonly accepted operation of a spanning tree, in this 261 application the distinction between upstream and downstream 262 adjacencies is important and is an input to how a member node further 263 relays any LSAs received. Upstream member adjacencies are in the 264 direction of a root, and downstream member adjacencies are in the 265 direction away from the root. 267 4.2. Illustrating the result 269 The following diagram illustrates the general layout of the flooding 270 graph constructed using the algorithm as applied to a bi-partite 271 style of tree (no intra tier links): 273 Spine 274 +---+ +---+ +---+ +---+ +---+ 275 | a | | b | | c | | d | o o o | z | 276 +---+ +---+ +---+ +---+ +---+ 277 /|\ /|\ 278 \|/ \|/ 279 +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ 280 | i | |ii | |iii| o o o | x | | 1 | | 2 | | 3 | o o o | n | 281 +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ 282 /|\ /|\ /|\ /|\ 284 In the example, there are two tiers of switches. The spine (nodes 285 a..z), and the next tier with two groups of nodes (i..x) and (1..n). 286 The algorithm will select the node with the lowest node ID in each 287 tier as the replicating node for the low spanning tree; 'a' and 'i' 288 for the set of nodes connecting the spine and the next tier. The 289 algorithm will select the nodes with highest node ID in the same set 290 of nodes for the high spanning tree; 'z' and 'n' for the same set of 291 nodes. 293 In the flooding topology: 295 - Node 'a' is connected to nodes i..x and 1..n for the low spanning 296 tree. 298 - Node 'z' is connected to the same set of nodes for the high 299 spanning tree. 301 - Node 'i' is connected to nodes 'a'..'z' for the low spanning tree, 302 and 304 - Node 'n' is connected to the same nodes for the high spanning 305 tree. 307 - All other nodes are bi-connected to the flooding topology 309 If there was a further tier added below nodes i..x, then 'i' and 'x' 310 would be selected as the replicating nodes for the low and high 311 spanning tree respectively. This is similarly true for nodes 1..n. 313 4.3. Interactions between Participating and Non-Participating Nodes 315 This solution proposes primarily only nodal behaviors with respect to 316 constraining flooding to member adjacencies. To address the scenario 317 where the participating nodes were a subset of a larger network, it 318 would be necessary to advertise the capability to participate in 319 flood reduction. 321 This would then require that each participating node use this 322 information to be able to identify the set of participating 323 adjacencies and confine the spanning tree computation to the set of 324 participating adjacencies in order to identify local set of member 325 adjacencies. Interactions with non-participant adjacencies would 326 conform to current practice. 328 4.4. Flooding of LSAs 330 The design of the protocol elements that are flooded is unmodified by 331 this solution. Therefore, there is no additional information 332 available to associate a received LSA with a given tree, nor is such 333 information needed; the two spanning trees are not treated as unique 334 entities in the flooding topology. 336 As per current practice, a node does not relay LSAs that it has 337 already seen. 339 A new LSA received from an upstream member adjacency is flooded on: 341 - All downstream member adjacencies exclusive of the adjacency of 342 arrival, irrespective of which tree the adjacencies are part of. 344 - All non-participant adjacencies 346 A new LSA received from a downstream member adjacency is flooded on: 348 - All other member adjacencies exclusive of the adjacency of 349 arrival irrespective of which tree the adjacencies are part of. 351 - All non-participant adjacencies 353 A new LSA received from a member adjacency where upstream and 354 downstream is ambiguous (it is an upstream member on one of the 355 spanning trees and a downstream member on the other), is flooded on: 357 - All other member adjacencies exclusive of the adjacency of 358 arrival irrespective of which adjacency the links are part of. 360 - All Non-Participant adjacencies 362 A new LSA received from a non-member adjacency is flooded on all 363 member adjacency irrespective of which tree the adjacencies are part 364 of (see sections 5.1 and 5.5). 366 4.5. Root Selection 368 The algorithm depends on tie breaking between sets of node IDs to 369 produce diverse paths, therefore it does place some restrictions on 370 root selection. 372 A root SHOULD be selected so that the root"s node-id when XORd with 373 the associated algorithm mask is the lowest ranked node in the local 374 tier in the tree hierarchy. This would be analogous to path-id 375 ranking where the paths were all of length 1. 377 The root MUST NOT be selected such that the node-ID when XORd with 378 the other root"s algorithm mask is the lowest ranked node. This would 379 result in the root also being a transit node for the other spanning 380 tree and produce a scenario whereby a single failure could render 381 both spanning trees incomplete. 383 Roots MUST NOT be directly connected for either of the low or high 384 spanning trees. If the topology does not permit this to be satisfied 385 purely by root selection, then the inter-root adjacency must be 386 pruned from the graph prior to spanning tree computation to ensure 387 that diverse paths between the roots are used. 389 For a true bipartite graph, there are no other restrictions on node 390 selection. 392 For a bipartite graph modified with inter-tier links, the roots MUST 393 be placed in different tiers to ensure a pathological combination of 394 link weights and node-ids does not result in a scenario where a 395 single failure would render the flooding topology incomplete. 397 Other sources of failure may exist that may require an administrative 398 component to root selection. This, for example, would ensure that 399 both roots were not selected from a common shared risk group. 401 See also section 5.5. 403 4.6. Node Additions 405 A participating node that is added to the topology will initially not 406 be served by the flooding topology. A participating node adjacent to 407 that node is required to treat it as a non-participating node until 408 such time as tree re-optimization has completed. At the end of tree 409 optimization, typically two adjacent participating nodes will have 410 member adjacencies with the new node, so the ability to flood LSAs 411 between the new node and the flooding topology will have been 412 uninterrupted during the process. 414 5. Further work 416 5.1. Thoughts on Coexistence in the Context of a Larger Network 418 A node that had a combination of participating and non-participating 419 adjacencies would be required to do the following: 421 - For any new LSA received on a participating adjacency, in addition 422 to the rules for member adjacencies, it would also flood the LSA 423 on all non-participating adjacencies. 425 - For any new LSA received on a non-participating adjacency, it 426 would flood the LSA on all member adjacencies. 428 This is reflected in the forwarding rules described in section 4.4. 430 5.1.1. Multiple flooding Domains and the Severing of Flooding Domains 432 It is possible to envision several scenarios whereby there are sets 433 of participating nodes that are not contiguously connected via 434 participating adjacencies in a given IGP domain. 436 1. A node has been incorrectly configured as a participating node 437 but has no participating adjacencies. 439 2. A participating node or set of nodes has become severed from 440 the flooding topology but is still connected to other nodes in 441 the network. Nodes in this set would still be able to compute a 442 local extension of the flooding topology, but it would only be 443 useful if the set was sufficiently large that a majority of the 444 nodes were not connected to non-participants. 446 3. Procedures are designed to permit more than one flooding 447 topology in an IGP domain. In which case participating nodes 448 would have to be administratively configured to associate with a 449 flooding topology instance. 451 5.2. Thoughts on Flooding Topology Re-Optimization 453 After a topology change, it is desirable that the flooding topology 454 remain stable until the network has stabilized. However a single 455 failure may render one of the spanning trees incomplete, such that a 456 further single failure could make the flooding topology incomplete. 457 Therefore procedures should include re-optimization of the flooding 458 topology after a topology change. In order to maintain complete 459 coverage it would make sense not to recompute the spanning trees 460 simultaneously. 462 One approach that would appear to make sense to separate in time 463 network convergence, re-optimization of the low spanning tree and re- 464 optimization of the high spanning tree. 466 The ideal would be to reoptimize an incomplete tree first, however 467 this would require the participating nodes to maintain a complete map 468 of all member adjacencies so that a common determination of the most 469 degraded spanning tree and hence the order of re-optimization could 470 be made. 472 5.3. Thoughts on Node and Network Initialization 474 A participating node at power up will be not be able to establish 475 member links until it has synchronized with the network and the 476 network is stable in the new topology. This suggests it simply treats 477 power up similarly to how a topology change and network re- 478 optimization is treated. The only difference being that it will flood 479 all LSAs received or originated as per current practice until both 480 spanning trees have stabilized. 482 5.4. Thoughts on Loop Prevention 484 802.1aq included additional mechanisms to prevent looping, a reverse 485 path forwarding check, and digest exchange across adjacencies to 486 ensure IGP synchronization. 488 Routing LSAs are not relayed if they are a duplicate, therefore 489 destructive looping cannot occur and additional mitigation mechanisms 490 are not required. 492 5.5. Thoughts on Pathological Failure Scenarios 494 While in a stable fault free network with sufficient mesh density of 495 the types considered, the flooding topology used by this solution 496 would ensure that no single failure rendered both spanning trees 497 incomplete, it is also useful to consider multiple failure scenarios 498 and if they can be mitigated. 500 Preliminary analysis suggests that in a tree network of sufficient 501 mesh density, the only dual link failure that can render the flooding 502 topology incomplete is if a participant node has failures in both 503 upstream member adjacencies. This can be partially mitigated if the 504 node recognizes this scenario and reverts to flooding on all 505 adjacencies. If the suggested procedures of 5.1.1 above are adopted, 506 surrounding participating nodes that receive the LSA on a non-member 507 adjacency will introduce the LSA into the flooding topology. 509 The pathological scenario is the simultaneous failure of both roots. 510 This does suggest that root selection should place the roots two hops 511 apart so there will be a constituency of participants that would 512 observe a simultaneous failure of both upstream member adjacencies 513 and revert to normal flooding. 515 6. Acknowledgements 517 The author would like to acknowledge Jerome Chiabaut for his original 518 algorithm work that underpins this memo. 520 7. Security Considerations 522 For a future version of this document. 524 8. IANA Considerations 526 This memo requires no IANA allocations 528 9. References 530 9.1. Normative References 532 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 533 Requirement Levels", BCP 14, RFC 2119, March 1997. 535 9.2. Informative References 537 [802.1Q] 802.1Q (2014) IEEE Standard for Local and Metropolitan 538 Area Networks--Media Access Control (MAC) Bridges and 539 Virtual Bridged Local Area Networks 541 [Li] Li, T., Psenak, P., "Dynamic Flooding on Dense Graphs", IETF work 542 in progress, draft-li-dynamic-flooding-05, June 2018 544 10. Author's Address 546 Dave Allan 547 Ericsson 548 2455 Augustine Drive 549 Santa Clara, CA 95054 550 USA 551 Email: david.i.allan@ericsson.com