idnits 2.17.1 

draft-allan-lsr-flooding-algorithm-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (October 2018) is 2020 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

     No issues found here.

     Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------

1	LSR Working Group                                            Dave Allan
2	Internet Draft                                                 Ericsson
3	Intended status: Standards Track                           October 2018
4	Expires: April 2019

6	          A Distributed Algorithm for Constrained Flooding of IGP
7	                              Advertisements
8	                   draft-allan-lsr-flooding-algorithm-00

10	Abstract

12	   This document describes a distributed algorithm that can be applied
13	   to the problem of constraining IGP flooding in dense mesh
14	   topologies. The flooding topology utilizes two node-diverse spanning
15	   trees in order to provide complete coverage in the presence of any
16	   single failure while constraining the number of LSAs received by any
17	   IGP speaker connected to the flooding topology.

19	Status of this Memo

21	   This Internet-Draft is submitted to IETF in full conformance
22	   with the provisions of BCP 78 and BCP 79.

24	   Internet-Drafts are working documents of the Internet
25	   Engineering Task Force (IETF), its areas, and its working
26	   groups.  Note that other groups may also distribute working
27	   documents as Internet-Drafts.

29	   Internet-Drafts are draft documents valid for a maximum of six
30	   months and may be updated, replaced, or obsoleted by other
31	   documents at any time.  It is inappropriate to use Internet-
32	   Drafts as reference material or to cite them other than as "work
33	   in progress".

35	   The list of current Internet-Drafts can be accessed at
36	   http://www.ietf.org/ietf/1id-abstracts.txt.

38	   The list of Internet-Draft Shadow Directories can be accessed at
39	   http://www.ietf.org/shadow.html.

41	   This Internet-Draft will expire in March 2019.

43	Copyright and License Notice
44	   Copyright (c) 2018 IETF Trust and the persons identified as the
45	   document authors.  All rights reserved.

47	   This document is subject to BCP 78 and the IETF Trust's Legal
48	   Provisions Relating to IETF Documents
49	   (http://trustee.ietf.org/license-info) in effect on the date of
50	   publication of this document. Please review these documents
51	   carefully, as they describe your rights and restrictions with
52	   respect to this document. Code Components extracted from this
53	   document must include Simplified BSD License text as described
54	   in Section 4.e of the Trust Legal Provisions and are provided
55	   without warranty as described in the Simplified BSD License.

57	Table of Contents

59	   1. Introduction...................................................3
60	   1.1. Authors......................................................3
61	   1.2. Requirements Language........................................3
62	   2. Conventions used in this document..............................3
63	   2.1. Terminology..................................................3
64	   3. Solution Overview..............................................4
65	   3.1. The Flooding Topology........................................4
66	   3.2. Solution Applicability.......................................4
67	   3.3. Algorithm....................................................4
68	   3.3.1. Algorithm Basics...........................................5
69	   3.3.2. Generating Diverse Trees...................................5
70	   3.3.3. Desirable Properties Computation Wise......................6
71	   4. Applying the Algorithm.........................................6
72	   4.1. Tree Generation..............................................6
73	   4.2. Illustrating the result......................................6
74	   4.3. Interactions between Participating and Non-Participating
75	   Nodes.............................................................7
76	   4.4. Flooding of LSAs.............................................8
77	   4.5. Root Selection...............................................9
78	   4.6. Node Additions...............................................9
79	   5. Further work..................................................10
80	   5.1. Thoughts on Coexistence in the Context of a Larger Network..10
81	   5.1.1. Multiple flooding Domains and the Severing of Flooding
82	   Domains..........................................................10
83	   5.2. Thoughts on Flooding Topology Re-Optimization...............10
84	   5.3. Thoughts on Node and Network Initialization.................11
85	   5.4. Thoughts on Loop Prevention.................................11
86	   5.5. Thoughts on Pathological Failure Scenarios..................11
87	   6. Acknowledgements..............................................12
88	   7. Security Considerations.......................................12
89	   8. IANA Considerations...........................................12
90	   9. References....................................................12
91	   9.1. Normative References........................................12
92	   9.2. Informative References......................................12
93	   10. Author's Address.............................................13

95	1. Introduction

97	   This memo describes an algorithm suitable for reducing the quantity
98	   of IGP flooding in dense mesh networks.  The only property that the
99	   algorithm is dependent upon is that there are at least two equal and
100	   diverse shortest paths between any pair of IGP speakers in order to
101	   meet the requirements elucidated in [Li].  The algorithm uses a re-
102	   purposing of the tie breaking algorithm used in 802.1aq Shortest Path
103	   Bridging as an element of construction of the flooding topology.
104	   It is not the intention of this memo to specify a complete solution,
105	   but to offer a foundation of an eventual solution.
106	1.1. Authors

108	   David Allan

110	1.2. Requirements Language

112	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
113	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
114	   document are to be interpreted as described in RFC2119 [RFC2119].

116	2. Conventions used in this document

118	2.1. Terminology

120	   Member Adjacency - An adjacency that has been determined to part of
121	   the flooding topology.

123	   Member Node - A Participant node that is connected to the flooding
124	   topology.

126	   Participant Adjacency - An adjacency between two participating nodes.
127	   It may be a member adjacency or a non-member adjacency

129	   Non-Participant Adjacency - An adjacency where at least one of the
130	   two nodes is a not a Participating Node

132	   Participating Node - An IGP speaker that has advertised the
133	   capability, and hence the intention, to participate in a flooding
134	   topology

136	3. Solution Overview

138	3.1. The Flooding Topology

140	   A flooding topology is composed of a contiguously connected set of
141	   participating nodes.

143	   The flooding topology constructed from two diversely rooted spanning
144	   trees. A participating node that is connected to the physical
145	   topology with a degree of two or greater and has at least two
146	   participating adjacencies will be bi-connected to the flooding
147	   topology.

149	   The resulting flooding topology diameter will typically be two times
150	   the depth of the tree hierarchy. The compromise in this approach is
151	   that a subset of nodes in the network will not see a reduction of the
152	   replication burden from current practice when flooding LSAs as the
153	   degree of a subset of nodes in the flooding topology will correspond
154	   to the degree of the physical topology.

156	   The protocol structure of flooded information is unmodified. A
157	   participant node may relay a received LSA onto member links of both
158	   spanning trees. Specific forwarding rules prevent undue flooding, the
159	   result being that every participant node that is bi-connected to the
160	   flooding topology will receive two copies of any flooded LSA in a
161	   fault free network. Participating nodes that due to network
162	   degradation are only singly connected will receive one copy. The
163	   forwarding rules are described in section 4.4.

165	3.2. Solution Applicability

167	   This algorithm has been considered in the context of pure bipartite
168	   graphs, bipartite graphs modified with the addition of intra-tier
169	   adjacencies, and hierarchical variations of the above. Applicability
170	   to other network designs is for further study.

172	   For all graphs the link costs are assumed to be common for all inter-
173	   tier links and common for any intra-tier links. Inter-tier and intra-
174	   tier links do not have to have the same cost.

176	3.3. Algorithm

178	   The algorithm borrows from 802.1aq for the construction of the
179	   spanning trees used in this application. This is described in clause
180	   28.5 of [802.1Q].

182	3.3.1. Algorithm Basics

184	   The key component of the 802.1aq employed is the tie breaking
185	   algorithm. The original application of the algorithm was to produce a
186	   symmetrically congruent mesh of multicast trees and unicast
187	   forwarding whereby the path between any two nodes in the network was
188	   symmetric in both directions and congruent for both unicast and
189	   multicast traffic.

191	   For this application the algorithm is used in the generation of two
192	   diversely rooted spanning trees that define the flooding topology.

194	   As part of tree construction, the algorithm tie breaks between equal
195	   cost paths. When a tie is identified as part of a Dijkstra
196	   computation, a path-id is constructed for each equal cost path. A
197	   path-id is expressed as a lexicographically sorted list of the node-
198	   ids in the path. The set of equal cost paths is ranked, and the
199	   lowest selected. As an example:

201	         Path-id 23-39-44-68-85 is ranked lower than

203	         Path-id 23-44-59-63-90

205	   When the path-ids are of unequal length, the path-ids with the fewest
206	   hops are ranked superior to the longer paths, and tie breaking is
207	   applied to select between the shorter path-ids. This is not expected
208	   to apply in the general case of the dense graphs this application is
209	   targeted at.

211	   The node-ids used would be the loopback address of each node,
212	   therefore each path-id will be unique.

214	3.3.2. Generating Diverse Trees

216	   The algorithm includes the concept of an "algorithm-mask", which is a
217	   value XOR'd with the node-ids prior to sorting into path IDs and
218	   ranking the paths. This permits the construction of diverse trees in
219	   a dense topology.

221	   Two algorithm masks are used (zero and -1). When computing two trees
222	   from the same root, when there are at least two nodes to choose from
223	   at each distance from the root, fully diverse trees will be
224	   generated. When computing two trees from diverse roots in a tree
225	   architecture, diverse nodes will be selected in each tier in the
226	   hierarchy as the relay nodes to the next tier.

228	3.3.3. Desirable Properties Computation Wise

230	   The algorithm has the property of permitting the pruning of
231	   intermediate state as a Dijkstra progresses as ties can be
232	   immediately evaluated, and the all but the selected path removed from
233	   further consideration. This is desirable when computing a Dijkstra in
234	   a dense graph as all path permutations do not need to be carried
235	   forward during computation. This permits the computation to be quite
236	   fast.

238	   The resulting computational complexity would still be expressed as
239	   2N(lnN).

241	4. Applying the Algorithm

243	4.1. Tree Generation

245	   Each IGP speaker in the network has knowledge of each of the two
246	   spanning tree roots and the algorithm mask associated with each. This
247	   memo does not specify how root selection is performed and
248	   disseminated through the network, but does discuss selection
249	   requirements in section 4.5.

251	   Each root has one of the two algorithm masks associated with it.

253	   Each participating IGP speaker in the network computes a spanning
254	   tree from each the two roots (using the algorithm mask associated
255	   with each root) and from that can determine its own role in the
256	   flooding topology. The two spanning trees are designated the "low
257	   spanning tree" and the "high spanning tree".

259	   The spanning trees are a starting point for a redundant topology.
260	   Unlike the commonly accepted operation of a spanning tree, in this
261	   application the distinction between upstream and downstream
262	   adjacencies is important and is an input to how a member node further
263	   relays any LSAs received. Upstream member adjacencies are in the
264	   direction of a root, and downstream member adjacencies are in the
265	   direction away from the root.

267	4.2. Illustrating the result

269	   The following diagram illustrates the general layout of the flooding
270	   graph constructed using the algorithm as applied to a bi-partite
271	   style of tree (no intra tier links):

273	                                  Spine
274	                  +---+ +---+ +---+ +---+        +---+
275	                  | a | | b | | c | | d | o o o  | z |
276	                  +---+ +---+ +---+ +---+        +---+
277	                   /|\                            /|\
278	    \|/                                                             \|/
279	   +---+ +---+ +---+       +---+           +---+ +---+ +---+       +---+
280	   | i | |ii | |iii| o o o | x |           | 1 | | 2 | | 3 | o o o | n |
281	   +---+ +---+ +---+       +---+           +---+ +---+ +---+       +---+
282	    /|\                     /|\             /|\                     /|\

284	   In the example, there are two tiers of switches. The spine (nodes
285	   a..z), and the next tier with two groups of nodes (i..x) and (1..n).
286	   The algorithm will select the node with the lowest node ID in each
287	   tier as the replicating node for the low spanning tree; 'a' and 'i'
288	   for the set of nodes connecting the spine and the next tier. The
289	   algorithm will select the nodes with highest node ID in the same set
290	   of nodes for the high spanning tree; 'z' and 'n' for the same set of
291	   nodes.

293	   In the flooding topology:

295	   - Node 'a' is connected to nodes i..x and 1..n for the low spanning
296	      tree.

298	   - Node 'z' is connected to the same set of nodes for the high
299	      spanning tree.

301	   - Node 'i' is connected to nodes 'a'..'z' for the low spanning tree,
302	      and

304	   - Node 'n' is connected to the same nodes for the high spanning
305	      tree.

307	   - All other nodes are bi-connected to the flooding topology

309	   If there was a further tier added below nodes i..x, then 'i' and 'x'
310	   would be selected as the replicating nodes for the low and high
311	   spanning tree respectively. This is similarly true for nodes 1..n.

313	4.3. Interactions between Participating and Non-Participating Nodes

315	   This solution proposes primarily only nodal behaviors with respect to
316	   constraining flooding to member adjacencies. To address the scenario
317	   where the participating nodes were a subset of a larger network, it
318	   would be necessary to advertise the capability to participate in
319	   flood reduction.

321	   This would then require that each participating node use this
322	   information to be able to identify the set of participating
323	   adjacencies and confine the spanning tree computation to the set of
324	   participating adjacencies in order to identify local set of member
325	   adjacencies. Interactions with non-participant adjacencies would
326	   conform to current practice.

328	4.4. Flooding of LSAs

330	   The design of the protocol elements that are flooded is unmodified by
331	   this solution. Therefore, there is no additional information
332	   available to associate a received LSA with a given tree, nor is such
333	   information needed; the two spanning trees are not treated as unique
334	   entities in the flooding topology.

336	   As per current practice, a node does not relay LSAs that it has
337	   already seen.

339	   A new LSA received from an upstream member adjacency is flooded on:

341	      - All downstream member adjacencies exclusive of the adjacency of
342	      arrival, irrespective of which tree the adjacencies are part of.

344	      - All non-participant adjacencies

346	   A new LSA received from a downstream member adjacency is flooded on:

348	      -  All other member adjacencies exclusive of the adjacency of
349	        arrival irrespective of which tree the adjacencies are part of.

351	      -  All non-participant adjacencies

353	   A new LSA received from a member adjacency where upstream and
354	   downstream is ambiguous (it is an upstream member on one of the
355	   spanning trees and a downstream member on the other), is flooded on:

357	      - All other member adjacencies exclusive of the adjacency of
358	      arrival irrespective of which adjacency the links are part of.

360	      - All Non-Participant adjacencies

362	   A new LSA received from a non-member adjacency is flooded on all
363	   member adjacency irrespective of which tree the adjacencies are part
364	   of (see sections 5.1 and 5.5).

366	4.5. Root Selection

368	   The algorithm depends on tie breaking between sets of node IDs to
369	   produce diverse paths, therefore it does place some restrictions on
370	   root selection.

372	   A root SHOULD be selected so that the root"s node-id when XORd with
373	   the associated algorithm mask is the lowest ranked node in the local
374	   tier in the tree hierarchy. This would be analogous to path-id
375	   ranking where the paths were all of length 1.

377	   The root MUST NOT be selected such that the node-ID when XORd with
378	   the other root"s algorithm mask is the lowest ranked node. This would
379	   result in the root also being a transit node for the other spanning
380	   tree and produce a scenario whereby a single failure could render
381	   both spanning trees incomplete.

383	   Roots MUST NOT be directly connected for either of the low or high
384	   spanning trees. If the topology does not permit this to be satisfied
385	   purely by root selection, then the inter-root adjacency must be
386	   pruned from the graph prior to spanning tree computation to ensure
387	   that diverse paths between the roots are used.

389	   For a true bipartite graph, there are no other restrictions on node
390	   selection.

392	   For a bipartite graph modified with inter-tier links, the roots MUST
393	   be placed in different tiers to ensure a pathological combination of
394	   link weights and node-ids does not result in a scenario where a
395	   single failure would render the flooding topology incomplete.

397	   Other sources of failure may exist that may require an administrative
398	   component to root selection. This, for example, would ensure that
399	   both roots were not selected from a common shared risk group.

401	   See also section 5.5.

403	4.6. Node Additions

405	   A participating node that is added to the topology will initially not
406	   be served by the flooding topology. A participating node adjacent to
407	   that node is required to treat it as a non-participating node until
408	   such time as tree re-optimization has completed. At the end of tree
409	   optimization, typically two adjacent participating nodes will have
410	   member adjacencies with the new node, so the ability to flood LSAs
411	   between the new node and the flooding topology will have been
412	   uninterrupted during the process.

414	5. Further work

416	5.1. Thoughts on Coexistence in the Context of a Larger Network

418	   A node that had a combination of participating and non-participating
419	   adjacencies would be required to do the following:

421	   -  For any new LSA received on a participating adjacency, in addition
422	     to the rules for member adjacencies, it would also flood the LSA
423	     on all non-participating adjacencies.

425	   -  For any new LSA received on a non-participating adjacency, it
426	     would flood the LSA on all member adjacencies.

428	   This is reflected in the forwarding rules described in section 4.4.

430	5.1.1. Multiple flooding Domains and the Severing of Flooding Domains

432	   It is possible to envision several scenarios whereby there are sets
433	   of participating nodes that are not contiguously connected via
434	   participating adjacencies in a given IGP domain.

436	     1. A node has been incorrectly configured as a participating node
437	        but has no participating adjacencies.

439	     2. A participating node or set of nodes has become severed from
440	        the flooding topology but is still connected to other nodes in
441	        the network. Nodes in this set would still be able to compute a
442	        local extension of the flooding topology, but it would only be
443	        useful if the set was sufficiently large that a majority of the
444	        nodes were not connected to non-participants.

446	     3. Procedures are designed to permit more than one flooding
447	        topology in an IGP domain. In which case participating nodes
448	        would have to be administratively configured to associate with a
449	        flooding topology instance.

451	5.2. Thoughts on Flooding Topology Re-Optimization

453	   After a topology change, it is desirable that the flooding topology
454	   remain stable until the network has stabilized. However a single
455	   failure may render one of the spanning trees incomplete, such that a
456	   further single failure could make the flooding topology incomplete.
457	   Therefore procedures should include re-optimization of the flooding
458	   topology after a topology change. In order to maintain complete
459	   coverage it would make sense not to recompute the spanning trees
460	   simultaneously.

462	   One approach that would appear to make sense to separate in time
463	   network convergence, re-optimization of the low spanning tree and re-
464	   optimization of the high spanning tree.

466	   The ideal would be to reoptimize an incomplete tree first, however
467	   this would require the participating nodes to maintain a complete map
468	   of all member adjacencies so that a common determination of the most
469	   degraded spanning tree and hence the order of re-optimization could
470	   be made.

472	5.3. Thoughts on Node and Network Initialization

474	   A participating node at power up will be not be able to establish
475	   member links until it has synchronized with the network and the
476	   network is stable in the new topology. This suggests it simply treats
477	   power up similarly to how a topology change and network re-
478	   optimization is treated. The only difference being that it will flood
479	   all LSAs received or originated as per current practice until both
480	   spanning trees have stabilized.

482	5.4. Thoughts on Loop Prevention

484	   802.1aq included additional mechanisms to prevent looping, a reverse
485	   path forwarding check, and digest exchange across adjacencies to
486	   ensure IGP synchronization.

488	   Routing LSAs are not relayed if they are a duplicate, therefore
489	   destructive looping cannot occur and additional mitigation mechanisms
490	   are not required.

492	5.5. Thoughts on Pathological Failure Scenarios

494	   While in a stable fault free network with sufficient mesh density of
495	   the types considered, the flooding topology used by this solution
496	   would ensure that no single failure rendered both spanning trees
497	   incomplete, it is also useful to consider multiple failure scenarios
498	   and if they can be mitigated.

500	   Preliminary analysis suggests that in a tree network of sufficient
501	   mesh density, the only dual link failure that can render the flooding
502	   topology incomplete is if a participant node has failures in both
503	   upstream member adjacencies. This can be partially mitigated if the
504	   node recognizes this scenario and reverts to flooding on all
505	   adjacencies. If the suggested procedures of 5.1.1 above are adopted,
506	   surrounding participating nodes that receive the LSA on a non-member
507	   adjacency will introduce the LSA into the flooding topology.

509	   The pathological scenario is the simultaneous failure of both roots.
510	   This does suggest that root selection should place the roots two hops
511	   apart so there will be a constituency of participants that would
512	   observe a simultaneous failure of both upstream member adjacencies
513	   and revert to normal flooding.

515	6. Acknowledgements

517	   The author would like to acknowledge Jerome Chiabaut for his original
518	   algorithm work that underpins this memo.

520	7. Security Considerations

522	   For a future version of this document.

524	8. IANA Considerations

526	   This memo requires no IANA allocations

528	9. References

530	9.1. Normative References

532	[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
533	          Requirement Levels", BCP 14, RFC 2119, March 1997.

535	9.2. Informative References

537	[802.1Q] 802.1Q (2014) IEEE Standard for Local and Metropolitan
538	           Area Networks--Media Access Control (MAC) Bridges and
539	           Virtual Bridged Local Area Networks

541	[Li]  Li, T., Psenak, P., "Dynamic Flooding on Dense Graphs", IETF work
542	        in progress, draft-li-dynamic-flooding-05, June 2018

544	10. Author's Address

546	   Dave Allan
547	   Ericsson
548	   2455 Augustine Drive
549	   Santa Clara, CA  95054
550	   USA
551	   Email: david.i.allan@ericsson.com