idnits 2.17.1 draft-ietf-mpls-rmr-14.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 14, 2021) is 1159 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'Gen' is mentioned on line 247, but not defined == Missing Reference: 'TAR' is mentioned on line 249, but not defined == Missing Reference: 'OPS' is mentioned on line 251, but not defined == Missing Reference: 'SAD' is mentioned on line 253, but not defined == Missing Reference: 'RID' is mentioned on line 557, but not defined == Missing Reference: 'RID 1' is mentioned on line 561, but not defined == Missing Reference: 'RID 2' is mentioned on line 562, but not defined Summary: 0 errors (**), 0 flaws (~~), 8 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MPLS WG K. Kompella 3 Internet-Draft Juniper Networks, Inc. 4 Intended status: Experimental L. Contreras 5 Expires: August 18, 2021 Telefonica 6 February 14, 2021 8 Resilient MPLS Rings 9 draft-ietf-mpls-rmr-14 11 Abstract 13 This document describes the use of the MPLS control and data planes 14 on ring topologies. It describes the special nature of rings, and 15 proceeds to show how MPLS can be effectively used in such topologies. 16 It describes how MPLS rings are configured, auto-discovered and 17 signaled, as well as how the data plane works. Companion documents 18 describe the details of discovery and signaling for specific 19 protocols. 21 Requirements Language 23 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 24 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 25 document are to be interpreted as described in [RFC2119][RFC8174]. 27 This document is classified as an Experimental RFC. The parameters 28 of this experiment have yet to be defined: how long the experiment 29 runs, what criteria determine that the experiment is over -- does the 30 doc then become Standards Track or Historical, etc. A future update 31 will document these parameters. 33 Status of This Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF). Note that other groups may also distribute 40 working documents as Internet-Drafts. The list of current Internet- 41 Drafts is at https://datatracker.ietf.org/drafts/current/. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on August 18, 2021. 50 Copyright Notice 52 Copyright (c) 2021 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (https://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 68 1.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 3 69 1.2. Changes from -12 in response to reviews . . . . . . . . . 5 70 2. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 6 71 3. Theory of Operation . . . . . . . . . . . . . . . . . . . . . 6 72 3.1. Provisioning . . . . . . . . . . . . . . . . . . . . . . 7 73 3.2. Ring Nodes . . . . . . . . . . . . . . . . . . . . . . . 7 74 3.3. Ring Links and Directions . . . . . . . . . . . . . . . . 8 75 3.3.1. Express Links . . . . . . . . . . . . . . . . . . . . 9 76 3.4. Ring LSPs . . . . . . . . . . . . . . . . . . . . . . . . 9 77 3.5. Installing Primary LFIB Entries . . . . . . . . . . . . . 9 78 3.6. Protection . . . . . . . . . . . . . . . . . . . . . . . 10 79 3.7. Installing FRR LFIB Entries . . . . . . . . . . . . . . . 11 80 4. Autodiscovery . . . . . . . . . . . . . . . . . . . . . . . . 11 81 4.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 11 82 4.2. Ring Announcement Phase . . . . . . . . . . . . . . . . . 13 83 4.3. Mastership Phase . . . . . . . . . . . . . . . . . . . . 14 84 4.4. Ring Identification Phase . . . . . . . . . . . . . . . . 14 85 4.5. Ring Changes . . . . . . . . . . . . . . . . . . . . . . 15 86 5. Ring OAM . . . . . . . . . . . . . . . . . . . . . . . . . . 16 87 6. Advanced Topics . . . . . . . . . . . . . . . . . . . . . . . 16 88 6.1. Beyond the Ring . . . . . . . . . . . . . . . . . . . . . 16 89 6.2. Half-rings . . . . . . . . . . . . . . . . . . . . . . . 18 90 6.3. Hub Node Resilience . . . . . . . . . . . . . . . . . . . 18 91 7. Security Considerations . . . . . . . . . . . . . . . . . . . 18 92 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 19 93 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 94 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 95 10.1. Normative References . . . . . . . . . . . . . . . . . . 19 96 10.2. Informative References . . . . . . . . . . . . . . . . . 19 97 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 99 1. Introduction 101 Rings are a very common topology either at infrastructure level 102 (e.g., physical ring fiber deployments in Layer 1 networks) or node 103 interconnection structures (e.g., loops created in bridged 104 interconnected infrastructures [IEEE.802.1D_2004]). A ring is the 105 simplest topology offering link and node resilience. Rings are 106 nearly ubiquitous in access and aggregation networks. As MPLS 107 increases its presence in such networks, and takes on a greater role, 108 it is imperative that MPLS handles rings well; this is not the case 109 today. 111 This document describes the special nature of rings, and the special 112 needs of MPLS on rings. It then shows how these needs can be met in 113 several ways, some of which involve extensions to protocols such as 114 IS-IS [RFC5305], OSPF[RFC3630], RSVP-TE [RFC3209] and LDP [RFC5036]. 115 RMR LSPs can also be signaled with IGP [RFC8402]; that will be 116 described in a future document. 118 The intent of this document is to handle rings that "occur 119 naturally". Many access and aggregation networks in metros have 120 their start as a simple ring. They may then grow into more complex 121 topologies, for example, by adding parallel links to the ring, or by 122 adding "express" links. The goal here is to discover these rings 123 (with some guidance), and run MPLS over them efficiently. The intent 124 is not to construct rings in a mesh network with the purpose of using 125 them for protection. 127 In some other networking situations (e.g., interconnection of 128 bridges), those rings could create loops making the network 129 inoperable, and thus needing from signaling mechanisms (such the 130 Spanning Tree Protocol) for preventing and eliminating such loops 131 [IEEE.802.1D_2004]. Here it is followed a dual approach where the 132 signaling methods are precisely created for automatically identifying 133 and defining rings where efficiently create LSPs adapted to the 134 formed ring topology. 136 1.1. Definitions 138 A (directed) graph G = (V, E) consists of a set of vertices (or 139 nodes) V and a set of edges (or links) E. An edge is an ordered pair 140 of nodes (a, b), where a and b are in V. (In this document, the 141 terms node and link will be used instead of vertex and edge.) 143 A ring is a subgraph of G. A ring consists of a subset of n nodes 144 {R_i, 0 <= i < n} of V. The directed edges {(R_i, R_i+1) and (R_i+1, 145 R_i), 0 <= i < n-1} must be a subset of E (note that index arithmetic 146 is done modulo n). We define the direction from node R_i to R_i+1 as 147 "clockwise" (CW) and the reverse direction as "anticlockwise" (AC). 148 As there may be several rings in a graph, we number each ring with a 149 distinct ring ID RID. 151 R0 . . . R1 152 . . 153 R7 R2 154 Anti- | . Ring . | 155 Clockwise | . . | Clockwise 156 v . RID = 17 . v 157 R6 R3 158 . . 159 R5 . . . R4 161 Figure 1: Ring with 8 nodes 163 The following terminology is used for ring LSPs: 165 Ring ID (RID): A non-negative number. When the RID identifies a 166 ring, it must be positive and unique in some scope of a Service 167 Provider's network. An RID of zero, when assigned to a node, 168 indicates that the node must behave in "promiscuous mode" (see 169 Section 3.2). A node may belong to multiple rings. 171 Ring node: A member of a ring. Note that a device may belong to 172 several rings. 174 Node index: A logical numbering of nodes in a ring, from zero up to 175 one less than the ring size. Used purely for exposition in this 176 document. 178 Ring master: The ring master initiates the ring identification 179 process. Mastership is indicated in the IGP by a two-bit field. 181 Ring neighbors: Nodes whose indices differ by one (modulo ring 182 size). 184 Ring links: Links that connect ring neighbors. 186 Express links: Links that connect non-neighboring ring nodes. 188 Ring direction: A two-bit field in the IGP indicating the direction 189 of a link. The choices are: 191 UN: 00 undefined link 193 CW: 01 clockwise ring link 194 AC: 10 anticlockwise ring link 196 EX: 11 express link 198 Ring Identification: The process of discovering ring nodes, ring 199 links, link directions, and express links. 201 The following notation is used for ring LSPs: 203 R_k: A ring node with index k. R_k has AC neighbor R_(k-1) and CW 204 neighbor R_(k+1). 206 RL_k: A (unicast) Ring LSP anchored on node R_k. 208 CL_jk: A label allocated by R_j for RL_k in the CW direction. 210 AL_jk: A label allocated by R_j for RL_k in the AC direction. 212 1.2. Changes from -12 in response to reviews 214 [Note to RFC Editor: this (sub-)section to be removed prior to 215 publication.] 217 Reqts Lang: updated (response to Gen-ART review [Gen]) 219 Section 1: updated "transport networks" to "Layer 1 networks" 220 (response to Transport Area review [TAR]) 222 Sec 1: replaced SPRING with IGP (response to OPS directorate 223 [OPS]) 225 Sec 1: rephrased last sentence [TAR] 227 Sec 2: added para on control plane resilience [TAR] 229 Sec 3.1: typo fixed [Gen] 231 Sec 3.2: added figure, caveats for promiscuous mode (response to 232 Security Area Directorate review [SAD]) 234 Sec 3.5: updated reference [OPS] 236 Sec 3.6: updated text on node protection, TTL [OPS] 238 Sec 4.1: changed Ring Neighbor TLV/flags to Ring Link TLV/flags; 239 changed SPRING to IGP [OPS] 241 Sec 4.1: clean up [Gen] 242 Sec 4.2: updated text on timers T1, T2 [SAD] 244 Sec 4.3, 4.4: rewrote sections on Mastership, Ring Identification 245 Phases for clarity [OPS] 247 Sec 4.5: removed "and" [Gen] 249 Sec 5: updated text on timers [TAR] 251 New Sec 6.1: added text on traffic transiting a ring [OPS] 253 Sec Cons: added text on compromised nodes [SAD] 255 2. Motivation 257 A ring is the simplest topology that offers resilience. This is 258 perhaps the main reason to lay out fiber in a ring. Thus, effective 259 mechanisms for fast failover on rings are needed. Furthermore, there 260 are large numbers of rings. Thus, configuration of rings needs to be 261 as simple as possible. Finally, bandwidth management on access rings 262 is very important, as bandwidth is generally quite constrained here. 264 The goals of this document are to present mechanisms for improved 265 MPLS-based resilience in ring networks (using ideas that are 266 reminiscent of Bidirectional Line Switched Rings), for automatic 267 bring-up of LSPs, better bandwidth management and for auto-hierarchy. 268 These goals can be achieved using extensions to existing IGP and MPLS 269 signaling protocols, using central provisioning, or in other ways. 271 Note that this document addresses data plane resilience. Control 272 plane resilience, and robustness of protocol messaging, is managed by 273 the protocols being used here (IS-IS, OSPF, LDP and RSVP-TE) and not 274 described in this document. 276 3. Theory of Operation 278 Say a ring has ring ID RID. The ring is provisioned by choosing one 279 or more ring masters for the ring and assigning them the RID. Other 280 nodes in the ring may also be assigned this RID, or may be configured 281 as "promiscuous". Ring discovery then kicks in. When each ring node 282 knows its CW and AC ring neighbors and its ring links, and all 283 express links have been identified, ring identification is complete. 285 Once ring identification is complete, each node signals one or more 286 ring LSPs RL_i. RL_i, anchored on node R_i, consists of two counter- 287 rotating unicast LSPs that start and end at R_i. A ring LSP is 288 "multipoint": any node R_j can use RL_i to send traffic to R_i; this 289 can be in either the CW or AC directions, or both (i.e., load 290 balanced). Both of these counter-rotating LSPs are "active"; the 291 choice of direction to send traffic to R_i is determined by policy at 292 the node where traffic is injected into the ring. The default policy 293 is to send traffic along the shortest path. Bidirectional 294 connectivity between nodes R_i and R_j is achieved by using two 295 different ring LSPs: R_i uses RL_j to reach R_j, and R_j uses RL_i to 296 reach R_i. 298 3.1. Provisioning 300 The goal here is to provision rings with the absolute minimum 301 configuration. The exposition below aims to achieve that using auto- 302 discovery via a link-state IGP (see Section 4). Of course, auto- 303 discovery can be overridden by configuration. For example, a link 304 that would otherwise be classified by auto-discovery as a ring link 305 might be configured not to be used for ring LSPs. 307 3.2. Ring Nodes 309 Ring nodes have a loopback address, and run a link-state IGP and an 310 MPLS signaling protocol. To provision a node as a ring node for ring 311 RID, the node is simply assigned that RID. A node may be part of 312 several rings, and thus may be assigned several ring IDs. 314 To simplify ring provisioning even further, a node N may be made 315 "promiscuous" by being assigned an RID of 0. A promiscuous node 316 listens to RIDs in its IGP neighbors' link-state updates. For every 317 non-zero RID N hears from a neighbor, N joins the corresponding ring 318 by taking on that RID. In many situations, the use of promiscuous 319 mode means that only one or two nodes in a ring needs to be 320 provisioned; everything else is auto-discovered. However, this 321 feature should be used with care. Consider the following: 323 R0 . . . R1 324 . . 325 R7 R2 326 Anti- | . Ring . | 327 Clockwise | . . | Clockwise 328 v . RID = 17 . v 329 R6 R3 330 . . 331 R5 . . . R4 332 . . 333 R13 R8 334 Anti- | . Ring . | 335 Clockwise | . . | Clockwise 336 v . RID = 18 . v 337 R12 R9 338 . . 339 R11 . . . R10 341 Two Rings 343 If R3 and R6 are configured with RID 17, R8 and R13 with RID 18, and 344 all other nodes with RID 0, this will end up as two rings with R4 and 345 R5 in both. However, other permutations of RID configurations could 346 easily end up with all nodes being in both rings 17 and 18, whereupon 347 the maximal ring will consist of R0 to R4, R8 to R13, R5 to R7 (and 348 the link from R4 to R5 will be an express link). In cases such as 349 these, one should eschew promiscuous mode in favor of simply 350 configuring all nodes with the appropriate RIDs. 352 A ring node indicates in its IGP updates the ring LSP signaling 353 protocols it supports. This can be LDP and/or RSVP-TE. Ideally, 354 each node should support both. 356 3.3. Ring Links and Directions 358 Ring links must be MPLS-capable. They are by default unnumbered, 359 point-to-point (from the IGP point of view) and "auto-bundled". The 360 "auto-bundled" attribute means that parallel links between ring 361 neighbors are considered as a single link, without the need for 362 explicit configuration for bundling (such as a Link Aggregation 363 Group). Note that each component may be advertised separately in the 364 IGP; however, signaling messages and labels across one component link 365 apply to all components. Parallel links between a pair of ring nodes 366 is often the result of having multiple lambdas or fibers between 367 those nodes. RMR is primarily intended for operation at the packet 368 layer; however, parallel links at the lambda or fiber layer may 369 result in parallel links at the packet layer. 371 A ring link is not provisioned as belonging to the ring; it is 372 discovered to belong to ring RID if both its adjacent nodes belong to 373 RID. A ring link's direction (CW or AC) is also discovered; this 374 process is initiated by the ring's ring master. Note that the above 375 two attributes can be overridden by provisioning if needed; it is 376 then up to the provisioning system to maintain consistency across the 377 ring. 379 3.3.1. Express Links 381 Express links are discovered once ring nodes, ring links and 382 directions have been established. As defined earlier, express links 383 are links joining non-neighboring ring nodes; often, this may be the 384 result of optically bypassing ring nodes. 386 3.4. Ring LSPs 388 Ring LSPs are not provisioned. Once a ring node R_i knows its RID, 389 its ring links and directions, it kicks off ring LSP signaling 390 automatically. R_i allocates CW and AC labels for each ring LSP 391 RL_k. R_i also initiates the creation of RL_i. As the signaling 392 propagates around the ring, CW and AC labels are exchanged. When R_i 393 receives CW and AC labels for RL_k from its ring neighbors, primary 394 and fast reroute (FRR) paths for RL_k are installed at R_i. 396 For RSVP-TE LSPs, bandwidths may be signaled in both directions. 397 However, these are not provisioned either; rather, one does "reverse 398 call admission control". When a service needs to use an LSP, the 399 ring node where the traffic enters the ring attempts to increase the 400 bandwidth on the LSP to the egress. If successful, the service is 401 admitted to the ring. 403 3.5. Installing Primary LFIB Entries 405 In setting up RL_k, a node R_j sends out two labels: CL_jk to R_j-1 406 and AL_jk to R_j+1. R_j also receives two labels: CL_j+1,k from 407 R_j+1, and AL_j-1,k from R_j-1. R_j can now set up the forwarding 408 entries for RL_k. In the CW direction, R_j swaps incoming label 409 CL_jk with CL_j+1,k with next hop R_j+1; these allow R_j to act as 410 LSR for RL_k. R_j also installs an LFIB entry to push CL_j+1,k with 411 next hop R_j+1 to act as ingress for RL_k. Similarly, in the AC 412 direction, R_j swaps incoming label AL_jk with AL_j-1,k with next hop 413 R_j-1 (as LSR), and an entry to push AL_j-1,k with next hop R_j-1 (as 414 ingress). 416 Clearly, R_k does not act as ingress for its own LSPs. However, R_k 417 can send OAM messages, for example, an MPLS ping or traceroute 418 ([RFC8029]), using labels CL_k,k+1 and AL_k-1,k, to test the entire 419 ring LSP anchored at R_k in both directions. Furthermore, if these 420 LSPs use Ultimate Hop Popping, then R_k installs LFIB entries to pop 421 CL_k,k for packets received from R_k-1 and to pop AL_k,k for packets 422 received from R_k+1. 424 3.6. Protection 426 In this scheme, there are no protection LSPs as such -- no node or 427 link bypass LSPs, no standby LSPs, no detours, and no LFA-type 428 protection. Protection is via the "other" direction around the ring, 429 which is why ring LSPs are in counter-rotating pairs. Protection 430 works in the same way for link, node and ring LSP failures. 432 If a node R_j detects a failure from R_j+1 -- either all links to 433 R_j+1 fail, or R_j+1 itself fails, R_j switches traffic on all CW 434 ring LSPs to the AC direction using the FRR LFIB entries. If the 435 failure is specific to a single ring LSP, R_j switches traffic just 436 for that LSP. In either case, this switchover can be very fast, as 437 the FRR LFIB entries can be preprogrammed. Fast detection and fast 438 switchover lead to minimal traffic loss. 440 R_j then sends an indication to R_j-1 that the CW direction is not 441 working, so that R_j-1 can similarly switch traffic to the AC 442 direction. For RSVP-TE, this indication can be a PathErr or a 443 Notify; other signaling protocols have similar indications. These 444 indications propagate AC until each traffic source on the ring AC of 445 the failure uses the AC direction. Thus, within a short period, 446 traffic will be flowing in the optimal path, given that there is a 447 failure on the ring. This contrasts with (say) bypass protection, 448 where until the ingress recomputes a new path, traffic will be 449 suboptimal. 451 Note that the failure of a node or a link will not necessarily affect 452 all ring LSPs. Thus, it is important to identify the affected LSPs 453 (and switch them), but to leave the rest alone. 455 One point to note is that when a ring node, say R_j, fails, RL_j is 456 clearly unusable. However, the above protection scheme will cause a 457 traffic loop: R_j-1 detects a failure CW, and protects by sending CW 458 traffic on RL_j back all the way to R_j+1, which in turn sends 459 traffic to R_j-1, etc. There are three proposals to avoid this: 461 1. Each ring node acting as ingress sends traffic with a TTL of at 462 most 2*n, where n is the number of nodes in the ring. 464 2. A ring node sends protected traffic (i.e., traffic switched from 465 CW to AC or vice versa) with TTL just large enough to reach the 466 egress. 468 3. A ring node sends protected traffic with a special purpose label 469 below the ring LSP label. A protecting node first checks for the 470 presence of this label; if present, it means that the traffic is 471 looping and MUST be dropped. 473 Approaches 1 and 2 work for traffic that remains on the ring or 474 terminates on a ring node (see Section 6.1); for traffic transiting 475 the ring, playing with TTL may affect forwarding beyond the ring. 476 Approach 3 is the most general and is the one we advocate; however, 477 this will require the allocation and definition of a new special 478 purpose label. 480 3.7. Installing FRR LFIB Entries 482 At the same time that R_j sets up its primary CW and AC LFIB entries, 483 it can also set up the protection forwarding entries for RL_k. In 484 the CW direction, R_j sets up an FRR LFIB entry to swap incoming 485 label CL_jk with AL_j-1,k with next hop R_j-1. In the AC direction, 486 R_j sets up an FRR LFIB entry to swap incoming label AL_jk with 487 CL_j+1,k with next hop R_j+1. Again, R_k does not install FRR LFIB 488 entries in this manner. 490 Say R1 receives label L42 from R2 to reach R4 in the clockwise 491 direction, and receives label L40 from R0 to reach R4 in the anti- 492 clockwise direction. Say R1 also receives label L52 from R2 to reach 493 R5 in the clockwise direction, and receives label L50 from R0 to 494 reach R5 in the anti-clockwise direction. R1 makes the following 495 LFIB entries: 497 +------+--------+-----------+--------+-----------+ 498 | Dest | CW/NH | CW FRR/NH | AC/NH | AC FRR/NH | 499 +------+--------+-----------+--------+-----------+ 500 | ... | | | | | 501 | R4 | L42/R2 | L40/R0 | L40/R0 | L42/R2 | 502 | R5 | L52/R2 | L50/R0 | L50/R0 | L52/R2 | 503 | ... | | | | | 504 +------+--------+-----------+--------+-----------+ 506 R1's LFIB 508 4. Autodiscovery 510 4.1. Overview 512 Auto-discovery proceeds in three phases. The first phase is the 513 announcement phase. The second phase is the mastership phase. The 514 third phase is the ring identification phase. 516 S1 517 / \ 518 | R0 . . . R1 R0 has MV = 11 519 | . \ . R1 has MV = 10 520 R7 \________ R2 All other nodes have MV = 00 521 Anti- | . . | 522 clockwise | . Ring . | Clockwise 523 v . RID = 17 . v 524 R6 R3 525 . . 526 R5 . . . R4 527 \ / 528 \ / 529 An 531 Figure 2: Ring with non-ring nodes and links 533 We use three concepts below: 535 ring nodes: all nodes that announce ring node TLVs with a given 536 RID. 538 IGP neighbors: all nodes which are IGP neighbors of a given node. 540 ring neighbors: ring nodes that are IGP neighbors of a given node. 541 Exactly one is the CW neighbor and one is the AC neighbor; all 542 other ring neighbors are express neighbors. 544 In Figure 2, R0 through R7 are ring nodes belonging to ring 17. R0 545 has IGP neighbors R1, R2, R7 and S1. R0 has ring neighbors R1 (CW), 546 R2 (express) and R7 (AC). Autodiscovery aims to identify ring nodes 547 of a given ring, ring neighbors of each ring node, and the CW and AC 548 node for each ring node. 550 The format of an RMR Node Type-Length-Value (TLV) is given below. It 551 consists of information pertaining to the node and optionally, sub- 552 TLVs. A Neighbor sub-TLV contains information pertaining to the 553 node's neighbors. Other sub-TLVs may be defined in the future. 554 Details of the format specific to IS-IS and OSPF will be given in the 555 corresponding IGP documents. 557 [RMR Node Type][RMR Node Length][RID][Node Flags][sub-TLVs] 559 Ring Node TLV Format 561 [My Intf Inx][Rem Intf Inx][RID 1][Flags for RID 1] 562 [RID 2][Flags for RID 2]... 564 Ring Link Sub-TLV Format 566 0 1 567 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 568 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 569 |MV | SS | SO | MBZ |SU |M| 570 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 571 MV: Mastership Value 572 SS: Supported Signaling Protocols 573 (100 = RSVP-TE; 010 = LDP; 001 = IGP) 574 MBZ: Must be zero 575 SO: Supported OAM Protocols (100 = BFD; 010 = CFM; 001 = EFM) 576 SU: Signaling Protocol to Use (00: none; 01: LDP; 10: RSVP-TE; 577 11: IGP) 578 M : Elected Master (0 = no, 1 = yes) 580 Flags for a Ring Node TLV 582 0 1 583 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 584 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 585 |RD |OAM| MBZ | 586 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 587 RD: Ring Direction (00 = none; 01 = CW; 10 = AC; 11 = express) 588 OAM: OAM Protocol to use (00 = none; 01 = BFD; 10 = CFM; 11 = EFM) 589 MBZ: Must be zero 591 Flags for a Ring Link TLV 593 4.2. Ring Announcement Phase 595 Each node participating in an MPLS ring is assigned an RID; in the 596 example, RID = 17. A node is also provisioned with a mastership 597 value. Each node advertises a ring node TLV for each ring it is 598 participating in, along with the associated flags. It then starts 599 timer T1; this timer is to allow each node time to hear from all 600 other nodes in the ring. [The settings for timers T1 and T2 (below) 601 are particular to the specific IGP used for signaling; they will be 602 discussed in the IGP document that defines the ring node/link TLVs.] 603 The settings for timers T1 and T2 (below) will be discussed in the 604 IGP document that defines the ring node/link TLVs.] 606 A node in promiscuous mode doesn't advertise any ring node TLVs. 607 However, when it hears a ring node TLV from an IGP neighbor, it joins 608 that ring, and sends its own ring node TLV with that RID. 610 The announcement phase allows a ring node to discover other ring 611 nodes in the same ring so that a ring master can be elected. 613 4.3. Mastership Phase 615 When timer T1 fires, a node enters the mastership phase. In this 616 phase, each ring node N starts timer T2 and checks if it is master as 617 follows. N examines the MV value of all ring nodes and selects those 618 with the highest MV vale. Among these nodes, N finds the node with 619 the lowest loopback address. If that node is N, N declares itself 620 master to the entire ring by readvertising its ring node TLV with the 621 M bit set. 623 When timer T2 fires, each node examines the ring node TLVs from all 624 other nodes in the ring to identify the ring master. There should be 625 exactly one; if not, each node restarts timer T2 and tries again. 627 Barring software bugs or malicious code, the principal reason for 628 multiple nodes for setting their M bit is late-arriving ring 629 announcements. Say nodes N1 and N2 have the highest mastership 630 values, and N1 has the lowest loopback address, while N2 has the 631 second lowest loopback address. If N1 makes its ring announcement 632 just as N2's T1 timer fires, both N1 and N2 will think they are the 633 master (since N2 will not have heard N1's announcement in time). 634 However, in the next round, N2 will realize that N1 is indeed the 635 master. In the worst case, the mastership phase will occur as many 636 times as there are nodes in the ring. 638 4.4. Ring Identification Phase 640 R0 . . . R1 <------ 3. Anti-clockwise neighbor 641 . . 642 R7 -------------- R2 <--- 0. Ring Master 643 . Ring / . 644 . / . 1. Maximal ring includes R3 645 . / . 646 R6 / R3 <--- 2. Clockwise neighbor 647 . / . 648 R5 . . . R4 <------ 4. R4 is express neighbor 649 R7 is also an express neighbor of R2 651 Figure 3: Ring Identification 653 When there is exactly one ring master M (here, R2), M enters the Ring 654 Identification Phase. M indicates that it has successfully completed 655 this phase by advertising ring link TLVs. This is the trigger for 656 M's CW neighbor to enter the Ring Identification Phase. This phase 657 passes CW until all ring nodes have completed ring identification. 659 The Ring Identification Phase proceeds as follows: 661 1. M identifies all ring nodes for ring RID, i.e., those that have 662 announced ring node TLVs with the ring ID = RID. 664 2. M computes a maximal ring among these nodes. 666 3. Based on that, M picks a CW neighbor and an AC neighbor. 668 4. M then inserts ring link TLVs with ring direction CW for each 669 link to its CW neighbor; M also inserts a ring link TLV with 670 direction AC for each link to its AC neighbor. (Note that there 671 may be multiple links from M to each of its neighbors.) 673 5. Finally, M determines its express links. These are links to IGP 674 neighbors that are ring nodes but neither the CW or AC neighbor. 675 M advertises ring link TLVs for express links by setting the link 676 direction to "express link". 678 This process passes on to the CW neighbor X as follows: 680 1. Each node Y listens for ring link TLVs. The set of nodes S 681 consists of those that have announced ring link TLVs. 683 2. If a node Z announces a ring link TLV with Y as the CW neighbor, 684 then Y is next. 686 X follows the same procedure as M with two small changes: 688 1. when X computes a maximal ring, it MUST include all nodes in S. 690 2. X knows its AC neighbor (Z above), and doesn't have to pick it. 692 Here, R2 (the master) knows R0 through R7 are ring nodes (Step 1). 693 R1, R3, R4 and R7 are its ring neighbors. R2 computes a maximal ring 694 (Step 2). It then picks R3 as its CW neighbor and R1 as its AC 695 neighbor (Step 3). Finally, it declares the links to R4 and R7 as 696 express links (Step 5). 698 4.5. Ring Changes 700 The main changes to a ring are: 702 ring link addition; 703 ring link deletion; 705 ring node addition; 707 ring node deletion. 709 The main goal of handling ring changes is (as much as possible) not 710 to perturb existing ring operation. Thus, if the ring master hasn't 711 changed, all of the above changes should be local to the point of 712 change. Link adds just update the IGP; signaling should take 713 advantage of the new capacity as soon as it learns. Link deletions 714 in the case of parallel links also show up as a change in capacity 715 (until the last link in the bundle is removed.) 717 The removal of the last ring link between two nodes, or the removal 718 of a ring node is an event that triggers protection switching. In a 719 simple ring, the result is a broken ring. However, if a ring has 720 express links, then it may be able to converge to a smaller ring with 721 protection. 723 The addition of a new ring node can also be handled incrementally. 725 5. Ring OAM 727 Each ring node should advertise in its ring node TLV the OAM 728 protocols it supports. Each ring node is expected to run a link- 729 level OAM over each ring link. This should be an OAM protocol that 730 both neighbors agree on. The default hello time is that of the 731 protocol chosen. 733 Each ring node also sends OAM messages over each direction of its 734 ring LSP. This is a multi-hop OAM to check LSP liveness; typically, 735 BFD would be used for this. Each node chooses the hello interval, 736 the choice of which should be based on the size of the ring (as each 737 node would have to send out twice that many hello messages every 738 interval) and the desired failure detection time. 740 6. Advanced Topics 742 6.1. Beyond the Ring 744 The discourse above discusses traffic that originates and terminates 745 on a ring. However, in many cases, traffic may come originate on a 746 ring node and terminate at a non-ring node; other traffic may 747 originate on a non-ring node and terminate on a ring node; and in yet 748 other cases, traffic may transit a ring, i.e., originate on a non- 749 ring node, arrive at a ring node, traverse the ring, and leave for a 750 non-ring destination. This section discusses these cases, and how 751 traffic traversing a ring can profit from ring protection. 753 N0 ___ R0 . . . R1 754 \\ . . 755 R7 R2 756 . Ring . 757 . 17 . 758 . . 759 R6 R3 ----- N1 760 . . 761 R5 . . . R4 763 Figure 4: Beyond the Ring 765 In all these cases, the "end-to-end" path needs to be either stitched 766 with, or overlaid on, the ring path. The latter approach is 767 recommended, using hierarchy in both the control and data planes. In 768 the figure above, traffic from N0 to N1 (both non-ring nodes) 769 traverses Ring 17. If nodes outside Ring 17 use LDP to signal LSPs, 770 here's one way to accomplish this: R7 and R3 have targeted LDP 771 sessions to exchange labels. The following LDP label exchanges occur 772 (among others): 774 1. N1 sends an "egress label" L0 for its loopback N1 to R3 and 775 inserts a "pop L0 and forward" entry in its LFIB. 777 2. R3 sends a label L1 for N1 to R7 over the targeted LDP session 778 and inserts a "swap L1 with L0" in its LFIB. 780 3. R7 sends label L2 for N1 to N0 and inserts a "swap L2 with L1" 781 entry in its LFIB. 783 4. N0 inserts a "push L2" entry in its LFIB for traffic destined to 784 N1. 786 In parallel, nodes in Ring 17 exchange labels for traffic within the 787 ring. 789 To send a packet to N1, N0 pushes label L2. When this reaches R7, R7 790 swaps L2 with L1 and additionally pushes a ring label to reach R3. 791 Ring forwarding occurs between R7 and R3. R3 pops the ring label, 792 swaps L2 with L1 and forwards the packet to N1. If a failure occurs 793 on the ring, ring protection kicks in. A failure of R7, R3 or any 794 non-ring node will be dealt with by the non-ring label distribution 795 protocol (in this case, LDP). 797 6.2. Half-rings 799 In some cases, a ring H may be incomplete, either because H is 800 permanently missing a link (not just because of a failure), or 801 because the link required to complete H is in a different IGP area. 802 Either way, the ring discovery algorithm will fail. We call such a 803 ring a "half-ring". Half-rings are sufficiently common that finding 804 a way to deal with them effectively is a useful problem to solve. 805 This topic will not be addressed in this document; that task is left 806 for a future document. 808 6.3. Hub Node Resilience 810 Let's call the node(s) that connect a ring to the rest of the network 811 "hub node(s)" (usually, there are a pair of hub nodes.) Suppose a 812 ring has two hub nodes H1 and H2. Suppose further that a non-hub 813 ring node X wants to send traffic to some node Z outside the ring. 814 This could be done, say, by having targeted LDP (T-LDP) sessions from 815 H1 and H2 to X advertising LDP reachability to Z via H1 (H2); there 816 would be a two-label stack from X to reach Z. Say that to reach Z, X 817 prefers H1; thus, traffic from X to Z will first go to H1 via a ring 818 LSP, then to Z via LDP. 820 If H1 fails, traffic from X to Z will drop until the T-LDP session 821 from H1 to Z fails, the IGP reconverges, and H2's label to Z is 822 chosen. Thereafter, traffic will go from X to H2 via a ring LSP, 823 then to Z via LDP. However, this convergence could take a long time. 824 Since this is a very common and important situation, it is again a 825 useful problem to solve. However, this topic too will not be 826 addressed in this document; that task is left for a future document. 828 7. Security Considerations 830 This document proposes extensions to IS-IS, OSPF, LDP and RSVP-TE, 831 all of which have mechanisms to secure them. The extensions proposed 832 do not represent per se a compromise to network security when the 833 control plane is secured, since any manipulation of the content of 834 the messages or even the control plane misinterpretation of the 835 semantics are avoided. 837 A compromised or otherwise misbehaving node can foil the 838 autodiscovery process Section 4, leading to a ring never 839 transitioning to a usable state. 841 8. Acknowledgments 843 Many thanks to Pierre Bichon whose exemplar of self-organizing 844 networks and whose urging for ever simpler provisioning led to the 845 notion of promiscuous nodes. 847 9. IANA Considerations 849 There are no requests as yet to IANA for this document. 851 10. References 853 10.1. Normative References 855 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 856 Requirement Levels", BCP 14, RFC 2119, 857 DOI 10.17487/RFC2119, March 1997, 858 . 860 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 861 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 862 May 2017, . 864 10.2. Informative References 866 [IEEE.802.1D_2004] 867 IEEE, "IEEE Standard for Local and metropolitan area 868 networks: Media Access Control (MAC) Bridges", IEEE 869 802.1D-2004, DOI 10.1109/ieeestd.2004.94569, July 2004, 870 . 872 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 873 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 874 Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001, 875 . 877 [RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering 878 (TE) Extensions to OSPF Version 2", RFC 3630, 879 DOI 10.17487/RFC3630, September 2003, 880 . 882 [RFC5036] Andersson, L., Ed., Minei, I., Ed., and B. Thomas, Ed., 883 "LDP Specification", RFC 5036, DOI 10.17487/RFC5036, 884 October 2007, . 886 [RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic 887 Engineering", RFC 5305, DOI 10.17487/RFC5305, October 888 2008, . 890 [RFC8029] Kompella, K., Swallow, G., Pignataro, C., Ed., Kumar, N., 891 Aldrin, S., and M. Chen, "Detecting Multiprotocol Label 892 Switched (MPLS) Data-Plane Failures", RFC 8029, 893 DOI 10.17487/RFC8029, March 2017, 894 . 896 [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., 897 Decraene, B., Litkowski, S., and R. Shakir, "Segment 898 Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, 899 July 2018, . 901 Authors' Addresses 903 Kireeti Kompella 904 Juniper Networks, Inc. 905 1133 Innovation Way 906 Sunnyvale, CA 94089 907 USA 909 Email: kireeti.ietf@gmail.com 911 Luis M. Contreras 912 Telefonica 913 Ronda de la Comunicacion 914 Sur-3 building, 3rd floor 915 Madrid 28050 916 Spain 918 Email: luismiguel.contrerasmurillo@telefonica.com 919 URI: http://lmcontreras.com