idnits 2.17.1 draft-thubert-rtgwg-arc-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (January 22, 2015) is 3353 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-15) exists of draft-ietf-spring-segment-routing-00 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 RTGWG P. Thubert, Ed. 3 Internet-Draft Cisco 4 Intended status: Standards Track P. Bellagamba 5 Expires: July 26, 2015 Cisco Systems 6 January 22, 2015 8 Available Routing Constructs 9 draft-thubert-rtgwg-arc-03 11 Abstract 13 This draft introduces the concept of ARC, a two-edged routing 14 construct that forms its own fault isolation and recovery domain. 15 The new paradigm can be leveraged to improve the network utilization 16 and resiliency for unicast and multicast traffic in multiple 17 environments, and is optimized to compute short reroute paths in case 18 of breakages. 20 Requirements Language 22 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 23 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 24 "OPTIONAL" in this document are to be interpreted as described in RFC 25 2119 [RFC2119]. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on July 26, 2015. 44 Copyright Notice 46 Copyright (c) 2015 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 62 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 63 3. ARC Set representations . . . . . . . . . . . . . . . . . . . 7 64 4. Lowest ARC First . . . . . . . . . . . . . . . . . . . . . . 9 65 4.1. Init . . . . . . . . . . . . . . . . . . . . . . . . . . 10 66 4.2. Growing Trees . . . . . . . . . . . . . . . . . . . . . . 10 67 4.3. Being Safe . . . . . . . . . . . . . . . . . . . . . . . 11 68 4.4. Bending An ARC . . . . . . . . . . . . . . . . . . . . . 11 69 4.5. Orienting Links . . . . . . . . . . . . . . . . . . . . . 12 70 4.6. Looping or recursing . . . . . . . . . . . . . . . . . . 13 71 5. Forwarding Along An ARC Set . . . . . . . . . . . . . . . . . 13 72 5.1. Control Plane Recovery . . . . . . . . . . . . . . . . . 14 73 5.2. Data Plane Recovery . . . . . . . . . . . . . . . . . . . 14 74 5.2.1. Label Switched ARCs . . . . . . . . . . . . . . . . . 15 75 5.2.2. Segment Routed ARCs . . . . . . . . . . . . . . . . . 15 76 5.3. Flooding . . . . . . . . . . . . . . . . . . . . . . . . 16 77 6. ARC Signaling . . . . . . . . . . . . . . . . . . . . . . . . 16 78 6.1. Serial ARC Representation . . . . . . . . . . . . . . . . 16 79 6.2. Centralized vs. Distributed computation . . . . . . . . . 16 80 6.3. ARC Topology Injection . . . . . . . . . . . . . . . . . 17 81 6.4. ARC Operations, Administration, and Maintenance . . . . . 17 82 7. Other ARC Operations . . . . . . . . . . . . . . . . . . . . 17 83 7.1. Node-Local vs. ARC-Wide reaction . . . . . . . . . . . . 17 84 7.2. Load Balancing . . . . . . . . . . . . . . . . . . . . . 17 85 7.3. Shared Risk Link Group . . . . . . . . . . . . . . . . . 18 86 7.4. Olympic Rings . . . . . . . . . . . . . . . . . . . . . . 19 87 7.5. Routing Hierarchies . . . . . . . . . . . . . . . . . . . 19 88 8. Manageability . . . . . . . . . . . . . . . . . . . . . . . . 20 89 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 90 10. Security Considerations . . . . . . . . . . . . . . . . . . . 20 91 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 20 92 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 93 12.1. Normative References . . . . . . . . . . . . . . . . . . 20 94 12.2. Informative References . . . . . . . . . . . . . . . . . 20 95 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21 97 1. Introduction 99 Traditional routing and forwarding uses the concept of path as the 100 basic routing paradigm to get a packet from a source to a destination 101 by following an ordered sequence of arrows between intermediate 102 nodes. In this serial design, a path is broken as soon as a single 103 arrow is, and getting around a breakage can require path re- 104 computation, network re-convergence, and incur delays to till service 105 is restored. 107 Multiple paths can be bound together for instance to form a Directed 108 Acyclic Graph (DAG) to a destination, but that technique can be 109 difficult to balance and cannot provide a full path redundancy even 110 in the case of a biconnected graph. For instance, if the node that 111 is closest to the DAG destination has only one link to that 112 destination, then it does not have a alternate path to get to that 113 destination. 115 It is also possible to compute an alternate routing topology for fast 116 rerouting to a given destination, in which case some signaling, 117 tagging or labeling can be put in place to indicate whether a packet 118 follows the normal path or was rerouted over an alternate topology. 119 Once a packet is rerouted, it is bound to the alternate topology so 120 only one breakage can be handled with loop-free guarantees in most 121 practical situations. 123 This draft introduces the concept of an Available Routing Construct 124 (ARC) as a routing construct made of a bidirectional sequence of 125 nodes and links with 2 outgoing edges, so that, upon a single 126 breakage, each lively node in along ARC can still reach one of the 127 outgoing edges. 129 The routing graph to reach a certain destination is expressed as a 130 cascade of ARCs, each ARC providing its own independent domain of 131 fault isolation and recovery. Unicast traffic may enter an ARC via 132 any node but it may only leave the ARC through one of its two edges. 133 One node along the ARC is designated as the Cursor. In normal 134 unicast operations, the traffic inside an ARC flows away from the 135 Cursor towards an edge. Upon a failure, packets may bounce on the 136 breakage point and flow the other way along the ARC to take the other 137 exit. 139 Aa a result an ARC is resilient to any single failure, and the 140 recovery can be driven either from the data plane or the control 141 plane. A second failure occurring within a same ARC will isolate an 142 ARC segment. This can be further corrected from the control plane by 143 reversing all the incoming Edges in a process that might recurse till 144 an exit is found. When ARC reversal is applied, an ARC topology is 145 resilient to some cases of Shared Risk Link Group (SRLG) failures. 147 Properties of the Maximally Redundant Tree (MRT) and ARC are compared 148 in [I-D.thubert-rtgwg-arc-vs-mrt]. The study shows that the reroute 149 path that ARC derives is generally shorter than the alternate path 150 that MRT computes. This property is largely due to the concept of 151 cursor that delineates the shortest path on both sides of an ARC. 152 Once a rerouted packet passes the cursor of the ARC in which it is 153 rerouted, it should not cross a cursor again unless there is a second 154 breakage later. It results that the packet follows the shortest path 155 for the rest of the way, staying on the right side of each downstream 156 ARC, when MRT would be following all subsequent eyes in the same 157 direction. 159 This draft presents the concept and provides an intuition of how ARCs 160 can simplify the operation and improve the network utilization and 161 resiliency for all sorts of traffic in multiple environments, but 162 defers to further documents to elaborate on the algorithms and 163 optimizations in the different application domains. For instance, 164 ARCs can also be used in datacenters for the purpose of fast-reroute, 165 or within a service provider network to simplify load balancing 166 operations or leverage optimally the ring topologies [RFC5921]. 168 2. Terminology 170 The definition of the constituent parts of the "OAM" term is found in 171 [RFC6291]. 173 The draft uses the following terminology: 175 ARC: Available Routing Construct. An ARC is a loopless ordered set 176 of nodes and links whereby traffic may enter via any node in the 177 ARC but may only leave the ARC through either one of the ARC 178 edges. 180 Comb: An ARC generalization: a Comb is a n-edged loopless set of 181 nodes and links with n >= 2; traffic may enter via any node in the 182 Comb but may only exit the Comb through one of its n edges. A 183 Comb comes with a walk operation that enables to attempt to exit 184 via every edge and to discover when all have been tried. 186 Cursor: A virtual point along an ARC that can be located on a node 187 or on a link between 2 nodes. In normal operations, the traffic 188 along the ARC flows away from its Cursor. If the Cursor is a 189 node, then traffic can be distributed on both sides. The Cursor 190 may be moved to change the way traffic is load balanced along an 191 ARC. It may also be placed at the location of a failure to direct 192 traffic away from that point. 194 ARC Node: A Node that belongs to an ARC. 196 Edge ARC Node: An ARC Node at an edge of its ARC. An Edge ARC Node 197 is a node via wich traffic can exit the ARC. 199 Edge Link: A directed link outgoing from an Edge ARC Node. Traffic 200 can only exit from an ARC via an Edge Link. An Edge Link does not 201 accept traffic into an ARC. 203 Intermediate ARC Node: A node that is not at an edge of an ARC. A 204 Intermediate ARC Node node that can receive traffic and forward 205 traffic between its adjacent nodes. 207 Intermediate Link: A link between two Intermediate ARC Nodes. An 208 Intermediate Link is reversible, meaning that traffic is allowed 209 in both directions though an individual packet is constrained in 210 the way its direction is reversed. For stable links such as wired 211 links, the typical constraint is that the direction of a packet 212 may be reversed at most once along a given ARC. 214 Collapsed ARC: An ARC that is formed of a single node. This node is 215 altogether the Cursor and both Edge Nodes. This implies that the 216 node has at least 2 outgoing links to 2 different Safe Nodes. 218 | 219 | 220 V 221 C+EAN 222 /|\ 223 / | \ 224 | V | 225 V V 227 E: Edge ARC Node -| collapsed in a single node 228 C: Cursor -| 230 Figure 1: Collapsed ARC 232 Infrastructure ARC: An ARC that is formed of more than one node, 233 which also means that the Edge Nodes are different nodes. 235 | \ | | 236 | \ | | | 237 V V V | 238 _->IAN<---->IAN<---->IAN<---->IAN<-_ | 239 / + C \ | 240 / \| 241 V V 242 EAN EAN 243 | /|\ 244 | / | \ 245 | | V | 246 V V V 248 IAN: Intermediate ARC Node -| 249 EAN: Edge ARC Node |- All are Safe Nodes 250 C: Cursor -| 252 Figure 2: Infrastructure ARC 254 DAG: Directed Acyclic Graph. 256 ARC Set (or Cascade): A DAG with ARCs as vertices. In the DAG, an 257 edge between ARC A and ARC B corresponds to a link from an Edge 258 ARC Node in ARC A and an arbitrary ARC Node in ARC B. Note that 259 by definition, an ARC has at least 2 outgoing Edge Links, one per 260 Edge Node, and maybe more if an Edge Node has multiple outgoing 261 Edge Links. All vertices in the DAG have 2 forwarding solutions, 262 even the ARC closest to the destination. 264 Omega: the abstract destination (== root) of an ARC Set. Omega is 265 also referred as a complex destination in that it typically 266 comprises more than one node and/or more than one link on a node. 267 if Omega has a single node, then the plural interfaces on that 268 node are considered as as many virtual node for the sake of the 269 ARC computation algorithm. 271 ARC Height: An arbitrary distance from Omega that is associated to 272 an ARC. The Height of an ARC must be more than the Height of any 273 of the ARCs it terminates into. The order of ARC formation by a 274 given algorithm can be used as a Height whereby an ARC is always 275 strictly higher or lower than another. 277 Buttressing ARC: A split ARC that is merged into another ARC at one 278 edge. An ARC and one or more Buttressing ARCs form a Comb 279 construct that is resilient to additional breakages. A 280 Buttressing ARC may be applied to an ARC or a Comb iff traffic 281 outgoing the Buttressing ARC Edge always reaches in an ARC that is 282 lower than this ARC, or Omega. 284 | \ | | 285 | \ | | | 286 V V V | 287 _->IAN<---->IAN<---->IAN<---->IAN<-_----->IAN<-_ 288 / + C \ | \ 289 / \| \ 290 V V V 291 EAN EAN EAN 292 | /|\ | 293 | / | \ | 294 | | V | | 295 V V V V 297 Figure 3: Comb with Buttressing ARC 299 Safe Node: A node is Safe if there is no single point of failure - 300 apart from the node itself - on its way to Omega. From this 301 definition, a node is Safe if it has at least two non-congruent 302 paths to two different other Safe Nodes. It results that a Safe 303 node that is not Omega has at least two completely disjunct paths 304 to Omega. When an ARC has been successfully constructed, all its 305 nodes become safe with respect to the Omega for which the ARC was 306 constructed. By extension for a collapsed path Omega is deemed to 307 be Safe, that is any node that pertains in Omega is a Safe Node. 309 ?-S: A node N is deemed dependent on a node S or S-dependent 310 (denoted as ?-S) if S is the last single point of failure along 311 N's shortest path to Omega. 313 3. ARC Set representations 315 An ARC Set can be represented in a number of fashions: 317 Graph View: 319 H2<==>H<==>H1<---I--->J1<==>J--->K1<===>K 320 | | | | | 321 | | | | | 322 V V V V V 323 D1<==>D<==>D3 E1<==>E F1<==>F<==>F2 G 324 | | | | | | / \ 325 | | | | | | / \ 326 V V V V V V V V 327 B1<==>B2<==>B3<==>B--->A<==>A1<------C2<==>C<==>C4 328 | | | | 329 | | | | 330 | V V | 331 +--------------------> Omega <-------------------+ 333 Figure 4: Routing Graph View 335 This representation is similar to a classical routing graph with 336 the peculiarity that some Links are marked reversible. An ARC is 337 represented as a sequence of reversible links. The node that 338 holds the Cursor is also indicated somehow. 340 ARC View: 342 +========I========+ 343 | | 344 | +====J====+ 345 | | | 346 +====H====+ | +=====K=====+ 347 | | | | | 348 +====D====+ +====E====+ +====F====+ +====G====+ 349 | | | | | | | | 350 +=========B=========+ | | +=========C=========+ 351 | | | | | | 352 | +======A=======+ | 353 | | | | 354 ------------------------------------------------------------------Omega 356 Figure 5: ARC Representation 358 This ARC representation abstracts a whole ARC as a single vertex. 359 An ARC ends in one or more other ARCs, but it has to be noted that 360 even if both edges of an ARC end in a same other ARC, it ends in 361 fact in 2 different nodes, or Omega. This is turn can be 362 represented as a DAG as described in Paragraph 3. 364 Collapsed DAG view: 366 +====+ +====+ +====+ +====+ 367 | H | <--- | I | ---> | J | ---> | K | 368 | \__ | ___/ | 369 | \ | / | 370 V _| V |_ V 371 +====+ +====+ +====+ +====+ 372 | D | | E | | F | <--- | G | 373 \ \ __/ \__ __/ \__ / / 374 \ \ / \ / \ / / 375 _| _| |_ _| |_ _| |_ |_ 376 +====+ +====+ +====+ 377 | B | ---> | A | <--- | C | 378 | | | | 379 V V V V 380 ------------------------------------------------------------------Omega 382 Figure 6: ARC DAG 384 In the DAG representation, an ARC is abstracted as a vertex and 385 links between ARCs are shown as directed edges. This way, the 386 reversible links are omitted and the graph is simplified. It can 387 be noted that even the vertex closest to Omega has 2 non-congruent 388 forwarding solutions, that is Heir Links to Omega. 390 4. Lowest ARC First 392 The open Lowest ARC First(oLAF) algorithm is presented below in such 393 a way as to help the reader figure how an ARC Set can be obtained but 394 not in a computer-optimized fashion that is left to be determined. 395 oLAF is based on Dijkstra's algorithm for Shortest Path First (SPF) 396 computation, and is designed in such a fashion that the reverse SPF 397 tree towards a destination is conserved and preferred for forwarding 398 along the resulting ARC Set. 400 We make the computation on behalf of Omega, that is an abstraction, 401 but could represent the node or the set of nodes that we want to 402 reach with an ARC Set. If Omega is instantiated as an actual 403 destination node, then that node may be a fine location for an ARC 404 Computing Engine. 406 4.1. Init 408 So we start with an proverbial Initial Set of Nodes that are 409 interconnected by Links, and Omega that is the destination that we 410 want to reach with an ARC Set. 412 If there is no Heir, we're done. If there is a single Heir then the 413 graph is mono-connected, so we restart the computation taking that 414 Heir off the Set of Nodes and making it Omega. 416 Else, if Omega is a single Node, or if Omega is composed of multiple 417 nodes but we are willing to accept that both ends of an ARC terminate 418 in a same node in Omega, then we create virtual Omega nodes, a 419 minimum of two and at most one per Heir, and we make them the new 420 Omega. Note: we need at least two destinations because both ends of 421 an ARC cannot terminate in a same node. 423 Now we can start building an ARC Set towards the resulting Omega. 425 In this process, we create so-called Dependent Sets of nodes, each 426 owned by a Safe Node S, DSet(S). DSet(S) contains nodes that are not 427 determined to be Safe at the current stage of the computation and for 428 which S, the owner Safe Node, is the last single point of failure on 429 the shortest path tree to Omega. It results that a given node can be 430 at most in one DSet, and that a Safe Node belongs to its own DSet. 432 For each node S in Omega we create a DSet(S) in which we place S. 434 4.2. Growing Trees 436 And then the process goes like this: 438 We select the node in the Set of Nodes that is closest to Omega using 439 the cost towards Omega as if we were building a traditional reverse 440 SPF tree and we place the selected node in the same Dependent Set as 441 its parent in the reverse SPF tree. Note that for a Heir, the parent 442 might be a real node in Omega, or a virtual Omega node. 444 If we kept it at that, we would be building subtrees that are hanging 445 off a Safe Node and together would represent the reverse shortest 446 path tree towards Omega, each subtree being grown separately inside 447 DSet(S) where S is the (virtual) Safe node that is the root of the 448 subtree. 450 4.3. Being Safe 452 But once we have placed the selected node in a DSet, we consider its 453 neighbors one by one. If at least one of the neighbors is already in 454 a different DSet than this node, we select the neighbor that provides 455 the shortest alternate path to Omega for the selected node. 457 Doing so, we have isolated two paths: 459 o one along its own shortest path that is contained within its own 460 Dependent Set and that leads to the owner Safe Node of this set. 462 o and one via the selected neighbor, along its own shortest path 463 within the selected neighbor's Dependent Set and that leads to the 464 owner Safe Node of that other set. 466 Because the two sets are different and have no intersection, these 467 paths are non-congruent. And because the two non-congruent paths 468 lead to two different Safe Nodes, this node is Safe. 470 It might happen that: 472 o the selected node's parent is already a Safe Node, in which case 473 the selected node is the Edge AN on its shortest path side. 475 o It might also happen that the selected neighbor is already a Safe 476 Node, in which case selected node is the Edge AN on its alternate 477 side. 479 If both conditions are met for a same AN, then that AN forms a 480 collapsed ARC by itself. 482 4.4. Bending An ARC 484 Now we form an ARC as follows: 486 o A height is attributed to this ARC that must be strictly more than 487 that of the ARCs it terminates into, if any. The order in which 488 the ARCs are built may be used in some cases. 490 o The ARC terminates in the two Safe Nodes that are the owners of 491 the two DSets. The normal behaviour is to make a Edge Link the 492 link to the Safe Node. 494 o If the Safe Node at one end forms a collapsed ARC by itself, it 495 may be absorbed in the ARC in order to build a multi-edged ARC. 497 o If one of the two Safe Nodes pertains in a ARC or a Comb construct 498 that is higher than the other end, then this ARC may be merged at 499 the Safe Node with its original ARC, in order to form a Comb 500 construct whereby this ARC is a Buttressing ARC of the Comb. The 501 resulting Comb conserves the height on the original ARC or Comb 502 that it extends. 504 o The ARC is built by adjoining the two non-congruent paths that we 505 isolated for the selected node. 507 o The selected node is the node farthest from Omega in the resulting 508 ARC, so we make it the Cursor. 510 o The link between the selected node and the selected neighbor would 511 not have been used in a classical reverse SPF tree. Here, we have 512 determined that this link is in fact critical to connect 2 zones 513 of the network (the DSets) that can act as a back up for one 514 another in case of the failure of their respective single points 515 of failure (the Safe Nodes). 517 o Because the ARC can be used in both directions, each AN along the 518 ARC has two non-congruent paths to the Safe Nodes that the ARC 519 terminates into. So it is a Safe Node. We create individual 520 DSets for each AN and we move the AN to its own DSet. 522 4.5. Orienting Links 524 For each ARC Node along the ARC: 526 o any link (there can be zero for a collapsed ARC, one for an Edge 527 AN or two of them for a Intermediate AN) between this AN and a 528 next AN along this ARC is made an Intermediate Link, that is, 529 reversible. The normal direction, away from the Cursor, preserves 530 the shortest path. 532 o If this AN is an Edge AN for this ARC, than all links off this 533 node that terminate in a Safe Node are made Edge Links, that is, 534 outgoing but not reversible. 536 o All the other links left undetermined. 538 The nodes left in the Dependent Sets but the owner Safe Node are 539 still not Safe. They are moved back to the original Set of Nodes to 540 enable forming additional ARCs which might depend on this ARC in the 541 ARC Set. 543 4.6. Looping or recursing 545 We are done processing the particular node we had picked in the 546 original Set of Nodes. If the Set of Nodes as it stands now is not 547 empty, we continue from Section 4.2. 549 If the Set of Nodes went empty, we are done with this pass and we 550 consider the Dependent Sets that we have put together. In a 551 biconnected graph, there should be one set per node and one node per 552 set, denoting that every node is a Safe Node. 554 If some portion of the graph is mono-connected, then each mono- 555 connected portion forms the Dependent Set of the Safe Node that is 556 its single point of failure. In order to be maximally redundant, we 557 need to form the ARCs again, within the Dependent Set. 559 To do so, we remove the Safe Node from the Dependent set and make it 560 Omega. We make the resulting DSet our Set of Nodes and run the 561 algorithm again. 563 This may recurse a number of times if the graph has mono-connected 564 zones within others. 566 5. Forwarding Along An ARC Set 568 Under normal conditions, the traffic flows away from the Cursor of 569 the current ARC and cascades into the next ARC on that side of the 570 Cursor, with the Height of the current ARC decreasing monotonically 571 from ARC to ARC till Omega is reached. 573 The same goes for a generic Comb construct. When Buttressing ARCs 574 are applied on a main ARC or other Buttressing ARCs, the final 575 construct assumes the shape of a tree. The tree may be walked in 576 different manners but the shortest path requires to start going down 577 the current ARC or Buttressing ARC to its Edge. 579 In case of Label forwarding, the same recursive technique is applied 580 and a multiple ARC label path is constructed. Each ARC has is own 581 set of label path per Omega, each ARC Set label path being merged 582 into the lower ARC label set, thus at the interconnection point. At 583 minimum, ARC label path should be built from the Cursor toward each 584 edge, but this would require label path recompilation upon Cursor 585 move, the proposed approach is then to build for the normal flow to 586 an Omega one pair of label path from edge to edge. 588 As this label construct maps the ARC topology with local significant 589 label, the Label Distribution Protocol (LDP) could be reused to 590 announce label association to neighbors on the ARC. 592 Upon a breakage inside an ARC, until a corrective action takes place, 593 some traffic will be lost. The corrective action might be either 594 operated at the control plane or the data plane, if immediate action 595 and near-zero packet loss is required. 597 5.1. Control Plane Recovery 599 Upon a first breakage in an ARC, the Cursor is moved to the breakage 600 point, either a node or a link, so that traffic flows away from the 601 Cursor again. 603 Upon a second breakage within a same ARC, a segment of the ARC is now 604 isolated. Both breakage points become sinks till an additional 605 corrective action, such as modifying the ARC Set, takes place. All 606 incoming links in the isolated segment are blocked , causing the 607 traffic to exit at the other end of the incoming ARCs. 609 Blocking an Edge Link in the incoming ARC may create an isolated 610 segment in the incoming ARC as well if it is a second breakage there 611 too, or if both edges of the incoming ARC terminate in the broken 612 segment. In that case the process recurses and the broken zone can 613 be determined as the collection of the isolated segments. 615 If a segment of an ARC is getting isolated by a dual failure but that 616 ARC segment has incoming Edges then the ARC can be reversed. This 617 reversal is done by reversing of all the incoming Edges, which become 618 outgoing. The segment that was isolated now benefits from multiple 619 exits in a loop free fashion. This process might in turn isolate a 620 segment of an ARC that was incoming and the process recurses and some 621 links flap. If a real exit exits the process will stabilize, but a 622 count to infinity must be put in place to avoid a permanent flapping 623 when a whole ARC Subset is physically isolated. One may consider 624 that this process is in fact the classical link reversal technique, 625 as applied to the DAG of ARCs. 627 5.2. Data Plane Recovery 629 Upon a breakage inside an ARC, it is possible in the data plane to 630 reverse the direction of -to turn- a given packet once along the ARC 631 so the packets exits over the other Edge Link. But in order to avoid 632 loops, it is undesirable to reverse the direction of a given packet a 633 second time. 635 Note that once a given packet leaves an ARC to enter the next, it is 636 free to bounce again in the next ARC. In other Words, the domain 637 that is impacted by a turn is limited to the current ARC itself; the 638 ARC forms the event horizon wherein the notion that a turn happened 639 may cause a loop. 641 So a local strategy must be put in place inside an ARC to allow a 642 given packet to bounce once upon a breakage, and get dropped upon a 643 second breakage. 645 In the case of IP packet forwarding, a packet can be tagged when it 646 bounces inside an ARC, or when it passes the Cursor, for instance by 647 reserving a TOS bit for that purpose. When the packet bounces, the 648 bit is set and when the packet leaves the ARC, the bit is reset and 649 may be used again in the next ARC. In the generic case of a Comb, a 650 strategy must be put in place to walk the structure and drop a packet 651 that tries all the Edges. it attempts to pass the Cursor twice in a 652 same direction, meaning that more than a full walk was already 653 accomplished. 655 5.2.1. Label Switched ARCs 657 In the case of MultiProtocol Label Switching (MPLS) forwarding, the 658 same result can be achieved with Label Switched ARCs (LSARCs), that 659 are composed of either 3 or 4 Labels Switched Paths (LSPs) along the 660 ARC. 662 3-Labels method: In this case we lay a primary LSP from the cursor 663 to the Edge in each direction, and a backup LSP Edge to Edge in 664 each direction. So a node along the way has three labels, one 665 primary and two backup, one in each direction. Should the primary 666 path fail, the packet can be placed along the backup LSP in the 667 other direction. We'll note that this method constraints the 668 location of the Cursor. Should the Cursor move, The primary LSPs 669 have to be recomputed, at a minimum between the old and the new 670 location of the Cursor where the direction is reversed. 672 4-Labels method: In this case we have a primary and a backup LSPs in 673 each direction all of them Edge to Edge, 4 labels total. The 674 labels are independent of the location of the Cursor, so the 675 Cursor can be moved from a node to the next in control plane with 676 no impact on labels. This method consumes an additional label but 677 is more amenable to load balancing techniques and allows each node 678 that inject a packet inside an ARC to make its own decision of the 679 exit edge for a given packet or flow. 681 5.2.2. Segment Routed ARCs 683 In the case of an infrastructure that is capable of Segment Routing 684 (SR) [I-D.ietf-spring-segment-routing], the tag in the packet is in 685 essence a Routing Header (RH) via the cursor. The RH forces routing 686 to the destination all the way back up the broken ARC and then down 687 on the other end. via the cursor of a broken ARC. 689 Upon a breakage, the node detecting the failure reroutes the packet 690 towards the other edge, which means going backwards up to the cursor, 691 and following normal routing from there. 693 The Routing Header may indicate, as consumed, an entry that points on 694 the broken edge, if that is necessary for the cursor to figure out 695 which is the broken edge so as to route towards the other edge. 697 5.3. Flooding 699 ARCs probably apply to both unicast and multicast traffic, as 700 illustrated by [I-D.thubert-rtgwg-arc-bicast]. In particular, ARCs 701 enable a redundant flooding of a packet. The flooded packet is 702 injected at all edges ending in Omega, and from there swims upriver 703 along the reverse ARC direction. The packet is then forwarded from 704 the incoming edge of the ARC to the other edge where it is absorbed. 705 On the way along the ARC, the packet is copied into all the ARCs that 706 terminate in this ARC where the process recommences. 708 Since a packet is finally injected from both edges of any ARC, it 709 should get to all nodes in an ARC even if there is one breakage in 710 that ARC. in normal conditions, at least two copies of the packet 711 circulate in an ARC, one in each direction, and a mechanism should be 712 put in place to make sure that only one copy is injected in an 713 incoming edge. 715 6. ARC Signaling 717 6.1. Serial ARC Representation 719 A single ARC can be serialized as the sets of endpoints at both edges 720 and the ordered list of nodes in the ARC between the edges. Since 721 the endpoints are effectively nodes in downstream ARCs, the set of 722 all serialized ARCs provides a full description of the topology. 724 6.2. Centralized vs. Distributed computation 726 An ARC set can be computed with a slightly altered Shortest Path 727 First algorithm, as further explained in Section 4. It results that 728 any node, or all nodes participating to a Link State protocol, may 729 learn the topology and compute an ARC Set. If all nodes compute the 730 topology on their own and asynchronously, micro-loops will follow 731 till the network converges. 733 It makes more sense to limit the computation of an ARC Set to 734 specific nodes, typically a Path Computation Element (PCE), a Network 735 Management Entity (NME), or nodes in Omega. This is typically what 736 happens in a Software Define Networking (SDN) environment. 738 6.3. ARC Topology Injection 740 Regardless of the central entity that computes the ARC set, the new 741 or updated ARC Set is serialized in a control message and flooded 742 over itself from Omega as described in Section 5.3. 744 The new ARC Set can be used as soon as it is received, in the 745 direction from which it is received, since a path along nodes in that 746 direction exists already, through the nodes that forwarded the 747 control messages. The full ARC redundancy is only available when a 748 control message has been received along an ARC in both directions. 749 In that model, there is no micro-loop. 751 6.4. ARC Operations, Administration, and Maintenance 753 Operations, Administration, and Maintenance (OAM) frames are used 754 within an ARC and flow periodically or asynchronously from an edge to 755 the other. Such frame may carry indications such as a breakage or a 756 congestion, and may be used to control the load balancing, or link 757 reversal operations. 759 7. Other ARC Operations 761 7.1. Node-Local vs. ARC-Wide reaction 763 ARCs enable forwarding plane reactions to breakages. In the simple 764 case of a single breakage in an ARC, the reaction can be immediate to 765 the discovery of the breakage and consists in rerouting the packet 766 towards the other edge across the cursor, as explained in 767 Section 5.2. 769 More complex situations require the coordination of all the nodes 770 along an ARC. For instance, load balancing requires the knowledge of 771 the congestion level at multiple points along the ARC, whereas the 772 solution to the Shared Risk Link Group (SRLG) problem discussed in 773 Section 7.3 requires all incoming edges in an isolated ARC segment to 774 be blocked before they can be returned. For such ARC-Wide 775 coordinated reactions, OAM frames are necessary to enable forwarding 776 plane rapid reactions. 778 7.2. Load Balancing 780 In normal conditions, only the Cursor may distribute its traffic 781 between the two Edge Nodes. If an Edge Node is still congested after 782 the Cursor forwards all its traffic towards the other Edge Node, then 783 the Cursor can be moved towards the congested Edge in order to derive 784 even more traffic towards the other Edge. If both Edges are 785 congested, then a back-pressure can be applied on the incoming ARCs 786 so that they move their own traffic towards their own alternate Edge. 787 The process may recurse. 789 It is expected that control frames similar to those defined for MPLS 790 Fault Management Operations, Administration, and Maintenance (OAM) 791 [RFC6291] will echo from Edge Node to Edge Node provide information 792 such as liveliness and load. In order to establish a control loop 793 between the Edge Nodes and the Cursor, the OAM frame would carry at 794 least a logical information whether: 796 The Edge Node is capable of forwarding data down to the next ARC 798 the load may be increased (e.g. rate below threshold including 799 hysteresis) 801 the load should be decreased (e.g. congestion observed as 802 increased latency or buffer bloat) 804 If the load should be decreased towards of congested Edge Node and 805 the load may be increased towards the other then the Cursor may 806 adjust its balancing of the load, or move Cursor ownership towards 807 the congested Edge if it is already redirecting all the traffic 808 towards the non-congested Edge. 810 If the Cursor is balancing traffic away from the default position due 811 to a past congestion notification and the Edge that was congested now 812 reports that the load may be increased, then the reverse operation 813 can happen and the Cursor may balance the load back to the original 814 position taking the reverse steps as above. 816 If the OAM can not be forwarded due to a link or a node failure, then 817 the last node towards the broken Edge becomes Cursor and echoes the 818 OAM frames advertising that it is an Edge node that is blocked, not 819 capable of forwarding data down to the next ARC. 821 If both Edges are experiencing a congestion then the condition should 822 be reported to the Edge Nodes of all incoming ARCs. Same goes when 823 both Edges are blocked. 825 7.3. Shared Risk Link Group 827 Essentially, the Shared Risk Link Group (SRLG) problem is that a 828 physical breakage may end up breaking more than one apparently 829 unrelated IP links. such a breakage may end up breaking an ARC in 830 more than one place, effectively creating isolated segments. 832 The basic approach to solve that problem is the classical link 833 reversal technique. Since ARCs form a DAG as illustrated in 834 Figure 6, it is possible to return all the incoming edges in an 835 isolated ARC segment so that traffic that circulates inside the 836 segment is actually fed back in incoming ARCs. The incoming ARCs are 837 considered broken on that edge so all the traffic is fed into the 838 other edge. If this causes the incoming ARC to be doubly broken, the 839 process recurses. in that incoming ARC. Over a number of iterations, 840 if there is an exit, it will be found and the traffic will be 841 funneled that way. If there is none, after a certain number of 842 iterations, the process counts to infinity and stops. 844 If the iterations are performed too quickly, the process may cause 845 micro-loops. OAM frames circulating within the broken segment can 846 solve that issue. On the way in, the OAM frame should block all 847 incoming ARCs, which effectively causes the edges to appear broken in 848 incoming ARCs. On the way back, the OAM frame returns the incoming 849 edges to be used the other way. 851 7.4. Olympic Rings 853 By Olympic Ring problem we mean how to optimally reuse the multiple 854 path opportunities that interconnecting 2 rings enable. ARCs can 855 simply be deployed inside a ring A to reach a connected ring B by 856 installing an ARC between adjacent interconnections on the rings. If 857 the rings only connect at one point, there is a single ARC going all 858 the way around the ring, with the cursor at the far side. If there 859 are more than one interconnection, then you always end up with as 860 many ARCs as there are interconnections. 862 A packet being forwarded inside a ring picks the side of the ring 863 that is away from the cursor, taking effectively the shortest path to 864 the next ring. If a hop is broken, then the packet is returned to 865 the other edge of the ARC, which is the adjacent interconnection 866 between the ARCs. 868 Note: There is no need in that model to artificially disable one hop 869 in the ring and re-enable it in case of breakage. 871 7.5. Routing Hierarchies 873 The ARC methods may be used to build and/or leverage routing 874 hierarchies, allowing high availability at multiple hierarchical 875 levels. In one hand, the view of an ARC Set can be simplified by 876 abstracting an ARC as a node in a DAG. The view of the routing 877 topology is thus simplified, as illustrated in Figure 6. 879 In the case of connected rings,abstracting a full ring as a node, 880 ARCs can be applied to a graph of rings, providing another level of 881 redundancy and an abstract end-to-end path computation, ring to ring 882 to ring. ARCs may be used to make that computation resilient as 883 well. 885 8. Manageability 887 This specification describes a generic model. Protocols and 888 management will come later 890 9. IANA Considerations 892 This specification does not require IANA action. 894 10. Security Considerations 896 This specification is not found to introduce new security threat. 898 11. Acknowledgments 900 The authors wishes to thank Dirk Anteunis, Stewart Bryant, IJsbrand 901 Wijnands, George Swallow, Eric Osborne, Clarence Filsfils and Eric 902 Levy-Abegnoli for their participation and continuous support to the 903 work presented here. 905 12. References 907 12.1. Normative References 909 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 910 Requirement Levels", BCP 14, RFC 2119, March 1997. 912 12.2. Informative References 914 [I-D.ietf-spring-segment-routing] 915 Filsfils, C., Previdi, S., Bashandy, A., Decraene, B., 916 Litkowski, S., Horneffer, M., Shakir, R., Tantsura, J., 917 and E. Crabbe, "Segment Routing Architecture", draft-ietf- 918 spring-segment-routing-00 (work in progress), December 919 2014. 921 [I-D.thubert-rtgwg-arc-bicast] 922 Thubert, P. and I. Wijnands, "Applying Available Routing 923 Constructs to bicasting", draft-thubert-rtgwg-arc- 924 bicast-01 (work in progress), October 2013. 926 [I-D.thubert-rtgwg-arc-vs-mrt] 927 Thubert, P., Enyedi, G., and S. Ramasubramanian, 928 "Available Routing Constructs", draft-thubert-rtgwg-arc- 929 vs-mrt-01 (work in progress), January 2014. 931 [RFC5921] Bocci, M., Bryant, S., Frost, D., Levrau, L., and L. 932 Berger, "A Framework for MPLS in Transport Networks", RFC 933 5921, July 2010. 935 [RFC6291] Andersson, L., van Helvoort, H., Bonica, R., Romascanu, 936 D., and S. Mansfield, "Guidelines for the Use of the "OAM" 937 Acronym in the IETF", BCP 161, RFC 6291, June 2011. 939 Authors' Addresses 941 Pascal Thubert (editor) 942 Cisco Systems, Inc 943 Building D 944 45 Allee des Ormes - BP1200 945 MOUGINS - Sophia Antipolis 06254 946 FRANCE 948 Phone: +33 497 23 26 34 949 Email: pthubert@cisco.com 951 Patrice Bellagamba 952 Cisco Systems 953 214 Avenue des fleurs 954 Saint-Raphael 83700 955 FRANCE 957 Phone: +33.6.1998.4346 958 Email: pbellaga@cisco.com