idnits 2.17.1 draft-chen-lsr-ctr-availability-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (March 20, 2022) is 761 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'ISO10589' is defined on line 449, but no explicit reference was found in the text == Unused Reference: 'RFC2328' is defined on line 461, but no explicit reference was found in the text == Unused Reference: 'RFC5305' is defined on line 465, but no explicit reference was found in the text == Unused Reference: 'RFC5329' is defined on line 469, but no explicit reference was found in the text == Unused Reference: 'RFC4970' is defined on line 476, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO10589' -- Obsolete informational reference (is this intentional?): RFC 4970 (Obsoleted by RFC 7770) Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group H. Chen 3 Internet-Draft Futurewei 4 Intended status: Standards Track M. Toy 5 Expires: September 21, 2022 Verizon 6 A. Wang 7 China Telecom 8 L. Liu 9 Fujitsu 10 X. Liu 11 Volta Networks 12 March 20, 2022 14 IGP for Network High Availability 15 draft-chen-lsr-ctr-availability-04 17 Abstract 19 This document describes protocol extensions to OSPF and IS-IS for 20 improving the reliability or availability of a network controlled by 21 a controller cluster. 23 Requirements Language 25 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 26 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 27 document are to be interpreted as described in RFC 2119 [RFC2119]. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at https://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on September 21, 2022. 46 Copyright Notice 48 Copyright (c) 2022 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (https://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 64 2. Terminologies . . . . . . . . . . . . . . . . . . . . . . . . 3 65 3. IGP for Controller Cluster Reliability . . . . . . . . . . . 3 66 3.1. Overview of Mechanism . . . . . . . . . . . . . . . . . . 3 67 3.2. Example . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 4. Extensions to IGP . . . . . . . . . . . . . . . . . . . . . . 6 69 4.1. Extensions to OSPF . . . . . . . . . . . . . . . . . . . 6 70 4.2. Extensions to IS-IS . . . . . . . . . . . . . . . . . . . 8 71 5. Recovery Procedure . . . . . . . . . . . . . . . . . . . . . 8 72 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 73 7. Security Considerations . . . . . . . . . . . . . . . . . . . 10 74 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 10 75 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 76 9.1. Normative References . . . . . . . . . . . . . . . . . . 11 77 9.2. Informative References . . . . . . . . . . . . . . . . . 11 78 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 80 1. Introduction 82 More and more networks are controlled by central controllers or 83 controller clusters. A controller cluster is a single controller 84 externally. It normally consists of two or more controllers 85 internally working together to control a network, i.e., every network 86 element (NE) in the network. The reliability or availability of a 87 network is heavily dependent on its controller cluster. The issues 88 or failures in the controller cluster may impact the reliability or 89 availability of the network greatly. 91 For a controller cluster comprising two or more controllers (i.e., 92 primary controller, secondary controller, and so on), the failures in 93 the cluster may split the cluster into a few of separated controller 94 groups. These groups do not know each other and may be out of 95 synchronization. Two or more groups may be elected to control the 96 network at the same time, which may cause some issues. 98 This document proposes some procedures and extensions to OSPF and IS- 99 IS for the separated controllers or controller groups to know each 100 other thus elect one new primary controller or controller group 101 correctly when the cluster is split because of failures in the 102 cluster. 104 2. Terminologies 106 The following terminologies are used in this document. 108 IGP: Interior Gateway Protocol 110 OSPF: Open Shortest Path First 112 IS-IS: Intermediate System to Intermediate System 114 LSA: Link State Advertisement in OSPF 116 LSP: Link State Protocol PDU in IS-IS 118 PDU: Protocol Data Unit 120 LS: Link Sate, which is LSA in OSPF or LSP in IS-IS 122 NE: Network Element 124 CE: Customer Edge 126 PE: Provider Edge 128 3. IGP for Controller Cluster Reliability 130 This section briefs the mechanism of controller cluster reliability 131 or availability using IGP, and illustrates some details through a 132 simple example. 134 3.1. Overview of Mechanism 136 When a cluster of controllers is split into a few of separated groups 137 because of failures in the cluster, the live controllers are still 138 actually connected to the network (i.e., network elements). Through 139 some of these connections, each group can get the information about 140 the other groups. A new primary controller or controller group is 141 correctly elected to control the network based on the information. 143 Each controller may comprise an IGP as an information proxy, called 144 IGP information proxy or IGP for short. The IGP has an IGP adjacency 145 relation with each of a given number of NEs (such as one NE) in the 146 network. When one adjacency is broken, a new adjacency is created 147 and maintained if possible. The given number of adjacency relations 148 is retained. 150 In normal operations, the cluster has all its controllers connected. 151 They are the primary controller controlling the network, the 152 secondary controller, and so on. They have current position 1, 2, 153 and so on respectively. The primary controller advertises the 154 information about the controllers via its IGP adjacencies. The 155 extensions to IGP below is used. 157 When the cluster is split into a few separated groups, each group 158 elects an intent primary controller, secondary controller and so on 159 from the group, which have intent position 1, 2, and so on 160 respectively. The intent primary controller advertises the 161 information about the controllers in the group. 163 The information advertised by the (intent) primary controller 164 includes its current (intent) position, its old position, its 165 priority to become a primary controller, the number of controllers, 166 and the IDs of the controllers which are ordered according to their 167 (intent) positions. In addition, a flag C indicating that whether it 168 is Controlling the network (i.e., it is the primary controller or 169 intent primary controller) is included. 171 3.2. Example 173 Figure 1 shows a controller cluster comprising two controllers: the 174 primary controller and the secondary controller. Each controller 175 includes an IGP as an information proxy. 177 +---------------------------------------------------+ 178 | Controller Cluster | 179 | | 180 | +------------+ +------------+ | 181 | |Controller A| Synchronize |Controller B| | 182 | |(Primary) +---------------+(Secondary) | | 183 | | [IGP]| | [IGP]| | 184 | ++-----------+ +-----------++ | 185 | | ^ | | 186 | | |_______________ | | 187 | | | | | 188 | | v | | 189 +-----|------------Control Channels-----------|-----+ 190 | / \ | 191 |IGP Adj / \____ | 192 \ / \ \____ |IGP Adj 193 \____ /\ .---. .---+ \ | 194 \ | \( ' |'.---. | | 195 \ |---\ Network | '+. | 196 NE1 (o \ | | ) / 197 ( | | o) NE4 198 ( | | ) 199 ( o NE2 o NE3.-' 200 ' ) 201 '---._.-. ) 202 '---' 204 Figure 1: Controller Cluster of 2 Controllers 206 The IGP in a controller has one IGP adjacency relation with one NE in 207 the network. In Figure 1, the IGP in controller A has IGP adjacency 208 with NE1, the IGP in B has IGP adjacency with NE4. 210 In normal operations, the IGP of the primary controller originates 211 link state (LS) containing the information about the controllers 212 connected to it. The LS originated by Controller A (Primary) in 213 Figure 1 having the following contents: 215 C = 1, A's current Position = 1, A's OldPosition = 1, A's Priority, 216 NoControllers = 2, A's ID, B's ID 218 When failures happen in the cluster, the live controllers act as 219 follows: 221 For the Secondary Controller (e.g., B) alive, if the primary 222 controller is dead, it promotes itself as the new primary controller; 223 if the primary controller is alive but separated from the secondary 224 controller, the secondary controller will not promote itself to be a 225 new primary controller. 227 For the Primary Controller (e.g., A), if it is alive, it continues to 228 be the primary controller. 230 With the extensions to IGP, the secondary controller can determine 231 the status of the primary controller through using IGP and obtaining 232 the information about the primary controller. The conditions that 233 the primary controller is alive but separated from the secondary 234 controller (i.e., condition a: the connection between the primary 235 controller and the secondary controller in the cluster failed, but 236 condition b: the two controllers are alive) can be determined by the 237 secondary controller as follows: 239 For condition a, when the heartbeat from the primary stops, the 240 secondary knows that the connection between the primary and secondary 241 controller failed. 243 For condition b, it checks its link state database (LSDB) in the IGP 244 to see whether the IGP for the primary controller is connected to 245 some network elements and advertises the LS. If so, the primary 246 controller is alive; otherwise, it is dead. 248 4. Extensions to IGP 250 This section describes extensions to OSPF and IS-IS. 252 4.1. Extensions to OSPF 254 A new TLV, called OSPF Controllers TLV, is defined. When OSPF acts 255 as a proxy of a controller in a cluster, it may advertise the 256 information about the controllers such as the number of controllers 257 connected to it (including itself) in its router information LSA, 258 which contains a Controllers TLV of the following format. 260 0 1 2 3 261 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 262 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 263 | Type (TBD1) | Length | 264 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 265 | Flags |C| Position | OldPosition | Priority | 266 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 267 | Reserved | NoControllers | 268 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 269 | Controller 1 ID | 270 : : | 271 | Controller n ID | 272 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 274 Figure 2: OSPF Controllers TLV 276 Type: TBD1 is to be assigned by IANA. 278 Length: It indicates the length of the value portion in octets. 280 Flag (8 bits): One flag bit, C-bit, is defined. When set, it 281 indicates that the position is the position of the current active 282 primary controller. In this case, C = 1 and Position = 1, which 283 indicate that the controller is the current active primary 284 controller controlling the network. 286 Position (8 bits): It indicates the current/intent position of the 287 controller in the controller cluster or group. 1: primary (first) 288 controller, 2: secondary controller, 3: third controller, and so 289 on (i.e., Controller Position of value n: n-th controller in the 290 cluster or group). 292 OldPosition (8 bits): It indicates the old position of the 293 controller in the controller cluster before it is split. 295 Priority (8 bits): It indicates the priority of the controller to be 296 elected as a primary controller. 298 Reserved (24 bits): Reserved field, must set to zero for 299 transmission and ignored for reception. 301 NoControllers (8 bits): It indicates the number of controllers 302 connected to the controller advertising the TLV. 304 Controller i ID (32 bits): It represents the identifier (ID) of 305 controller i at position i (i = 1, ..., n) in the cluster or 306 group. 308 When the information about the controllers is changed, OSPF of a 309 primary controller originates an OSPF Router Information Opaque LSA, 310 which includes a OSPF Controllers TLV. 312 4.2. Extensions to IS-IS 314 Similar to OSPF, a new TLV, called IS-IS Controllers TLV, is defined. 315 When IS-IS acts as a proxy of a controller in a cluster, it may 316 advertise the information about the cluster such as the number of 317 controllers connected to it (including itself) in its LSP, which 318 contains an IS-IS Controllers TLV of the following format. 320 0 1 2 3 321 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 322 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 323 | Type (TBD2) | Length | Flags |C| Position | 324 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 325 | OldPosition | Priority | NoControllers | Reserved | 326 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 327 | Controller 1 ID | 328 : : | 329 | Controller n ID | 330 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 332 Figure 3: IS-IS Controllers TLV 334 Type (8 bits): TBD2 is to be assigned by IANA. 336 Length (8 bits): It indicates the length of the value portion in 337 octets. 339 All other fields: The meaning of each of the other fields is the 340 same as the one of the corresponding field in the OSPF Controllers 341 TLV defined above. 343 When the information about the controllers is changed, the IS-IS of a 344 primary controller originates an LSP, which includes an IS-IS 345 Controllers TLV. 347 5. Recovery Procedure 349 This section describes the recovery procedure for a controller 350 cluster of n (n > 2) controllers, which are the primary controller A, 351 the secondary controller B, ..., the n-th controller N. 353 When failures happen in the cluster, it may be split into a few 354 separated groups of controllers. In one policy, the group with the 355 maximum number of controllers is responsible for controlling the 356 network as the primary group of the cluster, in which the new primary 357 controller, secondary controller, and so on are elected. 359 For each separated group of controllers, the intent primary 360 controller, secondary controller, and so on are elected. The intent 361 primary controller of the group advertises the information about the 362 group through its IGP. The information includes its intent position, 363 its old position, its priority to become a primary controller, the 364 number of controllers in the group, and identifiers of the 365 controllers in the group. The identifiers of the controllers are 366 ordered according to their positions. The identifier of the intent 367 primary controller, which has position 1, is the first one; The 368 identifier of the intent secondary controller, which has position 2, 369 is the second one; and so on. Thus every separated group has the 370 information about the other groups and can determine which group has 371 the maximum number of controllers. 373 In the case of tie (i.e., two or more groups have the same maximum 374 number of controllers), the group with the highest priority 375 controller wins in one policy. In another policy, the group with the 376 highest old position controller (e.g., the old primary controller) 377 wins. 379 Some details of the recovery procedures in the current and intent 380 primary controller in a controller cluster or group are as follows. 382 In normal operations, it advertises Controllers TLV containing: 384 C = 1, Position = 1, Old Position = 1, Primary Controller's priority, 385 NoControllers = n, Primary Controller's ID, secondary controller's 386 ID, ..., and n-th Controller's ID. 388 When failures cause the cluster split, it advertises Controllers TLV 389 containing: 391 C = 0, Position = 1, Old Position = 1, Intent Primary Controller's 392 priority, NoControllers = m (m is the number of controllers in the 393 group that the primary controller is connected after the failures), 394 Intent Primary Controller's ID, IDs of the other controllers 395 connected. 397 Then after a given time, it checks if the group is elected as the 398 primary group. If so, it advertises Controllers TLV containing: 400 C = 1, Position = 1, Old Position = 1, its Priority, NoControllers = 401 m, the IDs of the controllers in the group. 403 One example is that failures split the cluster into two separated 404 groups: group 1 comprising A and C, group 2 consisting of B and N. 405 Each group elects its intent primary controller, secondary 406 controller, and so on. Suppose that controller A and C are elected 407 as the intent primary and secondary controller respectively in group 408 1; controller B and N are elected as the intent primary and secondary 409 controller respectively in group 2. 411 Each of the intent primary controllers A and B advertises the 412 information about the controllers in its group. The information 413 advertised by A includes: 415 C = 0, Position = 1, OldPosition = 1, A's Priority, NoControllers = 416 2, A's ID, C's ID. 418 The information advertised by B includes: 420 C = 0, Position = 1, OldPosition = 2, B's Priority, NoControllers = 421 2, B's ID, N's ID. 423 Group 1 and 2 have the same number of controllers, which is 2. But 424 OldPosition in group 1 is higher than that in group 2. Group 1 is 425 elected as the primary group, and the intent primary controller A in 426 the primary group is determined as the current primary controller. 427 After the determination, the information about the controllers in 428 group 1 (i.e., the primary group) is changed. The updated 429 information advertised by A includes: 431 C = 1, Position = 1, OldPosition = 1, A's Priority, NoControllers = 432 2, A's ID, C's ID. 434 6. IANA Considerations 436 TBD 438 7. Security Considerations 440 TBD 442 8. Acknowledgements 444 TBD 446 9. References 447 9.1. Normative References 449 [ISO10589] 450 International Organization for Standardization, 451 "Intermediate System to Intermediate System Intra-Domain 452 Routing Exchange Protocol for use in Conjunction with the 453 Protocol for Providing the Connectionless-mode Network 454 Service (ISO 8473)", ISO/IEC 10589:2002, Nov. 2002. 456 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 457 Requirement Levels", BCP 14, RFC 2119, 458 DOI 10.17487/RFC2119, March 1997, 459 . 461 [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, 462 DOI 10.17487/RFC2328, April 1998, 463 . 465 [RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic 466 Engineering", RFC 5305, DOI 10.17487/RFC5305, October 467 2008, . 469 [RFC5329] Ishiguro, K., Manral, V., Davey, A., and A. Lindem, Ed., 470 "Traffic Engineering Extensions to OSPF Version 3", 471 RFC 5329, DOI 10.17487/RFC5329, September 2008, 472 . 474 9.2. Informative References 476 [RFC4970] Lindem, A., Ed., Shen, N., Vasseur, JP., Aggarwal, R., and 477 S. Shaffer, "Extensions to OSPF for Advertising Optional 478 Router Capabilities", RFC 4970, DOI 10.17487/RFC4970, July 479 2007, . 481 Authors' Addresses 483 Huaimo Chen 484 Futurewei 485 Boston, MA 486 USA 488 Email: Huaimo.chen@futurewei.com 489 Mehmet Toy 490 Verizon 491 USA 493 Email: mehmet.toy@verizon.com 495 Aijun Wang 496 China Telecom 497 Beiqijia Town, Changping District 498 Beijing 102209 499 China 501 Email: wangaj3@chinatelecom.cn 503 Lei Liu 504 Fujitsu 505 USA 507 Email: liulei.kddi@gmail.com 509 Xufeng Liu 510 Volta Networks 511 McLean, VA 512 USA 514 Email: xufeng.liu.ietf@gmail.com