idnits 2.17.1 draft-ietf-forces-ceha-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 3 instances of too long lines in the document, the longest one being 2 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 309 has weird spacing: '... |try v...' -- The document date (January 17, 2013) is 4089 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC 5810' is mentioned on line 339, but not defined == Unused Reference: 'RFC5812' is defined on line 707, but no explicit reference was found in the text Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group K. Ogawa 3 Internet-Draft NTT Corporation 4 Intended status: Standards Track W. M. Wang 5 Expires: July 21, 2013 Zhejiang Gongshang University 6 E. Haleplidis 7 University of Patras 8 J. Hadi Salim 9 Mojatatu Networks 10 January 17, 2013 12 ForCES Intra-NE High Availability 13 draft-ietf-forces-ceha-05 15 Abstract 17 This document discusses CE High Availability within a ForCES NE. 19 Status of this Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on July 21, 2013. 36 Copyright Notice 38 Copyright (c) 2013 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 54 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 55 2.1. Document Scope . . . . . . . . . . . . . . . . . . . . . . 5 56 2.2. Quantifying Problem Scope . . . . . . . . . . . . . . . . 5 57 3. RFC5810 CE HA Framework . . . . . . . . . . . . . . . . . . . 6 58 3.1. Current CE High Availability Support . . . . . . . . . . . 6 59 3.1.1. Cold Standby Interaction with ForCES Protocol . . . . 7 60 3.1.2. Responsibilities for HA . . . . . . . . . . . . . . . 9 61 4. CE HA Hot Standby . . . . . . . . . . . . . . . . . . . . . . 10 62 4.1. Changes to the FEPO model . . . . . . . . . . . . . . . . 10 63 4.2. FEPO processing . . . . . . . . . . . . . . . . . . . . . 11 64 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 65 6. Security Considerations . . . . . . . . . . . . . . . . . . . 16 66 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 17 67 7.1. Normative References . . . . . . . . . . . . . . . . . . . 17 68 7.2. Informative References . . . . . . . . . . . . . . . . . . 17 69 Appendix 1. Appendix I - New FEPO version . . . . . . . . . . . . 17 70 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25 72 1. Definitions 74 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 75 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 76 document are to be interpreted as described in RFC 2119. 78 The following definitions are taken from [RFC3654]and [RFC3746]: 80 Logical Functional Block (LFB) -- A template that represents a fine- 81 grained, logically separate aspects of FE processing. 83 ForCES Protocol -- The protocol used at the Fp reference point in the 84 ForCES Framework in [RFC3746]. 86 ForCES Protocol Layer (ForCES PL) -- A layer in the ForCES 87 architecture that embodies the ForCES protocol and the state transfer 88 mechanisms as defined in [RFC5810]. 90 ForCES Protocol Transport Mapping Layer (ForCES TML) -- A layer in 91 ForCES protocol architecture that specifically addresses the protocol 92 message transportation issues, such as how the protocol messages are 93 mapped to different transport media (like SCTP, IP, TCP, UDP, ATM, 94 Ethernet, etc), and how to achieve and implement reliability, 95 security, etc. 97 2. Introduction 99 Figure 1 illustrates a ForCES NE controlled by a set of redundant CEs 100 with CE1 being active and CE2 and CEn-1 being a backup. 102 ----------------------------------------- 103 | ForCES Network Element | 104 | +-----------+ | 105 | | CEn-1 | | 106 | | (Backup) | | 107 -------------- Fc | +------------+ +------------+ | | 108 | CE Manager |--------+-| CE1 |------| CE2 |-+ | 109 -------------- | | (Active) | Fr | (Backup) | | 110 | | +-------+--+-+ +---+---+----+ | 111 | Fl | | | Fp / | | 112 | | | +---------+ / | | 113 | | Fp| |/ |Fp | 114 | | | | | | 115 | | | Fp /+--+ | | 116 | | | +-------+ | | | 117 | | | | | | | 118 -------------- Ff | --------+--+-- ----+---+----+ | 119 | FE Manager |--------+-| FE1 | Fi | FE2 | | 120 -------------- | | |------| | | 121 | -------------- -------------- | 122 | | | | | | | | | | 123 ----+--+--+--+----------+--+--+--+------- 124 | | | | | | | | 125 | | | | | | | | 126 Fi/f Fi/f 128 Fp: CE-FE interface 129 Fi: FE-FE interface 130 Fr: CE-CE interface 131 Fc: Interface between the CE Manager and a CE 132 Ff: Interface between the FE Manager and an FE 133 Fl: Interface between the CE Manager and the FE Manager 134 Fi/f: FE external interface 136 Figure 1: ForCES Architecture 138 The ForCES architecture allows FEs to be aware of multiple CEs but 139 enforces that only one CE be the master controller. This is known in 140 the industry as 1+N redundancy. The master CE controls the FEs via 141 the ForCES protocol operating in the Fp interface. If the master CE 142 becomes faulty, a backup CE takes over and NE operation continues. 143 By definition, the current documented setup is known as cold-standby. 144 The CE set is static and is passed to the FE by the FE Manager (FEM) 145 via the Ff interface and to each CE by the CE Manager (CEM) in the Fc 146 interface during the pre-association phase. 148 From an FE perspective, the knobs of control for a CE set are defined 149 by the FEPO LFB in [RFC5810], Appendix B. Section 3.1 of this 150 document details these knobs further. 152 2.1. Document Scope 154 It is assumed that the reader is aware of the ForCES architecture to 155 make sense of the changes made here. This document provides minimal 156 background to set the context of the discussion in Section 4. 158 By current definition, the Fr interface is out of scope for the 159 ForCES architecture. However, it is expected that organizations 160 implementing a set of CEs will need to have the CEs communicate to 161 each other via the Fr interface in order to achieve the 162 synchronization necessary for controlling the FEs. 164 The problem scope addressed by this document falls into 2 areas: 166 1. To describe with more clarity (than [RFC5810]) how current cold- 167 standby approach operates within the NE cluster. 169 2. To describe how to evolve the cold-standby setup to a hot-standby 170 redundancy setup so as to improve the failover time and NE 171 availability. 173 2.2. Quantifying Problem Scope 175 The NE recovery and availability is dependent on several time- 176 sensitive metrics: 178 1. How fast the CE plane failure is detected the FE. 180 2. How fast a backup CE becomes operational. 182 3. How fast the FEs associate with the new master CE. 184 4. How fast the FEs recover their state and become operational. 186 The design goals of the current [RFC5810] choices to meet the above 187 goals are driven by desire for simplicity. 189 To quantify the above criteria with the current prescribed ForCES CE 190 setup in [RFC5810]: 192 1. How fast the CE side detects a CE failure is left undefined. To 193 illustrate an extreme scenario, we could have a human operator 194 acting as the monitoring entity to detect faulty CEs. How fast 195 such detection happens could be in the range of seconds to days. 196 A more active monitor on the Fr interface could improve this 197 detection. 199 2. How fast the backup CE becomes operational is also currently out 200 of scope. In the current setup, a backup CE need not be 201 operational at all (for example, to save power) and therefore it 202 is feasible for a monitoring entity to boot up a backup CE after 203 it detects the failure of the master CE. In this document 204 Section 4 we suggest that at least one backup CE be online so as 205 to improve this metric. 207 3. How fast an FE associates with new master CE is also currently 208 undefined. The cost of an FE connecting and associating adds to 209 the recovery overhead. As mentioned above we suggest having at 210 least one backup CE online. In Section 4 we propose to zero out 211 the connection and association cost on failover by having each FE 212 associate with all online backup CEs after associating to the 213 active CE. Note that if an FE pre-associates with backup CEs, 214 then the system will be technically operating in hot-standby 215 mode. 217 4. And last: How fast an FE recovers its state depends on how much 218 NE state exists. By ForCES current definition, the new master CE 219 assumes zero state on the FE and starts from scratch to update 220 the FE. So the larger the state, the longer the recovery. 222 3. RFC5810 CE HA Framework 224 To achieve CE High Availabilty, FEs and CEs MUST inter-operate per 225 [RFC5810] definition which is repeated for contextual reasons in 226 Section 3.1. It should be noted that in this default setup, which 227 MUST be implemented by CEs and FEs needing HA, the Fr plane is out of 228 scope (and if available is proprietary to an implementation). 230 3.1. Current CE High Availability Support 232 As mentioned earlier, although there can be multiple redundant CEs, 233 only one CE actively controls FEs in a ForCES NE. In practice there 234 may be only one backup CE. At any moment in time only one master CE 235 can control the FEs. In addition, the FE connects and associates to 236 only the master CE. The FE and the CE PL are aware of the primary 237 and one or more secondary CEs. This information (primary, secondary 238 CEs) is configured on the FE and the CE PLs during pre-association by 239 the FEM and the CEM respectively. 241 Figure 2 below illustrates the Forces message sequences that the FE 242 uses to recover the connection in current defined cold-standby 243 scheme. 245 FE CE Primary CE Secondary 246 | | | 247 | Asso Estb,Caps exchg | | 248 1 |<--------------------->| | 249 | | | 250 | state update | | 251 2 |<--------------------->| | 252 | | | 253 | | | 254 | FAILURE | 255 | | 256 | Asso Estb,Caps exchange | 257 3 |<------------------------------------------>| 258 | | 259 | Event Report (pri CE down) | 260 4 |------------------------------------------->| 261 | | 262 | state update | 263 5 |<------------------------------------------>| 265 Figure 2: CE Failover for Cold Standby 267 3.1.1. Cold Standby Interaction with ForCES Protocol 269 High Availability parameterization in an FE is driven by configuring 270 the FE Protocol Object (FEPO) LFB. 272 The FEPO CEID component identifies the current master CE and the 273 component table BackupCEs identifies the backup CEs. The FEPO FE 274 Heartbeat Interval, CE Heartbeat Dead Interval, and CE Heartbeat 275 policy help in detecting connectivity problems between an FE and CE. 276 The CE Failover policy defines how the FE should react on a detected 277 failure. 279 Figure 3 illustrates the defined state machine that facilitates 280 connection recovery. 282 The FE connects to the CE specified on FEPO CEID component. If it 283 fails to connect to the defined CE, it moves it to the bottom of 284 table BackupCEs and sets its CEID component to be the first CE 285 retrieved from table BackupCEs. The FE then attempts to associate 286 with the CE designated as the new primary CE. The FE continues 287 through this procedure until it successfully connects to one of the 288 CEs. 290 FE tries to associate 291 +-->-----+ 292 | | 293 (CE issues Teardown || +---+--------v----+ 294 Lost association) && | Pre-Association | 295 CE failover policy = 0 | (Association | 296 +------------>-->-->| in +<----+ 297 | | progress) | | 298 | CE Issues +--------+--------+ | 299 | Association | | CEFTI 300 | Response V | timer 301 | +------------------+ | expires 302 | | ^ 303 | V | 304 +-+-----------+ +------+-----+ 305 | | | Not | 306 | | (CE issues Teardown || | Associated | 307 | | Lost association) && | +->---+ 308 | Associated | CE Failover Policy = 1 |(May | FE | 309 | | | Continue |try v 310 | |-------->------->------>| Forwarding)|assn | 311 | | Start CEFTI timer | |-<---+ 312 | | | | 313 +-------------+ +-------+-----+ 314 ^ | 315 | CE Issues V 316 | Association | 317 | Setup | 318 | (Cancel CEFTI Timer) | 319 +_________________________________________+ 321 Figure 3: FE State Machine considering HA 323 When communication fails between the FE and CE (which can be caused 324 by either the CE or link failure but not FE related), either the TML 325 on the FE will trigger the FE PL regarding this failure or it will be 326 detected using the HB messages between FEs and CEs. The 327 communication failure, regardless of how it is detected, MUST be 328 considered as a loss of association between the CE and corresponding 329 FE. 331 If the FE's FEPO CE Failover Policy is configured to mode 0 (the 332 default), it will immediately transition to the pre-association 333 phase. This means that if association is again established, all FE 334 state will need to be re-established. 336 If the FE's FEPO CE Failover Policy is configured to mode 1, it 337 indicates that the FE is capable of HA restart recovery. In such a 338 case, the FE transitions to the Not Associated state and the CEFTI 339 timer[RFC 5810] is started. The FE MAY continue to forward packets 340 during this state. It MAY also recycle through any configured backup 341 CEs in a round-robin fashion. It first adds its primary CE to the 342 bottom of table BackupCEs and sets its CEID component to be the first 343 secondary retrieved from table BackupCEs. The FE then attempts to 344 associate with the CE designated as the new primary CE. If it fails 345 to re-associate with any CE and the CEFTI expires, the FE then 346 transitions to the pre-association state. 348 If the FE, while in the not associated state, manages to reconnect to 349 a new primary CE before CEFTI expires it transitions to the 350 Associated state. Once re-associated, the CE tries to synchronize 351 any state that the FE may have lost during the not associated state. 352 How the CE re-synchronizes such state is out of scope for the current 353 ForCES architecture but would typically constitute the issuing of new 354 configs and queries. 356 An explicit message (a Config message setting Primary CE component in 357 ForCES Protocol object) from the primary CE, can also be used to 358 change the Primary CE for an FE during normal protocol operation. In 359 this case, the FE transitions to the Not Associated State and 360 attempts to Associate with the new CE. 362 3.1.2. Responsibilities for HA 364 TML Level: 366 1. The TML controls logical connection availability and failover. 368 2. The TML also controls peer HA management. 370 At this level, control of all lower layers, for example transport 371 level (such as IP addresses, MAC addresses etc) and associated links 372 going down are the role of the TML. 374 PL Level: 375 All other functionality, including configuring the HA behavior during 376 setup, the CE IDs used to identify primary and secondary CEs, 377 protocol messages used to report CE failure (Event Report), Heartbeat 378 messages used to detect association failure, messages to change the 379 primary CE (Config), and other HA related operations described in 380 Section 3.1, are the PL's responsibility. 382 To put the two together, if a path to a primary CE is down, the TML 383 would take care of failing over to a backup path, if one is 384 available. If the CE is totally unreachable then the PL would be 385 informed and it would take the appropriate actions described before. 387 4. CE HA Hot Standby 389 In this section we describe small extensions to the existing scheme 390 to enable hot standby HA. To achieve hot standby HA, we target 391 specific goals defined in Section 2.2, namely: 393 o How fast a backup CE becomes operational. 395 o How fast the FEs associate with the new master CE. 397 As described in Section 3.1, in the pre-association phase the FEM 398 configures the FE to make it aware of all the CEs in the NE. The FEM 399 MUST configure the FE to make it aware which CE is the master and MAY 400 specify any backup CE(s). 402 4.1. Changes to the FEPO model 404 In order for the above to be achievable there is a need to make a few 405 changes in the FEPO model. Section 1 contains the xml definition of 406 the new version 2 of the FEPO LFB. 408 Changes from the version 1 of FEPO are: 410 1. Added four new datatypes: 412 1. CEStatusType an unsigned char to specify status of a 413 connection with a CE. Special values are 0 (Disconnected), 1 414 (Connected), 2 (Associated), 3 (Lost_Connection) and 4 415 (Unreachable) 417 2. HAModeValues an unsigned char to specify selected HA mode. 418 Special values are 0 (No HA Mode), 1 (HA Mode - Cold Standby) 419 and 2 (HA Mode - Hot Standby) 421 3. FEHACapab an unsigned char to specify HA capabilities of the 422 FE. Special values are 0 (Graceful Restart), 1 (Cold 423 Standby) and 2 (Hot Standby) 425 4. AllCEType a struct of CE ID and CEStatusType to contain 426 connection information for one CE. Used in the AllCEs array. 428 2. Appended three new components: 430 1. AllCEs to hold status for all CEs. AllCEs is an Array of the 431 AllCEType. 433 2. HAMode to specify current High Availability mode selected. 434 An unsigned char with three special values 0 (No HA), 1 435 (Running Cold-Standby) and 2 (Running Hot-Standby) 437 3. AcceptBackupGets to provide the master CE to control whether 438 the FE will accept incoming queries from backup CEs. 440 3. Added two new capabilities.: 442 1. HACapabilities, a table that defines which HA capabilities 443 the FE supports. 445 2. MaximumMultipleCEAssocations which defines the maximum 446 associations with CEs this FE can have. 448 4. Added one additional Event, the HAPrimaryCEDown event which 449 reports last known CEID and tentative new master CEID. 451 Since no component from the FEPO v1 has been changed FEPO v2 retains 452 backwards compatibility with CEs that know only version 1.0. These 453 CEs however cannot make use of the High Availability options that the 454 new FEPO provides. 456 4.2. FEPO processing 458 The FE's FEPO LFB version 2 AllCEs table contains all the CEIDs that 459 the FE may connect and associate with. The ordering of the CE IDs in 460 this table defines the priority order in which an FE will connect to 461 the CEs. In the pre-association phase, the first CE ID (lowest table 462 index) in the AllCEs table MUST be the first CE ID that the FE will 463 attempt to connect and associate with. If the FE fails to connect 464 and associate with the first CE ID, it will attempt to connect to the 465 second CE ID and so forth, and cycles back to the beggining of the 466 list until there is a connection and an association. The FE MUST 467 associate with at least one CE. Upon a successful association, the 468 FEPO's CEID component identifies the current associated master CE. 470 While it would be much simpler to have the FE not respond to any 471 messages from CE other than the master, it may be useful for the 472 backup CEs to be able to query the FE. Query commands are sent 473 always on the high priority channel. In order to avoid missing 474 critical configuration or query commands from the master CE, all 475 query commands from backup CEs MUST be sent on the high priority 476 channel but with the least priority, the value of which is 4. 477 However since queries are high priority from heartbeats, if the 478 master CE waits for heartbeat responses and the backup CEs flood the 479 FE, the master CE may think that the FE is down. Therefore it is 480 prudent to add a control mechanism that will be able to control 481 whether the FE can respond to query messages from backup CEs. The 482 AcceptBackupGets component, a boolean, is designed for this occasion. 483 If the master CE sets it to true, the FE MUST accept and process 484 query commands from backup CEs. If the AcceptBackupGets is false, 485 the FE MUST drop query commands from backup CEs. 487 Asynchronous events that the master CE has subscribed to, as well as 488 heartbeats are sent to all associated-to CEs. Packet redirects 489 continue to be sent only to the master CE. The Heartbeat Interval, 490 the CEHB Policy and the FEHB Policy MUST be the same for all CEs. 492 Figure 4 illustrates the state machine that facilitates connection 493 recovery with High Availability enabled. 495 FE tries to associate 496 +-->-----+ 497 | | 498 ^ v 499 (CE issues Teardown || +----+--------+---+ 500 Lost association) && | Pre-Association | 501 CE failover policy = 0 | (Association +<-------------------+ 502 +------------>-->-->| in +<-----+ | 503 | | progress) | | | 504 | CE Issues +--------+--------+ | | 505 | Association | | | 506 | Response V Not Found || CEFTI | 507 | +------------------+ timer expires | 508 | | | | 509 | V ^ | 510 +-+-----------+ +------+------+ | 511 | | (CE issues Teardown || | Not | | 512 | | Lost association) && | Associated | | 513 | | (CE Failover Policy=1) | | CEFTI 514 | Associated | | (May | timer 515 | | | Continue | expires 516 | +---------->------->----->| Forwarding)| | 517 | | Start CEFTI Timer | | | 518 | | | Search for | | 519 | | +--------->| next | | 520 | | | | associated | | 521 | | | | CE | | 522 | | | | (HAMode 2) | | 523 +-------------+ | +-------------+ | 524 ^ | V | 525 | | | | 526 | | Found CE | 527 | CEHDI Expires Send Event of | 528 | | New CE ID. | 529 | | Start CEHDI Timer | 530 | | | | 531 | | V | 532 | | +------+------+ | 533 | ^---------+ Confirm +-------^ 534 | | State | 535 | Received +---->| | 536 | different | | Wait for CE | 537 | CE ID. ^ | to confirm | 538 | Resend Event | | new CE ID | 539 | Restart CEHDI Timer +----<| | 540 | +-----+-------+ 541 | Received same CE ID | 542 | (Cancel CEFTI & CEHDI Timer) | 543 +_______________________________________+ 545 Figure 4: FE State Machine considering HA 547 Once the FE has associated with a master CE it moves to the post- 548 association phase (Associated state). It MAY also instruct the FE to 549 use a different master CE. It is assumed that the master CE will 550 communicate with other CEs within the NE for the purpose of 551 synchronization via the CE-CE interface. The CE-CE interface is out 552 of scope for this document. 554 FE CE#1 CE#2 ... CE#N 555 | | | | 556 | Asso Estb,Caps exchg | | | 557 1 |<-------------------->| | | 558 | | | | 559 | state update | | | 560 2 |<-------------------->| | | 561 | | | | 562 | Asso Estb,Caps exchg | | 563 3I|<--------------------------------->| | 564 ... ... ... ... 565 | Asso Estb,Caps exchg | 566 3N|<------------------------------------------>| 567 | | | | 568 4 |<-------------------->| | | 569 . . . . 570 4x|<-------------------->| | | 571 | FAILURE | | 572 | | | | 573 | Event Report (CE#2 is new master) | | 574 5 |---------------------------------->|------->| 575 | | | 576 | Config (Set CEID to CEID of CE#2) | | 577 6 |<----------------------------------| | 578 7 |<--------------------------------->| | 579 . . . . 580 7x|<--------------------------------->| | 581 . . . . 583 Figure 5: CE Failover for Hot Standby 585 While in the post-association phase, if the CE Failover Policy is set 586 to 1 and HAMode set to 2 (HotStandby) then the FE, after succesfully 587 associating with the master CE, MUST attempt to connect and associate 588 with all the CEs that is aware of. Figure 5 steps #1 and #2 589 illustrates the FE associating with CE#1 as the master and then 590 proceeding to steps #3I to #3N the association with backup CE's CE#2 591 to CE#N. If the FE fails to connect or associate with some CEs, the 592 FE MAY flag them as unreachable to avoid continuous attempts to 593 connect. The FE MAY retry to reassociate with unreachable CEs when 594 possible. 596 When the master CE for any reason is considered to be down, then the 597 FE will try to find the first associated CE from the list of all CEs 598 in a round-robin fashion. 600 If the FE is unable to find an associated FE in its list of CEs, then 601 it will attempt to connect and associate with the first from the list 602 of all CEs and continue in a round-robin fashion until it connects 603 and associates with a CE. 605 Once the FE selects the associated CE to use as the new master, the 606 FE then sends a High Availability Primary CE Changed Event 607 Notification to all associated CEs to notifying them that the primary 608 CE is down as well as which CE the reporting FE considers to be the 609 new master. 611 The new master CE MUST configure the CEID component of the FE within 612 the time limit defined in the CEHDI Failover Timeout as a 613 confirmation that the FE made the right choice. 615 FE CE#1 CE#2 ... CE#N 616 | | | | 617 | Asso Estb,Caps exchg | | | 618 1 |<-------------------->| | | 619 | | | | 620 | state update | | | 621 2 |<-------------------->| | | 622 | | | | 623 | Asso Estb,Caps exchg | | 624 3I|<--------------------------------->| | 625 | | | | 626 ... ... ... ... 627 | Asso Estb,Caps exchg | 628 3N|<------------------------------------------>| 629 | | | | 630 4 |<-------------------->| | | 631 . . . . 632 4x|<-------------------->| | | 633 | FAILURE | | 634 | | | | 635 | Event Report (CE#2 is new master) | | 636 5 |---------------------------------->|------->| 637 | | | | 638 | CEHDI Failover Timeout | | 639 | | | | 640 | Event Report (CE#N is new master) | | 641 6 |---------------------------------->|------->| 642 | | | | 643 | Config (Set CEID to CEID of CE#N) | 644 7 |<-------------------------------------------| 645 8a|<------------------------------------------>| 646 . . . . 647 8x|<------------------------------------------>| 648 Figure 6: CE Failover for Hot Standby 650 If the FE does not get confirmation within the CEHDI Failover 651 Timeout, it picks the next CE on its list and advertises it as the 652 new master. Figure 6 illustrates in step #5 selecting CE#2 as its 653 new master. In step #6, the timeout occurs and it picks CE#N as its 654 new master. The FE receives confirmation that CE#N is the new master 655 in step #7. 657 If the CE the FE assumed to be the master discovers that it should 658 not be the new master CE, then it will configure the CEID with the ID 659 of the proper master CE. How the CE decides who the new master CE 660 is, is also out of scope of this document and is assumed to be done 661 via a CE-CE communication protocol. The FE must then associate with 662 then new CE. 664 If the CEFTI timer expires at either the not-associated or confirm 665 states without a new master CE confirmed, then the FE MUST revert to 666 the pre-association stage. 668 In most High Availability architectures there exists the possibility 669 of split-brain. However, since in our setup the FE will never accept 670 any configuration messages from any other than the master CE, we 671 consider the FE as fenced against data corruption from the other CEs 672 that consider themselves as the master. The split-brain issue 673 becomes mostly a CE-CE communication problem which is considered to 674 be out of scope. 676 By virtue of having multiple CE connections, the FE switchover to a 677 new master CE will be relatively much faster. The overall effect is 678 improving the NE recovery time in case of communication failure or 679 faults of the master CE. This satisfies the requirement we set to 680 achieve. 682 5. IANA Considerations 684 TBA 686 6. Security Considerations 688 TBA 690 7. References 691 7.1. Normative References 693 [RFC5810] Doria, A., Hadi Salim, J., Haas, R., Khosravi, H., Wang, 694 W., Dong, L., Gopal, R., and J. Halpern, "Forwarding and 695 Control Element Separation (ForCES) Protocol 696 Specification", RFC 5810, March 2010. 698 7.2. Informative References 700 [RFC3654] Khosravi, H. and T. Anderson, "Requirements for Separation 701 of IP Control and Forwarding", RFC 3654, November 2003. 703 [RFC3746] Yang, L., Dantu, R., Anderson, T., and R. Gopal, 704 "Forwarding and Control Element Separation (ForCES) 705 Framework", RFC 3746, April 2004. 707 [RFC5812] Halpern, J. and J. Hadi Salim, "Forwarding and Control 708 Element Separation (ForCES) Forwarding Element Model", 709 RFC 5812, March 2010. 711 1. Appendix I - New FEPO version 713 715 716 717 FEHBPolicyValues 718 719 The possible values of FE heartbeat policy 720 721 722 uchar 723 724 725 FEHBPolicy0 726 727 The FE heartbeat policy 0 728 729 730 731 FEHBPolicy1 732 733 The FE heartbeat policy 1 734 735 736 737 739 740 741 HAModeValues 742 743 The possible values of HA modes 744 745 746 uchar 747 748 749 NoHA 750 751 The FE is not running in HA mode 752 753 754 755 ColdStandby 756 757 The FE is running in HA mode cold Standby 758 759 760 761 HotStandby 762 763 The FE is running in HA mode hot Standby 764 765 766 767 768 769 770 FERestartPolicyValues 771 772 The possible values of FE restart policy 773 774 775 uchar 776 777 778 FERestartPolicy0 779 780 The FE restart policy 0 781 782 783 784 785 786 787 CEHBPolicyValues 788 The possible values of CE heartbeat policy 789 790 uchar 791 792 793 CEHBPolicy0 794 The CE heartbeat policy 0 795 796 797 CEHBPolicy1 798 The CE heartbeat policy 1 799 800 801 802 803 804 FEHACapab 805 806 The supported HA features 807 808 809 uchar 810 811 812 GracefullRestart 813 814 The FE supports Graceful Restart 815 816 817 818 HA 819 820 The FE supports cold-standby mode 821 822 823 824 HOtStandBy 825 826 The FE supports hot-standby mode 827 828 829 830 831 832 833 CEStatusType 834 835 Status values. Status for each CE. 836 837 838 uchar 839 840 841 Disconnected 842 843 No connection attempt with the CE yet. 844 845 846 847 Connected 848 849 The FE has connected with the CE. 850 851 852 853 Associated 854 855 The FE has associated with the CE. 856 857 858 859 Lost_Connection 860 861 The FE was associated with the CE 862 but lost the connection. 863 864 865 866 Unreachable 867 868 The CE is deemed as unreachable by the FE. 869 870 871 872 873 874 875 AllCEType 876 877 Table Type for AllCE component. 878 879 880 881 CEID 882 ID of the CE 883 uint32 884 885 886 CEStatus 887 Status of the CE 888 CEStatusType 889 890 891 892 893 894 895 FEPO 896 897 The FE Protocol Object 898 899 2.0 900 901 902 CurrentRunningVersion 903 Currently running ForCES version 904 u8 905 906 907 FEID 908 Unicast FEID 909 uint32 910 911 912 MulticastFEIDs 913 914 the table of all multicast IDs 915 916 917 uint32 918 919 920 921 CEHBPolicy 922 923 The CE Heartbeat Policy 924 925 CEHBPolicyValues 926 927 928 CEHDI 929 930 The CE Heartbeat Dead Interval in millisecs 932 933 uint32 934 935 936 FEHBPolicy 937 938 The FE Heartbeat Policy 939 940 FEHBPolicyValues 941 942 943 FEHI 944 945 The FE Heartbeat Interval in millisecs 946 947 uint32 948 949 950 CEID 951 952 The Primary CE this FE is associated with 953 954 uint32 955 956 957 BackupCEs 958 959 The table of all backup CEs other than the primary 960 961 962 uint32 963 964 965 966 CEFailoverPolicy 967 968 The CE Failover Policy 969 970 CEFailoverPolicyValues 971 972 973 CEFTI 974 975 The CE Failover Timeout Interval in millisecs 976 977 uint32 978 979 980 FERestartPolicy 981 982 The FE Restart Policy 983 984 FERestartPolicyValues 985 986 987 LastCEID 988 989 The Primary CE this FE was last associated with 990 991 uint32 992 993 994 AllCEs 995 996 The table of all CEs. 997 998 999 AllCEType 1000 1001 1002 1003 HAMode 1004 1005 Mode selection for action in HA after loss of master CE 1006 1007 HAModeValues 1008 1009 1010 AcceptBackupGets 1011 If true, the FE will accept and respond to Queries 1012 from BackupCEs. 1013 Boolean 1014 1015 1016 1017 1018 SupportableVersions 1019 1020 the table of ForCES versions that FE supports 1021 1022 1023 uchar 1024 1025 1026 1027 HACapabilities 1028 1029 the table of HA capabilities the FE supports 1030 1031 1032 FEHACapab 1033 1034 1035 1036 MaximumMultipleCEAssocations 1037 1038 The number of CEs this FE can associate with at the same 1039 time 1040 1041 1042 uint32 1043 1044 1045 1046 1047 1048 PrimaryCEDown 1049 1050 The pimary CE has changed 1051 1052 1053 LastCEID 1054 1055 1056 1057 1058 LastCEID 1059 1060 1061 1062 1063 HAPrimaryCEDown 1064 The primary CE has changed 1065 1066 LastCEID 1067 1068 1069 1070 1071 CEID 1072 LastCEID 1073 1074 1075 1077 1078 1079 1080 1082 Authors' Addresses 1084 Kentaro Ogawa 1085 NTT Corporation 1086 3-9-11 Midori-cho 1087 Musashino-shi, Tokyo 180-8585 1088 Japan 1090 Email: ogawa.kentaro@lab.ntt.co.jp 1092 Weiming Wang 1093 Zhejiang Gongshang University 1094 149 Jiaogong Road 1095 Hangzhou 310035 1096 P.R.China 1098 Phone: +86-571-88057712 1099 Email: wmwang@mail.zjgsu.edu.cn 1101 Evangelos Haleplidis 1102 University of Patras 1103 Patras 1104 Greece 1106 Email: ehalep@ece.upatras.gr 1108 Jamal Hadi Salim 1109 Mojatatu Networks 1110 Ottawa, Ontario 1111 Canada 1113 Email: hadi@mojatatu.com