idnits 2.17.1 draft-ietf-forces-ceha-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC5810]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC5810, updated by this document, for RFC5378 checks: 2004-09-30) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (December 10, 2013) is 3782 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group K. Ogawa 3 Internet-Draft NTT Corporation 4 Updates: 5810 (if approved) W. M. Wang 5 Intended status: Standards Track Zhejiang Gongshang University 6 Expires: June 13, 2014 E. Haleplidis 7 University of Patras 8 J. Hadi Salim 9 Mojatatu Networks 10 December 10, 2013 12 ForCES Intra-NE High Availability 13 draft-ietf-forces-ceha-10 15 Abstract 17 This document discusses Control Element High Availability within a 18 ForCES Network Element. Additionally this document updates [RFC5810] 19 by providing new normative text for the Cold-Standby High 20 availability mechanism. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on June 13, 2014. 39 Copyright Notice 41 Copyright (c) 2013 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 58 2.1. Document Scope . . . . . . . . . . . . . . . . . . . . . 5 59 2.2. Quantifying Problem Scope . . . . . . . . . . . . . . . . 5 60 3. RFC5810 CE HA Framework . . . . . . . . . . . . . . . . . . . 6 61 3.1. RFC 5810 CE HA Support . . . . . . . . . . . . . . . . . 6 62 3.1.1. Cold Standby Interaction with ForCES Protocol . . . . 7 63 3.1.2. Responsibilities for HA . . . . . . . . . . . . . . . 10 64 4. CE HA Hot Standby . . . . . . . . . . . . . . . . . . . . . . 11 65 4.1. Changes to the FEPO model . . . . . . . . . . . . . . . . 11 66 4.2. FEPO processing . . . . . . . . . . . . . . . . . . . . . 13 67 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 68 6. Security Considerations . . . . . . . . . . . . . . . . . . . 18 69 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 70 7.1. Normative References . . . . . . . . . . . . . . . . . . 19 71 7.2. Informative References . . . . . . . . . . . . . . . . . 19 72 Appendix A. New FEPO version . . . . . . . . . . . . . . . . . . 19 73 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 29 75 1. Definitions 77 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 78 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 79 document are to be interpreted as described in [RFC2119]. 81 The following definitions are taken from [RFC3654], [RFC3746] and 82 [RFC5810]. They are repeated here for convenience as needed, but the 83 normative definitions are found in the referenced RFCs: 85 o Logical Functional Block (LFB) -- A template that represents a 86 fine-grained, logically separate aspects of FE processing. 88 o Forwarding Element (FE) - A logical entity that implements the 89 ForCES Protocol. FEs use the underlying hardware to provide per- 90 packet processing and handling as directed by a CE via the ForCES 91 Protocol. 93 o Control Element (CE) - A logical entity that implements the ForCES 94 Protocol and uses it to instruct one or more FEs on how to process 95 packets. CEs handle functionality such as the execution of 96 control and signaling protocols. 98 o ForCES Network Element (NE) - An entity composed of one or more 99 CEs and one or more FEs. An NE usually hides its internal 100 organization from external entities and represents a single point 101 of management to entities outside the NE. 103 o FE Manager (FEM) - A logical entity that operates in the pre- 104 association phase and is responsible for determining to which 105 CE(s) an FE should communicate. This process is called CE 106 discovery and may involve the FE manager learning the capabilities 107 of available CEs. 109 o CE Manager - A logical entity that operates in the pre-association 110 phase and is responsible for determining to which FE(s) a CE 111 should communicate. This process is called FE discovery and may 112 involve the CE manager learning the capabilities of available FEs. 114 o ForCES Protocol -- The protocol used for communication 115 communication between CEs and FEs. This protocol does not apply 116 to CE-to-CE communication, FE-to-FE communication, or to 117 communication between FE and CE managers. The ForCES protocol is 118 a master-slave protocol in which FEs are slaves and CEs are 119 masters. This protocol includes both the management of the 120 communication channel (e.g., connection establishment, heartbeats) 121 and the control messages themselves. 123 o ForCES Protocol Layer (ForCES PL) -- A layer in the ForCES 124 protocol architecture that defines the ForCES protocol messages, 125 the protocol state transfer scheme, and the ForCES protocol 126 architecture itself (including requirements of ForCES TML as shown 127 below). Specifications of ForCES PL are defined in [RFC5810] 129 o ForCES Protocol Transport Mapping Layer (ForCES TML) -- A layer in 130 ForCES protocol architecture that specifically addresses the 131 protocol message transportation issues, such as how the protocol 132 messages are mapped to different transport media (like SCTP, IP, 133 TCP, UDP, ATM, Ethernet, etc), and how to achieve and implement 134 reliability, security, etc. 136 2. Introduction 138 Figure 1 illustrates a ForCES NE controlled by a set of redundant CEs 139 with CE1 being active and CE2 and CEN being a backup. 141 ----------------------------------------- 142 | ForCES Network Element | 143 | +-----------+ | 144 | | CEn | | 145 | | (Backup) | | 146 -------------- Fc | +------------+ +------------+ | | 147 | CE Manager |--------+-| CE1 |------| CE2 |-+ | 148 -------------- | | (Active) | Fr | (Backup) | | 149 | | +-------+--+-+ +---+---+----+ | 150 | Fl | | | Fp / | | 151 | | | +---------+ / | | 152 | | Fp| |/ |Fp | 153 | | | | | | 154 | | | Fp /+--+ | | 155 | | | +-------+ | | | 156 | | | | | | | 157 -------------- Ff | --------+--+-- ----+---+----+ | 158 | FE Manager |--------+-| FE1 | Fi | FE2 | | 159 -------------- | | |------| | | 160 | -------------- -------------- | 161 | | | | | | | | | | 162 ----+--+--+--+----------+--+--+--+------- 163 | | | | | | | | 164 | | | | | | | | 165 Fi/f Fi/f 167 Fp: CE-FE interface 168 Fi: FE-FE interface 169 Fr: CE-CE interface 170 Fc: Interface between the CE Manager and a CE 171 Ff: Interface between the FE Manager and an FE 172 Fl: Interface between the CE Manager and the FE Manager 173 Fi/f: FE external interface 175 Figure 1: ForCES Architecture 177 The ForCES architecture allows FEs to be aware of multiple CEs but 178 enforces that only one CE be the master controller. This is known in 179 the industry as 1+N redundancy. The master CE controls the FEs via 180 the ForCES protocol operating on the Fp interface. If the master CE 181 becomes faulty, i.e. crashes or loses connectivity, a backup CE takes 182 over and NE operation continues. By definition, the current 183 documented setup is known as cold-standby. The set of CEs 184 controlling an FE is static and is passed to the FE by the FE Manager 185 (FEM) via the Ff interface and to each CE by the CE Manager (CEM) in 186 the Fc interface during the pre-association phase. 188 From an FE perspective, the knobs of control for a CE set are defined 189 by the FEPO LFB in [RFC5810], Appendix B. In Section 3.1 of this 190 document we discuss further details of these knobs. 192 2.1. Document Scope 194 It is assumed that the reader is aware of the ForCES architecture to 195 make sense of the changes being described in this document. This 196 document provides background information to set the context of the 197 discussion in Section 4. 199 At the time this document is being written, the Fr interface is out 200 of scope for the ForCES architecture. However, it is expected that 201 organizations implementing a set of CEs will need to have the CEs 202 communicate to each other via the Fr interface in order to achieve 203 the synchronization necessary for controlling the FEs. 205 The problem scope addressed by this document falls into 2 areas: 207 1. To update the description of [RFC5810] with more clarity on how 208 current cold-standby approach operates within the NE cluster. 210 2. To describe how to evolve the [RFC5810] cold-standby setup to a 211 hot-standby redundancy setup to improve the failover time and NE 212 availability. 214 2.2. Quantifying Problem Scope 216 The NE recovery and availability is dependent on several time- 217 sensitive metrics: 219 1. How fast the CE plane failure is detected by the FE. 221 2. How fast a backup CE becomes operational. 223 3. How fast the FEs associate with the new master CE. 225 4. How fast the FEs recover their state, and become operational. 226 Each FE state is the collective state of all its instantiated 227 LFBs. 229 The design intent of the current [RFC5810] as well as this document 230 to meet the above goals are driven by desire for simplicity. 232 To quantify the above criteria with the current prescribed ForCES CE 233 setup in [RFC5810]: 235 1. How fast the FE side detects a CE failure is left undefined. To 236 illustrate an extreme scenario, we could have a human operator 237 acting as the monitoring entity to detect faulty CEs. How fast 238 such detection happens could be in the range of seconds to days. 239 A more active monitor on the Fp interface could improve this 240 detection. Usually the FE will detect a CE failure either by the 241 TML if the Fp interface terminates or by the ForCES Protocol by 242 utilizing the ForCES heartbeat mechanism. 244 2. How fast the backup CE becomes operational is also currently out 245 of scope. In the current setup, a backup CE need not be 246 operational at all (for example, to save power) and therefore it 247 is feasible for a monitoring entity to boot up a backup CE after 248 it detects the failure of the master CE. In this document 249 Section 4 we suggest that at least one backup CE be online so as 250 to improve this metric. 252 3. How fast an FE associates with new master CE is also currently 253 undefined. The cost of an FE connecting and associating adds to 254 the recovery overhead. As mentioned above we suggest having at 255 least one backup CE online. In Section 4 we propose to zero out 256 the connection and association cost on failover by having each FE 257 associate with all online backup CEs after associating to an 258 active/master CE. Note that if an FE pre-associates with at 259 least one backup CE, then the system will be technically 260 operating in hot-standby mode. 262 4. And last: How fast an FE recovers its state depends on how much 263 NE state exists. By ForCES current definition, the new master CE 264 assumes zero state on the FE and starts from scratch to update 265 the FE. So the larger the state, the longer the recovery. 267 3. RFC5810 CE HA Framework 269 To achieve CE High Availability (HA), FEs and CEs MUST inter-operate 270 per [RFC5810] definition which is repeated for contextual reasons in 271 Section 3.1. It should be noted that in this default setup, which 272 MUST be implemented by CEs and FEs requiring HA, the Fr plane is out 273 of scope (and if available is proprietary to an implementation). 275 3.1. RFC 5810 CE HA Support 277 As mentioned earlier, although there can be multiple redundant CEs, 278 only one CE actively controls FEs in a ForCES NE. In practice there 279 may be only one backup CE. At any moment in time, only one master CE 280 can control an FE. In addition, the FE connects and associates to 281 only the master CE. The FE and the CE are aware of the primary and 282 one or more secondary CEs. This information (primary, secondary CEs) 283 is configured on the FE and the CE during pre-association by the FEM 284 and the CEM respectively. 286 This section includes a new normative description that updates 287 [RFC5810] for the Cold-Standby High Availability mechanism. 289 Figure 2 below illustrates the Forces message sequences that the FE 290 uses to recover the connection in current defined cold-standby 291 scheme. 293 FE CE Primary CE Secondary 294 | | | 295 | Association Establishment | | 296 | Capabilities Exchange | | 297 1 |<------------------------->| | 298 | | | 299 | State Update | | 300 2 |<------------------------->| | 301 | | | 302 | | | 303 | FAILURE | 304 | | 305 | Association Estbalishment,Capabilities Exchange | 306 3 |<----------------------------------------------->| 307 | | 308 | Event Report (primary CE down) | 309 4 |------------------------------------------------>| 310 | | 311 | State Update | 312 5 |<----------------------------------------------->| 314 Figure 2: CE Failover for Cold Standby 316 3.1.1. Cold Standby Interaction with ForCES Protocol 318 HA parameterization in an FE is driven by configuring the FE Protocol 319 Object (FEPO) LFB. 321 The FEPO CEID component identifies the current master CE and the 322 component table BackupCEs identifies the configured backup CEs. The 323 FEPO FE Heartbeat Interval, CE Heartbeat Dead Interval, and CE 324 Heartbeat policy help in detecting connectivity problems between an 325 FE and CE. The CE Failover policy defines how the FE should react on 326 a detected failure. The FEObject FEState component [RFC5812] defines 327 the operational forwarding status and control. The CE can turn off 328 the FE's forwarding operations by setting the FEState to AdminDisable 329 and can turn it on by setting it to OperEnable. Note: [RFC5812] 330 section 5.1 has an errata which describes the FEState as read-only 331 when it should be read-write. 333 Figure 3 illustrates the defined state machine that facilitates the 334 recovery of connection state. 336 The FE connects to the CE specified on FEPO CEID component. If it 337 fails to connect to the defined CE, it moves it to the bottom of 338 table BackupCEs and sets its CEID component to be the first CE 339 retrieved from table BackupCEs. The FE then attempts to associate 340 with the CE designated as the new primary CE. The FE continues 341 through this procedure until it successfully connects to one of the 342 CEs or until the CE Failover Timeout Interval (CEFTI) expires. 344 FE tries to associate 345 +-->-----+ 346 | | 347 (CE changes master || | | 348 CE issues Teardown || +---+--------v----+ 349 Lost association) && | Pre-Association | 350 CE failover policy = 0 | (Association | 351 +------------>-->-->| in +<----+ 352 | | progress) | | 353 | | | | 354 | +--------+--------+ | 355 | CE Association | | CEFTI 356 | Response V | timer 357 | +------------------+ | expires 358 | |FE issue CEPrimaryDown ^ 359 | V | 360 +-+-----------+ +------+-----+ 361 | | (CE changes master || | Not | 362 | | CE issues Teardown || | Associated | 363 | | Lost association) && | +->---+ 364 | Associated | CE Failover Policy = 1 |(May | FE | 365 | | | Continue | try v 366 | |-------->------->------>| Forwarding)| assn| 367 | | Start CEFTI timer | |-<---+ 368 | | | | 369 +-------------+ +-------+-----+ 370 ^ | 371 | Successful V 372 | Association | 373 | Setup | 374 | (Cancel CEFTI Timer) | 375 +_________________________________________+ 376 FE issue CEPrimaryDown event 378 Figure 3: FE State Machine considering HA 380 There are several events that trigger mastership changes: The master 381 CE may issue a mastership change (by changing the CEID component), or 382 teardown an existing association; and last, connectivity may be lost 383 between the CE and FE. 385 When communication fails between the FE and CE (which can be caused 386 by either the CE or link failure but not FE related), either the TML 387 on the FE will trigger the FE PL regarding this failure or it will be 388 detected using the heartbeat messages between FEs and CEs. The 389 communication failure, regardless of how it is detected, MUST be 390 considered as a loss of association between the CE and corresponding 391 FE. 393 If the FE's FEPO CE Failover Policy is configured to mode 0 (the 394 default), it will immediately transition to the pre-association 395 phase. This means that if association is later re-established with a 396 CE, all FE state will need to be re-created. 398 If the FE's FEPO CE Failover Policy is configured to mode 1, it 399 indicates that the FE will run in HA restart recovery. In such a 400 case, the FE transitions to the Not Associated state and the CEFTI 401 timer [RFC5810] is started. The FE may continue to forward packets 402 during this state depending upon the value of the CEFailoverPolicy 403 component of the FEPO LFB. The FE recycles through any configured 404 backup CEs in a round-robin fashion. It first adds its primary CE to 405 the bottom of table BackupCEs and sets its CEID component to be the 406 first secondary retrieved from table BackupCEs. The FE then attempts 407 to associate with the CE designated as the new primary CE. If it 408 fails to re-associate with any CE and the CEFTI expires, the FE then 409 transitions to the pre-association state and FE will operationally 410 bring down its forwarding path (and set the [RFC5812] FEObject 411 FEState component to OperDisable). 413 If the FE, while in the not associated state, manages to reconnect to 414 a new primary CE before CEFTI expires it transitions to the 415 Associated state. Once re-associated, the CE may try to synchronize 416 any state that the FE may have lost during disconnection. How the CE 417 re-synchronizes such state is out of scope for the current ForCES 418 architecture but would typically constitute the issuing of new 419 configs and queries. 421 An explicit message (a Config message setting Primary CE component in 422 ForCES Protocol object) from the primary CE, can also be used to 423 change the Primary CE for an FE during normal protocol operation. In 424 this case, the FE transitions to the Not Associated State and 425 attempts to Associate with the new CE. 427 3.1.2. Responsibilities for HA 429 TML Level: 431 1. The TML controls logical connection availability and failover. 433 2. The TML also controls peer HA management. 435 At this level, control of all lower layers, for example transport 436 level (such as IP addresses, MAC addresses etc) and associated links 437 going down are the role of the TML. 439 PL Level: 441 All other functionality, including configuring the HA behavior during 442 setup, the Control Element IDs (CE IDs) used to identify primary and 443 secondary CEs, protocol messages used to report CE failure (Event 444 Report), Heartbeat messages used to detect association failure, 445 messages to change the primary CE (Config), and other HA related 446 operations described in Section 3.1, are the PL's responsibility. 448 To put the two together, if a path to a primary CE is down, the TML 449 would help recover from a failure by switching over to a backup path, 450 if one is available. If the CE is totally unreachable then the PL 451 would be informed and it would take the appropriate actions described 452 before. 454 4. CE HA Hot Standby 456 In this section we describe small extensions to the existing scheme 457 to enable hot standby HA. To achieve hot standby HA, we target to 458 improve the specific goals defined in Section 2.2, namely: 460 o How fast a backup CE becomes operational. 462 o How fast the FEs associate with the new master CE. 464 As described in Section 3.1, in the pre-association phase the FEM 465 configures the FE to make it aware of all the CEs in the NE. The FEM 466 MUST configure the FE to make it aware which CE is the master and MAY 467 specify any backup CE(s). 469 4.1. Changes to the FEPO model 471 In order for the above to be achievable there is a need to make a few 472 changes in the FEPO model. Appendix A contains the xml definition of 473 the new version 1.1 of the FEPO LFB. 475 Changes from the version 1 of FEPO are: 477 1. Added four new datatypes: 479 1. CEStatusType an unsigned char to specify status of a 480 connection with a CE. Special values are: 482 + 0 (Disconnected) represents that no connection attempt has 483 been made with the CE yet 485 + 1 (Connected) represents that the FE connection with the 486 CE at the TML has completed successfully 488 + 2 (Associated) represents that the FE has successfully 489 associated with the CE 491 + 3 (IsMaster) represents that the FE has associated with 492 the CE and is the master of the FE 494 + 4 (LostConnection) represents that the FE was associated 495 with the CE at one point but lost the connection 497 + 5 (Unreachable) represents the FE deems this CE 498 unreachable. i.e., the FE has tried over a period to 499 connect to it but has failed. 501 2. HAModeValues an unsigned char to specify selected HA mode. 502 Special values are: 504 + 0 (No HA Mode) represents that the FE is not running in HA 505 mode 507 + 1 (HA Mode - Cold Standby) represents that the FE is in HA 508 mode cold Standby 510 + 2 (HA Mode - Hot Standby) represents that the FE is in HA 511 mode hot Standby 513 3. Statistics, a complex structure, representing the 514 communication statistics between the FE and CE. The 515 components are: 517 + RecvPackets representing the packet count received from 518 the CE 520 + RecvBytes representing the byte count received from the CE 522 + RecvErrPackets representing the erroneous packets received 523 from the CE. This component logs badly formatted packets 524 as well as good packets sent to the FE by the CE to set 525 components whilst that CE is not the master. Erroneous 526 packets are dropped(i.e. not responded to). 528 + RecvErrBytes representing the RecvErrPackets byte count 529 received from the CE 531 + TxmitPackets representing the packet count transmitted to 532 the CE 534 + TxmitErrPackets representing the error packet count 535 transmitted to the CE. Typically these would be failures 536 due to communication. 538 + TxmitBytes representing the byte count transmitted to the 539 CE 541 + TxmitErrBytes representing the byte count of errors from 542 transmit to the CE 544 4. AllCEType, a complex structure constituting the CE IDs, 545 Statistics and CEStatusType to reflect connection information 546 for one CE. Used in the AllCEs component array. 548 2. Appended two new components: 550 1. Read-only AllCEs to hold status for all CEs. AllCEs is an 551 Array of the AllCEType. 553 2. Read-write HAMode of type HAModeValues to carry the HA mode 554 used by the FE. 556 3. Added one additional Event, PrimaryCEChanged, reporting the new 557 master CE ID when there is a mastership change. 559 Since no component from the FEPO v1 has been changed FEPO v1.1 560 retains backwards compatibility with CEs that know only version 1.0. 561 These CEs however cannot make use of the HA options that the new FEPO 562 provides. 564 4.2. FEPO processing 566 The FE's FEPO LFB version 1.1 AllCEs table contains all the CE IDs 567 that the FE may connect and associate with. The ordering of the CE 568 IDs in this table defines the priority order in which an FE will 569 connect to the CEs. This table is provisioned initially from the 570 configuration plane (FEM). In the pre-association phase, the first 571 CE (lowest table index) in the AllCEs table MUST be the first CE that 572 the FE will attempt to connect and associate with. If the FE fails 573 to connect and associate with the first listed CE, it will attempt to 574 connect to the second CE and so forth, and cycles back to the 575 beginning of the list until there is a successful association. The 576 FE MUST associate with at least one CE. Upon a successful 577 association, a component of the FEPO LFB, specifically the CEID 578 component, identifies the current associated master CE. 580 While it would be much simpler to have the FE not respond to any 581 messages from a CE other than the master, in practice it has been 582 found to be useful to respond to queries and heartbeats from backup 583 CEs. For this reason, we allow backup CEs to issues queries to the 584 FE. Configuration messages (SET/DEL) from backup CEs MUST be dropped 585 by the FE and logged as received errors. 587 Asynchronous events that the master CE has subscribed to, as well as 588 heartbeats are sent to all associated-to CEs. Packet redirects 589 continue to be sent only to the master CE. The Heartbeat Interval, 590 the CE Heartbeat Policy (CEHB) and the FE Heartbeat Policy (FEHB) are 591 global for all CEs(and changed only by the master CE). 593 Figure 4 illustrates the state machine that facilitates connection 594 recovery with HA enabled. 596 FE tries to associate 597 +-->-----+ 598 | | 599 (CE changes master || | | 600 CE issues Teardown || +---+--------v----+ 601 Lost association) && | Pre-Association | 602 CE failover policy = 0 | (Association | 603 +------------>-->-->| in +<----+ 604 | | progress) | | 605 | | | | 606 | +--------+--------+ | 607 | CE Association | | CEFTI 608 | Response V | timer 609 | +------------------+ | expires 610 | |FE issue CEPrimaryDown ^ 611 | |FE issue PrimaryCEChanged ^ 612 | V | 613 +-+-----------+ +------+-----+ 614 | | (CE changes master || | Not | 615 | | CE issues Teardown || | Associated | 616 | | Lost association) && | +->----------+ 617 | Associated | CE Failover Policy = 1 |(May | find first | 618 | | | Continue | associated v 619 | |-------->------->------>| Forwarding)| CE or retry| 620 | | Start CEFTI timer | | associating| 621 | | | |-<----------+ 622 | | | | 623 +----+--------+ +-------+----+ 624 | | 625 ^ Found | associated CE 626 | or newly | associated CE 627 | V 628 | (Cancel CEFTI Timer) | 629 +_________________________________________+ 630 FE issue CEPrimaryDown event 631 FE issue PrimaryCEChanged event 633 Figure 4: FE State Machine considering HA 635 Once the FE has associated with a master CE it moves to the post- 636 association phase (Associated state). It is assumed that the master 637 CE will communicate with other CEs within the NE for the purpose of 638 synchronization via the CE-CE interface. The CE-CE interface is out 639 of scope for this document. An election result amongst CEs may 640 result in desire to change mastership to a different associated CE; 641 at which point current assumed master CE will instruct the FE to use 642 a different master CE. 644 FE CE#1 CE#2 ... CE#N 645 | | | | 646 | Association Establishment | | | 647 | Capabilities Exchange | | | 648 1 |<------------------------->| | | 649 | | | | 650 | State Update | | | 651 2 |<------------------------->| | | 652 | | | | 653 | Association Establishment | | 654 | Capabilities Exchange | | 655 3I|<-------------------------------------->| | 656 ... ... ... ... 657 | Association Estbalishment,Capabilities Exchange | 658 3N|<----------------------------------------------->| 659 | | | | 660 4 |<------------------------->| | | 661 . . . . 662 4x|<------------------------->| | | 663 | FAILURE | | 664 | | | | 665 | Event Report (LastCEID changed) | | 666 5 |--------------------------------------->|------->| 667 | Event Report (CE#2 is new master) | | 668 6 |--------------------------------------->|------->| 669 | | | 670 7 |<-------------------------------------->| | 671 . . . . 672 7x|<-------------------------------------->| | 673 . . . . 675 Figure 5: CE Failover for Hot Standby 677 While in the post-association phase, if the CE Failover Policy is set 678 to 1 and HAMode set to 2 (HotStandby) then the FE, after successfully 679 associating with the master CE, MUST attempt to connect and associate 680 with all the CEs that it is aware of. Figure 5 steps #1 and #2 681 illustrates the FE associating with CE#1 as the master and then 682 proceeding to steps #3I to #3N the association with backup CEs CE#2 683 to CE#N. If the FE fails to connect or associate with some CEs, the 684 FE MAY flag them as unreachable to avoid continuous attempts to 685 connect. The FE MAY retry to reassociate with unreachable CEs when 686 possible. 688 When the master CE for any reason is considered to be down, then the 689 FE MUST try to find the first associated CE from the list of all CEs 690 in a round-robin fashion. 692 If the FE is unable to find an associated FE in its list of CEs, then 693 it MUST attempt to connect and associate with the first from the list 694 of all CEs and continue in a round-robin fashion until it connects 695 and associates with a CE or the CEFTI timer expires. 697 Once the FE selects an associated CE to use as the new master, the FE 698 issues a PrimaryCEDown Event Notification to all associated CEs to 699 notify them that the last primary CE went down (and what its identity 700 was); a second event PrimaryCEChanged identifying the new master CE 701 is sent as well to identify which CE the reporting FE considers to be 702 the new master. 704 In most HA architectures there exists the possibility of split-brain. 705 However, since in our setup the FE will never accept any 706 configuration messages from any other than the master CE, we consider 707 the FE as fenced against data corruption from the other CEs that 708 consider themselves as the master. The split-brain issue becomes 709 mostly a CE-CE communication problem which is considered to be out of 710 scope. 712 By virtue of having multiple CE connections, the FE switchover to a 713 new master CE will be relatively much faster. The overall effect is 714 improving the NE recovery time in case of communication failure or 715 faults of the master CE. This satisfies the requirement we set to 716 achieve. 718 5. IANA Considerations 720 Following the policies outlined in "Guidelines for Writing an IANA 721 Considerations Section in RFCs" [RFC5226], the Logical Functional 722 Block (LFB) Class Names and Class Identifiers namespaces is updated. 724 A new column, LFB version, is added to the table after the LFB Class 725 Name. The table now reads as follows: 727 +----------------+------------+-----------+-------------+-----------+ 728 | LFB Class | LFB Class | LFB | Description | Reference | 729 | Identifier | Name | Version | | | 730 +----------------+------------+-----------+-------------+-----------+ 731 +----------------+------------+-----------+-------------+-----------+ 733 Logical Functional Block (LFB) Class Names and Class Identifiers 735 The same rules applies as defined in [RFC5812] with the addition that 736 entries must provide the LFB version as a string. 738 Upon publication of this document, all current entries are assigned a 739 value of 1.0. 741 New versions of already defined LFB, MUST NOT remove the previous 742 version entries. 744 It would make sense to have LFB versions to appear in sequence in the 745 registry. The table SHOULD be sorted, and the shorting should be 746 done by Class ID first and then by version. 748 This document introduces the FE Protocol Object version 1.1 as 749 follows: 751 +------------+-----------+---------+--------------------+-----------+ 752 | LFB Class | LFB Class | LFB | Description | Reference | 753 | Identifier | Name | Version | | | 754 +------------+-----------+---------+--------------------+-----------+ 755 | 2 | FE | 1.1 | Defines parameters | This | 756 | | Protocol | | for the ForCES | document | 757 | | Object | | protocol operation | | 758 +------------+-----------+---------+--------------------+-----------+ 760 Logical Functional Block (LFB) Class Names and Class Identifiers 762 6. Security Considerations 764 Security consideration as defined in section 9 of [RFC5810] applies 765 securing each CE-FE communication. Multiple CEs associated with the 766 same FE still require the same procedure to be followed on a per- 767 association basis. 769 It should be noted that since the FE is initiating the association 770 with a CE, a CE cannot initiate association with the FE and such 771 messages will be dropped. Thus the FE is secured from rogue CEs that 772 are attempting to associate with it. 774 CE implementers should have in mind that once associated the FE 775 cannot distinguish whether the CE has been compromised or 776 malfunctioning while not losing connectivity. Securing the CE is out 777 of scope of this document. 779 While CE-CE plane is outside current scope of ForCES, we recognize 780 that it may be subjected to attacks which may affect the CE-FE 781 communication. 783 The following considerations should be made: 785 1. CEs should use secure communication channels between for 786 coordination and keeping of state at least to avoid connection of 787 malicious CEs. 789 2. The master CE should take into account DoS and DDoS attacks from 790 malicious or malfunctioning CEs. 792 3. CEs should take into account the split-brain issue. There are 793 currently two fail-safes in the FE, firstly the FE has the CEID 794 component that denotes which CE is the master and secondly the FE 795 does not allow BackupCEs to configure the FE. However backup CEs 796 that consider that the master CE has dropped and themselves as 797 master should first do a sanity check and query the FE CEID 798 component. 800 7. References 802 7.1. Normative References 804 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 805 Requirement Levels", BCP 14, RFC 2119, March 1997. 807 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 808 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 809 May 2008. 811 [RFC5810] Doria, A., Hadi Salim, J., Haas, R., Khosravi, H., Wang, 812 W., Dong, L., Gopal, R., and J. Halpern, "Forwarding and 813 Control Element Separation (ForCES) Protocol 814 Specification", RFC 5810, March 2010. 816 [RFC5812] Halpern, J. and J. Hadi Salim, "Forwarding and Control 817 Element Separation (ForCES) Forwarding Element Model", RFC 818 5812, March 2010. 820 7.2. Informative References 822 [RFC3654] Khosravi, H. and T. Anderson, "Requirements for Separation 823 of IP Control and Forwarding", RFC 3654, November 2003. 825 [RFC3746] Yang, L., Dantu, R., Anderson, T., and R. Gopal, 826 "Forwarding and Control Element Separation (ForCES) 827 Framework", RFC 3746, April 2004. 829 Appendix A. New FEPO version 831 The xml has been validated against the schema defined in [RFC5812]. 833 836 837 838 839 CEHBPolicyValues 840 841 The possible values of CE heartbeat policy 842 843 844 uchar 845 846 847 CEHBPolicy0 848 849 The CE will send heartbeats to the FE 850 every CEHDI timeout if no other messages 851 have been sent since. 852 853 854 855 CEHBPolicy1 856 857 The CE will not send heartbeats to the FE 858 859 860 861 862 863 864 FEHBPolicyValues 865 866 The possible values of FE heartbeat policy 867 868 869 uchar 870 871 872 FEHBPolicy0 873 874 The FE will not generate any heartbeats to the CE 875 876 877 878 FEHBPolicy1 879 880 The FE generates heartbeats to the CE every FEHI 881 if no other messages have been sent to the CE. 882 883 884 886 887 888 889 FERestartPolicyValues 890 891 The possible values of FE restart policy 892 893 894 uchar 895 896 897 FERestartPolicy0 898 899 The FE restarts its state from scratch 900 901 902 903 904 905 906 HAModeValues 907 908 The possible values of HA modes 909 910 911 uchar 912 913 914 NoHA 915 916 The FE is not running in HA mode 917 918 919 920 ColdStandby 921 922 The FE is running in HA mode cold Standby 923 924 925 926 HotStandby 927 928 The FE is running in HA mode hot Standby 929 930 931 932 933 934 935 CEFailoverPolicyValues 936 937 The possible values of CE failover policy 938 939 940 uchar 941 942 943 CEFailoverPolicy0 944 945 The FE should stop functioning immediate and 946 transition to the FE OperDisable state 947 948 949 950 CEFailoverPolicy1 951 952 The FE should continue forwarding even without an 953 associated CE for CEFTI. The FE goes to FE 954 OperDisable when the CEFTI expires and no 955 association. Requires graceful restart support. 956 957 958 959 960 961 962 FEHACapab 963 964 The supported HA features 965 966 967 uchar 968 969 970 GracefullRestart 971 972 The FE supports Graceful Restart 973 974 975 976 HA 977 978 The FE supports HA 979 980 981 983 984 985 986 CEStatusType 987 Status values. Status for each CE 988 989 uchar 990 991 992 Disconnected 993 No connection attempt with the CE yet 994 995 996 997 Connected 998 The FE connection with the CE at the TML 999 has been completed 1000 1001 1002 1003 Associated 1004 The FE has associated with the CE 1005 1006 1007 1008 IsMaster 1009 The CE is the master (and associated) 1010 1011 1012 1013 LostConnection 1014 The FE was associated with the CE but 1015 lost the connection 1016 1017 1018 1019 Unreachable 1020 The CE is deemed as unreachable by the FE 1021 1022 1023 1024 1025 1026 1027 StatisticsType 1028 Statistics Definition 1029 1030 1031 RecvPackets 1032 Packets Received 1033 uint64 1034 1035 1036 RecvErrPackets 1037 Packets Received from CE with errors 1038 1039 uint64 1040 1041 1042 RecvBytes 1043 Bytes Received from CE 1044 uint64 1045 1046 1047 RecvErrBytes 1048 Bytes Received from CE in Error 1049 uint64 1050 1051 1052 TxmitPackets 1053 Packets Transmitted to CE 1054 uint64 1055 1056 1057 TxmitErrPackets 1058 1059 Packets Transmitted to CE that incurred 1060 errors 1061 1062 uint64 1063 1064 1065 TxmitBytes 1066 Bytes Transmitted to CE 1067 uint64 1068 1069 1070 TxmitErrBytes 1071 Bytes Transmitted to CE incurring errors 1072 1073 uint64 1074 1075 1076 1077 1078 AllCEType 1079 Table Type for AllCE component 1080 1081 1082 CEID 1083 ID of the CE 1084 uint32 1085 1086 1087 Statistics 1088 Statistics per CE 1089 StatisticsType 1090 1091 1092 CEStatus 1093 Status of the CE 1094 CEStatusType 1095 1096 1097 1098 1099 1100 1101 FEPO 1102 1103 The FE Protocol Object, with new CEHA 1104 1105 1.1 1106 1107 1108 CurrentRunningVersion 1109 Currently running ForCES version 1110 uchar 1111 1112 1113 FEID 1114 Unicast FEID 1115 uint32 1116 1117 1118 MulticastFEIDs 1119 1120 the table of all multicast IDs 1121 1122 1123 uint32 1124 1125 1126 1127 CEHBPolicy 1128 1129 The CE Heartbeat Policy 1130 1131 CEHBPolicyValues 1132 1133 1134 CEHDI 1135 1136 The CE Heartbeat Dead Interval in millisecs 1137 1138 uint32 1139 1140 1141 FEHBPolicy 1142 1143 The FE Heartbeat Policy 1144 1145 FEHBPolicyValues 1146 1147 1148 FEHI 1149 1150 The FE Heartbeat Interval in millisecs 1151 1152 uint32 1153 1154 1155 CEID 1156 1157 The Primary CE this FE is associated with 1158 1159 uint32 1160 1161 1162 BackupCEs 1163 1164 The table of all backup CEs other than the 1165 primary 1166 1167 1168 uint32 1169 1170 1171 1172 CEFailoverPolicy 1173 1174 The CE Failover Policy 1176 1177 CEFailoverPolicyValues 1178 1179 1180 CEFTI 1181 1182 The CE Failover Timeout Interval in millisecs 1183 1184 uint32 1185 1186 1187 FERestartPolicy 1188 1189 The FE Restart Policy 1190 1191 FERestartPolicyValues 1192 1193 1194 LastCEID 1195 1196 The Primary CE this FE was last associated 1197 with 1198 1199 uint32 1200 1201 1202 HAMode 1203 1204 The HA mode used 1205 1206 HAModeValues 1207 1208 1209 AllCEs 1210 The table of all CEs 1211 1212 AllCEType 1213 1214 1215 1216 1217 1218 SupportableVersions 1219 1220 the table of ForCES versions that FE supports 1221 1222 1223 uchar 1225 1226 1227 1228 HACapabilities 1229 1230 the table of HA capabilities the FE supports 1231 1232 1233 FEHACapab 1234 1235 1236 1237 1238 1239 PrimaryCEDown 1240 1241 The primary CE has changed 1242 1243 1244 LastCEID 1245 1246 1247 1248 1249 LastCEID 1250 1251 1252 1253 1254 PrimaryCEChanged 1255 A New primary CE has been selected 1256 1257 1258 CEID 1259 1260 1261 1262 1263 CEID 1264 1265 1266 1267 1268 1269 1270 1272 Authors' Addresses 1274 Kentaro Ogawa 1275 NTT Corporation 1276 3-9-11 Midori-cho 1277 Musashino-shi, Tokyo 180-8585 1278 Japan 1280 Email: k.ogawa@ntt.com 1282 Weiming Wang 1283 Zhejiang Gongshang University 1284 149 Jiaogong Road 1285 Hangzhou 310035 1286 P.R.China 1288 Phone: +86-571-88057712 1289 Email: wmwang@mail.zjgsu.edu.cn 1291 Evangelos Haleplidis 1292 University of Patras 1293 Panepistimioupoli Patron 1294 Patras 26504 1295 Greece 1297 Email: ehalep@ece.upatras.gr 1299 Jamal Hadi Salim 1300 Mojatatu Networks 1301 Suite 400, 303 Moodie Dr. 1302 Ottawa, Ontario K2H 9R4 1303 Canada 1305 Email: hadi@mojatatu.com