idnits 2.17.1 draft-ietf-forces-ceha-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC5810]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC5810, updated by this document, for RFC5378 checks: 2004-09-30) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 20, 2013) is 3804 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group K. Ogawa 3 Internet-Draft NTT Corporation 4 Updates: 5810 (if approved) W. M. Wang 5 Intended status: Standards Track Zhejiang Gongshang University 6 Expires: May 24, 2014 E. Haleplidis 7 University of Patras 8 J. Hadi Salim 9 Mojatatu Networks 10 November 20, 2013 12 ForCES Intra-NE High Availability 13 draft-ietf-forces-ceha-09 15 Abstract 17 This document discusses Control Element High Availability within a 18 ForCES Network Element. Additionally this document updates [RFC5810] 19 by providing new normative text for the Cold-Standby High 20 availability mechanism. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on May 24, 2014. 39 Copyright Notice 41 Copyright (c) 2013 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 58 2.1. Document Scope . . . . . . . . . . . . . . . . . . . . . 5 59 2.2. Quantifying Problem Scope . . . . . . . . . . . . . . . . 5 60 3. RFC5810 CE HA Framework . . . . . . . . . . . . . . . . . . . 6 61 3.1. RFC 5810 CE HA Support . . . . . . . . . . . . . . . . . 6 62 3.1.1. Cold Standby Interaction with ForCES Protocol . . . . 7 63 3.1.2. Responsibilities for HA . . . . . . . . . . . . . . . 10 64 4. CE HA Hot Standby . . . . . . . . . . . . . . . . . . . . . . 11 65 4.1. Changes to the FEPO model . . . . . . . . . . . . . . . . 11 66 4.2. FEPO processing . . . . . . . . . . . . . . . . . . . . . 13 67 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 68 6. Security Considerations . . . . . . . . . . . . . . . . . . . 17 69 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 70 7.1. Normative References . . . . . . . . . . . . . . . . . . 18 71 7.2. Informative References . . . . . . . . . . . . . . . . . 18 72 Appendix A. New FEPO version . . . . . . . . . . . . . . . . . . 19 73 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 28 75 1. Definitions 77 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 78 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 79 document are to be interpreted as described in [RFC2119]. 81 The following definitions are taken from [RFC3654], [RFC3746] and 82 [RFC5810]. They are repeated here for convenience as needed, but the 83 normative definitions are found in the referenced RFCs: 85 o Logical Functional Block (LFB) -- A template that represents a 86 fine-grained, logically separate aspects of FE processing. 88 o Forwarding Element (FE) - A logical entity that implements the 89 ForCES Protocol. FEs use the underlying hardware to provide per- 90 packet processing and handling as directed by a CE via the ForCES 91 Protocol. 93 o Control Element (CE) - A logical entity that implements the ForCES 94 Protocol and uses it to instruct one or more FEs on how to process 95 packets. CEs handle functionality such as the execution of 96 control and signaling protocols. 98 o ForCES Network Element (NE) - An entity composed of one or more 99 CEs and one or more FEs. An NE usually hides its internal 100 organization from external entities and represents a single point 101 of management to entities outside the NE. 103 o FE Manager (FEM) - A logical entity that operates in the pre- 104 association phase and is responsible for determining to which 105 CE(s) an FE should communicate. This process is called CE 106 discovery and may involve the FE manager learning the capabilities 107 of available CEs. 109 o CE Manager - A logical entity that operates in the pre-association 110 phase and is responsible for determining to which FE(s) a CE 111 should communicate. This process is called FE discovery and may 112 involve the CE manager learning the capabilities of available FEs. 114 o ForCES Protocol -- The protocol used for communication 115 communication between CEs and FEs. This protocol does not apply 116 to CE-to-CE communication, FE-to-FE communication, or to 117 communication between FE and CE managers. The ForCES protocol is 118 a master-slave protocol in which FEs are slaves and CEs are 119 masters. This protocol includes both the management of the 120 communication channel (e.g., connection establishment, heartbeats) 121 and the control messages themselves. 123 o ForCES Protocol Layer (ForCES PL) -- A layer in the ForCES 124 protocol architecture that defines the ForCES protocol messages, 125 the protocol state transfer scheme, and the ForCES protocol 126 architecture itself (including requirements of ForCES TML as shown 127 below). Specifications of ForCES PL are defined in [RFC5810] 129 o ForCES Protocol Transport Mapping Layer (ForCES TML) -- A layer in 130 ForCES protocol architecture that specifically addresses the 131 protocol message transportation issues, such as how the protocol 132 messages are mapped to different transport media (like SCTP, IP, 133 TCP, UDP, ATM, Ethernet, etc), and how to achieve and implement 134 reliability, security, etc. 136 2. Introduction 138 Figure 1 illustrates a ForCES NE controlled by a set of redundant CEs 139 with CE1 being active and CE2 and CEN being a backup. 141 ----------------------------------------- 142 | ForCES Network Element | 143 | +-----------+ | 144 | | CEn | | 145 | | (Backup) | | 146 -------------- Fc | +------------+ +------------+ | | 147 | CE Manager |--------+-| CE1 |------| CE2 |-+ | 148 -------------- | | (Active) | Fr | (Backup) | | 149 | | +-------+--+-+ +---+---+----+ | 150 | Fl | | | Fp / | | 151 | | | +---------+ / | | 152 | | Fp| |/ |Fp | 153 | | | | | | 154 | | | Fp /+--+ | | 155 | | | +-------+ | | | 156 | | | | | | | 157 -------------- Ff | --------+--+-- ----+---+----+ | 158 | FE Manager |--------+-| FE1 | Fi | FE2 | | 159 -------------- | | |------| | | 160 | -------------- -------------- | 161 | | | | | | | | | | 162 ----+--+--+--+----------+--+--+--+------- 163 | | | | | | | | 164 | | | | | | | | 165 Fi/f Fi/f 167 Fp: CE-FE interface 168 Fi: FE-FE interface 169 Fr: CE-CE interface 170 Fc: Interface between the CE Manager and a CE 171 Ff: Interface between the FE Manager and an FE 172 Fl: Interface between the CE Manager and the FE Manager 173 Fi/f: FE external interface 175 Figure 1: ForCES Architecture 177 The ForCES architecture allows FEs to be aware of multiple CEs but 178 enforces that only one CE be the master controller. This is known in 179 the industry as 1+N redundancy. The master CE controls the FEs via 180 the ForCES protocol operating on the Fp interface. If the master CE 181 becomes faulty, i.e. crashes or loses connectivity, a backup CE takes 182 over and NE operation continues. By definition, the current 183 documented setup is known as cold-standby. The set of CEs 184 controlling an FE is static and is passed to the FE by the FE Manager 185 (FEM) via the Ff interface and to each CE by the CE Manager (CEM) in 186 the Fc interface during the pre-association phase. 188 From an FE perspective, the knobs of control for a CE set are defined 189 by the FEPO LFB in [RFC5810], Appendix B. In Section 3.1 of this 190 document we discuss further details of these knobs. 192 2.1. Document Scope 194 It is assumed that the reader is aware of the ForCES architecture to 195 make sense of the changes being described in this document. This 196 document provides background information to set the context of the 197 discussion in Section 4. 199 At the time this document is being written, the Fr interface is out 200 of scope for the ForCES architecture. However, it is expected that 201 organizations implementing a set of CEs will need to have the CEs 202 communicate to each other via the Fr interface in order to achieve 203 the synchronization necessary for controlling the FEs. 205 The problem scope addressed by this document falls into 2 areas: 207 1. To update the description of [RFC5810] with more clarity on how 208 current cold-standby approach operates within the NE cluster. 210 2. To describe how to evolve the [RFC5810] cold-standby setup to a 211 hot-standby redundancy setup to improve the failover time and NE 212 availability. 214 2.2. Quantifying Problem Scope 216 The NE recovery and availability is dependent on several time- 217 sensitive metrics: 219 1. How fast the CE plane failure is detected by the FE. 221 2. How fast a backup CE becomes operational. 223 3. How fast the FEs associate with the new master CE. 225 4. How fast the FEs recover their state, and become operational. 226 Each FE state is the collective state of all its instantiated 227 LFBs. 229 The design intent of the current [RFC5810] as well as this document 230 to meet the above goals are driven by desire for simplicity. 232 To quantify the above criteria with the current prescribed ForCES CE 233 setup in [RFC5810]: 235 1. How fast the FE side detects a CE failure is left undefined. To 236 illustrate an extreme scenario, we could have a human operator 237 acting as the monitoring entity to detect faulty CEs. How fast 238 such detection happens could be in the range of seconds to days. 239 A more active monitor on the Fp interface could improve this 240 detection. Usually the FE will detect a CE failure either by the 241 TML if the Fp interface terminates or by the ForCES Protocol by 242 utilizing the ForCES heartbeat mechanism. 244 2. How fast the backup CE becomes operational is also currently out 245 of scope. In the current setup, a backup CE need not be 246 operational at all (for example, to save power) and therefore it 247 is feasible for a monitoring entity to boot up a backup CE after 248 it detects the failure of the master CE. In this document 249 Section 4 we suggest that at least one backup CE be online so as 250 to improve this metric. 252 3. How fast an FE associates with new master CE is also currently 253 undefined. The cost of an FE connecting and associating adds to 254 the recovery overhead. As mentioned above we suggest having at 255 least one backup CE online. In Section 4 we propose to zero out 256 the connection and association cost on failover by having each FE 257 associate with all online backup CEs after associating to an 258 active/master CE. Note that if an FE pre-associates with at 259 least one backup CE, then the system will be technically 260 operating in hot-standby mode. 262 4. And last: How fast an FE recovers its state depends on how much 263 NE state exists. By ForCES current definition, the new master CE 264 assumes zero state on the FE and starts from scratch to update 265 the FE. So the larger the state, the longer the recovery. 267 3. RFC5810 CE HA Framework 269 To achieve CE High Availability (HA), FEs and CEs MUST inter-operate 270 per [RFC5810] definition which is repeated for contextual reasons in 271 Section 3.1. It should be noted that in this default setup, which 272 MUST be implemented by CEs and FEs requiring HA, the Fr plane is out 273 of scope (and if available is proprietary to an implementation). 275 3.1. RFC 5810 CE HA Support 277 As mentioned earlier, although there can be multiple redundant CEs, 278 only one CE actively controls FEs in a ForCES NE. In practice there 279 may be only one backup CE. At any moment in time, only one master CE 280 can control an FE. In addition, the FE connects and associates to 281 only the master CE. The FE and the CE are aware of the primary and 282 one or more secondary CEs. This information (primary, secondary CEs) 283 is configured on the FE and the CE during pre-association by the FEM 284 and the CEM respectively. 286 This section includes a new normative description that updates 287 [RFC5810] for the Cold-Standby High Availability mechanism. 289 Figure 2 below illustrates the Forces message sequences that the FE 290 uses to recover the connection in current defined cold-standby 291 scheme. 293 FE CE Primary CE Secondary 294 | | | 295 | Association Establishment | | 296 | Capabilities Exchange | | 297 1 |<------------------------->| | 298 | | | 299 | State Update | | 300 2 |<------------------------->| | 301 | | | 302 | | | 303 | FAILURE | 304 | | 305 | Association Estbalishment,Capabilities Exchange | 306 3 |<----------------------------------------------->| 307 | | 308 | Event Report (primary CE down) | 309 4 |------------------------------------------------>| 310 | | 311 | State Update | 312 5 |<----------------------------------------------->| 314 Figure 2: CE Failover for Cold Standby 316 3.1.1. Cold Standby Interaction with ForCES Protocol 318 HA parameterization in an FE is driven by configuring the FE Protocol 319 Object (FEPO) LFB. 321 The FEPO CEID component identifies the current master CE and the 322 component table BackupCEs identifies the configured backup CEs. The 323 FEPO FE Heartbeat Interval, CE Heartbeat Dead Interval, and CE 324 Heartbeat policy help in detecting connectivity problems between an 325 FE and CE. The CE Failover policy defines how the FE should react on 326 a detected failure. The FEObject FEState component [RFC5812] defines 327 the operational forwarding status and control. The CE can turn off 328 the FE's forwarding operations by setting the FEState to AdminDisable 329 and can turn it on by setting it to OperEnable. Note: [RFC5812] 330 section 5.1 has an errata which describes the FEState as read-only 331 when it should be read-write. 333 Figure 3 illustrates the defined state machine that facilitates the 334 recovery of connection state. 336 The FE connects to the CE specified on FEPO CEID component. If it 337 fails to connect to the defined CE, it moves it to the bottom of 338 table BackupCEs and sets its CEID component to be the first CE 339 retrieved from table BackupCEs. The FE then attempts to associate 340 with the CE designated as the new primary CE. The FE continues 341 through this procedure until it successfully connects to one of the 342 CEs or until the CE Failover Timeout Interval (CEFTI) expires. 344 FE tries to associate 345 +-->-----+ 346 | | 347 (CE changes master || | | 348 CE issues Teardown || +---+--------v----+ 349 Lost association) && | Pre-Association | 350 CE failover policy = 0 | (Association | 351 +------------>-->-->| in +<----+ 352 | | progress) | | 353 | | | | 354 | +--------+--------+ | 355 | CE Association | | CEFTI 356 | Response V | timer 357 | +------------------+ | expires 358 | |FE issue CEPrimaryDown ^ 359 | V | 360 +-+-----------+ +------+-----+ 361 | | (CE changes master || | Not | 362 | | CE issues Teardown || | Associated | 363 | | Lost association) && | +->---+ 364 | Associated | CE Failover Policy = 1 |(May | FE | 365 | | | Continue | try v 366 | |-------->------->------>| Forwarding)| assn| 367 | | Start CEFTI timer | |-<---+ 368 | | | | 369 +-------------+ +-------+-----+ 370 ^ | 371 | Successful V 372 | Association | 373 | Setup | 374 | (Cancel CEFTI Timer) | 375 +_________________________________________+ 376 FE issue CEPrimaryDown event 378 Figure 3: FE State Machine considering HA 380 There are several events that trigger mastership changes: The master 381 CE may issue a mastership change (by changing the CEID component), or 382 teardown an existing association; and last, connectivity may be lost 383 between the CE and FE. 385 When communication fails between the FE and CE (which can be caused 386 by either the CE or link failure but not FE related), either the TML 387 on the FE will trigger the FE PL regarding this failure or it will be 388 detected using the heartbeat messages between FEs and CEs. The 389 communication failure, regardless of how it is detected, MUST be 390 considered as a loss of association between the CE and corresponding 391 FE. 393 If the FE's FEPO CE Failover Policy is configured to mode 0 (the 394 default), it will immediately transition to the pre-association 395 phase. This means that if association is later re-established with a 396 CE, all FE state will need to be re-created. 398 If the FE's FEPO CE Failover Policy is configured to mode 1, it 399 indicates that the FE will run in HA restart recovery. In such a 400 case, the FE transitions to the Not Associated state and the CEFTI 401 timer [RFC5810] is started. The FE may continue to forward packets 402 during this state depending upon the value of the CEFailoverPolicy 403 component of the FEPO LFB. The FE recycles through any configured 404 backup CEs in a round-robin fashion. It first adds its primary CE to 405 the bottom of table BackupCEs and sets its CEID component to be the 406 first secondary retrieved from table BackupCEs. The FE then attempts 407 to associate with the CE designated as the new primary CE. If it 408 fails to re-associate with any CE and the CEFTI expires, the FE then 409 transitions to the pre-association state and FE will operationally 410 bring down its forwarding path (and set the [RFC5812] FEObject 411 FEState component to OperDisable). 413 If the FE, while in the not associated state, manages to reconnect to 414 a new primary CE before CEFTI expires it transitions to the 415 Associated state. Once re-associated, the CE may try to synchronize 416 any state that the FE may have lost during disconnection. How the CE 417 re-synchronizes such state is out of scope for the current ForCES 418 architecture but would typically constitute the issuing of new 419 configs and queries. 421 An explicit message (a Config message setting Primary CE component in 422 ForCES Protocol object) from the primary CE, can also be used to 423 change the Primary CE for an FE during normal protocol operation. In 424 this case, the FE transitions to the Not Associated State and 425 attempts to Associate with the new CE. 427 3.1.2. Responsibilities for HA 429 TML Level: 431 1. The TML controls logical connection availability and failover. 433 2. The TML also controls peer HA management. 435 At this level, control of all lower layers, for example transport 436 level (such as IP addresses, MAC addresses etc) and associated links 437 going down are the role of the TML. 439 PL Level: 441 All other functionality, including configuring the HA behavior during 442 setup, the Control Element IDs (CE IDs) used to identify primary and 443 secondary CEs, protocol messages used to report CE failure (Event 444 Report), Heartbeat messages used to detect association failure, 445 messages to change the primary CE (Config), and other HA related 446 operations described in Section 3.1, are the PL's responsibility. 448 To put the two together, if a path to a primary CE is down, the TML 449 would help recover from a failure by switching over to a backup path, 450 if one is available. If the CE is totally unreachable then the PL 451 would be informed and it would take the appropriate actions described 452 before. 454 4. CE HA Hot Standby 456 In this section we describe small extensions to the existing scheme 457 to enable hot standby HA. To achieve hot standby HA, we target to 458 improve the specific goals defined in Section 2.2, namely: 460 o How fast a backup CE becomes operational. 462 o How fast the FEs associate with the new master CE. 464 As described in Section 3.1, in the pre-association phase the FEM 465 configures the FE to make it aware of all the CEs in the NE. The FEM 466 MUST configure the FE to make it aware which CE is the master and MAY 467 specify any backup CE(s). 469 4.1. Changes to the FEPO model 471 In order for the above to be achievable there is a need to make a few 472 changes in the FEPO model. Appendix A contains the xml definition of 473 the new version 1.1 of the FEPO LFB. 475 Changes from the version 1 of FEPO are: 477 1. Added four new datatypes: 479 1. CEStatusType an unsigned char to specify status of a 480 connection with a CE. Special values are: 482 + 0 (Disconnected) represents that no connection attempt has 483 been made with the CE yet 485 + 1 (Connected) represents that the FE connection with the 486 CE at the TML has completed successfully 488 + 2 (Associated) represents that the FE has successfully 489 associated with the CE 491 + 3 (IsMaster) represents that the FE has associated with 492 the CE and is the master of the FE 494 + 4 (LostConnection) represents that the FE was associated 495 with the CE at one point but lost the connection 497 + 5 (Unreachable) represents the FE deems this CE 498 unreachable. i.e., the FE has tried over a period to 499 connect to it but has failed. 501 2. HAModeValues an unsigned char to specify selected HA mode. 502 Special values are: 504 + 0 (No HA Mode) represents that the FE is not running in HA 505 mode 507 + 1 (HA Mode - Cold Standby) represents that the FE is in HA 508 mode cold Standby 510 + 2 (HA Mode - Hot Standby) represents that the FE is in HA 511 mode hot Standby 513 3. Statistics, a complex structure, representing the 514 communication statistics between the FE and CE. The 515 components are: 517 + RecvPackets representing the packet count received from 518 the CE 520 + RecvBytes representing the byte count received from the CE 522 + RecvErrPackets representing the erroneous packets received 523 from the CE. This component logs badly formatted packets 524 as well as good packets sent to the FE by the CE to set 525 components whilst that CE is not the master. Erroneous 526 packets are dropped(i.e. not responded to). 528 + RecvErrBytes representing the RecvErrPackets byte count 529 received from the CE 531 + TxmitPackets representing the packet count transmitted to 532 the CE 534 + TxmitErrPackets representing the error packet count 535 transmitted to the CE. Typically these would be failures 536 due to communication. 538 + TxmitBytes representing the byte count transmitted to the 539 CE 541 + TxmitErrBytes representing the byte count of errors from 542 transmit to the CE 544 4. AllCEType, a complex structure constituting the CE IDs, 545 Statistics and CEStatusType to reflect connection information 546 for one CE. Used in the AllCEs component array. 548 2. Appended two new components: 550 1. Read-only AllCEs to hold status for all CEs. AllCEs is an 551 Array of the AllCEType. 553 2. Read-write HAMode of type HAModeValues to carry the HA mode 554 used by the FE. 556 3. Added one additional Event, PrimaryCEChanged, reporting the new 557 master CE ID when there is a mastership change. 559 Since no component from the FEPO v1 has been changed FEPO v1.1 560 retains backwards compatibility with CEs that know only version 1.0. 561 These CEs however cannot make use of the HA options that the new FEPO 562 provides. 564 4.2. FEPO processing 566 The FE's FEPO LFB version 1.1 AllCEs table contains all the CE IDs 567 that the FE may connect and associate with. The ordering of the CE 568 IDs in this table defines the priority order in which an FE will 569 connect to the CEs. This table is provisioned initially from the 570 configuration plane (FEM). In the pre-association phase, the first 571 CE (lowest table index) in the AllCEs table MUST be the first CE that 572 the FE will attempt to connect and associate with. If the FE fails 573 to connect and associate with the first listed CE, it will attempt to 574 connect to the second CE and so forth, and cycles back to the 575 beginning of the list until there is a successful association. The 576 FE MUST associate with at least one CE. Upon a successful 577 association, a component of the FEPO LFB, specifically the CEID 578 component, identifies the current associated master CE. 580 While it would be much simpler to have the FE not respond to any 581 messages from a CE other than the master, in practice it has been 582 found to be useful to respond to queries and heartbeats from backup 583 CEs. For this reason, we allow backup CEs to issues queries to the 584 FE. Configuration messages (SET/DEL) from backup CEs MUST be dropped 585 by the FE and logged as received errors. 587 Asynchronous events that the master CE has subscribed to, as well as 588 heartbeats are sent to all associated-to CEs. Packet redirects 589 continue to be sent only to the master CE. The Heartbeat Interval, 590 the CE Heartbeat Policy (CEHB) and the FE Heartbeat Policy (FEHB) are 591 global for all CEs(and changed only by the master CE). 593 Figure 4 illustrates the state machine that facilitates connection 594 recovery with HA enabled. 596 FE tries to associate 597 +-->-----+ 598 | | 599 (CE changes master || | | 600 CE issues Teardown || +---+--------v----+ 601 Lost association) && | Pre-Association | 602 CE failover policy = 0 | (Association | 603 +------------>-->-->| in +<----+ 604 | | progress) | | 605 | | | | 606 | +--------+--------+ | 607 | CE Association | | CEFTI 608 | Response V | timer 609 | +------------------+ | expires 610 | |FE issue CEPrimaryDown ^ 611 | |FE issue PrimaryCEChanged ^ 612 | V | 613 +-+-----------+ +------+-----+ 614 | | (CE changes master || | Not | 615 | | CE issues Teardown || | Associated | 616 | | Lost association) && | +->----------+ 617 | Associated | CE Failover Policy = 1 |(May | find first | 618 | | | Continue | associated v 619 | |-------->------->------>| Forwarding)| CE or retry| 620 | | Start CEFTI timer | | associating| 621 | | | |-<----------+ 622 | | | | 623 +----+--------+ +-------+----+ 624 | | 625 ^ Found | associated CE 626 | or newly | associated CE 627 | V 628 | (Cancel CEFTI Timer) | 629 +_________________________________________+ 630 FE issue CEPrimaryDown event 631 FE issue PrimaryCEChanged event 633 Figure 4: FE State Machine considering HA 635 Once the FE has associated with a master CE it moves to the post- 636 association phase (Associated state). It is assumed that the master 637 CE will communicate with other CEs within the NE for the purpose of 638 synchronization via the CE-CE interface. The CE-CE interface is out 639 of scope for this document. An election result amongst CEs may 640 result in desire to change mastership to a different associated CE; 641 at which point current assumed master CE will instruct the FE to use 642 a different master CE. 644 FE CE#1 CE#2 ... CE#N 645 | | | | 646 | Association Establishment | | | 647 | Capabilities Exchange | | | 648 1 |<------------------------->| | | 649 | | | | 650 | State Update | | | 651 2 |<------------------------->| | | 652 | | | | 653 | Association Establishment | | 654 | Capabilities Exchange | | 655 3I|<-------------------------------------->| | 656 ... ... ... ... 657 | Association Estbalishment,Capabilities Exchange | 658 3N|<----------------------------------------------->| 659 | | | | 660 4 |<------------------------->| | | 661 . . . . 662 4x|<------------------------->| | | 663 | FAILURE | | 664 | | | | 665 | Event Report (LastCEID changed) | | 666 5 |--------------------------------------->|------->| 667 | Event Report (CE#2 is new master) | | 668 6 |--------------------------------------->|------->| 669 | | | 670 7 |<-------------------------------------->| | 671 . . . . 672 7x|<-------------------------------------->| | 673 . . . . 675 Figure 5: CE Failover for Hot Standby 677 While in the post-association phase, if the CE Failover Policy is set 678 to 1 and HAMode set to 2 (HotStandby) then the FE, after successfully 679 associating with the master CE, MUST attempt to connect and associate 680 with all the CEs that it is aware of. Figure 5 steps #1 and #2 681 illustrates the FE associating with CE#1 as the master and then 682 proceeding to steps #3I to #3N the association with backup CEs CE#2 683 to CE#N. If the FE fails to connect or associate with some CEs, the 684 FE MAY flag them as unreachable to avoid continuous attempts to 685 connect. The FE MAY retry to reassociate with unreachable CEs when 686 possible. 688 When the master CE for any reason is considered to be down, then the 689 FE MUST try to find the first associated CE from the list of all CEs 690 in a round-robin fashion. 692 If the FE is unable to find an associated FE in its list of CEs, then 693 it MUST attempt to connect and associate with the first from the list 694 of all CEs and continue in a round-robin fashion until it connects 695 and associates with a CE or the CEFTI timer expires. 697 Once the FE selects an associated CE to use as the new master, the FE 698 issues a PrimaryCEDown Event Notification to all associated CEs to 699 notify them that the last primary CE went down (and what its identity 700 was); a second event PrimaryCEChanged identifying the new master CE 701 is sent as well to identify which CE the reporting FE considers to be 702 the new master. 704 In most HA architectures there exists the possibility of split-brain. 705 However, since in our setup the FE will never accept any 706 configuration messages from any other than the master CE, we consider 707 the FE as fenced against data corruption from the other CEs that 708 consider themselves as the master. The split-brain issue becomes 709 mostly a CE-CE communication problem which is considered to be out of 710 scope. 712 By virtue of having multiple CE connections, the FE switchover to a 713 new master CE will be relatively much faster. The overall effect is 714 improving the NE recovery time in case of communication failure or 715 faults of the master CE. This satisfies the requirement we set to 716 achieve. 718 5. IANA Considerations 720 Following the policies outlined in "Guidelines for Writing an IANA 721 Considerations Section in RFCs" [RFC5226], the Logical Functional 722 Block (LFB) Class Names and Class Identifiers namespaces is updated. 724 A new column, LFB version, is added to the table after the LFB Class 725 Name. The table now reads as follows: 727 +----------------+---------+-----------+---------------+------------+ 728 | LFB Class | LFB | LFB | Description | Reference | 729 | Identifier | Class | Version | | | 730 | | Name | | | | 731 +----------------+---------+-----------+---------------+------------+ 732 +----------------+---------+-----------+---------------+------------+ 734 Logical Functional Block (LFB) Class Names and Class Identifiers 736 The same rules applies as defined in [RFC5812] with the addition that 737 entries must provide the LFB version as a string. 739 Upon publication of this document, all current entries are assigned a 740 value of 1.0. 742 New versions of already defined LFB, MUST NOT remove the previous 743 version entries. 745 It would make sense to have LFB versions to appear in sequence in the 746 registry. The table SHOULD be sorted, and the shorting should be 747 done by Class ID first and then by version. 749 This document introduces the FE Protocol Object version 1.1 as 750 follows: 752 +--------------+------------+---------+-----------------+-----------+ 753 | LFB Class | LFB Class | LFB | Description | Reference | 754 | Identifier | Name | Version | | | 755 +--------------+------------+---------+-----------------+-----------+ 756 | 2 | FE | 1.1 | Defines | This | 757 | | Protocol | | parameters for | document | 758 | | Object | | the ForCES | | 759 | | | | protocol | | 760 | | | | operation | | 761 +--------------+------------+---------+-----------------+-----------+ 763 Logical Functional Block (LFB) Class Names and Class Identifiers 765 6. Security Considerations 767 Security consideration as defined in section 9 of [RFC5810] applies 768 securing each CE-FE communication. Multiple CEs associated with the 769 same FE still require the same procedure to be followed on a per- 770 association basis. 772 It should be noted that since the FE is initiating the association 773 with a CE, a CE cannot initiate association with the FE and such 774 message will be dropped. Thus the FE is secured from rogue or 775 malfunctioning CEs. 777 While CE-CE plane is outside current scope of ForCES, we recognize 778 that it may be subjected to attacks which may affect the CE-FE 779 communication. 781 The following considerations should be made: 783 1. CEs should use secure communication channels between for 784 coordination and keeping of state at least to avoid connection of 785 malicious CEs. 787 2. The master CE should take into account DoS and DDoS attacks from 788 malicious or malfunctioning CEs. 790 3. CEs should take into account the split-brain issue. There are 791 currently two fail-safes in the FE, firstly the FE has the CEID 792 component that denotes which CE is the master and secondly the FE 793 does not allow BackupCEs to configure the FE. However backup CEs 794 that consider that the master CE has dropped and themselves as 795 master should first do a sanity check and query the FE CEID 796 component. 798 7. References 800 7.1. Normative References 802 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 803 Requirement Levels", BCP 14, RFC 2119, March 1997. 805 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 806 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 807 May 2008. 809 [RFC5810] Doria, A., Hadi Salim, J., Haas, R., Khosravi, H., Wang, 810 W., Dong, L., Gopal, R., and J. Halpern, "Forwarding and 811 Control Element Separation (ForCES) Protocol 812 Specification", RFC 5810, March 2010. 814 [RFC5812] Halpern, J. and J. Hadi Salim, "Forwarding and Control 815 Element Separation (ForCES) Forwarding Element Model", RFC 816 5812, March 2010. 818 7.2. Informative References 820 [RFC3654] Khosravi, H. and T. Anderson, "Requirements for Separation 821 of IP Control and Forwarding", RFC 3654, November 2003. 823 [RFC3746] Yang, L., Dantu, R., Anderson, T., and R. Gopal, 824 "Forwarding and Control Element Separation (ForCES) 825 Framework", RFC 3746, April 2004. 827 Appendix A. New FEPO version 829 The xml has been validated against the schema defined in [RFC5812]. 831 834 835 836 837 CEHBPolicyValues 838 839 The possible values of CE heartbeat policy 840 841 842 uchar 843 844 845 CEHBPolicy0 846 847 The CE will send heartbeats to the FE 848 every CEHDI timeout if no other messages 849 have been sent since. 850 851 852 853 CEHBPolicy1 854 855 The CE will not send heartbeats to the FE 856 857 858 859 860 861 862 FEHBPolicyValues 863 864 The possible values of FE heartbeat policy 865 866 867 uchar 868 869 870 FEHBPolicy0 871 872 The FE will not generate any heartbeats to the CE 873 874 875 876 FEHBPolicy1 877 878 The FE generates heartbeats to the CE every FEHI 879 if no other messages have been sent to the CE. 880 881 882 883 884 885 886 FERestartPolicyValues 887 888 The possible values of FE restart policy 889 890 891 uchar 892 893 894 FERestartPolicy0 895 896 The FE restarts its state from scratch 897 898 899 900 901 902 903 HAModeValues 904 905 The possible values of HA modes 906 907 908 uchar 909 910 911 NoHA 912 913 The FE is not running in HA mode 914 915 916 917 ColdStandby 918 919 The FE is running in HA mode cold Standby 920 921 922 923 HotStandby 924 925 The FE is running in HA mode hot Standby 926 927 928 929 930 931 932 CEFailoverPolicyValues 933 934 The possible values of CE failover policy 935 936 937 uchar 938 939 940 CEFailoverPolicy0 941 942 The FE should stop functioning immediate and 943 transition to the FE OperDisable state 944 945 946 947 CEFailoverPolicy1 948 949 The FE should continue forwarding even without an 950 associated CE for CEFTI. The FE goes to FE 951 OperDisable when the CEFTI expires and no 952 association. Requires graceful restart support. 953 954 955 956 957 958 959 FEHACapab 960 961 The supported HA features 962 963 964 uchar 965 966 967 GracefullRestart 968 969 The FE supports Graceful Restart 970 971 972 973 HA 974 975 The FE supports HA 976 977 978 979 980 981 982 CEStatusType 983 Status values. Status for each CE 984 985 uchar 986 987 988 Disconnected 989 No connection attempt with the CE yet 990 991 992 993 Connected 994 The FE connection with the CE at the TML 995 has been completed 996 997 998 999 Associated 1000 The FE has associated with the CE 1001 1002 1003 1004 IsMaster 1005 The CE is the master (and associated) 1006 1007 1008 1009 LostConnection 1010 The FE was associated with the CE but 1011 lost the connection 1013 1014 1015 1016 Unreachable 1017 The CE is deemed as unreachable by the FE 1018 1019 1020 1021 1022 1023 1024 StatisticsType 1025 Statistics Definition 1026 1027 1028 RecvPackets 1029 Packets Received 1030 uint64 1031 1032 1033 RecvErrPackets 1034 Packets Received from CE with errors 1035 1036 uint64 1037 1038 1039 RecvBytes 1040 Bytes Received from CE 1041 uint64 1042 1043 1044 RecvErrBytes 1045 Bytes Received from CE in Error 1046 uint64 1047 1048 1049 TxmitPackets 1050 Packets Transmitted to CE 1051 uint64 1052 1053 1054 TxmitErrPackets 1055 1056 Packets Transmitted to CE that incurred 1057 errors 1058 1059 uint64 1060 1061 1062 TxmitBytes 1063 Bytes Transmitted to CE 1064 uint64 1065 1066 1067 TxmitErrBytes 1068 Bytes Transmitted to CE incurring errors 1069 1070 uint64 1071 1072 1073 1074 1075 AllCEType 1076 Table Type for AllCE component 1077 1078 1079 CEID 1080 ID of the CE 1081 uint32 1082 1083 1084 Statistics 1085 Statistics per CE 1086 StatisticsType 1087 1088 1089 CEStatus 1090 Status of the CE 1091 CEStatusType 1092 1093 1094 1095 1096 1097 1098 FEPO 1099 1100 The FE Protocol Object, with new CEHA 1101 1102 1.1 1103 1104 1105 CurrentRunningVersion 1106 Currently running ForCES version 1107 uchar 1108 1109 1110 FEID 1111 Unicast FEID 1112 uint32 1113 1114 1115 MulticastFEIDs 1116 1117 the table of all multicast IDs 1118 1119 1120 uint32 1121 1122 1123 1124 CEHBPolicy 1125 1126 The CE Heartbeat Policy 1127 1128 CEHBPolicyValues 1129 1130 1131 CEHDI 1132 1133 The CE Heartbeat Dead Interval in millisecs 1134 1135 uint32 1136 1137 1138 FEHBPolicy 1139 1140 The FE Heartbeat Policy 1141 1142 FEHBPolicyValues 1143 1144 1145 FEHI 1146 1147 The FE Heartbeat Interval in millisecs 1148 1149 uint32 1150 1151 1152 CEID 1153 1154 The Primary CE this FE is associated with 1155 1156 uint32 1158 1159 1160 BackupCEs 1161 1162 The table of all backup CEs other than the 1163 primary 1164 1165 1166 uint32 1167 1168 1169 1170 CEFailoverPolicy 1171 1172 The CE Failover Policy 1173 1174 CEFailoverPolicyValues 1175 1176 1177 CEFTI 1178 1179 The CE Failover Timeout Interval in millisecs 1180 1181 uint32 1182 1183 1184 FERestartPolicy 1185 1186 The FE Restart Policy 1187 1188 FERestartPolicyValues 1189 1190 1191 LastCEID 1192 1193 The Primary CE this FE was last associated 1194 with 1195 1196 uint32 1197 1198 1199 HAMode 1200 1201 The HA mode used 1202 1203 HAModeValues 1204 1205 1206 AllCEs 1207 The table of all CEs 1208 1209 AllCEType 1210 1211 1212 1213 1214 1215 SupportableVersions 1216 1217 the table of ForCES versions that FE supports 1218 1219 1220 uchar 1221 1222 1223 1224 HACapabilities 1225 1226 the table of HA capabilities the FE supports 1227 1228 1229 FEHACapab 1230 1231 1232 1233 1234 1235 PrimaryCEDown 1236 1237 The primary CE has changed 1238 1239 1240 LastCEID 1241 1242 1243 1244 1245 LastCEID 1246 1247 1248 1249 1250 PrimaryCEChanged 1251 A New primary CE has been selected 1252 1253 1254 CEID 1255 1256 1257 1258 1259 CEID 1260 1261 1262 1263 1264 1265 1266 1268 Authors' Addresses 1270 Kentaro Ogawa 1271 NTT Corporation 1272 3-9-11 Midori-cho 1273 Musashino-shi, Tokyo 180-8585 1274 Japan 1276 Email: k.ogawa@ntt.com 1278 Weiming Wang 1279 Zhejiang Gongshang University 1280 149 Jiaogong Road 1281 Hangzhou 310035 1282 P.R.China 1284 Phone: +86-571-88057712 1285 Email: wmwang@mail.zjgsu.edu.cn 1287 Evangelos Haleplidis 1288 University of Patras 1289 Panepistimioupoli Patron 1290 Patras 26504 1291 Greece 1293 Email: ehalep@ece.upatras.gr 1294 Jamal Hadi Salim 1295 Mojatatu Networks 1296 Suite 400, 303 Moodie Dr. 1297 Ottawa, Ontario K2H 9R4 1298 Canada 1300 Email: hadi@mojatatu.com