idnits 2.17.1 draft-ietf-bess-evpn-mh-pa-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 26, 2021) is 1064 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'ES' is mentioned on line 259, but not defined == Missing Reference: 'VLAN' is mentioned on line 259, but not defined == Outdated reference: A later version (-13) exists of draft-ietf-bess-evpn-pref-df-07 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Working Group P. Brissette, Ed. 3 Internet-Draft A. Sajassi 4 Intended status: Standards Track LA. Burdet 5 Expires: November 27, 2021 S. Thoria 6 Cisco Systems 7 B. Wen 8 Comcast 9 E. Leyton 10 Verizon Wireless 11 J. Rabadan 12 Nokia 13 May 26, 2021 15 EVPN multi-homing port-active load-balancing 16 draft-ietf-bess-evpn-mh-pa-02 18 Abstract 20 The Multi-Chassis Link Aggregation Group (MC-LAG) technology enables 21 establishing a logical link-aggregation connection with a redundant 22 group of independent nodes. The purpose of multi-chassis LAG is to 23 provide a solution to achieve higher network availability, while 24 providing different modes of sharing/balancing of traffic. EVPN 25 standard defines EVPN based MC-LAG with single-active and all-active 26 multi-homing load-balancing mode. The current draft expands on 27 existing redundancy mechanisms supported by EVPN and introduces 28 support of port-active load-balancing mode. In the current document, 29 port-active load-balancing mode is also referred to as per interface 30 active/standby. 32 Status of This Memo 34 This Internet-Draft is submitted in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF). Note that other groups may also distribute 39 working documents as Internet-Drafts. The list of current Internet- 40 Drafts is at https://datatracker.ietf.org/drafts/current/. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 47 This Internet-Draft will expire on November 27, 2021. 49 Copyright Notice 51 Copyright (c) 2021 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents 56 (https://trustee.ietf.org/license-info) in effect on the date of 57 publication of this document. Please review these documents 58 carefully, as they describe your rights and restrictions with respect 59 to this document. Code Components extracted from this document must 60 include Simplified BSD License text as described in Section 4.e of 61 the Trust Legal Provisions and are provided without warranty as 62 described in the Simplified BSD License. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 67 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 4 68 2. Multi-Chassis Ethernet Bundles . . . . . . . . . . . . . . . 4 69 3. Port-active load-balancing procedure . . . . . . . . . . . . 4 70 4. Algorithm to elect per port-active PE . . . . . . . . . . . . 5 71 4.1. Capability Flag . . . . . . . . . . . . . . . . . . . . . 5 72 4.2. Modulo-based Designated Forwarder Algorithm . . . . . . . 6 73 4.3. HRW Algorithm . . . . . . . . . . . . . . . . . . . . . . 6 74 4.4. Preferred-DF Algorithm . . . . . . . . . . . . . . . . . 6 75 5. Convergence considerations . . . . . . . . . . . . . . . . . 6 76 5.1. Primary / Backup per Ethernet-Segment . . . . . . . . . . 7 77 5.2. Backward Compatibility . . . . . . . . . . . . . . . . . 7 78 6. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 7 79 7. Overall Advantages . . . . . . . . . . . . . . . . . . . . . 8 80 8. Security Considerations . . . . . . . . . . . . . . . . . . . 8 81 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 82 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 83 10.1. Normative References . . . . . . . . . . . . . . . . . . 9 84 10.2. Informative References . . . . . . . . . . . . . . . . . 9 85 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 87 1. Introduction 89 EVPN, as per [RFC7432], provides all-active per flow load balancing 90 for multi-homing. It also defines single-active with service carving 91 mode, where one of the PEs, in redundancy relationship, is active per 92 service. 94 While these two multi-homing scenarios are most widely utilized in 95 data center and service provider access networks, there are scenarios 96 where active-standby per interface multi-homing redundancy is useful 97 and required. The main consideration for this mode of redundancy is 98 the determinism of traffic forwarding through a specific interface 99 rather than statistical per flow load balancing across multiple PEs 100 providing multi-homing. The determinism provided by active-standby 101 per interface is also required for certain QOS features to work. 102 While using this mode, customers also expect minimized convergence 103 during failures. A new term of load-balancing mode, port-active 104 load- balancing is then defined. 106 This draft describes how that new redundancy mode can be supported 107 via EVPN 109 +-----+ 110 | PE3 | 111 +-----+ 112 +-----------+ 113 | MPLS/IP | 114 | CORE | 115 +-----------+ 116 +-----+ +-----+ 117 | PE1 | | PE2 | 118 +-----+ +-----+ 119 | | 120 I1 I2 121 \ / 122 \ / 123 +---+ 124 |CE1| 125 +---+ 127 Figure 1: MC-LAG Topology 129 Figure 1 shows a MC-LAG multi-homing topology where PE1 and PE2 are 130 part of the same redundancy group providing multi-homing to CE1 via 131 interfaces I1 and I2. Interfaces I1 and I2 are Bundle-Ethernet 132 interfaces running LACP protocol. The core, shown as IP or MPLS 133 enabled, provides wide range of L2 and L3 services. MC-LAG 134 multi-homing functionality is decoupled from those services in the 135 core and it focuses on providing multi-homing to CE. With per-port 136 active/standby redundancy, only one of the two interface I1 or I2 137 would be in forwarding, the other interface will be in standby. This 138 also implies that all services on the active interface are in active 139 mode and all services on the standby interface operate in standby 140 mode. 142 1.1. Requirements Language 144 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 145 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 146 document are to be interpreted as described in [RFC2119]. 148 2. Multi-Chassis Ethernet Bundles 150 When a CE is multi-homed to a set of PE nodes using the [802.1AX] 151 Link Aggregation Control Protocol (LACP), the PEs must act as if they 152 were a single LACP speaker for the Ethernet links to form a bundle, 153 and operate as a Link Aggregation Group (LAG). To achieve this, the 154 PEs connected to the same multi-homed CE must synchronize LACP 155 configuration and operational data among them. Interchassis 156 Communication Protocol (ICCP) has been used for that purpose. EVPN 157 LAG simplifies greatly that solution. Along with the simplification 158 comes few assumptions: 160 o CE device connected to Multi-homing PEs may has a single LAG with 161 all its active links i.e. Links in the Ethernet Bundle operate in 162 all-active load-balancing mode. 164 o Same LACP parameters MUST be configured on peering PEs such as 165 system id, port priority and port key. 167 Any discrepancies from this list is left for future study. 168 Furthermore, mis-configuration and mis-wiring detection across 169 peering PEs are also left for further study. 171 3. Port-active load-balancing procedure 173 Following steps describe the proposed procedure with EVPN LAG to 174 support port-active load-balancing mode: 176 a. The Ethernet-Segment Identifier (ESI) MUST be assigned per access 177 interface as described in [RFC7432], which may be auto derived or 178 manually assigned. Access interface MAY be a Layer-2 or Layer-3 179 interface. The usage of ESI over Layer-3 interfce is newly 180 described in this document. 182 b. Ethernet-Segment (ES) MUST be configured in port-active load- 183 balancing mode on peering PEs for specific access interface 185 c. Peering PEs MAY exchange only Ethernet-Segment (ES) route 186 (Route Type-4) when ESI is configured on a Layer-3 interface. 188 d. PEs in the redundancy group leverage the DF election defined in 189 [RFC8584] to determine which PE keeps the port in active mode and 190 which one(s) keep it in standby mode. While the DF election 191 defined in [RFC8584] is per [ES, Ethernet Tag] granularity, for 192 port-active mode of multi-homing, the DF election is done per 193 [ES]. The details of this algorithm are described in Section 4. 195 e. DF router MUST keep corresponding access interface in up and 196 forwarding active state for that Ethernet-Segment 198 f. Non-DF routers MAY bring and keep peering access interface 199 attached to it in operational down state. If the interface is 200 running LACP protocol, then the non-DF PE MAY also set the LACP 201 state to OOS (Out of Sync) as opposed to interface state down. 202 This allows for better convergence on standby to active 203 transition. 205 g. For EVPN-VPWS service, the usage of primary/backup bits of EVPN 206 Layer2 attributes extended community [RFC8214] is highly 207 recommended to achieve better convergence. 209 4. Algorithm to elect per port-active PE 211 The ES routes, running in port-active load-balancing mode, are 212 advertised with a new capability in the DF Election Extended 213 Community as defined in [RFC8584]. Moreover, the ES associated to 214 the port leverages existing procedure of single-active, and signals 215 single-active bit along with Ethernet-AD per-ES route. Finally, as 216 in [RFC7432], the ESI-label based split-horizon procedures should be 217 used to avoid transient echo'ed packets when Layer-2 circuits are 218 involved. 220 4.1. Capability Flag 222 [RFC8584] defines a DF Election extended community, and a Bitmap 223 field to encode "capabilities" to use with the DF election algorithm 224 in the DF algorithm field. Bitmap (2 octets) is extended by the 225 following value: 227 1 1 1 1 1 1 228 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 229 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 230 |D|A| |P| | 231 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 233 Figure 2: Amended Bitmap field in the DF Election Extended Community 235 Bit 0: 'Don't Preempt' bit, as explained in 236 [I-D.ietf-bess-evpn-pref-df]. 238 Bit 1: AC-Influenced DF Election, as explained in [RFC8584]. 240 Bit 5: (corresponds to Bit 25 of the DF Election Extended 241 Community and it is defined by this document): P bit or 242 'Port Mode' bit (P hereafter), determines that the DF-Algorithm 243 should be modified to consider the port only and not the Ethernet 244 Tags. 246 4.2. Modulo-based Designated Forwarder Algorithm 248 The default DF Election algorithm, or modulus-based algorithm as in 249 [RFC7432] and updated by [RFC8584], is used here, at the granularity 250 of ES only. Given the fact, ES-Import RT community inherits from ESI 251 only byte 1-6, many deployments differentiate ESI within these bytes 252 only. For Modulo calculation, bytes 3-6 are used to determine the 253 designated forwarder using Modulo-based DF assignment. 255 4.3. HRW Algorithm 257 Highest Random Weight (HRW) algorithm defined in [RFC8584] MAY also 258 be used and signaled, and modified to operate at the granularity of 259 [ES] rather than per [ES, VLAN]. 261 [RFC8584] describes computing a 32 bit CRC over the concatenation of 262 Ethernet Tag and ESI. For port-active load-balancing mode, the 263 Ethernet Tag is simply removed from the CRC computation. 265 4.4. Preferred-DF Algorithm 267 When the new capability 'Port-Mode' is signaled, the algorithm is 268 modified to consider the port only and not any associated Ethernet 269 Tags. Furthermore, the "port-based" capability MUST be compatible 270 with the 'DP' capability (for non-revertive). The AC-DF bit MUST be 271 set to zero. When an AC (sub-interface) goes down, it does not 272 influence the DF election. 274 5. Convergence considerations 276 To improve the convergence, upon failure and recovery, when 277 port-active load-balancing mode is used, some advanced 278 synchronization between peering PEs may be required. Port-active is 279 challenging in a sense that the "standby" port is in down state. It 280 takes some time to bring a "standby" port in up-state and settle the 281 network. For IRB and L3 services, ARP / ND cache may be 282 synchronized. Moreover, associated VRF tables may also be 283 synchronized. For L2 services, MAC table synchronization may be 284 considered. 286 Finally, for Bundle-Ethernet interface where LACP is running the 287 ability to set the "standby" port in "out-of-sync" state a.k.a 288 "warm-standby" can be leveraged. 290 5.1. Primary / Backup per Ethernet-Segment 292 The L2 Info Extended Community MAY be advertised in Ethernet A-D 293 per ES routes for fast convergence. Only the P and B bits are 294 relevant to this specification. When advertised, the L2 Info 295 Extended Community SHALL have only P or B bits set and all other bits 296 must be zero. MTU must also be zero. Remote PE receiving optional 297 L2 Info Extended Community on Ethernet A-D per ES routes SHALL 298 consider only P and B bits. P and B bits received on Ethernet A-D 299 per EVI routes per [RFC8214] are overridden. 301 5.2. Backward Compatibility 303 Implementations that comply with [RFC7432] or [RFC8214] only (i.e., 304 implementations that predate this specification) will not advertise 305 the L2 Info Extended Community in Ethernet A-D per ES routes. That 306 means that all remote PEs in the ES will not receive P and B bit per 307 ES and will continue to receive and honour the P and B bits Ethernet 308 A-D per EVI routes. Similarly, an implementation that complies with 309 [RFC7432] or [RFC8214] only and that receives a L2 Info Extended 310 Community will ignore it and will continue to use the default path 311 resolution algorithm. 313 6. Applicability 315 A common deployment is to provide L2 or L3 service on the PEs 316 providing multi-homing. The services could be any L2 EVPN such as 317 EVPN VPWS, EVPN [RFC7432], etc. L3 service could be in VPN context 318 [RFC4364] or in global routing context. When a PE provides first hop 319 routing, EVPN IRB could also be deployed on the PEs. The mechanism 320 defined in this draft is used between the PEs providing the L2 and/or 321 L3 service, when the requirement is to use per port active. 323 A possible alternate solution is the one described in this draft is 324 MC-LAG with ICCP [RFC7275] active-standby redundancy. However, ICCP 325 requires LDP to be enabled as a transport of ICCP messages. There 326 are many scenarios where LDP is not required e.g. deployments with 327 VXLAN or SRv6. The solution defined in this draft with EVPN does not 328 mandate the need to use LDP or ICCP and is independent of the 329 underlay encapsulation. 331 7. Overall Advantages 333 The use of port-active multi-homing brings the following benefits to 334 EVPN networks: 336 a. Open standards based per interface single-active redundancy 337 mechanism that eliminates the need to run ICCP and LDP. 339 b. Agnostic of underlay technology (MPLS, VXLAN, SRv6) and 340 associated services (L2, L3, Bridging, E-LINE, etc). 342 c. Provides a way to enable deterministic QOS over MC-LAG attachment 343 circuits. 345 d. Fully compliant with [RFC7432], does not require any new protocol 346 enhancement to existing EVPN RFCs. 348 e. Can leverage various DF election algorithms e.g. modulo, HRW, 349 etc. 351 f. Replaces legacy MC-LAG ICCP-based solution, and offers following 352 additional benefits: 354 g. 356 * Efficiently supports 1+N redundancy mode (with EVPN using BGP 357 RR) where as ICCP requires full mesh of LDP sessions among PEs 358 in redundancy group. 360 * Fast convergence with mass-withdraw is possible with EVPN, no 361 equivalent in ICCP 363 h. Customers want per interface single-active redundancy, but don't 364 want to enable LDP (e.g. they may be running VXLAN or SRv6 in the 365 network). Currently there is no alternative to this. 367 8. Security Considerations 369 The same Security Considerations described in [RFC7432] are valid for 370 this document. 372 9. IANA Considerations 374 This document solicits the allocation of the following values: 376 o Bit 5 in the [RFC8584] DF Election Capabilities registry, with 377 name "P" (port mode load-balancing) Capability" for port-active 378 ES. 380 10. References 382 10.1. Normative References 384 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 385 Requirement Levels", BCP 14, RFC 2119, 386 DOI 10.17487/RFC2119, March 1997, 387 . 389 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 390 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 391 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 392 2015, . 394 [RFC8214] Boutros, S., Sajassi, A., Salam, S., Drake, J., and J. 395 Rabadan, "Virtual Private Wire Service Support in Ethernet 396 VPN", RFC 8214, DOI 10.17487/RFC8214, August 2017, 397 . 399 [RFC8584] Rabadan, J., Ed., Mohanty, S., Ed., Sajassi, A., Drake, 400 J., Nagaraj, K., and S. Sathappan, "Framework for Ethernet 401 VPN Designated Forwarder Election Extensibility", 402 RFC 8584, DOI 10.17487/RFC8584, April 2019, 403 . 405 10.2. Informative References 407 [I-D.ietf-bess-evpn-pref-df] 408 Rabadan, J., Sathappan, S., Przygienda, T., Lin, W., 409 Drake, J., Sajassi, A., and satyamoh@cisco.com, 410 "Preference-based EVPN DF Election", draft-ietf-bess-evpn- 411 pref-df-07 (work in progress), March 2021. 413 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 414 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 415 2006, . 417 [RFC7275] Martini, L., Salam, S., Sajassi, A., Bocci, M., 418 Matsushima, S., and T. Nadeau, "Inter-Chassis 419 Communication Protocol for Layer 2 Virtual Private Network 420 (L2VPN) Provider Edge (PE) Redundancy", RFC 7275, 421 DOI 10.17487/RFC7275, June 2014, 422 . 424 Authors' Addresses 426 Patrice Brissette (editor) 427 Cisco Systems 428 Ottawa, ON 429 Canada 431 Email: pbrisset@cisco.com 433 Ali Sajassi 434 Cisco Systems 435 USA 437 Email: sajassi@cisco.com 439 Luc Andre Burdet 440 Cisco Systems 441 Canada 443 Email: lburdet@cisco.com 445 Samir Thoria 446 Cisco Systems 447 USA 449 Email: sthoria@cisco.com 451 Bin Wen 452 Comcast 453 USA 455 Email: Bin_Wen@comcast.com 457 Edward Leyton 458 Verizon Wireless 459 USA 461 Email: edward.leyton@verizonwireless.com 462 Jorge Rabadan 463 Nokia 464 USA 466 Email: jorge.rabadan@nokia.com