idnits 2.17.1 draft-brissette-bess-evpn-mh-pa-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 22, 2018) is 2005 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT Patrice Brissette 3 Intended Status: Proposed Standard Samir Thoria 4 Ali Sajassi 5 Cisco Systems 7 Expires: April 25, 2019 October 22, 2018 9 EVPN multi-homing port-active load-balancing 10 draft-brissette-bess-evpn-mh-pa-02 12 Abstract 14 The Multi-Chassis Link Aggregation Group (MC-LAG) technology enables 15 the establishment of a logical port-channel connection with a 16 redundant group of independent nodes. The purpose of multi-chassis 17 LAG is to provide a solution to achieve higher network availability, 18 while providing different modes of sharing/balancing of traffic. EVPN 19 standard defines EVPN based MC-LAG with single-active and all-active 20 multi-homing load-balancing mode. The current draft expands on 21 existing redundancy mechanisms supported by EVPN and introduces 22 support of port-active load-balancing mode. In the current draft, 23 port-active load-balancing mode is also referred to as per interface 24 active/standby. 26 Status of this Memo 28 This Internet-Draft is submitted to IETF in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF), its areas, and its working groups. Note that 33 other groups may also distribute working documents as 34 Internet-Drafts. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 The list of current Internet-Drafts can be accessed at 42 http://www.ietf.org/1id-abstracts.html 44 The list of Internet-Draft Shadow Directories can be accessed at 45 http://www.ietf.org/shadow.html 47 Copyright and License Notice 49 Copyright (c) 2018 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (http://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4 66 2. Multi-Chassis Ethernet Bundles . . . . . . . . . . . . . . . . 4 67 3. Port-active load-balancing procedure . . . . . . . . . . . . . 4 68 4. Algorithm to elect per port-active PE . . . . . . . . . . . . . 5 69 5. Port-active over Integrated Routing-Bridging Interface . . . . 6 70 6. Convergence considerations . . . . . . . . . . . . . . . . . . 7 71 6. Applicability . . . . . . . . . . . . . . . . . . . . . . . . . 7 72 7. Overall Advantages . . . . . . . . . . . . . . . . . . . . . . 8 73 8 Security Considerations . . . . . . . . . . . . . . . . . . . . 9 74 9 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 9 75 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 76 11 References . . . . . . . . . . . . . . . . . . . . . . . . . . 9 77 11.1 Normative References . . . . . . . . . . . . . . . . . . . 9 78 11.2 Informative References . . . . . . . . . . . . . . . . . . 9 79 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 9 81 1 Introduction 83 EVPN, as per [RFC7432], provides all-active per flow load balancing 84 for multi-homing. It also defines single-active with service carving 85 mode, where one of the PEs, in redundancy relationship, is active per 86 service. 88 While these two multi-homing scenarios are most widely utilized in 89 data center and service provider access networks, there are scenarios 90 where active-standby per interface multi-homing redundancy is useful 91 and required. Main consideration for this mode of redundancy is the 92 determinism of traffic forwarding through specific interface rather 93 than statistical per flow load balancing across multiple PEs 94 providing multi-homing. The determinism provided by active-standby 95 per interface is also required for certain QOS features to work. 96 While using this mode, customers also expect minimized convergence 97 during failures. A new term of load-balancing mode "port-active load- 98 balancing" is then defined. 100 This draft describes how that new redundancy mode can be supported 101 via EVPN. 103 +-----+ 104 | PE3 | 105 +-----+ 106 +-----------+ 107 | MPLS/IP | 108 | CORE | 109 +-----------+ 110 +-----+ +-----+ 111 | PE1 | | PE2 | 112 +-----+ +-----+ 113 | | 114 I1 I2 115 \ / 116 \ / 117 +---+ 118 |CE1| 119 +---+ 121 Figure 1. MC-LAG topology 123 Figure 1 shows a MC-LAG multi-homing topology where PE1 and PE2 are 124 part of the same redundancy group providing multi-homing to CE1 via 125 interfaces I1 and I2. Interfaces I1 and I2 are Bundle-Ethernet 126 interfaces running LACP protocol. The core, shown as IP or MPLS 127 enabled, provides wide range of L2 and L3 services. MC-LAG multi- 128 homing functionality is decoupled from those services in the core and 129 it focuses on providing multi-homing to CE. With per-port 130 active/standby redundancy, only one of the two interface I1 or I2 131 would be in forwarding, the other interface will be in standby. This 132 also implies that all services on the active interface are in active 133 mode and all services on the standby interface operate in standby 134 mode. When EVPN is used to provide MC-LAG functionality, we refer to 135 it as EVLAG in this draft. 137 1.1 Terminology 139 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 140 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 141 document are to be interpreted as described in RFC 2119 [RFC2119]. 143 2. Multi-Chassis Ethernet Bundles 145 When a CE is multi-homed to a set of PE nodes using the [802.1AX] 146 Link Aggregation Control Protocol (LACP), the PEs must act as if they 147 were a single LACP speaker for the Ethernet links to form a bundle, 148 and operate as a Link Aggregation Group (LAG). To achieve this, the 149 PEs connected to the same multi-homed CE must synchronize LACP 150 configuration and operational data among them. ICCP-based protocol 151 has been used for that purpose. EVLAG simplifies greatly that 152 solution. Along with the simplification comes few assumptions: 154 - Links in the Ethernet Bundle MUST operate in all-active load- 155 balancing mode 157 - Same LACP parameters MUST be configured on peering PEs such as 158 system id, port priority, etc. 160 Any discrepancies from this list is left for future study. 161 Furthermore, mis-configuration and mis-wiring detection across 162 peering PEs are also left for further study. 164 3. Port-active load-balancing procedure 166 Following steps describe the proposed procedure with EVLAG to support 167 port-active load-balancing mode: 169 1- ESI MUST be assigned per access interface as described in 170 [RFC7432], which may be auto derived or manually assigned. Access 171 interface MAY be a Layer-2 or Layer3 interface. 173 2- Ethernet-Segment MUST be configured in port-active load-balancing 174 mode on peering PEs for specific interface 175 3- Peering PEs MAY exchange only Ethernet-Segment route (Route Type- 176 4) 178 4- PEs in the redundancy group leverages DF election defined in 179 [draft-ietf-bess-evpn-df-election-framework] to determine which PE 180 keeps the port in active mode and which one(s) keep it in standby 181 mode. While the DF election defined in [draft-ietf-bess-evpn-df- 182 election-framework] is per granularity, for port-active 183 mode of multi-homing, the DF election is done per . The details 184 of this algorithm are described in Section 4. 186 5- DF router MUST keep corresponding access interface in up and 187 forwarding active state for that Ethernet-Segment 189 6- Non-DF routers MAY bring and keep peering access interface 190 attached to it in operational down state. If the interface is running 191 LACP protocol, then the non-DF PE MAY also set the LACP state to OOS 192 (Out of Sync) as opposed to interface state down. This allows for 193 better convergence on standby to active transition. 195 4. Algorithm to elect per port-active PE 197 The default DF Election algorithm, or modulus-based algorithm as in 198 [RFC7432], is used here also, at the granularity of only. For 199 Modulo calculation, byte 10 of the ESI is used. 201 Highest Random Weight (HRW) algorithm defined in [draft-ietf-bess- 202 evpn-df-election-framework] MAY also be used and signaled, and 203 modified to operate at the granularity of rather than per . 206 Let Active(ESI) denote the PE that will be the active PE for port 207 with Ethernet segment identifier - ESI. The other PEs in the 208 redundancy group will be standby PE(s) for the same port (ES). Ai is 209 the address of the PEi and weight() is a pseudorandom function of ESi 210 and Ai, Wrand() function defined in [draft-ietf-bess-evpn-df- 211 election-framework] is used as the Weight() function. 213 Active(ESI) = PEi: if Weight(ESI, Ai) >= Weight(ESI, Aj), for all j, 214 0 <= I,j <= Number of PEs in the redundancy group. In case of a tie, 215 choose the PE whose IP address is numerically the least. 217 5. Port-active over Integrated Routing-Bridging Interface 218 +-----+ 219 | PE3 | 220 |(IRB)| 221 | GW3 | 222 +-----+ 223 +-----------+ 224 | MPLS/IP | 225 | CORE | 226 +-----------+ 227 +-----+ +-----+ 228 | GW1 | | GW2 | 229 |(IRB)| |(IRB)| 230 | PE1 | | PE2 | 231 +-----+ +-----+ 232 | | | 233 I1 I2 I3 234 \ / | 235 \ / \ 236 +---+ +---+ 237 |CE1| |CE2| 238 +---+ +---+ 240 Figure 2. EVPN-IRB Port-active load-balancing 242 Figure 2 shows a simple network where EVPN-IRB is used for inter- 243 subnet connectivity. IRB interfaces on PE1 and PE2 are configured in 244 anycast gateway (same MAC, same IP). CE1 device is multi-homed to 245 both PE1 and PE2. The Ethernet-segment load-balancing mode, of the 246 connected CE1 to peering PEs, can be of any type e.g. all-active, 247 single-active or port-active. CE2 device is connected to a single PE 248 (PE2). It operates as single-homed device via an orphan port I3. 249 Finally, port-active load-balancing is apply to IRB interface on 250 peering PEs (PE1 and PE2). Manual Ethernet-Segment Identifier is 251 assigned per IRB interface. ESI auto-generation is also possible 252 based on the IRB anycast IP address. 254 DF election is performed between peering PE over IRB interface (per 255 ESI/EVI). Designed forwarder (DF) IRB interface remains in up state. 256 Non-designated forwarder (NDF) IRB interface may goes in down state. 257 Furthermore, if all access interfaces connected to an IRB interface 258 are down state (failure or admin) OR in blocked forward state(NDF), 259 IRB interface is brought down. For example, interface I3 fails at the 260 same time than interface I2 (in single-active load-balancing mode) is 261 in blocked forwarding state. 263 In the example where IRB on PE2 is NDF, all L3 traffic coming from 264 PE3 is going via PE1. An IRB interface in down state doesn't attract 265 traffic from core side. CE2 device reachability is done via an L2 266 subnet stretch between PE1 and PE2. Therefore L3 traffic coming from 267 PE3 destinated to CE2 goes via GW1 first, then via an L2 connection 268 to PE2 and finally via interface I3 to CE2 device. 270 There are many reasons of configuring port-active load-balancing mode 271 over IRB interface: 272 - Ease replacement of legacy technology such VRRP / HSRP 274 - Better scalability than legacy protocols 276 - Traffic predictability 278 - Optimal routing and entirely independent of load-balancing mode 279 configured on any access interfaces 281 6. Convergence considerations 283 To improve the convergence, upon failure and recovery, when port- 284 active load-balancing mode is used, some advanced synchronization 285 between peering PEs may be required. Port-active is challenging in a 286 sense that the "standby" port is in down state. It takes some time to 287 bring a "standby" port in up-state and settle the network. For IRB 288 and L3 services, ARP / MLD cache may be synchronized. Moreover, 289 associated VRF tables may also be synchronized. For L2 services, MAC 290 table synchronization may be considered. Finally, using bundle- 291 Ethernet interface, where LACP is running, is usually a smart thing 292 since it provides the ability to set the "standby" port in "out-of- 293 sync" state aka "warm-standby". 295 6. Applicability 297 A common deployment is to provide L2 or L3 service on the PEs 298 providing multi-homing. The services could be any L2 EVPN such as 299 EVPN VPWS, EVPN [RFC7432], etc. L3 service could be in VPN context 300 [RFC4364] or in global routing context. When a PE provides first hop 301 routing, EVPN IRB could also be deployed on the PEs. The mechanism 302 defined in this draft is used between the PEs providing the L2 or L3 303 service, when the requirement is to use per port active. 305 A possible alternate solution is the one described in this draft is 306 MC-LAG with ICCP [RFC7275] active-standby redundancy. However, ICCP 307 requires LDP to be enabled as a transport of ICCP messages. There are 308 many scenarios where LDP is not required e.g. deployments with VXLAN 309 or SRv6. The solution defined in this draft with EVPN does not 310 mandate the need to use LDP or ICCP and is independent of the overlay 311 encapsulation. 313 7. Overall Advantages 315 There are many advantages in EVLAG to support port-active load- 316 balancing mode. Here is a non-exhaustive list: 318 - Open standards based per interface single-active redundancy 319 mechanism that eliminates the need to run ICCP and LDP. 321 - Agnostic of underlay technology (MPLS, VXLAN, SRv6) and associated 322 services (L2, L3, Bridging, E-LINE, etc). 324 - Provides a way to enable deterministic QOS over MC-LAG attachment 325 circuits 327 - Fully compliant with RFC-7432, does not require any new protocol 328 enhancement to existing EVPN RFCs. 330 - Can leverage various DF election algorithms e.g. modulo, HRW, etc. 332 - Replaces legacy MC-LAG ICCP-based solution, and offers following 333 additional benefits: 335 - Efficiently supports 1+N redundancy mode (with EVPN using BGP 336 RR) where as ICCP requires full mesh of LDP sessions among PEs in 337 redundancy group 339 - Fast convergence with mass-withdraw is possible with EVPN, no 340 equivalent in ICCP 342 - Customers want per interface single-active redundancy, but don't 343 want to enable LDP (e.g. they may be running VXLAN or SRv6 in the 344 network). Currently there is no alternative to this. 346 8 Security Considerations 348 The same Security Considerations described in [RFC7432] are valid for 349 this document. 351 9 IANA Considerations 353 There are no new IANA considerations in this document. 355 10. Acknowledgements 357 Authors would like to thank Luc Andre Burdet for valuable reviews and 358 inputs. 360 11 References 362 11.1 Normative References 364 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 365 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 366 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 367 2015, . 369 [RFC7275] Martini, L., Salam, S., Sajassi, A., Bocci, M., 370 Matsushima, S., and T. Nadeau, "Inter-Chassis 371 Communication Protocol for Layer 2 Virtual Private Network 372 (L2VPN) Provider Edge (PE) Redundancy", RFC 7275, DOI 373 10.17487/RFC7275, June 2014, . 376 11.2 Informative References 378 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 379 Requirement Levels", BCP 14, RFC 2119, DOI 380 10.17487/RFC2119, March 1997, . 383 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 384 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 385 2006, . 387 Authors' Addresses 389 Patrice Brissette 390 Cisco Systems 391 EMail: pbrisset@cisco.com 393 Samir Thoria 394 Cisco Systems 395 EMail: sthoria@cisco.com 397 Ali Sajassi 398 Cisco Systems 399 EMail: sajassi@cisco.com