idnits 2.17.1 draft-snr-bess-evpn-loop-protect-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 189: '...PEs in a network MUST provide an autom...' RFC 2119 keyword, line 195: '...ection mechanism MUST be compatible wi...' RFC 2119 keyword, line 200: '...esolution action SHOULD discard the lo...' RFC 2119 keyword, line 207: '...esolution action MAY bring down the AC...' RFC 2119 keyword, line 215: '...detecting a loop SHOULD log an event, ...' (11 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 5, 2018) is 2244 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'RFC2119' is mentioned on line 481, but not defined == Missing Reference: 'RFC8174' is mentioned on line 481, but not defined Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup J. Rabadan, Ed. 3 Internet Draft S. Sathappan 4 Intended status: Informational K. Nagaraj 5 Nokia 7 J. Bueno 8 J. Crespo 9 Telefonica 11 Expires: August 9, 2018 February 5, 2018 13 Loop Protection in EVPN networks 14 draft-snr-bess-evpn-loop-protect-01 16 Abstract 18 Ethernet Virtual Private Networks (EVPN) is becoming the de-facto 19 standard-based control plane solution for Data Center and layer-2 20 Service Provider applications. The risk of loops caused by backdoor 21 paths accidentally created within the same broadcast domain, is a 22 general common concern, especially among Service Providers in large 23 Layer-2 networks. While other layer-2 Ethernet technologies use 24 Spanning Tree based Protocols (xSTP) to provide a network-wide loop 25 protection, EVPN has the right tools to detect and protect the 26 network against loops in an efficient and effective way. This 27 document describes a mechanism to provide global loop protection in 28 EVPN networks. 30 Status of this Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF), its areas, and its working groups. Note that 37 other groups may also distribute working documents as Internet- 38 Drafts. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 The list of current Internet-Drafts can be accessed at 46 http://www.ietf.org/ietf/1id-abstracts.txt 48 The list of Internet-Draft Shadow Directories can be accessed at 49 http://www.ietf.org/shadow.html 51 This Internet-Draft will expire on August 9, 2018. 53 Copyright Notice 55 Copyright (c) 2018 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents 60 (http://trustee.ietf.org/license-info) in effect on the date of 61 publication of this document. Please review these documents 62 carefully, as they describe your rights and restrictions with respect 63 to this document. Code Components extracted from this document must 64 include Simplified BSD License text as described in Section 4.e of 65 the Trust Legal Provisions and are provided without warranty as 66 described in the Simplified BSD License. 68 Table of Contents 70 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 71 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 5 72 3. Loop Protection Requirements in EVPN networks . . . . . . . . . 5 73 4. Loop Protection Solution for EVPN networks . . . . . . . . . . 6 74 4.1 The RFC7432 EVPN MAC Duplication Mechanism and Loop 75 Protection . . . . . . . . . . . . . . . . . . . . . . . . . 6 76 4.2 Loop Protection Solution . . . . . . . . . . . . . . . . . . 7 77 4.3 The Black-Hole MAC concept for Loop Protection . . . . . . . 11 78 5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 12 79 6. Conventions used in this document . . . . . . . . . . . . . . . 12 80 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 12 81 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 12 82 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13 83 9.1 Normative References . . . . . . . . . . . . . . . . . . . . 13 84 9.2 Informative References . . . . . . . . . . . . . . . . . . . 13 85 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 13 86 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 13 87 17. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 13 89 1. Introduction 91 Ethernet Virtual Private Networks (EVPN) is becoming the de-facto 92 standard-based control plane solution for Data Center and layer-2 93 Service Provider applications. The risk of loops caused by backdoor 94 paths accidentally created within the same broadcast domain, is a 95 general common concern, especially among Service Providers in large 96 Layer-2 networks. While other layer-2 Ethernet technologies use 97 Spanning Tree based Protocols (xSTP) to provide global loop 98 protection, EVPN has the right tools to detect and protect the 99 network against loops in an efficient and effective way. However, 100 [RFC7432] only addresses the MAC duplication detection and protection 101 at the control plane, and not all the possible loop scenarios. 103 In this document, backdoor path is defined as a layer-2 connection 104 between two Attachment Circuits (ACs) that, along with the layer-2 105 connectivity in the EVI, creates a loop. We differentiate between a 106 local and a global loop. A local loop is created by a backdoor path 107 within the same physical port or between two Attachment Circuits 108 (ACs) of the same MAC-VRF. A global loop is created by a backdoor 109 path between two ACs of the same EVI but different PEs. This document 110 addresses global loop protection, since it requires interoperability 111 between PEs. Local loop protection is implementation specific and it 112 is not addressed in this specification. 114 Figure 1 shows a typical example of a backdoor path that may be 115 created by mistake in a Service Provider network that uses EVPN to 116 provide E-LAN services. A backdoor path is accidentally created 117 between AC4 and AC5. 119 M1 120 +---+ 121 |CE1|---+ 122 +---+ | 123 |AC1 124 +-----+ 125 | PE1 | 126 +-----| |----+ 127 | +-----+ | 128 | | 129 | | 130 | EVPN | 131 M2 | | M3 132 +---+ +-----+ +-----+ +---+ 133 |CE2|----| PE2 | | PE3 |----|CE3| 134 +---+ AC2| |--------| |AC3 +---+ 135 +-----+ +-----+ 136 AC4| backdoor |AC5 137 +==========+ 138 link 140 Figure 1 Backdoor link example in Service Provider EVPN networks 142 When, for instance, CE1 (in Figure 1) sends Broadcast, Unknown 143 unicast or Multicast (BUM) traffic, the frames will be flooded to PE2 144 and PE3, looped to each other through the backdoor link and flooded 145 back again in the EVPN network, creating an endless loop. 147 Figure 2 illustrates another example of backdoor path between NVEs in 148 two remote Data Centers. 150 VXLAN MPLS VXLAN 151 <----EVPN----> <-EVPN--> <----EVPN----> 153 +------------+ +-------+ +------------+ 154 | +------+ +------+ | 155 | | DGW1 | | DGW3 | | 156 | +------+ +------+ | 157 +------+ | | | | +------+ 158 TS1--| NVE1 | DC1 | | WAN | | DC2 | NVE2 |--TS2 159 M1 +------+ | | | | +------+ M2 160 | | +------+ +------+ | | 161 | | | DGW2 | | DGW4 | | | 162 | | +------+ +------+ | | 163 | +------------+ +-------+ +------------+ | 164 | | 165 +================backdoor-path================+ 167 Figure 2 Backdoor path example in DCI EVPN networks 169 In Figure 2, a backdoor path is accidentally created between NVE1 and 170 NVE2 in two remote Data Centers. BUM traffic generated by TS1 or TS2 171 will cause a layer-2 loop across DC1 and DC2. 173 2. Terminology 175 EVI: EVPN Instance. 176 E-LAN: MEF-based Ethernet Local Area Network service. 177 E-Tree: MEF-based Ethernet Tree service. 178 BUM: Broadcast, Unknown unicast and Multicast traffic. 179 AC: Attachment Circuit. 180 MAC-VRF:MAC Virtual Routing and Forwarding instance. Instantiation of 181 an EVI in a PE. 182 xSTP: Any Spanning Tree based Protocol, e.g. STP, RSTP, MSTP. 184 3. Loop Protection Requirements in EVPN networks 186 The following requirements have been identified for loop protection 187 in EVPN networks: 189 1- The EVPN PEs in a network MUST provide an automatic mechanism for 190 detecting and resolving a loop within the same broadcast domain. 191 In this document 'resolving a loop' refers to an automatic action 192 executed by a PE or group of PEs that stops a frame from being 193 endlessly forwarded back and forth between two PEs. 195 2- The Loop Protection mechanism MUST be compatible with all the 196 procedures described in EVPN [RFC7432], in particular, it must not 197 interfere with regular EVPN Multi-homing, MAC Mobility and MAC 198 Protection procedures. 200 3- The Loop Resolution action SHOULD discard the looped flows without 201 bringing down the Attachment Circuits (ACs) involved in the 202 created loop. For example, when CE2 sends a broadcast frame (in 203 Figure 1) the Loop Resolution action should discard the looped 204 frames that are forwarded between PE2 and PE3 instead of bringing 205 down any AC in the backdoor path. 207 4- The Loop Resolution action MAY bring down the ACs that are 208 involved in the loop for a given flow instead of only discarding 209 the identified looped frames. This action may impact some unicast 210 flows that are not looped in the EVI, but provides an immediate 211 solution to the loop situation. For example, when a loop (for BUM 212 frames sent from CE1) is detected in PE3, the router may bring 213 down the AC corresponding to the backdoor link. 215 5- A PE detecting a loop SHOULD log an event, warning the operator of 216 the existence of a loop. 218 6- The operator SHOULD be able to configure whether the Loop 219 Resolution action is manually or automatically cleared from a 220 given PE, before the Loop Protection mechanism is restarted. 222 7- The solution MUST be compatible with other implementation-specific 223 procedures that protect the PE against local loops. 225 4. Loop Protection Solution for EVPN networks 227 This document re-uses and enhances the MAC duplication solution 228 specified in EVPN [RFC7432]. Section 4.1 clarifies this baseline EVPN 229 MAC duplication mechanism and describes the required enhancements so 230 that the EVPN network can protect the EVI user against loops. 232 4.1 The RFC7432 EVPN MAC Duplication Mechanism and Loop Protection 234 EVPN [RFC7432] describes a MAC duplication issue and how this anomaly 235 is resolved. In this document, the terms VLAN and broadcast domain 236 are used interchangeably. A VLAN is equivalent to an EVI in case of 237 VLAN-based or VLAN Bundle services, and to a broadcast domain in case 238 of VLAN-Aware Bundle services. 240 As per RFC7432, if a duplicate MAC situation exists in two or more 241 hosts that are part of two different Ethernet Segments within the 242 same VLAN, the traffic originating from these hosts would trigger 243 continuous MAC moves among the PEs attached to them. If no action was 244 made, the sequence number (in the MAC Mobility extended community 245 attribute) would be incremented by the PEs to infinity. 247 In order to remedy such a situation, a PE that detects a MAC mobility 248 event via local learning: 250 o Starts an M-second timer. M is configurable, with a default value 251 of M = 180. 253 o If it detects N MAC moves before the timer expires, it concludes 254 that a duplicate-MAC situation has occurred and adds the MAC to a 255 duplicate-MAC list. N is configurable with a default value of N = 256 5. 258 o The PE MUST alert the operator and stop sending and processing any 259 BGP MAC/IP Advertisement routes for that MAC address until a 260 corrective action is taken by the operator. 262 o While a MAC address is on the duplicate-MAC list for the VLAN, the 263 other PEs in the EVI will forward the traffic for the duplicate-MAC 264 address to one of the PEs that advertised it. 266 In the example of Figure 1, when CE1 sends BUM traffic to the EVI, 267 the EVPN MAC Duplication Mechanism prevents an endless MAC/IP route 268 exchange for M1 between PE1, PE2 and PE3. For instance, when MAC M1 269 moves N times in PE2 within the M-second timer period, PE2 will add 270 M1 to the duplicate-MAC list for the broadcast domain and will stop 271 advertising a MAC/IP route for M1. While this helps the control plane 272 settle, Broadcast frames being sent by CE1 are still endlessly looped 273 within the broadcast domain through the backdoor link. This may cause 274 unpredictable issues in the CEs connected to the affected EVI. 276 4.2 Loop Protection Solution 278 This document enhances the EVPN MAC Duplication Mechanism by 279 extending it with an optional Loop-protection action that is applied 280 on the duplicate-MAC addresses. This additional mechanism resolves 281 loops created by accidental backdoor links and SHOULD be enabled in 282 all the PEs in the EVI. 284 Figure 3 outlines the Loop Protection solution when a backdoor link 285 exists between two PEs (PE2 and PE3) in the same EVI and broadcast 286 domain. The following assumptions are made: 288 o Loop Protection (this document) is enabled on (at least) PE3. 290 o PEs in the EVI are configured with window M-timer = M seconds and 291 number of moves = N. 292 o PEs are also configured with a R-timer (retry-timer) = R seconds. 293 This timer is explained later. 294 o In this document, a MAC-move refers to a relearn event in the same 295 MAC-VRF, where the same MAC is first learned on an AC and later 296 learned from BGP EVPN. Vice versa is also considered a MAC-move. 297 Relearn events between two ACs in the same PE (i.e. local loops) or 298 between two different EVPN endpoints are not considered. To protect 299 the network against local loops, this procedure should be combined 300 with local loop protection mechanisms. 302 +--CE1 303 | 304 +-----+ 305 +--------| PE1 |------+ 306 | +-----+ | 307 | EVPN | 308 | SEQx SEQy | 309 | ----> <---- | 310 +-----+ +-----+ 311 CE2---+ PE2 |---------------| PE3 |---CE3 312 ----> +-----+ +-----+ 313 MAC DA=FF | backdoor | 314 SA=M2 | +=================+ | 315 | | 316 t=0 x=0 | | y=0 t=0 317 |------M2/SEQ-----> | | 318 | <-------M2/SEQ+1--------| y=1 | 319 x=1 |----withdraw---> | | 320 | | | 321 x=2 |------M2/SEQ+2---------> | | 322 | <-----withdraw----| y=2 | 323 | | | 324 | <-------M2/SEQ+3--------| y=3 | 325 x=3 |----withdraw---> | | 326 | 327 ################################ | 328 ... | 329 ################################ | 330 | 331 x=N-1 |------M2/SEQ+(N-1)-----> | | 332 | <-----withdraw----| y=N-1 V 333 | | y=N t < M 334 | ==================== 335 | Add M2 to duplicate-MAC list 336 | a) Stop BGP advertisements 337 | b) Loop-protection action 338 + 340 Figure 3 MAC Duplication and Loop Protection process 342 In the example of Figure 3, we assume CE2 sends a broadcast frame 343 with MAC SA (Source Address) M2. We also assume PE3 learns M2 via BGP 344 first, and via data path later. Although that is unlikely since data 345 path learning is normally faster than BGP-based learning, it helps 346 understand and generalize the procedure. The procedure will work as 347 long as the PE detects N MAC-moves within M seconds for a given MAC. 349 The following process takes place: 351 T0 - PE2 receives the frame, learns M2 (if not learned before) and 352 initializes counter x and timer t. Counter x stores the number 353 of MAC moves, while t stores the delta time since the first MAC 354 move for M2 occurred. PE2 advertises M2 with the currently 355 stored Sequence Number (SEQ). Also, PE2 does a MAC DA 356 (Destination Address) lookup and, since the MAC DA is a 357 broadcast address, it floods the frame to PE1, PE3 and the AC on 358 the backdoor link. This causes a loop between PE2 and PE3. 360 T1 - PE3 receives the BGP update and learns M2. Counter y and timer t 361 are initialized. Counter y stores the number of moves for M2 and 362 t stores the delta time since y was initialized. PE3 now 363 advertises M2 with SEQ+1. M2/SEQ+1 route arrives at PE2 and it 364 is installed in the MAC-VRF. The advertisement makes PE2 365 withdraw the MAC/IP route for M2 and increment x. Immediately 366 after, PE2 receives the frame again through the backdoor link, 367 relearns M2 locally, increments x and advertises M2 with SEQ+2. 369 T2 - M2/SEQ+2 route arrives at PE3 and it is installed in the MAC- 370 VRF. The advertisement makes PE3 withdraw the MAC/IP route for 371 M2 and increment y. PE3 receives the frame again through the 372 backdoor link, relearns M2 locally, increments y and advertises 373 M2 with SEQ+3. PE2 receives the route, relearns M2 and 374 increments x. PE2 also withdraws the route for M2. Immediately 375 after, PE2 receives the frame through the backdoor link and 376 repeats the process (updates y and withdraws the route). 378 Since the frame (with MAC SA=M2) keeps being learned locally on the 379 backdoor link ACs on PE2 and PE3, the above process is repeated until 380 y reaches number of moves = N. 382 Tr - When y=N, PE3 compares t against the configured window M, and in 383 case t. 499 9.2 Informative References 501 10. Acknowledgments 503 11. Contributors 505 17. Authors' Addresses 507 Jorge Rabadan 508 Nokia 509 777 E. Middlefield Road 510 Mountain View, CA 94043 USA 511 Email: jorge.rabadan@nokia.com 513 Senthil Sathappan 514 Nokia 515 Email: senthil.sathappan@nokia.com 517 Kiran Nagaraj 518 Nokia 519 Email: kiran.nagaraj@nokia.com 521 Julio Bueno 522 Telefonica 523 Email: julio.buenohernandez@telefonica.com 525 Jose Manuel Crespo 526 Telefonica 527 Email: josemanuel.crespogarcia@telefonica.com