idnits 2.17.1 draft-wang-bess-evpn-arp-nd-synch-without-irb-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 7 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 413: '...d of a RT-5G route MUST be zero as per...' RFC 2119 keyword, line 450: '...d of a RT-5E route MUST be zero as per...' RFC 2119 keyword, line 468: '...e is MPLS. The source MAC MUST be set...' RFC 2119 keyword, line 892: '... MUST have the same value with t...' RFC 2119 keyword, line 899: '... the packet SHOULD be encapsulate...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (1 September 2021) is 961 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'I-D.ietf-bess-evpn-inter-subnet-forwarding' is defined on line 943, but no explicit reference was found in the text == Unused Reference: 'RFC8679' is defined on line 956, but no explicit reference was found in the text == Unused Reference: 'I-D.sajassi-bess-evpn-ac-aware-bundling' is defined on line 968, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-idr-tunnel-encaps' is defined on line 978, but no explicit reference was found in the text == Outdated reference: A later version (-09) exists of draft-sajassi-bess-evpn-ip-aliasing-02 == Outdated reference: A later version (-06) exists of draft-sajassi-bess-evpn-ac-aware-bundling-04 == Outdated reference: A later version (-08) exists of draft-wang-bess-evpn-arp-nd-synch-without-irb-02 Summary: 1 error (**), 0 flaws (~~), 9 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS WG Y. Wang 3 Internet-Draft Z. Zhang 4 Intended status: Standards Track ZTE Corporation 5 Expires: 5 March 2022 1 September 2021 7 ARP/ND Synching And IP Aliasing without IRB 8 draft-wang-bess-evpn-arp-nd-synch-without-irb-08 10 Abstract 12 This draft discusses serveral signalling modes of EVPN Signalled 13 L3VPNs. EVPN Signalled L3VPNs are used to improve L3VPNs for some 14 new use cases. Then it discusses which style of RT-5 routes can be 15 selected for these new use cases, and why they are selected. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at https://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on 5 March 2022. 34 Copyright Notice 36 Copyright (c) 2021 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 41 license-info) in effect on the date of publication of this document. 42 Please review these documents carefully, as they describe your rights 43 and restrictions with respect to this document. Code Components 44 extracted from this document must include Simplified BSD License text 45 as described in Section 4.e of the Trust Legal Provisions and are 46 provided without warranty as described in the Simplified BSD License. 48 Table of Contents 50 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 51 1.1. Terminology and Acronyms . . . . . . . . . . . . . . . . 3 52 2. Service Interfaces of L3 EVIs . . . . . . . . . . . . . . . . 6 53 3. IP Discovery Mode . . . . . . . . . . . . . . . . . . . . . . 6 54 3.1. Adjacencies Discovery . . . . . . . . . . . . . . . . . . 6 55 3.1.1. Spreadable Mode (Faraway Mode) . . . . . . . . . . . 6 56 3.1.2. Non-Spreadable Mode (Nearby Mode) . . . . . . . . . . 7 57 3.2. CE-Prefixes Auto-Discovery Modes . . . . . . . . . . . . 7 58 3.2.1. Distributed A-D Mode . . . . . . . . . . . . . . . . 7 59 3.2.2. Centerlized A-D Mode . . . . . . . . . . . . . . . . 8 60 4. Styles of RT-5 Route . . . . . . . . . . . . . . . . . . . . 8 61 4.1. L-Style: no Overlay Index . . . . . . . . . . . . . . . . 8 62 4.1.1. RT-5L Advertisement in Distributed A-D Mode . . . . . 8 63 4.1.2. RT-5L Advertisement in Centerlized A-D Mode . . . . . 9 64 4.2. G-Style: GW-IP as Overlay Index . . . . . . . . . . . . . 9 65 4.2.1. RT-5G Advertisement in Distributed A-D Mode . . . . . 9 66 4.2.2. RT-5G Advertisement in Centerlized A-D Mode . . . . . 9 67 4.3. E-Style: ESI as Overlay Index . . . . . . . . . . . . . . 10 68 4.3.1. RT-5E in Bump-in-the-wire use case . . . . . . . . . 10 69 4.3.2. RT-5E Advertisement on Distributed L3 GW . . . . . . 11 70 4.3.3. RT-5E Advertisement in Centerlized A-D mode . . . . . 11 71 4.4. M-Style: MAC as Overlay Index . . . . . . . . . . . . . . 12 72 5. Centerlized RT-5G Advertisement for Distributed L3 73 Forwarding . . . . . . . . . . . . . . . . . . . . . . . 12 74 5.1. CE-side Configurations . . . . . . . . . . . . . . . . . 13 75 5.2. Why Centerlized A-D mode is used . . . . . . . . . . . . 13 76 5.3. Basic Control Plane Procedures . . . . . . . . . . . . . 14 77 5.3.1. Centerlized CE-BGP . . . . . . . . . . . . . . . . . 14 78 5.3.2. RT-2E Advertisement from PE1/PE2 to DGW1 . . . . . . 14 79 5.3.3. RT-5G Advertisement from DGW1 to PE1/PE2 . . . . . . 14 80 5.3.4. RT-2E Advertisement between PE1 and PE2 . . . . . . . 15 81 5.4. Mass-Withdraw by EAD/ES Route . . . . . . . . . . . . . . 15 82 5.5. If Mutiple VLAN-based Service Inerface is Used . . . . . 16 83 5.6. If VLAN-bundle Service Interface is Used . . . . . . . . 17 84 5.7. On the Failure of PE3 Node . . . . . . . . . . . . . . . 17 85 5.8. For Common CE-prefixes behind R1 and R2 . . . . . . . . . 18 86 6. Load Balancing of Unicast Packets . . . . . . . . . . . . . . 20 87 6.1. IP Aliasing using GW-IP . . . . . . . . . . . . . . . . . 20 88 6.2. IP Aliasing using ESI . . . . . . . . . . . . . . . . . . 20 89 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 90 8. Security Considerations . . . . . . . . . . . . . . . . . . . 21 91 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 92 9.1. Normative References . . . . . . . . . . . . . . . . . . 21 93 9.2. Informative References . . . . . . . . . . . . . . . . . 22 94 Appendix A. Explanation for Physical Links of the Use-cases . . 22 95 A.1. Failure Detections for P1.2 (or P2.1) . . . . . . . . . . 24 96 A.2. Protection Approaches for N1 (or N2) . . . . . . . . . . 24 97 A.2.1. CCC-Approaches . . . . . . . . . . . . . . . . . . . 24 98 A.2.1.1. CCC Active-Active Protection . . . . . . . . . . 24 99 A.2.1.2. CCC Active-Standby Protection . . . . . . . . . . 24 100 A.2.2. VSI-Approaches . . . . . . . . . . . . . . . . . . . 25 101 Appendix B. Different Understandings on Resolve GW-IP to RT-5 . 25 102 B.1. Section 3.2 of I-D.ietf-bess-evpn-prefix-advertisement . 25 103 B.2. How to Interpret Above Paragraphs . . . . . . . . . . . . 26 104 B.3. Special PEs . . . . . . . . . . . . . . . . . . . . . . . 26 105 B.4. GW-IP or a new TLV . . . . . . . . . . . . . . . . . . . 26 106 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 27 108 1. Introduction 110 This draft discusses serveral signalling modes of EVPN Signalled 111 L3VPNs. they are: 113 * Adjacencies Discovery modes: spreadable mode and non-spreadable 114 mode. 116 * CE-Prefix Auto-discovery modes: Centerlized mode and Distributed 117 mode. 119 * Styles of RT-5 routes: RT-5L (L-Sytle), RT-5G (G-Style), RT-5E 120 (E-Style), RT-5M (M-Style). 122 These signalling modes can help to improve L3VPNs for some new use 123 cases. Then we will discuss which style of RT-5 routes will be 124 selected for each new use case, and why it is selected for that new 125 use case. 127 1.1. Terminology and Acronyms 129 Most of the acronyms and terms used in this documents comes from 130 [RFC7432], [I-D.sajassi-bess-evpn-ip-aliasing] and 131 [I-D.wang-bess-evpn-ether-tag-id-usage] except for the following: 133 * VRF AC - An Attachment Circuit (AC) that attaches a CE to an 134 IP-VRF but is not an IRB interface. 136 * VRF Interface - An IRB interface or a VRF-AC or an IRC 137 interface. Note that a VRF interface will be bound to the 138 routing space of an IP-VRF. 140 * L3 EVI - An EVPN instance spanning the Provider Edge (PE) 141 devices participating in that EVPN which contains VRF ACs and 142 maybe contains IRB interfaces or IRC interfaces. 144 * IP-AD/EVI - Ethernet Auto-Discovery route per EVI, and the EVI 145 here is an IP-VRF. Note that the Ethernet Tag ID of an IP-AD/ 146 EVI route may be not zero. 148 * IP-AD/ES - Ethernet Auto-Discovery route per ES, and the EVI 149 for one of its route targets is an IP-VRF. 151 * RMAC - Router's MAC, which is signaled in the Router's MAC 152 extended community. 154 * ESI Overlay Index - ESI as overlay index. 156 * ET-ID - Ethernet Tag ID, it is also called ETI for short in 157 this document. 159 * RT-2R - When a MAC/IP Advertisement Route whose ESI is not 160 zero is used for IP-VRF forwarding, it is called as a RT-2R in 161 this draft. When it is used for MAC-VRF forwarding, it is not 162 called as a RT-2R in this draft. 164 * RT-5E - An EVPN Prefix Advertisement Route with a non-reserved 165 ESI as its overlay index (the E-style RT-5) . 167 * IRC - Integrated Routing and Cross-connecting, thus a IRC 168 interface is the virtual interface connecting an IP-VRF and an 169 EVPN VPWS. 171 * CE-BGP - The BGP session between PE and CE. Note that CE-BGP 172 route doesn't have a RD or Route-Target. 174 * CE-Prefix - An IP Prefixes behind a CE is called as that CE's 175 CE-Prefix. 177 * RT-5G - An EVPN Prefix Advertisement Route with a zero ESI and 178 a non-zero GW-IP (the G-style RT-5). 180 * RT-5L - An EVPN Prefix Advertisement Route with both zero ESI 181 and zero GW-IP, but a non-zero EVPN Label (the L-style RT-5). 183 * RT-5M - An EVPN Prefix Advertisement Route with zero ESI, zero 184 GW-IP and zero EVPN Label, but a non-zero Router's MAC (the 185 M-style RT-5). 187 * EVC - Ethernet Virtual Connection, which is typically 188 constructed per basis. 190 * Internal Remote PE: When PEx is called as an EVPN route ERy's 191 internal remote PE, that is saying that, PEx is on the ES which is 192 identified by ERy's ESI field. When ERy's SOI is not zero, that is 193 aslo saying that PEx has been attached to the ethernet tag which is 194 identified by the . 196 * External Remote PE: When PEx is called as an EVPN route ERy's 197 external remote PE, that is saying that, PEx is not on the ES which 198 is identified by ERy's ESI field. When ERy's SOI is not zero, PEx 199 may aslo be a PE which has not been attached to the ethernet tag 200 which is identified by the . 202 * CE-Prefix: When an IP prefix can be reached through CEx from PEy, 203 that IP prefix is called as PEy's CE-prefix behind CEx in this 204 draft. PEy's CE-prefix behind CEx is also called as PEy's CE- 205 prefix for short in this draft. 207 * Common CE-Prefix: When an CE-Prefix can be reached through either 208 CEy or CEz from PEy, in this draft, it is called as a common CE- 209 Prefix of CEy and CEz,from the viewpoint of PEy. 211 * Exclusive CE-Prefix: When an CE-Prefix of PEy can be reached 212 through CEy, and it can't be reached through other CEs of PEy, it 213 is called as an exlusive CE-Prefix of CEy, from the viewpoint of 214 PEy. 216 * SNGW: Sub-Net-specific Gate Way IP address, the SNGW of a subnet 217 is an IP address which is used by the hosts of that subnet to be 218 the nexthop of the default route of these host. 220 * Intermediate subnet: The subnet that connects a PE and a CE of a 221 L3 EVI. 223 * Intermediate SNGW : The SNGW of a intermediate subnet. It will be 224 the IP address of a IRC interface in this draft. 226 * Intermediate nexthop : The CE's IP address in the intermediate 227 subnet. 229 * Overlay nexthop : The CE-Prefix's nexthop IP address which is in 230 the address-space of the L3 EVI. 232 * Original Overlay nexthop : The overlay nexthop which is advertised 233 by the CE through a PE-CE route protocol. 235 * L3EVI-Specific EADR - When the uses L3EVI- 236 Specific Ethernet Auto-discovery mode, the only Ethernet A-D 237 per EVI route (which will be ) of that is called as a L3EVI-Specific EADR in this draft. 240 * ACI-Specific EADR - When the uses ACI-Specific 241 Ethernet Auto-discovery mode, the Ethernet A-D per EVI routes 242 of that are called as ACI-Specific EADRs in this 243 draft. 245 2. Service Interfaces of L3 EVIs 247 Service interface describes how an ES is attached to a L3EVI. 248 [I-D.wang-bess-evpn-ether-tag-id-usage] discussed the following 249 service interfaces: 251 * Mono VLAN-based Service Interface. 252 * Multiple VLAN-based Service Interface. 253 * Separated Risk VLAN-bundle Service Interface. 254 * Shared Risk VLAN-bundle Service Interface. 255 * IRC Service Interface. 257 Different service interface will require different control-plane 258 procedures, then this draft discusses the behavior of RT5 routes 259 advertisement per each service interface, especially when they are 260 RT5 routes with ESI as overlay index or GW-IP as overlay index. 262 Note that an ES may be attached to different L3EVIs via different 263 VLANs, and mutiple ESes can be attached to the same L3EVI Instance. 264 So service interface is ESI-specific and EVI-specific. When ES1 is 265 of VLAN-bundle Service Interface to EVI1, it may be of Mono VLAN- 266 based Service Interface for EVI2. Thus service interfaces of L3EVIs 267 are -specific in this draft. 269 3. IP Discovery Mode 271 IP discovery in L3EVIs include ARP/ND Adjacencies discovery and CE- 272 prefixes discovery. The adjacencies discovery is done distributively 273 by each VRF-AC using ARP/ND. But the CE-prefixes can be discovered 274 by two ways. 276 3.1. Adjacencies Discovery 278 3.1.1. Spreadable Mode (Faraway Mode) 280 In Spreadable Mode, an adjacent MAC/IP can be imported into IP-VRF by 281 both external remote PEs and internal remote PEs. 283 In this mode, the RT-2 route of the MAC/IP is used to synchronize 284 adjacency information ( mapping) to internal remote PEs. 285 and it is also used to advertise host route to external remote PEs. 287 When CEs are hosts, this mode will make the amount of EVPN routes 288 increase greatly. 290 The spreadable mode is also called as Faraway Mode because that the 291 external remote PEs of the MAC/IP entries can imported the RT-2 292 routes of these MAC/IP entries into IP-VRF. 294 The spreadable mode can be used to avoid making a detour when there 295 is a straightforward path. This mode is typically used in EVPN IRB 296 scenarios, where different hosts of the same BD may be reached 297 through different ESes. Another example of spreadable mode can be 298 found in Section 2.1.2 of [I-D.wz-bess-evpn-vpws-as-vrf-ac], where 299 the CEs are a few routers. 301 3.1.2. Non-Spreadable Mode (Nearby Mode) 303 In Spreadable Mode, an adjacent MAC/IP can only be imported into IP- 304 VRF by internal remote PEs. In other words, the MAC/IP can not be 305 imported into IP-VRF by the external remote PEs of it. 307 In this mode, the RT-2 route of the MAC/IP is just used to 308 synchronize adjacency information ( mapping) to internal 309 remote PEs. 311 In non-spreadable mode, it should be insured that only the internal 312 remote PEs of the MAC/IP entry can imported the RT-2 route of the 313 MAC/IP entry into IP-VRF. Thus that RT-2 route should carry EVI-RT 314 and ES-Import RT only, and that's why non-spreadable mode is also 315 called as nearby mode. 317 An example of non-spreadable mode can be found in Section 2.1.1 of 318 [I-D.wz-bess-evpn-vpws-as-vrf-ac], where the CEs are lots of hosts, 319 and all CEs can be reached through the same VPWS service instance. 321 3.2. CE-Prefixes Auto-Discovery Modes 323 There are two ways to discover the IP prefixes behind a CE (that's 324 why these prefixes are called CE-prefixes for short), they are 325 distributed AD-Mode and centerlized AD-mode. 327 3.2.1. Distributed A-D Mode 329 The CE-Prefixes inside a DC are discovered by each NVE separately. 330 Then these NVEs advertise their CE-prefixes to DC Gateways and other 331 NVEs of that DC. 333 Note that the external-prefixes (which are received from other DCs) 334 will be discovered by DC gateways even in distributed AD-mode. 335 Distributed A-D mode and Centerlized A-D mode just talks about how 336 CE-prefixes inside the DC will be discovered. 338 3.2.2. Centerlized A-D Mode 340 The CE-Prefixes (behind each NVE) are discovered by the same group of 341 DC Gateways. Then these DC Gateways advertise these CE-prefixes to 342 NVEs. 344 No matter what the A-D mode is, the distributed forwarding behavior 345 should be expected in this draft. That is, the communication between 346 two subnets behind two NVEs inside the same DC should not be required 347 to pass through the DC Gateway. 349 4. Styles of RT-5 Route 351 When a RT-5 route is used to forward a data packet, the label/VNI/SID 352 of that data packet's EVPN header may be obtained relying on four 353 different fields of that RT-5. 355 In other words, we can say that RT-5 routes can be classified into 356 four styles, which are called L-style, G-style, E-style, M-style 357 respectively. 359 These styles have different usages and they are suitable for 360 different secenarios. 362 4.1. L-Style: no Overlay Index 364 When a L-style RT-5 is used to forward a data packet, the label/VNI/ 365 SID of that data packet's EVPN header is obtained from the RT-5's own 366 MPLS Label (that's why it is called L-Style) field (of its NLRI), and 367 the forwarding path is determined by its own underlay next-hop (BGP 368 next hop). 370 A L-style RT-5 route is also called as a RT-5L in this draft. 372 Note that the ESI and GW IP fields are both zero at the same time, 373 otherwise it will be considered to be another style. 375 4.1.1. RT-5L Advertisement in Distributed A-D Mode 377 When N1/N2 establish CE-BGP sessions with both PE1 and PE2, it is 378 enough for PE1/PE2 to advertise RT-5L routes to DGW1. There is no 379 need for RT-5G or RT-5E advertisement on PE1/PE2 in that usecase. 381 Note that when N1/N2 establish CE-BGP sessions with both PE1 and PE2, 382 the downlink VRF-interface addresses on PE1 and PE2 may be different 383 IP addresses of the same subnet. Otherwise we may use loopback 384 interfaces to establish the CE-BGP sessions. 386 4.1.2. RT-5L Advertisement in Centerlized A-D Mode 388 When a PE advertises RT-5Ls just for its own direct subnets, it can 389 be used in both distributed A-D mode and centerlized A-D mode. When 390 a PE advertises RT-5Ls for CE-prefixes, it can not be used in 391 centerlized A-D mode, otherwise the data forwarding will be 392 centerlized too. When a PE (which is a DC Gateway) advertises RT-5Ls 393 for external-prefixes (which are received from other DCs or non-EVPN 394 neighbors), it can be used in either centerlized A-D mode or 395 distributed A-D mode. 397 4.2. G-Style: GW-IP as Overlay Index 399 When a G-style RT-5 is used to forward a data packet, the label/VNI/ 400 SID of that data packet's EVPN header is obtained using another EVPN 401 route whose IP field (of its NLRI) matches this RT-5 route's own GW- 402 IP (that's why it is called G-Style) field (of its NLRI), and the 403 forwarding path is determined by that EVPN route. 405 A G-style RT-5 route is also called as a RT-5G in this draft. 407 RT-5G can be used wether the CE-prefex AD-mode is centerlized mode or 408 distributed mode. and RT-5G can be used wether the Service Interface 409 is Mono VLAN-based mode or Mutiple VLAN-based mode. It can be a 410 uniform approach to advertise CE-prefixes no matter what the EVPN 411 mode is. 413 Note that the ESI field of a RT-5G route MUST be zero as per 414 [I-D.ietf-bess-evpn-prefix-advertisement]. 416 4.2.1. RT-5G Advertisement in Distributed A-D Mode 418 It follows [I-D.wz-bess-evpn-vpws-as-vrf-ac] section 2.3.2 and 419 section 6.2. Note that these procedures can be used in every L3EVI 420 Service Interface, not just in IRC Service Interface. 422 4.2.2. RT-5G Advertisement in Centerlized A-D Mode 424 When a PE (which may be a DC gateway) learns that CE-prefix prefix1's 425 overlay next hop is IP1, then the PE advertise a RT-5G for prefix1, 426 the GW-IP of that RT-5G is set to IP1. 428 An example of RT-5G advertisement in centeralized A-D mode can be 429 found in Section 5; 431 4.3. E-Style: ESI as Overlay Index 433 When a E-style RT-5 is used to forward a data packet, the label/VNI/ 434 SID of that data packet's EVPN header is obtained from another RT-1 435 route whose ESI and Ethernet Tag ID matches this RT-5 route's ESI 436 (that's why it is called L-Style) and Supplementary Overlay Index 437 (Section 3.3 of [I-D.wang-bess-evpn-ether-tag-id-usage] and 438 Section 6.3.3 of [I-D.wz-bess-evpn-vpws-as-vrf-ac]), and the 439 forwarding path is determined by that RT-1 route. 441 A E-style RT-5 route is also called as a RT-5E in this draft. 443 RT-5E can only be used when the CE-prefex AD-mode is distributed 444 mode. RT-5E can be used in Mono VLAN-based Service Interface. But 445 when RT-5E is used in Multiple VLAN-based Service interface or 446 Separated Risk VLAN-bundle service interface, the ACI-specific 447 ethernet auto-discovery per [I-D.wang-bess-evpn-ether-tag-id-usage] 448 should be followed. 450 Note that the GW-IP field of a RT-5E route MUST be zero as per 451 [I-D.ietf-bess-evpn-prefix-advertisement]. 453 4.3.1. RT-5E in Bump-in-the-wire use case 455 The RT-5 route that specifies an ESI as overlay index is first 456 defined in Section 4.3 of [I-D.ietf-bess-evpn-prefix-advertisement], 457 where the Bump-in-the-wire use case (the former RT-5E usage) is also 458 defined there. 460 Then it is discussed in Section 2.4 and Section 3.6.4 of 461 [I-D.wang-bess-evpn-ether-tag-id-usage]. The RT-5E routes (the 462 latter RT-5E usage) of Section 6 of revision-02 463 [I-D.wang-bess-evpn-arp-nd-synch-without-irb-02] and Section 1.3 of 464 [I-D.sajassi-bess-evpn-ip-aliasing] are different from these RT-5E 465 routes of Bump-in-the-wire use case in the following factors: 467 * Source MAC - The ethernet header can not be absent in the former 468 usage even if the data plane is MPLS. The source MAC MUST be set 469 to the MAC address of the IRB interface of BD-10 in Bump-in-the- 470 wire usecase. But in the latter usage the ethernet header can be 471 absent if the data plane is MPLS. 473 * Recursive Resolution - The recursive resolution of the former 474 usage are done in the context of a BD, But the recursive 475 resolution of the latter usage are done in the context of a IP- 476 VRF. 478 * EVPN label - The EVPN label of the corresponding RT-1 per EVI 479 route of the former usage is a MPLS label which identifies a BD, 480 But the EVPN label of the corresponding RT-1 per EVI route of the 481 latter usage is a MPLS label which identifies an IP-VRF. 483 * ESI - The ESI of the former usage is attached to a BD, But ESIs of 484 the latter usage are attached to IP-VRFs. 486 The Bump-in-the-wire use case is a special form of EVPN IRB use case, 487 that's why it is different from the non-IRB use cases. 489 4.3.2. RT-5E Advertisement on Distributed L3 GW 491 Given that PE1/PE2 (see Figure 1) can install a synced ARP entry to 492 its proper VRF-interface benefitting from the RT-2 route of 493 Section 3.1. So it is not necessary for PE1/PE2 to advertise per- 494 host IP prefixes to remote PEs (e.g. PE3) by RT-2 routes. It is 495 recommended that PE1/PE2 advertise an RT-5E route per subnet to PE3 496 instead. The ESI of these RT-5E routes can be set to the ESI of the 497 corresponding VRF interface. If the VRF interface fails, these 498 subnets will achieve more faster convergency on PE3 by the withdraw 499 of the corresponding IP-AD/EVI route. 501 Note that N1/N2 may be a host or a router, when it is a router, those 502 subnets (which are advertised by RT-5E routes) will be the CE- 503 prefixes behind it. When N1 and N2 are hosts, those subnets will be 504 the intermediate subnets (the subnet of N1/N2's own IP address). 506 When RT-5E routes are used to advertise direct-subnets, the details 507 can be found in Section 4.3.3. When RT-5E routes are used to 508 advertise CE-prefixes, there are two approaches, the details can be 509 found in Section 1.3 of [I-D.sajassi-bess-evpn-ip-aliasing] and 510 Section 6.3 of [I-D.wz-bess-evpn-vpws-as-vrf-ac]. 512 4.3.3. RT-5E Advertisement in Centerlized A-D mode 514 When the CE-prefixes are discovered by centerlized auto-discovery 515 approaches, the RT-5E can be used to advertise the direct-subnets of 516 NVE1/NVE2, but these RT-5E routes are not used to advertise the CE- 517 Prefixes. 519 When the direct-subnets are advertised by RT-5E routes, when the 520 main-interface of the corresponding ESI fails, mass-withdraw 521 procedures can be triggered for these prefiexes. This is the 522 advantage of advertising direct-subnets through RT-5E routes instead 523 of RT-5L routes. 525 Note that the example of the mass-withdraw use-case of RT-5E routes 526 can be found in Section 5.4. and it can be used in Dstributed A-D 527 mode too. 529 4.4. M-Style: MAC as Overlay Index 531 When a M-style RT-5 is used to forward a data packet, the label/VNI/ 532 SID of that data packet's EVPN header is obtained using another RT-2 533 route whose MAC field (of its NLRI) matches this RT-5 route's own 534 RMAC, and the forwarding path is determined by that RT-2 route. 536 A M-style RT-5 route is also called as a RT-5M in this draft. 538 RT-5M is used in Interfaceful IP-VRF-to-IP-VRF mode and Bump-in-the- 539 wire use case as per [I-D.ietf-bess-evpn-prefix-advertisement]. 541 5. Centerlized RT-5G Advertisement for Distributed L3 Forwarding 543 When N1/N2/N3 is a router, it is called R1/R2/R3 in the following 544 figure. Note that Figure 6 only illustrates the physical ethernet 545 links, but Figure 1 illustrates the logical L3 adjacencies between PE 546 and CE as the following. We assume that ESI21 are attched to L3EVI 547 VPNx of Section 1.1.2 of [I-D.wang-bess-evpn-ether-tag-id-usage]. 549 PE1 550 +----------+ 551 | +------+ | ------> 552 R1 | | | | RT-1 553 +-------+ | | VPNx | | ESI21 554 | | P1.1 | | | | ETI1 555 | ...................(10.9)| | 556 | . | ESI21 | +------+ | DGW1 557 | . | + +----------+ +-------------+ 558 | . | | ^ <---------- | | 559 | . | | | RT-2 RT-5G | +---------+ | 560 |(10.2) | | | 10.2 CE-Prefix1 | | VPNx | | 561 | . | | | ESI21 GW-IP=10.2 | | |....R3 562 | . | | | | |(3.3.3.3)| | 563 | . | + +----------+ ------> | +---------+ | 564 | . | ESI21 | +------+ | RT-2R | ^ | 565 | ...................(10.9)| | 10.2 | | | 566 | | P2.1 | | | | ESI21 +---|---------+ 567 +-------+ | | VPNx | | | 568 | | | | | ------> | CE-BGP 569 | | +------+ | RT-1 | Prefix1 570 | +----------+ ESI21 | NH=10.2 571 | PE2 ETI1 | 572 | CE-BGP | 573 +--------------------->---------------------------+ 575 Figure 1: Centerlized RT-5G Advertisement 577 If R1 prefers to establish a single CE-BGP session, it can establish 578 the CE-BGP session with DC GW (e.g. PE3 of Section 1.1.2 of 579 [I-D.wang-bess-evpn-ether-tag-id-usage]) instead. This CE-BGP 580 session can be called the centerlized CE-BGP session. But when we 581 use centerlized CE-BGP session, we should use RT-5G route instead. 583 Note that we just use centerlized CE-BGP session to discover CE- 584 prefixes, but we still expect a distributed Layer 3 forwarding 585 framework. 587 5.1. CE-side Configurations 589 Let us assume that CCC Active-Active Protection are used inside 590 PNEC1, that's to say, when R1 send packets to 10.9, these packets 591 will be load-balanced between PE1 and PE2. 593 5.2. Why Centerlized A-D mode is used 595 Because of the factors discussed in Section 5.1, perhaps the CE-BGP 596 session can be established between 10.2 and 10.9. 598 There may be other reasons that prevent the routing protocols to be 599 established between 10.2 and 10.9. 601 5.3. Basic Control Plane Procedures 603 5.3.1. Centerlized CE-BGP 605 The CE-BGP session between R1 and DGW1 (when PE3 is a DC GW, it is 606 called DGW1) is established between 10.2 and 3.3.3.3. The IP address 607 10.2 is called the uplink interface address of R1 in this document. 608 The IP address 3.3.3.3 is called the centerlized loopback address of 609 VPNx in this document. The IP address 10.9 is called the downlink 610 VRF-interface address of PE1/PE2 in this document. 612 R1 advertises a BGP route for a prefix (say "Prefix1") behind it to 613 DGW1 via that CE-BGP session. The nexthop for Prefix1 is R1's uplink 614 interface address (say 10.2). 616 Note that the data packets from R1 to the centerlized loopback 617 address may be routed following the default route on R1. Thus DGW1 618 don't need to use the CE-BGP session to advertise prefixes of VPNx to 619 R1. 621 5.3.2. RT-2E Advertisement from PE1/PE2 to DGW1 623 When PE1 learns the ARP entry of 10.2, it advertises a RT-2R route to 624 DGW1. The ESI value of the RT-2R route is ESI21, which is the ESI of 625 PE1's downlink VRF-interface for R1. The RT-2R route is constructed 626 following Section 3.1. This is a mono VLAN-based service interface, 627 thus the ETI1 (Ethernet Tag ID 1) of that RT-2R route can be 0. 629 Note that in [RFC7432], when the ESI is single-active, the MAC 630 forwarding only use the label and the BGP nexthop of the RT-2R route 631 as long as they are valid for forwarding status. But in RT-5E routes 632 we assume that the ESI is always preferred even if the ESI is single- 633 active. This is follows [I-D.ietf-bess-evpn-prefix-advertisement] 634 section 3.2 Table 1. 636 5.3.3. RT-5G Advertisement from DGW1 to PE1/PE2 638 When DGW1 receives the prefix1 from the CE-BGP session. The nexthop 639 for Prefix1 is 10.2. So DGW1 advertises a RT-5G route to PE1/PE2 for 640 Prefix1. The GW-IP value of the RT-5G route for Prefix1 is 10.2. 642 Note that DGW1 can load-balance packets for Prefix1 via the IP-AD/EVI 643 routes (of ESI21) from PE1/PE2. Because ESI21 (which is advertised 644 along with RT-2R of 10.2) is the ESI for Prefix1's GW-IP. 646 Note that the centerlized loopback address is advertised to PE1/PE2 647 by DGW1 via RT-5L route. The nexthop of the RT-5L route is DGW1. 648 The label of the RT-5L route is VPNx's label on DGW1. The RMAC of 649 the RT-5L route is DGW1's MAC when the encapsulation is VXLAN. 651 5.3.4. RT-2E Advertisement between PE1 and PE2 653 The RT-2R routes advertisement between PE1 and PE2 is used to sync 654 their ARP entries to each other in order to avoid ARP missing. The 655 ESI Value of these two RT-2R routes is ESI21. 657 5.4. Mass-Withdraw by EAD/ES Route 659 In the figure of Section 1.1.2 of 660 [I-D.wang-bess-evpn-ether-tag-id-usage], there are two L3EVIs, VPNx 661 and VPNy. We just take VPNx for example in Section 5.3, now we 662 consider these two L3EVIs together. 664 +-----------------------+ 665 PNEC1 PE1 | | 666 +-------------+ +----------+--------+ | 667 | | | __(20.9)__(VPNy) | Withdraw | 668 | Prefix1 " | P1 | / | IP-AD/ES | 669 | / #===========X==< | ----X---> | DGW1 670 | R1_______" | ESI21 | \__ __ | +----+----+ 671 | 10.2 " | + | (10.9) (VPNx) | | | 672 | " | | +-----------+-------+ |(3.3.3.3)| 673 | " | | | | | | 674 | Prefix2 " | | | | (VPNx)---+N3 675 | / " | | PE2 | | | 676 | R2_______" | | +-----------+-------+ | (VPNy)---+N5 677 | 20.2 " | + | __(20.9)__(VPNy) | | | 678 | " | ESI21 | / | +----+----+ 679 | #==============< | | 680 | " | P2 | \__ __ | | 681 | | | (10.9) (VPNx) | | 682 +-------------+ +----------+--------+ | 683 | | 684 +-----------------------+ 686 Figure 2: Mono VLAN-based S-I Use Case 688 When the physical interface of the downlink VRF-interface (P1) on PE1 689 fails (illustrated by the 'X' on P1), PE1 will withdraw the IP-AD/ES 690 route of ESI21, so DGW1 will re-route 10.2 for VPNx's CE-prefiex1. 691 and re-route 20.2 for VPNy's CE-prefix2 at the same time. Then data 692 packets for CE-Prefix1 and CE-Prefix2 will be sent to PE2 instead. 694 5.5. If Mutiple VLAN-based Service Inerface is Used 696 Now we assume that ESI21 are attached to L3EVI VPN1 according to 697 Section 1.1.3 of [I-D.wang-bess-evpn-ether-tag-id-usage]. And we 698 assume that CCC Active-Active Protection are used inside PNEC1. 700 +--------------------------+ 701 PNEC1 PE1 | | 702 +-------------+ +-------+------+ | DGW1' 703 | | | X__(20.9) | ----X----> +----+----+ 704 | " | P1 | / \ | Withdraw | | 705 | #==============< (VPN1) | IP-AD/EVI | (VPN1)---+N6 706 | R1_______" | ESI21 | \__ / | ET-ID=2 | | 707 | 10.2 " | + | (10.9) | +----+----+ 708 | " | | +--------+-----+ | 709 | " | | | | 710 | " | | | | DGW1 711 | " | | PE2 | +----+----+ 712 | R2_______" | | +--------+-----+ | | 713 | 20.2 " | + | __(20.9) | |(3.3.3.3)| 714 | " | ESI21 | / \ | Withdraw | | | 715 | #==============< (VPN1) | IP-AD/EVI | (VPN1)---+N3 716 | " | P2 | \__ / | ET-ID=1 | | 717 | | | X (10.9) | ----X----> +----+----+ 718 +-------------+ +-------+------+ | 719 | | 720 +--------------------------+ 722 Figure 3: Mutiple VLAN-based S-I Use Case 724 When physical port P3 (see Figure 6, which illustrates the physical 725 links of Figure 3) fails, the CFM session of P2.1 (10.9 of PE2) goes 726 down (illustrated by the 'X' inside PE2), while the CFM session of 727 P2.2 (20.9 of PE2) continues to be UP. thus only the IP-AD/EVI route 728 (whose ET-ID=1) of P2.1 should be withdrawn by PE2. the IP-AD/EVI 729 route (where ET-ID=2) of P2.2 and the IP-AD/ES route should not be 730 withdrawn by PE2. 732 Note that if the ET-IDs of these two IP-AD/EVI routes are the same, 733 when P2.1 fails, DGW1 will continue to load-balance traffics whose 734 DA=20.2 to PE2, because that there is still another IP-AD/EVI route 735 (of VPN1) whose ESI and ET-ID are the same. That's why ACI-specifice 736 Ethernet auto-discovery mode [I-D.wang-bess-evpn-ether-tag-id-usage] 737 should be followed in this case. 739 Note that we assume that the ARP entry for 10.2 will be learnt on PE1 740 only, and 20.2 will be learnt on PE2 only. Note that the two 741 downlink VRF-interfaces P2.1 (to R1) and P2.2 (to R2) on PE2 are sub- 742 interfaces of the same physical interface P2. So they have the same 743 ESI. ESI21 are attached to L3EVI VPN1 using multiple VLAN-based 744 service interface, thus the mass-withdraw procedures of Section 5.4 745 can be used in this case too. 747 5.6. If VLAN-bundle Service Interface is Used 749 If R1 and R2 can share the same gateway IP address, P2.1 and P2.2 can 750 be aggregated into the same subinterface (where the shared gateway IP 751 is configured to). Although they are aggregated, this can't change 752 the fact that they don't share the same risks. When that physical 753 interface P3 (see Figure 6) fails, one of them will fail, while the 754 other will continue to work well. 756 Thus different (in ET-ID) IP-AD/EVI routes for P2.1 and P2.2 should 757 be advertised separately. That's why 758 [I-D.wang-bess-evpn-ether-tag-id-usage] should be followed in this 759 case. 761 5.7. On the Failure of PE3 Node 763 Take the Figure 3 for example, on the failure of DGW1, PE1/PE2 should 764 delay the deletion of the RT-5G route from DGW1. DGW1 can use a new 765 BGP attribute to indicate the delayed-deletion requirement to PE1/ 766 PE2. Otherwise the L3 traffic between R1 and R2 will be interrupted. 767 Fortunately, DGW1 will typically have a redundant node (DGW1' in 768 Figure 3), and DGW1' can be used to take DGW1's place when DGW1 769 fails. 771 Note that from the viewpoint of R1 and R2, the total of PE1, PE2, 772 DGW1, DGW1' and the underlay network between them is regarded as the 773 following VNF: 775 +---------------------------------+ 776 | | 777 | +----------------------+ | 778 | | MPU1 (DGW1) | | 779 | +----------------------+ | 780 | | 781 | +----------------------+ | 782 | | MPU2 (DGW1') | | 783 | +----------------------+ | 784 | | 785 | +----------------------+ | 786 | | LPU1 (PE1) |----------------R1 787 | +----------------------+ | 788 | | 789 | +----------------------+ | 790 | | LPU2 (PE2) |----------------R2 791 | +----------------------+ | 792 | | 793 +---------------------------------+ 795 Figure 4: EVPN Instance as a VNF 797 R1 and R2 connect to the LPUs of the VNF. and the data packets 798 between R1 and R2 just pass through the LPUs, not through the MPUs. 799 But R1/R2 establish the BGP session with the MPUs, not the LPUs. 800 When the MPU1(or actually DGW1) fails, the LPUs(or actually PE1/PE2) 801 will keep the forwarding state unchanged untill the MPU1 or MPU2 802 comes up. So the delayed deletion on PE1/PE2 for DGW1's sake is 803 apprehensible for the same reason. 805 Note that for the north-bound traffics, the DC GWs also plays a LPU 806 role of this VNF. 808 5.8. For Common CE-prefixes behind R1 and R2 810 We can assume that there is a common prefix (say Prefix3) behind both 811 R1 and R2, That's saying that DGW1 can reach Prefix3 through either 812 R1 or R2. When R1 advertise Prefix3 to DGW1 over that CE-BGP 813 session, 10.2 may not be the best choice for Prefix3's BGP next hop. 815 EVPN Instance as a VNF 816 +---------------------------------+ 817 | | 818 | +----------------------+ | 819 | | MPU1 (DGW1) |<---------<-----+ 820 | +----------------------+ | | 821 | | ^ 822 | +----------------------+ | | CE-BGP 823 | | MPU2 (DGW1') | | | Prefix3 824 | +----------------------+ | | NH=7.7.7.7 825 | | | 826 | +----------------------+ | 10.2 | 827 | | LPU1 (PE1) |---------------[R1(7.7.7.7)]---+ 828 | +----------------------+ | | 829 | | Prefix3 830 | +----------------------+ | 20.2 | 831 | | LPU2 (PE2) |---------------[R2(7.7.7.7)]---+ 832 | +----------------------+ | 833 | | 834 +---------------------------------+ 836 Figure 5: IP Aliasing of Common CE-Prefixes 838 In such case, we can configure a common anycast loopback address (say 839 7.7.7.7) on R1 and R2. Then, when R1 advertise Prefix3 to DGW1, R1 840 choose 7.7.7.7 to be the BGP next-hop of the advertisement. Thus the 841 RT-5G of Prefix3 from DGW1 will be advertised along with GW- 842 IP=7.7.7.7. 844 In addition to the common prefixes behind R1 and R2, there will be 845 exclusive prefixes particular to R1 or R2, and maybe R1/R2 can't 846 distinguish the common prefixes from the exclusive prefixes, so R1/R2 847 just advertise all prefixes behind it to PEs by CE-BGP using the 848 common nexthop (e.g. 10.2). then the PEs can not distinguish the 849 common prefixes from the exclusive prefixes either. Thus RT-5E 850 routes can not be used even if distributed CE-prefix auto-discovery 851 mode is used, because that PE1/PE2 can't advertise different ESIs for 852 the common prefixes and the exclusive prefixes. 854 The ECMP-Merging approaches of Section 6.2.1 and Section 6.2.2 of 855 [I-D.wz-bess-evpn-vpws-as-vrf-ac] can also be used in such cases in 856 order to simplify the required recursive resolution. 858 If one of the PEs can't resolve the GW-IP (e.g. 7.7.7.7) of a RT-5G 859 route to another RT-5 route (e.g. the RT-5L route of 7.7.7.7), DGW1 860 can proxy the recursive resolution for other PEs. When 7.7.7.7 can 861 be resolved to two RT-2 routes of 10.2 and 20.2, DGW1 can advertise 862 the RT-5G route of the CE-prefix along with GW-IP=10.2 or GW-IP=20.2. 863 Further, DGW1 may advertise two RT-5G routes for that CE-prefix, 10.2 864 is the GW-IP of one of them, 20.2 is the GW-IP of the other. 866 6. Load Balancing of Unicast Packets 868 6.1. IP Aliasing using GW-IP 870 When a RT-5G's GW-IP can be resolved to an ECMP-list of RT-5L (e.g. 871 Section 6.2.1 of [I-D.wz-bess-evpn-vpws-as-vrf-ac]) routes, we can 872 say that the IP aliasing is implemented using GW-IP. 874 Note that when the encapsulation is VXLAN, in this case, PE3 will 875 encapsulate the RMAC per each path of that ECMP-list. 877 6.2. IP Aliasing using ESI 879 When a RT-5G's GW-IP can only be resolved to a single RT-2R (e.g. 880 Section 5.3.3, where the RT-5G is a local-discovered RT-5G) route, 881 but the of that RT-2R route can be resolved to an ECMP-list 882 of RT-1 routes, we can say that the IP aliasing is implemented using 883 ESI. 885 It is similar to [I-D.sajassi-bess-evpn-ip-aliasing] except for a few 886 notable exceptions as explained in the following. 888 o How to encapsulate Destination MAC ? 889 * The IP-AD/EVI routes don't have their own RMAC - Note that when 890 the encapsulation is VXLAN, PE3 will encapsulate the RMAC of the 891 RT-2R route for corresponding GW-IP address. And the RMAC of PE1 892 MUST have the same value with the RMAC of PE2. This can be 893 achieved by configuration. 895 * The IP-AD/EVI routes have their own RMAC - Note that when the 896 encapsulation is VXLAN, PE3 will encapsulate the RMAC of an IP- 897 AD/EVI route in that ECMP-list. When an IP packet is 898 encapsulated with a VNI label according to an IP-AD/EVI route, 899 the packet SHOULD be encapsulated with a Destination-MAC 900 according to the RMAC of that IP-AD/EVI route, if and only if the 901 IP-AD/EVI route have a RMAC of its own. 903 o How to select the IP-AD/EVI routes? 904 When selecting corresponding IP-AD/EVI routes for a RT-5E route, 905 the procedures discussed in Section 3.2 of 906 [I-D.wang-bess-evpn-ether-tag-id-usage] should be followed. 908 7. IANA Considerations 910 no IANA Considerations. 912 8. Security Considerations 914 TBD. 916 9. References 918 9.1. Normative References 920 [I-D.wang-bess-evpn-ether-tag-id-usage] 921 Wang, Y., "Ethernet Tag ID Usage Update for Ethernet A-D 922 per EVI Route", Work in Progress, Internet-Draft, draft- 923 wang-bess-evpn-ether-tag-id-usage-03, 26 August 2021, 924 . 927 [I-D.sajassi-bess-evpn-ip-aliasing] 928 Sajassi, A., Badoni, G., Warade, P., Pasupula, S., Drake, 929 J., and J. Rabadan, "EVPN Support for L3 Fast Convergence 930 and Aliasing/Backup Path", Work in Progress, Internet- 931 Draft, draft-sajassi-bess-evpn-ip-aliasing-02, 8 June 932 2021, . 935 [I-D.ietf-bess-evpn-prefix-advertisement] 936 Rabadan, J., Henderickx, W., Drake, J., Lin, W., and A. 937 Sajassi, "IP Prefix Advertisement in EVPN", Work in 938 Progress, Internet-Draft, draft-ietf-bess-evpn-prefix- 939 advertisement-11, 18 May 2018, 940 . 943 [I-D.ietf-bess-evpn-inter-subnet-forwarding] 944 Sajassi, A., Salam, S., Thoria, S., Drake, J., and J. 945 Rabadan, "Integrated Routing and Bridging in EVPN", Work 946 in Progress, Internet-Draft, draft-ietf-bess-evpn-inter- 947 subnet-forwarding-15, 26 July 2021, 948 . 951 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 952 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 953 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 954 2015, . 956 [RFC8679] Shen, Y., Jeganathan, M., Decraene, B., Gredler, H., 957 Michel, C., and H. Chen, "MPLS Egress Protection 958 Framework", RFC 8679, DOI 10.17487/RFC8679, December 2019, 959 . 961 [I-D.wz-bess-evpn-vpws-as-vrf-ac] 962 Wang, Y. and Z. Zhang, "EVPN VPWS as VRF Attachment 963 Circuit", Work in Progress, Internet-Draft, draft-wz-bess- 964 evpn-vpws-as-vrf-ac-02, 28 August 2021, 965 . 968 [I-D.sajassi-bess-evpn-ac-aware-bundling] 969 Sajassi, A., Brissette, P., Mishra, M. P., Thoria, S., 970 Rabadan, J., and J. Drake, "AC-Aware Bundling Service 971 Interface in EVPN", Work in Progress, Internet-Draft, 972 draft-sajassi-bess-evpn-ac-aware-bundling-04, 11 July 973 2021, . 976 9.2. Informative References 978 [I-D.ietf-idr-tunnel-encaps] 979 Patel, K., Velde, G., Sangli, S., and J. Scudder, "The BGP 980 Tunnel Encapsulation Attribute", Work in Progress, 981 Internet-Draft, draft-ietf-idr-tunnel-encaps-22, 7 January 982 2021, . 985 [I-D.wang-bess-evpn-arp-nd-synch-without-irb-02] 986 Wang, Y. and Z. Zhang, "ARP/ND Synching And IP Aliasing 987 without IRB", Work in Progress, Internet-Draft, draft- 988 wang-bess-evpn-arp-nd-synch-without-irb-02, 28 November 989 2019, . 992 Appendix A. Explanation for Physical Links of the Use-cases 993 +------------------+ 994 PE1 | P6 | 995 L2NE1 +----------+---------+ | 996 +----------+ | __(P1.1)__(VPNx) | | 997 +---+ P4 | | P1 | / \ | | 998 |N1 |-----O==------=======< (NIz) | P6 | PE3 999 +---+ | \ / | | \__ __ / | +----+-------+ 1000 | | | | | (P1.2) (VPNy) | | | 1001 +----|P3|--+ +-----------+--------+ | (VPNx)--+N3 1002 | | | | / | 1003 P3.1 | | P3.2 | P7 | (NIz)--------+N4 1004 | | PE2 | | \ | 1005 +----|P3|--+ +-----------+--------+ | (VPNy)--+N5 1006 | \/ | | __(P2.2)__(VPNy) | | | 1007 +---+ | /\ | | / \ | +----+-------+ 1008 |N2 |-----O====--=========< (NIz) | P8 | 1009 +---+ P5 | | P2 | \__ __ / | | 1010 +----------+ | (P2.1) (VPNx) | | 1011 L2NE2 +----------+---------+ | 1012 | P8 | 1013 +------------------+ 1015 Figure 6: Physical Links Illustrated 1017 There are three PEs, two L2NEs (Layer 2 Network Elements) and five 1018 L3NEs (Layer 3 Network Elements) in abobe network. The PEs are PE1, 1019 PE2 and PE3. The L2NEs are L2NE1 and L2NE2. The L3NEs are 1020 N1/N2/N3/N4/N5. They are all illustrated in Figure 6. 1022 There are 9 physical links among these 10 physical devices as 1023 illustrated in Figure 6. These physical links are called as PLi 1024 (i=1,2...8). The two physical ports of the same physical link PLi 1025 are both called as Pi (i=1,2...8). 1027 As illustrated in Figure 6, some of these physical ports may have 1028 subinterfaces. When a subinterface's VLAN ID is j and it is physical 1029 port Pi's subinterface, that subinterface is called as Pi.j. For 1030 example, P1.2 is a subinterface of physical port P1 and its VLAN ID 1031 is 2. 1033 There are three NIs (Network Instances) among PE1, PE2 and PE3. They 1034 are VPNx, VPNy and NIz. Two subinterfaces are attached to VPNx, they 1035 are P1.1 and P2.1. Other two subinterfaces are attached to VPNy, 1036 they are P1.2 and P2.2. N3 is also attched to VPNx, while N5 is also 1037 attached to VPNy. 1039 There are two EVCs (Ethernet Virtual Connections) between L2NE1 and 1040 L2NE2, they are EVC1 and EVC2. The L2NE1's EVC1 instance (which is 1041 illustrated as the "O" on L2NE1) have three member interfaces, they 1042 are P4, P1.1 and P3.1, where P3.1 and P1.1 are of the same 1043 protection-group. The L2NE2's EVC1 instance have two member 1044 interfaces, they are P3.1 and P2.1. The L2NE2's EVC2 instance (which 1045 is illustrated as the "O" on L2NE2) have three member interfaces, 1046 they are P5, P2.2 and P3.2, where P3.1 and P1.1 are of the same 1047 protection-group. The L2NE1's EVC2 instance have two member 1048 interfaces, they are P3.2 and P1.2. The L2NE2's EVC1 instance and 1049 L2NE1's EVC2 instance are both CCC (Circuit Cross Connection) local 1050 connections. 1052 VPNx and VPNy are associated to NIz on each PE. 1054 A.1. Failure Detections for P1.2 (or P2.1) 1056 There is a CFM session CFM1 between P1.2 of PE1 and L2NE2's P3.2, 1057 when physical port P3 fails, the CFM session CFM1 will go down. 1058 There is a CFM session CFM2 between P2.1 of PE2 and L2NE1's P3.1, 1059 when physical port P3 fails, the CFM session CFM2 will go down. 1061 A.2. Protection Approaches for N1 (or N2) 1063 A.2.1. CCC-Approaches 1065 The L2NE1's EVC1 instance and L2NE2's EVC2 instance are both CCC 1066 local connections too. In L2NE1's EVC1 instance, P1.1 and P3.1 are 1067 of the same protection-group PG1. In L2NE2's EVC2 instance, P2.2 and 1068 P3.2 are of the same protection-group PG2. In PG1, both P1.1 and 1069 P3.1 will receive data packets. In PG2, both P2.2 and P3.2 will 1070 receive data packets. 1072 A.2.1.1. CCC Active-Active Protection 1074 L2NE1 (or L2NE2) will load-balance N1's (N2's) data packets between 1075 P1.1 and P3.1 (or P2.2 and P3.2). 1077 A.2.1.2. CCC Active-Standby Protection 1079 In PG1, P1.1 is the active path, P3.1 is the backup path. In PG2, 1080 P2.2 is the active path, P3.2 is the backup path. 1082 That's saying that L2NE1 (or L2NE2) will not send N1's (or N2's) data 1083 packets over P3.1 (or P3.2), unless P1.1 (or P2.2) or P1 (or P2) has 1084 been in failure before that data forwarding. 1086 A.2.2. VSI-Approaches 1088 L2NE1's EVC2 instance and L2NE2's EVC1 instance are both VSI 1089 instances in this case. P1.1, P3.1, P2.2 and P3.2 are all individual 1090 ACs in these VSIs. 1092 Note that L2NE2's EVC1 instance and L2NE1's EVC2 instance are still 1093 both CCC local connections in this case, and there is no PG1 or PG2 1094 in this case, and there are no PWs in this case. 1096 Appendix B. Different Understandings on Resolve GW-IP to RT-5 1098 B.1. Section 3.2 of I-D.ietf-bess-evpn-prefix-advertisement 1100 The following bullets in Section 3.2 of 1101 [I-D.ietf-bess-evpn-prefix-advertisement]: 1103 "RT-5 routes support recursive lookup resolution through the use of 1104 Overlay Indexes as follows: 1106 o ... It is important to note that recursive 1107 resolution of the Overlay Index applies upon installation into an 1108 IP-VRF, and not upon BGP propagation (for instance, on an ASBR). 1110 ... 1112 o In order to enable the recursive lookup resolution at the ingress 1113 NVE, an NVE that is a possible egress NVE for a given Overlay Index 1114 must originate a route advertising itself as the BGP next hop on 1115 the path to the system denoted by the Overlay Index. For instance: 1117 . ... 1118 . If the RT-5 specifies an ESI as the Overlay Index, recursive 1119 resolution can only be done if the NVE has received and installed 1120 an RT-1 (Auto-Discovery per-EVI) route specifying that ESI. 1121 . If the RT-5 specifies a GW IP address as the Overlay Index, 1122 recursive resolution can only be done if the NVE has received and 1123 installed an RT-2 (MAC/IP route) specifying that IP address in 1124 the IP address field of its NLRI. 1125 . ... 1127 Note that the RT-1 or RT-2 routes needed for the recursive 1128 resolution may arrive before or after the given RT-5 route. 1130 o ..." 1132 B.2. How to Interpret Above Paragraphs 1134 We should note that above section can be interpreted that it was 1135 written based on the following principles: 1137 * The following paragraph (say Praragraph 1) sepecifies how the 1138 recursive lookup resolution will be done: 1140 "In order to enable the recursive lookup resolution at the 1141 ingress NVE, an NVE that is a possible egress NVE for a given 1142 Overlay Index must originate a route advertising itself as the 1143 BGP next hop on the path to the system denoted by the Overlay 1144 Index. For instance:" 1146 * The examples that is constrained by the phrase "For instance:" 1147 described some use-cases that followed above paragraph, with the 1148 understanding that new use-cases were possible in the future with 1149 new documents, as long as the rules of above Paragraph 1 were 1150 respected. 1152 B.3. Special PEs 1154 If there are devices that have interpreted above Paragraph 2 as the 1155 following: 1157 "if the recursive resolution can't find out a RT-2 for that RT-5's 1158 GW-IP, that RT-5 should not be installed." 1160 Such behavior of that PE might not be considered as according to 1161 Section 3.2 of [I-D.ietf-bess-evpn-prefix-advertisement]. It is just 1162 not included in [I-D.ietf-bess-evpn-prefix-advertisement]. 1164 B.4. GW-IP or a new TLV 1166 No matter how to understand Section 3.2 of 1167 [I-D.ietf-bess-evpn-prefix-advertisement], now we can assume that the 1168 function of the GW-IP field is replaced with a new TLV (e.g. the IP- 1169 mapping SOI extended community, similar to what have been done in 1170 Section 6.3 of [I-D.wz-bess-evpn-vpws-as-vrf-ac]), then we can 1171 compare these two implementations and see whether a new TLV will 1172 bring us some benefits or not. 1174 Now assume that the Figure 5 of this draft is changed to distributed 1175 CE-prefixes auto-discovery mode (which is similar to Section 6.3 of 1176 [I-D.wz-bess-evpn-vpws-as-vrf-ac]). The comparisons are illustrated 1177 as the following: 1179 +=====+=================================+=======+=========+=========+ 1180 | No. | Compared Points | GW-IP | New TLV | RT-5E | 1181 +=====+=================================+=======+=========+=========+ 1182 | 1 | Can non-upgraded | yes | yes | yes | 1183 | | RRs accept it? | | | | 1184 +-----+---------------------------------+-------+---------+---------+ 1185 | 2 | Can non-upgraded | maybe | no | no | 1186 | | DGWs* install it? | | | | 1187 +-----+---------------------------------+-------+---------+---------+ 1188 | 3 | Should PE1/PE2 be | yes | yes | yes | 1189 | | upgraded? | | | | 1190 +-----+---------------------------------+-------+---------+---------+ 1191 | 4 | Will it confuse | no | no | no | 1192 | | non-upgraded RRs? | | | | 1193 +-----+---------------------------------+-------+---------+---------+ 1194 | 5 | Will it confuse | no | no | maybe** | 1195 | | non-upgraded DGWs? | | | | 1196 +-----+---------------------------------+-------+---------+---------+ 1198 Table 1: GW-IP vs IP-mapping SOI 1200 Notes: 1202 * We also can take the Figure 4 of Section 6.3 of 1203 [I-D.wz-bess-evpn-vpws-as-vrf-ac] for example, in such case, its 1204 PE3 may be a DGW. 1206 ** If the RT-5E routes of the original Bump-in-the-wire usecase are 1207 advertised along with the route-target of the IP-VRF (thus no RTs 1208 of the BD-10), when DGW1 receives a RT-5E route and there is a 1209 SBD IRB in the IP-VRF instance, it may select RT-1 per EVI routes 1210 for the RT-5E route in the context of that SBD. This is 1211 discussed in section Section 3.6.4 of 1212 [I-D.wang-bess-evpn-ether-tag-id-usage]. 1214 We can found in above table that a new TLV will be no better than the 1215 original GW-IP field. 1217 Note that when PEs can not distinguish the common prefixes from the 1218 exclusive prefixes, only CE-BGP nexthop based Overlay Index can be 1219 used for IP aliasing (independent CE-BGP sessions and RT-5L routes 1220 can also be used as per Section 6.1 of 1221 [I-D.wz-bess-evpn-vpws-as-vrf-ac], but this is not IP aliasing), 1222 because that the PEs can't advertise different ESIs for the common 1223 prefixes and the exclusive prefixes. 1225 Authors' Addresses 1226 Yubao Wang 1227 ZTE Corporation 1228 No.68 of Zijinghua Road, Yuhuatai Distinct 1229 Nanjing 1230 China 1232 Email: wang.yubao2@zte.com.cn 1234 Zheng(Sandy) Zhang 1235 ZTE Corporation 1236 No. 50 Software Ave, Yuhuatai Distinct 1237 Nanjing 1238 China 1240 Email: zhang.zheng@zte.com.cn