idnits 2.17.1 draft-heitz-bess-evpn-option-b-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 13, 2017) is 2349 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'EVPN-Etree' is mentioned on line 154, but not defined == Outdated reference: A later version (-11) exists of draft-ietf-bess-evpn-prefix-advertisement-02 == Outdated reference: A later version (-22) exists of draft-ietf-idr-tunnel-encaps-02 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS J. Heitz 3 Internet-Draft A. Sajassi 4 Intended status: Standards Track Cisco 5 Expires: May 17, 2018 J. Drake 6 Juniper 7 J. Rabadan 8 Nokia 9 November 13, 2017 11 Multi-homing and E-Tree in EVPN with Inter-AS Option B 12 draft-heitz-bess-evpn-option-b-01 14 Abstract 16 The BGP speaker that originates an EVPN Ethernet A-D per ES route is 17 identified by the next-hop of the route. When the route is 18 propagated by an ASBR as an Inter-AS Option B route, the ASBR 19 overwrites the next-hop. This document describes a method to 20 identify the originator of the route. 22 Requirements Language 24 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 25 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 26 document are to be interpreted as described in [RFC2119]. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on May 17, 2018. 45 Copyright Notice 47 Copyright (c) 2017 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (https://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 2 63 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 64 2.1. EVPN multi-homing and Inter-AS Option B issue . . . . . . 3 65 2.2. EVPN E-tree and Inter-AS Option B issue . . . . . . . . . 4 66 3. Solution using the Tunnel Encapsulation Attribute . . . . . . 4 67 4. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 5 68 5. Procedures at the Imposition PE . . . . . . . . . . . . . . . 5 69 5.1. Primer for subsequent sections . . . . . . . . . . . . . 5 70 5.2. OPE exists on all Type 2/5 and EAD Routes . . . . . . . . 5 71 5.3. Some routes do not contain OPE . . . . . . . . . . . . . 6 72 5.4. OPE exists on EAD routes, but not on Type 2/5 routes . . 6 73 6. Security Considerations . . . . . . . . . . . . . . . . . . . 6 74 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6 75 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 76 9. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . 7 77 9.1. Alternative Ways to Signal OPE . . . . . . . . . . . . . 7 78 9.1.1. Extended Community holding the IP addres . . . . . . 7 79 9.1.2. Large Community holding the BGP Identifier . . . . . 7 80 9.2. Considerations . . . . . . . . . . . . . . . . . . . . . 7 81 10. Normative References . . . . . . . . . . . . . . . . . . . . 8 82 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 84 1. Terminology 86 Inter-AS Option B: This is described in Section 10.b of [RFC4364] 88 EAD-per-ES: Ethernet A-D per Ethernet Segment Route. 90 EAD-per-EVI: Ethernet A-D per EVPN Instance Route. 92 EAD: EVPN Type 1 route: Ethernet Auto-discovery Route. Either an 93 EAD-per-ES or an EAD-per-EVI route. 95 Type 2/5: either the EVPN Type 2 route: MAC/IP Advertisement Route or 96 the EVPN Type 5 route: IP Prefix Route described in 97 [I-D.ietf-bess-evpn-prefix-advertisement]. 99 Mass Withdraw: To withdraw the route from the forwarding table. For 100 example, a MAC route that is mass withdrawn remains in the BGP table. 101 The MAC route is required for directing packets with the specified 102 MAC destination address to a matching backup or alias route. When a 103 MAC route is completely withdrawn, then the matching backup or alias 104 routes can no longer be used for the given MAC address. The 105 withdrawal of an EAD-per-ES route will cause the mass withdrawal of 106 associated Type 2/5 routes as well as associated EAD-per-EVI routes. 108 2. Introduction 110 Inter-AS Option B is illustrated in Figure 1. 112 CE3 113 | 114 PE1 115 / \ 116 CE1 ASBR1---ASBR2---PE3--CE2 117 \ / 118 PE2 120 Figure 1: Inter-AS Option B 122 Traffic flow is from CE2 to CE1 where PE3 is an imposition PE, and 123 PE1 and PE2 are disposition PEs. The following sections describe the 124 issues that EVPN multi-homing and EVPN E-tree services have in these 125 types of scenarios. 127 2.1. EVPN multi-homing and Inter-AS Option B issue 129 In a multi-homing scenario, the router that performs the redundancy 130 switchover or the load balancing (e.g. PE3) must know which router 131 originated the Ethernet A-D routes. These redundancy functions are 132 normally implemented on a PE, but not on an ASBR. 134 Quote from [RFC7432]: 136 "A remote PE that receives a MAC/IP Advertisement route with a 137 non-reserved ESI SHOULD consider the advertised MAC address to be 138 reachable via all PEs that have advertised reachability to that 139 MAC address's EVI/ES via the combination of an Ethernet A-D per 140 EVI route for that EVI/ES (and Ethernet tag, if applicable) AND 141 Ethernet A-D per ES routes for that ES." 143 In the Intra-AS case, the remote PE identifies the "PEs that have 144 advertised reachability" by the next-hops of the Ethernet A-D routes. 145 In the Inter-AS option B case, ASBR1 and ASBR2 rewrite the next-hops 146 to themselves on all EVPN route advertisements, thus losing the 147 identity of the PE that originated an advertisement. 149 As a result, PE3 is unable to distinguish an EAD-per-ES route that 150 originated at PE1 from one that originated at PE2. 152 2.2. EVPN E-tree and Inter-AS Option B issue 154 As described in [EVPN-Etree], leaf-to-leaf BUM traffic filtering is 155 always performed at the disposition PE and based on the Leaf Label. 156 The Leaf Label can be downstream allocated (ingress replication) or 157 upstream allocated (p2mp tunnels) and is advertised in an EAD-per-ES 158 route with ESI-0. As in the multi-homing case, the PEs must identify 159 the PE that originated a given EAD-per-ES route, for both cases, 160 ingress replication or p2mp tunnels, so that the leaf-to-leaf BUM 161 filtering can be successful. 163 If ingress-replication is used for BUM traffic, the ingress PE must 164 identify the originator of the ESI-0 EAD-per-ES route, program the 165 Leaf Label and push it on the stack when sending BUM Leaf traffic to 166 the egress PE. However, this identification of the originating PE is 167 not possible in Inter-AS option B scenarios where ASBRs rewrite the 168 next-hops. For instance, assuming CE2 and CE3 (Figure 1) are 169 connected to Leaf ACs, PE1 will advertise a Leaf Label in an EAD-per- 170 ES route for ESI-0. When CE2 sends BUM traffic, PE3 will not know 171 what Leaf Label to use for sending traffic to PE1. 173 Similarly, when PE3 uses non-segmented p2mp tunnels for BUM traffic, 174 PE3 will upstream allocate a Leaf Label and advertise it in an EAD- 175 per-ES route, so that when sending BUM traffic with a Leaf Label, PE1 176 can identify that is coming from a Leaf and not forward it to CE3. 178 In both cases, the current Intra-AS procedures do not allow to 179 identify the originator of the EAD-per-ES routes and therefore egress 180 BUM filtering for leaf-to-leaf is not possible when the Leaf ACs are 181 located on different AS'es. 183 3. Solution using the Tunnel Encapsulation Attribute 185 The Tunnel Encapsulation Attribute is specified in 186 [I-D.ietf-idr-tunnel-encaps]. A new TLV to identify the Originating 187 PE is specified here. It is called OPE. The tunnel type for the OPE 188 (suggested value 15) is to be assigned by IANA. The OPE MUST contain 189 the Remote Endpoint Sub-TLV. The OPE must be able to uniquely 190 identify the PE of origin within all ASes that participate in an EVPN 191 instance. 193 If a BGP speaker, such as a route reflector or an ASBR, is about to 194 re-advertise a Type 2/5 or EAD route that does not have a OPE, and 195 will change the next-hop of that route, then it MUST add one by 196 putting the received next-hop into the Remote Endpoint Sub-TLV of the 197 OPE. This will ensure that all originating EVPN routes carry the 198 necessary information for imposition PEs to function properly for 199 aliasing and mass withdraw. 201 Any router that re-advertises a route that contains a OPE may modify 202 some TLVs in the Tunnel Encapsulation Attribute attribute. However, 203 it MUST keep the OPE unchanged. Examples are ASBR1 and ASBR2 in 204 Figure 1. 206 4. Operation 208 For an inter-AS option B scenario, when a PE receives EVPN route(s) 209 with OPE from an ASBR, then everything works per [RFC7432] 210 specification including both aliasing function and mass withdraw. 211 i.e., the imposition PE (e.g., PE3) can process mass withdraw 212 messages (Ethernet A-D per ES route). However, if a PE receives EVPN 213 route(s) without a OPE from an ASBR, then the mass withdraw function 214 operates in a degenerate mode where only Ethernet A-D per EVI route 215 can be processed (for its corresponding MAC-VRF) but not Ethernet A-D 216 per ES route (corresponding to all the impacted MAC-VRFs). The 217 following sections detail the procedures associated with OPE 218 processing. 220 5. Procedures at the Imposition PE 222 5.1. Primer for subsequent sections 224 When routes are being compared, they must exist in the same MAC-VRF 225 and have the same non-reserved ESI. In addition, when Type 2/5 226 routes and EAD-per-EVI routes are being compared, they must have the 227 same Ethernet Tag. Type 2/5 routes with ESI==0 do not use mass 228 withdrawal or aliasing. 230 5.2. OPE exists on all Type 2/5 and EAD Routes 232 If all Type 2/5 and EAD routes have a OPE, then "PEs that have 233 advertised reachability" can be identified by the OPE and the 234 procedures of [RFC7432] can be applied without modification. 236 5.3. Some routes do not contain OPE 238 The routes that have a OPE are handled as per the previous section. 239 The routes that do not have a OPE need the following procedures. 241 Type 2/5 routes without a OPE and EAD-per-EVI routes without a OPE 242 are valid if at least one EAD-per-ES route without a OPE exists with 243 the same next-hop. In other words: if multiple EAD-per-ES routes 244 with the same next-hop as a Type 2/5 route exist, then the Type 2/5 245 route will only be mass withdrawn once all of the EAD-per-ES routes 246 are withdrawn. This rule is necessary, because a BGP speaker may 247 serve dual roles as ASBR and PE 249 [Editorial note: If it is determined that no BGP speakers exist that 250 do not normally follow the procedures in this document (Legacy 251 speakers) then the following sub sections may be omitted] 253 If an EAD-per-EVI route without a OPE is withdrawn, it will mass 254 withdraw all Type 2/5 routes without a OPE that have the same next- 255 hop and the same RD as the EAD-per-EVI route. This is called mass- 256 withdraw per EVI. Note, it is not the absence of the EAD-per-EVI 257 route that causes mass-withdrawal, but the actual withdrawal itself. 258 If the route was never there to begin with, then no withdrawal took 259 place. 261 If any entity in the network rewrites an RD, then all entities must 262 rewrite the RD in a consistent manner, such that routes with the same 263 RD continue to have the same RD and routes with different RDs 264 continue to have different RDs. Note that if this condition is 265 violated, then other network functions would also break. 267 5.4. OPE exists on EAD routes, but not on Type 2/5 routes 269 If a Type 2/5 route exists without a OPE and an EAD-per-EVI route 270 exists with a OPE and it has the same next-hop and the same RD as the 271 Type 2/5 route, then the Type 2/5 route shall inherit the OPE from 272 the EAD-per-EVI route. Thereafter, Section 5.2 applies. 274 6. Security Considerations 276 TBD 278 7. IANA Considerations 280 A Tunnel Encapsulation Attribute Tunnel Type for the OPE is required. 282 8. Acknowledgements 284 Thanks to Kiran Pillai, Patrice Brissette, Satya Mohanty and Keyur 285 Patel for careful review and suggestions. 287 9. Appendix 289 9.1. Alternative Ways to Signal OPE 291 [Note to RFC editor: This appendix to be removed before publication] 293 9.1.1. Extended Community holding the IP addres 295 The Extended Community to use must be transitive and either IPv4 296 Specific or IPv6 Specific as described in [RFC5701]. Thus, if it is 297 IPv4 Specific, it will be of type 0x41 and if IPv6 Specific, it will 298 be of type 0x40. 300 The Extended Community will hold the IP address of the PE that 301 originates the EVPN routes. 303 9.1.2. Large Community holding the BGP Identifier 305 A PE can be uniquely identified by its BGP identifier (also called 306 Router ID) and its AS number (ASN). A Large Community [RFC8092] can 307 be used to carry the BGP identifier and the ASN. A well known Large 308 Community needs to be allocated for this. This allocation is for the 309 Global Administrator field. The Local Data Part 1 field should carry 310 ASN and the Local Data Part 2 should carry the BGP identifier. 312 9.2. Considerations 314 It may be possible to associate the EAD-per-ES route with the Type 315 2/5 route by matching the Administrator Subfield of the RD. However, 316 there are too many constraints that need to be met to make this 317 method reliable. Basically, the RD was emphatically designed to 318 distinguish routes, not to identify them. The constraints that need 319 to be met are: 321 o The RD MUST by of Type 1. [RFC7432] recommends Type 1, but does 322 not mandate it. 324 o The Administrator subfield of the RD MUST be the same for each of 325 these routes originated by one PE. [RFC7432] does not require 326 this. It just says "The value field comprises an IP address of 327 the PE", but does not say that it must be the same IP address for 328 all. In an IPv6 only scenario, other ways will be used to assign 329 RD. 331 o The Administrator subfield of the RD MUST be unique among all PEs 332 participating in the Inter-AS EVPN. This is likely, but not 333 guaranteed. 335 o If RDs are rewritten at AS boundaries, then the Administrator 336 subfield MUST be rewritten in a consistent way such as to preserve 337 the above properties. 339 By allowing a single EAD-per-ES route to validate all EAD-per-EVI 340 routes and all Type 2/5 routes, some of those routes may be falsely 341 validated. However that is the best possible outcome without a OPE. 342 It is transient until the Type 2/5 route can be withdrawn. 344 The possibility of the address space of PE next-hops in one AS 345 overlapping that of another AS was raised. In such a case, the IP 346 address of a PE in one AS may be the same as the IP address of a 347 different PE in another AS. Because an ASBR overwrites next-hops, 348 this can work. The OPE contains both the ASN as well as the IP 349 address of the originating PE, so this works too. However, EVPN 350 route types 3 and 4 contain only the originating router's IP address, 351 but not the originating router's ASN. Therefore, EVPN route types 3 352 and 4 may also need a OPE. 354 The possibility of making the EAD-per-EVI route mandatory was raised. 355 This would make some of the procedures easier, because the RD of the 356 EAD-per-EVI route can be matched with the RD of the Type 2/5 route 358 10. Normative References 360 [I-D.ietf-bess-evpn-prefix-advertisement] 361 Rabadan, J., Henderickx, W., Palislamovic, S., and A. 362 Isaac, "IP Prefix Advertisement in EVPN", draft-ietf-bess- 363 evpn-prefix-advertisement-02 (work in progress), September 364 2015. 366 [I-D.ietf-idr-tunnel-encaps] 367 Rosen, E., Patel, K., and G. Velde, "The BGP Tunnel 368 Encapsulation Attribute", draft-ietf-idr-tunnel-encaps-02 369 (work in progress), May 2016. 371 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 372 Requirement Levels", BCP 14, RFC 2119, 373 DOI 10.17487/RFC2119, March 1997, 374 . 376 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 377 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 378 2006, . 380 [RFC5701] Rekhter, Y., "IPv6 Address Specific BGP Extended Community 381 Attribute", RFC 5701, DOI 10.17487/RFC5701, November 2009, 382 . 384 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 385 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 386 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 387 2015, . 389 [RFC8092] Heitz, J., Ed., Snijders, J., Ed., Patel, K., Bagdonas, 390 I., and N. Hilliard, "BGP Large Communities Attribute", 391 RFC 8092, DOI 10.17487/RFC8092, February 2017, 392 . 394 Authors' Addresses 396 Jakob Heitz 397 Cisco 398 170 West Tasman Drive 399 San Jose, CA 95134 400 USA 402 Email: jheitz@cisco.com 404 Ali Sajassi 405 Cisco 406 170 West Tasman Drive 407 San Jose, CA 95134 408 USA 410 Email: sajassi@cisco.com 412 John Drake 413 Juniper 415 Email: jdrake@juniper.net 417 Jorge Rabadan 418 Nokia 420 Email: jorge.rabadan@nokia.com