idnits 2.17.1 draft-sajassi-bess-evpn-l3vpn-multihoming-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 21, 2016) is 2958 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2119' is mentioned on line 103, but not defined == Missing Reference: 'RFC5462' is mentioned on line 132, but not defined == Missing Reference: 'EVPN-VPWS' is mentioned on line 235, but not defined == Missing Reference: 'Virtual-ES' is mentioned on line 238, but not defined == Outdated reference: A later version (-15) exists of draft-ietf-bess-evpn-inter-subnet-forwarding-00 == Outdated reference: A later version (-11) exists of draft-ietf-bess-evpn-prefix-advertisement-02 ** Downref: Normative reference to an Informational RFC: RFC 6718 Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup Ali Sajassi 3 INTERNET-DRAFT Samer Salam 4 Intended Status: Standards Track Dennis Cai 5 Cisco 6 Sami Boutros 7 VmWare John Drake 8 Juniper 10 Luay Jalil 11 Verizon 13 Expires: October 21, 2016 March 21, 2016 15 Multi-homed L3VPN Service with Single IP peer to CE 16 draft-sajassi-bess-evpn-l3vpn-multihoming-01 18 Abstract 20 This document describes how EVPN can be used to offer a multi-homed 21 L3VPN service leveraging EVPN Layer 2 access redundancy. The solution 22 offers single IP peering to the Customer Edge (CE) nodes, rapid 23 failure detection, minimal fail-over time and make-before-break 24 paradigm for maintenance. 26 Status of this Memo 28 This Internet-Draft is submitted to IETF in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF), its areas, and its working groups. Note that 33 other groups may also distribute working documents as 34 Internet-Drafts. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 The list of current Internet-Drafts can be accessed at 42 http://www.ietf.org/1id-abstracts.html 44 The list of Internet-Draft Shadow Directories can be accessed at 45 http://www.ietf.org/shadow.html 47 Copyright and License Notice 49 Copyright (c) 2012 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (http://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 66 2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 3 67 3 Challenges with L3VPN Multi-homing . . . . . . . . . . . . . . . 4 68 4 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 69 4.1 Using Pseudowires in Access Network . . . . . . . . . . . . 5 70 4.2 Using EVPN-VPWS in Access Network . . . . . . . . . . . . . 6 71 5 Failure Scenarios . . . . . . . . . . . . . . . . . . . . . . . 6 72 5.1 Pseudowire Failure . . . . . . . . . . . . . . . . . . . . . 6 73 5.2 EVPN VPWS Service Instance Failure . . . . . . . . . . . . . 7 74 5.3 PE Node Failure . . . . . . . . . . . . . . . . . . . . . . 7 75 6 Security Considerations . . . . . . . . . . . . . . . . . . . . 7 76 7 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 8 77 8 References . . . . . . . . . . . . . . . . . . . . . . . . . . 8 78 8.1 Normative References . . . . . . . . . . . . . . . . . . . 8 79 8.2 Informative References . . . . . . . . . . . . . . . . . . 8 80 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 8 82 1 Introduction 84 [RFC7432] defines EVPN, a solution for multipoint Layer 2 Virtual 85 Private Network (L2VPN) services, with advanced multi-homing 86 capabilities, using BGP for distributing customer/client MAC address 87 reachability information over the core MPLS/IP network. [EVPN-IRB] 88 and [EVPN-PREFIX] discuss how EVPN can be used to support inter- 89 subnet forwarding among hosts across different IP subnets, while 90 maintaining the redundancy capabilities of the original solution. 92 In this document, we discuss how EVPN can be used to offer a multi- 93 homed L3VPN service leveraging its Layer 2 access redundancy. The 94 solution offers single IP peering to the Customer Edge (CE) nodes, 95 rapid failure detection, minimal fail-over time and make-before-break 96 paradigm for maintenance. 98 1.1 Terminology 100 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 101 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 102 document are to be interpreted as described in RFC 2119 [RFC2119]. 104 2 Requirements 106 The network topology in question comprises of three domains: the 107 customer network, the MPLS access network and the MPLS core network, 108 as shown in the figure below. 110 Customer MPLS Access MPLS MPLS Access Customer 111 Network Network Core Network Network Network 112 +--------------+ 113 +-----------+ | | +-------------+ 114 |+----+ | +----+ | | | 115 +--+ || |---------| | | | | 116 |CE|---|APE1|------\ |SPE1| | | | 117 +--+ |+----+ | \/+----+ | | | 118 |+----+ | / +----+ +----+ +---+| 119 +--+ || |------/ \| | | | |APE|| +--+ 120 |CE|---|APE2|---------|SPE2| |SPEr|-------| |--|CE| 121 +--+ |+----+ | +----+ +----+ +---+| +--+ 122 | | | | | | 123 +-----------+ | | +-------------+ 124 +-------------+ 126 Figure 1: Network Topology 128 The customer network connects via Customer Edge (CE) nodes to the 129 MPLS Access Network. The MPLS Access Network includes Access PEs (A- 130 PEs) and MPLS P nodes (not shown for simplicity). The A-PEs provide a 131 Virtual Private Wire Service (VPWS) to the connected CEs using 132 Ethernet over MPLS (EoMPLS) pseudowires per [RFC5462]. The access 133 pseudowires terminate on the service PEs (S-PE1, S-PE2,..., S-PEr). 134 The Service PEs (S-PEs) provide inter-subnet forwarding between the 135 CEs, i.e. L3VPN service between them. To provide redundancy, 136 pseudowires from a given A-PE can terminate on two or more S-PEs 137 forming a Redundancy Group. This provide multi-homed interconnect of 138 A-PEs to S-PEs. 140 The solution MUST support the following requirements: 142 - The S-PEs in a redundancy group must provide single-active 143 redundancy to the CEs, i.e. only one S-PE is actively forwarding 144 traffic at any given point of time. 146 - The SPEs in a redundancy group must appear as a single IP peer to 147 the CE, and a single eBGP session will be established between a given 148 CE and its associated S-PEs. 150 - In the case of S-PE failure, pseudowire failure or S-PE isolation 151 from access network, the fail-over time should be minimized by 152 optimizing both the backup pseudowire establishment as well as the 153 BGP convergence time. This reduces the amount of traffic loss as the 154 active path reroutes to one of the backup S-PEs. 156 - The active S-PE must be able to quickly detect pseudowire failures 157 or its isolation from the access MPLS network by means of a proactive 158 monitoring mechanism. 160 - For system maintenance, it should be possible to support a make- 161 before-break paradigm, where the backup path is in warm standby state 162 before a given active S-PE is taken offline for service. 164 3 Challenges with L3VPN Multi-homing 166 The requirements depicted in section 2 above, especially the 167 requirement to maintain a single eBGP session between the CE and the 168 S-PEs, introduce challenges for standard L3VPN multi-homing 169 solutions. In particular, the BGP prefix independent convergence 170 (PIC) solution [BGP-PIC] cannot be used here because the backup S-PEs 171 have no means of learning the IP prefixes from the CE: recall that 172 the CE will only have an active eBGP session with the active S-PE. As 173 a result, when the primary S-PE fails, the backup S-PE will have no 174 alternate paths to the prefixes advertised by the CE. Therefore, with 175 BGP PIC it is not possible to address the fast fail-over requirement. 177 4 Solution 179 4.1 Using Pseudowires in Access Network 181 The solution involves running EVPN on the S-PEs in single-active 182 redundancy mode albeit for inter-subnet forwarding (i.e. Layer 3 183 forwarding). All pseudowires associated with a given CE are 184 considered collectively as a Virtual Ethernet Segment (vES) [Virtual- 185 ES] from the EVPN PEs perspective. 187 In the MPLS access network, pseudowire redundancy mechanisms are used 188 [RFC6718][RFC6870] in either the Independent mode or the Master/Slave 189 mode, with the S-PEs acting as the Master. The EVPN Designated 190 Forwarder (DF) election mechanism is used to identify the active and 191 standby S-PEs, and the pseudowire Preferential Forwarding Status Bit 192 [RFC6870], for the access pseudowires, is derived from the outcome of 193 the DF election, as follows: 195 - The S-PE that is elected as DF for a given vES MUST advertise 196 Active in the Preferential Forwarding Status bit over the pseudowire 197 corresponding to the vES. 199 - The SPE that is elected as non-DF for a given vES MUST advertise 200 Standby in the Preferential Forwarding Status bit over the pseudowire 201 corresponding to the vES. 203 On the S-PEs, the pseudowires from the Access PEs are terminated onto 204 VRFs, such that all pseudowires within a given redundancy set 205 terminate on a single IP endpoint on the S-PEs. To achieve this, the 206 S-PEs in a given Redundancy Group are configured with the same 207 Anycast IP and MAC addresses on the virtual (sub)interface 208 corresponding to the VRF termination point. 210 Since the S-PEs are running in EVPN single-active redundancy mode, 211 the S-PEs would advertise an Ethernet AD route per vES with the 212 single-active flag set per [RFC7432]. Furthermore, the DF PE sets the 213 Primary bit in the L2 extended community and the backup PE set the 214 Backup bit in that extended community. Since only the DF S-PE has its 215 access pseudowire in Active state, only that device would establish 216 an eBGP session with the CE and receive control and data traffic. The 217 DF S-PE advertises host prefixes that it receives, from the CE over 218 the eBGP session, to other PEs in the EVI using EVPN route type-5, 219 with the proper ESI set. Remote PEs learn the host prefixes and 220 associate them with the ESI, using the advertising PE as the next-hop 221 for forwarding. 223 Other S-PEs in the same Redundancy Group as the advertising PE will 224 receive the same EVPN route type-5 advertisement, and will recognize 225 the associated ESI as a locally attached vES. This information will 226 be used in the case of failure to provide a backup path to the CE. In 227 other words, the S-PEs in the same Redundancy Group, use EVPN 228 Aliasing procedure to synchronzie their IP-VRFs among themselves. It 229 is worth noting here that the S-PEs in the Redundancy Group will have 230 their ARP caches synchronized through the EVPN route type-2 231 advertisements from the DF PE. 233 4.2 Using EVPN-VPWS in Access Network 235 [EVPN-VPWS] can be used instead of pseudo wires in the MPLS access 236 network, in that case all EVPN-VPWS service instances associated with 237 a given CE are considered collectively as a Virtual Ethernet Segment 238 (vES) [Virtual-ES]. 240 The elected DF S-PE MUST set the Primary bit in the L2 attributes 241 extended community associated with the EVPN-VPWS service instance 242 Ethernet A-D route, corresponding to the vES. The non-DF S-PEs MUST 243 set the Backup bit in the L2 attributes extended community associated 244 with the EVPN-VPWS service instance Ethernet A-D route, corresponding 245 to the vES. 247 Just as with pseudowires described in previous section, only the DF 248 S-PE has its access EVPN-VPWS service instance in Active state, and 249 thus establishes an eBGP session with the CE and receive control and 250 data traffic. Just as before, the DF S-PE advertises host prefixes 251 that it receives, from the CE over the eBGP session, to other PEs in 252 the EVI using EVPN route type-5, with the proper ESI set. Remote PEs 253 learn the host prefixes and associate them with the ESI, using the 254 advertising PE as the next-hop for forwarding. 256 5 Failure Scenarios 258 5.1 Pseudowire Failure 260 The active (DF) S-PE can proactively monitor the health of the 261 primary pseudowire by using a pseudowire OAM mechanism such as VCCV- 262 BFD. As such, the S-PE can detect the failure of the primary 263 pseudowire, and react by withdrawing both the Ethernet Segment route 264 as well as the Ethernet A-D route associated with the vES. Note that 265 the S-PE advertises the Ethernet A-D route per vES granularity as 266 well as the Ethernet A-D per EVI. The withdrawal of the Ethernet 267 Segment route serves as an indication to the backup S-PE to go active 268 (i.e. act as a backup DF), and activate its pseudowires to the Access 269 PE. The withdrawal of the Ethernet A-D route triggers a "mass 270 withdraw" on the remote PEs: these PEs adjust their next-hop 271 associated with the prefixes that were originally advertised by the 272 failed PE to point to the "backup path" per [RFC7432]. This provides 273 relatively fast convergence because only a single message per 274 Ethernet Segment is required for the remote PEs to switch over to the 275 backup path irrespective of how many prefixes were learnt from the CE 276 over the pseudowire. Also, note that no synchronization of VRF or ARP 277 tables is required between the primary S-PE and its backup S-PE 278 during the fail-over, because these tables were populated ahead of 279 time during the original EVPN route advertisements. 281 As a result of the pseudowire failure, the eBGP session between the 282 CE and the original DF PE will time out. This will cause said S-PE to 283 start a timer in order to defer withdrawing the EVPN type-5 and type- 284 2 routes that it had advertised for the prefixes learnt over the 285 session from the CE. As the backup pseudowire to the backup DF PE 286 goes active, the eBGP session will be re-established by the CE with 287 the backup PE. Since both PEs share the same Anycast IP and MAC 288 addresses, the CE does not recognize that it is in communication with 289 a different PE. 291 To minimize disruption in data forwarding on the CE and the backup 292 PE, the non-stop forwarding feature such as BGP Graceful Restart is 293 used. Since the end-point IP address has not changed, this eBGP 294 session handover between the primary S-PE and the backup S-PE, looks 295 like a eBGP session flap with respect to the CE. Thus, the CE 296 continues its packet forwarding operation in data-plane while 297 synchronizing its control-plane with the backup S-PE. 299 5.2 EVPN VPWS Service Instance Failure 301 The failure scenario for an EVPN VPWS in similar to PW failure 302 scenario described in the previous section. The failure detection of 303 an EVPN service instance can be performed via OAM mechanisms such as 304 VCCV-BFD and upon such failure detection, the switch over procedure 305 to the backup S-PE is the same as the one described above. 307 5.3 PE Node Failure 309 In the case of PE node failure, the operation is similar to the steps 310 described above, albeit that EVPN route withdrawals are performed by 311 the Route Reflector instead of the PE. 313 6 Security Considerations 315 TBD. 317 7 IANA Considerations 319 TBD 321 8 References 323 8.1 Normative References 325 [RFC7432] Sajassi et al., "Ethernet VPN", RFC 7432, February 2015. 327 [EVPN-IRB] Sajassi et al., "Integrated Routing and Bridging in EVPN", 328 draft-ietf-bess-evpn-inter-subnet-forwarding-00, work in 329 progress, November 2014. 331 [EVPN-PREFIX] Rabadan et al., "IP Prefix Advertisement in EVPN", 332 draft-ietf-bess-evpn-prefix-advertisement-02, work in 333 progress, September 2015. 335 [RFC6718] Muley P., et al., "Pseudowire Redundancy", RFC 6718, August 336 2012. 338 [RFC6870] Muley P., et al., "Pseudowire Preferential Forwarding 339 Status Bit", RFC 6870, February 2013. 341 8.2 Informative References 343 [BGP-PIC] Bashandy A. et al., "BGP Prefix Independent Convergence", 344 draft-rtgwg-bgp-pic-02.txt, work in progress, October 345 2013. 347 Authors' Addresses 349 Ali Sajassi 350 Cisco 351 EMail: sajassi@cisco.com 353 Samer Salam 354 Cisco 355 EMail: ssalam@cisco.com 356 Dennis Cai 357 Cisco 358 EMail: dcai@cisco.com 360 John Drake 361 Juniper 362 EMail: jdrake@juniper.net 364 Luay Jalil 365 Verizon 366 EMail: luayjalil@gmail.com 368 Sami Boutros 369 VmWare 370 EMail: boutros.sami@gmail.com