idnits 2.17.1 draft-boutros-bess-elan-services-over-sr-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (11 October 2021) is 927 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC8402' is defined on line 470, but no explicit reference was found in the text == Unused Reference: 'RFC8660' is defined on line 475, but no explicit reference was found in the text == Unused Reference: 'RFC8754' is defined on line 481, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-spring-segment-routing-policy' is defined on line 488, but no explicit reference was found in the text == Unused Reference: 'I-D.voyer-pim-sr-p2mp-policy' is defined on line 496, but no explicit reference was found in the text == Unused Reference: 'RFC4761' is defined on line 503, but no explicit reference was found in the text == Unused Reference: 'RFC4762' is defined on line 508, but no explicit reference was found in the text == Outdated reference: A later version (-22) exists of draft-ietf-spring-segment-routing-policy-13 Summary: 0 errors (**), 0 flaws (~~), 9 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup S. Boutros, Ed. 3 Internet-Draft S. Sivabalan, Ed. 4 Intended status: Standards Track H. Shah 5 Expires: 14 April 2022 Ciena Corporation 6 J. Uttaro 7 ATT 8 D. Voyer 9 Bell Canada 10 B. Wen 11 Comcast 12 L. Jalil 13 Verizon 14 11 October 2021 16 A Simplified Scalable ELAN Service Model with Segment Routing Underlay 17 draft-boutros-bess-elan-services-over-sr-03 19 Abstract 21 This document proposes a new approach for deploying Ethernet LAN 22 (ELAN) services with an objective of achieving high scalability, 23 faster network convergence, and reduced operational complexity. 24 Furthermore, it naturally brings the benefits of All-Active 25 multihoming as well as MAC learning in data-plane. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at https://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on 14 April 2022. 44 Copyright Notice 46 Copyright (c) 2021 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 51 license-info) in effect on the date of publication of this document. 52 Please review these documents carefully, as they describe your rights 53 and restrictions with respect to this document. Code Components 54 extracted from this document must include Simplified BSD License text 55 as described in Section 4.e of the Trust Legal Provisions and are 56 provided without warranty as described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 3. Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . 4 63 4. Control Plane Behavior . . . . . . . . . . . . . . . . . . . 5 64 4.1. Service discovery . . . . . . . . . . . . . . . . . . . . 5 65 4.2. All-Active Service Redundancy . . . . . . . . . . . . . . 6 66 4.3. Mass service withdrawal . . . . . . . . . . . . . . . . . 6 67 4.4. E-Tree Support . . . . . . . . . . . . . . . . . . . . . 6 68 5. Data Plane Behavior . . . . . . . . . . . . . . . . . . . . . 6 69 5.1. Unicast Traffic . . . . . . . . . . . . . . . . . . . . . 7 70 5.2. BUM Traffic . . . . . . . . . . . . . . . . . . . . . . . 8 71 5.3. Data Plane MAC learning . . . . . . . . . . . . . . . . . 8 72 5.3.1. Single Home CE . . . . . . . . . . . . . . . . . . . 9 73 5.3.2. Multi-Home CE . . . . . . . . . . . . . . . . . . . . 9 74 5.4. ARP suppression . . . . . . . . . . . . . . . . . . . . . 10 75 5.5. Distributed Anycast Gateway . . . . . . . . . . . . . . . 10 76 5.6. Multi-pathing . . . . . . . . . . . . . . . . . . . . . . 10 77 5.7. E-Tree Support . . . . . . . . . . . . . . . . . . . . . 11 78 6. Benefits of ELAN over SR . . . . . . . . . . . . . . . . . . 11 79 7. Security Considerations . . . . . . . . . . . . . . . . . . . 11 80 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 81 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11 82 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 83 10.1. Normative References . . . . . . . . . . . . . . . . . . 11 84 10.2. Informative References . . . . . . . . . . . . . . . . . 12 85 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 87 1. Introduction 89 Virtual Private LAN Service(VPLS) is based on Pseudo-Wire (PW) 90 construct which identifies both the service type and the service 91 termination node in both control and data planes. RFCs 4761 and 4762 92 specify mechanisms to signal PW for VPLS services using BGP and LDP 93 respectively. An ingress Provider Edge (PE) node needs to maintain a 94 PW per VPLS instance for each egress PE node. So, if we assume 10K 95 ELAN instances over a network of 100 PE nodes, each PE node needs to 96 setup and maintain approximately 1M PWs which can easily become a 97 scalability bottleneck in large scale deployment. 99 As described in RFC7432, Ethernet Virtual Private Network (EVPN) 100 technology builds ELAN services similar to BGP-based IP-VPN services 101 with additional features such as MAC address learning in control 102 lane, All-Active multihoming, etc. It eliminates the need for PWs, 103 and hence the scale problem associated with PWs. However, an egress 104 PE node cannot unambiguously identify ingress PE node in data-plane. 105 As such, EVPN requires control plane mechanisms for MAC advertisement 106 and learning which increases control plane complexity and overhead. 108 The goal of the proposed approach is to greatly simplify control 109 plane functions and minimize the amount of control plane messages PE 110 nodes have to process. In this version of the document, we assume 111 Segment Routing (SR) underlay network. A future version of this 112 document will generalize the underlay network to both classical MPLS 113 and SR technologies. 115 The proposed approach does not require PW, and hence the control 116 plane complexity and message overhead associated with signaling and 117 maintaining PWs are eliminated. 119 An ELAN instance is uniquely identified by Segment ID (SID) 120 regardless of the number of service termination points. Such a SID 121 will be referred to as "Service SID" in the rest of the document. 122 The number of states maintained at a PE node is equal to the number 123 of ELAN instances in the corresponding broadcast domain. Referring 124 to the above example, each PE node now needs to maintain states for 125 10K ELAN service instances as opposed to 1 M PWs in the case of 126 classical VPLS model in data and control planes. A node can 127 advertise service SID(s) of the ELAN instance(s) that it hosts via 128 BGP for auto-discovery purpose. A Service SID can be: 130 * MPLS label for SR-MPLS. 132 * uSID (micro SID) for SRv6 representing network function associated 133 with an ELAN service instance. 135 MAC address is learned in data-plane. Source node of a MAC address 136 is identified by its node SID (assigned for regular SR operation) 137 during MAC learning phase. In the data packets, the node SID of the 138 source is inserted directly below the service SID so that a 139 destination node can uniquely identify the source of the packets in 140 an SR domain. 142 ELAN service instances are advertised such that a service message 143 packs as many ELAN instances hosted by the advertising PE node as 144 possible at the time of advertisement. A possible approach is to use 145 a bit-map in which each bit position represents an ELAN instance, as 146 well as the starting value of Service SID. Using these parameters, 147 an ingress PE receiving advertisements node can learn ELAN 148 instance(s) hosted by an egress PE node. 150 All-Active multihoming redundancy is supported at the underlay level 151 by making use of SR anycast SID. No overlay mechanism is required 152 for this purpose. 154 Each node is also associated with another SID unique within the 155 broadcast domain that is used to identify incoming Broadcast Unknown- 156 unicast, and Multicast (BUM) traffic. We call such SID BUM SID. If 157 node A wants to send BUM traffic to node B, it needs to use BUM SID 158 assigned to node B as a destination SID. BUM SIDs can also be 159 advertised via BGP for auto-discovery purpose. In order to send BUM 160 traffic within a broadcast domain, P2MP SR policies can be used. 161 Such policies may or may not be shared by ELAN instances. 163 The proposed solution can also be applicable to the EVPN control 164 plane without compromising its benefits such as All-Active 165 multihoming on access, multipathing in the core, auto-provisioning 166 and auto-discovery, etc. With this approach, the need for 167 advertising EVPN route types 1 through 4 as well Split-Horizon (HP) 168 label is eliminated. 170 In the following sections, we will describe the functionalities of 171 the proposed approach in detail. 173 2. Terminology 175 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 176 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 177 document are to be interpreted as described in [RFC2119]. 179 3. Abbreviations 181 BUM: Broadcast, unicast and multicast. 183 CE: Customer Edge node e.g., host or router or switch. 185 ELAN: Ethernet LAN. 187 EVPN: Ethernet VPN. 189 MAC: Media Access Control. 191 MAC-VRF: A Virtual Routing and Forwarding table for Media Access 192 Control (MAC) addresses on a PE. 194 MH: Multi-Home. 196 OAM: Operations, Administration and Maintenance. 198 PE: Provide Edge Node. 200 SID: Segment Identifier. 202 SR: Segment Routing. 204 VPLS: Virtual Private LAN Service. 206 4. Control Plane Behavior 208 4.1. Service discovery 210 A node can discover ELAN service instances as well as the associated 211 service SIDs hosted on other nodes via configuration or auto- 212 discovery. With the latter, the service SIDs can be advertised using 213 BGP. As mentioned earlier such update message will pack information 214 about as many ELAN instances hosted by the advertising PE node to 215 reduce the amount of update messages exchanged by PE nodes. 217 Similar to the service SID, an ingress PE node can discover BUM SID 218 associated with an egress PE node via configuration or auto- 219 discovery. 221 The necessary BGP extensions will be specified in a future version of 222 this document. 224 4.2. All-Active Service Redundancy 226 An anycast SID per Ethernet Segment (ES) can be associated with the 227 PE nodes attached to a Multi-Home (MH) CE. The anycast SIDs will be 228 advertised in BGP by the PE nodes. Based on ES anycast SIDs, ingress 229 PEs receiving updates can discover the redundancy membership and 230 perform DF election. Aliasing/Multipathing can be achieved using the 231 same mechanisms excercised by SR underlay for forwarding traffic to 232 destinations belonging to anycast group. 234 4.3. Mass service withdrawal 236 Node failure can be detected due via IGP convergence. For faster 237 detection of node failure, mechanism like BFD can be deployed. The 238 proposed approach does not require additional MAC withdrawal 239 mechanism. 241 On PE-CE link failure, the corresponding PE node withdraws the route 242 to the corresponding ES in BGP in order to stop receiving traffic to 243 that ES. With MH case with anycast SID, upon detecting a failure on 244 PE-CE link, a PE node may forward incoming traffic to the impacted 245 ES(s) to other PE node(s) that is/are part of the anycast group until 246 it withdraws routes to the impacted ES(s) for faster convergence. 247 For example, in Figure 1, assuming PE5 and PE6 are part of an anycast 248 group, upon link failure between PE5 and CE5, PE5 can forward the 249 received packets from the core to PE6 until it withdraws the anycast 250 SID associated with the ES(s). 252 4.4. E-Tree Support 254 To be covered in the next revision of this document. 256 5. Data Plane Behavior 257 ____ CE3 258 / ____CE1 259 -------- PE3 --------- / 260 / PE1 261 / | \ 262 PE5 | \ 263 /| | \ 264 / | Service Provider Network | \ 265 CE5 | | CE2 266 \ | | / 267 \ | PE2_/ 268 PE6 / 269 / -------- PE4 -------- 270 CE6___ / CE4_____/ 272 Figure 1: Reference network diagram used for examples below 274 5.1. Unicast Traffic 276 The proposed method requires unicast data packet be formed as shown 277 in Figure 2. 279 +-------------------------------+ 280 | SID(s) to reach destination | 281 +-------------------------------+ 282 | Service SID | 283 +-------------------------------+ 284 | Source node SID | 285 +-------------------------------+ 286 | Layer-2 Payload | 287 +-------------------------------+ 289 Figure 2: Data packet format for unicast traffic 291 * SID(s) to reach destination: depends on the intent of the underlay 292 transport: 294 - IGP shortest path: node SID of the destination. The 295 destination can belong to an anycast group. 297 - IGP path with intent: Flex-Algo SID if the destination can be 298 reached using the Flex-Algo SID for a specific intent (e.g., 299 low latency). The destination can belong to an anycast group. 301 - SR policy (to support fine intent): a SID-list for the SR 302 policy that can be used to reach the destination. 304 * Service SID: The SID that uniquely identifies an ELAN instance in 305 a broadcast domain. 307 * Source node SID: The SID that uniquely identifies the source node. 308 This can be a node SID which may be part of an anycast group. 309 Note that such a SID is allocated as part of SR underlay 310 operation, and the proposed approach does not impose any 311 additional requirement. 313 5.2. BUM Traffic 315 In order to identify incoming BUM traffic a unique SID (which will be 316 referred to as "BUM SID" in the rest of the document) per PE node is 317 allocated. A BUM packet is formatted as shown in Figure 3: 319 +-------------------------------+ 320 | BUM SID | 321 +-------------------------------+ 322 | Service SID | 323 +-------------------------------+ 324 | Source node SID | 325 +-------------------------------+ 326 | Layer-2 Payload | 327 +-------------------------------+ 329 Figure 3: Data packet format for BUM traffic 331 In order to send BUM traffic, a P2MP SR policy may be established 332 from a given node to rest of the nodes associated with an ELAN 333 instance. If a dedicated P2MP SR policy is used per ELAN instance, a 334 single SID may be used as both replication SID for the P2MP SR policy 335 as well as to identify ELAN instance. With this approach, the number 336 of SIDs imposed on data packet will be only two. It is possible to 337 use a given P2MP SR policy for multiple ELAN instances in which case 338 service SID needs to be inserted in the packet for egress PE to 339 identify the ELAN instance for the BUM traffic. 341 5.3. Data Plane MAC learning 343 With the proposed approach, MAC address can be learned in data- plane 344 using the packets formatted as shown in Figure 4. 346 Source MAC address on the received Layer 2 packet is learned against 347 the source node SID placed directly under the service SID in the 348 data-plane. 350 5.3.1. Single Home CE 352 In Figure 1, node 3 learns a MAC address from CE3 and floods it to 353 all nodes configured with the same service SID. Nodes 1, 2, 4, 5 and 354 6 learn the MAC address as reachable via the source node SID of Node 355 3. 357 +-----------------------------+ 358 | Tree SID/Broadcast Node SID | 359 +-----------------------------+ 360 | Service SID | 361 +-----------------------------+ 362 | Node SID of node 3 | 363 +-----------------------------+ 364 | Layer-2 Packet | 365 +-----------------------------+ 367 Figure 4: Packet format used for flooding 369 5.3.2. Multi-Home CE 371 Referring to Figure 1, let's assume that node 5 learns a MAC address 372 from MH CE5, and floods it to all nodes in data-plane as per SID 373 stack shown in Figure 5, including node 6. The receiving nodes learn 374 the MAC address as reachable via the anycast SID belonging to node 5 375 and node 6. Node 6 applies SH and hence does not send the packet 376 back to CE5, but treats the MAC address as reachable via CE5, as well 377 floods the address to CE6. 379 The following diagram shows SID label stack for a Broadcast and 380 Multicast MAC frame sent by Multi-Home PE. Note the presence of 381 source SID after the service SID. This combination/order is 382 necessary for the receiver to learn source MAC address (from L2 383 packet) associated with ingress PE (i.e. source node SID). 385 +-----------------------------+ 386 | Tree SID/Broadcast Node SID | 387 +-----------------------------+ 388 | Service SID | 389 +-----------------------------+ 390 | Source Node SID | 391 +-----------------------------+ 392 | Layer-2 Packet | 393 +-----------------------------+ 395 Figure 5: Data packet format for traffic sent by a MH PE 397 5.4. ARP suppression 399 Gleaning ARP packet requests and replies will be used to learn IP/MAC 400 binding for ARP suppression. ARP replies are unicast, however 401 flooding ARP replies can allow all nodes to learn the MAC/IP bindings 402 for the destinations too. 404 5.5. Distributed Anycast Gateway 406 Distributed Anycast Gateway (GW) (aka inter-subnet IRB function) can 407 be realized as follows: 409 * All PEs connected to the tenant subnets share the same GW IP/MAC 410 per subnet. 412 * A PE MUST never learn its own GW IP/MAC via the tunnels connecting 413 itself to other PE(s). 415 * ARP requests/replies from the tenant subnet are flooded via the 416 ingress PE(s) attached to the subnet to all egress PE(s) attached 417 to the subnet so that egress PE(s) can learn the source MAC/IP 418 address via the ingress PE(s). 420 * ARP replies from tenants will be delivered to the local PE hosts 421 the GW virtual MAC address. The local PE MUST flood the ARP 422 replies over the tunnel to other PEs. Other PEs, including the PE 423 which originated the ARP request, will learn the IP/MAC 424 association of the tenant from the received ARP reply. 426 5.6. Multi-pathing 428 Packets destined to a MH CE is distributed to the PE nodes attached 429 to the CE for load-balancing purpose. This is achieved implicitly 430 due to the use of anycast SIDs for both ES as well as PE attached to 431 the ES. In our example, traffic destined to CE5 is distributed via 432 PE5 and PE6. 434 5.7. E-Tree Support 436 To be covered in the next revision of this document. 438 6. Benefits of ELAN over SR 440 The proposed approach eliminates the need for establishing and 441 maintaining PWs as with legacy VPLS technology. This yields 442 significant reduction in control plane overhead. Also, due to MAC 443 learning in data-plane (conversational MAC learning), the proposed 444 approach provides the benefits as such fast convergence, fast MAC 445 movement, etc. Finally, using anycast SID, the proposed approach 446 provides All-Active multihoming as well as multipathing and ARP 447 suppression. 449 7. Security Considerations 451 The mechanisms in this document use Segment Routing control plane as 452 defined in Security considerations described in Segment Routing 453 control plane are equally applicable. 455 8. IANA Considerations 457 TBD. 459 9. Acknowledgements 461 10. References 463 10.1. Normative References 465 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 466 Requirement Levels", BCP 14, RFC 2119, 467 DOI 10.17487/RFC2119, March 1997, 468 . 470 [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., 471 Decraene, B., Litkowski, S., and R. Shakir, "Segment 472 Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, 473 July 2018, . 475 [RFC8660] Bashandy, A., Ed., Filsfils, C., Ed., Previdi, S., 476 Decraene, B., Litkowski, S., and R. Shakir, "Segment 477 Routing with the MPLS Data Plane", RFC 8660, 478 DOI 10.17487/RFC8660, December 2019, 479 . 481 [RFC8754] Filsfils, C., Ed., Dukes, D., Ed., Previdi, S., Leddy, J., 482 Matsushima, S., and D. Voyer, "IPv6 Segment Routing Header 483 (SRH)", RFC 8754, DOI 10.17487/RFC8754, March 2020, 484 . 486 10.2. Informative References 488 [I-D.ietf-spring-segment-routing-policy] 489 Filsfils, C., Talaulikar, K., Voyer, D., Bogdanov, A., and 490 P. Mattes, "Segment Routing Policy Architecture", Work in 491 Progress, Internet-Draft, draft-ietf-spring-segment- 492 routing-policy-13, 28 May 2021, 493 . 496 [I-D.voyer-pim-sr-p2mp-policy] 497 Voyer, D., Filsfils, C., Parekh, R., Bidgoli, H., and Z. 498 Zhang, "Segment Routing Point-to-Multipoint Policy", Work 499 in Progress, Internet-Draft, draft-voyer-pim-sr-p2mp- 500 policy-02, 10 July 2020, . 503 [RFC4761] Kompella, K., Ed. and Y. Rekhter, Ed., "Virtual Private 504 LAN Service (VPLS) Using BGP for Auto-Discovery and 505 Signaling", RFC 4761, DOI 10.17487/RFC4761, January 2007, 506 . 508 [RFC4762] Lasserre, M., Ed. and V. Kompella, Ed., "Virtual Private 509 LAN Service (VPLS) Using Label Distribution Protocol (LDP) 510 Signaling", RFC 4762, DOI 10.17487/RFC4762, January 2007, 511 . 513 Authors' Addresses 515 Sami Boutros (editor) 516 Ciena Corporation 517 United States of America 519 Email: sboutros@ciena.com 521 Siva Sivabalan (editor) 522 Ciena Corporation 523 Canada 525 Email: ssivabal@ciena.com 526 Himanshu Shah 527 Ciena Corporation 528 United States of America 530 Email: hshah@ciena.com 532 James Uttaro 533 ATT 534 United States of America 536 Email: ju1738@att.com 538 Daniel Voyer 539 Bell Canada 540 Canada 542 Email: daniel.voyer@bell.ca 544 Bin Wen 545 Comcast 546 United States of America 548 Email: bin_wen@cable.comcast.com 550 Luay Jalil 551 Verizon 552 United States of America 554 Email: luay.jalil@verizon.com