BESS W. Lin Internet-Draft Z. Zhang Intended status: Standards Track J. Drake Expires: November 9, 2015 Juniper Networks, Inc. May 8, 2015 EVPN Inter-subnet Multicast Forwarding draft-lin-bess-evpn-irb-mcast-01 Abstract This document describes inter-subnet multicast forwarding procedures for Ethernet VPNs (EVPN). Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on November 9, 2015. Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect Lin, et al. Expires November 9, 2015 [Page 1] Internet-Draft evpn-irb-mcast May 2015 to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Solution . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1. IGMP/MLD Snooping Consideration . . . . . . . . . . . . . 5 2.2. Receiver sites not connected to a source subnet . . . . . 5 2.3. Receiver sites without IRB . . . . . . . . . . . . . . . 6 2.4. Multi-homing Support . . . . . . . . . . . . . . . . . . 6 3. Security Considerations . . . . . . . . . . . . . . . . . . . 7 4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 5. References . . . . . . . . . . . . . . . . . . . . . . . . . 7 5.1. Normative References . . . . . . . . . . . . . . . . . . 7 5.2. Informative References . . . . . . . . . . . . . . . . . 7 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 1. Introduction EVPN provides an extensible and flexible multi-homing VPN solution for intra-subnet connectivity among hosts/VMs over an MPLS/IP network. When forwarding among hosts/VMs across different IP subnets is required, Integrated Routing and Bridging (IRB) can be used [ietf- bess-evpn-inter-subnet-forwarding]. An NVE device supporting IRB is called a L3 Gateway. In a centralized approach, a centralized gateway provides all L3 routing functionality, and even two Tenant Systems on two subnets connected to the same NVE need to go through the central gateway, which is inefficient. In a distributed approach, each NVE (or most NVEs) have IRB configured, and inter-subnet traffic will be locally routed without having to go through a central gateway. Inter-subnet multicast forwarding is more complicated and not covered in [ietf-bess-evpn-inter-subnet-forwarding]. This document describes the procedures for inter-subnet multicast forwarding. For multicast traffic sourced from a TS in subnet 1, EVPN BUM forwarding will deliver it to all sites, and NVEs with IRB interfaces for the subnet will associate the traffic with the corresponding IRB interfaces. From L3 point of view, those NVEs are routers connected to the virtual LAN via the IRB interfaces and the source is locally attached. Nothing is different from a traditional LAN and regular IGMP/MLD/PIM procedures kick in. Lin, et al. Expires November 9, 2015 [Page 2] Internet-Draft evpn-irb-mcast May 2015 If a TS is a multicast receiver, it uses IGMP/MLD to signal its interest in some multicast flows. One of the gateways is the IGMP/ MLD querier and sends queries out of the IRB interfaces, which are forwarded throughout the subnet following EVPN BUM procedures. TS's send IGMP/MLD joins via multicast, which are also forwarded throughout the subnet via EVPN BUM procedure. The gateways receive the joins via their IRB interfaces. From layer 3 point of view, again it is nothing different from on a traditional LAN. On a traditional LAN, only one router can send multicast to the LAN. That is either the PIM Designated Router (DR) or IGMP/MLD querier (when PIM is not needed - e.g., the LAN is a stub network). On the source network, PIM is typically needed so that traffic can be delivered to other routers. For example, in case of PIM-SM, the DR on the source network encapsulates the initial packets for a particular flow in PIM Register messages and send to the RP, triggering necessary states for that flow to be built throughout the network. That also works in the EVPN scenario, although not efficiently. Consider the following example, where a tenant has two subnets (corresponding to two VLANs realized by two EVPN EVIs) at three sites. A multicast source is located at site 1 on VLAN/subnet 1 and three receivers are located at site 2 on VLAN/subnet 1, site 1 and 2 on VLAN/subnet 2 respectively. On subnet 1, NVE1 is the PIM DR while on subnet 2, NVE3 is the PIM DR. The connection drawn among NVEs are L3 connections (typically via L3VPN). Multicast traffic from the source at site 1 on subnet 1 is forwarded to all three sites on VLAN 1 following EVNP procedure. Rcvr1 gets the traffic when NVE2 sends it out of its local Attachment Circuit (AC). The three gateways for EVI1 also receive the traffic on their IRB interfaces to potentially route to other subnets. NVE3 is the DR on subnet 2 so it routes the local traffic (from L3 point of view) to subnet 2 while NVE1/2 is not the DR on subnet 2 so they don't. Once traffic gets onto subnet 2, it is forwarded back to NVE1/2 and delivered to rcvr2/3 following EVPN procedures. Notice that both NVE1 and NVE2 receive the multicast traffic from subnet 1 on their IRB interfaces for subnet 1, but they do not route to subnet 2 where they are not the DRs. Instead, they wait to receive traffic at L2 from NVE3. For example, for receiver 3 connected to NVE1 but on different IP subnet as the multicast source, the multicast traffic from source has to go from NVE1 to NVE3 and then back to NVE1 before it is being delivered to the receiver 3. This is similar to the hair-pinning issue with centralized approach (forwarding is centralized via the DR) for unicast, even though Lin, et al. Expires November 9, 2015 [Page 3] Internet-Draft evpn-irb-mcast May 2015 distributed approach is being used for unicast (in that each NVE is supporting IRB and routing inter-subnet unicast traffic locally). site 1 . site 2 . site 3 . . src . rcvr1 . | . | . -------------------------------------------- VLAN 1 (EVI1) | . | . | IRB1| DR . IRB1| . IRB1| NVE1------------NVE2-----------------NVE3---RP IRB2| . IRB2| . IRB2| DR | . | . | -------------------------------------------- VLAN 2 (EVI2) | . | . rcvr3 . rcvr2 . . . site 1 . site 2 . site 3 2. Solution This multicast hair-pinning can be avoided if the following procedures are followed: o On the IRB interfaces, each gateway forward multicast traffic as long as there are receivers for the traffic, regardless if it is DR or not. o On the IRB interfaces, each gateway will send PIM joins towards the RP or source if has IGMP/MLD group membership, regardless if it is DR/querier or not. o Multicast data traffic sent out of the IRB interfaces is forwarded to local ACs only and not to other EVPN sites. Essentially, each router on an IRB interface behaves as a DR/querier for receivers (but only the true DR behaves as a DR for sources), and multicast data traffic from IRB interfaces is limited to local receivers. Note that link local multicast traffic (e.g. addressed to 224.0.0.x in case of IPv4)), typically use for protocols, is not subject to the above procedures and still forwarded to remote sites following EVPN procedures. In the above example, when NVE1 gets traffic on its IRB1 interface it will route the traffic out of its IRB2 and deliver to local rcvr3. Lin, et al. Expires November 9, 2015 [Page 4] Internet-Draft evpn-irb-mcast May 2015 It also sends register messages to the RP, since it is the DR on the source network. Both NVE2 and NVE3 will receive the traffic on IRB1 but neither sends register messages to the RP, since they are not the DR on the source subnet. NVE2 will route the traffic out of its IRB2 and deliver to local rcvr2. NVE3 will also route the traffic out of IRB2 even though there is no receiver at the local site, because the IGMP/MLD joins from rcvr2/3 are also received by NVE3. 2.1. IGMP/MLD Snooping Consideration In the above example, NVE3 receives IGMP/MLD joins from rcvr2/3 and will route packets out of IRB2, even though there are no receivers at the local site. IGMP/MLD snooping on NVE3 can prevent the traffic from actually being sent out of ACs but at L3 there will still be related states and processing/forwarding (e.g., IRB2 will be in the downstream interface list for PIM join states and forwarding routes). To prevent NVE3 from learning those remote receivers at all, IGMP/MLD snooping on NVE3 could optionally suppress the joins from remote sites being sent to its IRB interface. With that, in the above example NVE3 will not learn of rcvr2/3 on IRB2 and will not try to route packets out of IRB2 at all. 2.2. Receiver sites not connected to a source subnet In the above example, the source subnet is connected to all NVEs that has receiver sites, and there are no receivers outside the EVPN network. As a result, PIM is not really needed and each NVE can just route multicast traffic locally. In that case, IGMP/MLD querier will be responsible to send traffic to a subnet. If there is a receiver subnet connected to an NVE that is not connected to the source subnet, then there must exist layer 3 multicast paths between them. This could be over an L3VPN core (in this revision it is assumed that the subnets realized by EVPN are stub only and not transit) and normal PIM and MVPN procedures will be followed. The L3VPN routes can be propagated either per RFC 4364 procedures or per EVPN Type 5 procedures [bess-evpn-prefix-advertisement]. BGP- MVPN [RFC 6514] reqiures that the routes used for RPF checking carry two extended communities (ECs) - VRF Route Import EC and Source AS EC. That must be applied to EVPN Prefix Advertisement (Type 5) routes as well. Lin, et al. Expires November 9, 2015 [Page 5] Internet-Draft evpn-irb-mcast May 2015 2.3. Receiver sites without IRB It is possible that a particular NVE may not have an IRB interface for a particular l2 domain. In that case, for traffic from another l2 domain, receivers need to receive from another NVE following EVPN procedures. The obvious choice is that it receives from the DR of that subnet. Because an NVE does not deliver traffic out of IRBs to remote sites with IRB, the DR needs to use a separate provider tunnel to deliever traffic only to sites that do not have IRB interfaces. The tunnel can be advertised via a separate Multicast Ethernet Tag Route, and only the sites without IRBs will join that tunnel. Details for that route and procedure will be provided in future revisions. 2.4. Multi-homing Support The solution works equally well in multi-homing situations, though a special situation must be considered. As shown in the diagram below, both rcvr4 and rcvr5 are active-active multi-homed to NVE2 and NVE3. Receiver 4 is on subnet VLAN 1 and receiver 5 is on VLAN 2. When IRBs on NVE1 and NVE2 forward multicast traffic to its local attached access interface(s) based on EVPN BUM procedure, only DF for the ES deliveries multicast traffic to its multi-homed receiver. Hence no duplicated multicast traffic will be forwarded to receiver 4 or receiver 5. If NVE2 does not have an IRB interface and becomes the DF on the multi-homed segments, then rcvr5 will not be able to receive traffic from a different layer 2 domain (rcvr4 will, because it is in the same layer 2 domain so traffic does not have to go through IRB). To handle this situation, this document proposes the addition of a new TLV, the IRB Mcast Capable TLV, to the Ethernet Segment route from NVEs with IRB configured for the corresponding segment. Those without IRB interface configured will not add the TVL to their routes for the segment. In addition to the standard DF election run by all NVEs to elect a standard DF, another election is run among those NVEs that included the IRB Mcast Capable TLV in their Ethernet Segment routes to elect an "IRB DF". If all NVEs on a segment has IRB configured then the two elections should result the same NVE to be elected. Otherwise, the elected IRB DF may be different from the standard DF. In either case, the IRB DF will be responsible for delivering multicast traffic from IRB to the segment, Lin, et al. Expires November 9, 2015 [Page 6] Internet-Draft evpn-irb-mcast May 2015 In the multi-homing example given below, NVE2 does not have IRB interface so it does not include IRB Mcast capable TLV in its ES routes, and NVE3 will be elected as IRB DF for both ESs. . src . +-------- rcvr4-----+ | . | . | -------------------------------------------- VLAN 1 (EVI1) | . | . | IRB1| DR . | . IRB1| NVE1------------NVE2-----------------NVE3---RP IRB2| . | . IRB2| DR | . | . | -------------------------------------------- VLAN 2 (EVI2) | . | . | rcvr3 . +-------- rcvr5-----+ . 3. Security Considerations This document does not introduce new security risks. 4. Acknowledgements 5. References 5.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC7432] Sajassi, A., Aggarwal, R., Bitar, N., Isaac, A., Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet VPN", RFC 7432, February 2015. 5.2. Informative References [I-D.ietf-bess-evpn-inter-subnet-forwarding] Sajassi, A., Salam, S., Thoria, S., Rekhter, Y., Drake, J., Yong, L., and L. Dunbar, "Integrated Routing and Bridging in EVPN", draft-ietf-bess-evpn-inter-subnet- forwarding-00 (work in progress), November 2014. Lin, et al. Expires November 9, 2015 [Page 7] Internet-Draft evpn-irb-mcast May 2015 [I-D.ietf-bess-evpn-prefix-advertisement] Rabadan, J., Henderickx, W., Palislamovic, S., Balus, F., and A. Isaac, "IP Prefix Advertisement in EVPN", draft- ietf-bess-evpn-prefix-advertisement-01 (work in progress), March 2015. [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private Networks (VPNs)", RFC 4364, February 2006. [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP Encodings and Procedures for Multicast in MPLS/BGP IP VPNs", RFC 6514, February 2012. Authors' Addresses Wen Lin Juniper Networks, Inc. EMail: wlin@juniper.net Zhaohui Zhang Juniper Networks, Inc. EMail: zzhang@juniper.net John Drake Juniper Networks, Inc. EMail: jdrake@juniper.net Lin, et al. Expires November 9, 2015 [Page 8]