idnits 2.17.1 draft-lin-bess-evpn-irb-mcast-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 2 instances of too long lines in the document, the longest one being 4 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: With PIM-ASM, if the DR on a source subnet is a tenant router, it will handle the registering procedures for PIM-ASM. As a result, the NVE at same site as the tenant router/DR MUST not handle registering procedures as described in Section 2. -- The document date (March 13, 2017) is 2572 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC6514' is mentioned on line 817, but not defined == Unused Reference: 'RFC2119' is defined on line 969, but no explicit reference was found in the text == Unused Reference: 'RFC7432' is defined on line 974, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-bess-evpn-inter-subnet-forwarding' is defined on line 981, but no explicit reference was found in the text == Unused Reference: 'RFC4364' is defined on line 987, but no explicit reference was found in the text == Outdated reference: A later version (-14) exists of draft-ietf-bess-evpn-bum-procedure-updates-01 == Outdated reference: A later version (-15) exists of draft-ietf-bess-evpn-inter-subnet-forwarding-03 Summary: 1 error (**), 0 flaws (~~), 9 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS W. Lin 3 Internet-Draft Z. Zhang 4 Intended status: Standards Track J. Drake 5 Expires: September 14, 2017 Juniper Networks, Inc. 6 J. Rabadan 7 Nokia 8 A. Sajassi 9 Cisco Systems 10 March 13, 2017 12 EVPN Inter-subnet Multicast Forwarding 13 draft-lin-bess-evpn-irb-mcast-03 15 Abstract 17 This document describes inter-subnet multicast forwarding procedures 18 for Ethernet VPNs (EVPN). This includes forwarding inside an EVN 19 domain and to/from outside the EVPN domain. 21 Requirements Language 23 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 24 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 25 document are to be interpreted as described in RFC2119. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on September 14, 2017. 44 Copyright Notice 46 Copyright (c) 2017 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 62 1.1. Background and Terminologies . . . . . . . . . . . . . . 3 63 1.1.1. Integrated Routing and Bridging . . . . . . . . . . . 3 64 1.1.2. General Multicast Routing . . . . . . . . . . . . . . 4 65 1.2. Inter-subnet Multicast in EVPN . . . . . . . . . . . . . 5 66 2. EVPN-aware Solution . . . . . . . . . . . . . . . . . . . . . 7 67 2.1. Basic Operations . . . . . . . . . . . . . . . . . . . . 7 68 2.2. Multi-homing Support . . . . . . . . . . . . . . . . . . 8 69 2.3. Receiver NVEs not connected to a source subnet . . . . . 9 70 2.3.1. IMET routes advertisement . . . . . . . . . . . . . . 10 71 2.3.2. Layer 2 Forwarding State . . . . . . . . . . . . . . 11 72 2.3.3. Layer 3 Forwarding State . . . . . . . . . . . . . . 12 73 2.4. Selective Multicast . . . . . . . . . . . . . . . . . . . 12 74 2.5. Advanced Topics . . . . . . . . . . . . . . . . . . . . . 14 75 2.5.1. Legacy NVEs . . . . . . . . . . . . . . . . . . . . . 14 76 2.5.2. Traffic to/from outside of an EVPN domain . . . . . . 15 77 2.5.3. Integration with MVPN . . . . . . . . . . . . . . . . 17 78 2.5.4. When Tenant Routers Are Present . . . . . . . . . . . 19 79 3. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 80 4. Security Considerations . . . . . . . . . . . . . . . . . . . 21 81 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 21 82 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 83 6.1. Normative References . . . . . . . . . . . . . . . . . . 21 84 6.2. Informative References . . . . . . . . . . . . . . . . . 21 85 Appendix A. Integrated Routing and Bridging . . . . . . . . . . 23 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24 88 1. Introduction 90 EVPN offers an efficient L2 VPN solution with all-active multi-homing 91 support for intra-subnet connectivity over MPLS/IP network. EVPN 92 also provides an integrated L2 and L3 service. When forwarding among 93 Tenant Systems (TS) across different IP subnets is required, 94 Integrated Routing and Bridging (IRB) can be used [ietf-bess-evpn- 95 inter-subnet-forwarding]. 97 An network virtualization endpoint (NVE) device supporting IRB is 98 called a L3 Gateway. In a centralized approach, a centralized 99 gateway provides all routing functionality, and even two tenant 100 systems on two subnets connected to the same NVE need to go through 101 the central gateway, which is inefficient. In a distributed 102 approach, each NVE has IRB configured, and inter-subnet traffic will 103 be locally routed without having to go through a central gateway. 105 Inter-subnet multicast forwarding is more complicated and not covered 106 in [ietf-bess-evpn-inter-subnet-forwarding]. This document describes 107 the procedures for inter-subnet multicast forwarding. 109 1.1. Background and Terminologies 111 For each Broadcast Domain (BD, an L2 concept), there is usually a 112 subnet (an L3 concept). This document may use subnet and BD 113 interchangeably. When inter-subnet forwarding is allowed between 114 some subnets of the same tenant on the same NVE, the BDs are 115 associated with the same routing instance via IRB interfaces. 116 Multiple BDs of the same tenant may be attached to different routing 117 instances if inter-subnet forwarding is subject to some restrictions. 118 This document assumes that inter-subnet forwarding is allowed by 119 default between subnets of the same tenant. 121 1.1.1. Integrated Routing and Bridging 123 Appendix A describes the concept of Integrated Routing and Bridging 124 and in particular IRB interfaces in more details. 126 An IRB interface is a logical connection between a BD and a routing 127 instance. It has two ends - one on routing instance side and one on 128 BD side. In this document, when we say a packet is "routed/sent down 129 an IRB interface", it is from L3 point of view and on the routing 130 instance side (from L3 down to L2). L3 forwarding related processing 131 like TTL/fragmentation and mac address change are done before the 132 packet is put onto the IRB interface "wire" and sent to the 133 corresponding BD. From the BD's point of view, that packet is 134 received on the BD side of the IRB interface and L2 switched out of 135 one or more other L2 interfaces (Attachment Circuits or ACs) in the 136 BD. 138 Note that there is one BD in a MAC-VRF with Vlan-based service and 139 multiple BDs in a MAC-VRF with Vlan-aware Bundle Service. Therefore, 140 a routing instance for a tenant may have one or more MAC-VRFs 141 associated with it, with the IRB interfaces being the ties. 143 1.1.2. General Multicast Routing 145 IP routing is inter-subnet forwarding - traffic received from one 146 subnet is routed/forwarded to other subnets. The subnets could be 147 tradictional networks like LANs or could be broadcast domains 148 implemented by EVPN. This section provides a very high level 149 description on layer 3 multicast routing and is not specific to EVPN 150 at all. 152 Multicast routing is based on trees - rooted at the source or 153 Rendezvous Point (RP). Typically the tree is set up by PIM protocol 154 [RFC7761] following the reverse path from a receiver towards the 155 source/RP. On a particular router on the tree, the process to 156 determine the upstream interface/neighbor is called the RPF process 157 and the upstream interface/neighbor is also called the RPF interface/ 158 neighbor. The PIM protocol signals the control plane state, and 159 corresponding (s,g) or (*,g) forwarding state is installed on the 160 routers on the tree. The forwarding state includes one (or more, in 161 case of bidirectional trees) expected Incoming Interfaces (IIFs) and 162 a list of Outging interfaces (OIFs). The IIF is the RPF interface 163 (IIF is forwarding state while RPF is control plane state, but may be 164 used interchangeably in this document) towards the source/RP, and in 165 case of bidirectional trees [RFC5015], the IIFs also include other 166 interfaces where traffic is accepted. 168 An interface is added to the OIF list if one of the following two 169 conditions is met: 171 o There are local receivers on the subnet that the interface is 172 connected to, and this router is the PIM Designated Router (DR) or 173 IGMP/MLD Querier if PIM is not used. In this case the router is 174 referred to as a Last Hop Router (LHR). 176 o A PIM join has been received from a downstream router connected by 177 this interface. 179 The LHR also send PIM join messages towards its RPF neighbor. This 180 will establish the branch of the tree towards the root. 182 In case of PIM-SM for ASM (Any Source Multicast), the LHRs send (*,g) 183 joins towards the RP, establishing a (*,g) shared tree rooted at the 184 RP. On the subnet that a source is connected to, the PIM DR, 185 referred to as First Hop Router (FHR), sends PIM Register messages to 186 the RP when it receives initial traffic for a flow. The RP then 187 sends (s,g) PIM join towards the FHR, establishing a branch from the 188 RP towards the source. Traffic is initially sent from the FHR to the 189 RP following the (s,g) branch, and the RP delivers the traffic to all 190 LHRs following the (*,g) shared tree. Upon receiving traffic, an LHR 191 optionally sends (s,g) join towards the source, establishing an (s,g) 192 branch between the source and the LHR so that traffic can follow a 193 more optimal path. 195 1.2. Inter-subnet Multicast in EVPN 197 For multicast traffic sourced from a TS in subnet 1, EVPN Broadcast, 198 Unknow unicast, Multicast (BUM) forwarding based on RFC 7432, will 199 deliver it to all sites in subnet 1. When NVEs receive the mulitcast 200 traffic on IRBs for subnet1, they route the traffic to other subnets 201 via their IRB interfaces following multicast routing procedures. 202 From an L3 point of view, each NVE has an (IRB) interface to subnet 203 1, and hence is attached to the same subnet as the multicast source. 204 Nothing is different from a traditional LAN and regular IGMP/MLD/PIM 205 procedures kick in. 207 If a TS is a multicast receiver, it uses IGMP/MLD to signal its 208 interest in some multicast flows. One of the gateways is the IGMP/ 209 MLD querier for a given subnet. It sends queries down the IRB for 210 that subnet, which in turn causes the queries to be forwarded 211 throughout the subnet following the EVPN BUM procedures. TS's send 212 IGMP/MLD joins via multicast, which are also forwarded throughout the 213 subnet via EVPN BUM procedure. The gateways receive the joins via 214 their IRB interfaces. From layer 3 point of view, again it is 215 nothing different from a traditional LAN. 217 On a traditional LAN, only one router can send multicast to local 218 receivers on the LAN. That is either the PIM Designated Router 219 (subject to PIM Assert procedure) or IGMP/MLD querier (if PIM is not 220 used - e.g., the LAN is a stub network). On the source subnet, PIM 221 is typically needed so that traffic can be delivered to other subnets 222 via other routers. For example, in case of PIM-SM, the DR on the 223 source network encapsulates the initial packets for a particular ASM 224 flow in PIM Register messages and unicasts the Register messages to 225 the Rendezvous Point (RP) for that flow, triggering necessary state 226 for that flow to be built throughout the network. 228 That also works in the EVPN scenario, although not efficiently. 229 Consider the example depicted in Figure 1, where a tenant has two 230 subnets (subnets 1 and 2) corresponding to two EVPN broadcast domains 231 (VLANs 1 and 2) at three sites. With VLAN-based service, each 232 broadcast domain has its own EVI. With VLAN-aware bundle service, 233 many broadcast domains can belong to the same EVI. 235 In Figure 1, a multicast source is located at site 1 on subnet 1 and 236 three receivers are located at site 2 on subnet 1, site 1 and 2 on 237 subnet 2 respectively. PIM adjacencies are formed among the NVEs on 238 each subnet. On subnet 1, NVE1 is the PIM DR while on subnet 2, NVE3 239 is the PIM DR. 241 Multicast traffic from the source at site 1 on subnet 1 is forwarded 242 to all three sites on BD 1 following EVPN BUM procedure. Rcvr1 gets 243 the traffic when NVE2 sends it out of its local Attachment Circuit 244 (AC). The three gateways for EVI1 also receive the traffic on their 245 IRB interfaces for subnet1 and potentially route to other subnets. 246 NVE3 is the DR on subnet 2 so it routes the local traffic (from L3 247 point of view) to subnet 2 while NVE1/2 is not the DR on subnet 2 so 248 they don't. Once traffic gets onto subnet 2, it is forwarded back to 249 NVE1/2 and delivered to rcvr2/3 following the EVPN BUM procedures. 251 Notice that the traffic is sent across the EVPN core multiple times - 252 once for each subnet with receivers. Additionally, both NVE1 and 253 NVE2 receive the multicast traffic from subnet 1 on their IRB 254 interfaces for subnet 1, but they do not route to subnet 2 where they 255 are not the PIM DRs. Instead, they wait to receive traffic at L2 256 from NVE3. For example, for receiver 3 connected to NVE1 but on 257 different IP subnet as the multicast source, the multicast traffic 258 from source has to go from NVE1 to NVE3 and then back to NVE1 before 259 it is being delivered to the receiver 3. This is similar to the 260 hairpinning issue with centralized approach - the inter-subnet 261 multicast forwarding is centralized via the DR, even though 262 distributed approach is being used for unicast (in that each NVE is 263 supporting IRB and routing inter-subnet unicast traffic locally). 265 site 1 . site 2 . site 3 266 . . 267 src . rcvr1 . 268 | . | . 269 -------------------------------------------- BD 1 270 | . | . | 271 IRB1| DR . IRB1| . IRB1| 272 NVE1------------NVE2-----------------NVE3---RP 273 IRB2| . IRB2| . IRB2| DR 274 | . | . | 275 -------------------------------------------- BD 2 276 | . | . 277 rcvr3 . rcvr2 . 278 . . 279 site 1 . site 2 . site 3 281 Figure 1 - EVPN IRB multicast scenario 283 2. EVPN-aware Solution 285 In the above text, the term "gateway" is from hosts point of view, 286 referring to a "routing gateway" that provides layer 3 forwarding. 287 With the distributed approach, each or almost every NVE is a gateway, 288 hence in the rest of the document we simply use the term NVE instead 289 of gateway. 291 2.1. Basic Operations 293 The multicast forwarding inefficiency described above (hairpinning 294 and multiple copies across the core) can be avoided if the following 295 Optimized Inter-subnet Multicast (OISM) procedures are followed: 297 1. When a routing instance on an NVE receives multicast traffic on 298 one of its IRB interfaces, it routes the traffic down any other 299 IRB interfaces that attach to subnets that have receivers for the 300 traffic, regardless whether the NVE is DR for those IRB 301 interfaces or not. 303 2. For ASM multicast traffic sourced from a local AC, if PIM runs on 304 the corresponding IRB interface, the NVE behaves as if it were 305 the DR on the IRB interface and performs PIM Registering 306 procedures. 308 3. When an NVE receives Membership Reports from one of its ACs and 309 PIM runs on the corresponding IRB interface, it sends PIM joins 310 towards the RP or source regardless if it is DR/querier or not. 312 4. Multicast data traffic received by a BD on its IRB interface 313 (i.e. multicast data traffic routed down the IRB interface) is 314 L2 switched out of that BD's local ACs only and not forwarded to 315 other NVEs. Note that link local multicast traffic (e.g. 316 addressed to 224.0.0.x in case of IPv4), is not subject to the 317 above procedures. It is still forwarded to remote NVEs in the 318 same subnet following EVPN procedures and not routed into other 319 subnets. 321 The above procedures are for routing traffic from the source subnet 322 to other subnets. In the source subnet itself, traffic is L2 323 switched according to EVPN procedures. It is assumed that each NVE 324 of the tenant can receive the L2 switched traffic in the source 325 subnet. If there are NVEs not attached to every subnet (therefore an 326 NVE cannot received L2 switched traffic in a source subnet that it is 327 not connected to), then a Supplemental BD (Section 2.3) is needed to 328 L2 switch the traffic from the source NVE to NVEs not attached to the 329 source subnet. In that SBD, multicast data traffic received on its 330 IRB interface is forwarded to other NVEs, as an exception to rule 4. 332 That is needed for situations discussed in Section 2.5.2 and 333 Section 2.5.4. 335 In the example in Figure 1, when NVE1's routing instance receives 336 traffic on its IRB1 interface it will route the traffic down its IRB2 337 for delivery to local rcvr3. It also sends register messages to the 338 RP since the source is local. Both NVE2 and NVE3 will receive the 339 traffic on IRB1 but neither sends register messages to the RP, since 340 the source is not local. NVE2 will route the traffic down its IRB2 341 and deliver to local rcvr2. NVE3 will also route the traffic down 342 IRB2 even though there is no receiver at the local site, because the 343 IGMP/MLD joins from rcvr2/3 are also received by NVE3. 345 Essentially, each NVE behaves as a DR/querier on an IRB interface for 346 local senders and receivers, and multicast data traffic routed down 347 IRB interfaces is limited to local receivers. 349 If EVPN is only used to provide DC overlay service but not transit 350 service (i.e. simulate a transit LAN connecting tenant routers) for a 351 tenant, then there is no need to run PIM protocol and the rule 2 and 352 3 above do not apply. Otherwise, additional procedures in 353 Section 2.5.4 are needed. 355 2.2. Multi-homing Support 357 The solution works as described when there are multi-homed ethernet 358 segments. 360 As shown in Figure 2, both rcvr4 and rcvr5 are all-active multi-homed 361 to NVE2 and NVE3. Receiver 4 is on subnet BD 1 and receiver 5 is on 362 BD 2. When IRBs on NVE1 and NVE2 forward multicast traffic to its 363 local attached access interface(s) based on EVPN BUM procedure, only 364 DF for the ES deliveries multicast traffic to its multi-homed 365 receiver. Hence no duplicated multicast traffic will be forwarded to 366 receiver 4 or receiver 5. 368 . 369 src . +-------- rcvr4-----+ 370 | . | . | 371 -------------------------------------------- BD 1 (EVI1) 372 | . | . | 373 IRB1| DR . IRB1| . IRB1| 374 NVE1------------NVE2-----------------NVE3---RP 375 IRB2| . IRB2| . IRB2| DR 376 | . | . | 377 -------------------------------------------- BD 2 (EVI2) 378 | . | . | 379 rcvr3 . +-------- rcvr5-----+ 380 . 382 Figure 2 - EVPN IRB multicast and multi-homing 384 For traffic sourced from a multi-homed ES, existing split-horizon 385 procedures work as is, because vanilla EVPN forwarding is used for 386 intra-subnet traffic. 388 2.3. Receiver NVEs not connected to a source subnet 390 The procedures of this document require that a inter-subnet multicast 391 packet is carried across the core as an intra-subnet frame. However, 392 consider the case where, for a given tenant, (a) NVE-1 attaches to 393 subnet-1, (b) NVE-2 attaches to subnet-2 but not to subnet-1, and (c) 394 a receiver in subnet-2 needs to receive multicast packets that are 395 sourced in subnet-1. Since NVE-1 sends the packets across the core 396 as intra-subnet multicasts, how does NVE-2 receive the packets? 398 One possible solution would be to configure subnet-1 on NVE-2. On 399 NVE-2, subnet-1 would have an IRB interface attaching it to the 400 routing instance, but subnet-1 would have no ACs. Then NVE-2 would 401 receive the intra-subnet multicast traffic of subnet-1, and the 402 procedures already discussed would cause the traffic to be forwarded 403 to NVE-2's local ACs for subnet-2. 405 However, if a given tenant has many subnets, only a few of which 406 attach to any given NVE, it is undesirable to have to configure all 407 those subnets on all those PEs. To avoid this, we introduce the 408 notion of a "Supplemental Broadcast Domain" (SBD). Each NVE will 409 have a single SBD (per tenant) configured. The SBD has no ACs, just 410 an IRB interface. The purpose of the SBD on a given NVE is to 411 receive (over the core) the intra-subnet multicasts of all subnets 412 that are not atached to that NVE. Additionally, traffic routed down 413 the SBD IRB interface will be sent across the core to remote NVEs. 414 This is an exception to rule 4 in Section 2, and is explained in 415 Section 2.5.2 and Section 2.5.4. 417 Thus in the above example, when NVE-1 sends a multicast packet from 418 subnet-1 to other NVEs, NVE-2 will receive the packet on the SBD. 419 Note that, in the example, NVE-1 would not have to send any extra 420 copies of the packet across the core. It just sends what it would 421 normally send. If an NVE receiving the packet is attached to subnet- 422 1, it associates the packet with subnet-1; if an NVE receiving the 423 packet is not attached to subnet-1, it associates the packet with the 424 SBD. 426 Subsequent sections explain how the NVEs construct the necessary EVPN 427 routes to make this happen. 429 2.3.1. IMET routes advertisement 431 The SBD is a separate broadcast domain present on all the NVEs of the 432 tenant. It has a corresponding IRB interface but no ACs. With VLAN- 433 based service, the SBD is in its own EVI. With VLAN-aware bundle 434 service, the SBD is just an additional BD in the EVI. The SBD uses a 435 Route Target that allows its routes to be imported by all the NVEs of 436 the tenant and associated with the SBD. In case of VLAN-aware bundle 437 service, the Route Target may be the same as or different from the 438 Route Targets for other BDs in the same EVI. In this document, when 439 we say a route is originated for/in the SBD, it means that the RD of 440 the route is set to the RD of the originating NVE's MAC-VRF for the 441 SBD, the Route Target is set to that of the SBD, and the Tag ID is 442 set to 0 in case of VLAN-based service or the Tag ID for the SBD in 443 case of VLAN-aware bundle service. 445 The rules of IMET route advertisement can be summarized as following: 447 o When IR, BIER, or RSVP-TE P2MP is being used for inclusive 448 tunnels, each NVE originates an IMET route in the SBD. In case of 449 IR, the MPLS Label field in the IMET route's PMSI Tunnel Attribute 450 (PTA) is a downstream allocated label for the SBD. 452 o When PIM, BIER or mLDP/RSVP-TE P2MP is being used for inclusive 453 tunnels, the IMET route that an NVE originates for a subnet 454 carries the RT for the subnet and the RT for the SBD. 456 o In case of BIER, or if tunnel aggregation (a single tunnel is used 457 for more than one broadcast domains) is used for mLDP/RSVP-TE 458 P2MP, the IMET route for the source subnet carries an upstream 459 allocated label in the PMSI Tunnel Attribute. The label is 460 different for each source subnet. 462 With the above rules, IMET routes are advertised in both the SBD and 463 source subnets if IR, BIER or RSVP-TE P2MP tunnels are used. IMET 464 routes are only advertised in the source subnet in case of PIM/mLDP 465 P2MP tunnels. 467 2.3.2. Layer 2 Forwarding State 469 In case of IR, when a source NVE builds its L2 forwarding state for a 470 BD, it finds all the remote NVEs that needs to receive traffic by 471 finding the IMET routes for the SBD. The IMET routes for the SBD are 472 those in the MAC-VRF for the SBD (in case of VLAN-based service) or 473 those in the MAC-VRF for the SBD and with the SBD's Tag ID (in case 474 of VLAN-aware bundle service). 476 If a remote NVE (learnt via the IMET route for the SBD) also 477 advertises an IMET route for the source subnet, the label in that 478 route is used. Otherwise, the label in the IMET route for the SBD is 479 used. Thus when a packet is transmitted to an NVE attached to the 480 source subnet, it carries the label that that NVE assigned to the 481 source subnet. When a packet is transmitted to an NVE that is not 482 attached to the source subnet, it carries the label that that NVE 483 assigned to the SBD. 485 In case of RSVP-TE P2MP, the source NVE establishes a P2MP tunnel to 486 all remote NVEs found through the SBD's IMET routes and advertises 487 the tunnel in the IMET route for the source subnet. If tunnel 488 aggregation is not used, a remote NVE attached to the source subnet 489 binds the incoming tunnel branch to the source subnet, and a remote 490 NVE that is not attached to the source subnet binds the incoming 491 tunnel branch to the SBD. 493 In case of PIM/mLDP, a remote NVE joins the tunnel advertised in the 494 IMET route for a source subnet. If tunnel aggregation is not used, a 495 remote NVE attached to the source subnet binds the incoming tunnel 496 branch to the source subnet, and a remote NVE that is not attached to 497 the source subnet binds the incoming tunnel branch to the SBD. 499 In case of BIER, or if tunnel aggregation is used for mLDP/RSVP-TE 500 P2MP, a remote NVE binds the upstream allocated label in the IMET 501 route for a source subnet to that subnet if it is present on the NVE. 502 Otherwise it binds the label to the SBD. 504 With the forwarding state set up as above, the incoming traffic from 505 a remote NVE is either associated with the source subnet or with the 506 SBD. In the former case, traffic is forwarded at L2 to local 507 receivers in the same source subnet, and split-horizon procedures for 508 multi-homing work as is. In the latter case, the traffic appears to 509 the receiving NVE as if it were sourced from the SBD. 511 The incoming traffic from a remote NVE is also associated with the 512 IRB interface in either the source subnet or SBD and routed down 513 other IRB interfaces for local receivers in other subnets, according 514 to a matching Layer 3 forwarding state described in the following 515 section. 517 2.3.3. Layer 3 Forwarding State 519 When an NVE's routing instance receives IGMP/MLD joins on IRB 520 interfaces, corresponding (C-S,C-G) or (C-*,C-G) L3 forwarding 521 entries are created/updated. The OIF list includes IRB interfaces 522 that have corresponding (C-S,C-G) or (C-*,C-G) IGMP/MLD state built 523 from relevant IGMP/MLD joins. An OIF is removed when the 524 corresponding IGMP/MLD state is removed from the interface, and the 525 (C-S,C-G) or (C-*,C-G) L3 forwarding state is removed when all of its 526 OIFs are removed. 528 For (C-S,C-G) L3 forwarding entries, the IIF is set to the source 529 subnet's IRB interface if the source subnet is present on the NVE. 530 If the source subnet is not present on the NVE, the IIF is set to the 531 SBD's IRB interface. 533 For (C-*,C-G) forwarding entries, the RPF interfaces include all IRB 534 interfaces as the traffic can arrive in the SBD or in any subnet to 535 which the NVE is attached. Note that for a particular packet, it 536 only arrive once, and is associated with either the source subnet or 537 the SBD. 539 2.4. Selective Multicast 541 For intra-subnet selective multicast, 542 [I-D.sajassi-bess-evpn-igmp-mld-proxy] specifies the procedures of 543 SMET routes. If a NVE has local receivers for (C-*,C-G) traffic in 544 subnet X, since the sources could be in any of other subnets that are 545 present on the NVE, it would need to advertise the (C-*,C-G) SMET 546 routes in each of those source subnets to pull traffic. To avoid the 547 duplication, SBD is used even if every subnet is connected to every 548 NVE of a tenant, and SMET routes are advertised as following: 550 o If there are tenant routers (Section 2.5.4), SMET routes are 551 originated per [I-D.sajassi-bess-evpn-igmp-mld-proxy] in the 552 subnet where the state is originally learnt. This will allow NVEs 553 in the same subnet to convert SMET routes back to IGMP/MLD 554 messages on ACs. 556 o Additionally, a corresponding SMET route is originated for the 557 SBD, with the v1/v2/v3 flag bits cleared, with one exception 558 described below. 560 Note that for (C-S,C-G) SMET routes, even though they would not need 561 to be advertised in every source subnet like in (C-*,C-G) case, they 562 are also advertised in the SBD. The reason is that a receiver for an 563 (C-S,C-G) flow may be attached to a NVE that is not connected to the 564 source subnet so the SMET route need to be advertised in the SBD 565 anyway in that case. For consistence in all situations, all SMET 566 routes are advertised in the SBD. 568 The one exception is that a (C-S,C-G) SMET route with the IE 569 (include/exclude) bit set may be suppressed in the SBD, according to 570 the IGMP/MLD state merged from all subnets. For example, a 571 particular source may be excluded in one subnet but not in another, 572 then the SMET route will not be originated for the SBD. This can be 573 considered that IGMP/MLD state in subnets is proxied into the SBD, 574 just like the IGMP/MLD state on ACs is proxied to other ACs in the 575 same subnet. 577 The SMET routes in the SBD will trigger IGMP/MLD state on the SBD's 578 IRB interfaces. Note that for L3 multicast forwarding state, the SBD 579 IRB interface is not added to the Outgoing InterFace (OIF) List when 580 the RPF interface is one or more IRB interfaces (i.e., traffic is 581 sourced from a BD), even with the IGMP/MLD state on the SBD IRB 582 interface. The reason is that traffic from that BD is already L2 583 switched to all NVEs. 585 [I-D.sajassi-bess-evpn-igmp-mld-proxy] assumes selective forwarding 586 is always used with IR or BIER for all flows. The SMET route allows 587 other NVEs to identify which NVEs need to receive traffic for a 588 particular (C-S,C-G) or (C-*,C-G). With SBD, a source NVE builds the 589 corresponding forwarding state using the same procedure as in the 590 inclusive tunnel case, except that it checks the corresponding SMET 591 route in the SBD to determine if a remote NVE needs to receive the 592 traffic. 594 For other tunnel types, or if selective forwarding is only used for 595 some of the flows, S-PMSI A-D routes are needed as specified in 596 [I-D.ietf-bess-evpn-bum-procedure-updates]. A source NVE advertises 597 S-SPMSI A-D routes to announce the tunnels used for certain flows, 598 and receiving NVEs either join the announced PIM/mLDP tunnel or 599 respond with Leaf A-D routes if the Leaf Information Requested flag 600 is set in the S-PMSI A-D route's PTA (so that the source NVE can 601 include them as tunnel leaves). As in the inclusive tunnel case, the 602 S-PMSI A-D routes additionally carry the RT for the SBD so that all 603 NVEs of the tenant will import them. A receiving NVE binds the 604 announced tunnel to either the subnet that the route is for if the 605 subnet is present on the NVE or to the SBD otherwise. 607 2.5. Advanced Topics 609 2.5.1. Legacy NVEs 611 It is possible that an NVE may not support the OISM procedures. For 612 example, it may not have IRB interfaces for some of its BDs, or its 613 software could not be upgraded to support OISM. To indicate the OISM 614 support, an NVE that supports the procedures in this document 615 includes the Multicast Flags Extended Community in its IMET routes 616 and sets a new flag bit (OISM bit, to be assigned by IANA) in the EC. 618 Suppose a multicast source is attached to NVE 1 in subnet 1. Subnet 619 1 is not present on NVE 2 that does not support OISM, and NVE 2 has 620 some receivers in its subnet 2. In this case, the receivers need to 621 receive traffic in subnet 2 from NVE 1. For that, the OISM NVEs run 622 PIM over the subnet for which not all NVEs support OISM, and the 623 elected PIM DR use a separate provider tunnel to forward traffic 624 (that is routed down the DR's IRB interface for the subnet) only to 625 NVEs that do not support OISM. 627 If the PIM DR uses IR to forward BUM traffic in the subnet, the 628 special tunnel's leaves includes the NVEs that do not set the OISM 629 bit in the above mentioned EC. 631 If the PIM DR uses P2MP tunnels, the special tunnel is advertised in 632 an EVPN S-PMSI A-D route per 633 [I-D.ietf-bess-evpn-bum-procedure-updates]. The route carries an 634 EVPN Non-OISM Extended Community, indicating that a receiving NVE 635 attached to the BD identified in the route should join the advertised 636 tunnel only if it does not support OISM. 638 The routes could be either be a (C-*,C-*) wildcard S-PMSI A-D routes 639 if an inclusive tunnel is used (but only for all sites without IRBs), 640 or individual (C-S,C-G)/(C-*,C-G)/(C-S,C-*) S-PMSI A-D routes if 641 selective tunnels are used. They are advertised for each of BD to 642 deliver multicast traffic routed down the IRB interface for the BD to 643 remote sites that do not have IRBs for the BD. If the same 644 (C-S,C-G)/(C-*,C-G)/(C-S,C-*)/(C-*,C-*) S-PMSI A-D routes are also 645 advertised without the EVPN Non-OISM EC (to deliver intra-subnet 646 traffic), then different RDs MUST be used for the two routes. 648 The EVPN Non-OISM Extended Community is a new EVPN extended 649 community. EVPN extended communities are transitive extended 650 community with a Type field of 6. The subtype of this new EVPN 651 extended community will be assigned by IANA, and with the following 652 8-octet encoding: 654 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 655 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 656 | Type=0x06 | Sub-Type TBD | Reserved=0 | 657 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 658 | Reserved=0 | 659 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 661 For multicast sources attached to a Non-OISM NVE, if the source 662 subnet is present on all NVEs, then traffic will be L2 switched to 663 all NVEs in the source subnet and then forwarded appropriately. For 664 simplicity, this document requires that all subnets on a Non-OISM NVE 665 are configured on all NVEs, even if there would be no ACs on some 666 NVEs for those subnets. 668 2.5.2. Traffic to/from outside of an EVPN domain 670 For traffic coming in/out of an EVPN domain, EVPN Gateways (GWs) are 671 used. They are NVEs that also participate in the SBD for each 672 tenant, and may be connected to some subnets. This document supposes 673 that the GWs run PIM on its external tenant interfaces, or act as 674 MVPN PEs for external connection (and the IRB interfaces are VRF 675 interfaces in the IPVPN). The subnets in the EVPN domain appear as 676 stub networks connected to the PIM/MVPN domain. This section 677 describes the procedures that are common for both PIM and MVPN as 678 external connection, while the next section focuses on procedures 679 specific to MVPN. 681 If there are multiple GWs for the same EVPN domain, then the GWs need 682 to run PIM on the IRB interfaces for the subnets and the SBD, so that 683 a DR can be elected for each subnet/SBD, and act as FHR/LHR on the 684 subnets/SBD. In other words, traffic inside the EVPN domain follows 685 the procedures described in previous sections, while traffic to/from 686 outside the EVPN domain need to additioanlly follow existing PIM/MVPN 687 procedures. 689 For traffic going out of the EVPN domain, the IRB interface of the 690 source subnet or SBD is the RPF interface on the GW, depending on 691 whether the source subnet is present on the GW. In case of PIM-SM, 692 one of the EVPN GWs is the PIM DR on a connected source subnet or on 693 the SBD act as the First Hop Router (e.g. handling PIM register 694 procedures for ASM). For that, the SBD IRB needs to be configured to 695 treat incoming packets as if the sources were on a local subnet (in 696 this case the SBD). 698 When selective forwarding is used in the EVPN domain, for the EVPN GW 699 to receive all traffic (before it learns possible external receivers) 700 for the purpose of FHR procedures, it MUST advertise a (C-*,C-*) SMET 701 route in the SBD, indicating to other NVEs that it needs to receive 702 all traffic. Later the EVPN GW may receive (C-S,C-G) prunes from the 703 external network. At that time, it MAY advertise (C-S,C-G) SMET 704 route with the Exclude Group type bit and IGMPv3 bit in the Flags 705 field set, signaling to other NVEs that the particular (C-S,C-G) 706 traffic is not needed. 708 For traffic coming into a EVPN domain, the IRB interfaces for 709 connected subnets are included in OIF list for the L3 multicast 710 forwarding route, if the subnets have corresponding local IGMP/MLD 711 state. The IRB interface of the SBD may also be added as an outgoing 712 interface so that remote NVEs can receive the traffic and route to 713 their connected subnets. Note that in this case, data traffic sent 714 down the SBD IRB interface is forwarded to remote NVEs (this is a 715 exception to the behavior in Section 2). The SBD IRB interface is 716 added only if the GW has corresponding SMET routes (as described in 717 Section 2.4) received from other NVEs in the SBD. Corresponding PIM 718 join/prune messages or BGP-MVPN routes will be triggered/withdrawn as 719 a result. 721 For (C-*,C-G) L3 forwarding state, Section 2.3.3 states that all IRB 722 interfaces are included in the RPF interface list. Section 2.4 723 states that the SBD IRB interface is not added to OIF list if the RPF 724 interfaces include one or more IRB interfaces. That is to prevent 725 routing internal traffic into the SBD at layer 3 (because the source 726 NVE already L2 switch the traffic to all NVEs). This means that 727 traffic coming into the EVPN domain cannot use the (C-*,C-G) 728 forwarding state (it would not be routed down the IRB interface for 729 ths SBD to reach remote NVEs because that IRB is not in the OIF 730 list). For this to work, the interface or MVPN tunnel connecting 731 towards the C-RP is not added as an IIF of the (C-*,C-G) forwarding 732 state (even though a PIM join is sent out of that interface), so 733 initial traffic for an externally sourced flow will match the 734 (C-*,C-G) forwarding state and trigger IIF Mismatch notifications, 735 (since the incoming interface does not match any of the IIFs), 736 causing the EVPN GW to install (C-S,C-G) state with the external 737 interface (or MVPN provider tunnel) being the RPF interface and IRB 738 interface included in the OIF list. 740 2.5.2.1. A Variation of External Connection 742 If a tenant's external connection can be via a vlan (instead of 743 MVPN), and there are no sources like C-S1/2/5 as described in 744 Section 2.5.3, then the following variation can be used. 746 The external vlan connection becomes an AC in the SBD. The tenant 747 external router becomes the PIM FHR and LHR for the EVPN domain that 748 is treated as a stub network. The previous EVPN GWs are no longer 749 gateways and are referred to as edge NVEs in this section. An AE 750 bundle can be used to connect to multiple edge NVEs - the bundle 751 terminates either on the external router or on a switch between the 752 edge NVEs and the external router, as depicted in the following 753 picture. From the edge NVEs' point of view, the external PIM router 754 is a TS on a multihomed ES. 756 vlan3 vlan4 757 TS3--------NVE3 NVE4---------TS4 758 (EVPN) 759 vlan1 vlan2 760 TS1------Edge NVE1 Edge NVE2------TS2 761 \ / 762 \SBD AC SBD AC/ 763 \ / 764 \ / 765 \ AE bundle / 766 \ / 767 external PIM router 769 PIM is not running on any of the NVEs. IGMP/MLD state inside the 770 EVPN domain is proxied to the external vlan and triggers 771 corresponding multicast state on the external router. Externally 772 sourced traffic is routed to the vlan as a result, and is L2 switched 773 by the edge NVEs to other NVEs via the SBD. All receivers in the 774 EVPN domain receive the traffic that is routed to them by their 775 attached NVEs (the IIF is the SBD IRB and the OIFs are the IRBs for 776 the subnets that the receivers are on). 778 For traffic sourced from inside the EVPN domain to reach external 779 receivers, the edge NVEs still need to advertise a (C-*,C-*) SMET 780 route in the SBD to pull all traffic and L2 switch to the external 781 router, who will register towards the RP. The external router may 782 prune back a particular flow by sending approproate IGMP/MLD 783 messages, triggering correspoding SMET routes on the edge NVEs so 784 that the source NVEs will stop sending traffic towards the edge NVEs. 786 2.5.3. Integration with MVPN 788 When a tenant needs to connect its EVPN subnets to external networks 789 via L3VPN, instead of running both EVPN and L3VPN on each NVE, this 790 document recommends that L3VPN (hence MVPN) only extends to the EVPN 791 GWs, and only EVPN runs inside the EVPN domain. EVPN GWs run both 792 EVPN and L3VPN/MVPN, as depicted in the following diagram. 794 C-s1 ---+ 795 | | 796 PE1 PE2 | 797 | WAN; L3VPN 798 (L3VPN/MVPN) _C-s5 | 799 / | 800 C-s2 --- GW1 GW2 --- r1 ---+ GWs; EVPN+L3VPN 801 /|\ /|\ | 802 / | \ (EVPN) / | \ | 803 / | \ / | \ | 804 bd1 bd2 SBD bd2 bd3 | DC; EVPN only 805 / | | 806 C-s3 C-s4 | 807 (more NVEs omitted) ---+ 809 GW1/2 run both EVPN and L3VPN. They may advertise routes learnt from 810 PE1/PE2 (e.g. C-s1), routes to locally attached non-EVPN 811 destinations (e.g., C-s2/s5), or just a default route into the EVPN 812 domain as EVPN type-5 routes. For destinations inside the EVPN 813 domain (including EVPN and non-EVPN, e.g. C-s2/3/4/5), the GWs may 814 advertise subnet prefix L3VPN routes towards outside the EVPN domain, 815 or optionally advertise host IPVPN route when they're learnt via EVPN 816 type-2 routes. The L3VNP routes are all advertised with Source AS 817 and VRF Route Import ECs [RFC6514] for MVPN purpose. 819 Using the GW2 example, when it determines RPF interface/neighbor or 820 MVPN UMH for various sources, it follows the following rules: 822 o If the source (e.g. C-s5) is reachable on a local non-IRB 823 interface, use that interface as the RPF interface. Or, 825 o If the source (e.g. C-s4) is on a local BD, use the IRB for that 826 local subnet as the RPF interface. Or, 828 o If the route to the source (e.g. C-s2/s3) is learnt via EVPN 829 type-2/5 routes, use the SBD IRB as the RPF interface. Or, 831 o If the route to the source (e.g. C-s1/s2) has a VRF Import RT EC, 832 then use MVPN procedure for UMH selection and use the MVPN 833 provider tunnel as the RPF interface. 835 Notice that for C-s2, GW2 may either use the SBD IRB or the MVPN 836 provider tunnel as the RPF interface, depending whether the IPVPN 837 route or EVPN type-5 route is selected as the active route. 839 Also notice that for C-s4, if GW1/2 only advertises the subnet prefix 840 into L3VPN, then PE1/2 may pick GW2 as the UMH. It will still work 841 as GW2 will get the traffic in bd2 as well. However, it would be 842 more optimal if GW1 is picked as the UMH as C-s4 is directly attached 843 to GW1. To achieve this optimization, when GW2 receives the 844 C-multicast route for (C-s4,C-g) from PE1/2, it may optionally 845 advertise a C-multicast route to GW1 where C-s4 is directly attached. 846 This will trigger an (C-s4,C-g) Source Active route, which PE1/2 may 847 optionally use to influence their UMH selection such that GW1 is 848 chosen as their UMH for C-s4. 850 2.5.4. When Tenant Routers Are Present 852 It is possible that an EVPN broadcast domain is providing transit 853 service for a tenant's larger network and there are tenant routers 854 attached to the subnet, running routing protocols like PIM. In that 855 case, traffic routed by an upstream NVE to the subnet via IRB 856 interface may be expected on a downstream tenant router. However, 857 since multicast data traffic sent down the IRB interfaces is 858 forwarded to local ACs only and not to other EVPN sites according to 859 rule 4 in Section 2, additional procedures are needed to handle this 860 situation with tenant routers. In particular, NVEs connecting to 861 tenant routers or traffic sources need to run PIM on the IRB 862 interface for the transit subnet and the SBD. 864 Consider the following situation: 866 S1 S2 867 \ N1 / N2 868 CE1a CE2b 869 \ vlan1 / vlan1 870 NVE1 ------------ NVE2 ---- CE2a -- receiver 871 / N3 N4 872 S3 874 CE1a, CE2a/b are three CE routers on vlan1 that is implemented by 875 EVPN. The CEs and NVE1/2 run PIM protocol and are PIM neighbors on 876 vlan1. CE2a has a receiver on network N4 for multicast traffic from 877 S1/2/3 on network N1/2/3 respectively. 879 CE2a sends PIM joins to CE1a/CE2b/NVE1 on vlan1 for the three sources 880 respectively and they all route traffic accordingly onto vlan1. 881 Traffic from S1/2 will reach CE2a because NVE1/2 receive the L2 882 traffic on their ACs and forward across the core following EVPN 883 procedures. Traffic from S3 is routed into vlan1 by NVE1 via the IRB 884 interface, and per rule 4 in Section 2 the traffic will not be sent 885 across the core. Thus, according to the procedures specified so far, 886 the traffic from S3 will never be received by NVE2 or CE2a. 888 To solve this problem, NVE2 needs to know that CE2a sent a PIM join 889 to another NVE in vlan1 and needs to pull traffic via the SBD, where 890 the traffic via IRB is not blocked on the core side. Because PIM 891 protocol already requires a router to process join/prune messages 892 that it receives on an interface even if it is not the intended RPF 893 neighbor (for the purpose of join suppression and prune overriding), 894 NVE2 can realize that the upstream router in the join message is 895 another NVE vs. a CE router (this only requires the NVEs to keep 896 track if a neighbor is an NVE for the subnet). In that case, it 897 treats that join/prune as for itself. Correspondingly, its PIM 898 upstream state machine will choose one of the NVEs as the RPF 899 neighbor. Between this local NVE and the chosen RPF neighbor there 900 could be multiple subnets including the SBD but the SBD IRB interface 901 is explicitly chosen as the RPF interface. Corresponding join/prune 902 is sent over the SBD IRB interface (optionally the the join/prune 903 could be replaced with SMET routes) and the upstream NVE will route 904 traffic through the SBD. This NVE then route traffic further 905 downstream to CE routers. 907 Similarly, if an NVE needs to send PIM join/prune messages due to its 908 local IGMP/MLD state changes, the RPF interface is always explicitly 909 set to the SBD IRB. 911 Note that, if CE2a chooses NVE1 or NVE2 instead of CE1a as its RPF 912 neighbor for S1, then both CE1a and NVE2 will send traffic to vlan1 913 (NVE1 receives join from NVE2 on the SBD and sends join to CE1a on 914 vlan1. NVE1 receives traffic from CE1a on vlan1 and route to SBD. 915 NVE2 receives traffic on SBD and route to local receivers on vlan1). 916 PIM assert procedure kicks in but only on NVE2, as CE1a does not 917 receive traffic from NVE2. To address this, an NVE must track all 918 the RPF neighbors and not add an IRB interface to the OIF list if it 919 received a corresponding PIM join on the IRB, in which a tenant 920 router is listed as the upstream neighbor. That tenant router will 921 deliver traffic to the subnet, and the traffic will be forwarded 922 through the core as it is not routed down the IRB but received on an 923 AC. 925 With PIM-ASM, if the DR on a source subnet is a tenant router, it 926 will handle the registering procedures for PIM-ASM. As a result, the 927 NVE at same site as the tenant router/DR MUST not handle registering 928 procedures as described in Section 2. 930 3. IANA Considerations 932 This document requests the following IANA assignments: 934 o A "Non-OISM" Sub-Type in "EVPN Extended Community Sub-Types" 935 registry for the EVPN Non-OISM Extended Community. 937 o An "Optimized Inter-subnet Multicast" bit (OISM) in the Multicast 938 Flags extended community defined in 939 [I-D.sajassi-bess-evpn-igmp-mld-proxy]. 941 4. Security Considerations 943 To be updated. 945 5. Acknowledgements 947 The authors thanks Eric Rosen for his detailed review, valuable 948 comments/suggestions and some suggesgted text. The authors also 949 thanks Vikram Nagarajan and Princy Elizabeth for their contribution 950 of the external connection variation (xref target="variation"/>. The 951 authors alse benefited tremendously from the discussions with Aldrin 952 Isaac on EVPN multicast optimizations. 954 6. References 956 6.1. Normative References 958 [I-D.ietf-bess-evpn-bum-procedure-updates] 959 Zhang, Z., Lin, W., Rabadan, J., and K. Patel, "Updates on 960 EVPN BUM Procedures", draft-ietf-bess-evpn-bum-procedure- 961 updates-01 (work in progress), December 2016. 963 [I-D.sajassi-bess-evpn-igmp-mld-proxy] 964 Sajassi, A., Thoria, S., Patel, K., Yeung, D., Drake, J., 965 and W. Lin, "IGMP and MLD Proxy for EVPN", draft-sajassi- 966 bess-evpn-igmp-mld-proxy-01 (work in progress), October 967 2016. 969 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 970 Requirement Levels", BCP 14, RFC 2119, 971 DOI 10.17487/RFC2119, March 1997, 972 . 974 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 975 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 976 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 977 2015, . 979 6.2. Informative References 981 [I-D.ietf-bess-evpn-inter-subnet-forwarding] 982 Sajassi, A., Salam, S., Thoria, S., Drake, J., Rabadan, 983 J., and L. Yong, "Integrated Routing and Bridging in 984 EVPN", draft-ietf-bess-evpn-inter-subnet-forwarding-03 985 (work in progress), February 2017. 987 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 988 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 989 2006, . 991 [RFC5015] Handley, M., Kouvelas, I., Speakman, T., and L. Vicisano, 992 "Bidirectional Protocol Independent Multicast (BIDIR- 993 PIM)", RFC 5015, DOI 10.17487/RFC5015, October 2007, 994 . 996 [RFC7761] Fenner, B., Handley, M., Holbrook, H., Kouvelas, I., 997 Parekh, R., Zhang, Z., and L. Zheng, "Protocol Independent 998 Multicast - Sparse Mode (PIM-SM): Protocol Specification 999 (Revised)", STD 83, RFC 7761, DOI 10.17487/RFC7761, March 1000 2016, . 1002 Appendix A. Integrated Routing and Bridging 1004 Consider a traditional router that only does routing and has no L2 1005 switching (also referred to as "bridging") capabilities. It has two 1006 interfaces lan1 and lan2 connecting to LAN 1 and LAN 2 respectively. 1007 The two LANs are realized by two switches respectively, with hosts 1008 and the router attached: 1010 +-------+ +--------+ +-------+ 1011 | | lan1| |lan2 | | 1012 H1 -----+switch1+--------+ router +--------+switch2+------H2 1013 | | | | | | 1014 +-------+ +--------+ +-------+ 1015 |____________________| |_____________________| 1016 LAN1 LAN2 1018 Interfaces lan1 and lan2 are two physical interfaces with IP 1019 configuration and functionality. With that they may also be referred 1020 to as IP interfaces (on top of layer 2 interfaces). H1 has a default 1021 gateway configured, which is the router's IP address on interface 1022 lan1. For H1 to send an IP packet destined to H2, it uses the 1023 router's mac address (learnt via ARP resolution for the gateway) for 1024 lan1 as the destination mac address. The router receives the packet 1025 from the switch and associate it with the IP interface lan1 because 1026 the desitnation mac address matches. A IP lookup is done and the 1027 packet is sent out of interface lan2, with H2's mac address (again 1028 learnt via ARP resolution) as the destination mac address and the 1029 router's mac address on lan2 as the source mac address. TTL is 1030 decremented and fragmentation may be done during this forwarding 1031 process. This process may be referred to as "routing a packet". For 1032 comparison, when switch1 sends the packet that it receives from H1 to 1033 the router, it is "bridging or L2 switching a packet". There is no 1034 TTL or fragmentation for L2 switching, and there is no source/ 1035 destination mac address change. 1037 If H1 sends an IP multicast packet, the multicast destination mac 1038 address and IPv4/6 Ethertype cause the router to associate the packet 1039 with the IP interface lan1 and may route it out of other IP 1040 interfaces as appropriate, following multicast routing rules. 1042 Now consider that the router itself supports both routing and 1043 bridging. Now the above picture becomes the following: 1045 +------------------------------------------+ 1046 | Integrated Router and Bridge/Switch | 1048 +-------+ +--------+ +-------+ 1049 | | IRB1| L3 |IRB2 | | 1050 H1 -----+ BD1 +--------+routing +--------+ BD2 +------H2 1051 | | |instance| | | 1052 +-------+ +--------+ +-------+ 1053 |____________________| |______________________| 1054 LAN1 LAN2 1056 The router now includes a routing instance (which could be the 1057 default/master instance, a Virtual Router routing instance, a VRF, or 1058 a VRF Lite, depending on which vendor's nomenclature one is familiar 1059 with) and two broadcast domains (BDs) that provide bridging 1060 functionalities. Instead of two physical interface connecting to two 1061 physical switches, there are two logical interfaces connecting the 1062 routing instance to the two BDs. 1064 Because the device now provides both routing and bridging 1065 functionalities, it becomes an Integrated Router and Bridge(/Switch), 1066 and the two logical interfaces are referred to IRB interfaces, or 1067 sometimes simply IRBs. For each BD that needs routing functionality, 1068 there is one IRB interface connecting the BD to a particular routing 1069 instance. 1071 Other than that the logical IRB interfaces replace physical 1072 interfaces, the way the packets are forwarded does not change. 1074 Authors' Addresses 1076 Wen Lin 1077 Juniper Networks, Inc. 1079 EMail: wlin@juniper.net 1081 Zhaohui Zhang 1082 Juniper Networks, Inc. 1084 EMail: zzhang@juniper.net 1086 John Drake 1087 Juniper Networks, Inc. 1089 EMail: jdrake@juniper.net 1090 Jorge Rabadan 1091 Nokia 1093 EMail: jorge.rabadan@nokia.com 1095 Ali Sajassi 1096 Cisco Systems 1098 EMail: sajassi@cisco.com