idnits 2.17.1 draft-ietf-bess-evpn-optimized-ir-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 2 instances of lines with control characters in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 19, 2018) is 2014 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-14) exists of draft-ietf-bess-evpn-bum-procedure-updates-04 Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup J. Rabadan, Ed. 3 Internet Draft S. Sathappan 4 Intended status: Standards Track Nokia 6 W. Lin 7 Juniper 9 M. Katiyar 10 Versa Networks 12 A. Sajassi 13 Cisco 15 Expires: April 22, 2019 October 19, 2018 17 Optimized Ingress Replication solution for EVPN 18 draft-ietf-bess-evpn-optimized-ir-06 20 Abstract 22 Network Virtualization Overlay (NVO) networks using EVPN as control 23 plane may use Ingress Replication (IR) or PIM (Protocol Independent 24 Multicast) based trees to convey the overlay BUM traffic. PIM 25 provides an efficient solution to avoid sending multiple copies of 26 the same packet over the same physical link, however it may not 27 always be deployed in the NVO core network. IR avoids the dependency 28 on PIM in the NVO network core. While IR provides a simple multicast 29 transport, some NVO networks with demanding multicast applications 30 require a more efficient solution without PIM in the core. This 31 document describes a solution to optimize the efficiency of IR in NVO 32 networks. 34 Status of this Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF), its areas, and its working groups. Note that 41 other groups may also distribute working documents as Internet- 42 Drafts. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 49 The list of current Internet-Drafts can be accessed at 50 http://www.ietf.org/ietf/1id-abstracts.txt 52 The list of Internet-Draft Shadow Directories can be accessed at 53 http://www.ietf.org/shadow.html 55 This Internet-Draft will expire on April 22, 2019. 57 Copyright Notice 59 Copyright (c) 2018 IETF Trust and the persons identified as the 60 document authors. All rights reserved. 62 This document is subject to BCP 78 and the IETF Trust's Legal 63 Provisions Relating to IETF Documents 64 (http://trustee.ietf.org/license-info) in effect on the date of 65 publication of this document. Please review these documents 66 carefully, as they describe your rights and restrictions with respect 67 to this document. Code Components extracted from this document must 68 include Simplified BSD License text as described in Section 4.e of 69 the Trust Legal Provisions and are provided without warranty as 70 described in the Simplified BSD License. 72 Table of Contents 74 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 75 2. Terminology and Conventions . . . . . . . . . . . . . . . . . . 4 76 3. Solution requirements . . . . . . . . . . . . . . . . . . . . . 5 77 4. EVPN BGP Attributes for optimized-IR . . . . . . . . . . . . . 6 78 5. Non-selective Assisted-Replication (AR) Solution Description . 9 79 5.1. Non-selective AR-REPLICATOR procedures . . . . . . . . . . 10 80 5.2. Non-selective AR-LEAF procedures . . . . . . . . . . . . . 11 81 5.3. RNVE procedures . . . . . . . . . . . . . . . . . . . . . . 12 82 5.4. Forwarding behavior in non-selective AR EVIs . . . . . . . 13 83 5.4.1. Broadcast and Multicast forwarding behavior . . . . . . 13 84 5.4.1.1. Non-selective AR-REPLICATOR BM forwarding . . . . . 13 85 5.4.1.2. Non-selective AR-LEAF BM forwarding . . . . . . . . 14 86 5.4.1.3. RNVE BM forwarding . . . . . . . . . . . . . . . . 14 87 5.4.2. Unknown unicast forwarding behavior . . . . . . . . . . 14 88 5.4.2.1. Non-selective AR-REPLICATOR/LEAF Unknown unicast 89 forwarding . . . . . . . . . . . . . . . . . . . . 15 90 5.4.2.2. RNVE Unknown unicast forwarding . . . . . . . . . . 15 92 6. Selective Assisted-Replication (AR) Solution Description . . . 15 93 6.1. Selective AR-REPLICATOR procedures . . . . . . . . . . . . 15 94 6.2. Selective AR-LEAF procedures . . . . . . . . . . . . . . . 17 95 6.3. Forwarding behavior in selective AR EVIs . . . . . . . . . 18 96 6.3.1. Selective AR-REPLICATOR BM forwarding . . . . . . . . . 18 97 6.3.2. Selective AR-LEAF BM forwarding . . . . . . . . . . . . 19 98 7. Pruned-Flood-Lists (PFL) . . . . . . . . . . . . . . . . . . . 20 99 7.1. A PFL example . . . . . . . . . . . . . . . . . . . . . . . 20 100 8. AR Procedures for single-IP AR-REPLICATORS . . . . . . . . . . 21 101 9. AR Procedures and EVPN All-Active Multi-homing Split-Horizon . 22 102 9.1. Ethernet Segments on AR-LEAF nodes . . . . . . . . . . . . 22 103 9.2. Ethernet Segments on AR-REPLICATOR nodes . . . . . . . . . 23 104 10. Benefits of the optimized-IR solution . . . . . . . . . . . . 23 105 11. Security Considerations . . . . . . . . . . . . . . . . . . . 24 106 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 107 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 108 13.1 Normative References . . . . . . . . . . . . . . . . . . . 24 109 13.2 Informative References . . . . . . . . . . . . . . . . . . 25 110 14. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 25 111 15. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 25 112 16. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 25 114 1. Introduction 116 Ethernet Virtual Private Networks (EVPN) may be used as the control 117 plane for a Network Virtualization Overlay (NVO) network. Network 118 Virtualization Edge (NVE) devices and Provider Edges (PEs) that are 119 part of the same EVPN Instance (EVI) use Ingress Replication (IR) or 120 PIM-based trees to transport the tenant's BUM traffic. In NVO 121 networks where PIM-based trees cannot be used, IR is the only option. 122 Examples of these situations are NVO networks where the core nodes 123 don't support PIM or the network operator does not want to run PIM in 124 the core. 126 In some use-cases, the amount of replication for BUM (Broadcast, 127 Unknown unicast and Multicast traffic) is kept under control on the 128 NVEs due to the following fairly common assumptions: 130 a) Broadcast is greatly reduced due to the proxy ARP (Address 131 Resolution Protocol) and proxy ND (Neighbor Discovery) 132 capabilities supported by EVPN on the NVEs. Some NVEs can even 133 provide Dynamic Host Configuration Protocol(DHCP) server functions 134 for the attached Tenant Systems (TS) reducing the broadcast even 135 further. 137 b) Unknown unicast traffic is greatly reduced in virtualized NVO 138 networks where all the MAC and IP addresses are learnt in the 139 control plane. 141 c) Multicast applications are not used. 143 If the above assumptions are true for a given NVO network, then IR 144 provides a simple solution for multi-destination traffic. However, 145 the statement c) above is not always true and multicast applications 146 are required in many use-cases. 148 When the multicast sources are attached to NVEs residing in 149 hypervisors or low-performance-replication TORs Top Of the Rack 150 switches), the ingress replication of a large amount of multicast 151 traffic to a significant number of remote NVEs/PEs can seriously 152 degrade the performance of the NVE and impact the application. 154 This document describes a solution that makes use of two IR 155 optimizations: 157 i) Assisted-Replication (AR) 158 ii) Pruned-Flood-Lists (PFL) 160 Both optimizations may be used together or independently so that the 161 performance and efficiency of the network to transport multicast can 162 be improved. Both solutions require some extensions to [RFC7432] that 163 are described in section 3. 165 Section 2 lists the requirements of the combined optimized-IR 166 solution, whereas sections 4 and 5 describe the Assisted-Replication 167 (AR) solution, and section 6 the Pruned-Flood-Lists (PFL) solution. 169 2. Terminology and Conventions 171 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 172 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 173 "OPTIONAL" in this document are to be interpreted as described in BCP 174 14 [RFC2119] [RFC8174] when, and only when, they appear in all 175 capitals, as shown here. 177 The following terminology is used throughout the document: 179 AC: Attachment Circuit 181 Regular-IR: Refers to Regular Ingress Replication, where the source 182 NVE/PE sends a copy to each remote NVE/PE part of the EVI. 184 AR-IP: IP address owned by the AR-REPLICATOR and used to 185 differentiate the ingress traffic that must follow the AR 186 procedures. 188 IR-IP: IP address used for Ingress Replication as in [RFC7432]. 190 AR-VNI: VNI advertised by the AR-REPLICATOR along with the 191 Replicator-AR route. It is used to identify the ingress 192 packets that must follow AR procedures ONLY in the Single-IP 193 AR-REPLICATOR case. 195 IR-VNI: VNI advertised along with the RT-3 for IR. 197 AR forwarding mode: for an AR-LEAF, it means sending an AC BM packet 198 to a single AR-REPLICATOR with tunnel destination IP AR-IP. 199 For an AR-REPLICATOR, it means sending a BM packet to a 200 selective number or all the overlay tunnels when the packet 201 was previously received from an overlay tunnel. 203 IR forwarding mode: it refers to the Ingress Replication behavior 204 explained in [RFC7432]. It means sending an AC BM packet copy 205 to each remote PE/NVE in the EVI and sending an overlay BM 206 packet only to the ACs and not other overlay tunnels. 208 PTA: PMSI Tunnel Attribute 210 RT-3: EVPN Route Type 3, Inclusive Multicast Ethernet Tag route 212 RT-11: EVPN Route Type 11, Leaf Auto-Discovery (AD) route 214 VXLAN: Virtual Extensible LAN 216 GRE: Generic Routing Encapsulation 218 NVGRE: Network Virtualization using Generic Routing Encapsulation 220 GENEVE: Generic Network Virtualization Encapsulation 222 NVO: Network Virtualization Overlay 224 NVE: Network Virtualization Edge 226 VNI: VXLAN Network Identifier 228 EVI: EVPN Instance. An EVPN instance spanning the Provider Edge (PE) 229 devices participating in that EVPN 231 3. Solution requirements 233 The IR optimization solution specified in this document (optimized-IR 234 hereafter) meets the following requirements: 236 a) The solution provides an IR optimization for BM (Broadcast and 237 Multicast) traffic, while preserving the packet order for unicast 238 applications, i.e., known and unknown unicast traffic should 239 follow the same path. 241 b) The solution is compatible with [RFC7432] and [RFC8365] and has no 242 impact on the EVPN procedures for BM traffic. In particular, the 243 solution supports the following EVPN functions: 245 o All-active multi-homing, including the split-horizon and 246 Designated Forwarder (DF) functions. 248 o Single-active multi-homing, including the DF function. 250 o Handling of multi-destination traffic and processing of 251 broadcast and multicast as per [RFC7432]. 253 c) The solution is backwards compatible with existing NVEs using a 254 non-optimized version of IR. A given EVI can have NVEs/PEs 255 supporting regular-IR and optimized-IR. 257 d) The solution is independent of the NVO specific data plane 258 encapsulation and the virtual identifiers being used, e.g.: VXLAN 259 VNIs, NVGRE VSIDs or MPLS labels, as long as the tunnel is IP- 260 based. 262 4. EVPN BGP Attributes for optimized-IR 264 This solution extends the [RFC7432] Inclusive Multicast Ethernet Tag 265 routes and attributes so that an NVE/PE can signal its optimized-IR 266 capabilities. 268 The Inclusive Multicast Ethernet Tag route (RT-3) and its PMSI Tunnel 269 Attribute's (PTA) general format used in [RFC7432] are shown below: 271 +---------------------------------+ 272 | RD (8 octets) | 273 +---------------------------------+ 274 | Ethernet Tag ID (4 octets) | 275 +---------------------------------+ 276 | IP Address Length (1 octet) | 277 +---------------------------------+ 278 | Originating Router's IP Addr | 279 | (4 or 16 octets) | 280 +---------------------------------+ 282 +---------------------------------+ 283 | Flags (1 octet) | 284 +---------------------------------+ 285 | Tunnel Type (1 octets) | 286 +---------------------------------+ 287 | MPLS Label (3 octets) | 288 +---------------------------------+ 289 | Tunnel Identifier (variable) | 290 +---------------------------------+ 292 The Flags field is defined as follows: 294 0 1 2 3 4 5 6 7 295 +-+-+-+-+-+--+-+-+ 296 |rsvd | T |BM|U|L| 297 +-+-+-+-+-+--+-+-+ 299 Where a new type field (for AR) and two new flags (for PFL signaling) 300 are defined: 302 - T is the AR Type field (2 bits) that defines the AR role of the 303 advertising router: 305 + 00 (decimal 0) = RNVE (non-AR support) 307 + 01 (decimal 1) = AR-REPLICATOR 309 + 10 (decimal 2) = AR-LEAF 311 + 11 (decimal 3) = RESERVED 313 - The PFL (Pruned-Flood-Lists) flags defined the desired behavior of 314 the advertising router for the different types of traffic: 316 + BM= Broadcast and Multicast (BM) flag. BM=1 means "prune-me" from 317 the BM flooding list. BM=0 means regular behavior. 319 + U= Unknown flag. U=1 means "prune-me" from the Unknown flooding 320 list. U=0 means regular behavior. 322 - Flag L is an existing flag defined in [RFC6514] (L=Leaf Information 323 Required) and it will be used only in the Selective AR Solution. 325 Please refer to section 10 for the IANA considerations related to the 326 PTA flags. 328 In this document, the above RT-3 and PTA can be used in two different 329 modes for the same EVI/Ethernet Tag: 331 o Regular-IR route: in this route, Originating Router's IP Address, 332 Tunnel Type (0x06), MPLS Label, Tunnel Identifier and Flags MUST be 333 used as described in [RFC7432]. The Originating Router's IP Address 334 and Tunnel Identifier are set to an IP address that we denominate 335 IR-IP in this document. 337 o Replicator-AR route: this route is used by the AR-REPLICATOR to 338 advertise its AR capabilities, with the fields set as follows. 340 + Originating Router's IP Address as well as the Tunnel Identifier 341 are set to the same routable IP address that we denominate AR-IP 342 and SHOULD be different than the IR-IP for a given PE/NVE. 344 + Tunnel Type = Assisted-Replication (AR). Section 11 provides the 345 allocated type value. 347 + T (AR role type) = 01 (AR-REPLICATOR). 349 + L (Leaf Information Required) = 0 (for non-selective AR) or 1 350 (for selective AR). 352 In addition, this document also uses the Leaf-AD route (RT-11) 353 defined in [EVPN-BUM] in case the selective AR mode is used. The 354 Leaf-AD route MAY be used by the AR-LEAF in response to a Replicator- 355 AR route (with the L flag set) to advertise its desire to receive the 356 multicast traffic from a specific AR-REPLICATOR. It is only used for 357 selective AR and its fields are set as follows: 359 + Originating Router's IP Address is set to the advertising IR-IP 360 (same IP used by the AR-LEAF in regular-IR routes). 362 + Route Key is the "Route Type Specific" NLRI of the Replicator-AR 363 route for which this Leaf-AD route is generated. 365 + The AR-LEAF constructs an IP-address-specific route-target as 366 indicated in [EVPN-BUM], by placing the IP address carried in the 367 Next Hop field of the received Replicator-AR route in the Global 368 Administrator field of the Community, with the Local 369 Administrator field of this Community set to 0. Note that the 370 same IP-address-specific import route-target is auto-configured 371 by the AR-REPLICATOR that sent the Replicator-AR, in order to 372 control the acceptance of the Leaf-AD routes. 374 + The leaf-AD route MUST include the PMSI Tunnel attribute with the 375 Tunnel Type set to AR, type set to AR-LEAF and the Tunnel 376 Identifier set to the IR-IP of the advertising AR-LEAF. The PMSI 377 Tunnel attribute MUST carry a downstream-assigned MPLS label that 378 is used by the AR-REPLICATOR to send traffic to the AR-LEAF. 380 Each AR-enabled node MUST understand and process the AR type field in 381 the PTA (Flags field) of the routes, and MUST signal the 382 corresponding type (1 or 2) according to its administrative choice. 384 Each node, part of the EVI, MAY understand and process the BM/U 385 flags. Note that these BM/U flags may be used to optimize the 386 delivery of multi-destination traffic and its use SHOULD be an 387 administrative choice, and independent of the AR role. 389 Non-optimized-IR nodes will be unaware of the new PMSI attribute flag 390 definition as well as the new Tunnel Type (AR), i.e. they will ignore 391 the information contained in the flags field for any RT-3 and will 392 ignore the RT-3 routes with an unknown Tunnel Type (type AR in this 393 case). 395 5. Non-selective Assisted-Replication (AR) Solution Description 397 The following figure illustrates an example NVO network where the 398 non-selective AR function is enabled. Three different roles are 399 defined for a given EVI: AR-REPLICATOR, AR-LEAF and RNVE (Regular 400 NVE). The solution is called "non-selective" because the chosen AR- 401 REPLICATOR for a given flow MUST replicate the multicast traffic to 402 'all' the NVE/PEs in the EVI except for the source NVE/PE. 404 ( ) 405 (_ WAN _) 406 +---(_ _)----+ 407 | (_ _) | 408 PE1 | PE2 | 409 +------+----+ +----+------+ 410 TS1--+ (EVI-1) | | (EVI-1) +--TS2 411 |REPLICATOR | |REPLICATOR | 412 +--------+--+ +--+--------+ 413 | | 414 +--+----------------+--+ 415 | | 416 | | 417 +----+ VXLAN/nvGRE/MPLSoGRE +----+ 418 | | IP Fabric | | 419 | | | | 420 NVE1 | +-----------+----------+ | NVE3 421 Hypervisor| TOR | NVE2 |Hypervisor 422 +---------+-+ +-----+-----+ +-+---------+ 423 | (EVI-1) | | (EVI-1) | | (EVI-1) | 424 | LEAF | | RNVE | | LEAF | 425 +--+-----+--+ +--+-----+--+ +--+-----+--+ 426 | | | | | | 427 VM11 VM12 TS3 TS4 VM31 VM32 429 Figure 1 Optimized-IR scenario 431 5.1. Non-selective AR-REPLICATOR procedures 433 An AR-REPLICATOR is defined as an NVE/PE capable of replicating 434 ingress BM (Broadcast and Multicast) traffic received on an overlay 435 tunnel to other overlay tunnels and local Attachment Circuits (ACs). 436 The AR-REPLICATOR signals its role in the control plane and 437 understands where the other roles (AR-LEAF nodes, RNVEs and other AR- 438 REPLICATORs) are located. A given AR-enabled EVI service may have 439 zero, one or more AR-REPLICATORs. In our example in figure 1, PE1 and 440 PE2 are defined as AR-REPLICATORs. The following considerations apply 441 to the AR-REPLICATOR role: 443 a) The AR-REPLICATOR role SHOULD be an administrative choice in any 444 NVE/PE that is part of an AR-enabled EVI. This administrative 445 option to enable AR-REPLICATOR capabilities MAY be implemented as 446 a system level option as opposed to as a per-MAC-VRF option. 448 b) An AR-REPLICATOR MUST advertise a Replicator-AR route and MAY 449 advertise a Regular-IR route. The AR-REPLICATOR MUST NOT generate 450 a Regular-IR route if it does not have local attachment circuits 451 (AC). If the Regular-IR route is advertised, the AR Type field MAY 452 be set to AR-REPLICATOR. 454 c) The Replicator-AR and Regular-IR routes will be generated 455 according to section 3. The AR-IP and IR-IP used by the 456 Replicator-AR will be different routable IP addresses. 458 d) When a node defined as AR-REPLICATOR receives a packet on an 459 overlay tunnel, it will do a tunnel destination IP lookup and 460 apply the following procedures: 462 o If the destination IP is the AR-REPLICATOR IR-IP Address the 463 node will process the packet normally as in [RFC7432]. 465 o If the destination IP is the AR-REPLICATOR AR-IP Address the 466 node MUST replicate the packet to local ACs and overlay 467 tunnels (excluding the overlay tunnel to the source of the 468 packet). When replicating to remote AR-REPLICATORs the tunnel 469 destination IP will be an IR-IP. That will be an indication 470 for the remote AR-REPLICATOR that it MUST NOT replicate to 471 overlay tunnels. The tunnel source IP used by the AR- 472 REPLICATOR MUST be its IR-IP. 474 5.2. Non-selective AR-LEAF procedures 476 AR-LEAF is defined as an NVE/PE that - given its poor replication 477 performance - sends all the BM traffic to an AR-REPLICATOR that can 478 replicate the traffic further on its behalf. It MAY signal its AR- 479 LEAF capability in the control plane and understands where the other 480 roles are located (AR-REPLICATOR and RNVEs). A given service can have 481 zero, one or more AR-LEAF nodes. Figure 1 shows NVE1 and NVE3 (both 482 residing in hypervisors) acting as AR-LEAF. The following 483 considerations apply to the AR-LEAF role: 485 a) The AR-LEAF role SHOULD be an administrative choice in any NVE/PE 486 that is part of an AR-enabled EVI. This administrative option to 487 enable AR-LEAF capabilities MAY be implemented as a system level 488 option as opposed to as per-MAC-VRF option. 490 b) In this non-selective AR solution, the AR-LEAF MUST advertise a 491 single Regular-IR inclusive multicast route as in [RFC7432]. The 492 AR-LEAF SHOULD set the AR Type field to AR-LEAF. Note that 493 although this flag does not make any difference for the egress 494 nodes when creating an EVPN destination to the the AR-LEAF, it is 495 RECOMMENDED the use of this flag for an easy operation and 496 troubleshooting of the EVI. 498 c) In a service where there are no AR-REPLICATORs, the AR-LEAF MUST 499 use regular ingress replication. This will happen when a new 500 update from the last former AR-REPLICATOR is received and contains 501 a non-REPLICATOR AR type, or when the AR-LEAF detects that the 502 last AR-REPLICATOR is down (next-hop tracking in the IGP or any 503 other detection mechanism). Ingress replication MUST use the 504 forwarding information given by the remote Regular-IR Inclusive 505 Multicast Routes as described in [RFC7432]. 507 d) In a service where there is one or more AR-REPLICATORs (based on 508 the received Replicator-AR routes for the EVI), the AR-LEAF can 509 locally select which AR-REPLICATOR it sends the BM traffic to: 511 o A single AR-REPLICATOR MAY be selected for all the BM packets 512 received on the AR-LEAF attachment circuits (ACs) for a given 513 EVI. This selection is a local decision and it does not have 514 to match other AR-LEAF's selection within the same EVI. 516 o An AR-LEAF MAY select more than one AR-REPLICATOR and do 517 either per-flow or per-EVI load balancing. 519 o In case of a failure on the selected AR-REPLICATOR, another 520 AR-REPLICATOR will be selected. 522 o When an AR-REPLICATOR is selected, the AR-LEAF MUST send all 523 the BM packets to that AR-REPLICATOR using the forwarding 524 information given by the Replicator-AR route for the chosen 525 AR-REPLICATOR, with tunnel type = 0x0A (AR tunnel). The 526 underlay destination IP address MUST be the AR-IP advertised 527 by the AR-REPLICATOR in the Replicator-AR route. 529 o AR-LEAF nodes SHALL send service-level BM control plane 530 packets following regular IR procedures. An example would be 531 IGMP, MLD or PIM multicast packets. The AR-REPLICATORs MUST 532 NOT replicate these control plane packets to other overlay 533 tunnels since they will use the regular IR-IP Address. 535 e) The use of an AR-REPLICATOR-activation-timer (in seconds) on the 536 AR-LEAF nodes is RECOMMENDED. Upon receiving a new Replicator-AR 537 route where the AR-REPLICATOR is selected, the AR-LEAF will run a 538 timer before programming the new AR-REPLICATOR. This will give the 539 AR-REPLICATOR some time to program the AR-LEAF nodes before the 540 AR-LEAF sends BM traffic. 542 5.3. RNVE procedures 544 RNVE (Regular Network Virtualization Edge node) is defined as an 545 NVE/PE without AR-REPLICATOR or AR-LEAF capabilities that does IR as 546 described in [RFC7432]. The RNVE does not signal any AR role and is 547 unaware of the AR-REPLICATOR/LEAF roles in the EVI. The RNVE will 548 ignore the Flags in the Regular-IR routes and will ignore the 549 Replicator-AR routes (due to an unknown tunnel type in the PTA) and 550 the Leaf-AD routes (due to the IP-address-specific route-target). 552 This role provides EVPN with the backwards compatibility required in 553 optimized-IR EVIs. Figure 1 shows NVE2 as RNVE. 555 5.4. Forwarding behavior in non-selective AR EVIs 557 In AR EVIs, BM (Broadcast and Multicast) traffic between two NVEs may 558 follow a different path than unicast traffic. This solution 559 recommends the replication of BM through the AR-REPLICATOR node, 560 whereas unknown/known unicast will be delivered directly from the 561 source node to the destination node without being replicated by any 562 intermediate node. Unknown unicast SHALL follow the same path as 563 known unicast traffic in order to avoid packet reordering for unicast 564 applications and simplify the control and data plane procedures. 565 Section 4.4.1. describes the expected forwarding behavior for BM 566 traffic in nodes acting as AR-REPLICATOR, AR-LEAF and RNVE. Section 567 4.4.2. describes the forwarding behavior for unknown unicast traffic. 569 Note that known unicast forwarding is not impacted by this solution. 571 5.4.1. Broadcast and Multicast forwarding behavior 573 The expected behavior per role is described in this section. 575 5.4.1.1. Non-selective AR-REPLICATOR BM forwarding 577 The AR-REPLICATORs will build a flooding list composed of ACs and 578 overlay tunnels to remote nodes in the EVI. Some of those overlay 579 tunnels MAY be flagged as non-BM receivers based on the BM flag 580 received from the remote nodes in the EVI. 582 o When an AR-REPLICATOR receives a BM packet on an AC, it will 583 forward the BM packet to its flooding list (including local ACs and 584 remote NVE/PEs), skipping the non-BM overlay tunnels. 586 o When an AR-REPLICATOR receives a BM packet on an overlay tunnel, it 587 will check the destination IP of the underlay IP header and: 589 - If the destination IP matches its AR-IP, the AR-REPLICATOR will 590 forward the BM packet to its flooding list (ACs and overlay 591 tunnels) excluding the non-BM overlay tunnels. The AR-REPLICATOR 592 will do source squelching to ensure the traffic is not sent back 593 to the originating AR-LEAF. 595 - If the destination IP matches its IR-IP, the AR-REPLICATOR will 596 skip all the overlay tunnels from the flooding list, i.e. it 597 will only replicate to local ACs. This is the regular IR 598 behavior described in [RFC7432]. 600 5.4.1.2. Non-selective AR-LEAF BM forwarding 602 The AR-LEAF nodes will build two flood-lists: 604 1) Flood-list #1 - composed of ACs and an AR-REPLICATOR-set of 605 overlay tunnels. The AR-REPLICATOR-set is defined as one or more 606 overlay tunnels to the AR-IP Addresses of the remote AR- 607 REPLICATOR(s) in the EVI. The selection of more than one AR- 608 REPLICATOR is described in section 4.2. and it is a local AR- 609 LEAF decision. 611 2) Flood-list #2 - composed of ACs and overlay tunnels to the 612 remote IR-IP Addresses. 614 When an AR-LEAF receives a BM packet on an AC, it will check the 615 AR-REPLICATOR-set: 617 o If the AR-REPLICATOR-set is empty, the AR-LEAF will send the packet 618 to flood-list #2. 620 o If the AR-REPLICATOR-set is NOT empty, the AR-LEAF will send the 621 packet to flood-list #1, where only one of the overlay tunnels of 622 the AR-REPLICATOR-set is used. 624 When an AR-LEAF receives a BM packet on an overlay tunnel, will 625 forward the BM packet to its local ACs and never to an overlay 626 tunnel. This is the regular IR behavior described in [RFC7432]. 628 5.4.1.3. RNVE BM forwarding 630 The RNVE is completely unaware of the AR-REPLICATORs, AR-LEAF nodes 631 and BM/U flags (that information is ignored). Its forwarding behavior 632 is the regular IR behavior described in [RFC7432]. Any regular non-AR 633 node is fully compatible with the RNVE role described in this 634 document. 636 5.4.2. Unknown unicast forwarding behavior 638 The expected behavior is described in this section. 640 5.4.2.1. Non-selective AR-REPLICATOR/LEAF Unknown unicast forwarding 642 While the forwarding behavior in AR-REPLICATORs and AR-LEAF nodes is 643 different for BM traffic, as far as Unknown unicast traffic 644 forwarding is concerned, AR-LEAF nodes behave exactly in the same way 645 as AR-REPLICATORs do. 647 The AR-REPLICATOR/LEAF nodes will build a flood-list composed of ACs 648 and overlay tunnels to the IR-IP Addresses of the remote nodes in the 649 EVI. Some of those overlay tunnels MAY be flagged as non-U (Unknown 650 unicast) receivers based on the U flag received from the remote nodes 651 in the EVI. 653 o When an AR-REPLICATOR/LEAF receives an unknown packet on an AC, it 654 will forward the unknown packet to its flood-list, skipping the 655 non-U overlay tunnels. 657 o When an AR-REPLICATOR/LEAF receives an unknown packet on an overlay 658 tunnel will forward the unknown packet to its local ACs and never 659 to an overlay tunnel. This is the regular IR behavior described in 660 [RFC7432]. 662 5.4.2.2. RNVE Unknown unicast forwarding 664 As described for BM traffic, the RNVE is completely unaware of the 665 REPLICATORs, LEAF nodes and BM/U flags (that information is ignored). 666 Its forwarding behavior is the regular IR behavior described in 667 [RFC7432], also for Unknown unicast traffic. Any regular non-AR node 668 is fully compatible with the RNVE role described in this document. 670 6. Selective Assisted-Replication (AR) Solution Description 672 Figure 1 is also used to describe the selective AR solution, however 673 in this section we consider NVE2 as one more AR-LEAF for EVI-1. The 674 solution is called "selective" because a given AR-REPLICATOR MUST 675 replicate the BM traffic to only the AR-LEAF that requested the 676 replication (as opposed to all the AR-LEAF nodes) and MAY replicate 677 the BM traffic to the RNVEs. The same AR roles defined in section 4 678 are used here, however the procedures are slightly different. 680 The following sub-sections describe the differences in the procedures 681 of AR-REPLICATOR/LEAFs compared to the non-selective AR solution. 682 There is no change on the RNVEs. 684 6.1. Selective AR-REPLICATOR procedures 685 In our example in figure 1, PE1 and PE2 are defined as Selective AR- 686 REPLICATORs. The following considerations apply to the Selective AR- 687 REPLICATOR role: 689 a) The Selective AR-REPLICATOR capability SHOULD be an administrative 690 choice in any NVE/PE that is part of an AR-enabled EVI, as the AR 691 role itself. This administrative option MAY be implemented as a 692 system level option as opposed to as a per-MAC-VRF option. 694 b) Each AR-REPLICATOR will build a list of AR-REPLICATOR, AR-LEAF and 695 RNVE nodes (AR-LEAF nodes that sent only a regular-IR route are 696 accounted as RNVEs by the AR-REPLICATOR). In spite of the 697 'Selective' administrative option, an AR-REPLICATOR MUST NOT 698 behave as a Selective AR-REPLICATOR if at least one of the AR- 699 REPLICATORs has the L flag NOT set. If at least one AR-REPLICATOR 700 sends a Replicator-AR route with L=0 (in the EVI context), the 701 rest of the AR-REPLICATORs will fall back to non-selective AR 702 mode. 704 b) The Selective AR-REPLICATOR MUST follow the procedures described 705 in section 4.1, except for the following differences: 707 o The Replicator-AR route MUST include L=1 (Leaf Information 708 Required) in the Replicator-AR route. This flag is used by the 709 AR-REPLICATORs to advertise their 'selective' AR-REPLICATOR 710 capabilities. In addition, the AR-REPLICATOR auto-configures 711 its IP-address-specific import route-target as described in 712 section 3. 714 o The AR-REPLICATOR will build a 'selective' AR-LEAF-set with 715 the list of nodes that requested replication to its own AR-IP. 716 For instance, assuming NVE1 and NVE2 advertise a Leaf-AD route 717 with PE1's IP-address-specific route-target and NVE3 718 advertises a Leaf-AD route with PE2's IP-address-specific 719 route-target, PE1 MUST only add NVE1/NVE2 to its selective AR- 720 LEAF-set for EVI-1, and exclude NVE3. 722 o When a node defined and operating as Selective AR-REPLICATOR 723 receives a packet on an overlay tunnel, it will do a tunnel 724 destination IP lookup and if the destination IP is the AR- 725 REPLICATOR AR-IP Address, the node MUST replicate the packet 726 to: 728 + local ACs 729 + overlay tunnels in the Selective AR-LEAF-set (excluding the 730 overlay tunnel to the source AR-LEAF). 731 + overlay tunnels to the RNVEs if the tunnel source IP is the 732 IR-IP of an AR-LEAF (in any other case, the AR-REPLICATOR 733 MUST NOT replicate the BM traffic to remote RNVEs). In other 734 words, the first-hop selective AR-REPLICATOR will replicate 735 to all the RNVEs. 736 + overlay tunnels to the remote Selective AR-REPLICATORs if 737 the tunnel source IP is an IR-IP of its own AR-LEAF-set (in 738 any other case, the AR-REPLICATOR MUST NOT replicate the BM 739 traffic to remote AR-REPLICATORs), where the tunnel 740 destination IP is the AR-IP of the remote Selective AR- 741 REPLICATOR. The tunnel destination IP AR-IP will be an 742 indication for the remote Selective AR-REPLICATOR that the 743 packet needs further replication to its AR-LEAFs. 745 6.2. Selective AR-LEAF procedures 747 A Selective AR-LEAF chooses a single Selective AR-REPLICATOR per EVI 748 and: 750 o Sends all the EVI BM traffic to that AR-REPLICATOR and 751 o Expects to receive the BM traffic for a given EVI from the same AR- 752 REPLICATOR. 754 In the example of Figure 1, we consider NVE1/NVE2/NVE3 as Selective 755 AR-LEAFs. NVE1 selects PE1 as its Selective AR-REPLICATOR. If that is 756 so, NVE1 will send all its BM traffic for EVI-1 to PE1. If other AR- 757 LEAF/REPLICATORs send BM traffic, NVE1 will receive that traffic from 758 PE1. These are the differences in the behavior of a Selective AR-LEAF 759 compared to a non-selective AR-LEAF: 761 a) The AR-LEAF role selective capability SHOULD be an administrative 762 choice in any NVE/PE that is part of an AR-enabled EVI. This 763 administrative option to enable AR-LEAF capabilities MAY be 764 implemented as a system level option as opposed to as per-MAC-VRF 765 option. 767 b) The AR-LEAF MAY advertise a Regular-IR route if there are RNVEs in 768 the EVI. The Selective AR-LEAF MUST advertise a Leaf-AD route 769 after receiving a Replicator-AR route with L=1. It is recommended 770 that the Selective AR-LEAF waits for a timer t before sending the 771 Leaf-AD route, so that the AR-LEAF receives all the Replicator-AR 772 routes for the EVI. 774 c) In a service where there is more than one Selective AR-REPLICATORs 775 the Selective AR-LEAF MUST locally select a single Selective AR- 776 REPLICATOR for the EVI. Once selected: 778 o The Selective AR-LEAF will send a Leaf-AD route including the 779 Route-key and IP-address-specific route-target of the selected 780 AR-REPLICATOR. 782 o The Selective AR-LEAF will send all the BM packets received on 783 the attachment circuits (ACs) for a given EVI to that AR- 784 REPLICATOR. 786 o In case of a failure on the selected AR-REPLICATOR, another 787 AR-REPLICATOR will be selected and a new Leaf-AD update will 788 be issued for the new AR-REPLICATOR. This new route will 789 update the selective list in the new Selective AR-REPLICATOR. 790 In case of failure on the active Selective AR-REPLICATOR, it 791 is recommended for the Selective AR-LEAF to revert to IR 792 behavior for a timer t to speed up the convergence. When the 793 timer expires, the Selective AR-LEAF will resume its AR mode 794 with the new Selective AR-REPLICATOR. 796 All the AR-LEAFs in an EVI are expected to be configured as either 797 selective or non-selective. A mix of selective and non-selective AR- 798 LEAFs SHOULD NOT coexist in the same EVI. In case there is a non- 799 selective AR-LEAF, its BM traffic sent to a selective AR-REPLICATOR 800 will not be replicated to other AR-LEAFs that are not in its 801 Selective AR-LEAF-set. 803 6.3. Forwarding behavior in selective AR EVIs 805 This section describes the differences of the selective AR forwarding 806 mode compared to the non-selective mode. Compared to section 4.4, 807 there are no changes for the forwarding behavior in RNVEs or for 808 unknown unicast traffic. 810 6.3.1. Selective AR-REPLICATOR BM forwarding 812 The Selective AR-REPLICATORs will build two flood-lists: 814 1) Flood-list #1 - composed of ACs and overlay tunnels to the 815 remote nodes in the EVI, always using the IR-IPs in the tunnel 816 destination IP addresses. Some of those overlay tunnels MAY be 817 flagged as non-BM receivers based on the BM flag received from 818 the remote nodes in the EVI. 820 2) Flood-list #2 - composed of ACs, a Selective AR-LEAF-set and a 821 Selective AR-REPLICATOR-set, where: 823 o The Selective AR-LEAF-set is composed of the overlay tunnels 824 to the AR-LEAFs that advertise a Leaf-AD route for the local 825 AR-REPLICATOR. This set is updated with every Leaf-AD route 826 received/withdrawn from a new AR-LEAF. 828 o The Selective AR-REPLICATOR-set is composed of the overlay 829 tunnels to all the AR-REPLICATORs that send a Replicator-AR 830 route with L=1. The AR-IP addresses are used as tunnel 831 destination IP. 833 When a Selective AR-REPLICATOR receives a BM packet on an AC, it will 834 forward the BM packet to its flood-list #1, skipping the non-BM 835 overlay tunnels. 837 When a Selective AR-REPLICATOR receives a BM packet on an overlay 838 tunnel, it will check the destination and source IPs of the underlay 839 IP header and: 841 - If the destination IP matches its AR-IP and the source IP 842 matches an IP of its own Selective AR-LEAF-set, the AR- 843 REPLICATOR will forward the BM packet to its flood-list #2, as 844 long as the list of AR-REPLICATORs for the EVI matches the 845 Selective AR-REPLICATOR-set. If the Selective AR-REPLICATOR-set 846 does not match the list of AR-REPLICATORs, the node reverts back 847 to non-selective mode and flood-list #1 is used. 849 - If the destination IP matches its AR-IP and the source IP does 850 not match any IP of its Selective AR-LEAF-set, the AR-REPLICATOR 851 will forward the BM packet to flood-list #2 but skipping the AR- 852 REPLICATOR-set. 854 - If the destination IP matches its IR-IP, the AR-REPLICATOR will 855 use flood-list #1 but MUST skip all the overlay tunnels from the 856 flooding list, i.e. it will only replicate to local ACs. This is 857 the regular-IR behavior described in [RFC7432]. 859 In any case, non-BM overlay tunnels are excluded from flood-lists 860 and, also, source squelching is always done in order to ensure the 861 traffic is not sent back to the originating source. If the 862 encapsulation is MPLSoGRE (or MPLSoUDP) and the EVI label is not the 863 bottom of the stack, the AR-REPLICATOR MUST copy the rest of the 864 labels when forwarding them to the egress overlay tunnels. 866 6.3.2. Selective AR-LEAF BM forwarding 868 The Selective AR-LEAF nodes will build two flood-lists: 870 1) Flood-list #1 - composed of ACs and the overlay tunnel to the 871 selected AR-REPLICATOR (using the AR-IP as the tunnel 872 destination IP). 874 2) Flood-list #2 - composed of ACs and overlay tunnels to the 875 remote IR-IP Addresses. 877 When an AR-LEAF receives a BM packet on an AC, it will check if there 878 is any selected AR-REPLICATOR. If there is, flood-list #1 will be 879 used. Otherwise, flood-list #2 will. 881 When an AR-LEAF receives a BM packet on an overlay tunnel, will 882 forward the BM packet to its local ACs and never to an overlay 883 tunnel. This is the regular IR behavior described in [RFC7432]. 885 7. Pruned-Flood-Lists (PFL) 887 In addition to AR, the second optimization supported by this solution 888 is the ability for the all the EVI nodes to signal Pruned-Flood-Lists 889 (PFL). As described in section 3, an EVPN node can signal a given 890 value for the BM and U PFL flags in the IR Inclusive Multicast 891 Routes, where: 893 + BM= Broadcast and Multicast (BM) flag. BM=1 means "prune-me" from 894 the BM flood-list. BM=0 means regular behavior. 896 + U= Unknown flag. U=1 means "prune-me" from the Unknown flood-list. 897 U=0 means regular behavior. 899 The ability to signal these PFL flags is an administrative choice. 900 Upon receiving a non-zero PFL flag, a node MAY decide to honor the 901 PFL flag and remove the sender from the corresponding flood-list. A 902 given EVI node receiving BUM traffic on an overlay tunnel MUST 903 replicate the traffic normally, regardless of the signaled PFL 904 flags. 906 This optimization MAY be used along with the AR solution. 908 7.1. A PFL example 910 In order to illustrate the use of the solution described in this 911 document, we will assume that EVI-1 in figure 1 is optimized-IR 912 enabled and: 914 o PE1 and PE2 are administratively configured as AR-REPLICATORs, due 915 to their high-performance replication capabilities. PE1 and PE2 916 will send a Replicator-AR route with BM/U flags = 00. 918 o NVE1 and NVE3 are administratively configured as AR-LEAF nodes, due 919 to their low-performance software-based replication capabilities. 920 They will advertise a Regular-IR route with type AR-LEAF. Assuming 921 both NVEs advertise all the attached VMs in EVPN as soon as they 922 come up and don't have any VMs interested in multicast 923 applications, they will be configured to signal BM/U flags = 11 for 924 EVI-1. 926 o NVE2 is optimized-IR unaware; therefore it takes on the RNVE role 927 in EVI-1. 929 Based on the above assumptions the following forwarding behavior will 930 take place: 932 (1) Any BM packets sent from VM11 will be sent to VM12 and PE1. PE1 933 will forward further the BM packets to TS1, WAN link, PE2 and 934 NVE2, but not to NVE3. PE2 and NVE2 will replicate the BM packets 935 to their local ACs but we will avoid NVE3 having to replicate 936 unnecessarily those BM packets to VM31 and VM32. 938 (2) Any BM packets received on PE2 from the WAN will be sent to PE1 939 and NVE2, but not to NVE1 and NVE3, sparing the two hypervisors 940 from replicating unnecessarily to their local VMs. PE1 and NVE2 941 will replicate to their local ACs only. 943 (3) Any Unknown unicast packet sent from VM31 will be forwarded by 944 NVE3 to NVE2, PE1 and PE2 but not NVE1. The solution avoids the 945 unnecessary replication to NVE1, since the destination of the 946 unknown traffic cannot be at NVE1. 948 (4) Any Unknown unicast packet sent from TS1 will be forwarded by PE1 949 to the WAN link, PE2 and NVE2 but not to NVE1 and NVE3, since the 950 target of the unknown traffic cannot be at those NVEs. 952 8. AR Procedures for single-IP AR-REPLICATORS 954 The procedures explained in sections 4 (Non-selective AR) and 5 955 (Selective AR) assume that the AR-REPLICATOR can use two local 956 routable IP addresses to terminate and originate NVO tunnels, i.e. 957 IR-IP and AR-IP addresses. This is usually the case for PE-based AR- 958 REPLICATOR nodes. 960 In some cases, the AR-REPLICATOR node does not support more than one 961 IP address to terminate and originate NVO tunnels, i.e. the IR-IP and 962 AR-IP are the same IP addresses. This may be the case in some 963 software-based or low-end AR-REPLICATOR nodes. If this is the case, 964 the procedures in sections 4 and 5 must be modified in the following 965 way: 967 o The Replicator-AR routes generated by the AR-REPLICATOR use an AR- 968 IP that will match its IR-IP. In order to differentiate the data 969 plane packets that need to use IR from the packets that must use AR 970 forwarding mode, the Replicator-AR route must advertise a different 971 VNI/VSID than the one used by the Regular-IR route. For instance, 972 the AR-REPLICATOR will advertise AR-VNI along with the Replicator- 973 AR route and IR-VNI along with the Regular-IR route. Since both 974 routes have the same key, different RDs are needed for both routes. 976 o An AR-REPLICATOR will perform IR or AR forwarding mode for the 977 incoming Overlay packets based on an ingress VNI lookup, as opposed 978 to the tunnel IP DA lookup described in sections 4 and 5. Note 979 that, when replicating to remote AR-REPLICATOR nodes, the use of 980 the IR-VNI or AR-VNI advertised by the egress node will determine 981 the IR or AR forwarding mode at the subsequent AR-REPLICATOR. 983 The rest of the procedures will follow what is described in sections 984 4 and 5. 986 9. AR Procedures and EVPN All-Active Multi-homing Split-Horizon 988 This section extends the procedures for the cases where AR-LEAF nodes 989 or AR-REPLICATOR nodes are attached to the the same Ethernet Segment 990 in the Broadcast Domain. The case where one (or more) AR-LEAF node(s) 991 and one (or more) AR-REPLICATOR node(s) are attached to the same 992 Ethernet Segment is out of scope. 994 9.1. Ethernet Segments on AR-LEAF nodes 996 If VXLAN or NVGRE are used, and if the Split-horizon is based on the 997 tunnel IP SA and "Local-Bias" as described in [RFC8365], the Split- 998 horizon check will not work if there is an Ethernet-Segment shared 999 between two AR-LEAF nodes, and the AR-REPLICATOR changes the tunnel 1000 IP SA of the packets with its own AR-IP. 1002 In order to be compatible with the IP SA split-horizon check, the AR- 1003 REPLICATOR MAY keep the original received tunnel IP SA when 1004 replicating packets to a remote AR-LEAF or RNVE. This will allow DF 1005 (Designated Forwarder) AR-LEAF nodes to apply Split-horizon check 1006 procedures for BM packets, before sending them to the local Ethernet- 1007 Segment. Even if the AR-LEAF's IP SA is preserved when replicating to 1008 AR-LEAFs or RNVEs, the AR-REPLICATOR MUST always use its IR-IP as IP 1009 SA when replicating to other AR-REPLICATORs. 1011 When EVPN is used for MPLS over GRE (or UDP), the ESI-label based 1012 split-horizon procedure as in [RFC7432] will not work for multi-homed 1013 Ethernet-Segments defined on AR-LEAF nodes. "Local-Bias" is 1014 recommended in this case, as in the case of VXLAN or NVGRE explained 1015 above. The "Local-Bias" and tunnel IP SA preservation mechanisms 1016 provide the required split-horizon behavior in non-selective or 1017 selective AR. 1019 Note that if the AR-REPLICATOR implementation keeps the received 1020 tunnel IP SA, the use of uRPF (unicast Reverse Path Forwarding) 1021 checks in the IP fabric based on the tunnel IP SA MUST be disabled. 1023 9.2. Ethernet Segments on AR-REPLICATOR nodes 1025 Ethernet Segments associated to one or more AR-REPLICATOR nodes 1026 SHOULD follow "Local-Bias" procedures for EVPN all-active multi- 1027 homing, as follows: 1029 o For BUM traffic received on a local AR-REPLICATOR's AC, "Local- 1030 Bias" procedures as in [RFC8365] SHOULD be followed. 1032 o For BUM traffic received on an AR-REPLICATOR overlay tunnel with 1033 AR-IP as the IP DA, "Local-Bias" SHOULD also be followed. That is, 1034 traffic received with AR-IP as IP DA will be treated as though it 1035 had been received on a local AC that is part of the ES and will be 1036 forwarded to all local ES, irrespective of their DF or NDF state. 1038 o BUM traffic received on an AR-REPLICATOR overlay tunnel with IR-IP 1039 as the IP DA, will follow regular [RFC8365] "Local-Bias" rules and 1040 will not be forwarded to local ESes that are shared with the AR-LEF 1041 or AR-REPLICATOR originating the traffic. 1043 10. Benefits of the optimized-IR solution 1045 A solution for the optimization of Ingress Replication in EVPN is 1046 described in this document (optimized-IR). The solution brings the 1047 following benefits: 1049 o Optimizes the multicast forwarding in low-performance NVEs, by 1050 relaying the replication to high-performance NVEs (AR-REPLICATORs) 1051 and while preserving the packet ordering for unicast applications. 1053 o Reduces the flooded traffic in NVO networks where some NVEs do not 1054 need broadcast/multicast and/or unknown unicast traffic. 1056 o It is fully compatible with existing EVPN implementations and EVPN 1057 functions for NVO overlay tunnels. Optimized-IR NVEs and regular 1058 NVEs can be even part of the same EVI. 1060 o It does not require any PIM-based tree in the NVO core of the 1061 network. 1063 11. Security Considerations 1065 This section will be added in future versions. 1067 12. IANA Considerations 1069 IANA has allocated the following Border Gateway Protocol (BGP) 1070 Parameters: 1072 1) Allocation in the P-Multicast Service Interface Tunnel (PMSI 1073 Tunnel) Tunnel Types registry: 1075 Value Meaning Reference 1076 0x0A Assisted-Replication Tunnel [This document] 1078 2) Allocations in the P-Multicast Service Interface (PMSI) Tunnel 1079 Attribute Flags registry: 1081 Value Name Reference 1082 3-4 Assisted-Replication Type (T) [This document] 1083 5 Broadcast and Multicast (BM) [This document] 1084 6 Unknown (U) [This document] 1086 13. References 1088 13.1 Normative References 1090 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1091 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1092 1997, . 1094 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1095 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, 1096 . 1098 [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP 1099 Encodings and Procedures for Multicast in MPLS/BGP IP VPNs", 1100 RFC 6514, DOI 10.17487/RFC6514, February 2012, . 1103 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 1104 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet 1105 VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, 1106 . 1108 [EVPN-BUM] Zhang et al., "Updates on EVPN BUM Procedures", draft- 1109 ietf-bess-evpn-bum-procedure-updates-04.txt, work in progress, June 1110 2018. 1112 13.2 Informative References 1114 [RFC8365] Sajassi et al., "A Network Virtualization Overlay Solution 1115 Using Ethernet VPN (EVPN)", RFC 8365, March, 2018. 1117 14. Contributors 1119 In addition to the names in the front page, the following co-authors 1120 also contributed to this document: 1122 Wim Henderickx 1123 Nokia 1125 Kiran Nagaraj 1126 Nokia 1128 Ravi Shekhar 1129 Juniper Networks 1131 Nischal Sheth 1132 Juniper Networks 1134 Aldrin Isaac 1135 Juniper 1137 Mudassir Tufail 1138 Citibank 1140 15. Acknowledgments 1142 The authors would like to thank Neil Hart, David Motz, Dai Truong, 1143 Thomas Morin, Jeffrey Zhang and Shankar Murthy for their valuable 1144 feedback and contributions. 1146 16. Authors' Addresses 1148 Jorge Rabadan (Editor) 1149 Nokia 1150 777 E. Middlefield Road 1151 Mountain View, CA 94043 USA 1152 Email: jorge.rabadan@nokia.com 1153 Senthil Sathappan 1154 Nokia 1155 Email: senthil.sathappan@nokia.com 1157 Mukul Katiyar 1158 Versa Networks 1159 Email: mukul@versa-networks.com 1161 Wen Lin 1162 Juniper Networks 1163 Email: wlin@juniper.net 1165 Ali Sajassi 1166 Cisco 1167 Email: sajassi@cisco.com