idnits 2.17.1 draft-rabadan-l2vpn-evpn-optimized-ir-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: o LEAF nodes SHALL send service-level BM control plane packets following regular IR procedures. An example would be IGMP, MLD or PIM multicast packets. The REPLICATORs MUST not replicate these control plane packets to other overlay tunnels since they will use the regular Originating Router's IP Address. -- The document date (July 4, 2014) is 3577 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2119' is mentioned on line 635, but not defined == Outdated reference: A later version (-11) exists of draft-ietf-l2vpn-evpn-07 == Outdated reference: A later version (-03) exists of draft-sd-l2vpn-evpn-overlay-02 -- Possible downref: Normative reference to a draft: ref. 'EVPN-OVERLAY' Summary: 2 errors (**), 0 flaws (~~), 5 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 L2VPN Workgroup J. Rabadan 3 Internet Draft S. Sathappan 4 Intended status: Standards Track W. Henderickx 5 Alcatel-Lucent 6 R. Shekhar 7 N. Sheth M. Katiyar 8 W. Lin Nuage Networks 9 Juniper 11 Expires: January 5, 2015 July 4, 2014 13 Optimized Ingress Replication solution for EVPN 14 draft-rabadan-l2vpn-evpn-optimized-ir-00 16 Abstract 18 Network Virtualization Overlay (NVO) networks using EVPN as control 19 plane may use ingress replication (IR) or PIM-based trees to convey 20 the overlay multicast traffic. PIM provides an efficient solution to 21 avoid sending multiple copies of the same packet over the same 22 physical link, however it may not always be deployed in the NVO core 23 network. IR avoids the dependency on PIM in the NVO network core. 24 While IR provides a simple multicast transport, some NVO networks 25 with demanding multicast applications require a more efficient 26 solution without PIM in the core. This document describes a solution 27 to optimize the efficiency of IR in NVO networks. 29 Status of this Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF), its areas, and its working groups. Note that 36 other groups may also distribute working documents as Internet- 37 Drafts. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 The list of current Internet-Drafts can be accessed at 45 http://www.ietf.org/ietf/1id-abstracts.txt 47 The list of Internet-Draft Shadow Directories can be accessed at 48 http://www.ietf.org/shadow.html 50 This Internet-Draft will expire on January 5, 2015. 52 Copyright Notice 54 Copyright (c) 2014 IETF Trust and the persons identified as the 55 document authors. All rights reserved. 57 This document is subject to BCP 78 and the IETF Trust's Legal 58 Provisions Relating to IETF Documents 59 (http://trustee.ietf.org/license-info) in effect on the date of 60 publication of this document. Please review these documents 61 carefully, as they describe your rights and restrictions with respect 62 to this document. Code Components extracted from this document must 63 include Simplified BSD License text as described in Section 4.e of 64 the Trust Legal Provisions and are provided without warranty as 65 described in the Simplified BSD License. 67 Table of Contents 69 1. Problem Statement . . . . . . . . . . . . . . . . . . . . . . . 3 70 2. Solution requirements . . . . . . . . . . . . . . . . . . . . . 4 71 3. EVPN BGP Attributes for optimized-IR . . . . . . . . . . . . . 4 72 4. Assisted-Replication (AR) Solution Description . . . . . . . . 6 73 4.1. AR roles and control plane . . . . . . . . . . . . . . . . 7 74 4.1.1. AR-REPLICATOR procedures . . . . . . . . . . . . . . . 7 75 4.1.2. AR-LEAF procedures . . . . . . . . . . . . . . . . . . 8 76 4.1.3. RNVE procedures . . . . . . . . . . . . . . . . . . . . 10 77 4.2. Multi-destination traffic forwarding behavior in AR EVIs . 10 78 4.2.1. Broadcast and Multicast forwarding behavior . . . . . . 10 79 4.2.1.1. REPLICATOR BM forwarding . . . . . . . . . . . . . 10 80 4.2.1.2. LEAF BM forwarding . . . . . . . . . . . . . . . . 11 81 4.2.1.3. RNVE BM forwarding . . . . . . . . . . . . . . . . 11 82 4.2.2. Unknown unicast forwarding behavior . . . . . . . . . . 12 83 4.2.2.1. REPLICATOR/LEAF Unknown unicast forwarding . . . . 12 84 4.4.2.2. RNVE Unknown unicast forwarding . . . . . . . . . . 12 85 5. Pruned-Flood-Lists (PFL) . . . . . . . . . . . . . . . . . . . 12 86 6. An example use-case . . . . . . . . . . . . . . . . . . . . . . 13 87 5. Benefits of the optimized-IR solution . . . . . . . . . . . . . 14 88 6. Conventions used in this document . . . . . . . . . . . . . . . 14 89 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 14 90 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 14 91 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 92 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 15 93 10. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 15 95 1. Problem Statement 97 EVPN may be used as the control plane for a Network Virtualization 98 Overlay (NVO) network. Network Virtualization Edge (NVE) devices and 99 PEs that are part of the same EVI use Ingress Replication (IR) or 100 PIM-based trees to transport the tenant's multicast traffic. In NVO 101 networks where PIM-based trees cannot be used, IR is the only 102 alternative. Examples of these situations are NVO networks where the 103 core nodes don't support PIM or the network operator does not want to 104 run PIM in the core. 106 In some use-cases, the amount of replication for BUM (Broadcast, 107 Unknown unicast and Multicast traffic) is kept under control on the 108 NVEs due to the following fairly common assumptions: 110 a) Broadcast is greatly reduced due to the proxy-ARP and proxy-ND 111 capabilities supported by EVPN on the NVEs. Some NVEs can even 112 provide DHCP-server functions for the attached Tenant Systems (TS) 113 reducing the broadcast even further. 115 b) Unknown unicast traffic is greatly reduced in virtualized NVO 116 networks where all the MAC and IP addresses are learnt in the 117 control plane. 119 c) Multicast applications are not used. 121 If the above assumptions are true for a given NVO network, then IR 122 provides a simple solution for multi-destination traffic. However, 123 the statement c) above is not always true and multicast applications 124 are required in many use-cases. 126 When the multicast sources are attached to NVEs residing in 127 hypervisors or low-performance-replication TORs, the ingress 128 replication of large amounts of multicast traffic to a significant 129 number of remote NVEs/PEs can seriously degrade the performance of 130 the NVE and impact the application. 132 This document describes a solution that makes use of two IR 133 optimizations: 135 i) Assisted-Replication (AR) 136 ii) Pruned-Flood-Lists (PFL) 137 Both optimizations may be used together or independently so that the 138 performance and efficiency of the network to transport multicast can 139 be improved. Both solutions require some extensions to [EVPN] that 140 are described in section 3. 142 Section 2 lists the requirements of the combined optimized-IR 143 solution, whereas section 4 describes the Assisted-Replication (AR) 144 solution and section 5 the Pruned-Flood-Lists (PFL) solution. 146 2. Solution requirements 148 The IR optimization solution (optimized-IR hereafter) MUST meet the 149 following requirements: 151 a) The solution MUST provide an IR optimization for BM (Broadcast and 152 Multicast) traffic, while preserving the packet order for unicast 153 applications, i.e. known and unknown unicast traffic SHALL follow 154 the same path. 156 b) The solution MUST be compatible with [EVPN] and [EVPN-OVERLAY] and 157 not have any impact on the EVPN procedures for BM traffic. In 158 particular, the solution MUST support the following EVPN 159 functions: 161 o All-active multi-homing, including the split-horizon and 162 Designated Forwarder (DF) functions. 164 o Single-active multi-homing, including the DF function. 166 o Handling of multi-destination traffic and processing of 167 broadcast and multicast as per [EVPN]. 169 c) The solution MUST be backwards compatible with existing NVEs using 170 a non-optimized version of IR. A given EVI can have NVEs/PEs 171 supporting regular-IR and optimized-IR. 173 d) The solution MUST be independent of the NVO specific data plane 174 encapsulation and the virtual identifiers being used, e.g.: VXLAN 175 VNIs, NVGRE VSIDs or MPLS labels. 177 3. EVPN BGP Attributes for optimized-IR 179 This solution proposes some changes to the [EVPN] inclusive multicast 180 routes and attributes so that an NVE/PE can signal its optimized-IR 181 capabilities. 183 The Inclusive Multicast Ethernet Tag route and its PMSI Tunnel 184 attribute's format used in EVPN are shown below: 186 +---------------------------------+ 187 | RD (8 octets) | 188 +---------------------------------+ 189 | Ethernet Tag ID (4 octets) | 190 +---------------------------------+ 191 | IP Address Length (1 octet) | 192 +---------------------------------+ 193 | Originating Router's IP Addr | 194 | (4 or 16 octets) | 195 +---------------------------------+ 197 +---------------------------------+ 198 | Flags (1 octet) | 199 +---------------------------------+ 200 | Tunnel Type (1 octets) | 201 +---------------------------------+ 202 | MPLS Label (3 octets) | 203 +---------------------------------+ 204 | Tunnel Identifier (variable) | 205 +---------------------------------+ 207 Where: 209 o Originating Router's IP Address, Tunnel Type (0x06), MPLS Label and 210 Tunnel Identifier MUST be used as described in [EVPN] for non- 211 optimized-IR behavior. 213 o A different Originating Router's IP Address, a new Tunnel Type 214 (TBD), MPLS Label and Tunnel Identifier may be used for 215 Assisted-Replication (AR). 217 o The Flags field is defined as follows: 219 0 1 2 3 4 5 6 7 220 +-+-+-+-+-+--+-+-+ 221 |rsved| T |BM|U|L| 222 +-+-+-+-+-+--+-+-+ 224 Where a new type field (for AR) and two new flags (for PFL 225 signaling) are defined: 227 - T is the AR Type field (2 bits): 229 + 00 (decimal 0) = RNVE (non-AR support) 231 + 01 (decimal 1) = AR REPLICATOR 232 + 10 (decimal 2) = AR LEAF 234 - New PFL (Pruned-Flood-Lists) flags: 236 + BM= Broadcast and Multicast (BM) flag. BM=1 means "prune- 237 me" from the BM flooding list. BM=0 means regular 238 behavior. 240 + U= Unknown flag. U=1 means "prune-me" from the Unknown 241 flooding list. U=0 means regular behavior. 243 - Flag L is an existing flag defined in RFC6514 (L=Leaf 244 Information Required) and it has no use in this solution. 246 Each AR-enabled EVI node MUST understand and process the AR type 247 field in the PMSI attribute (Flags field) and MUST signal the 248 corresponding type (1 or 2) according to its administrative choice. 250 Each EVI node MAY understand and process the BM/U flags. Note that 251 these BM/U flags may be used to optimize the delivery of multi- 252 destination traffic and its use SHOULD be an administrative choice, 253 regardless of the AR settings. 255 The T field and BM/U flags MAY be used individually or together, i.e. 256 a given PMSI attribute may only convey the AR type information, or 257 only the BM/U flags, or both pieces of information at the same time. 259 Non-optimized-IR nodes will be unaware of the new PMSI attribute flag 260 definition, i.e. they will ignore the information contained in the 261 flags field. 263 4. Assisted-Replication (AR) Solution Description 265 The following figure illustrates an example NVO network where the AR 266 function is enabled. This scenario will be used to describe the 267 solution throughout the rest of the document. 269 ( ) 270 (_ WAN _) 271 +---(_ _)----+ 272 | (_ _) | 273 PE1 | PE2 | 274 +------+----+ +----+------+ 275 TS1--+ (EVI-1) | | (EVI-1) +--TS2 276 |REPLICATOR | |REPLICATOR | 277 +--------+--+ +--+--------+ 278 | | 279 +--+----------------+--+ 280 | | 281 | | 282 +----+ VXLAN/nvGRE/MPLSoGRE +----+ 283 | | IP Fabric | | 284 | | | | 285 NVE1 | +-----------+----------+ | NVE3 286 Hypervisor| TOR | NVE2 |Hypervisor 287 +---------+-+ +-----+-----+ +-+---------+ 288 | (EVI-1) | | (EVI-1) | | (EVI-1) | 289 | LEAF | | RNVE | | LEAF | 290 +--+-----+--+ +--+-----+--+ +--+-----+--+ 291 | | | | | | 292 VM11 VM12 TS3 TS4 VM31 VM32 294 Figure 1 Optimized-IR scenario 296 4.1. AR roles and control plane 298 The solution defines three different roles in an AR EVI service: 300 a) AR-REPLICATOR (REPLICATOR) 301 b) AR-LEAF (LEAF) 302 c) Regular NVE (RNVE) 304 4.1.1. AR-REPLICATOR procedures 306 REPLICATOR is defined as an NVE/PE capable of replicating ingress BM 307 (Broadcast and Multicast) traffic received on an overlay tunnel to 308 other overlay tunnels and local Attachment Circuits (ACs). The 309 REPLICATOR signals its REPLICATOR role in the control plane and 310 understands where the other roles (LEAF nodes, RNVEs and other 311 REPLICATORs) are located. A given AR EVI service may have zero, one 312 or more REPLICATORs. In our example in figure 1, PE1 and PE2 are 313 defined as REPLICATORs. The following considerations apply to the 314 REPLICATOR role: 316 a) The AR-REPLICATOR role SHOULD be an administrative choice in any 317 NVE/PE that is part of an AR EVI. This administrative option to 318 enable REPLICATOR capabilities MAY be implemented as a system 319 level option as opposed to as per-EVI option. 321 b) An AR-REPLICATOR MUST advertise an AR inclusive multicast route 322 and MAY advertise an IR inclusive multicast route. 324 c) An IR Inclusive Multicast Route is an Inclusive Multicast Route as 325 defined in [EVPN] and MUST NOT be generated by the AR REPLICATOR 326 if it does not have local attachment circuits (AC). 328 d) An AR Inclusive Multicast Route MUST be generated by the AR 329 REPLICATOR and it is comprised of: 331 o AR Originating Router's IP Address, which is different from 332 the IR IP address used in the IR Inclusive Multicast Route. 334 o T = 1 (AR REPLICATOR) 336 o Tunnel type = TBD (AR tunnel) 338 o Tunnel Identifier MUST contain the same value as the AR 339 Originating Router's IP Address. 341 o The rest of the route fields are used as per [EVPN]. 343 e) When a node defined as REPLICATOR receives a packet from an 344 overlay tunnel, it will do a tunnel destination IP lookup and 345 follow the following procedures: 347 o If the destination IP is the IR Originating Router's IP 348 Address the node will process the packet normally as in 349 [EVPN]. 351 o If the destination IP is the AR Originating Router's IP 352 Address, the node MUST replicate the packet to local ACs and 353 overlay tunnels (excluding the overlay tunnel to the source of 354 the packet). Selective replication to only interested AR-LEAF 355 nodes will be added in a future revision of this document. 357 4.1.2. AR-LEAF procedures 359 LEAF is defined as an NVE/PE that - given its poor replication 360 performance - sends all the BM traffic to a REPLICATOR that can 361 replicate the traffic further on its behalf. It signals its LEAF 362 capability in the control plane and understands where the other roles 363 are located (REPLICATOR and RNVEs). A given service can have zero, 364 one or more LEAF nodes. Figure 1 shows NVE1 and NVE2 (both residing 365 in hypervisors) acting as LEAF. The following considerations apply to 366 the LEAF role: 368 a) The AR-LEAF role SHOULD be an administrative choice in any NVE/PE 369 that is part of an AR EVI. This administrative option to enable 370 LEAF capabilities MAY be implemented as a system level option as 371 opposed to as per-EVI option. 373 b) An AR-LEAF MUST advertise a single inclusive multicast route where 374 the AR type is set to T = 2 (AR LEAF) and the rest of fields 375 follow [EVPN]. 377 c) In a service where there are no REPLICATORs, the LEAF MUST use 378 regular ingress replication. This will happen when a new update 379 from the last former REPLICATOR is received and contains a non- 380 REPLICATOR AR type, or when the LEAF detects that the last 381 REPLICATOR is down (next-hop tracking in the IGP or any other 382 detection mechanism). Ingress replication MUST use the forwarding 383 information given by the IR Inclusive Multicast Routes as 384 described in [EVPN]. 386 d) In a service where there is more than one or more REPLICATORs, the 387 LEAF can locally select which REPLICATOR it sends the BM traffic 388 to: 390 o A single REPLICATOR may be selected for all the BM packets 391 received on LEAF attachment circuits (ACs). This selection is 392 a local decision and it does not have to match other LEAF's 393 selection within the same service. 395 o A LEAF may select more than one REPLICATOR and do either per- 396 flow or per-service load balancing. 398 o In case of a failure on the selected REPLICATOR, another 399 REPLICATOR will be selected. 401 o When a REPLICATOR is selected, the LEAF MUST send all the BM 402 packets to that REPLICATOR using the forwarding information 403 given by the AR Inclusive Multicast Route previously sent by 404 the REPLICATOR, with tunnel type = TBD (AR tunnel). The 405 underlay destination IP address MUST be the AR Originating 406 Router's IP Address signaled by the REPLICATOR for the AR 407 tunnel type. 409 o LEAF nodes SHALL send service-level BM control plane packets 410 following regular IR procedures. An example would be IGMP, MLD 411 or PIM multicast packets. The REPLICATORs MUST not replicate 412 these control plane packets to other overlay tunnels since 413 they will use the regular Originating Router's IP Address. 415 4.1.3. RNVE procedures 417 RNVE (Regular Network Virtualization Edge node) is defined as an 418 NVE/PE without REPLICATOR or LEAF capabilities that does IR as 419 described in [EVPN]. The RNVE does not signal any special role and is 420 unaware of the REPLICATOR/LEAF roles in the EVI. The RNVE will ignore 421 AR Inclusive Multicast Routes (due to an unknown tunnel type in the 422 PMSI attribute). 424 This role provides EVPN with the backwards compatibility required in 425 optimized-IR EVIs. Figure 1 shows NVE2 as RNVE. 427 4.2. Multi-destination traffic forwarding behavior in AR EVIs 429 In AR EVIs, BM (Broadcast and Multicast) traffic between two NVEs may 430 follow a different path than unicast traffic. This solution proposes 431 the replication of BM through the REPLICATOR node, whereas 432 unknown/known unicast will be delivered directly from the source node 433 to the destination node without being replicated by any intermediate 434 node. Unknown unicast SHALL follow the same path as known unicast 435 traffic in order to avoid packet reordering for unicast applications 436 and simplify the control and data plane procedures. Section 4.2.1 437 describes the expected forwarding behavior for BM traffic in nodes 438 acting as REPLICATOR, LEAF and RNVE. Section 4.2.2 describes the 439 forwarding behavior for unknown unicast traffic. 441 Note that known unicast forwarding is not impacted by this solution. 443 4.2.1. Broadcast and Multicast forwarding behavior 445 The expected behavior per role is described in this section. 447 4.2.1.1. REPLICATOR BM forwarding 449 The REPLICATORs will build a flooding list composed of ACs and 450 overlay tunnels to remote nodes in the EVI. Some of those overlay 451 tunnels MAY be flagged as non-BM receivers based on the BM flag 452 received from the remote nodes in the EVI. The REPLICATOR will also 453 build a list of remote REPLICATORs, LEAF nodes and RNVEs for the EVI. 455 o When a REPLICATOR receives a BM packet on an AC, it will forward 456 the BM packet to its flooding list (including local ACs and remote 457 NVE/PEs), skipping the non-BM overlay tunnels. 459 o When a REPLICATOR receives a BM packet on an overlay tunnel, it 460 will check the destination IP of the underlay IP header and: 462 - If the destination IP matches its AR Originating Router IP, the 463 REPLICATOR will forward the BM packet to its flooding list (ACs 464 and overlay tunnels) excluding the non-BM overlay tunnels. The 465 REPLICATOR will do source squelching to ensure the traffic is 466 not sent back to the originating LEAF. If the overlay 467 encapsulation is MPLS and the EVI label is not the bottom of the 468 stack, the REPLICATOR MUST copy the rest of the labels and 469 forward them to the egress overlay tunnels. 471 - If the destination IP matches its IR Originating Router IP, the 472 REPLICATOR will skip all the overlay tunnels from the flooding 473 list, i.e. it will only replicate to local ACs. This is the 474 regular IR behavior described in [EVPN]. 476 4.2.1.2. LEAF BM forwarding 478 The LEAF nodes will build two flood-lists: 480 1) Flood-list #1 - composed of ACs and a REPLICATOR-set of overlay 481 tunnels. The REPLICATOR-set is defined as one or more overlay 482 tunnels to the AR Originating Router's IP Addresses of the 483 remote REPLICATOR(s) in the EVI. The selection of more than one 484 REPLICATOR is described in section 4.1.2 and it is a local LEAF 485 decision. 487 2) Flood-list #2 - composed of ACs and overlay tunnels to the 488 remote IR Originating Router's IP Addresses. 490 When a LEAF receives a BM packet on an AC, it will check the 491 REPLICATOR-set: 493 o If the REPLICATOR-set is empty, the LEAF will send the packet to 494 flood-list #2. 496 o If the REPLICATOR-set is NOT empty, the LEAF will send the packet 497 to flood-list #1. 499 When a LEAF receives a BM packet on an overlay tunnel, will forward 500 the BM packet to its local ACs and never to an overlay tunnel. This 501 is the regular IR behavior described in [EVPN]. 503 4.2.1.3. RNVE BM forwarding 505 The RNVE is completely unaware of the REPLICATORs, LEAF nodes and 506 BM/U flags (that information is ignored). Its forwarding behavior is 507 the regular IR behavior described in [EVPN]. Any regular non-AR node 508 is fully compatible with the RNVE role described in this document. 510 4.2.2. Unknown unicast forwarding behavior 512 The expected behavior is described in this section. 514 4.2.2.1. REPLICATOR/LEAF Unknown unicast forwarding 516 While the forwarding behavior in REPLICATORs and LEAF nodes is 517 different for BM traffic, as far as Unknown unicast traffic 518 forwarding is concerned, LEAF nodes behave exactly in the same way as 519 REPLICATORs do. 521 The REPLICATOR/LEAF nodes will build a flood-list composed of ACs and 522 overlay tunnels to the IR Originating Router's IP Addresses of the 523 remote nodes in the EVI. Some of those overlay tunnels MAY be flagged 524 as non-U (Unknown unicast) receivers based on the U flag received 525 from the remote nodes in the EVI. 527 o When a REPLICATOR/LEAF receives an unknown packet on an AC, it will 528 forward the unknown packet to its flood-list, skipping the non-U 529 overlay tunnels. 531 o When a REPLICATOR/LEAF receives an unknown packet on an overlay 532 tunnel will forward the unknown packet to its local ACs and never 533 to an overlay tunnel. This is the regular IR behavior described in 534 [EVPN]. 536 4.4.2.2. RNVE Unknown unicast forwarding 538 As described for BM traffic, the RNVE is completely unaware of the 539 REPLICATORs, LEAF nodes and BM/U flags (that information is ignored). 540 Its forwarding behavior is the regular IR behavior described in 541 [EVPN], also for Unknown unicast traffic. Any regular non-AR node is 542 fully compatible with the RNVE role described in this document. 544 5. Pruned-Flood-Lists (PFL) 546 The second optimization supported by this solution is the ability for 547 the all the EVI nodes to signal Pruned-Flood-Lists (PFL). As 548 described in section 3, an EVPN node can signal a given value for the 549 BM and U PFL flags in the IR Inclusive Multicast Routes, where: 551 + BM= Broadcast and Multicast (BM) flag. BM=1 means "prune-me" from 552 the BM flood-list. BM=0 means regular behavior. 554 + U= Unknown flag. U=1 means "prune-me" from the Unknown flood-list. 555 U=0 means regular behavior. 557 The ability to signal these PFL flags is an administrative choice. 559 Upon receiving a non-zero PFL flag, a node MAY decide to honor the 560 PFL flag and remove the sender from the corresponding flood-list. A 561 given EVI node receiving BUM traffic on an overlay tunnel MUST 562 replicate the traffic normally, regardless of the signaled PFL 563 flags. 565 This optimization MAY be used along with the AR solution. 567 6. An example use-case 569 In order to illustrate the use of the solution described in this 570 document, we will assume that EVI-1 in figure 1 is optimized-IR 571 enabled and: 573 o PE1 and PE2 are administratively configured as REPLICATORs, due to 574 their high-performance replication capabilities. PE1 and PE2 will 575 signal AR type = 1 and BM/U flags = 00. 577 o NVE1 and NVE3 are administratively configured as LEAF nodes, due to 578 their low-performance software-based replication capabilities. They 579 will signal AR type = 2. Assuming both NVEs advertise all the 580 attached VMs in EVPN as soon as they come up and don't have any VMs 581 interested in multicast applications, they will be configured to 582 signal BM/U flags = 11 for EVI-1. 584 o NVE2 is optimized-IR unaware; therefore it takes on the RNVE role 585 in EVI-1. 587 Based on the above assumptions the following forwarding behavior will 588 take place: 590 (1) Any BM packets sent from VM11 will be sent to VM12 and PE1. PE1 591 will forward further the BM packets to TS1, WAN link, PE2 and 592 NVE2, but not to NVE3. PE2 and NVE2 will replicate the BM packets 593 to their local ACs but we will avoid NVE3 having to replicate 594 unnecessarily those BM packets to VM31 and VM32. 596 (2) Any BM packets received on PE2 from the WAN will be sent to PE1 597 and NVE2, but not to NVE1 and NVE3, sparing the two hypervisors 598 from replicating unnecessarily to their local VMs. PE1 and NVE2 599 will replicate to their local ACs only. 601 (3) Any Unknown unicast packet sent from VM31 will be forwarded by 602 NVE3 to NVE2, PE1 and PE2 but not NVE1. The solution avoids the 603 unnecessary replication to NVE1, since the destination of the 604 unknown traffic cannot be at NVE1. 606 (4) Any Unknown unicast packet sent from TS1 will be forwarded by PE1 607 to the WAN link, PE2 and NVE2 but not to NVE1 and NVE3, since the 608 target of the unknown traffic cannot be at those NVEs. 610 5. Benefits of the optimized-IR solution 612 A solution for the optimization of Ingress Replication in EVPN is 613 described in this document (optimized-IR). The solution brings the 614 following benefits: 616 o Optimizes the multicast forwarding in low-performance NVEs, by 617 relaying the replication to high-performance NVEs (REPLICATORs) and 618 while preserving the packet ordering for unicast applications. 620 o Reduces the flooded traffic in NVO networks where some NVEs do not 621 need broadcast/multicast and/or unknown unicast traffic. 623 o It is fully compatible with existing EVPN implementations and EVPN 624 functions for NVO overlay tunnels. Optimized-IR NVEs and regular 625 NVEs can be even part of the same EVI. 627 o It does not require any PIM-based tree in the NVO core of the 628 network. 630 6. Conventions used in this document 632 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 633 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 634 document are to be interpreted as described in RFC-2119 [RFC2119]. 636 In this document, these words will appear with that interpretation 637 only when in ALL CAPS. Lower case uses of these words are not to be 638 interpreted as carrying RFC-2119 significance. 640 In this document, the characters ">>" preceding an indented line(s) 641 indicates a compliance requirement statement using the key words 642 listed above. This convention aids reviewers in quickly identifying 643 or finding the explicit compliance requirements of this RFC. 645 7. Security Considerations 647 This section will be added in future versions. 649 8. IANA Considerations 651 8. References 653 [EVPN] Sajassi et al., "BGP MPLS Based Ethernet VPN", draft-ietf- 654 l2vpn-evpn-07.txt, work in progress, May, 2014 656 [EVPN-OVERLAY] Sajassi-Drake et al., "A Network Virtualization 657 Overlay Solution using EVPN", draft-sd-l2vpn-evpn-overlay-02.txt, 658 work in progress, October, 2013 660 9. Acknowledgments 662 The authors would like to thank Neil Hart and David Motz for their 663 valuable feedback and contributions. 665 10. Authors' Addresses 667 Jorge Rabadan 668 Alcatel-Lucent 669 777 E. Middlefield Road 670 Mountain View, CA 94043 USA 671 Email: jorge.rabadan@alcatel-lucent.com 673 Senthil Sathappan 674 Alcatel-Lucent 675 Email: senthil.sathappan@alcatel-lucent.com 677 Mukul Katiyar 678 Nuage Networks 679 Email: 681 Wim Henderickx 682 Alcatel-Lucent 683 Email: wim.henderickx@alcatel-lucent.com 685 Ravi Shekhar 686 Juniper Networks 687 Email: rshekhar@juniper.net 689 Nischal Sheth 690 Juniper Networks 691 Email: nsheth@juniper.net 693 Wen Lin 694 Juniper Networks 695 Email: wlin@juniper.net