idnits 2.17.1 

draft-ietf-l2vpn-evpn-08.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 46 instances of lines with control characters in the document.

  ** The abstract seems to contain references ([EVPN-REQ]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 1398 has weird spacing: '...ntinues  forwa...'

  -- The document date (September 12, 2014) is 3507 days in the past.  Is
     this intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'EVPN-REQ' is mentioned on line 432, but not defined

  == Missing Reference: 'VPLS-MCAST' is mentioned on line 2075, but not
     defined

  == Missing Reference: 'RFC5925' is mentioned on line 2206, but not defined

  == Unused Reference: 'RFC7209' is defined on line 2291, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC7117' is defined on line 2294, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-07) exists of
     draft-ietf-l2vpn-evpn-req-04

  == Outdated reference: A later version (-16) exists of
     draft-ietf-l2vpn-vpls-mcast-14


     Summary: 2 errors (**), 0 flaws (~~), 9 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                    A. Sajassi, Ed.
3	INTERNET-DRAFT                                                     Cisco
4	Category: Standards Track
5	                                                             R. Aggarwal
6	J. Drake                                                          Arktan
7	Juniper Networks
8	                                                                N. Bitar
9	W. Henderickx                                                    Verizon
10	Alcatel-Lucent
11	                                                            Aldrin Isaac
12	                                                               Bloomberg

14	                                                               J. Uttaro
15	                                                                    AT&T

17	Expires: March 12, 2015                               September 12, 2014

19	                      BGP MPLS Based Ethernet VPN
20	                        draft-ietf-l2vpn-evpn-08

22	Status of this Memo

24	   This Internet-Draft is submitted to IETF in full conformance with the
25	   provisions of BCP 78 and BCP 79.

27	   Internet-Drafts are working documents of the Internet Engineering
28	   Task Force (IETF), its areas, and its working groups.  Note that
29	   other groups may also distribute working documents as
30	   Internet-Drafts.

32	   Internet-Drafts are draft documents valid for a maximum of six months
33	   and may be updated, replaced, or obsoleted by other documents at any
34	   time.  It is inappropriate to use Internet-Drafts as reference
35	   material or to cite them other than as "work in progress."

37	   The list of current Internet-Drafts can be accessed at
38	   http://www.ietf.org/1id-abstracts.html

40	   The list of Internet-Draft Shadow Directories can be accessed at
41	   http://www.ietf.org/shadow.html

43	Copyright and License Notice

45	   Copyright (c) 2013 IETF Trust and the persons identified as the
46	   document authors. All rights reserved.

48	   This document is subject to BCP 78 and the IETF Trust's Legal
49	   Provisions Relating to IETF Documents
50	   (http://trustee.ietf.org/license-info) in effect on the date of
51	   publication of this document. Please review these documents
52	   carefully, as they describe your rights and restrictions with respect
53	   to this document. Code Components extracted from this document must
54	   include Simplified BSD License text as described in Section 4.e of
55	   the Trust Legal Provisions and are provided without warranty as
56	   described in the Simplified BSD License.

58	Abstract

60	   This document describes procedures for BGP MPLS based Ethernet VPNs
61	   (EVPN). The procedures described here are intended to meet the
62	   requirements specified in [EVPN-REQ].

64	Table of Contents

66	   1. Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  5
67	   2. Specification of requirements . . . . . . . . . . . . . . . . .  5
68	   3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . .  5
69	   4. BGP MPLS Based EVPN Overview  . . . . . . . . . . . . . . . . .  6
70	   5. Ethernet Segment  . . . . . . . . . . . . . . . . . . . . . . .  7
71	   6. Ethernet Tag ID . . . . . . . . . . . . . . . . . . . . . . . . 10
72	     6.1 VLAN Based Service Interface . . . . . . . . . . . . . . . . 11
73	     6.2 VLAN Bundle Service Interface  . . . . . . . . . . . . . . . 11
74	       6.2.1 Port Based Service Interface . . . . . . . . . . . . . . 11
75	     6.3 VLAN Aware Bundle Service Interface  . . . . . . . . . . . . 11
76	       6.3.1 Port Based VLAN Aware Service Interface  . . . . . . . . 12
77	   7. BGP EVPN NLRI . . . . . . . . . . . . . . . . . . . . . . . . . 12
78	     7.1. Ethernet Auto-Discovery Route . . . . . . . . . . . . . . . 13
79	     7.2.  MAC/IP Advertisement Route . . . . . . . . . . . . . . . . 13
80	     7.3. Inclusive Multicast Ethernet Tag Route  . . . . . . . . . . 14
81	     7.4 Ethernet Segment Route . . . . . . . . . . . . . . . . . . . 15
82	     7.5 ESI Label Extended Community . . . . . . . . . . . . . . . . 15
83	     7.6 ES-Import Route Target . . . . . . . . . . . . . . . . . . . 16
84	     7.7 MAC Mobility Extended Community  . . . . . . . . . . . . . . 16
85	     7.8 Default Gateway Extended Community . . . . . . . . . . . . . 17
86	     7.9 Route Distinguisher Assignment per EVI . . . . . . . . . . . 17
87	     7.10 Route Targets . . . . . . . . . . . . . . . . . . . . . . . 17
88	       7.10.1 Auto-Derivation from the Ethernet Tag ID  . . . . . . . 17
89	   8. Multi-homing Functions  . . . . . . . . . . . . . . . . . . . . 18
90	     8.1 Multi-homed Ethernet Segment Auto-Discovery  . . . . . . . . 18
91	       8.1.1 Constructing the Ethernet Segment Route  . . . . . . . . 18
92	     8.2 Fast Convergence . . . . . . . . . . . . . . . . . . . . . . 18
93	       8.2.1 Constructing the Ethernet A-D per Ethernet Segment
94	             (ES) Route . . . . . . . . . . . . . . . . . . . . . . . 19
95	         8.2.1.1. Ethernet A-D Route Targets  . . . . . . . . . . . . 20

97	     8.3 Split Horizon  . . . . . . . . . . . . . . . . . . . . . . . 20
98	       8.3.1 ESI Label Assignment . . . . . . . . . . . . . . . . . . 21
99	         8.3.1.1 Ingress Replication  . . . . . . . . . . . . . . . . 21
100	         8.3.1.2. P2MP MPLS LSPs  . . . . . . . . . . . . . . . . . . 22
101	     8.4 Aliasing and Backup-Path . . . . . . . . . . . . . . . . . . 23
102	       8.4.1 Constructing the Ethernet A-D per EVPN Instance (EVI)
103	             Route  . . . . . . . . . . . . . . . . . . . . . . . . . 24
104	     8.5 Designated Forwarder Election  . . . . . . . . . . . . . . . 25
105	     8.6. Interoperability with Single-homing PEs . . . . . . . . . . 27
106	   9. Determining Reachability to Unicast MAC Addresses . . . . . . . 27
107	     9.1. Local Learning  . . . . . . . . . . . . . . . . . . . . . . 27
108	     9.2. Remote learning . . . . . . . . . . . . . . . . . . . . . . 28
109	       9.2.1. Constructing the BGP EVPN MAC/IP Address
110	              Advertisement . . . . . . . . . . . . . . . . . . . . . 28
111	       9.2.2 Route Resolution . . . . . . . . . . . . . . . . . . . . 30
112	   10. ARP and ND . . . . . . . . . . . . . . . . . . . . . . . . . . 31
113	     10.1 Default Gateway . . . . . . . . . . . . . . . . . . . . . . 32
114	   11. Handling of Multi-Destination Traffic  . . . . . . . . . . . . 33
115	     11.1. Construction of the Inclusive Multicast Ethernet Tag
116	           Route  . . . . . . . . . . . . . . . . . . . . . . . . . . 34
117	     11.2. P-Tunnel Identification  . . . . . . . . . . . . . . . . . 34
118	   12. Processing of Unknown Unicast Packets  . . . . . . . . . . . . 35
119	     12.1. Ingress Replication  . . . . . . . . . . . . . . . . . . . 36
120	     12.2. P2MP MPLS LSPs . . . . . . . . . . . . . . . . . . . . . . 36
121	   13. Forwarding Unicast Packets . . . . . . . . . . . . . . . . . . 36
122	     13.1. Forwarding packets received from a CE  . . . . . . . . . . 37
123	     13.2. Forwarding packets received from a remote PE . . . . . . . 38
124	       13.2.1. Unknown Unicast Forwarding . . . . . . . . . . . . . . 38
125	       13.2.2. Known Unicast Forwarding . . . . . . . . . . . . . . . 38
126	   14. Load Balancing of Unicast Frames . . . . . . . . . . . . . . . 38
127	     14.1. Load balancing of traffic from a PE to remote CEs  . . . . 38
128	       14.1.1 Single-Active Redundancy Mode . . . . . . . . . . . . . 39
129	       14.1.2 All-Active Redundancy Mode  . . . . . . . . . . . . . . 39
130	     14.2. Load balancing of traffic between a PE and a local CE  . . 41
131	       14.2.1. Data plane learning  . . . . . . . . . . . . . . . . . 41
132	       14.2.2. Control plane learning . . . . . . . . . . . . . . . . 41
133	   15. MAC Mobility . . . . . . . . . . . . . . . . . . . . . . . . . 41
134	     15.1. MAC Duplication Issue  . . . . . . . . . . . . . . . . . . 43
135	     15.2. Sticky MAC addresses . . . . . . . . . . . . . . . . . . . 44
136	   16. Multicast & Broadcast  . . . . . . . . . . . . . . . . . . . . 44
137	     16.1. Ingress Replication  . . . . . . . . . . . . . . . . . . . 44
138	     16.2. P2MP LSPs  . . . . . . . . . . . . . . . . . . . . . . . . 44
139	       16.2.1. Inclusive Trees  . . . . . . . . . . . . . . . . . . . 45
140	   17. Convergence  . . . . . . . . . . . . . . . . . . . . . . . . . 45
141	     17.1. Transit Link and Node Failures between PEs . . . . . . . . 45
142	     17.2. PE Failures  . . . . . . . . . . . . . . . . . . . . . . . 45
143	     17.3. PE to CE Network Failures  . . . . . . . . . . . . . . . . 46
144	   18. Frame Ordering . . . . . . . . . . . . . . . . . . . . . . . . 46
145	   19. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 47
146	   20. Security Considerations  . . . . . . . . . . . . . . . . . . . 47
147	   21. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 48
148	   22.  IANA Considerations . . . . . . . . . . . . . . . . . . . . . 49
149	   23. References . . . . . . . . . . . . . . . . . . . . . . . . . . 49
150	     23.1 Normative References  . . . . . . . . . . . . . . . . . . . 49
151	     23.2 Informative References  . . . . . . . . . . . . . . . . . . 50
152	   24. Author's Address . . . . . . . . . . . . . . . . . . . . . . . 50

154	1. Introduction

156	   This document describes procedures for BGP MPLS based Ethernet VPNs
157	   (EVPN). The procedures described here are intended to meet the
158	   requirements specified in [EVPN-REQ].  Please refer to [EVPN-REQ] for
159	   the detailed requirements and motivation. EVPN requires extensions to
160	   existing IP/MPLS protocols as described in this document. In addition
161	   to these extensions EVPN uses several building blocks from existing
162	   MPLS technologies.

164	2. Specification of requirements

166	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
167	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
168	   document are to be interpreted as described in [RFC2119].

170	3. Terminology

172	   Broadcast Domain: in a bridged network, it corresponds to a Virtual
173	   LAN (VLAN); where a VLAN is typically represented by a single VLAN ID
174	   (VID), but can be represented by several  VIDs.

176	   Bridge Domain: An instantiation of a broadcast domain on a bridge
177	   node

179	   CE: Customer Edge device e.g., host or router or switch

181	   EVI:  An EVPN instance spanning across the PEs participating in that
182	   EVPN

184	   MAC-VRF:  A Virtual Routing and Forwarding table for MAC addresses on
185	   a PE for an EVI

187	   Ethernet Segment Identifier (ESI):  If a CE is multi-homed to two or
188	   more PEs, the set of Ethernet links that attaches the CE to the PEs
189	   is an 'Ethernet segment'.   Ethernet segments MUST have a unique non-
190	   zero identifier, the 'Ethernet Segment Identifier'.

192	   Ethernet Tag:  An Ethernet Tag identifies a particular broadcast
193	   domain, e.g., a VLAN.  An EVPN instance consists of one or more
194	   broadcast domains. Ethernet tag(s) are assigned to the broadcast
195	   domains of a given EVPN instance by the provider of that EVPN, and
196	   each PE in that EVPN instance performs a mapping between broadcast
197	   domain identifier(s) understood by each of its attached CEs and the
198	   corresponding Ethernet tag.

200	   LACP: Link Aggregation Control Protocol

202	   MP2MP: Multipoint to Multipoint

204	   P2MP: Point to Multipoint

206	   P2P: Point to Point

208	   Single-Active Redundancy Mode: When only a single PE, among a group
209	   of PEs attached to an Ethernet segment, is allowed to forward traffic
210	   to/from that Ethernet Segment, then the Ethernet segment is defined
211	   to be operating in Single-Active redundancy mode.

213	   All-Active Redundancy Mode: When all PEs attached to an Ethernet
214	   segment are allowed to forward traffic to/from that Ethernet Segment,
215	   then the Ethernet segment is defined to be operating in All-Active
216	   redundancy mode.

218	4. BGP MPLS Based EVPN Overview

220	   This section provides an overview of EVPN. An EVPN instance comprises
221	   CEs that are connected to PEs that form the edge of the MPLS
222	   infrastructure. A CE may be a host, a router or a switch. The PEs
223	   provide virtual Layer 2 bridged connectivity between the CEs. There
224	   may be multiple EVPN instances in the provider's network.

226	   The PEs may be connected by an MPLS LSP infrastructure which provides
227	   the benefits of MPLS technology such as fast-reroute, resiliency,
228	   etc.  The PEs may also be connected by an IP infrastructure in which
229	   case IP/GRE tunneling or other IP tunneling can be used between the
230	   PEs. The detailed procedures in this version of this document are
231	   specified only for MPLS LSPs as the tunneling technology. However
232	   these procedures are designed to be extensible to IP tunneling as the
233	   Packet Switched Network (PSN) tunneling technology.

235	   In an EVPN, MAC learning between PEs occurs not in the data plane (as
236	   happens with traditional bridging in VPLS [RFC4761] or [RFC4762]) but
237	   in the control plane. Control plane learning offers greater control
238	   over the MAC learning process, such as restricting who learns what,
239	   and the ability to apply policies.  Furthermore, the control plane
240	   chosen for advertising MAC reachability information is multi-protocol
241	   (MP) BGP (similar to IP VPNs (RFC 4364)). This provides flexibility
242	   and the ability to preserve the "virtualization" or isolation of
243	   groups of interacting agents (hosts, servers, virtual machines) from
244	   each other. In EVPN, PEs advertise the MAC addresses learned from the
245	   CEs that are connected to them, along with an MPLS label, to other
246	   PEs in the control plane using MP-BGP. Control plane learning enables
247	   load balancing of traffic to and from CEs that are multi-homed to
248	   multiple PEs. This is in addition to load balancing across the MPLS
249	   core via multiple LSPs between the same pair of PEs.  In other words
250	   it allows CEs to connect to multiple active points of attachment. It
251	   also improves convergence times in the event of certain network
252	   failures.

254	   However, learning between PEs and CEs is done by the method best
255	   suited to the CE: data plane learning, IEEE 802.1x, LLDP, 802.1aq,
256	   ARP, management plane or other protocols.

258	   It is a local decision as to whether the Layer 2 forwarding table on
259	   a PE is populated with all the MAC destination addresses known to the
260	   control plane, or whether the PE implements a cache based scheme. For
261	   instance the MAC forwarding table may be populated only with the MAC
262	   destinations of the active flows transiting a specific PE.

264	   The policy attributes of EVPN are very similar to those of IP-VPN. A
265	   EVPN instance requires a Route Distinguisher (RD) which is unique per
266	   PE and one or more globally unique Route-Targets (RTs). A CE attaches
267	   to a MAC-VRF on a PE, on an Ethernet interface which may be
268	   configured for one or more Ethernet Tags, e.g., VLAN IDs. Some
269	   deployment scenarios guarantee uniqueness of VLAN IDs across EVPN
270	   instances: all points of attachment for a given EVPN instance use the
271	   same VLAN ID, and no other EVPN instance uses this VLAN ID.  This
272	   document refers to this case as a "Unique VLAN EVPN" and describes
273	   simplified procedures to optimize for it.

275	5. Ethernet Segment

277	   If a CE is multi-homed to two or more PEs, the set of Ethernet links
278	   constitutes an "Ethernet Segment". An Ethernet segment may appear to
279	   the CE as a Link Aggregation Group (LAG).  Ethernet segments have an
280	   identifier, called the "Ethernet Segment Identifier" (ESI) which is
281	   encoded as a ten octets integer in line format with the most
282	   significant octet sent first.  The following two ESI values are
283	   reserved:

285	      - ESI 0 denotes a single-homed CE.

287	      - ESI {0xFF} (repeated 10 times) is known as MAX-ESI and is
288	   reserved.

290	   In general, an Ethernet segment SHOULD have a non-reserved ESI that
291	   is unique network wide (i.e., across all EVPN instances on all the
292	   PEs). If the CE(s) constituting an Ethernet Segment is (are) managed
293	   by the network operator, then ESI uniqueness should be guaranteed;
294	   however, if the CE(s) is (are) not managed, then the operator MUST
295	   configure a network-wide unique ESI for that Ethernet Segment.  This
296	   is required to enable auto-discovery of Ethernet Segments and DF
297	   election.

299	   In a network with managed and not-managed CEs, the ESI has the
300	   following format:

302	         +---+---+---+---+---+---+---+---+---+---+
303	         | T |          ESI Value                |
304	         +---+---+---+---+---+---+---+---+---+---+

306	   Where:

308	   T (ESI Type) is a 1-octet field (most significant octet) that
309	   specifies the format of the remaining nine octets (ESI Value). The
310	   following 6 ESI types can be used:

312	   - Type 0 (T=0x00) - This type indicates an arbitrary nine-octet ESI
313	   value, which is managed and configured by the operator.

315	   - Type 1 (T=0x01) - When IEEE 802.1AX LACP is used between the PEs
316	   and CEs, this ESI type indicates an auto-generated ESI value
317	   determined from LACP by concatenating the following parameters:

319	   	+ CE LACP six octets System MAC address. The CE LACP System MAC
320	   	  address MUST be encoded in the high order six octets of the ESI
321	   	  Value field.

323	   	+ CE LACP two octets Port Key. The CE LACP port key MUST be
324	   	  encoded in the two octets next to the System MAC address.

326	   	+ The remaining octet will be set to 0x00.

328	   	As far as the CE is concerned, it would treat the multiple PEs
329	   	that it is connected to as the same switch. This allows the CE
330	   	to aggregate links that are attached to different PEs in the
331	   	same bundle.

333	   	This mechanism could be used only if it produces ESIs that satisfy
334	   	the uniqueness requirement specified above.

336	   - Type 2 (T=0x02) - This type is used in the case of indirectly
337	   connected hosts via a bridged LAN between the CEs and the PEs. The
338	   ESI Value is auto-generated and determined based on the Layer 2
339	   bridge protocol as follows: If MST is used in the bridged LAN then
340	   the value of the ESI is derived by listening to BPDUs on the Ethernet
341	   segment. To achieve this the PE is not required to run MST. However
342	   the PE must learn the Root Bridge MAC address and Bridge Priority of
343	   the root of the Internal Spanning Tree (IST) by listening to the
344	   BPDUs. The ESI Value is constructed as follows:

346	   	+ Root Bridge six octets MAC address. The Root Bridge MAC
347	   	  address MUST be encoded in the high order six octets of the
348	   	  ESI Value field.

350	   	+ Root Bridge two octets Priority. The CE Root Bridge Priority
351	   	  MUST be encoded in the two octets next to the Root Bridge
352	   	  MAC address.

354	   	+ The remaining octet will be set to 0x00.

356	   	This mechanism could be used only if it produces ESIs that
357	   	satisfy the uniqueness requirement specified above.

359	   - Type 3 (T=0x03) - This type indicates a MAC-based ESI Value that
360	   can be auto-generated or configured by the operator. The ESI Value is
361	   constructed as follows:

363	   	+ System MAC address (six octets). The PE MAC address MUST
364	   	  be encoded in the high order six octets of the ESI Value field.

366	   	+ Local Discriminator value (three octets). The Local
367	   	  Discriminator MUST be encoded in the low order three octets
368	   	  of the ESI Value.

370	   	This mechanism could be used only if it produces ESIs that
371	   	satisfy the uniqueness requirement specified above.

373	   - Type 4 (T=0x04) - This type indicates a router-ID ESI Value that
374	   can be auto-generated or configured by the operator. The ESI Value is
375	   constructed as follows:

377	   	+ Router ID (four octets). The system router ID MUST be encoded in
378	   	  the high order four octets of the ESI Value field.

380	   	+ Local Discriminator value (four octets). The Local
381	   	  Discriminator MUST be encoded in the four octets next to the
382	   	  IP address.

384	   	+ The low order octet of the ESI Value will be set to 0x00.

386	   	This mechanism could be used only if it produces ESIs that
387	   	satisfy the uniqueness requirement specified above.

389	   - Type 5 (T=0x05) - This type indicates an AS-based ESI Value that
390	   can be auto-generated or configured by the operator. The ESI Value is
391	   constructed as follows:

393	   	+ AS number (four octets). This is an AS number owned by the
394	   	 system and MUST be encoded in the high order four octets of the
395	   	 ESI Value field. If a two-octet AS number is used, the high order
396	   	 extra two octets will be 0x0000.

398	   	+ Local Discriminator value (four octets). The Local Discriminator
399	   	  MUST be encoded in the four octets next to the AS number.

401	   	+ The low order octet of the ESI Value will be set to 0x00.

403	   	This mechanism could be used only if it produces ESIs that satisfy
404	   	the uniqueness requirement specified above.

406	6. Ethernet Tag ID

408	   An Ethernet Tag ID is a 32-bit field containing either a 12-bit or a
409	   24-bit identifier that identifies a particular broadcast domain
410	   (e.g., a VLAN) in an EVPN Instance.  The 12-bit identifier is called
411	   VLAN ID (VID). An EVPN Instance consists of one or more broadcast
412	   domains (one or more VLANs). VLANs are assigned to a given EVPN
413	   Instance by the provider of the EVPN service. A given VLAN can itself
414	   be represented by multiple VLAN IDs (VIDs). In such cases, the PEs
415	   participating in that VLAN for a given EVPN instance are responsible
416	   for performing VLAN ID translation to/from locally attached CE
417	   devices.

419	   If a VLAN is represented by a single VID across all PE devices
420	   participating in that VLAN for that EVPN instance, then there is no
421	   need for VID translation at the PEs. Furthermore, some deployment
422	   scenarios guarantee uniqueness of VIDs across all EVPN instances;
423	   all points of attachment for a given EVPN instance use the same VID
424	   and no other EVPN instances use that VID.  This allows the RT(s) for
425	   each EVPN instance to be derived automatically from the corresponding
426	   VID, as described in section 7.10.1.

428	   The following subsections discuss the relationship between broadcast
429	   domains (e.g., VLANs), Ethernet Tag IDs (e.g., VIDs), and MAC-VRFs as
430	   well as the setting of the Ethernet Tag ID, in the various EVPN BGP
431	   routes (defined in section 8), for the different types of service
432	   interfaces described in [EVPN-REQ].

434	   The following value of Ethernet Tag ID is reserved:

436	      - Ethernet Tag ID {0xFFFFFFFF} is known as MAX-ET

438	6.1 VLAN Based Service Interface

440	   With this service interface, an EVPN instance consists of only a
441	   single broadcast domain (e.g., a single VLAN). Therefore, there is a
442	   one to one mapping between a VID on this interface and a MAC-VRF.
443	   Since a MAC-VRF corresponds to a single VLAN, it consists of a single
444	   bridge domain corresponding to that VLAN. If the VLAN is represented
445	   by multiple VIDs (e.g., a different VID per Ethernet Segment per PE),
446	   then each PE needs to perform VID translation for frames destined to
447	   its Ethernet Segment(s). In such scenarios, the Ethernet frames
448	   transported over MPLS/IP network SHOULD remain tagged with the
449	   originating VID and a VID translation MUST be supported in the data
450	   path and MUST be performed on the disposition PE. The Ethernet Tag ID
451	   in all EVPN routes MUST be set to 0.

453	6.2 VLAN Bundle Service Interface

455	   With this service interface, an EVPN instance corresponds to several
456	   broadcast domains (e.g., several VLANs); however, only a single
457	   bridge domain is maintained per MAC-VRF which means multiple VLANs
458	   share the same bridge domain. This implies MAC addresses MUST be
459	   unique across different VLANs for this service to work. In other
460	   words, there is a many-to-one mapping between VLANs and a MAC-VRF,
461	   and the MAC-VRF consists of a single bridge domain. Furthermore, a
462	   single VLAN must be represented by a single VID - e.g., no VID
463	   translation is allowed for this service interface type. The MPLS
464	   encapsulated frames MUST remain tagged with the originating VID. Tag
465	   translation is NOT permitted. The Ethernet Tag ID in all EVPN routes
466	   MUST be set to 0.

468	6.2.1 Port Based Service Interface

470	   This service interface is a special case of the VLAN Bundle service
471	   interface, where all of the VLANs on the port are part of the same
472	   service and map to the same bundle. The procedures are identical to
473	   those described in section 6.2.

475	6.3 VLAN Aware Bundle Service Interface

477	   With this service interface, an EVPN instance consists of several
478	   broadcast domains (e.g., several VLANs) with each VLAN having its own
479	   bridge domain - i.e., multiple bridge domains (one per VLAN) is
480	   maintained by a single MAC-VRF corresponding to the EVPN instance. In
481	   the case where a single VLAN is represented by different VIDs on
482	   different CEs and thus VID translation is required, a normalized
483	   Ethernet Tag ID (VID) MUST be carried in the MPLS encapsulated frames
484	   and a Ethernet Tag ID translation function MUST be supported in the
485	   data path. This translation MUST be performed in data path on both
486	   the imposition as well as the disposition PEs (translating to
487	   normalized Ethernet Tag ID on imposition PE and translating to local
488	   Ethernet Tag ID on disposition PE). The Ethernet Tag ID in all EVPN
489	   routes MUST be set to the normalized value assigned by the EVPN
490	   provider.

492	6.3.1 Port Based VLAN Aware Service Interface

494	   This service interface is a special case of the VLAN Aware Bundle
495	   service interface, where all of the VLANs on the port are part of the
496	   same service and are mapped to a single bundle but without any VID
497	   translation. The procedures are subset of those described in section
498	   6.3.

500	7. BGP EVPN NLRI

502	   This document defines a new BGP NLRI, called the EVPN NLRI.

504	   Following is the format of the EVPN NLRI:

506	                   +-----------------------------------+
507	                   |    Route Type (1 octet)           |
508	                   +-----------------------------------+
509	                   |     Length (1 octet)              |
510	                   +-----------------------------------+
511	                   | Route Type specific (variable)    |
512	                   +-----------------------------------+

514	   The Route Type field defines encoding of the rest of the EVPN NLRI
515	   (Route Type specific EVPN NLRI).

517	   The Length field indicates the length in octets of the Route Type
518	   specific field of EVPN NLRI.

520	   This document defines the following Route Types:

522	        + 1 - Ethernet Auto-Discovery (A-D) route
523	        + 2 - MAC/IP advertisement route
524	        + 3 - Inclusive Multicast Ethernet Tag Route
525	        + 4 - Ethernet Segment Route

527	   The detailed encoding and procedures for these route types are
528	   described in subsequent sections.

530	   The EVPN NLRI is carried in BGP [RFC4271] using BGP Multiprotocol
531	   Extensions [RFC4760] with an AFI of 25 (L2VPN) and a SAFI of 70
532	   (EVPN). The NLRI field in the MP_REACH_NLRI/MP_UNREACH_NLRI attribute
533	   contains the EVPN NLRI (encoded as specified above).

535	   In order for two BGP speakers to exchange labeled EVPN NLRI, they
536	   must use BGP Capabilities Advertisement to ensure that they both are
537	   capable of properly processing such NLRI. This is done as specified
538	   in [RFC4760], by using capability code 1 (multiprotocol BGP) with an
539	   AFI of 25 (L2VPN) and a SAFI of 70 (EVPN).

541	7.1. Ethernet Auto-Discovery Route

543	   A Ethernet A-D route type specific EVPN NLRI consists of the
544	   following:

546	                   +---------------------------------------+
547	                   | Route Distinguisher (RD)   (8 octets) |
548	                   +---------------------------------------+
549	                   |Ethernet Segment Identifier (10 octets)|
550	                   +---------------------------------------+
551	                   |  Ethernet Tag ID (4 octets)           |
552	                   +---------------------------------------+
553	                   |  MPLS Label (3 octets)                |
554	                   +---------------------------------------+

556	   For the purpose of BGP route key processing, only the Ethernet
557	   Segment Identifier and the Ethernet Tag ID are considered to be part
558	   of the prefix in the NLRI.   The MPLS Label field is to be treated as
559	   a route attribute as opposed to being part of the route.

561	   For procedures and usage of this route please see section 8.2 "Fast
562	   Convergence" and section 8.4 "Aliasing".

564	7.2.  MAC/IP Advertisement Route

566	   A MAC/IP advertisement route type specific EVPN NLRI consists of the
567	   following:

569	                   +---------------------------------------+
570	                   |      RD   (8 octets)                  |
571	                   +---------------------------------------+
572	                   |Ethernet Segment Identifier (10 octets)|
573	                   +---------------------------------------+
574	                   |  Ethernet Tag ID (4 octets)           |
575	                   +---------------------------------------+
576	                   |  MAC Address Length (1 octet)         |
577	                   +---------------------------------------+
578	                   |  MAC Address (6 octets)               |
579	                   +---------------------------------------+
580	                   |  IP Address Length (1 octet)          |
581	                   +---------------------------------------+
582	                   |  IP Address (0 or 4 or 16 octets)     |
583	                   +---------------------------------------+
584	                   |  MPLS Label1 (3 octets)               |
585	                   +---------------------------------------+
586	                   |  MPLS Label2 (0 or 3 octets)          |
587	                   +---------------------------------------+

589	   For the purpose of BGP route key processing, only the Ethernet Tag
590	   ID, MAC Address Length, MAC Address, IP Address Length, and IP
591	   Address Address fields are considered to be part of the prefix in the
592	   NLRI. The Ethernet Segment Identifier and MPLS Label1 and MPLS Label2
593	   fields are to be treated as route attributes as opposed to being part
594	   of the "route". The IP address length is in bits.

596	   For procedures and usage of this route please see section 9
597	   "Determining Reachability to Unicast MAC Addresses" and section 14
598	   "Load Balancing of Unicast Packets".

600	7.3. Inclusive Multicast Ethernet Tag Route

602	   An Inclusive Multicast Ethernet Tag route type specific EVPN NLRI
603	   consists of the following:

605	                   +---------------------------------------+
606	                   |      RD   (8 octets)                  |
607	                   +---------------------------------------+
608	                   |  Ethernet Tag ID (4 octets)           |
609	                   +---------------------------------------+
610	                   |  IP Address Length (1 octet)          |
611	                   +---------------------------------------+
612	                   |   Originating Router's IP Addr        |
613	                   |          (4 or 16 octets)             |
614	                   +---------------------------------------+

616	   For procedures and usage of this route please see section 11
617	   "Handling of Multi-Destination Traffic", section 13 "Processing of
618	   Unknown Unicast Traffic" and section 16 "Multicast". The IP address
619	   length is in bits. For the purpose of BGP route key processing, only
620	   the Ethernet Tag ID, IP Address Length, and Originating Router's IP
621	   Address fields are considered to be part of the prefix in the NLRI.

623	7.4 Ethernet Segment Route

625	   An Ethernet Segment route type specific EVPN NLRI consists of the
626	   following:

628	                   +---------------------------------------+
629	                   |      RD   (8 octets)                  |
630	                   +---------------------------------------+
631	                   |Ethernet Segment Identifier (10 octets)|
632	                   +---------------------------------------+
633	                   |  IP Address Length (1 octet)          |
634	                   +---------------------------------------+
635	                   |   Originating Router's IP Addr        |
636	                   |          (4 or 16 octets)             |
637	                   +---------------------------------------+

639	   For procedures and usage of this route please see section 8.5
640	   "Designated Forwarder Election". The IP address length is in bits.
641	   For the purpose of BGP route key processing, only the Ethernet
642	   Segment ID, IP Address Length, and Originating Router's IP Address
643	   fields are considered to be part of the prefix in the NLRI.

645	7.5 ESI Label Extended Community

647	   This extended community is a new transitive extended community with
648	   the Type field is 0x06, and the Sub-Type of 0x01. It may be
649	   advertised along with Ethernet Auto-Discovery routes and it enables
650	   split-horizon procedures for multi-homed sites as described in
651	   section 8.3 "Split Horizon". ESI Label represents an ES by the
652	   advertising PE and it is used in split-horizon filtering by other PEs
653	   that are connected to the same multi-homed Ethernet Segment.

655	   Each ESI Label Extended Community is encoded as a 8-octet value as
656	   follows:

658	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
659	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
660	       | Type=0x06     | Sub-Type=0x01 | Flags(1 Octet)|  Reserved=0   |
661	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
662	       | Reserved = 0  |          ESI Label                            |
663	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

665	   The low order bit of the flags octet is defined as the "Single-
666	   Active" bit.  A value of 0 means that the multi-homed site is
667	   operating in All-Active redundancy mode and a value of 1 means that
668	   the multi-homed site is operating in Single-Active redundancy mode.

670	7.6 ES-Import Route Target

672	   This is a new transitive Route Target extended community carried with
673	   the Ethernet Segment route. When used, it enables all the PEs
674	   connected to the same multi-homed site to import the Ethernet Segment
675	   routes. The value is derived automatically from the ESI by encoding
676	   the high order 6-octet portion of the 9-octet ESI Value in the ES-
677	   Import Route Target. The high order 6-octet of the ESI incorporates
678	   MAC address of ESI (for type 1, 2, and 3) which when encoded in this
679	   RT and used in the RT constrain feature, it enables proper route-
680	   target filtering.  The format of this extended community is as
681	   follows:

683	       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
684	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
685	      | Type=0x06     | Sub-Type=0x02 |          ES-Import            |
686	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
687	      |                     ES-Import Cont'd                          |
688	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

690	   This document expands the definition of the Route Target extended
691	   community to allow the value of high order octet (Type field) to be
692	   0x06 (in addition to the values specified in rfc4360). The value of
693	   low order octet (Sub-Type field) of 0x02 indicates that this extended
694	   community is of type "Route Target". The new value for Type field of
695	   0x06 indicates that the structure of this RT is a six-octet value
696	   (e.g., a MAC address). A BGP speaker that implements RT-Constrain
697	   [RFC4684] MUST apply the RT Constraint procedures to the ES-import RT
698	   as well.

700	   For procedures and usage of this attribute, please see section 8.1
701	   "Multi-homed Ethernet Segment Auto-Discovery".

703	7.7 MAC Mobility Extended Community

705	   This extended community is a new transitive extended community with
706	   the Type field of 0x06 and the Sub-Type of 0x00. It may be advertised
707	   along with MAC Advertisement routes. The procedures for using this
708	   Extended Community are described in section 15 "MAC Mobility".

710	   The MAC Mobility Extended Community is encoded as an 8-octet value as
711	   follows:

713	   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
714	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
715	   | Type=0x06     | Sub-Type=0x00 |Flags(1 octet)|  Reserved=0    |
716	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
717	   |                       Sequence Number                         |
718	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

720	   The low order bit of the flags octet is defined as the
721	   "Sticky/static" flag and may be set to 1. A value of 1 means that the
722	   MAC address is static and cannot move. The sequence number is used to
723	   ensure that PEs retain the correct MAC advertisement route when
724	   multiple updates occur for the same MAC address.

726	7.8 Default Gateway Extended Community

728	   The Default Gateway community is an Extended Community of an Opaque
729	   Type (see 3.3 of rfc4360). It is a transitive community, which means
730	   that the first octet is 0x03. The value of the second octet (Sub-
731	   Type) is 0x0d (Default Gateway) as assigned by IANA. The Value field
732	   of this community is reserved (set to 0 by the senders, ignored by
733	   the receivers).

735	7.9 Route Distinguisher Assignment per EVI

737	   Route Distinguisher (RD) MUST be set to the RD of the EVI that is
738	   advertising the NLRI. An RD MUST be assigned for a given EVI on a PE.
739	   This RD MUST be unique across all EVIs on a PE. It is RECOMMENDED to
740	   use the Type 1 RD [RFC4364]. The value field comprises an IP address
741	   of the PE (typically, the loopback address) followed by a number
742	   unique to the PE.  This number may be generated by the PE. Or in the
743	   Unique VLAN EVPN case, the low order 12 bits may be the 12 bit VLAN
744	   ID, with the remaining high order 4 bits set to 0.

746	7.10 Route Targets

748	   The EVPN route MAY carry one or more Route Target (RT) attributes.
749	   RTs may be configured (as in IP VPNs), or may be derived
750	   automatically.

752	   If a PE uses RT-Constrain, the PE advertises all such RTs using RT
753	   Constraints per [RFC4684]. The use of RT Constrains allows each
754	   Ethernet A-D route to reach only those PEs that are configured to
755	   import at least one RT from the set of RTs carried in the EVPN route.

757	7.10.1 Auto-Derivation from the Ethernet Tag ID

759	   For the "Unique VLAN EVPN" scenario, it is highly desirable to auto-
760	   derive the RT from the Ethernet Tag ID (VLAN ID) for that EVPN
761	   instance. The following is the procedure for performing such auto-
762	   derivation.

764	        +    The Global Administrator field of the RT MUST be set
765	             to the Autonomous System (AS) number that the PE is
766	   	 associated with.

768	        +    The 12-bit VLAN ID MUST be encoded in the lowest 12 bits of
769	             the Local Administrator field.

771	8. Multi-homing Functions

773	   This section discusses the functions, procedures and associated BGP
774	   routes used to support multi-homing in EVPN. This covers both multi-
775	   homed device (MHD) as well as multi-homed network (MHN) scenarios.

777	8.1 Multi-homed Ethernet Segment Auto-Discovery

779	   PEs connected to the same Ethernet segment can automatically discover
780	   each other with minimal to no configuration through the exchange of
781	   the Ethernet Segment route.

783	8.1.1 Constructing the Ethernet Segment Route

785	   The Route-Distinguisher (RD) MUST be a Type 1 RD [RFC4364]. The value
786	   field comprises an IP address of the PE (typically, the loopback
787	   address) followed by 0's.

789	   The Ethernet Segment Identifier (ESI) MUST be set to the ten octet
790	   value described in section 5.

792	   The BGP advertisement that advertises the Ethernet Segment route MUST
793	   also carry an ES-Import route target, as defined in section 7.6.

795	   The Ethernet Segment Route filtering MUST be done such that the
796	   Ethernet Segment Route is imported only by the PEs that are multi-
797	   homed to the same Ethernet Segment. To that end, each PE that is
798	   connected to a particular Ethernet segment constructs an import
799	   filtering rule to import a route that carries the ES-Import extended
800	   community, constructed from the ESI.

802	8.2 Fast Convergence

804	   In EVPN, MAC address reachability is learnt via the BGP control-plane
805	   over the MPLS network. As such, in the absence of any fast protection
806	   mechanism, the network convergence time is a function of the number
807	   of MAC Advertisement routes that must be withdrawn by the PE
808	   encountering a failure. For highly scaled environments, this scheme
809	   yields slow convergence.

811	   To alleviate this, EVPN defines a mechanism to efficiently and
812	   quickly signal, to remote PE nodes, the need to update their
813	   forwarding tables upon the occurrence of a failure in connectivity to
814	   an Ethernet segment. This is done by having each PE advertise a set
815	   of Ethernet A-D per Ethernet segment (per ES) routes for each locally
816	   attached Ethernet segment (refer to section 8.2.1 below for details
817	   on how this route is constructed). Upon a failure in connectivity to
818	   the attached segment, the PE withdraws the corresponding Ethernet A-D
819	   route. This triggers all PEs that receive the withdrawal to update
820	   their next-hop adjacencies for all MAC addresses associated with the
821	   Ethernet segment in question. If no other PE had advertised an
822	   Ethernet A-D route for the same segment, then the PE that received
823	   the withdrawal simply invalidates the MAC entries for that segment.
824	   Otherwise, the PE updates the next-hop adjacencies to point to the
825	   backup PE(s).

827	8.2.1 Constructing the Ethernet A-D per Ethernet Segment (ES) Route

829	   This section describes the procedures used to construct the Ethernet
830	   A-D per ES route, which is used for fast convergence (as discussed
831	   above) and for advertising the ESI label used for split-horizon
832	   filtering (as discussed in section 8.3). Support of this route is
833	   REQUIRED.

835	   The Route-Distinguisher (RD) MUST be a Type 1 RD [RFC4364]. The value
836	   field comprises an IP address of the PE (typically, the loopback
837	   address) followed by a number unique to the PE.

839	   The Ethernet Segment Identifier MUST be a ten octet entity as
840	   described in section "Ethernet Segment". The Ethernet A-D route is
841	   not needed when the Segment Identifier is set to 0 (e.g., single-
842	   homed scenarios).

844	   The Ethernet Tag ID MUST be set to MAX-ET.

846	   The MPLS label in the NLRI MUST be set to 0.

848	   The "ESI Label Extended Community" MUST be included in the route. If
849	   All-Active redundancy mode is desired, then the "Single-Active" bit
850	   in the flags of the ESI Label Extended Community MUST be set to 0 and
851	   the MPLS label in that extended community MUST be set to a valid MPLS
852	   label value. The MPLS label in this Extended Community is referred to
853	   as the ESI label and MUST have the same value in each Ethernet A-D
854	   per ES route advertised for the ES. This label MUST be a downstream
855	   assigned MPLS label if the advertising PE is using ingress
856	   replication for receiving multicast, broadcast or unknown unicast
857	   traffic from other PEs. If the advertising PE is using P2MP MPLS LSPs
858	   for sending multicast, broadcast or unknown unicast traffic, then
859	   this label MUST be an upstream assigned MPLS label. The usage of this
860	   label is described in section 8.3.

862	   If Single-Active redundancy mode is desired, then the "Single-Active"
863	   bit in the flags of the ESI Label Extended Community MUST be set to 1
864	   and the ESI label SHOULD be set to a valid MPLS label value.

866	8.2.1.1. Ethernet A-D Route Targets

868	   Each Ethernet A-D per ES route MUST carry one or more Route Target
869	   (RT) attributes. The set of Ethernet A-D routes per ES MUST carry the
870	   entire set of RTs for all the EVPN instances to which the Ethernet
871	   Segment belongs.

873	8.3 Split Horizon

875	   Consider a CE that is multi-homed to two or more PEs on an Ethernet
876	   segment ES1 operating in All-Active redundancy mode. If the CE sends
877	   a broadcast, unknown unicast, or multicast (BUM) packet to one of the
878	   non-Designated Forwarder (non-DF) PEs, say PE1, then PE1 will forward
879	   that packet to all or subset of the other PEs in that EVPN instance
880	   including the Designated Forwarder (DF) PE for that Ethernet segment.
881	   In this case the DF PE that the CE is multi-homed to MUST drop the
882	   packet and not forward back to the CE. This filtering is referred to
883	   as "split horizon" filtering in this document.

885	   When a set of PEs operating in Single-Active redundancy mode, the use
886	   of this split-horizon filtering mechanism is highly recommended
887	   because it prevents transient loop at the time of failure or recovery
888	   impacting the Ethernet Segment - e.g., when two PEs thinks that both
889	   are DFs for that segment before DF election procedure settles down.

891	   In order to achieve this split horizon function, every BUM packet
892	   originating from a non-DF PE is encapsulated with an MPLS label that
893	   identifies the Ethernet segment of origin (i.e. the segment from
894	   which the frame entered the EVPN network). This label is referred to
895	   as the ESI label, and MUST be distributed by all PEs when operating
896	   in All-Active redundancy mode using a set of Ethernet A-D per ES
897	   routes per section 8.2.1 above. The ESI label SHOULD be distributed
898	   by all PEs when operating in Single-Active redundancy mode using a
899	   set of Ethernet A-D per ES route. This route is imported by the PEs
900	   connected to the Ethernet Segment and also by the PEs that have at
901	   least one EVPN instance in common with the Ethernet Segment in the
902	   route. As described in section 8.1.1, the route MUST carry an ESI
903	   Label Extended Community with a valid ESI label. The disposition PE
904	   rely on the value of the ESI label to determine whether or not a BUM
905	   frame is allowed to egress a specific Ethernet segment.

907	8.3.1 ESI Label Assignment

909	   The following subsections describe the assignment procedures for the
910	   ESI label, which differ depending on the type of tunnels being used
911	   to deliver multi-destination packets in the EVPN network.

913	8.3.1.1 Ingress Replication

915	   Each PE attached to a given ES that is operating in All-Active or
916	   Single-Active redundancy mode and that uses ingress replication to
917	   receive BUM traffic advertises a downstream assigned ESI label in the
918	   set of Ethernet A-D per ES routes for that ES.  This label MUST be
919	   programmed in the platform label space by the advertising PE and the
920	   forwarding entry for this label must result in NOT forwarding packets
921	   received with this label onto the Ethernet segment for which the
922	   label was distributed.

924	   The rules for the inclusion of the ESI label in a BUM packet by the
925	   ingress PE operating in All-Active redundancy mode are as follows:

927	   A non-DF ingress PE MUST include the ESI label distributed by the DF
928	   egress PE in the copy of a BUM packet sent to it.

930	   An ingress PE (DF or non-DF) SHOULD include the ESI label distributed
931	   by each non-DF egress PE in the copy of a BUM packet sent to it.

933	   The rules for the inclusion of the ESI label in a BUM packet by the
934	   ingress PE operating in Single-Active redundancy mode are as follows:

936	   An ingress DF PE SHOULD include the ESI label distributed by the
937	   egress PE in the copy of a BUM packet sent to it.

939	   In both All-Active and Single-Active redundancy mode, an ingress PE
940	   MUST NOT include an ESI label in the copy of a BUM packet sent to an
941	   egress PE that is not attached to the ES through which the BUM packet
942	   entered the EVI.

944	   As an example, consider PE1 and PE2 that are multi-homed to CE1 on
945	   ES1 and operating in All-Active multi-homing mode. Further consider
946	   that PE1 is using P2P or MP2P LSPs to send packets to PE2. Consider
947	   that PE1 is the non-DF for VLAN1 and PE2 is the DF for VLAN1, and PE1
948	   receives a BUM packet from CE1 on VLAN1 on ES1. In this scenario, PE2
949	   distributes an Inclusive Multicast Ethernet Tag route for VLAN1
950	   corresponding to an EVPN instance. So, when PE1 sends a BUM packet,
951	   that it receives from CE1, it MUST first push onto the MPLS label
952	   stack the ESI label that PE2 has distributed for ES1. It MUST then
953	   push on the MPLS label distributed by PE2 in the Inclusive Multicast
954	   Ethernet Tag route for VLAN1. The resulting packet is further
955	   encapsulated in the P2P or MP2P LSP label stack required to transmit
956	   the packet to PE2.  When PE2 receives this packet, it determines the
957	   set of ESIs to replicate the packet to from the top MPLS label, after
958	   any P2P or MP2P LSP labels have been removed. If the next label is
959	   the ESI label assigned by PE2 for ES1, then PE2 MUST NOT forward the
960	   packet onto ES1. If the next label is an ESI label which has not been
961	   assigned by PE2, then PE2 MUST drop the packet. It should be noted
962	   that in this scenario, if PE2 receives a BUM packet for VLAN1 from
963	   CE1, then it SHOULD encapsulate the packet with an ESI label received
964	   from PE1 when sending it to PE1 in order to avoid any transient loop
965	   during a failure scenario impacting ES1 (e.g., port or link failure).

967	8.3.1.2. P2MP MPLS LSPs

969	   The non-DF PEs attached to a given ES that is operating in All-Active
970	   redundancy mode and that use P2MP LSPs to send BUM traffic advertise
971	   an upstream assigned ESI label in the set of Ethernet A-D per ES
972	   routes for that ES. This label is upstream assigned by the PE that
973	   advertises the route. This label MUST be programmed by the other PEs,
974	   that are connected to the ESI advertised in the route, in the context
975	   label space for the advertising PE. Further the forwarding entry for
976	   this label must result in NOT forwarding packets received with this
977	   label onto the Ethernet segment that the label was distributed for.
978	   This label MUST also be programmed by the other PEs, that import the
979	   route but are not connected to the ESI advertised in the route, in
980	   the context label space for the advertising PE. Further the
981	   forwarding entry for this label must be a POP with no other
982	   associated action.

984	   The DF PE attached to a given ES that is operating in Single-Active
985	   redundancy mode and that use P2MP LSPs to send BUM traffic should
986	   advertise an upstream assigned ESI label in the set of Ethernet A-D
987	   per ES routes for that ES just as above paragraph.

989	   As an example, consider PE1 and PE2 that are multi-homed to CE1 on
990	   ES1 and operating in All-Active multi-homing mode. Also consider PE3
991	   belongs to one of the EVPN instances of ES1.  Further, assume that
992	   PE1 which is the non-DF, using P2MP MPLS LSPs to send BUM packets.
993	   When PE1 sends a BUM packet, that it receives from CE1, it MUST first
994	   push onto the MPLS label stack the ESI label that it has assigned for
995	   the ESI that the packet was received on. The resulting packet is
996	   further encapsulated in the P2MP MPLS label stack necessary to
997	   transmit the packet to the other PEs. Penultimate hop popping MUST be
998	   disabled on the P2MP LSPs used in the MPLS transport infrastructure
999	   for EVPN. When PE2 receives this packet, it de-capsulates the top
1000	   MPLS label and forwards the packet using the context label space
1001	   determined by the top label. If the next label is the ESI label
1002	   assigned by PE1 to ES1, then PE2 MUST NOT forward the packet onto
1003	   ES1. When PE3 receives this packet, it de-capsulates the top MPLS
1004	   label and forwards the packet using the context label space
1005	   determined by the top label. If the next label is the ESI label
1006	   assigned by PE1 to ES1 and PE3 is not connected to ES1, then PE3 MUST
1007	   pop the label and flood the packet over all local ESIs in that EVPN
1008	   instance. It should be noted that when PE2 sends a BUM frame over a
1009	   P2MP LSP, it should encapsulate the frame with an ESI label even
1010	   though it is the DF for that VLAN in order to avoid any transient
1011	   loop during a failure scenario impacting ES1 (e.g., port or link
1012	   failure).

1014	8.4 Aliasing and Backup-Path

1016	   In the case where a CE is multi-homed to multiple PE nodes, using a
1017	   LAG with All-Active redundancy, it is possible that only a single PE
1018	   learns a set of the MAC addresses associated with traffic transmitted
1019	   by the CE. This leads to a situation where remote PE nodes receive
1020	   MAC advertisement routes, for these addresses, from a single PE even
1021	   though multiple PEs are connected to the multi-homed segment. As a
1022	   result, the remote PEs are not able to effectively load-balance
1023	   traffic among the PE nodes connected to the multi-homed Ethernet
1024	   segment. This could be the case, for e.g. when the PEs perform data-
1025	   plane learning on the access, and the load-balancing function on the
1026	   CE hashes traffic from a given source MAC address to a single PE.
1027	   Another scenario where this occurs is when the PEs rely on control
1028	   plane learning on the access (e.g. using ARP), since ARP traffic will
1029	   be hashed to a single link in the LAG.

1031	   To address this issue, EVPN introduces the concept of 'Aliasing'
1032	   which is the ability of a PE to signal that it has reachability to an
1033	   EVPN instance on a given ES even when it has learnt no MAC addresses
1034	   from that EVI/ES. The Ethernet A-D per EVI route is used for this
1035	   purpose. A remote PE that receives a MAC advertisement route with
1036	   non-reserved ESI SHOULD consider the advertised MAC address to be
1037	   reachable via all PEs that have advertised reachability to that MAC
1038	   address' EVI/ES via the combination of an Ethernet A-D per EVI route
1039	   for that EVI/ES (and Ethernet Tag if applicable) AND Ethernet A-D per
1040	   ES routes for that ES with the 'Single-Active' bit in the flags of
1041	   the ESI Label Extended Community set to 0.

1043	   Note that the Ethernet A-D per EVI route may be received by a remote
1044	   PE before it receives the set of Ethernet A-D per ES routes.

1046	   Therefore, in order to handle corner cases and race conditions, the
1047	   Ethernet A-D per EVI route MUST NOT be used for traffic forwarding by
1048	   a remote PE until it also receives the associated set of Ethernet A-D
1049	   per ES routes.

1051	   Backup-path is a closely related function, but it is used in Single-
1052	   Active redundancy mode.  In this case a PE also advertises that it
1053	   has reachability to a give EVI/ES using same combination of Ethernet
1054	   A-D per EVI route and Ethernet A-D per ES route as above, but with
1055	   the 'Single-Active' bit in the flags of the ESI Label Extended
1056	   Community set to 1.   A remote PE that receives a MAC advertisement
1057	   route with non-reserved ESI SHOULD consider the advertised MAC
1058	   address to be reachable via any PE that has advertised this
1059	   combination of Ethernet A-D routes and it SHOULD install a backup-
1060	   path for that MAC address.

1062	8.4.1 Constructing the Ethernet A-D per EVPN Instance (EVI) Route

1064	   This section describes the procedures used to construct the Ethernet
1065	   A-D per EVPN Instance (EVI) route, which is used for aliasing (as
1066	   discussed above). Support of this route is OPTIONAL.

1068	   Route-Distinguisher (RD) MUST be set to the RD of the EVI that is
1069	   advertising the NLRI per section 7.9.

1071	   The Ethernet Segment Identifier MUST be a ten octet entity as
1072	   described in section "Ethernet Segment Identifier". The Ethernet A-D
1073	   route is not needed when the Segment Identifier is set to 0.

1075	   The Ethernet Tag ID is the identifier of an Ethernet Tag on the
1076	   Ethernet segment. This value may be a 12 bit VLAN ID, in which case
1077	   the low order 12 bits are set to the VLAN ID and the high order 20
1078	   bits are set to 0. Or it may be another Ethernet Tag used by the
1079	   EVPN.  It MAY be set to the default Ethernet Tag on the Ethernet
1080	   segment or to the value 0.

1082	   Note that the above allows the Ethernet A-D route to be advertised
1083	   with one of the following granularities:

1085	      + One Ethernet A-D route for a given <ESI, Ethernet Tag ID> tuple
1086	        per EVI. This is applicable when the PE uses MPLS-based
1087	        disposition.

1089	      + One Ethernet A-D route per <ESI, EVI> (where the Ethernet
1090	        Tag ID is set to 0). This is applicable when the PE uses
1091	        MAC-based disposition, or when the PE uses MPLS-based
1092	        disposition when no VLAN translation is required.

1094	   The usage of the MPLS label is described in the section on "Load
1095	   Balancing of Unicast Packets".

1097	   The Next Hop field of the MP_REACH_NLRI attribute of the route MUST
1098	   be set to the IPv4 or IPv6 address of the advertising PE.

1100	   The Ethernet A-D route MUST carry one or more Route Target (RT)
1101	   attributes per section 7.10.

1103	8.5 Designated Forwarder Election

1105	   Consider a CE that is a host or a router that is multi-homed directly
1106	   to more than one PE in an EVPN instance on a given Ethernet segment.
1107	   One or more Ethernet Tags may be configured on the Ethernet segment.
1108	   In this scenario only one of the PEs, referred to as the Designated
1109	   Forwarder (DF), is responsible for certain actions:

1111	        -   Sending multicast and broadcast traffic, on a given Ethernet
1112	            Tag on a particular Ethernet segment, to the CE.

1114	        -   Flooding unknown unicast traffic (i.e. traffic for
1115	            which a PE does not know the destination MAC address),
1116	            on a given Ethernet Tag on a particular Ethernet segment
1117	            to the CE, if the environment requires flooding of
1118	            unknown unicast traffic.

1120	   Note that this behavior, which allows selecting a DF at the
1121	   granularity of <ESI, EVI> for multicast, broadcast and unknown
1122	   unicast traffic, is the default behavior in this specification.

1124	   Note that a CE always sends packets belonging to a specific flow
1125	   using a single link towards a PE. For instance, if the CE is a host
1126	   then, as mentioned earlier, the host treats the multiple links that
1127	   it uses to reach the PEs as a Link Aggregation Group (LAG). The CE
1128	   employs a local hashing function to map traffic flows onto links in
1129	   the LAG.

1131	   If a bridged network is multi-homed to more than one PE in an EVPN
1132	   network via switches, then the support of All-Active redundancy mode
1133	   requires the bridged network to be connected to two or more PEs using
1134	   a LAG.

1136	   If a bridged network does not connect to the PEs using LAG, then only
1137	   one of the links between the switched bridged network and the PEs
1138	   must be the active link for a given EVPN instance. In this case, the
1139	   set of Ethernet A-D per ES routes advertised by each PE MUST have the
1140	   'Single-Active' bit in the flags of the ESI Label Extended Community
1141	   set to 1.

1143	   The default procedure for DF election at the granularity of <ESI,
1144	   EVI> is referred to as "service carving". With service carving, it is
1145	   possible to elect multiple DFs per Ethernet Segment (one per EVI) in
1146	   order to perform load-balancing of multi-destination traffic destined
1147	   to a given Segment. The load-balancing procedures carve up the EVI
1148	   space among the PE nodes evenly, in such a way that every PE is the
1149	   DF for a disjoint set of EVIs. The procedure for service carving is
1150	   as follows:

1152	   1. When a PE discovers the ESI of the attached Ethernet Segment, it
1153	   advertises an Ethernet Segment route with the associated ES-Import
1154	   extended community attribute.

1156	   2. The PE then starts a timer (default value = 3 seconds) to allow
1157	   the reception of Ethernet Segment routes from other PE nodes
1158	   connected to the same Ethernet Segment. This timer value MUST be same
1159	   across all PEs connected to the same Ethernet Segment.

1161	   3. When the timer expires, each PE builds an ordered list of the IP
1162	   addresses of all the PE nodes connected to the Ethernet Segment
1163	   (including itself), in increasing numeric value. Each IP address in
1164	   this list is extracted from the "Originator Router's IP address"
1165	   field of the advertised Ethernet Segment route. Every PE is then
1166	   given an ordinal indicating its position in the ordered list,
1167	   starting with 0 as the ordinal for the PE with the numerically lowest
1168	   IP address. The ordinals are used to determine which PE node will be
1169	   the DF for a given EVPN instance on the Ethernet Segment using the
1170	   following rule: Assuming a redundancy group of N PE nodes, the PE
1171	   with ordinal i is the DF for an EVPN instance with an associated
1172	   Ethernet Tag value V when (V mod N) = i. In the case where multiple
1173	   Ethernet Tags are associated with a single EVPN instance, then the
1174	   numerically lowest Ethernet Tag value in that EVPN instance MUST be
1175	   used in the modulo function.

1177	   It should be noted that using "Originator Router's IP address" field
1178	   in the Ethernet Segment route to get the PE IP address needed for the
1179	   ordered list, allows for a CE to be multi-homed across different ASes
1180	   if such need ever arises.

1182	   4. The PE that is elected as a DF for a given EVPN instance will
1183	   unblock traffic for the Ethernet Tags associated with that EVPN
1184	   instance. Note that the DF PE unblocks multi-destination traffic in
1185	   the egress direction towards the Segment. All non-DF PEs continue to
1186	   drop multi-destination traffic (for the associated EVPN instances) in
1187	   the egress direction towards the Segment.

1189	   In the case of link or port failure, the affected PE withdraws its
1190	   Ethernet Segment route. This will re-trigger the service carving
1191	   procedures on all the PEs in the RG. For PE node failure, or upon PE
1192	   commissioning or decommissioning, the PEs re-trigger the service
1193	   carving. In case of a Single-Active multi-homing, when a service
1194	   moves from one PE in the RG to another PE as a result of re-carving,
1195	   the PE, which ends up being the elected DF for the service, SHOULD
1196	   trigger a MAC address flush notification towards the associated
1197	   Ethernet Segment. This can be done, for e.g. using IEEE 802.1ak MVRP
1198	   'new' declaration.

1200	8.6. Interoperability with Single-homing PEs

1202	   Let's refer to PEs that only support single-homed CE devices as
1203	   single-homing PEs. For single-homing PEs, all the above multi-homing
1204	   procedures can be omitted; however, to allow for single-homing PEs to
1205	   fully inter-operate with multi-homing PEs, some of the multi-homing
1206	   procedures described above SHOULD be supported even by single-homing
1207	   PEs:

1209	   - procedures related to processing Ethernet A-D route for the purpose
1210	   of Fast Convergence (8.2 Fast Convergence), to let single-homing PEs
1211	   benefit from fast convergence

1213	   - procedures related to processing Ethernet A-D route for the purpose
1214	   of Aliasing (8.4 Aliasing and Backup-path), to let single-homing PEs
1215	   benefit from load balancing

1217	   - procedures related to processing Ethernet A-D route for the purpose
1218	   of Backup-path (8.4 Aliasing and Backup-path), to let single-homing
1219	   PEs to benefit from the corresponding convergence improvement

1221	9. Determining Reachability to Unicast MAC Addresses

1223	   PEs forward packets that they receive based on the destination MAC
1224	   address. This implies that PEs must be able to learn how to reach a
1225	   given destination unicast MAC address.

1227	   There are two components to MAC address learning, "local learning"
1228	   and "remote learning":

1230	9.1. Local Learning

1232	   A particular PE must be able to learn the MAC addresses from the CEs
1233	   that are connected to it. This is referred to as local learning.

1235	   The PEs in a particular EVPN instance MUST support local data plane
1236	   learning using standard IEEE Ethernet learning procedures. A PE must
1237	   be capable of learning MAC addresses in the data plane when it
1238	   receives packets such as the following from the CE network:

1240	        - DHCP requests

1242	        - ARP request for its own MAC.

1244	        - ARP request for a peer.

1246	   Alternatively PEs MAY learn the MAC addresses of the CEs in the
1247	   control plane or via management plane integration between the PEs and
1248	   the CEs.

1250	   There are applications where a MAC address that is reachable via a
1251	   given PE on a locally attached Segment (e.g. with ESI X) may move
1252	   such that it becomes reachable via another PE on another Segment
1253	   (e.g. with ESI Y).  This is referred to as a "MAC Mobility".
1254	   Procedures to support this are described in section "MAC Mobility".

1256	9.2. Remote learning

1258	   A particular PE must be able to determine how to send traffic to MAC
1259	   addresses that belong to or are behind CEs connected to other PEs
1260	   i.e. to remote CEs or hosts behind remote CEs. We call such MAC
1261	   addresses "remote" MAC addresses.

1263	   This document requires a PE to learn remote MAC addresses in the
1264	   control plane. In order to achieve this, each PE advertises the MAC
1265	   addresses it learns from its locally attached CEs in the control
1266	   plane, to all the other PEs in that EVPN instance, using MP-BGP and
1267	   specifically the MAC Advertisement route.

1269	9.2.1. Constructing the BGP EVPN MAC/IP Address Advertisement

1271	   BGP is extended to advertise these MAC addresses using the MAC/IP
1272	   Advertisement route type in the EVPN NLRI.

1274	   The RD MUST be the RD of the EVI that is advertising the NLRI. The
1275	   procedures for setting the RD for a given EVI are described in
1276	   section 7.9.

1278	   The Ethernet Segment Identifier is set to the ten octet ESI described
1279	   in section "Ethernet Segment".

1281	   The Ethernet Tag ID may be zero or may represent a valid Ethernet Tag
1282	   ID.  This field may be non-zero when there are multiple bridge
1283	   domains in the MAC-VRF (i.e., the PE needs to perform qualified
1284	   learning for the VLANs in that MAC-VRF).

1286	   When the the Ethernet Tag ID in the NLRI is set to a non-zero value,
1287	   for a particular bridge domain, then this Ethernet Tag ID may either
1288	   be the CE's Ethernet tag value (e.g., CE VLAN ID) or the EVPN
1289	   provider's Ethernet tag value (e.g., provider VLAN ID). The latter
1290	   would be the case if the CE Ethernet tags (e.g., CE VLAN ID) for a
1291	   particular bridge domain are different on different CEs.

1293	   The MAC address length field is in bits and it is set to 48. The MAC
1294	   address length values other than 48 bits, are outside the scope of
1295	   this document. The encoding of a MAC address MUST be the 6-octet MAC
1296	   address specified by [802.1D-ORIG] [802.1D-REV].

1298	   The IP Address Field is optional. By default, the IP Address Length
1299	   field is set to 0 and the IP address field is omitted from the route.
1300	   When a valid IP address needs to be advertised, it is then encoded in
1301	   this route. When an IP address is present, the IP Address Length
1302	   field is in bits and it is set to 32 or 128 bits. Other IP Address
1303	   Length values are outside the scope of this document. The encoding of
1304	   an IP address MUST be either 4 octets for IPv4 or 16 octets for IPv6.
1305	   The length field of EVPN NLRI (which is in octets and is described in
1306	   section 7) is sufficient to determine whether an IP address is
1307	   encoded in this route and if so, whether the encoded IP address is
1308	   IPV4 or IPv6.

1310	   The MPLS label1 field is encoded as 3 octets, where the high-order 20
1311	   bits contain the label value. The MPLS label1 MUST be downstream
1312	   assigned and it is associated with the MAC address being advertised
1313	   by the advertising PE. The advertising PE uses this label when it
1314	   receives an MPLS-encapsulated packet to perform forwarding based on
1315	   the destination MAC address toward the CE. The forwarding procedures
1316	   are specified in sections 13 and 14.

1318	   A PE may advertise the same single EVPN label for all MAC addresses
1319	   in a given EVI. This label assignment is referred to as a per EVI
1320	   label assignment. Alternatively, a PE may advertise a unique EVPN
1321	   label per <ESI, Ethernet Tag> combination. This label assignment is
1322	   referred to as a per <ESI, Ethernet Tag> label assignment. As a third
1323	   option, a PE may advertise a unique EVPN label per MAC address. This
1324	   label assignment is referred to as a per MAC label assignment. All of
1325	   these label assignment methods have their tradeoffs. The choice of a
1326	   particular label assignment methodology is purely local to the PE
1327	   that originates the route.

1329	   Per EVI label assignment requires the least number of EVPN labels,
1330	   but requires a MAC lookup in addition to an MPLS lookup on an egress
1331	   PE for forwarding. On the other hand, a unique label per <ESI,
1332	   Ethernet Tag> or a unique label per MAC allows an egress PE to
1333	   forward a packet that it receives from another PE, to the connected
1334	   CE, after looking up only the MPLS labels without having to perform a
1335	   MAC lookup. This includes the capability to perform appropriate VLAN
1336	   ID translation on egress to the CE.

1338	   The MPLS label2 field is an optional field and if it is present, then
1339	   it is encoded as 3 octets, where the high-order 20 bits contain the
1340	   label value.

1342	   The Next Hop field of the MP_REACH_NLRI attribute of the route MUST
1343	   be set to the IPv4 or IPv6 address of the advertising PE.

1345	   The BGP advertisement for the MAC advertisement route MUST also carry
1346	   one or more Route Target (RT) attributes.  RTs may be configured (as
1347	   in IP VPNs), or may be derived automatically from the Ethernet Tag
1348	   ID, in the Unique VLAN case, as described in section 7.10.1.

1350	   It is to be noted that this document does not require PEs to create
1351	   forwarding state for remote MACs when they are learnt in the control
1352	   plane. When this forwarding state is actually created is a local
1353	   implementation matter.

1355	9.2.2 Route Resolution

1357	   If the Ethernet Segment Identifier field in a received MAC
1358	   Advertisement route is set to the reserved ESI value of 0 or MAX-ESI,
1359	   then if the receiving PE decides to install forwarding state for the
1360	   associated MAC address, it MUST be based on the MAC Advertisement
1361	   route alone.

1363	   If the Ethernet Segment Identifier field in a received MAC
1364	   Advertisement route is set to a non-reserved ESI, and the receiving
1365	   PE is locally attached to the same ESI, then the PE does not alter
1366	   its forwarding state based on the received route. This ensures that
1367	   local routes are preferred to remote routes.

1369	   If the Ethernet Segment Identifier field in a received MAC
1370	   Advertisement route is set to a non-reserved ESI, then if the
1371	   receiving PE decides to install forwarding state for the associated
1372	   MAC address, it MUST be when both the MAC Advertisement route AND the
1373	   associated set of Ethernet A-D per ES routes have been received. The
1374	   dependency of MAC routes installation on Ethernet A-D per ES routes,
1375	   is to ensure that MAC routes don't get accidentally installed during
1376	   mass withdraw period.

1378	   To illustrate this with an example, consider two PEs (PE1 and PE2)
1379	   connected to a multi-homed Ethernet Segment ES1. All-Active
1380	   redundancy mode is assumed. A given MAC address M1 is learnt by PE1
1381	   but not PE2. On PE3, the following states may arise:

1383	   T1- When the MAC Advertisement Route from PE1 and the set of Ethernet
1384	   A-D per ES routes and Ethernet A-D per EVI routes from PE1 and PE2
1385	   are received, PE3 can forward traffic destined to M1 to both PE1 and
1386	   PE2.

1388	   T2- If after T1, PE1 withdraws its set of Ethernet A-D per ES routes,
1389	   then PE3 forwards traffic destined to M1 to PE2 only.

1391	   T2'- If after T1, PE2 withdraws its set of Ethernet A-D per ES
1392	   routes, then PE3 forwards traffic destined to M1 to PE1 only.

1394	   T2''- If after T1, PE1 withdraws its MAC Advertisement route, then
1395	   PE3 treats traffic to M1 as unknown unicast.

1397	   T3- PE2 also advertises a MAC route for M1 and then PE1 withdraws its
1398	   MAC route for M1.  PE3 continues  forwarding traffic destined to M1
1399	   to both PE1 and PE2. In other words, despite M1 withdrawal by PE1,
1400	   PE3 forwards the traffic destined to M1 to both PE1 and PE2. This is
1401	   because a flow from the CE, resulting in M1 traffic getting hashed to
1402	   PE1, can get terminated resulting in M1 to aged out in PE1; however,
1403	   M1 can be reachable by both PE1 and PE2.

1405	10. ARP and ND

1407	   The IP address field in the MAC advertisement route may optionally
1408	   carry one of the IP addresses associated with the MAC address. This
1409	   provides an option which can be used to minimize the flooding of ARP
1410	   or Neighbor Discovery (ND) messages over the MPLS network and to
1411	   remote CEs. This option also minimizes ARP (or ND) message processing
1412	   on end-stations/hosts connected to the EVPN network. A PE may learn
1413	   the IP address associated with a MAC address in the control or
1414	   management plane between the CE and the PE. Or, it may learn this
1415	   binding by snooping certain messages to or from a CE. When a PE
1416	   learns the IP address associated with a MAC address, of a locally
1417	   connected CE, it may advertise this address to other PEs by including
1418	   it in the MAC Advertisement route. The IP Address may be an IPv4
1419	   address encoded using four octets, or an IPv6 address encoded using
1420	   sixteen octets. For ARP and ND purposes, the IP Address length field
1421	   MUST be set to 32 for an IPv4 address or to 128 for an IPv6 address.

1423	   If there are multiple IP addresses associated with a MAC address,
1424	   then multiple MAC advertisement routes MUST be generated, one for
1425	   each IP address. For instance, this may be the case when there are
1426	   both an IPv4 and an IPv6 address associated with the same MAC address
1427	   for dual-IP stack scenarios. When the IP address is dissociated with
1428	   the MAC address, then the MAC advertisement route with that
1429	   particular IP address MUST be withdrawn.

1431	   Note that a MAC-only route can be advertised along with but
1432	   independent from MAC/IP route for scenarios where the MAC learning
1433	   over access network/node is done in data-plane and independent from
1434	   ARP snooping that generates MAC/IP route. In such scenarios when the
1435	   ARP entry times out and causes the MAC/IP to be withdrawn, then the
1436	   MAC information will not be lost. In scenarios where host MAC/IP is
1437	   learned via management or control plane, then the sender PE may only
1438	   generates and advertises MAC/IP route. If the receiving PE receives
1439	   both the MAC-only route and the MAC/IP route, then when it receives a
1440	   withdraw message for the MAC/IP route, it MUST delete the
1441	   corresponding entry from the ARP table but not the MAC entry from the
1442	   MAC-VRF table unless it receives a withdraw message for MAC-only
1443	   route.

1445	   When a PE receives an ARP request for an IP address from a CE, and if
1446	   the PE has the MAC address binding for that IP address, the PE SHOULD
1447	   perform ARP proxy by responding to the ARP request.

1449	10.1 Default Gateway

1451	   When a PE needs to perform inter-subnet forwarding where each subnet
1452	   is represented by a different broadcast domain (e.g., different VLAN)
1453	   the inter-subnet forwarding is performed at layer 3 and the PE that
1454	   performs such function is called the default gateway for the EVPN
1455	   instance. In this case when the PE receives an ARP Request for the IP
1456	   address configured as the default gateway address, the PE originates
1457	   an ARP Reply.

1459	   Each PE that acts as a default gateway for a given EVPN instance MAY
1460	   advertise in the EVPN control plane its default gateway MAC address
1461	   using the MAC/IP advertisement route, and indicates that such route
1462	   is associated with the default gateway.  This is accomplished by
1463	   requiring the route to carry the Default Gateway extended community
1464	   defined in [Section 7.8 Default Gateway Extended Community]. The ESI
1465	   field is set to zero when advertising the MAC route with the Default
1466	   Gateway extended community.

1468	   The IP address field of the MAC/IP advertisement route is set to the
1469	   default GW IP address for that subnet (e.g., EVPN instance). For a
1470	   given subnet (e.g., VLAN or EVPN instance), the default GW IP address
1471	   is the same across all the participant PEs. The inclusion of this IP
1472	   address enables the receiving PE to check its configured default GW
1473	   IP address against the one received in the MAC/IP advertisement route
1474	   for that subnet (or EVPN instance) and if there is a discrepancy,
1475	   then the PE SHOULD notify the operator and log an error message.

1477	   Unless it is known a priori (by means outside of this document) that
1478	   all PEs of a given EVPN instance act as a default gateway for that
1479	   EVPN instance, the MPLS label MUST be set to a valid downstream
1480	   assigned label.

1482	   Furthermore, even if all PEs of a given EVPN instance do act as a
1483	   default gateway for that EVPN instance, but only some, but not all,
1484	   of these PEs have sufficient (routing) information to provide inter-
1485	   subnet routing for all the inter-subnet traffic originated within the
1486	   subnet associated with the EVPN instance, then when such PE
1487	   advertises in the EVPN control plane its default gateway MAC address
1488	   using the MAC advertisement route, and indicates that such route is
1489	   associated with the default gateway, the route MUST carry a valid
1490	   downstream assigned label.

1492	   If all PEs of a given EVPN instance act as a default gateway for that
1493	   EVPN instance, and the same default gateway MAC address is used
1494	   across all gateway devices, then no such advertisement is needed.
1495	   However, if each default gateway uses a different MAC address, then
1496	   each default gateway needs to be aware of other gateways' MAC
1497	   addresses and thus the need for such advertisement. This is called
1498	   MAC address aliasing since a single default GW can be represented by
1499	   multiple MAC addresses.

1501	   Each PE that receives this route and imports it as per procedures
1502	   specified in this document follows the procedures in this section
1503	   when replying to ARP Requests that it receives.

1505	   Each PE that acts as a default gateway for a given EVPN instance that
1506	   receives this route and imports it as per procedures specified in
1507	   this document MUST create MAC forwarding state that enables it to
1508	   apply IP forwarding to the packets destined to the MAC address
1509	   carried in the route.

1511	11. Handling of Multi-Destination Traffic

1513	   Procedures are required for a given PE to send broadcast or multicast
1514	   traffic, received from a CE encapsulated in a given Ethernet Tag
1515	   (VLAN) in an EVPN instance, to all the other PEs that span that
1516	   Ethernet Tag (VLAN) in that EVPN instance. In certain scenarios,
1517	   described in section "Processing of Unknown Unicast Packets", a given
1518	   PE may also need to flood unknown unicast traffic to other PEs.

1520	   The PEs in a particular EVPN instance may use ingress replication,
1521	   P2MP LSPs or MP2MP LSPs to send unknown unicast, broadcast or
1522	   multicast traffic to other PEs.

1524	   Each PE MUST advertise an "Inclusive Multicast Ethernet Tag Route" to
1525	   enable the above. The following subsection provides the procedures to
1526	   construct the Inclusive Multicast Ethernet Tag route. Subsequent
1527	   subsections describe in further detail its usage.

1529	11.1. Construction of the Inclusive Multicast Ethernet Tag Route

1531	   The RD MUST be the RD of the EVI that is advertising the NLRI. The
1532	   procedures for setting the RD for a given EVPN instance on a PE are
1533	   described in section 7.9.

1535	   The Ethernet Tag ID is the identifier of the Ethernet Tag. It may be
1536	   set to 0 or to a valid Ethernet Tag value.

1538	   The Originating Router's IP address MUST be set to an IP address of
1539	   the PE.  This address SHOULD be common for all the EVIs on the PE
1540	   (e.,g., this address may be PE's loopback address). The IP Address
1541	   Length field is in bits.

1543	   The Next Hop field of the MP_REACH_NLRI attribute of the route MUST
1544	   be set to the same IP address as the one carried in the Originating
1545	   Router's IP Address field.

1547	   The BGP advertisement for the Inclusive Multicast Ethernet Tag route
1548	   MUST also carry one or more Route Target (RT) attributes. The
1549	   assignment of RTs described in the section 7.10 MUST be followed.

1551	11.2. P-Tunnel Identification

1553	   In order to identify the P-Tunnel used for sending broadcast, unknown
1554	   unicast or multicast traffic, the Inclusive Multicast Ethernet Tag
1555	   route MUST carry a "PMSI Tunnel Attribute" as specified in [BGP
1556	   MVPN].

1558	   Depending on the technology used for the P-tunnel for the EVPN
1559	   instance on the PE, the PMSI Tunnel attribute of the Inclusive
1560	   Multicast Ethernet Tag route is constructed as follows.

1562	        + If the PE that originates the advertisement uses a
1563	          P-Multicast tree for the P-tunnel for EVPN, the PMSI
1564	          Tunnel attribute MUST contain the identity of the tree
1565	          (note that the PE could create the identity of the
1566	          tree prior to the actual instantiation of the tree).

1568	        + A PE that uses a P-Multicast tree for the P-tunnel MAY
1569	          aggregate two or more EVPN instances (EVIs) present
1570	          on the PE onto the same tree. In this case, in addition
1571	          to carrying the identity of the tree, the PMSI Tunnel
1572	          attribute MUST carry an MPLS upstream assigned label which
1573	          the PE has bound uniquely to the EVI associated with this
1574	          update (as determined by its RTs).

1576	          If the PE has already advertised Inclusive Multicast
1577	          Ethernet Tag routes for two or more EVIs that it now
1578	          desires to aggregate, then the PE MUST re-advertise
1579	          those routes. The re-advertised routes MUST be the same
1580	          as the original ones, except for the PMSI Tunnel attribute
1581	          and the label carried in that attribute.

1583	        + If the PE that originates the advertisement uses ingress
1584	          replication for the P-tunnel for EVPN, the route MUST
1585	          include the PMSI Tunnel attribute with the Tunnel Type set to
1586	          Ingress Replication and Tunnel Identifier set to a routable
1587	          address of the PE. The PMSI Tunnel attribute MUST carry a
1588	          downstream assigned MPLS label. This label is used to
1589	          demultiplex the broadcast, multicast or unknown unicast EVPN
1590	          traffic received over a MP2P tunnel by the PE.

1592	        + The Leaf Information Required flag of the PMSI Tunnel
1593	          attribute MUST be set to zero, and MUST be ignored on receipt.

1595	12. Processing of Unknown Unicast Packets

1597	   The procedures in this document do not require the PEs to flood
1598	   unknown unicast traffic to other PEs. If PEs learn CE MAC addresses
1599	   via a control plane protocol, the PEs can then distribute MAC
1600	   addresses via BGP, and all unicast MAC addresses will be learnt prior
1601	   to traffic to those destinations.

1603	   However, if a destination MAC address of a received packet is not
1604	   known by the PE, the PE may have to flood the packet. When flooding,
1605	   one must take into account "split horizon forwarding" as follows: The
1606	   principles behind the following procedures are borrowed from the
1607	   split horizon forwarding rules in VPLS solutions [RFC4761] and
1608	   [RFC4762].  When a PE capable of flooding (say PEx) receives an
1609	   unknown destination MAC address, it floods the frame. If the frame
1610	   arrived from an attached CE, PEx must send a copy of that frame on
1611	   every Ethernet Segment (belonging to that EVI) for which it is the
1612	   DF, other than the Ethernet Segment on which it received the frame.
1613	   In addition, the PE must flood the frame to all other PEs
1614	   participating in that EVPN instance. If, on the other hand, the frame
1615	   arrived from another PE (say PEy), PEx must send a copy of the packet
1616	   on each Ethernet Segment (belonging to that EVI) for which it is the
1617	   DF. PEx MUST NOT send the frame to other PEs, since PEy would have
1618	   already done so. Split horizon forwarding rules apply to unknown MAC
1619	   addresses.

1621	   Whether or not to flood packets to unknown destination MAC addresses
1622	   should be an administrative choice, depending on how learning happens
1623	   between CEs and PEs.

1625	   The PEs in a particular EVPN instance may use ingress replication
1626	   using RSVP-TE P2P LSPs or LDP MP2P LSPs for sending unknown unicast
1627	   traffic to other PEs. Or they may use RSVP-TE P2MP or LDP P2MP for
1628	   sending such traffic to other PEs.

1630	12.1. Ingress Replication

1632	   If ingress replication is in use, the P-Tunnel attribute, carried in
1633	   the Inclusive Multicast Ethernet Tag routes for the EVPN instance,
1634	   specifies the downstream label that the other PEs can use to send
1635	   unknown unicast, multicast or broadcast traffic for that EVPN
1636	   instance to this particular PE.

1638	   The PE that receives a packet with this particular MPLS label MUST
1639	   treat the packet as a broadcast, multicast or unknown unicast packet.
1640	   Further if the MAC address is a unicast MAC address, the PE MUST
1641	   treat the packet as an unknown unicast packet.

1643	12.2. P2MP MPLS LSPs

1645	   The procedures for using P2MP LSPs are very similar to VPLS
1646	   procedures [VPLS-MCAST]. The P-Tunnel attribute used by a PE for
1647	   sending unknown unicast, broadcast or multicast traffic for a
1648	   particular EVPN instance is advertised in the Inclusive Ethernet Tag
1649	   Multicast route as described in section "Handling of Multi-
1650	   Destination Traffic".

1652	   The P-Tunnel attribute specifies the P2MP LSP identifier. This is the
1653	   equivalent of an Inclusive tree in [VPLS-MCAST]. Note that multiple
1654	   Ethernet Tags, which may be in different EVPN instances, may use the
1655	   same P2MP LSP, using upstream labels [VPLS-MCAST]. This is the
1656	   equivalent of an Aggregate Inclusive tree in [VPLS-MCAST]. When P2MP
1657	   LSPs are used for flooding unknown unicast traffic, packet re-
1658	   ordering is possible.

1660	   The PE that receives a packet on the P2MP LSP specified in the PMSI
1661	   Tunnel Attribute MUST treat the packet as a broadcast, multicast or
1662	   unknown unicast packet. Further if the MAC address is a unicast MAC
1663	   address, the PE MUST treat the packet as an unknown unicast packet.

1665	13. Forwarding Unicast Packets
1666	   This section describes procedures for forwarding unicast packets by
1667	   PEs, where such packets are received from either directly connected
1668	   CEs, or from some other PEs.

1670	13.1. Forwarding packets received from a CE

1672	   When a PE receives a packet from a CE, on a given Ethernet Tag ID, it
1673	   must first look up the source MAC address of the packet. In certain
1674	   environments that enable MAC security, the source MAC address MAY be
1675	   used to validate the host identity and determine that traffic from
1676	   the host can be allowed into the network. Source MAC lookup MAY also
1677	   be used for local MAC address learning.

1679	   If the PE decides to forward the packet, the destination MAC address
1680	   of the packet must be looked up. If the PE has received MAC address
1681	   advertisements for this destination MAC address from one or more
1682	   other PEs or learned it from locally connected CEs, it is considered
1683	   as a known MAC address. Otherwise, the MAC address is considered as
1684	   an unknown MAC address.

1686	   For known MAC addresses the PE forwards this packet to one of the
1687	   remote PEs or to a locally attached CE. When forwarding to a remote
1688	   PE, the packet is encapsulated in the EVPN MPLS label advertised by
1689	   the remote PE, for that MAC address, and in the MPLS LSP label stack
1690	   to reach the remote PE.

1692	   If the MAC address is unknown and if the administrative policy on the
1693	   PE requires flooding of unknown unicast traffic then:

1695	   - The PE MUST flood the packet to other PEs. The PE MUST first
1696	   encapsulate the packet in the ESI MPLS label as described in section
1697	   8.3. If ingress replication is used, the packet MUST be replicated to
1698	   each remote PE with the VPN label being an MPLS label determined as
1699	   follows: This is the MPLS label advertised by the remote PE in a PMSI
1700	   Tunnel Attribute in the Inclusive Multicast Ethernet Tag route for an
1701	   <EVPN instance, Ethernet Tag> combination. The Ethernet Tag in the
1702	   route may be the same as the Ethernet Tag associated with the
1703	   interface on which the ingress PE receives the packet. If P2MP LSPs
1704	   are being used the packet MUST be sent on the P2MP LSP that the PE is
1705	   the root of for the Ethernet Tag in the EVPN instance. If the same
1706	   P2MP LSP is used for all Ethernet Tags, then all the PEs in the EVPN
1707	   instance MUST be the leaves of the P2MP LSP. If a distinct P2MP LSP
1708	   is used for a given Ethernet Tag in the EVPN instance, then only the
1709	   PEs in the Ethernet Tag MUST be the leaves of the P2MP LSP. The
1710	   packet MUST be encapsulated in the P2MP LSP label stack.

1712	   If the MAC address is unknown then, if the administrative policy on
1713	   the PE does not allow flooding of unknown unicast traffic:

1715	   - The PE MUST drop the packet.

1717	13.2. Forwarding packets received from a remote PE

1719	   This section described the procedures for forwarding known and
1720	   unknown unicast packets received from a remote PE.

1722	13.2.1. Unknown Unicast Forwarding

1724	   When a PE receives an MPLS packet from a remote PE then, after
1725	   processing the MPLS label stack, if the top MPLS label ends up being
1726	   a P2MP LSP label associated with an EVPN instance or in case of
1727	   ingress replication the downstream label advertised in the P-Tunnel
1728	   attribute, and after performing the split horizon procedures
1729	   described in section 8.3:

1731	   - If the PE is the designated forwarder of BUM traffic on a
1732	   particular set of ESIs for the Ethernet Tag, the default behavior is
1733	   for the PE to flood the packet on these ESIs. In other words, the
1734	   default behavior is for the PE to assume that for BUM traffic, it is
1735	   not required to perform a destination MAC address lookup. As an
1736	   option, the PE may perform a destination MAC lookup to flood the
1737	   packet to only a subset of the CE interfaces in the Ethernet Tag. For
1738	   instance the PE may decide to not flood an BUM packet on certain
1739	   Ethernet segments even if it is the DF on the Ethernet segment, based
1740	   on administrative policy.

1742	   - If the PE is not the designated forwarder on any of the ESIs for
1743	   the Ethernet Tag, the default behavior is for it to drop the packet.

1745	13.2.2. Known Unicast Forwarding

1747	   If the top MPLS label ends up being an EVPN label that was advertised
1748	   in the unicast MAC advertisements, then the PE either forwards the
1749	   packet based on CE next-hop forwarding information associated with
1750	   the label or does a destination MAC address lookup to forward the
1751	   packet to a CE.

1753	14. Load Balancing of Unicast Frames

1755	   This section specifies the load balancing procedures for sending
1756	   known unicast frames to a multi-homed CE.

1758	14.1. Load balancing of traffic from a PE to remote CEs

1760	   Whenever a remote PE imports a MAC advertisement for a given <ESI,
1761	   Ethernet Tag> in an EVI, it MUST examine all imported Ethernet A-D
1762	   routes for that ESI in order to determine the load-balancing
1763	   characteristics of the Ethernet segment.

1765	14.1.1 Single-Active Redundancy Mode

1767	   For a given ES, if the remote PE has imported the set of Ethernet A-D
1768	   per ES routes from at least one PE, where the "Single-Active" flag in
1769	   the ESI Label Extended Community is set, then the remote PE MUST
1770	   deduce that the ES is operating in Single-Active redundancy mode. As
1771	   such, the MAC address will be reachable only via the PE announcing
1772	   the associated MAC Advertisement route - this is referred to as the
1773	   primary PE. The other PEs advertising the set of Ethernet A-D per ES
1774	   routes for the same ES provide backup paths for that ES, in case the
1775	   primary PE encounters a failure, and are referred to as backup PEs.
1776	   It should be noted that the primary PE for a given <ES, EVI> is the
1777	   DF for that <ES, EVI>.

1779	   If the primary PE encounters a failure, it MAY withdraw its set of
1780	   Ethernet A-D per ES routes for the affected ES prior to withdrawing
1781	   it set of MAC Advertisement routes.

1783	   If there is only one backup PE for a given ES, the remote PE MAY use
1784	   the primary PE's withdrawal of its set of Ethernet A-D per ES routes
1785	   as a trigger to update its forwarding entries, for the associated MAC
1786	   addresses, to point towards the backup PE. As the backup PE starts
1787	   learning the MAC addresses over its attached ES, it will start
1788	   sending MAC Advertisement routes while the failed PE withdraws its
1789	   routes. This mechanism minimizes the flooding of traffic during fail-
1790	   over events.

1792	   If there is more than one backup PE for a given ES, the remote PE
1793	   MUST use the primary PE's withdrawal of its set of Ethernet A-D per
1794	   ES routes as a trigger to start flooding traffic for the associated
1795	   MAC addresses (as long as flooding of unknown unicast is
1796	   administratively allowed), as it is not possible to select a single
1797	   backup PE.

1799	14.1.2 All-Active Redundancy Mode

1801	   For a given ES, if the remote PE has imported the set of Ethernet A-D
1802	   per ES routes from one or more PEs and none of them have the "Single-
1803	   Active" flag in the ESI Label Extended Community set, then the remote
1804	   PE MUST deduce that the ES is operating in All-Active redundancy
1805	   mode.  A remote PE that receives a MAC advertisement route with non-
1806	   reserved ESI SHOULD consider the advertised MAC address to be
1807	   reachable via all PEs that have advertised reachability to that MAC
1808	   address' EVI/ES via the combination of an Ethernet A-D per EVI route
1809	   for that EVI/ES (and Ethernet Tag if applicable) AND an Ethernet A-D
1810	   per ES route for that ES.  The remote PE MUST use received MAC
1811	   Advertisement routes and Ethernet A-D per EVI/per ES routes to
1812	   construct the set of next-hops for the advertised MAC address.

1814	   Each next-hop comprises an MPLS label stack that is to be used by the
1815	   egress PE to forward the packet. This label stack is determined as
1816	   follows:

1818	   -If the next-hop is constructed as a result of a MAC route then this
1819	   label stack MUST be used. However, if the MAC route doesn't exist for
1820	   that PE, then the next-hop and MPLS label stack is constructed as a
1821	   result of the Ethernet A-D routes. Note that the following
1822	   description applies to determining the label stack for a particular
1823	   next-hop to reach a given PE, from which the remote PE has received
1824	   and imported Ethernet A-D routes that have the matching ESI and
1825	   Ethernet Tag as the one present in the MAC advertisement. The
1826	   Ethernet A-D routes mentioned in the following description refer to
1827	   the ones imported from this given PE.

1829	   -If a set of Ethernet A-D per ES routes for that ES AND an Ethernet
1830	   A-D route per EVI exist, only then the label from that latter route
1831	   must be used.

1833	   The following example explains the above.

1835	   Consider a CE (CE1) that is dual-homed to two PEs (PE1 and PE2) on a
1836	   LAG interface (ES1), and is sending packets with source MAC address
1837	   MAC1 on VLAN1 (mapped to EVI1). A remote PE, say PE3, is able to
1838	   learn that MAC1 is reachable via PE1 and PE2. Both PE1 and PE2 may
1839	   advertise MAC1 in BGP if they receive packets with MAC1 from CE1. If
1840	   this is not the case, and if MAC1 is advertised only by PE1, PE3
1841	   still considers MAC1 as reachable via both PE1 and PE2 as both PE1
1842	   and PE2 advertise a set of Ethernet A-D per ES routes for ES1 as well
1843	   as an Ethernet A-D per EVI route for <EVI1, ES1>.

1845	   The MPLS label stack to send the packets to PE1 is the MPLS LSP stack
1846	   to get to PE1 (at top of the stack) followed by the EVPN label
1847	   advertised by PE1 for CE1's MAC .

1849	   The MPLS label stack to send packets to PE2 is the MPLS LSP stack to
1850	   get to PE2 (at top of the stack) followed by the MPLS label in the
1851	   Ethernet A-D route advertised by PE2 for <ES1, VLAN1>, if PE2 has not
1852	   advertised MAC1 in BGP.

1854	   We will refer to these label stacks as MPLS next-hops.

1856	   The remote PE (PE3) can now load balance the traffic it receives from
1857	   its CEs, destined for CE1, between PE1 and PE2.  PE3 may use N-Tuple
1858	   flow information to hash traffic into one of the MPLS next-hops for
1859	   load balancing of IP traffic. Alternatively PE3 may rely on the
1860	   source MAC addresses for load balancing.

1862	   Note that once PE3 decides to send a particular packet to PE1 or PE2
1863	   it can pick one out of multiple possible paths to reach the
1864	   particular remote PE using regular MPLS procedures. For instance, if
1865	   the tunneling technology is based on RSVP-TE LSPs, and PE3 decides to
1866	   send a particular packet to PE1, then PE3 can choose from multiple
1867	   RSVP-TE LSPs that have PE1 as their destination.

1869	   When PE1 or PE2 receive the packet destined for CE1 from PE3, if the
1870	   packet is a known unicast, it is forwarded to CE1.  If it is a BUM
1871	   packet then only one of PE1 or PE2 must forward the packet to the CE.
1872	   Which of PE1 or PE2 forward this packet to the CE is determined based
1873	   on which of the two is the DF.

1875	14.2. Load balancing of traffic between a PE and a local CE

1877	   A CE may be configured with more than one interface connected to
1878	   different PEs or the same PE for load balancing, using a technology
1879	   such as LAG. The PE(s) and the CE can load balance traffic onto these
1880	   interfaces using one of the following mechanisms.

1882	14.2.1. Data plane learning

1884	   Consider that the PEs perform data plane learning for local MAC
1885	   addresses learned from local CEs. This enables the PE(s) to learn a
1886	   particular MAC address and associate it with one or more interfaces,
1887	   if the technology between the PE and the CE supports multi-pathing.
1888	   The PEs can now load balance traffic destined to that MAC address on
1889	   the multiple interfaces.

1891	   Whether the CE can load balance traffic that it generates on the
1892	   multiple interfaces is dependent on the CE implementation.

1894	14.2.2. Control plane learning

1896	   The CE can be a host that advertises the same MAC address using a
1897	   control protocol on all interfaces. This enables the PE(s) to learn
1898	   the host's MAC address and associate it with all interfaces. The PEs
1899	   can now load balance traffic destined to the host on all these
1900	   interfaces. The host can also load balance the traffic it generates
1901	   onto these interfaces and the PE that receives the traffic employs
1902	   EVPN forwarding procedures to forward the traffic.

1904	15. MAC Mobility
1905	   It is possible for a given host or end-station (as defined by its MAC
1906	   address) to move from one Ethernet segment to another;  this is
1907	   referred to as 'MAC Mobility' or 'MAC move' and it is different from
1908	   the multi-homing situation in which a given MAC address is reachable
1909	   via multiple PEs for the same Ethernet segment.  In a MAC move, there
1910	   would be two sets of MAC Advertisement routes, one set with the new
1911	   Ethernet segment and one set with the previous Ethernet segment, and
1912	   the MAC address would appear to be reachable via each of these
1913	   segments.

1915	   In order to allow all of the PEs in the EVPN instance to correctly
1916	   determine the current location of the MAC address, all advertisements
1917	   of it being reachable via the previous Ethernet segment MUST be
1918	   withdrawn by the PEs, for the previous Ethernet segment, that had
1919	   advertised it.

1921	   If local learning is performed using the data plane, these PEs will
1922	   not be able to detect that the MAC address has moved to another
1923	   Ethernet segment and the receipt of MAC Advertisement routes, with
1924	   the MAC Mobility extended community attribute, from other PEs serves
1925	   as the trigger for these PEs to withdraw their advertisements.  If
1926	   local learning is performed using the control or management planes,
1927	   these interactions serve as the trigger for these PEs to withdraw
1928	   their advertisements.

1930	   In a situation where there are multiple moves of a given MAC,
1931	   possibly between the same two Ethernet segments, there may be
1932	   multiple withdrawals and re-advertisements.  In order to ensure that
1933	   all PEs in the EVPN instance receive all of these correctly through
1934	   the intervening BGP infrastructure, it is necessary to introduce a
1935	   sequence number into the MAC Mobility extended community attribute.

1937	   An implementation MUST handle the scenarios where the sequence number
1938	   wraps around to process mobility event correctly.

1940	   Every MAC mobility event for a given MAC address will contain a
1941	   sequence number that is set using the following rules:

1943	   - A PE advertising a MAC address for the first time advertises it
1944	   with no MAC Mobility extended community attribute.

1946	   - A PE detecting a locally attached MAC address for which it had
1947	   previously received a MAC Advertisement route with a different
1948	   Ethernet segment identifier advertises the MAC address in a MAC
1949	   Advertisement route tagged with a MAC Mobility extended community
1950	   attribute with a sequence number one greater than the sequence number
1951	   in the MAC mobility attribute of the received MAC Advertisement
1952	   route. In the case of the first mobility event for a given MAC
1953	   address, where the received MAC Advertisement route does not carry a
1954	   MAC Mobility attribute, the value of the sequence number in the
1955	   received route is assumed to be 0 for purpose of this processing.

1957	   - A PE detecting a locally attached MAC address for which it had
1958	   previously received a MAC Advertisement route with the same non-zero
1959	   Ethernet segment identifier advertises it with:
1960	      i.  no MAC Mobility extended community attribute, if the received
1961	      route did not carry said attribute.

1963	      ii. a MAC Mobility extended community attribute with the sequence
1964	      number equal to the highest of the sequence number(s) in the
1965	      received MAC Advertisement route(s), if the received route(s) is
1966	      (are) tagged with a MAC Mobility extended community attribute.

1968	   - A PE detecting a locally attached MAC address for which it had
1969	   previously received a MAC Advertisement route with the same zero
1970	   Ethernet segment identifier (single-homed scenarios) advertises it
1971	   with MAC mobility extended community attribute with the sequence
1972	   number set properly. In case of single-homed scenarios, there is no
1973	   need for ESI comparison. The reason ESI comparison is done for multi-
1974	   homing, is to prevent false detection of MAC move among the PEs
1975	   attached to the same multi-homed site.

1977	   A PE receiving a MAC Advertisement route for a MAC address with a
1978	   different Ethernet segment identifier and a higher sequence number
1979	   than that which it had previously advertised, withdraws its MAC
1980	   Advertisement route. If two (or more) PEs advertise the same MAC
1981	   address with same sequence number but different Ethernet segment
1982	   identifiers, a PE that receives these routes selects the route
1983	   advertised by the PE with lowest IP address as the best route. If the
1984	   PE is the originator of the MAC route and it receives the same MAC
1985	   address with the same sequence number that it generated, it will
1986	   compare its own IP address with the IP address of the remote PE and
1987	   will select the lowest IP. If its own route is not the best one, it
1988	   will withdraw the route.

1990	15.1. MAC Duplication Issue

1992	   A situation may arise where the same MAC address is learned by
1993	   different PEs in the same VLAN because of two (or more hosts) being
1994	   mis-configured with the same (duplicate) MAC address. In such
1995	   situation, the traffic originating from these hosts would trigger
1996	   continuous MAC moves among the PEs attached to these hosts. It is
1997	   important to recognize such situation and avoid incrementing the
1998	   sequence number (in the MAC Mobility attribute) to infinity. In order
1999	   to remedy such situation, a PE that detects a MAC mobility event by
2000	   way of local learning starts an M-second timer (default value of M =
2001	   180) and if it detects N MAC moves before the timer expires (default
2002	   value for N = 5), it concludes that a duplicate MAC situation has
2003	   occurred. The PE MUST alert the operator and stop sending and
2004	   processing any BGP MAC Advertisement routes for that MAC address till
2005	   a corrective action is taken by the operator. The values of M and N
2006	   MUST be configurable to allow for flexibility in operator control.
2007	   Note that the other PEs in the E-VPN instance will forward the
2008	   traffic for the duplicate MAC address to one of the PEs advertising
2009	   the duplicate MAC address.

2011	15.2. Sticky MAC addresses

2013	   There are scenarios in which it is desired to configure some MAC
2014	   addresses as static so that they are not subjected to MAC move. In
2015	   such scenarios, these MAC addresses are advertised with MAC Mobility
2016	   Extended Community where static flag is set to 1 and sequence number
2017	   is set to zero. If a PE receives such advertisements and later learns
2018	   the same MAC address(es) via local learning, then the PE MUST alert
2019	   the operator.

2021	16. Multicast & Broadcast

2023	   The PEs in a particular EVPN instance may use ingress replication or
2024	   P2MP LSPs to send multicast traffic to other PEs.

2026	16.1. Ingress Replication

2028	   The PEs may use ingress replication for flooding BUM traffic as
2029	   described in section "Handling of Multi-Destination Traffic". A given
2030	   broadcast packet must be sent to all the remote PEs. However a given
2031	   multicast packet for a multicast flow may be sent to only a subset of
2032	   the PEs. Specifically a given multicast flow may be sent to only
2033	   those PEs that have receivers that are interested in the multicast
2034	   flow. Determining which of the PEs have receivers for a given
2035	   multicast flow is done using explicit tracking described below.

2037	16.2. P2MP LSPs

2039	   A PE may use an "Inclusive" tree for sending an BUM packet. This
2040	   terminology is borrowed from [VPLS-MCAST].

2042	   A variety of transport technologies may be used in the SP network.
2043	   For inclusive P-Multicast trees, these transport technologies include
2044	   point-to-multipoint LSPs created by RSVP-TE or mLDP.

2046	16.2.1. Inclusive Trees

2048	   An Inclusive Tree allows the use of a single multicast distribution
2049	   tree, referred to as an Inclusive P-Multicast tree, in the SP network
2050	   to carry all the multicast traffic from a specified set of EVPN
2051	   instances on a given PE. A particular P-Multicast tree can be set up
2052	   to carry the traffic originated by sites belonging to a single EVPN
2053	   instance, or to carry the traffic originated by sites belonging to
2054	   several EVPN instances. The ability to carry the traffic of more than
2055	   one EVPN instance on the same tree is termed 'Aggregation' and the
2056	   tree is called an Aggregate Inclusive P-Multicast tree or Aggregate
2057	   Inclusive tree for short. The Aggregate Inclusive tree needs to
2058	   include every PE that is a member of any of the EVPN instances that
2059	   are using the tree. This implies that a PE may receive BUM traffic
2060	   even if it doesn't have any receivers that are interested in
2061	   receiving that traffic.

2063	   An Inclusive or Aggregate Inclusive tree as defined in this document
2064	   is a P2MP tree.  A P2MP tree is used to carry traffic only for EVPN
2065	   CEs that are connected to the PE that is the root of the tree.

2067	   The procedures for signaling an Inclusive tree are the same as those
2068	   in [VPLS-MCAST] with the VPLS-AD route replaced with the Inclusive
2069	   Multicast Ethernet Tag route. The P-Tunnel attribute [VPLS-MCAST] for
2070	   an Inclusive tree is advertised with the Inclusive Multicast Ethernet
2071	   Tag route as described in section "Handling of Multi-Destination
2072	   Traffic". Note that for an Aggregate Inclusive tree, a PE can
2073	   "aggregate" multiple EVPN instances on the same P2MP LSP using
2074	   upstream labels. The procedures for aggregation are the same as those
2075	   described in [VPLS-MCAST], with VPLS A-D routes replaced by EVPN
2076	   Inclusive Multicast Ethernet Tag routes.

2078	17. Convergence

2080	   This section describes failure recovery from different types of
2081	   network failures.

2083	17.1. Transit Link and Node Failures between PEs

2085	   The use of existing MPLS Fast-Reroute mechanisms can provide failure
2086	   recovery in the order of 50ms, in the event of transit link and node
2087	   failures in the infrastructure that connects the PEs.

2089	17.2. PE Failures

2091	   Consider a host CE1 that is dual homed to PE1 and PE2. If PE1 fails,
2092	   a remote PE, PE3, can discover this based on the failure of the BGP
2093	   session.  This failure detection can be in the sub-second range if
2094	   BFD is used to detect BGP session failure. PE3 can update its
2095	   forwarding state to start sending all traffic for CE1 to only PE2.

2097	17.3. PE to CE Network Failures

2099	   If the connectivity between the multi-homed CE and one of the PEs
2100	   that it is attached to, fails, the PE MUST withdraw the set of
2101	   Ethernet A-D per ES routes that had been previously advertised for
2102	   that ES. When the MAC entry on the PE ages out, the PE MUST withdraw
2103	   the MAC address from BGP. Note that to aid convergence, the Ethernet
2104	   A-D per EVI routes MAY be withdrawn before the MAC routes. This
2105	   enables the remote PEs to remove the MPLS next-hop to this particular
2106	   PE from the set of MPLS next-hops that can be used to forward traffic
2107	   to the CE.

2109	   When a Ethernet Tag is decommissioned on an Ethernet segment, then
2110	   the PE MUST withdraw the Ethernet A-D per EVI route(s) announced for
2111	   the <ESI, Ethernet Tags> that are impacted by the decommissioning. In
2112	   addition, the PE MUST also withdraw the MAC advertisement routes that
2113	   are impacted by the decommissioning.

2115	   The Ethernet A-D per ES routes should be used by an implementation to
2116	   optimize the withdrawal of MAC advertisement routes. When a PE
2117	   receives a withdrawal of a particular Ethernet A-D route from a PE it
2118	   SHOULD consider all the MAC advertisement routes, that are learned
2119	   from the same ESI as in the Ethernet A-D route, from the advertising
2120	   PE, as having been withdrawn. This optimizes the network convergence
2121	   times in the event of PE to CE failures.

2123	18. Frame Ordering

2125	   In a MAC address, if the value of the 1st nibble (bits 8 thorough 5)
2126	   of the most significant octet of the destination MAC address (which
2127	   follows the last MPLS label) happens to be 0x4 or 0x6, then the
2128	   Ethernet frame can be misinterpreted as an IPv4 or IPv6 packet by
2129	   intermediate P nodes performing ECMP based on deep packet inspection,
2130	   thus resulting in load balancing packets belonging to the same flow
2131	   on different ECMP paths and subjecting them to different delays.
2132	   Therefore, packets belonging to the same flow can arrive at the
2133	   destination out of order. This out of order delivery can happen
2134	   during steady state in absence of any failures resulting in
2135	   significant impact to the network operation.

2137	   In order to avoid any such mis-ordering, the following rules are
2138	   applied:

2140	   - If a network uses deep packet inspection for its ECMP, then the
2141	   "Preferred PW MPLS Control Word" per [RFC4385] SHOULD be used with
2142	   the value of 0 (e.g., a 4-octet field with value of zero) when
2143	   sending EVPN encapsulated packets over a MP2P LSP.

2145	   - If a network uses Entropy label [RFC6790], then the control word
2146	   SHOULD NOT be used when sending EVPN encapsulated packet over a MP2P
2147	   LSP.

2149	   - When sending EVPN encapsulated packets over a P2MP LSP or P2P LSP,
2150	   then the control world SHOULD NOT be used.

2152	19. Acknowledgements

2154	   Special thanks to Yakov Rekhter for reviewing this draft several
2155	   times and providing valuable comments and for his very engaging
2156	   discussions on several topics of this draft that helped shape this
2157	   document. We would also like to thank Pedro Marques, Kaushik Ghosh,
2158	   Nischal Sheth, Robert Raszuk, Amit Shukla, and Nadeem Mohammed for
2159	   discussions that helped shape this document. We would also like to
2160	   thank Han Nguyen for his comments and support of this work. We would
2161	   also like to thank Steve Kensil and Reshad Rahman for their reviews.
2162	   We would like to thank Jorge Rabadan for his contribution to section
2163	   5 of this draft. We like to thank Thomas Morin for his review of this
2164	   draft and his contribution of section 8.6. Many thanks to Jakob Heitz
2165	   for his help to improve several sections of this draft.

2167	   We would also like to thank Clarence Filsfils, Dennis Cai, Quaizar
2168	   Vohra, Kireeti Kompella, Apurva Mehta for their contributions to this
2169	   document.

2171	   Last but not least, special thanks to Giles Heron (our WG chair) for
2172	   his detailed review of this document in preparation for WG LC and
2173	   making many valuable suggestions.

2175	20. Security Considerations

2177	   Security considerations discussed in [RFC4761] and [RFC4762] apply to
2178	   this document for MAC learning in data-plane over an Attachment
2179	   Circuit (AC) and for flooding of unknown unicast and ARP messages
2180	   over the MPLS/IP core. Security considerations discussed in [RFC4364]
2181	   apply to this document for MAC learning in control-plane over the
2182	   MPLS/IP core. This section describes additional considerations.

2184	   As mentioned in [RFC4761], there are two aspects to achieving data
2185	   privacy and protecting against denial-of-service attacks in a VPN:

2187	   securing the control plane and protecting the forwarding path.
2188	   Compromise of the control plane could result in a PE sending customer
2189	   data belonging to some EVPN to another EVPN, or black-holing EVPN
2190	   customer data, or even sending it to an eavesdropper; none of which
2191	   are acceptable from a data privacy point of view.  In addition,
2192	   compromise of the control plane could result in black-holing EVPN
2193	   customer data and could provide opportunities for unauthorized EVPN
2194	   data usage (e.g., exploiting traffic replication within a multicast
2195	   tree to amplify a denial-of-service attack based on sending large
2196	   amounts of traffic).

2198	   The mechanisms in this document use BGP for the control plane. Hence,
2199	   techniques such as in [RFC5925] help authenticate BGP messages,
2200	   making it harder to spoof updates (which can be used to divert EVPN
2201	   traffic to the wrong EVPN instance) or withdrawals (denial-of-service
2202	   attacks).  In the multi-AS methods (b) and (c), this also means
2203	   protecting the inter-AS BGP sessions, between the ASBRs, the PEs, or
2204	   the Route Reflectors.

2206	   Note that [RFC5925] will not help in keeping MPLS labels private --
2207	   knowing the labels, one can eavesdrop on EVPN traffic. However, this
2208	   requires access to the data path within an SP network, which is
2209	   assumed to be composed of trusted nodes/links.

2211	   One of the requirements for protecting the data plane is that the
2212	   MPLS labels be accepted only from valid interfaces. For a PE, valid
2213	   interfaces comprise links from other routers in the PE's own AS.  For
2214	   an ASBR, valid interfaces comprise links from other routers in the
2215	   ASBR's own AS, and links from other ASBRs in ASes that have instances
2216	   of a given EVPN.  It is especially important in the case of multi-AS
2217	   EVPN instances that one accept EVPN packets only from valid
2218	   interfaces.

2220	   It is also important to help limit malicious traffic into a network
2221	   for an imposter MAC address. The mechanism described in section 15.1,
2222	   shows how duplicate MAC addresses can be detected and continous false
2223	   MAC mobility can be prevented. The mechanism described in section
2224	   15.2, shows how MAC addresses can be pinned to a given Ethernet
2225	   Segment, such that if they appear behind any other Ethernet Segments,
2226	   the traffic for those MAC addresses be prevented from entering the
2227	   EVPN network from the other Ethernet Segments.

2229	21. Contributors

2231	   In addition to the authors listed on the front page, the following
2232	   individuals have also helped to shape this document:

2234	      Keyur Patel
2235	      Samer Salam
2236	      Sami Boutros
2237	      Cisco

2239	      Yakov Rekhter
2240	      Ravi Shekhar
2241	      Juniper Networks

2243	      Florin Balus
2244	      Nuage Networks

2246	22.  IANA Considerations

2248	   This document defines a new NLRI, called "EVPN", to be carried in BGP
2249	   using multiprotocol extensions.  This NLRI uses the existing AFI of
2250	   25 (L2VPN).  IANA has assigned it a SAFI value of 70.

2252	   IANA allocated a new transitive extended community Type of 0x06 and
2253	   Sub-Type of 0x00 for EVPN MAC Mobility Extended Community.

2255	   IANA allocated a new transitive extended community Type of 0x06 and
2256	   Sub-Type of 0x01 for EVPN ESI Label Extended Community.

2258	   IANA allocated a new transitive extended community Type of 0x06 and
2259	   Sub-Type of 0x02 for EVPN ES-Import Route Target.

2261	   For EVPN NLRI (with AFI=25, SAFI = 70), the following route types are
2262	   requested from IANA:     1 - Ethernet Auto-Discovery (A-D) route
2263	   2 - MAC/IP advertisement route     3 - Inclusive Multicast Ethernet
2264	   Tag Route     4 - Ethernet Segment Route

2266	23. References

2268	23.1 Normative References

2270	   [RFC4364] "BGP/MPLS IP VPNs", Rosen, Rekhter, et. al., February 2006

2272	   [RFC4761] Kompella, K. and Y. Rekhter, "Virtual Private LAN Service
2273	              (VPLS) Using BGP for Auto-Discovery and Signaling", RFC
2274	              4761, January 2007.

2276	   [RFC4762] Lasserre, M. and V. Kompella, "Virtual Private LAN Service
2277	              (VPLS) Using Label Distribution Protocol (LDP) Signaling",
2278	              RFC 4762, January 2007.

2280	   [RFC4271] Y. Rekhter et. al., "A Border Gateway Protocol 4 (BGP-4)",
2281	              RFC 4271, January 2006

2283	   [RFC4760] T. Bates et. al., "Multiprotocol Extensions for BGP-4", RFC
2284	              4760, January 2007

2286	23.2 Informative References

2288	   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
2289	              Requirement Levels", BCP 14, RFC 2119, March 1997.

2291	   [RFC7209] A. Sajassi, R. Aggarwal et. al., "Requirements for Ethernet
2292	              VPN", draft-ietf-l2vpn-evpn-req-04.txt, July 2013.

2294	   [RFC7117] "Multicast in VPLS". R. Aggarwal et.al., draft-ietf-l2vpn-
2295	              vpls-mcast-14.txt, July 2013.

2297	   [RFC4684] P. Marques et. al., "Constrained Route Distribution for
2298	              Border Gateway Protocol/MultiProtocol Label Switching
2299	              (BGP/MPLS) Internet Protocol (IP) Virtual Private Networks
2300	              (VPNs)", RFC 4684, November 2006.

2302	   [RFC6790] K. Kompella et. al, "The Use of Entropy Labels in MPLS
2303	              Forwarding", RFC 6790, November 2012.

2305	   [RFC4385] S. Bryant et. al, "PWE3 Control Word for Use over an MPLS
2306	              PSN", RFC 4385, February 2006

2308	24. Author's Address

2310	      Ali Sajassi
2311	      Cisco
2312	      Email: sajassi@cisco.com

2314	      Rahul Aggarwal
2315	      Email: raggarwa_1@yahoo.com

2317	      Nabil Bitar
2318	      Verizon Communications
2319	      Email : nabil.n.bitar@verizon.com

2321	      Aldrin Isaac
2322	      Bloomberg
2323	      Email: aisaac71@bloomberg.net

2325	      James Uttaro
2326	      AT&T
2327	      Email: uttaro@att.com

2329	      John Drake
2330	      Juniper Networks
2331	      Email: jdrake@juniper.net

2333	      Wim Henderickx
2334	      Alcatel-Lucent
2335	      e-mail: wim.henderickx@alcatel-lucent.com