idnits 2.17.1 

draft-ietf-bess-evpn-usage-06.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (August 28, 2017) is 2432 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'PE-IP' is mentioned on line 407, but not defined

  == Missing Reference: 'AS' is mentioned on line 413, but not defined

  == Unused Reference: 'RFC4364' is defined on line 1339, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC7117' is defined on line 1348, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-15) exists of
     draft-ietf-bess-evpn-inter-subnet-forwarding-03


     Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	BESS Workgroup                                           J. Rabadan, Ed.
3	Internet Draft                                           S. Palislamovic
4	                                                           W. Henderickx
5	Intended status: Informational                                     Nokia

7	                                                              A. Sajassi
8	                                                                   Cisco

10	                                                               J. Uttaro
11	                                                                    AT&T

13	Expires: March 1, 2018                                   August 28, 2017

15	         Usage and applicability of BGP MPLS based Ethernet VPN
16	                     draft-ietf-bess-evpn-usage-06

18	Abstract

20	   This document discusses the usage and applicability of BGP MPLS based
21	   Ethernet VPN (EVPN) in a simple and fairly common deployment
22	   scenario. The different EVPN procedures are explained on the example
23	   scenario, analyzing the benefits and trade-offs of each option. This
24	   document is intended to provide a simplified guide for the deployment
25	   of EVPN networks.

27	Status of this Memo This Internet-Draft is submitted in full conformance
28	   with the provisions of BCP 78 and BCP 79.

30	   Internet-Drafts are working documents of the Internet Engineering
31	   Task Force (IETF), its areas, and its working groups.  Note that
32	   other groups may also distribute working documents as Internet-
33	   Drafts.

35	   Internet-Drafts are draft documents valid for a maximum of six months
36	   and may be updated, replaced, or obsoleted by other documents at any
37	   time.  It is inappropriate to use Internet-Drafts as reference
38	   material or to cite them other than as "work in progress."
39	   The list of current Internet-Drafts can be accessed at
40	   http://www.ietf.org/ietf/1id-abstracts.txt

42	   The list of Internet-Draft Shadow Directories can be accessed at
43	   http://www.ietf.org/shadow.html

45	   This Internet-Draft will expire on March 1, 2018.

47	Copyright Notice

49	   Copyright (c) 2017 IETF Trust and the persons identified as the
50	   document authors. All rights reserved.

52	   This document is subject to BCP 78 and the IETF Trust's Legal
53	   Provisions Relating to IETF Documents
54	   (http://trustee.ietf.org/license-info) in effect on the date of
55	   publication of this document. Please review these documents
56	   carefully, as they describe your rights and restrictions with respect
57	   to this document. Code Components extracted from this document must
58	   include Simplified BSD License text as described in Section 4.e of
59	   the Trust Legal Provisions and are provided without warranty as
60	   described in the Simplified BSD License.

62	Table of Contents

64	   1. Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  3
65	   2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . .  4
66	   3. Use-case scenario description and requirements  . . . . . . . .  4
67	     3.1. Service Requirements  . . . . . . . . . . . . . . . . . . .  5
68	     3.2. Why EVPN is chosen to address this use-case . . . . . . . .  6
69	   4. Provisioning Model  . . . . . . . . . . . . . . . . . . . . . .  7
70	     4.1. Common provisioning tasks . . . . . . . . . . . . . . . . .  7
71	       4.1.1. Non-service specific parameters . . . . . . . . . . . .  7
72	       4.1.2. Service specific parameters . . . . . . . . . . . . . .  8
73	     4.2. Service interface dependent provisioning tasks  . . . . . .  9
74	       4.2.1. VLAN-based service interface EVI  . . . . . . . . . . .  9
75	       4.2.2. VLAN-bundle service interface EVI . . . . . . . . . . . 10
76	       4.2.3. VLAN-aware bundling service interface EVI . . . . . . . 10
77	   5. BGP EVPN NLRI usage . . . . . . . . . . . . . . . . . . . . . . 10
78	   6. MAC-based forwarding model use-case . . . . . . . . . . . . . . 11
79	     6.1. EVPN Network Startup procedures . . . . . . . . . . . . . . 11
80	     6.2. VLAN-based service procedures . . . . . . . . . . . . . . . 12
81	       6.2.1. Service startup procedures  . . . . . . . . . . . . . . 12
82	       6.2.2. Packet walkthrough  . . . . . . . . . . . . . . . . . . 13
83	     6.3. VLAN-bundle service procedures  . . . . . . . . . . . . . . 16
84	       6.3.1. Service startup procedures  . . . . . . . . . . . . . . 16
85	       6.3.2. Packet Walkthrough  . . . . . . . . . . . . . . . . . . 17
86	     6.4. VLAN-aware bundling service procedures  . . . . . . . . . . 17
87	       6.4.1. Service startup procedures  . . . . . . . . . . . . . . 18
88	       6.4.2. Packet Walkthrough  . . . . . . . . . . . . . . . . . . 18
89	   7. MPLS-based forwarding model use-case  . . . . . . . . . . . . . 19
90	     7.1. Impact of MPLS-based forwarding on the EVPN network
91	          startup . . . . . . . . . . . . . . . . . . . . . . . . . . 20
92	     7.2. Impact of MPLS-based forwarding on the VLAN-based service
93	          procedures  . . . . . . . . . . . . . . . . . . . . . . . . 20
94	     7.3. Impact of MPLS-based forwarding on the VLAN-bundle
95	          service procedures  . . . . . . . . . . . . . . . . . . . . 21
96	     7.4. Impact of MPLS-based forwarding on the VLAN-aware service
97	          procedures  . . . . . . . . . . . . . . . . . . . . . . . . 21
98	   8. Comparison between MAC-based and MPLS-based Egress Forwarding
99	      Models  . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
100	   9. Traffic flow optimization . . . . . . . . . . . . . . . . . . . 23
101	     9.1. Control Plane Procedures  . . . . . . . . . . . . . . . . . 23
102	       9.1.1. MAC learning options  . . . . . . . . . . . . . . . . . 23
103	       9.1.2. Proxy-ARP/ND  . . . . . . . . . . . . . . . . . . . . . 24
104	       9.1.3. Unknown Unicast flooding suppression  . . . . . . . . . 24
105	       9.1.4. Optimization of Inter-subnet forwarding . . . . . . . . 25
106	     9.2. Packet Walkthrough Examples . . . . . . . . . . . . . . . . 26
107	       9.2.1. Proxy-ARP example for CE2 to CE3 traffic  . . . . . . . 26
108	       9.2.2. Flood suppression example for CE1 to CE3 traffic  . . . 26
109	       9.2.3. Optimization of inter-subnet forwarding example for
110	              CE3 to CE2 traffic  . . . . . . . . . . . . . . . . . . 27
111	   10. Security Considerations  . . . . . . . . . . . . . . . . . . . 28
112	   11. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 28
113	   12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 28
114	     12.1. Normative References . . . . . . . . . . . . . . . . . . . 28
115	     12.2. Informative References . . . . . . . . . . . . . . . . . . 29
116	   13. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 29
117	   14. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 29
118	   15. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 30

120	1. Introduction

122	   This document complements [RFC7432] by discussing the applicability
123	   of the technology in a simple and fairly common deployment scenario,
124	   which is described in section 3.

126	   After describing the topology and requirements of the use-case
127	   scenario, section 4 will describe the provisioning model.

129	   Once the provisioning model is analyzed, sections 5, 6 and 7 will
130	   describe the control plane and data plane procedures in the example
131	   scenario, for the two potential disposition/forwarding models:
132	   MAC-based and MPLS-based models. While both models can interoperate
133	   in the same network, each one has different trade-offs that are
134	   analyzed in section 8.

136	   Finally, EVPN provides some potential traffic flow optimization tools
137	   that are also described in section 9, in the context of the example
138	   scenario.

140	2. Terminology

142	   The following terminology is used:

144	   o VID: VLAN Identifier.

146	   o CE: Customer Edge device.

148	   o EVI: EVPN Instance.

150	   o MAC-VRF: A Virtual Routing and Forwarding table for Media Access
151	     Control (MAC) addresses on a PE.

153	   o Ethernet Segment (ES): set of links through which a customer site
154	     (CE) is connected to one or more PEs. Each ES is identified by an
155	     Ethernet Segment Identifier (ESI) in the control plane.

157	   o CE-VIDs refer to the VLAN tag identifiers being used at CE1, CE2
158	     and CE3 to tag customer traffic sent to the Service Provider E- VPN
159	     network

161	   o CE1-MAC, CE2-MAC and CE3-MAC refer to source MAC addresses "behind"
162	     each CE respectively. Those MAC addresses can belong to the CEs
163	     themselves or to devices connected to the CEs.

165	   o CE1-IP, CE2-IP and CE3-IP refer to IP addresses associated to the
166	     above MAC addresses.

168	   o LACP: Link Aggregation Control Protocol.

170	   o RD: Route Distinguisher.

172	   o RT: Route Target.

174	3. Use-case scenario description and requirements

176	   Figure 1 depicts the scenario that will be referenced throughout the
177	   rest of the document.

179	                            +--------------+
180	                            |              |
181	          +----+     +----+ |              | +----+   +----+
182	          | CE1|-----|    | |              | |    |---| CE3|
183	          +----+    /| PE1| |   IP/MPLS    | | PE3|   +----+
184	                   / +----+ |   Network    | +----+
185	                  /         |              |
186	                 /   +----+ |              |
187	          +----+/    |    | |              |
188	          | CE2|-----| PE2| |              |
189	          +----+     +----+ |              |
190	                            +--------------+

192	                     Figure 1 EVPN use-case scenario

194	   There are three PEs and three CEs considered in this example: PE1,
195	   PE2, PE3, as well as CE1, CE2 and CE3. Broadcast Domains must be
196	   extended among the three CEs.

198	3.1. Service Requirements

200	   The following service requirements are assumed in this scenario:

202	   o Redundancy requirements:

204	     - CE2 requires multi-homing connectivity to PE1 and PE2, not only
205	       for redundancy purposes, but also for adding more
206	       upstream/downstream connectivity bandwidth to/from the network.

208	     - Fast convergence. E.g.: if the link between CE2 and PE1 goes
209	       down, a fast convergence mechanism must be supported so that PE3
210	       can immediately send the traffic to PE2, irrespectively of the
211	       number of affected services and MAC addresses.

213	   o Service interface requirements:

215	     - The service definition must be flexible in terms of CE-VID-to-
216	       broadcast-domain assignment in the core.

218	     - The following three EVI services are required in this example:

220	       EVI100 - It uses VLAN-based service interfaces in the three CEs
221	       with a 1:1 VLAN-to-EVI mapping. The CE-VIDs at the three CEs can
222	       be the same, e.g.: VID 100, or different at each CE, e.g.: VID
223	       101 in CE1, VID 102 in CE2 and VID 103 in CE3. A single broadcast
224	       domain needs to be created for EVI100 in any case; therefore CE-
225	       VIDs will require translation at the egress PEs if they are not
226	       consistent across the three CEs. The case when the same CE-VID is
227	       used across the three CEs for EVI100 is referred in [RFC7432] as
228	       the "Unique VLAN" EVPN case. This term will be used throughout
229	       this document too.

231	       EVI200 - It uses VLAN-bundle service interfaces in CE1, CE2 and
232	       CE3, based on an N:1 VLAN-to-EVI mapping. The operator needs to
233	       pre-configure a range of CE-VIDs and its mapping to the EVI, and
234	       this mapping should be consistent in all the PEs (no translation
235	       is supported). A single broadcast domain is created for the
236	       customer. The customer is responsible of keeping the separation
237	       between users in different CE-VIDs.

239	       EVI300 - It uses VLAN-aware bundling service interfaces in CE1,
240	       CE2 and CE3. As in the EVI200 case, an N:1 VLAN-to-EVI mapping is
241	       created at the ingress PEs, however in this case, a separate
242	       broadcast domain is required per CE-VID. The CE-VIDs can be
243	       different (hence CE-VID translation is required).

245	   NOTE: in section 4.2.1, only EVI100 is used as an example of
246	   VLAN-based service provisioning. In sections 6.2 and 7.2, 4k
247	   VLAN-based EVIs (EVI1 to EVI4k) are used so that the impact of MAC
248	   vs. MPLS disposition models in the control plane can be evaluated. In
249	   the same way, EVI200 and EVI300 will be described with a 4k:1 mapping
250	   (CE-VIDs-to-EVI mapping) in sections 6.3, 6.4, 7.3 and 7.4.

252	   o BUM (Broadcast, Unknown unicast, Multicast) optimization
253	     requirements:

255	     - The solution must support ingress replication or P2MP MPLS LSPs
256	       on a per EVI service.

258	     - For example, we can use ingress replication for on EVI100 and
259	       EVI200, assuming those EVIs will not carry much BUM traffic. On
260	       the contrary, if EVI300 is presumably carrying a significant
261	       amount of multicast traffic, P2MP MPLS LSPs can be used for this
262	       service.

264	     - The benefit of ingress replication compared to P2MP LSPs is that
265	       the core routers will not need to maintain any multicast states.

267	3.2. Why EVPN is chosen to address this use-case

269	   VPLS solutions based on [RFC4761], [RFC4762] and [RFC6074] cannot
270	   meet the requirements in section 3, whereas EVPN can.

272	   For example:

274	   o If CE2 has a single CE-VID (or a few CE-VIDs) the current VPLS
275	     multi-homing solutions (based on load-balancing per CE-VID or
276	     service) do not provide the optimized link utilization required in
277	     this example. EVPN provides the flow-based load-balancing
278	     multi-homing solution required in this scenario to optimize the
279	     upstream/downstream link utilization between CE2 and PE1-PE2.

281	   o Also, EVPN provides a fast convergence solution that is independent
282	     of the CE-VIDs in the multi-homed PEs. Upon failure on the link
283	     between CE2 and PE1, PE3 can immediately send the traffic to PE2,
284	     based on a single notification message being sent by PE1. This is
285	     not possible with VPLS solutions.

287	   o In regards to service interfaces and mapping to broadcast domains,
288	     while VPLS might meet the requirements for EVI100 and EVI200, the
289	     VLAN-aware bundling service interfaces required by EVI300 are not
290	     supported by the current VPLS tools.

292	   The rest of the document will describe how EVPN can be used to meet
293	   the service requirements described in section 3, and even optimize
294	   the network further by:

296	   o Providing the user with an option to reduce (and even suppress) the
297	     ARP-flooding.

299	   o Supporting ARP termination and inter-subnet-forwarding.

301	4. Provisioning Model

303	   One of the requirements stated in [RFC7209] is the ease of
304	   provisioning. BGP parameters and service context parameters should be
305	   auto-provisioned so that the addition of a new MAC-VRF to the EVI
306	   requires a minimum number of single-sided provisioning touches.
307	   However this is only possible in a limited number of cases. This
308	   section describes the provisioning tasks required for the services
309	   described in section 3, i.e. EVI100 (VLAN-based service interfaces),
310	   EVI200 (VLAN-bundle service interfaces) and EVI300 (VLAN-aware
311	   bundling service interfaces).

313	4.1. Common provisioning tasks

315	   Regardless of the service interface type (VLAN-based, VLAN-bundle or
316	   VLAN-aware), the following sub-sections describe the parameters to be
317	   provisioned in the three PEs.

319	4.1.1. Non-service specific parameters
320	   The multi-homing function in EVPN requires the provisioning of
321	   certain parameters which are not service-specific and that are shared
322	   by all the MAC-VRFs in the node using the multi-homing capabilities.
323	   In our use-case, these parameters are only provisioned or auto-
324	   derived in PE1 and PE2, and are listed below:

326	   o Ethernet Segment Identifier (ESI): only the ESI associated to CE2
327	     needs to be considered in our example. Single-homed CEs such as CE1
328	     and CE3 do not require the provisioning of an ESI (the ESI will be
329	     coded as zero in the BGP NLRIs). In our example, a LAG is used
330	     between CE2 and PE1-PE2 (since all-active multi-homing is a
331	     requirement) therefore the ESI can be auto-derived from the LACP
332	     information as described in [RFC7432]. Note that the ESI must be
333	     unique across all the PEs in the network, therefore the
334	     auto-provisioning of the ESI is only recommended in case the CEs
335	     are managed by the Operator. Otherwise the ESI should be manually
336	     provisioned (type 0 as in [RFC7432]) in order to avoid potential
337	     conflicts.

339	   o ES-Import Route Target (ES-Import RT): this is the RT that will be
340	     sent by PE1 and PE2, along with the ES route. Regardless of how the
341	     ESI is provisioned in PE1 and PE2, the ES-Import RT must always be
342	     auto-derived from the 6-byte MAC address portion of the ESI value.

344	   o Ethernet Segment Route Distinguisher (ES RD): this is the RD to be
345	     encoded in the ES route and Ethernet Auto-Discovery (A-D) route to
346	     be sent by PE1 and PE2 for the CE2 ESI. This RD should always be
347	     auto-derived from the PE IP address, as described in [RFC7432].

349	   o Multi-homing type: the user must be able to provision the
350	     multi-homing type to be used in the network. In our use-case, the
351	     multi-homing type will be set to all-active for the CE2 ESI. This
352	     piece of information is encoded in the ESI Label extended community
353	     flags and sent by PE1 and PE2 along with the Ethernet A-D route for
354	     the CE2 ESI.

356	   In our use-case, besides the above parameters, the same LACP
357	   parameters will be configured in PE1 and PE2 for the ESI, so that CE2
358	   can send different flows to PE1 and PE2 for the same CE-VID as though
359	   they were forming a single system from the CE2 perspective.

361	4.1.2. Service specific parameters

363	   The following parameters must be provisioned in PE1, PE2 and PE3 per
364	   EVI service:

366	   o EVI identifier: global identifier per EVI that is shared by all the
367	     PEs part of the EVI, i.e. PE1, PE2 and PE3 will be provisioned with
368	     EVI100, 200 and 300. The EVI identifier can be associated to (or be
369	     the same value as) the EVI default Ethernet Tag (4-byte default
370	     broadcast domain identifier for the EVI). The Ethernet Tag is
371	     different from zero in the EVPN BGP routes only if the service
372	     interface type (of the source PE) is VLAN-aware Bundle.

374	   o EVI Route Distinguisher (EVI RD): This RD is a unique value across
375	     all the MAC-VRFs in a PE. Auto-derivation of this RD might be
376	     possible depending on the service interface type being used in the
377	     EVI. Next section discusses the specifics of each service interface
378	     type.

380	   o EVI Route Target(s) (EVI RT): one or more RTs can be provisioned
381	     per MAC-VRF. The RT(s) imported and exported can be equal or
382	     different, just as the RT(s) in IP-VPNs. Auto-derivation of this
383	     RT(s) might be possible depending on the service interface type
384	     being used in the EVI. Next section discusses the specifics of each
385	     service interface type.

387	   o CE-VID and port/LAG binding to EVI identifier or Ethernet Tag: see
388	     section 4.2.

390	4.2. Service interface dependent provisioning tasks

392	   Depending on the service interface type being used in the EVI, a
393	   specific CE-VID binding provisioning must be specified.

395	4.2.1. VLAN-based service interface EVI

397	   In our use-case, EVI100 is a VLAN-based service interface EVI.

399	   EVI100 can be a "unique-VLAN" service if the CE-VID being used for
400	   this service in CE1, CE2 and CE3 is identical, e.g. VID 100. In that
401	   case, the VID 100 binding must be provisioned in PE1, PE2 and PE3 for
402	   EVI100 and the associated port or LAG. The MAC-VRF RD and RT can be
403	   auto-derived from the CE-VID:

405	   o The auto-derived MAC-VRF RD will be a Type 1 RD, as recommended in
406	     [RFC7432], and it will be comprised of [PE-IP]:[zero-padded-VID];
407	     where [PE-IP] is the IP address of the PE (a loopback address) and
408	     [zero-padded-VID] is a 2-byte value where the low order 12 bits are
409	     the VID (VID 100 in our example) and the high order 4 bits are
410	     zero.

412	   o The auto-derived MAC-VRF RT will be composed of [AS]:[zero-padded-
413	     VID]; where [AS] is the Autonomous System that the PE belongs to
414	     and [zero-padded-VID] is a 2 or 4-byte value where the low order 12
415	     bits are the VID (VID 100 in our example) and the high order bits
416	     are zero. Note that auto-deriving the RT implies supporting a basic
417	     any-to-any topology in the EVI and using the same import and export
418	     RT in the EVI.

420	   If EVI100 is not a "unique-VLAN" instance, each individual CE-VID
421	   must be configured in each PE, and MAC-VRF RDs and RTs cannot be
422	   auto-derived, hence they must be provisioned by the user.

424	4.2.2. VLAN-bundle service interface EVI

426	   Assuming EVI200 is a VLAN-bundle service interface EVI, and VIDs
427	   200-250 are assigned to EVI200, the CE-VID bundle 200-250 must be
428	   provisioned on PE1, PE2 and PE3. Note that this model does not allow
429	   CE-VID translation and the CEs must use the same CE-VIDs for EVI200.
430	   No auto-derived EVI RDs or EVI RTs are possible.

432	4.2.3. VLAN-aware bundling service interface EVI

434	   If EVI300 is a VLAN-aware bundling service interface EVI, CE-VID
435	   binding to EVI300 does not have to match on the three PEs (only on
436	   PE1 and PE2, since they are part of the same ES). E.g.: PE1 and PE2
437	   CE-VID binding to EVI300 can be set to the range 300-310 and PE3 to
438	   321-330. Note that each individual CE-VID will be assigned to a
439	   different broadcast domain, represented by an Ethernet Tag in the
440	   control plane.

442	   Therefore, besides the CE-VID bundle range bound to EVI300 in each
443	   PE, associations between each individual CE-VID and the corresponding
444	   EVPN Ethernet Tag must be provisioned by the user. No auto-derived
445	   EVI RDs/RTs are possible.

447	5. BGP EVPN NLRI usage

449	   [RFC7432] defines four different route types and four different
450	   extended communities. However, not all the PEs in an EVPN network
451	   must generate and process all the different routes and extended
452	   communities. The following table shows the routes that must be
453	   exported and imported in the use-case described in this document.
454	   "Export", in this context, means that the PE must be capable of
455	   generating and exporting a given route, assuming there are no BGP
456	   policies to prevent it. In the same way, "Import" means the PE must
457	   be capable of importing and processing a given route, assuming the
458	   right RTs and policies. "N/A" means neither import nor export actions
459	   are required.

461	   +-------------------+---------------+---------------+
462	   | BGP EVPN routes   | PE1-PE2       | PE3           |
463	   +-------------------+---------------+---------------+
464	   | ES                | Export/import | N/A           |
465	   | A-D per ESI       | Export/import | Import        |
466	   | A-D per EVI       | Export/import | Import        |
467	   | MAC               | Export/import | Export/import |
468	   | Inclusive mcast   | Export/import | Export/import |
469	   +-------------------+---------------+---------------+

471	   PE3 is only required to export MAC and Inclusive multicast routes and
472	   be able to import and process A-D routes, as well as MAC and
473	   Inclusive multicast routes. If PE3 did not support importing and
474	   processing A-D routes per ESI and per EVI, fast convergence and
475	   aliasing functions (respectively) would not be possible in this
476	   use-case.

478	6. MAC-based forwarding model use-case

480	   This section describes how the BGP EVPN routes are exported and
481	   imported by the PEs in our use-case, as well as how traffic is
482	   forwarded assuming that PE1, PE2 and PE3 support a MAC-based
483	   forwarding model. In order to compare the control and data plane
484	   impact in the two forwarding models (MAC-based and MPLS-based) and
485	   different service types, we will assume that CE1, CE2 and CE3 need to
486	   exchange traffic for up to 4k CE-VIDs.

488	6.1. EVPN Network Startup procedures

490	   Before any EVI is provisioned in the network, the following
491	   procedures are required:

493	   o Infrastructure setup: the proper MPLS infrastructure must be setup
494	     among PE1, PE2 and PE3 so that the EVPN services can make use of
495	     P2P and P2MP LSPs. In addition to the MPLS transport, PE1 and PE2
496	     must be properly configured with the same LACP configuration to
497	     CE2. Details are provided in [RFC7432]. Once the LAG is properly
498	     setup, the ESI for the CE2 Ethernet Segment, e.g. ESI12, can be
499	     auto-generated by PE1 and PE2 from the LACP information exchanged
500	     with CE2 (ESI type 1), as discussed in section 4.1. Alternatively,
501	     the ESI can also be manually provisioned on PE1 and PE2 (ESI type
502	     0). PE1 and PE2 will auto-configure a BGP policy that will import
503	     any ES route matching the auto-derived ES-import RT for ESI12.

505	   o Ethernet Segment route exchange and DF election: PE1 and PE2 will
506	     advertise a BGP Ethernet Segment route for ESI12, where the ESI RD
507	     and ES-Import RT will be auto-generated as discussed in section
508	     4.1.1. PE1 and PE2 will import the ES routes of each other and will
509	     run the DF election algorithm for any existing EVI (if any, at this
510	     point). PE3 will simply discard the route. Note that the DF
511	     election algorithm can support service carving, so that the
512	     downstream BUM traffic from the network to CE2 can be load-balanced
513	     across PE1 and PE2 on a per-service basis.

515	   At the end of this process, the network infrastructure is ready to
516	   start deploying EVPN services. PE1 and PE2 are aware of the existence
517	   of a shared Ethernet Segment, i.e. ESI12.

519	6.2. VLAN-based service procedures

521	   Assuming that the EVPN network must carry traffic among CE1, CE2 and
522	   CE3 for up to 4k CE-VIDs, the Service Provider can decide to
523	   implement VLAN-based service interface EVIs to accomplish it. In this
524	   case, each CE-VID will be individually mapped to a different EVI.
525	   While this means a total number of 4k MAC-VRFs is required per PE,
526	   the advantages of this approach are the auto-provisioning of most of
527	   the service parameters if no VLAN translation is needed (see section
528	   4.2.1) and great control over each individual customer broadcast
529	   domain. We assume in this section that the range of EVIs from 1 to 4k
530	   is provisioned in the network.

532	6.2.1. Service startup procedures

534	   As soon as the EVIs are created in PE1, PE2 and PE3, the following
535	   control plane actions are carried out:

537	   o Flooding tree setup per EVI (4k routes): Each PE will send one
538	     Inclusive Multicast Ethernet Tag route per EVI (up to 4k routes per
539	     PE) so that the flooding tree per EVI can be setup. Note that
540	     ingress replication or P2MP LSPs can optionally be signaled in the
541	     PMSI Tunnel attribute and the corresponding tree be created.

543	   o Ethernet A-D routes per ESI (a set of routes for ESI12): A set of
544	     A-D routes with a total list of 4k RTs (one per EVI) for ESI12 will
545	     be issued from PE1 and PE2 (it has to be a set of routes so that
546	     the total number of RTs can be conveyed). As per [RFC7432], each
547	     Ethernet A-D route per ESI is differentiated from the other routes
548	     in the set by a different Route Distinguisher (ES RD). This set
549	     will also include ESI Label extended communities with the active-
550	     standby flag set to zero (all-active multi-homing type) and an ESI
551	     Label different from zero (used for split-horizon functions). These
552	     routes will be imported by the three PEs, since the RTs match the
553	     EVI RTs locally configured. The A-D routes per ESI will be used for
554	     fast convergence and split-horizon functions, as discussed in
555	     [RFC7432].

557	   o Ethernet A-D routes per EVI (4k routes): An A-D route per EVI will
558	     be sent by PE1 and PE2 for ESI12. Each individual route includes
559	     the corresponding EVI RT and an MPLS label to be used by PE3 for
560	     the aliasing function. These routes will be imported by the three
561	     PEs.

563	6.2.2. Packet walkthrough

565	   Once the services are setup, the traffic can start flowing. Assuming
566	   there are no MAC addresses learned yet and that MAC learning at the
567	   access is performed in the data plane in our use-case, this is the
568	   process followed upon receiving frames from each CE (example for
569	   EVI1).

571	   (1) BUM frame example from CE1:

573	   a) An ARP-request with CE-VID=1 is issued from source MAC CE1-MAC
574	      (MAC address coming from CE1 or from a device connected to CE1) to
575	      find the MAC address of CE3-IP.

577	   b) Based on the CE-VID, the frame is identified to be forwarded in
578	      the MAC-VRF-1 (EVI1) context. A source MAC lookup is done in the
579	      MAC FIB and the sender's CE1-IP in the proxy-ARP table within the
580	      MAC-VRF-1 (EVI1) context. If CE1-MAC/CE1-IP are unknown in both
581	      tables, three actions are carried out (assuming the source MAC is
582	      accepted by PE1):

584	      (1) Forwarding state is added for CE1-MAC associated to the
585	          corresponding port and CE-VID,

587	      (2) the ARP-request is snooped and the tuple CE1-MAC/CE1-IP is
588	          added to the proxy-ARP table and

590	      (3) a BGP MAC advertisement route is triggered from PE1 containing
591	          the EVI1 RD and RT, ESI=0, Ethernet-Tag=0 and CE1-MAC/CE1-IP
592	          along with an MPLS label assigned to MAC-VRF-1 from the PE1
593	          label space. Note that depending on the implementation, the
594	          MAC FIB and proxy-ARP learning processes can independently
595	          send two BGP MAC advertisements instead of one (one containing
596	          only the CE1-MAC and another one containing CE1-MAC/CE1-IP).

598	      Since we assume a MAC forwarding model, a label per MAC-VRF is
599	      normally allocated and signaled by the three PEs for MAC
600	      advertisement routes. Based on the RT, the route is imported by
601	      PE2 and PE3 and the forwarding state plus ARP entry are added to
602	      their MAC-VRF-1 context. From this moment on, any ARP request from
603	      CE2 or CE3 destined to CE1-IP, can be directly replied by PE1, PE2
604	      or PE3 and ARP flooding for CE1-IP is not needed in the core.

606	   c) Since the ARP frame is a broadcast frame, it is forwarded by PE1
607	      using the Inclusive multicast tree for EVI1 (CE-VID=1 tag should
608	      be kept if translation is required). Depending on the type of
609	      tree, the label stack may vary. E.g. assuming ingress replication,
610	      the packet is replicated to PE2 and PE3 with the downstream
611	      allocated labels and the P2P LSP transport labels. No other labels
612	      are added to the stack.

614	   d) Assuming PE1 is the DF for EVI1 on ESI12, the frame is locally
615	      replicated to CE2.

617	   e) The MPLS-encapsulated frame gets to PE2 and PE3. Since PE2 is non-
618	      DF for EVI1 on ESI12, and there is no other CE connected to PE2,
619	      the frame is discarded. At PE3, the frame is de-encapsulated, CE-
620	      VID translated if needed and forwarded to CE3.

622	   Any other type of BUM frame from CE1 would follow the same
623	   procedures. BUM frames from CE3 would follow the same procedures too.

625	   (2) BUM frame example from CE2:

627	   a) An ARP-request with CE-VID=1 is issued from source MAC CE2-MAC to
628	      find the MAC address of CE3-IP.

630	   b) CE2 will hash the frame and will forward it to e.g. PE2. Based on
631	      the CE-VID, the frame is identified to be forwarded in the EVI1
632	      context. A source MAC lookup is done in the MAC FIB and the
633	      sender's CE2-IP in the proxy-ARP table within the MAC-VRF-1
634	      context. If both are unknown, three actions are carried out
635	      (assuming the source MAC is accepted by PE2):

637	      (1) Forwarding state is added for CE2-MAC associated to the
638	          corresponding LAG/ESI and CE-VID,

640	      (2) the ARP-request is snooped and the tuple CE2-MAC/CE2-IP is
641	          added to the proxy-ARP table and

643	      (3) a BGP MAC advertisement route is triggered from PE2 containing
644	          the EVI1 RD and RT, ESI=12, Ethernet-Tag=0 and CE2-MAC/CE2-IP
645	          along with an MPLS label assigned from the PE2 label space
646	          (one label per MAC-VRF). Again, depending on the
647	          implementation, the MAC FIB and proxy-ARP learning processes
648	          can independently send two BGP MAC advertisements instead of
649	          one.

651	      Note that, since PE3 is not part of ESI12, it will install
652	      forwarding state for CE2-MAC as long as the A-D routes for ESI12
653	      are also active on PE3. On the contrary, PE1 is part of ESI12,
654	      therefore PE1 will not modify the forwarding state for CE2-MAC if
655	      it has previously learnt CE2-MAC locally attached to ESI12.

657	      Otherwise it will add forwarding state for CE2-MAC associated to
658	      the local ESI12 port.

660	   c) Assuming PE2 does not have the ARP information for CE3-IP yet, and
661	      since the ARP is a broadcast frame and PE2 the non-DF for EVI1 on
662	      ESI12, the frame is forwarded by PE2 in the Inclusive multicast
663	      tree for EVI1, adding the ESI label for ESI12 at the bottom of the
664	      stack. The ESI label has been previously allocated and signaled by
665	      the A-D routes for ESI12. Note that, as per [RFC7432], if the
666	      result of the CE2 hashing is different and the frame sent to PE1,
667	      PE1 should add the ESI label too (PE1 is the DF for EVI1 on
668	      ESI12).

670	   d) The MPLS-encapsulated frame gets to PE1 and PE3. PE1
671	      de-encapsulates the Inclusive multicast tree label(s) and based on
672	      the ESI label at the bottom of the stack, it decides to not
673	      forward the frame to the ESI12. It will pop the ESI label and will
674	      replicate it to CE1 though, since CE1 is not part of the ESI
675	      identified by the ESI label. At PE3, the Inclusive multicast tree
676	      label is popped and the frame forwarded to CE3. If a P2MP LSP is
677	      used as Inclusive multicast tree for EVI1, PE3 will find an ESI
678	      label after popping the P2MP LSP label. The ESI label will simply
679	      be popped, since CE3 is not part of ESI12.

681	   (3) Unicast frame example from CE3 to CE1:

683	   a) A unicast frame with CE-VID=1 is issued from source MAC CE3-MAC
684	      and destination MAC CE1-MAC (we assume PE3 has previously resolved
685	      an ARP request from CE3 to find the MAC of CE1-IP, and has added
686	      CE3-MAC/CE3-IP to its proxy-ARP table).

688	   b) Based on the CE-VID, the frame is identified to be forwarded in
689	      the EVI1 context. A source MAC lookup is done in the MAC FIB
690	      within the MAC-VRF-1 context and this time, since we assume CE3-
691	      MAC is known, no further actions are carried out as a result of
692	      the source lookup. A destination MAC lookup is performed next and
693	      the label stack associated to the MAC CE1-MAC is found (including
694	      the label associated to MAC-VRF-1 in PE1 and the P2P LSP label to
695	      get to PE1). The unicast frame is then encapsulated and forwarded
696	      to PE1.

698	   c) At PE1, the packet is identified to be part of EVI1 and a
699	      destination MAC lookup is performed in the MAC-VRF-1 context. The
700	      labels are popped and the frame forwarded to CE1 with CE-VID=1.

702	      Unicast frames from CE1 to CE3 or from CE2 to CE3 follow the same
703	      procedures described above.

705	   (4) Unicast frame example from CE3 to CE2:

707	   a) A unicast frame with CE-VID=1 is issued from source MAC CE3-MAC
708	      and destination MAC CE2-MAC (we assume PE3 has previously resolved
709	      an ARP request from CE3 to find the MAC of CE2-IP).

711	   b) Based on the CE-VID, the frame is identified to be forwarded in
712	      the MAC-VRF-1 context. We assume CE3-MAC is known. A destination
713	      MAC lookup is performed next and PE3 finds CE2-MAC associated to
714	      PE2 on ESI12, an Ethernet Segment for which PE3 has two active A-D
715	      routes per ESI (from PE1 and PE2) and two active A-D routes for
716	      EVI1 (from PE1 and PE2). Based on a hashing function for the
717	      frame, PE3 may decide to forward the frame using the label stack
718	      associated to PE2 (label received from the MAC advertisement
719	      route) or the label stack associated to PE1 (label received from
720	      the A-D route per EVI for EVI1). Either way, the frame is
721	      encapsulated and sent to the remote PE.

723	   c) At PE2 (or PE1), the packet is identified to be part of EVI1 based
724	      on the bottom label, and a destination MAC lookup is performed. At
725	      either PE (PE2 or PE1), the FIB lookup yields a local ESI12 port
726	      to which the frame is sent.

728	   Unicast frames from CE1 to CE2 follow the same procedures.

730	6.3. VLAN-bundle service procedures

732	   Instead of using VLAN-based interfaces, the Operator can choose to
733	   implement VLAN-bundle interfaces to carry the traffic for the 4k CE-
734	   VIDs among CE1, CE2 and CE3. If that is the case, the 4k CE-VIDs can
735	   be mapped to the same EVI, e.g. EVI200, at each PE. The main
736	   advantage of this approach is the low control plane overhead (reduced
737	   number of routes and labels) and easiness of provisioning, at the
738	   expense of no control over the customer broadcast domains, i.e. a
739	   single inclusive multicast tree for all the CE-VIDs and no CE-VID
740	   translation in the Provider network.

742	6.3.1. Service startup procedures

744	   As soon as the EVI200 is created in PE1, PE2 and PE3, the following
745	   control plane actions are carried out:

747	   o Flooding tree setup per EVI (one route): Each PE will send one
748	      Inclusive Multicast Ethernet Tag route per EVI (hence only one
749	      route per PE) so that the flooding tree per EVI can be setup. Note
750	      that ingress replication or P2MP LSPs can optionally be signaled
751	      in the PMSI Tunnel attribute and the corresponding tree be
752	      created.

754	   o Ethernet A-D routes per ESI (one route for ESI12): A single A-D
755	      route for ESI12 will be issued from PE1 and PE2. This route will
756	      include a single RT (RT for EVI200), an ESI Label extended
757	      community with the active-standby flag set to zero (all-active
758	      multi-homing type) and an ESI Label different from zero (used by
759	      the non-DF for split-horizon functions). This route will be
760	      imported by the three PEs, since the RT matches the EVI200 RT
761	      locally configured. The A-D routes per ESI will be used for fast
762	      convergence and split-horizon functions, as described in
763	      [RFC7432].

765	   o Ethernet A-D routes per EVI (one route): An A-D route (EVI200) will
766	      be sent by PE1 and PE2 for ESI12. This route includes the EVI200
767	      RT and an MPLS label to be used by PE3 for the aliasing function.
768	      This route will be imported by the three PEs.

770	6.3.2. Packet Walkthrough

772	   The packet walkthrough for the VLAN-bundle case is similar to the one
773	   described for EVI1 in the VLAN-based case except for the way the
774	   CE-VID is handled by the ingress PE and the egress PE:

776	   o No VLAN translation is allowed and the CE-VIDs are kept untouched
777	      from CE to CE, i.e. the ingress CE-VID must be kept at the
778	      imposition PE and at the disposition PE.

780	   o The frame is identified to be forwarded in the MAC-VRF-200 context
781	      as long as its CE-VID belongs to the VLAN-bundle defined in the
782	      PE1/PE2/PE3 port to CE1/CE2/CE3. Our example is a special VLAN-
783	      bundle case, since the entire CE-VID range is defined in the
784	      ports, therefore any CE-VID would be part of EVI200.

786	   Please refer to section 6.2.2 for more information about the control
787	   plane and forwarding plane interaction for BUM and unicast traffic
788	   from the different CEs.

790	6.4. VLAN-aware bundling service procedures

792	   The last potential service type analyzed in this document is
793	   VLAN-aware bundling. When this type of service interface is used to
794	   carry the 4k CE-VIDs among CE1, CE2 and CE3, all the CE-VIDs will be
795	   mapped to the same EVI, e.g. EVI300. The difference, compared to the
796	   VLAN-bundle service type in the previous section, is that each
797	   incoming CE-VID will also be mapped to a different "normalized"
798	   Ethernet-Tag in addition to EVI300. If no translation is required,
799	   the Ethernet-tag will match the CE-VID. Otherwise a translation
800	   between CE-VID and Ethernet-tag will be needed at the imposition PE
801	   and at the disposition PE. The main advantage of this approach is the
802	   ability to control customer broadcast domains while providing a
803	   single EVI to the customer.

805	6.4.1. Service startup procedures

807	   As soon as the EVI300 is created in PE1, PE2 and PE3, the following
808	   control plane actions are carried out:

810	   o Flooding tree setup per EVI per Ethernet-Tag (4k routes): Each PE
811	      will send one Inclusive Multicast Ethernet Tag route per EVI and
812	      per Ethernet-Tag (hence 4k routes per PE) so that the flooding
813	      tree per customer broadcast domain can be setup. Note that ingress
814	      replication or P2MP LSPs can optionally be signaled in the PMSI
815	      Tunnel attribute and the corresponding tree be created. In the
816	      described use-case, since all the CE-VIDs and Ethernet-Tags are
817	      defined on the three PEs, multicast tree aggregation might make
818	      sense in order to save forwarding states.

820	   o Ethernet A-D routes per ESI (one route for ESI12): A single A-D
821	      route for ESI12 will be issued from PE1 and PE2. This route will
822	      include a single RT (RT for EVI300), an ESI Label extended
823	      community with the active-standby flag set to zero (all-active
824	      multi-homing type) and an ESI Label different than zero (used by
825	      the non-DF for split-horizon functions). This route will be
826	      imported by the three PEs, since the RT matches the EVI300 RT
827	      locally configured. The A-D routes per ESI will be used for fast
828	      convergence and split-horizon functions, as described in
829	      [RFC7432].

831	   o Ethernet A-D routes per EVI: a single A-D route (EVI300) may be
832	      sent by PE1 and PE2 for ESI12, in case no CE-VID translation is
833	      required. This route includes the EVI300 RT and an MPLS label to
834	      be used by PE3 for the aliasing function. This route will be
835	      imported by the three PEs. Note that if CE-VID translation is
836	      required, an A-D per EVI route is required per Ethernet-Tag (4k).

838	6.4.2. Packet Walkthrough

840	   The packet walkthrough for the VLAN-aware case is similar to the one
841	   described before. Compared to the other two cases, VLAN-aware
842	   services allow for CE-VID translation and for an N:1 CE-VID to EVI
843	   mapping. Both things are not supported at once in either of the two
844	   other service interfaces. Some differences compared to the packet
845	   walkthrough described in section 6.2.2 are:

847	   o At the ingress PE, the frames are identified to be forwarded in the
848	      EVI300 context as long as their CE-VID belong to the range defined
849	      in the PE port to the CE. In addition to it, CE-VID=x is mapped to
850	      a "normalized" Ethernet-Tag=y at the MAC-VRF-300 (where x and y
851	      might be equal if no translation is needed). Qualified learning is
852	      now required (a different Bridge Table is allocated within MAC-
853	      VRF-300 for each Ethernet-Tag). Potentially the same MAC could be
854	      learned in two different Ethernet-Tag Bridge Tables of the same
855	      MAC-VRF.

857	   o Any new locally learned MAC on the MAC-VRF-300/Ethernet-Tag=y
858	      interface is advertised by the ingress PE in a MAC advertisement
859	      route, using now the Ethernet-Tag field (Ethernet-Tag=y) so that
860	      the remote PE learns the MAC associated to the MAC-VRF-
861	      300/Ethernet-Tag=y FIB. Note that the Ethernet-Tag field is not
862	      used in advertisements of MACs learned on VLAN-based or VLAN-
863	      bundle service interfaces.

865	   o At the ingress PE, BUM frames are sent to the corresponding
866	      flooding tree for the particular Ethernet-Tag they are mapped to.
867	      Each individual Ethernet-Tag can have a different flooding tree
868	      within the same EVI300. For instance, Ethernet-Tag=y can use
869	      ingress replication to get to the remote PEs whereas Ethernet-
870	      Tag=z can use a p2mp LSP.

872	   o At the egress PE, Ethernet-Tag=y, for a given broadcast domain
873	      within MAC-VRF-300, can be translated to egress CE-VID=x. That is
874	      not possible for VLAN-bundle interfaces. It is possible for VLAN-
875	      based interfaces, but it requires a separate MAC-VRF per CE-VID.

877	7. MPLS-based forwarding model use-case

879	   EVPN supports an alternative forwarding model, usually referred to as
880	   MPLS-based forwarding or disposition model as opposed to the
881	   MAC-based forwarding or disposition model described in section 6.
882	   Using MPLS-based forwarding model instead of MAC-based model might
883	   have an impact on:

885	   o The number of forwarding states required.

887	   o The FIB where the forwarding states are handled: MAC FIB or MPLS
888	      LFIB.

890	   The MPLS-based forwarding model avoids the destination MAC lookup at
891	   the egress PE MAC FIB, at the expense of increasing the number of
892	   next-hop forwarding states at the egress MPLS LFIB. This also has an
893	   impact on the control plane and the label allocation model, since an
894	   MPLS-based disposition PE must send as many routes and labels as
895	   required next-hops in the egress MAC-VRF. This concept is equivalent
896	   to the forwarding models supported in IP-VPNs at the egress PE, where
897	   an IP lookup in the IP-VPN FIB might be necessary or not depending on
898	   the available next-hop forwarding states in the LFIB.

900	   The following sub-sections highlight the impact on the control and
901	   data plane procedures described in section 6 when and MPLS-based
902	   forwarding model is used.

904	   Note that both forwarding models are compatible and interoperable in
905	   the same network. The implementation of either model in each PE is a
906	   local decision to the PE node.

908	7.1. Impact of MPLS-based forwarding on the EVPN network startup

910	   The MPLS-based forwarding model has no impact on the procedures
911	   explained in section 6.1.

913	7.2. Impact of MPLS-based forwarding on the VLAN-based service
914	   procedures

916	   Compared to the MAC-based forwarding model, the MPLS-based forwarding
917	   model has no impact in terms of number of routes, when all the
918	   service interfaces are VLAN-based. The differences for the use-case
919	   described in this document are summarized in the following list:

921	   o Flooding tree setup per EVI (4k routes per PE): no impact compared
922	     to the MAC-based model.

924	   o Ethernet A-D routes per ESI (one set of routes for ESI12 per PE):
925	     no impact compared to the MAC-based model.

927	   o Ethernet A-D routes per EVI (4k routes per PE/ESI): no impact
928	     compared to the MAC-based model.

930	   o MAC-advertisement routes: instead of allocating and advertising the
931	     same MPLS label for all the new MACs locally learnt on the same
932	     MAC-VRF, a different label must be advertised per CE next-hop or
933	     MAC so that no MAC FIB lookup is needed at the egress PE. In
934	     general, this means that a different label at least per CE must be
935	     advertised, although the PE can decide to implement a label per MAC
936	     if more granularity (hence less scalability) is required in terms
937	     of forwarding states. E.g. if CE2 sends traffic from two different
938	     MACs to PE1, CE2-MAC1 and CE2-MAC2, the same MPLS label=x can be
939	     re-used for both MAC advertisements since they both share the same
940	     source ESI12. It is up to the PE1 implementation to use a different
941	     label per individual MAC within the same ES Segment (even if only
942	     one label per ESI is enough).

944	   o PE1, PE2 and PE3 will not add forwarding states to the MAC FIB upon
945	     learning new local CE MAC addresses on the data plane, but will
946	     rather add forwarding states to the MPLS LFIB.

948	7.3. Impact of MPLS-based forwarding on the VLAN-bundle service
949	     procedures

951	   Compared to the MAC-based forwarding model, the MPLS-based forwarding
952	   model has no impact in terms of number of routes when all the service
953	   interfaces are VLAN-bundle type. The differences for the use-case
954	   described in this document are summarized in the following list:

956	   o Flooding tree setup per EVI (one route): no impact compared to the
957	     MAC-based model.

959	   o Ethernet A-D routes per ESI (one route for ESI12 per PE): no impact
960	     compared to the MAC-based model.

962	   o Ethernet A-D routes per EVI (one route per PE/ESI): no impact
963	     compared to the MAC-based model since no VLAN translation is
964	     required.

966	   o MAC-advertisement routes: instead of allocating and advertising the
967	     same MPLS label for all the new MACs locally learnt on the same
968	     MAC-VRF, a different label must be advertised per CE next-hop or
969	     MAC so that no MAC FIB lookup is needed at the egress PE. In
970	     general, this means that a different label at least per CE must be
971	     advertised, although the PE can decide to implement a label per MAC
972	     if more granularity (hence less scalability) is required in terms
973	     of forwarding states. It is up to the PE1 implementation to use a
974	     different label per individual MAC within the same ES Segment (even
975	     if only one label per ESI is enough).

977	   o PE1, PE2 and PE3 will not add forwarding states to the MAC FIB upon
978	     learning new local CE MAC addresses on the data plane, but will
979	     rather add forwarding states to the MPLS LFIB.

981	7.4. Impact of MPLS-based forwarding on the VLAN-aware service
982	     procedures

984	   Compared to the MAC-based forwarding model, the MPLS-based forwarding
985	   model has no impact in terms of number of A-D routes when all the
986	   service interfaces are VLAN-aware bundle type. The differences for
987	   the use-case described in this document are summarized in the
988	   following list:

990	   o Flooding tree setup per EVI (4k routes per PE): no impact compared
991	     to the MAC-based model.

993	   o Ethernet A-D routes per ESI (one route for ESI12 per PE): no impact
994	     compared to the MAC-based model.

996	   o Ethernet A-D routes per EVI (1 route per ESI or 4k routes per
997	     PE/ESI): PE1 and PE2 may send one route per ESI if no CE-VID
998	     translation is needed. However, 4k routes normally sent for EVI300,
999	     one per <ESI, Ethernet-Tag ID> tuple. This will allow the egress PE
1000	     to find out all the forwarding information in the MPLS LFIB and
1001	     even support Ethernet-Tag to CE-VID translation at the egress.

1003	   o MAC-advertisement routes: instead of allocating and advertising the
1004	     same MPLS label for all the new MACs locally learnt on the same
1005	     MAC-VRF, a different label must be advertised per CE next-hop or
1006	     MAC so that no MAC FIB lookup is needed at the egress PE. In
1007	     general, this means that a different label at least per CE must be
1008	     advertised, although the PE can decide to implement a label per MAC
1009	     if more granularity (hence less scalability) is required in terms
1010	     of forwarding states. It is up to the PE1 implementation to use a
1011	     different label per individual MAC within the same ES Segment. Note
1012	     that the Ethernet-Tag will be set to a non-zero value for the MAC-
1013	     advertisement routes. The same MAC address can be announced with
1014	     different Ethernet-Tag value. This will make the advertising PE
1015	     install two different forwarding states in the MPLS LFIB.

1017	   o PE1, PE2 and PE3 will not add forwarding states to the MAC FIB upon
1018	     learning new local CE MAC addresses on the data plane, but will
1019	     rather add forwarding states to the MPLS LFIB.

1021	8. Comparison between MAC-based and MPLS-based Egress Forwarding Models

1023	   Both forwarding models are possible in a network deployment and each
1024	   one has its own trade-offs.

1026	   Both forwarding models can save A-D routes per EVI when VLAN-aware
1027	   bundling services are deployed and no CE-VID translation is required.
1028	   While this saves a significant amount of routes, customers normally
1029	   require CE-VID translation, hence we assume an A-D per EVI route per
1030	   <ESI, Ethernet-Tag> is needed.

1032	   This MAC-based model saves a significant amount of MPLS labels
1033	   compared to the MPLS-based forwarding model. All the MACs and A-D
1034	   routes for the same EVI can signal the same MPLS label, saving labels
1035	   from the local PE space. A MAC FIB lookup at the egress PE is
1036	   required in order to do so.

1038	   The MPLS-based forwarding model can save forwarding states at the
1039	   egress PEs if labels per next hop CE (as opposed to per MAC) are
1040	   implemented. No egress MAC lookup is required. Also, a different
1041	   label per next-hop CE per MAC-VRF is consumed, as opposed to a single
1042	   label per MAC-VRF.

1044	   The following table summarizes the implementation details of both
1045	   models.

1047	    +-----------------------------+----------------+----------------+
1048	    |  4k CE-VID VLANs            | MAC-based      | MPLS-based     |
1049	    |                             | Model          | Model          |
1050	    +-----------------------------+----------------+----------------+
1051	    | MPLS labels consumed        | 1 per MAC-VRF  | 1 per CE/EVI   |
1052	    | Egress PE Forwarding states | 1 per MAC      | 1 per next-hop |
1053	    | Egress PE Lookups           | 2 (MPLS+MAC)   | 1 (MPLS)       |
1054	    +-----------------------------+----------------+----------------+

1056	   The egress forwarding model is an implementation local to the egress
1057	   PE and is independent of the model supported on the rest of the PEs,
1058	   i.e. in our use-case, PE1, PE2 and PE3 could have either egress
1059	   forwarding model without any dependencies.

1061	9. Traffic flow optimization

1063	   In addition to the procedures described across sections 3 through 8,
1064	   EVPN [RFC7432] procedures allow for optimized traffic handling in
1065	   order to minimize unnecessary flooding across the entire
1066	   infrastructure. Optimization is provided through specific ARP
1067	   termination and the ability to block unknown unicast flooding.
1068	   Additionally, EVPN procedures allow for intelligent, close to the
1069	   source, inter-subnet forwarding and solves the commonly known sub-
1070	   optimal routing problem. Besides the traffic efficiency, ingress
1071	   based inter-subnet forwarding also optimizes packet forwarding rules
1072	   and implementation at the egress nodes as well. Details of these
1073	   procedures are outlined in sections 9.1 and 9.2.

1075	9.1. Control Plane Procedures

1077	9.1.1. MAC learning options

1079	   The fundamental premise of [RFC7432] is the notion of a different
1080	   approach to MAC address learning compared to traditional IEEE 802.1
1081	   bridge learning methods; specifically EVPN differentiates between
1082	   data and control plane driven learning mechanisms.

1084	   Data driven learning implies that there is no separate communication
1085	   channel used to advertise and propagate MAC addresses. Rather, MAC
1086	   addresses are learned through IEEE defined bridge-learning procedures
1087	   as well as by snooping on DHCP and ARP requests. As different MAC
1088	   addresses show up on different ports, the L2 FIB is populated with
1089	   the appropriate MAC addresses.

1091	   Control plane driven learning implies a communication channel that
1092	   could be either a control-plane protocol or a management-plane
1093	   mechanism. In the context of EVPN, two different learning procedures
1094	   are defined, i.e. local and remote procedures:

1096	   o  Local learning defines the procedures used for learning the MAC
1097	      addresses of network elements locally connected to a MAC-VRF.
1098	      Local learning could be implemented through all three learning
1099	      procedures: control plane, management plane as well as data plane.
1100	      However, the expectation is that for most of the use cases, local
1101	      learning through data plane should be sufficient.

1103	   o  Remote learning defines the procedures used for learning MAC
1104	      addresses of network elements remotely connected to a MAC-VRF,
1105	      i.e. far-end PEs. Remote learning procedures defined in [RFC7432]
1106	      advocate using only control plane learning; specifically BGP.
1107	      Through the use of BGP EVPN NLRIs, the remote PE has the
1108	      capability of advertising all the MAC addresses present in its
1109	      local FIB.

1111	9.1.2. Proxy-ARP/ND

1113	   In EVPN, MAC addresses are advertised via the MAC/IP Advertisement
1114	   Route, as discussed in [RFC7432]. Optionally an IP address can be
1115	   advertised along with the MAC address advertisement. However, there
1116	   are certain rules put in place in terms of IP address usage: if the
1117	   MAC/IP Route contains an IP address, this particular IP address
1118	   correlates directly with the advertised MAC address. Such
1119	   advertisement allows us to build a proxy-ARP/ND table populated with
1120	   the IP<->MAC bindings received from all the remote nodes.

1122	   Furthermore, based on these bindings, a local MAC-VRF can now provide
1123	   Proxy-ARP/ND functionality for all ARP requests and ND solicitations
1124	   directed to the IP address pool learned through BGP. Therefore, the
1125	   amount of unnecessary L2 flooding, ARP/ND requests/solicitations in
1126	   this case, can be further reduced by the introduction of Proxy-ARP/ND
1127	   functionality across all EVI MAC-VRFs.

1129	9.1.3. Unknown Unicast flooding suppression

1131	   Given that all locally learned MAC addresses are advertised through
1132	   BGP to all remote PEs, suppressing flooding of any Unknown Unicast
1133	   traffic towards the remote PEs is a feasible network optimization.

1135	   The assumption in the use case is made that any network device that
1136	   appears on a remote MAC-VRF will somehow signal its presence to the
1137	   network. This signaling can be done through e.g. gratuitous ARPs.
1138	   Once the remote PE acknowledges the presence of the node in the MAC-
1139	   VRF, it will do two things: install its MAC address in its local FIB
1140	   and advertise this MAC address to all other BGP speakers via EVPN
1141	   NLRI. Therefore, we can assume that any active MAC address is
1142	   propagated and learnt through the entire EVI. Given that MAC
1143	   addresses become pre-populated - once nodes are alive on the network
1144	   - there is no need to flood any unknown unicast towards the remote
1145	   PEs. If the owner of a given destination MAC is active, the BGP route
1146	   will be present in the local RIB and FIB, assuming that the BGP
1147	   import policies are successfully applied; otherwise, the owner of
1148	   such destination MAC is not present on the network.

1150	   It is worth noting that unless: a) control or management plane
1151	   learning is performed through the entire EVI or b) all the EVI-
1152	   attached devices signal their presence when they come up (GARPs or
1153	   similar), unknown unicast flooding must be enabled.

1155	9.1.4. Optimization of Inter-subnet forwarding

1157	   In a scenario in which both L2 and L3 services are needed over the
1158	   same physical topology, some interaction between EVPN and IP-VPN is
1159	   required. A common way of stitching the two service planes is through
1160	   the use of an IRB interface, which allows for traffic to be either
1161	   routed or bridged depending on its destination MAC address. If the
1162	   destination MAC address is the one of the IRB interface, traffic
1163	   needs to be passed through a routing module and potentially be either
1164	   routed to a remote PE or forwarded to a local subnet. If the
1165	   destination MAC address is not the one of the IRB, the MAC-VRF
1166	   follows standard bridging procedures.

1168	   A typical example of EVPN inter-subnet forwarding would be a scenario
1169	   in which multiple IP subnets are part of a single or multiple EVIs,
1170	   and they all belong to a single IP-VPN. In such topologies, it is
1171	   desired that inter-subnet traffic can be efficiently routed without
1172	   any tromboning effects in the network. Due to the overlapping
1173	   physical and service topology in such scenarios, all inter-subnet
1174	   connectivity will be locally routed through the IRB interface.

1176	   In addition to optimizing the traffic patterns in the network, local
1177	   inter-subnet forwarding also optimizes greatly the amount of
1178	   processing needed to cross the subnets. Through EVPN MAC
1179	   advertisements, the local PE learns the real destination MAC address
1180	   associated with the remote IP address and the inter-subnet forwarding
1181	   can happen locally. When the packet is received at the egress PE, it
1182	   is directly mapped to an egress MAC-VRF, bypassing any egress IP-VPN
1183	   processing.

1185	   Please refer to [EVPN-INTERSUBNET] for more information about the IP
1186	   inter-subnet forwarding procedures in EVPN.

1188	9.2. Packet Walkthrough Examples

1190	   Assuming that the services are setup according to figure 1 in section
1191	   3, the following flow optimization processes will take place in terms
1192	   of creating, receiving and forwarding packets across the network.

1194	9.2.1. Proxy-ARP example for CE2 to CE3 traffic

1196	   Using Figure 1 in section 3, consider EVI 400 residing on PE1, PE2
1197	   and PE3 connecting CE2 and CE3 networks. Also, consider that PE1 and
1198	   PE2 are part of the all-active multi-homing ES for CE2, and that PE2
1199	   is elected designated-forwarder for EVI400. We assume that all the
1200	   PEs implement the proxy-ARP functionality in the MAC-VRF-400 context.

1202	   In this scenario, PE3 will not only advertise the MAC addresses
1203	   through the EVPN MAC Advertisement Route but also IP addresses of
1204	   individual hosts, i.e. /32 prefixes, behind CE3. Upon receiving the
1205	   EVPN routes, PE1 and PE2 will install the MAC addresses in the MAC-
1206	   VRF-400 FIB and based on the associated received IP addresses, PE1
1207	   and PE2 can now build a proxy-ARP table within the context of MAC-
1208	   VRF-400.

1210	   From the forwarding perspective, when a node behind CE2 sends a frame
1211	   destined to a node behind CE3, it will first send an ARP request to
1212	   e.g. PE2 (based on the result of the CE2 hashing). Assuming that PE2
1213	   has populated its proxy-ARP table for all active nodes behind the
1214	   CE3, and that the IP address in the ARP message matches the entry in
1215	   the table, PE2 will respond to the ARP request with the actual MAC
1216	   address on behalf of the node behind CE3.

1218	   Once the nodes behind CE2 learn the actual MAC address of the nodes
1219	   behind CE3, all the MAC-to-MAC communications between the two
1220	   networks will be unicast.

1222	9.2.2. Flood suppression example for CE1 to CE3 traffic

1224	   Using Figure 1 in section 3, consider EVI 500 residing on PE1 and PE3
1225	   connecting CE1 and CE3 networks. Consider that both PE1 and PE3 have
1226	   disabled unknown unicast flooding for this specific EVI context. Once
1227	   the network devices behind CE3 come online they will learn their MAC
1228	   addresses and create local FIB entries for these devices. Note that
1229	   local FIB entries could also be created through either a control or
1230	   management plane between PE and CE as well. Consequently, PE3 will
1231	   automatically create EVPN Type 2 MAC Advertisement Routes and
1232	   advertise all locally learned MAC addresses. The routes will also
1233	   include the corresponding MPLS label.

1235	   Given that PE1 automatically learns and installs all MAC addresses
1236	   behind CE3, its MAC-VRF FIB will already be pre-populated with the
1237	   respective next-hops and label assignments associated with the MAC
1238	   addresses behind CE3. As such, as soon as the traffic sent by CE1 to
1239	   nodes behind CE3 is received into the context of EVI 500, PE1 will
1240	   push the MPLS Label(s) onto the original Ethernet frame and send the
1241	   packet to the MPLS network. As usual, once PE3 receives this packet,
1242	   and depending on the forwarding model, PE3 will either do a next-hop
1243	   lookup in the EVI 500 context, or will just forward the traffic
1244	   directly to the CE3. In the case that PE1 MAC-VRF-500 does not have a
1245	   MAC entry for a specific destination that CE1 is trying to reach, PE1
1246	   will drop the frame since unknown unicast flooding is disabled.

1248	   Based on the assumption that all the MAC entries behind the CEs are
1249	   pre-populated through gratuitous-ARP and/or DHCP requests, if one
1250	   specific MAC entry is not present in the MAC-VRF-500 FIB on PE1, the
1251	   owner of that MAC is not alive on the network behind the CE3, hence
1252	   the traffic can be dropped at PE1 instead of be flooded and consume
1253	   network bandwidth.

1255	9.2.3. Optimization of inter-subnet forwarding example for CE3 to CE2
1256	   traffic

1258	   Using Figure 1 in section 3 consider that there is an IP-VPN 666
1259	   context residing on PE1, PE2 and PE3 which connects CE1, CE2 and CE3
1260	   into a single IP-VPN domain. Also consider that there are two EVIs
1261	   present on the PEs, EVI 600 and EVI 60. Each IP subnet is associated
1262	   to a different MAC-VRF context. Thus there is a single subnet, subnet
1263	   600, between CE1 and CE3 that is established through EVI 600.
1264	   Similarly, there is another subnet, subnet 60, between CE2 and CE3
1265	   that is established through EVI 60. Since both subnets are part of
1266	   the same IP VPN, there is a mapping of each EVI (or individual
1267	   subnet) to a local IRB interface on the three PEs.

1269	   If a node behind CE2 wants to communicate with a node on the same
1270	   subnet seating behind CE3, the communication flow will follow the
1271	   standard EVPN procedures, i.e. FIB lookup within the PE1 (or PE2)
1272	   after adding the corresponding EVPN label to the MPLS label stack
1273	   (downstream label allocation from PE3 for EVI 60).

1275	   When it comes to crossing the subnet boundaries, the ingress PE
1276	   implements local inter-subnet forwarding. For example, when a node
1277	   behind CE2 (EVI 60) sends a packet to a node behind CE1 (EVI 600) the
1278	   destination IP address will be in the subnet 600, but the destination
1279	   MAC address will be the address of source node's default gateway,
1280	   which in this case will be an IRB interface on PE1 (connecting EVI 60
1281	   to IP-VPN 666). Once PE1 sees the traffic destined to its own MAC
1282	   address, it will route the packet to EVI 600, i.e. it will change the
1283	   source MAC address to the one of the IRB interface in EVI 600 and
1284	   change the destination MAC address to the address belonging to the
1285	   node behind CE1, which is already populated in the MAC-VRF-600 FIB,
1286	   either through data or control plane learning.

1288	   An important optimization to be noted is the local inter-subnet
1289	   forwarding in lieu of IP VPN routing. If the node from subnet 60
1290	   (behind CE2) is sending a packet to the remote end node on subnet 600
1291	   (behind CE3), the mechanism in place still honors the local inter-
1292	   subnet (inter-EVI) forwarding.

1294	   In our use-case, therefore, when node from subnet 60 behind CE2 sends
1295	   traffic to the node on subnet 600 behind CE3, the destination MAC
1296	   address is the PE1 MAC-VRF-60 IRB MAC address. However, once the
1297	   traffic locally crosses EVIs, to EVI 600, via the IRB interface on
1298	   PE1, the source MAC address is changed to that of the IRB interface
1299	   and the destination MAC address is changed to the one advertised by
1300	   PE3 via EVPN and already installed in MAC-VRF-600. The rest of the
1301	   forwarding through PE1 is using the MAC-VRF-600 forwarding context
1302	   and label space.

1304	   Another very relevant optimization is due to the fact that traffic
1305	   between PEs is forwarded through EVPN, rather than through IP-VPN. In
1306	   the example described above for traffic from EVI 60 on CE2 to EVI 600
1307	   on CE3, there is no need for IP-VPN processing on the egress PE3.
1308	   Traffic is forwarded either to the EVI 600 context in PE3 for further
1309	   MAC lookup and next-hop processing, or directly to the node behind
1310	   CE3, depending on the egress forwarding model being used.

1312	10. Security Considerations

1314	   Please refer to the "Security Considerations" section in [RFC7432].

1316	11. IANA Considerations

1318	   No new IANA considerations are needed.

1320	12. References

1322	12.1. Normative References

1324	   [RFC4761] Kompella, K., Ed., and Y. Rekhter, Ed., "Virtual Private
1325	   LAN Service (VPLS) Using BGP for Auto-Discovery and Signaling",
1326	   RFC 4761, DOI 10.17487/RFC4761, January 2007, <http://www.rfc-
1327	   editor.org/info/rfc4761>.

1329	   [RFC4762] Lasserre, M., Ed., and V. Kompella, Ed., "Virtual Private
1330	   LAN Service (VPLS) Using Label Distribution Protocol (LDP)
1331	   Signaling", RFC 4762, DOI 10.17487/RFC4762, January 2007,
1332	   <http://www.rfc-editor.org/info/rfc4762>.

1334	   [RFC6074] Rosen, E., Davie, B., Radoaca, V., and W. Luo,
1335	   "Provisioning, Auto-Discovery, and Signaling in Layer 2 Virtual
1336	   Private Networks (L2VPNs)", RFC 6074, DOI 10.17487/RFC6074, January
1337	   2011, <http://www.rfc-editor.org/info/rfc6074>.

1339	   [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
1340	   Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 2006,
1341	   <http://www.rfc-editor.org/info/rfc4364>.

1343	   [RFC7209] Sajassi, A., Aggarwal, R., Uttaro, J., Bitar, N.,
1344	   Henderickx, W., and A. Isaac, "Requirements for Ethernet VPN (EVPN)",
1345	   RFC 7209, DOI 10.17487/RFC7209, May 2014, <http://www.rfc-
1346	   editor.org/info/rfc7209>.

1348	   [RFC7117] Aggarwal, R., Ed., Kamite, Y., Fang, L., Rekhter, Y., and
1349	   C. Kodeboniya, "Multicast in Virtual Private LAN Service (VPLS)",
1350	   RFC 7117, DOI 10.17487/RFC7117, February 2014, <http://www.rfc-
1351	   editor.org/info/rfc7117>.

1353	   [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A.,
1354	   Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet
1355	   VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, <http://www.rfc-
1356	   editor.org/info/rfc7432>.

1358	12.2. Informative References

1360	   [EVPN-INTERSUBNET] Sajassi et al., "IP Inter-subnet forwarding in
1361	   EVPN", draft-ietf-bess-evpn-inter-subnet-forwarding-03.txt

1363	13. Acknowledgments

1365	   The authors want to thank Giles Heron for his detailed review of the
1366	   document. We also thank Stefan Plug, and Eric Wunan for their
1367	   comments.

1369	14. Contributors
1370	   In addition to the authors listed on the front page, the following
1371	   co-authors have also contributed to this document:

1373	   Florin Balus
1374	   Keyur Patel
1375	   Aldrin Isaac
1376	   Truman Boyes

1378	15. Authors' Addresses

1380	   Jorge Rabadan
1381	   Nokia
1382	   777 E. Middlefield Road
1383	   Mountain View, CA 94043 USA
1384	   Email: jorge.rabadan@nokia.com

1386	   Senad Palislamovic
1387	   Nokia
1388	   Email: senad.palislamovic@nokia.com

1390	   Wim Henderickx
1391	   Nokia
1392	   Email: wim.henderickx@nokia.com

1394	   Ali Sajassi
1395	   Cisco
1396	   Email: sajassi@cisco.com

1398	   James Uttaro
1399	   AT&T
1400	   Email: uttaro@att.com