idnits 2.17.1 

draft-ietf-bess-evpn-usage-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([EVPN]), which it shouldn't. 
     Please replace those with straight textual mentions of the documents in
     question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (November 13, 2014) is 3451 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'VPLS-MCAST' is mentioned on line 240, but not defined

  == Missing Reference: 'RFC4761' is mentioned on line 248, but not defined

  == Missing Reference: 'RFC4762' is mentioned on line 248, but not defined

  == Missing Reference: 'RFC6074' is mentioned on line 248, but not defined

  == Missing Reference: 'RFC7209' is mentioned on line 260, but not defined

  == Missing Reference: 'PE-IP' is mentioned on line 364, but not defined

  == Missing Reference: 'AS' is mentioned on line 370, but not defined

  == Missing Reference: 'RFC2119' is mentioned on line 1278, but not defined

  == Unused Reference: 'RFC709' is defined on line 1317, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-11) exists of
     draft-ietf-l2vpn-evpn-10


     Summary: 1 error (**), 0 flaws (~~), 11 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	L2VPN Workgroup                                               J. Rabadan
3	Internet Draft                                           S. Palislamovic
4	                                                           W. Henderickx
5	Intended status: Informational                                  F. Balus
6	                                                          Alcatel-Lucent

8	J. Uttaro                                                       K. Patel
9	AT&T                                                          A. Sajassi
10	                                                                   Cisco

12	A. Isaac
13	T. Boyes
14	Bloomberg

16	Expires: May 17, 2015                                  November 13, 2014

18	         Usage and applicability of BGP MPLS based Ethernet VPN
19	                   draft-ietf-bess-evpn-usage-00.txt

21	Abstract

23	   This document discusses the usage and applicability of BGP MPLS based
24	   Ethernet VPN (EVPN) in a simple and fairly common deployment
25	   scenario. The different EVPN procedures will be explained on the
26	   example scenario, analyzing the benefits and trade-offs of each
27	   option. Along with [EVPN], this document is intended to provide a
28	   simplified guide for the deployment of EVPN in Service Provider
29	   networks.

31	Status of this Memo

33	   This Internet-Draft is submitted in full conformance with the
34	   provisions of BCP 78 and BCP 79.

36	   Internet-Drafts are working documents of the Internet Engineering
37	   Task Force (IETF), its areas, and its working groups.  Note that
38	   other groups may also distribute working documents as Internet-
39	   Drafts.

41	   Internet-Drafts are draft documents valid for a maximum of six months
42	   and may be updated, replaced, or obsoleted by other documents at any
43	   time.  It is inappropriate to use Internet-Drafts as reference
44	   material or to cite them other than as "work in progress."
45	   The list of current Internet-Drafts can be accessed at
46	   http://www.ietf.org/ietf/1id-abstracts.txt

48	   The list of Internet-Draft Shadow Directories can be accessed at
49	   http://www.ietf.org/shadow.html

51	   This Internet-Draft will expire on May 17, 2015.

53	Copyright Notice

55	   Copyright (c) 2014 IETF Trust and the persons identified as the
56	   document authors. All rights reserved.

58	   This document is subject to BCP 78 and the IETF Trust's Legal
59	   Provisions Relating to IETF Documents
60	   (http://trustee.ietf.org/license-info) in effect on the date of
61	   publication of this document. Please review these documents
62	   carefully, as they describe your rights and restrictions with respect
63	   to this document. Code Components extracted from this document must
64	   include Simplified BSD License text as described in Section 4.e of
65	   the Trust Legal Provisions and are provided without warranty as
66	   described in the Simplified BSD License.

68	Table of Contents

70	   1. Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  4
71	   2. Use-case scenario description . . . . . . . . . . . . . . . . .  4
72	   3. Provisioning Model  . . . . . . . . . . . . . . . . . . . . . .  6
73	     3.1. Common provisioning tasks . . . . . . . . . . . . . . . . .  7
74	       3.1.1. Non-service specific parameters . . . . . . . . . . . .  7
75	       3.1.2. Service specific parameters . . . . . . . . . . . . . .  8
76	     3.2. Service interface dependent provisioning tasks  . . . . . .  8
77	       3.2.1. VLAN-based service interface EVI  . . . . . . . . . . .  8
78	       3.2.2. VLAN-bundle service interface EVI . . . . . . . . . . .  9
79	       3.2.3. VLAN-aware bundling service interface EVI . . . . . . .  9
80	   4. BGP EVPN NLRI usage . . . . . . . . . . . . . . . . . . . . . .  9
81	   5. MAC-based forwarding model use-case . . . . . . . . . . . . . . 10
82	     5.1. EVPN Network Startup procedures . . . . . . . . . . . . . . 10
83	     5.2. VLAN-based service procedures . . . . . . . . . . . . . . . 11
84	       5.2.1. Service startup procedures  . . . . . . . . . . . . . . 11
85	       5.2.2. Packet walkthrough  . . . . . . . . . . . . . . . . . . 12
86	     5.3. VLAN-bundle service procedures  . . . . . . . . . . . . . . 15
87	       5.3.1. Service startup procedures  . . . . . . . . . . . . . . 15
88	       5.3.2. Packet Walkthrough  . . . . . . . . . . . . . . . . . . 16
89	     5.4. VLAN-aware bundling service procedures  . . . . . . . . . . 16
90	       5.4.1. Service startup procedures  . . . . . . . . . . . . . . 17
91	       5.4.2. Packet Walkthrough  . . . . . . . . . . . . . . . . . . 17
92	   6. MPLS-based forwarding model use-case  . . . . . . . . . . . . . 18
93	     6.1. Impact of MPLS-based forwarding on the EVPN network
94	          startup . . . . . . . . . . . . . . . . . . . . . . . . . . 19
95	     6.2. Impact of MPLS-based forwarding on the VLAN-based service
96	          procedures  . . . . . . . . . . . . . . . . . . . . . . . . 19
97	     6.3. Impact of MPLS-based forwarding on the VLAN-bundle
98	          service procedures  . . . . . . . . . . . . . . . . . . . . 19
99	     6.4. Impact of MPLS-based forwarding on the VLAN-aware service
100	          procedures  . . . . . . . . . . . . . . . . . . . . . . . . 20
101	   7. Comparison between MAC-based and MPLS-based forwarding models . 21
102	   8. Traffic flow optimization . . . . . . . . . . . . . . . . . . . 22
103	     8.1. Control Plane Procedures  . . . . . . . . . . . . . . . . . 22
104	       8.1.1. MAC learning options  . . . . . . . . . . . . . . . . . 22
105	       8.1.2. Proxy-ARP/ND  . . . . . . . . . . . . . . . . . . . . . 23
106	       8.1.3. Unknown Unicast flooding suppression  . . . . . . . . . 23
107	       8.1.4. Optimization of Inter-subnet forwarding . . . . . . . . 24
108	     8.2. Packet Walkthrough Examples . . . . . . . . . . . . . . . . 25
109	       8.2.1. Proxy-ARP example for CE2 to CE3 traffic  . . . . . . . 25
110	       8.2.2. Flood suppression example for CE1 to CE3 traffic  . . . 25
111	       8.2.3. Optimization of inter-subnet forwarding example for
112	              CE3 to CE2 traffic  . . . . . . . . . . . . . . . . . . 26
113	   9. Conventions used in this document . . . . . . . . . . . . . . . 27
114	   10. Security Considerations  . . . . . . . . . . . . . . . . . . . 28
115	   11. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 28
116	   12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 28
117	     12.1. Normative References . . . . . . . . . . . . . . . . . . . 28
118	     12.2. Informative References . . . . . . . . . . . . . . . . . . 28
119	   13. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 29
120	   14. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 29

122	1. Introduction

124	   This document complements [EVPN] by discussing the applicability of
125	   the technology in a simple and fairly common deployment scenario,
126	   which is described in section 2.

128	   After describing the topology of the use-case scenario and the
129	   characteristics of the service to be deployed, section 3 will
130	   describe the provisioning model, comparing the EVPN procedures with
131	   the provisioning tasks required for other VPN technologies, such as
132	   VPLS or IP-VPN.

134	   Once the provisioning model is analyzed, sections 4, 5 and 6 will
135	   describe the control plane and data plane procedures in the example
136	   scenario, for the two potential disposition/forwarding models: MAC-
137	   based and MPLS-based models. While both models can interoperate in
138	   the same network, each one has different trade-offs that are analyzed
139	   in section 7.

141	   Finally, EVPN provides some potential traffic flow optimization tools
142	   that are also described in section 8, in the context of the example
143	   scenario.

145	2. Use-case scenario description

147	   The following figure depicts the scenario that will be referenced
148	   throughout the rest of the document.

150	                            +--------------+
151	                            |              |
152	          +----+     +----+ |              | +----+   +----+
153	          | CE1|-----|    | |              | |    |---| CE3|
154	          +----+    /| PE1| |   IP/MPLS    | | PE3|   +----+
155	                   / +----+ |   Network    | +----+
156	                  /         |              |
157	                 /   +----+ |              |
158	          +----+/    |    | |              |
159	          | CE2|-----| PE2| |              |
160	          +----+     +----+ |              |
161	                            +--------------+

163	                     Figure 1 EVPN use-case scenario

165	   There are three PEs and three CEs considered in this example: PE1,
166	   PE2, PE3, as well as CE1, CE2 and CE3. Layer-2 traffic must be
167	   extended among the three CEs. The following service requirements are
168	   assumed in this scenario:

170	   o Redundancy requirements: CE1 and CE3 are single-homed to PE1 and
171	     PE3 respectively. CE2 requires multi-homing connectivity to PE1 and
172	     PE2, not only for redundancy purposes, but also for adding more
173	     upstream/downstream connectivity bandwidth to/from the network. If
174	     CE2 has a single CE-VID (or a few CE-VIDs) the current VPLS
175	     multi-homing solutions (based on load-balancing per CE-VID or
176	     service) do not provide the optimized link utilization required in
177	     this example. Another redundancy requirement that must be met is
178	     fast convergence. E.g.: if the link between CE2 and PE1 goes down,
179	     a fast convergence mechanism must be supported so that PE3 can
180	     immediately send the traffic to PE2, irrespectively of the number
181	     of affected services and MAC addresses. EVPN provides the
182	     flow-based load-balancing multi-homing solution required in this
183	     scenario to optimize the upstream/downstream link utilization
184	     between CE2 and PE1-PE2. EVPN also provides a fast convergence
185	     solution so that PE3 can immediately send the traffic to PE2 upon
186	     failure on the link between CE2 and PE1.

188	   o Service interface requirements: service definition must be flexible
189	     in terms of CE-VID-to-broadcast-domain assignment and service
190	     contexts in the core. The following three services are required in
191	     this example:

193	     EVI100 - It will use VLAN-based service interfaces in the three CEs
194	     with a 1:1 mapping (VLAN-to-EVI). The CE-VIDs at the three CEs can
195	     be the same, e.g.: VID 100, or different at each CE, e.g.: VID 101
196	     in CE1, VID 102 in CE2 and VID 103 in CE3. A single broadcast
197	     domain needs to be created for EVI100 in any case; therefore CE-
198	     VIDs will require translation at the egress PEs if they are not
199	     consistent across the three CEs. The case when the same CE-VID is
200	     used across the three CEs for EVI100 is referred in [EVPN] as the
201	     "Unique VLAN" EVPN case. This term will be used throughout this
202	     document too.

204	     EVI200 - It will use VLAN-bundle service interfaces in CE1, CE2 and
205	     CE3, based on an N:1 VLAN-to-EVI mapping. In this case, the service
206	     provider just needs to assign a pre-configured number of CE-VIDs on
207	     the ingress PE to EVI200, and send the customer frames with the
208	     original CE-VIDs. The Service Provider will build a single
209	     broadcast domain for the customer. The customer will be responsible
210	     for the CE-VID handling.

212	     EVI300 - It will use VLAN-aware bundling service interfaces in CE1,
213	     CE2 and CE3. At the ingress PE, an N:1 VLAN-to-EVI mapping will be
214	     done, however and as opposed to EVI200, a separate core broadcast
215	     domain is required per CE-VID. In addition to that, the CE-VIDs can
216	     be different (hence CE-VID translation is required). Note that,
217	     while the requirements stated for EVI100 and EVI200 might be met
218	     with the current VPLS solutions, the VLAN-aware bundling service
219	     interfaces required by EVI300 are not supported by the current VPLS
220	     tools.

222	   NOTE: in section 3.2.1, only EVI100 is used as an example of
223	   VLAN-based service provisioning. In sections 5.2 and 6.2, 4k
224	   VLAN-based EVIs (EVI1 to EVI4k) are used so that the impact of MAC
225	   vs. MPLS disposition models in the control plane can be evaluated. In
226	   the same way, EVI200 and EVI300 will be described with a 4k:1 mapping
227	   (CE-VIDs-to-EVI mapping) in sections 5.3-4 and 6.3-4.

229	   o BUM (Broadcast, Unknown unicast, Multicast) optimization
230	     requirements: The solution must be able to support ingress
231	     replication, P2MP MPLS LSPs and MP2MP MPLS LSPs and the user must
232	     be able to decide what kind of provider tree will be used by each
233	     EVI service. For example, if we assume that EVI100 and EVI200 will
234	     not carry much BUM traffic, we can use ingress replication for
235	     those service instances. The benefit is that the core will not need
236	     to maintain any states for the multicast trees associated to EVI100
237	     and EVI200. On the contrary, if EVI300 is presumably carrying a
238	     significant amount of multicast traffic, P2MP MPLS LSPs or MP2MP
239	     LSPs can be used for this service. Note that ingress replication
240	     and P2MP LSPs are supported by VPLS solutions (see [VPLS-MCAST]),
241	     however VPLS solutions do not support MP2MP LSPs, since the source
242	     of the tree must be identified for the data plane MAC learning, and
243	     that identification is challenging when using MP2MP LSPs. Since
244	     EVPN uses the control plane for MAC learning, any type of provider
245	     multicast tree is supported in the core.

247	   As already outlined above, the current VPLS solutions, based on
248	   [RFC4761][RFC4762][RFC6074], cannot meet all the above set of
249	   requirements and therefore a new solution is needed. The rest of the
250	   document will describe how EVPN can be used to meet those service
251	   requirements and even optimize the network further by:

253	   o Providing the user with an option to reduce (and even suppress) the
254	     ARP-flooding.

256	   o Supporting ARP termination for inter-subnet forwarding

258	3. Provisioning Model

260	   One of the requirements stated in [RFC7209] is the ease of
261	   provisioning. BGP parameters and service context parameters should be
262	   auto-provisioned so that the addition of a new MAC-VRF to the EVI
263	   requires a minimum number of single-sided provisioning touches.
264	   However this is only possible in a limited number of cases. This
265	   section describes the provisioning tasks required for the services
266	   described in section 2, i.e. EVI100 (VLAN-based service interfaces),
267	   EVI200 (VLAN-bundle service interfaces) and EVI300 (VLAN-aware
268	   bundling service interfaces).

270	3.1. Common provisioning tasks

272	   Regardless of the service interface type (VLAN-based, VLAN-bundle or
273	   VLAN-aware), the following sub-sections describe the parameters to be
274	   provisioned in the three PEs.

276	3.1.1. Non-service specific parameters

278	   The multi-homing function in EVPN requires the provisioning of
279	   certain parameters which are not service-specific and that are shared
280	   by all the MAC-VRFs in the node using the multi-homing capabilities.
281	   In our use-case, these parameters are only provisioned in PE1 and
282	   PE2, and are listed below:

284	   o Ethernet Segment Identifier (ESI): only the ESI associated to CE2
285	     needs to be considered in our example. Single-homed CEs such as CE1
286	     and CE3 do not require the provisioning of an ESI (the ESI will be
287	     coded as zero in the BGP NLRIs). In our example, a LAG is used
288	     between CE2 and PE1-PE2 (since all-active multi-homing is a
289	     requirement) therefore the ESI can be auto-derived from the LACP
290	     information as described in [EVPN]. Note that the ESI MUST be
291	     unique across all the PEs in the network, therefore the
292	     auto-provisioning of the ESI is only recommended in case the CEs
293	     are managed by the Service Provider. Otherwise the ESI should be
294	     manually provisioned (type 0 as in [EVPN]) in order to avoid
295	     potential conflicts.

297	   o ES-Import Route Target (ES-Import RT): this is the RT that will be
298	     sent by PE1 and PE2, along with the ES route. Regardless of how the
299	     ESI is provisioned in PE1 and PE2, the ES-Import RT must always be
300	     auto-derived from the 6-byte MAC address portion of the ESI value.

302	   o Ethernet Segment Route Distinguisher (ES RD): this is the RD to be
303	     encoded in the ES route and Ethernet Auto-Discovery (A-D) route to
304	     be sent by PE1 and PE2 for the CE2 ESI. This RD should always be
305	     auto-derived from the PE IP address, as described in [EVPN].

307	   o Multi-homing type: the user must be able to provision the
308	     multi-homing type to be used in the network. In our use-case, the
309	     multi-homing type will be set to all-active for the CE2 ESI. This
310	     piece of information is encoded in the ESI Label extended community
311	     flags and sent by PE1 and PE2 along with the Ethernet A-D route for
312	     the CE2 ESI.

314	   In our use-case, besides the above parameters, the same LACP
315	   parameters will be configured in PE1 and PE2 for the ESI, so that CE2
316	   can send different flows to PE1 and PE2 for the same CE-VID as though
317	   they were forming a single system from the CE2 perspective.

319	3.1.2. Service specific parameters

321	   The following parameters must be provisioned in PE1, PE2 and PE3 per
322	   EVI service:

324	   o EVI identifier: global identifier per EVI that is shared by all the
325	     PEs part of the EVI, i.e. PE1, PE2 and PE3 will be provisioned with
326	     EVI100, 200 and 300. The EVI identifier can be associated to (or be
327	     the same value as) the EVI default Ethernet Tag (4-byte default
328	     broadcast domain identifier for the EVI). The Ethernet Tag is
329	     different from zero in the EVPN BGP routes only if the service
330	     interface type (of the source PE) is VLAN-aware.

332	   o EVI Route Distinguisher (EVI RD): This RD is a unique value across
333	     all the MAC-VRFs in a PE. Auto-derivation of this RD might be
334	     possible depending on the service interface type being used in the
335	     EVI. Next section discusses the specifics of each service interface
336	     type.

338	   o EVI Route Target(s) (EVI RT): one or more RTs can be provisioned
339	     per MAC-VRF. The RT(s) imported and exported can be equal or
340	     different, just as the RT(s) in IP-VPNs. Auto-derivation of this
341	     RT(s) might be possible depending on the service interface type
342	     being used in the EVI. Next section discusses the specifics of each
343	     service interface type.

345	   o CE-VID and port/LAG binding to EVI identifier or Ethernet Tag: see
346	     section 3.2.

348	3.2. Service interface dependent provisioning tasks

350	   Depending on the service interface type being used in the EVI, a
351	   specific CE-VID binding provisioning must be specified.

353	3.2.1. VLAN-based service interface EVI

355	   In our use-case, EVI100 is a VLAN-based service interface EVI.

357	   EVI100 can be a "unique-VLAN" EVPN if the CE-VID being used for this
358	   service in CE1, CE2 and CE3 is equal, e.g. VID 100. In that case, the
359	   VID 100 binding must be provisioned in PE1, PE2 and PE3 for EVI100
360	   and the associated port or LAG. The MAC-VRF RD and RT can be auto-
361	   derived from the CE-VID:

363	   o The auto-derived MAC-VRF RD will be a Type 1 RD, as recommended in
364	     [EVPN], and it will be comprised of [PE-IP]:[zero-padded-VID];
365	     where PE-IP is the IP address of the PE (a loopback address) and
366	     [zero-padded-VID] is a 2-byte value where the low order 12 bits are
367	     the VID (VID 100 in our example) and the high order 4 bits are
368	     zero.

370	   o The auto-derived MAC-VRF RT will be composed of [AS]:[zero-padded-
371	     VID]; where AS is the Autonomous System that the PE belongs to and
372	     [zero-padded-VID] is a 4-byte value where the low order 12 bits are
373	     the VID (VID 100 in our example) and the high order 20 bits are
374	     zero. Note that auto-deriving the RT implies supporting a basic
375	     any-to-any topology in the EVI and using the same import and export
376	     RT in the EVI.

378	   If EVI100 is not a "unique-VLAN" EVPN, each individual CE-VID must be
379	   configured in each PE, and MAC-VRF RDs and RTs cannot be auto-
380	   derived, hence they must be provisioned by the user.

382	3.2.2. VLAN-bundle service interface EVI

384	   Assuming EVI200 is a VLAN-bundle service interface EVI, and VIDs
385	   200-250 are assigned to EVI200, the CE-VID bundle 200-250 must be
386	   provisioned on PE1, PE2 and PE3. Note that this model does not allow
387	   CE-VID translation and the CEs must use the same CE-VIDs for EVI200.
388	   No auto-derived EVI RDs or EVI RTs are possible.

390	3.2.3. VLAN-aware bundling service interface EVI

392	   If EVI300 is a VLAN-aware bundling service interface EVI, CE-VID
393	   binding to EVI300 does not have to match on the three PEs (only on
394	   PE1 and PE2, since they are part of the same ES). E.g.: PE1 and PE2
395	   CE-VID binding to EVI300 can be set to the range 300-310 and PE3 to
396	   321-330. Note that each individual CE-VID will be assigned to a core
397	   broadcast domain, i.e. Ethernet Tag, which will be encoded in the BGP
398	   EVPN routes.

400	   Therefore, besides the CE-VID bundle range bound to EVI300 in each
401	   PE, associations between each individual CE-VID and the EVPN Ethernet
402	   Tag must be provisioned by the user. No auto-derived EVI RDs/RTs are
403	   possible.

405	4. BGP EVPN NLRI usage

407	   [EVPN] defines four different types of routes and four different
408	   extended communities advertised along with the different routes.
409	   However not all the PEs in a network must generate and process all
410	   the different routes and extended communities. The following table
411	   shows the routes that must be exported and imported in the use-case
412	   described in this document. "Export", in this context, means that the
413	   PE must be capable of generating and exporting a given route,
414	   assuming there are no BGP policies to prevent it. In the same way,
415	   "Import" means the PE must be capable of importing and processing a
416	   given route, assuming the right RTs and policies. "N/A" means neither
417	   import nor export actions are required.

419	   +-------------------+---------------+---------------+
420	   | BGP EVPN routes   | PE1-PE2       | PE3           |
421	   +-------------------+---------------+---------------+
422	   | ES                | Export/import | N/A           |
423	   | A-D per ESI       | Export/import | Import        |
424	   | A-D per EVI       | Export/import | Import        |
425	   | MAC               | Export/import | Export/import |
426	   | Inclusive mcast   | Export/import | Export/import |
427	   +-------------------+---------------+---------------+

429	   PE3 is only required to export MAC and Inclusive multicast routes and
430	   be able to import and process A-D routes, as well as MAC and
431	   Inclusive multicast routes. If PE3 did not support importing and
432	   processing A-D routes per ESI and per EVI, fast convergence and
433	   aliasing functions (respectively) would not be possible in this
434	   use-case.

436	5. MAC-based forwarding model use-case

438	   This section describes how the BGP EVPN routes are exported and
439	   imported by the PEs in our use-case, as well as how traffic is
440	   forwarded assuming that PE1, PE2 and PE3 support a MAC-based
441	   forwarding model. In order to compare the control and data plane
442	   impact in the two forwarding models (MAC-based and MPLS-based) and
443	   different service types, we will assume that CE1, CE2 and CE3 need to
444	   exchange traffic for up to 4k CE-VIDs.

446	5.1. EVPN Network Startup procedures

448	   Before any EVI is provisioned in the network, the following
449	   procedures are required:

451	   o Infrastructure setup: the proper MPLS infrastructure must be setup
452	     among PE1, PE2 and PE3 so that the EVPN services can make use of
453	     P2P, P2MP and/or MP2MP LSPs. In addition to the MPLS transport, PE1
454	     and PE2 must be properly configured with the same LACP
455	     configuration to CE2. Details are provided in [EVPN]. Once the LAG
456	     is properly setup, the ESI for the CE2 Ethernet Segment, e.g.
457	     ESI12, can be auto-generated by PE1 and PE2 from the LACP
458	     information exchanged with CE2 (ESI type 1), as discussed in
459	     section 3.1. Alternatively, the ESI can also be manually
460	     provisioned on PE1 and PE2 (ESI type 0). PE1 and PE2 will auto-
461	     configure a BGP policy that will import any ES route matching the
462	     auto-derived ES-import RT for ESI12.

464	   o Ethernet Segment route exchange and DF election: PE1 and PE2 will
465	     advertise a BGP Ethernet Segment route for ESI12, where the ESI RD
466	     and ES-Import RT will be auto-generated as discussed in section
467	     3.1.1. PE1 and PE2 will import the ES routes of each other and will
468	     run the DF election algorithm for any existing EVI (if any, at this
469	     point). PE3 will simply discard the route. Note that the DF
470	     election algorithm can support service carving, so that the
471	     downstream BUM traffic from the network to CE2 can be load-balanced
472	     across PE1 and PE2 on a per-service basis.

474	   At the end of this process, the network infrastructure is ready to
475	   start deploying EVPN services. PE1 and PE2 are aware of the existence
476	   of a shared Ethernet Segment, i.e. ESI12.

478	5.2. VLAN-based service procedures

480	   Assuming that the EVPN network must carry traffic among CE1, CE2 and
481	   CE3 for up to 4k CE-VIDs, the Service Provider can decide to
482	   implement VLAN-based service interface EVIs to accomplish it. In this
483	   case, each CE-VID will be individually mapped to a different EVI.
484	   While this means a total number of 4k MAC-VRFs is required per PE,
485	   the advantages of this approach are the auto-provisioning of most of
486	   the service parameters if no VLAN translation is needed (see section
487	   3.2.1) and great control over each individual customer broadcast
488	   domain. We assume in this section that the range of EVIs from 1 to 4k
489	   is provisioned in the network.

491	5.2.1. Service startup procedures

493	   As soon as the EVIs are created in PE1, PE2 and PE3, the following
494	   control plane actions are carried out:

496	   o Flooding tree setup per EVI (4k routes): Each PE will send one
497	     Inclusive Multicast Ethernet Tag route per EVI (up to 4k routes per
498	     PE) so that the flooding tree per EVI can be setup. Note that
499	     ingress replication, P2MP LSPs or MP2MP LSPs can optionally be
500	     signaled in the PMSI Tunnel attribute and the corresponding tree be
501	     created.

503	   o Ethernet A-D routes per ESI (a set of routes for ESI12): A set of
504	     A-D routes with a list of 4k RTs (one per EVI) for ESI12 will be
505	     issued from PE1 and PE2 (it has to be a set of routes so that the
506	     total number of RTs can be conveyed). This set will also include
507	     ESI Label extended communities with the active-standby flag set to
508	     zero (all-active multi-homing type) and an ESI Label different from
509	     zero (used for split-horizon functions). These routes will be
510	     imported by the three PEs, since the RTs match the EVI RTs locally
511	     configured. The A-D routes per ESI will be used for fast
512	     convergence and split-horizon functions, as discussed in [EVPN].

514	   o Ethernet A-D routes per EVI (4k routes): An A-D route per EVI will
515	     be sent by PE1 and PE2 for ESI12. Each individual route includes
516	     the corresponding EVI RT and an MPLS label to be used by PE3 for
517	     the aliasing function. These routes will be imported by the three
518	     PEs.

520	5.2.2. Packet walkthrough

522	   Once the services are setup, the traffic can start flowing. Assuming
523	   there are no MAC addresses learnt yet and that MAC learning at the
524	   access is performed in the data plane in our use-case, this is the
525	   process followed upon receiving frames from each CE (example for
526	   EVI1).

528	   (1) BUM frame example from CE1:

530	   a) An ARP-request with CE-VID=1 is issued from source MAC CE1-MAC
531	      (MAC address coming from CE1 or from a device connected to CE1) to
532	      find the MAC address of CE3-IP.

534	   b) Based on the CE-VID, the frame is identified to be forwarded in
535	      the MAC-VRF-1 (EVI1) context. A source MAC lookup is done in the
536	      MAC FIB and the sender's CE1-IP in the proxy-ARP table within the
537	      MAC-VRF-1 (EVI1) context. If CE1-MAC/CE1-IP are unknown in both
538	      tables, three actions are carried out (assuming the source MAC is
539	      accepted by PE1): (1) a forwarding state is added for CE1-MAC
540	      associated to the corresponding port and CE-VID, (2) the ARP-
541	      request is snooped and the tuple CE1-MAC/CE1-IP is added to the
542	      proxy-ARP table and (3) a BGP MAC advertisement route is triggered
543	      from PE1 containing the EVI1 RD and RT, ESI=0, Ethernet-Tag=0 and
544	      CE1-MAC/CE1-IP along with an MPLS label assigned to MAC-VRF-1 from
545	      the PE1 label space. Note that depending on the implementation,
546	      the MAC FIB and proxy-ARP learning processes can independently
547	      send two BGP MAC advertisements instead of one (one containing
548	      only the CE1-MAC and another one containing CE1-MAC/CE1-IP).

550	      Since we assume a MAC forwarding model, a label per MAC-VRF is
551	      normally allocated and signaled by the three PEs for MAC
552	      advertisement routes. Based on the RT, the route is imported by
553	      PE2 and PE3 and the forwarding state plus ARP entry are added to
554	      their MAC-VRF-1 context. From this moment on, any ARP request from
555	      CE2 or CE3 destined to CE1-IP, can be directly replied by PE1, PE2
556	      or PE3 and ARP flooding for CE1-IP is not needed in the core.

558	   c) Since the ARP frame is a broadcast frame, it is forwarded by PE1
559	      using the Inclusive multicast tree for EVI1 (CE-VID=1 should be
560	      kept if translation is required). Depending on the type of tree,
561	      the label stack may vary. E.g. assuming ingress replication, the
562	      packet is replicated to PE2 and PE3 with the downstream allocated
563	      labels and the P2P LSP transport labels. No other labels are added
564	      to the stack.

566	   d) Assuming PE1 is the DF for EVI1 on ESI12, the frame is locally
567	      replicated to CE2.

569	   e) The MPLS-encapsulated frame gets to PE2 and PE3. Since PE2 is non-
570	      DF for EVI1 on ESI12, and there is no other CE connected to PE2,
571	      the frame is discarded. At PE3, the frame is de-encapsulated, CE-
572	      VID translated if needed and replicated to CE3.

574	   Any other type of BUM frame from CE1 would follow the same
575	   procedures. BUM frames from CE3 would follow the same procedures too.

577	   (2) BUM frame example from CE2:

579	   a) An ARP-request with CE-VID=1 is issued from source MAC CE2-MAC to
580	      find the MAC address of CE3-IP.

582	   b) CE2 will hash the frame and will forward it to e.g. PE2. Based on
583	      the CE-VID, the frame is identified to be forwarded in the EVI1
584	      context. A source MAC lookup is done in the MAC FIB and the
585	      sender's CE2-IP in the proxy-ARP table within the MAC-VRF-1
586	      context. If both are unknown, three actions are carried out
587	      (assuming the source MAC is accepted by PE2): (1) a forwarding
588	      state is added for CE2-MAC associated to the corresponding LAG/ESI
589	      and CE-VID, (2) the ARP-request is snooped and the tuple CE2-
590	      MAC/CE2-IP is added to the proxy-ARP table and (3) a BGP MAC
591	      advertisement route is triggered from PE2 containing the EVI1 RD
592	      and RT, ESI=12, Ethernet-Tag=0 and CE2-MAC/CE2-IP along with an
593	      MPLS label assigned from the PE2 label space (one label per MAC-
594	      VRF). Again, depending on the implementation, the MAC FIB and
595	      proxy-ARP learning processes can independently send two BGP MAC
596	      advertisements instead of one.

598	      Note that, since PE3 is not part of ESI12, it will install a
599	      forwarding state for CE2-MAC as long as the A-D routes for ESI12
600	      are also active on PE3. On the contrary, PE1 is part of ESI12,
601	      therefore PE1 will not modify the forwarding state for CE2-MAC if
602	      it has previously learnt CE2-MAC locally attached to ESI12.

604	      Otherwise it will add forwarding state for CE2-MAC associated to
605	      the local ESI12 port.

607	   c) Assuming PE2 does not have the ARP information for CE3-IP yet, and
608	      since the ARP is a broadcast frame and PE2 the non-DF for EVI1 on
609	      ESI12, the frame is forwarded by PE2 in the Inclusive multicast
610	      tree for EVI1, adding the ESI label for ESI12 at the bottom of the
611	      stack. The ESI label has been previously allocated and signaled by
612	      the A-D routes for ESI12. Note that, as per [EVPN], if the result
613	      of the CE2 hashing is different and the frame sent to PE1, PE1
614	      SHOULD add the ESI label too (PE1 is the DF for EVI1 on ESI12).

616	   d) The MPLS-encapsulated frame gets to PE1 and PE3. PE1
617	      de-encapsulate the Inclusive multicast tree label(s) and based on
618	      the ESI label at the bottom of the stack, it decides to not
619	      forward the frame to the ESI12. It will pop the ESI label and will
620	      replicate it to CE1 though, since CE1 is not part of the ESI
621	      identified by the ESI label. At PE3, the Inclusive multicast tree
622	      label is popped and the frame forwarded to CE3. If a P2MP LSP is
623	      used as Inclusive multicast tree for EVI1, PE3 will find an ESI
624	      label after popping the P2MP LSP label. The ESI label will simply
625	      be ignored and popped, since CE3 is not part of ESI12.

627	   (3) Unicast frame example from CE3 to CE1:

629	   a) A unicast frame with CE-VID=1 is issued from source MAC CE3-MAC
630	      and destination MAC CE1-MAC (we assume PE3 has previously resolved
631	      an ARP request from CE3 to find the MAC of CE1-IP, and has added
632	      CE3-MAC/CE3-IP to its proxy-ARP table).

634	   b) Based on the CE-VID, the frame is identified to be forwarded in
635	      the EVI1 context. A source MAC lookup is done in the MAC FIB
636	      within the MAC-VRF-1 context and this time, since we assume CE3-
637	      MAC is known, no further actions are carried out as a result of
638	      the source lookup. A destination MAC lookup is performed next and
639	      the label stack associated to the MAC CE1-MAC is found (including
640	      the label associated to MAC-VRF-1 in PE1 and the P2P LSP label to
641	      get to PE1). The unicast frame is then encapsulated and forwarded
642	      to PE1.

644	   c) At PE1, the packet is identified to be part of EVI1 and a
645	      destination MAC lookup is performed in the MAC-VRF-1 context. The
646	      labels are popped and the frame forwarded to CE1 with CE-VID=1.

648	      Unicast frames from CE1 to CE3 or from CE2 to CE3 follow the same
649	      procedures described above.

651	   (4) Unicast frame example from CE3 to CE2:

653	   a) A unicast frame with CE-VID=1 is issued from source MAC CE3-MAC
654	      and destination MAC CE2-MAC (we assume PE3 has previously resolved
655	      an ARP request from CE3 to find the MAC of CE2-IP).

657	   b) Based on the CE-VID, the frame is identified to be forwarded in
658	      the MAC-VRF-1 context. We assume CE3-MAC is known. A destination
659	      MAC lookup is performed next and PE3 finds CE2-MAC associated to
660	      PE2 on ESI12, an Ethernet Segment for which PE3 has two active A-D
661	      routes per ESI (from PE1 and PE2) and two active A-D routes for
662	      EVI1 (from PE1 and PE2). Based on a hashing function for the
663	      frame, PE3 may decide to forward the frame using the label stack
664	      associated to PE2 (label received from the MAC advertisement
665	      route) or the label stack associated to PE1 (label received from
666	      the A-D route per EVI for EVI1). Either way, the frame is
667	      encapsulated and sent to the remote PE.

669	   c) At PE2 (or PE1), the packet is identified to be part of EVI1 based
670	      on the bottom label, and a destination MAC lookup is performed. At
671	      either PE (PE2 or PE1), the FIB lookup yields a local ESI12 port
672	      to which the frame is sent.

674	   Unicast frames from CE1 to CE2 follow the same procedures. Aliasing
675	   is possible in this case too, since ESI12 is local to PE1 and load
676	   balancing through PE1 and PE2 may happen.

678	5.3. VLAN-bundle service procedures

680	   Instead of using VLAN-based interfaces, the Service Provider can
681	   choose to implement VLAN-bundle interfaces to carry the traffic for
682	   the 4k CE-VIDs among CE1, CE2 and CE3. If that is the case, the 4k
683	   CE-VIDs can be mapped to the same EVI, e.g. EVI200, at each PE. The
684	   main advantage of this approach is the low control plane overhead
685	   (reduced number of routes and labels) and easiness of provisioning,
686	   at the expense of no control over the customer broadcast domains,
687	   i.e. a single inclusive multicast tree for all the CE-VIDs and no CE-
688	   VID translation in the Provider network.

690	5.3.1. Service startup procedures

692	   As soon as the EVI200 is created in PE1, PE2 and PE3, the following
693	   control plane actions are carried out:

695	   o Flooding tree setup per EVI (one route): Each PE will send one
696	      Inclusive Multicast Ethernet Tag route per EVI (hence only one
697	      route per PE) so that the flooding tree per EVI can be setup. Note
698	      that ingress replication, P2MP LSPs or MP2MP LSPs can optionally
699	      be signaled in the PMSI Tunnel attribute and the corresponding
700	      tree be created.

702	   o Ethernet A-D routes per ESI (one route for ESI12): A single A-D
703	      route for ESI12 will be issued from PE1 and PE2. This route will
704	      include a single RT (RT for EVI200), an ESI Label extended
705	      community with the active-standby flag set to zero (all-active
706	      multi-homing type) and an ESI Label different from zero (used by
707	      the non-DF for split-horizon functions). This route will be
708	      imported by the three PEs, since the RT matches the EVI200 RT
709	      locally configured. The A-D routes per ESI will be used for fast
710	      convergence and split-horizon functions, as described in [EVPN].

712	   o Ethernet A-D routes per EVI (one route): An A-D route (EVI200) will
713	      be sent by PE1 and PE2 for ESI12. This route includes the EVI200
714	      RT and an MPLS label to be used by PE3 for the aliasing function.
715	      This route will be imported by the three PEs.

717	5.3.2. Packet Walkthrough

719	   The packet walkthrough for the VLAN-bundle case is similar to the one
720	   described for EVI1 in the VLAN-based case except for the way the
721	   CE-VID is handled by the ingress PE and the egress PE:

723	   o No VLAN translation is allowed and the CE-VIDs are kept untouched
724	      from CE to CE, i.e. the ingress CE-VID MUST be kept at the
725	      imposition PE and at the disposition PE.

727	   o The frame is identified to be forwarded in the MAC-VRF-200 context
728	      as long as its CE-VID belongs to the VLAN-bundle defined in the
729	      PE1/PE2/PE3 port to CE1/CE2/CE3. Our example is a special VLAN-
730	      bundle case, since the entire CE-VID range is defined in the
731	      ports, therefore any CE-VID would be part of EVI200.

733	   Please refer to section 5.2.2 for more information about the control
734	   plane and forwarding plane interaction for BUM and unicast traffic
735	   from the different CEs.

737	5.4. VLAN-aware bundling service procedures

739	   The last potential service type analyzed in this document is
740	   VLAN-aware bundling. When this type of service interface is used to
741	   carry the 4k CE-VIDs among CE1, CE2 and CE3, all the CE-VIDs will be
742	   mapped to the same EVI, e.g. EVI300. The difference, compared to the
743	   VLAN-bundle service type in the previous section, is that each
744	   incoming CE-VID will also be mapped to a different "normalized"
745	   Ethernet-Tag in addition to EVI300. If no translation is required,
746	   the Ethernet-tag will match the CE-VID. Otherwise a translation
747	   between CE-VID and Ethernet-tag will be needed at the imposition PE
748	   and at the disposition PE. The main advantage of this approach is the
749	   ability to control customer broadcast domains while providing a
750	   single EVI to the customer.

752	5.4.1. Service startup procedures

754	   As soon as the EVI300 is created in PE1, PE2 and PE3, the following
755	   control plane actions are carried out:

757	   o Flooding tree setup per EVI per Ethernet-Tag (4k routes): Each PE
758	      will send one Inclusive Multicast Ethernet Tag route per EVI and
759	      per Ethernet-Tag (hence 4k routes per PE) so that the flooding
760	      tree per customer broadcast domain can be setup. Note that ingress
761	      replication, P2MP LSPs or MP2MP LSPs can optionally be signaled in
762	      the PMSI Tunnel attribute and the corresponding tree be created.
763	      In the described use-case, since all the CE-VIDs and Ethernet-Tags
764	      are defined on the three PEs, multicast tree aggregation might
765	      make sense in order to save forwarding states.

767	   o Ethernet A-D routes per ESI (one route for ESI12): A single A-D
768	      route for ESI12 will be issued from PE1 and PE2. This route will
769	      include a single RT (RT for EVI300), an ESI Label extended
770	      community with the active-standby flag set to zero (all-active
771	      multi-homing type) and an ESI Label different than zero (used by
772	      the non-DF for split-horizon functions). This route will be
773	      imported by the three PEs, since the RT matches the EVI300 RT
774	      locally configured. The A-D routes per ESI will be used for fast
775	      convergence and split-horizon functions, as described in [EVPN].

777	   o Ethernet A-D routes per EVI (one route): An A-D route (EVI300) will
778	      be sent by PE1 and PE2 for ESI12. This route includes the EVI300
779	      RT and an MPLS label to be used by PE3 for the aliasing function.
780	      This route will be imported by the three PEs.

782	5.4.2. Packet Walkthrough

784	   The packet walkthrough for the VLAN-aware case is similar to the one
785	   described before. Compared to the other two cases, VLAN-aware
786	   services allow for CE-VID translation and for an N:1 CE-VID to EVI
787	   mapping. Both things are not supported at once in either of the two
788	   other service interfaces. Note that this model requires qualified
789	   learning on the MAC FIBs. Some differences compared to the packet
790	   walkthrough described in section 5.2.2 are:

792	   o At the ingress PE, the frames are identified to be forwarded in the
793	      EVI300 context as long as their CE-VID belong to the range defined
794	      in the PE port to the CE. In addition to it, CE-VID=x is mapped to
795	      a "normalized" Ethernet-Tag=y at the MAC-VRF-300 (where x and y
796	      might be equal if no translation is needed). Qualified learning is
797	      now required (a different FIB space is allocated within MAC-VRF-
798	      300 for each Ethernet-Tag). Potentially the same MAC could be
799	      learnt in two different Ethernet-Tag bridge domains of the same
800	      MAC-VRF.

802	   o Any new locally learnt MAC on the MAC-VRF-300/Ethernet-Tag=y
803	      interface is advertised by the ingress PE in a MAC advertisement
804	      route, using now the Ethernet-Tag field (Ethernet-Tag=y) so that
805	      the remote PE learns the MAC associated to the MAC-VRF-
806	      300/Ethernet-Tag=y FIB. Note that the Ethernet-Tag field is not
807	      used in advertisements of MACs learnt on VLAN-based or VLAN-bundle
808	      service interfaces.

810	   o At the ingress PE, BUM frames are sent to the corresponding
811	      flooding tree for the particular Ethernet-Tag they are mapped to.
812	      Each individual Ethernet-Tag can have a different flooding tree
813	      within the same EVI300. For instance, Ethernet-Tag=y can use
814	      ingress replication to get to the remote PEs whereas Ethernet-
815	      Tag=z can use a p2mp LSP.

817	   o At the egress PE, Ethernet-Tag=y, for a given broadcast domain
818	      within MAC-VRF-300, can be translated to egress CE-VID=x. That is
819	      not possible for VLAN-bundle interfaces. It is possible for VLAN-
820	      based interfaces, but it requires a separate EVI per CE-VID.

822	6. MPLS-based forwarding model use-case

824	   EVPN supports an alternative forwarding model, usually referred to as
825	   MPLS-based forwarding or disposition model as opposed to the
826	   MAC-based forwarding or disposition model described in section 5.
827	   Using MPLS-based forwarding model instead of MAC-based model might
828	   have an impact on:

830	   o The number of forwarding states required

832	   o The FIB where the forwarding states are handled: MAC FIB or MPLS
833	      LFIB.

835	   The MPLS-based forwarding model avoids the destination MAC lookup at
836	   the egress PE MAC FIB, at the expense of increasing the number of
837	   next-hop forwarding states at the egress MPLS LFIB. This also has an
838	   impact on the control plane and the label allocation model, since an
839	   MPLS-based disposition PE MUST send as many routes and labels as
840	   required next-hops in the egress MAC-VRF. This concept is equivalent
841	   to the forwarding models supported in IP-VPNs at the egress PE, where
842	   an IP lookup in the IP-VPN FIB might be necessary or not depending on
843	   the available next-hop forwarding states in the LFIB.

845	   The following sub-sections highlight the impact on the control and
846	   data plane procedures described in section 5 when and MPLS-based
847	   forwarding model is used.

849	   Note that both forwarding models are compatible and interoperable in
850	   the same network. The implementation of either model in each PE is a
851	   local decision to the PE node.

853	6.1. Impact of MPLS-based forwarding on the EVPN network startup

855	   The MPLS-based forwarding model has no impact on the procedures
856	   explained in section 5.1.

858	6.2. Impact of MPLS-based forwarding on the VLAN-based service
859	   procedures

861	   Compared to the MAC-based forwarding model, the MPLS-based forwarding
862	   model has no impact in terms of number of routes, when all the
863	   service interfaces are VLAN-based. The differences for the use-case
864	   described in this document are summarized in the following list:

866	   o Flooding tree setup per EVI (4k routes per PE): no impact compared
867	      to the MAC-based model.

869	   o Ethernet A-D routes per ESI (one set of routes for ESI12 per PE):
870	      no impact compared to the MAC-based model.

872	   o Ethernet A-D routes per EVI (4k routes per PE/ESI): no impact
873	      compared to the MAC-based model.

875	   o MAC-advertisement routes: instead of allocating and advertising the
876	      same MPLS label for all the new MACs locally learnt on the same
877	      MAC-VRF, a different label MUST be advertised per CE next-hop or
878	      MAC so that no MAC FIB lookup is needed at the egress PE. In
879	      general, this means that a different label at least per CE must be
880	      advertised, although the PE can decide to implement a label per
881	      MAC if more granularity (hence less scalability) is required in
882	      terms of forwarding states. E.g. if CE2 sends traffic from two
883	      different MACs to PE1, CE2-MAC1 and CE2-MAC2, the same MPLS
884	      label=x can be re-used for both MAC advertisements since they both
885	      share the same source ESI12. It is up to the PE1 implementation to
886	      use a different label per individual MAC within the same ES
887	      Segment (even if only one label per ESI is enough).

889	   o PE1, PE2 and PE3 will not add forwarding states to the MAC FIB upon
890	      learning new local CE MAC addresses on the data plane, but will
891	      rather add forwarding states to the MPLS LFIB.

893	6.3. Impact of MPLS-based forwarding on the VLAN-bundle service
894	      procedures

896	   Compared to the MAC-based forwarding model, the MPLS-based forwarding
897	   model has no impact in terms of number of routes when all the service
898	   interfaces are VLAN-bundle type. The differences for the use-case
899	   described in this document are summarized in the following list:

901	   o Flooding tree setup per EVI (one route): no impact compared to the
902	      MAC-based model.

904	   o Ethernet A-D routes per ESI (one route for ESI12 per PE): no impact
905	      compared to the MAC-based model.

907	   o Ethernet A-D routes per EVI (one route per PE/ESI): no impact
908	      compared to the MAC-based model since no VLAN translation is
909	      required.

911	   o MAC-advertisement routes: instead of allocating and advertising the
912	      same MPLS label for all the new MACs locally learnt on the same
913	      MAC-VRF, a different label MUST be advertised per CE next-hop or
914	      MAC so that no MAC FIB lookup is needed at the egress PE. In
915	      general, this means that a different label at least per CE must be
916	      advertised, although the PE can decide to implement a label per
917	      MAC if more granularity (hence less scalability) is required in
918	      terms of forwarding states. It is up to the PE1 implementation to
919	      use a different label per individual MAC within the same ES
920	      Segment (even if only one label per ESI is enough).

922	   o PE1, PE2 and PE3 will not add forwarding states to the MAC FIB upon
923	      learning new local CE MAC addresses on the data plane, but will
924	      rather add forwarding states to the MPLS LFIB.

926	6.4. Impact of MPLS-based forwarding on the VLAN-aware service
927	      procedures

929	   Compared to the MAC-based forwarding model, the MPLS-based forwarding
930	   model has definitively an impact in terms of number of A-D routes
931	   when all the service interfaces are VLAN-aware bundle type. The
932	   differences for the use-case described in this document are
933	   summarized in the following list:

935	   o Flooding tree setup per EVI (4k routes per PE): no impact compared
936	      to the MAC-based model.

938	   o Ethernet A-D routes per ESI (one route for ESI12 per PE): no impact
939	      compared to the MAC-based model.

941	   o Ethernet A-D routes per EVI (4k routes per PE/ESI): PE1 and PE2
942	      will send 4k routes for EVI300, one per <ESI, Ethernet-Tag ID>
943	      tuple. This will allow the egress PE to find out all the
944	      forwarding information in the MPLS LFIB and even support Ethernet-
945	      Tag to CE-VID translation at the egress. The MAC-based forwarding
946	      model would allow the PEs to send a single route per PE/ESI for
947	      EVI300, since the packet with the embedded Ethernet-Tag would be
948	      used to perform a MAC lookup and find out the egress CE-VID.

950	   o MAC-advertisement routes: instead of allocating and advertising the
951	      same MPLS label for all the new MACs locally learnt on the same
952	      MAC-VRF, a different label MUST be advertised per CE next-hop or
953	      MAC so that no MAC FIB lookup is needed at the egress PE. In
954	      general, this means that a different label at least per CE must be
955	      advertised, although the PE can decide to implement a label per
956	      MAC if more granularity (hence less scalability) is required in
957	      terms of forwarding states. It is up to the PE1 implementation to
958	      use a different label per individual MAC within the same ES
959	      Segment. Note that, in this model, the Ethernet-Tag will be set to
960	      a non-zero value for the MAC-advertisement routes. The same MAC
961	      address can be announced with different Ethernet-Tag value. This
962	      will make the advertising PE install two different forwarding
963	      states in the MPLS LFIB.

965	   o PE1, PE2 and PE3 will not add forwarding states to the MAC FIB upon
966	      learning new local CE MAC addresses on the data plane, but will
967	      rather add forwarding states to the MPLS LFIB.

969	7. Comparison between MAC-based and MPLS-based forwarding models

971	   Both forwarding models are possible in a network deployment and each
972	   one has its own trade-offs.

974	   The MAC-based forwarding model can save A-D routes per EVI when VLAN-
975	   aware bundling services are deployed and therefore reduce the control
976	   plane overhead. This model also saves a significant amount of MPLS
977	   labels compared to the MPLS-based forwarding model. All the MACs and
978	   A-D routes for the same EVI can signal the same MPLS label, saving
979	   labels from the local PE space. A MAC FIB lookup at the egress PE is
980	   required in order to do so.

982	   The MPLS-based forwarding model can save forwarding states at the
983	   egress PEs if labels per next hop CE (as opposed to per MAC) are
984	   implemented. No egress MAC lookup is required. An A-D route per <EVI,
985	   Ethernet-Tag> is required for VLAN-aware services, as opposed to an
986	   A-D route per EVI. Also, a different label per next-hop CE per MAC-
987	   VRF is consumed, as opposed to a single label per MAC-VRF.

989	   The following table summarizes the implementation details of both
990	   models for the VLAN-aware bundling service type.

992	    +-----------------------------+----------------+----------------+
993	    |  4k CE-VID VLANs            | MAC-based      | MPLS-based     |
994	    |                             | Model          | Model          |
995	    +-----------------------------+----------------+----------------+
996	    | A-D routes/EVI              | 1 per ESI/EVI  | 4k per ESI/EVI |
997	    | MPLS labels consumed        | 1 per MAC-VRF  | 1 per CE/EVI   |
998	    | Egress PE Forwarding states | 1 per MAC      | 1 per next-hop |
999	    | Egress PE Lookups           | 2 (MPLS+MAC)   | 1 (MPLS)       |
1000	    +-----------------------------+----------------+----------------+

1002	   The egress forwarding model is an implementation local to the egress
1003	   PE and is independent of the model supported on the rest of the PEs,
1004	   i.e. in our use-case, PE1, PE2 and PE3 could have either egress
1005	   forwarding model without any dependencies.

1007	8. Traffic flow optimization

1009	   In addition to the procedures described across sections 1 through 7,
1010	   EVPN [EVPN] procedures allow for optimized traffic handling in order
1011	   to minimize unnecessary flooding across the entire infrastructure.
1012	   Optimization is provided through specific ARP termination and the
1013	   ability to block unknown unicast flooding. Additionally, EVPN
1014	   procedures allow for intelligent, close to the source, inter-subnet
1015	   forwarding and solves the commonly known sub-optimal routing problem.
1016	   Besides the traffic efficiency, ingress based inter-subnet forwarding
1017	   also optimizes packet forwarding rules and implementation at the
1018	   egress nodes as well. Details of these procedures are outlined in
1019	   sections 8.1 and 8.2.

1021	8.1. Control Plane Procedures

1023	8.1.1. MAC learning options

1025	   The fundamental premise of [EVPN] is the notion of a different
1026	   approach to MAC address learning compared to traditional IEEE 802.1
1027	   bridge learning methods; specifically EVPN differentiates between
1028	   data and control plane driven learning mechanisms.

1030	   Data driven learning implies that there is no separate communication
1031	   channel used to advertise and propagate MAC addresses. Rather, MAC
1032	   addresses are learned through IEEE defined bridge-learning procedures
1033	   as well as by snooping on DHCP and ARP requests. As different MAC
1034	   addresses show up on different ports, the L2 FIB is populated with
1035	   the appropriate MAC addresses.

1037	   Control plane driven learning implies a communication channel that
1038	   could be either a control-plane protocol or a management-plane
1039	   mechanism. In the context of EVPN, two different learning procedures
1040	   are defined, i.e. local and remote procedures:

1042	   o  Local learning defines the procedures used for learning the MAC
1043	      addresses of network elements locally connected to a MAC-VRF.
1044	      Local learning could be implemented through all three learning
1045	      procedures: control plane, management plane as well as data plane.
1046	      However, the expectation is that for most of the use cases, local
1047	      learning through data plane should be sufficient.

1049	   o  Remote learning defines the procedures used for learning MAC
1050	      addresses of network elements remotely connected to a MAC-VRF,
1051	      i.e. far-end PEs. Remote learning procedures defined in [EVPN]
1052	      advocate using only control plane learning; specifically BGP.
1053	      Through the use of BGP EVPN NLRIs, the remote PE has the
1054	      capability of advertising all the MAC addresses present in its
1055	      local FIB.

1057	8.1.2. Proxy-ARP/ND

1059	   In EVPN, MAC addresses are advertised via the MAC/IP Advertisement
1060	   Route, as discussed in [EVPN]. Optionally an IP address can be
1061	   advertised along with the MAC address announcement. However, there
1062	   are certain rules put in place in terms of IP address usage: if the
1063	   MAC Advertisement Route contains an IP address, and the IP Address
1064	   Length is 32 bits (or 128 in the IPv6 case), this particular IP
1065	   address correlates directly with the advertised MAC address. Such
1066	   advertisement allows us to build a proxy-ARP/ND table populated with
1067	   the IP<>MAC bindings received from all the remote nodes.

1069	   Furthermore, based on these bindings, a local MAC-VRF can now provide
1070	   Proxy-ARP/ND functionality for all ARP requests and ND solicitations
1071	   directed to the IP address pool learned through BGP. Therefore, the
1072	   amount of unnecessary L2 flooding, ARP/ND requests/solicitations in
1073	   this case, can be further reduced by the introduction of Proxy-ARP/ND
1074	   functionality across all EVI MAC-VRFs.

1076	8.1.3. Unknown Unicast flooding suppression

1078	   Given that all locally learned MAC addresses are advertised through
1079	   BGP to all remote PEs, suppressing flooding of any Unknown Unicast
1080	   traffic towards the remote PEs is a feasible network optimization.

1082	   The assumption in the use case is made that any network device that
1083	   appears on a remote MAC-VRF will somehow signal its presence to the
1084	   network. This signaling can be done through e.g. gratuitous ARPs.

1086	   Once the remote PE acknowledges the presence of the node in the MAC-
1087	   VRF, it will do two things: install its MAC address in its local FIB
1088	   and advertise this MAC address to all other BGP speakers via EVPN
1089	   NLRI. Therefore, we can assume that any active MAC address is
1090	   propagated and learnt through the entire EVI. Given that MAC
1091	   addresses become pre-populated - once nodes are alive on the network
1092	   - there is no need to flood any unknown unicast towards the remote
1093	   PEs. If the owner of a given destination MAC is active, the BGP route
1094	   will be present in the local RIB and FIB, assuming that the BGP
1095	   import policies are successfully applied; otherwise, the owner of
1096	   such destination MAC is not present on the network.

1098	   It is worth noting that unless: a) control or management plane
1099	   learning is performed through the entire EVI or b) all the EVI-
1100	   attached devices signal their presence when they come up (GARPs or
1101	   similar), unknown unicast flooding MUST be enabled.

1103	8.1.4. Optimization of Inter-subnet forwarding

1105	   In a scenario in which both L2 and L3 services are needed over the
1106	   same physical topology, some interaction between EVPN and IP-VPN is
1107	   required. A common way of stitching the two service planes is through
1108	   the use of an IRB interface, which allows for traffic to be either
1109	   routed or bridged depending on its destination MAC address. If the
1110	   destination MAC address is the one of the IRB interface, traffic
1111	   needs to be passed through a routing module and potentially be either
1112	   routed to a remote PE or forwarded to a local subnet. If the
1113	   destination MAC address is not the one of the IRB, the MAC-VRF
1114	   follows standard bridging procedures.

1116	   A typical example of EVPN inter-subnet forwarding would be a scenario
1117	   in which multiple IP subnets are part of a single or multiple EVIs,
1118	   and they all belong to a single IP-VPN. In such topologies, it is
1119	   desired that inter-subnet traffic can be efficiently routed without
1120	   any tromboning effects in the network. Due to the overlapping
1121	   physical and service topology in such scenarios, all inter-subnet
1122	   connectivity will be locally routed trough the IRB interface.

1124	   In addition to optimizing the traffic patterns in the network, local
1125	   inter-subnet forwarding also optimizes greatly the amount of
1126	   processing needed to cross the subnets. Through EVPN MAC
1127	   advertisements, the local PE learns the real destination MAC address
1128	   associated with the remote IP address and the inter-subnet forwarding
1129	   can happen locally. When the packet is received at the egress PE, it
1130	   is directly mapped to an egress MAC-VRF, bypassing any egress IP-VPN
1131	   processing.

1133	   Please refer to [EVPN-INTERSUBNET] for more information about the IP
1134	   inter-subnet forwarding procedures in EVPN.

1136	8.2. Packet Walkthrough Examples

1138	   Assuming that the services are setup according to figure 1 in section
1139	   2, the following flow optimization processes will take place in terms
1140	   of creating, receiving and forwarding packets across the network.

1142	8.2.1. Proxy-ARP example for CE2 to CE3 traffic

1144	   Using figure 1 in section 2, consider EVI 400 residing on PE1, PE2
1145	   and PE3 connecting CE2 and CE3 networks. Also, consider that PE1 and
1146	   PE2 are part of the all-active multi-homing ES for CE2, and that PE2
1147	   is elected designated-forwarder for EVI400. We assume that all the
1148	   PEs implement the proxy-ARP functionality in the MAC-VRF-400 context.

1150	   In this scenario, PE3 will not only advertise the MAC addresses
1151	   through the EVPN MAC Advertisement Route but also IP addresses of
1152	   individual hosts, i.e. /32 prefixes, behind CE3. Upon receiving the
1153	   EVPN routes, PE1 and PE2 will install the MAC addresses in the MAC-
1154	   VRF-400 FIB and based on the associated received IP addresses, PE1
1155	   and PE2 can now build a proxy-ARP table within the context of MAC-
1156	   VRF-400.

1158	   From the forwarding perspective, when a node behind CE2 sends a frame
1159	   destined to a node behind CE3, it will first send an ARP request to
1160	   e.g. PE2 (based on the result of the CE2 hashing). Assuming that PE2
1161	   has populated its proxy-ARP table for all active nodes behind the
1162	   CE3, and that the IP address in the ARP message matches the entry in
1163	   the table, PE2 will respond to the ARP request with the actual MAC
1164	   address on behalf of the node behind CE3.

1166	   Once the nodes behind CE2 learn the actual MAC address of the nodes
1167	   behind CE3, all the MAC-to-MAC communications between the two
1168	   networks will be unicast.

1170	8.2.2. Flood suppression example for CE1 to CE3 traffic

1172	   Using figure 1 in section 2, consider EVI 500 residing on PE1 and PE3
1173	   connecting CE1 and CE3 networks. Consider that both PE1 and PE3 have
1174	   disabled unknown unicast flooding for this specific EVI context. Once
1175	   the network devices behind CE3 come online they will learn their MAC
1176	   addresses and create local FIB entries for these devices. Note that
1177	   local FIB entries could also be created through either a control or
1178	   management plane between PE and CE as well. Consequently, PE3 will
1179	   automatically create EVPN Type 2 MAC Advertisement Routes and
1180	   advertise all locally learned MAC addresses. The routes will also
1181	   include the corresponding MPLS label.

1183	   Given that PE1 automatically learns and installs all MAC addresses
1184	   behind CE3, its MAC-VRF FIB will already be pre-populated with the
1185	   respective next-hops and label assignments associated with the MAC
1186	   addresses behind CE3. As such, as soon as the traffic sent by CE1 to
1187	   nodes behind CE3 is received into the context of EVI 500, PE1 will
1188	   push the MPLS Label(s) onto the original Ethernet frame and send the
1189	   packet to the MPLS network. As usual, once PE3 receives this packet,
1190	   and depending on the forwarding model, PE3 will either do a next-hop
1191	   lookup in the EVI 500 context, or will just forward the traffic
1192	   directly to the CE3. In the case that PE1 MAC-VRF-500 does not have a
1193	   MAC entry for a specific destination that CE1 is trying to reach, PE1
1194	   will drop the frame since unknown unicast flooding is disabled.

1196	   Based on the assumption that all the MAC entries behind the CEs are
1197	   pre-populated through gratuitous-ARP and/or DHCP requests, if one
1198	   specific MAC entry is not present in the MAC-VRF-500 FIB on PE1, the
1199	   owner of that MAC is not alive on the network behind the CE3, hence
1200	   the traffic can be dropped at PE1 instead of be flooded and consume
1201	   network bandwidth.

1203	8.2.3. Optimization of inter-subnet forwarding example for CE3 to CE2
1204	   traffic

1206	   Using figure 1 in section 2 consider that there is an IP-VPN 666
1207	   context residing on PE1, PE2 and PE3 which connects CE1, CE2 and CE3
1208	   into a single IP-VPN domain. Also consider that there are two EVIs
1209	   present on the PEs, EVI 600 and EVI 60. Each IP subnet is associated
1210	   to a different MAC-VRF context. Thus there is a single subnet, subnet
1211	   600, between CE1 and CE3 that is established through EVI 600.
1212	   Similarly, there is another subnet, subnet 60, between CE2 and CE3
1213	   that is established through EVI 60. Since both subnets are part of
1214	   the same IP VPN, there is a mapping of each EVI (or individual
1215	   subnet) to a local IRB interface on the three PEs.

1217	   If a node behind CE2 wants to communicate with a node on the same
1218	   subnet seating behind CE3, the communication flow will follow the
1219	   standard EVPN procedures, i.e. FIB lookup within the PE1 (or PE2)
1220	   after adding the corresponding EVPN label to the MPLS label stack
1221	   (downstream label allocation from PE3 for EVI 60).

1223	   When it comes to crossing the subnet boundaries, the ingress PE
1224	   implements local inter-subnet forwarding. For example, when a node
1225	   behind CE2 (EVI 60) sends a packet to a node behind CE1 (EVI 600) the
1226	   destination IP address will be in the subnet 600, but the destination
1227	   MAC address will be the address of source node's default gateway,
1228	   which in this case will be an IRB interface on PE1 (connecting EVI 60
1229	   to IP-VPN 666). Once PE1 sees the traffic destined to its own MAC
1230	   address, it will route the packet to EVI 600, i.e. it will change the
1231	   source MAC address to the one of the IRB interface in EVI 600 and
1232	   change the destination MAC address to the address belonging to the
1233	   node behind CE1, which is already populated in the MAC-VRF-600 FIB,
1234	   either through data or control plane learning.

1236	   An important optimization to be noted is the local inter-subnet
1237	   forwarding in lieu of IP VPN routing. If the node from subnet 60
1238	   (behind CE2) is sending a packet to the remote end node on subnet 600
1239	   (behind CE3), the mechanism in place still honors the local inter-
1240	   subnet (inter-EVI) forwarding.

1242	   In our use-case, therefore, when node from subnet 60 behind CE2 sends
1243	   traffic to the node on subnet 600 behind CE3, the destination MAC
1244	   address is the PE1 MAC-VRF-60 IRB MAC address. However, once the
1245	   traffic locally crosses EVIs, to EVI 600, via the IRB interface on
1246	   PE1, the source MAC address is changed to that of the IRB interface
1247	   and the destination MAC address is changed to the one advertised by
1248	   PE3 via EVPN and already installed in MAC-VRF-600. The rest of the
1249	   forwarding through PE1 is using the MAC-VRF-600 forwarding context
1250	   and label space.

1252	   Another very relevant optimization is due to the fact that traffic
1253	   between PEs is forwarded through EVPN, rather than through IP-VPN. In
1254	   the example described above for traffic from EVI 60 on CE2 to EVI 600
1255	   on CE3, there is no need for IP-VPN processing on the egress PE3.
1256	   Traffic is forwarded either to the EVI 600 context in PE3 for further
1257	   MAC lookup and next-hop processing, or directly to the node behind
1258	   CE3, depending on the egress forwarding model being used.

1260	9. Conventions used in this document

1262	   In the examples, the following conventions are used:

1264	   o CE-VIDs refer to the VLAN tag identifiers being used at CE1, CE2
1265	      and CE3 to tag customer traffic sent to the Service Provider E-
1266	      VPN network

1268	   o CE1-MAC, CE2-MAC and CE3-MAC refer to source MAC addresses "behind"
1269	      each CE respectively. Those MAC addresses can belong to the CEs
1270	      themselves or to devices connected to the CEs.

1272	   o CE1-IP, CE2-IP and CE3-IP refer to IP addresses associated to the
1273	      above MAC addresses.

1275	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
1276	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
1277	   document are to be interpreted as described in RFC-2119 [RFC2119].

1279	   In this document, these words will appear with that interpretation
1280	   only when in ALL CAPS. Lower case uses of these words are not to be
1281	   interpreted as carrying RFC-2119 significance.

1283	10. Security Considerations

1285	11. IANA Considerations

1287	12. References

1289	12.1. Normative References

1291	   [RFC4761]Kompella, K., Ed., and Y. Rekhter, Ed., "Virtual Private LAN
1292	   Service (VPLS) Using BGP for Auto-Discovery and Signaling", RFC 4761,
1293	   January 2007, <http://www.rfc-editor.org/info/rfc4761>.

1295	   [RFC4762]Lasserre, M., Ed., and V. Kompella, Ed., "Virtual Private
1296	   LAN Service (VPLS) Using Label Distribution Protocol (LDP)
1297	   Signaling", RFC 4762, January 2007, <http://www.rfc-
1298	   editor.org/info/rfc4762>.

1300	   [RFC6074]Rosen, E., Davie, B., Radoaca, V., and W. Luo,
1301	   "Provisioning, Auto-Discovery, and Signaling in Layer 2 Virtual
1302	   Private Networks (L2VPNs)", RFC 6074, January 2011, <http://www.rfc-
1303	   editor.org/info/rfc6074>.

1305	   [RFC4364]Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
1306	   Networks (VPNs)", RFC 4364, February 2006, <http://www.rfc-
1307	   editor.org/info/rfc4364>.

1309	   [RFC7209]Sajassi, A., Aggarwal, R., Uttaro, J., Bitar, N.,
1310	   Henderickx, W., and A. Isaac, "Requirements for Ethernet VPN (EVPN)",
1311	   RFC 7209, May 2014, <http://www.rfc-editor.org/info/rfc7209>.

1313	   [RFC7117]Aggarwal, R., Ed., Kamite, Y., Fang, L., Rekhter, Y., and C.
1314	   Kodeboniya, "Multicast in Virtual Private LAN Service (VPLS)",
1315	   RFC 7117, February 2014, <http://www.rfc-editor.org/info/rfc7117>.

1317	   [RFC709] A. Sajassi, R. Aggarwal et al., "Requirements for Ethernet
1318	   VPN", RFC7209, May 2014

1320	12.2. Informative References

1322	   [EVPN] Sajassi et al., "BGP MPLS Based Ethernet VPN", draft-ietf-
1323	   l2vpn-evpn-10.txt, work in progress, October, 2014

1325	   [EVPN-INTERSUBNET] Sajassi et al., "IP Inter-subnet forwarding in
1326	   EVPN", draft-sajassi-l2vpn-evpn-inter-subnet-forwarding-05.txt

1328	13. Acknowledgments

1330	   The authors want to thank Giles Heron for his detailed review of the
1331	   document. We also thank Stefan Plug for his comments.

1333	   This document was prepared using 2-Word-v2.0.template.dot.

1335	14. Authors' Addresses

1337	   Jorge Rabadan
1338	   Alcatel-Lucent
1339	   777 E. Middlefield Road
1340	   Mountain View, CA 94043 USA
1341	   Email: jorge.rabadan@alcatel-lucent.com

1343	   Senad Palislamovic
1344	   Alcatel-Lucent
1345	   Email: senad.palislamovic@alcatel-lucent.com

1347	   Wim Henderickx
1348	   Alcatel-Lucent
1349	   Email: wim.henderickx@alcatel-lucent.be

1351	   Florin Balus
1352	   Alcatel-Lucent
1353	   Email: Florin.Balus@alcatel-lucent.com

1355	   Keyur Patel
1356	   Cisco
1357	   Email: keyupate@cisco.com

1359	   Ali Sajassi
1360	   Cisco
1361	   Email: sajassi@cisco.com
1362	   James Uttaro
1363	   AT&T
1364	   Email: uttaro@att.com

1366	   Aldrin Isaac
1367	   Bloomberg
1368	   Email: aisaac71@bloomberg.net

1370	   Truman Boyes
1371	   Bloomberg
1372	   Email: tboyes@bloomberg.net