| < draft-ietf-bess-evpn-pref-df-06.txt | draft-ietf-bess-evpn-pref-df-07.txt > | |||
|---|---|---|---|---|
| BESS Workgroup J. Rabadan, Ed. | BESS Workgroup J. Rabadan, Ed. | |||
| Internet-Draft S. Sathappan | Internet-Draft S. Sathappan | |||
| Intended status: Standards Track Nokia | Intended status: Standards Track Nokia | |||
| Expires: December 21, 2020 T. Przygienda | Expires: September 13, 2021 T. Przygienda | |||
| W. Lin | W. Lin | |||
| J. Drake | J. Drake | |||
| Juniper Networks | Juniper Networks | |||
| A. Sajassi | A. Sajassi | |||
| S. Mohanty | S. Mohanty | |||
| Cisco Systems | Cisco Systems | |||
| June 19, 2020 | March 12, 2021 | |||
| Preference-based EVPN DF Election | Preference-based EVPN DF Election | |||
| draft-ietf-bess-evpn-pref-df-06 | draft-ietf-bess-evpn-pref-df-07 | |||
| Abstract | Abstract | |||
| The Designated Forwarder (DF) in Ethernet Virtual Private Networks | The Designated Forwarder (DF) in Ethernet Virtual Private Networks | |||
| (EVPN) is defined as the PE responsible for sending Broadcast, | (EVPN) is defined as the PE responsible for sending Broadcast, | |||
| Unknown unicast and Broadcast traffic (BUM) to a multi-homed device/ | Unknown unicast and Broadcast traffic (BUM) to a multi-homed device/ | |||
| network in the case of an all-active multi-homing Ethernet Segment | network in the case of an all-active multi-homing Ethernet Segment | |||
| (ES), or BUM and unicast in the case of single-active multi-homing. | (ES), or BUM and unicast in the case of single-active multi-homing. | |||
| The DF is selected out of a candidate list of PEs that advertise the | The DF is selected out of a candidate list of PEs that advertise the | |||
| same Ethernet Segment Identifier (ESI) to the EVPN network, according | same Ethernet Segment Identifier (ESI) to the EVPN network, according | |||
| to the Default DF Election algorithm. | to the Default DF Election algorithm. While the Default Algorithm | |||
| provides an efficient and automated way of selecting the DF across | ||||
| While the Default Algorithm provides an efficient and automated way | different Ethernet Tags in the ES, there are some use cases where a | |||
| of selecting the DF across different Ethernet Tags in the ES, there | more 'deterministic' and user-controlled method is required. At the | |||
| are some use-cases where a more 'deterministic' and user-controlled | same time, Service Providers require an easy way to force an on- | |||
| method is required. At the same time, Service Providers require an | demand DF switchover in order to carry out some maintenance tasks on | |||
| easy way to force an on-demand DF switchover in order to carry out | the existing DF or control whether a new active PE can preempt the | |||
| some maintenance tasks on the existing DF or control whether a new | existing DF PE. | |||
| active PE can preempt the existing DF PE. | ||||
| This document proposes an extension to the Default DF election | This document proposes a DF Election algorithm that meets the | |||
| procedures so that the above requirements can be met. | requirements of determinism and operation control. | |||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on December 21, 2020. | This Internet-Draft will expire on September 13, 2021. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2020 IETF Trust and the persons identified as the | Copyright (c) 2021 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 1.1. Problem Statement . . . . . . . . . . . . . . . . . . . . 3 | 1.1. Problem Statement . . . . . . . . . . . . . . . . . . . . 3 | |||
| 1.2. Solution requirements . . . . . . . . . . . . . . . . . . 3 | 1.2. Solution requirements . . . . . . . . . . . . . . . . . . 3 | |||
| 2. Requirements Language and Terminology . . . . . . . . . . . . 4 | 2. Requirements Language and Terminology . . . . . . . . . . . . 4 | |||
| 3. EVPN BGP Attributes Extensions . . . . . . . . . . . . . . . 5 | 3. EVPN BGP Attributes Extensions . . . . . . . . . . . . . . . 5 | |||
| 4. Solution description . . . . . . . . . . . . . . . . . . . . 6 | 4. Solution description . . . . . . . . . . . . . . . . . . . . 6 | |||
| 4.1. Use of the Preference algorithm . . . . . . . . . . . . . 7 | 4.1. Use of the Highest-Preference Algorithm . . . . . . . . . 7 | |||
| 4.2. Use of the Preference algorithm in [RFC7432] Ethernet | 4.2. Use of the Lowest-Preference Algorithm . . . . . . . . . 9 | |||
| Segments . . . . . . . . . . . . . . . . . . . . . . . . 9 | 4.3. Use of the Highest-Preference algorithm in [RFC7432] | |||
| 4.3. The Non-Revertive Capability . . . . . . . . . . . . . . 10 | Ethernet Segments . . . . . . . . . . . . . . . . . . . . 9 | |||
| 5. Security Considerations . . . . . . . . . . . . . . . . . . . 13 | 4.4. The Non-Revertive Capability . . . . . . . . . . . . . . 10 | |||
| 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 | 5. Security Considerations . . . . . . . . . . . . . . . . . . . 14 | |||
| 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 13 | 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 | |||
| 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 13 | 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
| 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 | 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 15 | |||
| 9.1. Normative References . . . . . . . . . . . . . . . . . . 14 | 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 | |||
| 9.2. Informative References . . . . . . . . . . . . . . . . . 14 | 9.1. Normative References . . . . . . . . . . . . . . . . . . 15 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 15 | 9.2. Informative References . . . . . . . . . . . . . . . . . 16 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16 | ||||
| 1. Introduction | 1. Introduction | |||
| 1.1. Problem Statement | ||||
| [RFC7432] defines the Designated Forwarder (DF) in (PBB-)EVPN | 1.1. Problem Statement | |||
| networks as the PE responsible for sending broadcast, multicast and | ||||
| unknown unicast traffic (BUM) to a multi-homed device/network in the | ||||
| case of an all-active multi-homing ES or BUM and unicast traffic to a | ||||
| multi-homed device or network in case of single-active multi-homing. | ||||
| The DF is selected out of a candidate list of PEs that advertise the | [RFC7432] defines the Designated Forwarder (DF) in EVPN networks as | |||
| the PE responsible for sending broadcast, multicast and unknown | ||||
| unicast traffic (BUM) to a multi-homed device/network in the case of | ||||
| an all-active multi-homing ES or BUM and unicast traffic to a multi- | ||||
| homed device or network in case of single-active multi-homing. The | ||||
| DF is selected out of a candidate list of PEs that advertise the | ||||
| Ethernet Segment Identifier (ESI) to the EVPN network and according | Ethernet Segment Identifier (ESI) to the EVPN network and according | |||
| to the DF Election Algorithm, or DF Alg as per [RFC8584]. | to the DF Election Algorithm, or DF Alg as per [RFC8584]. | |||
| While the Default DF Alg [RFC7432] or HRW [RFC8584] provide an | While the Default DF Alg [RFC7432] or HRW [RFC8584] provide an | |||
| efficient and automated way of selecting the DF across different | efficient and automated way of selecting the DF across different | |||
| Ethernet Tags in the ES, there are some use-cases where a more | Ethernet Tags in the ES, there are some use-cases where a more | |||
| 'deterministic' and user-controlled method is required. At the same | 'deterministic' and user-controlled method is required. At the same | |||
| time, Service Providers require an easy way to force an on-demand DF | time, Service Providers require an easy way to force an on-demand DF | |||
| switchover in order to carry out some maintenance tasks on the | switchover in order to carry out some maintenance tasks on the | |||
| existing DF or control whether a new active PE can preempt the | existing DF or control whether a new active PE can preempt the | |||
| skipping to change at page 5, line 41 ¶ | skipping to change at page 5, line 41 ¶ | |||
| Where the following fields are defined as follows: | Where the following fields are defined as follows: | |||
| o DF Alg can have the following values: | o DF Alg can have the following values: | |||
| - Alg 0 - Default DF Election algorithm, or modulus-based | - Alg 0 - Default DF Election algorithm, or modulus-based | |||
| algorithm as per [RFC7432]. | algorithm as per [RFC7432]. | |||
| - Alg 1 - HRW algorithm as per [RFC8584]. | - Alg 1 - HRW algorithm as per [RFC8584]. | |||
| - Alg 2 - Preference algorithm (this document). | - Alg 2 - Highest-Preference algorithm (this document). | |||
| - Alg TBD - Lowest-Preference algorithm (this document). TBD | ||||
| will be replaced by the allocated value at the time of | ||||
| publication. | ||||
| o Bitmap (2 octets) can have the following values: | o Bitmap (2 octets) can have the following values: | |||
| 1 1 1 1 1 1 | 1 1 1 1 1 1 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| |D|A| | | |D|A| | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Figure 2: Bitmap field in the DF Election Extended Community | Figure 2: Bitmap field in the DF Election Extended Community | |||
| - Bit 0 (corresponds to Bit 24 of the DF Election Extended | - Bit 0 (corresponds to Bit 24 of the DF Election Extended | |||
| Community and it is defined by this document): D bit or 'Don't | Community and it is defined by this document): D bit or 'Don't | |||
| Preempt' bit (DP hereafter), determines if the PE advertising | Preempt' bit (DP hereafter), determines if the PE advertising | |||
| the ES route requests the remote PEs in the ES not to preempt | the ES route requests the remote PEs in the ES not to preempt | |||
| it as DF. The default value is DP=0, which is compatible with | it as DF. The default value is DP=0, which is compatible with | |||
| the 'preempt' or 'revertive' behavior in the Default DF Alg | the 'preempt' or 'revertive' behavior in the Default DF Alg | |||
| [RFC7432]. The DP bit SHOULD be ignored if the DF Alg is | [RFC7432]. The DP capability is supported by Alg 2 and Alg | |||
| different than 2. | TBD, and MAY be used with DF Alg 0 or 1. The procedures of the | |||
| DP capability for DF Alg 0 or 1 are out of the scope of this | ||||
| document. | ||||
| - Bit 1: AC-DF or AC-Influenced DF Election, as explained in | - Bit 1: AC-DF or AC-Influenced DF Election, as explained in | |||
| [RFC8584]. When set to 1, it indicates the desire to use AC- | [RFC8584]. When set to 1, it indicates the desire to use AC- | |||
| Influenced DF Election with the rest of the PEs in the ES. The | Influenced DF Election with the rest of the PEs in the ES. The | |||
| AC-DF capability bit MAY be set along with the DP capability | AC-DF capability bit MAY be set along with the DP capability | |||
| and DF Alg 2. | and DF Alg 2 or Alg TBD. | |||
| o DF Preference (defined in this document): defines a 2-octet value | o DF Preference (defined in this document): defines a 2-octet value | |||
| that indicates the PE preference to become the DF in the ES. The | that indicates the PE preference to become the DF in the ES. The | |||
| allowed values are within the range 0-65535, and the default value | allowed values are within the range 0-65535, and the default value | |||
| MUST be 32767. This value is the midpoint in the allowed | MUST be 32767. This value is the midpoint in the allowed | |||
| Preference range of values, which gives the operator the | Preference range of values, which gives the operator the | |||
| flexibility of choosing a significant number of values, above or | flexibility of choosing a significant number of values, above or | |||
| below the default Preference. The DF Preference field is specific | below the default Preference. The DF Preference field is specific | |||
| to DF Alg 2 and does not represent any Preference value for other | to DF Alg 2 and DF Alg TBD, and does not represent any Preference | |||
| Algs. If the DF Alg is different than Alg 2, these two octets can | value for other Algs. If the DF Alg is different than Alg 2 or | |||
| be encoded differently. | Alg TBD, these two octets can be encoded differently. | |||
| 4. Solution description | 4. Solution description | |||
| Figure 3 illustrates an example that will be used in the description | Figure 3 illustrates an example that will be used in the description | |||
| of the solution. | of the solution. | |||
| EVPN network | EVPN network | |||
| +-------------------+ | +-------------------+ | |||
| | +-------+ ENNI Aggregation | | +-------+ ENNI Aggregation | |||
| | <---ESI1,500 | PE1 | /\ +----Network---+ | | <---ESI1,500 | PE1 | /\ +----Network---+ | |||
| | <-----ESI2,100 | |===||=== | | | <-----ESI2,100 | |===||=== | | |||
| | | |===||== \ vES1 | +----+ | | | |===||== \ vES1 | +----+ | |||
| +-----+ | | \/ |\----------------+CE1 | | +-----+ | | \/ |\----------------+CE1 | | |||
| CE3--+ PE4 | +-------+ | \ ------------+ | | CE3--+ PE4 | +-------+ | \ ------------+ | | |||
| +-----+ | | \ / | +----+ | +-----+ | | \ / | +----+ | |||
| | | | X | | | | | X | | |||
| | <---ESI1,255 +-----+============ \ | | | <---ESI1,255 +-----+============ \ | | |||
| | <-----ESI2,200 | PE2 |========== \ vES2 | +----+ | | <-----ESI2,200 | PE2 |========== \ vES2 | +----+ | |||
| | +-----+ | \ ----------+CE2 | | | +-----+ | \ ----------+CE2 | | |||
| | | | --------------| | | | | | --------------+ | | |||
| | +-----+ ----------------------+ | | | +-----+ ----------------------+ | | |||
| | <-----ESI2,300 | PE3 +--/ | | +----+ | | <-----ESI2,300 | PE3 +--/ | | +----+ | |||
| | +-----+ +--------------+ | | +-----+ +--------------+ | |||
| --------------------+ | --------------------+ | |||
| Figure 3: Preference-based DF Election | Figure 3: Preference-based DF Election | |||
| Figure 3 shows three PEs that are connecting EVCs coming from the | Figure 3 shows three PEs that are connecting EVCs coming from the | |||
| Aggregation Network to their EVIs in the EVPN network. CE1 is | Aggregation Network to their EVIs in the EVPN network. CE1 is | |||
| connected to vES1 - that spans PE1 and PE2 - and CE2 is connected to | connected to vES1 - that spans PE1 and PE2 - and CE2 is connected to | |||
| vES2, that is defined in PE1, PE2 and PE3. | vES2, that is defined in PE1, PE2 and PE3. | |||
| If the algorithm chosen for vES1 and vES2 is Alg 2, i.e., Preference- | If the algorithm chosen for vES1 and vES2 is Alg 2 or Alg TBD, i.e., | |||
| based, the PEs may become DF irrespective of their IP address and | Highest-Preference or Lowest-Preference, the PEs may become DF | |||
| based on an administrative Preference value. The following sections | irrespective of their IP address and based on an administrative | |||
| provide some examples of the procedures and how they are applied in | Preference value. The following sections provide some examples of | |||
| the use-case of Figure 3. | the procedures and how they are applied in the use-case of Figure 3. | |||
| 4.1. Use of the Preference algorithm | 4.1. Use of the Highest-Preference Algorithm | |||
| Assuming the operator wants to control - in a flexible way - what PE | Assuming the operator wants to control - in a flexible way - what PE | |||
| becomes the DF for a given vES and the order in which the PEs become | becomes the DF for a given vES and the order in which the PEs become | |||
| DF in case of multiple failures, the following procedure may be used: | DF in case of multiple failures, the following procedure may be used: | |||
| a. vES1 and vES2 are now configurable with three optional parameters | a. vES1 and vES2 are now configurable with three optional parameters | |||
| that are signaled in the DF Election extended community. These | that are signaled in the DF Election extended community. These | |||
| parameters are the Preference, Preemption option (or "Don't | parameters are the Preference, Preemption option (or "Don't | |||
| Preempt Me" option) and DF Alg. We will represent these | Preempt Me" option) and DF Alg. We will represent these | |||
| parameters as (Pref,DP,Alg). Let's assume vES1 is configured as | parameters as (Pref,DP,Alg). Let's assume vES1 is configured as | |||
| (500,0,Pref) in PE1, and (255,0,Pref) in PE2. vES2 is configured | (500,0,Highest-Pref) in PE1, and (255,0,Highest-Pref) in PE2. | |||
| as (100,0,Pref), (200,0,Pref) and (300,0,Pref) in PE1, PE2 and | vES2 is configured as (100,0,Highest-Pref), (200,0,Highest-Pref) | |||
| PE3 respectively. | and (300,0,Highest-Pref) in PE1, PE2 and PE3 respectively. | |||
| b. The PEs will advertise an ES route for each vES, including the 3 | b. The PEs will advertise an ES route for each vES, including the 3 | |||
| parameters in the DF Election Extended Community. | parameters in the DF Election Extended Community. | |||
| c. According to [RFC8584], each PE will run the DF election | c. According to [RFC8584], each PE will run the DF election | |||
| algorithm upon expiration of the DF Wait timer. In this case, | algorithm upon expiration of the DF Wait timer. In this case, | |||
| each PE runs the Preference-based DF Alg for each ES as follows: | each PE runs the Highest-Preference DF Alg for each ES as | |||
| follows: | ||||
| - The PE will check the DF Alg value in each ES route, and | - The PE will check the DF Alg value in each ES route, and | |||
| assuming all the ES routes are consistent in this DF Alg and | assuming all the ES routes are consistent in this DF Alg and | |||
| the value is 2 (Preference-based), the PE will run the | the value is 2 (Highest-Preference), the PE will run the | |||
| procedure in this section. Otherwise, the procedure will fall | procedure in this section. Otherwise, the procedure will fall | |||
| back to [RFC7432] Default Alg. | back to [RFC7432] Default Alg. | |||
| - In this Preference-based Alg, each PE builds a list of | - In this Highest-Preference Alg, each PE builds a list of | |||
| candidate PEs, ordered by Preference. E.g. PE1 will build a | candidate PEs, ordered by Preference. E.g. PE1 will build a | |||
| list of candidate PEs for vES1 ordered by the Preference, from | list of candidate PEs for vES1 ordered by the Preference, from | |||
| high to low: PE1>PE2. Hence PE1 will become the DF for vES1. | high to low: PE1>PE2. Hence PE1 will become the DF for vES1. | |||
| In the same way, PE3 becomes the DF for vES2. | In the same way, PE3 becomes the DF for vES2. | |||
| d. Note that, by default, the Highest-Preference is chosen for each | d. Assuming some maintenance tasks had to be executed on, E.g., PE3, | |||
| ES or vES, however the ES configuration can be changed to the | the operator could set vES2's Preference to E.g., 50 so that PE2 | |||
| Lowest-Preference algorithm as long as this option is consistent | is forced to take over as DF for vES2 (irrespective of the DP | |||
| in all the PEs in the ES. E.g. vES1 could have been explicitly | capability). Once the maintenance task on PE3 is over, the | |||
| configured as Alg Preference-based with Lowest-Preference, in | operator could decide to leave the existing preference or | |||
| which case, PE2 would have been the DF. | configure the old preference back. | |||
| e. Assuming some maintenance tasks had to be executed on PE3, the | ||||
| operator could set vES2's Preference to e.g., 50 so that PE2 is | ||||
| forced to take over as DF for vES2 (irrespective of the DP | ||||
| capability). Once the maintenance on PE3 is over, the operator | ||||
| could decide to leave the existing preference or configure the | ||||
| old preference back. | ||||
| f. In case of equal Preference in two or more PEs in the ES, the DP | e. In case of equal Preference in two or more PEs in the ES, the DP | |||
| bit and the lowest IP of the candidate PEs are used as tie- | bit and the lowest IP of the candidate PEs are used as tie- | |||
| breakers. After selecting the PEs with the highest Preference | breakers. After selecting the PEs with the highest Preference | |||
| value, an implementation MUST first select the PE advertising the | value, an implementation MUST first select the PE advertising the | |||
| DP bit set, and then select the PE with the lowest IP address (if | DP bit set, and then select the PE with the lowest IP address (if | |||
| the DP bit selection does not yield a unique candidate). The | the DP bit selection does not yield a unique candidate). The | |||
| PE's IP address is the address used in the candidate list and it | PE's IP address is the address used in the candidate list and it | |||
| is derived from the Originating Router's IP address of the ES | is derived from the Originating Router's IP address of the ES | |||
| route. Some examples of the use of the DP bit and IP address | route. Some examples of the use of the DP bit and IP address | |||
| tie-breakers follow: | tie-breakers follow: | |||
| - If vES1 parameters were (500,0,Pref) in PE1 and (500,1,Pref) | - If vES1 parameters were (500,0,Highest-Pref) in PE1 and | |||
| in PE2, PE2 would be elected due to the DP bit. | (500,1,Highest-Pref) in PE2, PE2 would be elected due to the | |||
| DP bit. | ||||
| - If vES1 parameters were (500,0,Pref) in PE1 and (500,0,Pref) | - If vES1 parameters were (500,0,Highest-Pref) in PE1 and | |||
| in PE2, PE1 would be elected, assuming PE1's IP address is | (500,0,Highest-Pref) in PE2, PE1 would be elected, assuming | |||
| lower than PE2's. | PE1's IP address is lower than PE2's. | |||
| g. The Preference is an administrative option that MUST be | f. The Preference is an administrative option that MUST be | |||
| configured on a per-ES basis from the management plane, but MAY | configured on a per-ES basis from the management plane, but MAY | |||
| also be dynamically changed based on the use of local policies. | also be dynamically changed based on the use of local policies. | |||
| For instance, on PE1, ES1's Preference can be lowered from 500 to | For instance, on PE1, ES1's Preference can be lowered from 500 to | |||
| 100 in case the bandwidth on the ENNI port is decreased a 50% | 100 in case the bandwidth on the ENNI port is decreased a 50% | |||
| (that could happen if e.g. the 2-port LAG between PE1 and the | (that could happen if e.g. the 2-port LAG between PE1 and the | |||
| Aggregation Network loses one port). Policies MAY also trigger | Aggregation Network loses one port). Policies MAY also trigger | |||
| dynamic Preference changes based on the PE's bandwidth | dynamic Preference changes based on the PE's bandwidth | |||
| availability in the core, specific ports going operationally | availability in the core, specific ports going operationally | |||
| down, etc. The definition of the actual local policies is out of | down, etc. The definition of the actual local policies is out of | |||
| scope of this document. The default Preference value is 32767. | scope of this document. The default Preference value is 32767. | |||
| The Preference Alg MAY be used along with the AC-DF capability. | The Highest-Preference Alg MAY be used along with the AC-DF | |||
| Assuming all the PEs in the ES are configured consistently with | capability. Assuming all the PEs in the ES are configured | |||
| Preference Alg and AC-DF capability, a given PE in the ES is not | consistently with Highest-Preference Alg and AC-DF capability, a | |||
| considered as candidate for DF Election until its corresponding | given PE in the ES is not considered as candidate for DF Election | |||
| Ethernet A-D per ES and Ethernet A-D per EVI routes are not received, | until its corresponding Ethernet A-D per ES and Ethernet A-D per EVI | |||
| as described in [RFC8584]. | routes are not received, as described in [RFC8584]. | |||
| The procedures in this document can be used in [RFC7432] based ES or | The procedures in this document can be used in [RFC7432] based ES or | |||
| vES as in [I-D.ietf-bess-evpn-virtual-eth-segment], and including | vES as in [I-D.ietf-bess-evpn-virtual-eth-segment], and including | |||
| EVPN networks as in [RFC8214], [RFC7623] or [RFC8365]. | EVPN networks as in [RFC8214], [RFC7623] or [RFC8365]. | |||
| 4.2. Use of the Preference algorithm in [RFC7432] Ethernet Segments | 4.2. Use of the Lowest-Preference Algorithm | |||
| While the Preference-based DF Alg described in Section 4.1 is | In addition to the Highest-Preference Alg described in Section 4.1 | |||
| typically used in virtual ES scenarios where there is normally an | this document defines the Lowest-Preference Alg. In this case, and | |||
| individual Ethernet Tag per vES, the existing [RFC7432] definition of | using the example of vES1 in Figure 3, if the Lowest-Preference Alg | |||
| ES allows potentially up to thousands of Ethernet Tags on the same | is configured in all the PEs in the ES, PE2 will be the DF due to its | |||
| ES. If this is the case, and the operator still wants to control who | lower Preference. | |||
| the DF is for a given Ethernet Tag, the use of the Preference-based | ||||
| DF Alg can also provide some level of load balancing. | ||||
| In this type of scenarios, the ES is configured with an | All the procedures described in Section 4.1 apply to the Lowest- | |||
| administrative Preference value, but then a range of Ethernet Tags | Preference Alg, only replacing the Highest-Preference tie-breaker | |||
| can be defined to use the Highest-Preference or the Lowest-Preference | with the Lowest-Preference tie-breaker. The Highest-Preference and | |||
| depending on the desired behavior. With this option, the PE will | Lowest-Preference Algs are different Algs, therefore if two PEs | |||
| build a list of candidate PEs ordered by Preference, however the DF | configured for Highest-Preference and Lowest-Preference respectively, | |||
| for a given Ethernet Tag will be determined by the local | are attached to the same ES, the operational DF Election Alg will | |||
| configuration. | fall back to the Default Alg. | |||
| 4.3. Use of the Highest-Preference algorithm in [RFC7432] Ethernet | ||||
| Segments | ||||
| While the Highest-Preference (or Lowest-Preference for that matter) | ||||
| DF Alg described in Section 4.1 is typically used in virtual ES | ||||
| scenarios where there is normally an individual Ethernet Tag per vES, | ||||
| the existing [RFC7432] definition of an ES allows potentially up to | ||||
| thousands of Ethernet Tags on the same ES. If this is the case, if | ||||
| Highest-Preference (or Lowest-Preference) Alg is configured in all | ||||
| the PEs of the ES, the same PE will be the elected DF for all the | ||||
| Ethernet Tags of the ES. A potential way to achive a more granular | ||||
| load balancing is decribed below. | ||||
| The ES is configured with an administrative Preference value and | ||||
| E.g., Highest-Preference Alg, but then a range of Ethernet Tags can | ||||
| be defined to use the Lowest-Preference depending on the desired | ||||
| behavior. With this option, the PE will build a list of candidate | ||||
| PEs ordered by Preference, however the DF for a given Ethernet Tag | ||||
| will be determined by the local configuration. | ||||
| For instance: | For instance: | |||
| o Assuming ES3 is defined in PE1 and PE2, PE1 may be configured as | o Assuming ES3 is defined in PE1 and PE2, PE1 may be configured as | |||
| (500,0,Preference) for ES3 and PE2 as (100,0,Preference). | (500,0,Highest-Preference) for ES3 and PE2 as (100,0,Highest- | |||
| Preference). | ||||
| o In addition, assuming VLAN-based service interfaces, the PEs will | o In addition, assuming VLAN-based service interfaces and that the | |||
| be configured with (Ethernet Tag-range,high_or_low), E.g., | PEs are attached to all Ethernet Tags in the range 1-4000, both | |||
| (1-2000,high) and (2001-4000, low). | PE1 and PE2 will be configured with (Ethernet Tag-range,low), | |||
| E.g., (2001-4000, low). | ||||
| o This will result in PE1 being DF for Ethernet Tags 1-2000 and PE2 | o This will result in PE1 being DF for Ethernet Tags 1-2000 (since | |||
| being DF for Ethernet Tags 2001-4000. | they use the default Highest-Preference Alg) and PE2 being DF for | |||
| Ethernet Tags 2001-4000, due to the local policy overriding the | ||||
| Highest-Preference Alg. | ||||
| For Ethernet Segments attached to three or more PEs, any other logic | For Ethernet Segments attached to three or more PEs, any other logic | |||
| that provides a fair distribution of the DF function among the PEs is | that provides a fair distribution of the DF function among the PEs is | |||
| valid, as long as that logic is consistent in all the PEs in the ES. | valid, as long as that logic is consistent in all the PEs in the ES. | |||
| It is important to note that, when a local policy overrides the | ||||
| Highest-Preference or Lowest-Preference signaled by all the PEs in | ||||
| the ES, this local policy MUST be consistent in all the PEs of the | ||||
| ES. If the local policy is inconsistent for a given Ethernet Tag in | ||||
| the ES, black-holes or packet duplication may occur on that Ethernet | ||||
| Tag. | ||||
| 4.3. The Non-Revertive Capability | 4.4. The Non-Revertive Capability | |||
| As discussed in Section 1.2 (d), a capability to NOT preempt the | As discussed in Section 1.2 (d), a capability to NOT preempt the | |||
| existing DF for a given Ethernet Tag is required and therefore added | existing DF for a given Ethernet Tag is required and therefore added | |||
| to the DF Election extended community. This option will allow a non- | to the DF Election extended community. This option will allow a non- | |||
| revertive behavior in the DF election. | revertive behavior in the DF election. | |||
| Note that, when a given PE in an ES is taken down for maintenance | Note that, when a given PE in an ES is taken down for maintenance | |||
| operations, before bringing it back, the Preference may be changed in | operations, before bringing it back, the Preference may be changed in | |||
| order to provide a non-revertive behavior. The DP bit and the | order to provide a non-revertive behavior. The DP bit and the | |||
| mechanism explained in this section will be used for those cases when | mechanism explained in this section will be used for those cases when | |||
| a former DF comes back up without any controlled maintenance | a former DF comes back up without any controlled maintenance | |||
| operation, and the non-revertive option is desired in order to avoid | operation, and the non-revertive option is desired in order to avoid | |||
| service impact. | service impact. | |||
| In Figure 3, we assume that based on the Highest-Pref, PE3 is the DF | In Figure 3, we assume that based on the Highest-Preference Alg, PE3 | |||
| for ESI2. | is the DF for ESI2. | |||
| If PE3 has a link, EVC or node failure, PE2 would take over as DF. | If PE3 has a link, EVC or node failure, PE2 would take over as DF. | |||
| If/when PE3 comes back up again, PE3 will take over, causing some | If/when PE3 comes back up again, PE3 will take over, causing some | |||
| unnecessary packet loss in the ES. | unnecessary packet loss in the ES. | |||
| The following procedure avoids preemption upon failure recovery | The following procedure avoids preemption upon failure recovery | |||
| (please refer to Figure 1): | (please refer to Figure 3). The procedure supports a non-revertive | |||
| mode that can be used along with: | ||||
| 1. A new "Don't Preempt Me" capability is defined on a per-PE/per-ES | o Highest-Preference Alg | |||
| o Highest-Preference Alg, where a local policy overrides the | ||||
| Highest-Preference tie-breaker for a range of Ethernet Tags | ||||
| o Lowest-Preference Alg | ||||
| The procedure is described assuming Highest-Preference Alg in the ES, | ||||
| where local policy overrides the tie-breaker for a given Ethernet | ||||
| Tag, since this is the most complex case. The other two cases above | ||||
| are a sub-set of this one and the differences will be explained | ||||
| later. | ||||
| 1. A "Don't Preempt Me" capability is defined on a per-PE/per-ES | ||||
| basis, as described in Section 3. If "Don't Preempt Me" is | basis, as described in Section 3. If "Don't Preempt Me" is | |||
| disabled (default behavior), the advertised DP bit will be 0. If | disabled (default behavior), the advertised DP bit will be 0. If | |||
| "Don't Preempt Me" is enabled, the ES route will be advertised | "Don't Preempt Me" is enabled, the ES route will be advertised | |||
| with DP=1 ("Don't Preempt Me"). All the PEs in an ES SHOULD be | with DP=1 ("Don't Preempt Me"). All the PEs in an ES SHOULD be | |||
| consistent in their configuration of the DP capability, however | consistent in their configuration of the DP capability, however | |||
| this document do not enforce the consistency across all the PEs. | this document does not enforce the consistency across all the | |||
| In case of inconsistency in the support of the DP capability in | PEs. In case of inconsistency in the support of the DP | |||
| the PEs of the same ES, non-revertive behavior is not guaranteed. | capability in the PEs of the same ES, non-revertive behavior is | |||
| not guaranteed. However, PEs supporting this capability will | ||||
| still attempt this procedure. | ||||
| 2. Assuming we want to avoid 'preemption' in all the PEs in the ES, | 2. We assume we want to avoid 'preemption' in all the PEs in the ES, | |||
| the three PEs are configured with the "Don't Preempt Me" | the three PEs are configured with the "Don't Preempt Me" | |||
| capability. In this example, we assume ESI2 is configured as | capability. In this example, we assume ESI2 is configured as | |||
| 'DP=enabled' in the three PEs. | 'DP=enabled' in the three PEs. | |||
| 3. Assuming Ethernet Tag-1 uses Highest-Pref in vES2 and Ethernet | 3. We also assume vES2 is attached to Ethernet Tag-1 and Ethernet | |||
| Tag-2 uses Lowest-Pref, when vES2 is enabled in the three PEs, | Tag-2. vES2 uses Highest-Preference as DF Alg and a local policy | |||
| the PEs will exchange the ES routes and select PE3 as DF for | is configured in the three PEs to use Lowest-Preference for | |||
| Ethernet Tag-1 (due to the Highest-Pref type), and PE1 as DF for | Ethernet Tag-2. When vES2 is enabled in the three PEs, the PEs | |||
| Ethernet Tag-2 (due to the Lowest-Pref). | will exchange the ES routes and select PE3 as DF for Ethernet | |||
| Tag-1 (due to the Highest-Preference), and PE1 as DF for Ethernet | ||||
| Tag-2 (due to the Lowest-Preference). | ||||
| 4. If PE3's vES2 goes down (due to EVC failure - detected by OAM, or | 4. If PE3's vES2 goes down (due to EVC failure - detected by OAM, or | |||
| port failure or node failure), PE2 will become the DF for | port failure or node failure), PE2 will become the DF for | |||
| Ethernet Tag-1. No changes will occur for Ethernet Tag-2. | Ethernet Tag-1. No changes will occur for Ethernet Tag-2. | |||
| 5. When PE3's vES2 comes back up, PE3 will start a boot-timer (if | 5. When PE3's vES2 comes back up, PE3 will start a boot-timer (if | |||
| booting up) or hold-timer (if the port or EVC recovers). That | booting up) or hold-timer (if the port or EVC recovers). That | |||
| timer will allow some time for PE3 to receive the ES routes from | timer will allow some time for PE3 to receive the ES routes from | |||
| PE1 and PE2. This timer is applied between the INIT and the | PE1 and PE2. This timer is applied between the INIT and the | |||
| DF_WAIT states in the DF Election Finite State Machine described | DF_WAIT states in the DF Election Finite State Machine described | |||
| skipping to change at page 13, line 5 ¶ | skipping to change at page 13, line 44 ¶ | |||
| not before, the PE will then compare its operational 'in-use' | not before, the PE will then compare its operational 'in-use' | |||
| Pref with its administrative Pref. If different, the PE will | Pref with its administrative Pref. If different, the PE will | |||
| send an ES route update with its administrative Pref and DP | send an ES route update with its administrative Pref and DP | |||
| values. In the example, PE3 will be the new Highest-PE, | values. In the example, PE3 will be the new Highest-PE, | |||
| therefore it will send an ES route update with | therefore it will send an ES route update with | |||
| (Pref,DP)=(300,1). | (Pref,DP)=(300,1). | |||
| - After running the DF Election, PE3 will become the new DF for | - After running the DF Election, PE3 will become the new DF for | |||
| Ethernet Tag-1. No changes will occur for Ethernet Tag-2. | Ethernet Tag-1. No changes will occur for Ethernet Tag-2. | |||
| If the ES uses Highest-Preference Alg (for all the Ethernet Tags, no | ||||
| local policy), the PEs only need to select the "Highest-PE" as the | ||||
| "reference-PE" (i.e., no need to select the "Lowest-PE"). If the ES | ||||
| uses Lowest-Preference Alg for all the Ethernet Tags, the PEs only | ||||
| need to select the "Lowest-PE" as the "reference-PE". The rest of | ||||
| the procedure remains the same. | ||||
| Note that, irrespective of the DP bit, when a PE or ES comes back and | Note that, irrespective of the DP bit, when a PE or ES comes back and | |||
| the PE advertises a DF Election Alg different than 2 (Preference | the PE advertises a DF Election Alg different than the one configured | |||
| algorithm), the rest of the PEs in the ES MUST fall back to the | in the rest of the PEs in the ES, all the PEs in the ES MUST fall | |||
| Default [RFC7432] Alg. | back to the Default [RFC7432] Alg. | |||
| This document does not modify the use of the P and B bits in the | This document does not modify the use of the P and B bits in the | |||
| Ethernet A-D per EVI routes [RFC8214] advertised by the PEs in the ES | Ethernet A-D per EVI routes [RFC8214] advertised by the PEs in the ES | |||
| after running the DF Election, irrespective of the revertive or non- | after running the DF Election, irrespective of the revertive or non- | |||
| revertive behavior in the PE. | revertive behavior in the PE. | |||
| 5. Security Considerations | 5. Security Considerations | |||
| This document describes a DF Election Algorithm that provides | This document describes a DF Election Algorithm that provides | |||
| absolute control (by configuration) over what PE is the DF for a | absolute control (by configuration) over what PE is the DF for a | |||
| skipping to change at page 13, line 30 ¶ | skipping to change at page 14, line 27 ¶ | |||
| a malicious user that gets access to the configuration of a PE in the | a malicious user that gets access to the configuration of a PE in the | |||
| ES may change the behavior of the network. In other DF Algs such as | ES may change the behavior of the network. In other DF Algs such as | |||
| HRW, the DF Election is more automated and cannot be determined by | HRW, the DF Election is more automated and cannot be determined by | |||
| configuration. | configuration. | |||
| The non-revertive capability described in this document may be seen | The non-revertive capability described in this document may be seen | |||
| as a security improvement over the regular EVPN revertive DF | as a security improvement over the regular EVPN revertive DF | |||
| Election: an intentional link (or node) "flapping" on a PE will only | Election: an intentional link (or node) "flapping" on a PE will only | |||
| cause service disruption once, when the PE goes to NDF state. | cause service disruption once, when the PE goes to NDF state. | |||
| The document also describes how a local policy can override the | ||||
| Highest-Preference Alg for a range of Ethernet Tags in the ES. If | ||||
| the local policy is not consistent across all PEs in the ES and there | ||||
| is an Ethernet Tag that ends up with an inconsistent use of Highest- | ||||
| Preference or Lowest-Preference in different PEs, black-holing or | ||||
| packet duplication may occur for that Ethernet Tag. | ||||
| 6. IANA Considerations | 6. IANA Considerations | |||
| This document solicits the allocation of the following values: | This document solicits the allocation of the following values: | |||
| o DF Alg = 2 in the [RFC8584] "DF Alg" registry, with name | o DF Alg = 2 in the [RFC8584] "DF Alg" registry, with name "Highest- | |||
| "Preference Algorithm". | Preference Algorithm". | |||
| o DF Alg = TBD in the same "DF Alg" registry, with name "Lowest- | ||||
| Preference Algorithm". | ||||
| o Bit 0 in the [RFC8584] DF Election Capabilities registry, with | o Bit 0 in the [RFC8584] DF Election Capabilities registry, with | |||
| name "D (Don't Preempt) Capability" for Non-revertive ES. | name "D (Don't Preempt) Capability" for Non-revertive ES. | |||
| 7. Acknowledgments | 7. Acknowledgments | |||
| The authors would like to thank Kishore Tiruveedhula for his review | The authors would like to thank Kishore Tiruveedhula for his review | |||
| and comments. | and comments. Also thank you to Luc Andre Burdet and Stephane | |||
| Litkowski for their thorough review and suggestions for a new DF Alg | ||||
| for lowest-preference. | ||||
| 8. Contributors | 8. Contributors | |||
| In addition to the authors listed, the following individuals also | In addition to the authors listed, the following individuals also | |||
| contributed to this document: | contributed to this document: | |||
| Kiran Nagaraj, Nokia | Kiran Nagaraj, Nokia | |||
| Vinod Prabhu, Nokia | Vinod Prabhu, Nokia | |||
| Selvakumar Sivaraj, Juniper | Selvakumar Sivaraj, Juniper | |||
| Sami Boutros, VMWare | Sami Boutros, VMWare | |||
| 9. References | 9. References | |||
| 9.1. Normative References | 9.1. Normative References | |||
| [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., | [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., | |||
| Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based | Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based | |||
| End of changes. 50 change blocks. | ||||
| 124 lines changed or deleted | 190 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||