| < draft-rabadan-bess-evpn-pref-df-01.txt | draft-rabadan-bess-evpn-pref-df-02.txt > | |||
|---|---|---|---|---|
| BESS Workgroup J. Rabadan, Ed. | BESS Workgroup J. Rabadan, Ed. | |||
| Internet Draft S. Sathappan | Internet Draft S. Sathappan | |||
| Intended status: Standards Track Nokia | Intended status: Standards Track Nokia | |||
| S. Boutros T. Przygienda | S. Boutros T. Przygienda | |||
| VMware Ericsson | VMware W. Lin | |||
| W. Lin | ||||
| J. Drake | J. Drake | |||
| Juniper Networks | Juniper Networks | |||
| A. Sajassi | A. Sajassi | |||
| S. Mohanty | S. Mohanty | |||
| Cisco Systems | Cisco Systems | |||
| Expires: November 28, 2016 May 27, 2016 | Expires: June 22, 2017 December 19, 2016 | |||
| Preference-based EVPN DF Election | Preference-based EVPN DF Election | |||
| draft-rabadan-bess-evpn-pref-df-01 | draft-rabadan-bess-evpn-pref-df-02 | |||
| Abstract | Abstract | |||
| RFC7432 defines the Designated Forwarder (DF) in (PBB-)EVPN networks | RFC7432 defines the Designated Forwarder (DF) in (PBB-)EVPN networks | |||
| as the PE responsible for sending broadcast, multicast and unknown | as the PE responsible for sending broadcast, multicast and unknown | |||
| unicast traffic (BUM) to a multi-homed device/network in the case of | unicast traffic (BUM) to a multi-homed device/network in the case of | |||
| an all-active multi-homing ES, or BUM and unicast in the case of | an all-active multi-homing ES, or BUM and unicast in the case of | |||
| single-active multi-homing. | single-active multi-homing. | |||
| The DF is selected out of a candidate list of PEs that advertise the | The DF is selected out of a candidate list of PEs that advertise the | |||
| skipping to change at page 2, line 4 ¶ | skipping to change at page 1, line 45 ¶ | |||
| some use-cases where a more 'deterministic' and user-controlled | some use-cases where a more 'deterministic' and user-controlled | |||
| method is required. At the same time, Service Providers require an | method is required. At the same time, Service Providers require an | |||
| easy way to force an on-demand DF switchover in order to carry out | easy way to force an on-demand DF switchover in order to carry out | |||
| some maintenance tasks on the existing DF or control whether a new | some maintenance tasks on the existing DF or control whether a new | |||
| active PE can preempt the existing DF PE. | active PE can preempt the existing DF PE. | |||
| This document proposes an extension to the current RFC7432 DF | This document proposes an extension to the current RFC7432 DF | |||
| election procedures so that the above requirements can be met. | election procedures so that the above requirements can be met. | |||
| Status of this Memo | Status of this Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
| other groups may also distribute working documents as Internet- | other groups may also distribute working documents as Internet- | |||
| Drafts. | Drafts. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html | http://www.ietf.org/shadow.html | |||
| This Internet-Draft will expire on November 28, 2016. | This Internet-Draft will expire on June 22, 2017. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2016 IETF Trust and the persons identified as the | Copyright (c) 2016 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 3, line 4 ¶ | skipping to change at page 2, line 49 ¶ | |||
| Table of Contents | Table of Contents | |||
| 1. Problem Statement . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Problem Statement . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 2. Solution requirements . . . . . . . . . . . . . . . . . . . . . 3 | 2. Solution requirements . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 3. EVPN BGP Attributes for Deterministic DF Election . . . . . . . 4 | 3. EVPN BGP Attributes for Deterministic DF Election . . . . . . . 4 | |||
| 4. Solution description . . . . . . . . . . . . . . . . . . . . . 5 | 4. Solution description . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 4.1 Use of the Preference algorithm . . . . . . . . . . . . . . 5 | 4.1 Use of the Preference algorithm . . . . . . . . . . . . . . 5 | |||
| 4.2 Use of the Preference algorithm in RFC7432 | 4.2 Use of the Preference algorithm in RFC7432 | |||
| Ethernet-Segments . . . . . . . . . . . . . . . . . . . . . 7 | Ethernet-Segments . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 4.3 The Non-Revertive option . . . . . . . . . . . . . . . . . . 7 | 4.3 The Non-Revertive option . . . . . . . . . . . . . . . . . . 7 | |||
| 5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 10 | ||||
| 5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 9 | 11. Conventions used in this document . . . . . . . . . . . . . . 10 | |||
| 11. Conventions used in this document . . . . . . . . . . . . . . 9 | ||||
| 12. Security Considerations . . . . . . . . . . . . . . . . . . . 10 | 12. Security Considerations . . . . . . . . . . . . . . . . . . . 10 | |||
| 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 | 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 15. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 | 15. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 15.1 Normative References . . . . . . . . . . . . . . . . . . . 10 | 15.1 Normative References . . . . . . . . . . . . . . . . . . . 11 | |||
| 15.2 Informative References . . . . . . . . . . . . . . . . . . 10 | 15.2 Informative References . . . . . . . . . . . . . . . . . . 11 | |||
| 16. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10 | 16. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 17. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 10 | 17. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 17. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 11 | 17. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 1. Problem Statement | 1. Problem Statement | |||
| RFC7432 defines the Designated Forwarder (DF) in (PBB-)EVPN networks | RFC7432 defines the Designated Forwarder (DF) in (PBB-)EVPN networks | |||
| as the PE responsible for sending broadcast, multicast and unknown | as the PE responsible for sending broadcast, multicast and unknown | |||
| unicast traffic (BUM) to a multi-homed device/network in the case of | unicast traffic (BUM) to a multi-homed device/network in the case of | |||
| an all-active multi-homing ES or BUM and unicast traffic to a multi- | an all-active multi-homing ES or BUM and unicast traffic to a multi- | |||
| homed device or network in case of single-active multi-homing. | homed device or network in case of single-active multi-homing. | |||
| skipping to change at page 8, line 27 ¶ | skipping to change at page 8, line 27 ¶ | |||
| for ESI2. | for ESI2. | |||
| If PE3 has a link, EVC or node failure, PE2 would take over as DF. | If PE3 has a link, EVC or node failure, PE2 would take over as DF. | |||
| If/when PE3 comes back up again, PE3 will take over, causing some | If/when PE3 comes back up again, PE3 will take over, causing some | |||
| unnecessary packet loss in the ES. | unnecessary packet loss in the ES. | |||
| The following procedure avoids preemption upon failure recovery | The following procedure avoids preemption upon failure recovery | |||
| (please refer to Figure 1): | (please refer to Figure 1): | |||
| 1) A new "Don't Preempt Me" parameter is defined on a per-PE per-ES | 1) A new "Don't Preempt Me" parameter is defined on a per-PE per-ES | |||
| basis. If "Don't Preempt Me" is disabled (default behavior) the | basis, as described in section 3. If "Don't Preempt Me" is | |||
| advertised DP bit will be 0. If "Don't Preempt Me" is enabled on a | disabled (default behavior) the advertised DP bit will be 0. If | |||
| PE, the ES route will be advertised with DP=1 ("Don't Preempt Me") | "Don't Preempt Me" is enabled, the ES route will be advertised | |||
| once the DF timer is expired. | with DP=1 ("Don't Preempt Me"). | |||
| 2) Assuming we want to avoid 'preemption', the three PEs are | 2) Assuming we want to avoid 'preemption', the three PEs are | |||
| configured with the "Don't Preempt Me" option. Note that each PE | configured with the "Don't Preempt Me" option. Note that each PE | |||
| individually MAY be configured with different preemption value. In | individually MAY be configured with different preemption value. In | |||
| this example, we assume ESI2 is configured as 'DP=disabled' in PE1 | this example, we assume ESI2 is configured as 'DP=enabled' in the | |||
| but 'DP=enabled' in PE2 and PE3. | three PEs. | |||
| 3) When ES2 is enabled in the three PEs, and after the DF timer, the | 3) Assuming EVI1 uses Highest-Pref in vES2 and EVI2 uses Lowest-Pref, | |||
| PEs (due to the Highest-Pref type) select PE3 as DF for EVI1. Only | when vES2 is enabled in the three PEs, the PEs will exchange the | |||
| after the timer and the DF election, the PEs will check the 'DP' | ES routes and select PE3 as DF for EVI1 (due to the Highest-Pref | |||
| configuration and since it is enabled on PE2 and PE3, these two | type), and PE1 as DF for EVI2 (due to the Lowest-Pref). | |||
| PEs will send an ES route update, now with DP=1. This update will | ||||
| not cause any change in the existing DFs since there is no change | ||||
| in the Preference value. | ||||
| 4) If PE3's vES2 goes down (due to EVC failure - detected by OAM, or | 4) If PE3's vES2 goes down (due to EVC failure - detected by OAM, or | |||
| port failure or node failure), PE2 will become the DF for | port failure or node failure), PE2 will become the DF for EVI1. No | |||
| ESI2/EVI1. | changes will occur for EVI2. | |||
| 5) When PE3's vES2 comes back up, PE3 will start a boot-timer (if | 5) When PE3's vES2 comes back up, PE3 will start a boot-timer (if | |||
| booting up) or hold-timer (if the port or EVC recovers). That | booting up) or hold-timer (if the port or EVC recovers). That | |||
| timer will allow some time for PE3 to receive the ES routes from | timer will allow some time for PE3 to receive the ES routes from | |||
| PE1 and PE2. PE3 will then check its own | PE1 and PE2. PE3 will then: | |||
| [Pref,DP,type]=[300,1,Pref] and if its Pref is higher than any of | ||||
| the other PE's Pref, then PE3 will send the ES route with an 'in- | ||||
| use' Preference equal to the highest received Preference. In this | ||||
| case, since PE2 advertised [Pref,DP,type]=[200,1,Pref], PE3 will | ||||
| then send [200,0,Pref]. The 'in-use' Preference being sent upon | ||||
| recovery of a node is always equal to the highest existing | ||||
| Preference, irrespective of the EVI/ISIDs and their choice of | ||||
| highest or lowest preference algorithm. | ||||
| Note that, a PE will always send DP=0 the first time it advertises | o Select two "reference-PEs" among the ES routes in the vES, the | |||
| an ES route after the ES becomes active, and irrespective of the | "Highest-PE" and the "Lowest-PE": | |||
| configuration. Also a PE will always send DP=0 as long as the | ||||
| advertised Pref is the 'in-use' Pref (as opposed to the 'admin' | ||||
| Pref). | ||||
| This ES route update sent by PE3 (with [200,0,Pref]) will not | - The Highest-PE is the PE with higher Preference, using the DP | |||
| cause any changes in the DF election and PE2 will continue being | bit first (with DP=1 being better) and, after that, the lower | |||
| DF. This is because the DP bit will be used as a tie-breaker in | PE-IP address as tie-breakers. PE3 will select PE2 as Highest- | |||
| the DF election. That is, if a PE has two candidate PEs with the | PE over PE1, since, when comparing [Pref,DP,PE-IP], | |||
| same Pref, it will pick up the one with DP=1. | [200,1,PE2-IP] wins over [100,1,PE1-IP]. | |||
| 6) Only in case of PE2's failure, PE3 will resend the ES route with | - The Lowest-PE is the PE with lower Preference, using the DP | |||
| the admin Pref (as opposed to the 'in-use' Pref) and the DP bit | bit first (with DP=1 being better) and, after that, the lower | |||
| that corresponds to its configuration. In this case, PE3 will | PE-IP address as tie-breakers. PE3 will select PE1 as Lowest- | |||
| become DF for EVI1 again (assuming it wins the DF election to | PE over PE2, since [100,1,PE1-IP] wins over [200,1,PE2-IP]. | |||
| PE1). | ||||
| - Note that if there were only one remote PE in the ES, Lowest | ||||
| and Highest PE would be the same PE. | ||||
| o Check its own administrative Pref and compares it with the one | ||||
| of the Highest-PE and Lowest-PE that have DP=1 in their ES | ||||
| routes. Depending on this comparison PE3 will send the ES route | ||||
| with a [Pref,DP] that may be different from its administrative | ||||
| [Pref,DP]: | ||||
| - If PE3's Pref value is higher than the Highest-PE's, PE3 will | ||||
| send the ES route with an 'in-use' operational Pref equal to | ||||
| the Highest-PE's and DP=0. | ||||
| - If PE3's Pref value is lower than the Lowest-PE's, PE3 will | ||||
| send the ES route with an 'in-use' operational Preference | ||||
| equal to the Lowest-PE's and DP=0. | ||||
| - If PE3's Pref value is neither higher nor lower than the | ||||
| Highest-PE's or the Lowest-PE's respectively, PE3 will send | ||||
| the ES route with its administrative [Pref,DP]=[300,1]. | ||||
| - In this example, PE3's administrative Pref=300 is higher than | ||||
| the Highest-PE with DP=1, that is, PE2 (Pref=200). Hence PE3 | ||||
| will inherit PE2's preference and send the ES route with an | ||||
| operational 'in-use' [Pref,DP]=[200,0]. | ||||
| Note that, a PE will always send DP=0 as long as the advertised | ||||
| Pref is the 'in-use' operational Pref (as opposed to the | ||||
| 'administrative' Pref). | ||||
| This ES route update sent by PE3 (with [200,0,PE3-IP]) will not | ||||
| cause any DF switchover for any EVI/ISID. PE2 will continue being | ||||
| DF for EVI1. This is because the DP bit will be used as a tie- | ||||
| breaker in the DF election. That is, if a PE has two candidate PEs | ||||
| with the same Pref, it will pick up the one with DP=1. There are | ||||
| no DF changes for EVI2 either. | ||||
| 6) Subsequently, if PE2 fails, upon receiving PE2's ES route | ||||
| withdrawal, PE3 and PE1 will go through the process described in | ||||
| (5) to select new Highest and Lowest-PEs (considering their own | ||||
| active ES route) and then they will run the DF Election. | ||||
| o If a PE selects itself as new Highest or Lowest-PE and it was | ||||
| not before, the PE will then compare its operational 'in-use' | ||||
| Pref with its administrative Pref. If different, the PE will | ||||
| send an ES route update with its administrative Pref and DP | ||||
| values. In the example, PE3 will be the new Highest-PE, | ||||
| therefore it will send an ES route update with | ||||
| [Pref,DP]=[300,1]. | ||||
| o After running the DF Election, PE3 will become the new DF for | ||||
| EVI1. No changes will occur for EVI2. | ||||
| 5. Conclusions | 5. Conclusions | |||
| Service Providers are seeking for options where the DF election can | Service Providers are seeking for options where the DF election can | |||
| be controlled by the user in a deterministic way and with a non- | be controlled by the user in a deterministic way and with a non- | |||
| revertive behavior. This document defines the use of a Preference | revertive behavior. This document defines the use of a Preference | |||
| algorithm that can be configured and used in a flexible manner to | algorithm that can be configured and used in a flexible manner to | |||
| achieve those objectives. | achieve those objectives. | |||
| 11. Conventions used in this document | 11. Conventions used in this document | |||
| skipping to change at page 10, line 11 ¶ | skipping to change at page 11, line 4 ¶ | |||
| In this document, these words will appear with that interpretation | In this document, these words will appear with that interpretation | |||
| only when in ALL CAPS. Lower case uses of these words are not to be | only when in ALL CAPS. Lower case uses of these words are not to be | |||
| interpreted as carrying RFC-2119 significance. | interpreted as carrying RFC-2119 significance. | |||
| In this document, the characters ">>" preceding an indented line(s) | In this document, the characters ">>" preceding an indented line(s) | |||
| indicates a compliance requirement statement using the key words | indicates a compliance requirement statement using the key words | |||
| listed above. This convention aids reviewers in quickly identifying | listed above. This convention aids reviewers in quickly identifying | |||
| or finding the explicit compliance requirements of this RFC. | or finding the explicit compliance requirements of this RFC. | |||
| 12. Security Considerations | 12. Security Considerations | |||
| This section will be added in future versions. | This section will be added in future versions. | |||
| 13. IANA Considerations | 13. IANA Considerations | |||
| This document solicits the allocation of DF type = 2 in the registry | This document solicits the allocation of DF type = 2 in the registry | |||
| created by [vES] for the DF type field. | created by [EVPN-HRW-DF] for the DF type field. | |||
| 15. References | 15. References | |||
| 15.1 Normative References | 15.1 Normative References | |||
| [RFC7432]Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., | [RFC7432]Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., | |||
| Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet | Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet | |||
| VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, <http://www.rfc- | VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, <http://www.rfc- | |||
| editor.org/info/rfc7432>. | editor.org/info/rfc7432>. | |||
| skipping to change at page 11, line 12 ¶ | skipping to change at page 12, line 4 ¶ | |||
| Vinod Prabhu, Nokia | Vinod Prabhu, Nokia | |||
| Selvakumar Sivaraj, Juniper | Selvakumar Sivaraj, Juniper | |||
| 17. Authors' Addresses | 17. Authors' Addresses | |||
| Jorge Rabadan | Jorge Rabadan | |||
| Nokia | Nokia | |||
| 777 E. Middlefield Road | 777 E. Middlefield Road | |||
| Mountain View, CA 94043 USA | Mountain View, CA 94043 USA | |||
| Email: jorge.rabadan@nokia.com | Email: jorge.rabadan@nokia.com | |||
| Senthil Sathappan | Senthil Sathappan | |||
| Alcatel-Lucent | Alcatel-Lucent | |||
| Email: senthil.sathappan@nokia.com | Email: senthil.sathappan@nokia.com | |||
| Tony Przygienda | Tony Przygienda | |||
| Ericsson | Juniper Networks, Inc. | |||
| Email: antoni.przygienda@ericsson.com | Email: prz@juniper.net | |||
| John Drake | John Drake | |||
| Juniper Networks, Inc. | Juniper Networks, Inc. | |||
| Email: jdrake@juniper.net | Email: jdrake@juniper.net | |||
| Wen Lin | Wen Lin | |||
| Juniper Networks, Inc. | Juniper Networks, Inc. | |||
| Email: wlin@juniper.net | Email: wlin@juniper.net | |||
| Ali Sajassi | Ali Sajassi | |||
| End of changes. 19 change blocks. | ||||
| 59 lines changed or deleted | 93 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||