idnits 2.17.1 draft-liu-bess-evpn-mcast-bw-quantity-df-election-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([I-D.ietf-bess-evpn-per-mcast-flow-df-election], [I-D.ietf-bess-evpn-df-election-framework], [RFC7432]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (December 10, 2018) is 1964 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'I-D.skr-bess-evpn-pim-proxy' is defined on line 390, but no explicit reference was found in the text == Outdated reference: A later version (-09) exists of draft-ietf-bess-evpn-df-election-framework-06 == Outdated reference: A later version (-10) exists of draft-ietf-bess-evpn-per-mcast-flow-df-election-00 == Outdated reference: A later version (-21) exists of draft-ietf-bess-evpn-igmp-mld-proxy-02 -- No information found for draft-skr-evpn-bess-pim-proxy - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'I-D.skr-bess-evpn-pim-proxy' Summary: 1 error (**), 0 flaws (~~), 6 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 BESS Working Group Yisong Liu 2 Internet Draft M. McBride 3 Intended status: Standards Track Huawei Technologies 4 Expires: June 10, 2019 December 10, 2018 6 Multicast DF Election for EVPN Based on bandwidth or quantity 7 draft-liu-bess-evpn-mcast-bw-quantity-df-election-00 9 Status of this Memo 11 This Internet-Draft is submitted in full conformance with the 12 provisions of BCP 78 and BCP 79. 14 Internet-Drafts are working documents of the Internet Engineering 15 Task Force (IETF), its areas, and its working groups. Note that 16 other groups may also distribute working documents as Internet- 17 Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six 20 months and may be updated, replaced, or obsoleted by other documents 21 at any time. It is inappropriate to use Internet-Drafts as 22 reference material or to cite them other than as "work in progress." 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/ietf/1id-abstracts.txt 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html 30 This Internet-Draft will expire on June 10, 2019. 32 Copyright Notice 34 Copyright (c) 2019 IETF Trust and the persons identified as the 35 document authors. All rights reserved. 37 This document is subject to BCP 78 and the IETF Trust's Legal 38 Provisions Relating to IETF Documents 39 (http://trustee.ietf.org/license-info) in effect on the date of 40 publication of this document. Please review these documents 41 carefully, as they describe your rights and restrictions with 42 respect to this document. Code Components extracted from this 43 document must include Simplified BSD License text as described in 44 Section 4.e of the Trust Legal Provisions and are provided without 45 warranty as described in the Simplified BSD License. 47 Abstract 48 Ethernet Virtual Private Network (EVPN, RFC7432) is becoming 49 prevalent in Data Centers, Data Center Interconnect (DCI) and Service 50 Provider VPN applications. When multi-homing from a CE to multiple 51 PEs, including links in an EVPN instance on a given Ethernet Segment, 52 in an all-active redundancy mode, [RFC7432] describes a basic 53 mechanism to elect a Designated Forwarder (DF), and [I-D.ietf-bess- 54 evpn-df-election-framework] improves basic DF election by a HRW 55 algorithm. [I-D.ietf-bess-evpn-per-mcast-flow-df-election] enhances 56 the HRW algorithm for the multicast flows to perform DF election at 57 the granularity of (ESI, VLAN, Mcast flow). This document specifies a 58 new algorithm, based on multicast bandwidth utilization and multicast 59 state quantity, in order for the multicast flows to elect a DF. 61 Table of Contents 63 1. Introduction ................................................ 2 64 1.1. Requirements Language .................................. 3 65 1.2. Terminology ............................................ 3 66 2. Solution .................................................... 4 67 2.1. DF Election Based on Bandwidth ......................... 4 68 2.2. DF Election Based on State Quantity .................... 5 69 2.3. Inconsistent Timing between Multi-homed PEs ............ 5 70 2.4. Increase or Decrease of Multi-homed PEs ................ 6 71 2.4.1. Decrease of Multi-homed PEs ....................... 6 72 2.4.2. Increase of Multi-homed PEs ....................... 6 73 3. BGP Encoding ................................................ 7 74 3.1. DF Election Extended Community ......................... 7 75 3.2. Multicast DF Extended Community ........................ 7 76 4. Security Considerations ..................................... 8 77 5. IANA Considerations ......................................... 8 78 6. References .................................................. 8 79 6.1. Normative References ................................... 8 80 6.2. Informative References ................................. 9 81 7. Acknowledgments ............................................. 9 83 1. Introduction 85 Ethernet Virtual Private Network (EVPN [RFC7432]) solutions are 86 becoming prevalent in Data Centers, Data Center Interconnect (DCI) 87 and Service Provider VPN applications. When multi-homing from a CE 88 to multiple PEs, with links in an EVPN instance on a given Ethernet 89 Segment (ES), in an all-active redundancy mode, [RFC7432] defines 90 the role of Designated Forwarder (DF) as the node that is 91 responsible to forward multicast flows. 93 Per [RFC7432], the basic method of DF election is specified. The 94 same ES is sorted in ascending order according to the IP address of 95 the EVPN peer. The PE set is generated, and then the number of PEs 96 is modulo according to the VLAN. The modulo value is equal to the 97 position of the PE in the PE set. The election is the primary DF of 98 the corresponding VLAN, and the other PEs are elected as standby. 100 [I-D.ietf-bess-evpn-df-election-framework] defines extended 101 community attributes for DF elections, which can be extended to use 102 different DF election algorithms and would be used for PEs in a 103 redundancy group to reach a consensus as to which DF election 104 procedure is desired. A PE can notify other participating PEs in a 105 redundancy group about its DF election algorithm by signaling a DF 106 election extended community along with the ES route. The document 107 also improves the basic DF election by a HRW algorithm. 109 [I-D.ietf-bess-evpn-per-mcast-flow-df-election] proposes a method 110 for DF election by enhancing the HRW algorithm, adding the source 111 and group address of the multicast flow as hash factors, and 112 extending the types 4 and 5 of the extended community of the DF 113 election for (S, G) and (*, G) types for different multicast flows. 114 The source and group address is introduced as new elements to HRW 115 algorithm, and the PE with the largest weight is selected as the DF 116 of the multicast flow. 118 However, the relationship between the bandwidth of the multicast 119 flows and the link capacity of different PEs, to the same CE device, 120 is not considered in any of the current DF election algorithms. This 121 may result in severe bandwidth utilization of different links due to 122 different bandwidth usage of multicast flows. This document 123 specifies a new algorithm for multicast flow DF election based on 124 multicast bandwidth or multicast state quantity and extends the 125 existing extended community defined in [I-D.ietf-bess-evpn-df- 126 election-framework]. 128 1.1. Requirements Language 130 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 131 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 132 "OPTIONAL" in this document are to be interpreted as described in 133 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 134 capitals, as shown here. 136 1.2. Terminology 138 CE: Customer Edge equipment 140 PE: Provider Edge device 141 EVPN: Ethernet Virtual Private Network 143 Ethernet Segment (ES): When a customer site (device or network) is 144 connected to one or more PEs via a set of Ethernet links, then that 145 set of links is referred to as an 'Ethernet segment'. 147 IGMP: Internet Group Management Protocol 149 MLD: Multicast Listener Discovery 151 PIM: Protocol Independent Multicast 153 2. Solution 155 In the DF election calculation, the bandwidth weight of each multi- 156 homed link of the PE is added, and the bandwidth occupation of the 157 multicast flows is calculated and divided into two scenarios: 159 * The specific bandwidth value of the multicast flow exists, and the 160 ratio of the current multicast flow bandwidth value to the link 161 bandwidth weight is calculated according to the bandwidth weight of 162 each multi-homed link, and the link with the smallest ratio is 163 elected as the new multicast flow DF. 165 * The specific bandwidth value of the multicast flow does not exist, 166 and the ratio of the current multicast flow state quantity to the 167 link bandwidth weight is calculated according to the bandwidth 168 weight of each multi-homed link, and the link with the smallest 169 ratio is elected as the new multicast flow DF. 171 In particular, if there are multiple PEs with the same calculated 172 ratio, the DF is elected according to the method of maximum 173 bandwidth weight of the link or maximum IP address of the EVPN peer. 175 Since [I-D.ietf-idr-link-bandwidth] defines the link bandwidth 176 extended community, it can be reused to transfer the link bandwidth 177 value of the local ES to other multi-homed PEs, so that each PE can 178 calculate the bandwidth weight ratio of each link of the ES in 179 advance. 181 2.1. DF Election Based on Bandwidth 183 Each PE obtains the link bandwidth values of the other multi-homed 184 PEs in the same EVPN instance on a given ES according to the 185 extended community of the Link bandwidth, and calculates the link 186 bandwidth weight ratio, for example W1:W2:...:Wn for N multi-homed 187 PEs. 189 When the CE sends an IGMP or PIM join to one of the PEs, like PE1, 190 PE1 advertises the PE2, PE3, ... and PEn by the EVPN IGMP/PIM Join 191 Synch route defined in [I-D.ietf-bess-evpn-igmp-mld-proxy] and [I- 192 D.skr-bess-evpn-pim-proxy]. If PE2, PE3, ... or PEn receives an IGMP 193 or PIM join, the procedure will be the same. 195 Each PE calculates the ratio of the current multicast flows 196 bandwidth to the link bandwidth weight. The one PE in PE1, PE2, ... 197 and PEn, which has the smallest ratio, is elected as the DF of the 198 new multicast flow. When the smallest ratios of more than one PE are 199 the same, the PE with the maximum bandwidth weight of the link or 200 the maximum EVPN peer IP address is elected as the DF. 202 2.2. DF Election Based on State Quantity 204 The procedure is almost the same as described in section 2.1. The 205 only difference is that each PE calculates the ratio of the current 206 number of multicast states instead of the bandwidth to the link 207 bandwidth weight because of lacking specific bandwidth value of the 208 multicast flows. 210 2.3. Inconsistent Timing between Multi-homed PEs 212 As a result of the same multicast join, only one of the multi-homed 213 PEs can receive the multicast join message and advertise the EVPN 214 Join Synch route (Type 7). The other PEs need to install the new 215 multicast join state according to the received Synch route. 217 The inconsistent processing timing of the same multicast group 218 joining process between PEs may cause electing different DFs. For 219 example: 221 * Multicast group G1, G2, and G3 join packets are sent from the CE 222 to PE1, PE2 and PE3. 224 * PE1 calculates the DF of G1, while PE2 calculates the DF of G2, 225 and PE calculates the DF of G3, and at this moment each PE has not 226 received the EVPN Join Synch route. 228 * PE1, PE2 and PE3 select the link on the same ES to the CE using 229 the algorithm as described in section 2.1 or 2.2, and the same DF 230 may be elected for G1, G2, and G3. 232 * After receiving the EVPN Join Synch route sent by PE2, PE1 may 233 calculate the DF of G2 as PE3, which is inconsistent with the 234 calculation result of PE2. 236 The DF calculation results of the PEs are inconsistent, which may 237 result in multiple flows or traffic interruptions of the same 238 multicast flow state. Therefore, EVPN Join Synch routes need to 239 carry elected DF information in the route advertisement as the 240 extended community called Multicast DF Extended Community, which can 241 make the DF information for a given multicast flow state between PEs 242 consistent. The actual effect is that the PE that receives the 243 multicast join packet completes the calculation of the DF election 244 and notifies other PEs on the same ES. 246 2.4. Increase or Decrease of Multi-homed PEs 248 2.4.1. Decrease of Multi-homed PEs 250 When one of the multi-homed PEs on the same ES fails or is shut down 251 for maintenance reasons, because the other PEs have received the 252 synch routes of all the multicast flows, the multicast flows 253 destined to the failed PE need to be in a specific order (for 254 example, the group and source address ascending order) to reassign 255 the DF. The DF election calculation based on the multicast flows 256 bandwidth, or the number of multicast states, is completed by one of 257 the specified multi-homing PEs, and the specified calculated PE can 258 be selected according to the link bandwidth weight value or the IP 259 address of the EVPN peer. The specified PE needs to advertise each 260 DF election result of the multicast flow that belongs to the 261 original faulty PE to the other multi-homed PEs that belong to the 262 same ES by the EVPN Join Synch route carrying the Multicast DF 263 Extended Community. 265 If a new multicast join is received in the above calculation 266 process, the DF election calculation of the new multicast flow is 267 still completed by the PE receiving the multicast join packet. 268 Similarly, the PE needs to advertise the DF information to other 269 multi-homed PEs belonging to the same ES by the EVPN Join Synch 270 route carrying the Multicast DF Extended Community. 272 2.4.2. Increase of Multi-homed PEs 274 One multi-homing PE of the same ES is added, and no active 275 adjustment can be performed. The DF of the subsequent new multicast 276 flow is elected according to the algorithm of this document. The new 277 multicast flow must be preferentially assigned to the new PE, and 278 finally the multicast flows on the PEs of the same ES are 279 approximately equalized. 281 If active adjustment is required, consider calculating the ratio 282 using the algorithm as described in section 2.1 and 2.2. Each time 283 the multicast entries in the PE, whose ratio of the existing multi- 284 homed PE is the largest, are migrated to the new PE. The multicast 285 entries are migrated in descending order of multicast flow bandwidth 286 or in ascending order of the group and source address until the 287 ratio of the new PE is greater than the existing smallest ratio of 288 other multi-homed PEs. 290 The calculation of the active adjustment is still performed by one 291 specific PE among the multi-homed PEs. The specified calculated PE 292 can be selected according to the link bandwidth weight value or the 293 IP address of the EVPN peer. 295 After the new PE is started, in the synchronization process of all 296 the multicast entries of other multi-homed PEs, the existing 297 multicast join packet may be received on the new PE. To avoid having 298 the existing multicast join appear as a new multicast join, and 299 recalculating the DF and notifying the other PEs belonging to the 300 same ES, it is necessary to start a timer to suppress the 301 synchronization process from the new PE to other existing PE's. The 302 timer range should also be configured. 304 3. BGP Encoding 306 3.1. DF Election Extended Community 308 [I-D.ietf-bess-evpn-df-election-framework] defines an extended 309 community, which would be used for multi-homed PEs to reach a 310 consensus as to which DF election procedure is desired. A PE can 311 notify other participating PEs its DF election capability by 312 signaling a DF election extended community along with Ethernet- 313 Segment Route (Type-4). The current document extends the existing 314 extended community defined in [I-D.ietf-bess-evpn-df-election- 315 framework]. This document defines a new DF type. 317 o DF type (1 octet) - Encodes the DF Election algorithm values 318 (between 0 and 255) that the advertising PE desires to use for the 319 ES. 321 * Type TBD: Based on bandwidth of multicast flow DF 322 election(detailed in this document) 324 * Type TBD+1: Based on quantity of multicast flow state DF 325 election(detailed in this document) 327 3.2. Multicast DF Extended Community 329 This document defines a new extended community in EVPN Type 7 route 330 to notify other multi-homed PEs the elected DF of a given multicast 331 flow. The new extended community is called Multicast DF Extended 332 Community and it belongs to the transitive extended community. The 333 type is to be assigned. It is used to carry DF information of a 334 given (S,G) or (*,G) multicast flow selection. The role of this 335 extended community has been described in sections 2.3 and 2.4. 337 1 2 3 338 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 339 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 340 | Type=TBD | Sub-Type=TBD | Reserved | DF Length | 341 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 342 | DF IP Address(Variable) | 343 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 345 4. Security Considerations 347 TBD 349 For general EVPN Security Considerations, see [RFC7432]. 351 5. IANA Considerations 353 TBD 355 6. References 357 6.1. Normative References 359 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 360 Requirement Levels", BCP 14, RFC 2119, March 1997. 362 [RFC7432] A. Sajassi, Ed., R. Aggarwal, N. Bitar, A. Isaac, J. 363 Uttaro, J. Drake, and W. Henderickx, "BGP MPLS-Based 364 Ethernet VPN", RFC 7432, February 2015 366 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 367 2119 Key Words", BCP 14, RFC 8174, May 2017 369 [I-D.ietf-bess-evpn-df-election-framework] J. Rabadan Ed., S. 370 Mohanty, Ed., A. Sajassi, J. Drake, K. Nagaraj and S. 371 Sathappan, " Framework for EVPN Designated Forwarder 372 Election Extensibility ", December 2018, work-in-progress, 373 draft-ietf-bess-evpn-df-election-framework-06. 375 [I-D.ietf-bess-evpn-per-mcast-flow-df-election] Ali Sajassi, 376 Mankamana Mishra, Samir Thoria, Jorge Rabadan and John 377 Drake, " Per multicast flow Designated Forwarder Election 378 for EVPN ", September 2018, work-in-progress, draft-ietf- 379 bess-evpn-per-mcast-flow-df-election-00. 381 [I-D.ietf-idr-link-bandwidth] P. Mohapatra and R. Fernando, " BGP 382 Link Bandwidth Extended Community ", March 2018, expired, 383 draft-ietf-idr-link-bandwidth-07. 385 [I-D.ietf-bess-evpn-igmp-mld-proxy] Ali Sajassi, Samir Thoria, Keyur 386 Patel, Derek Yeung, John Drake and Wen Lin, "IGMP and MLD 387 Proxy for EVPN", June 2018, work-in-progress, draft-ietf- 388 bess-evpn-igmp-mld-proxy-02. 390 [I-D.skr-bess-evpn-pim-proxy] J. Rabadan, Ed., J. Kotalwar, S. 391 Sathappan, Z. Zhang and A. Sajassi, "PIM Proxy in EVPN 392 Networks", October 2017, expired, draft-skr-evpn-bess-pim- 393 proxy-01. 395 6.2. Informative References 397 TBD 399 7. Acknowledgments 401 The authors would like to thank the following for their valuable 402 contributions of this document: 404 TBD 405 Authors' Addresses 407 Yisong Liu 408 Huawei Technologies 409 Huawei Bld., No.156 Beiqing Rd. 410 Beijing 100095 411 China 413 Email: liuyisong@huawei.com 415 Mike McBride 416 Huawei Technologies 417 2330 Central Expressway 418 Santa Clara, CA 95055 419 USA 421 Email: Michael.mcbride@huawei.com