idnits 2.17.1 draft-ietf-grow-bgp-session-culling-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 3, 2017) is 2461 days in the past. Is this intentional? Checking references for intended status: Best Current Practice ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 309 == Outdated reference: A later version (-13) exists of draft-ietf-grow-bgp-gshut-09 == Outdated reference: A later version (-20) exists of draft-ietf-rtgwg-bgp-pic-05 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Global Routing Operations W. Hargrave 3 Internet-Draft LONAP 4 Intended status: Best Current Practice M. Griswold 5 Expires: January 4, 2018 20C 6 J. Snijders 7 NTT 8 N. Hilliard 9 INEX 10 July 3, 2017 12 Mitigating Negative Impact of Maintenance through BGP Session Culling 13 draft-ietf-grow-bgp-session-culling-02 15 Abstract 17 This document outlines an approach to mitigate negative impact on 18 networks resulting from maintenance activities. It includes guidance 19 for both IP networks and Internet Exchange Points (IXPs). The 20 approach is to ensure BGP-4 sessions affected by the maintenance are 21 forcefully torn down before the actual maintenance activities 22 commence. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on January 4, 2018. 41 Copyright Notice 43 Copyright (c) 2017 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 59 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 3 60 3. BGP Session Culling . . . . . . . . . . . . . . . . . . . . . 3 61 3.1. Voluntary BGP Session Teardown Recommendations . . . . . 3 62 3.1.1. Maintenance Considerations . . . . . . . . . . . . . 4 63 3.2. Involuntary BGP Session Teardown Recommendations . . . . 4 64 3.2.1. Packet Filter Considerations . . . . . . . . . . . . 4 65 3.2.2. Hardware Considerations . . . . . . . . . . . . . . . 5 66 3.3. Procedural Considerations . . . . . . . . . . . . . . . . 6 67 4. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 6 68 5. Security Considerations . . . . . . . . . . . . . . . . . . . 6 69 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6 70 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 6 71 7.1. Normative References . . . . . . . . . . . . . . . . . . 6 72 7.2. Informative References . . . . . . . . . . . . . . . . . 6 73 Appendix A. Example packet filters . . . . . . . . . . . . . . . 7 74 A.1. Cisco IOS, IOS XR & Arista EOS Firewall Example 75 Configuration . . . . . . . . . . . . . . . . . . . . . . 7 76 A.2. Nokia SR OS Filter Example Configuration . . . . . . . . 7 77 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 79 1. Introduction 81 BGP Session Culling is the practice of ensuring BGP sessions are 82 forcefully torn down before maintenance activities on a lower layer 83 network commence, which otherwise would affect the flow of data 84 between the BGP speakers. 86 BGP Session Culling ensures that lower layer network maintenance 87 activities cause the minimum possible amount of disruption, by 88 causing BGP speakers to preemptively gracefully converge onto 89 alternative paths while the lower layer network's forwarding plane 90 remains fully operational. 92 The grace period required for a successful application of BGP Session 93 Culling is the sum of the time needed to detect the loss of the BGP 94 session, plus the time required for the BGP speaker to converge onto 95 alternative paths. The first value is governed by the BGP Hold Timer 96 (section 6.5 of [RFC4271]), commonly between 90 and 180 seconds, The 97 second value is implementation specific, but could be as much as 15 98 minutes when a router with a slow control-plane is receiving a full 99 set of Internet routes. 101 Throughout this document the "Caretaker" is defined to be the 102 operator of the lower layer network, while "Operators" directly 103 administrate the BGP speakers. Operators and Caretakers implementing 104 BGP Session Culling are encouraged to avoid using a fixed grace 105 period, but instead monitor forwarding plane activity while the 106 culling is taking place and consider it complete once traffic levels 107 have dropped to a minimum (Section 3.3). 109 2. Requirements Language 111 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 112 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 113 document are to be interpreted as described in RFC 2119 [RFC2119]. 115 3. BGP Session Culling 117 From the viewpoint of the IP network operator, there are two types of 118 BGP Session Culling: 120 Voluntary BGP Session Teardown: The operator initiates the tear down 121 of the potentially affected BGP session by issuing an 122 Administrative Shutdown. 124 Involuntary BGP Session Teardown: The caretaker of the lower layer 125 network disrupts BGP control-plane traffic in the upper layer, 126 causing the BGP Hold Timers of the affected BGP session to expire, 127 subsequently triggering rerouting of end user traffic. 129 3.1. Voluntary BGP Session Teardown Recommendations 131 Before an operator commences activities which can cause disruption to 132 the flow of data through the lower layer network, an operator can 133 reduce loss of traffic by issuing an Administratively Shutdown to all 134 BGP sessions running across the lower layer network and wait a few 135 minutes for data-plane traffic to subside. 137 While architectures exist to facilitate quick network reconvergence 138 (such as BGP PIC [I-D.ietf-rtgwg-bgp-pic]), an operator cannot assume 139 the remote side has such capabilities. As such, a grace period 140 between the Administrative Shutdown and the impacting maintenance 141 activities is warranted. 143 After the maintenance activities have concluded, the operator is 144 expected to restore the BGP sessions to their original Administrative 145 state. 147 3.1.1. Maintenance Considerations 149 Initiators of the Administrative Shutdown could consider to use 150 Graceful Shutdown [I-D.ietf-grow-bgp-gshut] to facilitate smooth 151 drainage of traffic prior to session tear down, and the Shutdown 152 Communication [I-D.ietf-idr-shutdown] to inform the remote side on 153 the nature and duration of the maintenance activities. 155 3.2. Involuntary BGP Session Teardown Recommendations 157 In the case where multilateral interconnection between BGP speakers 158 is facilitated through a switched layer-2 fabric, such as commonly 159 seen at Internet Exchange Points (IXPs), different operational 160 considerations can apply. 162 Operational experience shows many network operators are unable to 163 carry out the Voluntary BGP Session Teardown recommendations, because 164 of the operational cost and risk of co-ordinating the two 165 configuration changes required. This has an adverse affect on 166 Internet performance. 168 In the absence of notifications from the lower layer (e.g. ethernet 169 link down) consistent with the planned maintenance activities in a 170 densely meshed multi-node layer-2 fabric, the caretaker of the fabric 171 could opt to cull BGP sessions on behalf of the stakeholders 172 connected to the fabric. 174 Such culling of control-plane traffic will pre-empt the loss of end- 175 user traffic, by causing the expiration of BGP Hold Timers ahead of 176 the moment where the expiration would occur without intervention from 177 the fabric's caretaker. 179 In this scenario, BGP Session Culling is accomplished through the 180 application of a combined layer-3 and layer-4 packet filter deployed 181 in the switched fabric itself. 183 3.2.1. Packet Filter Considerations 185 The following considerations apply to the packet filter design: 187 o The packet filter MUST only affect BGP traffic specific to the 188 layer-2 fabric, i.e. forming part of the control plane of the 189 system described, rather than multihop BGP traffic which merely 190 transits 192 o The packet filter MUST only affect BGP, i.e. TCP/179 194 o The packet filter SHOULD make provision for the bidirectional 195 nature of BGP, i.e. that sessions may be established in either 196 direction 198 o The packet filter MUST affect all relevant AFIs 200 Appendix A contains examples of correct packet filters for various 201 platforms. 203 3.2.2. Hardware Considerations 205 Not all hardware is capable of deploying layer 3 / layer 4 filters on 206 layer 2 ports, and even on platforms which support the feature, 207 documented limitations may exist or hardware resource allocation 208 failures may occur during filter deployment which may cause 209 unexpected results. These problems may include: 211 o Platform inability to apply layer 3/4 filters on ports which 212 already have layer 2 filters applied 214 o Layer 3/4 filters supported for IPv4 but not for IPv6 216 o Layer 3/4 filters supported on physical ports, but not on 802.3ad 217 Link Aggregate ports 219 o Failure of the operator to apply filters to all 802.3ad Link 220 Aggregate ports 222 o Limitations in ACL hardware mechanisms causing filters not to be 223 applied 225 o Fragmentation of ACL lookup memory causing transient ACL 226 application problems which are resolved after ACL removal / 227 reapplication 229 o Temporary service loss during hardware programming 231 o Reduction in hardware ACL capacity if the platform enables 232 lossless ACL application 234 It is advisable for the operator to be aware of the limitations of 235 their hardware, and to thoroughly test all complicated configurations 236 in advance to ensure that problems don't occur during production 237 deployments. 239 3.3. Procedural Considerations 241 The caretaker of the lower layer can monitor data-plane traffic (e.g. 242 interface counters) and carry out the maintenance without impact to 243 traffic once session culling is complete. 245 It is recommended that the packet filters are only deployed for the 246 duration of the maintenance and immediately removed after the 247 maintenance. To prevent unnecessarily troubleshooting, it is 248 RECOMMENDED that caretakers notify the affected operators before the 249 maintenance takes place, and make it explicit that the Involuntary 250 BGP Session Culling methodology will be applied. 252 4. Acknowledgments 254 The authors would like to thank the following people for their 255 contributions to this document: Saku Ytti, Greg Hankins, James 256 Bensley, Wolfgang Tremmel, Daniel Roesen, Bruno Decraene, and Tore 257 Anderson. 259 5. Security Considerations 261 There are no security considerations. 263 6. IANA Considerations 265 This document has no actions for IANA. 267 7. References 269 7.1. Normative References 271 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 272 Requirement Levels", BCP 14, RFC 2119, 273 DOI 10.17487/RFC2119, March 1997, 274 . 276 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 277 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 278 DOI 10.17487/RFC4271, January 2006, 279 . 281 7.2. Informative References 283 [I-D.ietf-grow-bgp-gshut] 284 Francois, P., Decraene, B., Pelsser, C., Patel, K., and C. 285 Filsfils, "Graceful BGP session shutdown", draft-ietf- 286 grow-bgp-gshut-09 (work in progress), June 2017. 288 [I-D.ietf-idr-shutdown] 289 Snijders, J., Heitz, J., and J. Scudder, "BGP 290 Administrative Shutdown Communication", draft-ietf-idr- 291 shutdown-10 (work in progress), June 2017. 293 [I-D.ietf-rtgwg-bgp-pic] 294 Bashandy, A., Filsfils, C., and P. Mohapatra, "BGP Prefix 295 Independent Convergence", draft-ietf-rtgwg-bgp-pic-05 296 (work in progress), May 2017. 298 7.3. URIs 300 [1] https://github.com/bgp/bgp-session-culling-config-examples 302 Appendix A. Example packet filters 304 Example packet filters for "Involuntary BGP Session Teardown" at an 305 IXP with LAN prefixes 192.0.2.0/24 and 2001:db8:2::/64. 307 A repository of configuration examples for a number of assorted 308 platforms can be found at github.com/bgp/bgp-session-culling-config- 309 examples [1]. 311 A.1. Cisco IOS, IOS XR & Arista EOS Firewall Example Configuration 313 ipv6 access-list acl-ipv6-permit-all-except-bgp 314 10 deny tcp 2001:db8:2::/64 eq bgp 2001:db8:2::/64 315 20 deny tcp 2001:db8:2::/64 2001:db8:2::/64 eq bgp 316 30 permit ipv6 any any 317 ! 318 ip access-list acl-ipv4-permit-all-except-bgp 319 10 deny tcp 192.0.2.0/24 eq bgp 192.0.2.0/24 320 20 deny tcp 192.0.2.0/24 192.0.2.0/24 eq bgp 321 30 permit ip any any 322 ! 323 interface Ethernet33 324 description IXP Participant Affected by Maintenance 325 ip access-group acl-ipv4-permit-all-except-bgp in 326 ipv6 access-group acl-ipv6-permit-all-except-bgp in 327 ! 329 A.2. Nokia SR OS Filter Example Configuration 330 ip-filter 10 create 331 filter-name "ACL IPv4 Permit All Except BGP" 332 default-action forward 333 entry 10 create 334 match protocol tcp 335 dst-ip 192.0.2.0/24 336 src-ip 192.0.2.0/24 337 port eq 179 338 exit 339 action 340 drop 341 exit 342 exit 343 exit 345 ipv6-filter 10 create 346 filter-name "ACL IPv6 Permit All Except BGP" 347 default-action forward 348 entry 10 create 349 match next-header tcp 350 dst-ip 2001:db8:2::/64 351 src-ip 2001:db8:2::/64 352 port eq 179 353 exit 354 action 355 drop 356 exit 357 exit 358 exit 360 interface "port-1/1/1" 361 description "IXP Participant Affected by Maintenance" 362 ingress 363 filter ip 10 364 filter ipv6 10 365 exit 366 exit 368 Authors' Addresses 370 Will Hargrave 371 LONAP Ltd 372 5 Fleet Place 373 London EC4M 7RD 374 United Kingdom 376 Email: will@lonap.net 377 Matt Griswold 378 20C 379 1658 Milwaukee Ave # 100-4506 380 Chicago, IL 60647 381 United States of America 383 Email: grizz@20c.com 385 Job Snijders 386 NTT Communications 387 Theodorus Majofskistraat 100 388 Amsterdam 1065 SZ 389 The Netherlands 391 Email: job@ntt.net 393 Nick Hilliard 394 INEX 395 4027 Kingswood Road 396 Dublin 24 397 Ireland 399 Email: nick@inex.ie