idnits 2.17.1 draft-white-grow-overlapping-routes-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 4, 2016) is 2934 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-08) exists of draft-ietf-idr-custom-decision-04 Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. White 3 Internet-Draft Linkedin 4 Intended status: Informational A. Retana 5 Expires: October 5, 2016 Cisco Systems, Inc. 6 S. Hares 7 Huawei 8 April 4, 2016 10 Filtering of Overlapping Routes 11 draft-white-grow-overlapping-routes-04 13 Abstract 15 This document proposes an optional mechanism to remove a prefix when 16 it overlaps with a functionally equivalent shorter prefix. The 17 proposed mechanism does not require any changes to the BGP protocol. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on October 5, 2016. 36 Copyright Notice 38 Copyright (c) 2016 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 54 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 3 55 3. Overlapping Route Filtering Mechanism . . . . . . . . . . . . 3 56 3.1. Marking Overlapping Routes . . . . . . . . . . . . . . . 4 57 3.2. Preferring Marked Routes . . . . . . . . . . . . . . . . 4 58 3.2.1. Using a Cost Community . . . . . . . . . . . . . . . 4 59 3.2.2. Using the Local Preference . . . . . . . . . . . . . 4 60 3.3. Handling Marked Routes Within the AS . . . . . . . . . . 5 61 3.4. Handling Marked Routes at the Outbound Edge . . . . . . . 5 62 4. Examples of Filtering Overlapping Routes . . . . . . . . . . 5 63 4.1. IPv4 Example . . . . . . . . . . . . . . . . . . . . . . 5 64 4.2. IPv6 Example . . . . . . . . . . . . . . . . . . . . . . 6 65 5. Operational Considerations . . . . . . . . . . . . . . . . . 6 66 5.1. Advantages to the Service Provider . . . . . . . . . . . 7 67 5.2. Implications for Router processing . . . . . . . . . . . 7 68 5.3. Implications for Convergence Time . . . . . . . . . . . . 7 69 6. Security Considerations . . . . . . . . . . . . . . . . . . . 7 70 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 71 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 72 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 73 9.1. Normative References . . . . . . . . . . . . . . . . . . 8 74 9.2. Informative References . . . . . . . . . . . . . . . . . 8 75 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 8 76 A.1. Changes between the -00 and -01 versions. . . . . . . . . 8 77 A.2. Changes between the -01 and -02 versions . . . . . . . . 9 78 A.3. Changes between the -02 and -03 versions . . . . . . . . 9 79 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 81 1. Introduction 83 One cause of the growth of the global Internet's default free zone 84 table size is overlapping routes injected into the routing system to 85 steer traffic among various entry points into a network. Because 86 padding AS Path lengths can only steer inbound traffic in a very 87 small set of cases, and other mechanisms used to steer traffic to a 88 particular inbound point are ineffective when multiple upstream 89 providers are in use, advertising longer prefixes is often the only 90 possible way for an AS to steer traffic into specific entry points 91 along its edge. 93 These longer prefix routes, called overlapping routes in this 94 document, are often advertised along with a shorter prefix route, 95 called a covering route, in order to ensure connectivity in the case 96 of link or device failures. Overlapping routes not only add to the 97 load on routers in the Internet core by simply expanding the table 98 size; these routes may be less stable than the covering routes they 99 are paired with. 101 Given the importance of an autonomous system's ability to steer 102 traffic into specific entry points, simply removing the longer 103 prefixes in a longer prefix (overlapping)/shorter prefix (covering) 104 pair of routes isn't a viable solution. 106 This document proposes an optional mechanism to remove overlapping 107 routes that are no longer useful for steering traffic towards a 108 specific entry point in a particular AS. Removing these routes would 109 reduce the global table in size, and reduce its instability, while 110 removing no capabilities, nor increasing the average path length. 112 The mechanism proposed is simple to implement, requiring no changes 113 to BGP [RFC4271] either in packet format or in the decision process. 114 The removal described in this document is akin to filtering, not to 115 route aggregation. 117 The intent of the mechanism is for it to be used based on local 118 decisions and policies, not on an Internet-wide fashion. It is 119 assumed that network operators using this mechanism have an incentive 120 to do so. 122 2. Requirements Language 124 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 125 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 126 document are to be interpreted as described in [RFC2119]. 128 3. Overlapping Route Filtering Mechanism 130 The handling of overlapping prefixes received from an external peer 131 can be broken down into four parts: marking overlapping routes, 132 preferring marked routes, handling marked routes within the AS, and 133 handling marked routes at the AS exit point. 135 The initial step in successfully filtering overlapping routes is to 136 identify and mark them. This document proposes the use of a BGP 137 community called BOUNDED for that purpose. Because the operation 138 suggested takes place inside an Autonomous System (AS), then any 139 locally assigned community can be used. 141 The term BOUNDED is used to refer to a locally assigned community 142 used to mark overlapping routes, and to these marked routes as well. 144 3.1. Marking Overlapping Routes 146 As each prefix is received by a BGP speaker from an external peer, it 147 is evaluated in the light of other prefixes already received. If two 148 prefixes overlap in space (such as 192.0.2.0/24 and 192.0.2.128/25, 149 or 2001:DB8::/32 and 2001:DB8:1:/48), the longer prefix SHOULD be 150 BOUNDED if it fully overlaps the covering prefix and it is the best 151 path to the destination. 153 An overlapping prefix is said to fully overlap the corresponding 154 covering prefix if both have identical AS_PATH attributes (both in 155 length and contents) and the same NEXT_HOP. 157 3.2. Preferring Marked Routes 159 Since the same overlapping route may be received at several peering 160 points along the edge of the AS, and the covering route may not be 161 present at each of these points, BOUNDED routes SHOULD be preferred 162 over unmarked routes for overlapping routes to be properly handled. 163 A router which marks an overlapping route should also use one of the 164 two mechanisms described here to insure the marked route is preferred 165 throughout the AS. 167 Only one method described in this section SHOULD be deployed in any 168 given AS. 170 3.2.1. Using a Cost Community 172 The recommended method for preferring BOUNDED routes is to use a Cost 173 Community [I-D.ietf-idr-custom-decision] with the Point of Insertion 174 set to ABSOLUTE_VALUE. This mechanism leaves all existing local 175 policy controls in place within the AS. 177 If this method is used, only the BOUNDED routes need to be tagged 178 using a lower than default Cost, as routes without a Cost Community 179 are considered to have the default value. 181 3.2.2. Using the Local Preference 183 An alternate mechanism which may be used to prefer BOUNDED routes is 184 to set their Local Preference to some number higher than the normal 185 standard policy settings for a particular prefix. It's not important 186 that any particular BOUNDED route win over any other one; so simply 187 adding a small amount to the normal Local Preference, as dictated by 188 local policy, will ensure a BOUNDED route will always win over an 189 unmarked route, so only these routes reach the outbound edge of the 190 AS. 192 3.3. Handling Marked Routes Within the AS 194 Routes marked with the BOUNDED community MAY not be installed in the 195 local RIB of routers within the AS. This optional step will reduce 196 local RIB and forwarding table usage and volatility within the AS. 198 3.4. Handling Marked Routes at the Outbound Edge 200 If local policy dictates, routes marked with the BOUNDED community 201 SHOULD NOT be advertised to external peers. If they are advertised, 202 they MAY then be marked with the NO_EXPORT community. 204 4. Examples of Filtering Overlapping Routes 206 Assume the following configuration of autonomous systems: 208 ( ) 209 /-------( AS2 )--------\ 210 ( ) / ( ) \ ( ) ( ) 211 ( AS1 ) ( AS4 )-----( AS5 ) 212 ( ) \ ( ) / ( ) ( ) 213 \-------( AS3 )--------/ 214 ( ) 216 This network is used in both of the following examples. 218 4.1. IPv4 Example 220 o AS1 is advertising 192.0.2.128/25 to both AS2 and AS3. 222 o AS2 is advertising both 192.0.2.128/25 and 192.0.2.0/24 into AS4. 224 o AS3 is advertising 192.0.2.128/25 into AS4 226 o Each BGP connection (session) is handled by a separate router 227 within each AS (for instance, AS4 peers with AS2 and AS3 on 228 separate routers). 230 When the router in AS4 peering with AS2 receives both the 231 192.0.2.128/25 and the 192.0.2.0/24 prefixes, it will mark 232 192.0.2.128/25 as BOUNDED, and set a Cost Community (as described in 233 Section 3.2.1) so the marked overlapping route is preferred over 234 unmarked routes within AS4. 236 The border router between AS4 and AS3 will receive the longer prefix 237 from AS3, and the preferred BOUNDED overlapping route through iBGP. 238 It will prefer the marked route, so the unmarked route towards 239 192.0.2.128/25 will not be advertised throughout AS4. 241 If the link between AS1 and AS2 fails, the longer length prefix will 242 be withdrawn from AS2, and thus the peering point between AS2 and AS4 243 will no longer have an overlapping set of prefixes. Within AS4, the 244 border router which peers with AS2 will cease advertising the 245 192.0.2.128/25 prefix, which allows the AS3/AS4 border router to 246 begin advertising it into AS4, and through AS4 into AS5, restoring 247 connectivity to AS1. 249 4.2. IPv6 Example 251 o AS1 is advertising 2001:DB8:1:/48 to both AS2 and AS3. 253 o AS2 is advertising both 2001:DB8:1:/48 and 2001:DB8::/32 into AS4. 255 o AS3 is advertising 2001:DB8:1:/48 into AS4 257 o Each BGP connection (session) is handled by a separate router 258 within each AS (for instance, AS4 peers with AS2 and AS3 on 259 separate routers). 261 When the router in AS4 peering with AS2 receives both the 262 2001:DB8:1:/48 and 2001:DB8::/32 prefixes, it will mark 263 2001:DB8:1:/48 as BOUNDED, and set a Cost Community (as described in 264 Section 3.2.1) so the marked overlapping route is preferred over 265 unmarked routes within AS4. 267 The border router between AS4 and AS3 will receive the longer prefix 268 from AS3, and the preferred BOUNDED overlapping route through iBGP. 269 It will prefer the marked route, so the unmarked route towards 270 2001:DB8:1:/48 will not be advertised throughout AS4. 272 If the link between AS1 and AS2 fails, the longer length prefix will 273 be withdrawn from AS2, and thus the peering point between AS2 and AS4 274 will no longer have an overlapping set of prefixes. Within AS4, the 275 border router which peers with AS2 will cease advertising the 276 2001:DB8:1:/48 prefix, which allows the AS3/AS4 border router to 277 begin advertising it into AS4, and through AS4 into AS5, restoring 278 connectivity to AS1. 280 5. Operational Considerations 282 The intent of the mechanism described in this document is for it to 283 be used based on local policies, not on an Internet-wide fashion. It 284 is assumed that network operators using this mechanism have an 285 incentive to do so. 287 The practice of filtering exists today on the Internet. While there 288 may be local benefits to applying manual filters and/or the mechanism 289 specified in this document, the operator should be aware of the 290 impact it may have on neighboring autonomous systems' policies 291 [I-D.cardona-filtering-threats]. 293 The benefits and implications associated with this proposal are 294 discussed in the sections below. The text references the sample 295 network in Section 4. 297 5.1. Advantages to the Service Provider 299 AS4, in each of the situations, reduces the number of prefixes 300 advertised to transit peering autonomous systems by the number of 301 longer prefixes that overlap with aggregates of those prefixes, so 302 that AS5 receives fewer total routes, and a more stable routing 303 table. While one copy of the prefix continues to be carried through 304 the autonomous system, this entry can be removed from the local 305 forwarding table. 307 5.2. Implications for Router processing 309 This proposal requires a BGP speaker to perform an additional check 310 on receiving a route, checking the route against existing routes for 311 overlapping coverage of a set of reachable destinations. This 312 additional work, in terms of processing requirements, should be 313 easily offset by the overall savings in processing through the 314 reduction of the forwarding table size, and the additional stability 315 in the routing table due to the removal of longer length prefixes. 317 5.3. Implications for Convergence Time 319 If the route to the AS providing the route to the covering route 320 should be lost, the overlapping route must now propagate into the 321 autonomous systems which had formerly received only the covering 322 route. This behavior increases convergence time and may create 323 situations in which reachability is temporarily compromised. Unlike 324 the case where manual filters are used, normal BGP behavior should 325 restore reachability without changes to the router configuration. 327 6. Security Considerations 329 This document presents a mechanism for an autonomous system to mark 330 and filter overlapping prefixes. Note that the result of this 331 operation is akin to the implementation of local route filtering at 332 an AS boundary. As such, this document doesn't introduce any new 333 security risks. 335 7. IANA Considerations 337 This document has no IANA actions. 339 8. Acknowledgements 341 Cengiz Alaentinoglu, Daniel Walton, David Ball, Ted Hardie, Jeff 342 Hass, Barry Greene, Bill Herrin and Robert Raszuk gave valuable 343 comments on this document. 345 9. References 347 9.1. Normative References 349 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 350 Requirement Levels", BCP 14, RFC 2119, March 1997. 352 9.2. Informative References 354 [I-D.cardona-filtering-threats] 355 Cardona, C. and P. Francois, "Making BGP filtering a 356 habit: Impact on policies", draft-cardona-filtering- 357 threats-02 (work in progress), July 2013. 359 [I-D.ietf-idr-custom-decision] 360 Retana, A. and R. White, "BGP Custom Decision Process", 361 draft-ietf-idr-custom-decision-04 (work in progress), 362 November 2013. 364 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 365 Protocol 4 (BGP-4)", RFC 4271, January 2006. 367 Appendix A. Change Log 369 A.1. Changes between the -00 and -01 versions. 371 o Updated authors' contact information. 373 o Changed intended status to Informational. 375 o General editorial changes. 377 o Clarified the intent of the draft in several places. 379 o Clarified when a route should be marked (3.1). 381 o Edited the operational considerations section. 383 o Updated ACKs. 385 A.2. Changes between the -01 and -02 versions 387 o Updated authors' contact information. 389 o General editorial changes. 391 o Refined the text about marking routes. 393 A.3. Changes between the -02 and -03 versions 395 o Updated authors' contact information. 397 o Added IPv6 examples. 399 o Minor editorial changes. 401 A.4. Changes between the -03 and -04 versions 403 o Updated authors' contact information. 405 Authors' Addresses 407 Russ White 408 Linkedin 410 Email: russ@riw.us 412 Alvaro Retana 413 Cisco Systems, Inc. 414 7025 Kit Creek Rd. 415 Research Triangle Park, NC 27709 416 USA 418 Email: aretana@cisco.com 420 Susan Hares 421 Huawei 423 Email: shares@ndzh.com