idnits 2.17.1 draft-grow-bounded-longest-match-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories -- however, there's a paragraph with a matching beginning. Boilerplate error? == Mismatching filename: the document gives the document name as 'draft-grow-bounded-longest-match-04', but the file name used is 'draft-grow-bounded-longest-match-00' == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 7 longer pages, the longest (page 2) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 8 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 2003) is 7621 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '1' is defined on line 283, but no explicit reference was found in the text == Unused Reference: '2' is defined on line 285, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' -- Possible downref: Non-RFC (?) normative reference: ref. '3' == Outdated reference: A later version (-06) exists of draft-walton-bgp-add-paths-00 -- Possible downref: Normative reference to a draft: ref. 'ADD-PATH' Summary: 8 errors (**), 0 flaws (~~), 7 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Hardie 3 Internet Draft R. White 4 Expiration Date: December 2003 June 2003 5 File Name: draft-grow-bounded-longest-match-04.txt 7 Bounding Longest Match Considered 8 draft-grow-bounded-longest-match-04.txt 10 Status of this Memo 12 This document is an Internet-Draft and is in full conformance with 13 all provisions of Section 10 of RFC2026. 15 Internet Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its Areas, and its Working Groups. Note that other 17 groups may also distribute working documents as Internet Drafts. 19 Internet Drafts are draft documents valid for a maximum of six 20 months. Internet Drafts may be updated, replaced, or obsoleted by 21 other documents at any time. It is not appropriate to use Internet 22 Drafts as reference material or to cite them other than as a "working 23 draft" or "work in progress". 25 The list of current Internet-Drafts can be accessed at 26 http//www.ietf.org/ietf/1id-abstracts.txt 28 The list of Internet-Draft Shadow Directories can be accessed at 29 http//www.ietf.org/shadow.html. 31 Abstract 33 Some ASes currently use length-based filters to manage the size of 34 the routing table they use and propagate. This draft explores an 35 alternative to length-based filters which allows for more automatic 36 configuration and which provides for better redundancy. 38 Rather than use a filter, this draft proposes a method of modifying 39 the BGP longest match algorithm by setting a bound on the prefix 40 lengths eligible for preference. A bound would operate on long 41 prefixes when covering route announcements are available; in certain 42 circumstances it would cause a router to prefer an aggregate over a 43 more specific route announcement. 45 1. Motivation 47 Modifying longest match would limit the rate of growth in the routing 48 table seen by many BGP speakers. The current rate of growth and the 49 time to convergence represent threats to the stability to the 50 Internet. In the short term, the IETF is considering efforts to curb 51 these threats while new routing paradigms that attack the fundamental 52 limitations of path vector protocols are developed and deployed. 54 A number of the practical efforts to limit the rate of growth of the 55 routing table have focused on filter policies, arguing that 56 aggressive filtering will return the Internet to a state in which 57 provider aggregates are a majority of the routes in the routing 58 table[3]. This draft proposes an approach along those same lines, 59 but using a bound on the longest match algorithm rather than a filter 60 policy. The authors believe that this approach can produce a similar 61 (though not identical) effect while retaining full reachability and 62 allowing multi-homing non-transit networks to achieve the main goals 63 which have motivated their becoming independent ASes. 65 2. Proposed Enhancements 67 Two enhancements are proposed by this draft: a new community, and a 68 new way of handling overlapping prefixes received from an external 69 peer. 71 As each prefix is received by a BGP speaker from an external peer, it 72 would be evaluated in the light of other prefixes already received. 73 If two prefixes overlap in space (such as 192.168.0.0/16 and 74 192.168.1.0/24), the longer prefix would be marked with the NO_EXPORT 75 community, and the local preference set to a very high number so that 76 it would always win in any best path computations within the 77 autonomous system. The longer prefix may also be marked with a new 78 community, NO_INSTALL. 80 2.1. The NO_INSTALL Community 82 An optional optimization to bounding longer prefixes by marking them 83 with a high Local Preference and the NO_EXPORT community is to also 84 mark them with a new, non-trasitive, optional community, NO_INSTALL. 85 The effect of this community would be for any BGP speaker receiving a 86 prefix with this community set to treat the prefix normally in the 87 BGP bestpath computation, and to forward bestpaths marked as 88 NO_INSTALL to iBGP peers, but to simply not install such prefixes in 89 the local routing table. 91 This would result in saving some small amount of memory for each 92 prefix not installed in the RIB, and the local forwarding tables 93 built from the RIB. If there are enough prefixes thus marked, the 94 memory and computation savings could be significant. BGP speakers 95 which receive a prefix marked with NO_INSTALL, and do not understand 96 this community, may simply ignore the community. 98 2.2. Example of Bounding the Longer Prefix 100 Assume the following configuration of autonomous systems: 102 ( ) 103 /-------( AS2 )--------\ 104 ( ) / ( ) ( R1 ) ( ) 105 ( AS1 ) ( AS4 R2)-----( AS5 ) 106 ( ) \ ( ) ( R3 ) ( ) 107 \-------( AS3 )--------/ 108 ( ) 110 o AS1 is advertising 192.168.1.0/24 to both AS2 and AS3. 112 o AS2 is advertising both 192.168.1.0/24 and 192.168.0.0/16 into 113 AS4 115 o AS3 is advertising 192.168.1.0/24 into AS4 117 When R1 receives both the 192.168.1.0/24 and the 192.168.0.0/16 pre- 118 fixes, it will mark the 192.168.1.0/24 as NO_EXPORT, and set the 119 local preference to a high value, as described in the section Setting 120 the Local Preference, below, and will then propogate this through 121 AS4. 123 R3 will receive the longer prefix from AS3, and the iBGP prefix with 124 the high local preference with NO_EXPORT set. Given it does not see 125 the overlapping prefix, it will compare the default (lower) local 126 preference of the externally learned route with the higher local 127 preference set by the AS2/AS4 border router, and will not advertise 128 the 192.168.1.0/24 prefix into AS4 at all. 130 R3 border router may also, on detecting the overlap, mark the longer 131 prefix with the NO_INSTALL community. 133 If the link between AS1 and AS2 fails, the longer length prefix will 134 be withdrawn from AS2, and thus the peering point between AS2 and AS4 135 will no longer have an overlapping set of prefixes. Within AS4, the 136 border router which peers with AS2 will cease advertising the 137 192.168.1.0/24 prefix, which allows the AS3/AS4 border router to 138 begin advertising it into AS4, and through AS4 into AS5, restoring 139 connectivity to AS1. 141 2.3. Setting the Local Preference 143 Since there could be multiple points at which an autonomous system 144 may receive the same pair of overlapping prefixes, there must be some 145 way to ensure that one of the longer prefixes wins in the BGP deci- 146 sion algorithm consistently. In practice, this means that each BGP 147 speaker which receives an overlapping set of routes should set the 148 local preference on the set of longer prefixes so there won't be two 149 longer prefixes with matching local preferences. 151 The easiest way to ensure this within an autonomous system is to set 152 the local preference for longer prefixes based on some unique number 153 assigned to each BGP speaker. Given the router ID and the local 154 preference are both 32 bit numbers, an ideal solution appears to be 155 to simply set the local preference to the router ID of the BGP 156 speaker. The primary problem with this is that in some cases, the 157 router ID of the device may be lower than some standard Local Prefer- 158 ence, perhaps even lower than a standard Local Prference used by 159 default throughout a network. 161 To alleviate this problem, the local preference of longer prefixes 162 which overlap with shorter prefixes should be set to the router ID of 163 the BGP speaker, and then the high order bit of the Local Preference 164 should be set, so the setting will be guaranteed to be at least above 165 64,000. 167 2.4. Implications for Load Sharing 169 Since the goal of this proposal is to reduce the number of paths 170 stored within local tables, and to reduce the amount of information 171 passed through to neighboring autonomous systems, the implementation 172 of this draft as described above would have a negative impact on the 173 ability to load share between multiple paths to the same destination. 175 3. An Alternative Implementation Using ADD PATH 177 An implementation which supports [ADD-PATH] could optionally use this 178 capability to block the overlapping prefixes into neighboring auto- 179 nomous systems, and preserve local load sharing. 181 o Any router receiving a pair of overlapped routes from its exter- 182 nal peers would mark the longer prefix with the NO_EXPORT com- 183 munity, and propogate the overlapped prefix using the technique 184 described in [ADD-PATH]. 186 o Any router receiving a pair of overlapped routes, with the 187 longer prefix learned from an external peer, and the shorter 188 prefix learned from an internal peer, would mark the longer pre- 189 fix with the NO_EXPORT community, and propogate the prefix nor- 190 mally to its internal peers. 192 4. Benefits and Risks 194 The benefits and risks associated with this proposal are discussed in 195 the sections below. 197 4.1. Advantages to the Service Provider 199 AS4, in each of the situations, reduces the number of prefixes car- 200 ried through the autonomous system by the number of longer prefixes 201 that overlap with aggregates of those prefixes. While one copy of the 202 prefix continues to be carried through the autonomous system, this 203 entry can be marked with the optional NO_INSTALL community, so it is 204 not placed in the forwarding table, nor is it propogated outside the 205 autonomous system. 207 AS5 receives one prefix instead of two (or possibly more). 209 4.2. Advantages to the Customer 211 In this case, the customer is respresented as AS1. The customer will 212 continue to receive some amount of traffic over both peering ses- 213 sions, and dual homing through two Service Providers is still effec- 214 tive. If the customer's primary link fails, the alternate link 215 through AS3 will take over receving all inbound traffic automati- 216 cally. With most other schemes presented to this point, the customer 217 loses all impact of dual-homing into the Internet, unless both con- 218 nections are through one Service Provider. 220 4.3. Advantages to the Internet 222 Beyond the second AS hop, aggregation is preserved in all cases. 223 While this would not reduce the backbone routing table by the 224 dramatic amounts that other methods might, the advantages to the com- 225 munity are great, and at greatly reduced risk to customers. 227 4.4. Implications for Router processing 229 This proposal clearly adds to the work which needs to be done during 230 overall BGP processing. Because a check needs to be done for both 231 covered and covering routes, some part of this work is required for 232 routes of lengths on either side of the bound. Should this become 233 common, however, the rate of growth in the number of routes should be 234 smaller and a balance should be struck between the extra processing 235 per route and the number of routes. 237 4.5. Implications for Traffic engineering 239 The implementation of a bound risks magnifying or removing the effect 240 of certain widely deployed traffic engineering methods. If, for 241 example, an AS chose to prepend its own route to an announcement in 242 order to alter the preference for that route, a BGP neighbor using a 243 bounded longest match might now see that route as eligible for dis- 244 card in favor of an aggregate. While it is fairly easy to code 245 around that particular problem, to avoid this class of problems it 246 might be preferable to allow this to apply to specific AS Sets as 247 well as to all BGP neighbors. 249 4.6. Implications for Propagation delay and increased convergence time. 251 If the route to the AS providing the route to the aggregate should be 252 lost, the more-specific must propagate into the ASes which had form- 253 erly heard only the aggregate. This increases convergence time and 254 may create situations in which reachability is temporarily comprom- 255 ised. Unlike the filter case, however, normal BGP behavior should 256 restore reachability without changes to the router configuration. 257 There is a also a risk that during a pathological event the increased 258 processing required by this change will degrade propagation times 259 during those events. This depends on both the speed of specific 260 implementations and the character of the topology. 262 5. Security Considerations 264 This document presumes that the implementation of bounded longest 265 match is a knob inside a router config. Since the use of the knob 266 affects route announcements not originating within the router's AS or 267 its direct neighbors, the new behavior may result in surprises to the 268 announcing AS. It is possible that this behavior might be considered 269 a denial of service or mistaken for a denial of service by systems 270 designed to detect black-holing on behalf of the origin AS. 272 6. Acknowledgements 274 Cengiz Alaentinoglu, Alvaro Retana, Daniel Walton, Danny McPherson, 275 and Barry Greene gave valuable comments on this draft. A number of 276 colleagues also gave the author valuable comments on the white board 277 markings that gave rise to this paper; among them are Lane Patterson, 278 Ian Cooper, Gerd Besch, Bill Norton, Diarmuid Flynn, and Sean 279 Donelan. 281 7. References 283 [1] Huston, Geoff. http://www.telstra.net/ops/bgp/index.html 285 [2] Ahuja, Abha. http://www.merit.edu/~ahuja/ptomaine-bof/ahuja-ietf- 286 ptomaine/index.htm 288 [3] Bush, Randy. Plenary, IETF 51. Eventually at: 289 http://www.ietf.org/proceedings/01aug/ 291 [ADD-PATH] 292 Walton, D, et al, "Advertisement of Multiple Paths in BGP," draft- 293 walton-bgp-add-paths-00.txt 295 8. Authors' Addresses 297 Ted Hardie 298 Ted.Hardie@nominum.com 300 Russ White 301 Cisco Systems, Inc. 302 7025 Kit Creek Rd. 303 Research Triangle Park, NC 27709 304 EMail: riw@cisco.com