idnits 2.17.1 draft-ietf-idr-as-pathlimit-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 18. -- Found old boilerplate from RFC 3978, Section 5.5 on line 436. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 413. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 420. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 426. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 6, 2006) is 6526 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. '4B AS' -- Possible downref: Non-RFC (?) normative reference: ref. 'IDRP' ** Obsolete normative reference: RFC 1771 (Obsoleted by RFC 4271) ** Obsolete normative reference: RFC 3065 (Obsoleted by RFC 5065) ** Downref: Normative reference to an Informational RFC: RFC 3221 Summary: 7 errors (**), 0 flaws (~~), 2 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Li, Ed. 3 Internet-Draft Tropos Networks 4 Expires: December 8, 2006 R. Fernando, Ed. 5 Amoora, Inc. 6 J. Abley, Ed. 7 Afilias 8 June 6, 2006 10 The AS_PATHLIMIT Path Attribute 11 draft-ietf-idr-as-pathlimit-02.txt 13 Status of this Memo 15 By submitting this Internet-Draft, each author represents that any 16 applicable patent or other IPR claims of which he or she is aware 17 have been or will be disclosed, and any of which he or she becomes 18 aware will be disclosed, in accordance with Section 6 of BCP 79. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet-Drafts as reference 28 material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt. 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 This Internet-Draft will expire on December 8, 2006. 38 Copyright Notice 40 Copyright (C) The Internet Society (2006). 42 Abstract 44 This document describes the 'AS path limit' (AS_PATHLIMIT) path 45 attribute for BGP. This is an optional, transitive path attribute 46 that is designed to help limit the distribution of routing 47 information in the Internet. 49 By default, prefixes advertised into the BGP graph are distributed 50 freely, and if not blocked by policy will propagate globally. This 51 is harmful to the scalability of the routing subsystem since 52 information that only has a local effect on routing will cause state 53 creation throughout the default-free zone. This attribute can be 54 attached to a particular path to limit its scope to a subset of the 55 Internet. 57 Table of Contents 59 1. Requirements notation . . . . . . . . . . . . . . . . . . . . 3 60 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 3. Inter-Domain Traffic Engineering . . . . . . . . . . . . . . . 5 62 3.1. Traffic Engineering on a Diet . . . . . . . . . . . . . . 6 63 3.2. AS_PATHLIMIT as Control . . . . . . . . . . . . . . . . . 7 64 4. Anycast Service Distribution . . . . . . . . . . . . . . . . . 8 65 5. The AS_PATHLIMIT Attribute . . . . . . . . . . . . . . . . . . 9 66 5.1. Operations . . . . . . . . . . . . . . . . . . . . . . . . 9 67 5.2. Proxy Control . . . . . . . . . . . . . . . . . . . . . . 10 68 6. Security Considerations . . . . . . . . . . . . . . . . . . . 11 69 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 70 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 13 71 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13 72 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 14 73 Intellectual Property and Copyright Statements . . . . . . . . . . 15 75 1. Requirements notation 77 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 78 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 79 document are to be interpreted as described in [RFC2119]. 81 2. Introduction 83 A prefix that is injected into BGP [RFC1771] will propagate 84 throughout the graph of all BGP speakers unless it is explicitly 85 blocked by policy configuration. This behavior is necessary for the 86 correct operation of BGP, but has some unfortunate interactions with 87 current operational procedures. Currently, it is beneficial in some 88 cases to inject longer prefixes into BGP to control the flow of 89 traffic headed towards a particular destination. These longer 90 prefixes may be advertised in addition to an aggregate, even when the 91 aggregate advertisement is sufficient for basic reachability. This 92 particular application is known as "inter-domain traffic engineering" 93 and is a well-known phenomenon that is contributing to growth in the 94 size of the global routing table [RFC3221]. The mechanism proposed 95 here allows the propagation of those longer prefixes to be limited, 96 allowing some traffic engineering problems to be solved without such 97 global implications. 99 Another application of this mechanism is concerned with the 100 distribution of services across the Internet using anycast. Allowing 101 an anycast address advertisement to be limited to a subset of ASes in 102 the network can help control the scope of the anycast service area. 104 3. Inter-Domain Traffic Engineering 106 To perform traffic engineering, a multi-homed site advertises its 107 prefix to all of its neighbours and then also advertises more 108 specific prefixes to a subset of its neighbours. The longest match 109 lookup algorithm then causes traffic for the more specific prefixes 110 to prefer the subset of neighbours with the more specific prefix. 112 Figure 1 shows an example of traffic engineering and its impact on 113 the network. The multi-homed site (A) has a primary provider (C) and 114 a secondary provider (B). It has a prefix, Y, that provides 115 reachability to all of A, and advertises this to both B and C. In 116 addition, due to the internal topology of end-site A, it wishes that 117 all incoming traffic to subset X of its site enter through provider 118 B. To accomplish this, A advertises the more specific prefix, X, to 119 provider B. Longest match again causes traffic to prefer X over Y if 120 the destination of the traffic is within X. 122 Assuming that there are no policy boundaries involved, BGP will 123 propagate both of these prefixes X and Y throughout the entire AS- 124 level topology. This includes distant providers such as H, F and G. 125 Unfortunately, this adds to the amount of overhead in the routing 126 subsystem. The problem to be solved is to reduce this overhead and 127 thereby improve the scalability of the routing of the Internet. 129 ,--------------. ,--------------. ,--------------. 130 | Tier 2 +---+ Tier 2 +---+ Tier 3 | 131 | Provider H | | Provider E | | Provider F | 132 `--------------' `-+---------+--+ `--------------' 133 / | 134 / | 135 ,------------------+---. ,----+---------. ,-------------. 136 | Tier 1 +---+ Tier 1 | | Tier 1 | 137 | Primary Provider C | | Provider D +---+ Provider G | 138 `--------+-----------+-' `-------+------' `-------------' 139 | \ | 140 |Y \ | 141 ,--------+------. ,-+----------+-----------. 142 | Multi-homed +-----+ Tier 2 | 143 | site A |Y,X | Secondary Provider B | 144 `---------------' `------------------------' 146 The longer prefix X traverse a core and then coincides with the less- 147 specific, covering prefix Y. 149 Figure 1 151 3.1. Traffic Engineering on a Diet 153 What is needed is one or more mechanisms that an AS can use to 154 distribute its more specific routing information to a subset of the 155 network that exceeds its immediate neighbouring ASes and yet is also 156 significantly less than the global BGP graph. The solution space for 157 this is unbounded, as the limits that a source AS may wish to apply 158 to its more specific routes could be a fairly complicated 159 manifestation of its routing policies. One can imagine a policy that 160 restricts more specifics to ASes that only have prime AS numbers, for 161 example. 163 We already have one mechanism for performing this type of function. 164 The BGP NO_EXPORT community string attribute [RFC1997] can be 165 attached to more specific prefixes. This will cause the more 166 specifics not to be advertised past the immediate neighbouring AS. 167 This is effective at helping to prevent more specific prefixes from 168 becoming global, but it is extremely limited in that the more 169 specific prefixes can only propagate to adjacent ASes. 171 Some ASes have created a further mechanism wherein a prefix that is 172 given a particular community will have NO_EXPORT attached to that 173 prefix when the prefix is propagated to a specific AS. This is not a 174 generally deployed mechanism, but is used by some ASes as another 175 means of scope control. 177 Referring again to our example, A can advertise X with NO_EXPORT to 178 provider B. However, this will cause provider B not to advertise X to 179 the remainder of the network, and providers C, D, and G will not have 180 the longer prefixes and will thus send all of A's traffic via 181 provider C. This is not what A hoped to accomplish with advertising a 182 longer prefix and demonstrates why this NO_EXPORT mechanism is not 183 sufficiently flexible. 185 Instead of attempting to provide an infinitely flexible and 186 complicated mechanism for controlling the distribution of prefixes, 187 we propose a single, coarse, scope control mechanism. This coarse 188 mechanism will provide a limited amount of control at a very low cost 189 and address most of the evils associated with performing traffic 190 engineering through route distribution. 192 We observe that traffic engineering via longer prefixes is only 193 effective when the longer prefixes have a different next hop from the 194 less specific prefix. Thus, past the point where the next hops 195 become identical, the longer prefixes provide no value whatsoever. 196 We also observe that most traffic ends up traversing a subset of the 197 network operated by a relatively small number of large market- 198 dominant providers, joined by settlement-free interconnects. If one 199 looks one AS hop past this subset of the network, it is likely that 200 the longer prefixes and the site aggregate are using the same next 201 hop, and thus the longer prefixes have stopped providing value. 203 We can see this clearly in our example. Provider F sees that both 204 prefix X and prefix Y will lead all traffic through provider E. There 205 is no point in F carrying and propagating the more specific prefix X. 206 Similarly, providers G and H need not carry prefix X. 208 3.2. AS_PATHLIMIT as Control 210 To accomplish this, we propose to add information that will limit the 211 radius of propagation of more specific prefixes. If we attach a 212 count of the ASes that may be traversed by the more specific prefix, 213 we gain much of the control that we hope to achieve. We propose the 214 creation of a new path attribute that will carry an upper bound on 215 the number of ASes found the AS_PATH attribute. This new path 216 attribute will be called the 'path limit' or AS_PATHLIMIT. For 217 example, if prefix X is advertised with path limit 1, then only 218 provider B has the information and we get an effect that is identical 219 to NO_EXPORT. If prefix X is advertised with path limit 2, then only 220 B, C and D will carry it. This is an interesting compromise as 221 traffic for X will now flow consistently through provider B, as 222 desired. 224 However, this is not identical to fully distributing X. Consider, for 225 example that provider E in this circumstance will not receive prefix 226 X and is likely to prefer provider C for all A destinations. This 227 causes traffic for X to flow from E to C to B. If provider E did have 228 prefix X, it may choose to prefer provider D instead, resulting in a 229 different path. This second result can be achieved by increasing the 230 path limit to 3, but this has the unfortunate effect that provider G 231 would also receive prefix X. 233 Thus, AS_PATHLIMIT is an extremely lightweight mechanism, and 234 achieves a great deal of control. It is easy to imagine more 235 complicated control mechanisms, such IDRP [IDRP] distribution lists, 236 but we currently feel that the complexity of such a mechanism is 237 simply not warranted. 239 4. Anycast Service Distribution 241 A growing number of services are being distributed using anycast, by 242 advertising a route which covers one or more addresses for a service 243 which is provided autonomously at multiple locations. 245 For some services, it is useful to restrict the peak possible service 246 load, to avoid overloading local connectivity or service 247 infrastructure capabilities; it may be a better failure mode for 248 service to be retained only for a small community of surrounding 249 networks than for a single node to fail under a global load of 250 queries. 252 Although to some degree this policy can be accomplished through 253 negotiation and judicious use of NO_EXPORT without AS_PATHLIMIT, the 254 AS_PATHLIMIT attribute provides a more flexible and reliable 255 mechanism. 257 5. The AS_PATHLIMIT Attribute 259 The AS_PATHLIMIT attribute is a transitive optional BGP path 260 attribute, with Type Code XXXX. The AS_PATHLIMIT attribute has a 261 fixed length of 5 octets. The first octet is an unsigned number that 262 is the upper bound on the number of ASes in the AS_PATH attribute of 263 the associated paths. The second thru fifth octets are the AS number 264 of the AS that attached the AS_PATHLIMIT attribute to the NLRI. 266 5.1. Operations 268 A BGP speaker attaching the AS_PATHLIMIT attribute to an NLRI MUST 269 encode its AS number in the second thru fifth octets. The encoding 270 is described in [4B AS]. This information is intended to aid 271 debugging in the case where the AS_PATHLIMIT attribute is added by an 272 AS other than the originator of the NLRI. 274 A BGP speaker sending a route with an associated AS_PATHLIMIT 275 attribute to an EBGP neighbour MUST examine the value of the 276 attribute and the associated AS_PATH to be advertised. If the number 277 of ASes found in the AS_PATH exceeds the AS_PATHLIMIT value, then the 278 route should not be sent. 280 For the purposes of this attribute, private AS numbers [RFC1930] and 281 confederation AS members [RFC3065] found in the AS_PATH are not 282 counted. AS numbers found within an AS_SET are not counted and an 283 entire AS_SET is counted as a single AS. Each instance of an AS 284 number that appears multiple times in an AS_PATH is counted. 286 A BGP speaker receiving a route with an associated AS_PATHLIMIT 287 attribute from an EBGP neighbour MUST examine the value of the 288 attribute. If the number of ASes in the AS_PATH exceeds the value of 289 the AS_PATHLIMIT attribute, then the route MUST be ignored without 290 further processing. 292 When a BGP speaker propagates a route with an associated AS_PATHLIMIT 293 attribute, which it has learned from another BGP speaker's UPDATE 294 message, it MUST NOT modify the route's AS_PATHLIMIT attribute. 296 BGP requires that a BGP speaker that advertises a less specific 297 prefix, but not a more specific prefix that it is using, must 298 advertise the less specific prefix with the ATOMIC_AGGREGATE 299 attribute. BGP speakers that do not advertise a more specific prefix 300 based on the AS_PATHLIMIT must comply with this rule and advertise 301 the less specific prefixes with the ATOMIC_AGGREGATE attribute. To 302 help ensure compliance with this, sites that choose to advertise the 303 AS_PATHLIMIT path attribute should advertise the ATOMIC_AGGREGATE 304 attribute on all less specific covering prefixes. 306 5.2. Proxy Control 308 An AS may attach the AS_PATHLIMIT attribute to a route that it has 309 received from another AS. This is a form of proxy aggregation and 310 may result in routing behaviors that the origin of the route did not 311 intend. Further, if the overlapping prefixes are not advertised with 312 the ATOMIC_AGGREGATE attribute, adding the AS_PATHLIMIT attribute may 313 cause defective implementations to advertise incorrect paths. Before 314 adding the AS_PATHLIMIT attribute an AS must carefully consider the 315 risks and consequences outlined here. 317 6. Security Considerations 319 This new BGP attribute creates no new security issues. For it to be 320 used, it must be attached to a BGP route. If the router is forging a 321 route, then this attribute limits the extent of the damage caused by 322 the forgery. If a router attaches this attribute to a route, then it 323 could have just as easily have used normal policy mechanisms to 324 filter out the route. 326 7. IANA Considerations 328 IANA is hereby requested to allocate a code point from the BGP path 329 attribute Type Code space for the AS_PATHLIMIT path attribute. 330 Please replace 'XXXX' in the text above with the newly allocated code 331 point value. 333 8. Acknowledgements 335 The editors would like to acknowledge that they are not the original 336 initiators of this concept. Over the years, many similar proposals 337 have come our way, and we had hoped that self-discipline would cause 338 this type of mechanism to be unnecessary. We were overly optimistic. 340 The names of those who originally proposed this are now lost to the 341 mists of time. This should rightfully be their document. We would 342 like to thank them for the opportunity to steward their concept to 343 fruition. 345 9. References 347 [4B AS] Vohra, Q. and E. Chen, "BGP support for Four-octet AS 348 Number Space", Sept. 2005, . 351 [IDRP] ISO/IEC, "Information Processing Systems - 352 Telecommunications and Information Exchange between 353 Systems - Protocol for Exchange of Inter-domain Routeing 354 Information among Intermediate Systems to Support 355 Forwarding of ISO 8473 PDUs", IS 10747, 1993, . 358 [RFC1771] Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 359 (BGP-4)", RFC 1771, March 1995. 361 [RFC1930] Hawkinson, J. and T. Bates, "Guidelines for creation, 362 selection, and registration of an Autonomous System (AS)", 363 BCP 6, RFC 1930, March 1996. 365 [RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP 366 Communities Attribute", RFC 1997, August 1996. 368 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 369 Requirement Levels", BCP 14, RFC 2119, March 1997. 371 [RFC3065] Traina, P., McPherson, D., and J. Scudder, "Autonomous 372 System Confederations for BGP", RFC 3065, February 2001. 374 [RFC3221] Huston, G., "Commentary on Inter-Domain Routing in the 375 Internet", RFC 3221, December 2001. 377 Authors' Addresses 379 T. Li (editor) 380 Tropos Networks 381 555 Del Rey Ave. 382 Sunnyvale, CA 94085 384 Phone: +1 408 470 7381 385 Email: tony.li@tony.li 387 R. Fernando (editor) 388 Amoora, Inc. 389 1463 Cedarmeadow Ct. 390 San Jose, CA 95131 391 US 393 Email: rex_f@yahoo.com 395 J. Abley (editor) 396 Afilias Canada, Inc. 397 4141 Yonge Street, Suite 204 398 Toronto, ON M2P 2A8 399 CA 401 Phone: +1 416 673 4176 402 Email: jabley@ca.afilias.info 404 Intellectual Property Statement 406 The IETF takes no position regarding the validity or scope of any 407 Intellectual Property Rights or other rights that might be claimed to 408 pertain to the implementation or use of the technology described in 409 this document or the extent to which any license under such rights 410 might or might not be available; nor does it represent that it has 411 made any independent effort to identify any such rights. Information 412 on the procedures with respect to rights in RFC documents can be 413 found in BCP 78 and BCP 79. 415 Copies of IPR disclosures made to the IETF Secretariat and any 416 assurances of licenses to be made available, or the result of an 417 attempt made to obtain a general license or permission for the use of 418 such proprietary rights by implementers or users of this 419 specification can be obtained from the IETF on-line IPR repository at 420 http://www.ietf.org/ipr. 422 The IETF invites any interested party to bring to its attention any 423 copyrights, patents or patent applications, or other proprietary 424 rights that may cover technology that may be required to implement 425 this standard. Please address the information to the IETF at 426 ietf-ipr@ietf.org. 428 Disclaimer of Validity 430 This document and the information contained herein are provided on an 431 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 432 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 433 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 434 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 435 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 436 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 438 Copyright Statement 440 Copyright (C) The Internet Society (2006). This document is subject 441 to the rights, licenses and restrictions contained in BCP 78, and 442 except as set forth therein, the authors retain all their rights. 444 Acknowledgment 446 Funding for the RFC Editor function is currently provided by the 447 Internet Society.