idnits 2.17.1 draft-ietf-idr-as-pathlimit-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 18. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 447. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 458. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 465. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 471. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 4, 2007) is 6322 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-13) exists of draft-ietf-idr-as4bytes-12 ** Obsolete normative reference: RFC 3065 (Obsoleted by RFC 5065) Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Li, Ed. 3 Internet-Draft Cisco Systems, Inc. 4 Intended status: Standards Track R. Fernando, Ed. 5 Expires: July 8, 2007 Juniper Networks, Inc. 6 J. Abley, Ed. 7 Afilias 8 January 4, 2007 10 The AS_PATHLIMIT Path Attribute 11 draft-ietf-idr-as-pathlimit-03 13 Status of this Memo 15 By submitting this Internet-Draft, each author represents that any 16 applicable patent or other IPR claims of which he or she is aware 17 have been or will be disclosed, and any of which he or she becomes 18 aware will be disclosed, in accordance with Section 6 of BCP 79. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet-Drafts as reference 28 material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt. 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 This Internet-Draft will expire on July 8, 2007. 38 Copyright Notice 40 Copyright (C) The IETF Trust (2007). 42 Abstract 44 This document describes the 'AS path limit' (AS_PATHLIMIT) path 45 attribute for BGP. This is an optional, transitive path attribute 46 that is designed to help limit the distribution of routing 47 information in the Internet. 49 By default, prefixes advertised into the BGP graph are distributed 50 freely, and if not blocked by policy will propagate globally. This 51 is harmful to the scalability of the routing subsystem since 52 information that only has a local effect on routing will cause state 53 creation throughout the default-free zone. This attribute can be 54 attached to a particular path to limit its scope to a subset of the 55 Internet. 57 Table of Contents 59 1. Requirements notation . . . . . . . . . . . . . . . . . . . . 3 60 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 3. Inter-Domain Traffic Engineering . . . . . . . . . . . . . . . 5 62 3.1. Traffic Engineering on a Diet . . . . . . . . . . . . . . 6 63 3.2. AS_PATHLIMIT as Control . . . . . . . . . . . . . . . . . 7 64 4. Anycast Service Distribution . . . . . . . . . . . . . . . . . 8 65 5. The AS_PATHLIMIT Attribute . . . . . . . . . . . . . . . . . . 9 66 5.1. Operations . . . . . . . . . . . . . . . . . . . . . . . . 9 67 5.2. Proxy Control . . . . . . . . . . . . . . . . . . . . . . 10 68 6. Security Considerations . . . . . . . . . . . . . . . . . . . 11 69 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 70 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 13 71 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 72 9.1. Normative References . . . . . . . . . . . . . . . . . . . 14 73 9.2. Informative References . . . . . . . . . . . . . . . . . . 14 74 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15 75 Intellectual Property and Copyright Statements . . . . . . . . . . 16 77 1. Requirements notation 79 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 80 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 81 document are to be interpreted as described in [RFC2119]. 83 2. Introduction 85 A prefix that is injected into BGP [RFC4271] will propagate 86 throughout the graph of all BGP speakers unless it is explicitly 87 blocked by policy configuration. This behavior is necessary for the 88 correct operation of BGP, but has some unfortunate interactions with 89 current operational procedures. Currently, it is beneficial in some 90 cases to inject longer prefixes into BGP to control the flow of 91 traffic headed towards a particular destination. These longer 92 prefixes may be advertised in addition to an aggregate, even when the 93 aggregate advertisement is sufficient for basic reachability. This 94 particular application is known as "inter-domain traffic engineering" 95 and is a well-known phenomenon that is contributing to growth in the 96 size of the global routing table [RFC3221]. The mechanism proposed 97 here allows the propagation of those longer prefixes to be limited, 98 allowing some traffic engineering problems to be solved without such 99 global implications. 101 Another application of this mechanism is concerned with the 102 distribution of services across the Internet using anycast. Allowing 103 an anycast address advertisement to be limited to a subset of ASes in 104 the network can help control the scope of the anycast service area. 106 3. Inter-Domain Traffic Engineering 108 To perform traffic engineering, a multi-homed site advertises its 109 prefix to all of its neighbours and then also advertises more 110 specific prefixes to a subset of its neighbours. The longest match 111 lookup algorithm then causes traffic for the more specific prefixes 112 to prefer the subset of neighbours with the more specific prefix. 114 Figure 1 shows an example of traffic engineering and its impact on 115 the network. The multi-homed site (A) has a primary provider (C) and 116 a secondary provider (B). It has a prefix, Y, that provides 117 reachability to all of A, and advertises this to both B and C. In 118 addition, due to the internal topology of end-site A, it wishes that 119 all incoming traffic to subset X of its site enter through provider 120 B. To accomplish this, A advertises the more specific prefix, X, to 121 provider B. Longest match again causes traffic to prefer X over Y if 122 the destination of the traffic is within X. 124 Assuming that there are no policy boundaries involved, BGP will 125 propagate both of these prefixes X and Y throughout the entire AS- 126 level topology. This includes distant providers such as H, F and G. 127 Unfortunately, this adds to the amount of overhead in the routing 128 subsystem. The problem to be solved is to reduce this overhead and 129 thereby improve the scalability of the routing of the Internet. 131 ,--------------. ,--------------. ,--------------. 132 | Tier 2 +---+ Tier 2 +---+ Tier 3 | 133 | Provider H | | Provider E | | Provider F | 134 `--------------' `-+---------+--+ `--------------' 135 / | 136 / | 137 ,------------------+---. ,----+---------. ,-------------. 138 | Tier 1 +---+ Tier 1 | | Tier 1 | 139 | Primary Provider C | | Provider D +---+ Provider G | 140 `--------+-----------+-' `-------+------' `-------------' 141 | \ | 142 |Y \ | 143 ,--------+------. ,-+----------+-----------. 144 | Multi-homed +-----+ Tier 2 | 145 | site A |Y,X | Secondary Provider B | 146 `---------------' `------------------------' 148 The longer prefix X traverse a core and then coincides with the less- 149 specific, covering prefix Y. 151 Figure 1 153 3.1. Traffic Engineering on a Diet 155 What is needed is one or more mechanisms that an AS can use to 156 distribute its more specific routing information to a subset of the 157 network that exceeds its immediate neighbouring ASes and yet is also 158 significantly less than the global BGP graph. The solution space for 159 this is unbounded, as the limits that a source AS may wish to apply 160 to its more specific routes could be a fairly complicated 161 manifestation of its routing policies. One can imagine a policy that 162 restricts more specifics to ASes that only have prime AS numbers, for 163 example. 165 We already have one mechanism for performing this type of function. 166 The BGP NO_EXPORT community string attribute [RFC1997] can be 167 attached to more specific prefixes. This will cause the more 168 specifics not to be advertised past the immediate neighbouring AS. 169 This is effective at helping to prevent more specific prefixes from 170 becoming global, but it is extremely limited in that the more 171 specific prefixes can only propagate to adjacent ASes. 173 Some ASes have created a further mechanism wherein a prefix that is 174 given a particular community will have NO_EXPORT attached to that 175 prefix when the prefix is propagated to a specific AS. This is not a 176 generally deployed mechanism, but is used by some ASes as another 177 means of scope control. 179 Referring again to our example, A can advertise X with NO_EXPORT to 180 provider B. However, this will cause provider B not to advertise X to 181 the remainder of the network, and providers C, D, and G will not have 182 the longer prefixes and will thus send all of A's traffic via 183 provider C. This is not what A hoped to accomplish with advertising a 184 longer prefix and demonstrates why this NO_EXPORT mechanism is not 185 sufficiently flexible. 187 Instead of attempting to provide an infinitely flexible and 188 complicated mechanism for controlling the distribution of prefixes, 189 we propose a single, coarse, scope control mechanism. This coarse 190 mechanism will provide a limited amount of control at a very low cost 191 and address most of the evils associated with performing traffic 192 engineering through route distribution. 194 We observe that traffic engineering via longer prefixes is only 195 effective when the longer prefixes have a different next hop from the 196 less specific prefix. Thus, past the point where the next hops 197 become identical, the longer prefixes provide no value whatsoever. 198 We also observe that most traffic ends up traversing a subset of the 199 network operated by a relatively small number of large market- 200 dominant providers, joined by settlement-free interconnects. If one 201 looks one AS hop past this subset of the network, it is likely that 202 the longer prefixes and the site aggregate are using the same next 203 hop, and thus the longer prefixes have stopped providing value. 205 We can see this clearly in our example. Provider F sees that both 206 prefix X and prefix Y will lead all traffic through provider E. There 207 is no point in F carrying and propagating the more specific prefix X. 208 Similarly, providers G and H need not carry prefix X. 210 3.2. AS_PATHLIMIT as Control 212 To accomplish this, we propose to add information that will limit the 213 radius of propagation of more specific prefixes. If we attach a 214 count of the ASes that may be traversed by the more specific prefix, 215 we gain much of the control that we hope to achieve. We propose the 216 creation of a new path attribute that will carry an upper bound on 217 the number of ASes found the AS_PATH attribute. This new path 218 attribute will be called the 'path limit' or AS_PATHLIMIT. For 219 example, if prefix X is advertised with path limit 1, then only 220 provider B has the information and we get an effect that is identical 221 to NO_EXPORT. If prefix X is advertised with path limit 2, then only 222 B, C and D will carry it. This is an interesting compromise as 223 traffic for X will now flow consistently through provider B, as 224 desired. 226 However, this is not identical to fully distributing X. Consider, for 227 example that provider E in this circumstance will not receive prefix 228 X and is likely to prefer provider C for all A destinations. This 229 causes traffic for X to flow from E to C to B. If provider E did have 230 prefix X, it may choose to prefer provider D instead, resulting in a 231 different path. This second result can be achieved by increasing the 232 path limit to 3, but this has the unfortunate effect that provider G 233 would also receive prefix X. 235 Thus, AS_PATHLIMIT is an extremely lightweight mechanism, and 236 achieves a great deal of control. It is easy to imagine more 237 complicated control mechanisms, such IDRP [ISO.10747.1993] 238 distribution lists, but we currently feel that the complexity of such 239 a mechanism is simply not warranted. 241 4. Anycast Service Distribution 243 A growing number of services are being distributed using anycast, by 244 advertising a route which covers one or more addresses for a service 245 which is provided autonomously at multiple locations. 247 For some services, it is useful to restrict the peak possible service 248 load, to avoid overloading local connectivity or service 249 infrastructure capabilities; it may be a better failure mode for 250 service to be retained only for a small community of surrounding 251 networks than for a single node to fail under a global load of 252 queries. 254 Although to some degree this policy can be accomplished through 255 negotiation and judicious use of NO_EXPORT without AS_PATHLIMIT, the 256 AS_PATHLIMIT attribute provides a more flexible and reliable 257 mechanism. 259 5. The AS_PATHLIMIT Attribute 261 The AS_PATHLIMIT attribute is a transitive optional BGP path 262 attribute, with Type Code 21. The AS_PATHLIMIT attribute has a fixed 263 length of 5 octets. The first octet is an unsigned number that is 264 the upper bound on the number of ASes in the AS_PATH attribute of the 265 associated paths. One octet suffices because the TTL field of the IP 266 header ensures that only one octet's worth of ASes can ever be 267 traversed. The second thru fifth octets are the AS number of the AS 268 that attached the AS_PATHLIMIT attribute to the NLRI. 270 5.1. Operations 272 A BGP speaker attaching the AS_PATHLIMIT attribute to an NLRI MUST 273 encode its AS number in the second thru fifth octets. The encoding 274 is described in [I-D.ietf-idr-as4bytes]. This information is 275 intended to aid debugging in the case where the AS_PATHLIMIT 276 attribute is added by an AS other than the originator of the NLRI. 278 A BGP speaker sending a route with an associated AS_PATHLIMIT 279 attribute to an EBGP neighbour MUST examine the value of the 280 attribute and the associated AS_PATH to be advertised. If the number 281 of ASes found in the AS_PATH exceeds the AS_PATHLIMIT value, then the 282 route SHOULD NOT be sent. 284 For the purposes of this attribute, private AS numbers [RFC1930] and 285 confederation AS members [RFC3065] found in the AS_PATH are not 286 counted. AS numbers found within an AS_SET are not counted and an 287 entire AS_SET is counted as a single AS. Each instance of an AS 288 number that appears multiple times in an AS_PATH is counted. 290 If the AS_PATHLIMIT attribute is attached to a prefix by a private 291 AS, then when the prefix is advertised outside of the parent AS, the 292 AS number contained in the AS_PATHLIMIT attribute should be replaced 293 by the AS number of the parent AS. 295 Similarly, if the AS_PATHLIMIT attribute is attached to a prefix by a 296 member of a confederation, then when the prefix is advertised outside 297 of the confederation boundary, then the AS number of the 298 confederation member inside of the AS_PATHLIMIT attribute should be 299 replaced by the confederation's AS number. 301 A BGP speaker receiving a route with an associated AS_PATHLIMIT 302 attribute from an EBGP neighbour MUST examine the value of the 303 attribute. If the number of ASes in the AS_PATH exceeds the value of 304 the AS_PATHLIMIT attribute, then the route MUST be ignored without 305 further processing. 307 When a BGP speaker propagates a route with an associated AS_PATHLIMIT 308 attribute, which it has learned from another BGP speaker's UPDATE 309 message, it MUST NOT modify the route's AS_PATHLIMIT attribute. It 310 may remove the AS_PATHLIMIT in its entirety. It may also attach a 311 new AS_PATHLIMIT attribute that encodes its own AS number. 313 To ensure loop prevention, BGP requires that all aggregate routes 314 with AS paths that omit any AS number from the AS_PATHs being 315 aggregated to be originated with the ATOMIC_AGGREGATE attribute. To 316 help ensure compliance with this, sites that choose to advertise the 317 AS_PATHLIMIT path attribute SHOULD advertise the ATOMIC_AGGREGATE on 318 all less specific covering prefixes as well as the more specific 319 prefixes. 321 5.2. Proxy Control 323 An AS may attach the AS_PATHLIMIT attribute to a route that it has 324 received from another AS. This is a form of proxy aggregation and 325 may result in routing behaviors that the origin of the route did not 326 intend. Further, if the overlapping prefixes are not advertised with 327 the ATOMIC_AGGREGATE attribute, adding the AS_PATHLIMIT attribute may 328 cause defective implementations to advertise incorrect paths. Before 329 adding the AS_PATHLIMIT attribute an AS must carefully consider the 330 risks and consequences outlined here. 332 6. Security Considerations 334 This new BGP attribute creates no new security issues. For it to be 335 used, it must be attached to a BGP route. If the router is forging a 336 route, then this attribute limits the extent of the damage caused by 337 the forgery. This may be used by attackers to limit the scope and 338 thus the visibility of their attacks. Presently, the same approach 339 can be applied with the use of the NO_EXPORT community, but just as 340 the AS_PATHLIMIT attribute gives network operators more granularity 341 in the distribution of prefixes, it also gives attackers more 342 granularity in their attacks. If a router fraudulently attaches the 343 AS_PATHLIMIT attribute to a route, then it could have just as easily 344 have used normal policy mechanisms to filter out the route 345 completely. Thus, the AS_PATHLIMIT attribute does not enable new 346 attacks, but it does give an attacker the ability to create more 347 subtle attacks that only affect a subset of the entire network. 349 7. IANA Considerations 351 This document has no actions for IANA. IANA has already allocated a 352 code point for the AS_PATHLIMIT attribute under the Early IANA 353 Allocation process. 355 8. Acknowledgements 357 The editors would like to acknowledge that they are not the original 358 initiators of this concept. Over the years, many similar proposals 359 have come our way, and we had hoped that self-discipline would cause 360 this type of mechanism to be unnecessary. We were overly optimistic. 362 The names of those who originally proposed this are now lost to the 363 mists of time. This should rightfully be their document. We would 364 like to thank them for the opportunity to steward their concept to 365 fruition. 367 9. References 369 9.1. Normative References 371 [I-D.ietf-idr-as4bytes] 372 Vohra, Q. and E. Chen, "BGP Support for Four-octet AS 373 Number Space", draft-ietf-idr-as4bytes-12 (work in 374 progress), November 2005. 376 [RFC1930] Hawkinson, J. and T. Bates, "Guidelines for creation, 377 selection, and registration of an Autonomous System (AS)", 378 BCP 6, RFC 1930, March 1996. 380 [RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP 381 Communities Attribute", RFC 1997, August 1996. 383 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 384 Requirement Levels", BCP 14, RFC 2119, March 1997. 386 [RFC3065] Traina, P., McPherson, D., and J. Scudder, "Autonomous 387 System Confederations for BGP", RFC 3065, February 2001. 389 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 390 Protocol 4 (BGP-4)", RFC 4271, January 2006. 392 9.2. Informative References 394 [ISO.10747.1993] 395 International Organization for Standardization, 396 "Information Processing Systems - Telecommunications and 397 Information Exchange between Systems - Protocol for 398 Exchange of Inter-domain Routeing Information among 399 Intermediate Systems to Support Forwarding of ISO 8473 400 PDUs", ISO Standard 10747, 1993. 402 [RFC3221] Huston, G., "Commentary on Inter-Domain Routing in the 403 Internet", RFC 3221, December 2001. 405 Authors' Addresses 407 T. Li (editor) 408 Cisco Systems, Inc. 409 425 East Tasman Drive 410 San Jose, CA 95134 412 Phone: +1 408 525 1254 413 Email: tli@cisco.com 415 R. Fernando (editor) 416 Juniper Networks, Inc. 417 1194 North Mathilda Avenue 418 Sunnyvale, CA 94089-1206 419 US 421 Phone: +1 888 586 4737 422 Email: rex@juniper.net 424 J. Abley (editor) 425 Afilias Canada, Inc. 426 4141 Yonge Street, Suite 204 427 Toronto, ON M2P 2A8 428 CA 430 Phone: +1 416 673 4176 431 Email: jabley@ca.afilias.info 433 Full Copyright Statement 435 Copyright (C) The IETF Trust (2007). 437 This document is subject to the rights, licenses and restrictions 438 contained in BCP 78, and except as set forth therein, the authors 439 retain all their rights. 441 This document and the information contained herein are provided on an 442 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 443 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 444 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 445 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 446 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 447 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 449 Intellectual Property 451 The IETF takes no position regarding the validity or scope of any 452 Intellectual Property Rights or other rights that might be claimed to 453 pertain to the implementation or use of the technology described in 454 this document or the extent to which any license under such rights 455 might or might not be available; nor does it represent that it has 456 made any independent effort to identify any such rights. Information 457 on the procedures with respect to rights in RFC documents can be 458 found in BCP 78 and BCP 79. 460 Copies of IPR disclosures made to the IETF Secretariat and any 461 assurances of licenses to be made available, or the result of an 462 attempt made to obtain a general license or permission for the use of 463 such proprietary rights by implementers or users of this 464 specification can be obtained from the IETF on-line IPR repository at 465 http://www.ietf.org/ipr. 467 The IETF invites any interested party to bring to its attention any 468 copyrights, patents or patent applications, or other proprietary 469 rights that may cover technology that may be required to implement 470 this standard. Please address the information to the IETF at 471 ietf-ipr@ietf.org. 473 Acknowledgment 475 Funding for the RFC Editor function is provided by the IETF 476 Administrative Support Activity (IASA).