idnits 2.17.1 draft-bonaventure-bgp-redistribution-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 9 instances of too long lines in the document, the longest one being 8 characters in excess of 72. == There are 3 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 4 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 88 has weird spacing: '... some annou...' == Line 90 has weird spacing: '...nnounce this...' == Line 101 has weird spacing: '...stomers to in...' == Line 104 has weird spacing: '...eceived from ...' == Line 441 has weird spacing: '...ied set of eB...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'Hal97' -- Possible downref: Non-RFC (?) normative reference: ref. 'Hus01' -- Possible downref: Normative reference to a draft: ref. 'QuB02' -- Possible downref: Non-RFC (?) normative reference: ref. 'Quo02' -- No information found for draft-ietf-idr-bgp-ext-communi-ties - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'RTR01' Summary: 3 errors (**), 0 flaws (~~), 8 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force Olivier Bonaventure 3 INTERNET DRAFT FUNDP 4 Stefaan De Cnodder 5 Alcatel 6 Jeffrey Haas 7 NextHop 8 Bruno Quoitin 9 FUNDP 10 Russ White 11 Cisco 12 February, 2002 13 Expires August, 2002 15 Controlling the redistribution of BGP routes 16 18 Status of this Memo 20 This document is an Internet-Draft and is in full conformance with 21 all provisions of Section 10 of RFC2026. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF), its areas, and its working groups. Note that 25 other groups may also distribute working documents as Internet- 26 Drafts. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 The list of current Internet-Drafts can be accessed at 34 http://www.ietf.org/ietf/1id-abstracts.txt 36 The list of Internet-Draft Shadow Directories can be accessed at 37 http://www.ietf.org/shadow.html. 39 Abstract 41 This document proposes the redistribution extended community. This 42 new extended community allows a router to influence how a specific 43 route should be redistributed towards a specified set of eBGP 44 speakers. Several types of redistribution communities are proposed. 45 The first type may be used to indicate that a specific route should 46 not be announced to the specified set of eBGP speakers. The second 47 type may be used to indicate that the attached route should only be 48 announced with the NO_EXPORT community to the specified set of eBGP 49 speakers and the third type may be used to indicate that the attached 50 route should be prepended n times when announced to the specified set 51 of eBGP speakers. 53 1 Introduction 55 In today's commercial Internet, many ISPs need to have some control 56 on their interdomain traffic. In the outgoing direction, this control 57 can be obtained by configuring the BGP routers of the ISP to favor 58 some routes over others by using the LOCAL-PREF attribute. However, 59 due to the assymetry of Internet traffic, most ISPs also need to 60 control their incoming traffic. 62 +---------------+ 63 | | 64 | AS22 | 65 | | 66 +---------------+ 67 || 68 +---------------+ +---------------+ 69 | 13.0.0.0/8 | | AS21 | 70 | 12.0.0.0/8 |===============| | 71 | AS20 | +---------------+ 72 +---------------+ 73 || 74 +---------------+ 75 | | 76 | AS10 | 77 | | 78 +---------------+ 79 Figure 1: Simple interdomain topology 81 In the incoming direction, the only way to influence the traffic flow 82 is to control the redistribution of its routes. Several methods exist 83 and are used in practice [Hal97,QuB02]. In this case, the ISP needs 84 to influence the redistribution and the selection of its own routes 85 by remote ISPs. Since the default configuration of many BGP routers 86 is to select the route with the smallest AS path length, a common 87 technique is to artificially increase the length of the AS path for 88 some announced routes. For example, in figure 1, if AS20 wanted to 89 indicate that it prefers to receive its traffic towards subnet 90 13.0.0.0/8 through its link with AS22, then it would announce this 91 prefix as usual on this link to AS22 and announce a prefix with the 92 AS20:AS20:AS20:AS20 path to AS21 and AS10. If AS10 and AS21 rely only 93 on the AS path length to select the best BGP route, they will prefer 94 the shorter route received by AS22. This requires a manual 95 configuration of the BGP routers, but path prepending is very 96 frequently used today on the Internet [Hus01]. In some cases, the 97 configuration burden can be reduced by using the BGP communities 98 attribute. 100 Recently, several large ISPs have gone one step further by defining 101 BGP communities that allow their customers to influence the 102 redistribution of their routes. For example, in figure 1, AS20 could 103 configure its BGP routers to always prepend four times AS20 when they 104 announce via eBGP a route received from one of AS20's customers with 105 a special community attribute. For this, AS20 needs to publish the 106 specific BGP communities that it supports and its customers need to 107 configure their router appropriately. If AS20 needs to define a new 108 BGP community or change an existing one, it must inform all its 109 customers would will then have to update the configuration of their 110 routers. A more detailed survey of the utilization of the BGP 111 community attribute by ISPs may be found in [QuB02]. This survey 112 reveals the following : 113 - Many different AS define their own BGP community values 114 to allow their customers/peers to indicate that a 115 particular route should not be propagated towards a specific AS, 116 towards the routers attached to a specific IX, or towards AS 117 within a given geographical area (e.g. a European AS could want 118 to prohibit a route from being announced to US peers). 119 - Many AS define their own BGP community values 120 to allow their peers or customers to indicate that an 121 announced route should be prepended when announced towards a 122 specific AS, IX or set of AS. 123 - Several AS define their own BGP community attribute to indicate 124 that a given route should only be redistributed towards a 125 specified AS. 127 Furthermore, this survey also reveals that some AS have difficulties 128 of providing all these facilities while still relying on their 129 assigned set of BGP community values. For example, some AS have 130 chosen to reuse several BGP community values corresponding to the 131 private AS space (i.e. community values 64512:00 - 65534:65535) to be 132 able to define structured communities that allow their customers to 133 influence the redistribution of their routes and some of these 134 community values appear in BGP tables on the global Internet. 136 Although the survey shows that these BGP communities are widely used 137 today to provide such facilities, this is far from the best solution. 138 Requiring each AS to select its own values for the BGP communities 139 and to document these values in the routing registries is not very 140 efficient because it forces the BGP routers to be configured manually 141 based on information found in these registries or in peering 142 agreements. 144 In this document, we define a new type of BGP extended community. By 145 using a set BGP extended communities with a precise syntax, we 146 support most of the current utilizations of the BGP community without 147 relying unnecessarily on manual configuration of the BGP routers. We 148 believe that reducing the manual configuration of these routers would 149 be very useful for the stability and the performance of the global 150 Internet. 152 2 Controlled redistribution of BGP routes 154 This document defines a method to allow a BGP speaker to influence 155 how its peers will redistribute its own routes. For this, the BGP 156 speaker may define for each announced route a redistribution policy 157 that controls how this route will be redistributed. This is done by 158 defining a set of allowed or requested operations and a list of BGP 159 speakers. The list of BGP speakers can be specified by listing either 160 the BGP speakers that are covered by the redistribution policy or 161 those that are not covered by this policy. The current version of 162 this document supports the following operations : 164 - the attached route should not be announced to the specified BGP 165 speakers 167 - the attached route should only be announced to the specified BGP 168 speakers 170 - the attached route should be announced with the NO_EXPORT 171 attribute to the specified BGP speakers 173 - the attached route should be prepended n times when announced to 174 to the specified BGP speakers 176 The redistribution policies are encoded in a special type of 177 extended community attribute called the redistribution community. 178 If a redistribution policy applies to a long list of BGP speakers, 179 then it will be encoded in several redistribution communities. 181 2.1 The redistribution community 183 The extended communities attribute is defined in [RTR01]. This 184 attribute allows a BGP router to attach a set of extended communi- 185 ties to an UPDATE message. Each extended community value is encoded 186 as an eight octets quantity with a two octets type field and a 6 187 octets value field. Several types of extended community values are 188 defined in [RTR01]. This document proposes a new well-known 189 extended community : the redistribution community. 191 The redistribution community is composed of a one octet type field 192 (regular type). It is encoded as defined in [RTR01]. The high-order 193 bit is cleared (type assigned by IANA). Since the redistribution 194 community is used for signalling purposes between two AS's, the 195 bit6 is set meaning that the extended community is non-transitive 196 across ASes. This is important to ensure that communities used to 197 affect the redistribution of routes by a given AS are not unneces- 198 sirally distributed over the entire Internet as it is often the 199 case today [QuB02]. The remaining 6 lower-order bits are to be 200 defined by IANA (TBDTBD notation in figure 1). 202 1 octet 1 octet 6 octets 203 +--------+--------+---------------------+ 204 |01TBDTBD| Action | BGP_Speakers_Filter | 205 +--------+--------+---------------------+ 206 Figure 1 : Encoding of the redistribution community 208 The remaining 7 octets of the redistribution community indicate how 209 a router will advertise the received route to its peers. This 210 requires two pieces of information: a filter to select a subset of 211 BGP speakers and an action that indicates how the attached route 212 should be redistributed to the selected peers. The high-order octet 213 indicates the action to be taken and the 6 remaining octets define 214 the filter. 216 The Action octet is encoded as follow: 218 - The high and the second order bits (Bit7 and Bit6) are reserved 219 and set to zero in this document 221 - Bit5-3 are the Action type 223 - Bit2-0 are the Action parameters 225 Action types 227 This document defines three types of actions (values 000b - 010b). 228 Values 011b-111b are to be assigned by IANA. 230 - 000b Prepend. This action means that the AS number of the 231 announcing router should be prepended when announcing the attached 232 route to the BGP speakers covered by the redistribution policy. The 233 action parameter indicates how many times the AS number should be 234 prepended. 236 - 001b No_Export. This action means that the NO_EXPORT community 237 should be inserted when announcing the attached route to the BGP 238 speakers covered by the redistribution policy. This action type 239 does not require a parameter. The action parameter should be set to 240 zero by the sender and ignored by the receiver. 242 - 010b Do not announce. This action means that the route should not 243 be announced to the BGP speakers covered by the redistribution pol- 244 icy. This action type does not require a parameter. The action 245 parameter should be set to zero by the sender and ignored by the 246 receiver. 248 The BGP Speakers Filter 250 The BGP_Speakers_Filter field is used to specify the eBGP speakers 251 that will be affected by the specified action. It is composed of a 252 one octet type field and a five octets value field. 254 +--------+--------------------------------------+ 255 | Type | BGP_Speakers_Filter Value (5 octets) | 256 +--------+--------------------------------------+ 257 Figure 2 : Encoding of the BGP_Speakers_Filter field 259 The BGP_Speakers_Filter field is used to specify the eBGP speakers 260 that will be affected by the specified action. There are two meth- 261 ods to specify the affected eBGP speakers. The first method is to 262 explicitly list all those speakers inside the BGP_Speakers_Filters 263 field of redistribution communities. In this case, the high order 264 bit of the type field of the BGP_Speakers_Filter field is set to 1. 265 The second method is to explicitly list only the eBGP speakers that 266 will not be affected by the specified action. In this case, the 267 high order bit of the BGP_Speakers_Filter type field shall be set 268 to 0. The 7 low order bits of the BGP_Speakers_Filter type field 269 are used to indicate the type of BGP speakers included in the five 270 low order octets of the BGP_Speakers_Filter field. This document 271 defines four types of BGP_Speakers_Filters (values 0x01-0x04). 272 Value 0x00 is reserved and values 0x05-0x3f are to be assigned by 273 IANA. Values 0x40-0x7f are vendor specific. 275 BGP_Speakers_Filter types 277 - The BGP_Speakers_Filter value contains a two octets AS number 278 (type 0x01) 280 - The BGP_Speakers_Filter value contains two two octets AS numbers 281 type 0x02) 283 - The BGP_Speakers_Filter value contains a CIDR prefix/length pair 284 (type 0x03) 286 - The BGP_Speakers_Filter value contains a four octets AS number 287 (type 0x04) 289 The BGP_Speakers_Filter value shall be encoded as follows. If this 290 field contains a two octet AS number, the AS number shall be placed 291 in the two low order octets. The three high order octets shall be 292 set to zero upon transmission and ignored upon reception. 294 +---------------------------+ 295 | Must be Zero (3 octets)| 296 +---------------------------+ 297 | AS number (2 octets) | 298 +---------------------------+ 299 Figure 3 : BGP speakers filter containing a single two octets AS number 301 If the BGP_Speakers_Filter value contains two two octets AS num- 302 bers, one of the AS numbers should be placed in the two low order 303 octets. The other AS number should be placed in the next two higher 304 order octets and the last octet shall be set to zero upon transmis- 305 sion and ignored upon reception. 307 +---------------------------+ 308 | Must be Zero (1 octet) | 309 +---------------------------+ 310 | AS number A (2 octets) | 311 +---------------------------+ 312 | AS number B (2 octets) | 313 +---------------------------+ 314 Figure 4 : BGP speakers filter containing two distinct two octets AS number 316 If the BGP_Speakers_Filter value contains a four octet AS number, 317 the AS number shall be placed in the four low order octets. The 318 high order octet shall be set to zero upon transmission and ignored 319 upon reception. 321 +---------------------------+ 322 | Must be Zero (1 octet) | 323 +---------------------------+ 324 | AS number (4 octets) | 325 +---------------------------+ 326 Figure 5 : BGP speakers filter containing a single four octets AS number 328 If the BGP_Speakers_Filter value contains a CIDR prefix/length 329 pair, it should be encoded as shown below : 331 +---------------------------+ 332 | Length (1 octet) | 333 +---------------------------+ 334 | Prefix (4 octets) | 335 +---------------------------+ 336 Figure 6 : BGP speakers filter containing a CIDR prefix/length pair 338 The Length field indicates the length in bits of the IP address 339 prefix. A length of zero indicates a prefix that matches all IP 340 addresses. The Prefix field contains IP address prefixes followed 341 by enough trailing bits with a value of zero to make the end of the 342 field fall on a four octets boundary. 344 2.2 Utilization of the redistribution communities 346 A router may, depending on its policy, add any number of redistri- 347 bution communities to a route originated by itself or received from 348 another BGP speaker with iBGP or eBGP. When a router attaches one 349 or several redistribution communities to a route, it must ensure 350 that two of the included redistribution communities do not con- 351 flict. This is necessary to ensure that the redistribution communi- 352 ties will be processed in a deterministic manner by the remote 353 peer. When several redistribution communities contain the same 354 action type and parameter, then all the BGP speakers filters of 355 those communities must have the same high order bit in the 356 BGP_Speakers_Filter type. A BGP router that receives a route con- 357 taining invalid redistribution communities for a given action type 358 and parameter should ignore all the redistribution communities con- 359 cerning this action type and parameter. 361 In practice, it can be expected that only the originator of the 362 route will attach the redistribution communities as this is an 363 attempt of the route originator to do some form of inter-domain 364 traffic engineering. In practice, it can also be expected that 365 most utilizations of the redistribution communities will only 366 require to attach a small number of those communities to a given 367 route. 369 2.3 Operations 371 The redistribution communities defined in this document only affect 372 the redistribution of the associated route to eBGP peers. The 373 redistribution communities do not affect the redistribution of 374 routes via iBGP or between the sub-ASs of a confederation. 376 When a router receives a route with redistribution communities, it 377 should apply the operations specified by these communities when 378 redistributing the route to eBGP peers. Since the redistribution 379 communities defined by this document are non-transitive, a router 380 will remove the received redistribution communities when redis- 381 tributing the route to eBGP peers. Of course, nothing prevents this 382 router from adding its own redistribution communities to this route 383 before redistributing it. 385 A router should apply the policies defined by the redistribution 386 communities to the routes that is has selected for advertisement 387 from its Adj-RIB-OUT based on its own policy. A route that con- 388 tains redistribution communities should be processed as follows. 390 First, the BGP speaker should build for each action type and param- 391 eter contained in the redistribution communities attached to the 392 route a list of the target BGP speakers contained in the BGP_Speak- 393 ers_filters for this action type. In the remainder of this sec- 394 tion, we will use the wordings "a BGP speaker P is affected by 395 action type x with parameter" to indicate that either of the fol- 396 lowing is true : 398 - P appears inside one of the BGP_Speakers_Filter of the redistri- 399 bution communities with action x and the high order bit of the 400 BGP_Speakers_Filter type is set to one 402 - P does not appear inside any of the BGP_Speakers_Filter of the 403 redistribution communities with action x and the high order bit of 404 the BGP_Speakers_Filter type is set to zero 406 Then, when a route is about to be redistributed to peer P, the 407 router first checks if this peer is affected by action type "Do not 408 announce". If this is the case, the route is not announced to this 409 peer. Otherwise, the router checks the other action types as fol- 410 lows. 412 - If peer P is affected by action type "No export" then the well- 413 known community NO_EXPORT is attached to the route. 415 - If peer P is affected by one or more actions of type "Prepend", 416 then the AS-Path of the route shall be prepended n times where n is 417 the smallest parameter of the matched "Prepend" actions. 419 Then the route is announced to peer P. 421 3 IANA considerations 423 This document requests the attribution of a new BGP extended commu- 424 nities type field from IANA. In addition, this document proposes 425 that IANA maintains the action types and the BGP speakers filter 426 types values defined in section 2. 428 4 Security considerations 430 This extension to BGP does not change the underlying security 431 issues of the extended community attribute. 433 5 Conclusion 435 This document has proposed a new type of extended communinities 436 called the redistribution communities. These redistribution commu- 437 nities can be used by a BGP router to influence the redistribution 438 of some of its routes by its peers. Three types of redistribution 439 communities have been proposed. The first type may be used to 440 indicate that a specific route should not be announced to the spec- 441 ified set of eBGP speakers. The second type may be used to indi- 442 cate that the attached route should only be announced with the 443 NO_EXPORT community to the specified set of eBGP speakers and the 444 third type may be used to indicate that the attached route should 445 be prepended n times when announced to the specified set of eBGP 446 speakers. 448 Acknowledgements 450 This work was partially funded by the European Commission, within 451 the ATRIUM IST project. We would like to thank Bart Peirens and 452 Alvaro Retana for their comments. 454 References 456 [Hal97] B. Halabi. Internet Routing Architectures. Cisco Press, 457 1997. 459 [Hus01] G. Huston. AS1221 BGP table statistics. available from 460 http://www.telstra.net/ops/bgp/, 2001. 462 [QuB02] B. Quoitin, O. Bonaventure, A survey of the utilization 463 of the BGP community attribute, Internet draft, draft-quoitin-bgp- 464 comm-survey-00.txt, work in progress, February 2002 466 [Quo02] B. Quoitin, An implementation of the BGP redistribution 467 communities in zebra, Technical report Infonet-TR-2002-03, February 468 2002, to appear 470 [RTR01] S. Sangli, D. Tappan, and Y. Rekhter. BGP extended commu- 471 nities attribute. Internet draft,draft-ietf-idr-bgp-ext-communi- 472 ties-01.txt, work in progress, August 2001. 474 Authors' Addresses 476 Olivier Bonaventure, Bruno Quoitin 477 Infonet group (FUNDP) 478 Rue Grandgagnage 21, B-5000 Namur, Belgium 479 Email: Olivier.Bonaventure@info.fundp.ac.be, Bruno.Quoitin@info.fundp.ac.be 480 URL : http://www.infonet.fundp.ac.be 482 Stefaan De Cnodder 483 Alcatel 484 Francis Wellesplein 1 485 B-2018 Antwerp, Belgium 486 Email: stefaan.de_cnodder@alcatel.be 488 Jeffrey Haas 489 NextHop Technologies 490 Email: jhaas@nexthop.com 492 Russ White 493 Cisco Systems 494 Email: ruwhite@cisco.com 496 Appendix 1 Simple example 498 The redistribution communities defined in this document can be 499 used in two different ways. A first possible solution would be 500 to rely on the existing support for the extended communities in 501 BGP implementations and to manually configure the redistribution 502 communities defined in this document. This solution could be 503 used today by ISPs to support the redistribution communities (or a subset 504 of those communities) defined 505 in this document instead on defining special community values in 506 their community space and advertising them in the routing registries. 508 To illustrate a possible configuration with an existing BGP implementation 509 supporting the extended communities, we use a syntax similar 510 to the syntax used by zebra. Let us assume that one route 511 from AS3 has two peerings : one peering with AS2 and one peering with AS1. 512 The configuration below shows how AS3's router could be configured to 513 support the redistribution communities defined in this document. In 514 the configuration in figure A, we show each extended community in 515 hex format for readability reasons and only consider a subset of the 516 redistribution communities. Figure A shows how AS3 would configure its 517 routers to allow to request that a route announced to AS1 would be 518 prepended n times before being announced and to request that a specific 519 route would not be announced to AS2. 521 router bgp 3 522 neighbor 172.17.1.1 remote-as 1 523 neighbor 172.17.1.1 route-map prepend1_as1 out 524 neighbor 172.17.1.2 remote-as 2 525 neighbor 172.17.1.2 route-map do_not_announce_as2 out 526 ! Extended community list 527 ! -------------------------- 528 ! action "prepend x times" 529 ! filter "include AS1" 530 ! 531 ip extcommunity-list 1 permit 0x4401810000000001 532 ip extcommunity-list 2 permit 0x4402810000000001 533 ip extcommunity-list 3 permit 0x4403810000000001 534 ip extcommunity-list 4 permit 0x4404810000000001 535 ! 536 ! Route-maps 537 ! -------------------------- 538 ! action "prepend x times" 539 ! filter "include AS1" 540 ! 541 route-map prepend_as1 permit 10 542 match extcommunity 1 543 set as-path prepend 1 544 ! 545 route-map prepend_as1 permit 20 546 match extcommunity 2 547 set as-path prepend 2 548 ! 549 route-map prepend_as1 permit 30 550 match extcommunity 3 551 set as-path prepend 3 552 ! 553 route-map prepend_as1 permit 40 554 match extcommunity 4 555 set as-path prepend 4 556 ! 557 ! Extended community list 558 ! -------------------------- 559 ! action "do not announce" 560 ! filter "include AS2" 561 ! 562 ip extcommunity-list 5 permit 0x4410810000000002 563 ! 564 route-map do_not_announce_as2 deny 10 565 match extcommunity 5 566 ! 567 Figure A : Sample configuration 569 For a router with a small number of peers, such a manual configura- 570 tion of the redistribution communities is possible. However, if the 571 routers has many peers, the required configuration file may become 572 very large, especially if one wants to fully support all the redis- 573 tribution communities defined in this document. In this case, a bet- 574 ter solution is to rely on a direct support for the redistribution 575 communities inside the BGP implementation itself as discussed in 576 [Quo02]. With a BGP implementation supporting directly the redistri- 577 bution communities, a few lines of configuration will be sufficient 578 to enable or disable some or all of the redistribution communites for 579 each peer.