idnits 2.17.1 draft-ietf-idr-aigp-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 12, 2011) is 4511 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-03) exists of draft-pmohapat-idr-fast-conn-restore-02 == Outdated reference: A later version (-05) exists of draft-ietf-idr-best-external-04 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Pradosh Mohapatra 3 Internet Draft Rex Fernando 4 Intended Status: Proposed Standard Eric C. Rosen 5 Expires: June 12, 2012 Cisco Systems, Inc. 7 James Uttaro 8 ATT 10 December 12, 2011 12 The Accumulated IGP Metric Attribute for BGP 14 draft-ietf-idr-aigp-07.txt 16 Abstract 18 Routing protocols that have been designed to run within a single 19 administrative domain ("IGPs") generally do so by assigning a metric 20 to each link, and then choosing as the installed path between two 21 nodes the path for which the total distance (sum of the metric of 22 each link along the path) is minimized. BGP, designed to provide 23 routing over a large number of independent administrative domains 24 ("autonomous systems"), does not make its path selection decisions 25 through the use of a metric. It is generally recognized that any 26 attempt to do so would incur significant scalability problems, as 27 well as inter-administration coordination problems. However, there 28 are deployments in which a single administration runs several 29 contiguous BGP networks. In such cases, it can be desirable, within 30 that single administrative domain, for BGP to select paths based on a 31 metric, just as an IGP would do. The purpose of this document is to 32 provide a specification for doing so. 34 Status of this Memo 36 This Internet-Draft is submitted to IETF in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF), its areas, and its working groups. Note that 41 other groups may also distribute working documents as Internet- 42 Drafts. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 49 The list of current Internet-Drafts can be accessed at 50 http://www.ietf.org/ietf/1id-abstracts.txt. 52 The list of Internet-Draft Shadow Directories can be accessed at 53 http://www.ietf.org/shadow.html. 55 Copyright and License Notice 57 Copyright (c) 2011 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents 62 (http://trustee.ietf.org/license-info) in effect on the date of 63 publication of this document. Please review these documents 64 carefully, as they describe your rights and restrictions with respect 65 to this document. Code Components extracted from this document must 66 include Simplified BSD License text as described in Section 4.e of 67 the Trust Legal Provisions and are provided without warranty as 68 described in the Simplified BSD License. 70 Table of Contents 72 1 Specification of requirements ......................... 3 73 2 Introduction .......................................... 3 74 3 AIGP Attribute ........................................ 5 75 3.1 Applicability Restrictions and Cautions ............... 6 76 3.2 Restrictions on Sending/Receiving ..................... 6 77 3.3 Creating and Modifying the AIGP Attribute ............. 7 78 3.3.1 Originating the AIGP Attribute ........................ 7 79 3.3.2 Modifications by the Originator ....................... 8 80 3.3.3 Modifications by a Non-Originator ..................... 8 81 4 Decision Process ...................................... 10 82 4.1 When a Route has an AIGP Attribute .................... 10 83 4.2 When the Route to the Next Hop has an AIGP attribute .. 11 84 5 Deployment Considerations ............................. 12 85 6 IANA Considerations ................................... 12 86 7 Security Considerations ............................... 12 87 8 Acknowledgments ....................................... 12 88 9 Authors' Addresses .................................... 13 89 10 Normative References .................................. 13 90 11 Informative References ................................ 14 92 1. Specification of requirements 94 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 95 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 96 document are to be interpreted as described in [RFC2119]. 98 2. Introduction 100 There are many routing protocols that have been designed to run 101 within a single administrative domain. These are known collectively 102 as "Interior Gateway Protocols" (IGPs). Typically, each link is 103 assigned a particular "metric" value. The path between two nodes can 104 then be assigned a "distance", which is the sum of the metrics of all 105 the links that belong to that path. An IGP selects the "shortest" 106 (minimal distance) path between any two nodes, perhaps subject to the 107 constraint that if the IGP provides multiple "areas", it may prefer 108 the shortest path within an area to a path that traverses more than 109 one area. Typically the administration of the network has some 110 routing policy which can be approximated by selecting shortest paths 111 in this way. 113 BGP, as distinguished from the IGPs, was designed to run over an 114 arbitrarily large number of administrative domains ("autonomous 115 systems", or "ASes") with limited coordination among the various 116 administrations. BGP does not make its path selection decisions 117 based on a metric; there is no such thing as an "inter-AS metric". 118 There are two fundamental reasons for this: 120 - The distance between two nodes in a common administrative domain 121 may change at any time due to events occurring in that domain. 122 These changes are not propagated around the Internet unless they 123 actually cause the border routers of the domain to select routes 124 with different BGP attributes for some set of address prefixes. 125 This accords with a fundamental principle of scaling, viz., that 126 changes with only local significance must not have global 127 effects. If local changes in distance were always propagated 128 around the Internet, this principle would be violated. 130 - A basic principle of inter-domain routing is that the different 131 administrative domains may have their own policies, which do not 132 have to be revealed to other domains, and which certainly do not 133 have to be agreed to by other domains. Yet the use of inter-AS 134 metric in the Internet would have exactly these effects. 136 There are, however, deployments in which a single administration runs 137 a network which has been sub-divided into multiple, contiguous ASes, 138 each running BGP. There are several reasons why a single 139 administrative domain may be broken into several ASes (which, in this 140 case, are not really "autonomous".) It may be that the existing IGPs 141 do not scale well in the particular environment; it may be that a 142 more generalized topology is desired than could be obtained by use of 143 a single IGP domain; it may be that a more finely grained routing 144 policy is desired than can be supported by an IGP. In such 145 deployments, it can be useful to allow BGP to make its routing 146 decisions based on the IGP metric, so that BGP chooses the "shortest" 147 path between two nodes, even if the nodes are in two different ASes 148 within that same administrative domain. We will refer to the set of 149 ASes in a common administrative domain as an "AIGP Administrative 150 Domain". 152 There are in fact some implementations that already do something like 153 this, using BGP's MULTI_EXIT_DISC (MED) attribute to carry a value 154 based on IGP metrics. However, that doesn't really provide IGP-like 155 "shortest path" routing, as the BGP decision process gives priority 156 to other factors, such as the AS_PATH length. Also, the standard 157 procedures for use of the MED do not ensure that the IGP metric is 158 properly accumulated so that it covers all the links along the path. 160 In this document, we define a new optional, non-transitive BGP 161 attribute, called the "Accumulated IGP Metric Attribute", or "AIGP 162 attribute", and specify the procedures for using it. 164 The specified procedures prevent the AIGP attribute from "leaking 165 out" past an AIGP administrative domain boundary into the Internet. 167 The specified procedures also ensure that the value in the AIGP 168 attribute has been accumulated all along the path from the 169 destination, i.e., that the AIGP attribute does not appear when there 170 are "gaps" along the path where the IGP metric is unknown. 172 3. AIGP Attribute 174 The AIGP Attribute is an optional non-transitive BGP Path Attribute. 175 The attribute type code for the AIGP Attribute is 26. 177 The value field of the AIGP Attribute is defined here to be a set of 178 elements encoded as "Type/Length/Value" (i.e., a set of "TLVs"). 179 Each such TLV is encoded as shown in Figure 1. 181 0 1 2 3 182 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 183 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 184 | Type | Length | | 185 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 186 ~ ~ 187 | Value | 188 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+.......................... 190 AIGP TLV 191 Figure 1 193 - Type: A single octet encoding the TLV Type. Only type 1, "AIGP 194 TLV", is defined in this document. 196 - Length: Two octets encoding the length in octets of the TLV, 197 including the type and length fields. The length is encoded as an 198 unsigned binary integer. (Note that the minimum length is 3, 199 indicating that no value field is present.) 201 - A value field containing zero or more octets. 203 This document defines only a single such TLV, the "AIGP TLV". The 204 AIGP TLV is encoded as follows: 206 - Type: 1 208 - Length: 11 210 - Accumulated IGP Metric. 212 The value field of the AIGP TLV is always 8 bytes long. IGP 213 metrics are frequently expressed as 4-octet values, and this 214 ensures that the AIGP attribute can be used to hold the sum of an 215 arbitrary number of 4-octet values. 217 3.1. Applicability Restrictions and Cautions 219 This document only considers the use of the AIGP attribute in 220 networks where each router uses tunneling of some sort to deliver a 221 packet to its BGP next hop. Use of the AIGP attribute in networks 222 that do not use tunneling is outside the scope of this document. 224 If a Route Reflector supports the AIGP attribute, but some of its 225 clients do not, then the routing choices that result may not all 226 reflect the intended routing policy. 228 3.2. Restrictions on Sending/Receiving 230 An implementation that supports the AIGP attribute MUST support a 231 per-session configuration item, AIGP_SESSION, that indicates whether 232 the attribute is enabled or disabled for use on that session. 234 - The default value of AIGP_SESSION, for EBGP sessions, MUST be 235 "disabled". 237 - The default value of AIGP_SESSION, for IBGP and confederation- 238 EBGP sessions, MUST be "enabled." 240 The AIGP attribute MUST NOT be sent on any BGP session for which 241 AIGP_SESSION is disabled. 243 If an AIGP attribute is received on a BGP session for which 244 AIGP_SESSION is disabled, the attribute MUST be treated exactly as if 245 it were an unrecognized non-transitive attribute. That is, "it MUST 246 be quietly ignored and not passed along to other BGP peers" (see 248 [BGP], section 5). 250 3.3. Creating and Modifying the AIGP Attribute 252 3.3.1. Originating the AIGP Attribute 254 An implementation that supports the AIGP attribute MUST support a 255 configuration item, AIGP_ORIGINATE, that enables or disables its 256 creation and attachment to routes. The default value of 257 AIGP_ORIGINATE MUST be "disabled". 259 A BGP speaker MUST NOT add the AIGP attribute to any route whose path 260 leads outside the "AIGP administrative domain" to which the BGP 261 speaker belongs. It may be added only to routes that satisfy one of 262 the following conditions: 264 - The route is a static route that is being redistributed into BGP 266 - The route is an IGP route that is being redistributed into BGP 268 - The route is an IBGP-learned route whose AS_PATH attribute is 269 empty. 271 - The route is an EBGP-learned route whose AS_PATH contains only 272 ASes that are in the same AIGP Administrative Domain as the BGP 273 speaker. 275 A BGP speaker MUST NOT add the AIGP attribute to any route for which 276 it has not set itself as the next hop. 278 It SHOULD be possible to set AIGP_ORIGINATE to "enabled for the 279 routes of a particular IGP that are redistributed into BGP" (where "a 280 particular IGP" might be "OSPF" or "ISIS"). Other policies 281 determining when and whether to originate an AIGP attribute are also 282 possible, depending on the needs of a particular deployment scenario. 284 When originating an AIGP attribute for a BGP route to address prefix 285 P, the value of the attribute is set according to policy. There are 286 a number of useful policies, some of which are in the following list: 288 - When a BGP speaker is redistributing into BGP an IGP route to 289 address prefix P, the IGP will have computed a "distance" from R 290 to P. This distance MAY be assigned as the value of AIGP 291 attribute. 293 - A BGP speaker may be redistributing into BGP a static route to 294 address prefix P, for which a "distance" from R to P has been 295 configured. This distance MAY be assigned as the value of AIGP 296 attribute. 298 - A BGP speaker R may have received and installed a BGP-learned 299 route to prefix P, with next hop N. Or it may be redistributing 300 a static route to P, with next hop N. The "distance" from R to N 301 MAY be assigned as the value of the AIGP attribute of the route 302 to P. 304 * If R has an IGP route to N, the IGP-computed distance from R 305 to N MAY be used. 307 * If R has a BGP route to N, and an AIGP attribute value has 308 been computed for that route (see section 3.3.3), that value 309 MAY be used as the AIGP attribute value of the route to P. 311 3.3.2. Modifications by the Originator 313 If BGP speaker R is the originator of the AIGP attribute of prefix P, 314 and at some point the "distance" from R to P changes, R SHOULD issue 315 a new BGP update containing the new value of the AIGP attribute. 316 (Here we use the term "distance" to refer to whatever value the 317 originator assigns to the AIGP attribute, however it is computed; see 318 section 3.3.1.) However, if the difference between the new distance 319 and the distance advertised in the AIGP attribute is less than a 320 configurable threshold, the update MAY be suppressed. 322 3.3.3. Modifications by a Non-Originator 324 Suppose a BGP speaker R1 receives a route with an AIGP attribute 325 whose value is A, and a Next Hop whose value is R2. Suppose also 326 that R1 is about to redistribute that route on a BGP session that is 327 enabled for sending/receiving the attribute. 329 If R1 does not change the Next Hop of the route, then R1 MUST NOT 330 change the AIGP attribute value of the route. 332 If R1 changes the Next Hop of the route from R2 to R1, and if R1's 333 route to R2 is an IGP-learned route, or a static route that does not 334 require recursive next hop resolution, then R1 must increase the 335 value of the AIGP attribute by adding to A the distance from R1 to 336 R2. This distance is either the IGP-computed distance from R1 to R2, 337 or some value determined by policy. However, A MUST be increased by 338 a non-zero amount. 340 Note that if R1 and R2 above are EBGP neighbors, and there is a 341 direct link between them on which no IGP is running, then when R1 342 changes the next hop of a route from R2 to R1, the AIGP metric value 343 MUST be increased by a non-zero amount. The amount of the increase 344 SHOULD be such that it is properly comparable to the IGP metrics. 345 E.g., if the IGP metric is a function of latency, then the amount of 346 the increase should be a function of the latency from R1 to R2. 348 If R1 changes the Next Hop of the route from R2 to R1, and if R1's 349 route to R2 is a BGP-learned route, or a static route that requires 350 recursive next hop resolution, then the AIGP attribute value needs to 351 be increased in several steps, according to the following procedure. 352 (Note that this procedure is ONLY used when recursive next hop 353 resolution is needed.) 355 1. Let Xattr be the new AIGP attribute value. 357 2. Initialize Xattr to A. 359 3. Set the XNH to R2. 361 4. Find the route to XNH. 363 5. If the route to XNH does not require recursive next hop 364 resolution, get the distance D from R1 to XNH. (Note that this 365 condition cannot be satisfied the first time through this 366 procedure.) If D is above a configurable threshold, set the 367 AIGP attribute value to Xattr+D. If D is below a configurable 368 threshold, set the AIGP attribute value to Xattr. In either 369 case, exit this procedure. 371 6. If the route to XNH is a BGP-learned route, and the route does 372 NOT have an AIGP attribute, then exit this procedure and do not 373 pass on any AIGP attribute. 375 7. If the route to XNH is a BGP-learned route, and the route has 376 an AIGP attribute value of Y, then set Xattr=Xattr+Y, and set 377 XNH to the next hop of this route. (The intention here is that 378 Y is the AIGP value of the route as it was received by R1, 379 without having been modified by R1.) 381 8. Go to step 4. 383 The AIGP value of a given route depends on (a) the AIGP values of all 384 the next hops that are recursively resolved during this procedure, 385 and (b) the IGP distance to any next hop that is not recursively 386 resolved. Any change due to (a) in any of these values MUST trigger 387 a new AIGP computation for that route. Whether a change due to (b) 388 triggers a new AIGP computation depends upon whether the change in 389 IGP distance exceeds a configurable threshold. 391 If the AIGP attribute is carried across several ASes, each with its 392 own IGP domain, it is clear that these procedures are unlikely to 393 give a sensible result if the IGPs are different (e.g., some OSPF and 394 some IS-IS), or if the meaning of the metrics is different in the 395 different IGPs (e.g., if the metric represents bandwidth in some IGP 396 domains but represents latency in others). These procedures also are 397 unlikely to give a sensible result if the metric assigned to inter-AS 398 BGP links (on which no IGP is running) or to static routes is not 399 comparable to the IGP metrics. All such cases are outside the scope 400 of the current document. 402 4. Decision Process 404 Support for the AIGP attribute involves several modifications to the 405 tie breaking procedures of the BGP "phase 2" decision described in 406 [BGP], section 9.1.2.2. These modifications are described below in 407 sections 4.1 and 4.2. 409 In some cases, the BGP decision process may install a route without 410 executing any tie breaking procedures. This may happen, e.g., if 411 only one route to a given prefix has the highest degree of preference 412 (as defined in [BGP] section 9.1.1). In this case, the AIGP 413 attribute is not considered. 415 In other cases, some routes may be eliminated before the tie breaking 416 procedures are invoked, e.g., routes with AS-PATH attributes 417 indicating a loop, or routes with unresolvable next hops. In these 418 cases, the AIGP attributes of the eliminated routes are not 419 considered. 421 4.1. When a Route has an AIGP Attribute 423 Assuming that the BGP decision process invokes the tie breaking 424 procedures, the procedures in this section MUST be executed BEFORE 425 any of the tie breaking procedures described in [BGP] section 9.1.2.2 426 are executed. 428 If any routes have an AIGP attribute, remove from consideration all 429 routes that do not have an AIGP attribute. 431 If router R is considering route T, where T has an AIGP attribute, 432 - then R must compute the value A, defined as follows: set A to the 433 sum of (a) T's AIGP attribute value and (b) the IGP distance from 434 R to T's next hop. 436 - remove from consideration all routes that are not tied for the 437 lowest value of A. 439 4.2. When the Route to the Next Hop has an AIGP attribute 441 Suppose that a given router R1 is comparing two BGP-learned routes, 442 such that either: 444 - the two routes have equal AIGP attribute values, or else 446 - neither of the two routes has an AIGP attribute. The BGP 447 decision process as specified in [BGP] makes use, in its tie 448 breaker procedures, of "interior cost", defined as follows: 450 "interior cost of a route is determined by calculating the 451 metric to the NEXT_HOP for the route using the Routing 452 Table." 454 Suppose route T has a next hop of N. We modify the notion of the 455 "interior cost" from node R1 to node N as follows: 457 - Let R2 be the next hop of the route to N, after all recursive 458 resolution of the next hop is done. Let m be the IGP distance 459 (or in the case of a static route, the configured distance) from 460 R1 to R2. 462 - If the installed route to N has an AIGP attribute, set A to the 463 AIGP value of the route to N, computing the AIGP value of the 464 route according to the procedure of section 3.3.3. 466 - If the installed route to N does not have an AIGP value, set A to 467 0. 469 - The "interior cost" of route T is the quantity A+m. 471 5. Deployment Considerations 473 Using the AIGP attribute to achieve a desired routing policy will be 474 more effective if each BGP speaker can use it to choose from among 475 multiple routes. Thus is it highly recommended that the procedures of 476 [BESTEXT] and [ADDPATH] be used in conjunction with the AIGP 477 Attribute. 479 If a Route Reflector does not pass all paths to its clients, then it 480 will tend to pass the paths for which the IGP distance from the Route 481 Reflector itself to the next hop is smallest. This may result in a 482 non-optimal choice by the clients. 484 6. IANA Considerations 486 IANA has assigned the codepoint 26 in the "BGP Path Attributes" 487 registry to the AIGP attribute. 489 IANA shall create a registry for "BGP AIGP Attribute Types". The 490 type field consists of a single octet, with possible values from 0 to 491 255. The allocation policy for this field is to be "Standards Action 492 with Early Allocation". Type 1 should be defined as "AIGP", and 493 should refer to this document. 495 7. Security Considerations 497 The spurious introduction, though error or malfeasance, of an AIGP 498 attribute, could result in the selection of paths other than those 499 desired. 501 Improper configuration on both ends of an EBGP connection could 502 result in an AIGP attribute being passed from one service provider to 503 another. This would likely result in an unsound selection of paths. 505 8. Acknowledgments 507 The authors would like to thank Waqas Alam, Rajiv Asati, Clarence 508 Filsfils, Robert Raszuk, Yakov Rekhter, Samir Saad, John Scudder, and 509 Shyam Sethuram for their input. 511 9. Authors' Addresses 513 Rex Fernando 514 Cisco Systems, Inc. 515 170 Tasman Drive 516 San Jose, CA 95134 517 Email: rex@cisco.com 519 Pradosh Mohapatra 520 Cisco Systems, Inc. 521 170 Tasman Drive 522 San Jose, CA 95134 523 Email: pmohapat@cisco.com 525 Eric C. Rosen 526 Cisco Systems, Inc. 527 1414 Massachusetts Avenue 528 Boxborough, MA, 01719 529 Email: erosen@cisco.com 531 James Uttaro 532 AT&T 533 200 S. Laurel Avenue 534 Middletown, NJ 07748 535 Email: uttaro@att.com 537 10. Normative References 539 [BGP], "A Border Gateway Protocol 4 (BGP-4)", Y. Rekhter, T. Li, S. 540 Hares, RFC 4271, January 2006. 542 11. Informative References 544 [ADDPATH] "Fast Connectivity Restoration Using BGP Add-Path", P. 545 Mohapatra, R. Fernando, C. Filsfils, R. Raszuk, draft-pmohapat-idr- 546 fast-conn-restore-02.txt, October 2011. 548 [BESTEXT], "Advertisement of the Best External Route in BGP", P. 549 Marques, R. Fernando, E. Chen, P. Mohapatra, H. Gredler, draft-ietf- 550 idr-best-external-04.txt, April 2011. 552 [RFC2119] "Key words for use in RFCs to Indicate Requirement 553 Levels.", S. Bradner, March 1997.