idnits 2.17.1 draft-rosen-idr-aigp-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 9, 2009) is 5554 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-03) exists of draft-pmohapat-idr-fast-conn-restore-00 == Outdated reference: A later version (-01) exists of draft-marques-idr-best-external-00 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Pradosh Mohapatra 3 Internet Draft Cisco Systems, Inc. 4 Intended Status: Proposed Standard 5 Expires: August 9, 2009 Rex Fernando 6 Juniper Networks, Inc. 8 Eric C. Rosen 9 Cisco Systems, Inc. 11 James Uttaro 12 ATT 14 February 9, 2009 16 The Accumulated IGP Metric Attribute for BGP 18 draft-rosen-idr-aigp-00.txt 20 Status of this Memo 22 This Internet-Draft is submitted to IETF in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF), its areas, and its working groups. Note that 27 other groups may also distribute working documents as Internet- 28 Drafts. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt. 38 The list of Internet-Draft Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html. 41 Copyright and License Notice 43 Copyright (c) 2009 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. 53 Abstract 55 Routing protocols that have been designed to run within a single 56 administrative domain ("IGPs") generally do so by assigning a metric 57 to each link, and then choosing as the installed path between two 58 nodes the path for which the total distance (sum of the metric of 59 each link along the path) is minimized. BGP, designed to provide 60 routing over a large number of independent administrative domains 61 ("autonomous systems"), does not make its path selection decisions 62 through the use of a metric. It is generally recognized that any 63 attempt to do so would incur significant scalability problems, as 64 well as inter-administration coordination problems. However, there 65 are deployments in which a single administration runs several 66 contiguous BGP networks. In such cases, it can be desirable, within 67 that single administrative domain, for BGP to select paths based on a 68 metric, just as an IGP would do. The purpose of this document is to 69 provide a specification for doing so. 71 Table of Contents 73 1 Specification of requirements ......................... 3 74 2 Introduction .......................................... 3 75 3 AIGP Attribute ........................................ 5 76 3.1 Applicability Restrictions and Cautions ............... 6 77 3.2 Restrictions on Sending/Receiving ..................... 6 78 3.3 Creating and Modifying the AIGP Attribute ............. 7 79 3.3.1 Originating the AIGP Attribute ........................ 7 80 3.3.2 Modifications by the Originator ....................... 7 81 3.3.3 Modifications by a Non-Originator ..................... 8 82 4 Decision Process ...................................... 9 83 4.1 When a Route has an AIGP Attribute .................... 9 84 4.2 When the Route to the Next Hop has an AIGP attribute .. 10 85 5 Deployment Considerations ............................. 11 86 6 IANA Considerations ................................... 11 87 7 Security Considerations ............................... 11 88 8 Acknowledgments ....................................... 12 89 9 Authors' Addresses .................................... 12 90 10 Normative References .................................. 13 91 11 Informative References ................................ 13 93 1. Specification of requirements 95 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 96 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 97 document are to be interpreted as described in [RFC2119]. 99 2. Introduction 101 There are many routing protocols that have been designed to run 102 within a single administrative domain. These are known collectively 103 as "Interior Gateway Protocols" (IGPs). Typically, each link is 104 assigned a particular "metric" value. The path between two nodes can 105 then be assigned a "distance", which is the sum of the metrics of all 106 the links that belong to that path. An IGP selects the "shortest" 107 (minimal distance) path between any two nodes, perhaps subject to the 108 constraint that if the IGP provides multiple "areas", it may prefer 109 the shortest path within an area to a path that traverses more than 110 one area. Typically the administration of the network has some 111 routing policy which can be approximated by selecting shortest paths 112 in this way. 114 BGP, as distinguished from the IGPs, was designed to run over an 115 arbitrarily large number of administrative domains ("autonomous 116 systems", or "ASes") with limited coordination among the various 117 administrations. BGP does not make its path selection decisions 118 based on a metric; there is no such thing as an "inter-AS metric". 119 There are two fundamental reasons for this: 121 - The distance between two nodes in a common administrative domain 122 may change at any time due to events occurring in that domain. 123 These changes are not propagated around the Internet unless they 124 actually cause the border routers of the domain to select routes 125 with different BGP attributes for some set of address prefixes. 126 This accords with a fundamental principle of scaling, viz., that 127 changes with only local significance must not have global 128 effects. If local changes in distance were always propagated 129 around the Internet, this principle would be violated. 131 - A basic principle of inter-domain routing is that the different 132 administrative domains may have their own policies, which do not 133 have to be revealed to other domains, and which certainly do not 134 have to be agreed to by other domains. Yet the use of inter-AS 135 metric in the Internet would have exactly these effects. 137 There are, however, deployments in which a single administration runs 138 a network which has been sub-divided into multiple, contiguous ASes, 139 each running BGP. There are several reasons why a single 140 administrative domain may be broken into several ASes (which, in this 141 case, are not really "autonomous".) It may be that the existing IGPs 142 do not scale well in the particular environment; it may be that a 143 more generalized topology is desired than could be obtained by use of 144 a single IGP domain; it may be that a more finely grained routing 145 policy is desired than can be supported by an IGP. In such 146 deployments, it can be useful to allow BGP to make its routing 147 decisions based on the IGP metric, so that BGP chooses the "shortest" 148 path between two nodes, even if the nodes are in two different ASes 149 within that same administrative domain. 151 There are in fact some implementations which already do something 152 like this, using the MULTI_EXIT_DISC (MED) attribute to carry the IGP 153 metric. However, that doesn't really provide IGP-like "shortest 154 path" routing, as the BGP decision process gives priority to other 155 factors, such as LOCAL_PREF and AS_PATH length. Also, the standard 156 procedures for use of the MED do not ensure that the IGP metric is 157 properly accumulated so that it covers all the links along the path. 159 In this document, we define a new optional, non-transitive BGP 160 attribute, called the "Accumulated IGP Metric Attribute", or "AIGP 161 attribute", and specify the procedures for using it. 163 The specified procedures prevent the AIGP attribute from "leaking 164 out" past the administrative domain boundaries into the Internet. 166 The specified procedures also ensure that the value in the AIGP 167 attribute has been accumulated all along the path from the 168 destination, i.e., that the AIGP attribute does not appear when there 169 are "gaps" along the path where the IGP metric is unknown. 171 3. AIGP Attribute 173 The AIGP Attribute is an optional non-transitive BGP Path Attribute. 174 The attribute type code for the AIGP Attribute is to be assigned by 175 IANA. The value field of the AIGP Attribute is defined here to be a 176 set of TLVs (elements encoded as "Type/Length/Value"). However, this 177 document defines only a single such TLV, the AIGP TLV, that contains 178 the Accumulated IGP Metric. The AIGP TLV is encoded as shown in 179 Figure 1. An AIGP Attribute MUST NOT contain more than one AIGP TLV. 181 0 1 2 3 182 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 183 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 184 | Type=1 | Length | | 185 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 186 ~ ~ 187 | Accumulated IGP Metric | 188 | +-+-+-+-+-+-+-+-+ 189 | | 190 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 192 AIGP Attribute 193 Figure 1 195 - Type: A single octet encoding the AIGP Attribute Type. Only type 196 1 is defined in this document. 198 - Length: Two octets encoding the length in octets of the attribute, 199 including the type and length fields. The length is encoded as an 200 unsigned binary integer. 202 The length of the AIGP TLV is always 11. 204 - Accumulated IGP Metric: For a type 1 AIGP attribute, the value 205 field is always 8 bytes long. IGP metrics are frequently 206 expressed as 4-octet values, and this ensures that the AIGP 207 attribute can be used to hold the sum of an arbitrary number of 208 4-octet values. 210 3.1. Applicability Restrictions and Cautions 212 This document only considers the use of the AIGP attribute in 213 networks where each router uses tunneling of some sort to deliver a 214 packet to its BGP next hop. Use of the AIGP attribute in networks 215 that do not use tunneling is outside the scope of this document. 217 If a Route Reflector supports the AIGP attribute, but some of its 218 clients do not, then the routing choices that result may not all 219 reflect the intended routing policy. 221 3.2. Restrictions on Sending/Receiving 223 An implementation that supports the AIGP attribute MUST support a 224 per-session configuration item, AIGP_SESSION, that indicates whether 225 the attribute is enabled or disabled for use on that session. 227 - The default value of AIGP_SESSION, for EBGP sessions, MUST be 228 "disabled". 230 - The default value of AIGP_SESSION, for IBGP and confederation- 231 EBGP sessions, MUST be "enabled." 233 The AIGP attribute MUST NOT be sent on any BGP session for which 234 AIGP_SESSION is disabled. 236 If an AIGP attribute is received on a BGP session for which 237 AIGP_SESSION is disabled, the attribute MUST be treated exactly as if 238 it were an unrecognized non-transitive attribute. That is, "it MUST 239 be quietly ignored and not passed along to other BGP peers" (see 240 [BGP], section 5). 242 3.3. Creating and Modifying the AIGP Attribute 244 3.3.1. Originating the AIGP Attribute 246 A BGP speaker MUST NOT add the AIGP attribute to any route whose path 247 leads outside the AS to which the BGP speaker belongs. It may be 248 added only to routes that satisfy one of the following conditions: 250 - The route is a static route that is being redistributed into BGP 252 - The route is an IGP route that is being redistributed into BGP 254 - The route is an IBGP-learned route whose AS_PATH attribute is 255 empty. 257 An implementation that supports the AIGP attribute MUST support a 258 configuration item, AIGP_ORIGINATE, that enables or disables its 259 creation and attachment to routes. The default value of 260 AIGP_ORIGINATE MUST be "disabled". 262 It SHOULD be possible to set AIGP_ORIGINATE to "enabled for the 263 routes of a particular IGP that are redistributed into BGP" (where "a 264 particular IGP" might be "OSPF" or "ISIS"). 266 When a BGP speaker R learns a route to address prefix P from an IGP, 267 the IGP will have computed a "distance" from R to P. The value 268 assigned to the AIGP attribute is either the IGP-computed distance, 269 or some other value determined by policy. 271 In the case of a static route whose next hop matches a BGP route that 272 has an AIGP attribute, the static route MAY inherit the AIGP 273 attribute value of that BGP route. 275 3.3.2. Modifications by the Originator 277 If BGP speaker R is the originator of the AIGP attribute of prefix P, 278 and at some point the distance from R to P changes, R SHOULD issue a 279 new BGP update containing the new value of the AIGP attribute. 280 However, if the difference between the new distance and the distance 281 advertised in the AIGP attribute is less than a configurable 282 threshold, the update MAY be suppressed. 284 3.3.3. Modifications by a Non-Originator 286 Suppose a BGP speaker R1 receives a route with an AIGP attribute 287 whose value is A, and a Next Hop whose value is R2. Suppose also 288 that R1 is about to redistribute that route on a BGP session that is 289 enabled for sending/receiving the attribute. 291 If R1 does not change the Next Hop of the route, then R1 MUST NOT 292 change the AIGP attribute value of the route. 294 If R1 changes the Next Hop of the route from R2 to R1, and if R1's 295 route to R2 is an IGP-learned route, or a static route that does not 296 require recursive next hop resolution, then R1 must increase the 297 value of the AIGP attribute by adding to A the distance from R1 to 298 R2. This distance is either the IGP-computed distance from R1 to R2, 299 or some value determined by policy. However, A MUST be increased by 300 a non-zero amount. 302 Note that if R1 and R2 above are EBGP neighbors, and there is a 303 direct link between them on which no IGP is running, then when R1 304 changes the next hop of a route from R2 to R1, the AIGP metric value 305 MUST be increased by a non-zero amount. The amount of the increase 306 SHOULD be such that it is properly comparable to the IGP metrics. 307 E.g., if the IGP metric is a function of latency, then the amount of 308 the increase should be a function of the latency from R1 to R2. 310 If R1 changes the Next Hop of the route from R2 to R1, and if R1's 311 route to R2 is a BGP-learned route, or a static route that requires 312 recursive next hop resolution, then the AIGP attribute value needs to 313 be increased in several steps: 315 1. Let Xattr be the new AIGP attribute value. 317 2. Initialize Xattr to A. 319 3. Set the XNH to R2. 321 4. Find the route to XNH. 323 5. If the route to XNH does not require recursive next hop 324 resolution, get the distance D from R1 to XNH. If D is above a 325 configurable threshold, set the AIGP attribute value to 326 Xattr+D. If D is below a configurable threshold, set the AIGP 327 attribute value to Xattr. In either case, exit this procedure. 329 6. If the route to XNH is a BGP-learned route, and the route does 330 NOT have an AIGP attribute, then exit this procedure and do not 331 pass on any AIGP attribute. 333 7. If the route to XNH is a BGP-learned route, and the route has 334 an AIGP attribute value of Y, then set Xattr=Xattr+Y, and set 335 XNH to the next hop of this route. (The intention here is that 336 Y is the AIGP value of the route as it was received by R1, 337 without having been modified by R1.) 339 8. Go to step 4. 341 The AIGP value of a given route depends on (a) the AIGP values of all 342 the next hops that are recursively resolved during this procedure, 343 and (b) the IGP distance to any next hop that is not recursively 344 resolved. Any change due to (a) in any of these values MUST trigger 345 a new AIGP computation for that route. Whether a change due to (b) 346 triggers a new AIGP computation depends upon whether the change in 347 IGP distance exceeds a configurable threshold. 349 Note that the overall shortest path may not be selected if the next 350 hop has to be recursively resolved more than once. 352 If the AIGP attribute is carried across several ASes, each with its 353 own IGP domain, it is clear that these procedures are unlikely to 354 give a sensible result if the IGPs are different (e.g., some OSPF and 355 some IS-IS), or if the meaning of the metrics is different in the 356 different IGPs (e.g., if the metric represents bandwidth in some IGP 357 domains but represents latency in others). These procedures also are 358 unlikely to give a sensible result if the metric assigned to inter-AS 359 BGP links (on which no IGP is running) or to static routes is not 360 comparable to the IGP metrics. All such cases are outside the scope 361 of the current document. 363 4. Decision Process 365 4.1. When a Route has an AIGP Attribute 367 Use of the AIGP attribute involves several modifications to the BGP 368 decision process. 370 The procedures defined in this section MUST be executed BEFORE the 371 LOCAL_PREF comparison step in the BGP decision process. 373 When comparing two routes, one of which has an AIGP attribute and one 374 of which does not, the route with the AIGP attribute MUST be 375 considered to be the preferable route. 377 When a given router R is comparing two routes, T1 and T2, each of 378 which has an AIGP attribute, the preferred route is selected 379 according to the following rule: 381 - Set A1 to the sum of (a) T1's AIGP attribute value and (b) the 382 IGP distance from R to T1's next hop. 384 - Set A2 to the sum of (a) T2's AIGP attribute value and (b) the 385 IGP distance from R to T2's next hop. 387 - If A1 is less than A2, select T1. 389 - If A2 is less than A1, select T2. 391 - If A1 is equal to A2, T1 and T2 are equally preferable. 393 In all other respects, the decision process is unchanged. In 394 particular, the tie-breaking rules for equally preferable paths 395 remain unchanged, and the AS_PATH continues to be used to prevent 396 consideration of routes that traverse an AS more than once. 398 4.2. When the Route to the Next Hop has an AIGP attribute 400 Suppose that a given router R1 is comparing two routes, neither of 401 which has an AIGP attribute. The BGP decision process as specified 402 in [BGP] makes use, in its tie breaker procedures, of "interior 403 cost", defined as follows: 405 "interior cost of a route is determined by calculating the metric 406 to the NEXT_HOP for the route using the Routing Table." 408 Suppose route T has a next hop of N. We modify the notion of the 409 "interior cost" from node R to node N as follows: 411 - If the route to N has an AIGP attribute, set A to the AIGP value 412 of the route to N, computing the AIGP value of the route 413 according to the procedure of section 3.3.3. (This will have 414 been computed at the time the route to N was installed.) 416 - If the route to N does not have an AIGP value, set A to 0. (This 417 can only be the case if there is no route to N that does have an 418 AIGP value.) 420 - Let R2 be the next hop of the route to N, after all recursive 421 resolution of the next hop is done. Let m be the IGP distance 422 (or in the case of a static route, the configured distance) from 423 R1 to R2. 425 - The "interior cost" of route T is the quantity A+m. 427 5. Deployment Considerations 429 Using the AIGP attribute to achieve a desired routing policy will be 430 more effective if each BGP speaker can use it to choose from among 431 multiple routes. Thus is it highly recommended that the procedures of 432 [BESTEXT] and [ADDPATH] be used in conjunction with the AIGP 433 Attribute. 435 If a Route Reflector does not pass all paths to its clients, then it 436 will tend to pass the paths for which the IGP distance from the Route 437 Reflector itself to the next hop is smallest. This may result in a 438 non-optimal choice by the clients. 440 6. IANA Considerations 442 IANA shall assign a codepoint for the AIGP attribute. This codepoint 443 will come from the "BGP Path Attributes" registry. 445 IANA shall create a registry for "BGP AIGP Attribute Types". Type 1 446 should be defined as "AIGP", and should refer to this document. 448 7. Security Considerations 450 The spurious introduction, though error or malfeasance, of an AIGP 451 attribute, could result in the selection of paths other than those 452 desired. 454 Improper configuration on both ends of an EBGP connection could 455 result in an AIGP attribute being passed from one service provider to 456 another. This would likely result in an unsound selection of paths. 458 8. Acknowledgments 460 The authors would like to thank Rajiv Asati, Clarence Filsfils, 461 Robert Raszuk, Yakov Rekhter, Samir Saad, and John Scudder for their 462 input. 464 9. Authors' Addresses 466 Rex Fernando 467 Juniper Networks 468 1194 N. Mathilda Ave 469 Sunnyvale, CA 94089 470 USA 471 Email: rex@juniper.net 473 Pradosh Mohapatra 474 Cisco Systems, Inc. 475 170 Tasman Drive 476 San Jose, CA 95134 477 Email: pmohapat@cisco.com 479 Eric C. Rosen 480 Cisco Systems, Inc. 481 1414 Massachusetts Avenue 482 Boxborough, MA, 01719 483 Email: erosen@cisco.com 485 James Uttaro 486 AT&T 487 200 S. Laurel Avenue 488 Middletown, NJ 07748 489 Email: uttaro@att.com 491 10. Normative References 493 [BGP], "A Border Gateway Protocol 4 (BGP-4)", Y. Rekhter, T. Li, S. 494 Hares, RFC 4271, January 2006. 496 11. Informative References 498 [ADDPATH] "Fast Connectivity Restoration Using BGP Add-Path", P. 499 Mohapatra, R. Fernando, C. Filsfils, R. Raszuk, draft-pmohapat-idr- 500 fast-conn-restore-00.txt, September 2008. 502 [BESTEXT], " Advertisement of the Best-External Route to IBGP", P. 503 Marques, R. Fernando, E. Chen, P. Mohapatra, draft-marques-idr-best- 504 external-00.txt, July 2008. 506 [RFC2119] "Key words for use in RFCs to Indicate Requirement 507 Levels.", S. Bradner, March 1997