idnits 2.17.1 draft-marques-idr-best-external-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 18. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 389. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 400. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 407. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 413. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 27, 2008) is 5752 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 1771 (Obsoleted by RFC 4271) ** Downref: Normative reference to an Informational RFC: RFC 3345 Summary: 3 errors (**), 0 flaws (~~), 2 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Marques 3 Internet-Draft R. Fernando 4 Intended status: Standards Track Juniper Networks 5 Expires: January 28, 2009 E. Chen 6 P. Mohapatra 7 Cisco Systems 8 July 27, 2008 10 Advertisement of the best-external route to IBGP 11 draft-marques-idr-best-external-00.txt 13 Status of this Memo 15 By submitting this Internet-Draft, each author represents that any 16 applicable patent or other IPR claims of which he or she is aware 17 have been or will be disclosed, and any of which he or she becomes 18 aware will be disclosed, in accordance with Section 6 of BCP 79. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet-Drafts as reference 28 material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt. 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 This Internet-Draft will expire on January 28, 2009. 38 Abstract 40 This document makes a case and provides the rules for a border router 41 to advertise its best external route towards its IBGP peers when its 42 overall best is a route received from an IBGP peer. 44 The best external route may be different from the overall best route 45 installed in the Loc-Rib. Advertising the best-external route (when 46 different from the overall best route) into an IBGP helps in speeding 47 up routing convergence, has positive effects in reducing inter-domain 48 churn and in some limited scenarios could help avoid permanent IBGP 49 route oscillation. 51 The document also extends this mechanism to route reflectors and 52 confederation border routers to advertise a best route that is 53 external to the cluster/domain. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 59 2. Consistency between routing and forwarding . . . . . . . . . . 3 60 3. Algorithm for selection of best-external route . . . . . . . . 5 61 4. Route Reflection . . . . . . . . . . . . . . . . . . . . . . . 6 62 5. Confederations . . . . . . . . . . . . . . . . . . . . . . . . 6 63 6. Applications . . . . . . . . . . . . . . . . . . . . . . . . . 7 64 6.1. Fast Connectivity Restoration . . . . . . . . . . . . . . 7 65 6.2. Inter-Domain Churn Reduction . . . . . . . . . . . . . . . 7 66 6.3. Reducing Persistent IBGP oscillation . . . . . . . . . . . 7 67 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 8 68 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 69 9. Security Considerations . . . . . . . . . . . . . . . . . . . 8 70 10. Normative References . . . . . . . . . . . . . . . . . . . . . 8 71 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 8 72 Intellectual Property and Copyright Statements . . . . . . . . . . 10 74 1. Introduction 76 The term best-external route describes the most preferred route among 77 the routes received by a router from its EBGP peers. The best- 78 external route might differ from the overall route installed in the 79 Loc-RIB in the case when the overall best route happens to be an 80 internal route. Advertising the best-external route, when different 81 from the overall best, presents additional information into an IBGP 82 mesh which may be of value for several purposes including: 84 o Faster restoration of connectivity, by providing additional paths, 85 that may be used to fail over in case the primary path becomes 86 invalid or is withdrawn. 88 o Reducing inter-domain churn and traffic blackholing due to the 89 readily available alternate path. 91 o Reducing the potential for situations of permanent IBGP route 92 oscillation, as discussed in some scenarios [RFC3345]. 94 o Improving selection of lower MED routes from the same neighboring 95 AS. 97 In current networks, BGP is typically deployed in topologies that 98 include the use of route reflectors [RFC4456] and/or confederations 99 [RFC5065]. It is straightforward to extend the concept of "external" 100 route to a cluster or confed sub-AS. A route is considered 101 "external" if it has not been received from the cluster/sub-AS which 102 is being considered for advertisement. 104 1.1. Requirements Language 106 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 107 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 108 document are to be interpreted as described in RFC 2119 [RFC2119]. 110 2. Consistency between routing and forwarding 112 The BGP protocol, as defined in [RFC1771], specifies that a BGP 113 speaker shall advertise to its internal peers the route with the 114 highest degree of preference among routes to the same destination 115 received from external neighbors. 117 This section discusses problems present with the approach described 118 in [RFC1771] and the next section offers an alternative algorithm to 119 select a best external route which can be advertised to an IBGP mesh. 121 The internal update advertisement rules contained in the original 122 BGP-4 specification [RFC1771] can lead to situations where traffic is 123 forwarded through a route other than the route advertised by BGP. 125 Inconsistencies between forwarding and routing are highly 126 undesirable. Service providers use BGP with the dual objective of 127 learning reachability information and expressing policy over network 128 resources. The latter assumes that forwarding follows routing 129 information. 131 Consider the Autonomous system presented in figure 1, where r1 ... r4 132 are members of a single IBGP mesh and routes a, b, and c are received 133 from external peers. 135 AS 1 (c) 136 | 137 +----+ +----+ 138 | r1 |...........| r2 | 139 +----+ +----+ 140 . 141 . 142 . 143 . 144 . 145 . 146 +----+ +----+ 147 | r3 |...........| r4 | --- ebgp --- AS X 148 +----+ +----+ 149 / \ 150 / \ 151 AS 1 (a) AS 2 (b) 153 Figure 1: Inconsistency in Routing 155 Path AS MED rtr_id 156 a 1 10 1 157 b 2 5 10 158 c 1 5 5 160 Figure 2: Path Attribute Table 162 Following the rules as specified in [RFC1771], router r3 will select 163 path (b) received from AS 2 as its overall best to install in the 164 Loc-Rib, since path (b) is preferable to path (c), the lowest MED 165 route from AS 1. However for the purposes of Internal Update route 166 selection, it will ignore the presence of path (c), and elect (a) as 167 its advertisement, via the router-id tie-breaking rule. 169 In this scenario, router r4 will receive (c) from r1 and (a) from r3. 170 It will pick the lowest MED route (c) and advertise it out via ebgp 171 to AS X. However at this point routing is inconsistent with 172 forwarding as traffic received from AS X will be forwarded towards AS 173 2, while the ebgp advertisement is being made for an AS 1 path. 175 Routing policies are typically specified in terms of neighboring 176 ASes. In the situation above, assuming that AS 1 is network for 177 which this AS provides transit services while AS 2 and AS X are peer 178 networks, one can easily see how the inconsistency between routing 179 and forwarding would lead to transit being inadvertently provided 180 between AS X and AS 2. This could lead to persistent forwarding 181 loops. 183 Inconsistency between routing and forwarding may happen, whenever a 184 bgp speaker chooses to advertise an external route into IBGP that is 185 different from the overall best route and its overall best is 186 external. 188 3. Algorithm for selection of best-external route 190 Given that the intent in advertising an external route, when the 191 overall best for the same destination is an internal route, is to 192 provide additional information into the IBGP mesh into which a route 193 is participating, it is desirable to take into account the routes 194 received from interior neighbors in the selection process. 196 We propose a route selection algorithm that selects a global order 197 between routes and which selects the same overall best route as the 198 one currently specified [RFC4271]. 200 In order to achieve this we need to introduce the concept of route 201 group. A route group is a set of routes to the same destination 202 received from the same neighboring AS and which is equal in terms of 203 route selection prior to the MED comparison step. 205 Routes are ordered within a group via MED or subsequent route 206 selection rules. 208 The order of all routes for the same destination is determined by the 209 order of the best route in each group. 211 As an example, the following set of received routes: 213 Path AS MED rtr_id 214 a 1 10 10 215 b 2 5 1 216 c 1 5 5 217 d 2 20 20 218 e 2 30 30 219 f 3 10 20 221 Figure 3: Path Attribute Table - 2 223 Would yield the following order (from the most to the least 224 preferred): 226 b < d < e < c < a < f 228 In this example, comparison of the best route within each group 229 provides the sequence (b < c < f). The remaining routes are ordered 230 in relation to their respective group best. 232 The route to be advertised to the IBGP mesh or a given cluster/sub-AS 233 is selected by choosing the most preferred route that is external to 234 that particular domain. Note that whenever the overall best route is 235 external it will automatically be selected by this algorithm. 237 4. Route Reflection 239 A route reflector that chooses to implement this algorithm, will 240 advertise to its non-client IBGP peers, the most preferred path 241 received from its clients. This is referred to as the best intra- 242 cluster route. It will advertise to its client peers the most 243 preferred path received from a neighbor outside the cluster. This is 244 referred to as the best inter-cluster route. 246 In order for a reflector to be able to advertise the best of its 247 inter-cluster routes into a cluster it is necessary that client-to- 248 client reflection be disabled, since its advertisement may otherwise 250 5. Confederations 252 When a BGP speaker is configured as a confederation border router, it 253 shall consider the best-external route as follows: 255 o When advertising into its sub-AS, it should select the most 256 preferred route not received from within its sub-AS. 258 o o When advertising into confed ebgp, it should select the most 259 preferred route not received from the neighboring sub-AS. 261 6. Applications 263 6.1. Fast Connectivity Restoration 265 When two exits are available to reach a particular destination and 266 one is preferred over the other, the availability of an alternate 267 path provides fast connectivity restoration when the primary path 268 fails. 270 Restoration can be quick since the alternate path is already at hand. 271 The border router could precompute the backup route and preinstall it 272 in FIB ready to be switched when the primary goes away. Note that 273 this requires the border router that's the backup to also preinstall 274 the secondary path and switch to it on failure. 276 6.2. Inter-Domain Churn Reduction 278 Within an AS, the non availability of backup best leads to a border 279 router sending a withdraw upstream when the primary fails. This 280 leads to inter-domain churn and packet loss for the time the network 281 takes to converge to the alternate path. Having the alternate path 282 will reduces the churn and eliminates packet loss. 284 6.3. Reducing Persistent IBGP oscillation 286 Advertising the best-external route, according to the algorithm 287 described in this document will reduce the possibility of route 288 oscillation by introducing additional information into the IBGP 289 system. 291 For a permanent oscillation condition to occur, it is necessary that 292 a circular dependency between paths occurs such that the selection of 293 a new best path by a router, in response to a received IBGP 294 advertisement, causes the withdrawal of information that another 295 router depends on in order to generate the original event. 297 In vanilla BGP, when only the best overall route is advertised, as in 298 most implementations, oscillation can occur whenever there are 2 or 299 clusters/sub-ASes such that at least one cluster has more than one 300 path that can potentially contribute to the dependency. 302 7. Acknowledgments 304 This document greatly benefits from the comments of Yakov Rekhter, 305 John Scudder and Jenny Yuan. 307 8. IANA Considerations 309 This document has no actions for IANA. 311 9. Security Considerations 313 There are no additional security risks introduced by this design. 315 10. Normative References 317 [RFC1771] Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 318 (BGP-4)", RFC 1771, March 1995. 320 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 321 Requirement Levels", BCP 14, RFC 2119, March 1997. 323 [RFC3345] McPherson, D., Gill, V., Walton, D., and A. Retana, 324 "Border Gateway Protocol (BGP) Persistent Route 325 Oscillation Condition", RFC 3345, August 2002. 327 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 328 Protocol 4 (BGP-4)", RFC 4271, January 2006. 330 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 331 Reflection: An Alternative to Full Mesh Internal BGP 332 (IBGP)", RFC 4456, April 2006. 334 [RFC5065] Traina, P., McPherson, D., and J. Scudder, "Autonomous 335 System Confederations for BGP", RFC 5065, August 2007. 337 Authors' Addresses 339 Pedro Marques 340 Juniper Networks 341 1194 N. Mathilda Ave 342 Sunnyvale, CA 94089 343 USA 345 Phone: 346 Email: roque@juniper.net 348 Rex Fernando 349 Juniper Networks 350 1194 N. Mathilda Ave 351 Sunnyvale, CA 94089 352 USA 354 Phone: 355 Email: rex@juniper.net 357 Enke Chen 358 Cisco Systems 359 170 W. Tasman Drive 360 San Jose, CA 95134 361 USA 363 Phone: 364 Email: enkechen@cisco.com 366 Pradosh Mohapatra 367 Cisco Systems 368 170 W. Tasman Drive 369 San Jose, CA 95134 370 USA 372 Phone: 373 Email: pmohapat@cisco.com 375 Full Copyright Statement 377 Copyright (C) The IETF Trust (2008). 379 This document is subject to the rights, licenses and restrictions 380 contained in BCP 78, and except as set forth therein, the authors 381 retain all their rights. 383 This document and the information contained herein are provided on an 384 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 385 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 386 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 387 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 388 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 389 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 391 Intellectual Property 393 The IETF takes no position regarding the validity or scope of any 394 Intellectual Property Rights or other rights that might be claimed to 395 pertain to the implementation or use of the technology described in 396 this document or the extent to which any license under such rights 397 might or might not be available; nor does it represent that it has 398 made any independent effort to identify any such rights. Information 399 on the procedures with respect to rights in RFC documents can be 400 found in BCP 78 and BCP 79. 402 Copies of IPR disclosures made to the IETF Secretariat and any 403 assurances of licenses to be made available, or the result of an 404 attempt made to obtain a general license or permission for the use of 405 such proprietary rights by implementers or users of this 406 specification can be obtained from the IETF on-line IPR repository at 407 http://www.ietf.org/ipr. 409 The IETF invites any interested party to bring to its attention any 410 copyrights, patents or patent applications, or other proprietary 411 rights that may cover technology that may be required to implement 412 this standard. Please address the information to the IETF at 413 ietf-ipr@ietf.org.