idnits 2.17.1 draft-ietf-idr-bgp-optimal-route-reflection-28.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 17, 2021) is 1037 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Obsolete informational reference (is this intentional?): RFC 7752 (Obsoleted by RFC 9552) Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IDR Working Group R. Raszuk, Ed. 3 Internet-Draft NTT Network Innovations 4 Intended status: Standards Track B. Decraene, Ed. 5 Expires: December 19, 2021 Orange 6 C. Cassar 8 E. Aman 10 K. Wang 11 Juniper Networks 12 June 17, 2021 14 BGP Optimal Route Reflection (BGP ORR) 15 draft-ietf-idr-bgp-optimal-route-reflection-28 17 Abstract 19 This document defines an extension to BGP route reflectors. On route 20 reflectors, BGP route selection is modified in order to choose the 21 best route from the standpoint of their clients, rather than from the 22 standpoint of the route reflectors. Depending on the scaling and 23 precision requirements, route selection can be specific for one 24 client, common for a set of clients or common for all clients of a 25 route reflector. This solution is particularly applicable in 26 deployments using centralized route reflectors, where choosing the 27 best route based on the route reflector's IGP location is suboptimal. 28 This facilitates, for example, best exit point policy (hot potato 29 routing). 31 The solution relies upon all route reflectors learning all paths 32 which are eligible for consideration. BGP Route Selection is 33 performed in the route reflectors based on the IGP cost from 34 configured locations in the link state IGP. 36 Status of This Memo 38 This Internet-Draft is submitted in full conformance with the 39 provisions of BCP 78 and BCP 79. 41 Internet-Drafts are working documents of the Internet Engineering 42 Task Force (IETF). Note that other groups may also distribute 43 working documents as Internet-Drafts. The list of current Internet- 44 Drafts is at https://datatracker.ietf.org/drafts/current/. 46 Internet-Drafts are draft documents valid for a maximum of six months 47 and may be updated, replaced, or obsoleted by other documents at any 48 time. It is inappropriate to use Internet-Drafts as reference 49 material or to cite them other than as "work in progress." 51 This Internet-Draft will expire on December 19, 2021. 53 Copyright Notice 55 Copyright (c) 2021 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents 60 (https://trustee.ietf.org/license-info) in effect on the date of 61 publication of this document. Please review these documents 62 carefully, as they describe your rights and restrictions with respect 63 to this document. Code Components extracted from this document must 64 include Simplified BSD License text as described in Section 4.e of 65 the Trust Legal Provisions and are provided without warranty as 66 described in the Simplified BSD License. 68 Table of Contents 70 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 71 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 72 3. Modifications to BGP Route Selection . . . . . . . . . . . . 4 73 3.1. Route Selection from a different IGP location . . . . . . 5 74 3.1.1. Restriction when BGP next hop is a BGP route . . . . 6 75 3.2. Multiple Route Selections . . . . . . . . . . . . . . . . 6 76 4. Deployment Considerations . . . . . . . . . . . . . . . . . . 6 77 5. Security Considerations . . . . . . . . . . . . . . . . . . . 8 78 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 79 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 8 80 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 9 81 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 82 9.1. Normative References . . . . . . . . . . . . . . . . . . 9 83 9.2. Informative References . . . . . . . . . . . . . . . . . 10 84 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 86 1. Introduction 88 There are three types of BGP deployments within Autonomous Systems 89 today: full mesh, confederations and route reflection. BGP route 90 reflection [RFC4456] is the most popular way to distribute BGP routes 91 between BGP speakers belonging to the same Autonomous System. 92 However, in some situations, this method suffers from non-optimal 93 path selection. 95 [RFC4456] asserts that, because the IGP cost to a given point in the 96 network will vary across routers, "the route reflection approach may 97 not yield the same route selection result as that of the full 98 Internal BGP (IBGP) mesh approach." One practical implication of 99 this fact is that the deployment of route reflection may thwart the 100 ability to achieve hot potato routing. Hot potato routing attempts 101 to direct traffic to the closest Autonomous System (AS) exit point in 102 cases where no higher priority policy dictates otherwise. As a 103 consequence of the route reflection method, the choice of exit point 104 for a route reflector and its clients will be the exit point that is 105 optimal for the route reflector - not necessarily the one that is 106 optimal for its clients. 108 Section 11 of [RFC4456] describes a deployment approach and a set of 109 constraints which, if satisfied, would result in the deployment of 110 route reflection yielding the same results as the IBGP full mesh 111 approach. This deployment approach makes route reflection compatible 112 with the application of hot potato routing policy. In accordance 113 with these design rules, route reflectors have often been deployed in 114 the forwarding path and carefully placed on the Point of Presence 115 (POP) to core boundaries. 117 The evolving model of intra-domain network design has enabled 118 deployments of route reflectors outside the forwarding path. 119 Initially this model was only employed for new services, e.g., IP 120 VPNs [RFC4364], however it has been gradually extended to other BGP 121 services, including the IPv4 and IPv6 Internet. In such 122 environments, hot potato routing policy remains desirable. 124 Route reflectors outside the forwarding path can be placed on the POP 125 to core boundaries, but they are often placed in arbitrary locations 126 in the core of large networks. 128 Such deployments suffer from a critical drawback in the context of 129 BGP Route Selection: A route reflector with knowledge of multiple 130 paths for a given route will typically pick its best path and only 131 advertise that best path to its clients. If the best path for a 132 route is selected on the basis of an IGP tie-break, the path 133 advertised will be the exit point closest to the route reflector. 134 However, the clients are in a different place in the network topology 135 than the route reflector. In networks where the route reflectors are 136 not in the forwarding path, this difference will be even more acute. 138 In addition, there are deployment scenarios where service providers 139 want to have more control in choosing the exit points for clients 140 based on other factors, such as traffic type, traffic load, etc. 141 This further complicates the issue and makes it less likely for the 142 route reflector to select the best path from the client's 143 perspective. It follows that the best path chosen by the route 144 reflector is not necessarily the same as the path which would have 145 been chosen by the client if the client had considered the same set 146 of candidate paths as the route reflector. 148 2. Terminology 150 This memo makes use of the terms defined in [RFC4271] and [RFC4456]. 152 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 153 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 154 "OPTIONAL" in this document are to be interpreted as described in BCP 155 14 [RFC2119] [RFC8174] when, and only when, they appear in all 156 capitals, as shown here. 158 3. Modifications to BGP Route Selection 160 The core of this solution is the ability for an operator to specify 161 the IGP location for which the route reflector calculates interior 162 cost for the NEXT_HOP. The IGP location is defined as a node in the 163 IGP topology, it is identified by an IP address of this node (e.g., a 164 loopback address), and may be configured on a per route reflector 165 basis, per set of clients, or per client basis. Such configuration 166 will allow the route reflector to select and distribute to a given 167 set of clients routes with shortest distance to the next hops from 168 the position of the selected IGP location. This provides for freedom 169 of route reflector physical location, and allows transient or 170 permanent migration of this network control plane function to an 171 arbitrary location with no impact to IP transit. 173 The choice of specific granularity (route reflector, set of clients, 174 or client) is configured by the network operator. An implementation 175 is considered compliant with this document if it supports at least 176 one such grouping category. 178 For purposes of route selection, the perspective of a client can 179 differ from that of a route reflector or another client in two 180 distinct ways: 182 o it has a different position in the IGP topology, 184 o it can have a different routing policy. 186 These factors correspond to the issues described earlier. 188 This document defines, for BGP Route Reflectors [RFC4456], two 189 changes to the BGP Route Selection algorithm: 191 o The first change, introduced in Section 3.1, is related to the IGP 192 cost to the BGP Next Hop in the BGP decision process. The change 193 consists of using the IGP cost from a different IGP location than 194 the route reflector itself. 196 o The second change, introduced in Section 3.2, is to extend the 197 granularity of the BGP decision process, to allow for running 198 multiple decisions processes using different perspective or 199 policies. 201 A significant advantage of these approaches is that the route 202 reflector clients do not need to be modified. 204 3.1. Route Selection from a different IGP location 206 In this approach, optimal refers to the decision where the interior 207 cost of a route is determined during step e) of [RFC4271] section 208 9.1.2.2 "Breaking Ties (Phase 2)". It does not apply to path 209 selection preference based on other policy steps and provisions. 211 In addition to the change specified in [RFC4456] section 9, [RFC4271] 212 section 9.1.2.2 is modified as follows. 214 The below text in step e) 216 e) Remove from consideration any routes with less-preferred 217 interior cost. The interior cost of a route is determined by 218 calculating the metric to the NEXT_HOP for the route using the 219 Routing Table. 221 ...is replaced by this new text: 223 e) Remove from consideration any routes with less-preferred 224 interior cost. The interior cost of a route is determined by 225 calculating the metric from the selected IGP location to the 226 NEXT_HOP for the route using the shortest IGP path tree rooted at 227 the selected IGP location. 229 In order to be able to compute the shortest path tree rooted at the 230 selected IGP locations, knowledge of the IGP topology for the area/ 231 level that includes each of those locations is needed. This 232 knowledge can be gained with the use of the link state IGP such as 233 IS-IS [ISO10589] or OSPF [RFC2328] [RFC5340] or via BGP-LS [RFC7752]. 234 When specifying logical location of a route reflector for a group of 235 clients one or more backup IGP locations SHOULD be allowed to be 236 specified for redundancy. Further deployment considerations are 237 discussed in Section 4. 239 3.1.1. Restriction when BGP next hop is a BGP route 241 In situations where the BGP next hop is a BGP route itself, the IGP 242 metric of a route used for its resolution SHOULD be the final IGP 243 cost to reach such next hop. Implementations which cannot inform BGP 244 of the final IGP metric to a recursive next hop MUST treat such paths 245 as least preferred during next hop metric comparison. However, such 246 paths MUST still be considered valid for BGP Phase 2 Route Selection. 248 3.2. Multiple Route Selections 250 BGP Route Reflector as per [RFC4456] runs a single BGP Decision 251 Process. Optimal route reflection may require multiple BGP Decision 252 Processes or subsets of the Decision Process in order to consider 253 different IGP locations or BGP policies for different sets of 254 clients. This is very similar to what is defined in [RFC7947] 255 section 2.3.2.1. 257 If the required routing optimization is limited to the IGP cost to 258 the BGP Next-Hop, only step e) and subsequent steps as defined in 259 [RFC4271] section 9.1.2.2, needs to be run multiple times. 261 If the routing optimization requires the use of different BGP 262 policies for different sets of clients, a larger part of the decision 263 process needs to be run multiple times, up to the whole decision 264 process as defined in section 9.1 of [RFC4271]. This is for example 265 the case when there is a need to use different policies to compute 266 different degree of preference during Phase 1. This is needed for 267 use cases involving traffic engineering or dedicating certain exit 268 points for certain clients. In the latter case, the user may specify 269 and apply a general policy on the route reflector for a set of 270 clients. Regular path selection, including IGP perspective for a set 271 of clients as per Section 3.1, is then applied to the candidate paths 272 to select the final paths to advertise to the clients. 274 A route reflector can implement either or both of the modifications 275 in order to allow it to choose the best path for its clients that the 276 clients themselves would have chosen given the same set of candidate 277 paths. 279 4. Deployment Considerations 281 BGP Optimal Route Reflection provides a model for integrating the 282 client perspective into the BGP Route Selection decision function for 283 route reflectors. More specifically, the choice of BGP path takes 284 into account either the IGP cost between the client and the NEXT_HOP 285 (rather than the IGP cost from the route reflector to the NEXT_HOP) 286 or other user configured policies. 288 The achievement of optimal routing between clients of different 289 clusters relies upon all route reflectors learning all paths that are 290 eligible for consideration. In order to satisfy this requirement, 291 BGP add-path [RFC7911] needs to be deployed between route reflectors. 293 This solution can be deployed in traditional hop-by-hop forwarding 294 networks as well as in end-to-end tunneled environments. To avoid 295 routing loops in networks with multiple route reflectors and hop-by- 296 hop forwarding without encapsulation, it is essential that the 297 network topology be carefully considered in designing a route 298 reflection topology (see also Section 11 of [RFC4456]). 300 As discussed in section 11 of [RFC4456], the IGP locations of BGP 301 route reflectors is important and has routing implications. This 302 equally applies to the choice of the IGP locations configured on 303 optimal route reflectors. If a backup location is provided, it is 304 used when the primary IGP location disappears from the IGP (i.e. 305 fails). Just like the failure of a RR [RFC4456], it may result in 306 changing the paths selected and advertised to the clients and in 307 general the post-failure paths are expected to be less optimal. This 308 is dependent on the IGP topologies and the IGP distance between the 309 primary and the backup IGP locations: the smaller the distance the 310 smaller the potential impact. 312 After selecting suitable IGP locations, an operator may let one or 313 multiple route reflectors handle route selection for all of them. 314 The operator may alternatively deploy one or multiple route reflector 315 for each IGP location or create any design in between. This choice 316 may depend on operational model (centralized vs per region), 317 acceptable blast radius in case of failure, acceptable number of IBGP 318 sessions for the mesh between the route reflectors, performance and 319 configuration granularity of the equipment. 321 With this approach, an ISP can effect a hot potato routing policy 322 even if route reflection has been moved out of the forwarding plane, 323 and hop-by-hop forwarding has been replaced by end-to-end MPLS or IP 324 encapsulation. Compared with a deployment of ADD-PATH on all 325 routers, BGP Optimal Route Reflection (ORR) reduces the amount of 326 state which needs to be pushed to the edge of the network in order to 327 perform hot potato routing. 329 Modifying the IGP location of BGP ORR does not interfere with 330 policies enforced before IGP tie-breaking (step e) of [RFC4271] 331 section 9.1.2.2 in the BGP Decision Process. 333 Calculating routes for different IGP locations requires multiple 334 Shortest Path First (SPF) calculations and multiple (subsets of) BGP 335 Decision Processes, which requires more computing resources. This 336 document allows for different granularity such as one Decision 337 Process per route reflector, per set of clients or per client. A 338 more fine-grained granularity may translate into more optimal hot 339 potato routing at the cost of more computing power. Selecting to 340 configure an IGP location per client has the highest precision as 341 each client can be associated with their ideal (own) IGP location. 342 However, doing so may have an impact on the performance (as explained 343 above). Using an IGP location per set of clients implies a loss of 344 precision, but reduces the impact on the performance of the route 345 reflector. Similarly, if an IGP location is selected for the whole 346 routing instance, the lowest precision is achieved, but the 347 performance impact is minimal. In the last mode of operation both 348 precision as well as perfomance metrics are equal to same metrics 349 when using route reflection as described in [RFC4456] without ORR 350 extension. The ability to run fine-grained computations depends on 351 the platform/hardware deployed, the number of clients, the number of 352 BGP routes and the size of the IGP topology. In essence, sizing 353 considerations are similar to the deployments of BGP Route Reflector. 355 5. Security Considerations 357 This extension provides a new metric value using additional 358 information for computing routes for BGP route reflectors. While any 359 improperly used metric value could impact the resiliency of the 360 network, this extension does not change the underlying security 361 issues inherent in the existing IBGP per [RFC4456]. 363 This document does not introduce requirements for any new protection 364 measures. 366 6. IANA Considerations 368 This document does not request any IANA allocations. 370 7. Acknowledgments 372 Authors would like to thank Keyur Patel, Eric Rosen, Clarence 373 Filsfils, Uli Bornhauser, Russ White, Jakob Heitz, Mike Shand, Jon 374 Mitchell, John Scudder, Jeff Haas, Martin Djernaes, Daniele 375 Ceccarelli, Kieran Milne, Job Snijders, Randy Bush, Alvaro Retana, 376 Francesca Palombini, Benjamin Kaduk, Zaheduzzaman Sarker, Lars 377 Eggert, Murray Kucherawy, Tom Petch and Nick Hilliard for their 378 valuable input. 380 8. Contributors 382 Following persons substantially contributed to the current format of 383 the document: 385 Stephane Litkowski 386 Cisco System 388 slitkows.ietf@gmail.com 390 Adam Chappell 391 GTT Communications, Inc. 392 Aspira Business Centre 393 Bucharova 2928/14a 394 158 00 Prague 13 Stodulky 395 Czech Republic 397 adam.chappell@gtt.net 399 9. References 401 9.1. Normative References 403 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 404 Requirement Levels", BCP 14, RFC 2119, 405 DOI 10.17487/RFC2119, March 1997, 406 . 408 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 409 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 410 DOI 10.17487/RFC4271, January 2006, 411 . 413 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 414 Reflection: An Alternative to Full Mesh Internal BGP 415 (IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006, 416 . 418 [RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder, 419 "Advertisement of Multiple Paths in BGP", RFC 7911, 420 DOI 10.17487/RFC7911, July 2016, 421 . 423 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 424 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 425 May 2017, . 427 9.2. Informative References 429 [ISO10589] 430 International Organization for Standardization, 431 "Intermediate system to Intermediate system intra-domain 432 routeing information exchange protocol for use in 433 conjunction with the protocol for providing the 434 connectionless-mode Network Service (ISO 8473)", ISO/ 435 IEC 10589:2002, Second Edition, Nov 2002. 437 [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, 438 DOI 10.17487/RFC2328, April 1998, 439 . 441 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 442 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 443 2006, . 445 [RFC5340] Coltun, R., Ferguson, D., Moy, J., and A. Lindem, "OSPF 446 for IPv6", RFC 5340, DOI 10.17487/RFC5340, July 2008, 447 . 449 [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and 450 S. Ray, "North-Bound Distribution of Link-State and 451 Traffic Engineering (TE) Information Using BGP", RFC 7752, 452 DOI 10.17487/RFC7752, March 2016, 453 . 455 [RFC7947] Jasinska, E., Hilliard, N., Raszuk, R., and N. Bakker, 456 "Internet Exchange BGP Route Server", RFC 7947, 457 DOI 10.17487/RFC7947, September 2016, 458 . 460 Authors' Addresses 462 Robert Raszuk (editor) 463 NTT Network Innovations 465 Email: robert@raszuk.net 467 Bruno Decraene (editor) 468 Orange 470 Email: bruno.decraene@orange.com 471 Christian Cassar 473 Email: cassar.christian@gmail.com 475 Erik Aman 477 Email: erik.aman@aman.se 479 Kevin Wang 480 Juniper Networks 481 10 Technology Park Drive 482 Westford, MA 01886 483 USA 485 Email: kfwang@juniper.net