idnits 2.17.1 draft-ietf-idr-route-reflect-v2-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-19) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 8 longer pages, the longest (page 2) being 59 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([2,3], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 35 has weird spacing: '...as been well...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (April 1999) is 9136 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 1771 (ref. '1') (Obsoleted by RFC 4271) ** Obsolete normative reference: RFC 1863 (ref. '2') (Obsoleted by RFC 4223) ** Obsolete normative reference: RFC 1965 (ref. '3') (Obsoleted by RFC 3065) Summary: 12 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Tony Bates 2 Ravi Chandra 3 Enke Chen 4 Cisco Systems 5 April 1999 7 BGP Route Reflection 8 An alternative to full mesh IBGP 9 11 Status of this Memo 13 This document is an Internet Draft. Internet Drafts are working 14 documents of the Internet Engineering Task Force (IETF), its Areas, 15 and its Working Groups. Note that other groups may also distribute 16 working documents as Internet Drafts. 18 Internet Drafts are draft documents valid for a maximum of six 19 months. Internet Drafts may be updated, replaced, or obsoleted by 20 other documents at any time. It is not appropriate to use Internet 21 Drafts as reference material or to cite them other than as a "working 22 draft" or "work in progress". 24 Please check the I-D abstract listing contained in each Internet 25 Draft directory to learn the current status of this or any other 26 Internet Draft. 28 Abstract 30 The Border Gateway Protocol [1] is an inter-autonomous system routing 31 protocol designed for TCP/IP internets. Currently in the Internet BGP 32 deployments are configured such that that all BGP speakers within a 33 single AS must be fully meshed so that any external routing 34 information must be re-distributed to all other routers within that 35 AS. This represents a serious scaling problem that has been well 36 documented with several alternatives proposed [2,3]. 38 This document describes the use and design of a method known as 39 "Route Reflection" to alleviate the the need for "full mesh" IBGP. 41 1. Introduction 43 Currently in the Internet, BGP deployments are configured such that 44 that all BGP speakers within a single AS must be fully meshed and any 45 external routing information must be re-distributed to all other 46 routers within that AS. For n BGP speakers within an AS that 47 requires to maintain n*(n-1)/2 unique IBGP sessions. This "full 48 mesh" requirement clearly does not scale when there are a large 49 number of IBGP speakers each exchanging a large volume of routing 50 information, as is common in many of todays internet networks. 52 This scaling problem has been well documented and a number of 53 proposals have been made to alleviate this [2,3]. This document 54 represents another alternative in alleviating the need for a "full 55 mesh" and is known as "Route Reflection". This approach allows a BGP 56 speaker (known as "Route Reflector") to advertise IBGP learned routes 57 to certain IBGP peers. It represents a change in the commonly 58 understood concept of IBGP, and the addition of two new optional 59 transitive BGP attributes to prevent loops in routing updates. 61 2. Design Criteria 63 Route Reflection was designed to satisfy the following criteria. 65 o Simplicity 67 Any alternative must be both simple to configure as well 68 as understand. 70 o Easy Transition 72 It must be possible to transition from a full mesh 73 configuration without the need to change either topology 74 or AS. This is an unfortunate management overhead of the 75 technique proposed in [3]. 77 o Compatibility 79 It must be possible for non compliant IBGP peers 80 to continue be part of the original AS or domain 81 without any loss of BGP routing information. 83 These criteria were motivated by operational experiences of a very 84 large and topology rich network with many external connections. 86 3. Route Reflection 87 The basic idea of Route Reflection is very simple. Let us consider 88 the simple example depicted in Figure 1 below. 90 +------ + +-------+ 91 | | IBGP | | 92 | RTR-A |--------| RTR-B | 93 | | | | 94 +-------+ +-------+ 95 \ / 96 IBGP \ ASX / IBGP 97 \ / 98 +-------+ 99 | | 100 | RTR-C | 101 | | 102 +-------+ 104 Figure 1: Full Mesh IBGP 106 In ASX there are three IBGP speakers (routers RTR-A, RTR-B and RTR- 107 C). With the existing BGP model, if RTR-A receives an external route 108 and it is selected as the best path it must advertise the external 109 route to both RTR-B and RTR-C. RTR-B and RTR-C (as IBGP speakers) 110 will not re-advertise these IBGP learned routes to other IBGP 111 speakers. 113 If this rule is relaxed and RTR-C is allowed to advertise IBGP 114 learned routes to IBGP peers, then it could re-advertise (or reflect) 115 the IBGP routes learned from RTR-A to RTR-B and vice versa. This 116 would eliminate the need for the IBGP session between RTR-A and RTR-B 117 as shown in Figure 2 below. 119 +------ + +-------+ 120 | | | | 121 | RTR-A | | RTR-B | 122 | | | | 123 +-------+ +-------+ 124 \ / 125 IBGP \ ASX / IBGP 126 \ / 127 +-------+ 128 | | 129 | RTR-C | 130 | | 131 +-------+ 133 Figure 2: Route Reflection IBGP 135 The Route Reflection scheme is based upon this basic principle. 137 4. Terminology and Concepts 139 We use the term "Route Reflection" to describe the operation of a BGP 140 speaker advertising an IBGP learned route to another IBGP peer. Such 141 a BGP speaker is said to be a "Route Reflector" (RR), and such a 142 route is said to be a reflected route. 144 The internal peers of a RR are divided into two groups: 146 1) Client Peers 148 2) Non-Client Peers 150 A RR reflects routes between these groups, and may reflect routes 151 among client peers. A RR along with its client peers form a Cluster. 152 The Non-Client peer must be fully meshed but the Client peers need 153 not be fully meshed. Figure 3 depicts a simple example outlining the 154 basic RR components using the terminology noted above. 156 / - - - - - - - - - - - - - - 157 | Cluster | 158 +-------+ +-------+ 159 | | | | | | 160 | RTR-A | | RTR-B | 161 | |Client | |Client | | 162 +-------+ +-------+ 163 | \ / | 164 IBGP \ / IBGP 165 | \ / | 166 +-------+ 167 | | | | 168 | RTR-C | 169 | | RR | | 170 +-------+ 171 | / \ | 172 - - - - - /- - -\- - - - - - / 173 IBGP / \ IBGP 174 +-------+ +-------+ 175 | RTR-D | IBGP | RTR-E | 176 | Non- |---------| Non- | 177 |Client | |Client | 178 +-------+ +-------+ 180 Figure 3: RR Components 182 5. Operation 184 When a RR receives a route from an IBGP peer, it selects the best 185 path based on its path selection rule. After the best path is 186 selected, it must do the following depending on the type of the peer 187 it is receiving the best path from: 189 1) A Route from a Non-Client IBGP peer 191 Reflect to all the Clients. 193 2) A Route from a Client peer 195 Reflect to all the Non-Client peers and also to the 196 Client peers. (Hence the Client peers are not required 197 to be fully meshed.) 199 An Autonomous System could have many RRs. A RR treats other RRs just 200 like any other internal BGP speakers. A RR could be configured to 201 have other RRs in a Client group or Non-client group. 203 In a simple configuration the backbone could be divided into many 204 clusters. Each RR would be configured with other RRs as Non-Client 205 peers (thus all the RRs will be fully meshed.). The Clients will be 206 configured to maintain IBGP session only with the RR in their 207 cluster. Due to route reflection, all the IBGP speakers will receive 208 reflected routing information. 210 It is possible in a Autonomous System to have BGP speakers that do 211 not understand the concept of Route-Reflectors (let us call them 212 conventional BGP speakers). The Route-Reflector Scheme allows such 213 conventional BGP speakers to co-exist. Conventional BGP speakers 214 could be either members of a Non-Client group or a Client group. This 215 allows for an easy and gradual migration from the current IBGP model 216 to the Route Reflection model. One could start creating clusters by 217 configuring a single router as the designated RR and configuring 218 other RRs and their clients as normal IBGP peers. Additional clusters 219 can be created gradually. 221 6. Redundant RRs 223 Usually a cluster of clients will have a single RR. In that case, the 224 cluster will be identified by the ROUTER_ID of the RR. However, this 225 represents a single point of failure so to make it possible to have 226 multiple RRs in the same cluster, all RRs in the same cluster can be 227 configured with a 4-byte CLUSTER_ID so that an RR can discard routes 228 from other RRs in the same cluster. 230 7. Avoiding Routing Information Loops 232 When a route is reflected, it is possible through mis-configuration 233 to form route re-distribution loops. The Route Reflection method 234 defines the following attributes to detect and avoid routing 235 information loops: 237 ORIGINATOR_ID 239 ORIGINATOR_ID is a new optional, non-transitive BGP attribute of Type 240 code 9. This attribute is 4 bytes long and it will be created by a RR 241 in reflecting a route. This attribute will carry the ROUTER_ID of 242 the originator of the route in the local AS. A BGP speaker should not 243 create an ORIGINATOR_ID attribute if one already exists. A router 244 should ignore a route received with its ROUTER_ID as the 245 ORIGINATOR_ID. 247 CLUSTER_LIST 249 Cluster-list is a new optional, non-transitive BGP attribute of Type 250 code 10. It is a sequence of CLUSTER_ID values representing the 251 reflection path that the route has passed. It is encoded as follows: 253 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 254 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 255 | Attr. Flags |Attr. Type Code| Length | value ... 256 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 258 Where Length is the number of octets. 260 When a RR reflects a route, it must prepend the local CLUSTER_ID to 261 the CLUSTER_LIST. If the CLUSTER_LIST is empty, it must create a new 262 one. Using this attribute an RR can identify if the routing 263 information is looped back to the same cluster due to mis- 264 configuration. If the local CLUSTER_ID is found in the cluster-list, 265 the advertisement received will be ignored. 267 8. Implementation Considerations 269 Care should be taken to make sure that none of the BGP path 270 attributes defined above can be modified through configuration when 271 exchanging internal routing information between RRs and Clients and 272 Non-Clients. Their modification could potential result in routing 273 loops. 275 In addition, when a RR reflects a route, it should not modify the 276 following path attributes: NEXT_HOP, AS_PATH, LOCAL_PREF, and MED. 277 Their modification could potential result in routing loops. 279 9. Configuration and Deployment Considerations 281 The BGP protocol provides no way for a Client to identify itself 282 dynamically as a Client of an RR. The simplest way to achieve this 283 is by manual configuration. 285 One of the key component of the route reflection approach in 286 addressing the scaling issue is that the RR summarizes routing 287 information and only reflects its best path. 289 Both MEDs and IGP metrics may impact the BGP route selection. 290 Because MEDs are not always comparable and the IGP metric may differ 291 for each router, with certain route reflection topologies the route 292 reflection approach may not yield the same route selection result as 293 that of the full IBGP mesh approach. A way to make route selection 294 the same as it would be with the full IBGP mesh approach is to make 295 sure that route reflectors are never forced to perform the BGP route 296 selection based on IGP metrics which are significantly different from 297 the IGP metrics of their clients, or based on incomparable MEDs. The 298 former can be achieved by configuring the intra-cluster IGP metrics 299 to be better than the inter-cluster IGP metrics, and maintaining full 300 mesh within the cluster. The latter can be achieved by: 302 o setting the local preference of a route at the border router 303 to reflect the MED values. 305 o or by making sure the AS-path lengths from different ASs are 306 different when the AS-path length is used as a route 307 selection criteria. 309 o or by configuring community based policies using which the 310 reflector can decide on the best route. 312 One could argue though that the latter requirement is overly 313 restrictive, and perhaps impractical in some cases. One could 314 further argue that as long as there are no routing loops, there are 315 no compelling reasons to force route selection with route reflectors 316 to be the same as it would be with the full IBGP mesh approach. 318 To prevent routing loops and maintain consistent routing view, it is 319 essential that the network topology be carefully considered in 320 designing a route reflection topology. In general, the route 321 reflection topology should congruent with the network topology when 322 there exist multiple paths for a prefix. One commonly used approach 323 is the POP-based reflection, in which each POP maintains its own 324 route reflectors serving clients in the POP, and all route reflectors 325 are fully meshed. In addition, clients of the reflectors in each POP 326 are often fully meshed for the purpose of optimal intra-POP routing, 327 and the intra-POP IGP metrics are configured to be better than the 328 inter-POP IGP metrics. 330 10. Security 332 This extension to BGP does not change the underlying security issues 333 inherent in the existing IBGP. 335 11. Acknowledgments 337 The authors would like to thank Dennis Ferguson, John Scudder, Paul 338 Traina and Tony Li for the many discussions resulting in this work. 339 This idea was developed from an earlier discussion between Tony Li 340 and Dimitri Haskin. 342 In addition, the authors would like to acknowledge valuable review 343 and suggestions from Yakov Rekhter on this document, and helpful 344 comments from Tony Li, Rohit Dube, and John Scudder on Section 9. 346 12. References 348 [1] Rekhter, Y., and Li, T., "A Border Gateway Protocol 4 (BGP-4)", 349 RFC1771, March 1995. 351 [2] Haskin, D., "A BGP/IDRP Route Server alternative to a full mesh 352 routing", RFC1863, October 1995. 354 [3] Traina, P. "Limited Autonomous System Confederations for BGP", 355 RFC1965, June 1996. 357 13. Author's Addresses 359 Tony Bates 360 Cisco Systems 361 170 West Tasman Drive 363 email: tbates@cisco.com 365 Ravishanker Chandrasekeran 366 (Ravi Chandra) 367 Cisco Systems 368 170 West Tasman Drive 369 San Jose, CA 95134 371 email: rchandra@cisco.com 373 Enke Chen 374 Cisco Systems 375 170 West Tasman Drive 376 San Jose, CA 95134 378 email: enkechen@cisco.com