idnits 2.17.1 draft-ietf-idr-route-reflect-v2-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 10 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([2,3], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 37 has weird spacing: '...as been well...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 1999) is 8989 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 1771 (ref. '1') (Obsoleted by RFC 4271) ** Obsolete normative reference: RFC 1863 (ref. '2') (Obsoleted by RFC 4223) ** Obsolete normative reference: RFC 1965 (ref. '3') (Obsoleted by RFC 3065) ** Obsolete normative reference: RFC 1966 (ref. '4') (Obsoleted by RFC 4456) ** Obsolete normative reference: RFC 2385 (ref. '5') (Obsoleted by RFC 5925) Summary: 11 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Tony Bates 2 Ravi Chandra 3 Enke Chen 4 Cisco Systems 5 September 1999 7 BGP Route Reflection - 8 An Alternative to Full Mesh IBGP 9 11 Status of this Memo 13 This document is an Internet-Draft and is in full conformance with 14 all provisions of Section 10 of RFC2026. Internet-Drafts are working 15 documents of the Internet Engineering Task Force (IETF), its areas, 16 and its working groups. Note that other groups may also distribute 17 working documents as Internet-Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six months 20 and may be updated, replaced, or obsoleted by other documents at any 21 time. It is inappropriate to use Internet-Drafts as reference 22 material or to cite them other than as "work in progress." 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/ietf/1id-abstracts.txt 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html. 30 Abstract 32 The Border Gateway Protocol [1] is an inter-autonomous system routing 33 protocol designed for TCP/IP internets. Currently in the Internet BGP 34 deployments are configured such that that all BGP speakers within a 35 single AS must be fully meshed so that any external routing 36 information must be re-distributed to all other routers within that 37 AS. This represents a serious scaling problem that has been well 38 documented with several alternatives proposed [2,3]. 40 This document describes the use and design of a method known as 41 "Route Reflection" to alleviate the the need for "full mesh" IBGP. 43 1. Introduction 45 Currently in the Internet, BGP deployments are configured such that 46 that all BGP speakers within a single AS must be fully meshed and any 47 external routing information must be re-distributed to all other 48 routers within that AS. For n BGP speakers within an AS that 49 requires to maintain n*(n-1)/2 unique IBGP sessions. This "full 50 mesh" requirement clearly does not scale when there are a large 51 number of IBGP speakers each exchanging a large volume of routing 52 information, as is common in many of todays internet networks. 54 This scaling problem has been well documented and a number of 55 proposals have been made to alleviate this [2,3]. This document 56 represents another alternative in alleviating the need for a "full 57 mesh" and is known as "Route Reflection". This approach allows a BGP 58 speaker (known as "Route Reflector") to advertise IBGP learned routes 59 to certain IBGP peers. It represents a change in the commonly 60 understood concept of IBGP, and the addition of two new optional 61 transitive BGP attributes to prevent loops in routing updates. 63 This document is a revision of RFC1966 [4], and it includes editorial 64 changes, clarifications and corrections based on the deployment 65 experience with route reflection. 67 2. Design Criteria 69 Route Reflection was designed to satisfy the following criteria. 71 o Simplicity 73 Any alternative must be both simple to configure as well 74 as understand. 76 o Easy Transition 78 It must be possible to transition from a full mesh 79 configuration without the need to change either topology 80 or AS. This is an unfortunate management overhead of the 81 technique proposed in [3]. 83 o Compatibility 85 It must be possible for non compliant IBGP peers 86 to continue be part of the original AS or domain 87 without any loss of BGP routing information. 89 These criteria were motivated by operational experiences of a very 90 large and topology rich network with many external connections. 92 3. Route Reflection 94 The basic idea of Route Reflection is very simple. Let us consider 95 the simple example depicted in Figure 1 below. 97 +-------+ +-------+ 98 | | IBGP | | 99 | RTR-A |--------| RTR-B | 100 | | | | 101 +-------+ +-------+ 102 \ / 103 IBGP \ ASX / IBGP 104 \ / 105 +-------+ 106 | | 107 | RTR-C | 108 | | 109 +-------+ 111 Figure 1: Full Mesh IBGP 113 In ASX there are three IBGP speakers (routers RTR-A, RTR-B and RTR- 114 C). With the existing BGP model, if RTR-A receives an external route 115 and it is selected as the best path it must advertise the external 116 route to both RTR-B and RTR-C. RTR-B and RTR-C (as IBGP speakers) 117 will not re-advertise these IBGP learned routes to other IBGP 118 speakers. 120 If this rule is relaxed and RTR-C is allowed to advertise IBGP 121 learned routes to IBGP peers, then it could re-advertise (or reflect) 122 the IBGP routes learned from RTR-A to RTR-B and vice versa. This 123 would eliminate the need for the IBGP session between RTR-A and RTR-B 124 as shown in Figure 2 below. 126 +-------+ +-------+ 127 | | | | 128 | RTR-A | | RTR-B | 129 | | | | 130 +-------+ +-------+ 131 \ / 132 IBGP \ ASX / IBGP 133 \ / 134 +-------+ 135 | | 136 | RTR-C | 137 | | 138 +-------+ 140 Figure 2: Route Reflection IBGP 142 The Route Reflection scheme is based upon this basic principle. 144 4. Terminology and Concepts 146 We use the term "Route Reflection" to describe the operation of a BGP 147 speaker advertising an IBGP learned route to another IBGP peer. Such 148 a BGP speaker is said to be a "Route Reflector" (RR), and such a 149 route is said to be a reflected route. 151 The internal peers of a RR are divided into two groups: 153 1) Client Peers 155 2) Non-Client Peers 157 A RR reflects routes between these groups, and may reflect routes 158 among client peers. A RR along with its client peers form a Cluster. 159 The Non-Client peer must be fully meshed but the Client peers need 160 not be fully meshed. Figure 3 depicts a simple example outlining the 161 basic RR components using the terminology noted above. 163 / - - - - - - - - - - - - - - 164 | Cluster | 165 +-------+ +-------+ 166 | | | | | | 167 | RTR-A | | RTR-B | 168 | |Client | |Client | | 169 +-------+ +-------+ 170 | \ / | 171 IBGP \ / IBGP 172 | \ / | 173 +-------+ 174 | | | | 175 | RTR-C | 176 | | RR | | 177 +-------+ 178 | / \ | 179 - - - - - /- - -\- - - - - - / 180 IBGP / \ IBGP 181 +-------+ +-------+ 182 | RTR-D | IBGP | RTR-E | 183 | Non- |---------| Non- | 184 |Client | |Client | 185 +-------+ +-------+ 187 Figure 3: RR Components 189 5. Operation 191 When a RR receives a route from an IBGP peer, it selects the best 192 path based on its path selection rule. After the best path is 193 selected, it must do the following depending on the type of the peer 194 it is receiving the best path from: 196 1) A Route from a Non-Client IBGP peer 198 Reflect to all the Clients. 200 2) A Route from a Client peer 202 Reflect to all the Non-Client peers and also to the 203 Client peers. (Hence the Client peers are not required 204 to be fully meshed.) 206 An Autonomous System could have many RRs. A RR treats other RRs just 207 like any other internal BGP speakers. A RR could be configured to 208 have other RRs in a Client group or Non-client group. 210 In a simple configuration the backbone could be divided into many 211 clusters. Each RR would be configured with other RRs as Non-Client 212 peers (thus all the RRs will be fully meshed.). The Clients will be 213 configured to maintain IBGP session only with the RR in their 214 cluster. Due to route reflection, all the IBGP speakers will receive 215 reflected routing information. 217 It is possible in a Autonomous System to have BGP speakers that do 218 not understand the concept of Route-Reflectors (let us call them 219 conventional BGP speakers). The Route-Reflector Scheme allows such 220 conventional BGP speakers to co-exist. Conventional BGP speakers 221 could be either members of a Non-Client group or a Client group. This 222 allows for an easy and gradual migration from the current IBGP model 223 to the Route Reflection model. One could start creating clusters by 224 configuring a single router as the designated RR and configuring 225 other RRs and their clients as normal IBGP peers. Additional clusters 226 can be created gradually. 228 6. Redundant RRs 230 Usually a cluster of clients will have a single RR. In that case, the 231 cluster will be identified by the ROUTER_ID of the RR. However, this 232 represents a single point of failure so to make it possible to have 233 multiple RRs in the same cluster, all RRs in the same cluster can be 234 configured with a 4-byte CLUSTER_ID so that an RR can discard routes 235 from other RRs in the same cluster. 237 7. Avoiding Routing Information Loops 239 When a route is reflected, it is possible through mis-configuration 240 to form route re-distribution loops. The Route Reflection method 241 defines the following attributes to detect and avoid routing 242 information loops: 244 ORIGINATOR_ID 246 ORIGINATOR_ID is a new optional, non-transitive BGP attribute of Type 247 code 9. This attribute is 4 bytes long and it will be created by a RR 248 in reflecting a route. This attribute will carry the ROUTER_ID of 249 the originator of the route in the local AS. A BGP speaker should not 250 create an ORIGINATOR_ID attribute if one already exists. A router 251 which recognizes the ORIGINATOR_ID attribute should ignore a route 252 received with its ROUTER_ID as the ORIGINATOR_ID. 254 CLUSTER_LIST 256 Cluster-list is a new optional, non-transitive BGP attribute of Type 257 code 10. It is a sequence of CLUSTER_ID values representing the 258 reflection path that the route has passed. It is encoded as follows: 260 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 261 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 262 | Attr. Flags |Attr. Type Code| Length | value ... 263 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 265 Where Length is the number of octets. 267 When a RR reflects a route, it must prepend the local CLUSTER_ID to 268 the CLUSTER_LIST. If the CLUSTER_LIST is empty, it must create a new 269 one. Using this attribute an RR can identify if the routing 270 information is looped back to the same cluster due to mis- 271 configuration. If the local CLUSTER_ID is found in the cluster-list, 272 the advertisement received should be ignored. 274 8. Implementation Considerations 276 Care should be taken to make sure that none of the BGP path 277 attributes defined above can be modified through configuration when 278 exchanging internal routing information between RRs and Clients and 279 Non-Clients. Their modification could potential result in routing 280 loops. 282 In addition, when a RR reflects a route, it should not modify the 283 following path attributes: NEXT_HOP, AS_PATH, LOCAL_PREF, and MED. 284 Their modification could potential result in routing loops. 286 9. Configuration and Deployment Considerations 288 The BGP protocol provides no way for a Client to identify itself 289 dynamically as a Client of an RR. The simplest way to achieve this 290 is by manual configuration. 292 One of the key component of the route reflection approach in 293 addressing the scaling issue is that the RR summarizes routing 294 information and only reflects its best path. 296 Both MEDs and IGP metrics may impact the BGP route selection. 297 Because MEDs are not always comparable and the IGP metric may differ 298 for each router, with certain route reflection topologies the route 299 reflection approach may not yield the same route selection result as 300 that of the full IBGP mesh approach. A way to make route selection 301 the same as it would be with the full IBGP mesh approach is to make 302 sure that route reflectors are never forced to perform the BGP route 303 selection based on IGP metrics which are significantly different from 304 the IGP metrics of their clients, or based on incomparable MEDs. The 305 former can be achieved by configuring the intra-cluster IGP metrics 306 to be better than the inter-cluster IGP metrics, and maintaining full 307 mesh within the cluster. The latter can be achieved by: 309 o setting the local preference of a route at the border router 310 to reflect the MED values. 312 o or by making sure the AS-path lengths from different ASs are 313 different when the AS-path length is used as a route 314 selection criteria. 316 o or by configuring community based policies using which the 317 reflector can decide on the best route. 319 One could argue though that the latter requirement is overly 320 restrictive, and perhaps impractical in some cases. One could 321 further argue that as long as there are no routing loops, there are 322 no compelling reasons to force route selection with route reflectors 323 to be the same as it would be with the full IBGP mesh approach. 325 To prevent routing loops and maintain consistent routing view, it is 326 essential that the network topology be carefully considered in 327 designing a route reflection topology. In general, the route 328 reflection topology should congruent with the network topology when 329 there exist multiple paths for a prefix. One commonly used approach 330 is the POP-based reflection, in which each POP maintains its own 331 route reflectors serving clients in the POP, and all route reflectors 332 are fully meshed. In addition, clients of the reflectors in each POP 333 are often fully meshed for the purpose of optimal intra-POP routing, 334 and the intra-POP IGP metrics are configured to be better than the 335 inter-POP IGP metrics. 337 10. Security 339 This extension to BGP does not change the underlying security issues 340 inherent in the existing IBGP [5]. 342 11. Acknowledgments 343 The authors would like to thank Dennis Ferguson, John Scudder, Paul 344 Traina and Tony Li for the many discussions resulting in this work. 345 This idea was developed from an earlier discussion between Tony Li 346 and Dimitri Haskin. 348 In addition, the authors would like to acknowledge valuable review 349 and suggestions from Yakov Rekhter on this document, and helpful 350 comments from Tony Li, Rohit Dube, and John Scudder on Section 9, and 351 from Bruce Cole. 353 12. References 355 [1] Rekhter, Y., and Li, T., "A Border Gateway Protocol 4 (BGP-4)", 356 RFC1771, March 1995. 358 [2] Haskin, D., "A BGP/IDRP Route Server alternative to a full mesh 359 routing", RFC1863, October 1995. 361 [3] Traina, P. "Limited Autonomous System Confederations for BGP", 362 RFC1965, June 1996. 364 [4] Bates, T., and Chandra, R., "BGP Route Reflection An alternative 365 to full mesh IBGP", RFC1966, June 1996. 367 [5] Heffernan, A., "Protection of BGP Sessions via the TCP MD5 Sig- 368 nature Option", RFC2385, August 1998. 370 13. Author's Addresses 372 Tony Bates 373 Cisco Systems 374 170 West Tasman Drive 376 email: tbates@cisco.com 378 Ravishanker Chandrasekeran 379 (Ravi Chandra) 380 Cisco Systems 381 170 West Tasman Drive 382 San Jose, CA 95134 384 email: rchandra@cisco.com 386 Enke Chen 387 Cisco Systems 388 170 West Tasman Drive 389 San Jose, CA 95134 391 email: enkechen@cisco.com