idnits 2.17.1 draft-ietf-idr-bgp-analysis-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1.a on line 19. -- Found old boilerplate from RFC 3978, Section 5.5 on line 747. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 724. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 731. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 737. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 753), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 44. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: This document is an Internet-Draft and is subject to all provisions of Section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 20 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 2 instances of too long lines in the document, the longest one being 3 characters in excess of 72. ** The abstract seems to contain references ([RFC1264], [RFC1774]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 456 has weird spacing: '... AS4 is a mul...' == Line 530 has weird spacing: '...tion we ident...' == Line 605 has weird spacing: '...prevent inser...' == Line 610 has weird spacing: '...ructure withi...' == Line 611 has weird spacing: '...for its routi...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (December 2004) is 7071 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC 2385' is mentioned on line 583, but not defined ** Obsolete undefined reference: RFC 2385 (Obsoleted by RFC 5925) == Missing Reference: 'RFC 3562' is mentioned on line 584, but not defined == Missing Reference: 'RFC 2406' is mentioned on line 588, but not defined ** Obsolete undefined reference: RFC 2406 (Obsoleted by RFC 4303, RFC 4305) == Missing Reference: 'BGP-VUL' is mentioned on line 607, but not defined == Unused Reference: 'RFC2385' is defined on line 639, but no explicit reference was found in the text == Unused Reference: 'RFC3562' is defined on line 651, but no explicit reference was found in the text == Unused Reference: 'RFC1771' is defined on line 686, but no explicit reference was found in the text == Unused Reference: 'RFC2406' is defined on line 700, but no explicit reference was found in the text -- No information found for draft-ietf-idr-bgp4-2 - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'BGP4' ** Obsolete normative reference: RFC 1519 (Obsoleted by RFC 4632) ** Obsolete normative reference: RFC 2385 (Obsoleted by RFC 5925) ** Obsolete normative reference: RFC 2842 (Obsoleted by RFC 3392) ** Downref: Normative reference to an Informational RFC: RFC 3345 ** Downref: Normative reference to an Informational RFC: RFC 3562 ** Obsolete normative reference: RFC 3682 (Obsoleted by RFC 5082) == Outdated reference: A later version (-02) exists of draft-white-sobgp-architecture-00 -- Possible downref: Normative reference to a draft: ref. 'SOBGP' -- Possible downref: Non-RFC (?) normative reference: ref. 'SBGP' -- Obsolete informational reference (is this intentional?): RFC 1105 (Obsoleted by RFC 1163) -- Duplicate reference: RFC1105, mentioned in 'RFC1163', was also mentioned in 'RFC1105'. -- Obsolete informational reference (is this intentional?): RFC 1105 (ref. 'RFC1163') (Obsoleted by RFC 1163) -- Obsolete informational reference (is this intentional?): RFC 1264 (Obsoleted by RFC 4794) -- Duplicate reference: RFC1105, mentioned in 'RFC1267', was also mentioned in 'RFC1163'. -- Obsolete informational reference (is this intentional?): RFC 1105 (ref. 'RFC1267') (Obsoleted by RFC 1163) -- Obsolete informational reference (is this intentional?): RFC 1771 (Obsoleted by RFC 4271) -- Obsolete informational reference (is this intentional?): RFC 2406 (Obsoleted by RFC 4303, RFC 4305) Summary: 16 errors (**), 0 flaws (~~), 17 warnings (==), 20 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT D. Meyer 3 draft-ietf-idr-bgp-analysis-07.txt K. Patel 4 Category Informational 5 Expires: June 2005 December 2004 7 BGP-4 Protocol Analysis 8 10 Status of this Memo 12 Status of this Memo 14 This document is an Internet-Draft and is subject to all 15 provisions of section 3 of RFC 3667. By submitting this 16 Internet-Draft, each author represents that any applicable patent 17 or other IPR claims of which he or she is aware have been or will 18 be disclosed, and any of which he or she become aware will be 19 disclosed, in accordance with RFC 3668. 21 Internet-Drafts are working documents of the Internet 22 Engineering Task Force (IETF), its areas, and its working 23 groups. Note that other groups may also distribute working 24 documents as Internet-Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six 27 months and may be updated, replaced, or obsoleted by other 28 documents at any time. It is inappropriate to use 29 Internet-Drafts as reference material or to cite them other 30 than as "work in progress." 32 The list of current Internet-Drafts can be accessed at 33 http://www.ietf.org/1id-abstracts.txt. 35 The list of Internet-Draft Shadow Directories can be accessed 36 at http://www.ietf.org/shadow.html. 38 This document is a product of the IDR Working Group 39 WG. Comments should be addressed to the authors, or the 40 mailing list at idr@ietf.org. 42 Copyright Notice 44 Copyright (C) The Internet Society (2004). All Rights Reserved. 46 Abstract 48 The purpose of this report is to document how the requirements for 49 advancing a routing protocol from Draft Standard to full Standard 50 have been satisfied by Border Gateway Protocol version 4 (BGP-4). 52 This report satisfies the requirement for "the second report", as 53 described in Section 6.0 of [RFC1264]. In order to fulfill the 54 requirement, this report augments [RFC1774] and summarizes the key 55 features of BGP-4 protocol, and analyzes the protocol with respect 56 to scaling and performance. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 2. Key Features and algorithms of the BGP protocol. . . . . . . . 4 62 2.1. Key Features. . . . . . . . . . . . . . . . . . . . . . . . 4 63 2.2. BGP Algorithms. . . . . . . . . . . . . . . . . . . . . . . 5 64 2.3. BGP Finite State Machine (FSM). . . . . . . . . . . . . . . 6 65 3. BGP Capabilities . . . . . . . . . . . . . . . . . . . . . . . 7 66 4. BGP Persistent Peer Oscillations . . . . . . . . . . . . . . . 7 67 5. Implementation Guidelines. . . . . . . . . . . . . . . . . . . 7 68 6. BGP Performance characteristics and Scalability. . . . . . . . 8 69 6.1. Link bandwidth and CPU utilization. . . . . . . . . . . . . 8 70 6.1.1. CPU utilization. . . . . . . . . . . . . . . . . . . . . 9 71 6.1.2. Memory requirements. . . . . . . . . . . . . . . . . . . 10 72 7. BGP Policy Expressiveness and its Implications . . . . . . . . 11 73 7.1. Existence of Unique Stable Routings . . . . . . . . . . . . 12 74 7.2. Existence of Stable Routings. . . . . . . . . . . . . . . . 13 75 8. Applicability. . . . . . . . . . . . . . . . . . . . . . . . . 14 76 9. Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . 15 77 10. Security Considerations . . . . . . . . . . . . . . . . . . . 16 78 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 79 12. References. . . . . . . . . . . . . . . . . . . . . . . . . . 17 80 12.1. Normative References . . . . . . . . . . . . . . . . . . . 17 81 12.2. Informative References . . . . . . . . . . . . . . . . . . 18 82 13. Author's Addresses. . . . . . . . . . . . . . . . . . . . . . 19 84 1. Introduction 86 BGP-4 is an inter-autonomous system routing protocol designed for 87 TCP/IP internets. Version 1 of the BGP-4 protocol was published in 88 [RFC1105]. Since then BGP versions 2, 3, and 4 have been developed. 89 Version 2 was documented in [RFC1163]. Version 3 is documented in 90 [RFC1267]. Version 4 is documented in the [BGP4] (version 4 of BGP 91 will hereafter be referred to as BGP). The changes between versions 92 are explained in Appendix A of [BGP4]. Possible applications of BGP 93 in the Internet are documented in [RFC1772]. 95 BGP introduced support for Classless Inter-Domain Routing 96 [RFC1519]. Since earlier versions of BGP lacked the support 97 for CIDR, they are considered obsolete and unusable in 98 today's Internet. 100 2. Key Features and algorithms of the BGP protocol 102 This section summarizes the key features and algorithms of the 103 BGP protocol. BGP is an inter-autonomous system routing 104 protocol; it is designed to be used between multiple 105 autonomous systems. BGP assumes that routing within an 106 autonomous system is done by an intra-autonomous system 107 routing protocol. BGP also assumes that data packets are 108 routed from source towards destination independent of the 109 source. BGP does not make any assumptions about 110 intra-autonomous system routing protocols deployed within the 111 various autonomous systems. Specifically, BGP does not require 112 all autonomous systems to run the same intra-autonomous system 113 routing protocol (i.e., interior gateway protocol or IGP). 115 Finally, note that BGP is a real inter-autonomous system 116 routing protocol, and as such it imposes no constraints on the 117 underlying interconnect topology of the autonomous 118 systems. The information exchanged via BGP is sufficient to 119 construct a graph of autonomous systems connectivity from 120 which routing loops may be pruned and many routing policy 121 decisions at the autonomous system level may be enforced. 123 2.1. Key Features 125 The key features of the protocol are the notion of path 126 attributes and aggregation of Network Layer Reachability 127 Information (NLRI). 129 Path attributes provide BGP with flexibility and 130 extensibility. Path attributes are partitioned into well-known 131 and optional. The provision for optional attributes allows 132 experimentation that may involve a group of BGP routers 133 without affecting the rest of the Internet. New optional 134 attributes can be added to the protocol in much the same way 135 that new options are added to, say, the Telnet protocol [RFC854]. 137 One of the most important path attributes is the Autonomous 138 System Path, or AS_PATH. As the reachability information 139 traverses the Internet, this (AS_PATH) information is 140 augmented by the list of autonomous systems that have been 141 traversed thus far, forming the AS_PATH. The AS_PATH allows 142 straightforward suppression of the looping of routing 143 information. In addition, the AS_PATH serves as a powerful and 144 versatile mechanism for policy-based routing. 146 BGP enhances the AS_PATH attribute to include sets of autonomous 147 systems as well as lists via the AS_SET attribute. This extended 148 format allows generated aggregate routes to carry path information 149 from the more specific routes used to generate the aggregate. It 150 should be noted however, that as of this writing, AS_SETs are 151 rarely used in the Internet [ROUTEVIEWS]. 153 2.2. BGP Algorithms 155 BGP uses an algorithm that is neither a pure distance vector 156 algorithm or a pure link state algorithm. It is instead a 157 modified distance vector algorithm referred to as a "Path 158 Vector" algorithm that uses path information to avoid 159 traditional distance vector problems. Each route within BGP 160 pairs destination with path information to that 161 destination. Path information (also known as AS_PATH 162 information) is stored within the AS_PATH attribute in 163 BGP. The path information assists BGP in detecting AS loops 164 thereby allowing BGP speakers select loop free routes. 166 BGP uses an incremental update strategy in order to conserve 167 bandwidth and processing power. That is, after initial 168 exchange of complete routing information, a pair of BGP 169 routers exchanges only changes to that information. Such an 170 incremental update design requires reliable transport between 171 a pair of BGP routers to function correctly. BGP solves this 172 problem by using TCP for reliable transport. 174 In addition to incremental updates, BGP has added the concept 175 of route aggregation so that information about groups of 176 destinations 177 that use hierarchical address assignment (e.g., CIDR) may be 178 aggregated and sent as a single Network Layer Reachability 179 (NLRI). 181 Finally, note that BGP is a self-contained protocol. That is, 182 BGP specifies how routing information is exchanged both 183 between BGP speakers in different autonomous systems, and 184 between BGP speakers within a single autonomous system. 186 2.3. BGP Finite State Machine (FSM) 188 The BGP FSM is a set of rules that are applied to a BGP 189 speaker's set of configured peers for the BGP operation. A BGP 190 implementation requires that a BGP speaker must connect to and 191 listen on TCP port 179 for accepting any new BGP connections 192 from its peers. The BGP Finite State Machine, or FSM, must be 193 initiated and maintained for each new incoming and outgoing 194 peer connections. However, in steady state operation, there 195 will be only one BGP FSM per connection per peer. 197 There may be a short period during which a BGP peer may have 198 separate incoming and outgoing connections resulting in the 199 creation of two different BGP FSMs relating to a peer (instead 200 of one). This can be resolved by following the BGP connection 201 collision rules defined in the [BGP4] specification. 203 The BGP FSM has the following states associated with each of its 204 peers: 206 IDLE: State when BGP peer refuses any incoming 207 connections. 209 CONNECT: State in which BGP peer is waiting for 210 its TCP connection to be completed. 212 ACTIVE: State in which BGP peer is trying to acquire a 213 peer by listening and accepting TCP connection. 215 OPENSENT: BGP peer is waiting for OPEN message from its 216 peer. 218 OPENCONFIRM: BGP peer is waiting for KEEPALIVE or NOTIFICATION 219 message from its peer. 221 ESTABLISHED: BGP peer connection is established and exchanges 222 UPDATE, NOTIFICATION, and KEEPALIVE messages with 223 its peer. 225 There are an number of BGP events that operate on above 226 mentioned states of the BGP FSM for BGP peers. Support of 227 these BGP events are either mandatory or optional. They are 228 triggered by the protocol logic as part of the BGP or using an 229 operator intervention via a configuration interface to the BGP 230 protocol. 232 These BGP events are of following types: Optional events 233 linked to Optional Session attributes, Administrative Events, 234 Timer Events, TCP Connection based Events, and BGP 235 Message-based Events. Both the FSM and the BGP events are 236 explained in detail in [BGP4]. 238 3. BGP Capabilities 240 The BGP Capability mechanism [RFC2842] provides an easy and flexible 241 way to introduce new features within the protocol. In particular, the 242 BGP capability mechanism allows a BGP speaker to advertise to its 243 peers during startup various optional features supported by the 244 speaker (and receive similar information from the peers). This allows 245 the base BGP protocol to contain only essential functionality, while 246 at the same time providing a flexible mechanism for signaling 247 protocol extensions. 249 4. BGP Persistent Peer Oscillations 251 Whenever a BGP speaker detects an error in any peer connection, it 252 shuts down the peer and changes its FSM state to IDLE. BGP speaker 253 requires a Start event to re-initiate its idle peer connection. If 254 the error remains persistent and BGP speaker generates Start event 255 automatically then it may result in persistent peer flapping. 256 However, although peer oscillation is found to be wide-spread in BGP 257 implementations, methods for preventing persistent peer oscillations 258 are outside the scope of base BGP protocol specification. 260 5. Implementation Guidelines 262 A robust BGP implementation is work conserving. This means that if 263 the number of prefixes is bounded, arbitrarily high levels of route 264 change can be tolerated with bounded impact on route convergence for 265 occasional changes in generally stable routes. 267 A robust implementation of BGP should have the following 268 characteristics: 269 1. It is able to operate in almost arbitrarily high levels 270 of route flap without losing peerings (failing to send 271 keepalives) or loosing other protocol adjacencies as a 272 result of BGP load. 274 2. Instability of a subset of routes should not affect the 275 route advertisements or forwarding associated with the set 276 of stable routes. 278 3. High levels of instability and peers of different CPU speed 279 or load resulting in faster or slower processing of routes 280 should not cause instability and should have a bounded 281 impact on the convergence time for generally stable routes. 283 Numerous robust BGP implementations exist. Producing a robust 284 implementation is not a trivial matter but clearly achievable. 286 6. BGP Performance characteristics and Scalability 288 In this section, we provide "order of magnitude" answers to the 289 questions of how much link bandwidth, router memory and router CPU 290 cycles the BGP protocol will consume under normal conditions. In 291 particular, we will address the scalability of BGP and its 292 limitations. 294 6.1. Link bandwidth and CPU utilization 296 Immediately after the initial BGP connection setup, BGP peers 297 exchange complete set of routing information. If we denote the total 298 number of routes in the Internet by N, the total path attributes (for 299 all N routes) received from a peer as A, and assume that the networks 300 are uniformly distributed among the autonomous systems, then the 301 worst case amount of bandwidth consumed during the initial exchange 302 between a pair of BGP speakers (P) is 303 BW = O((N + A) * P) 305 BGP-4 was created specifically to reduce the size of the set 306 of NLRI entries which have to be carried and exchanged by 307 border routers. The aggregation scheme, defined in [RFC1519], 308 describes the provider-based aggregation scheme in use in 309 today's Internet. 311 Due to the advantages of advertising a few large aggregate blocks 312 instead of many smaller class-based individual networks, it is 313 difficult to estimate the actual reduction in bandwidth and 314 processing that BGP-4 has provided over BGP-3. If we simply 315 enumerate all aggregate blocks into their individual class-based 316 networks, we would not take into account "dead" space that has been 317 reserved for future expansion. The best metric for determining the 318 success of BGP's aggregation is to sample the number NLRI entries in 319 the globally connected Internet today and compare it to projected 320 growth rates before BGP was deployed. 322 At the time of this writing, the full set of exterior routes carried 323 by BGP is approximately 134,000 network entries [ROUTEVIEWS]. 325 6.1.1. CPU utilization 327 An important and fundamental feature of BGP is that BGP's CPU 328 utilization depends only on the stability of its network which 329 relates to BGP in terms of BGP UPDATE message announcements. If the 330 BGP network is stable: all the BGP routers within its network are in 331 the steady state; then the only link bandwidth and router CPU cycles 332 consumed by BGP are due to the exchange of the BGP KEEPALIVE 333 messages. The KEEPALIVE messages are exchanged only between peers. 334 The suggested frequency of the exchange is 30 seconds. The KEEPALIVE 335 messages are quite short (19 octets), and require virtually no 336 processing. As a result, the bandwidth consumed by the KEEPALIVE 337 messages is about 5 bits/sec. Operational experience confirms that 338 the overhead (in terms of bandwidth and CPU) associated with the 339 KEEPALIVE messages should be viewed as negligible. 341 During periods of network instability, BGP routers within the network 342 are generating routing updates that are exchanged using the BGP 343 UPDATE messages. The greatest overhead per UPDATE message occurs 344 when each UPDATE message contains only a single network. It should be 345 pointed out that in practice routing changes exhibit strong locality 346 with respect to the route attributes. That is, routes that change 347 are likely to have common route attributes. In this case, multiple 348 networks can be grouped into a single UPDATE message, thus 349 significantly reducing the amount of bandwidth required (see also 350 Appendix F.1 of [BGP4]). 352 6.1.2. Memory requirements 354 To quantify the worst case memory requirements for BGP, we denote the 355 total number of networks in the Internet by N, the mean AS distance 356 of the Internet by M (distance at the level of an autonomous system, 357 expressed in terms of the number of autonomous systems), the total 358 number of unique AS paths as A. Then the worst case memory 359 requirements (MR) can be expressed as 361 MR = O(N + (M * A)) 363 Since a mean AS distance M is a slow moving function of the 364 interconnectivity ("meshiness") of the Internet, for all practical 365 purposes the worst case router memory requirements are on the order 366 of the total number of networks in the Internet times the number of 367 peers the local system is peering with. We expect that the total 368 number of networks in the Internet will grow much faster than the 369 average number of peers per router. As a result, BGP's memory 370 scaling properties are linearly related to the total number of 371 networks in the Internet. 373 The following table illustrates typical memory requirements of a 374 router running BGP. We denote average number of routes advertised by 375 each peer as N, the total number of unique AS paths as A, the mean AS 376 distance of the Internet as M (distance at the level of an autonomous 377 system, expressed in terms of the number of autonomous systems), 378 number of octets required to store a network as R, and number of 379 bytes required to store one AS in an AS path as P. It is assumed 380 that each network is encoded as four bytes, each AS is encoded as two 381 bytes, and each networks is reachable via some fraction of all of the 382 peers (# BGP peers/per net). For purposes of the estimates here, we 383 will calculate MR = (((N * R) + (M * A) * P) * S). 385 # Networks Mean AS Distance # AS's # BGP peers/per net Memory Req 386 (N) (M) (A) (P) (MR) 387 ---------- ---------------- ------ ------------------- -------------- 388 100,000 20 3,000 20 10,400,000 389 100,000 20 15,000 20 20,000,000 390 120,000 10 15,000 100 78,000,000 391 140,000 15 20,000 100 116,000,000 393 In analyzing BGP's memory requirements, we focus on the size of the 394 BGP RIB table (ignoring implementation details). In particular, we 395 derive upper bounds for the size of the BGP RIB table. For example, 396 at the time of this writing, the BGP RIB tables of a typical backbone 397 router carry on the order of 120,000 entries. Given this number, one 398 might ask whether it would be possible to have a functional router 399 with a table that will have 1,000,000 entries. Clearly the answer to 400 this question is more related to how BGP is implemented. A robust BGP 401 implementation with a reasonable CPU and memory should not have 402 issues scaling to such limits. 404 7. BGP Policy Expressiveness and its Implications 406 BGP is unique among deployed IP routing protocols in that routing is 407 determined using semantically rich routing policies. Although 408 routing policies are usually the first thing that comes to a network 409 operator's mind concerning BGP, it is important to note that the 410 languages and techniques for specifying BGP routing policies are not 411 actually a part of the protocol specification ([RFC2622] for an 412 example of such a policy language). In addition, the BGP 413 specification contains few restrictions, either explicitly or 414 implicitly, on routing policy languages. These languages have 415 typically been developed by vendors and have evolved through 416 interactions with network engineers in an environment lacking 417 vendor-independent standards. 419 The complexity of typical BGP configurations, at least in provider 420 networks, has been steadily increasing. Router vendors typically 421 provide hundreds of special commands for use in the configuration of 422 BGP, and this command set is continually expanding. For example, BGP 423 communities [RFC1997] allow policy writers to selectively attach tags 424 to routes and use these to signal policy information to other 425 BGP-speaking routers. Many providers allow customers, and 426 sometimes peers, to send communities that determine the scope 427 and preference of their routes. These developments have 428 increasingly given the task of writing BGP configurations 429 aspects associated with open-ended programming. This has 430 allowed network operators to encode complex policies in order 431 to address many unforeseen situations, and has opened the door 432 for a great deal of creativity and experimentation in routing 433 policies. This policy flexibility is one of the main reasons 434 why BGP is so well suited to the commercial environment of the 435 current Internet. 437 However, this rich policy expressiveness has come with a cost 438 that is often not recognized. In particular, it is possible 439 to construct locally defined routing policies that can lead to 440 unexpected global routing anomalies such as (unintended) 441 non-determinism and to protocol divergence. If the 442 interacting policies causing such anomalies are defined in 443 different autonomous systems, then these problems can be very 444 difficult to debug and correct. In the following sections, we 445 describe two such cases relating to the existence (or lack 446 thereof) of stable routings. 448 7.1. Existence of Unique Stable Routings 450 One can easily construct sets of policies for which BGP can not 451 guarantee that stable routings are unique. This can be illustrated 452 by the following simple example. Consider the example of four 453 Autonomous Systems, AS1, AS2, AS3, and AS4. AS1 and AS2 are peers, 454 and they provide transit for AS3 and AS4 respectively, Suppose 455 further that AS3 provides transit for AS4 (in this case AS3 is a 456 customer of AS1, and AS4 is a multihomed customer of both AS3 and 457 AS2). AS4 may want to use the link to AS3 as a "backup" link, and 458 sends AS3 a community value that AS3 has configured to lower the 459 preference of AS4's routes to a level below that of its upstream 460 provider, AS1. The intended "backup routing" to AS4 is illustrated 461 here: 463 AS1 ------> AS2 464 /|\ | 465 | | 466 | | 467 | \|/ 468 AS3 ------- AS4 470 That is, the AS3-AS4 link is intended to be used only when the 471 AS2-AS4 link is down (for outbound traffic, AS4 simply gives routes 472 from AS2 a higher local preference). This is a common scenario in 473 today's Internet. But note that this configuration has another 474 stable solution: 476 AS1 ------- AS2 477 | | 478 | | 479 | | 480 \|/ \|/ 481 AS3 ------> AS4 483 In this case, AS3 does not translate the "depref my route" community 484 received from AS4 into a "depref my route" community for AS1, and so 485 if AS1 hears the route to AS4 that transits AS3 it will prefer that 486 route (since AS3 is a customer). This state could be reached, for 487 example, by starting in the "correct" backup routing shown first, 488 bringing down the AS2-AS4 BGP session, and then bringing it back up. 489 In general, BGP has no way to prefer the "intended" solution over the 490 anomalous one, and which is picked will depend on the unpredictable 491 order of BGP messages. 493 While this example is relatively simple, many operators may fail to 494 recognize that the true source of the problem is that the BGP 495 policies of ASes can interact in unexpected ways, and that these 496 interactions can result in multiple stable routings. One can imagine 497 that the interactions could be much more complex in the real 498 Internet. We suspect that such anomalies will only become more 499 common as BGP continues to evolve with richer policy expressiveness. 500 For example, extended communities provide an even more flexible means 501 of signaling information within and between autonomous systems than 502 is possible with [RFC1997] communities. At the same time, 503 applications of communities by network operators are evolving to 504 address complex issues of inter-domain traffic engineering. 506 7.2. Existence of Stable Routings 508 One can also construct a set of policies for which BGP can not 509 guarantee that a stable routing exists (or worse, that a stable 510 routing will ever be found). For example, [RFC3345] documents 511 several scenarios that lead to route oscillations associated 512 with the use of the Multi-Exit Discriminator or MED, 513 attribute. Route 514 oscillation will happen in BGP when a set of policies has no 515 solution. That is, when there is no stable routing that satisfies 516 the constraints imposed by policy, then BGP has no choice by to keep 517 trying. In addition, BGP configurations can have a stable routing, 518 yet the protocol may not be able to find it; BGP can "get trapped" 519 down a blind alley that has no solution. 521 Protocol divergence is not, however, a problem associated solely with 522 use of the MED attribute. This potential exists in BGP even without 523 the use of the MED attribute. Hence, like the unintended 524 nondeterminism described in the previous section, this type of 525 protocol divergence is an unintended consequence of the unconstrained 526 nature of BGP policy languages. 528 8. Applicability 530 In this section we identify the environments for which BGP well 531 suited, and for which environments it is not suitable. This question 532 is partially answered in Section 2 of BGP [BGP4], which states: 534 "To characterize the set of policy decisions that can be 535 enforced using BGP, one must focus on the rule that an AS 536 advertises to its neighbor ASs only those routes that it 537 itself uses. This rule reflects the "hop-by-hop" routing 538 paradigm generally used throughout the current Internet. 539 Note that some policies cannot be supported by the 540 "hop-by-hop" routing paradigm and thus require techniques 541 such as source routing to enforce. For example, BGP does 542 not enable one AS to send traffic to a neighbor AS 543 intending that the traffic take a different route from 544 that taken by traffic originating in the neighbor AS. On 545 the other hand, BGP can support any policy conforming to 546 the "hop-by-hop" routing paradigm. Since the current 547 Internet uses only the "hop-by-hop" routing paradigm and 548 since BGP can support any policy that conforms to that 549 paradigm, BGP is highly applicable as an inter-AS routing 550 protocol for the current Internet." 552 One of the important points here is that the BGP protocol contains 553 only the functionality that is essential, while at the same time 554 providing a flexible mechanism within the protocol that allows us to 555 extend its functionality. For example, BGP capabilities provide an 556 easy and flexible way to introduce new features within the protocol. 557 Finally, since BGP was designed with flexibility and 558 extensibility in mind, new and/or evolving requirements can be 559 addressed via existing mechanisms. 561 To summarize, BGP is well suited as an inter-autonomous system 562 routing protocol for any internet that is based on IP [RFC791] 563 as the internet protocol and "hop-by-hop" routing paradigm. 565 9. Acknowledgments 567 We would like to thank Paul Traina for authoring previous 568 versions of this document. Elwyn Davies, Tim Griffin, Randy 569 Presuhn, Curtis Villamizar and Atanu Ghosh also provided many 570 insightful comments on earlier versions of this document. 572 10. Security Considerations 574 BGP provides flexible mechanisms with varying levels of complexity 575 for security purposes. BGP sessions are authenticated using BGP 576 session addresses and the assigned AS number. Since BGP 577 sessions use TCP (and IP) for reliable transport, BGP sessions 578 are further authenticated and secured by any authentication 579 and security mechanisms used by TCP and IP. 581 BGP uses TCP MD5 option for validating data and protecting against 582 spoofing of TCP segments exchanged between its sessions. The usage 583 of TCP MD5 option for BGP is described at length in [RFC 2385]. The 584 TCP MD5 Key management is discussed in [RFC 3562]. BGP data 585 encryption is provided using IPsec mechanism which encrypts the IP 586 payload data (including TCP and BGP data). The IPsec mechanism can 587 be used in both, the transport mode as well as the tunnel mode. The 588 IPsec mechanism is described in [RFC 2406]. Both, the TCP MD5 option 589 and the IPsec mechanism are not widely deployed security mechanisms 590 for BGP in today's Internet and hence it is difficult to gauge their 591 real performance impact when using with BGP. However, since both the 592 mechanisms are TCP and IP based security mechanisms, the Link 593 Bandwidth, CPU utilization and router memory consumed by BGP protocol 594 using it would be same as any other TCP and IP based protocols. 596 BGP uses IP TTL value to protect its External BGP (EBGP) sessions 597 from any TCP (or IP) based CPU intensive attacks. It is a simple 598 mechanism which suggests the use of filtering BGP (TCP) segments 599 using the IP TTL value carried within the IP header of BGP (TCP) 600 segments exchanged between the EBGP sessions. The BGP TTL mechanism 601 is described in [RFC3682]. Usage of [RFC3682] impacts performance in 602 a similar way as using any ACL policies for BGP. 604 Such flexible TCP and IP based security mechanisms, allow BGP to 605 prevent insertion/deletion/modification of BGP data, any snooping of 606 the data, session stealing, etc. However, BGP is vulnerable to the 607 same security attacks that are present in TCP. The [BGP-VUL] 608 explains in depth about the BGP security vulnerability. At the time 609 of this writing, several efforts are underway for creating and 610 defining an appropriate security infrastructure within the BGP 611 protocol to provide authentication and security for its routing 612 information; some of which include [SBGP] and [SOBGP]. 614 11. IANA Considerations 616 This document presents an analysis of the BGP protocol and hence 617 presents no new IANA considerations. 619 12. References 621 12.1. Normative References 623 [BGP4] Rekhter, Y., T. Li., and S. Hares, Editors, "A 624 Border Gateway Protocol 4 (BGP-4)", 625 draft-ietf-idr-bgp4-2.txt. Work in progress. 627 [RFC1519] Fuller, V., Li, T., Yu, J., and K. Varadhan, 628 "Classless Inter-Domain Routing (CIDR): an Address 629 Assignment and Aggregation Strategy", RFC 1519, 630 September, 1993. 632 [RFC791] "INTERNET PROTOCOL", DARPA INTERNET PROGRAM 633 PROTOCOL SPECIFICATION, RFC 791, September, 634 1981. 636 [RFC1997] Chandra. R, and T. Li, "BGP Communities 637 Attribute", RFC 1997, August, 1996. 639 [RFC2385] Heffernan, A., "Protection of BGP Sessions via 640 the TCP MD5 Signature Option", RFC 2385, 641 August, 1998. 643 [RFC2842] Chandra, R. and J. Scudder, "Capabilities 644 Advertisement with BGP-4", RFC 2842, May 2000. 646 [RFC3345] McPherson, D., Gill, V., Walton, D., and 647 A. Retana, "Border Gateway Protocol (BGP) Persistent 648 Route Oscillation Condition", RFC 3345, 649 August, 2002. 651 [RFC3562] Leech, M., "Key Management Considerations for 652 the TCP MD5 Signature Option", RFC 3562, 653 July, 2003. 655 [RFC3682] Gill, V., Heasley, J., and D. Meyer, "The 656 Generalized TTL Security Mechanism (GTSM)", RFC 657 3682, February, 2004. 658 [SOBGP] White, R., "Architecture and Deployment 659 Considerations for Secure Origin BGP (soBGP)", 660 draft-white-sobgp-architecture-00.txt. Work in 661 Progress. 663 [BGP_VULN] Murphy, S., "BGP Security Vulnerabilities Analysis", 664 draft-ietf-idr-bgp-vuln-01.txt. Work in progress 666 [SBGP] Lynn, C., Mikkelson, J., and K. Seo, "Secure BGP S-BGP", 667 Internet-Draft, Work in Progress. 669 12.2. Informative References 671 [RFC854] Postel, J. and J. Reynolds, "TELNET PROTOCOL 672 SPECIFICATION", RFC 854, May, 1983. 674 [RFC1105] Lougheed, K., and Y. Rekhter, "Border Gateway 675 Protocol BGP", RFC 1105, June 1989. 677 [RFC1163] Lougheed, K., and Rekhter, Y, "Border Gateway 678 Protocol BGP", RFC 1105, June 1990. 680 [RFC1264] Hinden, R., "Internet Routing Protocol 681 Standardization Criteria", RFC 1264, October 1991. 683 [RFC1267] Lougheed, K., and Rekhter, Y, "Border Gateway 684 Protocol 3 (BGP-3)", RFC 1105, October 1991. 686 [RFC1771] Rekhter, Y., and T. Li, "A Border Gateway 687 Protocol 4 (BGP-4)", RFC 1771, March 1995. 689 [RFC1772] Rekhter, Y., and P. Gross, Editors, "Application 690 of the Border Gateway Protocol in the Internet", 691 RFC 1772, March 1995. 693 [RFC1774] Traina, P., "BGP-4 protocol analysis", 694 RFC 1774, March, 1995. 696 [RFC2622] Alaettinoglu, C., et. al., "Routing Policy 697 Specification Language (RPSL)" RFC 2622, May, 698 1998. 700 [RFC2406] Kent, S., Atkinson, R., "IP Encapsulating Security 701 Payload (ESP)", RFC 2406, November, 1998. 703 [ROUTEVIEWS] Meyer, D., "The Route Views Project", 704 http://www.routeviews.org 706 13. Author's Addresses 708 David Meyer 709 Email: dmm@1-4-5.net 711 Keyur Patel 712 Cisco Systems 713 Email: keyupate@cisco.com 715 Intellectual Property Statement 717 The IETF takes no position regarding the validity or scope of any 718 Intellectual Property Rights or other rights that might be claimed to 719 pertain to the implementation or use of the technology described in 720 this document or the extent to which any license under such rights 721 might or might not be available; nor does it represent that it has 722 made any independent effort to identify any such rights. Information 723 on the procedures with respect to rights in RFC documents can be 724 found in BCP 78 and BCP 79. 726 Copies of IPR disclosures made to the IETF Secretariat and any 727 assurances of licenses to be made available, or the result of an 728 attempt made to obtain a general license or permission for the use of 729 such proprietary rights by implementers or users of this 730 specification can be obtained from the IETF on-line IPR repository at 731 http://www.ietf.org/ipr. 733 The IETF invites any interested party to bring to its attention any 734 copyrights, patents or patent applications, or other proprietary 735 rights that may cover technology that may be required to implement 736 this standard. Please address the information to the IETF at 737 ietf-ipr@ietf.org. 739 Disclaimer of Validity 741 This document and the information contained herein are provided on an 742 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 743 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 744 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 745 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 746 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 747 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 749 Copyright Statement 751 Copyright (C) The Internet Society (2004). This document is subject 752 to the rights, licenses and restrictions contained in BCP 78, and 753 except as set forth therein, the authors retain all their rights. 755 Acknowledgment 757 Funding for the RFC Editor function is currently provided by the 758 Internet Society.