idnits 2.17.1 draft-irtf-routing-history-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 2131. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2142. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2149. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2155. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([I-D.irtf-routing-reqs]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 19, 2007) is 6275 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'Blumenthal01' -- Possible downref: Non-RFC (?) normative reference: ref. 'Breslau90' -- Possible downref: Non-RFC (?) normative reference: ref. 'Chapin94' -- Possible downref: Normative reference to a draft: ref. 'Chiappa91' -- Possible downref: Non-RFC (?) normative reference: ref. 'Griffin99' -- Possible downref: Non-RFC (?) normative reference: ref. 'Huitema90' -- Possible downref: Non-RFC (?) normative reference: ref. 'Huston05' -- Possible downref: Normative reference to a draft: ref. 'I-D.alaettinoglu-isis-convergence' -- No information found for draft-berkowitz-multirqmt - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'I-D.berkowitz-multirqmt' == Outdated reference: A later version (-11) exists of draft-ietf-bfd-base-05 == Outdated reference: A later version (-11) exists of draft-irtf-routing-reqs-07 ** Downref: Normative reference to an Historic draft: draft-irtf-routing-reqs (ref. 'I-D.irtf-routing-reqs') -- Possible downref: Normative reference to a draft: ref. 'I-D.sandiick-flip' -- Possible downref: Non-RFC (?) normative reference: ref. 'INARC89' -- Possible downref: Non-RFC (?) normative reference: ref. 'IRRToolSet' -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO10747' -- Possible downref: Non-RFC (?) normative reference: ref. 'Jiang02' -- Possible downref: Non-RFC (?) normative reference: ref. 'Labovitz02' -- Possible downref: Non-RFC (?) normative reference: ref. 'NewArch03' ** Downref: Normative reference to an Historic RFC: RFC 904 ** Downref: Normative reference to an Unknown state RFC: RFC 975 ** Obsolete normative reference: RFC 1105 (Obsoleted by RFC 1163) ** Downref: Normative reference to an Unknown state RFC: RFC 1126 ** Obsolete normative reference: RFC 1163 (Obsoleted by RFC 1267) ** Downref: Normative reference to an Historic RFC: RFC 1267 ** Downref: Normative reference to an Informational RFC: RFC 1753 ** Obsolete normative reference: RFC 1771 (Obsoleted by RFC 4271) ** Downref: Normative reference to an Informational RFC: RFC 1992 ** Obsolete normative reference: RFC 2362 (Obsoleted by RFC 4601, RFC 5059) ** Obsolete normative reference: RFC 2547 (Obsoleted by RFC 4364) ** Downref: Normative reference to an Informational RFC: RFC 2650 ** Downref: Normative reference to an Informational RFC: RFC 2791 ** Obsolete normative reference: RFC 2858 (Obsoleted by RFC 4760) ** Downref: Normative reference to an Informational RFC: RFC 3221 ** Downref: Normative reference to an Informational RFC: RFC 3277 ** Downref: Normative reference to an Informational RFC: RFC 3345 ** Downref: Normative reference to an Experimental RFC: RFC 3618 ** Downref: Normative reference to an Informational RFC: RFC 3765 ** Downref: Normative reference to an Historic RFC: RFC 3913 ** Downref: Normative reference to an Informational RFC: RFC 4116 ** Downref: Normative reference to an Informational RFC: RFC 4593 ** Obsolete normative reference: RFC 4601 (Obsoleted by RFC 7761) -- Possible downref: Non-RFC (?) normative reference: ref. 'Tsuchiya87' -- Possible downref: Non-RFC (?) normative reference: ref. 'Xu97' Summary: 27 errors (**), 0 flaws (~~), 4 warnings (==), 26 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group E. Davies 3 Internet-Draft Consultant 4 Expires: August 23, 2007 A. Doria 5 LTU 6 February 19, 2007 8 Analysis of IDR requirements and History 9 draft-irtf-routing-history-05.txt 11 Status of this Memo 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware 15 have been or will be disclosed, and any of which he or she becomes 16 aware will be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt. 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 This Internet-Draft will expire on August 23, 2007. 36 Copyright Notice 38 Copyright (C) The IETF Trust (2007). 40 Abstract 42 This document analyses the current state of IDR routing with respect 43 to RFC1126 and other IDR requirements and design efforts. It is the 44 companion document to "Requirements for Inter-Domain Routing" 45 [I-D.irtf-routing-reqs], which is a discussion of requirements for 46 the future routing architecture and future routing protocols. 47 Publication of this document is in accordance with the consensus of 48 the active contributors the IRTF's Routing Research Group. 50 [Note to RFC Editor: Please replace the reference in the abstract 51 with a non-reference quoting the RFC number of the companion 52 document when it is allocated, i.e., '(RFC xxxx)' and remove this 53 note.] 55 Table of Contents 57 1. Provenance of this Document . . . . . . . . . . . . . . . . . 4 58 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 59 2.1. Background . . . . . . . . . . . . . . . . . . . . . . . . 5 60 3. Historical Perspective . . . . . . . . . . . . . . . . . . . . 6 61 3.1. The Legacy of RFC1126 . . . . . . . . . . . . . . . . . . 6 62 3.1.1. "General Requirements" . . . . . . . . . . . . . . . . 7 63 3.1.2. "Functional Requirements" . . . . . . . . . . . . . . 11 64 3.1.3. "Non-Goals" . . . . . . . . . . . . . . . . . . . . . 18 65 3.2. ISO OSI IDRP, BGP and the Development of Policy Routing . 22 66 3.3. Nimrod Requirements . . . . . . . . . . . . . . . . . . . 27 67 3.4. PNNI . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 68 4. Recent Research Work . . . . . . . . . . . . . . . . . . . . . 29 69 4.1. Developments in Internet Connectivity . . . . . . . . . . 29 70 4.2. DARPA NewArch Project . . . . . . . . . . . . . . . . . . 30 71 4.2.1. Defending the End-to-End Principle . . . . . . . . . . 31 72 5. Existing problems of BGP and the current 73 Inter-/Intra-Domain Architecture . . . . . . . . . . . . . . . 32 74 5.1. BGP and Auto-aggregation . . . . . . . . . . . . . . . . . 32 75 5.2. Convergence and Recovery Issues . . . . . . . . . . . . . 32 76 5.3. Non-locality of Effects of Instability and 77 Misconfiguration . . . . . . . . . . . . . . . . . . . . . 33 78 5.4. Multihoming Issues . . . . . . . . . . . . . . . . . . . . 33 79 5.5. AS-number exhaustion . . . . . . . . . . . . . . . . . . . 35 80 5.6. Partitioned AS's . . . . . . . . . . . . . . . . . . . . . 35 81 5.7. Load Sharing . . . . . . . . . . . . . . . . . . . . . . . 36 82 5.8. Hold down issues . . . . . . . . . . . . . . . . . . . . . 36 83 5.9. Interaction between Inter domain routing and intra 84 domain routing . . . . . . . . . . . . . . . . . . . . . . 36 85 5.10. Policy Issues . . . . . . . . . . . . . . . . . . . . . . 38 86 5.11. Security Issues . . . . . . . . . . . . . . . . . . . . . 38 87 5.12. Support of MPLS and VPNS . . . . . . . . . . . . . . . . . 38 88 5.13. IPv4 / IPv6 Ships in the Night . . . . . . . . . . . . . . 39 89 5.14. Existing Tools to Support Effective Deployment of 90 Inter-Domain Routing . . . . . . . . . . . . . . . . . . . 39 91 5.14.1. Routing Policy Specification Language RPSL (RFC 92 2622, 2650) and RIPE NCC Database (RIPE 157) . . . . . 40 93 6. Security Considerations . . . . . . . . . . . . . . . . . . . 41 94 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 41 95 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 41 96 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 42 97 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 46 98 Intellectual Property and Copyright Statements . . . . . . . . . . 47 100 1. Provenance of this Document 102 In 2001, the IRTF Routing Research Group (IRTF RRG) chairs, Abha 103 Ahuja and Sean Doran, decided to establish a sub-group to look at 104 requirements for inter-domain routing (IDR). A group of well known 105 routing experts was assembled to develop requirements for a new 106 routing architecture. Their mandate was to approach the problem 107 starting from a blank sheet. This group was free to take any 108 approach, including a revolutionary approach, in developing 109 requirements for solving the problems they saw in inter-domain 110 routing. 112 Simultaneously, an independent effort was started in Sweden with a 113 similar goal. A team, calling itself Babylon, representing vendors, 114 service providers, and academia, assembled to understand the history 115 of inter-domain routing, to research the problems seen by the service 116 providers, and to develop a proposal of requirements for a follow-on 117 to the current routing architecture. This group's approach required 118 an evolutionary approach starting from current routing architecture 119 and practice. In other words the group limited itself to developing 120 an evolutionary strategy. The Babylon group was later folded into 121 the IRTF RRG as Sub-Group B. 123 This document, which was a part of Sub-group B's output, provides a 124 snapshot of the current state of Inter-Domain Routing (IDR) at the 125 time of original writing (2001) with some minor updates to take into 126 account developments since that date, bringing it up to date in 2006. 127 The development of the new requirments set is then motivated by an 128 analysis of the problems that IDR has been encountering in the recent 129 past. This document is intended as a counterpart to the Routing 130 Requirements document which captures the requirements for future 131 domain routing systems as captured separately by the IRTF RRG Sub- 132 groups A and B [I-D.irtf-routing-reqs]. 134 2. Introduction 136 It is generally accepted that there are major shortcomings in the 137 inter-domain routing of the Internet today and that these may result 138 in severe routing problems within an unspecified period of time. 139 Remedying these shortcomings will require extensive research to tie 140 down the exact failure modes that lead to these shortcomings and 141 identify the best techniques to remedy the situation. 143 Changes in the nature and quality of the services that users want 144 from the Internet are difficult to provide within the current 145 framework, as they impose requirements never foreseen by the original 146 architects of the Internet routing system. 148 The kind of radical changes that have to be accommodated are 149 epitomized by the advent of IPv6 and the application of IP mechanisms 150 to private commercial networks that offer specific service guarantees 151 beyond the best-effort services of the public Internet. Major 152 changes to the inter-domain routing system are inevitable to provide 153 an efficient underpinning for the radically changed and increasingly 154 commercially-based networks that rely on the IP protocol suite. 156 Current practice stresses the need to separate the concerns of the 157 control plane in a router and the forwarding plane: This document 158 will follow this practice, but we still use the term 'routing' as a 159 global portmanteau to cover all aspects of the system. 161 This document provides a historical perspective on the current state 162 of domain routing in Section 3 by revisiting the previous IETF 163 requirements document intended to steer the development of a future 164 routing system. These requirements, which informed the design of the 165 Border Gateway Protocol (BGP) in 1989, are contained in RFC1126 - 166 "Goals and Functional Requirements for Inter-Autonomous System 167 Routing" [RFC1126]. 169 Section 3 also looks at some other work on requirements for domain 170 routing that was carried out before and after RFC1126 was published. 171 This work fleshes out the historical perspective and provides some 172 additional insights into alternative approaches which may be 173 instructive when building a new set of requirements. 175 The motivation for change and the inspiration for some of the 176 requirements for new routing architectures derive from the problems 177 attributable to the current domain routing system that are being 178 experienced in the Internet today. These will be discussed in 179 Section 5. 181 2.1. Background 183 Today's Internet uses an addressing and routing structure that has 184 developed in an ad hoc, more or less upwards-compatible fashion. It 185 has progressed from handling a non-commercial Internet with a single 186 administrative domain to a solution that is just about controlling 187 today's multi-domain, federated Internet, carrying traffic between 188 the networks of commercial, governmental and not-for-profit 189 participants. As well as directing traffic to its intended end- 190 point, inter-domain routing mechanisms are expected to implement a 191 host of domain specific routing policies for competing, communicating 192 domains. The result is not ideal, particularly as regards inter- 193 domain routing mechanisms, but it does a pretty fair job at its 194 primary goal of providing any-to-any connectivity to many millions of 195 computers. 197 Based on a large body of anecdotal evidence, but also on a growing 198 body of experimental evidence [Labovitz02] and analytic work on the 199 stability of BGP under certain policy specifications [Griffin99], the 200 main Internet inter-domain routing protocol, BGP version 4 (BGP-4), 201 appears to have a number of problems that need to be resolved. 202 Additionally, the hierarchical nature of the inter-domain routing 203 problem appears to be changing as the connectivity between domains 204 becomes increasingly meshed [RFC3221] which alters some of the 205 scaling and structuring assumptions on which BGP-4 is built. Patches 206 and fix-ups may relieve some of these problems but others may require 207 a new architecture and new protocols. 209 3. Historical Perspective 211 3.1. The Legacy of RFC1126 213 RFC 1126 [RFC1126] outlined a set of requirements that were intended 214 to guide the development of BGP. 216 Editors' Note: When this document was reviewed by Yakov Rekhter, 217 one of the designers of BGP, his view was that "While some people 218 expected a set of requirements outlined in RFC1126 to guide the 219 development of BGP, in reality the development of BGP happened 220 completely independently of RFC1126. In other words, from the 221 point of view of the development of BGP, RFC1126 turned out to be 222 totally irrelevant." On the other hand, it appears that BGP as 223 currently implemented has met a large proportion of these 224 requirements, especially for unicast traffic. 226 While the network is demonstrably different from what it was in 1989, 227 both as to structure and size, many of the same requirements remain. 228 As a first step in setting requirements for the future, we need to 229 understand the requirements that were originally set for the current 230 protocols. And in charting a future architecture we must first be 231 sure to do no harm. This means a future domain routing system has to 232 support as its base requirement, the level of function that is 233 available today. 235 The following sections each relate to a requirement, or non- 236 requirement listed in RFC1126. In fact the section names are direct 237 quotes from the document. The discussion of these requirements 238 covers the following areas: 240 Explanation: Optional interpretation for today's audience of 241 the original intent of the requirement 243 Relevance: Is the requirement of RFC1126 still relevant, and 244 to what degree? Should it be understood 245 differently in today's environment? 247 Current practice: How well is the requirement met by current 248 protocols and practice? 250 3.1.1. "General Requirements" 252 3.1.1.1. "Route to Destination" 254 Timely routing to all reachable destinations, including multihoming 255 and multicast. 257 Relevance: Valid, but requirements for multihoming need 258 further discussion and elucidation. The 259 requirement should include multiple source 260 multicast routing. 262 Current practice: Multihoming is not efficient and the proposed 263 inter-domain multicast protocol BGMP [RFC3913] is 264 an add-on to BGP following many of the same 265 strategies but not integrated into the BGP 266 framework . 268 Editors' Note: Multicast routing has moved on 269 again since this was originally written. By 270 2006 BGMP had been effectively superseded. 271 Multicast routing now uses Multiprotocol BGP 272 [RFC2858], the Multicast Source Discovery 273 Protocol (MSDP) [RFC3618] and Protocol 274 Independent Multicast - Sparse Mode (PIM-SM) 275 [RFC2362], [RFC4601], especially the Source 276 Specific Multicast (SSM) subset. 278 3.1.1.2. "Routing is Assured" 280 This requires that a user be notified within a reasonable time period 281 of attempts, about inability to provide a service. 283 Relevance: Valid 284 Current practice: There are ICMP messages for this, but in many 285 cases they are not used, either because of fears 286 about creating message storms or uncertainty about 287 whether the end system can do anything useful with 288 the resulting information. IPv6 implementations 289 may be able to make better use of the information 290 as they may have alternative addresses that could 291 be used to exploit an alternative routing. 293 3.1.1.3. "Large System" 295 The architecture was designed to accommodate the growth of the 296 Internet. 298 Relevance: Valid. Properties of Internet topology might be 299 an issue for future scalability (topology varies 300 from very sparse to quite dense at present). 301 Instead of setting growth in a time-scale, 302 indefinite growth should be accommodated. On the 303 other hand, such growth has to be accommodated 304 without making the protocols too expensive - 305 trade-offs may be necessary. 307 Current practice: Scalability of the current protocols will not be 308 sufficient under the current rate of growth. 309 There are problems with BGP convergence for large 310 dense topologies, problems with routing 311 information propagation between routers in transit 312 domains, limited support for hierarchy, etc. 314 3.1.1.4. "Autonomous Operation" 316 This requirement encapsulates the need for administative domains 317 ("Autonomous Systems" - AS) to be able to operate autonomously as 318 regards setting routing policy: 320 Relevance: Valid. There may need to be additional 321 requirements for adjusting policy decisions to the 322 global functionality and for avoiding 323 contradictory policies. This would decrease the 324 possibility of unstable routing behavior. 326 There is a need for handling various degrees of 327 trust in autonomous operations, ranging from no 328 trust (e.g., between separate ISPs) to very high 329 trust where the domains have a common goal of 330 optimizing their mutual policies. 332 Policies for intra domain operations should in 333 some cases be revealed, using suitable 334 abstractions. 336 Current practice: Policy management is in the control of network 337 managers, as required, but there is little support 338 for handling policies at an abstract level for a 339 domain. 341 Cooperating administrative entities decide about 342 the extent of cooperation independently. Lack of 343 coordination combined with global range of effects 344 results in occasional melt-down of Internet 345 routing. 347 3.1.1.5. "Distributed System" 349 The routing environment is a distributed system. The distributed 350 routing environment supports redundancy and diversity of nodes and 351 links. Both data and operations are distributed. 353 Relevance: Valid. RFC1126 is very clear that we should not 354 be using centralized solutions, but maybe we need 355 a discussion on trade-offs between common 356 knowledge and distribution (i.e., to allow for 357 uniform policy routing, e.g., GSM systems are in a 358 sense centralized, but with hierarchies) 360 Current practice: Routing is very distributed, but lacking abilities 361 to consider optimization over several hops or 362 domains. 364 3.1.1.6. "Provide A Credible Environment" 366 Routing mechanism information must be integral and secure (credible 367 data, reliable operation). Security from unwanted modification and 368 influence is required. 370 Relevance: Valid. 372 Current practice: BGP provides a limited mechanism for 373 authentication and security of peering sessions, 374 but this does not guarantee the authenticity or 375 validity of the routing information that is 376 exchanged. 378 There are certainly security problems with current 379 practice. The Routing Protocol Security 380 Requirements (rpsec) working group has been 381 struggling to agree on a set of requirements for 382 BGP security since early 2002. 384 Editors' note: Proposals for authenticating BGP 385 routing information using certificates were 386 under development by the Secure Inter-Domain 387 Routing (sidr) working group in 2006. 389 3.1.1.7. "Be A Managed Entity" 391 Requires that a manager should get enough information on a state of 392 network so that s/he could make informed decisions. 394 Relevance: The requirement is reasonable, but we might need 395 to be more specific on what information should be 396 available, e.g., to prevent routing oscillations. 398 Current practice: All policies are determined locally, where they 399 may appear reasonable but there is limited global 400 coordination through the routing policy databases 401 operated by the Internet registries (AfriNIC, 402 APNIC, ARIN, LACNIC, RIPE, etc.). 404 Operators are not required to register their 405 policies; even when policies are registered, it is 406 difficult to check that the actual policies in use 407 match the declared policies and therefore a 408 manager cannot guarantee to make a globally 409 consistent decision. 411 3.1.1.8. "Minimize Required Resources" 413 Relevance: Valid, however, the paragraph states that 414 assumptions on significant upgrades shouldn't be 415 made. Although this is reasonable, a new 416 architecture should perhaps be prepared to use 417 upgrades when they occur. 419 Current practice: Most bandwidth is consumed by the exchange of the 420 Network Layer Reachability Information (NLRI). 421 Usage of processing cycles ("Central Processor 422 Usage" - CPU) depends on the stability of the 423 Internet. Both phenomena have a local nature, so 424 there are not scaling problems with bandwidth and 425 CPU usage. Instability of routing increases the 426 consumption of resources in any case. The number 427 of networks in the Internet dominates memory 428 requirements - this is a scaling problem. 430 3.1.2. "Functional Requirements" 432 3.1.2.1. "Route Synthesis Requirements" 434 3.1.2.1.1. "Route around failures dynamically" 436 Relevance: Valid. Should perhaps be stronger. Only 437 providing a best-effort attempt may not be enough 438 if real-time services are to be provided for. 439 Detections may need to be faster than 100ms to 440 avoid being noticed by end-users. 442 Current practice: Latency of fail-over is too high; sometimes 443 minutes or longer. 445 3.1.2.1.2. "Provide loop free paths" 447 Relevance: Valid. Loops should occur only with negligible 448 probability and duration. 450 Current practice: Both link-state intra domain routing and BGP 451 inter-domain routing (if correctly configured) are 452 forwarding-loop free after having converged. 453 However, convergence time for BGP can be very long 454 and poorly designed routing policies may result in 455 a number of BGP speakers engaging in a cyclic 456 pattern of advertisements and withdrawals which 457 never converges to a stable result [RFC3345]. 458 Perhaps this is one context in which the need for 459 global convergence needs to be reviewed. 461 3.1.2.1.3. "Know when a path or destination is unavailable" 463 Relevance: Valid to some extent, but there is a trade-off 464 between aggregation and immediate knowledge of 465 reachability. It requires that routing tables 466 contain enough information to determine that the 467 destination is unknown or a path cannot be 468 constructed to reach it. 470 Current practice: Knowledge about lost reachability propagates 471 slowly through the networks due to slow 472 convergence for route withdrawals. 474 3.1.2.1.4. "Provide paths sensitive to administrative policies" 476 Relevance: Valid. Policy control of routing is of 477 increasingly importance as the Internet has turned 478 into a business. 480 Current practice: Supported to some extent. Policies can only be 481 applied locally in an AS and not globally. Policy 482 information supplied has a very small probability 483 of affecting policies in other AS's. Furthermore, 484 only static policies are supported; between static 485 policies and policies dependent upon volatile 486 events of great celerity there should exist events 487 that routing should be aware of. Lastly, there is 488 no support for policies other than route- 489 properties (such as AS-origin, AS-path, 490 destination prefix, MED-values etc). 492 Editors' note: Subsequent to the original issue 493 of this document mechanisms which acknowledge 494 the business relationships of operators have 495 been developed such as the NOPEER community 496 attribute [RFC3765]. However the level of 497 usage of this attribute is apparently not very 498 great. 500 3.1.2.1.5. "Provide paths sensitive to user policies" 502 Relevance: Valid to some extent, as they may conflict with 503 the policies of the network administrator. It is 504 likely that this requirement will be met by means 505 of different bit transport services offered by an 506 operator, but at the cost of adequate 507 provisioning, authentication and policing when 508 utilizing the service. 510 Current practice: Not supported in normal routing. Can be 511 accomplished to some extent with loose source 512 routing, resulting in inefficient forwarding in 513 the routers. The various attempts to introduce 514 Quality of Service (QoS - e.g., Integrated 515 Services and Differentiated Services (DiffServ)) 516 can also be seen as means to support this 517 requirement but they have met with limited success 518 in terms of providing alternate routes as opposed 519 to providing improved service on the standard 520 route. 522 Editor's Note: From the standpoint of a later 523 time, it would probably be more appropriate to 524 say "total faiure" rather than "limited 525 success". 527 3.1.2.1.6. "Provide paths which characterize user quality-of-service 528 requirements" 530 Relevance: Valid to some extent, as they may conflict with 531 the policies of the operator. It is likely that 532 this requirement will be met by means of different 533 bit transport services offered by an operator, but 534 at the cost of adequate provisioning, 535 authentication and policing when utilizing the 536 service. It has become clear that offering to 537 provide a particular QoS to any arbitrary 538 destination from a particular source is generally 539 impossible: QoS except in very 'soft' forms such 540 as overall long term average packet delay, is 541 generally associated with connection oriented 542 routing. 544 Current practice: Creating routes with specified QoS is not 545 generally possible at present. 547 3.1.2.1.7. "Provide autonomy between inter- and intra-autonomous system 548 route synthesis" 550 Relevance: Inter- and intra-domain routing should stay 551 independent, but one should notice that this to 552 some extent contradicts the previous three 553 requirements. There is a trade-off between 554 abstraction and optimality. 556 Current practice: Inter-domain routing is performed independently of 557 intra-domain routing. Intra-domain routing is 558 however, especially in transit domains, very 559 interrelated with inter-domain routing. 561 3.1.2.2. "Forwarding Requirements" 563 3.1.2.2.1. "Decouple inter- and intra-autonomous system forwarding 564 decisions" 566 Relevance: Valid. 568 Current practice: As explained in Section 3.1.2.1.7, intra-domain 569 forwarding in transit domains is dependent on 570 inter-domain forwarding decisions. 572 3.1.2.2.2. "Do not forward datagrams deemed administratively 573 inappropriate" 575 Relevance: Valid, and increasingly important in the context 576 of enforcing policies correctly expressed through 577 routing advertisements but flouted by rogue peers 578 which send traffic for which a route has not been 579 advertised. On the other hand, packets that have 580 been misrouted due to transient routing problems 581 perhaps should be forwarded to reach the 582 destination, although along an unexpected path. 584 Current practice: At stub domains there is packet filtering, e.g., 585 to catch source address spoofing on outgoing 586 traffic or to filter out unwanted incoming 587 traffic. Filtering can in particular reject 588 traffic (such as unauthorized transit traffic) 589 that has been sent to a domain even when it has 590 not advertised a route for such traffic on a given 591 interface. The growing class of 'middle boxes' 592 (midboxes, e.g., Network Address Translators - 593 NATs) is quite likely to apply administrative 594 rules that will prevent forwarding of packets. 595 Note that security policies may deliberately hide 596 administrative denials. In the backbone, 597 intentional packet dropping based on policies is 598 not common. 600 3.1.2.2.3. "Do not forward datagrams to failed resources" 602 Relevance: Unclear, although it is clearly desirable to 603 minimise waste of forwarding resources by 604 discarding datagrams which cannot be delivered at 605 the earliest opportunity. There is a trade-off 606 between scalability and keeping track of 607 unreachable resources. Equipment closest to a 608 failed node has the highest motivation to keep 609 track of failures so that waste can be minimised. 611 Current practice: Routing protocols use both internal adjacency 612 management sub-protocols (e.g. Hello protocols) 613 and information from equipment and lower layer 614 link watchdogs to keep track of failures in 615 routers and connecting links. Failures will 616 eventually result in the routing protocol 617 reconfiguring the routing to avoid (if possible) a 618 failed resource, but this is generally very slow 619 (30s or more). In the meantime datagrams may well 620 be forwarded to failed resources. In general 621 terms, end hosts and some non-router midboxes do 622 not participate in these notifications and 623 failures of such boxes will not affect the routing 624 system. 626 3.1.2.2.4. "Forward datagram according to its characteristics" 628 Relevance: Valid. This is necessary in enabling 629 differentiation in the network, based on QoS, 630 precedence, policy or security. 632 Current practice: Ingress and egress filtering can be done based on 633 policy. Some networks discriminate on the basis 634 of requested QoS. 636 3.1.2.3. "Information Requirements" 638 3.1.2.3.1. "Provide a distributed and descriptive information base" 640 Relevance: Valid, however hierarchical information bases 641 might provide more possibilities. 643 Current practice: The information base is distributed, but it is 644 unclear whether it supports all necessary routing 645 functionality. 647 3.1.2.3.2. "Determine resource availability" 649 Relevance: Valid. It should be possible for resource 650 availability and levels of resource availability 651 to be determined. This prevents needing to 652 discover unavailability through failure. Resource 653 location and discovery is arguably a separate 654 concern that could be addressed outside the core 655 routing requirements. 657 Current practice: Resource availability is predominantly handled 658 outside of the routing system. 660 3.1.2.3.3. "Restrain transmission utilization" 662 Relevance: Valid. However certain requirements in the 663 control plane, such as fast detection of faults 664 may be worth consumption of more resources. 665 Similarly, simplicity of implementation may make 666 it cheaper to 'back haul' traffic to central 667 locations to minimise the cost of routing if 668 bandwidth is cheaper than processing. 670 Current practice: BGP messages probably do not ordinarily consume 671 excessive resources, but might during erroneous 672 conditions. In the data plane, the near universal 673 adoption of shortest path protocols could be 674 considered to result in minimization of 675 transmission utilization. 677 3.1.2.3.4. "Allow limited information exchange" 679 Relevance: Valid. But perhaps routing could be improved if 680 certain information could be available either 681 globally or at least for a wider defined locality. 683 Current practice: Policies are used to determine which reachability 684 information is exported. 686 3.1.2.4. "Environmental Requirements" 688 3.1.2.4.1. "Support a packet-switching environment" 690 Relevance: Valid but routing system should, perhaps, not be 691 limited to this exclusively. 693 Current practice: Supported. 695 3.1.2.4.2. "Accommodate a connection-less oriented user transport 696 service" 698 Relevance: Valid, but routing system should, perhaps, not be 699 limited to this exclusively. 701 Current practice: Accommodated. 703 3.1.2.4.3. "Accommodate 10K autonomous systems and 100K networks" 705 Relevance: No longer valid. Needs to be increased 706 potentially indefinitely. It is extremely 707 difficult to foresee the future size expansion of 708 the Internet so that the Utopian solution would be 709 to achieve an Internet whose architecture is scale 710 invariant. Regrettably, this may not be 711 achievable without introducing undesirable 712 complexity and a suitable trade off between 713 complexity and scalability is likely to be 714 necessary. 716 Current Practice: Supported but perhaps reaching its limit. Since 717 the original version of this document was written 718 in 2001, the number of ASs advertised has grown 719 from around 8000 to 20000, and almost 35000 AS 720 numbers have been allocated by the regional 721 registries [Huston05]. If this growth continues 722 the original 16 bit AS space in BGP-4 will be 723 exhausted in less than 5 years. Planning for an 724 extended AS space is now an urgent requirement. 726 3.1.2.4.4. "Allow for arbitrary interconnection of autonomous systems" 728 Relevance: Valid. However perhaps not all interconnections 729 should be accessible globally. 731 Current practice: BGP-4 allows for arbitrary interconnections. 733 3.1.2.5. "General Objectives" 735 3.1.2.5.1. "Provide routing services in a timely manner" 737 Relevance: Valid, as stated before. The more complex a 738 service is the longer it should be allowed to 739 take, but the implementation of services requiring 740 (say) NP-complete calculation should be avoided. 742 Current practice: More or less, with the exception of convergence 743 and fault robustness. 745 3.1.2.5.2. "Minimize constraints on systems with limited resources" 746 Relevance: Valid 748 Current practice: Systems with limited resources are typically stub 749 domains that advertise very little information. 751 3.1.2.5.3. "Minimize impact of dissimilarities between autonomous 752 systems" 754 Relevance: Important. This requirement is critical to a 755 future architecture. In a domain routing 756 environment where the internal properties of 757 domains may differ radically, it will be important 758 to be sure that these dissimilarities are 759 minimized at the borders. 760 Current: practice: For the most part this capability is not really 761 required in today's networks since the intra- 762 domain attributes are broadly similar across 763 domains. 765 3.1.2.5.4. "Accommodate the addressing schemes and protocol mechanisms 766 of the autonomous systems" 768 Relevance: Important, probably more so than when RFC1126 was 769 originally developed because of the potential 770 deployment of IPv6, wider usage of MPLS and the 771 increasing usage of VPNs. 773 Current practice: Only one global addressing scheme is supported in 774 most autonomous systems but the availability of 775 IPv6 services is steadily increasing. Some global 776 backbones support IPv6 routing and forwarding. 778 3.1.2.5.5. "Must be implementable by network vendors" 780 Relevance: Valid, but note that what can be implemented today 781 is different from what was possible when RFC1126 782 was written: a future domain routing architecture 783 should not be unreasonably constrained by past 784 limitations. 786 Current practice: BGP was implemented and meets a large proportion 787 of the original requirements. 789 3.1.3. "Non-Goals" 791 RFC1126 also included a section discussing non-goals. To what extent 792 are these still non-goals? Does the fact that they were non-goals 793 adversely affect today's IDR system? 795 3.1.3.1. "Ubiquity" 797 The authors of RFC 1126 were explicitly saying that IP and its inter- 798 domain routing system need not be deployed in every AS, and a 799 participant should not necessarily expect to be able to reach a given 800 AS, possibly because of routing policies. In a sense this 'non-goal' 801 has effectively been achieved by the Internet and IP protocols. This 802 requirement reflects a different world view where there was serious 803 competition for network protocols, which is really no longer the 804 case. Ubiquitous deployment of inter-domain routing in particular 805 has been achieved and must not be undone by any proposed future 806 domain routing architecture. On the other hand: 807 o ubiquitous connectivity cannot be reached in a policy sensitive 808 environment and should not be an aim, 809 * Editor's Note: It has been pointed out that this statement 810 could be interpreted as being contrary to the Internet mission 811 of providing universal connectivity. The fact that limits to 812 connectivity will be added as operational requiremements in a 813 policy sensitive environment should not imply that a future 814 domain routing architecture contains intrinsic limits on 815 connectivity. 816 o it must not be required that the same routing mechanisms are used 817 throughout provided that they can interoperate appropriately 818 o the information needed to control routing in a part of the network 819 should not necessarily be ubiquitously available and it must be 820 possible for an operator to hide commercially sensitive 821 information that is not needed outside a domain. 822 o the introduction of IPv6 reintroduces an element of diversity into 823 the world of network protocols but the similarities of IPv4 and 824 IPv6 as regards routing and forwarding make this event less likely 825 to drive an immediate diversification in routing systems. The 826 potential for further growth in the size of the network enabled by 827 IPv6 is very likely to require changes in the future: whether this 828 results in the replacement of one de facto ubiquitous system with 829 another remains to be seen but cannot be a requirement - it will 830 have to interoperate with BGP during the transition.. 832 Relevance: De facto essential for a future domain routing 833 architecture, but what is required is ubiquity of 834 the routing system rather than ubiquity of 835 connectivity and it must be capable of a gradual 836 takeover through interoperation with the existing 837 system. 839 Current practice: De facto ubiquity achieved. 841 3.1.3.2. "Congestion control" 843 Relevance: It is not clear if this non-goal was to be applied 844 to routing or forwarding. It is definitely a non- 845 goal to adapt the choice of route when there is 846 transient congestion. However, to add support for 847 congestion avoidance (e.g., Explicit Congestion 848 Notification (ECN) and ICMP messages) in the 849 forwarding process would be a useful addition. 850 There is also extensive work going on in traffic 851 engineering which should result in congestion 852 avoidance through routing as well as in 853 forwarding. 855 Current practice: Some ICMP messages (e.g., source quench) exist to 856 deal with congestion control but these are not 857 generally used as they either make the problem 858 worse or there is no mechanism to reflect the 859 message into the application which is providing 860 the source. 862 3.1.3.3. "Load splitting" 864 Relevance: This should neither be a non-goal, nor an explicit 865 goal. It might be desirable in some cases and 866 should be considered as an optional architectural 867 feature. 869 Current practice: Can be implemented by exporting different prefixes 870 on different links, but this requires manual 871 configuration and does not consider actual load. 873 Editors' Note: This configuration is carried 874 out extensively as of 2006 and has been a 875 significant factor in routing table bloat. If 876 this need is a real operational requirement, as 877 it seems to be for multihomed or otherwise 878 richly connected sites, it will be necessary to 879 reclassify this as a real and important goal. 881 3.1.3.4. "Maximizing the utilization of resources" 882 Relevance: Valid. Cost-efficiency should be striven for; 883 maximizing resource utilization does not always 884 lead to greatest cost-efficiency. 886 Current practice: Not currently part of the system, though often a 887 'hacked in' feature done with manual 888 configuration. 890 3.1.3.5. "Schedule to deadline service" 892 This non-goal was put in place to ensure that the IDR did not have to 893 meet real time deadline goals such as might apply to Constant Bit 894 Rate (CBR) real time services in ATM. 896 Relevance: The hard form of deadline services is still a non- 897 goal for the future domain routing architecture 898 but overall delay bounds are much more of the 899 essence than was the case when RFC1126 was 900 written. 902 Current practice: Service providers are now offering overall 903 probabilistic delay bounds on traffic contracts. 904 To implement these contracts there is a 905 requirement for a rather looser form of delay 906 sensitive routing. 908 3.1.3.6. "Non-interference policies of resource utilization" 910 The requirement in RFC1126 is somewhat opaque, but appears to imply 911 that what we would today call QoS routing is a non-goal and that 912 routing would not seek to control the elastic characteristics of 913 Internet traffic whereby a TCP connection can seek to utilize all the 914 spare bandwidth on a route, possibly to the detriment of other 915 connections sharing the route or crossing it. 916 Relevance: Open Issue. It is not clear whether dynamic QoS 917 routing can or should be implemented. Such a 918 system would seek to control the admission and 919 routing of traffic depending on current or recent 920 resource utilization. This would be particularly 921 problematic where traffic crosses an ownership 922 boundary because of the need for potentially 923 commercially sensitive information to be made 924 available outside the ownership boundary. 926 Current practice: Routing does not consider dynamic resource 927 availability. Forwarding can support service 928 differentiation. 930 3.2. ISO OSI IDRP, BGP and the Development of Policy Routing 932 During the decade before the widespread success of the World Wide 933 Web, ISO was developing the communications architecture and protocol 934 suite Open Systems Interconnection (OSI). For a considerable part of 935 this time OSI was seen as a possible competitor for and even a 936 replacement for the IP suite as this basis for the Internet. The 937 technical developments of the two protocols were quite heavily 938 interrelated with each providing ideas and even components that were 939 adapted into the other suite. 941 During the early stages of the development of OSI, the IP suite was 942 still mainly in use on the ARPANET and the relatively small scale 943 first phase NSFnet. This was a effectively a single administrative 944 domain with a simple tree structured network in a three level 945 hierarchy connected to a single logical exchange point (the NSFnet 946 backbone). In the second half of the 1980s the NSFNET was starting 947 on the growth and transformation that would lead to today's Internet. 948 It was becoming clear that the backbone routing protocol, the 949 Exterior Gateway Protocol (EGP) [RFC0904], was not going to cope even 950 with the limited expansion being planned. EGP is an "all informed" 951 protocol which needed to know the identities of all gateways and this 952 was no longer reasonable. With the increasing complexity of the 953 NSFnet and the linkage of the NSFnet network to other networks there 954 was a desire for policy-based routing which would allow 955 administrators to manage the flow of packets between networks. The 956 first version of the Border Gateway Protocol (BGP-1) [RFC1105] was 957 developed as a replacement for EGP with policy capabilities - a 958 stopgap EGP version 3 had been created as an interim measure while 959 BGP was developed. BGP was designed to work on a hierarchically 960 structured network, such as the original NSFNET, but could also work 961 on networks that were at least partially non-hierarchical where there 962 were links between ASs at the same level in the hierarchy (we would 963 now call these 'peering arrangements') although the protocol made a 964 distinction between different kinds of links (links are classified as 965 upwards, downwards or sideways). ASs themselves were a 'fix' for the 966 complexity that developed in the three tier structure of the NSFnet. 968 Meanwhile the OSI architects, led by Lyman Chapin, were developing a 969 much more general architecture for large scale networks. They had 970 recognized that no one node, especially an end-system (host) could or 971 should attempt to remember routes from "here" to "anywhere" - this 972 sounds obvious today but was not so obvious 20 years ago. They were 973 also considering hierarchical networks with independently 974 administered domains - a model already well entrenched in the public 975 switched telephone network. This led to a vision of a network with 976 multiple independent administrative domains with an arbitrary 977 interconnection graph and a hierarchy of routing functionality. This 978 architecture was fairly well established by 1987 [Tsuchiya87]. The 979 architecture initially envisaged a three level routing functionality 980 hierarchy in which each layer had significantly different 981 characteristics: 983 1. *End-system to Intermediate system routing (host to router)*, in 984 which the principal functions are discovery and redirection. 986 2. *Intra-domain intermediate system to intermediate system routing 987 (router to router)*, in which "best" routes between end-systems 988 in a single administrative domain are computed and used. A 989 single algorithm and routing protocol would be used throughout 990 any one domain. 992 3. *Inter-domain intermediate-system to intermediate system routing 993 (router to router)*, in which routes between routing domains 994 within administrative domains are computed (routing is considered 995 separately between administrative domains and routing domains). 997 Level 3 of this hierarchy was still somewhat fuzzy. Tsuchiya says: 999 The last two components, Inter-Domain and Inter-Administration 1000 routing, are less clear-cut. It is not obvious what should be 1001 standardized with respect to these two components of routing. For 1002 example, for Inter-Domain routing, what can be expected from the 1003 Domains? By asking Domains to provide some kind of external 1004 behavior, we limit their autonomy. If we expect nothing of their 1005 external behavior, then routing functionality will be minimal. 1007 Across administrations, it is not known how much trust there will 1008 be. In fact, the definition of trust itself can only be 1009 determined by the two or more administrations involved. 1011 Fundamentally, the problem with Inter-Domain and Inter- 1012 Administration routing is that autonomy and mistrust are both 1013 antithetical to routing. Accomplishing either will involve a 1014 number of tradeoffs which will require more knowledge about the 1015 environments within which they will operate. 1017 Further refinement of the model occurred over the next couple of 1018 years and a more fully formed view is given by Huitema and Dabbous in 1019 1989 [Huitema90]. By this stage work on the original IS-IS link 1020 state protocol, originated by the Digital Equipment Corporation 1021 (DEC), was fairly advanced and was close to becoming a Draft 1022 International Standard. IS-IS is of course a major component of 1023 intra-domain routing today and inspired the development of the Open 1024 Shortest Path First (OSPF) family. However, Huitema and Dabbous were 1025 not able to give any indication of protocol work for Level 3. There 1026 are hints of possible use of centralized route servers. 1028 In the meantime, the NSFnet consortium and the IETF had been 1029 struggling with the rapid growth of the NSFnet. It had been clear 1030 since fairly early on that EGP was not suitable for handling the 1031 expanding network and the race was on to find a replacement. There 1032 had been some intent to include a metric in EGP to facilitate routing 1033 decisions, but no agreement could be reached on how to define the 1034 metric. The lack of trust was seen as one of the main reasons that 1035 EGP could not establish a globally acceptable routing metric: again 1036 this seems to be a clearly futile aim from this distance in time! 1037 Consequently EGP became effectively a rudimentary path-vector 1038 protocol which linked gateways with Autonomous Systems. It was 1039 totally reliant on the tree structured network to avoid routing loops 1040 and the all informed nature of EGP meant that update packets became 1041 very large. BGP version 1 [RFC1105] was standardized in 1989 but had 1042 been in development for some time before this and had already seen 1043 action in production networks prior to standardization. BGP was the 1044 first real path-vector routing protocol and was intended to relieve 1045 some of the scaling problems as well as providing policy-based 1046 routing. Routes were described as paths along a 'vector' of ASs 1047 without any associated cost metric. This way of describing routes 1048 was explicitly intended to allow detection of routing loops. It was 1049 assumed that the intra-domain routing system was loop-free with the 1050 implication that the total routing system would be loop-free if there 1051 were no loops in the AS path. Note that there were no theoretical 1052 underpinnings for this work and it traded freedom from routing loops 1053 for guaranteed convergence. 1055 Also the NSFnet was a government funded research and education 1056 network. Commercial companies which were partners in some of the 1057 projects were using the NSFnet for their research activities but it 1058 was becoming clear that these companies also needed networks for 1059 commercial traffic. NSFnet had put in place "acceptable use" 1060 policies which were intended to limit the use of the network. 1061 However there was little or no technology to support the legal 1062 framework. 1064 Practical experience, IETF IAB discussion (centred in the Internet 1065 Architecture Task Force) and the OSI theoretical work were by now 1066 coming to the same conclusions: 1067 o Networks were going to be composed out of multiple administrative 1068 domains (the federated network), 1070 o The connections between these domains would be an arbitrary graph 1071 and certainly not a tree, 1072 o The administrative domains would wish to establish distinctive, 1073 independent routing policies through the graph of Autonomous 1074 Systems, and 1075 o Administrative Domains would have a degree of distrust of each 1076 other which would mean that policies would remain opaque. 1078 These views were reflected by Susan Hares' (Merit) contribution to 1079 the Internet Architecture (INARC) workshop in 1989, summarized in the 1080 report of the workshop [INARC89]: 1082 The rich interconnectivity within the Internet causes routing 1083 problems today. However, the presenter believes the problem is 1084 not the high degree of interconnection, but the routing protocols 1085 and models upon which these protocols are based. Rich 1086 interconnectivity can provide redundancy which can help packets 1087 moving even through periods of outages. Our model of interdomain 1088 routing needs to change. The model of autonomous confederations 1089 and autonomous systems [RFC0975] no longer fits the reality of 1090 many regional networks. The ISO models of administrative domain 1091 and routing domains better fit the current Internet's routing 1092 structure. 1094 With the first NSFNET backbone, NSF assumed that the Internet 1095 would be used as a production network for research traffic. We 1096 cannot stop these networks for a month and install all new routing 1097 protocols. The Internet will need to evolve its changes to 1098 networking protocols while still continuing to serve its users. 1099 This reality colors how plans are made to change routing 1100 protocols. 1102 It is also interesting to note that the difficulties of organising a 1103 transition were recognized at this stage and have not been seriously 1104 explored or resolved since. 1106 Policies would primarily be interested in controlling which traffic 1107 should be allowed to transit a domain (to satisfy commercial 1108 constraints or acceptable use policies) thereby controlling which 1109 traffic uses the resources of the domain. The solution adopted by 1110 both the IETF and OSI was a form of distance vector hop-by-hop 1111 routing with explicit policy terms. The reasoning for this choice 1112 can be found in Breslau and Estrin's 1990 paper [Breslau90] 1113 (implicitly - because some other alternatives are given such as a 1114 link state with policy suggestion which, with hindsight, would have 1115 even greater problems than BGP on a global scale network). 1116 Traditional distance vector protocols exchanged routing information 1117 in the form of a destination and a metric. The new protocols 1118 explicitly associated policy expressions with the route by including 1119 either a list of the source ASs that are permitted to use the route 1120 described in the routing update, and/or a list of all ASs traversed 1121 along the advertised route. 1123 Parallel protocol developments were already in progress by the time 1124 this paper was published: BGP version 2 [RFC1163] in the IETF and the 1125 Inter-Domain Routing Protocol (IDRP) [ISO10747] which would be the 1126 Level 3 routing protocol for the OSI architecture. IDRP was 1127 developed under the aegis of the ANSI XS3.3 working group led by 1128 Lyman Chapin and Charles Kunzinger. The two protocols were very 1129 similar in basic design but IDRP has some extra features, some of 1130 which have been incorporated into later versions of BGP; others may 1131 yet be so and still others may be seen to be inappropriate. Breslau 1132 and Estrin summarize the design of IDRP as follows: 1134 IDRP attempts to solve the looping and convergence problems 1135 inherent in distance vector routing by including full AD 1136 [Administrative Domain - essentially the equivalent of what are 1137 now called ASs] path information in routing updates. Each routing 1138 update includes the set of ADs that must be traversed in order to 1139 reach the specified destination. In this way, routes that contain 1140 AD loops can be avoided. 1142 IDRP updates also contain additional information relevant to 1143 policy constraints. For instance, these updates can specify what 1144 other ADs are allowed to receive the information described in the 1145 update. In this way, IDRP is able to express source specific 1146 policies. The IDRP protocol also provides the structure for the 1147 addition of other types of policy related information in routing 1148 updates. For example, User Class Identifiers (UCI) could also be 1149 included as policy attributes in routing updates. 1151 Using the policy route attributes IDRP provides the framework for 1152 expressing more fine grained policy in routing decisions. 1153 However, because it uses hop-by-hop distance vector routing, it 1154 only allows a single route to each destination per-QOS to be 1155 advertised. As the policy attributes associated with routes 1156 become more fine grained, advertised routes will be applicable to 1157 fewer sources. This implies a need for multiple routes to be 1158 advertised for each destination in order to increase the 1159 probability that sources have acceptable routes available to them. 1160 This effectively replicates the routing table per forwarding 1161 entity for each QoS, UCI, source combination that might appear in 1162 a packet. Consequently, we claim that this approach does not 1163 scale well as policies become more fine grained, i.e., source or 1164 UCI specific policies. 1166 Over the next three or four years successive versions of BGP (BGP-2 1167 [RFC1163], BGP-3 [RFC1267] and BGP-4 [RFC1771]) were deployed to cope 1168 with the growing and by now commercialized Internet. From BGP-2 1169 onwards, BGP made no assumptions about an overall structure of 1170 interconnections allowing it to cope with today's dense web of 1171 interconnections between ASs. BGP version 4 was developed to handle 1172 the change from classful to classless addressing. For most of this 1173 time IDRP was being developed in parallel, and both protocols were 1174 implemented in the Merit gatedaemon routing protocol suite. During 1175 this time there was a movement within the IETF which saw BGP as a 1176 stopgap measure to be used until the more sophisticated IDRP could be 1177 adapted to run over IP instead of the OSI connectionless protocol 1178 CLNP. However, unlike its intra-domain counterpart IS-IS which has 1179 stood the test of time, and indeed proved to be more flexible than 1180 OSPF, IDRP was ultimately not adopted by the market. By the time the 1181 NSFnet backbone was decommissioned in 1995, BGP-4 was the inter- 1182 domain routing protocol of choice and OSI's star was already 1183 beginning to wane. IDRP is now little remembered. 1185 A more complete account of the capabilities of IDRP can be found in 1186 chapter 14 of David Piscitello and Lyman Chapin's book 'Open Systems 1187 Networking: TCP/IP and OSI' which is now readable on the Internet 1188 [Chapin94]. 1190 IDRP also contained quite extensive means for securing routing 1191 exchanges much of it based on X.509 certificates for each router and 1192 public/private key encryption of routing updates. 1194 Some of the capabilities of IDRP which might yet appear in a future 1195 version of BGP include the ability to manage routes with explicit QoS 1196 classes, and the concept of domain confederations (somewhat different 1197 from the confederation mechanism in today's BGP) as an extra level in 1198 the hierarchy of routing. 1200 3.3. Nimrod Requirements 1202 Nimrod as expressed by Noel Chiappa in his early document, "A New IP 1203 Routing and Addressing Architecture" [Chiappa91] and later in the 1204 NIMROD Working Group documents [RFC1753] and [RFC1992] established a 1205 number of requirements that need to be considered by any new routing 1206 architecture. The Nimrod requirements took RFC1126 as a starting 1207 point and went further. 1209 The goals of Nimrod, quoted from [RFC1992], were as follows 1210 1. To support a dynamic internetwork of _arbitrary size_ (our 1211 emphasis) by providing mechanisms to control the amount of 1212 routing information that must be known throughout an 1213 internetwork. 1215 2. To provide service-specific routing in the presence of multiple 1216 constraints imposed by service providers and users. 1217 3. To admit incremental deployment throughout an internetwork. 1219 It is certain that these goals should be considered requirements for 1220 any new domain routing architecture. 1221 o As discussed in other sections of this document the amount of 1222 information needed to maintain the routing system is growing at a 1223 rate that does not scale. And yet, as the services and 1224 constraints upon those services grow there is a need for more 1225 information to be maintained by the routing system. One of the 1226 key terms in the first requirements is 'control'. While 1227 increasing amounts of information need to be known and maintained 1228 in the Internet, the amounts and kinds of information that are 1229 distributed can be controlled. This goal should be reflected in 1230 the requirements for the future domain architecture. 1231 o If anything, the demand for specific services in the Internet has 1232 grown since 1996 when the Nimrod architecture was published. 1233 Additionally the kinds of constraints that service providers need 1234 to impose upon their networks and that services need to impose 1235 upon the routing have also increased. Any changes made to the 1236 network in the last half-decade have not significantly improved 1237 this situation. 1238 o The ability to incrementally deploy any new routing architecture 1239 within the Internet is still a absolute necessity. It is 1240 impossible to imagine that a new routing architecture could 1241 supplant the current architecture on a flag day 1243 At one point in time Nimrod, with its addressing and routing 1244 architectures was seen as a candidate for IPng. History shows that 1245 it was not accepted as the IPng, having been ruled out of the 1246 selection process by the IESG in 1994 on the grounds that it was 'too 1247 much of a research effort' [RFC1752], although input for the 1248 requirements of IPng was explicitly solicited from Chiappa [RFC1753]. 1249 Instead IPv6 has been put forth as the IPng. Without entering a 1250 discussion of the relative merits of IPv6 versus Nimrod, it is 1251 apparent that IPv6, while it may solve many problems, does not solve 1252 the critical routing problems in the Internet today. In fact in some 1253 sense it exacerbates them by adding a requirement for support of two 1254 Internet protocols and their respective addressing methods. In many 1255 ways the addition of IPv6 to the mix of methods in today's Internet 1256 only points to the fact that the goals, as set forth by the Nimrod 1257 team, remain as necessary goals. 1259 There is another sense in which study of Nimrod and its architecture 1260 may be important to deriving a future domain routing architecture. 1261 Nimrod can be said to have two derivatives: 1263 o Multi-Protocol Label Switching (MPLS) in that it took the notion 1264 of forwarding along well known paths 1265 o Private Network-Node Interface (PNNI) in that it took the notion 1266 of abstracting topological information and using that information 1267 to create connections for traffic. 1269 It is important to note, that whilst MPLS and PNNI borrowed ideas 1270 from Nimrod, neither of them can be said to be an implementation of 1271 this architecture. 1273 3.4. PNNI 1275 The Private Network-Node Interface (PNNI) routing protocol was 1276 developed under the ATM Forum's auspices as a hierarchical route 1277 determination protocol for ATM, a connection oriented architecture. 1278 It is reputed to have developed several of its methods from a study 1279 of the Nimrod architecture. What can be gained from an analysis of 1280 what did and did not succeed in PNNI? 1282 The PNNI protocol includes the assumption that all peer groups are 1283 willing to cooperate, and that the entire network is under the same 1284 top administration. Are there limitations that stem from this 'world 1285 node' presupposition? As discussed in [RFC3221], the Internet is no 1286 longer a clean hierarchy and there is a lot of resistance to having 1287 any sort of 'ultimate authority' controlling or even brokering 1288 communication. 1290 PNNI is the first deployed example of a routing protocol that uses 1291 abstract map exchange (as opposed to distance vector or link state 1292 mechanisms) for inter-domain routing information exchange. One 1293 consequence of this is that domains need not all use the same 1294 mechanism for map creation. What were the results of this 1295 abstraction and source based route calculation mechanism? 1297 Since the authors of this document do not have experience running a 1298 PNNI network, the comments above are from a theoretical perspective. 1299 Further research on these issues based on operational exprience is 1300 required. 1302 4. Recent Research Work 1304 4.1. Developments in Internet Connectivity 1306 The work commissioned from Geoff Huston by the Internet Architecture 1307 Board [RFC3221] draws a number of conclusions from analysis of BGP 1308 routing tables and routing registry databases: 1310 o The connectivity between provider ASs is becoming more like a 1311 dense mesh than the tree structure that was commonly assumed to be 1312 commonplace a couple of years ago. This has been driven by the 1313 increasing amounts charged for peering and transit traffic by 1314 global service providers. Local direct peering and Internet 1315 exchanges are becoming steadily more common as the cost of local 1316 fibre connections drops. 1317 o End user sites are increasingly resorting to multi-homing onto two 1318 or more service providers as a way of improving resiliency. This 1319 has a knock-on effect of spectacularly fast depletion of the 1320 available pool of AS numbers as end user sites require public AS 1321 numbers to become multi-homed and corresponding increase in the 1322 number of prefixes advertised in BGP. 1323 o Multi-homed sites are using advertisement of longer prefixes in 1324 BGP as a means of traffic engineering to load spread across their 1325 multiple external connections with further impact on the size of 1326 the BGP tables. 1327 o Operational practices are not uniform, and in some cases lack of 1328 knowledge or training is leading to instability and/or excessive 1329 advertisement of routes by incorrectly configured BGP speakers. 1330 o All these factors are quickly negating the advantages in limiting 1331 the expansion of BGP routing tables that were gained by the 1332 introduction of CIDR and consequent prefix aggregation in BGP. It 1333 is also now impossible for IPv6 to realize the world view in which 1334 the default free zone would be limited to perhaps 10,000 prefixes. 1335 o The typical 'width' of the Internet in AS hops is now around five, 1336 and much less in many cases. 1338 These conclusions have a considerable impact on the requirements for 1339 the future domain routing architecture: 1340 o Topological hierarchy (e.g. mandating a tree structured 1341 connectivity) cannot be relied upon to deliver scalability of a 1342 large Internet routing system 1343 o Aggregation cannot be relied upon to constrain the size of routing 1344 tables for an all-informed routing system 1346 4.2. DARPA NewArch Project 1348 DARPA funded a project to think about a new architecture for future 1349 generation Internet, called NewArch (). 1350 Work started in the first half of 2000 and the main project finished 1351 in 2003 [NewArch03]. 1353 The main development is to conclude that as the Internet becomes 1354 mainstream infrastructure, fewer and fewer of the requirements are 1355 truly global but may apply with different force or not at all in 1356 certain parts of the network. This (it is claimed) makes the 1357 compilation of a single, ordered list of requirements deeply 1358 problematic. Instead we may have to produce multiple requirement 1359 sets with support for differing requirement importance at different 1360 times and in different places. This 'meta-requirement' significantly 1361 impacts architectural design. 1363 Potential new technical requirements identified so far include: 1364 o Commercial environment concerns such as richer inter-provider 1365 policy controls and support for a variety of payment models 1366 o Trustworthiness 1367 o Ubiquitous mobility 1368 o Policy driven self-organisation ('deep auto configuration') 1369 o Extreme short-time-scale resource variability 1370 o Capacity allocation mechanisms 1371 o Speed, propagation delay and Delay/BandWidth Product issues 1373 Non-technical or political 'requirements' include: 1374 o Legal and Policy drivers such as 1375 * Privacy and free/anonymous speech 1376 * Intellectual property concerns 1377 * Encryption export controls 1378 * Law enforcement surveillance regulations 1379 * Charging and taxation issues 1380 o Reconciling national variations and consistent operation in a 1381 world wide infrastructure 1383 The conclusions of the work are now summarized in the final report . 1385 4.2.1. Defending the End-to-End Principle 1387 One of the participants in DARPA NewArch work (Dave Clark) with one 1388 of his associates has also published a very interesting paper 1389 analyzing the impact of some of the new requirements identified in 1390 NewArch (see Section 4.2) on the end-to-end principle that has guided 1391 the development of the Internet to date [Blumenthal01]. Their 1392 primary conclusion is that the loss of trust between the users at the 1393 ends of end to end has the most fundamental effect on the Internet. 1394 This is clear in the context of the routing system, where operators 1395 are unwilling to reveal the inner workings of their networks for 1396 commercial reasons. Similarly, trusted third parties and their 1397 avatars (mainly mid-boxes of one sort or another) have a major impact 1398 on the end-to-end principles and the routing mechanisms that went 1399 with them. Overall, the end to end principles should be defended so 1400 far as is possible - some changes are already too deeply embedded to 1401 make it possible to go back to full trust and openness - at least 1402 partly as a means of staving off the day when the network will ossify 1403 into an unchangeable form and function (much as the telephone network 1404 has done). The hope is that by that time a new Internet will appear 1405 to offer a context for unfettered innovation. 1407 5. Existing problems of BGP and the current Inter-/Intra-Domain 1408 Architecture 1410 Although most of the people who have to work with BGP today believe 1411 it to be a useful, working protocol, discussions have brought to 1412 light a number of areas where BGP or the relationship between BGP and 1413 the intra-domain routing protocols in use today could be improved. 1414 BGP-4 has been and continues to be extended since it was originally 1415 introduced in [RFC1771] and the protocol as deployed has been 1416 documented in [RFC4271]. This section is, to a large extent, a wish 1417 list for the future domain routing architecture based on those areas 1418 where BGP is seen to be lacking, rather than simply a list of 1419 problems with BGP. The shortcomings of today's inter-domain routing 1420 system have also been extensively surveyed in 'Architectural 1421 Requirements for Inter-Domain Routing in the Internet' [RFC3221], 1422 particularly with respect to its stability and the problems produced 1423 by explosions in the size of the Internet. 1425 5.1. BGP and Auto-aggregation 1427 The stability and later linear growth rates of the number of routing 1428 objects (prefixes) that was achieved by the introduction of CIDR 1429 around 1994, has now been once again been replaced by near- 1430 exponential growth of number of routing objects. The granularity of 1431 many of the objects advertised in the default free zone is very small 1432 (prefix length of 22 or longer): This granularity appears to be a by- 1433 product of attempts to perform precision traffic engineering related 1434 to increasing levels of multi-homing. At present there is no 1435 mechanism in BGP that would allow an AS to aggregate such prefixes 1436 without advance knowledge of their existence, even if it was possible 1437 to deduce automatically that they could be aggregated. Achieving 1438 satisfactory auto-aggregation would also significantly reduce the 1439 non-locality problems associated with instability in peripheral ASs. 1441 On the other hand, it may be that alterations to the connectivity of 1442 the net as described in [RFC3221] and Section 2.5.1 may limit the 1443 usefulness of auto-aggregation. 1445 5.2. Convergence and Recovery Issues 1447 BGP today is a stable protocol under most circumstances but this has 1448 been achieved at the expense of making the convergence time of the 1449 inter-domain routing system very slow under some conditions. This 1450 has a detrimental effect on the recovery of the network from 1451 failures. 1453 The timers that control the behavior of BGP are typically set to 1454 values in the region of several tens of seconds to a few minutes, 1455 which constrains the responsiveness of BGP to failure conditions. 1457 In the early days of deployment of BGP, poor network stability and 1458 router software problems lead to storms of withdrawals closely 1459 followed by re-advertisements of many prefices. To control the load 1460 on routing software imposed by these 'route flaps', route flap 1461 damping was introduced into BGP. Most operators have now implemented 1462 a degree of route flap damping in their deployments of BGP. This 1463 restricts the number of times that the routing tables will be rebuilt 1464 even if a route is going up and down very frequently. Unfortunately, 1465 the effect of route flap damping is exponential in its behavior that 1466 can result in some parts of the Internet being inaccessible for hours 1467 at a time. 1469 There is evidence ([RFC3221] and our own measurements [Jiang02]) that 1470 in today's network route flap is disproportionately associated with 1471 the fine grain prefices (length 22 or longer) associated with traffic 1472 engineering at the periphery of the network. Auto-aggregation as 1473 previously discussed would tend to mask such instability and prevent 1474 it being propagated across the whole network. Another question that 1475 needs to be studied is the continuing need for an architecture that 1476 requires global convergence. Some of our studies (unpublished) show 1477 that, in some localities at least, the network never actually reaches 1478 stability; i.e., it never really globally converges. Can a global, 1479 and beyond, network be designed with the requirement of global 1480 convergence? 1482 5.3. Non-locality of Effects of Instability and Misconfiguration 1484 There have been a number of instances, some of which are well 1485 documented of a mistake in BGP configuration in a single peripheral 1486 AS propagating across the whole Internet and resulting in misrouting 1487 of most of the traffic in the Internet. 1489 Similarly, route flap in a single peripheral AS can require route 1490 table recalculation across the entire Internet. 1492 This non-locality of effects is highly undesirable, and it would be a 1493 considerable improvement if such effects were naturally limited to a 1494 small area of the network around the problem. This is another 1495 argument for an architecture that does not require global 1496 convergence. 1498 5.4. Multihoming Issues 1500 As discussed previously, the increasing use of multi-homing as a 1501 robustness technique by peripheral networks requires that multiple 1502 routes have to be advertised for such domains. These routes must not 1503 be aggregated close in to the multi-homed domain as this would defeat 1504 the traffic engineering implied by multi-homing and currently cannot 1505 be aggregated further away from the multi-homed domain due to the 1506 lack of auto-aggregation capabilities. Consequentially the default 1507 free zone routing table is growing exponentially, as it was before 1508 CIDR. 1510 The longest prefix match routing technique introduced by CIDR, and 1511 implemented in BGP-4, when combined with provider address allocation 1512 is an obstacle to effective multi-homing if load sharing across the 1513 multiple links is required: If an AS has been allocated its addresses 1514 from an upstream provider, the upstream provider can aggregate those 1515 addresses with those of other customers and need only advertise a 1516 single prefix for a range of customers. But, if the customer AS is 1517 also connected to another provider, the second provider is not able 1518 to aggregate the customer addresses because they are not taken from 1519 his allocation, and will therefore have to announce a more specific 1520 route to the customer AS. The longest match rule will then direct 1521 all traffic through the second provider, which is not as required. 1523 Example: 1525 \ / 1526 AS1 AS2 1527 \ / 1528 AS3 1530 Figure 1: Address Aggregation 1532 AS3 has received its addresses from AS1, which means AS1 can 1533 aggregate. But if AS3 wants its traffic to be seen equally both 1534 ways, AS3 is forced to announce both the aggregate and the more 1535 specific route to AS2. 1537 This problem has induced many ASs to apply for their own address 1538 allocation even though they could have been allocated from an 1539 upstream provider further exacerbating the default free zone route 1540 table size explosion. This problem also interferes with the desire 1541 of many providers in the default free zone to route only prefixes 1542 that are equal to or shorter than 20 or 19 bits. 1544 Note that some problems which are referred to as multihoming issues 1545 are not and should not solvable through the routing system (e.g., 1546 where a TCP load distributor is needed), and multihoming is not a 1547 panacea for the general problem of robustness in a routing system 1548 [I-D.berkowitz-multirqmt]. 1550 Editors' Note: A more recent analysis of multihoming can be found 1551 in [RFC4116]. 1553 5.5. AS-number exhaustion 1555 The domain identifier or AS-number is a 16-bit number. When this 1556 paper was originally written in 2001, allocation of AS-numbers was 1557 increasing 51% a year [RFC3221] and exhaustion by 2005 was predicted. 1558 According to some recent work again by Huston [Huston05], the rate of 1559 increase dropped off after the business downturn but as of July 2005, 1560 well over half the available AS numbers (39000 out of 64510) had been 1561 allocated by IANA and around 20000 were visible in the global BGP 1562 routing tables. A year later these figures had grown to 42000 (April 1563 2006) and 23000 (August 2006) respectively and the rate of allocation 1564 is currently about 3500 per year. Depending on the curve fitting 1565 model used to predict when exhaustion will occur, the pool will run 1566 out somewhere between 2010 and 2013. There appear to be other 1567 factors at work in this rate of increase beyond an increase in the 1568 number of ISPs in business, although there is a fair degree of 1569 correlation between these numbers. AS numbers are now used for a 1570 number of purposes beyond that of identifying large routing domains: 1571 multihomed sites acquire an AS number in order to express routing 1572 preferences to their various providers and AS numbers are used part 1573 of the addressing mechanism for MPLS/BGP-based virtual private 1574 networks (VPNs) [RFC2547]. The IETF has had a proposal under 1575 development for over four years to increase the available range of 1576 AS-numbers to 32 bits [I-D.ietf-idr-as4bytes]. Much of the slowness 1577 in development is due to the deployment challenge during transition. 1578 Because of the difficulties of transition, deployment needs to start 1579 well in advance of actual exhaustion so that the network as a whole 1580 is ready for the new capability when it is needed. This implies that 1581 standardisation needs to be complete and implementations available at 1582 least well in advance of expected exhaustion so that deployment of 1583 upgrades that can handle the longer AS numbers should be starting 1584 around 2008 to give a reasonable expectation that the change has been 1585 rolled out across a large fraction of the Internet by the time 1586 exhaustion occurs. 1588 5.6. Partitioned AS's 1590 Tricks with discontinuous ASs are used by operators, for example, to 1591 implement anycast. Discontinuous ASs may also come into being by 1592 chance if a multi-homed domain becomes partitioned as a result of a 1593 fault and part of the domain can access the Internet through each 1594 connection. It may be desirable to make support for this kind of 1595 situation more transparent than it is at present. 1597 5.7. Load Sharing 1599 Load splitting or sharing was not a goal of the original designers of 1600 BGP and it is now a problem for today's network designers and 1601 managers. Trying to fool BGP into load sharing between several links 1602 is a constantly recurring exercise for most operators today. 1604 5.8. Hold down issues 1606 As with the interval between 'hello' messages in OSPF, the typical 1607 size and defined granularity (seconds to tens of seconds) of the 1608 'keep-alive' time negotiated at start-up for each BGP connection 1609 constrains the responsiveness of BGP to link failures. 1611 The recommended values and the available lower limit for this timer 1612 were set to limit the overhead caused by keep-alive messages when 1613 link bandwidths were typically much lower than today. Analysis and 1614 experiment ([I-D.alaettinoglu-isis-convergence], [I-D.sandiick-flip] 1615 & [RFC4204]) indicate that faster links could sustain a much higher 1616 rate of keep-alive messages without significantly impacting normal 1617 data traffic. This would improve responsiveness to link and node 1618 failures but with a corresponding increase in the risk of 1619 instability, if the error characteristics of the link are not taken 1620 properly into account when setting the keep-alive interval. 1622 Editors' Note: A 'fast' liveness protocol has been standardized as 1623 [I-D.ietf-bfd-base]. 1625 An additional problem with the hold-down mechanism in BGP is the 1626 amount of information that has to be exchanged to re-establish the 1627 database of route advertisements on each side of the link when it is 1628 re-established after a failure. Currently any failure, however brief 1629 forces a full exchange which could perhaps be constrained by 1630 retaining some state across limited time failures and using revision 1631 control, transaction and replication techniques to resynchronise the 1632 databases. Various techniques have been implemented to try to reduce 1633 this problem but they have not yet been standardised. 1635 5.9. Interaction between Inter domain routing and intra domain routing 1637 Today, many operators' backbone routers run both I-BGP and an intra- 1638 domain protocol to maintain the routes that reach between the borders 1639 of the domain. Exporting routes from BGP into the intra-domain 1640 protocol in use and bringing them back up to BGP is not recommended 1641 [RFC2791], but it is still necessary for all backbone routers to run 1642 both protocols. BGP is used to find the egress point and intra- 1643 domain protocol to find the path (next hop router) to the egress 1644 point across the domain. This is not only a management problem but 1645 may also create other problems: 1646 o BGP is a distance vector protocol, as compared with most intra- 1647 domain protocols, which are link state protocols, and as such it 1648 is not optimised for convergence speed although they generally 1649 require less processing power. Incidentally, more efficient 1650 distance vector algorithms are available such as [Xu97]. 1651 o The metrics used in BGP and the intra-domain protocol are rarely 1652 comparable or combinable. Whilst there are arguments that the 1653 optimizations inside a domain may be different from those for end- 1654 to-end paths, there are occasions, such as calculating the 1655 'topologically nearest' server when computable or combinable 1656 metrics would be of assistance. 1657 o The policies that can be implemented using BGP are designed for 1658 control of traffic exchange between operators, not for controlling 1659 paths within a domain. Policies for BGP are most conveniently 1660 expressed in Routing Policy Support Language (RPSL) [RFC2622] and 1661 this could be extended if thought desirable to include additional 1662 policy information. 1663 o If the NEXT HOP destination for a set of BGP routes becomes 1664 inaccessible because of intra-domain protocol problems, the routes 1665 using the vanished next hop have to be invalidated at the next 1666 available UPDATE. Subsequently, if the next hop route reappears, 1667 this would normally lead to the BGP speaker requesting a full 1668 table from its neighbour(s). Current implementations may attempt 1669 to circumvent the effects of intra-domain protocol route flap by 1670 caching the invalid routes for a period in case the next hop is 1671 restored through the 'graceful restart' mechanism. 1673 * Editors' Note: This was standardized as [I-D.ietf-idr-restart]. 1675 o Synchronization between intra-domain and inter-domain routing 1676 information is a problem as long as we use different protocols for 1677 intra-domain and inter-domain routing, which will most probably be 1678 the case even in the future because of the differing requirements 1679 in the two situations. Some sort of synchronization between those 1680 two protocols would be useful. In the RFC 'IS-IS Transient 1681 Blackhole Avoidance' [RFC3277], the intra-domain protocol side of 1682 the story is covered (there is an equivalent discussion for OSPF). 1683 o Synchronizing in BGP means waiting for the intra-domain protocol 1684 to know about the same networks as the inter-domain protocol, 1685 which can take a significant period of time and slows down the 1686 convergence of BGP by adding the intra-domain protocol convergence 1687 time into each cycle. In general operators no longer attempt full 1688 synchronization in order to avoid this problem (in general, 1689 redistributing the entire BGP routing feed into the local intra- 1690 domain protocol is unnecessary and undesirable but where a domain 1691 has multiple exits to peers and other non-customer networks, 1692 changes in BGP routing that affect the exit taken by traffic 1693 require corresponding re-routing in the intra-domain routing). 1695 5.10. Policy Issues 1697 There are several classes of issues with current BGP policy: 1698 o Policy is installed in an ad-hoc manner in each autonomous system. 1699 There isn't a method for ensuring that the policy installed in one 1700 router is coherent with policies installed in other routers. 1701 o As described in Griffin [Griffin99] and in McPherson [RFC3345] it 1702 is possible to create policies for ASs, and instantiate them in 1703 routers, that will cause BGP to fail to converge in certain types 1704 of topology 1705 o There is no available network model for describing policy in a 1706 coherent manner. 1708 Policy management is extremely complex and mostly done without the 1709 aid of any automated procedures. The extreme complexity means that a 1710 highly qualified specialist is required for policy management of 1711 border routers. The training of these specialists is quite lengthy 1712 and needs to involve long periods of hands-on experience. There is, 1713 therefore, a shortage of qualified staff for installing and 1714 maintaining the routing policies. Because of the overall complexity 1715 of BGP, policy management tends to be only a relatively small topic 1716 within a complete BGP training course and specialised policy 1717 management training courses are not generally available. 1719 5.11. Security Issues 1721 While many of the issues with BGP security have been traced either to 1722 implementation issues or to operational issues, BGP is vulnerable to 1723 Distributed Denial of Service (DDoS) attacks. Additionally routers 1724 can be used as unwitting forwarders in DDoS attacks on other systems. 1726 Though DDoS attacks can be fought in a variety of ways, most 1727 filtering methods, it is takes constant vigilance. There is nothing 1728 in the current architecture or in the protocols that serves to 1729 protect the forwarders from these attacks. 1731 Editors' Note: Since the original draft was written, the issue of 1732 inter-domain routing security has been studied in much greater 1733 depth. The rpsec working group has gone into the security issues 1734 in great detail [RFC4593] and readers should refer to that work to 1735 understand the security issues. 1737 5.12. Support of MPLS and VPNS 1739 Recently BGP has been modified to function as a signaling protocol 1740 for MPLS and for VPNs [RFC2547]. Some people see this over-loading 1741 of the BGP protocol as a boon whilst others see it as a problem. 1742 While it was certainly convenient as a vehicle for vendors to deliver 1743 extra functionality for to their products, it has exacerbated some of 1744 the performance and complexity issues of BGP. Two important problems 1745 are, the additional state that must be retained and refreshed to 1746 support VPN (Virtual Private Network) tunnels and that BGP does not 1747 provide end-to-end notification making it difficult to confirm that 1748 all necessary state has been installed or updated. 1750 It is an open questtion whether VPN signaling protocols should remain 1751 separate from the route determination protocols. 1753 5.13. IPv4 / IPv6 Ships in the Night 1755 The fact that service providers need to maintain two completely 1756 separate networks; one for IPv4 and one for IPv6 has been a real 1757 hindrance to the introduction of IPv6. When IPv6 does get widely 1758 deployed it will do so without causing the disappearance of IPv4. 1759 This means that unless something is done, service providers would 1760 need to maintain the two networks in, relative, perpetuity. 1762 It is possible to use a single set of BGP speakers with multiprotocol 1763 extensions [RFC2858] to exchange information about both IPv4 and IPv6 1764 routes between domains, but the use of TCP as the transport protocol 1765 for the information exchange results in an asymmetry when choosing to 1766 use one of TCP over IPv4 or TCP over IPv6. Successful information 1767 exchange confirms one of IPv4 or IPv6 reachability between the 1768 speakers but not the other, making it possible that reachability is 1769 being advertised for a protocol for which it is not present. 1771 Also, current implementations do not allow a route to be advertised 1772 for both IPv4 and IPv6 in the same UPDATE message, because it is not 1773 possible to explicitly link the reachability information for an 1774 address family to the corresponding next hop information. This could 1775 be improved, but currently results in independent UPDATEs being 1776 exchanged for each address family. 1778 5.14. Existing Tools to Support Effective Deployment of Inter-Domain 1779 Routing 1781 The tools available to network operators to assist in configuring and 1782 maintaining effective inter-domain routing in line with their defined 1783 policies are limited, and almost entirely passive. 1785 o There are no tools to facilitate the planning of the routing of a 1786 domain (either intra- or inter-domain); there are a limited number 1787 of display tools that will visualize the routing once it has been 1788 configured 1790 o There are no tools to assist in converting business policy 1791 specifications into the RPSL language; there are limited tools to 1792 convert the RPSL into BGP commands and to check, post-facto, that 1793 the proposed policies are consistent with the policies in adjacent 1794 domains (always provided that these have been revealed and 1795 accurately documented). 1796 o There are no tools to monitor BGP route changes in real time and 1797 warn the operator about policy inconsistencies and/or 1798 instabilities. 1800 The following section summarises the tools that are available to 1801 assist with the use of RPSL. Note they are all batch mode tools used 1802 off-line from a real network. These tools will provide checks for 1803 skilled inter-domain routing configurers but limited assistance for 1804 the novice. 1806 5.14.1. Routing Policy Specification Language RPSL (RFC 2622, 2650) and 1807 RIPE NCC Database (RIPE 157) 1809 Routing Policy Specification Language (RPSL) [RFC2622] enables a 1810 network operator to describe routes, routers and autonomous systems 1811 ASs that are connected to the local AS. 1813 Using the RPSL language (see [RFC2650])a distributed database is 1814 created to describe routing policies in the Internet as described by 1815 each AS independently. The database can be used to check the 1816 consistency of routing policies stored in the database. 1818 Tools exist ([IRRToolSet]) that can be applied on the database to 1819 answer requests of the form, e.g. 1820 o Flag when two neighboring network operators specify conflicting or 1821 inconsistent routing information exchanges with each other and 1822 also detect global inconsistencies where possible; 1823 o Extract all AS-paths between two networks that are allowed by 1824 routing policy from the routing policy database; display the 1825 connectivity a given network has according to current policies. 1827 The database queries enable a partial static solution to the 1828 convergence problem. They analyze routing policies of very limited 1829 part of Internet and verify that they do not contain conflicts that 1830 could lead to protocol divergence. The static analysis of 1831 convergence of the entire system has exponential time complexity, so 1832 approximation algorithms would have to be used. 1834 The toolset also allows router configurations to be generated from 1835 RPSL specifications. 1837 Editors' Note: The "Internet Routing Registry Toolset" was 1838 originally developed by the University of Southern California's 1839 Information Sciences Institute (ISI) between 1997 and 2001 as the 1840 "Routing Arbiter ToolSet" (RAToolSet) project. The toolset is no 1841 longer developed by ISI but is used worldwide, so after a period 1842 of improvement by RIPE NCC it has now been transferred to the 1843 Internet Systems Consortium (ISC) for ongoing maintenance as a 1844 public resource. 1846 6. Security Considerations 1848 As this is an informational draft on the history of requirements in 1849 IDR and on the problems facing the current Internet IDR architecture, 1850 it does not as such create any security problems. On the other hand, 1851 some of the problems with today's Internet routing architecture do 1852 create security problems and these have been discussed in the text 1853 above. 1855 7. IANA Considerations 1857 This document does not request any actions by IANA. 1859 RFC Editor: Please remove this section before publication. 1861 8. Acknowledgments 1863 The draft is derived from work originally produced by Babylon. 1864 Babylon was a loose association of individuals from academia, service 1865 providers and vendors whose goal was to discuss issues in Internet 1866 routing with the intention of finding solutions for those problems. 1868 The individual members who contributed materially to this draft are: 1869 Anders Bergsten, Howard Berkowitz, Malin Carlzon, Lenka Carr 1870 Motyckova, Elwyn Davies, Avri Doria, Pierre Fransson, Yong Jiang, 1871 Dmitri Krioukov, Tove Madsen, Olle Pers, and Olov Schelen. 1873 Thanks also go to the members of Babylon and others who did 1874 substantial reviews of this material. Specifically we would like to 1875 acknowledge the helpful comments and suggestions of the following 1876 individuals: Loa Andersson, Tomas Ahlstrom, Erik Aman, Thomas 1877 Eriksson, Niklas Borg, Nigel Bragg, Thomas Chmara, Krister Edlund, 1878 Owe Grafford, Torbjorn Lundberg, Jasminko Mulahusic, Florian-Daniel 1879 Otel, Bernhard Stockman, Tom Worster, Roberto Zamparo. 1881 In addition, the authors are indebted to the folks who wrote all the 1882 references we have consulted in putting this paper together. This 1883 includes not only the references explicitly listed below, but also 1884 those who contributed to the mailing lists we have been participating 1885 in for years. 1887 Finally, it is the editors who are responsible for any lack of 1888 clarity, any errors, glaring omissions or misunderstandings. 1890 9. References 1892 [Blumenthal01] 1893 Blumenthal, M. and D. Clark, "Rethinking the design of the 1894 Internet: The end to end arguments vs", the brave new 1895 world , May 2001, 1896 . 1898 [Breslau90] 1899 Breslau, L. and D. Estrin, "An Architecture for Network- 1900 Layer Routing in OSI", Proceedings of the ACM symposium on 1901 Communications architectures & protocols , 1990. 1903 [Chapin94] 1904 Piscitello, D. and A. Chapin, "Open Systems Networking: 1905 TCP/IP & OSI", Addison-Wesley Copyright assigned to 1906 authors, 1994, . 1908 [Chiappa91] 1909 Chiappa, N., "A New IP Routing and Addressing 1910 Architecture", Internet 1911 Draft draft-chiappa-routing-01.txt, 1991, 1912 . 1914 [Griffin99] 1915 Griffin, T. and G. Wilfong, "An Analysis of BGP 1916 Convergence Properties", Association for Computing 1917 Machinery Proceedings of SIGCOMM '99, 1999. 1919 [Huitema90] 1920 Huitema, C. and W. Dabbous, "Routeing protocols 1921 development in the OSI architecture", Proceedings of 1922 ISCIS V Turkey, 1990. 1924 [Huston05] 1925 Huston, G., "Exploring Autonomous System Numbers", The ISP 1926 Column , August 2005, 1927 . 1929 [I-D.alaettinoglu-isis-convergence] 1930 Alaettinoglu, C., Jacobson, V., and H. Yu, "Towards Milli- 1931 Second IGP Convergence", 1932 draft-alaettinoglu-isis-convergence-00 (work in progress), 1933 Nov 2000. 1935 [I-D.berkowitz-multirqmt] 1936 Berkowitz, H. and D. Krioukov, "To Be Multihomed: 1937 Requirements and Definitions", 1938 draft-berkowitz-multirqmt-02 (work in progress), 2002. 1940 [I-D.ietf-bfd-base] 1941 Katz, D. and D. Ward, "Bidirectional Forwarding 1942 Detection", draft-ietf-bfd-base-05 (work in progress), 1943 June 2006. 1945 [I-D.ietf-idr-as4bytes] 1946 Vohra, Q. and E. Chen, "BGP Support for Four-octet AS 1947 Number Space", draft-ietf-idr-as4bytes-13 (work in 1948 progress), February 2007. 1950 [I-D.ietf-idr-restart] 1951 Sangli, S., "Graceful Restart Mechanism for BGP", 1952 draft-ietf-idr-restart-13 (work in progress), July 2006. 1954 [I-D.irtf-routing-reqs] 1955 Doria, A., "Requirements for Inter-Domain Routing", 1956 draft-irtf-routing-reqs-07 (work in progress), 1957 January 2007. 1959 [I-D.sandiick-flip] 1960 Sandick, H., Squire, M., Cain, B., Duncan, I., and B. 1961 Haberman, "Fast LIveness Protocol (FLIP)", 1962 draft-sandiick-flip-00 (work in progress), Feb 2000. 1964 [INARC89] Mills, D., Ed. and M. Davis, Ed., "Internet Architecture 1965 Workshop: Future of the Internet System Architecture and 1966 TCP/IP Protocols - Report", Internet Architecture Task 1967 Force INARC, 1990, . 1970 [IRRToolSet] 1971 Internet Systems Consortium, "Internet Routing Registry 1972 Toolset Project", IRR Tool Set Website, 2006, 1973 . 1975 [ISO10747] 1976 ISO/IEC, "Protocol for Exchange of Inter-Domain Routeing 1977 Information among Intermediate Systems to support 1978 Forwarding of ISO 8473 PDUs", International Standard 1979 10747 , 1993. 1981 [Jiang02] Jiang, Y., Doria, A., Olsson, D., and F. Pettersson, 1982 "Inter-domain Routing Stability Measurement", , 2002, . 1985 [Labovitz02] 1986 Labovitz, C., Ahuja, A., Farnam, J., and A. Bose, 1987 "Experimental Measurement of Delayed Convergence", NANOG , 1988 2002. 1990 [NewArch03] 1991 Clark, D., Sollins, K., Wroclawski, J., Katabi, D., Kulik, 1992 J., Yang, X., Braden, R., Faber, T., Falk, A., Pingali, 1993 V., Handley, M., and N. Chiappa, "New Arch: Future 1994 Generation Internet Architecture", December 2003, 1995 . 1997 [RFC0904] Mills, D., "Exterior Gateway Protocol formal 1998 specification", RFC 904, April 1984. 2000 [RFC0975] Mills, D., "Autonomous confederations", RFC 975, 2001 February 1986. 2003 [RFC1105] Lougheed, K. and J. Rekhter, "Border Gateway Protocol 2004 (BGP)", RFC 1105, June 1989. 2006 [RFC1126] Little, M., "Goals and functional requirements for inter- 2007 autonomous system routing", RFC 1126, October 1989. 2009 [RFC1163] Lougheed, K. and Y. Rekhter, "Border Gateway Protocol 2010 (BGP)", RFC 1163, June 1990. 2012 [RFC1267] Lougheed, K. and Y. Rekhter, "Border Gateway Protocol 3 2013 (BGP-3)", RFC 1267, October 1991. 2015 [RFC1752] Bradner, S. and A. Mankin, "The Recommendation for the IP 2016 Next Generation Protocol", RFC 1752, January 1995. 2018 [RFC1753] Chiappa, J., "IPng Technical Requirements Of the Nimrod 2019 Routing and Addressing Architecture", RFC 1753, 2020 December 1994. 2022 [RFC1771] Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 2023 (BGP-4)", RFC 1771, March 1995. 2025 [RFC1992] Castineyra, I., Chiappa, N., and M. Steenstrup, "The 2026 Nimrod Routing Architecture", RFC 1992, August 1996. 2028 [RFC2362] Estrin, D., Farinacci, D., Helmy, A., Thaler, D., Deering, 2029 S., Handley, M., and V. Jacobson, "Protocol Independent 2030 Multicast-Sparse Mode (PIM-SM): Protocol Specification", 2031 RFC 2362, June 1998. 2033 [RFC2547] Rosen, E. and Y. Rekhter, "BGP/MPLS VPNs", RFC 2547, 2034 March 1999. 2036 [RFC2622] Alaettinoglu, C., Villamizar, C., Gerich, E., Kessens, D., 2037 Meyer, D., Bates, T., Karrenberg, D., and M. Terpstra, 2038 "Routing Policy Specification Language (RPSL)", RFC 2622, 2039 June 1999. 2041 [RFC2650] Meyer, D., Schmitz, J., Orange, C., Prior, M., and C. 2042 Alaettinoglu, "Using RPSL in Practice", RFC 2650, 2043 August 1999. 2045 [RFC2791] Yu, J., "Scalable Routing Design Principles", RFC 2791, 2046 July 2000. 2048 [RFC2858] Bates, T., Rekhter, Y., Chandra, R., and D. Katz, 2049 "Multiprotocol Extensions for BGP-4", RFC 2858, June 2000. 2051 [RFC3221] Huston, G., "Commentary on Inter-Domain Routing in the 2052 Internet", RFC 3221, December 2001. 2054 [RFC3277] McPherson, D., "Intermediate System to Intermediate System 2055 (IS-IS) Transient Blackhole Avoidance", RFC 3277, 2056 April 2002. 2058 [RFC3345] McPherson, D., Gill, V., Walton, D., and A. Retana, 2059 "Border Gateway Protocol (BGP) Persistent Route 2060 Oscillation Condition", RFC 3345, August 2002. 2062 [RFC3618] Fenner, B. and D. Meyer, "Multicast Source Discovery 2063 Protocol (MSDP)", RFC 3618, October 2003. 2065 [RFC3765] Huston, G., "NOPEER Community for Border Gateway Protocol 2066 (BGP) Route Scope Control", RFC 3765, April 2004. 2068 [RFC3913] Thaler, D., "Border Gateway Multicast Protocol (BGMP): 2069 Protocol Specification", RFC 3913, September 2004. 2071 [RFC4116] Abley, J., Lindqvist, K., Davies, E., Black, B., and V. 2072 Gill, "IPv4 Multihoming Practices and Limitations", 2073 RFC 4116, July 2005. 2075 [RFC4204] Lang, J., "Link Management Protocol (LMP)", RFC 4204, 2076 October 2005. 2078 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 2079 Protocol 4 (BGP-4)", RFC 4271, January 2006. 2081 [RFC4593] Barbir, A., Murphy, S., and Y. Yang, "Generic Threats to 2082 Routing Protocols", RFC 4593, October 2006. 2084 [RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas, 2085 "Protocol Independent Multicast - Sparse Mode (PIM-SM): 2086 Protocol Specification (Revised)", RFC 4601, August 2006. 2088 [Tsuchiya87] 2089 Tsuchiya, P., "An Architecture for Network-Layer Routing 2090 in OSI", Proceedings of the ACM workshop on Frontiers in 2091 computer communications technology , 1987. 2093 [Xu97] Xu, Z., Dai, S., and J. Garcia-Luna-Aceves, "A More 2094 Efficient Distance Vector Routing Algorithm", Proc IEEE 2095 MILCOM 97, Monterey, California, Nov 1997, . 2099 Authors' Addresses 2101 Elwyn B. Davies 2102 Consultant 2103 Soham, Cambs 2104 UK 2106 Phone: +44 7889 488 335 2107 Email: elwynd@dial.pipex.com 2109 Avri Doria 2110 LTU 2111 Lulea, 971 87 2112 Sweden 2114 Phone: +1 401 663 5024 2115 Email: avri@acm.org 2117 Full Copyright Statement 2119 Copyright (C) The IETF Trust (2007). 2121 This document is subject to the rights, licenses and restrictions 2122 contained in BCP 78, and except as set forth therein, the authors 2123 retain all their rights. 2125 This document and the information contained herein are provided on an 2126 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 2127 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 2128 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 2129 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 2130 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 2131 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 2133 Intellectual Property 2135 The IETF takes no position regarding the validity or scope of any 2136 Intellectual Property Rights or other rights that might be claimed to 2137 pertain to the implementation or use of the technology described in 2138 this document or the extent to which any license under such rights 2139 might or might not be available; nor does it represent that it has 2140 made any independent effort to identify any such rights. Information 2141 on the procedures with respect to rights in RFC documents can be 2142 found in BCP 78 and BCP 79. 2144 Copies of IPR disclosures made to the IETF Secretariat and any 2145 assurances of licenses to be made available, or the result of an 2146 attempt made to obtain a general license or permission for the use of 2147 such proprietary rights by implementers or users of this 2148 specification can be obtained from the IETF on-line IPR repository at 2149 http://www.ietf.org/ipr. 2151 The IETF invites any interested party to bring to its attention any 2152 copyrights, patents or patent applications, or other proprietary 2153 rights that may cover technology that may be required to implement 2154 this standard. Please address the information to the IETF at 2155 ietf-ipr@ietf.org. 2157 Acknowledgment 2159 Funding for the RFC Editor function is provided by the IETF 2160 Administrative Support Activity (IASA).