idnits 2.17.1
draft-irtf-routing-history-10.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
** The abstract seems to contain references ([I-D.irtf-routing-reqs]),
which it shouldn't. Please replace those with straight textual mentions
of the documents in question.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
-- The document seems to lack a disclaimer for pre-RFC5378 work, but may
have content which was first submitted before 10 November 2008. If you
have contacted all the original authors and they are all willing to grant
the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
this comment. If not, you may need to add the pre-RFC5378 disclaimer.
(See the Legal Provisions document at
https://trustee.ietf.org/license-info for more information.)
-- The document date (February 16, 2009) is 5540 days in the past. Is this
intentional?
Checking references for intended status: Historic
----------------------------------------------------------------------------
== Outdated reference: A later version (-11) exists of
draft-ietf-bfd-base-09
== Outdated reference: A later version (-11) exists of
draft-irtf-routing-reqs-10
-- Obsolete informational reference (is this intentional?): RFC 1105
(Obsoleted by RFC 1163)
-- Obsolete informational reference (is this intentional?): RFC 1163
(Obsoleted by RFC 1267)
-- Obsolete informational reference (is this intentional?): RFC 1771
(Obsoleted by RFC 4271)
-- Obsolete informational reference (is this intentional?): RFC 2362
(Obsoleted by RFC 4601, RFC 5059)
-- Obsolete informational reference (is this intentional?): RFC 4601
(Obsoleted by RFC 7761)
-- Obsolete informational reference (is this intentional?): RFC 4893
(Obsoleted by RFC 6793)
Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 8 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Network Working Group E. Davies
3 Internet-Draft Folly Consulting
4 Intended status: Historic A. Doria
5 Expires: August 20, 2009 LTU
6 February 16, 2009
8 Analysis of Inter-Domain Routing Requirements and History
9 draft-irtf-routing-history-10.txt
11 Status of this Memo
13 This Internet-Draft is submitted to IETF in full conformance with the
14 provisions of BCP 78 and BCP 79.
16 Internet-Drafts are working documents of the Internet Engineering
17 Task Force (IETF), its areas, and its working groups. Note that
18 other groups may also distribute working documents as Internet-
19 Drafts.
21 Internet-Drafts are draft documents valid for a maximum of six months
22 and may be updated, replaced, or obsoleted by other documents at any
23 time. It is inappropriate to use Internet-Drafts as reference
24 material or to cite them other than as "work in progress."
26 The list of current Internet-Drafts can be accessed at
27 http://www.ietf.org/ietf/1id-abstracts.txt.
29 The list of Internet-Draft Shadow Directories can be accessed at
30 http://www.ietf.org/shadow.html.
32 This Internet-Draft will expire on August 20, 2009.
34 Copyright Notice
36 Copyright (c) 2009 IETF Trust and the persons identified as the
37 document authors. All rights reserved.
39 This document is subject to BCP 78 and the IETF Trust's Legal
40 Provisions Relating to IETF Documents
41 (http://trustee.ietf.org/license-info) in effect on the date of
42 publication of this document. Please review these documents
43 carefully, as they describe your rights and restrictions with respect
44 to this document.
46 Abstract
48 This document analyses the state of the Internet domain-based routing
49 system, concentrating on Inter-Domain Routing (IDR) and also
50 considering the relationship between inter-domain and intra-domain
51 routing. The analysis is carried out with respect to RFC 1126 and
52 other IDR requirements and design efforts looking at the routing
53 system as it appeared to be in 2001 with editorial additions
54 reflecting developments up to 2006. It is the companion document to
55 "A Set of Possible Requirements for a Future Routing Architecture"
56 [I-D.irtf-routing-reqs], which is a discussion of requirements for
57 the future routing architecture, addressing systems developments and
58 future routing protocols. This document summarizes discussions held
59 several years ago by members of the IRTF Routing Research Group (IRTF
60 RRG) and other interested parties. The document is published with
61 the support of the IRTF RRG as a record of the work completed at that
62 time, but with the understanding that it does not necessarily
63 represent either the latest technical understanding or the technical
64 concensus of the research group at the date of publication.
65 [Note to RFC Editor: Please replace the reference in the abstract
66 with a non-reference quoting the RFC number of the companion
67 document when it is allocated, i.e., '(RFC xxxx)' and remove this
68 note.]
70 Table of Contents
72 1. Provenance of this Document . . . . . . . . . . . . . . . . . 4
73 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5
74 2.1. Background . . . . . . . . . . . . . . . . . . . . . . . . 7
75 3. Historical Perspective . . . . . . . . . . . . . . . . . . . . 7
76 3.1. The Legacy of RFC1126 . . . . . . . . . . . . . . . . . . 7
77 3.1.1. "General Requirements" . . . . . . . . . . . . . . . . 8
78 3.1.2. "Functional Requirements" . . . . . . . . . . . . . . 13
79 3.1.3. "Non-Goals" . . . . . . . . . . . . . . . . . . . . . 21
80 3.2. ISO OSI IDRP, BGP and the Development of Policy Routing . 24
81 3.3. Nimrod Requirements . . . . . . . . . . . . . . . . . . . 30
82 3.4. PNNI . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
83 4. Recent Research Work . . . . . . . . . . . . . . . . . . . . . 32
84 4.1. Developments in Internet Connectivity . . . . . . . . . . 32
85 4.2. DARPA NewArch Project . . . . . . . . . . . . . . . . . . 33
86 4.2.1. Defending the End-to-End Principle . . . . . . . . . . 34
87 5. Existing problems of BGP and the current
88 Inter-/Intra-Domain Architecture . . . . . . . . . . . . . . . 34
89 5.1. BGP and Auto-aggregation . . . . . . . . . . . . . . . . . 34
90 5.2. Convergence and Recovery Issues . . . . . . . . . . . . . 35
91 5.3. Non-locality of Effects of Instability and
92 Misconfiguration . . . . . . . . . . . . . . . . . . . . . 36
93 5.4. Multihoming Issues . . . . . . . . . . . . . . . . . . . . 36
94 5.5. AS-number exhaustion . . . . . . . . . . . . . . . . . . . 37
95 5.6. Partitioned ASs . . . . . . . . . . . . . . . . . . . . . 38
96 5.7. Load Sharing . . . . . . . . . . . . . . . . . . . . . . . 38
97 5.8. Hold down issues . . . . . . . . . . . . . . . . . . . . . 38
98 5.9. Interaction between Inter-Domain Routing and
99 Intra-Domain Routing . . . . . . . . . . . . . . . . . . . 39
100 5.10. Policy Issues . . . . . . . . . . . . . . . . . . . . . . 40
101 5.11. Security Issues . . . . . . . . . . . . . . . . . . . . . 41
102 5.12. Support of MPLS and VPNS . . . . . . . . . . . . . . . . . 41
103 5.13. IPv4 / IPv6 Ships in the Night . . . . . . . . . . . . . . 42
104 5.14. Existing Tools to Support Effective Deployment of
105 Inter-Domain Routing . . . . . . . . . . . . . . . . . . . 42
106 5.14.1. Routing Policy Specification Language RPSL (RFC
107 2622, 2650) and RIPE NCC Database (RIPE 157) . . . . . 43
108 6. Security Considerations . . . . . . . . . . . . . . . . . . . 44
109 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 44
110 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 44
111 9. Informative References . . . . . . . . . . . . . . . . . . . . 45
112 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 49
114 1. Provenance of this Document
116 In 2001, the IRTF Routing Research Group (IRTF RRG) chairs, Abha
117 Ahuja and Sean Doran, decided to establish a sub-group to look at
118 requirements for inter-domain routing (IDR). A group of well known
119 routing experts was assembled to develop requirements for a new
120 routing architecture. Their mandate was to approach the problem
121 starting from a blank sheet. This group was free to take any
122 approach, including a revolutionary approach, in developing
123 requirements for solving the problems they saw in inter-domain
124 routing. Their eventual approach documented requirements for a
125 complete future routing and addressing architecture rather than just
126 the requirements for IDR.
128 Simultaneously, an independent effort was started in Sweden with a
129 similar goal. A team, calling itself Babylon, with participation
130 from vendors, service providers, and academia, assembled to
131 understand the history of inter-domain routing, to research the
132 problems seen by the service providers, and to develop a proposal of
133 requirements for a follow-on to the current routing architecture.
134 This group's approach required an evolutionary approach starting from
135 current routing architecture and practice. In other words the group
136 limited itself to developing an evolutionary strategy and
137 consequently assumed that the architecture would probably remain
138 domain-based. The Babylon group was later folded into the IRTF RRG
139 as Sub-group B to distinguish it from the original RRG Sub-group A.
141 This document, which was a part of Sub-group B's output, provides a
142 snapshot of the current state of Inter-Domain Routing (IDR) at the
143 time of original writing (2001) with some minor updates to take into
144 account developments since that date, bringing it up to date in 2006.
145 The development of the new requirements set is then motivated by an
146 analysis of the problems that IDR has been encountering in the recent
147 past. This document is intended as a counterpart to the Routing
148 Requirements document ("A Set of Possible Requirements for a Future
149 Routing Architecture") which documents the requirements for future
150 routing systems as captured separately by the IRTF RRG Sub-groups A
151 and B [I-D.irtf-routing-reqs].
153 The IRTF RRG supported publication of this document as a historical
154 record of the work completed on the understanding that it does not
155 necessarily represent either the latest technical understanding or
156 the technical consensus of the research group at the time of
157 publication. The document has had substantial review by members of
158 the Babylon team, members of the IRTF RRG and others over the years.
160 2. Introduction
162 For the greater part of its existence the Internet has used a domain-
163 oriented routing system whereby the routers and other nodes making up
164 the infrastructure are partitioned into a set of administrative
165 domains, primarily along ownership lines. Individual routing domains
166 (also known as Autonomous Systems (ASs)), which maybe a subset of an
167 adminitrative domain, are made up of a finite, connected set of nodes
168 (at least in normal operation). Each routing domain is subject to a
169 coherent set of routing and other policies managed by a single
170 administrative authority. The domains are interlinked to form the
171 greater Internet producing a very large network: in practice, we have
172 to treat this network as if it were infinite in extent as there is no
173 central knowledge about the whole network of domains. An early
174 presentation of the concept of routing domains can be found Paul
175 Francis' OSI routing architecture paper from 1987 [Tsuchiya87] (Paul
176 Francis was formerly known as Paul Tsuchiya).
178 The domain concept and domain-oriented routing has become so
179 fundamental to Internet routing thinking that it is generally taken
180 as an axiom these days and not even defined again (c.f.,
181 [NewArch03]). The issues discussed in the present document
182 notwithstanding, it has proved to be a robust and successful
183 architectural concept that brings with it the possibility of using
184 different routing mechanisms and protocols within the domains (intra-
185 domain) and between the domains (inter-domain). This is an
186 attractive division, because intra-domain protocols can exploit the
187 well-known finite scope of the domain and the mutual trust engendered
188 by shared ownership to give a high degree of control to the domain
189 administrators, whereas inter-domain routing lives in an essentially
190 infinite region featuring a climate of distrust built on a multtude
191 of competitive commercial agreements and driven by less-than-fully
192 public policies from each component domain. Of course, like any
193 other assumption that has been around for a very long time, the
194 domain concept should be reevaluated to make sure that it is still
195 helping!
197 It is generally accepted that there are major shortcomings in the
198 inter-domain routing of the Internet today and that these may result
199 in severe routing problems within an unspecified period of time.
200 Remedying these shortcomings will require extensive research to tie
201 down the exact failure modes that lead to these shortcomings and
202 identify the best techniques to remedy the situation. Comparatively,
203 intra-domain routing works satisfactorily, and issues with intra-
204 domain routing are mainly associated with the interface between
205 intra- and inter-domain routing.
207 Reviewer's Note: Even in 2001, there was a wide difference of
208 opinion across the community regarding the shortcomings of
209 interdomain routing. In the years between writing and
210 publication, further analysis, changes in operational practice,
211 alterations to the demands made on inter-domain routing,
212 modifications made to BGP and a recognition of the difficulty of
213 finding a replacement may have altered the views of some members
214 of the community.
216 Changes in the nature and quality of the services that users want
217 from the Internet are difficult to provide within the current
218 framework, as they impose requirements never foreseen by the original
219 architects of the Internet routing system.
221 The kind of radical changes that have to be accommodated are
222 epitomized by the advent of IPv6 and the application of IP mechanisms
223 to private commercial networks that offer specific service guarantees
224 beyond the best-effort services of the public Internet. Major
225 changes to the inter-domain routing system are inevitable to provide
226 an efficient underpinning for the radically changed and increasingly
227 commercially-based networks that rely on the IP protocol suite.
229 Current practice stresses the need to separate the concerns of the
230 control plane and the forwarding plane in a router: This document
231 will follow this practice, but we still use the term 'routing' as a
232 global portmanteau to cover all aspects of the system.
234 This document provides a historical perspective on the current state
235 of inter-domain routing and its relationship to intra-domain routing
236 in Section 3 by revisiting the previous IETF requirements document
237 intended to steer the development of a future routing system. These
238 requirements, which informed the design of the Border Gateway
239 Protocol (BGP) in 1989, are contained in RFC1126 - "Goals and
240 Functional Requirements for Inter-Autonomous System Routing"
241 [RFC1126].
243 Section 3 also looks at some other work on requirements for domain-
244 based routing that was carried out before and after RFC1126 was
245 published. This work fleshes out the historical perspective and
246 provides some additional insights into alternative approaches which
247 may be instructive when building a new set of requirements.
249 The motivation for change and the inspiration for some of the
250 requirements for new routing architectures derive from the problems
251 attributable to the current domain-based routing system that are
252 being experienced in the Internet today. These will be discussed in
253 Section 5.
255 2.1. Background
257 Today's Internet uses an addressing and routing structure that has
258 developed in an ad hoc, more or less upwards-compatible fashion. The
259 structure has progressed from supporting a non-commercial Internet
260 with a single administrative domain to a solution that is able to
261 control today's multi-domain, federated Internet, carrying traffic
262 between the networks of commercial, governmental and not-for-profit
263 participants. This is not achieved without a great deal of 24/7
264 vigilance and operational activity by network operators: Internet
265 routing often appears to be running close to the limits of stability.
266 As well as directing traffic to its intended end-point, inter-domain
267 routing mechanisms are expected to implement a host of domain
268 specific routing policies for competing, communicating domains. The
269 result is not ideal, particularly as regards inter-domain routing
270 mechanisms, but it does a pretty fair job at its primary goal of
271 providing any-to-any connectivity to many millions of computers.
273 Based on a large body of anecdotal evidence, but also on a growing
274 body of experimental evidence [Labovitz02] and analytic work on the
275 stability of BGP under certain policy specifications [Griffin99], the
276 main Internet inter-domain routing protocol, BGP version 4 (BGP-4),
277 appears to have a number of problems. These problems are discussed
278 in more detail in Section 5. Additionally, the hierarchical nature
279 of the inter-domain routing problem appears to be changing as the
280 connectivity between domains becomes increasingly meshed [RFC3221]
281 which alters some of the scaling and structuring assumptions on which
282 BGP-4 is built. Patches and fix-ups may relieve some of these
283 problems but others may require a new architecture and new protocols.
285 3. Historical Perspective
287 3.1. The Legacy of RFC1126
289 RFC 1126 [RFC1126] outlined a set of requirements that were intended
290 to guide the development of BGP.
292 Editors' Note: When this document was reviewed by Yakov Rekhter,
293 one of the designers of BGP, his view was that "While some people
294 expected a set of requirements outlined in RFC1126 to guide the
295 development of BGP, in reality the development of BGP happened
296 completely independently of RFC1126. In other words, from the
297 point of view of the development of BGP, RFC1126 turned out to be
298 totally irrelevant." On the other hand, it appears that BGP as
299 currently implemented has met a large proportion of these
300 requirements, especially for unicast traffic.
302 While the network is demonstrably different from what it was in 1989,
303 having
304 o moved from single to multiple administrative control,
305 o increased in size by several orders of magnitude, and
306 o migrated from a fairly tree like connectivity graph to a meshier
307 style,
308 many of the same requirements remain. As a first step in setting
309 requirements for the future, we need to understand the requirements
310 that were originally set for the current protocols. And in charting
311 a future architecture we must first be sure to do no harm. This
312 means a future domain-based routing system has to support as its base
313 requirement, the level of function that is available today.
315 The following sections each relate to a requirement, or non-
316 requirement listed in RFC1126. In fact the section names are direct
317 quotes from the document. The discussion of these requirements
318 covers the following areas:
320 Explanation: Optional interpretation for today's audience of
321 the original intent of the requirement
323 Relevance: Is the requirement of RFC1126 still relevant, and
324 to what degree? Should it be understood
325 differently in today's environment?
327 Current practice: How well is the requirement met by current
328 protocols and practice?
330 3.1.1. "General Requirements"
332 3.1.1.1. "Route to Destination"
334 Timely routing to all reachable destinations, including multihoming
335 and multicast.
337 Relevance: Valid, but requirements for multihoming need
338 further discussion and elucidation. The
339 requirement should include multiple source
340 multicast routing.
342 Current practice: Multihoming is not efficient and the proposed
343 inter-domain multicast protocol BGMP [RFC3913] is
344 an add-on to BGP following many of the same
345 strategies but not integrated into the BGP
346 framework.
348 Editors' Note: Multicast routing has moved on
349 again since this was originally written. By
350 2006 BGMP had been effectively superseded.
351 Multicast routing now uses Multiprotocol BGP
352 [RFC4760], the Multicast Source Discovery
353 Protocol (MSDP) [RFC3618] and Protocol
354 Independent Multicast - Sparse Mode (PIM-SM)
355 [RFC2362], [RFC4601], especially the Source
356 Specific Multicast (SSM) subset.
358 3.1.1.2. "Routing is Assured"
360 This requires that a user be notified within a reasonable time period
361 after persistent attempts, about inability to provide a service.
363 Relevance: Valid
365 Current practice: There are ICMP messages for this, but in many
366 cases they are not used, either because of fears
367 about creating message storms or uncertainty about
368 whether the end system can do anything useful with
369 the resulting information. IPv6 implementations
370 may be able to make better use of the information
371 as they may have alternative addresses that could
372 be used to exploit an alternative routing.
374 3.1.1.3. "Large System"
376 The architecture was designed to accommodate the growth of the
377 Internet.
379 Relevance: Valid. Properties of Internet topology might be
380 an issue for future scalability (topology varies
381 from very sparse to quite dense at present).
382 Instead of setting out to accomodate growth in a
383 specific time period, indefinite growth should be
384 accommodated. On the other hand, such growth has
385 to be accommodated without making the protocols
386 too expensive - trade-offs may be necessary.
388 Current practice: Scalability of the current protocols will not be
389 sufficient under the current rate of growth.
390 There are problems with BGP convergence for large
391 dense topologies, problems with the slow speed of
392 routing information propagation between routers in
393 transit domains through the intra-domain protocol
394 for example when a failure requires traffic to be
395 redirected to an alternative exit point from the
396 domain (see Section 5.9, limited support for
397 hierarchy, etc.
399 3.1.1.4. "Autonomous Operation"
401 This requirement encapsulates the need for administrative domains
402 ("Autonomous Systems" - AS) to be able to operate autonomously as
403 regards setting routing policy:
405 Relevance: Valid. There may need to be additional
406 requirements for adjusting policy decisions to the
407 global functionality and for avoiding
408 contradictory policies. This would decrease the
409 possibility of unstable routing behavior.
411 There is a need for handling various degrees of
412 trust in autonomous operations, ranging from no
413 trust (e.g., between separate ISPs) to very high
414 trust where the domains have a common goal of
415 optimizing their mutual policies.
417 Policies for intra-domain operations should in
418 some cases be revealed, using suitable
419 abstractions.
421 Current practice: Policy management is in the control of network
422 managers, as required, but there is little support
423 for handling policies at an abstract level for a
424 domain.
426 Cooperating administrative entities decide about
427 the extent of cooperation independently. This can
428 lead to inconsistent, and potentially incompatible
429 routing policies being applied in notionally
430 cooperating domains. As discussed in Sections
431 (Section 5.2), (Section 5.3)and
432 (Section 5.10),lack of coordination combined with
433 global range of effects of BGP policies results in
434 occasional disruption of Internet routing over an
435 area far wider than the domains that are not
436 cooperating effectively.
438 3.1.1.5. "Distributed System"
440 The routing environment is a distributed system. The distributed
441 routing environment supports redundancy and diversity of nodes and
442 links. Both the controlling rule sets, which implement the routing
443 policies, and the places where operational control is applied,
444 through decisions on path selection, are distributed (primarily in
445 the routers).
447 Relevance: Valid. RFC1126 is very clear that we should not
448 be using centralized solutions, but maybe we need
449 a discussion on trade-offs between common
450 knowledge and distribution (i.e., to allow for
451 uniform policy routing, e.g., GSM systems are in a
452 sense centralized, but with hierarchies).
454 Current practice: Routing is very distributed, but lacking the
455 ability to consider optimization over several hops
456 or domains.
457 Editors' Note: Also coordinating the
458 implementation of a set of routing policies
459 across a large domain with many routers running
460 BGP is difficult. The policies have to be
461 turned into BGP rules and applied individually
462 to each router, giving opportunities for
463 mismatch and error.
465 3.1.1.6. "Provide A Credible Environment"
467 The routing environment and services should be based upon mechanisms
468 and information that exhibit both integrity and security. That is
469 the routers should always be working with credible data derived
470 through the reliable operation of protocols. Security from unwanted
471 modification and influence is required.
473 Relevance: Valid.
475 Current practice: BGP provides a limited mechanism for
476 authentication and security of peering sessions,
477 but this does not guarantee the authenticity or
478 validity of the routing information that is
479 exchanged.
481 There are certainly security problems with current
482 practice. The Routing Protocol Security
483 Requirements (rpsec) working group has been
484 struggling to agree on a set of requirements for
485 BGP security since early 2002.
487 Editors' note: Proposals for authenticating BGP
488 routing information using certificates were
489 under development by the Secure Inter-Domain
490 Routing (sidr) working group from 2006 through
491 2008.
493 3.1.1.7. "Be A Managed Entity"
495 Requires that the routing system provides adequate information on the
496 state of the network to allow resource, problem and fault management
497 to be carried out effectively and expeditiously. The system must
498 also provide controls that allow managers to use this information to
499 make informed decisions and use it to control the operation of the
500 routing system.
502 Relevance: The requirement is reasonable, but we might need
503 to be more specific on what information should be
504 available, e.g., to prevent routing oscillations.
506 Current practice: All policies are determined locally, where they
507 may appear reasonable but there is limited global
508 coordination through the routing policy databases
509 operated by the Internet registries (AfriNIC,
510 APNIC, ARIN, LACNIC, RIPE, etc.).
512 Operators are not required to register their
513 policies; even when policies are registered, it is
514 difficult to check that the actual policies in use
515 in other domains match the declared policies.
516 Therefore, a manager cannot guarantee to design
517 and implement policies that will interoperate with
518 those of other domains to provide stable routing.
519 Editors' note: Operators report that management
520 of BGP-based routing remains a function that
521 needs highly-skilled operators and continual
522 attention.
524 3.1.1.8. "Minimize Required Resources"
526 Relevance: Valid, however, the paragraph states that
527 assumptions on significant upgrades shouldn't be
528 made. Although this is reasonable, a new
529 architecture should perhaps be prepared to use
530 upgrades when they occur.
532 Current practice: Most bandwidth is consumed by the exchange of the
533 Network Layer Reachability Information (NLRI).
534 Usage of processing cycles ("Central Processor
535 Usage" - CPU) depends on the stability of the
536 Internet. Both phenomena have a local nature, so
537 there are not scaling problems with bandwidth and
538 CPU usage. Instability of routing increases the
539 consumption of resources in any case. The number
540 of networks in the Internet dominates memory
541 requirements - this is a scaling problem.
543 3.1.2. "Functional Requirements"
545 3.1.2.1. "Route Synthesis Requirements"
547 3.1.2.1.1. "Route around failures dynamically"
549 Relevance: Valid. Should perhaps be stronger. Only
550 providing a best-effort attempt may not be enough
551 if real-time services are to be provided for.
552 Detection of failures may need to be faster than
553 100ms to avoid being noticed by end-users.
555 Current practice: Latency of fail-over is too high; sometimes
556 minutes or longer.
558 3.1.2.1.2. "Provide loop free paths"
560 Relevance: Valid. Loops should occur only with negligible
561 probability and duration.
563 Current practice: Both link-state intra-domain routing and BGP
564 inter-domain routing (if correctly configured) are
565 forwarding-loop free after having converged.
566 However, convergence time for BGP can be very long
567 and poorly designed routing policies may result in
568 a number of BGP speakers engaging in a cyclic
569 pattern of advertisements and withdrawals which
570 never converges to a stable result [RFC3345].
571 Part of the reason for long convergence times is
572 the non-locality of the effects of changes in BGP
573 advertisements (see Section 5.3). Modifying the
574 inter-domain routing protocol to make the effects
575 of changes less global, and convergence a more
576 local condition might improve performance,
577 assuming a suitable modification could be
578 developed.
580 3.1.2.1.3. "Know when a path or destination is unavailable"
582 Relevance: Valid to some extent, but there is a trade-off
583 between aggregation and immediate knowledge of
584 reachability. It requires that routing tables
585 contain enough information to determine that the
586 destination is unknown or a path cannot be
587 constructed to reach it.
589 Current practice: Knowledge about lost reachability propagates
590 slowly through the networks due to slow
591 convergence for route withdrawals.
593 3.1.2.1.4. "Provide paths sensitive to administrative policies"
595 Relevance: Valid. Policy control of routing has become
596 increasingly important as the Internet has turned
597 into a business.
599 Current practice: Supported to some extent. Policies can only be
600 applied locally in an AS and not globally. Policy
601 information supplied has a very small probability
602 of affecting policies in other ASs. Furthermore,
603 only static policies are supported; between static
604 policies and policies dependent upon volatile
605 events of great celerity there should exist events
606 that routing should be aware of. Lastly, there is
607 no support for policies other than route-
608 properties (such as AS-origin, AS-path,
609 destination prefix, MED-values etc).
611 Editors' note: Subsequent to the original issue
612 of this document mechanisms which acknowledge
613 the business relationships of operators have
614 been developed such as the NOPEER community
615 attribute [RFC3765]. However the level of
616 usage of this attribute is apparently not very
617 great.
619 3.1.2.1.5. "Provide paths sensitive to user policies"
621 Relevance: Valid to some extent, as they may conflict with
622 the policies of the network administrator. It is
623 likely that this requirement will be met by means
624 of different bit transport services offered by an
625 operator, but at the cost of adequate
626 provisioning, authentication and policing when
627 utilizing the service.
629 Current practice: Not supported in normal routing. Can be
630 accomplished to some extent with loose source
631 routing, resulting in inefficient forwarding in
632 the routers. The various attempts to introduce
633 Quality of Service (QoS - e.g., Integrated
634 Services and Differentiated Services (DiffServ))
635 can also be seen as means to support this
636 requirement but they have met with limited success
637 in terms of providing alternate routes as opposed
638 to providing improved service on the standard
639 route.
640 Editor's Note: From the standpoint of a later
641 time, it would probably be more appropriate to
642 say "total failure" rather than "limited
643 success".
645 3.1.2.1.6. "Provide paths which characterize user quality-of-service
646 requirements"
648 Relevance: Valid to some extent, as they may conflict with
649 the policies of the operator. It is likely that
650 this requirement will be met by means of different
651 bit transport services offered by an operator, but
652 at the cost of adequate provisioning,
653 authentication and policing when utilizing the
654 service. It has become clear that offering to
655 provide a particular QoS to any arbitrary
656 destination from a particular source is generally
657 impossible: QoS except in very 'soft' forms such
658 as overall long term average packet delay, is
659 generally associated with connection oriented
660 routing.
662 Current practice: Creating routes with specified QoS is not
663 generally possible at present.
665 3.1.2.1.7. "Provide autonomy between inter- and intra-autonomous system
666 route synthesis"
668 Relevance: Inter- and intra-domain routing should stay
669 independent, but one should notice that this to
670 some extent contradicts the previous three
671 requirements. There is a trade-off between
672 abstraction and optimality.
674 Current practice: Inter-domain routing is performed independently of
675 intra-domain routing. Intra-domain routing is
676 however, especially in transit domains, very
677 interrelated with inter-domain routing.
679 3.1.2.2. "Forwarding Requirements"
681 3.1.2.2.1. "Decouple inter- and intra-autonomous system forwarding
682 decisions"
684 Relevance: Valid.
686 Current practice: As explained in Section 3.1.2.1.7, intra-domain
687 forwarding in transit domains is dependent on
688 inter-domain forwarding decisions.
690 3.1.2.2.2. "Do not forward datagrams deemed administratively
691 inappropriate"
693 Relevance: Valid, and increasingly important in the context
694 of enforcing policies correctly expressed through
695 routing advertisements but flouted by rogue peers
696 which send traffic for which a route has not been
697 advertised. On the other hand, packets that have
698 been misrouted due to transient routing problems
699 perhaps should be forwarded to reach the
700 destination, although along an unexpected path.
702 Current practice: At stub domains (i.e., domains that do not provide
703 any transit service for any other domains but that
704 connect directly to one or more transit domains)
705 there is packet filtering, e.g., to catch source
706 address spoofing on outgoing traffic or to filter
707 out unwanted incoming traffic. Filtering can in
708 particular reject traffic (such as unauthorized
709 transit traffic) that has been sent to a domain
710 even when it has not advertised a route for such
711 traffic on a given interface. The growing class
712 of 'middle boxes' (midboxes, e.g., Network Address
713 Translators - NATs) is quite likely to apply
714 administrative rules that will prevent forwarding
715 of packets. Note that security policies may
716 deliberately hide administrative denials. In the
717 backbone, intentional packet dropping based on
718 policies is not common.
720 3.1.2.2.3. "Do not forward datagrams to failed resources"
722 Relevance: Unclear, although it is clearly desirable to
723 minimise waste of forwarding resources by
724 discarding datagrams which cannot be delivered at
725 the earliest opportunity. There is a trade-off
726 between scalability and keeping track of
727 unreachable resources. The requirement
728 effectively imposes a requirement on adjacent
729 nodes to monitor for failures and take steps to
730 cause rerouting at the earliest opportinity if a
731 failure is detected. However, packets that are
732 already in flight or queued on a failed link
733 cannot generally be rescued.
735 Current practice: Routing protocols use both internal adjacency
736 management sub-protocols (e.g. Hello protocols)
737 and information from equipment and lower layer
738 link watchdogs to keep track of failures in
739 routers and connecting links. Failures will
740 eventually result in the routing protocol
741 reconfiguring the routing to avoid (if possible) a
742 failed resource, but this is generally very slow
743 (30s or more). In the meantime datagrams may well
744 be forwarded to failed resources. In general
745 terms, end hosts and some non-router middle boxes
746 do not participate in these notifications and
747 failures of such boxes will not affect the routing
748 system.
750 3.1.2.2.4. "Forward datagram according to its characteristics"
752 Relevance: Valid. This is necessary in enabling
753 differentiation in the network, based on QoS,
754 precedence, policy or security.
756 Current practice: Ingress and egress filtering can be done based on
757 policy. Some networks discriminate on the basis
758 of requested QoS.
760 3.1.2.3. "Information Requirements"
762 3.1.2.3.1. "Provide a distributed and descriptive information base"
764 Relevance: Valid, however an alternative arrangement of
765 information bases, possibly with an element of
766 centralization for the domain (as mentioned in
767 Section 3.1.1.5) might offer some advantages
768 through the ability to optimize across the domain
769 and respond more quickly to changes and failures.
771 Current practice: The information base is distributed, but it is
772 unclear whether it supports all necessary routing
773 functionality.
775 3.1.2.3.2. "Determine resource availability"
776 Relevance: Valid. It should be possible to determine the
777 availability and levels of availability of any
778 resource (such as bandwidth) needed to carry out
779 routing. This prevents needing to discover
780 unavailability through failure. Resource location
781 and discovery is arguably a separate concern that
782 could be addressed outside the core routing
783 requirements.
785 Current practice: Resource availability is predominantly handled
786 outside of the routing system.
788 3.1.2.3.3. "Restrain transmission utilization"
790 Relevance: Valid. However certain requirements in the
791 control plane, such as fast detection of faults
792 may be worth consumption of more resources.
793 Similarly, simplicity of implementation may make
794 it cheaper to 'back haul' traffic to central
795 locations to minimise the cost of routing if
796 bandwidth is cheaper than processing.
798 Current practice: BGP messages probably do not ordinarily consume
799 excessive resources, but might during erroneous
800 conditions. In the data plane, the near universal
801 adoption of shortest path protocols could be
802 considered to result in minimization of
803 transmission utilization.
805 3.1.2.3.4. "Allow limited information exchange"
807 Relevance: Valid. But perhaps routing could be improved if
808 certain information (especially policies) could be
809 available either globally or at least for a wider
810 defined locality.
811 Editors' note: Limited information exchange
812 would be potentially compatible with a more
813 local form of convergence than BGP tries to
814 achieve today. Limited information exchange is
815 potentially incompatible with global
816 convergence.
817 Current practice: Policies are used to determine which reachability
818 information is exported but neighbors receiving
819 the information are not generally aware of the
820 policies that resulted in this export.
822 3.1.2.4. "Environmental Requirements"
824 3.1.2.4.1. "Support a packet-switching environment"
826 Relevance: Valid but routing system should, perhaps, not be
827 limited to this exclusively.
829 Current practice: Supported.
831 3.1.2.4.2. "Accommodate a connection-less oriented user transport
832 service"
834 Relevance: Valid, but routing system should, perhaps, not be
835 limited to this exclusively.
837 Current practice: Accommodated.
839 3.1.2.4.3. "Accommodate 10K autonomous systems and 100K networks"
841 Relevance: No longer valid. Needs to be increased
842 potentially indefinitely. It is extremely
843 difficult to foresee the future size expansion of
844 the Internet so that the Utopian solution would be
845 to achieve an Internet whose architecture is scale
846 invariant. Regrettably, this may not be
847 achievable without introducing undesirable
848 complexity and a suitable trade off between
849 complexity and scalability is likely to be
850 necessary.
852 Current Practice: Supported but perhaps reaching its limit. Since
853 the original version of this document was written
854 in 2001, the number of ASs advertised has grown
855 from around 8000 to 20000, and almost 35000 AS
856 numbers have been allocated by the regional
857 registries [Huston05]. If this growth continues
858 the original 16 bit AS space in BGP-4 will be
859 exhausted in less than 5 years. Planning for an
860 extended AS space is now an urgent requirement.
862 3.1.2.4.4. "Allow for arbitrary interconnection of autonomous systems"
864 Relevance: Valid. However perhaps not all interconnections
865 should be accessible globally.
867 Current practice: BGP-4 allows for arbitrary interconnections.
869 3.1.2.5. "General Objectives"
871 3.1.2.5.1. "Provide routing services in a timely manner"
873 Relevance: Valid, as stated before. It might be acceptable
874 for a more complex service to take longer to
875 deliver, but it still has to meet the
876 application's requirements - routing has to be at
877 the service of the end-to-end principle.
878 Editors' note: Delays in setting up connections
879 due to network functions such as NAT boxes are
880 becoming increasingly problematic. The routing
881 system should try to keep any routing delay to
882 a minimum.
884 Current practice: More or less, with the exception of convergence
885 and fault robustness.
887 3.1.2.5.2. "Minimize constraints on systems with limited resources"
889 Relevance: Valid
891 Current practice: Systems with limited resources are typically stub
892 domains that advertise very little information.
894 3.1.2.5.3. "Minimize impact of dissimilarities between autonomous
895 systems"
897 Relevance: Important. This requirement is critical to a
898 future architecture. In a domain-based routing
899 environment where the internal properties of
900 domains may differ radically, it will be important
901 to be sure that these dissimilarities are
902 minimized at the borders.
903 Current: practice: For the most part this capability is not really
904 required in today's networks since the intra-
905 domain attributes are broadly similar across
906 domains.
908 3.1.2.5.4. "Accommodate the addressing schemes and protocol mechanisms
909 of the autonomous systems"
911 Relevance: Important, probably more so than when RFC1126 was
912 originally developed because of the potential
913 deployment of IPv6, wider usage of MPLS and the
914 increasing usage of VPNs.
916 Current practice: Only one global addressing scheme is supported in
917 most autonomous systems but the availability of
918 IPv6 services is steadily increasing. Some global
919 backbones support IPv6 routing and forwarding.
921 3.1.2.5.5. "Must be implementable by network vendors"
923 Relevance: Valid, but note that what can be implemented today
924 is different from what was possible when RFC1126
925 was written: a future domain-based routing
926 architecture should not be unreasonably
927 constrained by past limitations.
929 Current practice: BGP was implemented and meets a large proportion
930 of the original requirements.
932 3.1.3. "Non-Goals"
934 RFC1126 also included a section discussing non-goals. This section
935 discusses the extent to which these are still non-goals. It also
936 considers whether the fact that they were non-goals adversely affects
937 today's IDR system.
939 3.1.3.1. "Ubiquity"
941 The authors of RFC 1126 were explicitly saying that IP and its inter-
942 domain routing system need not be deployed in every AS, and a
943 participant should not necessarily expect to be able to reach a given
944 AS, possibly because of routing policies. In a sense this 'non-goal'
945 has effectively been achieved by the Internet and IP protocols. This
946 requirement reflects a different world view where there was serious
947 competition for network protocols, which is really no longer the
948 case. Ubiquitous deployment of inter-domain routing in particular
949 has been achieved and must not be undone by any proposed future
950 domain-based routing architecture. On the other hand:
951 o ubiquitous connectivity cannot be reached in a policy sensitive
952 environment and should not be an aim,
953 * Editor's Note: It has been pointed out that this statement
954 could be interpreted as being contrary to the Internet mission
955 of providing universal connectivity. The fact that limits to
956 connectivity will be added as operational requirements in a
957 policy sensitive environment should not imply that a future
958 domain-based routing architecture contains intrinsic limits on
959 connectivity.
960 o it must not be required that the same routing mechanisms are used
961 throughout provided that they can interoperate appropriately
962 o the information needed to control routing in a part of the network
963 should not necessarily be ubiquitously available and it must be
964 possible for an operator to hide commercially sensitive
965 information that is not needed outside a domain.
966 o the introduction of IPv6 reintroduces an element of diversity into
967 the world of network protocols but the similarities of IPv4 and
968 IPv6 as regards routing and forwarding make this event less likely
969 to drive an immediate diversification in routing systems. The
970 potential for further growth in the size of the network enabled by
971 IPv6 is very likely to require changes in the future: whether this
972 results in the replacement of one de facto ubiquitous system with
973 another remains to be seen but cannot be a requirement - it will
974 have to interoperate with BGP during the transition..
976 Relevance: De facto essential for a future domain-based
977 routing architecture, but what is required is
978 ubiquity of the routing system rather than
979 ubiquity of connectivity and it must be capable of
980 a gradual takeover through interoperation with the
981 existing system.
983 Current practice: De facto ubiquity achieved.
985 3.1.3.2. "Congestion control"
987 Relevance: It is not clear if this non-goal was to be applied
988 to routing or forwarding. It is definitely a non-
989 goal to adapt the choice of route when there is
990 transient congestion. However, to add support for
991 congestion avoidance (e.g., Explicit Congestion
992 Notification (ECN) and ICMP messages) in the
993 forwarding process would be a useful addition.
994 There is also extensive work going on in traffic
995 engineering which should result in congestion
996 avoidance through routing as well as in
997 forwarding.
999 Current practice: Some ICMP messages (e.g., source quench) exist to
1000 deal with congestion control but these are not
1001 generally used as they either make the problem
1002 worse or there is no mechanism to reflect the
1003 message into the application which is providing
1004 the source.
1006 3.1.3.3. "Load splitting"
1008 Relevance: This should neither be a non-goal, nor an explicit
1009 goal. It might be desirable in some cases and
1010 should be considered as an optional architectural
1011 feature.
1013 Current practice: Can be implemented by exporting different prefixes
1014 on different links, but this requires manual
1015 configuration and does not consider actual load.
1017 Editors' Note: This configuration is carried
1018 out extensively as of 2006 and has been a
1019 significant factor in routing table bloat. If
1020 this need is a real operational requirement, as
1021 it seems to be for multihomed or otherwise
1022 richly connected sites, it will be necessary to
1023 reclassify this as a real and important goal.
1025 3.1.3.4. "Maximizing the utilization of resources"
1027 Relevance: Valid. Cost-efficiency should be striven for; we
1028 note that maximizing resource utilization does not
1029 always lead to greatest cost-efficiency.
1031 Current practice: Not currently part of the system, though often a
1032 'hacked in' feature done with manual
1033 configuration.
1035 3.1.3.5. "Schedule to deadline service"
1037 This non-goal was put in place to ensure that the IDR did not have to
1038 meet real time deadline goals such as might apply to Constant Bit
1039 Rate (CBR) real time services in ATM.
1041 Relevance: The hard form of deadline services is still a non-
1042 goal for the future domain-based routing
1043 architecture but overall delay bounds are much
1044 more of the essence than was the case when RFC1126
1045 was written.
1047 Current practice: Service providers are now offering overall
1048 probabilistic delay bounds on traffic contracts.
1049 To implement these contracts there is a
1050 requirement for a rather looser form of delay
1051 sensitive routing.
1053 3.1.3.6. "Non-interference policies of resource utilization"
1055 The requirement in RFC1126 is somewhat opaque, but appears to imply
1056 that what we would today call QoS routing is a non-goal and that
1057 routing would not seek to control the elastic characteristics of
1058 Internet traffic whereby a TCP connection can seek to utilize all the
1059 spare bandwidth on a route, possibly to the detriment of other
1060 connections sharing the route or crossing it.
1062 Relevance: Open Issue. It is not clear whether dynamic QoS
1063 routing can or should be implemented. Such a
1064 system would seek to control the admission and
1065 routing of traffic depending on current or recent
1066 resource utilization. This would be particularly
1067 problematic where traffic crosses an ownership
1068 boundary because of the need for potentially
1069 commercially sensitive information to be made
1070 available outside the ownership boundary.
1072 Current practice: Routing does not consider dynamic resource
1073 availability. Forwarding can support service
1074 differentiation.
1076 3.2. ISO OSI IDRP, BGP and the Development of Policy Routing
1078 During the decade before the widespread success of the World Wide
1079 Web, ISO was developing the communications architecture and protocol
1080 suite Open Systems Interconnection (OSI). For a considerable part of
1081 this time OSI was seen as a possible competitor for and even a
1082 replacement for the IP suite as this basis for the Internet. The
1083 technical developments of the two protocols were quite heavily
1084 interrelated with each providing ideas and even components that were
1085 adapted into the other suite.
1087 During the early stages of the development of OSI, the IP suite was
1088 still mainly in use on the ARPANET and the relatively small scale
1089 first phase NSFnet. This was effectively a single administrative
1090 domain with a simple tree structured network in a three level
1091 hierarchy connected to a single logical exchange point (the NSFnet
1092 backbone). In the second half of the 1980s the NSFNET was starting
1093 on the growth and transformation that would lead to today's Internet.
1094 It was becoming clear that the backbone routing protocol, the
1095 Exterior Gateway Protocol (EGP) [RFC0904], was not going to cope even
1096 with the limited expansion being planned. EGP is an "all informed"
1097 protocol which needed to know the identities of all gateways and this
1098 was no longer reasonable. With the increasing complexity of the
1099 NSFnet and the linkage of the NSFnet network to other networks there
1100 was a desire for policy-based routing which would allow
1101 administrators to manage the flow of packets between networks. The
1102 first version of the Border Gateway Protocol (BGP-1) [RFC1105] was
1103 developed as a replacement for EGP with policy capabilities - a
1104 stopgap EGP version 3 had been created as an interim measure while
1105 BGP was developed. BGP was designed to work on a hierarchically
1106 structured network, such as the original NSFNET, but could also work
1107 on networks that were at least partially non-hierarchical where there
1108 were links between ASs at the same level in the hierarchy (we would
1109 now call these 'peering arrangements') although the protocol made a
1110 distinction between different kinds of links (links are classified as
1111 upwards, downwards or sideways). ASs themselves were a 'fix' for the
1112 complexity that developed in the three tier structure of the NSFnet.
1114 Meanwhile the OSI architects, led by Lyman Chapin, were developing a
1115 much more general architecture for large scale networks. They had
1116 recognized that no one node, especially an end-system (host) could or
1117 should attempt to remember routes from "here" to "anywhere" - this
1118 sounds obvious today but was not so obvious 20 years ago. They were
1119 also considering hierarchical networks with independently
1120 administered domains - a model already well entrenched in the public
1121 switched telephone network. This led to a vision of a network with
1122 multiple independent administrative domains with an arbitrary
1123 interconnection graph and a hierarchy of routing functionality. This
1124 architecture was fairly well established by 1987 [Tsuchiya87]. The
1125 architecture initially envisaged a three level routing functionality
1126 hierarchy in which each layer had significantly different
1127 characteristics:
1129 1. *End-system to Intermediate system routing (host to router)*, in
1130 which the principal functions are discovery and redirection.
1132 2. *Intra-domain intermediate system to intermediate system routing
1133 (router to router)*, in which "best" routes between end-systems
1134 in a single administrative domain are computed and used. A
1135 single algorithm and routing protocol would be used throughout
1136 any one domain.
1138 3. *Inter-domain intermediate-system to intermediate system routing
1139 (router to router)*, in which routes between routing domains
1140 within administrative domains are computed (routing is considered
1141 separately between administrative domains and routing domains).
1143 Level 3 of this hierarchy was still somewhat fuzzy. Tsuchiya says:
1145 The last two components, Inter-Domain and Inter-Administration
1146 routing, are less clear-cut. It is not obvious what should be
1147 standardized with respect to these two components of routing. For
1148 example, for Inter-Domain routing, what can be expected from the
1149 Domains? By asking Domains to provide some kind of external
1150 behavior, we limit their autonomy. If we expect nothing of their
1151 external behavior, then routing functionality will be minimal.
1153 Across administrations, it is not known how much trust there will
1154 be. In fact, the definition of trust itself can only be
1155 determined by the two or more administrations involved.
1157 Fundamentally, the problem with Inter-Domain and Inter-
1158 Administration routing is that autonomy and mistrust are both
1159 antithetical to routing. Accomplishing either will involve a
1160 number of tradeoffs which will require more knowledge about the
1161 environments within which they will operate.
1163 Further refinement of the model occurred over the next couple of
1164 years and a more fully formed view is given by Huitema and Dabbous in
1165 1989 [Huitema90]. By this stage work on the original IS-IS link
1166 state protocol, originated by the Digital Equipment Corporation
1167 (DEC), was fairly advanced and was close to becoming a Draft
1168 International Standard. IS-IS is of course a major component of
1169 intra-domain routing today and inspired the development of the Open
1170 Shortest Path First (OSPF) family. However, Huitema and Dabbous were
1171 not able to give any indication of protocol work for Level 3. There
1172 are hints of possible use of centralized route servers.
1174 In the meantime, the NSFnet consortium and the IETF had been
1175 struggling with the rapid growth of the NSFnet. It had been clear
1176 since fairly early on that EGP was not suitable for handling the
1177 expanding network and the race was on to find a replacement. There
1178 had been some intent to include a metric in EGP to facilitate routing
1179 decisions, but no agreement could be reached on how to define the
1180 metric. The lack of trust was seen as one of the main reasons that
1181 EGP could not establish a globally acceptable routing metric: again
1182 this seems to be a clearly futile aim from this distance in time!
1183 Consequently EGP became effectively a rudimentary path-vector
1184 protocol which linked gateways with Autonomous Systems. It was
1185 totally reliant on the tree structured network to avoid routing loops
1186 and the all informed nature of EGP meant that update packets became
1187 very large. BGP version 1 [RFC1105] was standardized in 1989 but had
1188 been in development for some time before this and had already seen
1189 action in production networks prior to standardization. BGP was the
1190 first real path-vector routing protocol and was intended to relieve
1191 some of the scaling problems as well as providing policy-based
1192 routing. Routes were described as paths along a 'vector' of ASs
1193 without any associated cost metric. This way of describing routes
1194 was explicitly intended to allow detection of routing loops. It was
1195 assumed that the intra-domain routing system was loop-free with the
1196 implication that the total routing system would be loop-free if there
1197 were no loops in the AS path. Note that there were no theoretical
1198 underpinnings for this work and it traded freedom from routing loops
1199 for guaranteed convergence.
1201 Also the NSFnet was a government funded research and education
1202 network. Commercial companies which were partners in some of the
1203 projects were using the NSFnet for their research activities but it
1204 was becoming clear that these companies also needed networks for
1205 commercial traffic. NSFnet had put in place "acceptable use"
1206 policies which were intended to limit the use of the network.
1207 However there was little or no technology to support the legal
1208 framework.
1210 Practical experience, IETF IAB discussion (centred in the Internet
1211 Architecture Task Force) and the OSI theoretical work were by now
1212 coming to the same conclusions:
1213 o Networks were going to be composed out of multiple administrative
1214 domains (the federated network),
1215 o The connections between these domains would be an arbitrary graph
1216 and certainly not a tree,
1217 o The administrative domains would wish to establish distinctive,
1218 independent routing policies through the graph of Autonomous
1219 Systems, and
1220 o Administrative Domains would have a degree of distrust of each
1221 other which would mean that policies would remain opaque.
1223 These views were reflected by Susan Hares' (working for Merit
1224 Networks at that time) contribution to the Internet Architecture
1225 (INARC) workshop in 1989, summarized in the report of the workshop
1226 [INARC89]:
1228 The rich interconnectivity within the Internet causes routing
1229 problems today. However, the presenter believes the problem is
1230 not the high degree of interconnection, but the routing protocols
1231 and models upon which these protocols are based. Rich
1232 interconnectivity can provide redundancy which can help packets
1233 moving even through periods of outages. Our model of interdomain
1234 routing needs to change. The model of autonomous confederations
1235 and autonomous systems [RFC0975] no longer fits the reality of
1236 many regional networks. The ISO models of administrative domain
1237 and routing domains better fit the current Internet's routing
1238 structure.
1240 With the first NSFNET backbone, NSF assumed that the Internet
1241 would be used as a production network for research traffic. We
1242 cannot stop these networks for a month and install all new routing
1243 protocols. The Internet will need to evolve its changes to
1244 networking protocols while still continuing to serve its users.
1246 This reality colors how plans are made to change routing
1247 protocols.
1249 It is also interesting to note that the difficulties of organising a
1250 transition were recognized at this stage and have not been seriously
1251 explored or resolved since.
1253 Policies would primarily be interested in controlling which traffic
1254 should be allowed to transit a domain (to satisfy commercial
1255 constraints or acceptable use policies) thereby controlling which
1256 traffic uses the resources of the domain. The solution adopted by
1257 both the IETF and OSI was a form of distance vector hop-by-hop
1258 routing with explicit policy terms. The reasoning for this choice
1259 can be found in Breslau and Estrin's 1990 paper [Breslau90]
1260 (implicitly - because some other alternatives are given such as a
1261 link state with policy suggestion which, with hindsight, would have
1262 even greater problems than BGP on a global scale network).
1263 Traditional distance vector protocols exchanged routing information
1264 in the form of a destination and a metric. The new protocols
1265 explicitly associated policy expressions with the route by including
1266 either a list of the source ASs that are permitted to use the route
1267 described in the routing update, and/or a list of all ASs traversed
1268 along the advertised route.
1270 Parallel protocol developments were already in progress by the time
1271 this paper was published: BGP version 2 [RFC1163] in the IETF and the
1272 Inter-Domain Routing Protocol (IDRP) [ISO10747] which would be the
1273 Level 3 routing protocol for the OSI architecture. IDRP was
1274 developed under the aegis of the ANSI XS3.3 working group led by
1275 Lyman Chapin and Charles Kunzinger. The two protocols were very
1276 similar in basic design but IDRP has some extra features, some of
1277 which have been incorporated into later versions of BGP; others may
1278 yet be so and still others may be seen to be inappropriate. Breslau
1279 and Estrin summarize the design of IDRP as follows:
1281 IDRP attempts to solve the looping and convergence problems
1282 inherent in distance vector routing by including full AD
1283 [Administrative Domain - essentially the equivalent of what are
1284 now called ASs] path information in routing updates. Each routing
1285 update includes the set of ADs that must be traversed in order to
1286 reach the specified destination. In this way, routes that contain
1287 AD loops can be avoided.
1289 IDRP updates also contain additional information relevant to
1290 policy constraints. For instance, these updates can specify what
1291 other ADs are allowed to receive the information described in the
1292 update. In this way, IDRP is able to express source specific
1293 policies. The IDRP protocol also provides the structure for the
1294 addition of other types of policy related information in routing
1295 updates. For example, User Class Identifiers (UCI) could also be
1296 included as policy attributes in routing updates.
1298 Using the policy route attributes IDRP provides the framework for
1299 expressing more fine grained policy in routing decisions.
1300 However, because it uses hop-by-hop distance vector routing, it
1301 only allows a single route to each destination per-QOS to be
1302 advertised. As the policy attributes associated with routes
1303 become more fine grained, advertised routes will be applicable to
1304 fewer sources. This implies a need for multiple routes to be
1305 advertised for each destination in order to increase the
1306 probability that sources have acceptable routes available to them.
1307 This effectively replicates the routing table per forwarding
1308 entity for each QoS, UCI, source combination that might appear in
1309 a packet. Consequently, we claim that this approach does not
1310 scale well as policies become more fine grained, i.e., source or
1311 UCI specific policies.
1313 Over the next three or four years successive versions of BGP (BGP-2
1314 [RFC1163], BGP-3 [RFC1267] and BGP-4 [RFC1771]) were deployed to cope
1315 with the growing and by now commercialized Internet. From BGP-2
1316 onwards, BGP made no assumptions about an overall structure of
1317 interconnections allowing it to cope with today's dense web of
1318 interconnections between ASs. BGP version 4 was developed to handle
1319 the change from classful to classless addressing. For most of this
1320 time IDRP was being developed in parallel, and both protocols were
1321 implemented in the Merit gatedaemon routing protocol suite. During
1322 this time there was a movement within the IETF which saw BGP as a
1323 stopgap measure to be used until the more sophisticated IDRP could be
1324 adapted to run over IP instead of the OSI connectionless protocol
1325 CLNP. However, unlike its intra-domain counterpart IS-IS which has
1326 stood the test of time, and indeed proved to be more flexible than
1327 OSPF, IDRP was ultimately not adopted by the market. By the time the
1328 NSFnet backbone was decommissioned in 1995, BGP-4 was the inter-
1329 domain routing protocol of choice and OSI's star was already
1330 beginning to wane. IDRP is now little remembered.
1332 A more complete account of the capabilities of IDRP can be found in
1333 chapter 14 of David Piscitello and Lyman Chapin's book 'Open Systems
1334 Networking: TCP/IP and OSI' which is now readable on the Internet
1335 [Chapin94].
1337 IDRP also contained quite extensive means for securing routing
1338 exchanges much of it based on X.509 certificates for each router and
1339 public/private key encryption of routing updates.
1341 Some of the capabilities of IDRP which might yet appear in a future
1342 version of BGP include the ability to manage routes with explicit QoS
1343 classes, and the concept of domain confederations (somewhat different
1344 from the confederation mechanism in today's BGP) as an extra level in
1345 the hierarchy of routing.
1347 3.3. Nimrod Requirements
1349 Nimrod as expressed by Noel Chiappa in his early document, "A New IP
1350 Routing and Addressing Architecture" [Chiappa91] and later in the
1351 NIMROD Working Group documents [RFC1753] and [RFC1992] established a
1352 number of requirements that need to be considered by any new routing
1353 architecture. The Nimrod requirements took RFC1126 as a starting
1354 point and went further.
1356 The three goals of Nimrod, quoted from [RFC1992], were as follows:
1357 1. To support a dynamic internetwork of _arbitrary size_ (our
1358 emphasis) by providing mechanisms to control the amount of
1359 routing information that must be known throughout an
1360 internetwork.
1361 2. To provide service-specific routing in the presence of multiple
1362 constraints imposed by service providers and users.
1363 3. To admit incremental deployment throughout an internetwork.
1365 It is certain that these goals should be considered requirements for
1366 any new domain-based routing architecture.
1367 o As discussed in other sections of this document the rate of growth
1368 of the amount of information needed to maintain the routing system
1369 is such that the system may not be able to scale up as the
1370 Internet expands as foreseen. And yet, as the services and
1371 constraints upon those services grow there is a need for more
1372 information to be maintained by the routing system. One of the
1373 key terms in the first requirements is 'control'. While
1374 increasing amounts of information need to be known and maintained
1375 in the Internet, the amounts and kinds of information that are
1376 distributed can be controlled. This goal should be reflected in
1377 the requirements for the future domain-based architecture.
1378 o If anything, the demand for specific services in the Internet has
1379 grown since 1996 when the Nimrod architecture was published.
1380 Additionally the kinds of constraints that service providers need
1381 to impose upon their networks and that services need to impose
1382 upon the routing have also increased. Any changes made to the
1383 network in the last half-decade have not significantly improved
1384 this situation.
1385 o The ability to incrementally deploy any new routing architecture
1386 within the Internet is still an absolute necessity. It is
1387 impossible to imagine that a new routing architecture could
1388 supplant the current architecture on a flag day.
1390 At one point in time Nimrod, with its addressing and routing
1391 architectures was seen as a candidate for IPng. History shows that
1392 it was not accepted as the IPng, having been ruled out of the
1393 selection process by the IESG in 1994 on the grounds that it was 'too
1394 much of a research effort' [RFC1752], although input for the
1395 requirements of IPng was explicitly solicited from Chiappa [RFC1753].
1396 Instead IPv6 has been put forth as the IPng. Without entering a
1397 discussion of the relative merits of IPv6 versus Nimrod, it is
1398 apparent that IPv6, while it may solve many problems, does not solve
1399 the critical routing problems in the Internet today. In fact in some
1400 sense it exacerbates them by adding a requirement for support of two
1401 Internet protocols and their respective addressing methods. In many
1402 ways the addition of IPv6 to the mix of methods in today's Internet
1403 only points to the fact that the goals, as set forth by the Nimrod
1404 team, remain as necessary goals.
1406 There is another sense in which study of Nimrod and its architecture
1407 may be important to deriving a future domain-based routing
1408 architecture. Nimrod can be said to have two derivatives:
1409 o Multi-Protocol Label Switching (MPLS) in that it took the notion
1410 of forwarding along well known paths
1411 o Private Network-Node Interface (PNNI) in that it took the notion
1412 of abstracting topological information and using that information
1413 to create connections for traffic.
1415 It is important to note, that whilst MPLS and PNNI borrowed ideas
1416 from Nimrod, neither of them can be said to be an implementation of
1417 this architecture.
1419 3.4. PNNI
1421 The Private Network-Node Interface (PNNI) routing protocol was
1422 developed under the ATM Forum's auspices as a hierarchical route
1423 determination protocol for ATM, a connection oriented architecture.
1424 It is reputed to have developed several of its methods from a study
1425 of the Nimrod architecture. What can be gained from an analysis of
1426 what did and did not succeed in PNNI?
1428 The PNNI protocol includes the assumption that all peer groups are
1429 willing to cooperate, and that the entire network is under the same
1430 top administration. Are there limitations that stem from this 'world
1431 node' presupposition? As discussed in [RFC3221], the Internet is no
1432 longer a clean hierarchy and there is a lot of resistance to having
1433 any sort of 'ultimate authority' controlling or even brokering
1434 communication.
1436 PNNI is the first deployed example of a routing protocol that uses
1437 abstract map exchange (as opposed to distance vector or link state
1438 mechanisms) for inter-domain routing information exchange. One
1439 consequence of this is that domains need not all use the same
1440 mechanism for map creation. What were the results of this
1441 abstraction and source based route calculation mechanism?
1443 Since the authors of this document do not have experience running a
1444 PNNI network, the comments above are from a theoretical perspective.
1445 Further research on these issues based on operational experience is
1446 required.
1448 4. Recent Research Work
1450 4.1. Developments in Internet Connectivity
1452 The work commissioned from Geoff Huston by the Internet Architecture
1453 Board [RFC3221] draws a number of conclusions from analysis of BGP
1454 routing tables and routing registry databases:
1455 o The connectivity between provider ASs is becoming more like a
1456 dense mesh than the tree structure that was commonly assumed to be
1457 commonplace a couple of years ago. This has been driven by the
1458 increasing amounts charged for peering and transit traffic by
1459 global service providers. Local direct peering and Internet
1460 exchanges are becoming steadily more common as the cost of local
1461 fibre connections drops.
1462 o End user sites are increasingly resorting to multi-homing onto two
1463 or more service providers as a way of improving resiliency. This
1464 has a knock-on effect of spectacularly fast depletion of the
1465 available pool of AS numbers as end user sites require public AS
1466 numbers to become multi-homed and corresponding increase in the
1467 number of prefixes advertised in BGP.
1468 o Multi-homed sites are using advertisement of longer prefixes in
1469 BGP as a means of traffic engineering to load spread across their
1470 multiple external connections with further impact on the size of
1471 the BGP tables.
1472 o Operational practices are not uniform, and in some cases lack of
1473 knowledge or training is leading to instability and/or excessive
1474 advertisement of routes by incorrectly configured BGP speakers.
1475 o All these factors are quickly negating the advantages in limiting
1476 the expansion of BGP routing tables that were gained by the
1477 introduction of CIDR and consequent prefix aggregation in BGP. It
1478 is also now impossible for IPv6 to realize the world view in which
1479 the default free zone would be limited to perhaps 10,000 prefixes.
1480 o The typical 'width' of the Internet in AS hops is now around five,
1481 and much less in many cases.
1483 These conclusions have a considerable impact on the requirements for
1484 the future domain-based routing architecture:
1486 o Topological hierarchy (e.g. mandating a tree structured
1487 connectivity) cannot be relied upon to deliver scalability of a
1488 large Internet routing system
1489 o Aggregation cannot be relied upon to constrain the size of routing
1490 tables for an all-informed routing system
1492 4.2. DARPA NewArch Project
1494 DARPA funded a project to think about a new architecture for future
1495 generation Internet, called NewArch ().
1496 Work started in the first half of 2000 and the main project finished
1497 in 2003 [NewArch03].
1499 The main development is to conclude that as the Internet becomes
1500 mainstream infrastructure, fewer and fewer of the requirements are
1501 truly global but may apply with different force or not at all in
1502 certain parts of the network. This (it is claimed) makes the
1503 compilation of a single, ordered list of requirements deeply
1504 problematic. Instead we may have to produce multiple requirement
1505 sets with support for differing requirement importance at different
1506 times and in different places. This 'meta-requirement' significantly
1507 impacts architectural design.
1509 Potential new technical requirements identified so far include:
1510 o Commercial environment concerns such as richer inter-provider
1511 policy controls and support for a variety of payment models
1512 o Trustworthiness
1513 o Ubiquitous mobility
1514 o Policy driven self-organisation ('deep auto configuration')
1515 o Extreme short-timescale resource variability
1516 o Capacity allocation mechanisms
1517 o Speed, propagation delay and delay/bandwidth product issues
1519 Non-technical or political 'requirements' include:
1520 o Legal and Policy drivers such as
1521 * Privacy and free/anonymous speech
1522 * Intellectual property concerns
1523 * Encryption export controls
1524 * Law enforcement surveillance regulations
1525 * Charging and taxation issues
1526 o Reconciling national variations and consistent operation in a
1527 world wide infrastructure
1529 The conclusions of the work are now summarized in the final report
1530 [NewArch03].
1532 4.2.1. Defending the End-to-End Principle
1534 One of the participants in DARPA NewArch work (Dave Clark) with one
1535 of his associates has also published a very interesting paper
1536 analyzing the impact of some of the new requirements identified in
1537 NewArch (see Section 4.2) on the end-to-end principle that has guided
1538 the development of the Internet to date [Blumenthal01]. Their
1539 primary conclusion is that the loss of trust between the users at the
1540 ends of end to end has the most fundamental effect on the Internet.
1541 This is clear in the context of the routing system, where operators
1542 are unwilling to reveal the inner workings of their networks for
1543 commercial reasons. Similarly, trusted third parties and their
1544 avatars (mainly mid-boxes of one sort or another) have a major impact
1545 on the end-to-end principles and the routing mechanisms that went
1546 with them. Overall, the end to end principles should be defended so
1547 far as is possible - some changes are already too deeply embedded to
1548 make it possible to go back to full trust and openness - at least
1549 partly as a means of staving off the day when the network will ossify
1550 into an unchangeable form and function (much as the telephone network
1551 has done). The hope is that by that time a new Internet will appear
1552 to offer a context for unfettered innovation.
1554 5. Existing problems of BGP and the current Inter-/Intra-Domain
1555 Architecture
1557 Although most of the people who have to work with BGP today believe
1558 it to be a useful, working protocol, discussions have brought to
1559 light a number of areas where BGP or the relationship between BGP and
1560 the intra-domain routing protocols in use today could be improved.
1561 BGP-4 has been and continues to be extended since it was originally
1562 introduced in [RFC1771] and the protocol as deployed has been
1563 documented in [RFC4271]. This section is, to a large extent, a wish
1564 list for the future domain-based routing architecture based on those
1565 areas where BGP is seen to be lacking, rather than simply a list of
1566 problems with BGP. The shortcomings of today's inter-domain routing
1567 system have also been extensively surveyed in 'Architectural
1568 Requirements for Inter-Domain Routing in the Internet' [RFC3221],
1569 particularly with respect to its stability and the problems produced
1570 by explosions in the size of the Internet.
1572 5.1. BGP and Auto-aggregation
1574 The initial stability followed by linear growth rates of the number
1575 of routing objects (prefixes) that was achieved by the introduction
1576 of CIDR around 1994, has now been once again been replaced by near-
1577 exponential growth of number of routing objects. The granularity of
1578 many of the objects advertised in the default free zone is very small
1579 (prefix length of 22 or longer): This granularity appears to be a by-
1580 product of attempts to perform precision traffic engineering related
1581 to increasing levels of multi-homing. At present there is no
1582 mechanism in BGP that would allow an AS to aggregate such prefixes
1583 without advance knowledge of their existence, even if it was possible
1584 to deduce automatically that they could be aggregated. Achieving
1585 satisfactory auto-aggregation would also significantly reduce the
1586 non-locality problems associated with instability in peripheral ASs.
1588 On the other hand, it may be that alterations to the connectivity of
1589 the net as described in [RFC3221] and Section 2.5.1 may limit the
1590 usefulness of auto-aggregation.
1592 5.2. Convergence and Recovery Issues
1594 BGP today is a stable protocol under most circumstances but this has
1595 been achieved at the expense of making the convergence time of the
1596 inter-domain routing system very slow under some conditions. This
1597 has a detrimental effect on the recovery of the network from
1598 failures.
1600 The timers that control the behavior of BGP are typically set to
1601 values in the region of several tens of seconds to a few minutes,
1602 which constrains the responsiveness of BGP to failure conditions.
1604 In the early days of deployment of BGP, poor network stability and
1605 router software problems lead to storms of withdrawals closely
1606 followed by re-advertisements of many prefices. To control the load
1607 on routing software imposed by these "route flaps", route flap
1608 damping was introduced into BGP. Most operators have now implemented
1609 a degree of route flap damping in their deployments of BGP. This
1610 restricts the number of times that the routing tables will be rebuilt
1611 even if a route is going up and down very frequently. Unfortunately,
1612 route flap damping responds to multiple flaps by increasing the route
1613 suppression time exponentially, which can result in some parts of the
1614 Internet being unreachable for hours at a time.
1616 There is evidence ([RFC3221] and measurements by some of the Sub-
1617 group B members[Jiang02]) that in today's network route flap is
1618 disproportionately associated with the fine grain prefices (length 22
1619 or longer) associated with traffic engineering at the periphery of
1620 the network. Auto-aggregation as previously discussed would tend to
1621 mask such instability and prevent it being propagated across the
1622 whole network. Another question that needs to be studied is the
1623 continuing need for an architecture that requires global convergence.
1624 Some of our studies (unpublished) show that, in some localities at
1625 least, the network never actually reaches stability; i.e., it never
1626 really globally converges. Can a global, and beyond, network be
1627 designed with the requirement of global convergence?
1629 5.3. Non-locality of Effects of Instability and Misconfiguration
1631 There have been a number of instances, some of which are well
1632 documented, of a mistake in BGP configuration in a single peripheral
1633 AS propagating across the whole Internet and resulting in misrouting
1634 of most of the traffic in the Internet.
1636 Similarly, a single route flap in a single peripheral AS can require
1637 route table recalculation across the entire Internet.
1639 This non-locality of effects is highly undesirable, and it would be a
1640 considerable improvement if such effects were naturally limited to a
1641 small area of the network around the problem. This is another
1642 argument for an architecture that does not require global
1643 convergence.
1645 5.4. Multihoming Issues
1647 As discussed previously, the increasing use of multi-homing as a
1648 robustness technique by peripheral networks requires that multiple
1649 routes have to be advertised for such domains. These routes must not
1650 be aggregated close in to the multi-homed domain as this would defeat
1651 the traffic engineering implied by multi-homing and currently cannot
1652 be aggregated further away from the multi-homed domain due to the
1653 lack of auto-aggregation capabilities. Consequentially the default
1654 free zone routing table is growing exponentially, as it was before
1655 CIDR.
1657 The longest prefix match routing technique introduced by CIDR, and
1658 implemented in BGP-4, when combined with provider address allocation
1659 is an obstacle to effective multi-homing if load sharing across the
1660 multiple links is required. If an AS has been allocated its
1661 addresses from an upstream provider, the upstream provider can
1662 aggregate those addresses with those of other customers and need only
1663 advertise a single prefix for a range of customers. But, if the
1664 customer AS is also connected to another provider, the second
1665 provider is not able to aggregate the customer addresses because they
1666 are not taken from his allocation, and will therefore have to
1667 announce a more specific route to the customer AS. The longest match
1668 rule will then direct all traffic through the second provider, which
1669 is not as required.
1671 Example:
1673 \ /
1674 AS1 AS2
1675 \ /
1676 AS3
1678 Figure 1: Address Aggregation
1680 In Figure 1 AS3 has received its addresses from AS1, which means AS1
1681 can aggregate. But if AS3 wants its traffic to be seen equally both
1682 ways, AS3 is forced to announce both the aggregate and the more
1683 specific route to AS2.
1685 This problem has induced many ASs to apply for their own address
1686 allocation even though they could have been allocated from an
1687 upstream provider further exacerbating the default free zone route
1688 table size explosion. This problem also interferes with the desire
1689 of many providers in the default free zone to route only prefixes
1690 that are equal to or shorter than 20 or 19 bits.
1692 Note that some problems which are referred to as multihoming issues
1693 are not, and should not be, solvable through the routing system
1694 (e.g., where a TCP load distributor is needed), and multihoming is
1695 not a panacea for the general problem of robustness in a routing
1696 system [I-D.berkowitz-multireq].
1698 Editors' Note: A more recent analysis of multihoming can be found
1699 in [RFC4116].
1701 5.5. AS-number exhaustion
1703 The domain identifier or AS-number is a 16-bit number. When this
1704 paper was originally written in 2001, allocation of AS-numbers was
1705 increasing 51% a year [RFC3221] and exhaustion by 2005 was predicted.
1706 According to some recent work again by Huston [Huston05], the rate of
1707 increase dropped off after the business downturn but as of July 2005,
1708 well over half the available AS numbers (39000 out of 64510) had been
1709 allocated by IANA and around 20000 were visible in the global BGP
1710 routing tables. A year later these figures had grown to 42000 (April
1711 2006) and 23000 (August 2006) respectively and the rate of allocation
1712 is currently about 3500 per year. Depending on the curve fitting
1713 model used to predict when exhaustion will occur, the pool will run
1714 out somewhere between 2010 and 2013. There appear to be other
1715 factors at work in this rate of increase beyond an increase in the
1716 number of ISPs in business, although there is a fair degree of
1717 correlation between these numbers. AS numbers are now used for a
1718 number of purposes beyond that of identifying large routing domains:
1719 multihomed sites acquire an AS number in order to express routing
1720 preferences to their various providers and AS numbers are used part
1721 of the addressing mechanism for MPLS/BGP-based virtual private
1722 networks (VPNs) [RFC4364]. The IETF has had a proposal under
1723 development for over four years to increase the available range of
1724 AS-numbers to 32 bits [RFC4893]. Much of the slowness in development
1725 is due to the deployment challenge during transition. Because of the
1726 difficulties of transition, deployment needs to start well in advance
1727 of actual exhaustion so that the network as a whole is ready for the
1728 new capability when it is needed. This implies that standardisation
1729 needs to be complete and implementations available at least well in
1730 advance of expected exhaustion so that deployment of upgrades that
1731 can handle the longer AS numbers should be starting around 2008 to
1732 give a reasonable expectation that the change has been rolled out
1733 across a large fraction of the Internet by the time exhaustion
1734 occurs.
1735 Editors' Note: The RIRs are planning to move to assignment of the
1736 longer AS numbers by default on 1 January 2009, but there are
1737 concerns that significant numbers of routers will not have been
1738 upgraded by then.
1740 5.6. Partitioned ASs
1742 Tricks with discontinuous ASs are used by operators, for example, to
1743 implement anycast. Discontinuous ASs may also come into being by
1744 chance if a multi-homed domain becomes partitioned as a result of a
1745 fault and part of the domain can access the Internet through each
1746 connection. It may be desirable to make support for this kind of
1747 situation more transparent than it is at present.
1749 5.7. Load Sharing
1751 Load splitting or sharing was not a goal of the original designers of
1752 BGP and it is now a problem for today's network designers and
1753 managers. Trying to fool BGP into load sharing between several links
1754 is a constantly recurring exercise for most operators today.
1756 5.8. Hold down issues
1758 As with the interval between 'hello' messages in OSPF, the typical
1759 size and defined granularity (seconds to tens of seconds) of the
1760 'keep-alive' time negotiated at start-up for each BGP connection
1761 constrains the responsiveness of BGP to link failures.
1763 The recommended values and the available lower limit for this timer
1764 were set to limit the overhead caused by keep-alive messages when
1765 link bandwidths were typically much lower than today. Analysis and
1766 experiment ([I-D.alaettinoglu-isis-convergence], [I-D.sandiick-flip]
1767 and [RFC4204]) indicate that faster links could sustain a much higher
1768 rate of keep-alive messages without significantly impacting normal
1769 data traffic. This would improve responsiveness to link and node
1770 failures but with a corresponding increase in the risk of
1771 instability, if the error characteristics of the link are not taken
1772 properly into account when setting the keep-alive interval.
1774 Editors' Note: A 'fast' liveness protocol has been standardized as
1775 [I-D.ietf-bfd-base].
1777 An additional problem with the hold-down mechanism in BGP is the
1778 amount of information that has to be exchanged to re-establish the
1779 database of route advertisements on each side of the link when it is
1780 re-established after a failure. Currently any failure, however brief
1781 forces a full exchange which could perhaps be constrained by
1782 retaining some state across limited time failures and using revision
1783 control, transaction and replication techniques to resynchronise the
1784 databases. Various techniques have been implemented to try to reduce
1785 this problem but they have not yet been standardised.
1787 5.9. Interaction between Inter-Domain Routing and Intra-Domain Routing
1789 Today, many operators' backbone routers run both I-BGP and an intra-
1790 domain protocol to maintain the routes that reach between the borders
1791 of the domain. Exporting routes from BGP into the intra-domain
1792 protocol in use and bringing them back up to BGP is not recommended
1793 [RFC2791], but it is still necessary for all backbone routers to run
1794 both protocols. BGP is used to find the egress point and intra-
1795 domain protocol to find the path (next hop router) to the egress
1796 point across the domain. This is not only a management problem but
1797 may also create other problems:
1798 o BGP is a path vector protocol (i.e., a protocol that uses distance
1799 metrics possibly overridden by policy metrics), whereas most
1800 intra-domain protocols are link state protocols. As such BGP is
1801 not optimised for convergence speed although distance vector
1802 algorithms generally require less processing power. Incidentally,
1803 more efficient distance vector algorithms are available such as
1804 [Xu97].
1805 o The metrics used in BGP and the intra-domain protocol are rarely
1806 comparable or combinable. Whilst there are arguments that the
1807 optimizations inside a domain may be different from those for end-
1808 to-end paths, there are occasions, such as calculating the
1809 'topologically nearest' server when computable or combinable
1810 metrics would be of assistance.
1812 o The policies that can be implemented using BGP are designed for
1813 control of traffic exchange between operators, not for controlling
1814 paths within a domain. Policies for BGP are most conveniently
1815 expressed in Routing Policy Support Language (RPSL) [RFC2622] and
1816 this could be extended if thought desirable to include additional
1817 policy information.
1818 o If the NEXT HOP destination for a set of BGP routes becomes
1819 inaccessible because of intra-domain protocol problems, the routes
1820 using the vanished next hop have to be invalidated at the next
1821 available UPDATE. Subsequently, if the next hop route reappears,
1822 this would normally lead to the BGP speaker requesting a full
1823 table from its neighbour(s). Current implementations may attempt
1824 to circumvent the effects of intra-domain protocol route flap by
1825 caching the invalid routes for a period in case the next hop is
1826 restored through the 'graceful restart' mechanism.
1828 * Editors' Note: This was standardized as [RFC4724].
1830 o Synchronization between intra-domain and inter-domain routing
1831 information is a problem as long as we use different protocols for
1832 intra-domain and inter-domain routing, which will most probably be
1833 the case even in the future because of the differing requirements
1834 in the two situations. Some sort of synchronization between those
1835 two protocols would be useful. In the RFC 'IS-IS Transient
1836 Blackhole Avoidance' [RFC3277], the intra-domain protocol side of
1837 the story is covered (there is an equivalent discussion for OSPF).
1838 o Synchronizing in BGP means waiting for the intra-domain protocol
1839 to know about the same networks as the inter-domain protocol,
1840 which can take a significant period of time and slows down the
1841 convergence of BGP by adding the intra-domain protocol convergence
1842 time into each cycle. In general operators no longer attempt full
1843 synchronization in order to avoid this problem (in general,
1844 redistributing the entire BGP routing feed into the local intra-
1845 domain protocol is unnecessary and undesirable but where a domain
1846 has multiple exits to peers and other non-customer networks,
1847 changes in BGP routing that affect the exit taken by traffic
1848 require corresponding re-routing in the intra-domain routing).
1850 5.10. Policy Issues
1852 There are several classes of issues with current BGP policy:
1853 o Policy is installed in an ad-hoc manner in each autonomous system.
1854 There isn't a method for ensuring that the policy installed in one
1855 router is coherent with policies installed in other routers.
1856 o As described in Griffin [Griffin99] and in McPherson [RFC3345] it
1857 is possible to create policies for ASs, and instantiate them in
1858 routers, that will cause BGP to fail to converge in certain types
1859 of topology
1861 o There is no available network model for describing policy in a
1862 coherent manner.
1864 Policy management is extremely complex and mostly done without the
1865 aid of any automated procedures. The extreme complexity means that a
1866 highly qualified specialist is required for policy management of
1867 border routers. The training of these specialists is quite lengthy
1868 and needs to involve long periods of hands-on experience. There is,
1869 therefore, a shortage of qualified staff for installing and
1870 maintaining the routing policies. Because of the overall complexity
1871 of BGP, policy management tends to be only a relatively small topic
1872 within a complete BGP training course and specialised policy
1873 management training courses are not generally available.
1875 5.11. Security Issues
1877 While many of the issues with BGP security have been traced either to
1878 implementation issues or to operational issues, BGP is vulnerable to
1879 Distributed Denial of Service (DDoS) attacks. Additionally routers
1880 can be used as unwitting forwarders in DDoS attacks on other systems.
1882 Though DDoS attacks can be fought in a variety of ways, mostly using
1883 filtering methods, it takes constant vigilance. There is nothing in
1884 the current architecture or in the protocols that serves to protect
1885 the forwarders from these attacks.
1887 Editors' Note: Since the original draft was written, the issue of
1888 inter-domain routing security has been studied in much greater
1889 depth. The rpsec working group has gone into the security issues
1890 in great detail [RFC4593] and readers should refer to that work to
1891 understand the security issues.
1893 5.12. Support of MPLS and VPNS
1895 Recently BGP has been modified to function as a signaling protocol
1896 for MPLS and for VPNs [RFC4364]. Some people see this over-loading
1897 of the BGP protocol as a boon whilst others see it as a problem.
1898 While it was certainly convenient as a vehicle for vendors to deliver
1899 extra functionality for to their products, it has exacerbated some of
1900 the performance and complexity issues of BGP. Two important problems
1901 are, the additional state that must be retained and refreshed to
1902 support VPN (Virtual Private Network) tunnels and that BGP does not
1903 provide end-to-end notification making it difficult to confirm that
1904 all necessary state has been installed or updated.
1906 It is an open question whether VPN signaling protocols should remain
1907 separate from the route determination protocols.
1909 5.13. IPv4 / IPv6 Ships in the Night
1911 The fact that service providers need to maintain two completely
1912 separate networks, one for IPv4 and one for IPv6, has been a real
1913 hindrance to the introduction of IPv6. When IPv6 does get widely
1914 deployed it will do so without causing the disappearance of IPv4.
1915 This means that unless something is done, service providers would
1916 need to maintain the two networks in perpetuity (at least on the
1917 foreshortened timescale which the Internet world uses).
1919 It is possible to use a single set of BGP speakers with multiprotocol
1920 extensions [RFC4760] to exchange information about both IPv4 and IPv6
1921 routes between domains, but the use of TCP as the transport protocol
1922 for the information exchange results in an asymmetry when choosing to
1923 use one of TCP over IPv4 or TCP over IPv6. Successful information
1924 exchange confirms one of IPv4 or IPv6 reachability between the
1925 speakers but not the other, making it possible that reachability is
1926 being advertised for a protocol for which it is not present.
1928 Also, current implementations do not allow a route to be advertised
1929 for both IPv4 and IPv6 in the same UPDATE message, because it is not
1930 possible to explicitly link the reachability information for an
1931 address family to the corresponding next hop information. This could
1932 be improved, but currently results in independent UPDATEs being
1933 exchanged for each address family.
1935 5.14. Existing Tools to Support Effective Deployment of Inter-Domain
1936 Routing
1938 The tools available to network operators to assist in configuring and
1939 maintaining effective inter-domain routing in line with their defined
1940 policies are limited, and almost entirely passive.
1942 o There are no tools to facilitate the planning of the routing of a
1943 domain (either intra- or inter-domain); there are a limited number
1944 of display tools that will visualize the routing once it has been
1945 configured.
1946 o There are no tools to assist in converting business policy
1947 specifications into the Routing Policy Specification Language
1948 (RPSL) language (see Section 5.14.1); there are limited tools to
1949 convert the RPSL into BGP commands and to check, post-facto, that
1950 the proposed policies are consistent with the policies in adjacent
1951 domains (always provided that these have been revealed and
1952 accurately documented).
1953 o There are no tools to monitor BGP route changes in real time and
1954 warn the operator about policy inconsistencies and/or
1955 instabilities.
1957 The following section summarises the tools that are available to
1958 assist with the use of RPSL. Note they are all batch mode tools used
1959 off-line from a real network. These tools will provide checks for
1960 skilled inter-domain routing configurers but limited assistance for
1961 the novice.
1963 5.14.1. Routing Policy Specification Language RPSL (RFC 2622, 2650) and
1964 RIPE NCC Database (RIPE 157)
1966 Routing Policy Specification Language (RPSL) [RFC2622] enables a
1967 network operator to describe routes, routers and autonomous systems
1968 ASs that are connected to the local AS.
1970 Using the RPSL language (see [RFC2650]) a distributed database is
1971 created to describe routing policies in the Internet as described by
1972 each AS independently. The database can be used to check the
1973 consistency of routing policies stored in the database.
1975 Tools exist ([IRRToolSet]) that can use the database to (among other
1976 things)
1977 o Flag when two neighboring network operators specify conflicting or
1978 inconsistent routing information exchanges with each other and
1979 also detect global inconsistencies where possible;
1980 o Extract all AS-paths between two networks that are allowed by
1981 routing policy from the routing policy database; display the
1982 connectivity a given network has according to current policies.
1984 The database queries enable a partial static solution to the
1985 convergence problem. They analyze routing policies of a very limited
1986 part of Internet and verify that they do not contain conflicts that
1987 could lead to protocol divergence. The static analysis of
1988 convergence of the entire system has exponential time complexity, so
1989 approximation algorithms would have to be used.
1991 The toolset also allows router configurations to be generated from
1992 RPSL specifications.
1994 Editors' Note: The "Internet Routing Registry Toolset" was
1995 originally developed by the University of Southern California's
1996 Information Sciences Institute (ISI) between 1997 and 2001 as the
1997 "Routing Arbiter ToolSet" (RAToolSet) project. The toolset is no
1998 longer developed by ISI but is used worldwide, so after a period
1999 of improvement by RIPE NCC it has now been transferred to the
2000 Internet Systems Consortium (ISC) for ongoing maintenance as a
2001 public resource.
2003 6. Security Considerations
2005 As this is an informational draft on the history of requirements in
2006 IDR and on the problems facing the current Internet IDR architecture,
2007 it does not as such create any security problems. On the other hand,
2008 some of the problems with today's Internet routing architecture do
2009 create security problems and these have been discussed in the text
2010 above.
2012 7. IANA Considerations
2014 This document does not request any actions by IANA.
2016 RFC Editor: Please remove this section before publication.
2018 8. Acknowledgments
2020 The draft is derived from work originally produced by Babylon.
2021 Babylon was a loose association of individuals from academia, service
2022 providers and vendors whose goal was to discuss issues in Internet
2023 routing with the intention of finding solutions for those problems.
2025 The individual members who contributed materially to this draft are:
2026 Anders Bergsten, Howard Berkowitz, Malin Carlzon, Lenka Carr
2027 Motyckova, Elwyn Davies, Avri Doria, Pierre Fransson, Yong Jiang,
2028 Dmitri Krioukov, Tove Madsen, Olle Pers, and Olov Schelen.
2030 Thanks also go to the members of Babylon and others who did
2031 substantial reviews of this material. Specifically we would like to
2032 acknowledge the helpful comments and suggestions of the following
2033 individuals: Loa Andersson, Tomas Ahlstrom, Erik Aman, Thomas
2034 Eriksson, Niklas Borg, Nigel Bragg, Thomas Chmara, Krister Edlund,
2035 Owe Grafford, Susan Hares, Torbjorn Lundberg, David McGrew, Jasminko
2036 Mulahusic, Florian-Daniel Otel, Bernhard Stockman, Tom Worster, and
2037 Roberto Zamparo.
2039 In addition, the authors are indebted to the folks who wrote all the
2040 references we have consulted in putting this paper together. This
2041 includes not only the references explicitly listed below, but also
2042 those who contributed to the mailing lists we have been participating
2043 in for years.
2045 Finally, it is the editors who are responsible for any lack of
2046 clarity, any errors, glaring omissions or misunderstandings.
2048 9. Informative References
2050 [Blumenthal01]
2051 Blumenthal, M. and D. Clark, "Rethinking the design of the
2052 Internet: The end to end arguments vs", the brave new
2053 world , May 2001,
2054 .
2056 [Breslau90]
2057 Breslau, L. and D. Estrin, "An Architecture for Network-
2058 Layer Routing in OSI", Proceedings of the ACM symposium on
2059 Communications architectures & protocols , 1990.
2061 [Chapin94]
2062 Piscitello, D. and A. Chapin, "Open Systems Networking:
2063 TCP/IP & OSI", Addison-Wesley Copyright assigned to
2064 authors, 1994, .
2066 [Chiappa91]
2067 Chiappa, N., "A New IP Routing and Addressing
2068 Architecture", draft-chiappa-routing-01.txt (work in
2069 progress), 1991,
2070 .
2072 [Griffin99]
2073 Griffin, T. and G. Wilfong, "An Analysis of BGP
2074 Convergence Properties", Association for Computing
2075 Machinery Proceedings of SIGCOMM '99, 1999.
2077 [Huitema90]
2078 Huitema, C. and W. Dabbous, "Routeing protocols
2079 development in the OSI architecture", Proceedings of
2080 ISCIS V Turkey, 1990.
2082 [Huston05]
2083 Huston, G., "Exploring Autonomous System Numbers", The ISP
2084 Column , August 2005,
2085 .
2087 [I-D.alaettinoglu-isis-convergence]
2088 Alaettinoglu, C., Jacobson, V., and H. Yu, "Towards Milli-
2089 Second IGP Convergence",
2090 draft-alaettinoglu-isis-convergence-00 (work in progress),
2091 Nov 2000.
2093 [I-D.berkowitz-multireq]
2094 Berkowitz, H. and D. Krioukov, "To Be Multihomed:
2095 Requirements and Definitions", draft-berkowitz-multireq-02
2096 (work in progress), 2001.
2098 [I-D.ietf-bfd-base]
2099 Katz, D. and D. Ward, "Bidirectional Forwarding
2100 Detection", draft-ietf-bfd-base-09 (work in progress),
2101 February 2009.
2103 [I-D.irtf-routing-reqs]
2104 Doria, A., Davies, E., and F. Kastenholz, "A Set of
2105 Possible Requirements for a Future Routing Architecture",
2106 draft-irtf-routing-reqs-10 (work in progress),
2107 January 2009.
2109 [I-D.sandiick-flip]
2110 Sandick, H., Squire, M., Cain, B., Duncan, I., and B.
2111 Haberman, "Fast LIveness Protocol (FLIP)",
2112 draft-sandiick-flip-00 (work in progress), Feb 2000.
2114 [INARC89] Mills, D., Ed. and M. Davis, Ed., "Internet Architecture
2115 Workshop: Future of the Internet System Architecture and
2116 TCP/IP Protocols - Report", Internet Architecture Task
2117 Force INARC, 1990, .
2120 [IRRToolSet]
2121 Internet Systems Consortium, "Internet Routing Registry
2122 Toolset Project", IRR Tool Set Website, 2006,
2123 .
2125 [ISO10747]
2126 ISO/IEC, "Protocol for Exchange of Inter-Domain Routeing
2127 Information among Intermediate Systems to support
2128 Forwarding of ISO 8473 PDUs", International Standard
2129 10747 , 1993.
2131 [Jiang02] Jiang, Y., Doria, A., Olsson, D., and F. Pettersson,
2132 "Inter-domain Routing Stability Measurement", , 2002, .
2135 [Labovitz02]
2136 Labovitz, C., Ahuja, A., Farnam, J., and A. Bose,
2137 "Experimental Measurement of Delayed Convergence", NANOG ,
2138 2002.
2140 [NewArch03]
2141 Clark, D., Sollins, K., Wroclawski, J., Katabi, D., Kulik,
2142 J., Yang, X., Braden, R., Faber, T., Falk, A., Pingali,
2143 V., Handley, M., and N. Chiappa, "New Arch: Future
2144 Generation Internet Architecture", December 2003,
2145 .
2147 [RFC0904] Mills, D., "Exterior Gateway Protocol formal
2148 specification", RFC 904, April 1984.
2150 [RFC0975] Mills, D., "Autonomous confederations", RFC 975,
2151 February 1986.
2153 [RFC1105] Lougheed, K. and J. Rekhter, "Border Gateway Protocol
2154 (BGP)", RFC 1105, June 1989.
2156 [RFC1126] Little, M., "Goals and functional requirements for inter-
2157 autonomous system routing", RFC 1126, October 1989.
2159 [RFC1163] Lougheed, K. and Y. Rekhter, "Border Gateway Protocol
2160 (BGP)", RFC 1163, June 1990.
2162 [RFC1267] Lougheed, K. and Y. Rekhter, "Border Gateway Protocol 3
2163 (BGP-3)", RFC 1267, October 1991.
2165 [RFC1752] Bradner, S. and A. Mankin, "The Recommendation for the IP
2166 Next Generation Protocol", RFC 1752, January 1995.
2168 [RFC1753] Chiappa, J., "IPng Technical Requirements Of the Nimrod
2169 Routing and Addressing Architecture", RFC 1753,
2170 December 1994.
2172 [RFC1771] Rekhter, Y. and T. Li, "A Border Gateway Protocol 4
2173 (BGP-4)", RFC 1771, March 1995.
2175 [RFC1992] Castineyra, I., Chiappa, N., and M. Steenstrup, "The
2176 Nimrod Routing Architecture", RFC 1992, August 1996.
2178 [RFC2362] Estrin, D., Farinacci, D., Helmy, A., Thaler, D., Deering,
2179 S., Handley, M., and V. Jacobson, "Protocol Independent
2180 Multicast-Sparse Mode (PIM-SM): Protocol Specification",
2181 RFC 2362, June 1998.
2183 [RFC2622] Alaettinoglu, C., Villamizar, C., Gerich, E., Kessens, D.,
2184 Meyer, D., Bates, T., Karrenberg, D., and M. Terpstra,
2185 "Routing Policy Specification Language (RPSL)", RFC 2622,
2186 June 1999.
2188 [RFC2650] Meyer, D., Schmitz, J., Orange, C., Prior, M., and C.
2189 Alaettinoglu, "Using RPSL in Practice", RFC 2650,
2190 August 1999.
2192 [RFC2791] Yu, J., "Scalable Routing Design Principles", RFC 2791,
2193 July 2000.
2195 [RFC3221] Huston, G., "Commentary on Inter-Domain Routing in the
2196 Internet", RFC 3221, December 2001.
2198 [RFC3277] McPherson, D., "Intermediate System to Intermediate System
2199 (IS-IS) Transient Blackhole Avoidance", RFC 3277,
2200 April 2002.
2202 [RFC3345] McPherson, D., Gill, V., Walton, D., and A. Retana,
2203 "Border Gateway Protocol (BGP) Persistent Route
2204 Oscillation Condition", RFC 3345, August 2002.
2206 [RFC3618] Fenner, B. and D. Meyer, "Multicast Source Discovery
2207 Protocol (MSDP)", RFC 3618, October 2003.
2209 [RFC3765] Huston, G., "NOPEER Community for Border Gateway Protocol
2210 (BGP) Route Scope Control", RFC 3765, April 2004.
2212 [RFC3913] Thaler, D., "Border Gateway Multicast Protocol (BGMP):
2213 Protocol Specification", RFC 3913, September 2004.
2215 [RFC4116] Abley, J., Lindqvist, K., Davies, E., Black, B., and V.
2216 Gill, "IPv4 Multihoming Practices and Limitations",
2217 RFC 4116, July 2005.
2219 [RFC4204] Lang, J., "Link Management Protocol (LMP)", RFC 4204,
2220 October 2005.
2222 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
2223 Protocol 4 (BGP-4)", RFC 4271, January 2006.
2225 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
2226 Networks (VPNs)", RFC 4364, February 2006.
2228 [RFC4593] Barbir, A., Murphy, S., and Y. Yang, "Generic Threats to
2229 Routing Protocols", RFC 4593, October 2006.
2231 [RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas,
2232 "Protocol Independent Multicast - Sparse Mode (PIM-SM):
2233 Protocol Specification (Revised)", RFC 4601, August 2006.
2235 [RFC4724] Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y.
2236 Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724,
2237 January 2007.
2239 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter,
2240 "Multiprotocol Extensions for BGP-4", RFC 4760,
2241 January 2007.
2243 [RFC4893] Vohra, Q. and E. Chen, "BGP Support for Four-octet AS
2244 Number Space", RFC 4893, May 2007.
2246 [Tsuchiya87]
2247 Tsuchiya, P., "An Architecture for Network-Layer Routing
2248 in OSI", Proceedings of the ACM workshop on Frontiers in
2249 computer communications technology , 1987.
2251 [Xu97] Xu, Z., Dai, S., and J. Garcia-Luna-Aceves, "A More
2252 Efficient Distance Vector Routing Algorithm", Proc IEEE
2253 MILCOM 97, Monterey, California, Nov 1997, .
2257 Authors' Addresses
2259 Elwyn B. Davies
2260 Folly Consulting
2261 Soham, Cambs
2262 UK
2264 Phone: +44 7889 488 335
2265 Email: elwynd@dial.pipex.com
2267 Avri Doria
2268 LTU
2269 Lulea, 971 87
2270 Sweden
2272 Phone: +1 401 663 5024
2273 Email: avri@acm.org