idnits 2.17.1
draft-irtf-routing-history-05.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
** It looks like you're using RFC 3978 boilerplate. You should update this
to the boilerplate described in the IETF Trust License Policy document
(see https://trustee.ietf.org/license-info), which is required now.
-- Found old boilerplate from RFC 3978, Section 5.1 on line 16.
-- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
line 2131.
-- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2142.
-- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2149.
-- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2155.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
== No 'Intended status' indicated for this document; assuming Proposed
Standard
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
** The document seems to lack separate sections for Informative/Normative
References. All references will be assumed normative when checking for
downward references.
** The abstract seems to contain references ([I-D.irtf-routing-reqs]),
which it shouldn't. Please replace those with straight textual mentions
of the documents in question.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust Copyright Line does not match the
current year
-- The document seems to lack a disclaimer for pre-RFC5378 work, but may
have content which was first submitted before 10 November 2008. If you
have contacted all the original authors and they are all willing to grant
the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
this comment. If not, you may need to add the pre-RFC5378 disclaimer.
(See the Legal Provisions document at
https://trustee.ietf.org/license-info for more information.)
-- The document date (February 19, 2007) is 6275 days in the past. Is this
intentional?
Checking references for intended status: Proposed Standard
----------------------------------------------------------------------------
(See RFCs 3967 and 4897 for information about using normative references
to lower-maturity documents in RFCs)
-- Possible downref: Non-RFC (?) normative reference: ref. 'Blumenthal01'
-- Possible downref: Non-RFC (?) normative reference: ref. 'Breslau90'
-- Possible downref: Non-RFC (?) normative reference: ref. 'Chapin94'
-- Possible downref: Normative reference to a draft: ref. 'Chiappa91'
-- Possible downref: Non-RFC (?) normative reference: ref. 'Griffin99'
-- Possible downref: Non-RFC (?) normative reference: ref. 'Huitema90'
-- Possible downref: Non-RFC (?) normative reference: ref. 'Huston05'
-- Possible downref: Normative reference to a draft: ref.
'I-D.alaettinoglu-isis-convergence'
-- No information found for draft-berkowitz-multirqmt - is the name correct?
-- Possible downref: Normative reference to a draft: ref.
'I-D.berkowitz-multirqmt'
== Outdated reference: A later version (-11) exists of
draft-ietf-bfd-base-05
== Outdated reference: A later version (-11) exists of
draft-irtf-routing-reqs-07
** Downref: Normative reference to an Historic draft:
draft-irtf-routing-reqs (ref. 'I-D.irtf-routing-reqs')
-- Possible downref: Normative reference to a draft: ref.
'I-D.sandiick-flip'
-- Possible downref: Non-RFC (?) normative reference: ref. 'INARC89'
-- Possible downref: Non-RFC (?) normative reference: ref. 'IRRToolSet'
-- Possible downref: Non-RFC (?) normative reference: ref. 'ISO10747'
-- Possible downref: Non-RFC (?) normative reference: ref. 'Jiang02'
-- Possible downref: Non-RFC (?) normative reference: ref. 'Labovitz02'
-- Possible downref: Non-RFC (?) normative reference: ref. 'NewArch03'
** Downref: Normative reference to an Historic RFC: RFC 904
** Downref: Normative reference to an Unknown state RFC: RFC 975
** Obsolete normative reference: RFC 1105 (Obsoleted by RFC 1163)
** Downref: Normative reference to an Unknown state RFC: RFC 1126
** Obsolete normative reference: RFC 1163 (Obsoleted by RFC 1267)
** Downref: Normative reference to an Historic RFC: RFC 1267
** Downref: Normative reference to an Informational RFC: RFC 1753
** Obsolete normative reference: RFC 1771 (Obsoleted by RFC 4271)
** Downref: Normative reference to an Informational RFC: RFC 1992
** Obsolete normative reference: RFC 2362 (Obsoleted by RFC 4601, RFC 5059)
** Obsolete normative reference: RFC 2547 (Obsoleted by RFC 4364)
** Downref: Normative reference to an Informational RFC: RFC 2650
** Downref: Normative reference to an Informational RFC: RFC 2791
** Obsolete normative reference: RFC 2858 (Obsoleted by RFC 4760)
** Downref: Normative reference to an Informational RFC: RFC 3221
** Downref: Normative reference to an Informational RFC: RFC 3277
** Downref: Normative reference to an Informational RFC: RFC 3345
** Downref: Normative reference to an Experimental RFC: RFC 3618
** Downref: Normative reference to an Informational RFC: RFC 3765
** Downref: Normative reference to an Historic RFC: RFC 3913
** Downref: Normative reference to an Informational RFC: RFC 4116
** Downref: Normative reference to an Informational RFC: RFC 4593
** Obsolete normative reference: RFC 4601 (Obsoleted by RFC 7761)
-- Possible downref: Non-RFC (?) normative reference: ref. 'Tsuchiya87'
-- Possible downref: Non-RFC (?) normative reference: ref. 'Xu97'
Summary: 27 errors (**), 0 flaws (~~), 4 warnings (==), 26 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Network Working Group E. Davies
3 Internet-Draft Consultant
4 Expires: August 23, 2007 A. Doria
5 LTU
6 February 19, 2007
8 Analysis of IDR requirements and History
9 draft-irtf-routing-history-05.txt
11 Status of this Memo
13 By submitting this Internet-Draft, each author represents that any
14 applicable patent or other IPR claims of which he or she is aware
15 have been or will be disclosed, and any of which he or she becomes
16 aware will be disclosed, in accordance with Section 6 of BCP 79.
18 Internet-Drafts are working documents of the Internet Engineering
19 Task Force (IETF), its areas, and its working groups. Note that
20 other groups may also distribute working documents as Internet-
21 Drafts.
23 Internet-Drafts are draft documents valid for a maximum of six months
24 and may be updated, replaced, or obsoleted by other documents at any
25 time. It is inappropriate to use Internet-Drafts as reference
26 material or to cite them other than as "work in progress."
28 The list of current Internet-Drafts can be accessed at
29 http://www.ietf.org/ietf/1id-abstracts.txt.
31 The list of Internet-Draft Shadow Directories can be accessed at
32 http://www.ietf.org/shadow.html.
34 This Internet-Draft will expire on August 23, 2007.
36 Copyright Notice
38 Copyright (C) The IETF Trust (2007).
40 Abstract
42 This document analyses the current state of IDR routing with respect
43 to RFC1126 and other IDR requirements and design efforts. It is the
44 companion document to "Requirements for Inter-Domain Routing"
45 [I-D.irtf-routing-reqs], which is a discussion of requirements for
46 the future routing architecture and future routing protocols.
47 Publication of this document is in accordance with the consensus of
48 the active contributors the IRTF's Routing Research Group.
50 [Note to RFC Editor: Please replace the reference in the abstract
51 with a non-reference quoting the RFC number of the companion
52 document when it is allocated, i.e., '(RFC xxxx)' and remove this
53 note.]
55 Table of Contents
57 1. Provenance of this Document . . . . . . . . . . . . . . . . . 4
58 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
59 2.1. Background . . . . . . . . . . . . . . . . . . . . . . . . 5
60 3. Historical Perspective . . . . . . . . . . . . . . . . . . . . 6
61 3.1. The Legacy of RFC1126 . . . . . . . . . . . . . . . . . . 6
62 3.1.1. "General Requirements" . . . . . . . . . . . . . . . . 7
63 3.1.2. "Functional Requirements" . . . . . . . . . . . . . . 11
64 3.1.3. "Non-Goals" . . . . . . . . . . . . . . . . . . . . . 18
65 3.2. ISO OSI IDRP, BGP and the Development of Policy Routing . 22
66 3.3. Nimrod Requirements . . . . . . . . . . . . . . . . . . . 27
67 3.4. PNNI . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
68 4. Recent Research Work . . . . . . . . . . . . . . . . . . . . . 29
69 4.1. Developments in Internet Connectivity . . . . . . . . . . 29
70 4.2. DARPA NewArch Project . . . . . . . . . . . . . . . . . . 30
71 4.2.1. Defending the End-to-End Principle . . . . . . . . . . 31
72 5. Existing problems of BGP and the current
73 Inter-/Intra-Domain Architecture . . . . . . . . . . . . . . . 32
74 5.1. BGP and Auto-aggregation . . . . . . . . . . . . . . . . . 32
75 5.2. Convergence and Recovery Issues . . . . . . . . . . . . . 32
76 5.3. Non-locality of Effects of Instability and
77 Misconfiguration . . . . . . . . . . . . . . . . . . . . . 33
78 5.4. Multihoming Issues . . . . . . . . . . . . . . . . . . . . 33
79 5.5. AS-number exhaustion . . . . . . . . . . . . . . . . . . . 35
80 5.6. Partitioned AS's . . . . . . . . . . . . . . . . . . . . . 35
81 5.7. Load Sharing . . . . . . . . . . . . . . . . . . . . . . . 36
82 5.8. Hold down issues . . . . . . . . . . . . . . . . . . . . . 36
83 5.9. Interaction between Inter domain routing and intra
84 domain routing . . . . . . . . . . . . . . . . . . . . . . 36
85 5.10. Policy Issues . . . . . . . . . . . . . . . . . . . . . . 38
86 5.11. Security Issues . . . . . . . . . . . . . . . . . . . . . 38
87 5.12. Support of MPLS and VPNS . . . . . . . . . . . . . . . . . 38
88 5.13. IPv4 / IPv6 Ships in the Night . . . . . . . . . . . . . . 39
89 5.14. Existing Tools to Support Effective Deployment of
90 Inter-Domain Routing . . . . . . . . . . . . . . . . . . . 39
91 5.14.1. Routing Policy Specification Language RPSL (RFC
92 2622, 2650) and RIPE NCC Database (RIPE 157) . . . . . 40
93 6. Security Considerations . . . . . . . . . . . . . . . . . . . 41
94 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 41
95 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 41
96 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 42
97 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 46
98 Intellectual Property and Copyright Statements . . . . . . . . . . 47
100 1. Provenance of this Document
102 In 2001, the IRTF Routing Research Group (IRTF RRG) chairs, Abha
103 Ahuja and Sean Doran, decided to establish a sub-group to look at
104 requirements for inter-domain routing (IDR). A group of well known
105 routing experts was assembled to develop requirements for a new
106 routing architecture. Their mandate was to approach the problem
107 starting from a blank sheet. This group was free to take any
108 approach, including a revolutionary approach, in developing
109 requirements for solving the problems they saw in inter-domain
110 routing.
112 Simultaneously, an independent effort was started in Sweden with a
113 similar goal. A team, calling itself Babylon, representing vendors,
114 service providers, and academia, assembled to understand the history
115 of inter-domain routing, to research the problems seen by the service
116 providers, and to develop a proposal of requirements for a follow-on
117 to the current routing architecture. This group's approach required
118 an evolutionary approach starting from current routing architecture
119 and practice. In other words the group limited itself to developing
120 an evolutionary strategy. The Babylon group was later folded into
121 the IRTF RRG as Sub-Group B.
123 This document, which was a part of Sub-group B's output, provides a
124 snapshot of the current state of Inter-Domain Routing (IDR) at the
125 time of original writing (2001) with some minor updates to take into
126 account developments since that date, bringing it up to date in 2006.
127 The development of the new requirments set is then motivated by an
128 analysis of the problems that IDR has been encountering in the recent
129 past. This document is intended as a counterpart to the Routing
130 Requirements document which captures the requirements for future
131 domain routing systems as captured separately by the IRTF RRG Sub-
132 groups A and B [I-D.irtf-routing-reqs].
134 2. Introduction
136 It is generally accepted that there are major shortcomings in the
137 inter-domain routing of the Internet today and that these may result
138 in severe routing problems within an unspecified period of time.
139 Remedying these shortcomings will require extensive research to tie
140 down the exact failure modes that lead to these shortcomings and
141 identify the best techniques to remedy the situation.
143 Changes in the nature and quality of the services that users want
144 from the Internet are difficult to provide within the current
145 framework, as they impose requirements never foreseen by the original
146 architects of the Internet routing system.
148 The kind of radical changes that have to be accommodated are
149 epitomized by the advent of IPv6 and the application of IP mechanisms
150 to private commercial networks that offer specific service guarantees
151 beyond the best-effort services of the public Internet. Major
152 changes to the inter-domain routing system are inevitable to provide
153 an efficient underpinning for the radically changed and increasingly
154 commercially-based networks that rely on the IP protocol suite.
156 Current practice stresses the need to separate the concerns of the
157 control plane in a router and the forwarding plane: This document
158 will follow this practice, but we still use the term 'routing' as a
159 global portmanteau to cover all aspects of the system.
161 This document provides a historical perspective on the current state
162 of domain routing in Section 3 by revisiting the previous IETF
163 requirements document intended to steer the development of a future
164 routing system. These requirements, which informed the design of the
165 Border Gateway Protocol (BGP) in 1989, are contained in RFC1126 -
166 "Goals and Functional Requirements for Inter-Autonomous System
167 Routing" [RFC1126].
169 Section 3 also looks at some other work on requirements for domain
170 routing that was carried out before and after RFC1126 was published.
171 This work fleshes out the historical perspective and provides some
172 additional insights into alternative approaches which may be
173 instructive when building a new set of requirements.
175 The motivation for change and the inspiration for some of the
176 requirements for new routing architectures derive from the problems
177 attributable to the current domain routing system that are being
178 experienced in the Internet today. These will be discussed in
179 Section 5.
181 2.1. Background
183 Today's Internet uses an addressing and routing structure that has
184 developed in an ad hoc, more or less upwards-compatible fashion. It
185 has progressed from handling a non-commercial Internet with a single
186 administrative domain to a solution that is just about controlling
187 today's multi-domain, federated Internet, carrying traffic between
188 the networks of commercial, governmental and not-for-profit
189 participants. As well as directing traffic to its intended end-
190 point, inter-domain routing mechanisms are expected to implement a
191 host of domain specific routing policies for competing, communicating
192 domains. The result is not ideal, particularly as regards inter-
193 domain routing mechanisms, but it does a pretty fair job at its
194 primary goal of providing any-to-any connectivity to many millions of
195 computers.
197 Based on a large body of anecdotal evidence, but also on a growing
198 body of experimental evidence [Labovitz02] and analytic work on the
199 stability of BGP under certain policy specifications [Griffin99], the
200 main Internet inter-domain routing protocol, BGP version 4 (BGP-4),
201 appears to have a number of problems that need to be resolved.
202 Additionally, the hierarchical nature of the inter-domain routing
203 problem appears to be changing as the connectivity between domains
204 becomes increasingly meshed [RFC3221] which alters some of the
205 scaling and structuring assumptions on which BGP-4 is built. Patches
206 and fix-ups may relieve some of these problems but others may require
207 a new architecture and new protocols.
209 3. Historical Perspective
211 3.1. The Legacy of RFC1126
213 RFC 1126 [RFC1126] outlined a set of requirements that were intended
214 to guide the development of BGP.
216 Editors' Note: When this document was reviewed by Yakov Rekhter,
217 one of the designers of BGP, his view was that "While some people
218 expected a set of requirements outlined in RFC1126 to guide the
219 development of BGP, in reality the development of BGP happened
220 completely independently of RFC1126. In other words, from the
221 point of view of the development of BGP, RFC1126 turned out to be
222 totally irrelevant." On the other hand, it appears that BGP as
223 currently implemented has met a large proportion of these
224 requirements, especially for unicast traffic.
226 While the network is demonstrably different from what it was in 1989,
227 both as to structure and size, many of the same requirements remain.
228 As a first step in setting requirements for the future, we need to
229 understand the requirements that were originally set for the current
230 protocols. And in charting a future architecture we must first be
231 sure to do no harm. This means a future domain routing system has to
232 support as its base requirement, the level of function that is
233 available today.
235 The following sections each relate to a requirement, or non-
236 requirement listed in RFC1126. In fact the section names are direct
237 quotes from the document. The discussion of these requirements
238 covers the following areas:
240 Explanation: Optional interpretation for today's audience of
241 the original intent of the requirement
243 Relevance: Is the requirement of RFC1126 still relevant, and
244 to what degree? Should it be understood
245 differently in today's environment?
247 Current practice: How well is the requirement met by current
248 protocols and practice?
250 3.1.1. "General Requirements"
252 3.1.1.1. "Route to Destination"
254 Timely routing to all reachable destinations, including multihoming
255 and multicast.
257 Relevance: Valid, but requirements for multihoming need
258 further discussion and elucidation. The
259 requirement should include multiple source
260 multicast routing.
262 Current practice: Multihoming is not efficient and the proposed
263 inter-domain multicast protocol BGMP [RFC3913] is
264 an add-on to BGP following many of the same
265 strategies but not integrated into the BGP
266 framework .
268 Editors' Note: Multicast routing has moved on
269 again since this was originally written. By
270 2006 BGMP had been effectively superseded.
271 Multicast routing now uses Multiprotocol BGP
272 [RFC2858], the Multicast Source Discovery
273 Protocol (MSDP) [RFC3618] and Protocol
274 Independent Multicast - Sparse Mode (PIM-SM)
275 [RFC2362], [RFC4601], especially the Source
276 Specific Multicast (SSM) subset.
278 3.1.1.2. "Routing is Assured"
280 This requires that a user be notified within a reasonable time period
281 of attempts, about inability to provide a service.
283 Relevance: Valid
284 Current practice: There are ICMP messages for this, but in many
285 cases they are not used, either because of fears
286 about creating message storms or uncertainty about
287 whether the end system can do anything useful with
288 the resulting information. IPv6 implementations
289 may be able to make better use of the information
290 as they may have alternative addresses that could
291 be used to exploit an alternative routing.
293 3.1.1.3. "Large System"
295 The architecture was designed to accommodate the growth of the
296 Internet.
298 Relevance: Valid. Properties of Internet topology might be
299 an issue for future scalability (topology varies
300 from very sparse to quite dense at present).
301 Instead of setting growth in a time-scale,
302 indefinite growth should be accommodated. On the
303 other hand, such growth has to be accommodated
304 without making the protocols too expensive -
305 trade-offs may be necessary.
307 Current practice: Scalability of the current protocols will not be
308 sufficient under the current rate of growth.
309 There are problems with BGP convergence for large
310 dense topologies, problems with routing
311 information propagation between routers in transit
312 domains, limited support for hierarchy, etc.
314 3.1.1.4. "Autonomous Operation"
316 This requirement encapsulates the need for administative domains
317 ("Autonomous Systems" - AS) to be able to operate autonomously as
318 regards setting routing policy:
320 Relevance: Valid. There may need to be additional
321 requirements for adjusting policy decisions to the
322 global functionality and for avoiding
323 contradictory policies. This would decrease the
324 possibility of unstable routing behavior.
326 There is a need for handling various degrees of
327 trust in autonomous operations, ranging from no
328 trust (e.g., between separate ISPs) to very high
329 trust where the domains have a common goal of
330 optimizing their mutual policies.
332 Policies for intra domain operations should in
333 some cases be revealed, using suitable
334 abstractions.
336 Current practice: Policy management is in the control of network
337 managers, as required, but there is little support
338 for handling policies at an abstract level for a
339 domain.
341 Cooperating administrative entities decide about
342 the extent of cooperation independently. Lack of
343 coordination combined with global range of effects
344 results in occasional melt-down of Internet
345 routing.
347 3.1.1.5. "Distributed System"
349 The routing environment is a distributed system. The distributed
350 routing environment supports redundancy and diversity of nodes and
351 links. Both data and operations are distributed.
353 Relevance: Valid. RFC1126 is very clear that we should not
354 be using centralized solutions, but maybe we need
355 a discussion on trade-offs between common
356 knowledge and distribution (i.e., to allow for
357 uniform policy routing, e.g., GSM systems are in a
358 sense centralized, but with hierarchies)
360 Current practice: Routing is very distributed, but lacking abilities
361 to consider optimization over several hops or
362 domains.
364 3.1.1.6. "Provide A Credible Environment"
366 Routing mechanism information must be integral and secure (credible
367 data, reliable operation). Security from unwanted modification and
368 influence is required.
370 Relevance: Valid.
372 Current practice: BGP provides a limited mechanism for
373 authentication and security of peering sessions,
374 but this does not guarantee the authenticity or
375 validity of the routing information that is
376 exchanged.
378 There are certainly security problems with current
379 practice. The Routing Protocol Security
380 Requirements (rpsec) working group has been
381 struggling to agree on a set of requirements for
382 BGP security since early 2002.
384 Editors' note: Proposals for authenticating BGP
385 routing information using certificates were
386 under development by the Secure Inter-Domain
387 Routing (sidr) working group in 2006.
389 3.1.1.7. "Be A Managed Entity"
391 Requires that a manager should get enough information on a state of
392 network so that s/he could make informed decisions.
394 Relevance: The requirement is reasonable, but we might need
395 to be more specific on what information should be
396 available, e.g., to prevent routing oscillations.
398 Current practice: All policies are determined locally, where they
399 may appear reasonable but there is limited global
400 coordination through the routing policy databases
401 operated by the Internet registries (AfriNIC,
402 APNIC, ARIN, LACNIC, RIPE, etc.).
404 Operators are not required to register their
405 policies; even when policies are registered, it is
406 difficult to check that the actual policies in use
407 match the declared policies and therefore a
408 manager cannot guarantee to make a globally
409 consistent decision.
411 3.1.1.8. "Minimize Required Resources"
413 Relevance: Valid, however, the paragraph states that
414 assumptions on significant upgrades shouldn't be
415 made. Although this is reasonable, a new
416 architecture should perhaps be prepared to use
417 upgrades when they occur.
419 Current practice: Most bandwidth is consumed by the exchange of the
420 Network Layer Reachability Information (NLRI).
421 Usage of processing cycles ("Central Processor
422 Usage" - CPU) depends on the stability of the
423 Internet. Both phenomena have a local nature, so
424 there are not scaling problems with bandwidth and
425 CPU usage. Instability of routing increases the
426 consumption of resources in any case. The number
427 of networks in the Internet dominates memory
428 requirements - this is a scaling problem.
430 3.1.2. "Functional Requirements"
432 3.1.2.1. "Route Synthesis Requirements"
434 3.1.2.1.1. "Route around failures dynamically"
436 Relevance: Valid. Should perhaps be stronger. Only
437 providing a best-effort attempt may not be enough
438 if real-time services are to be provided for.
439 Detections may need to be faster than 100ms to
440 avoid being noticed by end-users.
442 Current practice: Latency of fail-over is too high; sometimes
443 minutes or longer.
445 3.1.2.1.2. "Provide loop free paths"
447 Relevance: Valid. Loops should occur only with negligible
448 probability and duration.
450 Current practice: Both link-state intra domain routing and BGP
451 inter-domain routing (if correctly configured) are
452 forwarding-loop free after having converged.
453 However, convergence time for BGP can be very long
454 and poorly designed routing policies may result in
455 a number of BGP speakers engaging in a cyclic
456 pattern of advertisements and withdrawals which
457 never converges to a stable result [RFC3345].
458 Perhaps this is one context in which the need for
459 global convergence needs to be reviewed.
461 3.1.2.1.3. "Know when a path or destination is unavailable"
463 Relevance: Valid to some extent, but there is a trade-off
464 between aggregation and immediate knowledge of
465 reachability. It requires that routing tables
466 contain enough information to determine that the
467 destination is unknown or a path cannot be
468 constructed to reach it.
470 Current practice: Knowledge about lost reachability propagates
471 slowly through the networks due to slow
472 convergence for route withdrawals.
474 3.1.2.1.4. "Provide paths sensitive to administrative policies"
476 Relevance: Valid. Policy control of routing is of
477 increasingly importance as the Internet has turned
478 into a business.
480 Current practice: Supported to some extent. Policies can only be
481 applied locally in an AS and not globally. Policy
482 information supplied has a very small probability
483 of affecting policies in other AS's. Furthermore,
484 only static policies are supported; between static
485 policies and policies dependent upon volatile
486 events of great celerity there should exist events
487 that routing should be aware of. Lastly, there is
488 no support for policies other than route-
489 properties (such as AS-origin, AS-path,
490 destination prefix, MED-values etc).
492 Editors' note: Subsequent to the original issue
493 of this document mechanisms which acknowledge
494 the business relationships of operators have
495 been developed such as the NOPEER community
496 attribute [RFC3765]. However the level of
497 usage of this attribute is apparently not very
498 great.
500 3.1.2.1.5. "Provide paths sensitive to user policies"
502 Relevance: Valid to some extent, as they may conflict with
503 the policies of the network administrator. It is
504 likely that this requirement will be met by means
505 of different bit transport services offered by an
506 operator, but at the cost of adequate
507 provisioning, authentication and policing when
508 utilizing the service.
510 Current practice: Not supported in normal routing. Can be
511 accomplished to some extent with loose source
512 routing, resulting in inefficient forwarding in
513 the routers. The various attempts to introduce
514 Quality of Service (QoS - e.g., Integrated
515 Services and Differentiated Services (DiffServ))
516 can also be seen as means to support this
517 requirement but they have met with limited success
518 in terms of providing alternate routes as opposed
519 to providing improved service on the standard
520 route.
522 Editor's Note: From the standpoint of a later
523 time, it would probably be more appropriate to
524 say "total faiure" rather than "limited
525 success".
527 3.1.2.1.6. "Provide paths which characterize user quality-of-service
528 requirements"
530 Relevance: Valid to some extent, as they may conflict with
531 the policies of the operator. It is likely that
532 this requirement will be met by means of different
533 bit transport services offered by an operator, but
534 at the cost of adequate provisioning,
535 authentication and policing when utilizing the
536 service. It has become clear that offering to
537 provide a particular QoS to any arbitrary
538 destination from a particular source is generally
539 impossible: QoS except in very 'soft' forms such
540 as overall long term average packet delay, is
541 generally associated with connection oriented
542 routing.
544 Current practice: Creating routes with specified QoS is not
545 generally possible at present.
547 3.1.2.1.7. "Provide autonomy between inter- and intra-autonomous system
548 route synthesis"
550 Relevance: Inter- and intra-domain routing should stay
551 independent, but one should notice that this to
552 some extent contradicts the previous three
553 requirements. There is a trade-off between
554 abstraction and optimality.
556 Current practice: Inter-domain routing is performed independently of
557 intra-domain routing. Intra-domain routing is
558 however, especially in transit domains, very
559 interrelated with inter-domain routing.
561 3.1.2.2. "Forwarding Requirements"
563 3.1.2.2.1. "Decouple inter- and intra-autonomous system forwarding
564 decisions"
566 Relevance: Valid.
568 Current practice: As explained in Section 3.1.2.1.7, intra-domain
569 forwarding in transit domains is dependent on
570 inter-domain forwarding decisions.
572 3.1.2.2.2. "Do not forward datagrams deemed administratively
573 inappropriate"
575 Relevance: Valid, and increasingly important in the context
576 of enforcing policies correctly expressed through
577 routing advertisements but flouted by rogue peers
578 which send traffic for which a route has not been
579 advertised. On the other hand, packets that have
580 been misrouted due to transient routing problems
581 perhaps should be forwarded to reach the
582 destination, although along an unexpected path.
584 Current practice: At stub domains there is packet filtering, e.g.,
585 to catch source address spoofing on outgoing
586 traffic or to filter out unwanted incoming
587 traffic. Filtering can in particular reject
588 traffic (such as unauthorized transit traffic)
589 that has been sent to a domain even when it has
590 not advertised a route for such traffic on a given
591 interface. The growing class of 'middle boxes'
592 (midboxes, e.g., Network Address Translators -
593 NATs) is quite likely to apply administrative
594 rules that will prevent forwarding of packets.
595 Note that security policies may deliberately hide
596 administrative denials. In the backbone,
597 intentional packet dropping based on policies is
598 not common.
600 3.1.2.2.3. "Do not forward datagrams to failed resources"
602 Relevance: Unclear, although it is clearly desirable to
603 minimise waste of forwarding resources by
604 discarding datagrams which cannot be delivered at
605 the earliest opportunity. There is a trade-off
606 between scalability and keeping track of
607 unreachable resources. Equipment closest to a
608 failed node has the highest motivation to keep
609 track of failures so that waste can be minimised.
611 Current practice: Routing protocols use both internal adjacency
612 management sub-protocols (e.g. Hello protocols)
613 and information from equipment and lower layer
614 link watchdogs to keep track of failures in
615 routers and connecting links. Failures will
616 eventually result in the routing protocol
617 reconfiguring the routing to avoid (if possible) a
618 failed resource, but this is generally very slow
619 (30s or more). In the meantime datagrams may well
620 be forwarded to failed resources. In general
621 terms, end hosts and some non-router midboxes do
622 not participate in these notifications and
623 failures of such boxes will not affect the routing
624 system.
626 3.1.2.2.4. "Forward datagram according to its characteristics"
628 Relevance: Valid. This is necessary in enabling
629 differentiation in the network, based on QoS,
630 precedence, policy or security.
632 Current practice: Ingress and egress filtering can be done based on
633 policy. Some networks discriminate on the basis
634 of requested QoS.
636 3.1.2.3. "Information Requirements"
638 3.1.2.3.1. "Provide a distributed and descriptive information base"
640 Relevance: Valid, however hierarchical information bases
641 might provide more possibilities.
643 Current practice: The information base is distributed, but it is
644 unclear whether it supports all necessary routing
645 functionality.
647 3.1.2.3.2. "Determine resource availability"
649 Relevance: Valid. It should be possible for resource
650 availability and levels of resource availability
651 to be determined. This prevents needing to
652 discover unavailability through failure. Resource
653 location and discovery is arguably a separate
654 concern that could be addressed outside the core
655 routing requirements.
657 Current practice: Resource availability is predominantly handled
658 outside of the routing system.
660 3.1.2.3.3. "Restrain transmission utilization"
662 Relevance: Valid. However certain requirements in the
663 control plane, such as fast detection of faults
664 may be worth consumption of more resources.
665 Similarly, simplicity of implementation may make
666 it cheaper to 'back haul' traffic to central
667 locations to minimise the cost of routing if
668 bandwidth is cheaper than processing.
670 Current practice: BGP messages probably do not ordinarily consume
671 excessive resources, but might during erroneous
672 conditions. In the data plane, the near universal
673 adoption of shortest path protocols could be
674 considered to result in minimization of
675 transmission utilization.
677 3.1.2.3.4. "Allow limited information exchange"
679 Relevance: Valid. But perhaps routing could be improved if
680 certain information could be available either
681 globally or at least for a wider defined locality.
683 Current practice: Policies are used to determine which reachability
684 information is exported.
686 3.1.2.4. "Environmental Requirements"
688 3.1.2.4.1. "Support a packet-switching environment"
690 Relevance: Valid but routing system should, perhaps, not be
691 limited to this exclusively.
693 Current practice: Supported.
695 3.1.2.4.2. "Accommodate a connection-less oriented user transport
696 service"
698 Relevance: Valid, but routing system should, perhaps, not be
699 limited to this exclusively.
701 Current practice: Accommodated.
703 3.1.2.4.3. "Accommodate 10K autonomous systems and 100K networks"
705 Relevance: No longer valid. Needs to be increased
706 potentially indefinitely. It is extremely
707 difficult to foresee the future size expansion of
708 the Internet so that the Utopian solution would be
709 to achieve an Internet whose architecture is scale
710 invariant. Regrettably, this may not be
711 achievable without introducing undesirable
712 complexity and a suitable trade off between
713 complexity and scalability is likely to be
714 necessary.
716 Current Practice: Supported but perhaps reaching its limit. Since
717 the original version of this document was written
718 in 2001, the number of ASs advertised has grown
719 from around 8000 to 20000, and almost 35000 AS
720 numbers have been allocated by the regional
721 registries [Huston05]. If this growth continues
722 the original 16 bit AS space in BGP-4 will be
723 exhausted in less than 5 years. Planning for an
724 extended AS space is now an urgent requirement.
726 3.1.2.4.4. "Allow for arbitrary interconnection of autonomous systems"
728 Relevance: Valid. However perhaps not all interconnections
729 should be accessible globally.
731 Current practice: BGP-4 allows for arbitrary interconnections.
733 3.1.2.5. "General Objectives"
735 3.1.2.5.1. "Provide routing services in a timely manner"
737 Relevance: Valid, as stated before. The more complex a
738 service is the longer it should be allowed to
739 take, but the implementation of services requiring
740 (say) NP-complete calculation should be avoided.
742 Current practice: More or less, with the exception of convergence
743 and fault robustness.
745 3.1.2.5.2. "Minimize constraints on systems with limited resources"
746 Relevance: Valid
748 Current practice: Systems with limited resources are typically stub
749 domains that advertise very little information.
751 3.1.2.5.3. "Minimize impact of dissimilarities between autonomous
752 systems"
754 Relevance: Important. This requirement is critical to a
755 future architecture. In a domain routing
756 environment where the internal properties of
757 domains may differ radically, it will be important
758 to be sure that these dissimilarities are
759 minimized at the borders.
760 Current: practice: For the most part this capability is not really
761 required in today's networks since the intra-
762 domain attributes are broadly similar across
763 domains.
765 3.1.2.5.4. "Accommodate the addressing schemes and protocol mechanisms
766 of the autonomous systems"
768 Relevance: Important, probably more so than when RFC1126 was
769 originally developed because of the potential
770 deployment of IPv6, wider usage of MPLS and the
771 increasing usage of VPNs.
773 Current practice: Only one global addressing scheme is supported in
774 most autonomous systems but the availability of
775 IPv6 services is steadily increasing. Some global
776 backbones support IPv6 routing and forwarding.
778 3.1.2.5.5. "Must be implementable by network vendors"
780 Relevance: Valid, but note that what can be implemented today
781 is different from what was possible when RFC1126
782 was written: a future domain routing architecture
783 should not be unreasonably constrained by past
784 limitations.
786 Current practice: BGP was implemented and meets a large proportion
787 of the original requirements.
789 3.1.3. "Non-Goals"
791 RFC1126 also included a section discussing non-goals. To what extent
792 are these still non-goals? Does the fact that they were non-goals
793 adversely affect today's IDR system?
795 3.1.3.1. "Ubiquity"
797 The authors of RFC 1126 were explicitly saying that IP and its inter-
798 domain routing system need not be deployed in every AS, and a
799 participant should not necessarily expect to be able to reach a given
800 AS, possibly because of routing policies. In a sense this 'non-goal'
801 has effectively been achieved by the Internet and IP protocols. This
802 requirement reflects a different world view where there was serious
803 competition for network protocols, which is really no longer the
804 case. Ubiquitous deployment of inter-domain routing in particular
805 has been achieved and must not be undone by any proposed future
806 domain routing architecture. On the other hand:
807 o ubiquitous connectivity cannot be reached in a policy sensitive
808 environment and should not be an aim,
809 * Editor's Note: It has been pointed out that this statement
810 could be interpreted as being contrary to the Internet mission
811 of providing universal connectivity. The fact that limits to
812 connectivity will be added as operational requiremements in a
813 policy sensitive environment should not imply that a future
814 domain routing architecture contains intrinsic limits on
815 connectivity.
816 o it must not be required that the same routing mechanisms are used
817 throughout provided that they can interoperate appropriately
818 o the information needed to control routing in a part of the network
819 should not necessarily be ubiquitously available and it must be
820 possible for an operator to hide commercially sensitive
821 information that is not needed outside a domain.
822 o the introduction of IPv6 reintroduces an element of diversity into
823 the world of network protocols but the similarities of IPv4 and
824 IPv6 as regards routing and forwarding make this event less likely
825 to drive an immediate diversification in routing systems. The
826 potential for further growth in the size of the network enabled by
827 IPv6 is very likely to require changes in the future: whether this
828 results in the replacement of one de facto ubiquitous system with
829 another remains to be seen but cannot be a requirement - it will
830 have to interoperate with BGP during the transition..
832 Relevance: De facto essential for a future domain routing
833 architecture, but what is required is ubiquity of
834 the routing system rather than ubiquity of
835 connectivity and it must be capable of a gradual
836 takeover through interoperation with the existing
837 system.
839 Current practice: De facto ubiquity achieved.
841 3.1.3.2. "Congestion control"
843 Relevance: It is not clear if this non-goal was to be applied
844 to routing or forwarding. It is definitely a non-
845 goal to adapt the choice of route when there is
846 transient congestion. However, to add support for
847 congestion avoidance (e.g., Explicit Congestion
848 Notification (ECN) and ICMP messages) in the
849 forwarding process would be a useful addition.
850 There is also extensive work going on in traffic
851 engineering which should result in congestion
852 avoidance through routing as well as in
853 forwarding.
855 Current practice: Some ICMP messages (e.g., source quench) exist to
856 deal with congestion control but these are not
857 generally used as they either make the problem
858 worse or there is no mechanism to reflect the
859 message into the application which is providing
860 the source.
862 3.1.3.3. "Load splitting"
864 Relevance: This should neither be a non-goal, nor an explicit
865 goal. It might be desirable in some cases and
866 should be considered as an optional architectural
867 feature.
869 Current practice: Can be implemented by exporting different prefixes
870 on different links, but this requires manual
871 configuration and does not consider actual load.
873 Editors' Note: This configuration is carried
874 out extensively as of 2006 and has been a
875 significant factor in routing table bloat. If
876 this need is a real operational requirement, as
877 it seems to be for multihomed or otherwise
878 richly connected sites, it will be necessary to
879 reclassify this as a real and important goal.
881 3.1.3.4. "Maximizing the utilization of resources"
882 Relevance: Valid. Cost-efficiency should be striven for;
883 maximizing resource utilization does not always
884 lead to greatest cost-efficiency.
886 Current practice: Not currently part of the system, though often a
887 'hacked in' feature done with manual
888 configuration.
890 3.1.3.5. "Schedule to deadline service"
892 This non-goal was put in place to ensure that the IDR did not have to
893 meet real time deadline goals such as might apply to Constant Bit
894 Rate (CBR) real time services in ATM.
896 Relevance: The hard form of deadline services is still a non-
897 goal for the future domain routing architecture
898 but overall delay bounds are much more of the
899 essence than was the case when RFC1126 was
900 written.
902 Current practice: Service providers are now offering overall
903 probabilistic delay bounds on traffic contracts.
904 To implement these contracts there is a
905 requirement for a rather looser form of delay
906 sensitive routing.
908 3.1.3.6. "Non-interference policies of resource utilization"
910 The requirement in RFC1126 is somewhat opaque, but appears to imply
911 that what we would today call QoS routing is a non-goal and that
912 routing would not seek to control the elastic characteristics of
913 Internet traffic whereby a TCP connection can seek to utilize all the
914 spare bandwidth on a route, possibly to the detriment of other
915 connections sharing the route or crossing it.
916 Relevance: Open Issue. It is not clear whether dynamic QoS
917 routing can or should be implemented. Such a
918 system would seek to control the admission and
919 routing of traffic depending on current or recent
920 resource utilization. This would be particularly
921 problematic where traffic crosses an ownership
922 boundary because of the need for potentially
923 commercially sensitive information to be made
924 available outside the ownership boundary.
926 Current practice: Routing does not consider dynamic resource
927 availability. Forwarding can support service
928 differentiation.
930 3.2. ISO OSI IDRP, BGP and the Development of Policy Routing
932 During the decade before the widespread success of the World Wide
933 Web, ISO was developing the communications architecture and protocol
934 suite Open Systems Interconnection (OSI). For a considerable part of
935 this time OSI was seen as a possible competitor for and even a
936 replacement for the IP suite as this basis for the Internet. The
937 technical developments of the two protocols were quite heavily
938 interrelated with each providing ideas and even components that were
939 adapted into the other suite.
941 During the early stages of the development of OSI, the IP suite was
942 still mainly in use on the ARPANET and the relatively small scale
943 first phase NSFnet. This was a effectively a single administrative
944 domain with a simple tree structured network in a three level
945 hierarchy connected to a single logical exchange point (the NSFnet
946 backbone). In the second half of the 1980s the NSFNET was starting
947 on the growth and transformation that would lead to today's Internet.
948 It was becoming clear that the backbone routing protocol, the
949 Exterior Gateway Protocol (EGP) [RFC0904], was not going to cope even
950 with the limited expansion being planned. EGP is an "all informed"
951 protocol which needed to know the identities of all gateways and this
952 was no longer reasonable. With the increasing complexity of the
953 NSFnet and the linkage of the NSFnet network to other networks there
954 was a desire for policy-based routing which would allow
955 administrators to manage the flow of packets between networks. The
956 first version of the Border Gateway Protocol (BGP-1) [RFC1105] was
957 developed as a replacement for EGP with policy capabilities - a
958 stopgap EGP version 3 had been created as an interim measure while
959 BGP was developed. BGP was designed to work on a hierarchically
960 structured network, such as the original NSFNET, but could also work
961 on networks that were at least partially non-hierarchical where there
962 were links between ASs at the same level in the hierarchy (we would
963 now call these 'peering arrangements') although the protocol made a
964 distinction between different kinds of links (links are classified as
965 upwards, downwards or sideways). ASs themselves were a 'fix' for the
966 complexity that developed in the three tier structure of the NSFnet.
968 Meanwhile the OSI architects, led by Lyman Chapin, were developing a
969 much more general architecture for large scale networks. They had
970 recognized that no one node, especially an end-system (host) could or
971 should attempt to remember routes from "here" to "anywhere" - this
972 sounds obvious today but was not so obvious 20 years ago. They were
973 also considering hierarchical networks with independently
974 administered domains - a model already well entrenched in the public
975 switched telephone network. This led to a vision of a network with
976 multiple independent administrative domains with an arbitrary
977 interconnection graph and a hierarchy of routing functionality. This
978 architecture was fairly well established by 1987 [Tsuchiya87]. The
979 architecture initially envisaged a three level routing functionality
980 hierarchy in which each layer had significantly different
981 characteristics:
983 1. *End-system to Intermediate system routing (host to router)*, in
984 which the principal functions are discovery and redirection.
986 2. *Intra-domain intermediate system to intermediate system routing
987 (router to router)*, in which "best" routes between end-systems
988 in a single administrative domain are computed and used. A
989 single algorithm and routing protocol would be used throughout
990 any one domain.
992 3. *Inter-domain intermediate-system to intermediate system routing
993 (router to router)*, in which routes between routing domains
994 within administrative domains are computed (routing is considered
995 separately between administrative domains and routing domains).
997 Level 3 of this hierarchy was still somewhat fuzzy. Tsuchiya says:
999 The last two components, Inter-Domain and Inter-Administration
1000 routing, are less clear-cut. It is not obvious what should be
1001 standardized with respect to these two components of routing. For
1002 example, for Inter-Domain routing, what can be expected from the
1003 Domains? By asking Domains to provide some kind of external
1004 behavior, we limit their autonomy. If we expect nothing of their
1005 external behavior, then routing functionality will be minimal.
1007 Across administrations, it is not known how much trust there will
1008 be. In fact, the definition of trust itself can only be
1009 determined by the two or more administrations involved.
1011 Fundamentally, the problem with Inter-Domain and Inter-
1012 Administration routing is that autonomy and mistrust are both
1013 antithetical to routing. Accomplishing either will involve a
1014 number of tradeoffs which will require more knowledge about the
1015 environments within which they will operate.
1017 Further refinement of the model occurred over the next couple of
1018 years and a more fully formed view is given by Huitema and Dabbous in
1019 1989 [Huitema90]. By this stage work on the original IS-IS link
1020 state protocol, originated by the Digital Equipment Corporation
1021 (DEC), was fairly advanced and was close to becoming a Draft
1022 International Standard. IS-IS is of course a major component of
1023 intra-domain routing today and inspired the development of the Open
1024 Shortest Path First (OSPF) family. However, Huitema and Dabbous were
1025 not able to give any indication of protocol work for Level 3. There
1026 are hints of possible use of centralized route servers.
1028 In the meantime, the NSFnet consortium and the IETF had been
1029 struggling with the rapid growth of the NSFnet. It had been clear
1030 since fairly early on that EGP was not suitable for handling the
1031 expanding network and the race was on to find a replacement. There
1032 had been some intent to include a metric in EGP to facilitate routing
1033 decisions, but no agreement could be reached on how to define the
1034 metric. The lack of trust was seen as one of the main reasons that
1035 EGP could not establish a globally acceptable routing metric: again
1036 this seems to be a clearly futile aim from this distance in time!
1037 Consequently EGP became effectively a rudimentary path-vector
1038 protocol which linked gateways with Autonomous Systems. It was
1039 totally reliant on the tree structured network to avoid routing loops
1040 and the all informed nature of EGP meant that update packets became
1041 very large. BGP version 1 [RFC1105] was standardized in 1989 but had
1042 been in development for some time before this and had already seen
1043 action in production networks prior to standardization. BGP was the
1044 first real path-vector routing protocol and was intended to relieve
1045 some of the scaling problems as well as providing policy-based
1046 routing. Routes were described as paths along a 'vector' of ASs
1047 without any associated cost metric. This way of describing routes
1048 was explicitly intended to allow detection of routing loops. It was
1049 assumed that the intra-domain routing system was loop-free with the
1050 implication that the total routing system would be loop-free if there
1051 were no loops in the AS path. Note that there were no theoretical
1052 underpinnings for this work and it traded freedom from routing loops
1053 for guaranteed convergence.
1055 Also the NSFnet was a government funded research and education
1056 network. Commercial companies which were partners in some of the
1057 projects were using the NSFnet for their research activities but it
1058 was becoming clear that these companies also needed networks for
1059 commercial traffic. NSFnet had put in place "acceptable use"
1060 policies which were intended to limit the use of the network.
1061 However there was little or no technology to support the legal
1062 framework.
1064 Practical experience, IETF IAB discussion (centred in the Internet
1065 Architecture Task Force) and the OSI theoretical work were by now
1066 coming to the same conclusions:
1067 o Networks were going to be composed out of multiple administrative
1068 domains (the federated network),
1070 o The connections between these domains would be an arbitrary graph
1071 and certainly not a tree,
1072 o The administrative domains would wish to establish distinctive,
1073 independent routing policies through the graph of Autonomous
1074 Systems, and
1075 o Administrative Domains would have a degree of distrust of each
1076 other which would mean that policies would remain opaque.
1078 These views were reflected by Susan Hares' (Merit) contribution to
1079 the Internet Architecture (INARC) workshop in 1989, summarized in the
1080 report of the workshop [INARC89]:
1082 The rich interconnectivity within the Internet causes routing
1083 problems today. However, the presenter believes the problem is
1084 not the high degree of interconnection, but the routing protocols
1085 and models upon which these protocols are based. Rich
1086 interconnectivity can provide redundancy which can help packets
1087 moving even through periods of outages. Our model of interdomain
1088 routing needs to change. The model of autonomous confederations
1089 and autonomous systems [RFC0975] no longer fits the reality of
1090 many regional networks. The ISO models of administrative domain
1091 and routing domains better fit the current Internet's routing
1092 structure.
1094 With the first NSFNET backbone, NSF assumed that the Internet
1095 would be used as a production network for research traffic. We
1096 cannot stop these networks for a month and install all new routing
1097 protocols. The Internet will need to evolve its changes to
1098 networking protocols while still continuing to serve its users.
1099 This reality colors how plans are made to change routing
1100 protocols.
1102 It is also interesting to note that the difficulties of organising a
1103 transition were recognized at this stage and have not been seriously
1104 explored or resolved since.
1106 Policies would primarily be interested in controlling which traffic
1107 should be allowed to transit a domain (to satisfy commercial
1108 constraints or acceptable use policies) thereby controlling which
1109 traffic uses the resources of the domain. The solution adopted by
1110 both the IETF and OSI was a form of distance vector hop-by-hop
1111 routing with explicit policy terms. The reasoning for this choice
1112 can be found in Breslau and Estrin's 1990 paper [Breslau90]
1113 (implicitly - because some other alternatives are given such as a
1114 link state with policy suggestion which, with hindsight, would have
1115 even greater problems than BGP on a global scale network).
1116 Traditional distance vector protocols exchanged routing information
1117 in the form of a destination and a metric. The new protocols
1118 explicitly associated policy expressions with the route by including
1119 either a list of the source ASs that are permitted to use the route
1120 described in the routing update, and/or a list of all ASs traversed
1121 along the advertised route.
1123 Parallel protocol developments were already in progress by the time
1124 this paper was published: BGP version 2 [RFC1163] in the IETF and the
1125 Inter-Domain Routing Protocol (IDRP) [ISO10747] which would be the
1126 Level 3 routing protocol for the OSI architecture. IDRP was
1127 developed under the aegis of the ANSI XS3.3 working group led by
1128 Lyman Chapin and Charles Kunzinger. The two protocols were very
1129 similar in basic design but IDRP has some extra features, some of
1130 which have been incorporated into later versions of BGP; others may
1131 yet be so and still others may be seen to be inappropriate. Breslau
1132 and Estrin summarize the design of IDRP as follows:
1134 IDRP attempts to solve the looping and convergence problems
1135 inherent in distance vector routing by including full AD
1136 [Administrative Domain - essentially the equivalent of what are
1137 now called ASs] path information in routing updates. Each routing
1138 update includes the set of ADs that must be traversed in order to
1139 reach the specified destination. In this way, routes that contain
1140 AD loops can be avoided.
1142 IDRP updates also contain additional information relevant to
1143 policy constraints. For instance, these updates can specify what
1144 other ADs are allowed to receive the information described in the
1145 update. In this way, IDRP is able to express source specific
1146 policies. The IDRP protocol also provides the structure for the
1147 addition of other types of policy related information in routing
1148 updates. For example, User Class Identifiers (UCI) could also be
1149 included as policy attributes in routing updates.
1151 Using the policy route attributes IDRP provides the framework for
1152 expressing more fine grained policy in routing decisions.
1153 However, because it uses hop-by-hop distance vector routing, it
1154 only allows a single route to each destination per-QOS to be
1155 advertised. As the policy attributes associated with routes
1156 become more fine grained, advertised routes will be applicable to
1157 fewer sources. This implies a need for multiple routes to be
1158 advertised for each destination in order to increase the
1159 probability that sources have acceptable routes available to them.
1160 This effectively replicates the routing table per forwarding
1161 entity for each QoS, UCI, source combination that might appear in
1162 a packet. Consequently, we claim that this approach does not
1163 scale well as policies become more fine grained, i.e., source or
1164 UCI specific policies.
1166 Over the next three or four years successive versions of BGP (BGP-2
1167 [RFC1163], BGP-3 [RFC1267] and BGP-4 [RFC1771]) were deployed to cope
1168 with the growing and by now commercialized Internet. From BGP-2
1169 onwards, BGP made no assumptions about an overall structure of
1170 interconnections allowing it to cope with today's dense web of
1171 interconnections between ASs. BGP version 4 was developed to handle
1172 the change from classful to classless addressing. For most of this
1173 time IDRP was being developed in parallel, and both protocols were
1174 implemented in the Merit gatedaemon routing protocol suite. During
1175 this time there was a movement within the IETF which saw BGP as a
1176 stopgap measure to be used until the more sophisticated IDRP could be
1177 adapted to run over IP instead of the OSI connectionless protocol
1178 CLNP. However, unlike its intra-domain counterpart IS-IS which has
1179 stood the test of time, and indeed proved to be more flexible than
1180 OSPF, IDRP was ultimately not adopted by the market. By the time the
1181 NSFnet backbone was decommissioned in 1995, BGP-4 was the inter-
1182 domain routing protocol of choice and OSI's star was already
1183 beginning to wane. IDRP is now little remembered.
1185 A more complete account of the capabilities of IDRP can be found in
1186 chapter 14 of David Piscitello and Lyman Chapin's book 'Open Systems
1187 Networking: TCP/IP and OSI' which is now readable on the Internet
1188 [Chapin94].
1190 IDRP also contained quite extensive means for securing routing
1191 exchanges much of it based on X.509 certificates for each router and
1192 public/private key encryption of routing updates.
1194 Some of the capabilities of IDRP which might yet appear in a future
1195 version of BGP include the ability to manage routes with explicit QoS
1196 classes, and the concept of domain confederations (somewhat different
1197 from the confederation mechanism in today's BGP) as an extra level in
1198 the hierarchy of routing.
1200 3.3. Nimrod Requirements
1202 Nimrod as expressed by Noel Chiappa in his early document, "A New IP
1203 Routing and Addressing Architecture" [Chiappa91] and later in the
1204 NIMROD Working Group documents [RFC1753] and [RFC1992] established a
1205 number of requirements that need to be considered by any new routing
1206 architecture. The Nimrod requirements took RFC1126 as a starting
1207 point and went further.
1209 The goals of Nimrod, quoted from [RFC1992], were as follows
1210 1. To support a dynamic internetwork of _arbitrary size_ (our
1211 emphasis) by providing mechanisms to control the amount of
1212 routing information that must be known throughout an
1213 internetwork.
1215 2. To provide service-specific routing in the presence of multiple
1216 constraints imposed by service providers and users.
1217 3. To admit incremental deployment throughout an internetwork.
1219 It is certain that these goals should be considered requirements for
1220 any new domain routing architecture.
1221 o As discussed in other sections of this document the amount of
1222 information needed to maintain the routing system is growing at a
1223 rate that does not scale. And yet, as the services and
1224 constraints upon those services grow there is a need for more
1225 information to be maintained by the routing system. One of the
1226 key terms in the first requirements is 'control'. While
1227 increasing amounts of information need to be known and maintained
1228 in the Internet, the amounts and kinds of information that are
1229 distributed can be controlled. This goal should be reflected in
1230 the requirements for the future domain architecture.
1231 o If anything, the demand for specific services in the Internet has
1232 grown since 1996 when the Nimrod architecture was published.
1233 Additionally the kinds of constraints that service providers need
1234 to impose upon their networks and that services need to impose
1235 upon the routing have also increased. Any changes made to the
1236 network in the last half-decade have not significantly improved
1237 this situation.
1238 o The ability to incrementally deploy any new routing architecture
1239 within the Internet is still a absolute necessity. It is
1240 impossible to imagine that a new routing architecture could
1241 supplant the current architecture on a flag day
1243 At one point in time Nimrod, with its addressing and routing
1244 architectures was seen as a candidate for IPng. History shows that
1245 it was not accepted as the IPng, having been ruled out of the
1246 selection process by the IESG in 1994 on the grounds that it was 'too
1247 much of a research effort' [RFC1752], although input for the
1248 requirements of IPng was explicitly solicited from Chiappa [RFC1753].
1249 Instead IPv6 has been put forth as the IPng. Without entering a
1250 discussion of the relative merits of IPv6 versus Nimrod, it is
1251 apparent that IPv6, while it may solve many problems, does not solve
1252 the critical routing problems in the Internet today. In fact in some
1253 sense it exacerbates them by adding a requirement for support of two
1254 Internet protocols and their respective addressing methods. In many
1255 ways the addition of IPv6 to the mix of methods in today's Internet
1256 only points to the fact that the goals, as set forth by the Nimrod
1257 team, remain as necessary goals.
1259 There is another sense in which study of Nimrod and its architecture
1260 may be important to deriving a future domain routing architecture.
1261 Nimrod can be said to have two derivatives:
1263 o Multi-Protocol Label Switching (MPLS) in that it took the notion
1264 of forwarding along well known paths
1265 o Private Network-Node Interface (PNNI) in that it took the notion
1266 of abstracting topological information and using that information
1267 to create connections for traffic.
1269 It is important to note, that whilst MPLS and PNNI borrowed ideas
1270 from Nimrod, neither of them can be said to be an implementation of
1271 this architecture.
1273 3.4. PNNI
1275 The Private Network-Node Interface (PNNI) routing protocol was
1276 developed under the ATM Forum's auspices as a hierarchical route
1277 determination protocol for ATM, a connection oriented architecture.
1278 It is reputed to have developed several of its methods from a study
1279 of the Nimrod architecture. What can be gained from an analysis of
1280 what did and did not succeed in PNNI?
1282 The PNNI protocol includes the assumption that all peer groups are
1283 willing to cooperate, and that the entire network is under the same
1284 top administration. Are there limitations that stem from this 'world
1285 node' presupposition? As discussed in [RFC3221], the Internet is no
1286 longer a clean hierarchy and there is a lot of resistance to having
1287 any sort of 'ultimate authority' controlling or even brokering
1288 communication.
1290 PNNI is the first deployed example of a routing protocol that uses
1291 abstract map exchange (as opposed to distance vector or link state
1292 mechanisms) for inter-domain routing information exchange. One
1293 consequence of this is that domains need not all use the same
1294 mechanism for map creation. What were the results of this
1295 abstraction and source based route calculation mechanism?
1297 Since the authors of this document do not have experience running a
1298 PNNI network, the comments above are from a theoretical perspective.
1299 Further research on these issues based on operational exprience is
1300 required.
1302 4. Recent Research Work
1304 4.1. Developments in Internet Connectivity
1306 The work commissioned from Geoff Huston by the Internet Architecture
1307 Board [RFC3221] draws a number of conclusions from analysis of BGP
1308 routing tables and routing registry databases:
1310 o The connectivity between provider ASs is becoming more like a
1311 dense mesh than the tree structure that was commonly assumed to be
1312 commonplace a couple of years ago. This has been driven by the
1313 increasing amounts charged for peering and transit traffic by
1314 global service providers. Local direct peering and Internet
1315 exchanges are becoming steadily more common as the cost of local
1316 fibre connections drops.
1317 o End user sites are increasingly resorting to multi-homing onto two
1318 or more service providers as a way of improving resiliency. This
1319 has a knock-on effect of spectacularly fast depletion of the
1320 available pool of AS numbers as end user sites require public AS
1321 numbers to become multi-homed and corresponding increase in the
1322 number of prefixes advertised in BGP.
1323 o Multi-homed sites are using advertisement of longer prefixes in
1324 BGP as a means of traffic engineering to load spread across their
1325 multiple external connections with further impact on the size of
1326 the BGP tables.
1327 o Operational practices are not uniform, and in some cases lack of
1328 knowledge or training is leading to instability and/or excessive
1329 advertisement of routes by incorrectly configured BGP speakers.
1330 o All these factors are quickly negating the advantages in limiting
1331 the expansion of BGP routing tables that were gained by the
1332 introduction of CIDR and consequent prefix aggregation in BGP. It
1333 is also now impossible for IPv6 to realize the world view in which
1334 the default free zone would be limited to perhaps 10,000 prefixes.
1335 o The typical 'width' of the Internet in AS hops is now around five,
1336 and much less in many cases.
1338 These conclusions have a considerable impact on the requirements for
1339 the future domain routing architecture:
1340 o Topological hierarchy (e.g. mandating a tree structured
1341 connectivity) cannot be relied upon to deliver scalability of a
1342 large Internet routing system
1343 o Aggregation cannot be relied upon to constrain the size of routing
1344 tables for an all-informed routing system
1346 4.2. DARPA NewArch Project
1348 DARPA funded a project to think about a new architecture for future
1349 generation Internet, called NewArch ().
1350 Work started in the first half of 2000 and the main project finished
1351 in 2003 [NewArch03].
1353 The main development is to conclude that as the Internet becomes
1354 mainstream infrastructure, fewer and fewer of the requirements are
1355 truly global but may apply with different force or not at all in
1356 certain parts of the network. This (it is claimed) makes the
1357 compilation of a single, ordered list of requirements deeply
1358 problematic. Instead we may have to produce multiple requirement
1359 sets with support for differing requirement importance at different
1360 times and in different places. This 'meta-requirement' significantly
1361 impacts architectural design.
1363 Potential new technical requirements identified so far include:
1364 o Commercial environment concerns such as richer inter-provider
1365 policy controls and support for a variety of payment models
1366 o Trustworthiness
1367 o Ubiquitous mobility
1368 o Policy driven self-organisation ('deep auto configuration')
1369 o Extreme short-time-scale resource variability
1370 o Capacity allocation mechanisms
1371 o Speed, propagation delay and Delay/BandWidth Product issues
1373 Non-technical or political 'requirements' include:
1374 o Legal and Policy drivers such as
1375 * Privacy and free/anonymous speech
1376 * Intellectual property concerns
1377 * Encryption export controls
1378 * Law enforcement surveillance regulations
1379 * Charging and taxation issues
1380 o Reconciling national variations and consistent operation in a
1381 world wide infrastructure
1383 The conclusions of the work are now summarized in the final report .
1385 4.2.1. Defending the End-to-End Principle
1387 One of the participants in DARPA NewArch work (Dave Clark) with one
1388 of his associates has also published a very interesting paper
1389 analyzing the impact of some of the new requirements identified in
1390 NewArch (see Section 4.2) on the end-to-end principle that has guided
1391 the development of the Internet to date [Blumenthal01]. Their
1392 primary conclusion is that the loss of trust between the users at the
1393 ends of end to end has the most fundamental effect on the Internet.
1394 This is clear in the context of the routing system, where operators
1395 are unwilling to reveal the inner workings of their networks for
1396 commercial reasons. Similarly, trusted third parties and their
1397 avatars (mainly mid-boxes of one sort or another) have a major impact
1398 on the end-to-end principles and the routing mechanisms that went
1399 with them. Overall, the end to end principles should be defended so
1400 far as is possible - some changes are already too deeply embedded to
1401 make it possible to go back to full trust and openness - at least
1402 partly as a means of staving off the day when the network will ossify
1403 into an unchangeable form and function (much as the telephone network
1404 has done). The hope is that by that time a new Internet will appear
1405 to offer a context for unfettered innovation.
1407 5. Existing problems of BGP and the current Inter-/Intra-Domain
1408 Architecture
1410 Although most of the people who have to work with BGP today believe
1411 it to be a useful, working protocol, discussions have brought to
1412 light a number of areas where BGP or the relationship between BGP and
1413 the intra-domain routing protocols in use today could be improved.
1414 BGP-4 has been and continues to be extended since it was originally
1415 introduced in [RFC1771] and the protocol as deployed has been
1416 documented in [RFC4271]. This section is, to a large extent, a wish
1417 list for the future domain routing architecture based on those areas
1418 where BGP is seen to be lacking, rather than simply a list of
1419 problems with BGP. The shortcomings of today's inter-domain routing
1420 system have also been extensively surveyed in 'Architectural
1421 Requirements for Inter-Domain Routing in the Internet' [RFC3221],
1422 particularly with respect to its stability and the problems produced
1423 by explosions in the size of the Internet.
1425 5.1. BGP and Auto-aggregation
1427 The stability and later linear growth rates of the number of routing
1428 objects (prefixes) that was achieved by the introduction of CIDR
1429 around 1994, has now been once again been replaced by near-
1430 exponential growth of number of routing objects. The granularity of
1431 many of the objects advertised in the default free zone is very small
1432 (prefix length of 22 or longer): This granularity appears to be a by-
1433 product of attempts to perform precision traffic engineering related
1434 to increasing levels of multi-homing. At present there is no
1435 mechanism in BGP that would allow an AS to aggregate such prefixes
1436 without advance knowledge of their existence, even if it was possible
1437 to deduce automatically that they could be aggregated. Achieving
1438 satisfactory auto-aggregation would also significantly reduce the
1439 non-locality problems associated with instability in peripheral ASs.
1441 On the other hand, it may be that alterations to the connectivity of
1442 the net as described in [RFC3221] and Section 2.5.1 may limit the
1443 usefulness of auto-aggregation.
1445 5.2. Convergence and Recovery Issues
1447 BGP today is a stable protocol under most circumstances but this has
1448 been achieved at the expense of making the convergence time of the
1449 inter-domain routing system very slow under some conditions. This
1450 has a detrimental effect on the recovery of the network from
1451 failures.
1453 The timers that control the behavior of BGP are typically set to
1454 values in the region of several tens of seconds to a few minutes,
1455 which constrains the responsiveness of BGP to failure conditions.
1457 In the early days of deployment of BGP, poor network stability and
1458 router software problems lead to storms of withdrawals closely
1459 followed by re-advertisements of many prefices. To control the load
1460 on routing software imposed by these 'route flaps', route flap
1461 damping was introduced into BGP. Most operators have now implemented
1462 a degree of route flap damping in their deployments of BGP. This
1463 restricts the number of times that the routing tables will be rebuilt
1464 even if a route is going up and down very frequently. Unfortunately,
1465 the effect of route flap damping is exponential in its behavior that
1466 can result in some parts of the Internet being inaccessible for hours
1467 at a time.
1469 There is evidence ([RFC3221] and our own measurements [Jiang02]) that
1470 in today's network route flap is disproportionately associated with
1471 the fine grain prefices (length 22 or longer) associated with traffic
1472 engineering at the periphery of the network. Auto-aggregation as
1473 previously discussed would tend to mask such instability and prevent
1474 it being propagated across the whole network. Another question that
1475 needs to be studied is the continuing need for an architecture that
1476 requires global convergence. Some of our studies (unpublished) show
1477 that, in some localities at least, the network never actually reaches
1478 stability; i.e., it never really globally converges. Can a global,
1479 and beyond, network be designed with the requirement of global
1480 convergence?
1482 5.3. Non-locality of Effects of Instability and Misconfiguration
1484 There have been a number of instances, some of which are well
1485 documented of a mistake in BGP configuration in a single peripheral
1486 AS propagating across the whole Internet and resulting in misrouting
1487 of most of the traffic in the Internet.
1489 Similarly, route flap in a single peripheral AS can require route
1490 table recalculation across the entire Internet.
1492 This non-locality of effects is highly undesirable, and it would be a
1493 considerable improvement if such effects were naturally limited to a
1494 small area of the network around the problem. This is another
1495 argument for an architecture that does not require global
1496 convergence.
1498 5.4. Multihoming Issues
1500 As discussed previously, the increasing use of multi-homing as a
1501 robustness technique by peripheral networks requires that multiple
1502 routes have to be advertised for such domains. These routes must not
1503 be aggregated close in to the multi-homed domain as this would defeat
1504 the traffic engineering implied by multi-homing and currently cannot
1505 be aggregated further away from the multi-homed domain due to the
1506 lack of auto-aggregation capabilities. Consequentially the default
1507 free zone routing table is growing exponentially, as it was before
1508 CIDR.
1510 The longest prefix match routing technique introduced by CIDR, and
1511 implemented in BGP-4, when combined with provider address allocation
1512 is an obstacle to effective multi-homing if load sharing across the
1513 multiple links is required: If an AS has been allocated its addresses
1514 from an upstream provider, the upstream provider can aggregate those
1515 addresses with those of other customers and need only advertise a
1516 single prefix for a range of customers. But, if the customer AS is
1517 also connected to another provider, the second provider is not able
1518 to aggregate the customer addresses because they are not taken from
1519 his allocation, and will therefore have to announce a more specific
1520 route to the customer AS. The longest match rule will then direct
1521 all traffic through the second provider, which is not as required.
1523 Example:
1525 \ /
1526 AS1 AS2
1527 \ /
1528 AS3
1530 Figure 1: Address Aggregation
1532 AS3 has received its addresses from AS1, which means AS1 can
1533 aggregate. But if AS3 wants its traffic to be seen equally both
1534 ways, AS3 is forced to announce both the aggregate and the more
1535 specific route to AS2.
1537 This problem has induced many ASs to apply for their own address
1538 allocation even though they could have been allocated from an
1539 upstream provider further exacerbating the default free zone route
1540 table size explosion. This problem also interferes with the desire
1541 of many providers in the default free zone to route only prefixes
1542 that are equal to or shorter than 20 or 19 bits.
1544 Note that some problems which are referred to as multihoming issues
1545 are not and should not solvable through the routing system (e.g.,
1546 where a TCP load distributor is needed), and multihoming is not a
1547 panacea for the general problem of robustness in a routing system
1548 [I-D.berkowitz-multirqmt].
1550 Editors' Note: A more recent analysis of multihoming can be found
1551 in [RFC4116].
1553 5.5. AS-number exhaustion
1555 The domain identifier or AS-number is a 16-bit number. When this
1556 paper was originally written in 2001, allocation of AS-numbers was
1557 increasing 51% a year [RFC3221] and exhaustion by 2005 was predicted.
1558 According to some recent work again by Huston [Huston05], the rate of
1559 increase dropped off after the business downturn but as of July 2005,
1560 well over half the available AS numbers (39000 out of 64510) had been
1561 allocated by IANA and around 20000 were visible in the global BGP
1562 routing tables. A year later these figures had grown to 42000 (April
1563 2006) and 23000 (August 2006) respectively and the rate of allocation
1564 is currently about 3500 per year. Depending on the curve fitting
1565 model used to predict when exhaustion will occur, the pool will run
1566 out somewhere between 2010 and 2013. There appear to be other
1567 factors at work in this rate of increase beyond an increase in the
1568 number of ISPs in business, although there is a fair degree of
1569 correlation between these numbers. AS numbers are now used for a
1570 number of purposes beyond that of identifying large routing domains:
1571 multihomed sites acquire an AS number in order to express routing
1572 preferences to their various providers and AS numbers are used part
1573 of the addressing mechanism for MPLS/BGP-based virtual private
1574 networks (VPNs) [RFC2547]. The IETF has had a proposal under
1575 development for over four years to increase the available range of
1576 AS-numbers to 32 bits [I-D.ietf-idr-as4bytes]. Much of the slowness
1577 in development is due to the deployment challenge during transition.
1578 Because of the difficulties of transition, deployment needs to start
1579 well in advance of actual exhaustion so that the network as a whole
1580 is ready for the new capability when it is needed. This implies that
1581 standardisation needs to be complete and implementations available at
1582 least well in advance of expected exhaustion so that deployment of
1583 upgrades that can handle the longer AS numbers should be starting
1584 around 2008 to give a reasonable expectation that the change has been
1585 rolled out across a large fraction of the Internet by the time
1586 exhaustion occurs.
1588 5.6. Partitioned AS's
1590 Tricks with discontinuous ASs are used by operators, for example, to
1591 implement anycast. Discontinuous ASs may also come into being by
1592 chance if a multi-homed domain becomes partitioned as a result of a
1593 fault and part of the domain can access the Internet through each
1594 connection. It may be desirable to make support for this kind of
1595 situation more transparent than it is at present.
1597 5.7. Load Sharing
1599 Load splitting or sharing was not a goal of the original designers of
1600 BGP and it is now a problem for today's network designers and
1601 managers. Trying to fool BGP into load sharing between several links
1602 is a constantly recurring exercise for most operators today.
1604 5.8. Hold down issues
1606 As with the interval between 'hello' messages in OSPF, the typical
1607 size and defined granularity (seconds to tens of seconds) of the
1608 'keep-alive' time negotiated at start-up for each BGP connection
1609 constrains the responsiveness of BGP to link failures.
1611 The recommended values and the available lower limit for this timer
1612 were set to limit the overhead caused by keep-alive messages when
1613 link bandwidths were typically much lower than today. Analysis and
1614 experiment ([I-D.alaettinoglu-isis-convergence], [I-D.sandiick-flip]
1615 & [RFC4204]) indicate that faster links could sustain a much higher
1616 rate of keep-alive messages without significantly impacting normal
1617 data traffic. This would improve responsiveness to link and node
1618 failures but with a corresponding increase in the risk of
1619 instability, if the error characteristics of the link are not taken
1620 properly into account when setting the keep-alive interval.
1622 Editors' Note: A 'fast' liveness protocol has been standardized as
1623 [I-D.ietf-bfd-base].
1625 An additional problem with the hold-down mechanism in BGP is the
1626 amount of information that has to be exchanged to re-establish the
1627 database of route advertisements on each side of the link when it is
1628 re-established after a failure. Currently any failure, however brief
1629 forces a full exchange which could perhaps be constrained by
1630 retaining some state across limited time failures and using revision
1631 control, transaction and replication techniques to resynchronise the
1632 databases. Various techniques have been implemented to try to reduce
1633 this problem but they have not yet been standardised.
1635 5.9. Interaction between Inter domain routing and intra domain routing
1637 Today, many operators' backbone routers run both I-BGP and an intra-
1638 domain protocol to maintain the routes that reach between the borders
1639 of the domain. Exporting routes from BGP into the intra-domain
1640 protocol in use and bringing them back up to BGP is not recommended
1641 [RFC2791], but it is still necessary for all backbone routers to run
1642 both protocols. BGP is used to find the egress point and intra-
1643 domain protocol to find the path (next hop router) to the egress
1644 point across the domain. This is not only a management problem but
1645 may also create other problems:
1646 o BGP is a distance vector protocol, as compared with most intra-
1647 domain protocols, which are link state protocols, and as such it
1648 is not optimised for convergence speed although they generally
1649 require less processing power. Incidentally, more efficient
1650 distance vector algorithms are available such as [Xu97].
1651 o The metrics used in BGP and the intra-domain protocol are rarely
1652 comparable or combinable. Whilst there are arguments that the
1653 optimizations inside a domain may be different from those for end-
1654 to-end paths, there are occasions, such as calculating the
1655 'topologically nearest' server when computable or combinable
1656 metrics would be of assistance.
1657 o The policies that can be implemented using BGP are designed for
1658 control of traffic exchange between operators, not for controlling
1659 paths within a domain. Policies for BGP are most conveniently
1660 expressed in Routing Policy Support Language (RPSL) [RFC2622] and
1661 this could be extended if thought desirable to include additional
1662 policy information.
1663 o If the NEXT HOP destination for a set of BGP routes becomes
1664 inaccessible because of intra-domain protocol problems, the routes
1665 using the vanished next hop have to be invalidated at the next
1666 available UPDATE. Subsequently, if the next hop route reappears,
1667 this would normally lead to the BGP speaker requesting a full
1668 table from its neighbour(s). Current implementations may attempt
1669 to circumvent the effects of intra-domain protocol route flap by
1670 caching the invalid routes for a period in case the next hop is
1671 restored through the 'graceful restart' mechanism.
1673 * Editors' Note: This was standardized as [I-D.ietf-idr-restart].
1675 o Synchronization between intra-domain and inter-domain routing
1676 information is a problem as long as we use different protocols for
1677 intra-domain and inter-domain routing, which will most probably be
1678 the case even in the future because of the differing requirements
1679 in the two situations. Some sort of synchronization between those
1680 two protocols would be useful. In the RFC 'IS-IS Transient
1681 Blackhole Avoidance' [RFC3277], the intra-domain protocol side of
1682 the story is covered (there is an equivalent discussion for OSPF).
1683 o Synchronizing in BGP means waiting for the intra-domain protocol
1684 to know about the same networks as the inter-domain protocol,
1685 which can take a significant period of time and slows down the
1686 convergence of BGP by adding the intra-domain protocol convergence
1687 time into each cycle. In general operators no longer attempt full
1688 synchronization in order to avoid this problem (in general,
1689 redistributing the entire BGP routing feed into the local intra-
1690 domain protocol is unnecessary and undesirable but where a domain
1691 has multiple exits to peers and other non-customer networks,
1692 changes in BGP routing that affect the exit taken by traffic
1693 require corresponding re-routing in the intra-domain routing).
1695 5.10. Policy Issues
1697 There are several classes of issues with current BGP policy:
1698 o Policy is installed in an ad-hoc manner in each autonomous system.
1699 There isn't a method for ensuring that the policy installed in one
1700 router is coherent with policies installed in other routers.
1701 o As described in Griffin [Griffin99] and in McPherson [RFC3345] it
1702 is possible to create policies for ASs, and instantiate them in
1703 routers, that will cause BGP to fail to converge in certain types
1704 of topology
1705 o There is no available network model for describing policy in a
1706 coherent manner.
1708 Policy management is extremely complex and mostly done without the
1709 aid of any automated procedures. The extreme complexity means that a
1710 highly qualified specialist is required for policy management of
1711 border routers. The training of these specialists is quite lengthy
1712 and needs to involve long periods of hands-on experience. There is,
1713 therefore, a shortage of qualified staff for installing and
1714 maintaining the routing policies. Because of the overall complexity
1715 of BGP, policy management tends to be only a relatively small topic
1716 within a complete BGP training course and specialised policy
1717 management training courses are not generally available.
1719 5.11. Security Issues
1721 While many of the issues with BGP security have been traced either to
1722 implementation issues or to operational issues, BGP is vulnerable to
1723 Distributed Denial of Service (DDoS) attacks. Additionally routers
1724 can be used as unwitting forwarders in DDoS attacks on other systems.
1726 Though DDoS attacks can be fought in a variety of ways, most
1727 filtering methods, it is takes constant vigilance. There is nothing
1728 in the current architecture or in the protocols that serves to
1729 protect the forwarders from these attacks.
1731 Editors' Note: Since the original draft was written, the issue of
1732 inter-domain routing security has been studied in much greater
1733 depth. The rpsec working group has gone into the security issues
1734 in great detail [RFC4593] and readers should refer to that work to
1735 understand the security issues.
1737 5.12. Support of MPLS and VPNS
1739 Recently BGP has been modified to function as a signaling protocol
1740 for MPLS and for VPNs [RFC2547]. Some people see this over-loading
1741 of the BGP protocol as a boon whilst others see it as a problem.
1742 While it was certainly convenient as a vehicle for vendors to deliver
1743 extra functionality for to their products, it has exacerbated some of
1744 the performance and complexity issues of BGP. Two important problems
1745 are, the additional state that must be retained and refreshed to
1746 support VPN (Virtual Private Network) tunnels and that BGP does not
1747 provide end-to-end notification making it difficult to confirm that
1748 all necessary state has been installed or updated.
1750 It is an open questtion whether VPN signaling protocols should remain
1751 separate from the route determination protocols.
1753 5.13. IPv4 / IPv6 Ships in the Night
1755 The fact that service providers need to maintain two completely
1756 separate networks; one for IPv4 and one for IPv6 has been a real
1757 hindrance to the introduction of IPv6. When IPv6 does get widely
1758 deployed it will do so without causing the disappearance of IPv4.
1759 This means that unless something is done, service providers would
1760 need to maintain the two networks in, relative, perpetuity.
1762 It is possible to use a single set of BGP speakers with multiprotocol
1763 extensions [RFC2858] to exchange information about both IPv4 and IPv6
1764 routes between domains, but the use of TCP as the transport protocol
1765 for the information exchange results in an asymmetry when choosing to
1766 use one of TCP over IPv4 or TCP over IPv6. Successful information
1767 exchange confirms one of IPv4 or IPv6 reachability between the
1768 speakers but not the other, making it possible that reachability is
1769 being advertised for a protocol for which it is not present.
1771 Also, current implementations do not allow a route to be advertised
1772 for both IPv4 and IPv6 in the same UPDATE message, because it is not
1773 possible to explicitly link the reachability information for an
1774 address family to the corresponding next hop information. This could
1775 be improved, but currently results in independent UPDATEs being
1776 exchanged for each address family.
1778 5.14. Existing Tools to Support Effective Deployment of Inter-Domain
1779 Routing
1781 The tools available to network operators to assist in configuring and
1782 maintaining effective inter-domain routing in line with their defined
1783 policies are limited, and almost entirely passive.
1785 o There are no tools to facilitate the planning of the routing of a
1786 domain (either intra- or inter-domain); there are a limited number
1787 of display tools that will visualize the routing once it has been
1788 configured
1790 o There are no tools to assist in converting business policy
1791 specifications into the RPSL language; there are limited tools to
1792 convert the RPSL into BGP commands and to check, post-facto, that
1793 the proposed policies are consistent with the policies in adjacent
1794 domains (always provided that these have been revealed and
1795 accurately documented).
1796 o There are no tools to monitor BGP route changes in real time and
1797 warn the operator about policy inconsistencies and/or
1798 instabilities.
1800 The following section summarises the tools that are available to
1801 assist with the use of RPSL. Note they are all batch mode tools used
1802 off-line from a real network. These tools will provide checks for
1803 skilled inter-domain routing configurers but limited assistance for
1804 the novice.
1806 5.14.1. Routing Policy Specification Language RPSL (RFC 2622, 2650) and
1807 RIPE NCC Database (RIPE 157)
1809 Routing Policy Specification Language (RPSL) [RFC2622] enables a
1810 network operator to describe routes, routers and autonomous systems
1811 ASs that are connected to the local AS.
1813 Using the RPSL language (see [RFC2650])a distributed database is
1814 created to describe routing policies in the Internet as described by
1815 each AS independently. The database can be used to check the
1816 consistency of routing policies stored in the database.
1818 Tools exist ([IRRToolSet]) that can be applied on the database to
1819 answer requests of the form, e.g.
1820 o Flag when two neighboring network operators specify conflicting or
1821 inconsistent routing information exchanges with each other and
1822 also detect global inconsistencies where possible;
1823 o Extract all AS-paths between two networks that are allowed by
1824 routing policy from the routing policy database; display the
1825 connectivity a given network has according to current policies.
1827 The database queries enable a partial static solution to the
1828 convergence problem. They analyze routing policies of very limited
1829 part of Internet and verify that they do not contain conflicts that
1830 could lead to protocol divergence. The static analysis of
1831 convergence of the entire system has exponential time complexity, so
1832 approximation algorithms would have to be used.
1834 The toolset also allows router configurations to be generated from
1835 RPSL specifications.
1837 Editors' Note: The "Internet Routing Registry Toolset" was
1838 originally developed by the University of Southern California's
1839 Information Sciences Institute (ISI) between 1997 and 2001 as the
1840 "Routing Arbiter ToolSet" (RAToolSet) project. The toolset is no
1841 longer developed by ISI but is used worldwide, so after a period
1842 of improvement by RIPE NCC it has now been transferred to the
1843 Internet Systems Consortium (ISC) for ongoing maintenance as a
1844 public resource.
1846 6. Security Considerations
1848 As this is an informational draft on the history of requirements in
1849 IDR and on the problems facing the current Internet IDR architecture,
1850 it does not as such create any security problems. On the other hand,
1851 some of the problems with today's Internet routing architecture do
1852 create security problems and these have been discussed in the text
1853 above.
1855 7. IANA Considerations
1857 This document does not request any actions by IANA.
1859 RFC Editor: Please remove this section before publication.
1861 8. Acknowledgments
1863 The draft is derived from work originally produced by Babylon.
1864 Babylon was a loose association of individuals from academia, service
1865 providers and vendors whose goal was to discuss issues in Internet
1866 routing with the intention of finding solutions for those problems.
1868 The individual members who contributed materially to this draft are:
1869 Anders Bergsten, Howard Berkowitz, Malin Carlzon, Lenka Carr
1870 Motyckova, Elwyn Davies, Avri Doria, Pierre Fransson, Yong Jiang,
1871 Dmitri Krioukov, Tove Madsen, Olle Pers, and Olov Schelen.
1873 Thanks also go to the members of Babylon and others who did
1874 substantial reviews of this material. Specifically we would like to
1875 acknowledge the helpful comments and suggestions of the following
1876 individuals: Loa Andersson, Tomas Ahlstrom, Erik Aman, Thomas
1877 Eriksson, Niklas Borg, Nigel Bragg, Thomas Chmara, Krister Edlund,
1878 Owe Grafford, Torbjorn Lundberg, Jasminko Mulahusic, Florian-Daniel
1879 Otel, Bernhard Stockman, Tom Worster, Roberto Zamparo.
1881 In addition, the authors are indebted to the folks who wrote all the
1882 references we have consulted in putting this paper together. This
1883 includes not only the references explicitly listed below, but also
1884 those who contributed to the mailing lists we have been participating
1885 in for years.
1887 Finally, it is the editors who are responsible for any lack of
1888 clarity, any errors, glaring omissions or misunderstandings.
1890 9. References
1892 [Blumenthal01]
1893 Blumenthal, M. and D. Clark, "Rethinking the design of the
1894 Internet: The end to end arguments vs", the brave new
1895 world , May 2001,
1896 .
1898 [Breslau90]
1899 Breslau, L. and D. Estrin, "An Architecture for Network-
1900 Layer Routing in OSI", Proceedings of the ACM symposium on
1901 Communications architectures & protocols , 1990.
1903 [Chapin94]
1904 Piscitello, D. and A. Chapin, "Open Systems Networking:
1905 TCP/IP & OSI", Addison-Wesley Copyright assigned to
1906 authors, 1994, .
1908 [Chiappa91]
1909 Chiappa, N., "A New IP Routing and Addressing
1910 Architecture", Internet
1911 Draft draft-chiappa-routing-01.txt, 1991,
1912 .
1914 [Griffin99]
1915 Griffin, T. and G. Wilfong, "An Analysis of BGP
1916 Convergence Properties", Association for Computing
1917 Machinery Proceedings of SIGCOMM '99, 1999.
1919 [Huitema90]
1920 Huitema, C. and W. Dabbous, "Routeing protocols
1921 development in the OSI architecture", Proceedings of
1922 ISCIS V Turkey, 1990.
1924 [Huston05]
1925 Huston, G., "Exploring Autonomous System Numbers", The ISP
1926 Column , August 2005,
1927 .
1929 [I-D.alaettinoglu-isis-convergence]
1930 Alaettinoglu, C., Jacobson, V., and H. Yu, "Towards Milli-
1931 Second IGP Convergence",
1932 draft-alaettinoglu-isis-convergence-00 (work in progress),
1933 Nov 2000.
1935 [I-D.berkowitz-multirqmt]
1936 Berkowitz, H. and D. Krioukov, "To Be Multihomed:
1937 Requirements and Definitions",
1938 draft-berkowitz-multirqmt-02 (work in progress), 2002.
1940 [I-D.ietf-bfd-base]
1941 Katz, D. and D. Ward, "Bidirectional Forwarding
1942 Detection", draft-ietf-bfd-base-05 (work in progress),
1943 June 2006.
1945 [I-D.ietf-idr-as4bytes]
1946 Vohra, Q. and E. Chen, "BGP Support for Four-octet AS
1947 Number Space", draft-ietf-idr-as4bytes-13 (work in
1948 progress), February 2007.
1950 [I-D.ietf-idr-restart]
1951 Sangli, S., "Graceful Restart Mechanism for BGP",
1952 draft-ietf-idr-restart-13 (work in progress), July 2006.
1954 [I-D.irtf-routing-reqs]
1955 Doria, A., "Requirements for Inter-Domain Routing",
1956 draft-irtf-routing-reqs-07 (work in progress),
1957 January 2007.
1959 [I-D.sandiick-flip]
1960 Sandick, H., Squire, M., Cain, B., Duncan, I., and B.
1961 Haberman, "Fast LIveness Protocol (FLIP)",
1962 draft-sandiick-flip-00 (work in progress), Feb 2000.
1964 [INARC89] Mills, D., Ed. and M. Davis, Ed., "Internet Architecture
1965 Workshop: Future of the Internet System Architecture and
1966 TCP/IP Protocols - Report", Internet Architecture Task
1967 Force INARC, 1990, .
1970 [IRRToolSet]
1971 Internet Systems Consortium, "Internet Routing Registry
1972 Toolset Project", IRR Tool Set Website, 2006,
1973 .
1975 [ISO10747]
1976 ISO/IEC, "Protocol for Exchange of Inter-Domain Routeing
1977 Information among Intermediate Systems to support
1978 Forwarding of ISO 8473 PDUs", International Standard
1979 10747 , 1993.
1981 [Jiang02] Jiang, Y., Doria, A., Olsson, D., and F. Pettersson,
1982 "Inter-domain Routing Stability Measurement", , 2002, .
1985 [Labovitz02]
1986 Labovitz, C., Ahuja, A., Farnam, J., and A. Bose,
1987 "Experimental Measurement of Delayed Convergence", NANOG ,
1988 2002.
1990 [NewArch03]
1991 Clark, D., Sollins, K., Wroclawski, J., Katabi, D., Kulik,
1992 J., Yang, X., Braden, R., Faber, T., Falk, A., Pingali,
1993 V., Handley, M., and N. Chiappa, "New Arch: Future
1994 Generation Internet Architecture", December 2003,
1995 .
1997 [RFC0904] Mills, D., "Exterior Gateway Protocol formal
1998 specification", RFC 904, April 1984.
2000 [RFC0975] Mills, D., "Autonomous confederations", RFC 975,
2001 February 1986.
2003 [RFC1105] Lougheed, K. and J. Rekhter, "Border Gateway Protocol
2004 (BGP)", RFC 1105, June 1989.
2006 [RFC1126] Little, M., "Goals and functional requirements for inter-
2007 autonomous system routing", RFC 1126, October 1989.
2009 [RFC1163] Lougheed, K. and Y. Rekhter, "Border Gateway Protocol
2010 (BGP)", RFC 1163, June 1990.
2012 [RFC1267] Lougheed, K. and Y. Rekhter, "Border Gateway Protocol 3
2013 (BGP-3)", RFC 1267, October 1991.
2015 [RFC1752] Bradner, S. and A. Mankin, "The Recommendation for the IP
2016 Next Generation Protocol", RFC 1752, January 1995.
2018 [RFC1753] Chiappa, J., "IPng Technical Requirements Of the Nimrod
2019 Routing and Addressing Architecture", RFC 1753,
2020 December 1994.
2022 [RFC1771] Rekhter, Y. and T. Li, "A Border Gateway Protocol 4
2023 (BGP-4)", RFC 1771, March 1995.
2025 [RFC1992] Castineyra, I., Chiappa, N., and M. Steenstrup, "The
2026 Nimrod Routing Architecture", RFC 1992, August 1996.
2028 [RFC2362] Estrin, D., Farinacci, D., Helmy, A., Thaler, D., Deering,
2029 S., Handley, M., and V. Jacobson, "Protocol Independent
2030 Multicast-Sparse Mode (PIM-SM): Protocol Specification",
2031 RFC 2362, June 1998.
2033 [RFC2547] Rosen, E. and Y. Rekhter, "BGP/MPLS VPNs", RFC 2547,
2034 March 1999.
2036 [RFC2622] Alaettinoglu, C., Villamizar, C., Gerich, E., Kessens, D.,
2037 Meyer, D., Bates, T., Karrenberg, D., and M. Terpstra,
2038 "Routing Policy Specification Language (RPSL)", RFC 2622,
2039 June 1999.
2041 [RFC2650] Meyer, D., Schmitz, J., Orange, C., Prior, M., and C.
2042 Alaettinoglu, "Using RPSL in Practice", RFC 2650,
2043 August 1999.
2045 [RFC2791] Yu, J., "Scalable Routing Design Principles", RFC 2791,
2046 July 2000.
2048 [RFC2858] Bates, T., Rekhter, Y., Chandra, R., and D. Katz,
2049 "Multiprotocol Extensions for BGP-4", RFC 2858, June 2000.
2051 [RFC3221] Huston, G., "Commentary on Inter-Domain Routing in the
2052 Internet", RFC 3221, December 2001.
2054 [RFC3277] McPherson, D., "Intermediate System to Intermediate System
2055 (IS-IS) Transient Blackhole Avoidance", RFC 3277,
2056 April 2002.
2058 [RFC3345] McPherson, D., Gill, V., Walton, D., and A. Retana,
2059 "Border Gateway Protocol (BGP) Persistent Route
2060 Oscillation Condition", RFC 3345, August 2002.
2062 [RFC3618] Fenner, B. and D. Meyer, "Multicast Source Discovery
2063 Protocol (MSDP)", RFC 3618, October 2003.
2065 [RFC3765] Huston, G., "NOPEER Community for Border Gateway Protocol
2066 (BGP) Route Scope Control", RFC 3765, April 2004.
2068 [RFC3913] Thaler, D., "Border Gateway Multicast Protocol (BGMP):
2069 Protocol Specification", RFC 3913, September 2004.
2071 [RFC4116] Abley, J., Lindqvist, K., Davies, E., Black, B., and V.
2072 Gill, "IPv4 Multihoming Practices and Limitations",
2073 RFC 4116, July 2005.
2075 [RFC4204] Lang, J., "Link Management Protocol (LMP)", RFC 4204,
2076 October 2005.
2078 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
2079 Protocol 4 (BGP-4)", RFC 4271, January 2006.
2081 [RFC4593] Barbir, A., Murphy, S., and Y. Yang, "Generic Threats to
2082 Routing Protocols", RFC 4593, October 2006.
2084 [RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas,
2085 "Protocol Independent Multicast - Sparse Mode (PIM-SM):
2086 Protocol Specification (Revised)", RFC 4601, August 2006.
2088 [Tsuchiya87]
2089 Tsuchiya, P., "An Architecture for Network-Layer Routing
2090 in OSI", Proceedings of the ACM workshop on Frontiers in
2091 computer communications technology , 1987.
2093 [Xu97] Xu, Z., Dai, S., and J. Garcia-Luna-Aceves, "A More
2094 Efficient Distance Vector Routing Algorithm", Proc IEEE
2095 MILCOM 97, Monterey, California, Nov 1997, .
2099 Authors' Addresses
2101 Elwyn B. Davies
2102 Consultant
2103 Soham, Cambs
2104 UK
2106 Phone: +44 7889 488 335
2107 Email: elwynd@dial.pipex.com
2109 Avri Doria
2110 LTU
2111 Lulea, 971 87
2112 Sweden
2114 Phone: +1 401 663 5024
2115 Email: avri@acm.org
2117 Full Copyright Statement
2119 Copyright (C) The IETF Trust (2007).
2121 This document is subject to the rights, licenses and restrictions
2122 contained in BCP 78, and except as set forth therein, the authors
2123 retain all their rights.
2125 This document and the information contained herein are provided on an
2126 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
2127 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
2128 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
2129 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
2130 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
2131 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
2133 Intellectual Property
2135 The IETF takes no position regarding the validity or scope of any
2136 Intellectual Property Rights or other rights that might be claimed to
2137 pertain to the implementation or use of the technology described in
2138 this document or the extent to which any license under such rights
2139 might or might not be available; nor does it represent that it has
2140 made any independent effort to identify any such rights. Information
2141 on the procedures with respect to rights in RFC documents can be
2142 found in BCP 78 and BCP 79.
2144 Copies of IPR disclosures made to the IETF Secretariat and any
2145 assurances of licenses to be made available, or the result of an
2146 attempt made to obtain a general license or permission for the use of
2147 such proprietary rights by implementers or users of this
2148 specification can be obtained from the IETF on-line IPR repository at
2149 http://www.ietf.org/ipr.
2151 The IETF invites any interested party to bring to its attention any
2152 copyrights, patents or patent applications, or other proprietary
2153 rights that may cover technology that may be required to implement
2154 this standard. Please address the information to the IETF at
2155 ietf-ipr@ietf.org.
2157 Acknowledgment
2159 Funding for the RFC Editor function is provided by the IETF
2160 Administrative Support Activity (IASA).