idnits 2.17.1 draft-narten-radir-problem-statement-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 744. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 755. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 762. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 768. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 26, 2007) is 6119 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'RAWS-REPORT' is mentioned on line 80, but not defined == Missing Reference: 'RFC4116' is mentioned on line 371, but not defined == Missing Reference: 'CIDR4' is mentioned on line 504, but not defined == Unused Reference: 'RFC2002' is defined on line 712, but no explicit reference was found in the text == Unused Reference: 'RFC3963' is defined on line 715, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 2002 (Obsoleted by RFC 3220) Summary: 2 errors (**), 0 flaws (~~), 6 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Narten 3 Internet-Draft IBM 4 Intended status: Informational July 26, 2007 5 Expires: January 27, 2008 7 Routing and Addressing Problem Statement 8 draft-narten-radir-problem-statement-00.txt 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on January 27, 2008. 35 Copyright Notice 37 Copyright (C) The IETF Trust (2007). 39 Abstract 41 Problem statement for the route scaling problem. 43 Table of Contents 45 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 46 2. Terms and Definitions . . . . . . . . . . . . . . . . . . . . 4 47 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 6 48 3.1. Technical Aspects . . . . . . . . . . . . . . . . . . . . 6 49 3.2. Business Aspects . . . . . . . . . . . . . . . . . . . . . 7 50 3.3. Alignment of Incentives . . . . . . . . . . . . . . . . . 8 51 3.4. Table Growth Targets . . . . . . . . . . . . . . . . . . . 8 52 4. Pressures on Routing Table Size . . . . . . . . . . . . . . . 10 53 4.1. Traffic Engineering . . . . . . . . . . . . . . . . . . . 10 54 4.2. Multihoming . . . . . . . . . . . . . . . . . . . . . . . 11 55 4.3. End Site Renumbering . . . . . . . . . . . . . . . . . . . 12 56 4.4. Acquisitions and Mergers . . . . . . . . . . . . . . . . . 12 57 4.5. RIR Address Allocation Policies . . . . . . . . . . . . . 12 58 4.6. Dual Stack Pressure on the Routing Table . . . . . . . . . 13 59 4.7. Internal Customer Routes . . . . . . . . . . . . . . . . . 14 60 4.8. IPv4 Address Exhaustion . . . . . . . . . . . . . . . . . 14 61 5. Pressures on Path Computation Load . . . . . . . . . . . . . . 15 62 5.1. Interconnection Richness . . . . . . . . . . . . . . . . . 15 63 5.2. Multihoming . . . . . . . . . . . . . . . . . . . . . . . 15 64 5.3. Traffic Engineering . . . . . . . . . . . . . . . . . . . 15 65 5.4. Questionable Operational Practices? . . . . . . . . . . . 16 66 5.4.1. Rapid shuffling of prefixes . . . . . . . . . . . . . 16 67 5.4.2. Anti-Route Hijacking . . . . . . . . . . . . . . . . . 16 68 5.4.3. Operational Ignorance . . . . . . . . . . . . . . . . 16 69 5.5. RIR Policy . . . . . . . . . . . . . . . . . . . . . . . . 17 70 6. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 18 71 7. Security Considerations . . . . . . . . . . . . . . . . . . . 19 72 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 20 73 9. Informative References . . . . . . . . . . . . . . . . . . . . 21 74 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 22 75 Intellectual Property and Copyright Statements . . . . . . . . . . 23 77 1. Introduction 79 Prompted in part by the recent IAB workshop on Routing & Addressing 80 [RAWS-REPORT], there has been a renewed focus on the problem of 81 routing scalability within the Internet. The issue itself is not 82 new, with discussions dating back at least 10-15 years [GSE, 83 ROAD,...]. 85 This document attempts to define the "problem", with the aim of 86 describing the essential aspects so that the community has a way of 87 evaluating whether proposed solutions actually address or impact the 88 underlying problem or "pain points" in a significant manner. 90 2. Terms and Definitions 92 Default Free Zone (DFZ): That part of the Internet where routers 93 maintain full routing tables. Many routers maintain only partial 94 routes, having explicit routes for "local" destinations plus a 95 "default" for everything else. For such routers, building and 96 maintaining routing tables is relatively simple because the amount 97 of information learned and maintained can be small. In contrast, 98 routers in the DFZ maintain complete information about all 99 reachable destinations, which currently number in the hundreds of 100 the thousands of entries. 102 Routing Information Base (RIB): The data structures a router 103 maintains that hold the information about destinations (i.e., 104 prefixes) and paths to those destinations. The amount of state 105 information maintained is dependent on a number of factors, 106 including the number of individual prefixes, the number of BGP 107 peers, the number of distinct paths, etc. The RIB may also 108 include information about unused ("backup") paths for a given 109 prefix as well as the active path(s) used for forwarding. The RIB 110 is typically constructed from specialized hardware components, 111 which have different (and higher) cost properties than the 112 hardware typically used to maintain the FIB. 114 Forwarding Information Base (FIB): The actual table used while 115 making forwarding decisions for individual packets. The FIB is a 116 compact, optimized subset of the RIB, containing only the 117 information needed to actually forward individual packets, i.e., 118 mapping a packet's destination address to an outgoing interface 119 and next-hop. The FIB only stores information about paths 120 actually used for forwarding; it typically does not store 121 information about backup paths. The FIB is typically constructed 122 from specialized hardware components, which have different cost 123 properties compared to the hardware typically used to maintain the 124 FIB. 126 Trafic Engineering (TE): In this document, "traffic engineering" 127 refers to the current practice of inbound, inter-AS traffic 128 engineering. This is accomplished by placing more specific routes 129 in the routing table and/ or increasing the frequency of routing 130 updates in order to control inbound traffic at the boundary of an 131 Autonomous system (AS). 133 Provider Aggregatable (PA) address space Address space that an end 134 site obtains from an ISP's address block. The main benefit of PA 135 address space is that reachability to all of a provider's 136 customers can be achieved by advertising a single "provider 137 aggregate" address prefix into the DFZ, rather than needing to 138 announce individual prefixes for each customer. An important 139 disadvantages is a requirement that the customer return those 140 addresses (and renumber) when changing providers. 142 Provider Independent (PI) address space Address space that an end 143 site obtains directly from a Regional Internet Registry (RIR) for 144 numbering its devices. The main advantage (for the end site) is 145 that it does not have to return those addresses (and renumber its 146 site) upon changing providers. However, PI address blocks are not 147 aggregatable and thus each individual PI assignment results in an 148 individual prefix being injected into the DFZ. 150 3. Background 152 Within the DFZ, both the size of the RIB and FIB and the overall 153 update rate have historically increased at a greater than linear 154 rate. Specifically: 156 o The number of individual prefixes that are propagated into the DFZ 157 is increasing at a super-linear rate. The reasons behind this 158 increase are varied and discussed below. Because each individual 159 prefix requires resources to process, any increase in the number 160 of prefixes requires a corresponding increase in resources. Each 161 individual prefix that appears in routing updates requires state 162 in the RIB (and possibly the FIB) and consumes processing 163 resources when updates related to the prefix are received. 165 o The overall rate of routing updates is increasing, requiring 166 routers to process updates at an increased rate or converge more 167 slowly if they cannot. The rate increase is driven by a number of 168 factors (discussed below). It should be noted that the overall 169 routing update rate is dependent on two factors: the number of 170 individual prefixes and the mean per-prefix update rate. While it 171 is clear that the overall number of prefixes is increasing super- 172 linearly, further study is needed to determine whether the mean 173 per-prefix update rate is increasing as well [need reference]. 175 This super linear growth presents a scalability challenge for current 176 and/or future routers. There are two aspects to the challenge. The 177 first one is purely technical: can we build routers (i.e., hardware & 178 software) actually capable handling the load, both today and going 179 forward? The second challenge is one of economics: is the cost of 180 developing, building and deploying such routers economically 181 sustainable, given current and realistic business models that govern 182 how ISPs operate as businesses? 184 Finally, the scalability challenge is aggravated by the lack of any 185 limiting architectural upper-bound on the growth rate and a weakening 186 of traditional social constraints on the growth rate that have helped 187 restrain growth so far. Going forward, there is considerable 188 uncertainty whether future growth rates will continue to be 189 sufficiently constrained so that the routing load can be handled by 190 routers available at that time. 192 3.1. Technical Aspects 194 The technical challenge of building routers relates to the resources 195 needed to process a larger and increasingly dynamic amount of routing 196 information. More specifically, routers must maintain an increasing 197 amount of associated state information in the RIB, they must be 198 capable of populating a growing FIB, they must perform forwarding 199 lookups at line rates (while accessing the FIB) and they must be able 200 to initialize the RIB and FIB at boot time. Moreover, this activity 201 must take place within acceptable time frames (i.e., the overall 202 system must converge and stabilize within an acceptable time period). 203 Finally, the hardware needed to achieve this cannot have unreasonable 204 power consumption or cooling demands. 206 3.2. Business Aspects 208 Even if it is technically possible to build routers capable of 209 meeting the technical and operational requirements, it is also 210 necessary that the overall cost to build, maintain and deploy such 211 equipment meet reasonable business expectations. ISPs, after all, 212 are run as businesses. As such, they must be able to plan, develop 213 and construct viable business plans that provide an acceptable return 214 on investment (i.e., one acceptable to investors). 216 While the IETF does not (and cannot) concern itself with business 217 models or the profitability of the ISP community, the cost of running 218 the routing subsystem as a whole is directly influenced by the 219 routing architecture of the Internet, which clearly is the IETF's 220 business. Further, because cost implications are part of each and 221 every engineering decision, controlling or limiting the overall cost 222 of running the routing subsystem (through architectural decisions) is 223 part of the IETF's fundamental charter. Consequently, having the 224 IETF continue with an architectural model that places unbounded cost 225 requirements on critical infrastructure represents an undue risk to 226 the future of the Internet as a whole. 228 One aspect of planning concerns the assumptions made about the 229 expected usable lifetime of purchased equipment. Businesses 230 typically expect that once deployed, equipment can remain in use for 231 some projected amount of time (e.g., 3-5 years). Upgrading equipment 232 earlier than planned is more easily justified (as an unplanned 233 expense) when a new business opportunity is enabled as a result of an 234 upgrade. For example, an upgrade might be justified by an ability to 235 support increased traffic or an increase in the number of customer 236 connections, etc., where the upgrade can translate into increased 237 revenue. In contrast, it is more difficult to justify unplanned 238 upgrades in the absence of corresponding customer benefit (and 239 revenue) to cover the upgrade cost. It is generally desired that 240 deployed equipment remain usable over its planned lifetime. An 241 increase in the resources required to support larger or more dynamic 242 routing tables is viewed as a sort of "unfunded mandate", in that 243 customers do not expect to have to pay more just to continue 244 retaining the same level of service as before, i.e., having all 245 destinations be reachable as was the case in the past. This 246 undermining of planning is particularly problematic when the increase 247 in routing demand originates external to the ISP, and the ISP has no 248 way to control or limit it (e.g., the increased demand comes from 249 being part of the DFZ). 251 From a business perspective, it is desirable to maintain or increase 252 the useful lifespan of routing equipment, by improving the scaling 253 properties of the routing and addressing system. 255 3.3. Alignment of Incentives 257 Today's growth pattern is influenced by the scaling properties of the 258 current system. If the system had better scaling properties, we 259 would be able support and enable more widespread usage of certain 260 applications such as multihoming and traffic engineering. Currently 261 the system does not allow everyone to multihome, as there are some 262 barriers to multihoming due to operational practices that try to 263 strike a balance between the amount of multihoming and preservation 264 of routing slots. It is desirable that the routing and addressing 265 system exert the least possible back pressure on end user 266 applications and deployment scenarios, to enable the broadest 267 possible use of the Internet. 269 One aspect of the current architecture is a misalignment of cost and 270 benefit. Injecting individual prefixes into the DFZ creates a small 271 amount of "pain" for those routers that are part of the DFZ. Each 272 individual prefix has a small cost, but the aggregate sum of all 273 prefixes is significant, and leads to the core problem at hand. 274 Those that inject prefixes into the DFZ do not generally pay the cost 275 associated with the individual prefix -- it is carried by the routers 276 in the DFZ. But the originator of the prefix receives the benefit. 277 Hence, there is misalignment of incentives between those receiving 278 the benefit and those bearing the cost of providing the benefit. 279 Consequently, incentives are not aligned properly to produce a 280 natural balance between the cost and benefit of maintaining routing 281 tables. 283 3.4. Table Growth Targets 285 A precise target for the rate of table size or routing update 286 increase that should reasonably be supported going forward is 287 difficult to state in quantitative terms. One target might simply be 288 to keep the growth at a stable, but manageable growth rate so that 289 the increased router functionality can roughly be covered by 290 improvements in technology (e.g., increased processor speeds, 291 reductions in component costs, etc.). 293 However, it is highly desirable to significantly bring down (or even 294 reverse) the growth rate in order to meet user expectations for 295 specific services. As discussed below, there are numerous pressures 296 to deaggregate routes. These pressures come from users seeking 297 specific, tangible service improvements that provide "business- 298 critical" value. Today, some of those services simply cannot be 299 supported to the degree that future demand can be reasonably be 300 expected because of the negative implications on DFZ table growth. 301 Hence, valuable services are available to some, but not all potential 302 customers. As the need for such services becomes increasingly 303 important, it will be difficult to deny such services to large 304 numbers of users, especially when some "lucky" sites are able to use 305 the service and others are not. 307 4. Pressures on Routing Table Size 309 There are a number of factors behind the increase in the quantity of 310 prefixes appearing in the DFZ. From a theoretical perspective, the 311 number of prefixes in the DFZ can be minimized through aggressive 312 aggregation [RFC4632]. In practice, strict adherence to the CIDR 313 principles is difficult. 315 4.1. Traffic Engineering 317 Traffic engineering (TE) is the act of arranging for certain Internet 318 traffic to use or avoid certain network paths (that is, TE attempts 319 to place traffic where capacity exists, or where some set of 320 parameters of the path is more favorable to the traffic being placed 321 there). 323 Outbound TE is typically accomplished by using internal IGP metrics 324 to choose the shortest exit for two equally good BGP paths. 325 Adjustment of IGP metrics controls how much traffic flows over 326 different internal paths to specific exit points for two equally good 327 BGP paths. Additional traffic can be moved by applying some policy 328 to depreference or filter certain routes from specific BGP peers. 329 Because outbound TE is achieved via a site's own IGP, outbound TE 330 does not impact routing outside of a site. 332 Inbound TE is performed by announcing a more specific route along the 333 preferred path that "catches" the desired traffic and channels it 334 away from the path it would take otherwise (i.e., via a larger 335 aggregate). At the BGP level, if the address range requiring TE is a 336 portion of a larger address aggregate, network operators implementing 337 TE are forced to de-aggregate otherwise aggregatable prefixes in 338 order to steer the traffic of the particular address range to 339 specific paths. 341 TE is performed by both ISPs and customer networks, for three primary 342 reasons: 344 o First, to match traffic with network capacity, or to spread the 345 traffic load across multiple links (frequently referred to as 346 "load balancing"). 348 o Second, to reduce costs by shifting traffic to lower cost paths or 349 by balancing the incoming and outgoing traffic volume to maintain 350 appropriate peering relations. 352 o Finally, TE is sometimes deployed to enforce certain forms of 353 policy (e.g., government traffic may not be permitted to transit 354 through other countries). 356 TE impacts route scaling in two ways. First, inbound TE can result 357 in additional prefixes being advertised into the DFZ. Second, 358 Network operators usually achieve traffic engineering by "tweaking" 359 the processing of routing protocols to achieve desired results, e.g., 360 by sending updates at an increased rate. In addition, some devices 361 attempt to automatically find better paths and then advertise those 362 preferences through BGP, though the extent to which such tools are in 363 use and contributing to the routing load is unknown. 365 In today's highly competitive environment, providers require TE to 366 maintain good performance and low cost in their networks. 368 4.2. Multihoming 370 Multihoming refers generically to the case in which a site is served 371 by more than one ISP [RFC4116]. Multihoming is used to provide 372 backup paths (i.e., to remove single points of failure), to achieve 373 load-sharing, and to achieve policy or performance objectives (e.g., 374 to use lower latency or higher bandwidth paths). Multihoming may 375 also be a requirement due to contract or law. 377 Multihoming can be accomplished using either PI or PA address space. 378 A multihomed site advertises its site prefix into the routing system 379 of each of its providers. For PI space, the site's PI space is used, 380 and the prefix is propagated throughout the DFZ. For PA space, the 381 PA site prefix may (or may not) be propagated throughout the DFZ, 382 with the details depending on what type of multihoming is sought. 384 If the site uses PA space, the PA site prefix allocated from one of 385 its provider's (whom we'll call the Primary Provider) is used. The 386 PA site prefix will be aggregatable by the Primary Provider but not 387 the others. To achieve the same level of multihoming as described in 388 the case with PI addresses above, the PA site prefix will need to be 389 injected into the routing system of all of its ISPs, and throughout 390 the DFZ. In addition, because of the longest-match forwarding rule, 391 the Primary Provider must also advertise and propagate the individual 392 PA site prefix; otherwise, the path via the primary provider (as 393 advertised via the aggregate) will never be selected due to the 394 longest match rule. For the type of multihoming described here, 395 where the PA site prefix is propagated throughout the DFZ, the use of 396 PI vs. PA space has no impact on the load placed on the routing 397 system. The increased load is due entirely to the need to propagate 398 the site's individual prefix into the DFZ. 400 The demand for multihoming is increasing [XXX do we have data to 401 cite?]. The increase in multihoming demand is due to the increased 402 reliance on the Internet for mission and business-critical 403 applications (where businesses require 7x24 availability for their 404 services) and the general decrease in cost of Internet connectivity. 406 4.3. End Site Renumbering 408 It is generally considered painful and costly to renumber a site, 409 with the cost proportional to the size and complexity of the network 410 and most importantly, to the degree that addresses are stored in 411 places that are difficult in practice to update. When using PA 412 space, a site must renumber when changing providers. Larger sites 413 object to this cost and view the requirement to renumber akin to 414 being held "hostage" to the provider from which PA space was 415 obtained. Consequently, many sites desire PI space. Having PI space 416 provides independence from any one provider and makes it easier to 417 switch providers (for whatever reason). However, each individual PI 418 prefix must be propagated throughout the DFZ and adds to the DFZ 419 routing load. 421 It should be noted that while larger sites may also want to 422 multihome, the cost of renumbering drives some sites to seek PI 423 space, even though they do not multihome. 425 4.4. Acquisitions and Mergers 427 Acquisitions and mergers take place for business reasons, which 428 usually have little to do with the network topologies of the impacted 429 organizations. When a business sells off part of itself, the assets 430 may include networks, attached devices, etc. A company that 431 purchases or merges with other organizations may quickly find that 432 its network assets are numbered out of many different and 433 unaggragatable address blocks. Consequently, individual 434 organizations may find themselves unable to announce a single prefix 435 for all of their networks without renumbering a significant portion 436 of its network. 438 Likewise, selling off part of a business may involve selling part of 439 a network as well, resulting in the fragmentation of one address 440 block into two (or more) smaller blocks. Because the resultant 441 blocks belong to different companies, they can no longer be 442 advertised by a single aggregate and the resultant fragments may need 443 to be advertised individually into the DFZ. 445 4.5. RIR Address Allocation Policies 447 ISPs and multihoming end sites obtain address space from RIRs. As an 448 entity grows, it needs additional address space and requests more 449 from its RIR. In order to be able to obtain additional address space 450 that can be aggregated with the previously-allocated address space, 451 the RIR must keep a reserve of space that the requester can grow into 452 in the future. But any reserved address space cannot be used for any 453 other purpose. Hence, there is an inherent conflict between holding 454 address space in reserve to allow for the future growth of an 455 existing allocation and using address space efficiently. In IPv4, 456 there has been a heavy emphasis on conserving address space and 457 obtaining efficient utilization. Consequently, insufficient space 458 has been held in reserve to allow for the growth of all sites and 459 some allocations have had to me made from discontiguous address 460 blocks. For IPv6, a greater emphasis has been placed on aggregation. 462 4.6. Dual Stack Pressure on the Routing Table 464 The recommended IPv6 deployment model is dual-stack, where IPv4 and 465 IPv6 are run in parallel across the same links. This has two 466 implications for routing. First, although alternative scenarios are 467 possible, it seems likely that many routers will be supporting both 468 IPv4 and IPv6 simultaneously and will thus be managing both IPv4 and 469 IPv6 routing tables within a single router. Second, for sites 470 connected via both IPv4 and IPv6, both IPv4 and IPv6 prefixes will 471 need to be propagated into the routing system. Consequently, dual- 472 stack routers will maintain both an IPv4 and IPv6 route to reach the 473 same destination. 475 It is possible to make some simple estimates on the approximate size 476 of the IPv6 tables that would be needed if all sites reachable via 477 IPv4 today were also reachable via IPv6. In theory, each autonomous 478 system (AS) needs only a single aggregate route. This provides a 479 lower bound on the size of the fully-realized IPv6 routing table. 480 (As of July 2007, [CIDR4] states there are 25,836 active ASes in the 481 routing system.) 483 A single IPv6 aggregate will not allow for inbound traffic 484 engineering. End sites will need to advertise a number of smaller 485 prefixes into the DFZ if they desire to gain finer grained control 486 over their IPv6 inbound traffic. This will increase the size of the 487 IPv6 routing table beyond the lower bound discussed above. There is 488 reason to expect the IPv6 routing table will be smaller than the 489 current IPv4 table, however, because the larger initial assignments 490 to end sites will minimize the de-aggregation that occurs when a site 491 must go back to its upstream address provider or RIR and receive a 492 second, non-contiguous assignment. 494 It is possible to extrapolate what the size of the IPv6 Internet 495 routing table would be if widespread IPv6 adoption occurred, from the 496 current IPv4 Internet routing table. Each active AS (25,836) would 497 require at least one aggregate. In addition, the IPv6 Internet table 498 would also carry more specific prefixes for traffic engineering. 499 Assume that the IPv6 Internet table will carry the same number of 500 more specifics as the IPv4 Internet table. In this case one can take 501 the number of IPv4 Internet routes and subtract the number of CIDR 502 aggregates that they could easily be aggregated down to. As of July 503 2007, the 229,789 routes can be easily aggregated down to 150,018 504 CIDR aggregates [CIDR4]. That difference yields 79,771 extra more 505 specific prefixes. Thus if each active AS (25,836) required one 506 aggregate, and an additional 79,771 more specifics were required, 507 then the IPv6 Internet table would be 105,607 prefixes. 509 4.7. Internal Customer Routes 511 In addition to the Internet routing table, networks must also carry 512 their internal routing table. Internal routes are defined as more 513 specific routes that are not advertised to the DFZ. This primarily 514 consists of prefixes that are a more specific of a provider aggregate 515 (PA) and are assigned to a single homed customer. The DFZ need only 516 carry the PA aggregate in order to deliver traffic to the provider. 517 However, the provider's routers require the more specific route to 518 deliver traffic to the end site. 520 This could also consist of more specific prefixes advertised by 521 multi-homed customers with the no-export community. This is useful 522 when the fine grained control of traffic to be influenced can be 523 contained to the neighboring network. 525 For a large ISP, the internal IPv4 table can be between 50,000 and 526 150,000 routes. During the dot com boom some ISPs had more internal 527 prefixes than there were in the Internet table. Thus the size of the 528 internal routing table can have significant impact on the scalability 529 and should not be discounted. 531 4.8. IPv4 Address Exhaustion 533 The IANA and RIR free pool of IPv4 addresses will be exhausted within 534 a few years. As the free pool shrinks, the size of the remaining 535 unused blocks will also shrink and unused blocks previously held in 536 reserve for expansion of existing allocations or otherwise not used 537 due to their smaller size will be allocated for use. Consequently, 538 as the community looks to use use every piece of available address 539 space (no matter how small) there will be an increasing pressure to 540 advertise additional prefixes in the DFZ. 542 5. Pressures on Path Computation Load 544 This section describes a number of trends and pressures that are 545 contributing to the overall load of computing Internet paths. The 546 previous section described pressures that are increasing the size of 547 the routing table. Even if the size could be bounded, the amount of 548 work needed to maintain paths for a given set of prefixes appears to 549 be increasing. 551 5.1. Interconnection Richness 553 The degree of interconnectedness between ASes has increased in recent 554 years. That is, the Internet as whole is becoming "flatter" with an 555 increasing number of possible paths interconnecting sites [ref? gih 556 has observed this]. As the number of possible paths increase, the 557 amount of computation needed to find a best path also increases. 558 This computation comes into effect whenever a change in path 559 characteristics occurs, whether from a new path becoming available, 560 an existing path failing, or a change in the attributes associated 561 with a potential path. Thus, even if the total number of prefixes 562 were to stay constant, an increase in the interconnection richness 563 implies an increase in the resources needed to maintain routing 564 tables. 566 5.2. Multihoming 568 Multihoming places pressure on the routing system in two ways. 569 First, an individual prefix for a multihomed site (whether PI or PA) 570 must be propagated into the routing system, so that other sites can 571 find a good path to the site. Even if the site's prefix comes out of 572 a PA block, an individual prefix for the site needs to be advertised 573 so that the most desirable path to the site can be chosen when the 574 path through the aggregate is sub-optimal. Second, a multi-homed 575 site will be connected to the Internet in more than one place, 576 increasing the overall level of interconnection richness. If an 577 outage occurs on any of the circuits connecting the site to the 578 Internet, those changes will be propagated into the routing system. 579 In contrast, a singly-homed site numbered out of a Provider Aggregate 580 places no additional routing load in the DFZ as the details of the 581 connectivity status to the site are kept internal to the provider to 582 which it connects. 584 5.3. Traffic Engineering 586 The mechanisms used to achieve multihoming and inbound Traffic 587 Engineering are the same. In both cases, a specific prefix is 588 advertised into the routing system to "catch" traffic and route it 589 over a different path than it would otherwise be carried. When 590 multihoming, the specific prefix is one that differs from that of its 591 ISP or is a more-specific of the ISP's PA. Traffic Engineering is 592 achieved by taking one prefix and dividing into a number of smaller 593 and more-specific ones, and advertising them in order to gain finer- 594 grained control over the paths used to carry traffic covered by those 595 prefixes. 597 Traffic Engineering increases the number of prefixes carried in the 598 routing system. In addition, when a circuit fails (or the routing 599 attributes associated with the circuit change), additional load is 600 placed on the routing system by having multiple prefixes potentially 601 impacted by the change, as opposed to just one. 603 5.4. Questionable Operational Practices? 605 Some operators are believed to engage in operational practices that 606 increase the load on the routing system. 608 5.4.1. Rapid shuffling of prefixes 610 Some networks try to assert fine-grained control of inbound traffic 611 by modifying route announcements frequently in order to migrate 612 traffic to less loaded links quickly. The goal of this is to achieve 613 higher utilization of multiple links. In addition, some route 614 selection devices actively measure link or path utilization and 615 attempt to optimize inbound traffic by withholding or depreferencing 616 certain prefixes in their advertisements. In short, any system that 617 actively measures load and modifies route advertisements in real time 618 increases the load on the routing system, as any change in what is 619 advertised must ripple through the entire routing system. 621 5.4.2. Anti-Route Hijacking 623 In order to reduce the threat of accidental (or intentional) 624 hijacking of its address space by an unauthorized third party, some 625 sites advertise their space as a set of smaller prefixes rather than 626 as one aggregate. That way, if someone else advertised a path for 627 the larger aggregate (or a small piece of the aggregate), it will be 628 ignored in favor of the more specific announcements. This increases 629 both the number of prefixes advertised, and the number of updates. 631 5.4.3. Operational Ignorance 633 It is believed that some undesirable practices result from operator 634 ignorance, where the operator is unaware of what they are doing and 635 the impact that has on the DFZ. 637 The default behavior of most BGP configurations is to automatically 638 propagate all learned routes. That is, one must take explicit 639 configuration steps to prevent the automatic propagation of learned 640 routes. In addition, it is often significant work to figure out how 641 to (safely) aggregate routes (and which ones to aggregate) in order 642 to reduce the number of advertisements propagated elsewhere. While 643 vendors could provide additional configuration "knobs" to reduce 644 leakage, the implementation of additional features increases 645 complexity and some operators may fear that the new configuration 646 will break their existing routing setup. Finally, leaking routes 647 unnecessarily does not generally harm those with the 648 misconfiguration, hence, there is less motivation to address the 649 problem. 651 5.5. RIR Policy 653 RIR address policy has direct impact on the routing load because 654 address policy determines who is eligible for a PI assignment (which 655 impacts how many are given out in practice) and the size of the 656 assignment (which impacts how much address space can be aggregated 657 within a single assignment). If PI assignments for end sites did not 658 exist, then those end sites would not advertise their own prefix 659 directly into the global routing system; instead their address block 660 would be covered by their provider's aggregate. That said, RIRs have 661 adopted PI policies in response to community demand, for reasons 662 described elsewhere (e.g., to support multihoming and to avoid the 663 need to renumber). In short, RIR policy can be seen as a symptom 664 rather than a root cause. 666 6. Problem Statement 668 There is a need for an approach to routing and addressing that: 670 1. Reduces the growth rate of the DFZ routing load, where the 671 routing load is dependent on: 673 A. The number of individual prefixes in the DFZ 675 B. The update rate associated with those prefixes. 677 2. Allows any end site wishing to multihome to do so 679 3. Supports ISP and enterprise TE needs 681 4. Allows end sites to switch providers while minimizing 682 configuration changes to internal end site devices. 684 5. Provides meaningful benefits to the parties who bear the costs of 685 deploying and maintaining the technology. 687 The problem statement in this document has purposefully been scoped 688 to focus on the growth of the routing update function of the DFZ. 689 Other problems that may seem related, but do not directly impact the 690 route scaling problem are not considered to be "in scope" at this 691 time. For example, Mobile IP [[RFC2002], RFC3775] and NEMO 692 [[RFC3963]] place no pressures on the routing system. They are 693 layered on top of existing IP, using tunneling to forward packets via 694 a care-of addresses. Hence, "improving" these technologies (e.g., by 695 having them leverage a solution to the multihoming problem), while a 696 laudable goal, is not considered a part of this problem statement. 698 7. Security Considerations 700 None. 702 8. Acknowledgments 704 The initial version of this document was produced by the Routing and 705 Addressing Directorate (http://www.ietf.org/IESG/content/radir.html). 706 The membership of the directorate at that time included Marla 707 Azinger, Vince Fuller, Vijay Gill, Thomas Narten, Erik Nordmark, 708 Jason Schiller, Peter Schoenmaker, and John Scudder. 710 9. Informative References 712 [RFC2002] Perkins, C., "IP Mobility Support", RFC 2002, 713 October 1996. 715 [RFC3963] Devarapalli, V., Wakikawa, R., Petrescu, A., and P. 716 Thubert, "Network Mobility (NEMO) Basic Support Protocol", 717 RFC 3963, January 2005. 719 [RFC4632] Fuller, V. and T. Li, "Classless Inter-domain Routing 720 (CIDR): The Internet Address Assignment and Aggregation 721 Plan", BCP 122, RFC 4632, August 2006. 723 Author's Address 725 Thomas Narten 726 IBM 728 Email: narten@us.ibm.com 730 Full Copyright Statement 732 Copyright (C) The IETF Trust (2007). 734 This document is subject to the rights, licenses and restrictions 735 contained in BCP 78, and except as set forth therein, the authors 736 retain all their rights. 738 This document and the information contained herein are provided on an 739 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 740 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 741 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 742 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 743 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 744 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 746 Intellectual Property 748 The IETF takes no position regarding the validity or scope of any 749 Intellectual Property Rights or other rights that might be claimed to 750 pertain to the implementation or use of the technology described in 751 this document or the extent to which any license under such rights 752 might or might not be available; nor does it represent that it has 753 made any independent effort to identify any such rights. Information 754 on the procedures with respect to rights in RFC documents can be 755 found in BCP 78 and BCP 79. 757 Copies of IPR disclosures made to the IETF Secretariat and any 758 assurances of licenses to be made available, or the result of an 759 attempt made to obtain a general license or permission for the use of 760 such proprietary rights by implementers or users of this 761 specification can be obtained from the IETF on-line IPR repository at 762 http://www.ietf.org/ipr. 764 The IETF invites any interested party to bring to its attention any 765 copyrights, patents or patent applications, or other proprietary 766 rights that may cover technology that may be required to implement 767 this standard. Please address the information to the IETF at 768 ietf-ipr@ietf.org. 770 Acknowledgment 772 Funding for the RFC Editor function is provided by the IETF 773 Administrative Support Activity (IASA).