idnits 2.17.1 draft-ietf-grow-va-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 8, 2010) is 5161 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 3107 (Obsoleted by RFC 8277) ** Obsolete normative reference: RFC 4601 (Obsoleted by RFC 7761) == Outdated reference: A later version (-12) exists of draft-ietf-grow-simple-va-00 Summary: 3 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Francis 3 Internet-Draft MPI-SWS 4 Intended status: Informational X. Xu 5 Expires: September 9, 2010 Huawei 6 H. Ballani 7 Cornell U. 8 D. Jen 9 UCLA 10 R. Raszuk 11 Cisco 12 L. Zhang 13 UCLA 14 March 8, 2010 16 FIB Suppression with Virtual Aggregation 17 draft-ietf-grow-va-02.txt 19 Abstract 21 The continued growth in the Default Free Routing Table (DFRT) 22 stresses the global routing system in a number of ways. One of the 23 most costly stresses is FIB size: ISPs often must upgrade router 24 hardware simply because the FIB has run out of space, and router 25 vendors must design routers that have adequate FIB. FIB suppression 26 is an approach to relieving stress on the FIB by NOT loading selected 27 RIB entries into the FIB. Virtual Aggregation (VA) allows ISPs to 28 shrink the FIBs of any and all routers, easily by an order of 29 magnitude with negligible increase in path length and load. FIB 30 suppression deployed autonomously by an ISP (cooperation between ISPs 31 is not required), and can co-exist with legacy routers in the ISP. 33 Status of this Memo 35 This Internet-Draft is submitted to IETF in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF), its areas, and its working groups. Note that 40 other groups may also distribute working documents as Internet- 41 Drafts. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 The list of current Internet-Drafts can be accessed at 49 http://www.ietf.org/ietf/1id-abstracts.txt. 51 The list of Internet-Draft Shadow Directories can be accessed at 52 http://www.ietf.org/shadow.html. 54 This Internet-Draft will expire on September 9, 2010. 56 Copyright Notice 58 Copyright (c) 2010 IETF Trust and the persons identified as the 59 document authors. All rights reserved. 61 This document is subject to BCP 78 and the IETF Trust's Legal 62 Provisions Relating to IETF Documents 63 (http://trustee.ietf.org/license-info) in effect on the date of 64 publication of this document. Please review these documents 65 carefully, as they describe your rights and restrictions with respect 66 to this document. Code Components extracted from this document must 67 include Simplified BSD License text as described in Section 4.e of 68 the Trust Legal Provisions and are provided without warranty as 69 described in the BSD License. 71 Table of Contents 73 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 74 1.1. Scope of this Document . . . . . . . . . . . . . . . . . . 5 75 1.2. Requirements notation . . . . . . . . . . . . . . . . . . 6 76 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 77 1.4. Temporary Sections . . . . . . . . . . . . . . . . . . . . 7 78 1.4.1. Document revisions . . . . . . . . . . . . . . . . . . 7 79 2. Overview of Virtual Aggregation (VA) . . . . . . . . . . . . . 9 80 2.1. Mix of legacy and VA routers . . . . . . . . . . . . . . . 11 81 2.2. Summary of Tunnels and Paths . . . . . . . . . . . . . . . 11 82 3. Specification of VA . . . . . . . . . . . . . . . . . . . . . 13 83 3.1. VA Operation . . . . . . . . . . . . . . . . . . . . . . . 13 84 3.1.1. Legacy Routers . . . . . . . . . . . . . . . . . . . . 13 85 3.1.2. Advertising and Handling Virtual Prefixes (VP) . . . . 14 86 3.1.3. Border VA Routers . . . . . . . . . . . . . . . . . . 17 87 3.1.4. Advertising and Handling Sub-Prefixes . . . . . . . . 18 88 3.1.5. Suppressing FIB Sub-prefix Routes . . . . . . . . . . 18 89 3.2. New Configuration . . . . . . . . . . . . . . . . . . . . 20 90 4. Usage of Tunnels . . . . . . . . . . . . . . . . . . . . . . . 21 91 4.1. MPLS tunnels . . . . . . . . . . . . . . . . . . . . . . . 21 92 4.2. Usage of Inner Label . . . . . . . . . . . . . . . . . . . 21 93 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 94 6. Security Considerations . . . . . . . . . . . . . . . . . . . 22 95 6.1. Properly Configured VA . . . . . . . . . . . . . . . . . . 22 96 6.2. Mis-configured VA . . . . . . . . . . . . . . . . . . . . 23 97 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 23 98 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 99 8.1. Normative References . . . . . . . . . . . . . . . . . . . 23 100 8.2. Informative References . . . . . . . . . . . . . . . . . . 24 101 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 24 103 1. Introduction 105 ISPs today manage constant DFRT growth in a number of ways. One way, 106 of course, is for ISPs to upgrade their router hardware before DFRT 107 growth outstrips the size of the FIB. This is too expensive for many 108 ISPs. They would prefer to extend the lifetime of routers whose FIBs 109 can no longer hold the full DFRT. 111 A common approach taken by lower-tier ISPs is to default route to 112 their providers. Routes to customers and peer ISPs are maintained, 113 but everything else defaults to the provider. This approach has 114 several disadvantages. First, packets to Internet destinations may 115 take longer-than-necessary AS paths. This problem can be mitigated 116 through careful configuration of partial defaults, but this can 117 require substantial configuration overhead. A second problem with 118 defaulting to providers is that the ISP is no longer able to provide 119 the full DFRT to its customers. Finally, provider defaults prevents 120 the ISP from being able to detect martian packets. As a result, the 121 ISP transmits packets that could otherwise have been dropped over its 122 expensive provider links. 124 An alternative is for the ISP to maintain full routes in its core 125 routers, but to filter routes from edge routers that do not require a 126 full DFRT. These edge routers can then default route to the core 127 routers. This is often possible with edge routers that interface to 128 customer networks. The problem with this approach is that it cannot 129 be used for all edge routers. For instance, it cannot be used for 130 routers that connect to transits. It of course also does not help in 131 cases where core routers themselves have inadequate FIB capacity. 133 FIB Suppression is an approach to shrinking FIB size that requires no 134 changes to BGP, no changes to packet forwarding mechanisms in 135 routers, and relatively minor changes to control mechanisms in 136 routers and configuration of those mechanisms. The core idea behind 137 FIB suppression is to run BGP as normal, and in particular to not 138 shrink the RIB, but rather to not load certain RIB entries into the 139 FIB. This approach minimizes changes to routers, and in particular 140 is simpler than more general routing architectures that try to shrink 141 both RIB and FIB. With FIB suppression, there are no changes to BGP 142 per se. The BGP decision process does not change. The selected AS- 143 path does not change, and except on rare occasion the exit router 144 does not change. ISPs can deploy FIB suppression autonomously and 145 with no coordination with neighboring ASes. 147 This document describes an approach to FIB suppression called 148 "Virtual Aggregation" (VA). VA operates by organizing the IP (v4 or 149 v6) address space into Virtual Prefixes (VP), and using tunnels to 150 aggregate the (regular) sub-prefixes within each VP. The decrease in 151 FIB size can be dramatic, easily 5x or 10x with only a slight path 152 length and router load increase [nsdi09]. The VPs can be organized 153 such that all routers in an ISP see FIB size decrease, or in such a 154 way that "core" routers keep the full FIB, and "edge" routers have 155 almost no FIB (i.e. by defining a VP of 0/0). This "core-edge" style 156 of VA deployment is much simpler than a "full" VA deployment, whereby 157 multiple VPs are defined, and any router, core or otherwise, can have 158 reduced FIB size. This simpler "core-edge" style of deployment is 159 specified in a separate draft in order to make it more easily 160 understandable [I-D.ietf-grow-simple-va]. 162 VA has the following characteristics: 163 o it is robust to router failure, 164 o it allows for traffic engineering, 165 o it allows for existing inter-domain routing policies, 166 o it operates in a predictable manner and is therefore possible to 167 test, debug, and reason about performance (i.e. establish SLAs), 168 o it can be safely installed, tested, and started up, 169 o it can be configured and reconfigured without service 170 interruption, 171 o it can be incrementally deployed, and in particular can be 172 operated in an AS with a mix of VA-capable and legacy routers, 173 o it accommodates existing security mechanisms such as unicast 174 Reverse Path Forwarding (uRPF) ingress filtering and DoS defense, 175 o does not introduce significant new security vulnerabilities. 177 1.1. Scope of this Document 179 The scope of this document is limited to Intra-domain VA operation. 180 In other words, the case where a single ISP autonomously operates VA 181 internally without any coordination with neighboring ISPs. 183 Note that this document assumes that the VA "domain" (i.e. the unit 184 of autonomy) is the AS (that is, different ASes run VA independently 185 and without coordination). For the remainder of this document, the 186 terms ISP, AS, and domain are used interchangeably. 188 This document applies equally to IPv4 and IPv6. 190 VA may operate with a mix of upgraded routers and legacy routers. 191 There are no topological restrictions placed on the mix of routers. 192 In order to avoid loops between upgraded and legacy routers, packets 193 are always tunneled by the VA routers to the BGP NEXT_HOPs of the 194 matched BGP routes. If a given local ASBR is a legacy router, it 195 must be able to terminate tunnels. 197 1.2. Requirements notation 199 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 200 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 201 document are to be interpreted as described in [RFC2119]. 203 1.3. Terminology 205 Aggregation Point Router (APR): An Aggregation Point Router (APR) is 206 a router that aggregates a Virtual Prefix (VP) by installing 207 routes (into the FIB) for all of the sub-prefixes within the VP. 208 APRs advertise the VP to other routers with BGP. For each sub- 209 prefix within the VP, APRs have a tunnel from themselves to the 210 remote ASBR (Autonomous System Border Router) where packets for 211 that prefix should be delivered. 212 Install and Suppress: The terms "install" and "suppress" are used to 213 describe whether a RIB entry has been loaded or not loaded into 214 the FIB. In other words, the phrase "install a route" means 215 "install a route into the FIB", and the phrase "suppress a route" 216 means "do not install a route into the FIB". 217 Legacy Router: A router that does not run VA, and has no knowledge 218 of VA. Legacy routers, however, must be able to terminate tunnels 219 when they are local ASBRs. 220 non-APR Router: In discussing VPs, it is often necessary to 221 distinguish between routers that are APRs for that VP, and routers 222 that are not APRs for that VP (but of course may be APRs for other 223 VPs not under discussion). In these cases, the term "APR" is 224 taken to mean "a VA router that is an APR for the given VP", and 225 the term "non-APR" is taken to mean "a VA router that is not an 226 APR for the given VP". The term non-APR router is not used to 227 refer to legacy routers. 228 Popular Prefix: A Popular Prefix is a sub-prefix that is installed 229 in a router in addition to the sub-prefixes it holds by virtue of 230 being a Aggregation Point Router. The Popular Prefix allows 231 packets to follow the shortest path. Note that different routers 232 do not need to have the same set of Popular Prefixes. 233 Routing Information Base (RIB): The term RIB is used rather sloppily 234 in this document to refer either to the loc-RIB (as used in 235 [RFC4271]), or to the combined Adj-RIBs-In, the Loc-RIB, and the 236 Adj-RIBs-Out. 237 Sub-Prefix: A regular (physically aggregatable) prefix. These are 238 equivalent to the prefixes that would normally comprise the DFRT 239 in the absence of VA. A VA router will contain a sub-prefix entry 240 either because the sub-prefix falls within a Virtual Prefix for 241 which the router is an APR, or because the sub-prefix is installed 242 as a Popular Prefix. Legacy routers hold the same sub-prefixes 243 that they hold today. 245 Tunnel: This draft specifies the use of MPLS Label Switched Paths 246 (LSP), and of MPLS inner labels tunneled over either LSPs or IP 247 headers. Other types of tunnels may be used, but are not 248 specified here. This document generically uses the term tunnel to 249 refer to any of these tunnel types. 250 VA router: A router that operates Virtual Aggregation according to 251 this document. 252 Virtual Prefix (VP): A Virtual Prefix (VP) is a prefix used to 253 aggregate its contained regular prefixes (sub-prefixes). The set 254 of sub-prefixes in a VP are not physically aggregatable, and so 255 they are aggregated at APRs through the use of tunnels. 256 VP-List: A list of defined VPs. All routers must agree on the 257 contents of this list (which is statically configured into every 258 VA router). 260 1.4. Temporary Sections 262 This section contains temporary information, and will be removed in 263 the final version. 265 1.4.1. Document revisions 267 This document was previously published as both 268 draft-francis-idr-intra-va-01.txt and draft-francis-intra-va-01.txt. 270 1.4.1.1. Revisions from the 01 version of draft-ietf-grow-va 272 The specification of how to use tunnels has been incorporated 273 directly into this draft. Formerly the specifications were provided 274 in separate drafts ([I-D.ietf-grow-va-mpls], and 275 [I-D.ietf-grow-va-mpls-innerlabel]). The tunneling types specified 276 in [I-D.ietf-grow-va-gre] are not included in this draft. 278 The simpler "core-edge" style of deployment has been removed from 279 this draft and specified in a stand-alone draft 280 [I-D.ietf-grow-simple-va] to simplify its understanding for those 281 interested in only that style of deployment. 283 Added text about usage of uRPF (strict and loose). 285 Added text about flapping APR failure scenario. 287 1.4.1.2. Revisions from the 00 version of draft-ietf-grow-va 289 Removed the notion that FIB suppression can be done by suppressing 290 entries from the Routing Table (as defined in Section 3.2 of 291 [RFC4271]), an idea that was introduced in the second version of the 292 draft. Suppressing from the Routing Table breaks PIM-SM, which 293 relies on the contents of the unicast Routing Table to produce its 294 forwarding table. 296 1.4.1.3. Revisions from the 00 version (of 297 draft-francis-intra-va-00.txt) 299 Added additional authors (Jen, Raszuk, Zhang), to reflect primary 300 contributors moving forwards. In addition, a number of minor 301 clarifications were made. 303 1.4.1.4. Revisions from the 01 version (of 304 draft-francis-idr-intra-va-01.txt) 306 1. Changed file name from draft-francis-idr-intra-va to 307 draft-francis-intra-va. 308 2. Restructured the document to make the edge suppression mode a 309 specific sub-case of VA rather than a separate mode of operation. 310 This includes modifying the title of the draft. 311 3. Removed MPLS tunneling details so that specific tunneling 312 approaches can be described in separate documents. 314 1.4.1.5. Revisions from 00 version 316 o Changed intended document type from STD to BCP, as per advice from 317 Dublin IDR meeting. 318 o Cleaned up the MPLS language, and specified that the full-address 319 routes to remote ASBRs must be imported into OSPF (Section 3.1.3). 320 As per Daniel Ginsburg's email 321 http://www.ietf.org/mail-archive/web/idr/current/msg02933.html. 322 o Clarified that legacy routers must run MPLS. As per Daniel 323 Ginsburg's email 324 http://www.ietf.org/mail-archive/web/idr/current/msg02935.html. 325 o Fixed LOCAL_PREF bug. As per Daniel Ginsburg's email 326 http://www.ietf.org/mail-archive/web/idr/current/msg02940.html. 327 o Removed the need for the extended communities attribute on VP 328 routes, and added the requirement that all VA routers be 329 statically configured with the complete list of VPs. As per 330 Daniel Ginsburg's emails 331 http://www.ietf.org/mail-archive/web/idr/current/msg02940.html and 332 http://www.ietf.org/mail-archive/web/idr/current/msg02958.html. 333 In addition, the procedure for adding, deleting, splitting, and 334 merging VPs was added. As part of this, the possibility of having 335 overlapping VPs was added. 336 o Added the special case of a core-edge topology with default routes 337 to the edge as suggested by Robert Raszuk in email 338 http://www.ietf.org/mail-archive/web/idr/current/msg02948.html. 339 Note that this altered the structure and even title of the 340 document. 342 o Clarified that FIB suppression can be achieved by not loading 343 entries into the Routing Table, as suggested by Rajiv Asati in 344 email 345 http://www.ietf.org/mail-archive/web/idr/current/msg03019.html. 347 2. Overview of Virtual Aggregation (VA) 349 For descriptive simplicity, this section starts by describing VA 350 assuming that there are no legacy routers in the domain. Section 2.1 351 overviews the additional functions required by VA routers to 352 accommodate legacy routers. 354 A key concept behind VA is to operate BGP as normal, and in 355 particular to populate the RIB with the full DFRT, but to suppress 356 many or most prefixes from being loaded into the FIB. By populating 357 the RIB as normal, we avoid any changes to BGP, and changes to router 358 operation are relatively minor. The basic idea behind VA is quite 359 simple. The address space is partitioned into large prefixes --- 360 larger than any aggregatable prefix in use today. These prefixes are 361 called Virtual Prefixes (VP). Different VPs do not need to be the 362 same size. They may be a mix of /6, /7, /8 (for IPv4), and so on. 363 Indeed, an ISP can define a single /0 VP, and use it for a core/edge 364 type of configuration [I-D.ietf-grow-simple-va]. That is, the core 365 routers would maintain full FIBs, and edge routers could maintain 366 default routes to the core routers, and suppress as much of the FIB 367 as they wish. Each ISP can independently select the size of its VPs. 369 VPs are not themselves topologically aggregatable. VA makes the VPs 370 aggregatable through the use of tunnels, as follows. Associated with 371 each VP are one or more "Aggregation Point Routers" (APR). An APR 372 (for a given VP) is a router that installs routes for all sub- 373 prefixes (i.e. real physically aggregatable prefixes) within the VP. 374 Note that an APR is not a special router per se---it is an otherwise 375 normal router that is configured to operate as an APR. By "install 376 routes" here, we mean: 378 1. The route for each of the sub-prefixes is loaded into the FIB, 379 and 380 2. there is a tunnel from the APR to the BGP NEXT_HOP for the route. 382 The APR originates a BGP route to the VP. This route is distributed 383 within the domain, but not outside the domain. With this structure 384 in place, a packet transiting the ISP goes from the ingress router to 385 the APR (usually via a tunnel), and then from the APR to the BGP 386 NEXT_HOP router via a tunnel. VA can operate with MPLS LSPs, or with 387 MPLS inner labels over LSPs or IP headers. Section 4 specifies the 388 usage of tunnels. 390 The BGP NEXT_HOP can be either the local ASBR or the remote ASBR. In 391 the former case, an inner label is used to tunnel packets 392 (Section 4.2). In either case, all tunner headers are stripped by 393 the local ASBR before the packet is delivered to the remote ASBR. In 394 other words, the remote ASBR sees a normal IP packet, and is 395 completely unaware of the existence of VA in the neighboring ISP. 396 Note that legacy ASBRs MUST set themselves as the BGP NEXT_HOP. 398 Note that the AS-path is not effected at all by VA. This means among 399 other things that AS-level policies are not effected by VA. The 400 packet may not, however, follow the shortest path within the ISP 401 (where shortest path is defined here as the path that would have been 402 taken if VA were not operating), because the APR may not be on the 403 shortest path between the ingress and egress routers. When this 404 happens, the packet experiences additional latency and creates extra 405 load (by virtue of taking more hops than it otherwise would have). 406 Note also that, with VA, a packet may occasionally take a different 407 exit point than it otherwise would have. 409 VA can avoid traversing the APR for selected routes by installing 410 these routes in non-APR routers. In other words, even if an ingress 411 router is not an APR for a given sub-prefix, it MAY install that sub- 412 prefix into its FIB. Packets in this case are tunneled directly from 413 the ingress to the BGP NEXT_HOP. These extra routes are called 414 "Popular Prefixes", and are typically installed for policy reasons 415 (e.g. customer routes are always installed), or for sub-prefixes that 416 carry a high volume of traffic (Section 3.1.5.1). Different routers 417 MAY have different Popular Prefixes. As such, an ISP MAY assign 418 Popular Prefixes per router, per POP, or uniformly across the ISP. A 419 given router MAY have zero Popular Prefixes, or the majority of its 420 FIB MAY consist of Popular Prefixes. The effectiveness of Popular 421 Prefixes to reduce traffic load relies on the fact that traffic 422 volumes follow something like a power-law distribution: i.e. that 90% 423 of traffic is destined to 10% of the destinations. Internet traffic 424 measurement studies over the years have consistently shown that 425 traffic patterns follow this distribution, though there is no 426 guarantee that they always will. 428 Note that for routing to work properly, every packet must sooner or 429 later reach a router that has installed a sub-prefix route that 430 matches the packet. This would obviously be the case for a given 431 sub-prefix if every router has installed a route for that sub-prefix 432 (which of course is the situation in the absence of VA). If this is 433 not the case, then there MUST be at least one Aggregation Point 434 Router (APR) for the sub-prefix's Virtual Prefix (VP). Ideally, 435 every POP contains at least two APRs for every Virtual Prefix. By 436 having APRs in every POP, the latency imposed by routing to the APR 437 is minimal (the extra hop is within the POP). By having more than 438 one APR, there is a redundant APR should one fail. In practice it is 439 often not possible to have an APR for every VP in every POP. This is 440 because some POPs may have only one or a few routers, and therefore 441 there may not have enough cumulative FIB space in the POP to hold 442 every sub-prefix. Note that any router ("edge", "core", etc.) MAY 443 be an APR. 445 It is important that both the contents of BGP RIBs, as well as the 446 contents of the Routing Table (as defined in Section 3.2 of 447 [RFC4271]) not be modified by VA (other than the introduction of 448 routes to VPs). This is because PIM-SM [RFC4601] relies on the 449 contents of the Routing Table to build its own trees and forwarding 450 table. Therefore, FIB suppression MUST take place between the 451 Routing Table and the actual FIB(s). 453 2.1. Mix of legacy and VA routers 455 It is important that an ISP be able to operate with a mix of "VA 456 routers" (routers upgraded to operate VA as described in the 457 document) and "legacy routers". This allows ISPs to deploy VA in an 458 incremental fashion and to continue to use routers that for whatever 459 reason cannot be upgraded. This document allows such a mix, and 460 indeed places no topological restrictions on that mix. It does, 461 however, require that legacy routers (and VA routers for that matter) 462 are able to forward already-tunneled packets, are able to serve as 463 tunnel endpoints, and are able to participate in distribution of 464 tunnel information required to establish themselves as tunnel 465 endpoints. (This is listed as Requirement R5 in the companion 466 tunneling documents.) Depending on the tunnel type, legacy routers 467 MAY also be able to initiate tunneled packets, though this is an 468 OPTIONAL requirement. (This is listed as Requirement R4 in the 469 companion tunneling documents.) Legacy routers MUST use their own 470 address as the BGP NEXT_HOP, and MUST FIB-install routes for which 471 they are the BGP NEXT_HOP. 473 2.2. Summary of Tunnels and Paths 475 To summarize, the following tunnels are created: 477 1. From all VA routers to all BGP NEXT_HOP addresses (where the BGP 478 NEXT_HOP address is either an APR, a local ASBR, or the remote 479 ASBR neighbor of a VA router). Note that this is listed as 480 Requirement R3 in the companion tunneling documents. 481 2. Optionally, from all legacy routers to all BGP NEXT_HOP 482 addresses. 483 There are a number of possible paths that packets may take through an 484 ISP, summarized in the following diagram. Here, "VA" is a VA router, 485 "LR" is a legacy router, the symbol "==>" represents a tunneled 486 packet (through zero or more routers), "-->" represents an untunneled 487 packet, and "(pop)" represents stripping the tunnel header. The 488 symbol "::>" represents the portion of the path where although the 489 tunnel is targeted to the receiving node, the outer header has been 490 stripped. (Note that the remote ASBR may actually be a legacy router 491 or a VA router---it doesn't matter (and isn't known) to the ISP.) 493 Egress 494 Router 495 Ingress Some APR (Local Remote 496 Router Router Router ASBR) ASBR 497 ------- ------ ------ ------ -------- 498 1. VA===================>VA=========>VA(pop)::::>LR 500 2. VA===================>VA=========>LR--------->LR 502 3. VA===============================>VA(pop)::::>LR 504 4. VA===============================>LR--------->LR 506 (The following two exist in the case where legacy routers 507 can initiate tunneled packets.) 509 5. LR===============================>VA(pop)::::>LR 511 6. LR===============================>LR--------->LR 513 (The following two exist in the case where legacy routers 514 cannot initiate tunneled packets.) 516 7. LR------->VA (remaining paths as in 1 to 4 above) 518 8. LR------->LR--------------------->LR--------->LR 520 The first and second paths represent the case where the ingress 521 router does not have a Popular Prefix for the destination, and MUST 522 tunnel the packet to an APR. The third and fourth paths represent 523 the case where the ingress router does have a Popular Prefix for the 524 destination, and so tunnels the packet directly to the egress. The 525 fifth and sixth paths are similar to the third and fourth paths 526 respectively, but where the ingress is a legacy router that can 527 initiate tunneled packets, and effectively has the Popular Prefix by 528 virtue of holding the entire DFRT. (Note that some ISPs have only 529 partial RIBs in their customer-facing edge routers, and default route 530 to a router that holds the full DFRT. This case is not shown here, 531 but works perfectly well.) Finally, paths 7 and 8 represent the case 532 where legacy routers cannot initiate a tunneled packet. 534 VA prevents the routing loops that might otherwise occur when VA 535 routers and legacy routers are mixed. The trick is avoiding the case 536 where a legacy router is forwarding packets towards the BGP NEXT_HOP, 537 while a VA router is forwarding packets towards the APR, with each 538 router thinking that the other is on the shortest path to their 539 respective targets. 541 In the first four types of path, the loop is avoided because tunnels 542 are used all the way to the egress. As a result, there is never an 543 opportunity for a legacy router to try to route based on the 544 destination address unless the legacy router is the egress, in which 545 case it forwards the packet to the remote ASBR. 547 In the 5th and 6th cases, the ingress is a legacy router, but this 548 router can initiate tunnels and has the full FIB, and so simply 549 tunnels the packet to the egress router. 551 In the 7th and 8th cases, the legacy ingress cannot initiate tunnels, 552 and so forwards the packet hop-by-hop towards the BGP NEXT_HOP. The 553 packet will work its way towards the egress router, and will either 554 progress through a series of legacy routers (in which case the IGP 555 prevents loops), or it will eventually reach a VA router, after which 556 it will take tunnels as in the 1st and 2nd cases. 558 3. Specification of VA 560 This section describes in detail how to operate VA. It starts with a 561 brief discussion of requirements, followed by a specification of 562 router support for VA. 564 3.1. VA Operation 566 In this section, the detailed operation of VA is specified. 568 3.1.1. Legacy Routers 570 VA can operate with a mix of VA and legacy routers. To prevent the 571 types of loops described in Section 2.2, however, legacy routers MUST 572 satisfy the following requirements: 574 1. When forwarding externally-received routes over iBGP, the BGP 575 NEXT_HOP attribute MUST be set to the legacy router itself. 576 2. Legacy routers MUST be able to detunnel packets addressed to 577 themselves at the BGP NEXT_HOP address. They MUST also be able 578 to convey the tunnel information needed by other routers to 579 initiate tunneled packets to them. This is listed as 580 "Requirement R1" in the companion tunneling documents. If a 581 legacy router cannot detunnel and convey tunnel parameters, then 582 the AS cannot use VA. 583 3. Legacy routers MUST be able to forward all tunneled packets. 584 4. Every legacy router MUST hold its complete FIB. (Note, of 585 course, that this FIB does not necessarily need to contain the 586 full DFRT. This might be the case, for instance, if the router 587 is an edge router that defaults to a core router.) 589 As long as legacy routers participating in tunneling as described 590 above there are no topological restrictions on the legacy routers. 591 They may be freely mixed with VA routers without the possibility of 592 forming sustained loops (Section 2.2). 594 3.1.2. Advertising and Handling Virtual Prefixes (VP) 596 3.1.2.1. Distinguishing VPs from Sub-prefixes 598 VA routers MUST be able to distinguish VPs from sub-prefixes. This 599 is primarily in order to know which routes to install. In 600 particular, non-APR routers MUST know which prefixes are VPs before 601 they receive routes for those VPs, for instance when they first boot 602 up. This is in order to avoid the situation where they unnecessarily 603 start filling their FIBs with routes that they ultimately don't need 604 to install (Section 3.1.5). This leads to the following requirement: 606 It MUST be possible to statically configure the complete list of VPs 607 into all VA routers. This list is known as the VP-List. 609 3.1.2.2. Limitations on Virtual Prefixes 611 From the point of view of best-match routing semantics, VPs are 612 treated identically to any other prefix. In other words, if the 613 longest matching prefix is a VP, then the packet is routed towards 614 the VP. If a packet matching a VP reaches an Aggregation Point 615 Router (APR) for that VP, and the APR does not have a better matching 616 route, then the packet is discarded by the APR (just as a router that 617 originates any prefix will discard a packet that does not have a 618 better match). 620 The overall semantics of VPs, however, are slightly different from 621 those of real prefixes. Without VA, when a router originates a route 622 for a (real) prefix, the expectation is that the addresses within the 623 prefix are within the originating AS (or a customer of the AS). For 624 VPs, this is not the case. APRs originate VPs whose sub-prefixes 625 exist in different ASes. Because of this, it is important that VPs 626 not be advertised across AS boundaries. 628 It is up to individual domains to define their own VPs. VPs MUST be 629 "larger" (span a larger address space) than any real sub-prefix. If 630 a VP is smaller than a real prefix, then packets that match the real 631 prefix will nevertheless be routed to an APR owning the VP, at which 632 point the packet will be dropped if it does not match a sub-prefix 633 within the VP (Section 6). 635 (Note that, in principle there are cases where a VP could be smaller 636 than a real prefix. This is where the egress router to the real 637 prefix is a VA router. In this case, the APR could theoretically 638 tunnel the packet to the appropriate remote ASBR, which would then 639 forward the packet correctly. On the other hand, if the egress 640 router is a legacy router, then the APR could not tunnel matching 641 packets to the egress. This is because the egress would view the VP 642 as a better match, and would loop the packet back to the APR. For 643 this reason we require that VPs be larger than any real prefixes, and 644 that APRs never install prefixes larger than a VP in their FIBs.) 646 It is valid for a VP to be a subset of another VP. For example, 20/7 647 and 20/8 can both be VPs. In fact, this capability is necessary for 648 "splitting" a VP without temporarily increasing the FIB size in any 649 router. (Section 3.1.2.5). 651 3.1.2.3. Aggregation Point Routers (APR) 653 Any router MAY be configured as an Aggregation Point Router (APR) for 654 one or more Virtual Prefixes (VP). For each VP for which a router is 655 an APR, the router does the following: 657 1. The APR MUST originate a BGP route to the VP [RFC4271]. In this 658 route, the NLRI are all of the VPs for which the router is an 659 APR. This is true even for VPs that are a subset of another VP. 660 The ORIGIN is set to INCOMPLETE (value 2), the AS number of the 661 APR's AS is used in the AS_PATH, and the BGP NEXT_HOP is set to 662 the address of the APR. The ATOMIC_AGGREGATE and AGGREGATOR 663 attributes are not included. 664 2. The APR MUST attach a NO_EXPORT Communities Attribute [RFC1997] 665 to the route. 666 3. The APR MUST be able to detunnel packets addressed to itself at 667 its BGP NEXT_HOP address. It MUST also be able to convey the 668 tunnel information needed by other routers to initiate tunneled 669 packets to them (Requirement R1). 670 4. If a packet is received at the APR whose best match route is the 671 VP (i.e. it matches the VP but not any sub-prefixes within the 672 VP), then the packet MUST be discarded (see Section 3.1.2.2). 673 This can be accomplished by never installing a prefix larger than 674 the VP into the FIB, or by installing the VP as a route to 675 \dev\null. 677 3.1.2.3.1. Selecting APRs 679 An ISP is free to select APRs however it chooses. The details of 680 this are outside the scope of this document. Nevertheless, a few 681 comments are made here. In general, APRs should be selected such 682 that the distance to the nearest APR for any VP is small---ideally 683 within the same POP. Depending on the number of routers in a POP, 684 and the sizes of the FIBs in the routers relative to the DFRT size, 685 it may not be possible for all VPs to be represented in a given POP. 686 In addition, there should be multiple APRs for each VP, again ideally 687 in each POP, so that the failure of one does not unduly disrupt 688 traffic. 690 Note that, although VPs MUST be larger than real prefixes, there is 691 intentionally no mechanism designed to automatically insure that this 692 is the case. Such a mechanisms would be dangerous. For instance, if 693 an ISP somewhere advertised a very large prefix (a /4, say), then 694 this would cause APRs to throw out all VPs that are smaller than 695 this. For this reason, VPs MUST be set through static configuration 696 only. 698 3.1.2.4. Non-APR Routers 700 A non-APR router MUST install at least the following routes: 702 1. Routes to VPs (identifiable using the VP-List). 703 2. Routes to all sub-prefixes that are not covered by any VP in the 704 VP-list. 706 If the non-APR has a tunnel to the BGP NEXT_HOP of any such route, it 707 MUST use the tunnel to forward packets to the BGP NEXT_HOP. 709 When an APR fails, routers must select another APR to send packets to 710 (if there is one). This happens, however, through normal internal 711 BGP convergence mechanisms. 713 3.1.2.5. Adding and deleting VPs 715 An ISP may from time to time wish to reconfigure its VP-List. There 716 are a number of reasons for this. For instance, early in its 717 deployment an ISP may configure one or a small number of VPs in order 718 to test VA. As the ISP gets more confident with VA, it may increase 719 the number of VPs. Or, an ISP may start with a small number of large 720 VPs (i.e. /4's or even one /0), and over time move to more smaller 721 VPs in order to save even more FIB. In this case, the ISP will need 722 to "split" a VP. Finally, since the address space is not uniformly 723 populated with prefixes, the ISP may want to change the size of VPs 724 in order to balance FIB size across routers. This can involve both 725 splitting and merging VPs. Of course, an ISP must be able to modify 726 its VP-List without 1) interrupting service to any destinations, or 727 2) temporarily increasing the size of any FIB (i.e. where the FIB 728 size during the change is no bigger than its size either before or 729 after the change). 731 Adding a VP is straightforward. The first step is to configure the 732 APRs for the VP. This causes the APRs to originate routes for the 733 VP. Non-APR routers will install this route according to the rules 734 in Section 3.1.2.4 even though they do not yet recognize that the 735 prefix is a VP. Subsequently the VP is added to the VP-List of non- 736 APR routers. The Non-APR routers can then start suppressing the sub- 737 prefixes with no loss of service. 739 To delete a VP, the process is reversed. First, the VP is removed 740 from the VP-Lists of non-APRs. This causes the non-APRs to install 741 the sub-prefixes. After all sub-prefixes have been installed, the VP 742 may be removed from the APRs. 744 In many cases, it is desirable to split a VP. For instance, consider 745 the case where two routers, Ra and Rb, are APRs for the same prefix. 746 It would be possible to shrink the FIB in both routers by splitting 747 the VP into two VPs (i.e. split one /6 into two /7's), and assigning 748 each router to one of the VPs. While this could in theory be done by 749 first deleting the larger VP, and then adding the smaller VPs, doing 750 so would temporarily increase the FIB size in non-APRs, which may not 751 have adequate space for such an increase. For this reason, we allow 752 overlapping VPs. 754 To split a VP, first the two smaller VPs are added to the VP-Lists of 755 all non-APR routers (in addition to the larger superset VP). Next, 756 the smaller VPs are added to the selected APRs (which may or may not 757 be APRs for the larger VP). Because the smaller VPs are a better 758 match than the larger VP, this will cause the non-APR routers to 759 forward packets to the APRs for the smaller VPs. Next, the larger VP 760 can be removed from the VP-Lists of all non-APR routers. Finally, 761 the larger VP can be removed from its APRs. 763 To merge two VPs, the new larger VP is configured in all non-APRs. 764 This has no effect on FIB size or APR selection, since the smaller 765 VPs are better matches. Next the larger VP is configured in its 766 selected APRs. Next the smaller VPs are deleted from all non-APRs. 767 Finally, the smaller VPs are deleted from their corresponding APRs. 769 3.1.3. Border VA Routers 771 A VA router that is an ASBR MUST do the following: 773 1. When forwarding externally-received routes over iBGP, if a tunnel 774 with an inner label is used, the ASBR MUST set the BGP NEXT_HOP 775 attribute to itself. Otherwise, the BGP NEXT_HOP attribute is 776 left unchanged. 777 2. They MUST establish tunnels as described in Section 4. 778 3. The ASBR MUST detunnel the packet before forwarding the packet to 779 the remote ASBR. In other words, the remote ASBR receives a 780 normal untunneled packet identical to the packet it would receive 781 without VA. 782 4. The ASBR MUST be able to forward the packet without a FIB lookup. 783 In other words, the tunnel information itself contains all the 784 information needed by the border router to know which remote ASBR 785 should receive the packet. 787 3.1.4. Advertising and Handling Sub-Prefixes 789 Sub-prefixes are advertised and handled by BGP as normal. VA does 790 not effect this behavior. The only difference in the handling of 791 sub-prefixes is that they might not be installed in the FIB, as 792 described in Section 3.1.5. 794 In those cases where the route is installed, packets forwarded to 795 prefixes external to the AS MUST be transmitted via the tunnel 796 established as described in Section 3.1.3. 798 3.1.5. Suppressing FIB Sub-prefix Routes 800 Any route not for a known VP (i.e. not in the VP-List) is taken to be 801 a sub-prefix. The following rules are used to determine if a sub- 802 prefix route can be suppressed. 804 1. A VA router MUST NOT FIB-install a sub-prefix route for which 805 there is no tunnel to the BGP NEXT_HOP address. This is to 806 prevent a loop whereby the APR forwards the packet hop-by-hop 807 towards the next hop, but a router on the path that has FIB- 808 suppressed the sub-prefix forwards it back to the APR. If there 809 is an alternate route to the sub-prefix for which there is a 810 tunnel, then that route SHOULD be selected, even if it is less 811 attractive according to the normal BGP best path selection 812 algorithm. 813 2. If the router is an APR, a route for every sub-prefix within the 814 VP MUST be FIB-installed (subject to the above limitation that 815 there be a tunnel). 816 3. If a non-APR router has a sub-prefix route that does not fall 817 within any VP (as determined by the VP-List), then the route MUST 818 be installed. This may occur because the ISP hasn't defined a VP 819 covering that prefix, for instance during an incremental 820 deployment buildup. 822 4. If an ASBR is using strict uRPF to do ingress filtering, then it 823 MUST install routes for which the remote ASBR is the BGP NEXT_HOP 824 [RFC2827]. Note that only a APR may do loose uRPF filtering, and 825 then only for routes to sub-prefixes within its VPs. 826 5. All other sub-prefix routes MAY be suppressed. Such "optional" 827 sub-prefixes that are nevertheless installed are referred to as 828 Popular Prefixes. Note, however, that whether or not to install 829 a given sub-prefix SHOULD NOT be based on whether or not there is 830 an active route to a VP in the VP-list. This avoids the 831 situation whereby, during BGP initialization, the router receives 832 some sub-prefix routes before receiving the corresponding VP 833 route, with the result that it installs routes in its FIB that it 834 will only remove a short time later, possibly even overflowing 835 its FIB. 837 3.1.5.1. Selecting Popular Prefixes 839 Individual routers MAY independently choose which sub-prefixes are 840 Popular Prefixes. There is no need for different routers to install 841 the same sub-prefixes. There is therefore significant leeway as to 842 how routers select Popular Prefixes. As a general rule, routers 843 should fill the FIB as much as possible, because the cost of doing so 844 is relatively small, and more FIB entries leads to fewer packets 845 taking a longer path. Broadly speaking, an ISP may choose to fill 846 the FIB by making routers APRs for as many VPs as possible, or by 847 assigning relatively few APRs and rather filling the FIB with Popular 848 Prefixes. Several basic approaches to selecting Popular Prefixes are 849 outlined here. Router vendors are free to implement whatever 850 approaches they want. 852 1. Policy-based: The simplest approach for network administrators is 853 to have broad policies that routers use to determine which sub- 854 prefixes are designated as popular. An obvious policy would be a 855 "customer routes" policy, whereby all customer routes are 856 installed (as identified for instance by appropriate community 857 attribute tags). Another policy would be for a router to install 858 prefixes originated by specific ASes. For instance, two ISPs 859 could mutually agree to install each other's originated prefixes. 860 A third policy might be to install prefixes with the shortest AS- 861 path. 862 2. Static list: Another approach would be to configure static lists 863 of specific prefixes to install. For instance, prefixes 864 associated with an SLA might be configured. Or, a list of 865 prefixes for the most popular websites might be installed. 866 3. High-volume prefixes: By installing high-volume prefixes as 867 Popular Prefixes, the latency and load associated with the longer 868 path required by VA is minimized. One approach would be for an 869 ISP to measure its traffic volume over time (days or a few 870 weeks), and statically configure high-volume prefixes as Popular 871 Prefixes. There is strong evidence that prefixes that are high- 872 volume tend to remain high-volume over multi-day or multi-week 873 timeframes (though not necessarily at short timeframes like 874 minutes or seconds). High-volume prefixes MAY also be installed 875 dynamically. In other words, a router measures its own traffic 876 volumes, and installs and removes Popular Prefixes in response to 877 short term traffic load. The downside of this approach is that 878 it complicates debugging network problems. If packets are being 879 dropped somewhere in the network, it is more difficult to find 880 out where if the selected path can change dynamically. 882 3.2. New Configuration 884 VA places new configuration requirements on ISP administrators. 885 Namely, the administrator must: 887 1. Select VPs, and configure the VP-List into all VA routers. As a 888 general rule, having a larger number of relatively small prefixes 889 gives administrators the most flexibility in terms of filling 890 available FIB with sub-prefixes, and in terms of balancing load 891 across routers. Once an administrator has selected a VP-List, it 892 is just as easy to configure routers with a large list as a small 893 list. We can expect network operator groups like NANOG to 894 compile good VP-Lists that ISPs can then adopt. A good list 895 would be one where the number of VPs is relatively large, say 100 896 or so (noting again that each VP must be smaller than a real 897 prefix), and the number of sub-prefixes within each VP is roughly 898 the same. 899 2. Select and configure APRs. There are three primary 900 considerations here. First, there must be enough APRs to handle 901 reasonable APR failure scenarios. Second, APR assignment should 902 not result in router overload. Third, particularly long paths 903 should be avoided. Ideally there should be two APRs for each VP 904 within each PoP, but this may not be possible for small PoPs. 905 Failing this, there should be at least two APRs in each 906 geographical region, so as to minimize path length increase. 907 Routers should have the appropriate counters to allow 908 administrators to know the volume of APR traffic each router is 909 handling so as to adjust load by adding or removing APR 910 assignments. 911 3. Select and configure Popular Prefixes or Popular Prefix policies. 912 There are two general goals here. The first is to minimize load 913 overall by minimizing the number of packets that take longer 914 paths. The second is to insure that specific selected prefixes 915 don't have overly long paths. These goals must be weighed 916 against the administrative overhead of configuring potentially 917 thousands of Popular Prefixes. As one example a small ISP may 918 wish to keep it simple by doing nothing more than indicating that 919 customer routes should be installed. In this case, the 920 administrator could otherwise assign as many APRs as possible 921 while leaving enough FIB space for customer routes. As another 922 example, a large ISP could build a management system that takes 923 into consideration the traffic matrix, customer SLAs, robustness 924 requirements, FIB sizes, topology, and router capacity, and 925 periodically automatically computes APR and Popular Prefix 926 assignments. 928 4. Usage of Tunnels 930 4.1. MPLS tunnels 932 VA utilizes a straight-forward application of MPLS. The tunnels are 933 MPLS Label Switched Paths (LSP), and are signaled using either the 934 Label Distribution Protocol (LDP) [RFC5036] or RSVP-TE [RFC3209]. 935 Both VA and legacy routers MUST participate in this signaling. 937 APRs and ASBRs initiate tunnels. In both cases, Downstream 938 Unsolicited tunnels are initiated to all IGP neighbors with the full 939 BGP NEXT_HOP address as the Forwarding Equivalence Class (FEC). In 940 the case of APRs, the BGP NEXT_HOP is the APR's own address. In the 941 case of legacy ASBRs, the BGP NEXT_HOP is the ASBR's own address. In 942 the case of VA ASBRs, the BGP NEXT_HOP is that of the remote ASBR. 944 Existing Penultimate Hop Popping (PHP) mechanisms in the data plane 945 can be used for forwarding packets to remote ASBRs. 947 4.2. Usage of Inner Label 949 Besides using a separate LSP to identify the remote ASBR as described 950 above, it is also possible to use an inner label to identify the 951 remote ASBR. Either an outer label or an IP tunnel identifies the 952 local ASBR. 954 When a local ASBR advertises a route into iBGP, it sets the NEXT_HOP 955 to itself, and assigns a label to the route. This label is used as 956 the inner label, and identifies the remote ASBR from which the route 957 was received [RFC3107]. 959 The presence of the inner label in the iBGP update acts as the signal 960 to the receiving router that an inner label MUST be used in packets 961 tunneled to the NEXT_HOP address. If there is an LSP established 962 targeted to the NEXT_HOP address, then it is used to tunnel the 963 packet to the NEXT_HOP address. Otherwise, an IP header address to 964 the NEXT_HOP address is used. 966 5. IANA Considerations 968 There are no IANA considerations. 970 6. Security Considerations 972 We consider the security implications of VA under two scenarios, one 973 where VA is configured and operated correctly, and one where it is 974 mis-configured. A cornerstone of VA operation is that the basic 975 behavior of BGP doesn't change, especially inter-domain. Among other 976 things, this makes it easier to reason about security. 978 6.1. Properly Configured VA 980 If VA is configured and operated properly, then the external behavior 981 of an AS does not change. The same upstream ASes are selected, and 982 the same prefixes and AS-paths are advertised. Therefore, a properly 983 configured VA domain has no security impact on other domains. 985 If another ISP starts advertising a prefix that is larger than a 986 given VP, this prefix will be ignored by APRs that have a VP that 987 falls within the larger prefix (Section 3.1.2.3). As a result, 988 packets that might otherwise have been routed to the new larger 989 prefix will be dropped at the APRs. Note that the trend in the 990 Internet is towards large prefixes being broken up into smaller ones, 991 not the reverse. Therefore, such a larger prefix is likely to be 992 invalid. If it is determined without a doubt that the larger prefix 993 is valid, then the ISP will have to reconfigure its VPs. 995 VA does not change an ISP's ability to do ingress filtering using 996 strict uRPF (Section 3.1.5). 998 Regarding DoS attacks, there are two issues that need to be 999 considered. First, does VA result in new types of DoS attacks? 1000 Second, does VA make it more difficult to deploy DoS defense systems. 1001 Regarding the first issue, one possibility is that an attacker 1002 targets a given router by flooding the network with traffic to 1003 prefixes that are not popular, and for which that router is an APR. 1004 This would cause a disproportionate amount of traffic to be forwarded 1005 to the APR(s). While it is up to individual ISPs to decide if this 1006 attack is a concern, it does not strike the authors that this attack 1007 is likely to significantly worsen the DoS problem. 1009 Many DoS defense systems use dynamically established Routing Table 1010 entries to divert victims' traffic into LSPs that carry the traffic 1011 to scrubbers. This mechanism works with VA---it simply over-rides 1012 whatever route is in place. This mechanism works equally well with 1013 APRs and non-APRs. 1015 6.2. Mis-configured VA 1017 VA introduces the possibility that a VP is advertised outside of an 1018 AS. This in fact should be a low probability event, but it is 1019 considered here none-the-less. 1021 If an AS leaks a large VP (i.e. larger than any real prefixes), then 1022 the impact is minimal. Smaller prefixes will be preferred because of 1023 best-match semantics, and so the only impact is that packets that 1024 otherwise have no matching routes will be sent to the misbehaving AS 1025 and dropped there. If an AS leaks a small VP (i.e. smaller than a 1026 real prefix), then packets to that AS will be hijacked by the 1027 misbehaving AS and dropped. This can happen with or without VA, and 1028 so doesn't represent a new security problem per se. 1030 7. Acknowledgements 1032 The authors would like to acknowledge the efforts of Xinyang Zhang 1033 and Jia Wang, who worked on CRIO (Core Router Integrated Overlay), an 1034 early inter-domain variant of FIB suppression, and the efforts of 1035 Hitesh Ballani and Tuan Cao, who worked on the configuration-only 1036 variant of VA that works with legacy routers. We would also like to 1037 thank Scott Brim, Daniel Ginsburg, and Rajiv Asati for their helpful 1038 comments. In particular, Daniel's comments significantly simplified 1039 the spec (eliminating the need for a new External Communities 1040 Attribute). 1042 8. References 1044 8.1. Normative References 1046 [RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP 1047 Communities Attribute", RFC 1997, August 1996. 1049 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1050 Requirement Levels", BCP 14, RFC 2119, March 1997. 1052 [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: 1053 Defeating Denial of Service Attacks which employ IP Source 1054 Address Spoofing", BCP 38, RFC 2827, May 2000. 1056 [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in 1057 BGP-4", RFC 3107, May 2001. 1059 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 1060 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 1061 Tunnels", RFC 3209, December 2001. 1063 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 1064 Protocol 4 (BGP-4)", RFC 4271, January 2006. 1066 [RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas, 1067 "Protocol Independent Multicast - Sparse Mode (PIM-SM): 1068 Protocol Specification (Revised)", RFC 4601, August 2006. 1070 [RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP 1071 Specification", RFC 5036, October 2007. 1073 8.2. Informative References 1075 [I-D.ietf-grow-simple-va] 1076 Francis, P., Xu, X., Ballani, H., Raszuk, R., and L. 1077 Zhang, "Simple Virtual Aggregation (S-VA)", 1078 draft-ietf-grow-simple-va-00 (work in progress), 1079 March 2010. 1081 [I-D.ietf-grow-va-gre] 1082 Francis, P., Raszuk, R., and X. Xu, "GRE and IP-in-IP 1083 Tunnels for Virtual Aggregation", 1084 draft-ietf-grow-va-gre-00 (work in progress), July 2009. 1086 [I-D.ietf-grow-va-mpls] 1087 Francis, P. and X. Xu, "MPLS Tunnels for Virtual 1088 Aggregation", draft-ietf-grow-va-mpls-00 (work in 1089 progress), May 2009. 1091 [I-D.ietf-grow-va-mpls-innerlabel] 1092 Xu, X. and P. Francis, "Proposal to use an inner MPLS 1093 label to identify the remote ASBR VA", 1094 draft-ietf-grow-va-mpls-innerlabel-00 (work in progress), 1095 September 2009. 1097 [nsdi09] Ballani, H., Francis, P., Cao, T., and J. Wang, "Making 1098 Routers Last Longer with ViAggre", ACM Usenix NSDI 2009 ht 1099 tp://www.usenix.org/events/nsdi09/tech/full_papers/ 1100 ballani/ballani.pdf, April 2009. 1102 Authors' Addresses 1104 Paul Francis 1105 Max Planck Institute for Software Systems 1106 Gottlieb-Daimler-Strasse 1107 Kaiserslautern 67633 1108 Germany 1110 Phone: +49 631 930 39600 1111 Email: francis@mpi-sws.org 1113 Xiaohu Xu 1114 Huawei Technologies 1115 No.3 Xinxi Rd., Shang-Di Information Industry Base, Hai-Dian District 1116 Beijing, Beijing 100085 1117 P.R.China 1119 Phone: +86 10 82836073 1120 Email: xuxh@huawei.com 1122 Hitesh Ballani 1123 Cornell University 1124 4130 Upson Hall 1125 Ithaca, NY 14853 1126 US 1128 Phone: +1 607 279 6780 1129 Email: hitesh@cs.cornell.edu 1131 Dan Jen 1132 UCLA 1133 4805 Boelter Hall 1134 Los Angeles, CA 90095 1135 US 1137 Phone: 1138 Email: jenster@cs.ucla.edu 1139 Robert Raszuk 1140 Cisco Systems, Inc. 1141 170 West Tasman Drive 1142 San Jose, CA 95134 1143 USA 1145 Phone: 1146 Email: raszuk@cisco.com 1148 Lixia Zhang 1149 UCLA 1150 3713 Boelter Hall 1151 Los Angeles, CA 90095 1152 US 1154 Phone: 1155 Email: lixia@cs.ucla.edu