idnits 2.17.1 draft-ietf-grow-va-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 22, 2011) is 4806 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'I-D.ietf-grow-va-gre' is defined on line 986, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-grow-va-mpls' is defined on line 991, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-grow-va-mpls-innerlabel' is defined on line 996, but no explicit reference was found in the text ** Obsolete normative reference: RFC 3107 (Obsoleted by RFC 8277) ** Obsolete normative reference: RFC 4601 (Obsoleted by RFC 7761) == Outdated reference: A later version (-12) exists of draft-ietf-grow-simple-va-00 Summary: 2 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Francis 3 Internet-Draft MPI-SWS 4 Intended status: Informational X. Xu 5 Expires: August 26, 2011 Huawei 6 H. Ballani 7 Cornell U. 8 D. Jen 9 UCLA 10 R. Raszuk 11 Cisco 12 L. Zhang 13 UCLA 14 February 22, 2011 16 FIB Suppression with Virtual Aggregation 17 draft-ietf-grow-va-04.txt 19 Abstract 21 The continued growth in the Default Free Routing Table (DFRT) 22 stresses the global routing system in a number of ways. One of the 23 most costly stresses is FIB size: ISPs often must upgrade router 24 hardware simply because the FIB has run out of space, and router 25 vendors must design routers that have adequate FIB. FIB suppression 26 is an approach to relieving stress on the FIB by NOT loading selected 27 RIB entries into the FIB. Virtual Aggregation (VA) allows ISPs to 28 shrink the FIBs of any and all routers, easily by an order of 29 magnitude with negligible increase in path length and load. FIB 30 suppression deployed autonomously by an ISP (cooperation between ISPs 31 is not required), and can co-exist with legacy routers in the ISP. 32 There are no changes from the 03 version. 34 Status of this Memo 36 This Internet-Draft is submitted to IETF in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at http://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on August 26, 2011. 50 Copyright Notice 52 Copyright (c) 2011 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 1.1. Scope of this Document . . . . . . . . . . . . . . . . . . 5 69 1.2. Requirements notation . . . . . . . . . . . . . . . . . . 6 70 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 71 2. Overview of Virtual Aggregation (VA) . . . . . . . . . . . . . 7 72 2.1. Mix of legacy and VA routers . . . . . . . . . . . . . . . 9 73 2.2. Summary of Tunnels and Paths . . . . . . . . . . . . . . . 10 74 3. Specification of VA . . . . . . . . . . . . . . . . . . . . . 11 75 3.1. VA Operation . . . . . . . . . . . . . . . . . . . . . . . 12 76 3.1.1. Legacy Routers . . . . . . . . . . . . . . . . . . . . 12 77 3.1.2. Advertising and Handling Virtual Prefixes (VP) . . . . 12 78 3.1.3. Border VA Routers . . . . . . . . . . . . . . . . . . 16 79 3.1.4. Advertising and Handling Sub-Prefixes . . . . . . . . 16 80 3.1.5. Suppressing FIB Sub-prefix Routes . . . . . . . . . . 17 81 3.2. New Configuration . . . . . . . . . . . . . . . . . . . . 18 82 4. Usage of Tunnels . . . . . . . . . . . . . . . . . . . . . . . 19 83 4.1. MPLS tunnels . . . . . . . . . . . . . . . . . . . . . . . 19 84 4.2. Usage of Inner Label . . . . . . . . . . . . . . . . . . . 20 85 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 86 6. Security Considerations . . . . . . . . . . . . . . . . . . . 20 87 6.1. Properly Configured VA . . . . . . . . . . . . . . . . . . 20 88 6.2. Mis-configured VA . . . . . . . . . . . . . . . . . . . . 21 89 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 21 90 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 91 8.1. Normative References . . . . . . . . . . . . . . . . . . . 22 92 8.2. Informative References . . . . . . . . . . . . . . . . . . 22 93 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23 95 1. Introduction 97 ISPs today manage constant DFRT growth in a number of ways. One way, 98 of course, is for ISPs to upgrade their router hardware before DFRT 99 growth outstrips the size of the FIB. This is too expensive for many 100 ISPs. They would prefer to extend the lifetime of routers whose FIBs 101 can no longer hold the full DFRT. 103 A common approach taken by lower-tier ISPs is to default route to 104 their providers. Routes to customers and peer ISPs are maintained, 105 but everything else defaults to the provider. This approach has 106 several disadvantages. First, packets to Internet destinations may 107 take longer-than-necessary AS paths. This problem can be mitigated 108 through careful configuration of partial defaults, but this can 109 require substantial configuration overhead. A second problem with 110 defaulting to providers is that the ISP is no longer able to provide 111 the full DFRT to its customers. Finally, provider defaults prevents 112 the ISP from being able to detect martian packets. As a result, the 113 ISP transmits packets that could otherwise have been dropped over its 114 expensive provider links. 116 An alternative is for the ISP to maintain full routes in its core 117 routers, but to filter routes from edge routers that do not require a 118 full DFRT. These edge routers can then default route to the core 119 routers. This is often possible with edge routers that interface to 120 customer networks. The problem with this approach is that it cannot 121 be used for all edge routers. For instance, it cannot be used for 122 routers that connect to transits. It of course also does not help in 123 cases where core routers themselves have inadequate FIB capacity. 125 FIB Suppression is an approach to shrinking FIB size that requires no 126 changes to BGP, no changes to packet forwarding mechanisms in 127 routers, and relatively minor changes to control mechanisms in 128 routers and configuration of those mechanisms. The core idea behind 129 FIB suppression is to run BGP as normal, and in particular to not 130 shrink the RIB, but rather to not load certain RIB entries into the 131 FIB. This approach minimizes changes to routers, and in particular 132 is simpler than more general routing architectures that try to shrink 133 both RIB and FIB. With FIB suppression, there are no changes to BGP 134 per se. The BGP decision process does not change. The selected AS- 135 path does not change, and except on rare occasion the exit router 136 does not change. ISPs can deploy FIB suppression autonomously and 137 with no coordination with neighboring ASes. 139 This document describes an approach to FIB suppression called 140 "Virtual Aggregation" (VA). VA operates by organizing the IP (v4 or 141 v6) address space into Virtual Prefixes (VP), and using tunnels to 142 aggregate the (regular) sub-prefixes within each VP. The decrease in 143 FIB size can be dramatic, easily 5x or 10x with only a slight path 144 length and router load increase [nsdi09]. The VPs can be organized 145 such that all routers in an ISP see FIB size decrease, or in such a 146 way that "core" routers keep the full FIB, and "edge" routers have 147 almost no FIB (i.e. by defining a VP of 0/0). This "core-edge" style 148 of VA deployment is much simpler than a "full" VA deployment, whereby 149 multiple VPs are defined, and any router, core or otherwise, can have 150 reduced FIB size. This simpler "core-edge" style of deployment is 151 specified in a separate draft in order to make it more easily 152 understandable [I-D.ietf-grow-simple-va]. 154 VA has the following characteristics: 155 o it is robust to router failure, 156 o it allows for traffic engineering, 157 o it allows for existing inter-domain routing policies, 158 o it operates in a predictable manner and is therefore possible to 159 test, debug, and reason about performance (i.e. establish SLAs), 160 o it can be safely installed, tested, and started up, 161 o it can be configured and reconfigured without service 162 interruption, 163 o it can be incrementally deployed, and in particular can be 164 operated in an AS with a mix of VA-capable and legacy routers, 165 o it accommodates existing security mechanisms such as unicast 166 Reverse Path Forwarding (uRPF) ingress filtering and DoS defense, 167 o does not introduce significant new security vulnerabilities. 169 1.1. Scope of this Document 171 The scope of this document is limited to Intra-domain VA operation. 172 In other words, the case where a single ISP autonomously operates VA 173 internally without any coordination with neighboring ISPs. 175 Note that this document assumes that the VA "domain" (i.e. the unit 176 of autonomy) is the AS (that is, different ASes run VA independently 177 and without coordination). For the remainder of this document, the 178 terms ISP, AS, and domain are used interchangeably. 180 This document applies equally to IPv4 and IPv6. 182 VA may operate with a mix of upgraded routers and legacy routers. 183 There are no topological restrictions placed on the mix of routers. 184 In order to avoid loops between upgraded and legacy routers, packets 185 are always tunneled by the VA routers to the BGP NEXT_HOPs of the 186 matched BGP routes. If a given local ASBR is a legacy router, it 187 must be able to terminate tunnels. 189 1.2. Requirements notation 191 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 192 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 193 document are to be interpreted as described in [RFC2119]. 195 1.3. Terminology 197 Aggregation Point Router (APR): An Aggregation Point Router (APR) is 198 a router that aggregates a Virtual Prefix (VP) by installing 199 routes (into the FIB) for all of the sub-prefixes within the VP. 200 APRs advertise the VP to other routers with BGP. For each sub- 201 prefix within the VP, APRs have a tunnel from themselves to the 202 remote ASBR (Autonomous System Border Router) where packets for 203 that prefix should be delivered. 204 Install and Suppress: The terms "install" and "suppress" are used to 205 describe whether a RIB entry has been loaded or not loaded into 206 the FIB. In other words, the phrase "install a route" means 207 "install a route into the FIB", and the phrase "suppress a route" 208 means "do not install a route into the FIB". 209 Legacy Router: A router that does not run VA, and has no knowledge 210 of VA. Legacy routers, however, must be able to terminate tunnels 211 when they are local ASBRs. 212 non-APR Router: In discussing VPs, it is often necessary to 213 distinguish between routers that are APRs for that VP, and routers 214 that are not APRs for that VP (but of course may be APRs for other 215 VPs not under discussion). In these cases, the term "APR" is 216 taken to mean "a VA router that is an APR for the given VP", and 217 the term "non-APR" is taken to mean "a VA router that is not an 218 APR for the given VP". The term non-APR router is not used to 219 refer to legacy routers. 220 Popular Prefix: A Popular Prefix is a sub-prefix that is installed 221 in a router in addition to the sub-prefixes it holds by virtue of 222 being a Aggregation Point Router. The Popular Prefix allows 223 packets to follow the shortest path. Note that different routers 224 do not need to have the same set of Popular Prefixes. 225 Routing Information Base (RIB): The term RIB is used rather sloppily 226 in this document to refer either to the loc-RIB (as used in 227 [RFC4271]), or to the combined Adj-RIBs-In, the Loc-RIB, and the 228 Adj-RIBs-Out. 229 Sub-Prefix: A regular (physically aggregatable) prefix. These are 230 equivalent to the prefixes that would normally comprise the DFRT 231 in the absence of VA. A VA router will contain a sub-prefix entry 232 either because the sub-prefix falls within a Virtual Prefix for 233 which the router is an APR, or because the sub-prefix is installed 234 as a Popular Prefix. Legacy routers hold the same sub-prefixes 235 that they hold today. 237 Tunnel: This draft specifies the use of MPLS Label Switched Paths 238 (LSP), and of MPLS inner labels tunneled over either LSPs or IP 239 headers. Other types of tunnels may be used, but are not 240 specified here. This document generically uses the term tunnel to 241 refer to any of these tunnel types. 242 VA router: A router that operates Virtual Aggregation according to 243 this document. 244 Virtual Prefix (VP): A Virtual Prefix (VP) is a prefix used to 245 aggregate its contained regular prefixes (sub-prefixes). The set 246 of sub-prefixes in a VP are not physically aggregatable, and so 247 they are aggregated at APRs through the use of tunnels. 248 VP-List: A list of defined VPs. All routers must agree on the 249 contents of this list (which is statically configured into every 250 VA router). 252 2. Overview of Virtual Aggregation (VA) 254 For descriptive simplicity, this section starts by describing VA 255 assuming that there are no legacy routers in the domain. Section 2.1 256 overviews the additional functions required by VA routers to 257 accommodate legacy routers. 259 A key concept behind VA is to operate BGP as normal, and in 260 particular to populate the RIB with the full DFRT, but to suppress 261 many or most prefixes from being loaded into the FIB. By populating 262 the RIB as normal, we avoid any changes to BGP, and changes to router 263 operation are relatively minor. The basic idea behind VA is quite 264 simple. The address space is partitioned into large prefixes --- 265 larger than any aggregatable prefix in use today. These prefixes are 266 called Virtual Prefixes (VP). Different VPs do not need to be the 267 same size. They may be a mix of /6, /7, /8 (for IPv4), and so on. 268 Indeed, an ISP can define a single /0 VP, and use it for a core/edge 269 type of configuration [I-D.ietf-grow-simple-va]. That is, the core 270 routers would maintain full FIBs, and edge routers could maintain 271 default routes to the core routers, and suppress as much of the FIB 272 as they wish. Each ISP can independently select the size of its VPs. 274 VPs are not themselves topologically aggregatable. VA makes the VPs 275 aggregatable through the use of tunnels, as follows. Associated with 276 each VP are one or more "Aggregation Point Routers" (APR). An APR 277 (for a given VP) is a router that installs routes for all sub- 278 prefixes (i.e. real physically aggregatable prefixes) within the VP. 279 Note that an APR is not a special router per se---it is an otherwise 280 normal router that is configured to operate as an APR. By "install 281 routes" here, we mean: 283 1. The route for each of the sub-prefixes is loaded into the FIB, 284 and 285 2. there is a tunnel from the APR to the BGP NEXT_HOP for the route. 287 The APR originates a BGP route to the VP. This route is distributed 288 within the domain, but not outside the domain. With this structure 289 in place, a packet transiting the ISP goes from the ingress router to 290 the APR (usually via a tunnel), and then from the APR to the BGP 291 NEXT_HOP router via a tunnel. VA can operate with MPLS LSPs, or with 292 MPLS inner labels over LSPs or IP headers. Section 4 specifies the 293 usage of tunnels. 295 The BGP NEXT_HOP can be either the local ASBR or the remote ASBR. In 296 the former case, an inner label is used to tunnel packets 297 (Section 4.2). In either case, all tunner headers are stripped by 298 the local ASBR before the packet is delivered to the remote ASBR. In 299 other words, the remote ASBR sees a normal IP packet, and is 300 completely unaware of the existence of VA in the neighboring ISP. 301 Note that legacy ASBRs MUST set themselves as the BGP NEXT_HOP. 303 Note that the AS-path is not effected at all by VA. This means among 304 other things that AS-level policies are not effected by VA. The 305 packet may not, however, follow the shortest path within the ISP 306 (where shortest path is defined here as the path that would have been 307 taken if VA were not operating), because the APR may not be on the 308 shortest path between the ingress and egress routers. When this 309 happens, the packet experiences additional latency and creates extra 310 load (by virtue of taking more hops than it otherwise would have). 311 Note also that, with VA, a packet may occasionally take a different 312 exit point than it otherwise would have. 314 VA can avoid traversing the APR for selected routes by installing 315 these routes in non-APR routers. In other words, even if an ingress 316 router is not an APR for a given sub-prefix, it MAY install that sub- 317 prefix into its FIB. Packets in this case are tunneled directly from 318 the ingress to the BGP NEXT_HOP. These extra routes are called 319 "Popular Prefixes", and are typically installed for policy reasons 320 (e.g. customer routes are always installed), or for sub-prefixes that 321 carry a high volume of traffic (Section 3.1.5.1). Different routers 322 MAY have different Popular Prefixes. As such, an ISP MAY assign 323 Popular Prefixes per router, per POP, or uniformly across the ISP. A 324 given router MAY have zero Popular Prefixes, or the majority of its 325 FIB MAY consist of Popular Prefixes. The effectiveness of Popular 326 Prefixes to reduce traffic load relies on the fact that traffic 327 volumes follow something like a power-law distribution: i.e. that 90% 328 of traffic is destined to 10% of the destinations. Internet traffic 329 measurement studies over the years have consistently shown that 330 traffic patterns follow this distribution, though there is no 331 guarantee that they always will. 333 Note that for routing to work properly, every packet must sooner or 334 later reach a router that has installed a sub-prefix route that 335 matches the packet. This would obviously be the case for a given 336 sub-prefix if every router has installed a route for that sub-prefix 337 (which of course is the situation in the absence of VA). If this is 338 not the case, then there MUST be at least one Aggregation Point 339 Router (APR) for the sub-prefix's Virtual Prefix (VP). Ideally, 340 every POP contains at least two APRs for every Virtual Prefix. By 341 having APRs in every POP, the latency imposed by routing to the APR 342 is minimal (the extra hop is within the POP). By having more than 343 one APR, there is a redundant APR should one fail. In practice it is 344 often not possible to have an APR for every VP in every POP. This is 345 because some POPs may have only one or a few routers, and therefore 346 there may not have enough cumulative FIB space in the POP to hold 347 every sub-prefix. Note that any router ("edge", "core", etc.) MAY 348 be an APR. 350 It is important that both the contents of BGP RIBs, as well as the 351 contents of the Routing Table (as defined in Section 3.2 of 352 [RFC4271]) not be modified by VA (other than the introduction of 353 routes to VPs). This is because PIM-SM [RFC4601] relies on the 354 contents of the Routing Table to build its own trees and forwarding 355 table. Therefore, FIB suppression MUST take place between the 356 Routing Table and the actual FIB(s). 358 2.1. Mix of legacy and VA routers 360 It is important that an ISP be able to operate with a mix of "VA 361 routers" (routers upgraded to operate VA as described in the 362 document) and "legacy routers". This allows ISPs to deploy VA in an 363 incremental fashion and to continue to use routers that for whatever 364 reason cannot be upgraded. This document allows such a mix, and 365 indeed places no topological restrictions on that mix. It does, 366 however, require that legacy routers (and VA routers for that matter) 367 are able to forward already-tunneled packets, are able to serve as 368 tunnel endpoints, and are able to participate in distribution of 369 tunnel information required to establish themselves as tunnel 370 endpoints. (This is listed as Requirement R5 in the companion 371 tunneling documents.) Depending on the tunnel type, legacy routers 372 MAY also be able to initiate tunneled packets, though this is an 373 OPTIONAL requirement. (This is listed as Requirement R4 in the 374 companion tunneling documents.) Legacy routers MUST use their own 375 address as the BGP NEXT_HOP, and MUST FIB-install routes for which 376 they are the BGP NEXT_HOP. 378 2.2. Summary of Tunnels and Paths 380 To summarize, the following tunnels are created: 382 1. From all VA routers to all BGP NEXT_HOP addresses (where the BGP 383 NEXT_HOP address is either an APR, a local ASBR, or the remote 384 ASBR neighbor of a VA router). Note that this is listed as 385 Requirement R3 in the companion tunneling documents. 386 2. Optionally, from all legacy routers to all BGP NEXT_HOP 387 addresses. 388 There are a number of possible paths that packets may take through an 389 ISP, summarized in the following diagram. Here, "VA" is a VA router, 390 "LR" is a legacy router, the symbol "==>" represents a tunneled 391 packet (through zero or more routers), "-->" represents an untunneled 392 packet, and "(pop)" represents stripping the tunnel header. The 393 symbol "::>" represents the portion of the path where although the 394 tunnel is targeted to the receiving node, the outer header has been 395 stripped. (Note that the remote ASBR may actually be a legacy router 396 or a VA router---it doesn't matter (and isn't known) to the ISP.) 398 Egress 399 Router 400 Ingress Some APR (Local Remote 401 Router Router Router ASBR) ASBR 402 ------- ------ ------ ------ -------- 403 1. VA===================>VA=========>VA(pop)::::>LR 405 2. VA===================>VA=========>LR--------->LR 407 3. VA===============================>VA(pop)::::>LR 409 4. VA===============================>LR--------->LR 411 (The following two exist in the case where legacy routers 412 can initiate tunneled packets.) 414 5. LR===============================>VA(pop)::::>LR 416 6. LR===============================>LR--------->LR 418 (The following two exist in the case where legacy routers 419 cannot initiate tunneled packets.) 421 7. LR------->VA (remaining paths as in 1 to 4 above) 423 8. LR------->LR--------------------->LR--------->LR 425 The first and second paths represent the case where the ingress 426 router does not have a Popular Prefix for the destination, and MUST 427 tunnel the packet to an APR. The third and fourth paths represent 428 the case where the ingress router does have a Popular Prefix for the 429 destination, and so tunnels the packet directly to the egress. The 430 fifth and sixth paths are similar to the third and fourth paths 431 respectively, but where the ingress is a legacy router that can 432 initiate tunneled packets, and effectively has the Popular Prefix by 433 virtue of holding the entire DFRT. (Note that some ISPs have only 434 partial RIBs in their customer-facing edge routers, and default route 435 to a router that holds the full DFRT. This case is not shown here, 436 but works perfectly well.) Finally, paths 7 and 8 represent the case 437 where legacy routers cannot initiate a tunneled packet. 439 VA prevents the routing loops that might otherwise occur when VA 440 routers and legacy routers are mixed. The trick is avoiding the case 441 where a legacy router is forwarding packets towards the BGP NEXT_HOP, 442 while a VA router is forwarding packets towards the APR, with each 443 router thinking that the other is on the shortest path to their 444 respective targets. 446 In the first four types of path, the loop is avoided because tunnels 447 are used all the way to the egress. As a result, there is never an 448 opportunity for a legacy router to try to route based on the 449 destination address unless the legacy router is the egress, in which 450 case it forwards the packet to the remote ASBR. 452 In the 5th and 6th cases, the ingress is a legacy router, but this 453 router can initiate tunnels and has the full FIB, and so simply 454 tunnels the packet to the egress router. 456 In the 7th and 8th cases, the legacy ingress cannot initiate tunnels, 457 and so forwards the packet hop-by-hop towards the BGP NEXT_HOP. The 458 packet will work its way towards the egress router, and will either 459 progress through a series of legacy routers (in which case the IGP 460 prevents loops), or it will eventually reach a VA router, after which 461 it will take tunnels as in the 1st and 2nd cases. 463 3. Specification of VA 465 This section describes in detail how to operate VA. It starts with a 466 brief discussion of requirements, followed by a specification of 467 router support for VA. 469 3.1. VA Operation 471 In this section, the detailed operation of VA is specified. 473 3.1.1. Legacy Routers 475 VA can operate with a mix of VA and legacy routers. To prevent the 476 types of loops described in Section 2.2, however, legacy routers MUST 477 satisfy the following requirements: 479 1. When forwarding externally-received routes over iBGP, the BGP 480 NEXT_HOP attribute MUST be set to the legacy router itself. 481 2. Legacy routers MUST be able to detunnel packets addressed to 482 themselves at the BGP NEXT_HOP address. They MUST also be able 483 to convey the tunnel information needed by other routers to 484 initiate tunneled packets to them. This is listed as 485 "Requirement R1" in the companion tunneling documents. If a 486 legacy router cannot detunnel and convey tunnel parameters, then 487 the AS cannot use VA. 488 3. Legacy routers MUST be able to forward all tunneled packets. 489 4. Every legacy router MUST hold its complete FIB. (Note, of 490 course, that this FIB does not necessarily need to contain the 491 full DFRT. This might be the case, for instance, if the router 492 is an edge router that defaults to a core router.) 494 As long as legacy routers participating in tunneling as described 495 above there are no topological restrictions on the legacy routers. 496 They may be freely mixed with VA routers without the possibility of 497 forming sustained loops (Section 2.2). 499 3.1.2. Advertising and Handling Virtual Prefixes (VP) 501 3.1.2.1. Distinguishing VPs from Sub-prefixes 503 VA routers MUST be able to distinguish VPs from sub-prefixes. This 504 is primarily in order to know which routes to install. In 505 particular, non-APR routers MUST know which prefixes are VPs before 506 they receive routes for those VPs, for instance when they first boot 507 up. This is in order to avoid the situation where they unnecessarily 508 start filling their FIBs with routes that they ultimately don't need 509 to install (Section 3.1.5). This leads to the following requirement: 511 It MUST be possible to statically configure the complete list of VPs 512 into all VA routers. This list is known as the VP-List. 514 3.1.2.2. Limitations on Virtual Prefixes 516 From the point of view of best-match routing semantics, VPs are 517 treated identically to any other prefix. In other words, if the 518 longest matching prefix is a VP, then the packet is routed towards 519 the VP. If a packet matching a VP reaches an Aggregation Point 520 Router (APR) for that VP, and the APR does not have a better matching 521 route, then the packet is discarded by the APR (just as a router that 522 originates any prefix will discard a packet that does not have a 523 better match). 525 The overall semantics of VPs, however, are slightly different from 526 those of real prefixes. Without VA, when a router originates a route 527 for a (real) prefix, the expectation is that the addresses within the 528 prefix are within the originating AS (or a customer of the AS). For 529 VPs, this is not the case. APRs originate VPs whose sub-prefixes 530 exist in different ASes. Because of this, it is important that VPs 531 not be advertised across AS boundaries. 533 It is up to individual domains to define their own VPs. VPs MUST be 534 "larger" (span a larger address space) than any real sub-prefix. If 535 a VP is smaller than a real prefix, then packets that match the real 536 prefix will nevertheless be routed to an APR owning the VP, at which 537 point the packet will be dropped if it does not match a sub-prefix 538 within the VP (Section 6). 540 (Note that, in principle there are cases where a VP could be smaller 541 than a real prefix. This is where the egress router to the real 542 prefix is a VA router. In this case, the APR could theoretically 543 tunnel the packet to the appropriate remote ASBR, which would then 544 forward the packet correctly. On the other hand, if the egress 545 router is a legacy router, then the APR could not tunnel matching 546 packets to the egress. This is because the egress would view the VP 547 as a better match, and would loop the packet back to the APR. For 548 this reason we require that VPs be larger than any real prefixes, and 549 that APRs never install prefixes larger than a VP in their FIBs.) 551 It is valid for a VP to be a subset of another VP. For example, 20/7 552 and 20/8 can both be VPs. In fact, this capability is necessary for 553 "splitting" a VP without temporarily increasing the FIB size in any 554 router. (Section 3.1.2.5). 556 3.1.2.3. Aggregation Point Routers (APR) 558 Any router MAY be configured as an Aggregation Point Router (APR) for 559 one or more Virtual Prefixes (VP). For each VP for which a router is 560 an APR, the router does the following: 562 1. The APR MUST originate a BGP route to the VP [RFC4271]. In this 563 route, the NLRI are all of the VPs for which the router is an 564 APR. This is true even for VPs that are a subset of another VP. 565 The ORIGIN is set to INCOMPLETE (value 2), the AS number of the 566 APR's AS is used in the AS_PATH, and the BGP NEXT_HOP is set to 567 the address of the APR. The ATOMIC_AGGREGATE and AGGREGATOR 568 attributes are not included. 569 2. The APR MUST attach a NO_EXPORT Communities Attribute [RFC1997] 570 to the route. 571 3. The APR MUST be able to detunnel packets addressed to itself at 572 its BGP NEXT_HOP address. It MUST also be able to convey the 573 tunnel information needed by other routers to initiate tunneled 574 packets to them (Requirement R1). 575 4. If a packet is received at the APR whose best match route is the 576 VP (i.e. it matches the VP but not any sub-prefixes within the 577 VP), then the packet MUST be discarded (see Section 3.1.2.2). 578 This can be accomplished by never installing a prefix larger than 579 the VP into the FIB, or by installing the VP as a route to 580 \dev\null. 582 3.1.2.3.1. Selecting APRs 584 An ISP is free to select APRs however it chooses. The details of 585 this are outside the scope of this document. Nevertheless, a few 586 comments are made here. In general, APRs should be selected such 587 that the distance to the nearest APR for any VP is small---ideally 588 within the same POP. Depending on the number of routers in a POP, 589 and the sizes of the FIBs in the routers relative to the DFRT size, 590 it may not be possible for all VPs to be represented in a given POP. 591 In addition, there should be multiple APRs for each VP, again ideally 592 in each POP, so that the failure of one does not unduly disrupt 593 traffic. 595 Note that, although VPs MUST be larger than real prefixes, there is 596 intentionally no mechanism designed to automatically insure that this 597 is the case. Such a mechanisms would be dangerous. For instance, if 598 an ISP somewhere advertised a very large prefix (a /4, say), then 599 this would cause APRs to throw out all VPs that are smaller than 600 this. For this reason, VPs MUST be set through static configuration 601 only. 603 3.1.2.4. Non-APR Routers 605 A non-APR router MUST install at least the following routes: 607 1. Routes to VPs (identifiable using the VP-List). 609 2. Routes to all sub-prefixes that are not covered by any VP in the 610 VP-List. 612 If the non-APR has a tunnel to the BGP NEXT_HOP of any such route, it 613 MUST use the tunnel to forward packets to the BGP NEXT_HOP. 615 When an APR fails, routers must select another APR to send packets to 616 (if there is one). This happens, however, through normal internal 617 BGP convergence mechanisms. 619 3.1.2.5. Adding and deleting VPs 621 An ISP may from time to time wish to reconfigure its VP-List. There 622 are a number of reasons for this. For instance, early in its 623 deployment an ISP may configure one or a small number of VPs in order 624 to test VA. As the ISP gets more confident with VA, it may increase 625 the number of VPs. Or, an ISP may start with a small number of large 626 VPs (i.e. /4's or even one /0), and over time move to more smaller 627 VPs in order to save even more FIB. In this case, the ISP will need 628 to "split" a VP. Finally, since the address space is not uniformly 629 populated with prefixes, the ISP may want to change the size of VPs 630 in order to balance FIB size across routers. This can involve both 631 splitting and merging VPs. Of course, an ISP must be able to modify 632 its VP-List without 1) interrupting service to any destinations, or 633 2) temporarily increasing the size of any FIB (i.e. where the FIB 634 size during the change is no bigger than its size either before or 635 after the change). 637 Adding a VP is straightforward. The first step is to configure the 638 APRs for the VP. This causes the APRs to originate routes for the 639 VP. Non-APR routers will install this route according to the rules 640 in Section 3.1.2.4 even though they do not yet recognize that the 641 prefix is a VP. Subsequently the VP is added to the VP-List of non- 642 APR routers. The Non-APR routers can then start suppressing the sub- 643 prefixes with no loss of service. 645 To delete a VP, the process is reversed. First, the VP is removed 646 from the VP-Lists of non-APRs. This causes the non-APRs to install 647 the sub-prefixes. After all sub-prefixes have been installed, the VP 648 may be removed from the APRs. 650 In many cases, it is desirable to split a VP. For instance, consider 651 the case where two routers, Ra and Rb, are APRs for the same prefix. 652 It would be possible to shrink the FIB in both routers by splitting 653 the VP into two VPs (i.e. split one /6 into two /7's), and assigning 654 each router to one of the VPs. While this could in theory be done by 655 first deleting the larger VP, and then adding the smaller VPs, doing 656 so would temporarily increase the FIB size in non-APRs, which may not 657 have adequate space for such an increase. For this reason, we allow 658 overlapping VPs. 660 To split a VP, first the two smaller VPs are added to the VP-Lists of 661 all non-APR routers (in addition to the larger superset VP). Next, 662 the smaller VPs are added to the selected APRs (which may or may not 663 be APRs for the larger VP). Because the smaller VPs are a better 664 match than the larger VP, this will cause the non-APR routers to 665 forward packets to the APRs for the smaller VPs. Next, the larger VP 666 can be removed from the VP-Lists of all non-APR routers. Finally, 667 the larger VP can be removed from its APRs. 669 To merge two VPs, the new larger VP is configured in all non-APRs. 670 This has no effect on FIB size or APR selection, since the smaller 671 VPs are better matches. Next the larger VP is configured in its 672 selected APRs. Next the smaller VPs are deleted from all non-APRs. 673 Finally, the smaller VPs are deleted from their corresponding APRs. 675 3.1.3. Border VA Routers 677 A VA router that is an ASBR MUST do the following: 679 1. When forwarding externally-received routes over iBGP, if a tunnel 680 with an inner label is used, the ASBR MUST set the BGP NEXT_HOP 681 attribute to itself. Otherwise, the BGP NEXT_HOP attribute is 682 left unchanged. 683 2. They MUST establish tunnels as described in Section 4. 684 3. The ASBR MUST detunnel the packet before forwarding the packet to 685 the remote ASBR. In other words, the remote ASBR receives a 686 normal untunneled packet identical to the packet it would receive 687 without VA. 688 4. The ASBR MUST be able to forward the packet without a FIB lookup. 689 In other words, the tunnel information itself contains all the 690 information needed by the border router to know which remote ASBR 691 should receive the packet. 693 3.1.4. Advertising and Handling Sub-Prefixes 695 Sub-prefixes are advertised and handled by BGP as normal. VA does 696 not effect this behavior. The only difference in the handling of 697 sub-prefixes is that they might not be installed in the FIB, as 698 described in Section 3.1.5. 700 In those cases where the route is installed, packets forwarded to 701 prefixes external to the AS MUST be transmitted via the tunnel 702 established as described in Section 3.1.3. 704 3.1.5. Suppressing FIB Sub-prefix Routes 706 Any route not for a known VP (i.e. not in the VP-List) is taken to be 707 a sub-prefix. The following rules are used to determine if a sub- 708 prefix route can be suppressed. 710 1. A VA router MUST NOT FIB-install a sub-prefix route for which 711 there is no tunnel to the BGP NEXT_HOP address. This is to 712 prevent a loop whereby the APR forwards the packet hop-by-hop 713 towards the next hop, but a router on the path that has FIB- 714 suppressed the sub-prefix forwards it back to the APR. If there 715 is an alternate route to the sub-prefix for which there is a 716 tunnel, then that route SHOULD be selected, even if it is less 717 attractive according to the normal BGP best path selection 718 algorithm. 719 2. If the router is an APR, a route for every sub-prefix within the 720 VP MUST be FIB-installed (subject to the above limitation that 721 there be a tunnel). 722 3. If a non-APR router has a sub-prefix route that does not fall 723 within any VP (as determined by the VP-List), then the route MUST 724 be installed. This may occur because the ISP hasn't defined a VP 725 covering that prefix, for instance during an incremental 726 deployment buildup. 727 4. If an ASBR is using strict uRPF to do ingress filtering, then it 728 MUST install routes for which the remote ASBR is the BGP NEXT_HOP 729 [RFC2827]. Note that only a APR may do loose uRPF filtering, and 730 then only for routes to sub-prefixes within its VPs. 731 5. All other sub-prefix routes MAY be suppressed. Such "optional" 732 sub-prefixes that are nevertheless installed are referred to as 733 Popular Prefixes. Note, however, that whether or not to install 734 a given sub-prefix SHOULD NOT be based on whether or not there is 735 an active route to a VP in the VP-List. This avoids the 736 situation whereby, during BGP initialization, the router receives 737 some sub-prefix routes before receiving the corresponding VP 738 route, with the result that it installs routes in its FIB that it 739 will only remove a short time later, possibly even overflowing 740 its FIB. 742 3.1.5.1. Selecting Popular Prefixes 744 Individual routers MAY independently choose which sub-prefixes are 745 Popular Prefixes. There is no need for different routers to install 746 the same sub-prefixes. There is therefore significant leeway as to 747 how routers select Popular Prefixes. As a general rule, routers 748 should fill the FIB as much as possible, because the cost of doing so 749 is relatively small, and more FIB entries leads to fewer packets 750 taking a longer path. Broadly speaking, an ISP may choose to fill 751 the FIB by making routers APRs for as many VPs as possible, or by 752 assigning relatively few APRs and rather filling the FIB with Popular 753 Prefixes. Several basic approaches to selecting Popular Prefixes are 754 outlined here. Router vendors are free to implement whatever 755 approaches they want. 757 1. Policy-based: The simplest approach for network administrators is 758 to have broad policies that routers use to determine which sub- 759 prefixes are designated as popular. An obvious policy would be a 760 "customer routes" policy, whereby all customer routes are 761 installed (as identified for instance by appropriate community 762 attribute tags). Another policy would be for a router to install 763 prefixes originated by specific ASes. For instance, two ISPs 764 could mutually agree to install each other's originated prefixes. 765 A third policy might be to install prefixes with the shortest AS- 766 path. 767 2. Static list: Another approach would be to configure static lists 768 of specific prefixes to install. For instance, prefixes 769 associated with an SLA might be configured. Or, a list of 770 prefixes for the most popular websites might be installed. 771 3. High-volume prefixes: By installing high-volume prefixes as 772 Popular Prefixes, the latency and load associated with the longer 773 path required by VA is minimized. One approach would be for an 774 ISP to measure its traffic volume over time (days or a few 775 weeks), and statically configure high-volume prefixes as Popular 776 Prefixes. There is strong evidence that prefixes that are high- 777 volume tend to remain high-volume over multi-day or multi-week 778 timeframes (though not necessarily at short timeframes like 779 minutes or seconds). High-volume prefixes MAY also be installed 780 dynamically. In other words, a router measures its own traffic 781 volumes, and installs and removes Popular Prefixes in response to 782 short term traffic load. The downside of this approach is that 783 it complicates debugging network problems. If packets are being 784 dropped somewhere in the network, it is more difficult to find 785 out where if the selected path can change dynamically. 787 3.2. New Configuration 789 VA places new configuration requirements on ISP administrators. 790 Namely, the administrator must: 792 1. Select VPs, and configure the VP-List into all VA routers. As a 793 general rule, having a larger number of relatively small prefixes 794 gives administrators the most flexibility in terms of filling 795 available FIB with sub-prefixes, and in terms of balancing load 796 across routers. Once an administrator has selected a VP-List, it 797 is just as easy to configure routers with a large list as a small 798 list. We can expect network operator groups like NANOG to 799 compile good VP-Lists that ISPs can then adopt. A good list 800 would be one where the number of VPs is relatively large, say 100 801 or so (noting again that each VP must be smaller than a real 802 prefix), and the number of sub-prefixes within each VP is roughly 803 the same. 804 2. Select and configure APRs. There are three primary 805 considerations here. First, there must be enough APRs to handle 806 reasonable APR failure scenarios. Second, APR assignment should 807 not result in router overload. Third, particularly long paths 808 should be avoided. Ideally there should be two APRs for each VP 809 within each PoP, but this may not be possible for small PoPs. 810 Failing this, there should be at least two APRs in each 811 geographical region, so as to minimize path length increase. 812 Routers should have the appropriate counters to allow 813 administrators to know the volume of APR traffic each router is 814 handling so as to adjust load by adding or removing APR 815 assignments. 816 3. Select and configure Popular Prefixes or Popular Prefix policies. 817 There are two general goals here. The first is to minimize load 818 overall by minimizing the number of packets that take longer 819 paths. The second is to insure that specific selected prefixes 820 don't have overly long paths. These goals must be weighed 821 against the administrative overhead of configuring potentially 822 thousands of Popular Prefixes. As one example a small ISP may 823 wish to keep it simple by doing nothing more than indicating that 824 customer routes should be installed. In this case, the 825 administrator could otherwise assign as many APRs as possible 826 while leaving enough FIB space for customer routes. As another 827 example, a large ISP could build a management system that takes 828 into consideration the traffic matrix, customer SLAs, robustness 829 requirements, FIB sizes, topology, and router capacity, and 830 periodically automatically computes APR and Popular Prefix 831 assignments. 833 4. Usage of Tunnels 835 4.1. MPLS tunnels 837 VA utilizes a straight-forward application of MPLS. The tunnels are 838 MPLS Label Switched Paths (LSP), and are signaled using either the 839 Label Distribution Protocol (LDP) [RFC5036] or RSVP-TE [RFC3209]. 840 Both VA and legacy routers MUST participate in this signaling. 842 APRs and ASBRs initiate tunnels. In both cases, Downstream 843 Unsolicited tunnels are initiated to all IGP neighbors with the full 844 BGP NEXT_HOP address as the Forwarding Equivalence Class (FEC). In 845 the case of APRs, the BGP NEXT_HOP is the APR's own address. In the 846 case of legacy ASBRs, the BGP NEXT_HOP is the ASBR's own address. In 847 the case of VA ASBRs, the BGP NEXT_HOP is that of the remote ASBR. 849 Existing Penultimate Hop Popping (PHP) mechanisms in the data plane 850 can be used for forwarding packets to remote ASBRs. 852 4.2. Usage of Inner Label 854 Besides using a separate LSP to identify the remote ASBR as described 855 above, it is also possible to use an inner label to identify the 856 remote ASBR. Either an outer label or an IP tunnel identifies the 857 local ASBR. 859 When a local ASBR advertises a route into iBGP, it sets the NEXT_HOP 860 to itself, and assigns a label to the route. This label is used as 861 the inner label, and identifies the remote ASBR from which the route 862 was received [RFC3107]. 864 The presence of the inner label in the iBGP update acts as the signal 865 to the receiving router that an inner label MUST be used in packets 866 tunneled to the NEXT_HOP address. If there is an LSP established 867 targeted to the NEXT_HOP address, then it is used to tunnel the 868 packet to the NEXT_HOP address. Otherwise, an IP header address to 869 the NEXT_HOP address is used. 871 5. IANA Considerations 873 There are no IANA considerations. 875 6. Security Considerations 877 We consider the security implications of VA under two scenarios, one 878 where VA is configured and operated correctly, and one where it is 879 mis-configured. A cornerstone of VA operation is that the basic 880 behavior of BGP doesn't change, especially inter-domain. Among other 881 things, this makes it easier to reason about security. 883 6.1. Properly Configured VA 885 If VA is configured and operated properly, then the external behavior 886 of an AS does not change. The same upstream ASes are selected, and 887 the same prefixes and AS-paths are advertised. Therefore, a properly 888 configured VA domain has no security impact on other domains. 890 If another ISP starts advertising a prefix that is larger than a 891 given VP, this prefix will be ignored by APRs that have a VP that 892 falls within the larger prefix (Section 3.1.2.3). As a result, 893 packets that might otherwise have been routed to the new larger 894 prefix will be dropped at the APRs. Note that the trend in the 895 Internet is towards large prefixes being broken up into smaller ones, 896 not the reverse. Therefore, such a larger prefix is likely to be 897 invalid. If it is determined without a doubt that the larger prefix 898 is valid, then the ISP will have to reconfigure its VPs. 900 VA does not change an ISP's ability to do ingress filtering using 901 strict uRPF (Section 3.1.5). 903 Regarding DoS attacks, there are two issues that need to be 904 considered. First, does VA result in new types of DoS attacks? 905 Second, does VA make it more difficult to deploy DoS defense systems. 906 Regarding the first issue, one possibility is that an attacker 907 targets a given router by flooding the network with traffic to 908 prefixes that are not popular, and for which that router is an APR. 909 This would cause a disproportionate amount of traffic to be forwarded 910 to the APR(s). While it is up to individual ISPs to decide if this 911 attack is a concern, it does not strike the authors that this attack 912 is likely to significantly worsen the DoS problem. 914 Many DoS defense systems use dynamically established Routing Table 915 entries to divert victims' traffic into LSPs that carry the traffic 916 to scrubbers. This mechanism works with VA---it simply over-rides 917 whatever route is in place. This mechanism works equally well with 918 APRs and non-APRs. 920 6.2. Mis-configured VA 922 VA introduces the possibility that a VP is advertised outside of an 923 AS. This in fact should be a low probability event, but it is 924 considered here none-the-less. 926 If an AS leaks a large VP (i.e. larger than any real prefixes), then 927 the impact is minimal. Smaller prefixes will be preferred because of 928 best-match semantics, and so the only impact is that packets that 929 otherwise have no matching routes will be sent to the misbehaving AS 930 and dropped there. If an AS leaks a small VP (i.e. smaller than a 931 real prefix), then packets to that AS will be hijacked by the 932 misbehaving AS and dropped. This can happen with or without VA, and 933 so doesn't represent a new security problem per se. 935 7. Acknowledgements 937 The authors would like to acknowledge the efforts of Xinyang Zhang 938 and Jia Wang, who worked on CRIO (Core Router Integrated Overlay), an 939 early inter-domain variant of FIB suppression, and the efforts of 940 Hitesh Ballani and Tuan Cao, who worked on the configuration-only 941 variant of VA that works with legacy routers. We would also like to 942 thank Scott Brim, Daniel Ginsburg, and Rajiv Asati for their helpful 943 comments. In particular, Daniel's comments significantly simplified 944 the spec (eliminating the need for a new External Communities 945 Attribute). 947 8. References 949 8.1. Normative References 951 [RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP 952 Communities Attribute", RFC 1997, August 1996. 954 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 955 Requirement Levels", BCP 14, RFC 2119, March 1997. 957 [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: 958 Defeating Denial of Service Attacks which employ IP Source 959 Address Spoofing", BCP 38, RFC 2827, May 2000. 961 [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in 962 BGP-4", RFC 3107, May 2001. 964 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 965 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 966 Tunnels", RFC 3209, December 2001. 968 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 969 Protocol 4 (BGP-4)", RFC 4271, January 2006. 971 [RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas, 972 "Protocol Independent Multicast - Sparse Mode (PIM-SM): 973 Protocol Specification (Revised)", RFC 4601, August 2006. 975 [RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP 976 Specification", RFC 5036, October 2007. 978 8.2. Informative References 980 [I-D.ietf-grow-simple-va] 981 Francis, P., Xu, X., Ballani, H., Raszuk, R., and L. 982 Zhang, "Simple Virtual Aggregation (S-VA)", 983 draft-ietf-grow-simple-va-00 (work in progress), 984 March 2010. 986 [I-D.ietf-grow-va-gre] 987 Francis, P., Raszuk, R., and X. Xu, "GRE and IP-in-IP 988 Tunnels for Virtual Aggregation", 989 draft-ietf-grow-va-gre-00 (work in progress), July 2009. 991 [I-D.ietf-grow-va-mpls] 992 Francis, P. and X. Xu, "MPLS Tunnels for Virtual 993 Aggregation", draft-ietf-grow-va-mpls-00 (work in 994 progress), May 2009. 996 [I-D.ietf-grow-va-mpls-innerlabel] 997 Xu, X. and P. Francis, "Proposal to use an inner MPLS 998 label to identify the remote ASBR VA", 999 draft-ietf-grow-va-mpls-innerlabel-00 (work in progress), 1000 September 2009. 1002 [nsdi09] Ballani, H., Francis, P., Cao, T., and J. Wang, "Making 1003 Routers Last Longer with ViAggre", ACM Usenix NSDI 2009 ht 1004 tp://www.usenix.org/events/nsdi09/tech/full_papers/ 1005 ballani/ballani.pdf, April 2009. 1007 Authors' Addresses 1009 Paul Francis 1010 Max Planck Institute for Software Systems 1011 Gottlieb-Daimler-Strasse 1012 Kaiserslautern 67633 1013 Germany 1015 Phone: +49 631 930 39600 1016 Email: francis@mpi-sws.org 1018 Xiaohu Xu 1019 Huawei Technologies 1020 No.3 Xinxi Rd., Shang-Di Information Industry Base, Hai-Dian District 1021 Beijing, Beijing 100085 1022 P.R.China 1024 Phone: +86 10 82836073 1025 Email: xuxh@huawei.com 1026 Hitesh Ballani 1027 Cornell University 1028 4130 Upson Hall 1029 Ithaca, NY 14853 1030 US 1032 Phone: +1 607 279 6780 1033 Email: hitesh@cs.cornell.edu 1035 Dan Jen 1036 UCLA 1037 4805 Boelter Hall 1038 Los Angeles, CA 90095 1039 US 1041 Phone: 1042 Email: jenster@cs.ucla.edu 1044 Robert Raszuk 1045 Cisco Systems, Inc. 1046 170 West Tasman Drive 1047 San Jose, CA 95134 1048 USA 1050 Phone: 1051 Email: raszuk@cisco.com 1053 Lixia Zhang 1054 UCLA 1055 3713 Boelter Hall 1056 Los Angeles, CA 90095 1057 US 1059 Phone: 1060 Email: lixia@cs.ucla.edu