idnits 2.17.1 draft-ietf-grow-va-auto-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([I-D.ietf-grow-va]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 18, 2009) is 5275 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC1997' is defined on line 395, but no explicit reference was found in the text == Outdated reference: A later version (-06) exists of draft-ietf-grow-va-00 Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Francis 3 Internet-Draft MPI-SWS 4 Intended status: Informational X. Xu 5 Expires: April 21, 2010 Huawei 6 H. Ballani 7 Cornell U. 8 D. Jen 9 UCLA 10 R. Raszuk 11 Self 12 L. Zhang 13 UCLA 14 October 18, 2009 16 Proposal for Auto-Configuration in Virtual Aggregation 17 draft-ietf-grow-va-auto-00.txt 19 Status of this Memo 21 This Internet-Draft is submitted to IETF in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF), its areas, and its working groups. Note that 26 other groups may also distribute working documents as Internet- 27 Drafts. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 The list of current Internet-Drafts can be accessed at 35 http://www.ietf.org/ietf/1id-abstracts.txt. 37 The list of Internet-Draft Shadow Directories can be accessed at 38 http://www.ietf.org/shadow.html. 40 This Internet-Draft will expire on April 21, 2010. 42 Copyright Notice 44 Copyright (c) 2009 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents in effect on the date of 49 publication of this document (http://trustee.ietf.org/license-info). 50 Please review these documents carefully, as they describe your rights 51 and restrictions with respect to this document. 53 Abstract 55 Virtual Aggregation as specified in [I-D.ietf-grow-va] requires a 56 certain amount of configuration, namely virtual prefixes (VP), a VP 57 list, type of tunnel, and popular prefixes. This draft proposes 58 optional approaches to auto-configuration of popular prefixes and the 59 VP list, and discusses the pros and cons of each. If these proposals 60 are accepted, they will be incorporated into [I-D.ietf-grow-va]. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 1.1. Requirements notation . . . . . . . . . . . . . . . . . . 3 66 2. Syntax for the tags . . . . . . . . . . . . . . . . . . . . . 3 67 3. Config of Popular Prefixes . . . . . . . . . . . . . . . . . . 4 68 3.1. Operation of the should-install tag . . . . . . . . . . . 5 69 3.1.1. Sending the should-install tag . . . . . . . . . . . . 5 70 3.1.2. Receiving the should-install tag . . . . . . . . . . . 5 71 3.2. Discussion . . . . . . . . . . . . . . . . . . . . . . . . 5 72 4. Config of the VP list . . . . . . . . . . . . . . . . . . . . 6 73 4.1. VP-route tag . . . . . . . . . . . . . . . . . . . . . . . 7 74 4.2. Can suppress tag . . . . . . . . . . . . . . . . . . . . . 8 75 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 76 6. Security Considerations . . . . . . . . . . . . . . . . . . . 9 77 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9 78 7.1. Normative References . . . . . . . . . . . . . . . . . . . 9 79 7.2. Informative References . . . . . . . . . . . . . . . . . . 10 80 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10 82 1. Introduction 84 Virtual Aggregation as specified in [I-D.ietf-grow-va] requires a 85 certain amount of configuration, namely: 87 1. Each Aggregation Point Router (APR) must be configured with the 88 VPs for which it is an APR. 89 2. Every router must be configured with the VP list (a list of all 90 VPs). This allows the router to know which prefixes can and 91 cannot be FIB-suppressed. 92 3. Every router should be configured with a list of prefixes that 93 should be FIB-installed (for instance because they have large 94 traffic volumes). 95 4. Every router should be configured as to the tunnel type. 97 Of these four items, the first and last cannot be automated. Both, 98 however, represent a relatively small amount of configuration. The 99 second and third are more significant, and this draft proposes 100 mechanisms for partially or fully automating them. If any of these 101 proposals are accepted, they will be incorporated into the main VA 102 draft. In any event, they would be considered as optional. The 103 manually configured VP-list would still be mandatory, though an ISP 104 could choose not to use it if one of the options described here is 105 available. ([I-D.ietf-grow-va]). 107 All of the approaches described in this draft involve tagging routes 108 with a standard extended communities attribute. There are three such 109 tags, the "should-install" tag, the "VP-route" tag, and the "can- 110 suppress" tag. The should-install tag is for the purpose of 111 automating the configuration of popular prefixes that are popular by 112 virtue of having high traffic volume. The VP-route and can-suppress 113 tags represent two alternatives for the VP-list. Note that usage of 114 the should-install tag (popular prefixes config) is completely 115 orthogonal with usage of either the VP-route or can-suppress tag 116 (replacement for VP-list config). 118 1.1. Requirements notation 120 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 121 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 122 document are to be interpreted as described in [RFC2119]. 124 2. Syntax for the tags 126 All three tags can be conveyed with an Extended Communities Attribute 127 [RFC4360] to be assigned by IANA. For all three tags, the Transitive 128 Bit MUST be set to value 1 (the community is non- transitive across 129 ASes). 131 3. Config of Popular Prefixes 133 Broadly speaking, a Popular Prefix is any prefix that does not have 134 to be FIB-installed, but should never-the-less be FIB-installed. For 135 instance: 137 Prefixes for customer networks should be installed (so that 138 traffic to customers does not incur the extra delay associated 139 with the detour through an APR). Since customer routes are in any 140 event tagged with a community attribute for routing policy 141 reasons, the decision to FIB-install them is entirely local and 142 requires no standardization. 143 If an ASBR chooses its external peer as a next-hop for a given 144 prefix, then it should FIB-install that prefix. 146 Prefixes to which there is a large volume of traffic should also be 147 FIB-installed. This is to reduce the additional load the results 148 from the extra hop(s) that packets must take on the APR detour. 149 Installing these prefixes is not trivial. The volume of traffic must 150 be measured, the high-volume prefixes identified, and routers 151 configured to FIB-install these prefixes. Furthermore, the router 152 where the prefix must be FIB-installed is typically different from 153 where the high-volume is measured. Normally, the highest volume for 154 any given prefix will be seen at the egress routers for that prefix. 155 However, the ingress router is where FIB installation should take 156 place. 158 The proposal is to identify high-volume prefixes at ASBRs and RRs 159 (routers that forward iBGP updates), and to tag routes to these 160 prefixes with a community attribute that effectively means "should 161 FIB-install". How to identify high-volume prefixes is a local 162 matter, but one way would be by examining netflow records from the 163 router. In principle, however, a router could internally detect 164 high-volume prefixes. Identification of high-volume prefixes need 165 only be done for either: 167 1. Outgoing traffic on ASBRs peering with non-customer networks 168 (peers or transits). 169 2. Route Reflectors, probably limited to traffic that is routed 170 towards the edge. 172 Either way, the set of routers where this identification must take 173 place is limited. 175 3.1. Operation of the should-install tag 177 3.1.1. Sending the should-install tag 179 For routers implementing this optional feature, it must be possible 180 to configure a router to attach the community attribute (the "should- 181 install tag") to routes for a given prefix. In practice, this may be 182 automatically done by the system that receives and analyzes netflow 183 records, or it may be done manually by a network administrator. Once 184 configured as such, the router must attach the should-install tag to 185 BGP updates containing the prefix. The update may be generated 186 immediately after the configuration takes place, or it may be put off 187 until the next time the update is normally transmitted. 189 If the configuration is removed, the router must not attach the 190 should-install tag to subsequent updates containing the prefix. An 191 update without the should-install tag may be generated immediately 192 after the configuration is removed, or it may be put off until the 193 next time the update is normally transmitted. 195 3.1.2. Receiving the should-install tag 197 If the best-path route to a given prefix (that doesn't otherwise have 198 to be FIB-installed), has the should-install tag, then the router 199 locally decides whether or not to FIB-install the prefix. If there 200 is no room in the FIB for a new prefix, the router may choose to 201 remove an existing FIB entry (for instance, the oldest entry) to make 202 room for the new entry. 204 3.2. Discussion 206 The time-frame over which should-install tags are attached and 207 removed should be quite long, at least hours if not days. Evidence 208 shows that high-volume prefixes tend to stay high-volume on average 209 over long periods of times (days or even weeks) [nsdi09]. 211 There are a number of limited scenarios whereby a should-install tag 212 is not successfully conveyed to all routers in an AS. This does not 213 result in non-delivery of packets, only inefficiencies. 215 Consider the case where an AS is using Route Reflectors (RRs), and is 216 using ASBRs to transmit should-install tags. Imagine two ASBRs, BR1 217 and BR2, that advertise routes to some prefix P. Further, both BR1 218 and BR2 are clients of the same RR. Assume that there is high-volume 219 to prefix P at BR1 but not at BR2. As a result, BR1 attaches the 220 should-install tag and BR2 does not. If the RR for any reason 221 prefers the route via BR2 over BR1, then it the should-install tag 222 will not be passed on by the RR. (Although note that a likely 223 outcome of this is that BR2 will start to see high volumes of traffic 224 to P, and eventually will set the should-install tag.) 226 Next consider the same topology as above (BR1 and BR2 both clients of 227 the same RR), but now assume that it is the RR that is used to 228 transmit should-install tags. Assume that the RR detects high-volume 229 to prefix P and attaches the should-install tag for routes to P. 230 Assume that both BR1 and BR2 choose their respective external peers 231 as the next hop to P, and of course advertise this next hop to the 232 RR. The RR selects and advertises a best path, say via BR1. When 233 the RR advertises this best path to BR2, BR2 ignores it and so does 234 not FIB-install the route. The end result here is that packets 235 detour through an APR and then are tunneled back to the ASBR. 236 (Though as mentioned earlier in this section, prefixes where the next 237 hop is an external peer should be FIB-installed as a matter of local 238 policy.) 240 4. Config of the VP list 242 As the current VA specification stands, routers have to know which 243 prefixes they must FIB-install and and which they need not FIB- 244 install. The VP-list tells them this: they must FIB-install routes 245 to VPs, and they need not FIB-install routes to prefixes that fall 246 within VPs for which they are not an APR. The same VP-list must be 247 installed in every router (though it is not a problem that they 248 differ for brief periods during modification of the VP-list). 249 Configuration of the VP-list is not nearly as hard as configuration 250 of popular prefixes, but it is nevert-the-less a significant task 251 that we'd just as soon do without. 253 There are two basic approaches to automating this configuration. One 254 is to have APRs tag the routes to VPs that they originate, and let 255 routers effectively reconstruct the VP-list from these tags. This 256 approach has the advantage that no configuration what-so-ever is 257 required to solve the problem. 259 The other is to have ASBRs tag the routes that need not be installed. 260 This can be done by configuring a list of one or more "VP-ranges" in 261 the ASBRs. This is simpler than the current configured VP-list 262 approach in two regards. First, fewer routers need to be configured 263 (only ASBRs interfacing with peer and provider (non-customer) 264 networks. Second, the VP-range is simpler than the VP-list. In most 265 cases, once an ISP is past its initial VA roll-out phase, it would 266 consist of a single 0/0 entry. 268 These two approaches are discussed in the following sections. 270 4.1. VP-route tag 272 Routers that receive a route with the communities attribute 273 indicating the VP-route tag must FIB-install the associated prefix 274 (VP). They may FIB-suppress any sub-prefixes that fall within the 275 VP. 277 Prefixes that do not fall within any known VP must be FIB-installed. 278 During BGP initialization (i.e. before the End-of-RIB marker is 279 received [RFC4724]), however, the full set of VPs is not yet known. 280 Therefore, what routers do with prefixes that do not fall within any 281 known VP during initialization is a local matter. 283 There are two basic strategies, install by default and suppress by 284 default. Each has pros and cons, though the latter is generally 285 prefered. With install by default, some prefixes will be installed 286 only to be removed later (when the parent VP is learned). This can 287 actually ultimately slow down convergence, since it takes time to 288 modify the FIB. Also, this could result in the FIB filling up with 289 entries. 291 The problem with suppress by default is that entries that ultimately 292 will be installed are not immediately installed. Instead, they are 293 installed only after the End-of-RIB marker. This approach, however, 294 does avoid the pitfalls of install by default, and ultimately could 295 converge faster because FIB churn is avoided. There are also several 296 mitigating factors that should make suppress by default work well in 297 practice. First, if the router uses Graceful Restart [RFC4724], then 298 in any event forwarding can continue to take place even when the BGP 299 session is restarted. Second, the router can have a policy whereby 300 prefixes with a should-install tag are automatically installed. In 301 this way, high-volume prefixes are installed and so most traffic will 302 in fact be forwarded by the End-of-RIB. Finally, if the router has a 303 policy that customer prefixes are always installed, then flows 304 between customers are also correctly forwarded by the End-of-RIB. 306 Another issue with the VP-route tag is what to do if all APRs for a 307 given VP stop operating (i.e. crash) and so all VP routes are 308 withdrawn. Strictly speaking, the router would immediately start 309 installing the sub-prefixes within that VP. This could lead to the 310 FIB filling up. Also, if the APR is thrashing (going up and down), 311 then all routers in the AS could end up repeatedly adding and 312 removing the same set of prefixes. 314 How to deal with this is a local matter. There are two questions the 315 router must answer: 317 1. How should hysteresis be applied to the (implicit) VP list to 318 avoid FIB churn? 319 2. How are FIB entries prioritized in the case where the FIB is 320 full? 322 Regarding VP list hysteresis, perhaps the simplest thing to do is to 323 use standard route flap damping on the VP routes [RFC2439]. 324 Alternatively, the router could simply not install sub-prefixes for a 325 recently known VP for some period of time (minutes) after which the 326 VP route was withdrawn, or only install sub-prefixes slowly (to 327 minimize the impact of churn). 329 Regarding FIB entry prioritization, routers must in any event install 330 VP routes and sub-prefixes within the VPs for which the router is an 331 APR. If the FIB does not have room for at least these entries, then 332 VA has simply been configured incorrectly in the AS, and the 333 administrator must fix this. Beyond these necessary FIB entries, 334 prioritization is a local matter. A reasonable prioritization, 335 however, is the following: 1) customer routes, 2) routes with should- 336 install tag, 3) routes for sub-prefixes of recently withdrawn VPs, 4) 337 other. 339 4.2. Can suppress tag 341 With this approach, some set of ASBRs are configured with a "VP 342 range". This is the ranges of IP address that are covered by all 343 VPs. In a mature deployment of VA, the range would amount to all IP 344 addresses, in which case the VP range is simply 0/0. Early in VA 345 deployment, when an ISP is still in the testing or roll-out phase, 346 the VP range would consist of multiple entries. At a minimum, the 347 set of ASBRs so configured are those with peers in peer or transit 348 ASes. If the AS has a policy that customer routes are always FIB- 349 installed, then it is not necessary to configure routers that connect 350 to customer ASes. 352 VP-range configured ASBRs must tag any route whose prefix falls 353 within the VP range with a "can-suppress" tag, with the following 354 exceptions: 356 1. Routers must never tag a VP route with can-suppress. 357 2. If the ISP has a policy of FIB-installing customer routes, then 358 routes received from customers should not be tagged with can- 359 suppress. 361 A router receiving a route with a can-suppress tag first determines 362 if it must FIB-install the prefix. It would have to do this for 363 instance if the prefix falls within a VP for which it is an APR. If 364 the router does not have to install the prefix, then it may suppress 365 the prefix at its own discretion. 367 When the can-suppress approach is used, then routers must FIB-install 368 any prefixes not tagged as can-suppress. The primary reason for this 369 is so that VP routes are always installed. 371 Note that in the case where all VP routes for a given VP are 372 withdrawn, routers would not be able to FIB-install the (now 373 unreachable) sub-prefixes. This is because, with the can-suppress 374 approach, routers do not actually know which routes are VPs. 376 5. IANA Considerations 378 IANA must assign type values for the Extended Communities Attributes 379 that convey the tags. 381 6. Security Considerations 383 As of this writing, there are no known new security threats 384 introduced by this draft. 386 7. References 388 7.1. Normative References 390 [I-D.ietf-grow-va] 391 Francis, P., Xu, X., Ballani, H., Jen, D., Raszuk, R., and 392 L. Zhang, "FIB Suppression with Virtual Aggregation", 393 draft-ietf-grow-va-00 (work in progress), May 2009. 395 [RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP 396 Communities Attribute", RFC 1997, August 1996. 398 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 399 Requirement Levels", BCP 14, RFC 2119, March 1997. 401 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 402 Communities Attribute", RFC 4360, February 2006. 404 [RFC4724] Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y. 405 Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724, 406 January 2007. 408 7.2. Informative References 410 [RFC2439] Villamizar, C., Chandra, R., and R. Govindan, "BGP Route 411 Flap Damping", RFC 2439, November 1998. 413 [nsdi09] Ballani, H., Francis, P., Cao, T., and J. Wang, "Making 414 Routers Last Longer with ViAggre", ACM Usenix NSDI 2009 ht 415 tp://www.usenix.org/events/nsdi09/tech/full_papers/ 416 ballani/ballani.pdf, April 2009. 418 Authors' Addresses 420 Paul Francis 421 Max Planck Institute for Software Systems 422 Gottlieb-Daimler-Strasse 423 Kaiserslautern 67633 424 Germany 426 Phone: +49 631 930 39600 427 Email: francis@mpi-sws.org 429 Xiaohu Xu 430 Huawei Technologies 431 No.3 Xinxi Rd., Shang-Di Information Industry Base, Hai-Dian District 432 Beijing, Beijing 100085 433 P.R.China 435 Phone: +86 10 82836073 436 Email: xuxh@huawei.com 438 Hitesh Ballani 439 Cornell University 440 4130 Upson Hall 441 Ithaca, NY 14853 442 US 444 Phone: +1 607 279 6780 445 Email: hitesh@cs.cornell.edu 446 Dan Jen 447 UCLA 448 4805 Boelter Hall 449 Los Angeles, CA 90095 450 US 452 Phone: 453 Email: jenster@cs.ucla.edu 455 Robert Raszuk 456 Self 458 Phone: 459 Email: robert@raszuk.net 461 Lixia Zhang 462 UCLA 463 3713 Boelter Hall 464 Los Angeles, CA 90095 465 US 467 Phone: 468 Email: lixia@cs.ucla.edu