idnits 2.17.1 draft-ietf-grow-simple-va-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 2 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 29, 2012) is 4349 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 GROW Working Group R. Raszuk 3 Internet-Draft NTT MCL 4 Intended status: Informational J. Heitz 5 Expires: November 30, 2012 Ericsson 6 A. Lo 7 Arista 8 L. Zhang 9 UCLA 10 X. Xu 11 Huawei 12 May 29, 2012 14 Simple Virtual Aggregation (S-VA) 15 draft-ietf-grow-simple-va-07.txt 17 Abstract 19 The continued growth in the Default Free Routing Table (DFRT) 20 stresses the global routing system in a number of ways. One of the 21 most costly stresses is FIB size: ISPs often must upgrade router 22 hardware simply because the FIB has run out of space, and router 23 vendors must design routers that have adequate FIB. 25 FIB suppression is an approach to relieving stress on the FIB by NOT 26 loading selected RIB entries into the FIB. Simple Virtual 27 Aggregation (S-VA) is a simple form of Virtual Aggregation (VA) that 28 allows any and all edge routers to shrink their RIB and FIB 29 requirements substantially and therefore increase their useful 30 lifetime. 32 S-VA does not increase FIB requirements for core routers. S-VA is 33 extremely easy to configure considerably more so than the various 34 tricks done today to extend the life of edge routers. S-VA can be 35 deployed autonomously by an ISP (cooperation between ISPs is not 36 required), and can co-exist with legacy routers in the ISP. 38 Status of this Memo 40 This Internet-Draft is submitted to IETF in full conformance with the 41 provisions of BCP 78 and BCP 79. 43 Internet-Drafts are working documents of the Internet Engineering 44 Task Force (IETF). Note that other groups may also distribute 45 working documents as Internet-Drafts. The list of current Internet- 46 Drafts is at http://datatracker.ietf.org/drafts/current/. 48 Internet-Drafts are draft documents valid for a maximum of six months 49 and may be updated, replaced, or obsoleted by other documents at any 50 time. It is inappropriate to use Internet-Drafts as reference 51 material or to cite them other than as "work in progress." 53 This Internet-Draft will expire on November 30, 2012. 55 Copyright Notice 57 Copyright (c) 2012 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents 62 (http://trustee.ietf.org/license-info) in effect on the date of 63 publication of this document. Please review these documents 64 carefully, as they describe your rights and restrictions with respect 65 to this document. Code Components extracted from this document must 66 include Simplified BSD License text as described in Section 4.e of 67 the Trust Legal Provisions and are provided without warranty as 68 described in the Simplified BSD License. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 73 1.1. Scope of this Document . . . . . . . . . . . . . . . . . . 5 74 1.2. Requirements notation . . . . . . . . . . . . . . . . . . 5 75 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 76 2. Operation of S-VA . . . . . . . . . . . . . . . . . . . . . . 6 77 3. Deployment considerations . . . . . . . . . . . . . . . . . . 7 78 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 79 5. Security Considerations . . . . . . . . . . . . . . . . . . . 9 80 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 9 81 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 82 7.1. Normative References . . . . . . . . . . . . . . . . . . . 10 83 7.2. Informative References . . . . . . . . . . . . . . . . . . 10 84 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10 86 1. Introduction 88 ISPs today manage constant DFRT growth in a number of ways. One way, 89 of course, is for ISPs to upgrade their router hardware before DFRT 90 growth outstrips the size of the FIB. This is too expensive for many 91 ISPs. They would prefer to extend the lifetime of routers whose FIBs 92 can no longer hold the full DFRT. 94 A common approach taken by lower-tier ISPs is to default route to 95 their providers. Routes to customers and peer ISPs are maintained, 96 but everything else defaults to the provider. This approach has 97 several disadvantages. First, packets to Internet destinations may 98 take longer-than-necessary AS paths. 100 This problem can be mitigated through careful configuration of 101 partial defaults, but this can require substantial configuration 102 overhead. A second problem with defaulting to providers is that the 103 ISP is no longer able to provide the full DFRT to its customers. 104 Finally, provider defaults prevents the ISP from being able to detect 105 martian packets. As a result, the ISP transmits packets that could 106 otherwise have been dropped over its expensive provider links. 108 An alternative is for the ISP to maintain full routes in its core 109 routers, but to filter routes from edge routers that do not require a 110 full DFRT. These edge routers can then default route to the core or 111 exit routers. This is often possible with edge routers that 112 interface to customer networks. The problem with this approach is 113 that it cannot be used for all edge routers. For instance, it cannot 114 be used for routers that connect to transits. It should also not be 115 used for routers that connect to customers which wish to receive the 116 full DFRT. 118 This draft describes a very simple technique, called Simple Virtual 119 Aggregation (S-VA), that allows any and all edge routers to have 120 substantially reduced FIB requirements even while still advertising 121 and receiving the full DFRT over BGP. The basic idea is as follows. 122 Core routers in the ISP maintain the full DFRT in the FIB and RIB. 123 Edge routers maintain the full DFRT in the BGP protocol RIB, but 124 suppress certain routes from being installed in RIB and FIB tables. 125 Edge routers install a default route to core routers, to ABRs which 126 are installed on the POP to core boundary or to the ASBR routers. 128 S-VA requires no changes to BGP and no changes to any choice of 129 forwarding mechanisms in routers. Configuration is extremely simple: 130 S-VA must be enabled on the edge router which needs to save its RIB 131 and FIB space. In the same time operator must inject into his intra- 132 domain routing a new prefix further called virtual aggregate (VA- 133 prefix) which will be used as the aggregate forwarding reference by 134 the edge routers performing S-VA. Everything else is automatic. 135 ISPs can deploy FIB suppression autonomously and with no coordination 136 with neighbor ASes. 138 1.1. Scope of this Document 140 The scope of this document is limited to Intra-domain S-VA operation. 141 In other words, the case where a single ISP autonomously operates 142 S-VA internally without any coordination with neighboring ISPs. 144 Note that this document assumes that the S-VA "domain" (i.e. the unit 145 of autonomy) is the AS (that is, different ASes run S-VA 146 independently and without coordination). For the remainder of this 147 document, the terms ISP, AS, and domain are used interchangeably. 149 This document applies equally to IPv4 and IPv6 both unicast and 150 multicast address families. 152 S-VA may operate with a mix of upgraded routers and legacy routers. 153 There are no topological restrictions placed on the mix of routers. 154 S-VA functionality is local to the router on which it is enabled and 155 routing correctness is guaranteed. 157 Note that S-VA is a greatly simplified variant of "full VA" 158 [I-D.ietf-grow-va]. With full VA, all routers (core or otherwise) 159 can have reduced FIBs. However, full VA requires substantial new 160 configuration and operational complexity compared to S-VA. Full VA 161 also requires the use of MPLS LSPs between all routers. Note that 162 S-VA was formerly specified in [I-D.ietf-grow-va]. It has been moved 163 to this separate draft to simplify its understanding. 165 1.2. Requirements notation 167 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 168 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 169 document are to be interpreted as described in [RFC2119]. 171 1.3. Terminology 173 RIB/FIB-Installing Router (FIR): An router that does not suppress 174 any routes, and advertises itself as a default route for 0/0. 175 Typically a core router, POP to core boundary router or an ASBR 176 would be configured as an FIR. 177 RIB/FIB-Suppressing Router (FSR): An S-VA router that installs a 178 route to 0/0, and may suppress other routes. Typically an edge 179 router would be configured as an FSR. 181 Install and Suppress: The terms "install" and "suppress" are used to 182 describe whether a protocol local RIB entry has been loaded or not 183 loaded into the global RIB and FIB. In other words, the phrase 184 "install a route" means "install a route into the global RIB and 185 FIB", and the phrase "suppress a route" means "do not install a 186 route from BGP into the global RIB and FIB". 187 Legacy Router: A router that does not run S-VA, and has no knowledge 188 of S-VA. 189 Global Routing Information Base (RIB): The term global RIB is used 190 to indicate the router's main routing information base. That RIB 191 is normally used to populate FIB tables of the router. It needs 192 to be highlighted that unless FIB compression is used global RIB 193 and FIB tables are in sync. 194 Local/Protocol Routing Information Base (loc-RIB): The term local 195 RIB is used to indicate the protocol's table where product of SPF 196 or BGP best path selection is kept before being installed in 197 global RIB. For example, in some protocol implementations BGP 198 loc-RIB can be further divided into Adj-RIBs-In, the Loc-RIB, and 199 the Adj-RIBs-Out. 201 2. Operation of S-VA 203 There are three types of routers in S-VA, FIB-Installing routers 204 (FIR), FIB-Suppressing routers (FSR), and optionally legacy routers. 205 While any router can be an FIR or an FSR (there are no topology 206 constraints), the most simple form of deployment is for AS border or 207 POP border routers to be configured as FIRs, and for customer facing 208 edge routers respectively in the AS or in the POP to be configured as 209 FSRs. 211 FIRs must originate a default BGP route to NLRI 0/0 [RFC4271]. The 212 ORIGIN is set to INCOMPLETE (value 2) and the BGP NEXT_HOP is set to 213 match the other BGP routes which are also advertised by said FIR. 214 The ATOMIC_AGGREGATE and AGGREGATOR attributes are not included. The 215 FIR MUST attach a NO_EXPORT Community Attribute [RFC1997] to the 216 default route. 218 FIRs should not FIB-suppress any routes. They may, however, still 219 use some form of local FIB compression algorithm if deemed necessary. 221 FSRs must detect the VA prefix 0/0 and install it both in loc-RIB, 222 RIB and FIB. Following that FSR may suppress any more specific 223 routes which carry the same next hop as the VA prefix. To guarantee 224 semantical correctness FSR by default should also be able to detect 225 installation of not matching next hop route and reinstall all the 226 more specifics which were previously eligible for suppression to 227 maintain semantical forwarding correctness. 229 Generally, any more specific route which carries the same next hop as 230 the VA-prefix 0/0 is eligible for suppression. However, provided 231 that there was at least one less specific prefix (e.g., 1.0.0.0/8) 232 and the next-hop of such prefix was different from that of the VA 233 0/0, those more specific prefixes (e.g., 1.1.1.0/24) which are 234 otherwise subject to suppression would not be eligible for 235 suppression anymore. 237 Similarly when IBGP multipath is enabled and when multiple VA 238 prefixes are detected which are multipath candidates under given 239 network condition only those more specific prefixes are subject to 240 suppression which have the identical set of next hops as multipath 241 set of VA prefixes. 243 We illustrate the expected behavior on the figure below. This figure 244 shows an autonomous system with a FIR FIR1 and an FSR FSR1. FSR1 is 245 an ASBR and is connected to two remote ASBRs, EP1 and EP2. 247 +------------------------------------------+ 248 | Autonomous System | +----+ 249 | | |EP1 | 250 | /---+---| | 251 | To ----\ +----+ +----+ / | +----+ 252 | Other \|FIR1|----------|FSR1|/ | 253 |Routers /| | | |\ | 254 | ----/ +----+ +----+ \ | +----+ 255 | \---+---|EP2 | 256 | | | | 257 | | +----+ 258 +------------------------------------------+ 260 Suppose that FSR1 has been enabled to perform S-VA. Originally it 261 receives all routes from FIR1 (doing next hop self) as well as 262 directly connected EBGP peers EP1 and EP2. FIR1 now will advertise a 263 VA prefix 0/0 with next hop set to himself. That will trigger 264 detection of such prefix on FSR1 and suppression all routes which 265 have the same next hop as VA prefix and which otherwise would be 266 installed in RIB and FIB. However it needs to be observed that FSR1 267 will not suppress any EBGP routes received from his peers EP1 and EP2 268 due to next hop being different from the one assigned to VA-prefix. 270 3. Deployment considerations 272 The simplest deployment model of S-VA is its use within the POP. In 273 such model the POP to core boundary routers (usually RRs in the data 274 path) would act as FIRs and would inject VA-prefix 0/0 to all of its 275 clients within the POP. In such model of operation an observation 276 can be made that such ABRs do have full routing knowledge and client 277 to ABR distance is negligible as compared with client to intra-domain 278 exit distance. 280 Therefore under the above intra POP S-VA deployment model clients can 281 be configured that even in the event of lack of ABR to ABR 282 advertisement symmetry there is still no need to monitor if more 283 specific unsuppressed route would cover suppressed one. Thus in this 284 particular deployment model there is no need to detect and reinstall 285 the previously suppressed ones. 287 Another deployment consideration should be given to networks which 288 may utilize route reflection. In the event of enabling IBGP 289 multipath a special care must be taken that both outbound prefixes as 290 well as VA-prefixes would pass via said route reflectors to their 291 clients. 293 In order to address the above aspects the following solutions could 294 be considered: 296 - Use of intra-POP S-VA 297 - Full mesh Small or medium side networks where S-VA can be deployed 298 are normally fully meshed and do not use route reflection. It 299 also needs to pointed out that some large networks are also fully 300 meshed today. 301 - Use of add-paths Use of add-paths new BGP encoding will allow to 302 distribute more then one overall best path from RR to each client. 303 - Alternate advertisement of VA-prefix S-VA prefix does not need to 304 be advertised in BGP. The BGP suppression will happen as long as 305 we configure the S-VA with next hop(s) and implementation verifies 306 that such VA-prefix is installed in the RIB and FIB. 308 In some deployment scenarios BGP routes could be used to resolve 309 other BGP routes - commonly process called double or multi-level BGP 310 recursion. If such recursion involves specific route resolution 311 policy a special care must be taken to either automatically or 312 manually exclude such routes matching given policy from suppression. 314 Route resolution over default route is a special case. Most network 315 operating systems can be configured by the operator to enable route 316 resolution over default route(s). In simple-va all default routes 317 are intra-domain routes and their objective it to shift full lookup 318 from edge router to more powerful pop to core boundary router or exit 319 ASBR. In those cases simple-va should be configured in concert with 320 global configuration regarding resolution via default route. In the 321 event of actually using default for next hop resolution the worse 322 case scenario is that the packets may be forwarded one more hop then 323 dropped if more specific destination route is not found there. 325 Operators are advised to keep effect in mind when choosing a policy 326 for use of default route for next hop resolution. 328 Selected BGP routes in the RIB may be redistributed to other 329 protocols. If they no longer exist in the RIB, they will not be 330 redistributed. This is especially important when the conditional 331 redistribution is taking place based on the length of the prefix, 332 community value etc .. In those cases where redistribution policy is 333 in place simple-va code should refrain from suppressing prefixes 334 matching such policy. 336 In the case where operator injects a default at the pop to core 337 boundary into the pop or alternatively when intra-domain default 338 route is injected into autonomous system by set of ASBRs peering with 339 their upstreams a special care needs to be take to make sure that any 340 aggregate subnet is advertised only from the BGP speakers which 341 inject the default route and therefor attract traffic to non existing 342 destinations. This will allow to completely mitigate potential 343 forwarding issue while not specific to simple-va, but applicable to 344 the general use of default routes. 346 4. IANA Considerations 348 There are no IANA considerations. 350 5. Security Considerations 352 The authors are not aware of any new security considerations due to 353 S-VA. 355 6. Acknowledgements 357 The concept for Virtual Aggregation comes from Paul Francis. In this 358 document authors only simplified some aspects of its behavior to 359 allow simpler adoption by some operators. 361 Authors would like to thank Clarence Filsfils and Nick Hilliard for 362 their valuable input. 364 7. References 365 7.1. Normative References 367 [RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP 368 Communities Attribute", RFC 1997, August 1996. 370 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 371 Requirement Levels", BCP 14, RFC 2119, March 1997. 373 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 374 Protocol 4 (BGP-4)", RFC 4271, January 2006. 376 7.2. Informative References 378 [I-D.ietf-grow-va] 379 Francis, P., Xu, X., Ballani, H., Jen, D., Raszuk, R., and 380 L. Zhang, "FIB Suppression with Virtual Aggregation", 381 draft-ietf-grow-va-06 (work in progress), December 2011. 383 Authors' Addresses 385 Robert Raszuk 386 NTT MCL 387 101 S Ellsworth Avenue Suite 350 388 San Mateo, CA 94401 389 US 391 Email: robert@raszuk.net 393 Jakob Heitz 394 Ericsson 395 300 Holger Way 396 San Jose, CA 95135 397 USA 399 Phone: 400 Email: jakob.heitz@ericsson.com 401 Alton Lo 402 Arista Networks 403 5470 Great America Parkway 404 Santa Clara, CA 95054 405 USA 407 Phone: 408 Email: altonlo@aristanetworks.com 410 Lixia Zhang 411 UCLA 412 3713 Boelter Hall 413 Los Angeles, CA 90095 414 US 416 Phone: 417 Email: lixia@cs.ucla.edu 419 Xiaohu Xu 420 Huawei Technologies 421 No.3 Xinxi Rd., Shang-Di Information Industry Base, Hai-Dian District 422 Beijing, Beijing 100085 423 P.R.China 425 Phone: +86 10 82836073 426 Email: xuxh@huawei.com