idnits 2.17.1 draft-ietf-grow-simple-va-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 1, 2010) is 5168 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-06) exists of draft-ietf-grow-va-00 ** Obsolete normative reference: RFC 3107 (Obsoleted by RFC 8277) ** Obsolete normative reference: RFC 5512 (Obsoleted by RFC 9012) Summary: 3 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Francis 3 Internet-Draft MPI-SWS 4 Intended status: Informational X. Xu 5 Expires: September 2, 2010 Huawei 6 H. Ballani 7 Cornell U. 8 R. Raszuk 9 Cisco 10 L. Zhang 11 UCLA 12 March 1, 2010 14 Simple Virtual Aggregation (S-VA) 15 draft-ietf-grow-simple-va-00.txt 17 Abstract 19 The continued growth in the Default Free Routing Table (DFRT) 20 stresses the global routing system in a number of ways. One of the 21 most costly stresses is FIB size: ISPs often must upgrade router 22 hardware simply because the FIB has run out of space, and router 23 vendors must design routers that have adequate FIB. FIB suppression 24 is an approach to relieving stress on the FIB by NOT loading selected 25 RIB entries into the FIB. Simple Virtual Aggregation (S-VA) is a 26 simple form of Virtual Aggregation (VA) that allows any and all edge 27 routers to shrink their FIB requirements substantially and therefore 28 increase their useful lifetime. S-VA does not change FIB 29 requirements for core routers. S-VA is extremely easy to 30 configure---considerably more so than the various tricks done today 31 to extend the life of edge routers. S-VA can be deployed 32 autonomously by an ISP (cooperation between ISPs is not required), 33 and can co-exist with legacy routers in the ISP. 35 Status of this Memo 37 This Internet-Draft is submitted to IETF in full conformance with the 38 provisions of BCP 78 and BCP 79. 40 Internet-Drafts are working documents of the Internet Engineering 41 Task Force (IETF), its areas, and its working groups. Note that 42 other groups may also distribute working documents as Internet- 43 Drafts. 45 Internet-Drafts are draft documents valid for a maximum of six months 46 and may be updated, replaced, or obsoleted by other documents at any 47 time. It is inappropriate to use Internet-Drafts as reference 48 material or to cite them other than as "work in progress." 49 The list of current Internet-Drafts can be accessed at 50 http://www.ietf.org/ietf/1id-abstracts.txt. 52 The list of Internet-Draft Shadow Directories can be accessed at 53 http://www.ietf.org/shadow.html. 55 This Internet-Draft will expire on September 2, 2010. 57 Copyright Notice 59 Copyright (c) 2010 IETF Trust and the persons identified as the 60 document authors. All rights reserved. 62 This document is subject to BCP 78 and the IETF Trust's Legal 63 Provisions Relating to IETF Documents 64 (http://trustee.ietf.org/license-info) in effect on the date of 65 publication of this document. Please review these documents 66 carefully, as they describe your rights and restrictions with respect 67 to this document. Code Components extracted from this document must 68 include Simplified BSD License text as described in Section 4.e of 69 the Trust Legal Provisions and are provided without warranty as 70 described in the BSD License. 72 Table of Contents 74 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 75 1.1. Scope of this Document . . . . . . . . . . . . . . . . . . 5 76 1.2. Requirements notation . . . . . . . . . . . . . . . . . . 5 77 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 78 1.4. Temporary Sections . . . . . . . . . . . . . . . . . . . . 6 79 1.4.1. Document revisions . . . . . . . . . . . . . . . . . . 6 80 2. Operation of S-VA . . . . . . . . . . . . . . . . . . . . . . 6 81 2.1. Tunnels . . . . . . . . . . . . . . . . . . . . . . . . . 7 82 2.2. Legacy Routers . . . . . . . . . . . . . . . . . . . . . . 8 83 3. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 84 4. Security Considerations . . . . . . . . . . . . . . . . . . . 9 85 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 9 86 6. Normative References . . . . . . . . . . . . . . . . . . . . . 9 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10 89 1. Introduction 91 ISPs today manage constant DFRT growth in a number of ways. One way, 92 of course, is for ISPs to upgrade their router hardware before DFRT 93 growth outstrips the size of the FIB. This is too expensive for many 94 ISPs. They would prefer to extend the lifetime of routers whose FIBs 95 can no longer hold the full DFRT. 97 A common approach taken by lower-tier ISPs is to default route to 98 their providers. Routes to customers and peer ISPs are maintained, 99 but everything else defaults to the provider. This approach has 100 several disadvantages. First, packets to Internet destinations may 101 take longer-than-necessary AS paths. This problem can be mitigated 102 through careful configuration of partial defaults, but this can 103 require substantial configuration overhead. A second problem with 104 defaulting to providers is that the ISP is no longer able to provide 105 the full DFRT to its customers. Finally, provider defaults prevents 106 the ISP from being able to detect martian packets. As a result, the 107 ISP transmits packets that could otherwise have been dropped over its 108 expensive provider links. Simple Virtual Aggregation (S-VA) solves 109 these problems because the full DFRT is used by core routers. 111 An alternative is for the ISP to maintain full routes in its core 112 routers, but to filter routes from edge routers that do not require a 113 full DFRT. These edge routers can then default route to the core 114 routers. This is often possible with edge routers that interface to 115 customer networks. The problem with this approach is that it cannot 116 be used for all edge routers. For instance, it cannot be used for 117 routers that connect to transits. It should also not be used for 118 routers that connect to customers which wish to receive the full 119 DFRT. 121 This draft describes a very simple technique, called Simple Virtual 122 Aggregation (S-VA), that allows any and all edge routers to have 123 substantially reduced FIB requirements even while still advertising 124 and receiving the full DFRT over BGP. The basic idea is as follows. 125 Core routers in the ISP maintain the full DFRT in the FIB and RIB. 126 Edge routers maintain the full DFRT in the RIB, but suppress certain 127 routes from the FIB. Edge routers install a default route to core 128 routers. Label Switched Paths (LSP) are used to transmit packets 129 from a core router, through the edge router, to the Next Hop remote 130 Autonomous System Border Router (ASBR). ASBRs strip the tunnel 131 header (MPLS or IP) before forwarding tunneled packets to the remote 132 ASBR (in much the same way MPLS Penultimate Hop Popping (PHP) strips 133 the LSP header before forwarding packets to the tunnel target). 135 S-VA requires no changes to BGP and no changes to MPLS forwarding 136 mechanisms in routers. Configuration is extremely simple: S-VA must 137 be enabled, and routers must told whether they are FIB-suppressing 138 routers or not. Everything else is automatic. ISPs can deploy FIB 139 suppression autonomously and with no coordination with neighbor ASes. 141 1.1. Scope of this Document 143 The scope of this document is limited to Intra-domain S-VA operation. 144 In other words, the case where a single ISP autonomously operates 145 S-VA internally without any coordination with neighboring ISPs. 147 Note that this document assumes that the S-VA "domain" (i.e. the unit 148 of autonomy) is the AS (that is, different ASes run S-VA 149 independently and without coordination). For the remainder of this 150 document, the terms ISP, AS, and domain are used interchangeably. 152 This document applies equally to IPv4 and IPv6. 154 S-VA may operate with a mix of upgraded routers and legacy routers. 155 There are no topological restrictions placed on the mix of routers. 156 In order to avoid loops between upgraded and legacy routers, however, 157 legacy routers must be able to terminate tunnels. 159 Note that S-VA is a greatly simplified variant of "full VA" 160 [I-D.ietf-grow-va]. With full VA, all routers (core or otherwise) 161 can have reduced FIBs. However, full VA requires substantial new 162 configuration and operational complexity compared to S-VA. Note that 163 S-VA was formerly specified in [I-D.ietf-grow-va]. It has been moved 164 to this separate draft to simplify its understanding. 166 1.2. Requirements notation 168 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 169 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 170 document are to be interpreted as described in [RFC2119]. 172 1.3. Terminology 174 FIB-Installing Router (FIR): An S-VA router that does not suppress 175 any routes, and advertises itself as a default route for 0/0. 176 Typically a core router or route reflector would be configured as 177 an FIR. 178 FIB-Suppressing Router (FSR): An S-VA router that installs a route 179 to 0/0, and may suppress other routes. Typically an edge router 180 would be configured as an FSR. 182 Install and Suppress: The terms "install" and "suppress" are used to 183 describe whether a RIB entry has been loaded or not loaded into 184 the FIB. In other words, the phrase "install a route" means 185 "install a route into the FIB", and the phrase "suppress a route" 186 means "do not install a route into the FIB". 187 Legacy Router: A router that does not run S-VA, and has no knowledge 188 of S-VA. 189 Routing Information Base (RIB): The term RIB is used rather sloppily 190 in this document to refer either to the loc-RIB (as used in 191 [RFC4271]), or to the combined Adj-RIBs-In, the Loc-RIB, and the 192 Adj-RIBs-Out. 194 1.4. Temporary Sections 196 This section contains temporary information, and will be removed in 197 the final version. 199 1.4.1. Document revisions 201 Note that S-VA was formerly specified in [I-D.ietf-grow-va]. No new 202 functionality is specified in this draft. 204 2. Operation of S-VA 206 There are three types of routers in S-VA, FIB-Installing routers 207 (FIR), FIB-Suppressing routers (FSR), and optionally legacy routers. 208 While any router can be an FIR or an FSR (there are no topology 209 constraints), the simplist form of deployment is for border routers 210 to be configured as edge routers, and for non-border routers (for 211 instance the routers used as route reflectors) to be configured as 212 core routers. S-VA, however, does not mandate this deployment per 213 se. 215 FIRs must originate a BGP route to NLRI 0/0 [RFC4271]. The ORIGIN is 216 set to INCOMPLETE (value 2), the AS number of the FIR's AS is used in 217 the AS_PATH, and the BGP NEXT_HOP is set to the router's own address. 218 The ATOMIC_AGGREGATE and AGGREGATOR attributes are not included. The 219 FIR MUST attach a NO_EXPORT Communities Attribute [RFC1997] to the 220 route. 222 FIRs must not FIB-suppress any routes. 224 FSRs must FIB-install a route to 0/0. When transmitting a packet to 225 a FIR (i.e. based on a 0/0 FIB lookup), the packet must be tunneled. 226 This is to prevent loops that would otherwise occur when a packet 227 transits multiple FSRs on the way to the core, some of which have 228 FIB-installed the route for the destination, and others of which have 229 not. FSRs may FIB-install any other routes. They should install any 230 routes for which their eBGP neighbor is the NEXT_HOP. There are a 231 couple reasons for this, which can be illustrated in the figure 232 below. This figure shows an autonomous system with a FIR FIR1 and an 233 FSR FSR1. FSR1 is an ASBR and is connected to two remote ASBRs, EP1 234 and EP2. 236 +------------------------------------------+ 237 | Autonomous System | +----+ 238 | | |EP1 | 239 | /---+---| | 240 | To ----\ +----+ +----+ / | +----+ 241 | Other \|FIR1|----------|FSR1|/ | 242 |Routers /| | | |\ | 243 | ----/ +----+ +----+ \ | +----+ 244 | \---+---|EP2 | 245 | | | | 246 | | +----+ 247 +------------------------------------------+ 249 Suppose that FSR1 does not FIB-install routes for which EP1 and EP2 250 are next hops. In this case, when EP2 sends a packet to FSR1 for 251 which the next hop is EP1, FSR1 will first tunnel the packet to FIR1, 252 which will tunnel it right back to FSR1. This trombone routing is 253 avoided if local ASBRs FIB-install routes where their neighbor remote 254 ASBRs are the BGP NEXT_HOP. 256 In addition, FSR1 cannot filter source addresses using strict unicast 257 Reverse Path Forwarding (uRPF) unless it FIB-installs the routes 258 learned from the remote ASBR. Note, however, that FSRs cannot do 259 loose uRPF. Rather, this must be done by FIRs. 261 The above observations lead to the following rules: FSRs that are 262 ASBRs should FIB-install all routes for which the neighbor is the BGP 263 NEXT_HOP. FSRs that are ASBRs must FIB-install any routes that are 264 used for uRPF. 266 2.1. Tunnels 268 S-VA works with both MPLS and IP-in-IP tunnels. There are 269 potentially up to two tunnels required for a packet to traverse an AS 270 with S-VA. The first tunnel is that from an FSR to a FIR (for the 271 0/0 default). This is called the default tunnel. The second tunnel 272 targets the remote ASBR which is the BGP NEXT_HOP, although the 273 tunnel header is stripped by the local ASBR before transmitting to 274 the remote ASBR. This is the exit tunnel. The start of the exit 275 tunnel is an ingress local ASBR in the case where the ingress local 276 ASBR has FIB-installed the associated route. Otherwise, the start of 277 the exit tunnel is a FIR. 279 The target address of the default tunnel is always the FIR. If MPLS 280 is used, the FIRs must initiate LSPs to themselves using either the 281 Label Distribution Protocol (LDP) [RFC5036]. RSVP-TE [RFC3209] may 282 also be used. 284 If IP-in-IP tunnels are used, then the BGP Encapsulation Extended 285 Community (BGPencap-Attribute) ([RFC5512]) is used to convey the 286 ability to accept tunnels at the target address (the BGP NEXT_HOP). 288 For the exit tunnels, again either MPLS or IP-in-IP can be used. In 289 the case of IP-in-IP, the inner label defined in [RFC4023] and 290 signaled in BGP with [RFC3107] is used by the local ASBR to identify 291 the remote ASBR which is the BGP NEXT_HOP for the packet. 292 Specifically, when a local ASBR, which can be either an FSR or a FIR, 293 advertises an eBGP-received route into iBGP, it sets the BGP NEXT_HOP 294 as itself. It assigns a label to the route. This label is used as 295 the inner label in packets tunneled to the local ASBR, and is used to 296 identify the remote ASBR from which the route was received. When 297 receiving a packet with this label, the local ASBR strips off the 298 label, and forwards the native packet to the remote ASBR indicated by 299 the label. 301 In the case of MPLS, the inner label may or may not be used. If it 302 is used, then an LSP is established to the IP address of the local 303 ASBR as described above for FIRs. The BGP NEXT_HOP is set to be 304 itself (the same address that serves as the FEC in the LSP). The 305 inner label is established as described in the previous paragraph for 306 IP-in-IP tunnels, but with the encapsulation defined in [RFC3032]. 308 If the inner label is not used, then the local ASBR must initiate a 309 Downstream Unsolicited LSP for each remote ASBR. The FEC for the LSP 310 is the remote ASBR address that is used in the BGP NEXT_HOP field. 311 When a packet is received on one of these LSPs, the local ASBR strips 312 the MPLS header, and forwards the packet to the remote ASBR indicated 313 by the label. 315 2.2. Legacy Routers 317 S-VA may be operated with a mix of legacy and S-VA-upgraded routers. 318 The legacy routers, however, must be able to forward tunneled 319 packets. In the case of MPLS tunnels, this means that they must 320 fully participate in MPLS signaling. If a legacy router is an ASBR, 321 then it must also initiate tunnels to itself and be able to detunnel 322 packets (without the inner label). 324 3. IANA Considerations 326 There are no IANA considerations. 328 4. Security Considerations 330 The authors are not aware of any new security considerations due to 331 S-VA. 333 5. Acknowledgements 335 The concept for S-VA comes from Robert Raszuk. 337 6. Normative References 339 [I-D.ietf-grow-va] 340 Francis, P., Xu, X., Ballani, H., Jen, D., Raszuk, R., and 341 L. Zhang, "FIB Suppression with Virtual Aggregation", 342 draft-ietf-grow-va-00 (work in progress), May 2009. 344 [RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP 345 Communities Attribute", RFC 1997, August 1996. 347 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 348 Requirement Levels", BCP 14, RFC 2119, March 1997. 350 [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., 351 Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack 352 Encoding", RFC 3032, January 2001. 354 [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in 355 BGP-4", RFC 3107, May 2001. 357 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 358 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 359 Tunnels", RFC 3209, December 2001. 361 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, "Encapsulating 362 MPLS in IP or Generic Routing Encapsulation (GRE)", 363 RFC 4023, March 2005. 365 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 366 Protocol 4 (BGP-4)", RFC 4271, January 2006. 368 [RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP 369 Specification", RFC 5036, October 2007. 371 [RFC5512] Mohapatra, P. and E. Rosen, "BGP Encapsulation SAFI and 372 BGP Tunnel Encapsulation Attribute", RFC 5512, April 2009. 374 Authors' Addresses 376 Paul Francis 377 Max Planck Institute for Software Systems 378 Gottlieb-Daimler-Strasse 379 Kaiserslautern 67633 380 Germany 382 Phone: +49 631 930 39600 383 Email: francis@mpi-sws.org 385 Xiaohu Xu 386 Huawei Technologies 387 No.3 Xinxi Rd., Shang-Di Information Industry Base, Hai-Dian District 388 Beijing, Beijing 100085 389 P.R.China 391 Phone: +86 10 82836073 392 Email: xuxh@huawei.com 394 Hitesh Ballani 395 Cornell University 396 4130 Upson Hall 397 Ithaca, NY 14853 398 US 400 Phone: +1 607 279 6780 401 Email: hitesh@cs.cornell.edu 403 Robert Raszuk 404 Cisco Systems, Inc. 405 170 West Tasman Drive 406 San Jose, CA 95134 407 USA 409 Phone: 410 Email: raszuk@cisco.com 411 Lixia Zhang 412 UCLA 413 3713 Boelter Hall 414 Los Angeles, CA 90095 415 US 417 Phone: 418 Email: lixia@cs.ucla.edu