idnits 2.17.1 draft-ietf-grow-simple-va-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 2, 2011) is 4797 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-06) exists of draft-ietf-grow-va-00 ** Obsolete normative reference: RFC 3107 (Obsoleted by RFC 8277) ** Obsolete normative reference: RFC 5512 (Obsoleted by RFC 9012) Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Francis 3 Internet-Draft MPI-SWS 4 Intended status: Informational X. Xu 5 Expires: September 3, 2011 Huawei 6 H. Ballani 7 Cornell U. 8 R. Raszuk 9 Cisco 10 L. Zhang 11 UCLA 12 March 2, 2011 14 Simple Virtual Aggregation (S-VA) 15 draft-ietf-grow-simple-va-02.txt 17 Abstract 19 The continued growth in the Default Free Routing Table (DFRT) 20 stresses the global routing system in a number of ways. One of the 21 most costly stresses is FIB size: ISPs often must upgrade router 22 hardware simply because the FIB has run out of space, and router 23 vendors must design routers that have adequate FIB. FIB suppression 24 is an approach to relieving stress on the FIB by NOT loading selected 25 RIB entries into the FIB. Simple Virtual Aggregation (S-VA) is a 26 simple form of Virtual Aggregation (VA) that allows any and all edge 27 routers to shrink their FIB requirements substantially and therefore 28 increase their useful lifetime. S-VA does not change FIB 29 requirements for core routers. S-VA is extremely easy to 30 configure---considerably more so than the various tricks done today 31 to extend the life of edge routers. S-VA can be deployed 32 autonomously by an ISP (cooperation between ISPs is not required), 33 and can co-exist with legacy routers in the ISP. There are no 34 changes from the 01 version to this version. 36 Status of this Memo 38 This Internet-Draft is submitted to IETF in full conformance with the 39 provisions of BCP 78 and BCP 79. 41 Internet-Drafts are working documents of the Internet Engineering 42 Task Force (IETF). Note that other groups may also distribute 43 working documents as Internet-Drafts. The list of current Internet- 44 Drafts is at http://datatracker.ietf.org/drafts/current/. 46 Internet-Drafts are draft documents valid for a maximum of six months 47 and may be updated, replaced, or obsoleted by other documents at any 48 time. It is inappropriate to use Internet-Drafts as reference 49 material or to cite them other than as "work in progress." 51 This Internet-Draft will expire on September 3, 2011. 53 Copyright Notice 55 Copyright (c) 2011 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents 60 (http://trustee.ietf.org/license-info) in effect on the date of 61 publication of this document. Please review these documents 62 carefully, as they describe your rights and restrictions with respect 63 to this document. Code Components extracted from this document must 64 include Simplified BSD License text as described in Section 4.e of 65 the Trust Legal Provisions and are provided without warranty as 66 described in the Simplified BSD License. 68 Table of Contents 70 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 71 1.1. Scope of this Document . . . . . . . . . . . . . . . . . . 5 72 1.2. Requirements notation . . . . . . . . . . . . . . . . . . 5 73 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 74 2. Operation of S-VA . . . . . . . . . . . . . . . . . . . . . . 6 75 2.1. Tunnels . . . . . . . . . . . . . . . . . . . . . . . . . 7 76 2.2. Legacy Routers . . . . . . . . . . . . . . . . . . . . . . 8 77 3. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 78 4. Security Considerations . . . . . . . . . . . . . . . . . . . 9 79 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 9 80 6. Normative References . . . . . . . . . . . . . . . . . . . . . 9 81 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10 83 1. Introduction 85 ISPs today manage constant DFRT growth in a number of ways. One way, 86 of course, is for ISPs to upgrade their router hardware before DFRT 87 growth outstrips the size of the FIB. This is too expensive for many 88 ISPs. They would prefer to extend the lifetime of routers whose FIBs 89 can no longer hold the full DFRT. 91 A common approach taken by lower-tier ISPs is to default route to 92 their providers. Routes to customers and peer ISPs are maintained, 93 but everything else defaults to the provider. This approach has 94 several disadvantages. First, packets to Internet destinations may 95 take longer-than-necessary AS paths. This problem can be mitigated 96 through careful configuration of partial defaults, but this can 97 require substantial configuration overhead. A second problem with 98 defaulting to providers is that the ISP is no longer able to provide 99 the full DFRT to its customers. Finally, provider defaults prevents 100 the ISP from being able to detect martian packets. As a result, the 101 ISP transmits packets that could otherwise have been dropped over its 102 expensive provider links. Simple Virtual Aggregation (S-VA) solves 103 these problems because the full DFRT is used by core routers. 105 An alternative is for the ISP to maintain full routes in its core 106 routers, but to filter routes from edge routers that do not require a 107 full DFRT. These edge routers can then default route to the core 108 routers. This is often possible with edge routers that interface to 109 customer networks. The problem with this approach is that it cannot 110 be used for all edge routers. For instance, it cannot be used for 111 routers that connect to transits. It should also not be used for 112 routers that connect to customers which wish to receive the full 113 DFRT. 115 This draft describes a very simple technique, called Simple Virtual 116 Aggregation (S-VA), that allows any and all edge routers to have 117 substantially reduced FIB requirements even while still advertising 118 and receiving the full DFRT over BGP. The basic idea is as follows. 119 Core routers in the ISP maintain the full DFRT in the FIB and RIB. 120 Edge routers maintain the full DFRT in the RIB, but suppress certain 121 routes from the FIB. Edge routers install a default route to core 122 routers. Label Switched Paths (LSP) are used to transmit packets 123 from a core router, through the edge router, to the Next Hop remote 124 Autonomous System Border Router (ASBR). ASBRs strip the tunnel 125 header (MPLS or IP) before forwarding tunneled packets to the remote 126 ASBR (in much the same way MPLS Penultimate Hop Popping (PHP) strips 127 the LSP header before forwarding packets to the tunnel target). 129 S-VA requires no changes to BGP and no changes to MPLS forwarding 130 mechanisms in routers. Configuration is extremely simple: S-VA must 131 be enabled, and routers must told whether they are FIB-suppressing 132 routers or not. Everything else is automatic. ISPs can deploy FIB 133 suppression autonomously and with no coordination with neighbor ASes. 135 1.1. Scope of this Document 137 The scope of this document is limited to Intra-domain S-VA operation. 138 In other words, the case where a single ISP autonomously operates 139 S-VA internally without any coordination with neighboring ISPs. 141 Note that this document assumes that the S-VA "domain" (i.e. the unit 142 of autonomy) is the AS (that is, different ASes run S-VA 143 independently and without coordination). For the remainder of this 144 document, the terms ISP, AS, and domain are used interchangeably. 146 This document applies equally to IPv4 and IPv6. 148 S-VA may operate with a mix of upgraded routers and legacy routers. 149 There are no topological restrictions placed on the mix of routers. 150 In order to avoid loops between upgraded and legacy routers, however, 151 legacy routers must be able to terminate tunnels. 153 Note that S-VA is a greatly simplified variant of "full VA" 154 [I-D.ietf-grow-va]. With full VA, all routers (core or otherwise) 155 can have reduced FIBs. However, full VA requires substantial new 156 configuration and operational complexity compared to S-VA. Note that 157 S-VA was formerly specified in [I-D.ietf-grow-va]. It has been moved 158 to this separate draft to simplify its understanding. 160 1.2. Requirements notation 162 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 163 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 164 document are to be interpreted as described in [RFC2119]. 166 1.3. Terminology 168 FIB-Installing Router (FIR): An S-VA router that does not suppress 169 any routes, and advertises itself as a default route for 0/0. 170 Typically a core router or route reflector would be configured as 171 an FIR. 172 FIB-Suppressing Router (FSR): An S-VA router that installs a route 173 to 0/0, and may suppress other routes. Typically an edge router 174 would be configured as an FSR. 176 Install and Suppress: The terms "install" and "suppress" are used to 177 describe whether a RIB entry has been loaded or not loaded into 178 the FIB. In other words, the phrase "install a route" means 179 "install a route into the FIB", and the phrase "suppress a route" 180 means "do not install a route into the FIB". 181 Legacy Router: A router that does not run S-VA, and has no knowledge 182 of S-VA. 183 Routing Information Base (RIB): The term RIB is used rather sloppily 184 in this document to refer either to the loc-RIB (as used in 185 [RFC4271]), or to the combined Adj-RIBs-In, the Loc-RIB, and the 186 Adj-RIBs-Out. 188 2. Operation of S-VA 190 There are three types of routers in S-VA, FIB-Installing routers 191 (FIR), FIB-Suppressing routers (FSR), and optionally legacy routers. 192 While any router can be an FIR or an FSR (there are no topology 193 constraints), the simplist form of deployment is for border routers 194 to be configured as edge routers, and for non-border routers (for 195 instance the routers used as route reflectors) to be configured as 196 core routers. S-VA, however, does not mandate this deployment per 197 se. 199 FIRs must originate a BGP route to NLRI 0/0 [RFC4271]. The ORIGIN is 200 set to INCOMPLETE (value 2), the AS number of the FIR's AS is used in 201 the AS_PATH, and the BGP NEXT_HOP is set to the router's own address. 202 The ATOMIC_AGGREGATE and AGGREGATOR attributes are not included. The 203 FIR MUST attach a NO_EXPORT Communities Attribute [RFC1997] to the 204 route. 206 FIRs must not FIB-suppress any routes. 208 FSRs must FIB-install a route to 0/0. When transmitting a packet to 209 a FIR (i.e. based on a 0/0 FIB lookup), the packet must be tunneled. 210 This is to prevent loops that would otherwise occur when a packet 211 transits multiple FSRs on the way to the core, some of which have 212 FIB-installed the route for the destination, and others of which have 213 not. FSRs may FIB-install any other routes. They should install any 214 routes for which their eBGP neighbor is the NEXT_HOP. There are a 215 couple reasons for this, which can be illustrated in the figure 216 below. This figure shows an autonomous system with a FIR FIR1 and an 217 FSR FSR1. FSR1 is an ASBR and is connected to two remote ASBRs, EP1 218 and EP2. 220 +------------------------------------------+ 221 | Autonomous System | +----+ 222 | | |EP1 | 223 | /---+---| | 224 | To ----\ +----+ +----+ / | +----+ 225 | Other \|FIR1|----------|FSR1|/ | 226 |Routers /| | | |\ | 227 | ----/ +----+ +----+ \ | +----+ 228 | \---+---|EP2 | 229 | | | | 230 | | +----+ 231 +------------------------------------------+ 233 Suppose that FSR1 does not FIB-install routes for which EP1 and EP2 234 are next hops. In this case, when EP2 sends a packet to FSR1 for 235 which the next hop is EP1, FSR1 will first tunnel the packet to FIR1, 236 which will tunnel it right back to FSR1. This trombone routing is 237 avoided if local ASBRs FIB-install routes where their neighbor remote 238 ASBRs are the BGP NEXT_HOP. 240 In addition, FSR1 cannot filter source addresses using strict unicast 241 Reverse Path Forwarding (uRPF) unless it FIB-installs the routes 242 learned from the remote ASBR. Note, however, that FSRs cannot do 243 loose uRPF. Rather, this must be done by FIRs. 245 The above observations lead to the following rules: FSRs that are 246 ASBRs should FIB-install all routes for which the neighbor is the BGP 247 NEXT_HOP. FSRs that are ASBRs must FIB-install any routes that are 248 used for uRPF. 250 2.1. Tunnels 252 S-VA works with both MPLS and IP-in-IP tunnels. There are 253 potentially up to two tunnels required for a packet to traverse an AS 254 with S-VA. The first tunnel is that from an FSR to a FIR (for the 255 0/0 default). This is called the default tunnel. The second tunnel 256 targets the remote ASBR which is the BGP NEXT_HOP, although the 257 tunnel header is stripped by the local ASBR before transmitting to 258 the remote ASBR. This is the exit tunnel. The start of the exit 259 tunnel is an ingress local ASBR in the case where the ingress local 260 ASBR has FIB-installed the associated route. Otherwise, the start of 261 the exit tunnel is a FIR. 263 The target address of the default tunnel is always the FIR. If MPLS 264 is used, the FIRs must initiate LSPs to themselves using either the 265 Label Distribution Protocol (LDP) [RFC5036]. RSVP-TE [RFC3209] may 266 also be used. 268 If IP-in-IP tunnels are used, then the BGP Encapsulation Extended 269 Community (BGPencap-Attribute) ([RFC5512]) is used to convey the 270 ability to accept tunnels at the target address (the BGP NEXT_HOP). 272 For the exit tunnels, again either MPLS or IP-in-IP can be used. In 273 the case of IP-in-IP, the inner label defined in [RFC4023] and 274 signaled in BGP with [RFC3107] is used by the local ASBR to identify 275 the remote ASBR which is the BGP NEXT_HOP for the packet. 276 Specifically, when a local ASBR, which can be either an FSR or a FIR, 277 advertises an eBGP-received route into iBGP, it sets the BGP NEXT_HOP 278 as itself. It assigns a label to the route. This label is used as 279 the inner label in packets tunneled to the local ASBR, and is used to 280 identify the remote ASBR from which the route was received. When 281 receiving a packet with this label, the local ASBR strips off the 282 label, and forwards the native packet to the remote ASBR indicated by 283 the label. 285 In the case of MPLS, the inner label may or may not be used. If it 286 is used, then an LSP is established to the IP address of the local 287 ASBR as described above for FIRs. The BGP NEXT_HOP is set to be 288 itself (the same address that serves as the FEC in the LSP). The 289 inner label is established as described in the previous paragraph for 290 IP-in-IP tunnels, but with the encapsulation defined in [RFC3032]. 292 If the inner label is not used, then the local ASBR must initiate a 293 Downstream Unsolicited LSP for each remote ASBR. The FEC for the LSP 294 is the remote ASBR address that is used in the BGP NEXT_HOP field. 295 When a packet is received on one of these LSPs, the local ASBR strips 296 the MPLS header, and forwards the packet to the remote ASBR indicated 297 by the label. 299 2.2. Legacy Routers 301 S-VA may be operated with a mix of legacy and S-VA-upgraded routers. 302 The legacy routers, however, must be able to forward tunneled 303 packets. In the case of MPLS tunnels, this means that they must 304 fully participate in MPLS signaling. If a legacy router is an ASBR, 305 then it must also initiate tunnels to itself and be able to detunnel 306 packets (without the inner label). 308 3. IANA Considerations 310 There are no IANA considerations. 312 4. Security Considerations 314 The authors are not aware of any new security considerations due to 315 S-VA. 317 5. Acknowledgements 319 The concept for S-VA comes from Robert Raszuk. 321 6. Normative References 323 [I-D.ietf-grow-va] 324 Francis, P., Xu, X., Ballani, H., Jen, D., Raszuk, R., and 325 L. Zhang, "FIB Suppression with Virtual Aggregation", 326 draft-ietf-grow-va-00 (work in progress), May 2009. 328 [RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP 329 Communities Attribute", RFC 1997, August 1996. 331 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 332 Requirement Levels", BCP 14, RFC 2119, March 1997. 334 [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., 335 Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack 336 Encoding", RFC 3032, January 2001. 338 [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in 339 BGP-4", RFC 3107, May 2001. 341 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 342 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 343 Tunnels", RFC 3209, December 2001. 345 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, "Encapsulating 346 MPLS in IP or Generic Routing Encapsulation (GRE)", 347 RFC 4023, March 2005. 349 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 350 Protocol 4 (BGP-4)", RFC 4271, January 2006. 352 [RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP 353 Specification", RFC 5036, October 2007. 355 [RFC5512] Mohapatra, P. and E. Rosen, "BGP Encapsulation SAFI and 356 BGP Tunnel Encapsulation Attribute", RFC 5512, April 2009. 358 Authors' Addresses 360 Paul Francis 361 Max Planck Institute for Software Systems 362 Gottlieb-Daimler-Strasse 363 Kaiserslautern 67633 364 Germany 366 Phone: +49 631 930 39600 367 Email: francis@mpi-sws.org 369 Xiaohu Xu 370 Huawei Technologies 371 No.3 Xinxi Rd., Shang-Di Information Industry Base, Hai-Dian District 372 Beijing, Beijing 100085 373 P.R.China 375 Phone: +86 10 82836073 376 Email: xuxh@huawei.com 378 Hitesh Ballani 379 Cornell University 380 4130 Upson Hall 381 Ithaca, NY 14853 382 US 384 Phone: +1 607 279 6780 385 Email: hitesh@cs.cornell.edu 387 Robert Raszuk 388 Cisco Systems, Inc. 389 170 West Tasman Drive 390 San Jose, CA 95134 391 USA 393 Phone: 394 Email: raszuk@cisco.com 395 Lixia Zhang 396 UCLA 397 3713 Boelter Hall 398 Los Angeles, CA 90095 399 US 401 Phone: 402 Email: lixia@cs.ucla.edu