idnits 2.17.1 draft-stenberg-pd-route-maintenance-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5 on line 537. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 514. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 521. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 527. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 293: '...r replies. The implementations SHOULD...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 20, 2006) is 6429 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 3315 (ref. '1') (Obsoleted by RFC 8415) ** Obsolete normative reference: RFC 3633 (ref. '2') (Obsoleted by RFC 8415) == Outdated reference: A later version (-01) exists of draft-ietf-dhc-dhcvp6-leasequery-00 == Outdated reference: A later version (-11) exists of draft-ietf-bfd-base-05 ** Obsolete normative reference: RFC 2461 (ref. '5') (Obsoleted by RFC 4861) Summary: 8 errors (**), 0 flaws (~~), 4 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Independent Submission M. Stenberg 3 Internet-Draft O. Troan 4 Expires: March 24, 2007 cisco Systems, Inc. 5 September 20, 2006 7 IPv6 Prefix Delegation routing state maintenance approaches 8 draft-stenberg-pd-route-maintenance-00 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on March 24, 2007. 35 Copyright Notice 37 Copyright (C) The Internet Society (2006). 39 Abstract 41 The maintenance of Prefix Delegation (PD) routing state is an issue 42 that people have discussed in the IETF DHC WG, and there have been 43 drafts on the topic. However, as the pros and cons of the different 44 routing state maintenance solutions have not been examined 45 thoroughly, this text attempts to shed some light on both the actual 46 problem and the various alternative solutions. 48 1. Introduction 50 A prefix delegation deployment consists of Requesting Routers (RR), 51 Delegating Routers (DR) and possibly a backend provisioning system 52 (see Figure 1). The delegated prefix has to be routed in the 53 network. This document explores various alternatives for how the 54 route for the delegated prefix can be injected in the network, and 55 how the routing state can be maintained. 57 /~~~~~~~~~\ 58 | Network | 59 \~~~~~~~~~/ 60 | 61 |----------------------------------------- 62 | \ 63 | +-----------------------------+-------------------+ 64 | | Backend provisioning system | DR 4 (integrated) | 65 | +-----------------------------+-------------------+ 66 | | | | 67 | +------+ +------+ +------+ 68 |-| DR 1 | | DR 3 | | RR 3 | 69 \ +------+ +------+ +------+ 70 --- | --------------------/ | 71 +------+ +------+ 72 | DR 2 | | RR 2 | 73 +------+ +------+ 74 | 75 +------+ 76 | RR 1 | 77 +------+ 79 Figure 1: Possible prefix delegation deployment cases. 81 Prefix delegation is a stateful protocol. The RR needs to maintain 82 state so that it can sub-delegate prefixes to downstream links. The 83 RR maintains soft-state which can be recovered by redoing the prefix 84 request (for example, using Dynamic Host Configuration Protocol for 85 IPv6 (DHCPv6) [1] with the Prefix Delegation options defined in [2]). 87 If the DR should do route injection on behalf of the RR, it needs to 88 maintain state. The backend provisioning system must maintain a list 89 of prefixes delegated, as a prefix delegation is a long-lived entity 90 (lifetime of a customer relationship, as in months or years). The 91 backend system and DR might run on the same router. This document 92 focuses on the case where the backend system and DR are separate and 93 the DR has little or no persistent storage. Therefore the DR 4 case 94 (in Figure 1) is not covered here, as it is trivial - the backend has 95 nonvolatile storage for the prefixes, and it can re-inject the routes 96 when the integrated DR 4 restarts. 98 The DR's routing state that needs to be maintained can be divided 99 into two distinct categories: local routing state (that is, a local 100 RIB entry containing the next-hop and the interface the assigned 101 prefix is connected to), and global (AS-wide) routing state which 102 requires advertising the route via a routing protocol. 104 Advertising a route per delegation from the DR can be avoided if 105 there is an aggregate prefix covering the delegation. This requires 106 stringent address allocation procedures and prohibits an RR from 107 moving to a different DR. 109 2. Different approaches for maintaining routing state 111 As any router (or the backend system for that matter) can go offline 112 and come back up later, it is necessary for the system to recover 113 from these intermittent failures. 115 The problem is how to delegate responsibility for route maintenance 116 to one (or more) of the three components of the system, and letting 117 it take care of maintaining the required routing state in place for 118 the RR's prefix. 120 2.1. Backend provisioning system responsible for routing state 122 Considering the backend provisioning system is the only component in 123 the system that actually requires significant amount of nonvolatile 124 storage, from data system point of view it would be ideal to have the 125 backend provisioning system responsible for maintaining the routing 126 state as well. 128 It would mean that the backend provisioning system should, when DR 129 restarts, (securely) re-inject the local or global routing state into 130 the DRs. 132 In practise, this is infeasible: 134 o There is no standard way of detecting when the DR is restarted. 136 o Redundancy of DR, or the links between DR and backend system, 137 makes it difficult for the backend system to judge the state of 138 the DR accurately without significant extra configuration data 139 about the deployed configuration. 141 o None of the current routing protocols are suitable for altering 142 remote router's local routing information, and therefore some 143 protocol development would be in order for this approach to be 144 usable. There are also security implications with this solution. 146 o Lack of scalability; the benefit of having 'backend' provisioning 147 system disappears as it will need to take care of maintaining 148 routes of every one of its DRs. 150 o The backend may lack the information to identify the DR to the 151 routing system. With multiple DRs, if the delegation protocol 152 does not contain everything needed to re-inject the route later on 153 to the specific DR, it won't work. For example, DHCPv6 does not 154 uniquely identify the relays. And if interim DRs do not have 155 backend provisioning system-addressible addresses, there is a 156 problem. All DRs may not have global unicast addresses, and this 157 is problematic especially in configurations spanning multiple 158 administrative domains. 160 Having considered the backend provisioning system as the responsible 161 component, it is clearly NOT the way to go. That leaves the DR and 162 RR components. 164 2.2. Delegating router responsible for routing state 166 As the DR is part of the local routing infrastructure, placing the 167 responsibility for routing state in the DR seems sensible. With that 168 design decision, the next problem is _when_ the routing information 169 is updated after the DR restarts: 171 2.2.1. Approach 1: On-demand lease query 173 In the on-demand lease query case as defined in [3], the routing 174 state maintenance problem is assumed to be local, and therefore the 175 DR will receive packets both from the network at large as well as the 176 RR even after a loss of local state caused by a restart. 178 When traffic arrives to DR either from the RR, or from the network to 179 the DR for a prefix without local routing information, the DR will 180 perform lease query, acquire the allocated prefix, and update the 181 routing information appropriately. 183 This approach, while simple to specify, has some major issues: 185 o It depends on the aggregated prefix to cause the inbound traffic 186 to wind up in the DR. This assumption may not be valid, depending 187 on the address assignment policies of the organization. 188 Geographical or network topological hierarchical address 189 assignment at large seems to be a failure, and it is unclear if 190 all deployments can really implement this. 192 o It requires the incoming traffic both from the RR and the network, 193 for which no route exists, to trigger the lease query. This has 194 two negative side effects: it requires support from the fast path 195 hardware in the DR, and potentially causes large amount of 196 spurious requests to the backend provisioning system (up to the 197 desired rate that is considered harmful to the system). 199 o It requires simulated ordering of the unordered transaction 200 stream, to ensure that the routing state is maintained correctly. 201 The DR cannot be argued to be particularly stateless anymore. 203 2.2.2. Approach 2: Anticipatory lease query 205 Anticipatory, or bulk lease query, solves the routing state problem 206 by requesting ALL prefix information from the backend provisioning 207 system at the DR restart time. There are two different ways: The 208 first approach is asynchronous, that is, the old state is fetched 209 while handling the delegation requests, requiring synchronization 210 algorithm between the bulk data retrieved from the backend system, 211 and the requests served during that. For synchronization, some sort 212 of ordering of the transaction stream is needed. The second 213 alternative is synchronous: the bulk query is performed first, and 214 only then the RRs' requests are handled. 216 Bulk query has several advantages over the on-demand case: 218 o No need for triggering based on either inbound or outbound traffic 219 for the prefix. 221 o If DR handles the query synchronously, we can avoid the ordering 222 of the transaction stream and the associated complexity rising 223 from it. 225 o Given reasonable TCP transport scheme, the transfer of the state 226 is more efficient than the on-demand case in terms of total number 227 of packets. 229 o Does not require changes to fast path hardware, as no new triggers 230 are needed from the traffic. Instead, simple additional code in 231 the system initialization is enough. 233 But, unfortunately it has also some disadvantages: 235 o It causes more uneven load on the backend provisioning system than 236 the on-demand case. If the prefix is not being actively used at 237 the time, it will not cause traffic in the on-demand case, but it 238 will in the bulk case. 240 o Synchronization is non-trivial if the DR serves RR requests during 241 the bulk retrieval of the data. 243 o Doesn't work very well with virtual interfaces - it is hard to 244 retrieve state at boot time if the interfaces themselves get up 245 only at some point, and with their transient nature mapping a DUID 246 to individual customers is difficult. 248 2.2.3. Approach 3: Persistent storage 250 It is possible for the DR to store the route information to be 251 injected either locally, or on some adjacent storage node. The clear 252 advantage of this is the lack of traffic on the wire. 254 Unfortunately, it has also some problems - the data being possibly 255 outdated due to lack of synchronization, and the management overhead 256 when the customers for example move around would be significant. 258 However, in most deployment scenarios persistent storage at or near 259 all routers is not desirable or possible in the first place, so this 260 is listed simply for the sake of completeness. 262 2.3. Requesting router responsible for routing state 264 The most interested party in the routing state of the given prefix is 265 the RR itself; therefore, giving the responsibility for maintaining 266 its routing state to it seemed to be idea worth considering. 268 Due to the operators wariness of the systems not under their direct 269 control, even with the RR responsible for maintenance of the state, 270 the real route injection should be handled by the DR. 272 The nice thing about some of the RR-oriented solutions is that they 273 can be deployed without any changes to the rest of the 274 infrastructure. 276 2.3.1. Approach 1: Layer-2 detection of link state 278 If the RR implementation gets notifications about the state of the 279 link layer, it can actually detect the state of the network link 280 going down and coming back up; performing reconfiguration to ensure 281 that the routing state is still up seems like a trivial solution in 282 this case. 284 This solution can be the best one when operating over connection- 285 oriented media (PPPoE, L2TP) but it doesn't work on say, Ethernet 286 without direct connection between the RR and DR. 288 2.3.2. Approach 2: Keepalive 290 If the RR doesn't have L2 way of detecting DR being restarted, it can 291 maintain a keepalive mechanism using, for example, Bidirectional 292 Forwarding Detection (BFD - [4]) to send self-addressed echo packets 293 to the DR and waiting for their replies. The implementations SHOULD 294 do this only if there is no traffic from the network within a desired 295 period of time - see IPv6 Neighbor Unreachability Detection (NUD)'s 296 definition of forward progress detection as a way to send keepalives 297 only when truly necessary in [5]. 299 Assuming sub-second round-trips (reasonable assumption in most modern 300 network environments), the longest factor for the determining the 301 keepalive timeout is the recovery speed of the DR (by orders of 302 magnitude), as it can take from some seconds (hot standby) to minutes 303 (non-HA restart, or cold standby with huge configurations). The 304 initial keepalive timer should be some fraction of the highest delay 305 in the system, that is, the DR recovery time. The subsequent retries 306 if no reply is received within reasonable timeframe should be 307 calculated based on the link delay, and jitter, to ensure that the 308 reply is unlikely to be coming back by the time the keepalive message 309 is re-sent. 311 As far as overhead is concerned, assuming the cold standby/restart 312 taking minute(s), with a keepalive per 60 seconds for example, the 313 QoS would remain roughly same as with faster intervals (as the DR 314 going down would cause interruption in the routing in the order of 315 minute(s) in any case). This value would cause overhead of 0.017pps 316 per RR, and it is unlikely to be the straw that breaks camel's back 317 for the DR. With any traffic, even NUD packets should outnumber the 318 keepalive traffic. 320 As far as resource utilization is concerned, this solution involves 321 only routing plane of the RR, the data link between RR and DR, and 322 fast path of DR which bounces the packets back. Therefore it can be 323 argued to be fairly lightweight general-purpose solution. 325 2.3.3. Approach 3: Short lifetimes 327 The current best practice for maintaining the routing state is to set 328 short configuration lifetimes (DHCP T1/T2 values). It causes extra 329 traffic and load on the whole DHCP infrastructure. That is because 330 during every reconfiguration, even with the DR constantly up and 331 running, the backend system is queried. The transaction involves all 332 three components. Due to that, every RR will cause constant load on 333 the backend system itself over the time, making the solution simply 334 not scale well. 336 2.3.4. Approach 4: Routing protocol to the requesting router 338 The final RR-based approach consists of the RR actually running a 339 routing protocol; this way, the RR router can simply advertise the 340 prefix as it receives it, and everything just works. Or not, as it 341 may be. 343 The downside is the security, or complete lack of it. The DR 344 accepting arbitrary RR-advertised prefixes (assuming no state at the 345 DR) should be acceptable only if DR and RR are within the same 346 administrative domain. For that case, this is probably the cleanest 347 solution of all. 349 If administrative boundaries are crossed, the DR will not take prefix 350 advertisement at face value. The DR will have extra overhead of 351 checking the backend provisioning system for AAA purposes before 352 actually doing anything with the prefix. This can imply look-up for 353 validity using the prefix and the interface the advertisement came 354 from, including the DUID or some other identifier within the route 355 advertisement message, or using some real AAA mechanism if the 356 routing protocol supports one. If minimal changes to the routing 357 protocol implementation are desired, it is also possible to ignore 358 the advertisement itself, and just trigger on-demand lease query, 359 thereby using the routing protocol just as an alternative keepalive 360 mechanism the with most of the logic shoved in DR instead of RR. 362 /~~~~~~~~~~\ 363 | Network | 364 \~~~~~~~~~~/ 365 | | 366 ---------/ \--------- 367 / \ 368 +---------------------+ +---------------------+ 369 | Delegating router 1 | | Delegating router 2 | 370 +---------------------+ +---------------------+ 371 \ / 372 -----------+----------- 373 | 374 +---------------------+ 375 | Requesting router | 376 +---------------------+ 378 Figure 2: Multihomed deployment. 380 There is also a case where the RR HAS to run its own routing 381 protocol; in multihomed situation like Figure 2, with the same 382 routable prefix advertised via two different DRs, there is no other 383 practical way to get the system working. Of course, static route 384 configuration is always an alternative but it is seldom desirable. 386 The routing-protocol-based solutions all require a significant level 387 of trust between RR and DR; regrettably the current routing protocols 388 are not designed with AAA (or security for that matter) in mind, and 389 therefore when crossing administrative boundaries, the alternatives 390 are either using them as-is as a hint that something needs to be 391 done, or significantly extending the protocols in the AAA direction. 393 Adding extra complexity to the DR's routing protocol implementation 394 or configuration is not desirable in general. 396 Finally, the current prefix delegation solution (DHCPv6 PD) does not 397 provide the information about which routing protocol to use, and 398 there is no routing protocol auto-negotiation protocol. Therefore 399 the auto-configuration of the RR with arbitrary routing protocol 400 cannot be done currently. 402 3. Security Considerations 404 The backend-oriented solution detailed in Section 2.1 implies a 405 significant level of trust between the DR and the backend 406 provisioning system. The system's configuration is simpler if the 407 backend provisioning system can inject arbitrary routes to the DR, 408 but allowing injection of routes for only specific sub-prefixes of a 409 specific prefix is considerably more secure solution. Unfortunately 410 it requires advance configuration of the prefix(es) involved. 412 The delegating router-based solutions detailed in Section 2.2 do not 413 have any security issues, assuming the delegation protocol itself is 414 secured, or can be assumed to be used only within a trusted network. 416 The requesting router-based solutions in Section 2.3, even 417 incorrectly implemented, at most just cause extra load to the DR. As 418 noted in Section 2.3.4, even when running routing protocol from the 419 RR, ideally the DR should consider the advertisements only a hint at 420 best if not part of the same adminstrative domain. This may not be 421 ideal if the routing protocol information should be propagated as-is 422 onward, as in the the multihoming cases. Unfortunately, those cases 423 also most likely cross administrative boundaries (the requesting 424 router being part of one domain, and connected to delegating routers 425 in most likely more than one), the providers will not most likely 426 trust the routing protocol to be used as-is at the delegating 427 routers, and their complexity will increase due to the required AAA/ 428 policy checks. This is a potential security risk in a critical part 429 of the network infrastructure. 431 4. IANA Considerations 433 As this document is informational in nature and only summarizes 434 current best practices, it does not require action from IANA. 436 5. Summary 438 The backend provisioning system should not be assigned the 439 responsibility for the maintenance of the route. As seen in 440 Section 2.1, that approach has significant obstacles without any 441 clear benefits. 443 If the link layer state can be used to detect the (potential) restart 444 of delegating router, the requesting router-based simple 445 reconfiguration described in Section 2.3.1 seems to be the best 446 choice. 448 When link layer state is not available, there is no clear 'best' 449 solution. The tradeoff seems to be between increasing the complexity 450 of the delegation protocol and the delegating router/backend system 451 (as described in the lease query cases in Section 2.2.1 and 452 Section 2.2.2), decreasing scalability of the system significantly by 453 using low lifetimes for configuration (as described in 454 Section 2.3.3), or small overhead of the keepalive (as described in 455 Section 2.3.2). 457 Only in multihoming cases, given some extensions to the current 458 prefix delegation protocol, should routing protocol on the requesting 459 router be considered, as described in Section 2.3.4. Multihoming 460 solution itself is challenging to do securely, as noted in Section 3, 461 due to lack of AAA support in routing protocols. 463 6. References 465 [1] Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C., and M. 466 Carney, "Dynamic Host Configuration Protocol for IPv6 (DHCPv6)", 467 RFC 3315, July 2003. 469 [2] Troan, O. and R. Droms, "IPv6 Prefix Options for Dynamic Host 470 Configuration Protocol (DHCP) version 6", RFC 3633, 471 December 2003. 473 [3] Brzozowski, J., "DHCPv6 Leasequery", 474 draft-ietf-dhc-dhcvp6-leasequery-00 (work in progress), 475 August 2006. 477 [4] Katz, D. and D. Ward, "Bidirectional Forwarding Detection", 478 draft-ietf-bfd-base-05 (work in progress), June 2006. 480 [5] Narten, T., Nordmark, E., and W. Simpson, "Neighbor Discovery 481 for IP Version 6 (IPv6)", RFC 2461, December 1998. 483 Appendix A. Acknowledgements 485 Thanks to Bernie Volz for feedback during writing of the document. 487 Authors' Addresses 489 Markus Stenberg 490 cisco Systems, Inc. 491 Shinjuku Mitsui Building, 2-1-1, Nishi-Shinjuku 492 Shinjuku-Ku, Tokyo-to 1630409 493 JP 495 Email: mstenber@cisco.com 497 Ole Troan 498 cisco Systems, Inc. 499 Shinjuku Mitsui Building, 2-1-1, Nishi-Shinjuku 500 Shinjuku-Ku, Tokyo-to 1630409 501 JP 503 Email: ot@cisco.com 505 Intellectual Property Statement 507 The IETF takes no position regarding the validity or scope of any 508 Intellectual Property Rights or other rights that might be claimed to 509 pertain to the implementation or use of the technology described in 510 this document or the extent to which any license under such rights 511 might or might not be available; nor does it represent that it has 512 made any independent effort to identify any such rights. Information 513 on the procedures with respect to rights in RFC documents can be 514 found in BCP 78 and BCP 79. 516 Copies of IPR disclosures made to the IETF Secretariat and any 517 assurances of licenses to be made available, or the result of an 518 attempt made to obtain a general license or permission for the use of 519 such proprietary rights by implementers or users of this 520 specification can be obtained from the IETF on-line IPR repository at 521 http://www.ietf.org/ipr. 523 The IETF invites any interested party to bring to its attention any 524 copyrights, patents or patent applications, or other proprietary 525 rights that may cover technology that may be required to implement 526 this standard. Please address the information to the IETF at 527 ietf-ipr@ietf.org. 529 Disclaimer of Validity 531 This document and the information contained herein are provided on an 532 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 533 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 534 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 535 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 536 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 537 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 539 Copyright Statement 541 Copyright (C) The Internet Society (2006). This document is subject 542 to the rights, licenses and restrictions contained in BCP 78, and 543 except as set forth therein, the authors retain all their rights. 545 Acknowledgment 547 Funding for the RFC Editor function is currently provided by the 548 Internet Society.