idnits 2.17.1 draft-ietf-ngtrans-6bone-multi-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-25) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 19 instances of too long lines in the document, the longest one being 1 character in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 3, 1998) is 9578 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'AGGR' ** Obsolete normative reference: RFC 1771 (ref. 'BGP') (Obsoleted by RFC 4271) ** Downref: Normative reference to an Informational RFC: RFC 2260 (ref. 'MULT') -- Possible downref: Non-RFC (?) normative reference: ref. 'MOB' Summary: 10 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Francis Dupont 2 INTERNET DRAFT GIE DYADE 3 Expires in 6 months February 3, 1998 5 Multihomed routing domain issues for IPv6 aggregatable scheme 7 9 Status of this Memo 11 This document is an Internet-Draft. Internet-Drafts are working 12 documents of the Internet Engineering Task Force (IETF), its 13 areas, and its working groups. Note that other groups may also 14 distribute working documents as Internet-Drafts. 16 Internet-Drafts are draft documents valid for a maximum of six 17 months and may be updated, replaced, or obsoleted by other 18 documents at any time. It is inappropriate to use Internet- 19 Drafts as reference material or to cite them other than as 20 "work in progress." 22 To view the entire list of current Internet-Drafts, please check 23 the "1id-abstracts.txt" listing contained in the Internet-Drafts 24 Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net 25 (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East 26 Coast), or ftp.isi.edu (US West Coast). 28 Distribution of this memo is unlimited. 30 Abstract 32 This document exposes some issues for multihomed routing domains using 33 the aggregatable addressing and routing scheme. A routing domain is 34 multihomed when it uses two or more providers of the upper level. Most 35 of these issues are not specific to IPv6 but are consequences of the 36 addressing and routing scheme. 38 1. Introduction 40 The aggregatable addressing and routing scheme [AGGR] defines an IPv6 41 aggregatable global unicast address format for use in the Internet and 42 the associated routing. 44 The address assignment and allocation mechanism is fully hierarchical, 45 a prefix of a given level (ie. of a given length) denotes all the 46 destinations in the prefix ie. aggregates them. The customers of an 47 Internet service provider are in its prefix (as a consequence a 48 multihomed routing domain has several prefixes). 50 The routing is standard datagram routing, hop by hop, on destination 51 address only (as in IPv4). But it is a prefix routing, ie. forwarding 52 decisions are based on a "longest prefix match" algorithm on arbitrary 53 bit boundaries without any knowledge of the internal structure of 54 addresses. 56 When there are two routes for the same prefix with the same length 57 then the best is caught for the inter-domain routing protocol [BGP]: 59 o policy rules; 61 o shortest path, the path being the list of routing domains 62 to cross; 64 o protocol metric. 66 The aggregation idea is the bet that in most of the cases a 67 single-homed Internet service provider at a given level should know 68 (ie. has routes to) only: 70 o its upper provider (ie. a shorter prefix, used as a default) 71 if it is not a top-level provider; 73 o its customers (ie. longer routes in its prefix); 75 o some routes to other customers of its upper provider (ie. 76 sibling prefixes, at the same level). 78 With addresses this gives (with P1:P2/x for the concatenation of 79 prefixes P1 and P2 with the length x): 81 o T/t for the upper provider; 83 o T:P/t+p for the provider itself; 85 o T:P1/t+p1, T:P2/t+p2, ..., T:Pn/t+pn for siblings; 87 o T:P:C1/t+p+c1, T:P:C2/t+p+c2, ..., T:P:Cn/t+p+cn for customers. 89 The routing information for siblings is only needed for top-level 90 providers. For an other provider it is only an optimization 91 (ie. a backdoor) because any destination, including sibling, not 92 in its own prefix, is reachable through the upper provider. 94 Usual routing exchanges for P at prefix T:P/t+p are: 96 o from the upper provider the route to T/t which can be used as 97 a default (ie. <>/0); 99 o from a customer the route to T:P:C/t+p+c; 101 o from a sibling the route to T:Q/t+q; 103 o to anybody the route for T:P/t+p (and nothing else). 105 The scheme is with arrows for route (and traffic) exchange: 107 +-----+ 108 Upper Level | T | 109 +-----+ 110 | ^ 111 T/t | | T:P/t+p 112 V | 113 +-------+ +-----+ 114 | |------ T:P/t+p --->| | 115 Siblings | P | | Q | 116 | |<---- T:Q/t+q -----| | 117 +-------+ +-----+ 118 ^ | ^ | 119 | | | | 120 | | | +-------- T:P/t+p ----+ 121 | | | | 122 | | +---- T:P:Cn/t+p+cn --+ | 123 | | | | 124 T:P:C1/t+p+c1 | | | | 125 | | T:P/t+p | | 126 | V | V 127 +-----+ +-----+ 128 | | | | 129 Customers | C 1 | | C n | 130 | | | | 131 +-----+ +-----+ 133 The aggregation is shown by the fact one announces only the route 134 to its own "aggregated" prefix and masks routes to longer prefixes. 135 Upper levels should not know the details of lower levels, this 136 transparency property should be kept. 138 A top-level provider has no upper provider (ie. no default) and must 139 exchange routes with all the other top-level providers (ie. full 140 routing with its siblings is mandatory). In order to avoid routing 141 table explosion, the length of top-level prefixes is bounded 142 (therefore the number of top-level providers is bounded too). 144 2. Multihomed Routing Domains 146 A multihomed routing domain has more than one provider then it has 147 more than one prefix (usually a prefix per provider). 149 There are several reasons to be multihomed: 151 o the "two coasts" case where the routing domain is split into 152 sub-domains in different locations, each domain using a local 153 provider: 155 +-----+ +-----+ 156 | | | | 157 | T w | | T e | 158 | | | | 159 +-----+ +-----+ 160 ^ | ^ | 161 | | | | 162 +---------|-|--------------------|-|--------+ 163 | S | V | V | 164 | +-----+ +-----+ | 165 | | |--------------->| | | 166 | | S w | | S e | | 167 | | |<---------------| | | 168 | +-----+ +-----+ | 169 | | 170 +-------------------------------------------+ 172 But in fact this comes down to two routing domains with a backdoor 173 between them. The extra routes can be hidden and there is no 174 further matter. 176 o reliable service: to be able to use another provider in 177 case of a connectivity problem. Of course the purpose 178 is to limit trouble to the only case when all the 179 providers fail (and NOT when at least one fails!). 181 +-----+ +-----+ 182 | | | | 183 | T 1 | | T 2 | 184 | | | | 185 +-----+ +-----+ 186 ^ | ^ | 187 | | | | 188 | +--------+ +--------+ | 189 | | | | 190 +--------+ | | +--------+ 191 | | | | 192 | V | V 193 +--------+ 194 | | 195 | S | 196 | | 197 +--------+ 199 A given host of a such routing domain may (and should if 200 reliable connectivity is needed) have two different addresses, 201 one for each prefix (T1:S1:H in T1:S1/t1+s1 and T2:S2:H in 202 T2:S2/t2+s2). 204 This document mainly covers this case. 206 3. The Transparency Issue 208 If a domain prefix is announced at an upper level, it has to be 209 announced to this whole level. 211 ^ A/x ^ B/x and A:S/x+y 212 | | 213 +-----+ +-----+ 214 | | | | 215 | A | | B | 216 | | | | 217 +-----+ +-----+ 218 ^ | ^ | 219 | | | | 220 | +--------+ +--------+ | 221 | | | | 222 +--------+ | | +--------+ 223 | | | | 224 | V | V 225 +--------+ 226 | | 227 | S | 228 | | 229 +--------+ 231 If the provider B tries to announce the prefix A:S/x+y in order to be 232 able to route the traffic for S with both prefixes A:S/x+y and B:S/x+y 233 then B will catch the whole traffic for S because the prefix A:S/x+y 234 is longer than the prefix A/x (x+y > x) so it is a better match... 236 In this case the only solution is that both A and B announce routes 237 to prefixes A:S/x+y and B:S/x+y which breaks the transparency property 238 and obviously does not scale. 240 The [MULT] document proposes to announce the prefix A:S/x+y by B only 241 when the path through A (then announces by A) is not available. This 242 makes transparency problems less important but a route for a long 243 prefix is liable to filtering or flap damping mechanisms and should 244 be avoid. 246 A second solution proposed by [MULT] is to use tunnels in order to 247 keep connectivity even a path is not available: 249 ttttttttttttttttttttt 250 t t 251 +-----+ t +-----+ 252 | | t | | 253 | A | t | B | 254 | | t | | 255 +-----+ t +-----+ 256 ^ | tttttttt X ^ | 257 | | t X | | 258 | +--------+ t +----X---+ | 259 | | t | X | 260 +--------+ | t | +----X---+ 261 | | t | | X 262 | V t | V X 263 +----------+ 264 | S | 265 +----------+ 267 This uses a hairy configuration of EBGP and is limited by the tunnel 268 technology. We shall try to explore other kinds of solutions. 270 4. Upper Level Routing 272 At upper levels the structure looks like: 274 +--------+ 275 | NLAx | 276 +--------+ 277 | | 278 / \ 279 / \ 280 / \ 281 +-------+ +-------+ 282 | NLAy1 | | NLAy2 | 283 +-------+ +-------+ 284 | | 285 . . 286 . . 287 . . 288 | | 289 +-------+ +-------+ 290 | NLAz1 | | NLAz2 | 291 +-------+ +-------+ 292 \ / 293 \ / 294 \ / 295 | | 296 +--------+ 297 | S | 298 +--------+ 300 For an optimal routing S should have routes for any NLAi1 or NLAi2 up 301 to NLAx, the first common upper provider. For destinations outside the 302 diagram any provider (NLAz1 or NLAz2) can be used, usually the choice 303 of the provider is managed by internal policy rules. 305 The source address selection for S nodes should be coherent with the 306 upper level routing and the policy in order to avoid asymmetrical 307 routing. There is no current proposal for source address selection 308 (and the dual problem, destination address selection) but a selection 309 service should: 311 o be synchronized with (external) routing, ie. there should be an 312 interaction between border routers and the service; 314 o be used by applications which can have more information; 316 o be used as the same time than DNS resolution which makes the 317 destination address selection easy to intergrate in the service, 318 ie. the list of addresses returned by the resolver can be converted 319 in a partial ordered list of source / destination address pairs. 321 5. Mutual Backup 323 There is a case where the transparency property is kept, routing 324 is as reliable as possible and is optimal in almost all the cases. 326 ^ A/x and B/x ^ B/x and A/x 327 | | 328 +-----+ +-----+ 329 | |------ A/x ---->| | 330 | A | | B | 331 | |<------ B/x ----| | 332 +-----+ +-----+ 333 ^ | B:S/x+y ^ | 334 | | A:S/x+y | | 335 | +-- A/x -+ +--------+ | 336 | | | | 337 +--------+ | | +- B/x --+ 338 A:S/x+y | | | | 339 B:S/x+y | V | V 340 +--------+ 341 | | 342 | S | 343 | | 344 +--------+ 346 For a provider T in an upper level or the same one than providers A 347 and B, routes for the prefix A/x are not equivalent because the prefix 348 A/x announced by A is direct (one element (A) in the path) and the 349 prefix A/x announced by B is indirect (two elements (B and A) in the 350 path). Then traffic for A will go to A directly. The same thing 351 applies for B. 353 The prefix A:S/x+y is longer (ie. better) than the prefix A/x then 354 for A the whole traffic for S will go directly, same for B. 356 If the path through A is not available then the whole traffic for S, 357 including the one to or from addresses in the prefix A:S/x+y will go 358 through B. 360 This case supposes a mutual backup agreement between A and B which 361 can be the case if A and B are not in competition, for instance A is 362 a mission provider and B a geographical one. But it is a real 363 constraint... 365 This still works if announces between A and B do not carry full 366 prefixes (but they should include (ie. be shorter than) the prefix 367 *:S/x+y). The backup will work only for a part of A and B (with a dark 368 hole in case of failure for customers not implied in the backup 369 agreement). Unfortunately this does not work in more complex cases: 371 ^ A/x and B/x ^ B/x, A/x and C/x ^ C/x and B/x 372 | | | 373 +-----+ +--------+ +-----+ 374 | |--- A:S/x+y --->| |--- B:R/x+y --->| | 375 | A | | B | | C | 376 | |<--- B:S/x+y ---| |<--- C:R/x+y ---| | 377 +-----+ +--------+ +-----+ 378 ^ | B:S/x+y ^ | ^ | C:R/x+y ^ | 379 | | A:S/x+y | | | | B:R/x+y | | 380 | +-- A/x -+ +--------+ | | +-- B/x -+ +--------+ | 381 | | | | | | | | 382 +--------+ | | +- B/x --+ +--------+ | | +- C/x --+ 383 A:S/x+y | | | | B:R/x+y | | | | 384 B:S/x+y | V | V C:R/x+y | V | V 385 +--------+ +--------+ 386 | | | | 387 | S | | R | 388 | | | | 389 +--------+ +--------+ 391 The backup is not transitive in this case, if something goes wrong 392 in the B path for S the traffic can try to cross C which knows 393 nothing about S and will drop packets... 395 6. Broken Bit 397 Consider the standard multihomed case when a link is broken: 399 +-----+ +-----+ 400 | | | | 401 | A | | B | 402 | | | | 403 +-----+ +-----+ 404 ^ | X ^ | 405 | | X | | 406 | +--------+ +---X----+ | 407 | | | X | 408 +--------+ | | +---X----+ 409 | | | | X 410 | V | V X 411 +--------+ 412 | | 413 | S | 414 | | 415 +--------+ 417 If we look inside the routing domain S: 419 +-----+ +-----+ 420 | | | | 421 | A | | B | 422 | | | | 423 +-----+ +-----+ 424 +---+ ^ | X ^ | 425 | X | | | X | | 426 +---+ | +--------+ +---X----+ | 427 | | | X | 428 +--------+ | | +---X----+ 429 | | | | X 430 | V | V X 431 +-----+ +-----+ 432 +---| BRA |---| BRB |---+ 433 | +-----+ +-----+ | 434 | | | | 435 | ------------------- | 436 | | | 437 | +---+ | 438 | | R | | 439 | +---+ | 440 | | | 441 | ------- | 442 | | | 443 | +---+ | 444 | | H | | 445 | S +---+ | 446 | | 447 +-----------------------+ 449 The host H has two addresses, A:S:H and B:S:H, and the path through B 450 is broken. 452 An external host X will use A:S:H because B:S:H does not work. The DNS 453 will return both addresses but the applications should try all of them 454 (on BSD 4.4 derived Unixes we have found only one standard application 455 trying only the first returned address). We can try to play on address 456 order in the DNS but the DNS caching mechanism makes this difficult 457 (but it is not necessary). In conclusion new connections from X to H 458 will work. 460 For new connections from H to X the problem is to force the choice of 461 the good source address (A:S:H) by H. The proposal is to add a "broken 462 bit" in prefix information in router advertisement in order to inform 463 nodes that addresses in a given prefix should not be used. The border 464 router BRB knows there is a problem and should send this information 465 to all the routers of S using for instance the router renumbering 466 protocol. 468 The last case, existing (ie. established before the failure) 469 connections between H (using B:S:H) and X are dealt with in the next 470 section. 472 7. Use Of Mobility Mechanisms 474 The idea is to use some mechanisms of IPv6 mobility [MOB] (home 475 address and binding update but not home-agent nor (in fact) true 476 mobility) in order to make critical connections resilient to provider 477 failures. 478 +---+ 479 aaaaaaaaaaaaaaaaaaa| X | 480 a +---+ 481 a b 482 a b 483 +-----+ b +-----+ 484 | | b | | 485 | A | bb| B | 486 | | | | 487 +-----+ +-----+ 488 ^ | X ^ | 489 | | X | | 490 | +--------+ +---X----+ | 491 | | | X | 492 +--------+ | | +---X----+ 493 | | | | X 494 | V | V X 495 +--------------+ 496 | a b | 497 | a b | 498 | a +---+ | 499 | aaaaa| H | | 500 | +---+ | 501 | S | 502 +--------------+ 504 There is a connection between H and X (using addresses B:S:H and X) 505 with a security association for authentication (necessary for mobility 506 and not a real constraint for a critical connection because it is easy 507 to mess an unauthentic connection, for instance with junk RST TCP 508 packets). 510 After the (used) path through B fails, the broken bit is set in the 511 prefix B:S information in router advertisements then H is informed of 512 the problem. 514 H uses a home address B:S:H destination option in each packet for 515 X in order to use A:S:H as the source address: for each router the 516 source is in A's prefix and only X replaces the source address by 517 B:S:H before looking up the PCB of the connection. 519 H sends a binding update with A:S:H as the care-of address to X in a 520 packet with an Authentication Header. X receives and processes it, 521 sends a binding acknowledgement and uses a routing header with A:S:H 522 as the (first) destination and B:S:H as the final destination. 524 Summary: 526 o packets from H to X: 527 source = A:S:H 528 destination = X 529 home-address = B:S:H 530 binding-update (in first packets, should be acknowledged): 531 care-of = A:S:H 533 o packets from X to H: 534 source = X 535 destination = A:S:H 536 routing-header: one address = B:S:H 538 While X must implement the full mobile correspondent node operation, 539 H must implement only the binding management (no movement 540 detection, no new care-of address acquisition, no operation with a 541 home agent). In fact H does not move, it only changes its address 542 choice. 544 8. Security Considerations 546 A better reliability in Internet connectivity can only improve 547 security. Critical connection should be authenticated and binding 548 updates must be carried in authenticated packets (see [MOB] for the 549 discussion). IPSEC is mandatory for compliant IPv6 implementations. 551 9. ACKNOWLEDGEMENTS 553 All these ideas were discussed or found at the 40th IETF meeting at 554 Washington during lunch-time 6bone BOFs. The transparency issue was 555 well-known (and presented by Ben Crosby). The mutual backup scheme was 556 built by the author for a regional/organization dual-homing at a G6 557 meeting. 558 The non-transitive issue was presented by Alain Durand. The 559 diversion of mobility mechanisms appeared in the discussion between 560 the author and Matt Crawford who proposed the broken bit. 561 The author would like to acknowledge inputs of the G6, the 6bone and 562 the RIPE communities. 564 10. References 566 [AGGR] Hinden, R., O'Dell, M. and Deering, S., "An IPv6 567 Aggregatable Global Unicast Address Format", Internet 568 Draft, , January 1998. 570 [BGP] Rekhter, Y. and Li, T., "A Border Gateway Protocol 4 (BGP-4)", 571 RFC 1771, cisco Systems, March 1995. 573 [MULT] Bates, T. and Rekhter, Y., "Scalable Support for Multi-homed 574 Multi-provider Connectivity", RFC 2260, cisco Systems, 575 January 1998. 577 [MOB] Johnson, D. B., Perkins, C., "Mobility Support in IPv6", 578 Internet Draft, , 579 November 1997. 581 11. Author's Address 583 Francis Dupont 584 GIE DYADE 585 INRIA Rocquencourt 586 Domaine de Voluceau 587 B.P. 105 588 78153 Le Chesnay CEDEX 589 FRANCE 591 Fax: +33 1 39 63 55 66 592 EMail: Francis.Dupont@inria.fr 594 Expire in 6 months (August 8, 1998)