idnits 2.17.1 draft-ietf-rtgwg-enterprise-pa-multihoming-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 3, 2019) is 1752 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 6106 (Obsoleted by RFC 8106) == Outdated reference: A later version (-11) exists of draft-ietf-intarea-provisioning-domains-05 -- Obsolete informational reference (is this intentional?): RFC 4941 (Obsoleted by RFC 8981) -- Obsolete informational reference (is this intentional?): RFC 6434 (Obsoleted by RFC 8504) -- Obsolete informational reference (is this intentional?): RFC 6824 (Obsoleted by RFC 8684) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Routing Working Group F. Baker 3 Internet-Draft 4 Intended status: Informational C. Bowers 5 Expires: January 4, 2020 Juniper Networks 6 J. Linkova 7 Google 8 July 3, 2019 10 Enterprise Multihoming using Provider-Assigned IPv6 Addresses without 11 Network Prefix Translation: Requirements and Solutions 12 draft-ietf-rtgwg-enterprise-pa-multihoming-10 14 Abstract 16 Connecting an enterprise site to multiple ISPs over IPv6 using 17 provider-assigned addresses is difficult without the use of some form 18 of Network Address Translation (NAT). Much has been written on this 19 topic over the last 10 to 15 years, but it still remains a problem 20 without a clearly defined or widely implemented solution. Any 21 multihoming solution without NAT requires hosts at the site to have 22 addresses from each ISP and to select the egress ISP by selecting a 23 source address for outgoing packets. It also requires routers at the 24 site to take into account those source addresses when forwarding 25 packets out towards the ISPs. 27 This document examines currently available mechanisms for providing a 28 solution to this problem for a broad range of enterprise topologies. 29 It covers the behavior of routers to forward traffic taking into 30 account source address, and it covers the behavior of hosts to select 31 appropriate default source addresses. It also covers any possible 32 role that routers might play in providing information to hosts to 33 help them select appropriate source addresses. In the process of 34 exploring potential solutions, this document also makes explicit 35 requirements for how the solution would be expected to behave from 36 the perspective of an enterprise site network administrator. 38 Status of This Memo 40 This Internet-Draft is submitted in full conformance with the 41 provisions of BCP 78 and BCP 79. 43 Internet-Drafts are working documents of the Internet Engineering 44 Task Force (IETF). Note that other groups may also distribute 45 working documents as Internet-Drafts. The list of current Internet- 46 Drafts is at https://datatracker.ietf.org/drafts/current/. 48 Internet-Drafts are draft documents valid for a maximum of six months 49 and may be updated, replaced, or obsoleted by other documents at any 50 time. It is inappropriate to use Internet-Drafts as reference 51 material or to cite them other than as "work in progress." 53 This Internet-Draft will expire on January 4, 2020. 55 Copyright Notice 57 Copyright (c) 2019 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents 62 (https://trustee.ietf.org/license-info) in effect on the date of 63 publication of this document. Please review these documents 64 carefully, as they describe your rights and restrictions with respect 65 to this document. Code Components extracted from this document must 66 include Simplified BSD License text as described in Section 4.e of 67 the Trust Legal Provisions and are provided without warranty as 68 described in the Simplified BSD License. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 73 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 6 74 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 75 4. Enterprise Multihoming Use Cases . . . . . . . . . . . . . . 8 76 4.1. Simple ISP Connectivity with Connected SERs . . . . . . . 8 77 4.2. Simple ISP Connectivity Where SERs Are Not Directly 78 Connected . . . . . . . . . . . . . . . . . . . . . . . . 9 79 4.3. Enterprise Network Operator Expectations . . . . . . . . 11 80 4.4. More complex ISP connectivity . . . . . . . . . . . . . . 13 81 4.5. ISPs and Provider-Assigned Prefixes . . . . . . . . . . . 15 82 4.6. Simplified Topologies . . . . . . . . . . . . . . . . . . 16 83 5. Generating Source-Prefix-Scoped Forwarding Tables . . . . . 16 84 6. Mechanisms For Hosts To Choose Good Source Addresses In A 85 Multihomed Site . . . . . . . . . . . . . . . . . . . . . . . 23 86 6.1. Source Address Selection Algorithm on Hosts . . . . . . . 25 87 6.2. Selecting Source Address When Both Uplinks Are Working . 28 88 6.2.1. Distributing Address Selection Policy Table with 89 DHCPv6 . . . . . . . . . . . . . . . . . . . . . . . 28 90 6.2.2. Controlling Source Address Selection With Router 91 Advertisements . . . . . . . . . . . . . . . . . . . 29 92 6.2.3. Controlling Source Address Selection With ICMPv6 . . 31 93 6.2.4. Summary of Methods For Controlling Source Address 94 Selection To Implement Routing Policy . . . . . . . . 32 95 6.3. Selecting Source Address When One Uplink Has Failed . . . 33 96 6.3.1. Controlling Source Address Selection With DHCPv6 . . 34 97 6.3.2. Controlling Source Address Selection With Router 98 Advertisements . . . . . . . . . . . . . . . . . . . 35 99 6.3.3. Controlling Source Address Selection With ICMPv6 . . 36 100 6.3.4. Summary Of Methods For Controlling Source Address 101 Selection On The Failure Of An Uplink . . . . . . . . 37 102 6.4. Selecting Source Address Upon Failed Uplink Recovery . . 37 103 6.4.1. Controlling Source Address Selection With DHCPv6 . . 37 104 6.4.2. Controlling Source Address Selection With Router 105 Advertisements . . . . . . . . . . . . . . . . . . . 38 106 6.4.3. Controlling Source Address Selection With ICMP . . . 38 107 6.4.4. Summary Of Methods For Controlling Source Address 108 Selection Upon Failed Uplink Recovery . . . . . . . . 39 109 6.5. Selecting Source Address When All Uplinks Failed . . . . 39 110 6.5.1. Controlling Source Address Selection With DHCPv6 . . 39 111 6.5.2. Controlling Source Address Selection With Router 112 Advertisements . . . . . . . . . . . . . . . . . . . 39 113 6.5.3. Controlling Source Address Selection With ICMPv6 . . 40 114 6.5.4. Summary Of Methods For Controlling Source Address 115 Selection When All Uplinks Failed . . . . . . . . . . 40 116 6.6. Summary Of Methods For Controlling Source Address 117 Selection . . . . . . . . . . . . . . . . . . . . . . . . 40 118 6.7. Solution Limitations . . . . . . . . . . . . . . . . . . 42 119 6.7.1. Connections Preservation . . . . . . . . . . . . . . 42 120 6.8. Other Configuration Parameters . . . . . . . . . . . . . 43 121 6.8.1. DNS Configuration . . . . . . . . . . . . . . . . . . 43 122 7. Deployment Considerations . . . . . . . . . . . . . . . . . . 44 123 7.1. Deploying SADR Domain . . . . . . . . . . . . . . . . . . 44 124 7.2. Hosts-Related Considerations . . . . . . . . . . . . . . 45 125 8. Other Solutions . . . . . . . . . . . . . . . . . . . . . . . 45 126 8.1. Shim6 . . . . . . . . . . . . . . . . . . . . . . . . . . 45 127 8.2. IPv6-to-IPv6 Network Prefix Translation . . . . . . . . . 46 128 8.3. Multipath Transport . . . . . . . . . . . . . . . . . . . 46 129 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 47 130 10. Security Considerations . . . . . . . . . . . . . . . . . . . 47 131 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 48 132 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 48 133 12.1. Normative References . . . . . . . . . . . . . . . . . . 48 134 12.2. Informative References . . . . . . . . . . . . . . . . . 50 135 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 51 137 1. Introduction 139 Site multihoming, the connection of a subscriber network to multiple 140 upstream networks using redundant uplinks, is a common enterprise 141 architecture for improving the reliability of its Internet 142 connectivity. If the site uses provider-independent (PI) addresses, 143 all traffic originating from the enterprise can use source addresses 144 from the PI address space. Site multihoming with PI addresses is 145 commonly used with both IPv4 and IPv6, and does not present any new 146 technical challenges. 148 It may be desirable for an enterprise site to connect to multiple 149 ISPs using provider-assigned (PA) addresses, instead of PI addresses. 150 Multihoming with provider-assigned addresses is typically less 151 expensive for the enterprise relative to using provider-independent 152 addresses as it does not require obtaining and maintaining PI address 153 space as well as running BGP between the enterprise and the ISPs (for 154 small/meduim networks running BGP might be not just undesirable but 155 impossible, especially if residential-type ISP connections are used). 156 PA multihoming is also a practice that should be facilitated and 157 encouraged because it does not add to the size of the Internet 158 routing table, whereas PI multihoming does. Note that PA is also 159 used to mean "provider-aggregatable". In this document we assume 160 that provider-assigned addresses are always provider-aggregatable. 162 With PA multihoming, for each ISP connection, the site is assigned a 163 prefix from within an address block allocated to that ISP by its 164 National or Regional Internet Registry. In the simple case of two 165 ISPs (ISP-A and ISP-B), the site will have two different prefixes 166 assigned to it (prefix-A and prefix-B). This arrangement is 167 problematic. First, packets with the "wrong" source address may be 168 dropped by one of the ISPs. In order to limit denial of service 169 attacks using spoofed source addresses, BCP38 [RFC2827] recommends 170 that ISPs filter traffic from customer sites to only allow traffic 171 with a source address that has been assigned by that ISP. So a 172 packet sent from a multihomed site on the uplink to ISP-B with a 173 source address in prefix-A may be dropped by ISP-B. 175 However, even if ISP-B does not implement BCP38 or ISP-B adds 176 prefix-A to its list of allowed source addresses on the uplink from 177 the multihomed site, two-way communication may still fail. If the 178 packet with source address in prefix-A was sent to ISP-B because the 179 uplink to ISP-A failed, then if ISP-B does not drop the packet and 180 the packet reaches its destination somewhere on the Internet, the 181 return packet will be sent back with a destination address in prefix- 182 A. The return packet will be routed over the Internet to ISP-A, but 183 it will not be delivered to the multihomed site because the site 184 uplink with ISP-A has failed. Two-way communication would require 185 some arrangement for ISP-B to advertise prefix-A when the uplink to 186 ISP-A fails. 188 Note that the same may be true with a provider that does not 189 implement BCP 38, if his upstream provider does, or has no 190 corresponding route to deliver the ingress traffic to the multihomed 191 site. The issue is not that the immediate provider implements 192 ingress filtering; it is that someone upstream does (so egress 193 traffic is blocked), or lacks a route (causing blackholing of the 194 ingress traffic). 196 Another issue with asymmetric traffic flow (when the egress traffic 197 leaves the site via one ISP but the return traffic enters the site 198 via another uplink) is related to stateful firewalls/middleboxes. 199 Keeping state in that case might be problematic, even impossible. 201 With IPv4, this problem is commonly solved by using [RFC1918] private 202 address space within the multi-homed site and Network Address 203 Translation (NAT) or Network Address/Port Translation (NAPT) on the 204 uplinks to the ISPs. However, one of the goals of IPv6 is to 205 eliminate the need for and the use of NAT or NAPT. Therefore, 206 requiring the use of NAT or NAPT for an enterprise site to multihome 207 with provider-assigned addresses is not an attractive solution. 209 [RFC6296] describes a translation solution specifically tailored to 210 meet the requirements of multi-homing with provider-assigned IPv6 211 addresses. With the IPv6-to-IPv6 Network Prefix Translation (NPTv6) 212 solution, within the site an enterprise can use Unique Local 213 Addresses [RFC4193] or the prefix assigned by one of the ISPs. As 214 traffic leaves the site on an uplink to an ISP, the source address 215 gets translated to an address within the prefix assigned by the ISP 216 on that uplink in a predictable and reversible manner. [RFC6296] is 217 currently classified as Experimental, and it has been implemented by 218 several vendors. See Section 8.2, for more discussion of NPTv6. 220 This document defines routing requirements for enterprise multihoming 221 This document focuses on the following general class of solutions. 223 Each host at the enterprise has multiple addresses, at least one from 224 each ISP-assigned prefix. Each host, as discussed in Section 6.1 and 225 [RFC6724], is responsible for choosing the source address applied to 226 each packet it sends. A host is expected to be able respond 227 dynamically to the failure of an uplink to a given ISP by no longer 228 sending packets with the source address corresponding to that ISP. 229 Potential mechanisms for the communication of changes in the network 230 to the host are Neighbor Discovery Router Advertisements ([RFC4861]), 231 DHCPv6 ([RFC8415]), and ICMPv6 ([RFC4443]). 233 The routers in the enterprise network are responsible for ensuring 234 that packets are delivered to the "correct" ISP uplink based on 235 source address. This requires that at least some routers in the site 236 network are able to take into account the source address of a packet 237 when deciding how to route it. That is, some routers must be capable 238 of some form of Source Address Dependent Routing (SADR), if only as 239 described in the section 4.3 of [RFC3704]. At a minimum, the routers 240 connected to the ISP uplinks (the site exit routers or SERs) must be 241 capable of Source Address Dependent Routing. Expanding the connected 242 domain of routers capable of SADR from the site exit routers deeper 243 into the site network will generally result in more efficient routing 244 of traffic with external destinations. 246 This document is organized as follows. Section 4 looks in more 247 detail at the enterprise networking environments in which this 248 solution is expected to operate. The discussion of Section 4 uses 249 the concepts of source-prefix-scoped routing advertisements and 250 forwarding tables and provides a description of how source-prefix- 251 scoped routing advertisements are used to generate source-prefix- 252 scoped forwarding tables. Instead, this detailed description is 253 provided in Section 5. Section 6 discusses existing and proposed 254 mechanisms for hosts to select the default source address applied to 255 packets. It also discusses the requirements for routing that are 256 needed to support these enterprise network scenarios and the 257 mechanisms by which hosts are expected to select source addresses for 258 new connetions dynamically based on network state. Section 7 259 discusses deployment considerations, while Section 8 discusses other 260 solutions. 262 2. Requirements Language 264 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 265 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 266 "OPTIONAL" in this document are to be interpreted as described in BCP 267 14 [RFC2119] [RFC8174] when, and only when, they appear in all 268 capitals, as shown here. 270 3. Terminology 272 PA (provider-assigned or provider-aggregatable) address space: a 273 block of IP addresses assigned by an Regional Internet Registry (RIR) 274 to a Local Internet Registry (LIR), used to create allocations to end 275 sites. Can be aggregated and present in the routing table as one 276 route. 278 PI (provider-independent) address space: a block of IP addresses 279 assigned by an Regional Internet Registry (RIR) directly to end site/ 280 end customer. 282 ISP: Internet Service Provider. 284 LIR (Local Internet Registry): an organisation (usually an ISP or an 285 enterprise/academic) which receives IP addresses allocation from its 286 Regional Internet Regsitry, then assign parts of that allocation to 287 its customers. 289 RIR (Regional Internet Registry): an organization which manages the 290 Internet number resources (such as IP addresses and AS numbers) 291 within a geographical region of the world. 293 SADR (Source Address Dependent Routing): Routing which takes into 294 account the source address of a packet in addition to the packet 295 destination address. 297 SADR domain: a routing domain where some (or all) routers exchange 298 source-dependent routing information. 300 Source-Prefix-Scoped Routing/Forwarding Table: a routing (or 301 forwarding) table which contains routing (or forwarding) information 302 which is applicable to packets with source addresses from the 303 specific prefix only. 305 Unscoped Routing/Forwarding Table: a routing (or forwarding) table 306 which can be used to route/forward packets with any source addresses. 308 SER (Site Edge Router): a router which connects the site to an ISP 309 (terminates an ISP uplink).. 311 LLA (Link-Local Address): IPv6 Unicast Address from fe80::/10 prefix 312 ([RFC4291]). 314 ULA (Unique Local IPv6 Unicast Address): IPv6 unicast addresses from 315 FC00::/7 prefix. They are globally unique and intended for local 316 communications ([RFC4193]). 318 GUA (Global Unicast Address): globally routable IPv6 addresses of the 319 global scope ([RFC4291]). 321 SLAAC (IPv6 Stateless Address Autoconfiguration): a stateless process 322 of configuring network stack on IPv6 hosts ([RFC4862]). 324 RA (Router Advertisement): a message sent by an IPv6 router to 325 advertise its presence to hosts together with various network-related 326 parameters required for hosts to perform SLAAC ([RFC4861]). 328 PIO (Prefix Information Option): a part of RA message containing 329 information about IPv6 prefixes which could be used by hosts to 330 generate global IPv6 addresses ([RFC4862]). 332 RIO (Route Information Option): a part of RA message containing 333 information about more specific IPv6 prefixes reachable via the 334 advertising router ([RFC4191]). 336 4. Enterprise Multihoming Use Cases 338 4.1. Simple ISP Connectivity with Connected SERs 340 We start by looking at a scenario in which a site has connections to 341 two ISPs, as shown in Figure 1. The site is assigned the prefix 342 2001:db8:0:a000::/52 by ISP-A and prefix 2001:db8:0:b000::/52 by ISP- 343 B. We consider three hosts in the site. H31 and H32 are on a LAN 344 that has been assigned subnets 2001:db8:0:a010::/64 and 345 2001:db8:0:b010::/64. H31 has been assigned the addresses 346 2001:db8:0:a010::31 and 2001:db8:0:b010::31. H32 has been assigned 347 2001:db8:0:a010::32 and 2001:db8:0:b010::32. H41 is on a different 348 subnet that has been assigned 2001:db8:0:a020::/64 and 349 2001:db8:0:b020::/64. 351 2001:db8:0:1234::101 H101 352 | 353 | 354 2001:db8:0:a010::31 -------- 355 2001:db8:0:b010::31 ,-----. / \ 356 +--+ +--+ +----+ ,' `. : : 357 +---|R1|---|R4|---+---|SERa|-+ ISP-A +--+-- : 358 H31--+ +--+ +--+ | +----+ `. ,' : : 359 | | `-----' : Internet : 360 | | : : 361 | | : : 362 | | : : 363 | | ,-----. : : 364 H32--+ +--+ | +----+ ,' `. : : 365 +---|R2|----------+---|SERb|-+ ISP-B +--+-- : 366 +--+ | +----+ `. ,' : : 367 | `-----' : : 368 | : : 369 +--+ +--+ +--+ \ / 370 H41------|R3|--|R5|--|R6| -------- 371 +--+ +--+ +--+ 373 2001:db8:0:a020::41 374 2001:db8:0:b020::41 376 Figure 1: Simple ISP Connectivity With Connected SERs 378 We refer to a router that connects the site to an ISP as a site edge 379 router (SER). Several other routers provide connectivity among the 380 internal hosts (H31, H32, and H41), as well as connecting the 381 internal hosts to the Internet through SERa and SERb. In this 382 example SERa and SERb share a direct connection to each other. In 383 Section 4.2, we consider a scenario where this is not the case. 385 For the moment, we assume that the hosts are able to make good 386 choices about which source addresses through some mechanism that 387 doesn't involve the routers in the site network. Here, we focus on 388 primary task of the routed site network, which is to get packets 389 efficiently to their destinations, while sending a packet to the ISP 390 that assigned the prefix that matches the source address of the 391 packet. In Section 6, we examine what role the routed network may 392 play in helping hosts make good choices about source addresses for 393 packets. 395 With this solution, routers will need some form of Source Address 396 Dependent Routing, which will be new functionality. It would be 397 useful if an enterprise site does not need to upgrade all routers to 398 support the new SADR functionality in order to support PA multi- 399 homing. We consider if this is possible and what are the tradeoffs 400 of not having all routers in the site support SADR functionality. 402 In the topology in Figure 1, it is possible to support PA multihoming 403 with only SERa and SERb being capable of SADR. The other routers can 404 continue to forward based only on destination address, and exchange 405 routes that only consider destination address. In this scenario, 406 SERa and SERb communicate source-scoped routing information across 407 their shared connection. When SERa receives a packet with a source 408 address matching prefix 2001:db8:0:b000::/52 , it forwards the packet 409 to SERb, which forwards it on the uplink to ISP-B. The analogous 410 behaviour holds for traffic that SERb receives with a source address 411 matching prefix 2001:db8:0:a000::/52. 413 In Figure 1, when only SERa and SERb are capable of source address 414 dependent routing, PA multi-homing will work. However, the paths 415 over which the packets are sent will generally not be the shortest 416 paths. The forwarding paths will generally be more efficient as more 417 routers are capable of SADR. For example, if R4, R2, and R6 are 418 upgraded to support SADR, then can exchange source-scoped routes with 419 SERa and SERb. They will then know to send traffic with a source 420 address matching prefix 2001:db8:0:b000::/52 directly to SERb, 421 without sending it to SERa first. 423 4.2. Simple ISP Connectivity Where SERs Are Not Directly Connected 425 In Figure 2, we modify the topology slightly by inserting R7, so that 426 SERa and SERb are no longer directly connected. With this topology, 427 it is not enough to just enable SADR routing on SERa and SERb to 428 support PA multi-homing. There are two solutions to enable PA 429 multihoming in this topology. 431 2001:db8:0:1234::101 H101 432 | 433 | 434 2001:db8:0:a010::31 -------- 435 2001:db8:0:b010::31 ,-----. / \ 436 +--+ +--+ +----+ ,' `. : : 437 +---|R1|---|R4|---+---|SERa|-+ ISP-A +--+-- : 438 H31--+ +--+ +--+ | +----+ `. ,' : : 439 | | `-----' : Internet : 440 | +--+ : : 441 | |R7| : : 442 | +--+ : : 443 | | ,-----. : : 444 H32--+ +--+ | +----+ ,' `. : : 445 +---|R2|----------+---|SERb|-+ ISP-B +--+-- : 446 +--+ | +----+ `. ,' : : 447 | `-----' : : 448 | : : 449 +--+ +--+ +--+ \ / 450 H41------|R3|--|R5|--|R6| -------- 451 +--+ +--+ +--+ | 452 | 453 2001:db8:0:a020::41 2001:db8:0:5678::501 H501 454 2001:db8:0:b020::41 456 Figure 2: Simple ISP Connectivity Where SERs Are Not Directly 457 Connected 459 One option is to effectively modify the topology by creating a 460 logical tunnel between SERa and SERb, using GRE ([RFC7676]) for 461 example. Although SERa and SERb are not directly connected 462 physically in this topology, they can be directly connected logically 463 by a tunnel. 465 The other option is to enable SADR functionality on R7. In this way, 466 R7 will exchange source-scoped routes with SERa and SERb, making the 467 three routers act as a single SADR domain. This illustrates the 468 basic principle that the minimum requirement for the routed site 469 network to support PA multi-homing is having all of the site exit 470 routers be part of a connected SADR domain. Extending the connected 471 SADR domain beyond that point can produce more efficient forwarding 472 paths. 474 4.3. Enterprise Network Operator Expectations 476 Before considering a more complex scenario, let's look in more detail 477 at the reasonably simple multihoming scenario in Figure 2 to 478 understand what can reasonably be expected from this solution. As a 479 general guiding principle, we assume an enterprise network operator 480 will expect a multihomed network to behave as close as to a single- 481 homed network as possible. So a solution that meets those 482 expectations where possible is a good thing. 484 For traffic between internal hosts and traffic from outside the site 485 to internal hosts, an enterprise network operator would expect there 486 be no visible change in the path taken by this traffic, since this 487 traffic does not need to be routed in a way that depends on source 488 address. It is also reasonable to expect that internal hosts should 489 be able to communicate with each other using either of their source 490 addresses without restriction. For example, H31 should be able to 491 communicate with H41 using a packet with S=2001:db8:0:a010::31, 492 D=2001:db8:0:b020::41, regardless of the state of uplink to ISP-B. 494 These goals can be accomplished by having all of the routers in the 495 network continue to originate normal unscoped destination routes for 496 their connected networks. If we can arrange so that these unscoped 497 destination routes get used for forwarding this traffic, then we will 498 have accomplished the goal of keeping forwarding of traffic destined 499 for internal hosts, unaffected by the multihoming solution. 501 For traffic destined for external hosts, it is reasonable to expect 502 that traffic with a source address from the prefix assigned by ISP-A 503 to follow the path to that the traffic would follow if there is no 504 connection to ISP-B. This can be accomplished by having SERa 505 originate a source-scoped route of the form (S=2001:db8:0:a000::/52, 506 D=::/0) . If all of the routers in the site support SADR, then the 507 path of traffic exiting via ISP-A can match that expectation. If 508 some routers don't support SADR, then it is reasonable to expect that 509 the path for traffic exiting via ISP-A may be different within the 510 site. This is a tradeoff that the enterprise network operator may 511 decide to make. 513 It is important to understand how this multihoming solution behaves 514 when an uplink to one of the ISPs fails. To simplify this 515 discussion, we assume that all routers in the site support SADR. We 516 first start by looking at how the network operates when the uplinks 517 to both ISP-A and ISP-B are functioning properly. SERa originates a 518 source-scoped route of the form (S=2001:db8:0:a000::/52, D=::/0), and 519 SERb is originates a source-scoped route of the form 520 (S=2001:db8:0:b000::/52, D=::/0). These routes are distributed 521 through the routers in the site, and they establish within the 522 routers two set of forwarding paths for traffic leaving the site. 523 One set of forwarding paths is for packets with source address in 524 2001:db8:0:a000::/52. The other set of forwarding paths is for 525 packets with source address in 2001:db8:0:b000::/52. The normal 526 destination routes which are not scoped to these two source prefixes 527 play no role in the forwarding. Whether a packet exits the site via 528 SERa or via SERb is completely determined by the source address 529 applied to the packet by the host. So for example, when host H31 530 sends a packet to host H101 with (S=2001:db8:0:a010::31, 531 D=2001:db8:0:1234::101), the packet will only be sent out the link 532 from SERa to ISP-A. 534 Now consider what happens when the uplink from SERa to ISP-A fails. 535 The only way for the packets from H31 to reach H101 is for H31 to 536 start using the source address for ISP-B. H31 needs to send the 537 following packet: (S=2001:db8:0:b010::31, D=2001:db8:0:1234::101). 539 This behavior is very different from the behavior that occurs with 540 site multihoming using PI addresses or with PA addresses using NAT. 541 In these other multi-homing solutions, hosts do not need to react to 542 network failures several hops away in order to regain Internet 543 access. Instead, a host can be largely unaware of the failure of an 544 uplink to an ISP. When multihoming with PA addresses and NAT, 545 existing sessions generally need to be re-established after a failure 546 since the external host will receive packets from the internal host 547 with a new source address. However, new sessions can be established 548 without any action on the part of the hosts. Multihoming with PA 549 addresses and NAT has created the expectation of a fairly quick and 550 simple recovery from network failures. Alternatives should to be 551 evaluated in terms of the speed and complexity of the recovery 552 mechanism. 554 Another example where the behavior of this multihoming solution 555 differs significantly from that of multihoming with PI address or 556 with PA addresses using NAT is in the ability of the enterprise 557 network operator to route traffic over different ISPs based on 558 destination address. We still consider the fairly simple network of 559 Figure 2 and assume that uplinks to both ISPs are functioning. 560 Assume that the site is multihomed using PA addresses and NAT, and 561 that SERa and SERb each originate a normal destination route for 562 D=::/0, with the route origination dependent on the state of the 563 uplink to the respective ISP. 565 Now suppose it is observed that an important application running 566 between internal hosts and external host H101 experience much better 567 performance when the traffic passes through ISP-A (perhaps because 568 ISP-A provides lower latency to H101.) When multihoming this site 569 with PI addresses or with PA addresses and NAT, the enterprise 570 network operator can configure SERa to originate into the site 571 network a normal destination route for D=2001:db8:0:1234::/64 (the 572 destination prefix to reach H101) that depends on the state of the 573 uplink to ISP-A. When the link to ISP-A is functioning, the 574 destination route D=2001:db8:0:1234::/64 will be originated by SERa, 575 so traffic from all hosts will use ISP-A to reach H101 based on the 576 longest destination prefix match in the route lookup. 578 Implementing the same routing policy is more difficult with the PA 579 multihoming solution described in this document since it doesn't use 580 NAT. By design, the only way to control where a packet exits this 581 network is by setting the source address of the packet. Since the 582 network cannot modify the source address without NAT, the host must 583 set it. To implement this routing policy, each host needs to use the 584 source address from the prefix assigned by ISP-A to send traffic 585 destined for H101. Mechanisms have been proposed to allow hosts to 586 choose the source address for packets in a fine grained manner. We 587 will discuss these proposals in Section 6. However, interacting with 588 host operating systems in some manner to ensure a particular source 589 address is chosen for a particular destination prefix is not what an 590 enterprise network administrator would expect to have to do to 591 implement this routing policy. 593 4.4. More complex ISP connectivity 595 The previous sections considered two variations of a simple 596 multihoming scenario where the site is connected to two ISPs offering 597 only Internet connectivity. It is likely that many actual enterprise 598 multihoming scenarios will be similar to this simple example. 599 However, there are more complex multihoming scenarios that we would 600 like this solution to address as well. 602 It is fairly common for an ISP to offer a service in addition to 603 Internet access over the same uplink. Two variations of this are 604 reflected in Figure 3. In addition to Internet access, ISP-A offers 605 a service which requires the site to access host H51 at 606 2001:db8:0:5555::51. The site has a single physical and logical 607 connection with ISP-A, and ISP-A only allows access to H51 over that 608 connection. So when H32 needs to access the service at H51 it needs 609 to send packets with (S=2001:db8:0:a010::32, D=2001:db8:0:5555::51) 610 and those packets need to be forward out the link from SERa to ISP-A. 612 2001:db8:0:1234::101 H101 613 | 614 | 615 2001:db8:0:a010::31 -------- 616 2001:db8:0:b010::31 ,-----. / \ 617 +--+ +--+ +----+ ,' `. : : 618 +---|R1|---|R4|---+---|SERa|-+ ISP-A +--+-- : 619 H31--+ +--+ +--+ | +----+ `. ,' : : 620 | | `-----' : Internet : 621 | | | : : 622 | | H51 : : 623 | | 2001:db8:0:5555::51 : : 624 | +--+ : : 625 | |R7| : : 626 | +--+ : : 627 | | : : 628 | | ,-----. : : 629 H32--+ +--+ | +-----+ ,' `. : : 630 +---|R2|-----+----+--|SERb1|-+ ISP-B +--+-- : 631 +--+ | +-----+ `. ,' : : 632 +--+ `--|--' : : 633 2001:db8:0:a010::32 |R8| | \ / 634 +--+ ,--|--. -------- 635 | +-----+ ,' `. | 636 +-------|SERb2|-+ ISP-B | | 637 | +-----+ `. ,' H501 638 | `-----' 2001:db8:0:5678 639 | | ::501 640 +--+ +--+ H61 641 H41------|R3|--|R5| 2001:db8:0:6666::61 642 +--+ +--+ 644 2001:db8:0:a020::41 645 2001:db8:0:b020::41 647 Figure 3: Internet access and services offered by ISP-A and ISP-B 649 ISP-B illustrates a variation on this scenario. In addition to 650 Internet access, ISP-B also offers a service which requires the site 651 to access host H61. The site has two connections to two different 652 parts of ISP-B (shown as SERb1 and SERb2 in Figure 3). ISP-B expects 653 Internet traffic to use the uplink from SERb1, while it expects 654 traffic destined for the service at H61 to use the uplink from SERb2. 655 For either uplink, ISP-B expects the ingress traffic to have a source 656 address matching the prefix it assigned to the site, 657 2001:db8:0:b000::/52. 659 As discussed before, we rely completely on the internal host to set 660 the source address of the packet properly. In the case of a packet 661 sent by H31 to access the service in ISP-B at H61, we expect the 662 packet to have the following addresses: (S=2001:db8:0:b010::31, 663 D=2001:db8:0:6666::61). The routed network has two potential ways of 664 distributing routes so that this packet exits the site on the uplink 665 at SERb2. 667 We could just rely on normal destination routes, without using 668 source-prefix scoped routes. If we have SERb2 originate a normal 669 unscoped destination route for D=2001:db8:0:6666::/64, the packets 670 from H31 to H61 will exit the site at SERb2 as desired. We should 671 not have to worry about SERa needing to originate the same route, 672 because ISP-B should choose a globally unique prefix for the service 673 at H61. 675 The alternative is to have SERb2 originate a source-prefix-scoped 676 destination route of the form (S=2001:db8:0:b000::/52, 677 D=2001:db8:0:6666::/64). From a forwarding point of view, the use of 678 the source-prefix-scoped destination route would result in traffic 679 with source addresses corresponding only to ISP-B being sent to 680 SERb2. Instead, the use of the unscoped destination route would 681 result in traffic with source addresses corresponding to ISP-A and 682 ISP-B being sent to SERb2, as long as the destination address matches 683 the destination prefix. It seems like either forwarding behavior 684 would be acceptable. 686 However, from the point of view of the enterprise network 687 administrator trying to configure, maintain, and trouble-shoot this 688 multihoming solution, it seems much clearer to have SERb2 originate 689 the source-prefix-scoped destination route correspond to the service 690 offered by ISP-B. In this way, all of the traffic leaving the site 691 is determined by the source-prefix-scoped routes, and all of the 692 traffic within the site or arriving from external hosts is determined 693 by the unscoped destination routes. Therefore, for this multihoming 694 solution we choose to originate source-prefix-scoped routes for all 695 traffic leaving the site. 697 4.5. ISPs and Provider-Assigned Prefixes 699 While we expect that most site multihoming involves connecting to 700 only two ISPs, this solution allows for connections to an arbitrary 701 number of ISPs to be supported. However, when evaluating scalable 702 implementations of the solution, it would be reasonable to assume 703 that the maximum number of ISPs that a site would connect to is five 704 (topologies with two redundant routers each having two uplinks to 705 different ISPs plus a tunnel to a headoffice acting as fifth one are 706 not unheard of). 708 It is also useful to note that the prefixes assigned to the site by 709 different ISPs will not overlap. This must be the case, since the 710 provider-assigned addresses have to be globally unique. 712 4.6. Simplified Topologies 714 The topologies of many enterprise sites using this multihoming 715 solution may in practice be simpler than the examples that we have 716 used. The topology in Figure 1 could be further simplified by having 717 all hosts directly connected to the LAN connecting the two site exit 718 routers, SERa and SERb. The topology could also be simplified by 719 having the uplinks to ISP-A and ISP-B both connected to the same site 720 exit router. However, it is the aim of this document to provide a 721 solution that applies to a broad a range of enterprise site network 722 topologies, so this document focuses on providing a solution to the 723 more general case. The simplified cases will also be supported by 724 this solution, and there may even be optimizations that can be made 725 for simplified cases. This solution however needs to support more 726 complex topologies. 728 We are starting with the basic assumption that enterprise site 729 networks can be quite complex from a routing perspective. However, 730 even a complex site network can be multihomed to different ISPs with 731 PA addresses using IPv4 and NAT. It is not reasonable to expect an 732 enterprise network operator to change the routing topology of the 733 site in order to deploy IPv6. 735 5. Generating Source-Prefix-Scoped Forwarding Tables 737 So far we have described in general terms how the routers in this 738 solution that are capable of Source Address Dependent Routing will 739 forward traffic using both normal unscoped destination routes and 740 source-prefix-scoped destination routes. Here we give a precise 741 method for generating a source-prefix-scoped forwarding table on a 742 router that supports SADR. 744 1. Compute the next-hops for the source-prefix-scoped destination 745 prefixes using only routers in the connected SADR domain. These 746 are the initial source-prefix-scoped forwarding table entries. 748 2. Compute the next-hops for the unscoped destination prefixes using 749 all routers in the IGP. This is the unscoped forwarding table. 751 3. For a given source-prefix-scoped forwarding table T (scoped to 752 source prefix P), consider a source-prefix-scoped forwarding 753 table T', whose source prefix P' contains P. We call T the more 754 specific source-prefix-scoped forwarding table, and T' the less 755 specific source-prefix-scoped forwarding table. We select 756 entries in the less specific source-prefix-scoped forwarding 757 table to augment the more specific source-prefix-scoped 758 forwarding table based on the following rules. If a destination 759 prefix of an entry in the less specific source-prefix-scoped 760 forwarding table exactly matches the destination prefix of an 761 existing entry in the more specific source-prefix-scoped 762 forwarding table (including destination prefix length), then do 763 not add the entry to the more specific source-prefix-scoped 764 forwarding table. If the destination prefix does NOT match an 765 existing entry, then add the entry to the more specific source- 766 prefix-scoped forwarding table. As the unscoped forwarding table 767 is considered to be scoped to ::/0, this process will propagate 768 routes from the unscoped forwarding table to the more specific 769 source-prefix-scoped forwarding table. If there exist multiple 770 source-prefix-scoped forwarding tables whose source prefixes 771 contain P, these source-prefix-scoped forwarding tables should be 772 processed in order from most specific to least specific. 774 The forwarding tables produced by this process are used in the 775 following way to forward packets. 777 1. Select the most specific (longest prefix match) source-prefix- 778 scoped forwarding table that matches the source address of the 779 packet (again, the unscoped forwarding table is considered to be 780 scoped to ::/0). 782 2. Look up the destination address of the packet in the selected 783 forwarding table to determine the next-hop for the packet. 785 The following example illustrates how this process is used to create 786 a forwarding table for each provider-assigned source prefix. We 787 consider the multihomed site network in Figure 3. Initially we 788 assume that all of the routers in the site network support SADR. 789 Figure 4 shows the routes that are originated by the routers in the 790 site network. 792 Routes originated by SERa: 793 (S=2001:db8:0:a000::/52, D=2001:db8:0:5555/64) 794 (S=2001:db8:0:a000::/52, D=::/0) 795 (D=2001:db8:0:5555::/64) 796 (D=::/0) 798 Routes originated by SERb1: 799 (S=2001:db8:0:b000::/52, D=::/0) 800 (D=::/0) 802 Routes originated by SERb2: 803 (S=2001:db8:0:b000::/52, D=2001:db8:0:6666::/64) 804 (D=2001:db8:0:6666::/64) 806 Routes originated by R1: 807 (D=2001:db8:0:a010::/64) 808 (D=2001:db8:0:b010::/64) 810 Routes originated by R2: 811 (D=2001:db8:0:a010::/64) 812 (D=2001:db8:0:b010::/64) 814 Routes originated by R3: 815 (D=2001:db8:0:a020::/64) 816 (D=2001:db8:0:b020::/64) 818 Figure 4: Routes Originated by Routers in the Site Network 820 Each SER originates destination routes which are scoped to the source 821 prefix assigned by the ISP that the SER connects to. Note that the 822 SERs also originate the corresponding unscoped destination route. 823 This is not needed when all of the routers in the site support SADR. 824 However, it is required when some routers do not support SADR. This 825 will be discussed in more detail later. 827 We focus on how R8 constructs its source-prefix-scoped forwarding 828 tables from these route advertisements. R8 computes the next hops 829 for destination routes which are scoped to the source prefix 830 2001:db8:0:a000::/52. The results are shown in the first table in 831 Figure 5. (In this example, the next hops are computed assuming that 832 all links have the same metric.) Then, R8 computes the next hops for 833 destination routes which are scoped to the source prefix 834 2001:db8:0:b000::/52. The results are shown in the second table in 835 Figure 5 . Finally, R8 computes the next hops for the unscoped 836 destination prefixes. The results are shown in the third table in 837 Figure 5. 839 forwarding entries scoped to 840 source prefix = 2001:db8:0:a000::/52 841 ============================================ 842 D=2001:db8:0:5555/64 NH=R7 843 D=::/0 NH=R7 845 forwarding entries scoped to 846 source prefix = 2001:db8:0:b000::/52 847 ============================================ 848 D=2001:db8:0:6666/64 NH=SERb2 849 D=::/0 NH=SERb1 851 unscoped forwarding entries 852 ============================================ 853 D=2001:db8:0:a010::/64 NH=R2 854 D=2001:db8:0:b010::/64 NH=R2 855 D=2001:db8:0:a020::/64 NH=R5 856 D=2001:db8:0:b020::/64 NH=R5 857 D=2001:db8:0:5555::/64 NH=R7 858 D=2001:db8:0:6666::/64 NH=SERb2 859 D=::/0 NH=SERb1 861 Figure 5: Forwarding Entries Computed at R8 863 The final step is for R8 to augment the more specific source-prefix- 864 scoped forwarding tables with entries from less specific source- 865 prefix-scoped forwarding tables. The unscoped forwarding table is 866 considered as being scoped to ::/0, so both 2001:db8:0:a000::/52 and 867 2001:db8:0:b000::/52 are more specific prefixes of ::/0. Therefore, 868 entries in the unscoped forwarding table will be evaluated to be 869 added to these two more specific source-prefix-scoped forwarding 870 tables. If a forwarding entry from the less specific source-prefix- 871 scoped forwarding table has the exact same destination prefix 872 (including destination prefix length) as the forwarding entry from 873 the more specific source-prefix-scoped forwarding table, then the 874 existing forwarding entry in the more specific source-prefix-scoped 875 forwarding table wins. 877 As an example of how the source scoped forwarding entries are 878 augmented, we consider how the two entries in the first table in 879 Figure 5 (the table for source prefix = 2001:db8:0:a000::/52) are 880 augmented with entries from the third table in Figure 5 (the table of 881 unscoped or scoped to ::/0 forwarding entries). The first four 882 unscoped forwarding entries (D=2001:db8:0:a010::/64, 883 D=2001:db8:0:b010::/64, D=2001:db8:0:a020::/64, and 884 D=2001:db8:0:b020::/64) are not an exact match for any of the 885 existing entries in the forwarding table for source prefix 886 2001:db8:0:a000::/52. Therefore, these four entries are added to the 887 final forwarding table for source prefix 2001:db8:0:a000::/52. The 888 result of adding these entries is reflected in the first four entries 889 the first table in Figure 6. 891 The next less specific scoped (scope is ::/0) forwarding table entry 892 is for D=2001:db8:0:5555::/64. This entry is an exact match for the 893 existing entry in the forwarding table for the more specific source 894 prefix 2001:db8:0:a000::/52. Therefore, we do not replace the 895 existing entry with the entry from the unscoped forwarding table. 896 This is reflected in the fifth entry in the first table in Figure 6. 897 (Note that since both scoped and unscoped entries have R7 as the next 898 hop, the result of applying this rule is not visible.) 900 The next less specific prefix scoped (scope is ::/0) forwarding table 901 entry is for D=2001:db8:0:6666::/64. This entry is not an exact 902 match for any existing entries in the forwarding table for source 903 prefix 2001:db8:0:a000::/52. Therefore, we add this entry. This is 904 reflected in the sixth entry in the first table in Figure 6. 906 The next less specific prefix scoped (scope is ::/0) forwarding table 907 entry is for D=::/0. This entry is an exact match for the existing 908 entry in the forwarding table for more specific source prefix 909 2001:db8:0:a000::/52. Therefore, we do not overwrite the existing 910 source-prefix-scoped entry, as can be seen in the last entry in the 911 first table in Figure 6. 913 if source address matches 2001:db8:0:a000::/52 914 then use this forwarding table 915 ============================================ 916 D=2001:db8:0:a010::/64 NH=R2 917 D=2001:db8:0:b010::/64 NH=R2 918 D=2001:db8:0:a020::/64 NH=R5 919 D=2001:db8:0:b020::/64 NH=R5 920 D=2001:db8:0:5555::/64 NH=R7 921 D=2001:db8:0:6666::/64 NH=SERb2 922 D=::/0 NH=R7 924 else if source address matches 2001:db8:0:b000::/52 925 then use this forwarding table 926 ============================================ 927 D=2001:db8:0:a010::/64 NH=R2 928 D=2001:db8:0:b010::/64 NH=R2 929 D=2001:db8:0:a020::/64 NH=R5 930 D=2001:db8:0:b020::/64 NH=R5 931 D=2001:db8:0:5555::/64 NH=R7 932 D=2001:db8:0:6666::/64 NH=SERb2 933 D=::/0 NH=SERb1 935 else if source address matches ::/0 use this forwarding table 936 ============================================ 937 D=2001:db8:0:a010::/64 NH=R2 938 D=2001:db8:0:b010::/64 NH=R2 939 D=2001:db8:0:a020::/64 NH=R5 940 D=2001:db8:0:b020::/64 NH=R5 941 D=2001:db8:0:5555::/64 NH=R7 942 D=2001:db8:0:6666::/64 NH=SERb2 943 D=::/0 NH=SERb1 945 Figure 6: Complete Forwarding Tables Computed at R8 947 The forwarding tables produced by this process at R8 have the desired 948 properties. A packet with a source address in 2001:db8:0:a000::/52 949 will be forwarded based on the first table in Figure 6. If the 950 packet is destined for the Internet at large or the service at 951 D=2001:db8:0:5555/64, it will be sent to R7 in the direction of SERa. 952 If the packet is destined for an internal host, then the first four 953 entries will send it to R2 or R5 as expected. Note that if this 954 packet has a destination address corresponding to the service offered 955 by ISP-B (D=2001:db8:0:5555::/64), then it will get forwarded to 956 SERb2. It will be dropped by SERb2 or by ISP-B, since the packet has 957 a source address that was not assigned by ISP-B. However, this is 958 expected behavior. In order to use the service offered by ISP-B, the 959 host needs to originate the packet with a source address assigned by 960 ISP-B. 962 In this example, a packet with a source address that doesn't match 963 2001:db8:0:a000::/52 or 2001:db8:0:b000::/52 must have originated 964 from an external host. Such a packet will use the unscoped 965 forwarding table (the last table in Figure 6). These packets will 966 flow exactly as they would in absence of multihoming. 968 We can also modify this example to illustrate how it supports 969 deployments where not all routers in the site support SADR. 970 Continuing with the topology shown in Figure 3, suppose that R3 and 971 R5 do not support SADR. Instead they are only capable of 972 understanding unscoped route advertisements. The SADR routers in the 973 network will still originate the routes shown in Figure 4. However, 974 R3 and R5 will only understand the unscoped routes as shown in 975 Figure 7. 977 Routes originated by SERa: 978 (D=2001:db8:0:5555::/64) 979 (D=::/0) 981 Routes originated by SERb1: 982 (D=::/0) 984 Routes originated by SERb2: 985 (D=2001:db8:0:6666::/64) 987 Routes originated by R1: 988 (D=2001:db8:0:a010::/64) 989 (D=2001:db8:0:b010::/64) 991 Routes originated by R2: 992 (D=2001:db8:0:a010::/64) 993 (D=2001:db8:0:b010::/64) 995 Routes originated by R3: 996 (D=2001:db8:0:a020::/64) 997 (D=2001:db8:0:b020::/64) 999 Figure 7: Routes Advertisements Understood by Routers that do no 1000 Support SADR 1002 With these unscoped route advertisements, R5 will produce the 1003 forwarding table shown in Figure 8. 1005 forwarding table 1006 ============================================ 1007 D=2001:db8:0:a010::/64 NH=R8 1008 D=2001:db8:0:b010::/64 NH=R8 1009 D=2001:db8:0:a020::/64 NH=R3 1010 D=2001:db8:0:b020::/64 NH=R3 1011 D=2001:db8:0:5555::/64 NH=R8 1012 D=2001:db8:0:6666::/64 NH=SERb2 1013 D=::/0 NH=R8 1015 Figure 8: Forwarding Table For R5, Which Doesn't Understand Source- 1016 Prefix-Scoped Routes 1018 As all SERs belong to the SADR domain any traffic that needs to exit 1019 the site will eventually hit a SADR-capable router. To prevent 1020 routing loops involving SADR-capable and non-SADR-capable routers, 1021 traffic that enters the SADR-capable domain does not leave the domain 1022 until it exits the site. Therefore all SADR-capable routers within 1023 the domain MUST be logically connected. 1025 Note that the mechanism described here for converting source-prefix- 1026 scoped destination prefix routing advertisements into forwarding 1027 state is somewhat different from that proposed in 1028 [I-D.ietf-rtgwg-dst-src-routing]. The method described in the 1029 current document is functionally equivalent, but it is based on 1030 application of existing mechanisms for the described scenarios. 1032 6. Mechanisms For Hosts To Choose Good Source Addresses In A Multihomed 1033 Site 1035 Until this point, we have made the assumption that hosts are able to 1036 choose the correct source address using some unspecified mechanism. 1037 This has allowed us to just focus on what the routers in a multihomed 1038 site network need to do in order to forward packets to the correct 1039 ISP based on source address. Now we look at possible mechanisms for 1040 hosts to choose the correct source address. We also look at what 1041 role, if any, the routers may play in providing information that 1042 helps hosts to choose source addresses. 1044 It should be noted that this section discussed how hosts could select 1045 the default source address for new connections. Any connection which 1046 already exists on a host is bound to the specific source address 1047 which can not be changed. Section 6.7 discusses the connections 1048 preservation issue in more details. 1050 Any host that needs to be able to send traffic using the uplinks to a 1051 given ISP is expected to be configured with an address from the 1052 prefix assigned by that ISP. The host will control which ISP is used 1053 for its traffic by selecting one of the addresses configured on the 1054 host as the source address for outgoing traffic. It is the 1055 responsibility of the site network to ensure that a packet with the 1056 source address from an ISP is now sent on an uplink to that ISP. 1058 If all of the ISP uplinks are working, the choice of source address 1059 by the host may be driven by the desire to load share across ISP 1060 uplinks, or it may be driven by the desire to take advantage of 1061 certain properties of a particular uplink or ISP (if some information 1062 about various path properties has been made availabe to the host 1063 somehow - see [I-D.ietf-intarea-provisioning-domains] as an example). 1064 If any of the ISP uplinks is not working, then the choice of source 1065 address by the host can cause packets to get dropped. 1067 How a host should make good decisions about source address selection 1068 in a multihomed site is not a solved problem. We do not attempt to 1069 solve this problem in this document. Instead we discuss the current 1070 state of affairs with respect to standardized solutions and 1071 implementation of those solutions. We also look at proposed 1072 solutions for this problem. 1074 An external host initiating communication with a host internal to a 1075 PA multihomed site will need to know multiple addresses for that host 1076 in order to communicate with it using different ISPs to the 1077 multihomed site (knowing just one address would undermine all 1078 benefits of redundant connectivity provided by multihoming). These 1079 addresses are typically learned through DNS. (For simplicity, we 1080 assume that the external host is single-homed.) The external host 1081 chooses the ISP that will be used at the remote multihomed site by 1082 setting the destination address on the packets it transmits. For a 1083 session originated from an external host to an internal host, the 1084 choice of source address used by the internal host is simple. The 1085 internal host has no choice but to use the destination address in the 1086 received packet as the source address of the transmitted packet. 1088 For a session originated by a host inside the multi-homed site, the 1089 decision of what source address to select is more complicated. We 1090 consider three main methods for hosts to get information about the 1091 network. The two proactive methods are Neighbor Discovery Router 1092 Advertisements and DHCPv6. The one reactive method we consider is 1093 ICMPv6. Note that we are explicitly excluding the possibility of 1094 having hosts participate in or even listen directly to routing 1095 protocol advertisements. 1097 First we look at how a host is currently expected to select the 1098 default source and destination addresses to be used for a new 1099 connection. 1101 6.1. Source Address Selection Algorithm on Hosts 1103 [RFC6724] defines the algorithms that hosts are expected to use to 1104 select source and destination addresses for packets. It defines an 1105 algorithm for selecting a source address and a separate algorithm for 1106 selecting a destination address. Both of these algorithms depend on 1107 a policy table. [RFC6724] defines a default policy which produces 1108 certain behavior. 1110 The rules in the two algorithms in [RFC6724] depend on many different 1111 properties of addresses. While these are needed for understanding 1112 how a host should choose addresses in an arbitrary environment, most 1113 of the rules are not relevant for understanding how a host should 1114 choose among multiple source addresses in multihomed environment when 1115 sending a packet to a remote host. Returning to the example in 1116 Figure 3, we look at what the default algorithms in [RFC6724] say 1117 about the source address that internal host H31 should use to send 1118 traffic to external host H101, somewhere on the Internet. 1120 There is no choice to be made with respect to destination address. 1121 H31 needs to send a packet with D=2001:db8:0:1234::101 in order to 1122 reach H101. So H31 have to choose between using 1123 S=2001:db8:0:a010::31 or S=2001:db8:0:b010::31 as the source address 1124 for this packet. We go through the rules for source address 1125 selection in Section 5 of [RFC6724]. 1127 Rule 1 (Prefer same address) is not useful to break the tie between 1128 source addresses, because neither the candidate source addresses 1129 equals the destination address. 1131 Rule 2 (Prefer appropriate scope) is also not used in this scenario, 1132 because both source addresses and the destination address have global 1133 scope. 1135 Rule 3 (Avoid deprecated addresses) applies to an address that has 1136 been autoconfigured by a host using stateless address 1137 autoconfiguration as defined in [RFC4862]. An address autoconfigured 1138 by a host has a preferred lifetime and a valid lifetime. The address 1139 is preferred until the preferred lifetime expires, after which it 1140 becomes deprecated. A deprecated address is not used if there is a 1141 preferred address of the appropriate scope available. When the valid 1142 lifetime expires, the address cannot be used at all. The preferred 1143 and valid lifetimes for an autoconfigured address are set based on 1144 the corresponding lifetimes in the Prefix Information Option in 1145 Neighbor Discovery Router Advertisements. So a possible tool to 1146 control source address selection in this scenario would be for a host 1147 to make an address deprecated by having routers on that link, R1 and 1148 R2 in Figure 3, send a Router Advertisement message containing a 1149 Prefix Information Option for the source prefix to be discouraged (or 1150 prohibited) with the preferred lifetime set to zero. This is a 1151 rather blunt tool, because it discourages or prohibits the use of 1152 that source prefix for all destinations. However, it may be useful 1153 in some scenarios. For example, if all uplinks to a particular ISP 1154 fail, it is desirable to prevent hosts from using source addresses 1155 from that ISP address space. 1157 Rule 4 (Avoid home addresses) does not apply here because we are not 1158 considering Mobile IP. 1160 Rule 5 (Prefer outgoing interface) is not useful in this scenario, 1161 because both source addresses are assigned to the same interface. 1163 Rule 5.5 (Prefer addresses in a prefix advertised by the next-hop) is 1164 not useful in the scenario when both R1 and R2 will advertise both 1165 source prefixes. However potentially this rule may allow a host to 1166 select the correct source prefix by selecting a next-hop. The most 1167 obvious way would be to make R1 to advertise itself as a default 1168 router and send PIO for 2001:db8:0:a010::/64, while R2 is advertising 1169 itself as a default router and sending PIO for 2001:db8:0:b010::/64. 1170 We'll discuss later how Rule 5.5 can be used to influence a source 1171 address selection in single-router topologies (e.g. when H41 is 1172 sending traffic using R3 as a default gateway). 1174 Rule 6 (Prefer matching label) refers to the Label value determined 1175 for each source and destination prefix as a result of applying the 1176 policy table to the prefix. With the default policy table defined in 1177 Section 2.1 of [RFC6724], Label(2001:db8:0:a010::31) = 5, 1178 Label(2001:db8:0:b010::31) = 5, and Label(2001:db8:0:1234::101) = 5. 1179 So with the default policy, Rule 6 does not break the tie. However, 1180 the algorithms in [RFC6724] are defined in such a way that non- 1181 default address selection policy tables can be used. [RFC7078] 1182 defines a way to distribute a non-default address selection policy 1183 table to hosts using DHCPv6. So even though the application of rule 1184 6 to this scenario using the default policy table is not useful, rule 1185 6 may still be a useful tool. 1187 Rule 7 (Prefer temporary addresses) has to do with the technique 1188 described in [RFC4941] to periodically randomize the interface 1189 portion of an IPv6 address that has been generated using stateless 1190 address autoconfiguration. In general, if H31 were using this 1191 technique, it would use it for both source addresses, for example 1192 creating temporary addresses 2001:db8:0:a010:2839:9938:ab58:830f and 1193 2001:db8:0:b010:4838:f483:8384:3208, in addition to 1194 2001:db8:0:a010::31 and 2001:db8:0:b010::31. So this rule would 1195 prefer the two temporary addresses, but it would not break the tie 1196 between the two source prefixes from ISP-A and ISP-B. 1198 Rule 8 (Use longest matching prefix) dictates that between two 1199 candidate source addresses the one which has longest common prefix 1200 length with the destination address. For example, if H31 were 1201 selecting the source address for sending packets to H101, this rule 1202 would not be a tie breaker as for both candidate source addresses 1203 2001:db8:0:a101::31 and 2001:db8:0:b101::31 the common prefix length 1204 with the destination is 48. However if H31 were selecting the source 1205 address for sending packets H41 address 2001:db8:0:a020::41, then 1206 this rule would result in using 2001:db8:0:a101::31 as a source 1207 (2001:db8:0:a101::31 and 2001:db8:0:a020::41 share the common prefix 1208 2001:db8:0:a000::/58, while for 2001:db8:0:b101::31 and 1209 2001:db8:0:a020::41 the common prefix is 2001:db8:0:a000::/51). 1210 Therefore rule 8 might be useful for selecting the correct source 1211 address in some but not all scenarios (for example if ISP-B services 1212 belong to 2001:db8:0:b000::/59 then H31 would always use 1213 2001:db8:0:b010::31 to access those destinations). 1215 So we can see that of the 8 source selection address rules from 1216 [RFC6724], four actually apply to our basic site multihoming 1217 scenario. The rules that are relevant to this scenario are 1218 summarized below. 1220 o Rule 3: Avoid deprecated addresses. 1222 o Rule 5.5: Prefer addresses in a prefix advertised by the next-hop. 1224 o Rule 6: Prefer matching label. 1226 o Rule 8: Prefer longest matching prefix. 1228 The two methods that we discuss for controlling the source address 1229 selection through the four relevant rules above are SLAAC Router 1230 Advertisement messages and DHCPv6. 1232 We also consider a possible role for ICMPv6 for getting traffic- 1233 driven feedback from the network. With the source address selection 1234 algorithm discussed above, the goal is to choose the correct source 1235 address on the first try, before any traffic is sent. However, 1236 another strategy is to choose a source address, send the packet, get 1237 feedback from the network about whether or not the source address is 1238 correct, and try another source address if it is not. 1240 We consider four scenarios where a host needs to select the correct 1241 source address. The first is when both uplinks are working. The 1242 second is when one uplink has failed. The third one is a situation 1243 when one failed uplink has recovered. The last one is failure of 1244 both (all) uplinks. 1246 It should be noted that [RFC6724] only defines the behavior of IPv6 1247 hosts to select default addresses that applications and upper-layer 1248 protocols can use. Applications and upper-layer protocols can make 1249 their own choices on selecting source addresses. The mechanism 1250 proposed in this document attempts to ensure that the subset of 1251 source addresses available for applications and upper-layer protocols 1252 is selected with the up-to-date network state in mind. The rest of 1253 the document discusses various aspects of the default source address 1254 selection defined in [RFC6724], calling it for the sake of brevity 1255 "the source address selection". 1257 6.2. Selecting Source Address When Both Uplinks Are Working 1259 Again we return to the topology in Figure 3. Suppose that the site 1260 administrator wants to implement a policy by which all hosts need to 1261 use ISP-A to reach H101 at D=2001:db8:0:1234::101. So for example, 1262 H31 needs to select S=2001:db8:0:a010::31. 1264 6.2.1. Distributing Address Selection Policy Table with DHCPv6 1266 This policy can be implemented by using DHCPv6 to distribute an 1267 address selection policy table that assigns the same label to 1268 destination address that match 2001:db8:0:1234::/64 as it does to 1269 source addresses that match 2001:db8:0:a000::/52. The following two 1270 entries accomplish this. 1272 Prefix Precedence Label 1273 2001:db8:0:1234::/64 50 33 1274 2001:db8:0:a000::/52 50 33 1276 Figure 9: Policy table entries to implement a routing policy 1278 This requires that the hosts implement [RFC6724], the basic source 1279 and destination address framework, along with [RFC7078], the DHCPv6 1280 extension for distributing a non-default policy table. Note that it 1281 does NOT require that the hosts use DHCPv6 for address assignment. 1282 The hosts could still use stateless address autoconfiguration for 1283 address configuration, while using DHCPv6 only for policy table 1284 distribution (see [RFC8415]). However this method has a number of 1285 disadvantages: 1287 o DHCPv6 support is not a mandatory requirement for IPv6 hosts 1288 ([RFC6434]), so this method might not work for all devices. 1290 o Network administrators are required to explicitly configure the 1291 desired network access policies on DHCPv6 servers. While it might 1292 be feasible in the scenario of a single multihomed network, such 1293 approach might have some scalability issues, especially if the 1294 centralized DHCPv6 solution is deployed to serve a large number of 1295 multiomed sites. 1297 6.2.2. Controlling Source Address Selection With Router Advertisements 1299 Neighbor Discovery currently has two mechanisms to communicate prefix 1300 information to hosts. The base specification for Neighbor Discovery 1301 (see [RFC4861]) defines the Prefix Information Option (PIO) in the 1302 Router Advertisement (RA) message. When a host receives a PIO with 1303 the A-flag set, it can use the prefix in the PIO as source prefix 1304 from which it assigns itself an IP address using stateless address 1305 autoconfiguration (SLAAC) procedures described in [RFC4862]. In the 1306 example of Figure 3, if the site network is using SLAAC, we would 1307 expect both R1 and R2 to send RA messages with PIOs for both source 1308 prefixes 2001:db8:0:a010::/64 and 2001:db8:0:b010::/64 with the 1309 A-flag set. H31 would then use the SLAAC procedure to configure 1310 itself with the 2001:db8:0:a010::31 and 2001:db8:0:b010::31. 1312 Whereas a host learns about source prefixes from PIO messages, hosts 1313 can learn about a destination prefix from a Router Advertisement 1314 containing Route Information Option (RIO), as specified in [RFC4191]. 1315 The destination prefixes in RIOs are intended to allow a host to 1316 choose the router that it uses as its first hop to reach a particular 1317 destination prefix. 1319 As currently standardized, neither PIO nor RIO options contained in 1320 Neighbor Discovery Router Advertisements can communicate the 1321 information needed to implement the desired routing policy. PIO's 1322 communicate source prefixes, and RIO communicate destination 1323 prefixes. However, there is currently no standardized way to 1324 directly associate a particular destination prefix with a particular 1325 source prefix. 1327 [I-D.pfister-6man-sadr-ra] proposes a Source Address Dependent Route 1328 Information option for Neighbor Discovery Router Advertisements which 1329 would associate a source prefix and with a destination prefix. The 1330 details of [I-D.pfister-6man-sadr-ra] might need tweaking to address 1331 this use case. However, in order to be able to use Neighbor 1332 Discovery Router Advertisements to implement this routing policy, an 1333 extension that allows R1 and R2 to explicitly communicate to H31 an 1334 association between S=2001:db8:0:a000::/52 D=2001:db8:0:1234::/64 1335 would be needed. 1337 However, Rule 5.5 of the default source address selection algorithm 1338 (discussed in Section 6.1 above), together with default router 1339 preference (specified in [RFC4191]) and RIO can be used to influence 1340 a source address selection on a host as described below. Let's look 1341 at source address selection on the host H41. It receives RAs from R3 1342 with PIOs for 2001:db8:0:a020::/64 and 2001:db8:0:b020::/64. At that 1343 point all traffic would use the same next-hop (R3 link-local address) 1344 so Rule 5.5 does not apply. Now let's assume that R3 supports SADR 1345 and has two scoped forwarding tables, one scoped to 1346 S=2001:db8:0:a000::/52 and another scoped to S=2001:db8:0:b000::/52. 1347 If R3 generates two different link-local addresses for its interface 1348 facing H41 (one for each scoped forwarding table, LLA_A and LLA_B) 1349 and starts sending two different RAs: one is sent from LLA_A and 1350 includes PIO for 2001:db8:0:a020::/64, another is sent from LLA_B and 1351 includes PIO for 2001:db8:0:b020::/64. Now it is possible to 1352 influence H41 source address selection for destinations which follow 1353 the default route by setting default router preference in RAs. If it 1354 is desired that H41 reaches H101 (or any destinations in the 1355 Internet) via ISP-A, then RAs sent from LLA_A should have default 1356 router preference set to 01 (high priority), while RAs sent from 1357 LLA_B should have preference set to 11 (low). Then LLA_A would be 1358 chosen as a next-hop for H101 and therefore (as per rule 5.5) 1359 2001:db8:0:a020::41 would be selected as the source address. If, at 1360 the same time, it is desired that H61 is accessible via ISP-B then R3 1361 should include a RIO for 2001:db8:0:6666::/64 to its RA sent from 1362 LLA_B. H41 would chose LLA_B as a next-hop for all traffic to H61 1363 and then as per Rule 5.5, 2001:db8:0:b020::41 would be selected as a 1364 source address. 1366 If in the above mentioned scenario it is desirable that all Internet 1367 traffic leaves the network via ISP-A and the link to ISP-B is used 1368 for accessing ISP-B services only (not as ISP-A link backup), then 1369 RAs sent by R3 from LLA_B should have Router Lifetime set to 0 and 1370 should include RIOs for ISP-B address space. It would instruct H41 1371 to use LLA_A for all Internet traffic but use LLA_B as a next-hop 1372 while sending traffic to ISP-B addresses. 1374 The description of the mechanism above assumes SADR support by the 1375 first-hop routers as well as SERs. However, a first-hop router can 1376 still provide a less flexible version of this mechanism even without 1377 implementing SADR. This could be done by providing configuration 1378 knobs on the first-hop router that allow it to generate different 1379 link-local addresses and to send individual RAs for each prefix. 1381 The mechanism described above relies on Rule 5.5 of the default 1382 source address selection algorithm defined in [RFC6724]. [RFC8028] 1383 states that "A host SHOULD select default routers for each prefix it 1384 is assigned an address in". It also recommends that hosts should 1385 implement Rule 5.5. of [RFC6724]. Hosts following the 1386 recommendations specified in [RFC8028] therefore should be able to 1387 benefit from the solution described in this document. No standards 1388 need to be updated in regards to host behavior. 1390 6.2.3. Controlling Source Address Selection With ICMPv6 1392 We now discuss how one might use ICMPv6 to implement the routing 1393 policy to send traffic destined for H101 out the uplink to ISP-A, 1394 even when uplinks to both ISPs are working. If H31 started sending 1395 traffic to H101 with S=2001:db8:0:b010::31 and 1396 D=2001:db8:0:1234::101, it would be routed through SER-b1 and out the 1397 uplink to ISP-B. SERb1 could recognize that this traffic is not 1398 following the desired routing policy and react by sending an ICMPv6 1399 message back to H31. 1401 In this example, we could arrange things so that SERb1 drops the 1402 packet with S=2001:db8:0:b010::31 and D=2001:db8:0:1234::101, and 1403 then sends to H31 an ICMPv6 Destination Unreachable message with Code 1404 5 (Source address failed ingress/egress policy). When H31 receives 1405 this packet, it would then be expected to try another source address 1406 to reach the destination. In this example, H31 would then send a 1407 packet with S=2001:db8:0:a010::31 and D=2001:db8:0:1234::101, which 1408 will reach SERa and be forwarded out the uplink to ISP-A. 1410 However, we would also want it to be the case that SERb1 does not 1411 enforce this routing policy when the uplink from SERa to ISP-A has 1412 failed. This could be accomplished by having SERa originate a 1413 source-prefix-scoped route for (S=2001:db8:0:a000::/52, 1414 D=2001:db8:0:1234::/64) and have SERb1 monitor the presence of that 1415 route. If that route is not present (because SERa has stopped 1416 originating it), then SERb1 will not enforce the routing policy, and 1417 it will forward packets with S=2001:db8:0:b010::31 and 1418 D=2001:db8:0:1234::101 out its uplink to ISP-B. 1420 We can also use this source-prefix-scoped route originated by SERa to 1421 communicate the desired routing policy to SERb1. We can define an 1422 EXCLUSIVE flag to be advertised together with the IGP route for 1423 (S=2001:db8:0:a000::/52, D=2001:db8:0:1234::/64). This would allow 1424 SERa to communicate to SERb that SERb should reject traffic for 1425 D=2001:db8:0:1234::/64 and respond with an ICMPv6 Destination 1426 Unreachable Code 5 message, as long as the route for 1427 (S=2001:db8:0:a000::/52, D=2001:db8:0:1234::/64) is present. The 1428 definition of an EXCLUSIVE flag for SADR advertisements in IGPs would 1429 require future standardization work. 1431 Finally, if we are willing to extend ICMPv6 to support this solution, 1432 then we could create a mechanism for SERb1 to tell the host what 1433 source address it should be using to successfully forward packets 1434 that meet the policy. In its current form, when SERb1 sends an 1435 ICMPv6 Destination Unreachable Code 5 message, it is basically 1436 saying, "This source address is wrong. Try another source address." 1437 In the absence of a clear indication which address to try next, the 1438 host will iterate over all addresses assigned to the interface (e.g. 1439 various privacy addresses) which would lead to significant delays and 1440 degraded user experience. It would be better is if the ICMPv6 1441 message could say, "This source address is wrong. Instead use a 1442 source address in S=2001:db8:0:a000::/52.". 1444 However using ICMPv6 for signaling source address information back to 1445 hosts introduces new challenges. Most routers currently have 1446 software or hardware limits on generating ICMP messages. A site 1447 administrator deploying a solution that relies on the SERs generating 1448 ICMP messages could try to improve the performance of SERs for 1449 generating ICMP messages. However, in a large network, it is still 1450 likely that ICMP message generation limits will be reached. As a 1451 result hosts would not receive ICMPv6 back which in turn leads to 1452 traffic blackholing and poor user experience. To improve the 1453 scalability of ICMPv6-based signaling hosts SHOULD cache the 1454 preferred source address (or prefix) for the given destination (which 1455 in turn might cause issues in case of the corresponding ISP uplinks 1456 failure - see Section 6.3). In addition, the same source prefix 1457 SHOULD be used for other destinations in the same /64 as the original 1458 destination address. The source prefix to the destination mapping 1459 SHOULD have a specific lifetime. Expiration of the lifetime SHOULD 1460 trigger the source address selection algorithm again. 1462 Using ICMPv6 Destination Unreachable Messages with Code 5 to 1463 influence source address selection introduces some security 1464 challenges which are discussed in Section 10. 1466 As currently standardized in [RFC4443], the ICMPv6 Destination 1467 Unreachable Message with Code 5 would allow for the iterative 1468 approach to retransmitting packets using different source addresses. 1469 As currently defined, the ICMPv6 message does not provide a mechanism 1470 to communication information about which source prefix should be used 1471 for a retransmitted packet. The current document does not define 1472 such a mechanism but it might be a useful extension to define in a 1473 different document. However this approach has some security 1474 implications such as an ability for an attacker to send spoofed 1475 ICMPv6 messages to signal invalid/unreachable source prefix causing 1476 DoS-type attack. 1478 6.2.4. Summary of Methods For Controlling Source Address Selection To 1479 Implement Routing Policy 1481 So to summarize this section, we have looked at three methods for 1482 implementing a simple routing policy where all traffic for a given 1483 destination on the Internet needs to use a particular ISP, even when 1484 the uplinks to both ISPs are working. 1486 The default source address selection policy cannot distinguish 1487 between the source addresses needed to enforce this policy, so a non- 1488 default policy table using associating source and destination 1489 prefixes using Label values would need to be installed on each host. 1490 A mechanism exists for DHCPv6 to distribute a non-default policy 1491 table but such solution would heavily rely on DHCPv6 support by host 1492 operating system. Moreover there is no mechanism to translate 1493 desired routing/traffic engineering policies into policy tables on 1494 DHCPv6 servers. Therefore using DHCPv6 for controlling address 1495 selection policy table is not recommended and SHOULD NOT be used. 1497 At the same time Router Advertisements provide a reliable mechanism 1498 to influence source address selection process via PIO, RIO and 1499 default router preferences. As all those options have been 1500 standardized by IETF and are supported by various operating systems 1501 no changes are required on hosts. First-hop routers in the 1502 enterprise network need to be able of sending different RAs for 1503 different SLAAC prefixes (either based on scoped forwarding tables or 1504 based on pre-configured policies). 1506 SERs can enforce the routing policy by sending ICMPv6 Destination 1507 Unreachable messages with Code 5 (Source address failed ingress/ 1508 egress policy) for traffic that is being sent with the wrong source 1509 address. The policy distribution could be automated by defining an 1510 EXCLUSIVE flag for the source-prefix-scoped route which can be set on 1511 the SER that originates the route. As ICMPv6 message generation can 1512 be rate-limited on routers, it SHOULD NOT be used as the only 1513 mechanism to influence source address selection on hosts. While 1514 hosts SHOULD select the correct source address for a given 1515 destination the network SHOULD signal any source address issues back 1516 to hosts using ICMPv6 error messages. 1518 6.3. Selecting Source Address When One Uplink Has Failed 1520 Now we discuss if DHCPv6, Neighbor Discovery Router Advertisements, 1521 and ICMPv6 can help a host choose the right source address when an 1522 uplink to one of the ISPs has failed. Again we look at the scenario 1523 in Figure 3. This time we look at traffic from H31 destined for 1524 external host H501 at D=2001:db8:0:5678::501. We initially assume 1525 that the uplink from SERa to ISP-A is working and that the uplink 1526 from SERb1 to ISP-B is working. 1528 We assume there is no particular routing policy desired, so H31 is 1529 free to send packets with S=2001:db8:0:a010::31 or 1530 S=2001:db8:0:b010::31 and have them delivered to H501. For this 1531 example, we assume that H31 has chosen S=2001:db8:0:b010::31 so that 1532 the packets exit via SERb to ISP-B. Now we see what happens when the 1533 link from SERb1 to ISP-B fails. How should H31 learn that it needs 1534 to start sending the packet to H501 with S=2001:db8:0:a010::31 in 1535 order to start using the uplink to ISP-A? We need to do this in a 1536 way that doesn't prevent H31 from still sending packets with 1537 S=2001:db8:0:b010::31 in order to reach H61 at D=2001:db8:0:6666::61. 1539 6.3.1. Controlling Source Address Selection With DHCPv6 1541 For this example we assume that the site network in Figure 3 has a 1542 centralized DHCP server and all routers act as DHCP relay agents. We 1543 assume that both of the addresses assigned to H31 were assigned via 1544 DHCP. 1546 We could try to have the DHCP server monitor the state of the uplink 1547 from SERb1 to ISP-B in some manner and then tell H31 that it can no 1548 longer use S=2001:db8:0:b010::31 by settings its valid lifetime to 1549 zero. The DHCP server could initiate this process by sending a 1550 Reconfigure Message to H31 as described in Section 18.3 of [RFC8415]. 1551 Or the DHCP server can assign addresses with short lifetimes in order 1552 to force clients to renew them often. 1554 This approach would prevent H31 from using S=2001:db8:0:b010::31 to 1555 reach a host on the Internet. However, it would also prevent H31 1556 from using S=2001:db8:0:b010::31 to reach H61 at 1557 D=2001:db8:0:6666::61, which is not desirable. 1559 Another potential approach is to have the DHCP server monitor the 1560 uplink from SERb1 to ISP-B and control the choice of source address 1561 on H31 by updating its address selection policy table via the 1562 mechanism in [RFC7078]. The DHCP server could initiate this process 1563 by sending a Reconfigure Message to H31. Note that [RFC8415] 1564 requires that Reconfigure Message use DHCP authentication. DHCP 1565 authentication could be avoided by using short address lifetimes to 1566 force clients to send Renew messages to the server often. If the 1567 host is not obtaining its IP addresses from the DHCP server, then it 1568 would need to use the Information Refresh Time option defined in 1569 [RFC8415]. 1571 If the following policy table can be installed on H31 after the 1572 failure of the uplink from SERb1, then the desired routing behavior 1573 should be achieved based on source and destination prefix being 1574 matched with label values. 1576 Prefix Precedence Label 1577 ::/0 50 44 1578 2001:db8:0:a000::/52 50 44 1579 2001:db8:0:6666::/64 50 55 1580 2001:db8:0:b000::/52 50 55 1582 Figure 10: Policy Table Needed On Failure Of Uplink From SERb1 1584 The described solution has a number of significant drawbacks, some of 1585 them already discussed in Section 6.2.1. 1587 o DHCPv6 support is not required for an IPv6 host and there are 1588 operating systems which do not support DHCPv6. Besides that, it 1589 does not appear that [RFC7078] has been widely implemented on host 1590 operating systems. 1592 o [RFC7078] does not clearly specify this kind of a dynamic use case 1593 where address selection policy needs to be updated quickly in 1594 response to the failure of a link. In a large network it would 1595 present scalability issues as many hosts need to be reconfigured 1596 in very short period of time. 1598 o Updating DHCPv6 server configuration each time an ISP uplink 1599 changes its state introduces some scalability issues, especially 1600 for mid/large distributed scale enterprise networks. In addition 1601 to that, the policy table needs to be manually configured by 1602 administrators which makes that solution prone to human error. 1604 o No mechanism exists for making DHCPv6 servers aware of network 1605 topology/routing changes in the network. In general DHCPv6 1606 servers monitoring network-related events sounds like a bad idea 1607 as completely new functionality beyond the scope of DHCPv6 role is 1608 required. 1610 6.3.2. Controlling Source Address Selection With Router Advertisements 1612 The same mechanism as discussed in Section 6.2.2 can be used to 1613 control the source address selection in the case of an uplink 1614 failure. If a particular prefix should not be used as a source for 1615 any destinations, then the router needs to send RA with Preferred 1616 Lifetime field for that prefix set to 0. 1618 Let's consider a scenario when all uplinks are operational and H41 1619 receives two different RAs from R3: one from LLA_A with PIO for 1620 2001:db8:0:a020::/64, default router preference set to 11 (low) and 1621 another one from LLA_B with PIO for 2001:db8:0:a020::/64, default 1622 router preference set to 01 (high) and RIO for 2001:db8:0:6666::/64. 1624 As a result H41 is using 2001:db8:0:b020::41 as a source address for 1625 all Internet traffic and those packets are sent by SERs to ISP-B. If 1626 SERb1 uplink to ISP-B failed, the desired behavior is that H41 stops 1627 using 2001:db8:0:b020::41 as a source address for all destinations 1628 but H61. To achieve that R3 should react to SERb1 uplink failure 1629 (which could be detected as the scoped route (S=2001:db8:0:b000::/52, 1630 D=::/0) disappearance) by withdrawing itself as a default router. R3 1631 sends a new RA from LLA_B with Router Lifetime value set to 0 (which 1632 means that it should not be used as default router). That RA still 1633 contains PIO for 2001:db8:0:b020::/64 (for SLAAC purposes) and RIO 1634 for 2001:db8:0:6666::/64 so H41 can reach H61 using LLA_B as a next- 1635 hop and 2001:db8:0:b020::41 as a source address. For all traffic 1636 following the default route, LLA_A will be used as a next-hop and 1637 2001:db8:0:a020::41 as a source address. 1639 If all uplinks to ISP-B have failed and therefore source addresses 1640 from ISP-B address space should not be used at all, the forwarding 1641 table scoped S=2001:db8:0:b000::/52 contains no entries. Hosts can 1642 be instructed to stop using source addresses from that block by 1643 sending RAs containing PIO with Preferred Lifetime set to 0. 1645 6.3.3. Controlling Source Address Selection With ICMPv6 1647 Now we look at how ICMPv6 messages can provide information back to 1648 H31. We assume again that at the time of the failure H31 is sending 1649 packets to H501 using (S=2001:db8:0:b010::31, 1650 D=2001:db8:0:5678::501). When the uplink from SERb1 to ISP-B fails, 1651 SERb1 would stop originating its source-prefix-scoped route for the 1652 default destination (S=2001:db8:0:b000::/52, D=::/0) as well as its 1653 unscoped default destination route. With these routes no longer in 1654 the IGP, traffic with (S=2001:db8:0:b010::31, D=2001:db8:0:5678::501) 1655 would end up at SERa based on the unscoped default destination route 1656 being originated by SERa. Since that traffic has the wrong source 1657 address to be forwarded to ISP-A, SERa would drop it and send a 1658 Destination Unreachable message with Code 5 (Source address failed 1659 ingress/egress policy) back to H31. H31 would then know to use 1660 another source address for that destination and would try with 1661 (S=2001:db8:0:a010::31, D=2001:db8:0:5678::501). This would be 1662 forwarded to SERa based on the source-prefix-scoped default 1663 destination route still being originated by SERa, and SERa would 1664 forward it to ISP-A. As discussed above, if we are willing to extend 1665 ICMPv6, SERa can even tell H31 what source address it should use to 1666 reach that destination. The expected host behaviour has been 1667 discussed in Section 6.2.3. Using ICMPv6 would have the same 1668 scalability/rate limiting issues discussed in Section 6.2.3. ISP-B 1669 uplink failure immediately makes source addresses from 1670 2001:db8:0:b000::/52 unsuitable for external communication and might 1671 trigger a large number of ICMPv6 packets being sent to hosts in that 1672 subnet. 1674 6.3.4. Summary Of Methods For Controlling Source Address Selection On 1675 The Failure Of An Uplink 1677 It appears that DHCPv6 is not particularly well suited to quickly 1678 changing the source address used by a host in the event of the 1679 failure of an uplink, which eliminates DHCPv6 from the list of 1680 potential solutions. On the other hand Router Advertisements 1681 provides a reliable mechanism to dynamically provide hosts with a 1682 list of valid prefixes to use as source addresses as well as prevent 1683 particular prefixes to be used. While no additional new features are 1684 required to be implemented on hosts, routers need to be able to send 1685 RAs based on the state of scoped forwarding tables entries and to 1686 react to network topology changes by sending RAs with particular 1687 parameters set. 1689 The use of ICMPv6 Destination Unreachable messages generated by the 1690 SER (or any SADR-capable) routers seem like they have the potential 1691 to provide a support mechanism together with RAs to signal source 1692 address selection errors back to hosts, however scalability issues 1693 may arise in large networks in case of sudden topology change. 1694 Therefore it is highly desirable that hosts are able to select the 1695 correct source address in case of uplinks failure with ICMPv6 being 1696 an additional mechanism to signal unexpected failures back to hosts. 1698 The current behavior of different host operating system when 1699 receiving ICMPv6 Destination Unreachable message with code 5 (Source 1700 address failed ingress/egress policy) is not clear to the authors. 1701 Information from implementers, users, and testing would be quite 1702 helpful in evaluating this approach. 1704 6.4. Selecting Source Address Upon Failed Uplink Recovery 1706 The next logical step is to look at the scenario when a failed uplink 1707 on SERb1 to ISP-B is coming back up, so hosts can start using source 1708 addresses belonging to 2001:db8:0:b000::/52 again. 1710 6.4.1. Controlling Source Address Selection With DHCPv6 1712 The mechanism to use DHCPv6 to instruct the hosts (H31 in our 1713 example) to start using prefixes from ISP-B space (e.g. 1714 S=2001:db8:0:b010::31 for H31) to reach hosts on the Internet is 1715 quite similar to one discussed in Section 6.3.1 and shares the same 1716 drawbacks. 1718 6.4.2. Controlling Source Address Selection With Router Advertisements 1720 Let's look at the scenario discussed in Section 6.3.2. If the 1721 uplink(s) failure caused the complete withdrawal of prefixes from 1722 2001:db8:0:b000::/52 address space by setting Preferred Lifetime 1723 value to 0, then the recovery of the link should just trigger new RA 1724 being sent with non-zero Preferred Lifetime. In another scenario 1725 discussed in Section 6.3.2, the SERb1 uplink to ISP-B failure leads 1726 to disappearance of the (S=2001:db8:0:b000::/52, D=::/0) entry from 1727 the forwarding table scoped to S=2001:db8:0:b000::/52 and, in turn, 1728 caused R3 to send RAs from LLA_B with Router Lifetime set to 0. The 1729 recovery of the SERb1 uplink to ISP-B leads to 1730 (S=2001:db8:0:b000::/52, D=::/0) scoped forwarding entry re- 1731 appearance and instructs R3 that it should advertise itself as a 1732 default router for ISP-B address space domain (send RAs from LLA_B 1733 with non-zero Router Lifetime). 1735 6.4.3. Controlling Source Address Selection With ICMP 1737 It looks like ICMPv6 provides a rather limited functionality to 1738 signal back to hosts that particular source addresses have become 1739 valid again. Unless the changes in the uplink state a particular 1740 (S,D) pair, hosts can keep using the same source address even after 1741 an ISP uplink has come back up. For example, after the uplink from 1742 SERb1 to ISP-B had failed, H31 received ICMPv6 Code 5 message (as 1743 described in Section 6.3.3) and allegedly started using 1744 (S=2001:db8:0:a010::31, D=2001:db8:0:5678::501) to reach H501. Now 1745 when the SERb1 uplink comes back up, the packets with that (S,D) pair 1746 are still routed to SERa1 and sent to the Internet. Therefore H31 is 1747 not informed that it should stop using 2001:db8:0:a010::31 and start 1748 using 2001:db8:0:b010::31 again. Unless SERa has a policy configured 1749 to drop packets (S=2001:db8:0:a010::31, D=2001:db8:0:5678::501) and 1750 send ICMPv6 back if SERb1 uplink to ISP-B is up, H31 will be unaware 1751 of the network topology change and keep using S=2001:db8:0:a010::31 1752 for Internet destinations, including H51. 1754 One of the possible option may be using a scoped route with EXCLUSIVE 1755 flag as described in Section 6.2.3. SERa1 uplink recovery would 1756 cause (S=2001:db8:0:a000::/52, D=2001:db8:0:1234::/64) route to 1757 reappear in the routing table. In the absence of that route packets 1758 to H101 which were sent to ISP-B (as ISP-A uplink was down) with 1759 source addresses from 2001:db8:0:b000::/52. When the route re- 1760 appears SERb1 would reject those packets and sends ICMPv6 back as 1761 discussed in Section 6.2.3. Practically it might lead to scalability 1762 issues which have been already discussed in Section 6.2.3 and 1763 Section 6.4.3. 1765 6.4.4. Summary Of Methods For Controlling Source Address Selection Upon 1766 Failed Uplink Recovery 1768 Once again DHCPv6 does not look like reasonable choice to manipulate 1769 source address selection process on a host in the case of network 1770 topology changes. Using Router Advertisement provides the flexible 1771 mechanism to dynamically react to network topology changes (if 1772 routers are able to use routing changes as a trigger for sending out 1773 RAs with specific parameters). ICMPv6 could be considered as a 1774 supporting mechanism to signal incorrect source address back to hosts 1775 but should not be considered as the only mechanism to control the 1776 address selection in multihomed environments. 1778 6.5. Selecting Source Address When All Uplinks Failed 1780 One particular tricky case is a scenario when all uplinks have 1781 failed. In that case there is no valid source address to be used for 1782 any external destinations while it might be desirable to have intra- 1783 site connectivity. 1785 6.5.1. Controlling Source Address Selection With DHCPv6 1787 From DHCPv6 perspective uplinks failure should be treated as two 1788 independent failures and processed as described in Section 6.3.1. At 1789 this stage it is quite obvious that it would result in quite 1790 complicated policy table which needs to be explicitly configured by 1791 administrators and therefore seems to be impractical. 1793 6.5.2. Controlling Source Address Selection With Router Advertisements 1795 As discussed in Section 6.3.2 an uplink failure causes the scoped 1796 default entry to disappear from the scoped forwarding table and 1797 triggers RAs with zero Router Lifetime. Complete disappearance of 1798 all scoped entries for a given source prefix would cause the prefix 1799 being withdrawn from hosts by setting Preferred Lifetime value to 1800 zero in PIO. If all uplinks (SERa, SERb1 and SERb2) failed, hosts 1801 either lost their default routers and/or have no global IPv6 1802 addresses to use as a source. (Note that 'uplink failure' might mean 1803 'IPv6 connectivity failure with IPv4 still being reachable', in which 1804 case hosts might fall back to IPv4 if there is IPv4 connectivity to 1805 destinations). As a result, intra-site connectivity is broken. One 1806 of the possible way to solve it is to use ULAs. 1808 All hosts have ULA addresses assigned in addition to GUAs and used 1809 for intra-site communication even if there is no GUA assigned to a 1810 host. To avoid accidental leaking of packets with ULA sources SADR- 1811 capable routers SHOULD have a scoped forwarding table for ULA source 1812 for internal routes but MUST NOT have an entry for D=::/0 in that 1813 table. In the absence of (S=ULA_Prefix; D=::/0) first-hop routers 1814 will send dedicated RAs from a unique link-local source LLA_ULA with 1815 PIO from ULA address space, RIO for the ULA prefix and Router 1816 Lifetime set to zero. The behaviour is consistent with the situation 1817 when SERb1 lost the uplink to ISP-B (so there is no Internet 1818 connectivity from 2001:db8:0:b000::/52 sources) but those sources can 1819 be used to reach some specific destinations. In the case of ULA 1820 there is no Internet connectivity from ULA sources but they can be 1821 used to reach another ULA destinations. Note that ULA usage could be 1822 particularly useful if all ISPs assign prefixes via DHCP-PD. In the 1823 absence of ULAs, upon the all uplinks failure hosts would lost all 1824 their GUAs upon prefix lifetime expiration which again makes intra- 1825 site communication impossible. 1827 It should be noted that the Rule 5.5 (prefer a prefix advertised by 1828 the selected next-hop) takes precedence over the Rule 6 (prefer 1829 matching label, which ensures that GUA source addresses are preferred 1830 over ULAs for GUA destinations). Therefore if ULAs are used, the 1831 network administrator needs to ensure that while the site has an 1832 Internet connectivity, hosts do not select a router which advertises 1833 ULA prefixes as their default router. 1835 6.5.3. Controlling Source Address Selection With ICMPv6 1837 In case of all uplinks failure all SERs will drop outgoing IPv6 1838 traffic and respond with ICMPv6 error message. In the large network 1839 when many hosts are trying to reach Internet destinations it means 1840 that SERs need to generate an ICMPv6 error to every packet they 1841 receive from hosts which presents the same scalability issues 1842 discussed in Section 6.3.3 1844 6.5.4. Summary Of Methods For Controlling Source Address Selection When 1845 All Uplinks Failed 1847 Again, combining SADR with Router Advertisements seems to be the most 1848 flexible and scalable way to control the source address selection on 1849 hosts. 1851 6.6. Summary Of Methods For Controlling Source Address Selection 1853 To summarize the scenarios and options discussed above: 1855 While DHCPv6 allows administrators to manipulate source address 1856 selection policy tables, this method has a number of significant 1857 disadvantages which eliminates DHCPv6 from a list of potential 1858 solutions: 1860 1. It required hosts to support DHCPv6 and its extension (RFC7078); 1861 2. DHCPv6 server needs to monitor network state and detect routing 1862 changes. 1864 3. The use of policy tables requires manual configuration and might 1865 be extremely complicated, especially in the case of distributed 1866 network when large number of remote sites are being served by 1867 centralized DHCPv6 servers. 1869 4. Network topology/routing policy changes could trigger 1870 simultaneous re-configuration of large number of hosts which 1871 present serious scalability issues. 1873 The use of Router Advertisements to influence the source address 1874 selection on hosts seem to be the most reliable, flexible and 1875 scalable solution. It has the following benefits: 1877 1. no new (non-standard) functionality needs to be implemented on 1878 hosts (except for [RFC4191] RIO support, which remains at the 1879 time of this writing not widely implemented); 1881 2. no changes in RA format; 1883 3. routers can react to routing table changes by sending RAs which 1884 would minimize the failover time in the case of network topology 1885 changes; 1887 4. information required for source address selection is broadcast to 1888 all affected hosts in case of topology change event which 1889 improves the scalability of the solution (comparing to DHCPv6 1890 reconfiguration or ICMPv6 error messages). 1892 To fully benefit from the RA-based solution, first-hop routers need 1893 to implement SADR, belong to the SADR domain and be able to send 1894 dedicated RAs per scoped forwarding table as discussed above, 1895 reacting to network changes with sending new RAs. It should be noted 1896 that the proposed solution would work even if first-hop routers are 1897 not SADR-capable but still able to send individual RAs for each ISP 1898 prefix and react to topology changes as discussed above (e.g. via 1899 configuration knobs). 1901 The RA-based solution relies heavily on hosts correctly implementing 1902 default address selection algorithm as defined in [RFC6724]. While 1903 the basic (and most common) multihoming scenario (two or more 1904 Internet uplinks, no 'walled gardens') would work for any host 1905 supporting the minimal implementation of [RFC6724], more complex use 1906 cases (such as "walled garden" and other scenarios when some ISP 1907 resources can only be reached from that ISP address space) require 1908 that hosts support Rule 5.5 of the default address selection 1909 algorithm. There is some evidence that not all host OSes have that 1910 rule implemented currently. However it should be noted that 1911 [RFC8028] states that Rule 5.5 should be implemented. 1913 ICMPv6 Code 5 error message SHOULD be used to complement RA-based 1914 solution to signal incorrect source address selection back to hosts, 1915 but it SHOULD NOT be considered as the stand-alone solution. To 1916 prevent scenarios when hosts in multihomed envinronments incorrectly 1917 identify onlink/offlink destinations, hosts SHOULD treat ICMPv6 1918 Redirects as discussed in [RFC8028]. 1920 6.7. Solution Limitations 1922 6.7.1. Connections Preservation 1924 The proposed solution is not designed to preserve connection state in 1925 case of an uplink failure. When all uplinks to an ISP go down all 1926 transport connections established to/from that ISP address space will 1927 be interrupted (unless the transport protocol has specific 1928 multihoming support). That behaviour is similar to the scenario of 1929 IPv4 multihoming with NAT when an uplink failure causes all 1930 connections to be NATed to completely different public IPv4 1931 addresses. While it does sound suboptimal, it is determined by the 1932 nature of PA address space: if all uplinks to the particular ISP have 1933 failed, there is no path for the ingress traffic to reach the site 1934 and the egress traffic is supposed to be dropped by the BCP38 1935 [RFC2827] ingress filters. The only potential way to overcome this 1936 limitation would be running BGP with all ISPs and advertise all site 1937 prefixes to all uplinks - a solution which shares all drawbacks of 1938 using PI address space without having its benefits. Networks willing 1939 and capable of running BGP and using PI are out of scope of this 1940 document. 1942 It should be noted that in case of IPv4 NAT-based multihoming uplink 1943 recovery could cause connection interruptions as well (unless packet 1944 forwarding is integrated with existing NAT sessions tracking so the 1945 egress interface for the existing sessions is not changed). However 1946 the proposed solution has a benefit of preserving the existing 1947 sessions during/after the failed uplink restoration. Unlike the 1948 uplink failure event which causes all addresses from the affected 1949 prefix to be deprecated the recovery would just add new preferred 1950 addresses to a host without making any addresses unavailable. 1951 Therefore connections estavlished to/from those addresses do not have 1952 to be interrupted. 1954 While it's desirable for active connections to survive ISP failover 1955 events, for sites using PA address space such events affect the 1956 reachability of IP addresses assigned to hosts. Unless the transport 1957 (or even higher level protocols) are capable of suviving the host 1958 renumbering, the active connections will be broken. The proposed 1959 solution focuses on minimizing the impact of failover for new 1960 connections and for multipath-aware protocols. 1962 6.8. Other Configuration Parameters 1964 6.8.1. DNS Configuration 1966 In mutihomed envinronment each ISP might provide their own list of 1967 DNS servers. For example, in the topology shown in Figure 3, ISP-A 1968 might provide recursive DNS server H51 2001:db8:0:5555::51, while 1969 ISP-B might provide H61 2001:db8:0:6666::61 as a recursive DNS 1970 server. [RFC8106] defines IPv6 Router Advertisement options to allow 1971 IPv6 routers to advertise a list of DNS recursive server addresses 1972 and a DNS Search List to IPv6 hosts. Using RDNSS together with 1973 'scoped' RAs as described above would allow a first-hop router (R3 in 1974 the Figure 3) to send DNS server addresses and search lists provided 1975 by each ISP (or the corporate DNS servers addresses if the enterprise 1976 is running its own DNS servers - as discussed below DNS split-horizon 1977 problem is to hard to solve without running a local DNS server). 1979 As discussed in Section 6.5.2, failure of all ISP uplinks would cause 1980 deprecation of all addresses assigned to a host from the address 1981 space of all ISPs. If any intra-site IPv6 connectivity is still 1982 desirable (most likely to be the case for any mid/large scare 1983 network), then ULAs should be used as discussed in Section 6.5.2. In 1984 such a scenario, the enterprise network should run its own recursive 1985 DNS server(s) and provide its ULA addresses to hosts via RDNSS in RAs 1986 send for ULA-scoped forwarding table as described in Section 6.5.2. 1988 There are some scenarios when the final outcome of the name 1989 resolution might be different depending on: 1991 o which DNS server is used; 1993 o which source address the client uses to send a DNS query to the 1994 server (DNS split horizon). 1996 There is no way currently to instruct a host to use a particular DNS 1997 server out of the configured servers list for resolving a particular 1998 name. Therefore it does not seem feasible to solve the problem of 1999 DNS server selection on the host (it should be noted that this 2000 particular issue is protocol-agnostic and happens for IPv4 as well). 2001 In such a scenario it is recommended that the enterprise runs its own 2002 local recursive DNS server. 2004 To influence host source address selection for packets sent to a 2005 particular DNS server the following requirements must be met: 2007 o the host supports RIO as defined in [RFC4191]; 2009 o the routers send RIO for routes to DNS server addresses. 2011 For example, if it is desirable that host H31 reaches the ISP-A DNS 2012 server H51 2001:db8:0:5555::51 using its source address 2013 2001:db8:0:a010::31, then both R1 and R2 should send the RIO 2014 containing the route to 2001:db8:0:5555::51 (or covering route) in 2015 their 'scoped' RAs, containing LLA_A as the default router address 2016 and the PO for SLAAC prefix 2001:db8:0:a010::/64. In that case the 2017 host H31 (if it supports the Rule 5.5) would select LLA_A as a next- 2018 hop and then chose 2001:db8:0:a010::31 as the source address for 2019 packets to the DNS server. 2021 It should be noted that [RFC6106] explicitly prohibits using DNS 2022 information if the RA router Lifetime expired: "An RDNSS address or a 2023 DNSSL domain name MUST be used only as long as both the RA router 2024 Lifetime (advertised by a Router Advertisement message) and the 2025 corresponding option Lifetime have not expired.". Therefore hosts 2026 might ignore RDNSS information provided in ULA-scoped RAs as those 2027 RAs would have router lifetime set to 0. However the updated version 2028 of RFC6106 ([RFC8106]) has that requirement removed. 2030 As discussed above the DNS split-horizon problem and selecting the 2031 correct DNS server in a multihomed envinroment is not an easy one to 2032 solve. The proper solution would require hosts to support the 2033 concept of multiple Provisioning Domains (PvD, a set of configuration 2034 information associated with a network, [RFC7556]). 2036 7. Deployment Considerations 2038 The solution described in this document requires certain mechanisms 2039 to be supported by the network infrastructure and hosts. It requires 2040 some routers in the enterprise site to support some form of Source 2041 Address Dependent Routing (SADR). It also requires hosts to be able 2042 to learn when the uplink to an ISP changes its state so the 2043 corresponding source addresses should (or should not) be used. 2044 Ongoing work to create mechanisms to accomplish this are discussed in 2045 this document, but they are still a work in progress. 2047 7.1. Deploying SADR Domain 2049 The proposed solution provides does not prescribe particular details 2050 regarding deploying an SADR domain within a multihomed enterprise 2051 network. However the following guidelines could be applied: 2053 o The SADR domain is usually limited by the multihomed site border. 2055 o The minimal deployable scenario requires enabling SADR on all SERs 2056 and including them into a single SADR domain. 2058 o As discussed in Section 4.2, extending the connected SADR domain 2059 beyond that point down to the first-hop routers can produce more 2060 efficient forwarding paths and allow the network to fully benefit 2061 from SADR. it would also simplify the operation of the SADR 2062 domain. 2064 o During the incremental SADR domain expansion from the SERs down 2065 towards first-hop routers it's important to ensure that at any 2066 moment of time all SADR-capable routers within the domain are 2067 logically connected (see Section 5). 2069 7.2. Hosts-Related Considerations 2071 The solution discussed in this document relies on the default address 2072 selection algorithm ([RFC6724]) Rule 5.5. While [RFC6724] considers 2073 this rule as optional, the recent [RFC8028] states that "A host 2074 SHOULD select default routers for each prefix it is assigned an 2075 address in". It also recommends that hosts should implement Rule 2076 5.5. of [RFC6724]. Therefore while RFC8028-compliant hosts already 2077 have mechanism to learn about ISP uplinks state changes and selecting 2078 the source addresses accordingly, many hosts do not have such 2079 mechanism supported yet. 2081 It should be noted that multihomed enterprise network utilizing 2082 multiple ISP prefixes can be considered as a typical multiple 2083 provisioning domain (mPVD) scenario, as described in [RFC7556]. This 2084 document defines a way for the network to provide the PVD information 2085 to hosts indirectly, using the existing mechanisms. At the same time 2086 [I-D.ietf-intarea-provisioning-domains] takes one step further and 2087 describes a comprehensive mechanism for hosts to discover the whole 2088 set of configuration information associated with different PVD/ISPs. 2089 [I-D.ietf-intarea-provisioning-domains] complements this document in 2090 terms of making hosts being able to learn about ISP uplink states and 2091 selecting the corresponding source addresses. 2093 8. Other Solutions 2095 8.1. Shim6 2097 The Shim6 working group specified the Shim6 protocol [RFC5533] which 2098 allows a host at a multihomed site to communicate with an external 2099 host and exchange information about possible source and destination 2100 address pairs that they can use to communicate. It also specified 2101 the REAP protocol [RFC5534] to detect failures in the path between 2102 working address pairs and find new working address pairs. A 2103 fundamental requirement for Shim6 is that both internal and external 2104 hosts need to support Shim6. That is, both the host internal to the 2105 multihomed site and the host external to the multihomed site need to 2106 support Shim6 in order for there to be any benefit for the internal 2107 host to run Shim6. The Shim6 protocol specification was published in 2108 2009, but it has not been widely implemented. Therefore Shim6 is not 2109 considered as a viable solution for enterprise multihoming. 2111 8.2. IPv6-to-IPv6 Network Prefix Translation 2113 IPv6-to-IPv6 Network Prefix Translation (NPTv6) [RFC6296] is not the 2114 focus of this document. NPTv6 suffers from the same fundamental 2115 issue as any other address translation approaches: it breaks end-to- 2116 end connectivity. Therefore NPTv6 is not considered as desirable 2117 solution and this document intentionally focuses on solving 2118 enterprise multihoming problem without any form of address 2119 translations. 2121 With increasing interest and ongoing work in bringing path awareness 2122 to transport and application layer protocols hosts might be able to 2123 determine the properties of the various network paths and choose 2124 among paths available to them. As selecting the correct source 2125 address is one of the possible mechanisms path-aware hosts may 2126 utilize, address translation negatively affects hosts path-awareness 2127 which makes NTPv6 even more undesirable solution. 2129 8.3. Multipath Transport 2131 Using multipath transport (such as MPTCP, [RFC6824] or multipath 2132 capabilities in QUIC) might solve the problems discussed in Section 6 2133 since it would allow hosts to use multiple source addresses for a 2134 single connection and switch between source addresses when a 2135 particular address becomes unavailable or a new address gets assigned 2136 to the host interface. Therefore if all hosts in the enterprise 2137 network are only using multipath transport for all connections, the 2138 signaling solution described in Section 6 might not be needed (it 2139 should be noted that the Source Address Dependent Routing would still 2140 be required to deliver packets to the correct uplinks). At the time 2141 this document was written, multipath transport alone could not be 2142 considered a solution for the problem of selecting the source address 2143 in a multihomed environment. There are significant number of hosts 2144 which do not use multipath transport currently and it seems unlikely 2145 that the situation is going to change in any foreseeable future (even 2146 if new releases of operatin systems get multipath protocols support 2147 there will be a long tail of legacy hosts). The solution for 2148 enterprise multihoming needs to work for the least common 2149 denominator: hosts without multipath transport support. In addition, 2150 not all protocols are using multipath transport. While multipath 2151 transport would complement the solution described in Section 6, it 2152 could not be considered as a sole solution to the problem of source 2153 address selection in multihomed environments. 2155 On the other hand PA-based multihoming could provide additional 2156 benefits for multipath protocol, should those protocols be deployed 2157 in the network. Multipath protocols could leverage source address 2158 selection to achieve maximum path diversity (and potentially improved 2159 performance). 2161 Therefore deploying multipath protocols could not be considered as an 2162 alternative to the approach proposed in this document. Instead both 2163 solutions complement each other so deploying multipath protocols in 2164 PA-based multihomed network proves mutually beneficial. 2166 9. IANA Considerations 2168 This memo asks the IANA for no new parameters. 2170 10. Security Considerations 2172 Section 6.2.3 discusses a mechanism for controlling source address 2173 selection on hosts using ICMPv6 messages. Using ICMPv6 to influence 2174 source address selection allows an attacker to exhaust the list of 2175 candidate source addresses on the host by sending spoofed ICMPv6 Code 2176 5 for all prefixes known on the network (therefore preventing a 2177 victim from establishing a communication with the destination host). 2178 Another possible attack vector is using ICMPv6 Destination 2179 Unreachable Messages with Code 5 to steer the egress tra ffic towards 2180 the particular ISP (for example if the attacker has the ability of 2181 doing traffic sniffing or man-in-the-middle attack in that ISP 2182 network). 2184 To prevent those attacks hosts SHOULD verify that the original packet 2185 header included into ICMPv6 error message was actually sent by the 2186 host (to ensure that the ICMPv6 message was triggered by a packet 2187 sent by the host). 2189 As ICMPv6 Destination Unreachable Messages with Code 5 could be 2190 originated by any SADR-capable router within the domain (or even come 2191 from the Internet), GTSM ([RFC5082]) can not be applied. Filtering 2192 such ICMOv6 messages at the site border can not be recommended as it 2193 would break the legitimate end2end error signalling mechanism ICMPv6 2194 is designed for. 2196 The security considerations of using stateless address 2197 autoconfiguration are discussed in [RFC4862]. 2199 11. Acknowledgements 2201 The original outline was suggested by Ole Troan. 2203 The authors would like to thank the following people (in alphabetical 2204 order) for their review and feedback: Olivier Bonaventure, Deborah 2205 Brungard, Brian E Carpenter, Lorenzo Colitti, Roman Danyliw, Benjamin 2206 Kaduk, Suresh Krishnan, Mirja Kuhlewind, David Lamparter, Nicolai 2207 Leymann, Acee Lindem, Philip Matthewsu, Robert Raszuk, Alvaro Retana, 2208 Dave Thaler, Michael Tuxen, Martin Vigoureux, Eric Vyncke, Magnus 2209 Westerlund. 2211 12. References 2213 12.1. Normative References 2215 [RFC1918] Rekhter, Y., Moskowitz, B., Karrenberg, D., de Groot, G., 2216 and E. Lear, "Address Allocation for Private Internets", 2217 BCP 5, RFC 1918, DOI 10.17487/RFC1918, February 1996, 2218 . 2220 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2221 Requirement Levels", BCP 14, RFC 2119, 2222 DOI 10.17487/RFC2119, March 1997, 2223 . 2225 [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: 2226 Defeating Denial of Service Attacks which employ IP Source 2227 Address Spoofing", BCP 38, RFC 2827, DOI 10.17487/RFC2827, 2228 May 2000, . 2230 [RFC4191] Draves, R. and D. Thaler, "Default Router Preferences and 2231 More-Specific Routes", RFC 4191, DOI 10.17487/RFC4191, 2232 November 2005, . 2234 [RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 2235 Addresses", RFC 4193, DOI 10.17487/RFC4193, October 2005, 2236 . 2238 [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing 2239 Architecture", RFC 4291, DOI 10.17487/RFC4291, February 2240 2006, . 2242 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 2243 Control Message Protocol (ICMPv6) for the Internet 2244 Protocol Version 6 (IPv6) Specification", STD 89, 2245 RFC 4443, DOI 10.17487/RFC4443, March 2006, 2246 . 2248 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 2249 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 2250 DOI 10.17487/RFC4861, September 2007, 2251 . 2253 [RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless 2254 Address Autoconfiguration", RFC 4862, 2255 DOI 10.17487/RFC4862, September 2007, 2256 . 2258 [RFC6106] Jeong, J., Park, S., Beloeil, L., and S. Madanapalli, 2259 "IPv6 Router Advertisement Options for DNS Configuration", 2260 RFC 6106, DOI 10.17487/RFC6106, November 2010, 2261 . 2263 [RFC6296] Wasserman, M. and F. Baker, "IPv6-to-IPv6 Network Prefix 2264 Translation", RFC 6296, DOI 10.17487/RFC6296, June 2011, 2265 . 2267 [RFC6724] Thaler, D., Ed., Draves, R., Matsumoto, A., and T. Chown, 2268 "Default Address Selection for Internet Protocol Version 6 2269 (IPv6)", RFC 6724, DOI 10.17487/RFC6724, September 2012, 2270 . 2272 [RFC7078] Matsumoto, A., Fujisaki, T., and T. Chown, "Distributing 2273 Address Selection Policy Using DHCPv6", RFC 7078, 2274 DOI 10.17487/RFC7078, January 2014, 2275 . 2277 [RFC7556] Anipko, D., Ed., "Multiple Provisioning Domain 2278 Architecture", RFC 7556, DOI 10.17487/RFC7556, June 2015, 2279 . 2281 [RFC8028] Baker, F. and B. Carpenter, "First-Hop Router Selection by 2282 Hosts in a Multi-Prefix Network", RFC 8028, 2283 DOI 10.17487/RFC8028, November 2016, 2284 . 2286 [RFC8106] Jeong, J., Park, S., Beloeil, L., and S. Madanapalli, 2287 "IPv6 Router Advertisement Options for DNS Configuration", 2288 RFC 8106, DOI 10.17487/RFC8106, March 2017, 2289 . 2291 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2292 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2293 May 2017, . 2295 [RFC8415] Mrugalski, T., Siodelski, M., Volz, B., Yourtchenko, A., 2296 Richardson, M., Jiang, S., Lemon, T., and T. Winters, 2297 "Dynamic Host Configuration Protocol for IPv6 (DHCPv6)", 2298 RFC 8415, DOI 10.17487/RFC8415, November 2018, 2299 . 2301 12.2. Informative References 2303 [I-D.ietf-intarea-provisioning-domains] 2304 Pfister, P., Vyncke, E., Pauly, T., Schinazi, D., and W. 2305 Shao, "Discovering Provisioning Domain Names and Data", 2306 draft-ietf-intarea-provisioning-domains-05 (work in 2307 progress), June 2019. 2309 [I-D.ietf-rtgwg-dst-src-routing] 2310 Lamparter, D. and A. Smirnov, "Destination/Source 2311 Routing", draft-ietf-rtgwg-dst-src-routing-07 (work in 2312 progress), March 2019. 2314 [I-D.pfister-6man-sadr-ra] 2315 Pfister, P., "Source Address Dependent Route Information 2316 Option for Router Advertisements", draft-pfister-6man- 2317 sadr-ra-01 (work in progress), June 2015. 2319 [RFC3704] Baker, F. and P. Savola, "Ingress Filtering for Multihomed 2320 Networks", BCP 84, RFC 3704, DOI 10.17487/RFC3704, March 2321 2004, . 2323 [RFC4941] Narten, T., Draves, R., and S. Krishnan, "Privacy 2324 Extensions for Stateless Address Autoconfiguration in 2325 IPv6", RFC 4941, DOI 10.17487/RFC4941, September 2007, 2326 . 2328 [RFC5082] Gill, V., Heasley, J., Meyer, D., Savola, P., Ed., and C. 2329 Pignataro, "The Generalized TTL Security Mechanism 2330 (GTSM)", RFC 5082, DOI 10.17487/RFC5082, October 2007, 2331 . 2333 [RFC5533] Nordmark, E. and M. Bagnulo, "Shim6: Level 3 Multihoming 2334 Shim Protocol for IPv6", RFC 5533, DOI 10.17487/RFC5533, 2335 June 2009, . 2337 [RFC5534] Arkko, J. and I. van Beijnum, "Failure Detection and 2338 Locator Pair Exploration Protocol for IPv6 Multihoming", 2339 RFC 5534, DOI 10.17487/RFC5534, June 2009, 2340 . 2342 [RFC6434] Jankiewicz, E., Loughney, J., and T. Narten, "IPv6 Node 2343 Requirements", RFC 6434, DOI 10.17487/RFC6434, December 2344 2011, . 2346 [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, 2347 "TCP Extensions for Multipath Operation with Multiple 2348 Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013, 2349 . 2351 [RFC7676] Pignataro, C., Bonica, R., and S. Krishnan, "IPv6 Support 2352 for Generic Routing Encapsulation (GRE)", RFC 7676, 2353 DOI 10.17487/RFC7676, October 2015, 2354 . 2356 Authors' Addresses 2358 Fred Baker 2359 Santa Barbara, California 93117 2360 USA 2362 Email: FredBaker.IETF@gmail.com 2364 Chris Bowers 2365 Juniper Networks 2366 Sunnyvale, California 94089 2367 USA 2369 Email: cbowers@juniper.net 2371 Jen Linkova 2372 Google 2373 1 Darling Island Rd 2374 Pyrmont, NSW 2009 2375 AU 2377 Email: furry@google.com