idnits 2.17.1 draft-ietf-rtgwg-enterprise-pa-multihoming-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 1, 2019) is 1760 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 6106 (Obsoleted by RFC 8106) == Outdated reference: A later version (-11) exists of draft-ietf-intarea-provisioning-domains-05 -- Obsolete informational reference (is this intentional?): RFC 4941 (Obsoleted by RFC 8981) -- Obsolete informational reference (is this intentional?): RFC 6434 (Obsoleted by RFC 8504) -- Obsolete informational reference (is this intentional?): RFC 6824 (Obsoleted by RFC 8684) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Routing Working Group F. Baker 3 Internet-Draft 4 Intended status: Informational C. Bowers 5 Expires: January 2, 2020 Juniper Networks 6 J. Linkova 7 Google 8 July 1, 2019 10 Enterprise Multihoming using Provider-Assigned IPv6 Addresses without 11 Network Prefix Translation: Requirements and Solutions 12 draft-ietf-rtgwg-enterprise-pa-multihoming-09 14 Abstract 16 Connecting an enterprise site to multiple ISPs over IPv6 using 17 provider-assigned addresses is difficult without the use of some form 18 of Network Address Translation (NAT). Much has been written on this 19 topic over the last 10 to 15 years, but it still remains a problem 20 without a clearly defined or widely implemented solution. Any 21 multihoming solution without NAT requires hosts at the site to have 22 addresses from each ISP and to select the egress ISP by selecting a 23 source address for outgoing packets. It also requires routers at the 24 site to take into account those source addresses when forwarding 25 packets out towards the ISPs. 27 This document examines currently available mechanisms for providing a 28 solution to this problem for a broad range of enterprise topologies. 29 It covers the behavior of routers to forward traffic taking into 30 account source address, and it covers the behavior of hosts to select 31 appropriate source addresses. It also covers any possible role that 32 routers might play in providing information to hosts to help them 33 select appropriate source addresses. In the process of exploring 34 potential solutions, this document also makes explicit requirements 35 for how the solution would be expected to behave from the perspective 36 of an enterprise site network administrator. 38 Status of This Memo 40 This Internet-Draft is submitted in full conformance with the 41 provisions of BCP 78 and BCP 79. 43 Internet-Drafts are working documents of the Internet Engineering 44 Task Force (IETF). Note that other groups may also distribute 45 working documents as Internet-Drafts. The list of current Internet- 46 Drafts is at https://datatracker.ietf.org/drafts/current/. 48 Internet-Drafts are draft documents valid for a maximum of six months 49 and may be updated, replaced, or obsoleted by other documents at any 50 time. It is inappropriate to use Internet-Drafts as reference 51 material or to cite them other than as "work in progress." 53 This Internet-Draft will expire on January 2, 2020. 55 Copyright Notice 57 Copyright (c) 2019 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents 62 (https://trustee.ietf.org/license-info) in effect on the date of 63 publication of this document. Please review these documents 64 carefully, as they describe your rights and restrictions with respect 65 to this document. Code Components extracted from this document must 66 include Simplified BSD License text as described in Section 4.e of 67 the Trust Legal Provisions and are provided without warranty as 68 described in the Simplified BSD License. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 73 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 6 74 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 75 4. Enterprise Multihoming Use Cases . . . . . . . . . . . . . . 8 76 4.1. Simple ISP Connectivity with Connected SERs . . . . . . . 8 77 4.2. Simple ISP Connectivity Where SERs Are Not Directly 78 Connected . . . . . . . . . . . . . . . . . . . . . . . . 9 79 4.3. Enterprise Network Operator Expectations . . . . . . . . 11 80 4.4. More complex ISP connectivity . . . . . . . . . . . . . . 13 81 4.5. ISPs and Provider-Assigned Prefixes . . . . . . . . . . . 15 82 4.6. Simplified Topologies . . . . . . . . . . . . . . . . . . 16 83 5. Generating Source-Prefix-Scoped Forwarding Tables . . . . . 16 84 6. Mechanisms For Hosts To Choose Good Source Addresses In A 85 Multihomed Site . . . . . . . . . . . . . . . . . . . . . . . 23 86 6.1. Source Address Selection Algorithm on Hosts . . . . . . . 25 87 6.2. Selecting Source Address When Both Uplinks Are Working . 28 88 6.2.1. Distributing Address Selection Policy Table with 89 DHCPv6 . . . . . . . . . . . . . . . . . . . . . . . 28 90 6.2.2. Controlling Source Address Selection With Router 91 Advertisements . . . . . . . . . . . . . . . . . . . 29 92 6.2.3. Controlling Source Address Selection With ICMPv6 . . 31 93 6.2.4. Summary of Methods For Controlling Source Address 94 Selection To Implement Routing Policy . . . . . . . . 33 95 6.3. Selecting Source Address When One Uplink Has Failed . . . 33 96 6.3.1. Controlling Source Address Selection With DHCPv6 . . 34 97 6.3.2. Controlling Source Address Selection With Router 98 Advertisements . . . . . . . . . . . . . . . . . . . 35 99 6.3.3. Controlling Source Address Selection With ICMPv6 . . 36 100 6.3.4. Summary Of Methods For Controlling Source Address 101 Selection On The Failure Of An Uplink . . . . . . . . 37 102 6.4. Selecting Source Address Upon Failed Uplink Recovery . . 37 103 6.4.1. Controlling Source Address Selection With DHCPv6 . . 37 104 6.4.2. Controlling Source Address Selection With Router 105 Advertisements . . . . . . . . . . . . . . . . . . . 38 106 6.4.3. Controlling Source Address Selection With ICMP . . . 38 107 6.4.4. Summary Of Methods For Controlling Source Address 108 Selection Upon Failed Uplink Recovery . . . . . . . . 39 109 6.5. Selecting Source Address When All Uplinks Failed . . . . 39 110 6.5.1. Controlling Source Address Selection With DHCPv6 . . 39 111 6.5.2. Controlling Source Address Selection With Router 112 Advertisements . . . . . . . . . . . . . . . . . . . 39 113 6.5.3. Controlling Source Address Selection With ICMPv6 . . 40 114 6.5.4. Summary Of Methods For Controlling Source Address 115 Selection When All Uplinks Failed . . . . . . . . . . 40 116 6.6. Summary Of Methods For Controlling Source Address 117 Selection . . . . . . . . . . . . . . . . . . . . . . . . 40 118 6.7. Solution Limitations . . . . . . . . . . . . . . . . . . 42 119 6.7.1. Connections Preservation . . . . . . . . . . . . . . 42 120 6.8. Other Configuration Parameters . . . . . . . . . . . . . 43 121 6.8.1. DNS Configuration . . . . . . . . . . . . . . . . . . 43 122 7. Deployment Considerations . . . . . . . . . . . . . . . . . . 44 123 7.1. Deploying SADR Domain . . . . . . . . . . . . . . . . . . 44 124 7.2. Hosts-Related Considerations . . . . . . . . . . . . . . 45 125 8. Other Solutions . . . . . . . . . . . . . . . . . . . . . . . 45 126 8.1. Shim6 . . . . . . . . . . . . . . . . . . . . . . . . . . 45 127 8.2. IPv6-to-IPv6 Network Prefix Translation . . . . . . . . . 46 128 8.3. Multipath Transport . . . . . . . . . . . . . . . . . . . 46 129 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 47 130 10. Security Considerations . . . . . . . . . . . . . . . . . . . 47 131 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 47 132 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 47 133 12.1. Normative References . . . . . . . . . . . . . . . . . . 47 134 12.2. Informative References . . . . . . . . . . . . . . . . . 49 135 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 50 137 1. Introduction 139 Site multihoming, the connection of a subscriber network to multiple 140 upstream networks using redundant uplinks, is a common enterprise 141 architecture for improving the reliability of its Internet 142 connectivity. If the site uses provider-independent (PI) addresses, 143 all traffic originating from the enterprise can use source addresses 144 from the PI address space. Site multihoming with PI addresses is 145 commonly used with both IPv4 and IPv6, and does not present any new 146 technical challenges. 148 It may be desirable for an enterprise site to connect to multiple 149 ISPs using provider-assigned (PA) addresses, instead of PI addresses. 150 Multihoming with provider-assigned addresses is typically less 151 expensive for the enterprise relative to using provider-independent 152 addresses as it does not require obtaining and maintaining PI address 153 space as well as running BGP between the enterprise and the ISPs (for 154 small/meduim networks running BGP might be not just undesirable but 155 impossible, especially if residential-type ISP connections are used). 156 PA multihoming is also a practice that should be facilitated and 157 encouraged because it does not add to the size of the Internet 158 routing table, whereas PI multihoming does. Note that PA is also 159 used to mean "provider-aggregatable". In this document we assume 160 that provider-assigned addresses are always provider-aggregatable. 162 With PA multihoming, for each ISP connection, the site is assigned a 163 prefix from within an address block allocated to that ISP by its 164 National or Regional Internet Registry. In the simple case of two 165 ISPs (ISP-A and ISP-B), the site will have two different prefixes 166 assigned to it (prefix-A and prefix-B). This arrangement is 167 problematic. First, packets with the "wrong" source address may be 168 dropped by one of the ISPs. In order to limit denial of service 169 attacks using spoofed source addresses, BCP38 [RFC2827] recommends 170 that ISPs filter traffic from customer sites to only allow traffic 171 with a source address that has been assigned by that ISP. So a 172 packet sent from a multihomed site on the uplink to ISP-B with a 173 source address in prefix-A may be dropped by ISP-B. 175 However, even if ISP-B does not implement BCP38 or ISP-B adds 176 prefix-A to its list of allowed source addresses on the uplink from 177 the multihomed site, two-way communication may still fail. If the 178 packet with source address in prefix-A was sent to ISP-B because the 179 uplink to ISP-A failed, then if ISP-B does not drop the packet and 180 the packet reaches its destination somewhere on the Internet, the 181 return packet will be sent back with a destination address in prefix- 182 A. The return packet will be routed over the Internet to ISP-A, but 183 it will not be delivered to the multihomed site because the site 184 uplink with ISP-A has failed. Two-way communication would require 185 some arrangement for ISP-B to advertise prefix-A when the uplink to 186 ISP-A fails. 188 Note that the same may be true with a provider that does not 189 implement BCP 38, if his upstream provider does, or has no 190 corresponding route to deliver the ingress traffic to the multihomed 191 site. The issue is not that the immediate provider implements 192 ingress filtering; it is that someone upstream does (so egress 193 traffic is blocked), or lacks a route (causing blackholing of the 194 ingress traffic). 196 Another issue with asymmetric traffic flow (when the egress traffic 197 leaves the site via one ISP but the return traffic enters the site 198 via another uplink) is related to stateful firewalls/middleboxes. 199 Keeping state in that case might be problematic, even impossible. 201 With IPv4, this problem is commonly solved by using [RFC1918] private 202 address space within the multi-homed site and Network Address 203 Translation (NAT) or Network Address/Port Translation (NAPT) on the 204 uplinks to the ISPs. However, one of the goals of IPv6 is to 205 eliminate the need for and the use of NAT or NAPT. Therefore, 206 requiring the use of NAT or NAPT for an enterprise site to multihome 207 with provider-assigned addresses is not an attractive solution. 209 [RFC6296] describes a translation solution specifically tailored to 210 meet the requirements of multi-homing with provider-assigned IPv6 211 addresses. With the IPv6-to-IPv6 Network Prefix Translation (NPTv6) 212 solution, within the site an enterprise can use Unique Local 213 Addresses [RFC4193] or the prefix assigned by one of the ISPs. As 214 traffic leaves the site on an uplink to an ISP, the source address 215 gets translated to an address within the prefix assigned by the ISP 216 on that uplink in a predictable and reversible manner. [RFC6296] is 217 currently classified as Experimental, and it has been implemented by 218 several vendors. See Section 8.2, for more discussion of NPTv6. 220 This document defines routing requirements for enterprise multihoming 221 This document focuses on the following general class of solutions. 223 Each host at the enterprise has multiple addresses, at least one from 224 each ISP-assigned prefix. Each host, as discussed in Section 6.1 and 225 [RFC6724], is responsible for choosing the source address applied to 226 each packet it sends. A host is expected to be able respond 227 dynamically to the failure of an uplink to a given ISP by no longer 228 sending packets with the source address corresponding to that ISP. 229 Potential mechanisms for the communication of changes in the network 230 to the host are Neighbor Discovery Router Advertisements ([RFC4861]), 231 DHCPv6 ([RFC8415]), and ICMPv6 ([RFC4443]). 233 The routers in the enterprise network are responsible for ensuring 234 that packets are delivered to the "correct" ISP uplink based on 235 source address. This requires that at least some routers in the site 236 network are able to take into account the source address of a packet 237 when deciding how to route it. That is, some routers must be capable 238 of some form of Source Address Dependent Routing (SADR), if only as 239 described in the section 4.3 of [RFC3704]. At a minimum, the routers 240 connected to the ISP uplinks (the site exit routers or SERs) must be 241 capable of Source Address Dependent Routing. Expanding the connected 242 domain of routers capable of SADR from the site exit routers deeper 243 into the site network will generally result in more efficient routing 244 of traffic with external destinations. 246 This document is organized as follows. Section 4 looks in more 247 detail at the enterprise networking environments in which this 248 solution is expected to operate. The discussion of Section 4 uses 249 the concepts of source-prefix-scoped routing advertisements and 250 forwarding tables and provides a description of how source-prefix- 251 scoped routing advertisements are used to generate source-prefix- 252 scoped forwarding tables. Instead, this detailed description is 253 provided in Section 5. Section 6 discusses existing and proposed 254 mechanisms for hosts to select the source address applied to packets. 255 It also discusses the requirements for routing that are needed to 256 support these enterprise network scenarios and the mechanisms by 257 which hosts are expected to select source addresses dynamically based 258 on network state. Section 7 discusses deployment considerations, 259 while Section 8 discusses other solutions. 261 2. Requirements Language 263 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 264 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 265 "OPTIONAL" in this document are to be interpreted as described in BCP 266 14 [RFC2119] [RFC8174] when, and only when, they appear in all 267 capitals, as shown here. 269 3. Terminology 271 PA (provider-assigned or provider-aggregatable) address space: a 272 block of IP addresses assigned by an Regional Internet Registry (RIR) 273 to a Local Internet Registry (LIR), used to create allocations to end 274 sites. Can be aggregated and present in the routing table as one 275 route. 277 PI (provider-independent) address space: a block of IP addresses 278 assigned by an Regional Internet Registry (RIR) directly to end site/ 279 end customer. 281 ISP: Internet Service Provider. 283 LIR (Local Internet Registry): an organisation (usually an ISP or an 284 enterprise/academic) which receives IP addresses allocation from its 285 Regional Internet Regsitry, then assign parts of that allocation to 286 its customers. 288 RIR (Regional Internet Registry): an organization which manages the 289 Internet number resources (such as IP addresses and AS numbers) 290 within a geographical region of the world. 292 SADR (Source Address Dependent Routing): Routing which takes into 293 account the source address of a packet in addition to the packet 294 destination address. 296 SADR domain: a routing domain where some (or all) routers exchange 297 source-dependent routing information. 299 Source-Prefix-Scoped Routing/Forwarding Table: a routing (or 300 forwarding) table which contains routing (or forwarding) information 301 which is applicable to packets with source addresses from the 302 specific prefix only. 304 Unscoped Routing/Forwarding Table: a routing (or forwarding) table 305 which can be used to route/forward packets with any source addresses. 307 SER (Site Edge Router): a router which connects the site to an ISP 308 (terminates an ISP uplink).. 310 LLA (Link-Local Address): IPv6 Unicast Address from fe80::/10 prefix 311 ([RFC4291]). 313 ULA (Unique Local IPv6 Unicast Address): IPv6 unicast addresses from 314 FC00::/7 prefix. They are globally unique and intended for local 315 communications ([RFC4193]). 317 GUA (Global Unicast Address): globally routable IPv6 addresses of the 318 global scope ([RFC4291]). 320 SLAAC (IPv6 Stateless Address Autoconfiguration): a stateless process 321 of configuring network stack on IPv6 hosts ([RFC4862]). 323 RA (Router Advertisement): a message sent by an IPv6 router to 324 advertise its presence to hosts together with various network-related 325 parameters required for hosts to perform SLAAC ([RFC4861]). 327 PIO (Prefix Information Option): a part of RA message containing 328 information about IPv6 prefixes which could be used by hosts to 329 generate global IPv6 addresses ([RFC4862]). 331 RIO (Route Information Option): a part of RA message containing 332 information about more specific IPv6 prefixes reachable via the 333 advertising router ([RFC4191]). 335 4. Enterprise Multihoming Use Cases 337 4.1. Simple ISP Connectivity with Connected SERs 339 We start by looking at a scenario in which a site has connections to 340 two ISPs, as shown in Figure 1. The site is assigned the prefix 341 2001:db8:0:a000::/52 by ISP-A and prefix 2001:db8:0:b000::/52 by ISP- 342 B. We consider three hosts in the site. H31 and H32 are on a LAN 343 that has been assigned subnets 2001:db8:0:a010::/64 and 344 2001:db8:0:b010::/64. H31 has been assigned the addresses 345 2001:db8:0:a010::31 and 2001:db8:0:b010::31. H32 has been assigned 346 2001:db8:0:a010::32 and 2001:db8:0:b010::32. H41 is on a different 347 subnet that has been assigned 2001:db8:0:a020::/64 and 348 2001:db8:0:b020::/64. 350 2001:db8:0:1234::101 H101 351 | 352 | 353 2001:db8:0:a010::31 -------- 354 2001:db8:0:b010::31 ,-----. / \ 355 +--+ +--+ +----+ ,' `. : : 356 +---|R1|---|R4|---+---|SERa|-+ ISP-A +--+-- : 357 H31--+ +--+ +--+ | +----+ `. ,' : : 358 | | `-----' : Internet : 359 | | : : 360 | | : : 361 | | : : 362 | | ,-----. : : 363 H32--+ +--+ | +----+ ,' `. : : 364 +---|R2|----------+---|SERb|-+ ISP-B +--+-- : 365 +--+ | +----+ `. ,' : : 366 | `-----' : : 367 | : : 368 +--+ +--+ +--+ \ / 369 H41------|R3|--|R5|--|R6| -------- 370 +--+ +--+ +--+ 372 2001:db8:0:a020::41 373 2001:db8:0:b020::41 375 Figure 1: Simple ISP Connectivity With Connected SERs 377 We refer to a router that connects the site to an ISP as a site edge 378 router (SER). Several other routers provide connectivity among the 379 internal hosts (H31, H32, and H41), as well as connecting the 380 internal hosts to the Internet through SERa and SERb. In this 381 example SERa and SERb share a direct connection to each other. In 382 Section 4.2, we consider a scenario where this is not the case. 384 For the moment, we assume that the hosts are able to make good 385 choices about which source addresses through some mechanism that 386 doesn't involve the routers in the site network. Here, we focus on 387 primary task of the routed site network, which is to get packets 388 efficiently to their destinations, while sending a packet to the ISP 389 that assigned the prefix that matches the source address of the 390 packet. In Section 6, we examine what role the routed network may 391 play in helping hosts make good choices about source addresses for 392 packets. 394 With this solution, routers will need some form of Source Address 395 Dependent Routing, which will be new functionality. It would be 396 useful if an enterprise site does not need to upgrade all routers to 397 support the new SADR functionality in order to support PA multi- 398 homing. We consider if this is possible and what are the tradeoffs 399 of not having all routers in the site support SADR functionality. 401 In the topology in Figure 1, it is possible to support PA multihoming 402 with only SERa and SERb being capable of SADR. The other routers can 403 continue to forward based only on destination address, and exchange 404 routes that only consider destination address. In this scenario, 405 SERa and SERb communicate source-scoped routing information across 406 their shared connection. When SERa receives a packet with a source 407 address matching prefix 2001:db8:0:b000::/52 , it forwards the packet 408 to SERb, which forwards it on the uplink to ISP-B. The analogous 409 behaviour holds for traffic that SERb receives with a source address 410 matching prefix 2001:db8:0:a000::/52. 412 In Figure 1, when only SERa and SERb are capable of source address 413 dependent routing, PA multi-homing will work. However, the paths 414 over which the packets are sent will generally not be the shortest 415 paths. The forwarding paths will generally be more efficient as more 416 routers are capable of SADR. For example, if R4, R2, and R6 are 417 upgraded to support SADR, then can exchange source-scoped routes with 418 SERa and SERb. They will then know to send traffic with a source 419 address matching prefix 2001:db8:0:b000::/52 directly to SERb, 420 without sending it to SERa first. 422 4.2. Simple ISP Connectivity Where SERs Are Not Directly Connected 424 In Figure 2, we modify the topology slightly by inserting R7, so that 425 SERa and SERb are no longer directly connected. With this topology, 426 it is not enough to just enable SADR routing on SERa and SERb to 427 support PA multi-homing. There are two solutions to enable PA 428 multihoming in this topology. 430 2001:db8:0:1234::101 H101 431 | 432 | 433 2001:db8:0:a010::31 -------- 434 2001:db8:0:b010::31 ,-----. / \ 435 +--+ +--+ +----+ ,' `. : : 436 +---|R1|---|R4|---+---|SERa|-+ ISP-A +--+-- : 437 H31--+ +--+ +--+ | +----+ `. ,' : : 438 | | `-----' : Internet : 439 | +--+ : : 440 | |R7| : : 441 | +--+ : : 442 | | ,-----. : : 443 H32--+ +--+ | +----+ ,' `. : : 444 +---|R2|----------+---|SERb|-+ ISP-B +--+-- : 445 +--+ | +----+ `. ,' : : 446 | `-----' : : 447 | : : 448 +--+ +--+ +--+ \ / 449 H41------|R3|--|R5|--|R6| -------- 450 +--+ +--+ +--+ | 451 | 452 2001:db8:0:a020::41 2001:db8:0:5678::501 H501 453 2001:db8:0:b020::41 455 Figure 2: Simple ISP Connectivity Where SERs Are Not Directly 456 Connected 458 One option is to effectively modify the topology by creating a 459 logical tunnel between SERa and SERb, using GRE ([RFC7676]) for 460 example. Although SERa and SERb are not directly connected 461 physically in this topology, they can be directly connected logically 462 by a tunnel. 464 The other option is to enable SADR functionality on R7. In this way, 465 R7 will exchange source-scoped routes with SERa and SERb, making the 466 three routers act as a single SADR domain. This illustrates the 467 basic principle that the minimum requirement for the routed site 468 network to support PA multi-homing is having all of the site exit 469 routers be part of a connected SADR domain. Extending the connected 470 SADR domain beyond that point can produce more efficient forwarding 471 paths. 473 4.3. Enterprise Network Operator Expectations 475 Before considering a more complex scenario, let's look in more detail 476 at the reasonably simple multihoming scenario in Figure 2 to 477 understand what can reasonably be expected from this solution. As a 478 general guiding principle, we assume an enterprise network operator 479 will expect a multihomed network to behave as close as to a single- 480 homed network as possible. So a solution that meets those 481 expectations where possible is a good thing. 483 For traffic between internal hosts and traffic from outside the site 484 to internal hosts, an enterprise network operator would expect there 485 be no visible change in the path taken by this traffic, since this 486 traffic does not need to be routed in a way that depends on source 487 address. It is also reasonable to expect that internal hosts should 488 be able to communicate with each other using either of their source 489 addresses without restriction. For example, H31 should be able to 490 communicate with H41 using a packet with S=2001:db8:0:a010::31, 491 D=2001:db8:0:b020::41, regardless of the state of uplink to ISP-B. 493 These goals can be accomplished by having all of the routers in the 494 network continue to originate normal unscoped destination routes for 495 their connected networks. If we can arrange so that these unscoped 496 destination routes get used for forwarding this traffic, then we will 497 have accomplished the goal of keeping forwarding of traffic destined 498 for internal hosts, unaffected by the multihoming solution. 500 For traffic destined for external hosts, it is reasonable to expect 501 that traffic with a source address from the prefix assigned by ISP-A 502 to follow the path to that the traffic would follow if there is no 503 connection to ISP-B. This can be accomplished by having SERa 504 originate a source-scoped route of the form (S=2001:db8:0:a000::/52, 505 D=::/0) . If all of the routers in the site support SADR, then the 506 path of traffic exiting via ISP-A can match that expectation. If 507 some routers don't support SADR, then it is reasonable to expect that 508 the path for traffic exiting via ISP-A may be different within the 509 site. This is a tradeoff that the enterprise network operator may 510 decide to make. 512 It is important to understand how this multihoming solution behaves 513 when an uplink to one of the ISPs fails. To simplify this 514 discussion, we assume that all routers in the site support SADR. We 515 first start by looking at how the network operates when the uplinks 516 to both ISP-A and ISP-B are functioning properly. SERa originates a 517 source-scoped route of the form (S=2001:db8:0:a000::/52, D=::/0), and 518 SERb is originates a source-scoped route of the form 519 (S=2001:db8:0:b000::/52, D=::/0). These routes are distributed 520 through the routers in the site, and they establish within the 521 routers two set of forwarding paths for traffic leaving the site. 522 One set of forwarding paths is for packets with source address in 523 2001:db8:0:a000::/52. The other set of forwarding paths is for 524 packets with source address in 2001:db8:0:b000::/52. The normal 525 destination routes which are not scoped to these two source prefixes 526 play no role in the forwarding. Whether a packet exits the site via 527 SERa or via SERb is completely determined by the source address 528 applied to the packet by the host. So for example, when host H31 529 sends a packet to host H101 with (S=2001:db8:0:a010::31, 530 D=2001:db8:0:1234::101), the packet will only be sent out the link 531 from SERa to ISP-A. 533 Now consider what happens when the uplink from SERa to ISP-A fails. 534 The only way for the packets from H31 to reach H101 is for H31 to 535 start using the source address for ISP-B. H31 needs to send the 536 following packet: (S=2001:db8:0:b010::31, D=2001:db8:0:1234::101). 538 This behavior is very different from the behavior that occurs with 539 site multihoming using PI addresses or with PA addresses using NAT. 540 In these other multi-homing solutions, hosts do not need to react to 541 network failures several hops away in order to regain Internet 542 access. Instead, a host can be largely unaware of the failure of an 543 uplink to an ISP. When multihoming with PA addresses and NAT, 544 existing sessions generally need to be re-established after a failure 545 since the external host will receive packets from the internal host 546 with a new source address. However, new sessions can be established 547 without any action on the part of the hosts. Multihoming with PA 548 addresses and NAT has created the expectation of a fairly quick and 549 simple recovery from network failures. Alternatives should to be 550 evaluated in terms of the speed and complexity of the recovery 551 mechanism. 553 Another example where the behavior of this multihoming solution 554 differs significantly from that of multihoming with PI address or 555 with PA addresses using NAT is in the ability of the enterprise 556 network operator to route traffic over different ISPs based on 557 destination address. We still consider the fairly simple network of 558 Figure 2 and assume that uplinks to both ISPs are functioning. 559 Assume that the site is multihomed using PA addresses and NAT, and 560 that SERa and SERb each originate a normal destination route for 561 D=::/0, with the route origination dependent on the state of the 562 uplink to the respective ISP. 564 Now suppose it is observed that an important application running 565 between internal hosts and external host H101 experience much better 566 performance when the traffic passes through ISP-A (perhaps because 567 ISP-A provides lower latency to H101.) When multihoming this site 568 with PI addresses or with PA addresses and NAT, the enterprise 569 network operator can configure SERa to originate into the site 570 network a normal destination route for D=2001:db8:0:1234::/64 (the 571 destination prefix to reach H101) that depends on the state of the 572 uplink to ISP-A. When the link to ISP-A is functioning, the 573 destination route D=2001:db8:0:1234::/64 will be originated by SERa, 574 so traffic from all hosts will use ISP-A to reach H101 based on the 575 longest destination prefix match in the route lookup. 577 Implementing the same routing policy is more difficult with the PA 578 multihoming solution described in this document since it doesn't use 579 NAT. By design, the only way to control where a packet exits this 580 network is by setting the source address of the packet. Since the 581 network cannot modify the source address without NAT, the host must 582 set it. To implement this routing policy, each host needs to use the 583 source address from the prefix assigned by ISP-A to send traffic 584 destined for H101. Mechanisms have been proposed to allow hosts to 585 choose the source address for packets in a fine grained manner. We 586 will discuss these proposals in Section 6. However, interacting with 587 host operating systems in some manner to ensure a particular source 588 address is chosen for a particular destination prefix is not what an 589 enterprise network administrator would expect to have to do to 590 implement this routing policy. 592 4.4. More complex ISP connectivity 594 The previous sections considered two variations of a simple 595 multihoming scenario where the site is connected to two ISPs offering 596 only Internet connectivity. It is likely that many actual enterprise 597 multihoming scenarios will be similar to this simple example. 598 However, there are more complex multihoming scenarios that we would 599 like this solution to address as well. 601 It is fairly common for an ISP to offer a service in addition to 602 Internet access over the same uplink. Two variations of this are 603 reflected in Figure 3. In addition to Internet access, ISP-A offers 604 a service which requires the site to access host H51 at 605 2001:db8:0:5555::51. The site has a single physical and logical 606 connection with ISP-A, and ISP-A only allows access to H51 over that 607 connection. So when H32 needs to access the service at H51 it needs 608 to send packets with (S=2001:db8:0:a010::32, D=2001:db8:0:5555::51) 609 and those packets need to be forward out the link from SERa to ISP-A. 611 2001:db8:0:1234::101 H101 612 | 613 | 614 2001:db8:0:a010::31 -------- 615 2001:db8:0:b010::31 ,-----. / \ 616 +--+ +--+ +----+ ,' `. : : 617 +---|R1|---|R4|---+---|SERa|-+ ISP-A +--+-- : 618 H31--+ +--+ +--+ | +----+ `. ,' : : 619 | | `-----' : Internet : 620 | | | : : 621 | | H51 : : 622 | | 2001:db8:0:5555::51 : : 623 | +--+ : : 624 | |R7| : : 625 | +--+ : : 626 | | : : 627 | | ,-----. : : 628 H32--+ +--+ | +-----+ ,' `. : : 629 +---|R2|-----+----+--|SERb1|-+ ISP-B +--+-- : 630 +--+ | +-----+ `. ,' : : 631 +--+ `--|--' : : 632 2001:db8:0:a010::32 |R8| | \ / 633 +--+ ,--|--. -------- 634 | +-----+ ,' `. | 635 +-------|SERb2|-+ ISP-B | | 636 | +-----+ `. ,' H501 637 | `-----' 2001:db8:0:5678 638 | | ::501 639 +--+ +--+ H61 640 H41------|R3|--|R5| 2001:db8:0:6666::61 641 +--+ +--+ 643 2001:db8:0:a020::41 644 2001:db8:0:b020::41 646 Figure 3: Internet access and services offered by ISP-A and ISP-B 648 ISP-B illustrates a variation on this scenario. In addition to 649 Internet access, ISP-B also offers a service which requires the site 650 to access host H61. The site has two connections to two different 651 parts of ISP-B (shown as SERb1 and SERb2 in Figure 3). ISP-B expects 652 Internet traffic to use the uplink from SERb1, while it expects 653 traffic destined for the service at H61 to use the uplink from SERb2. 654 For either uplink, ISP-B expects the ingress traffic to have a source 655 address matching the prefix it assigned to the site, 656 2001:db8:0:b000::/52. 658 As discussed before, we rely completely on the internal host to set 659 the source address of the packet properly. In the case of a packet 660 sent by H31 to access the service in ISP-B at H61, we expect the 661 packet to have the following addresses: (S=2001:db8:0:b010::31, 662 D=2001:db8:0:6666::61). The routed network has two potential ways of 663 distributing routes so that this packet exits the site on the uplink 664 at SERb2. 666 We could just rely on normal destination routes, without using 667 source-prefix scoped routes. If we have SERb2 originate a normal 668 unscoped destination route for D=2001:db8:0:6666::/64, the packets 669 from H31 to H61 will exit the site at SERb2 as desired. We should 670 not have to worry about SERa needing to originate the same route, 671 because ISP-B should choose a globally unique prefix for the service 672 at H61. 674 The alternative is to have SERb2 originate a source-prefix-scoped 675 destination route of the form (S=2001:db8:0:b000::/52, 676 D=2001:db8:0:6666::/64). From a forwarding point of view, the use of 677 the source-prefix-scoped destination route would result in traffic 678 with source addresses corresponding only to ISP-B being sent to 679 SERb2. Instead, the use of the unscoped destination route would 680 result in traffic with source addresses corresponding to ISP-A and 681 ISP-B being sent to SERb2, as long as the destination address matches 682 the destination prefix. It seems like either forwarding behavior 683 would be acceptable. 685 However, from the point of view of the enterprise network 686 administrator trying to configure, maintain, and trouble-shoot this 687 multihoming solution, it seems much clearer to have SERb2 originate 688 the source-prefix-scoped destination route correspond to the service 689 offered by ISP-B. In this way, all of the traffic leaving the site 690 is determined by the source-prefix-scoped routes, and all of the 691 traffic within the site or arriving from external hosts is determined 692 by the unscoped destination routes. Therefore, for this multihoming 693 solution we choose to originate source-prefix-scoped routes for all 694 traffic leaving the site. 696 4.5. ISPs and Provider-Assigned Prefixes 698 While we expect that most site multihoming involves connecting to 699 only two ISPs, this solution allows for connections to an arbitrary 700 number of ISPs to be supported. However, when evaluating scalable 701 implementations of the solution, it would be reasonable to assume 702 that the maximum number of ISPs that a site would connect to is five 703 (topologies with two redundant routers each having two uplinks to 704 different ISPs plus a tunnel to a headoffice acting as fifth one are 705 not unheard of). 707 It is also useful to note that the prefixes assigned to the site by 708 different ISPs will not overlap. This must be the case, since the 709 provider-assigned addresses have to be globally unique. 711 4.6. Simplified Topologies 713 The topologies of many enterprise sites using this multihoming 714 solution may in practice be simpler than the examples that we have 715 used. The topology in Figure 1 could be further simplified by having 716 all hosts directly connected to the LAN connecting the two site exit 717 routers, SERa and SERb. The topology could also be simplified by 718 having the uplinks to ISP-A and ISP-B both connected to the same site 719 exit router. However, it is the aim of this document to provide a 720 solution that applies to a broad a range of enterprise site network 721 topologies, so this document focuses on providing a solution to the 722 more general case. The simplified cases will also be supported by 723 this solution, and there may even be optimizations that can be made 724 for simplified cases. This solution however needs to support more 725 complex topologies. 727 We are starting with the basic assumption that enterprise site 728 networks can be quite complex from a routing perspective. However, 729 even a complex site network can be multihomed to different ISPs with 730 PA addresses using IPv4 and NAT. It is not reasonable to expect an 731 enterprise network operator to change the routing topology of the 732 site in order to deploy IPv6. 734 5. Generating Source-Prefix-Scoped Forwarding Tables 736 So far we have described in general terms how the routers in this 737 solution that are capable of Source Address Dependent Routing will 738 forward traffic using both normal unscoped destination routes and 739 source-prefix-scoped destination routes. Here we give a precise 740 method for generating a source-prefix-scoped forwarding table on a 741 router that supports SADR. 743 1. Compute the next-hops for the source-prefix-scoped destination 744 prefixes using only routers in the connected SADR domain. These 745 are the initial source-prefix-scoped forwarding table entries. 747 2. Compute the next-hops for the unscoped destination prefixes using 748 all routers in the IGP. This is the unscoped forwarding table. 750 3. Augment each less specific source-prefix-scoped forwarding table 751 with all more specific source-prefix-scoped forwarding tables 752 entries based on the following rule. If the destination prefix 753 of the less specific source-prefix-scoped forwarding entry 754 exactly matches the destination prefix of an existing more 755 specific source-prefix-scoped forwarding entry (including 756 destination prefix length), then do not add the less specific 757 source-prefix-scoped forwarding entry. If the destination prefix 758 does NOT match an existing entry, then add the entry to the more 759 specific source-prefix-scoped forwarding table. As the unscoped 760 forwarding table is considered to be scoped to ::/0 this process 761 starts with propagating routes from the unscoped forwarding table 762 to source-prefix-scoped forwarding tables and then continues with 763 propagating routes to more-specific-source-prefix-scoped 764 forwarding tables should they exist. 766 The forwarding tables produced by this process are used in the 767 following way to forward packets. 769 1. Select the most specific (longest prefix match) source-prefix- 770 scoped forwarding table that matches the source address of the 771 packet (again, the unscoped forwarding table is considered to be 772 scoped to ::/0). 774 2. Look up the destination address of the packet in the selected 775 forwarding table to determine the next-hop for the packet. 777 The following example illustrates how this process is used to create 778 a forwarding table for each provider-assigned source prefix. We 779 consider the multihomed site network in Figure 3. Initially we 780 assume that all of the routers in the site network support SADR. 781 Figure 4 shows the routes that are originated by the routers in the 782 site network. 784 Routes originated by SERa: 785 (S=2001:db8:0:a000::/52, D=2001:db8:0:5555/64) 786 (S=2001:db8:0:a000::/52, D=::/0) 787 (D=2001:db8:0:5555::/64) 788 (D=::/0) 790 Routes originated by SERb1: 791 (S=2001:db8:0:b000::/52, D=::/0) 792 (D=::/0) 794 Routes originated by SERb2: 795 (S=2001:db8:0:b000::/52, D=2001:db8:0:6666::/64) 796 (D=2001:db8:0:6666::/64) 798 Routes originated by R1: 799 (D=2001:db8:0:a010::/64) 800 (D=2001:db8:0:b010::/64) 802 Routes originated by R2: 803 (D=2001:db8:0:a010::/64) 804 (D=2001:db8:0:b010::/64) 806 Routes originated by R3: 807 (D=2001:db8:0:a020::/64) 808 (D=2001:db8:0:b020::/64) 810 Figure 4: Routes Originated by Routers in the Site Network 812 Each SER originates destination routes which are scoped to the source 813 prefix assigned by the ISP that the SER connects to. Note that the 814 SERs also originate the corresponding unscoped destination route. 815 This is not needed when all of the routers in the site support SADR. 816 However, it is required when some routers do not support SADR. This 817 will be discussed in more detail later. 819 We focus on how R8 constructs its source-prefix-scoped forwarding 820 tables from these route advertisements. R8 computes the next hops 821 for destination routes which are scoped to the source prefix 822 2001:db8:0:a000::/52. The results are shown in the first table in 823 Figure 5. (In this example, the next hops are computed assuming that 824 all links have the same metric.) Then, R8 computes the next hops for 825 destination routes which are scoped to the source prefix 826 2001:db8:0:b000::/52. The results are shown in the second table in 827 Figure 5 . Finally, R8 computes the next hops for the unscoped 828 destination prefixes. The results are shown in the third table in 829 Figure 5. 831 forwarding entries scoped to 832 source prefix = 2001:db8:0:a000::/52 833 ============================================ 834 D=2001:db8:0:5555/64 NH=R7 835 D=::/0 NH=R7 837 forwarding entries scoped to 838 source prefix = 2001:db8:0:b000::/52 839 ============================================ 840 D=2001:db8:0:6666/64 NH=SERb2 841 D=::/0 NH=SERb1 843 unscoped forwarding entries 844 ============================================ 845 D=2001:db8:0:a010::/64 NH=R2 846 D=2001:db8:0:b010::/64 NH=R2 847 D=2001:db8:0:a020::/64 NH=R5 848 D=2001:db8:0:b020::/64 NH=R5 849 D=2001:db8:0:5555::/64 NH=R7 850 D=2001:db8:0:6666::/64 NH=SERb2 851 D=::/0 NH=SERb1 853 Figure 5: Forwarding Entries Computed at R8 855 The final step is for R8 to augment the less specific source-prefix- 856 scoped forwarding entries with more specific source-prefix-scoped 857 forwarding entries. As unscoped forwarding table is considered being 858 scoped to ::/0 and both 2001:db8:0:a000::/52 and 2001:db8:0:b000::/52 859 are more specific prefixes of ::/0, the unscoped (scoped to ::/0) 860 forwarding table needs to be augmented with both more specific 861 source-prefix-scoped tables. If a less specific scoped forwarding 862 entry has the exact same destination prefix as a more specific 863 source-prefix-scoped forwarding entry (including destination prefix 864 length), then the more specific source-prefix-scoped forwarding entry 865 wins. 867 As an example of how the source scoped forwarding entries are 868 augmented, we consider how the two entries in the first table in 869 Figure 5 (the table for source prefix = 2001:db8:0:a000::/52) are 870 augmented with entries from the third table in Figure 5 (the table of 871 unscoped or scoped to ::/0 forwarding entries). The first four 872 unscoped forwarding entries (D=2001:db8:0:a010::/64, 873 D=2001:db8:0:b010::/64, D=2001:db8:0:a020::/64, and 874 D=2001:db8:0:b020::/64) are not an exact match for any of the 875 existing entries in the forwarding table for source prefix 876 2001:db8:0:a000::/52. Therefore, these four entries are added to the 877 final forwarding table for source prefix 2001:db8:0:a000::/52. The 878 result of adding these entries is reflected in the first four entries 879 the first table in Figure 6. 881 The next less specific scoped (scope is ::/0) forwarding table entry 882 is for D=2001:db8:0:5555::/64. This entry is an exact match for the 883 existing entry in the forwarding table for the more specific source 884 prefix 2001:db8:0:a000::/52. Therefore, we do not replace the 885 existing entry with the entry from the unscoped forwarding table. 886 This is reflected in the fifth entry in the first table in Figure 6. 887 (Note that since both scoped and unscoped entries have R7 as the next 888 hop, the result of applying this rule is not visible.) 890 The next less specific prefix scoped (scope is ::/0) forwarding table 891 entry is for D=2001:db8:0:6666::/64. This entry is not an exact 892 match for any existing entries in the forwarding table for source 893 prefix 2001:db8:0:a000::/52. Therefore, we add this entry. This is 894 reflected in the sixth entry in the first table in Figure 6. 896 The next less specific prefix scoped (scope is ::/0) forwarding table 897 entry is for D=::/0. This entry is an exact match for the existing 898 entry in the forwarding table for more specific source prefix 899 2001:db8:0:a000::/52. Therefore, we do not overwrite the existing 900 source-prefix-scoped entry, as can be seen in the last entry in the 901 first table in Figure 6. 903 if source address matches 2001:db8:0:a000::/52 904 then use this forwarding table 905 ============================================ 906 D=2001:db8:0:a010::/64 NH=R2 907 D=2001:db8:0:b010::/64 NH=R2 908 D=2001:db8:0:a020::/64 NH=R5 909 D=2001:db8:0:b020::/64 NH=R5 910 D=2001:db8:0:5555::/64 NH=R7 911 D=2001:db8:0:6666::/64 NH=SERb2 912 D=::/0 NH=R7 914 else if source address matches 2001:db8:0:b000::/52 915 then use this forwarding table 916 ============================================ 917 D=2001:db8:0:a010::/64 NH=R2 918 D=2001:db8:0:b010::/64 NH=R2 919 D=2001:db8:0:a020::/64 NH=R5 920 D=2001:db8:0:b020::/64 NH=R5 921 D=2001:db8:0:5555::/64 NH=R7 922 D=2001:db8:0:6666::/64 NH=SERb2 923 D=::/0 NH=SERb1 925 else if source address matches ::/0 use this forwarding table 926 ============================================ 927 D=2001:db8:0:a010::/64 NH=R2 928 D=2001:db8:0:b010::/64 NH=R2 929 D=2001:db8:0:a020::/64 NH=R5 930 D=2001:db8:0:b020::/64 NH=R5 931 D=2001:db8:0:5555::/64 NH=R7 932 D=2001:db8:0:6666::/64 NH=SERb2 933 D=::/0 NH=SERb1 935 Figure 6: Complete Forwarding Tables Computed at R8 937 The forwarding tables produced by this process at R8 have the desired 938 properties. A packet with a source address in 2001:db8:0:a000::/52 939 will be forwarded based on the first table in Figure 6. If the 940 packet is destined for the Internet at large or the service at 941 D=2001:db8:0:5555/64, it will be sent to R7 in the direction of SERa. 942 If the packet is destined for an internal host, then the first four 943 entries will send it to R2 or R5 as expected. Note that if this 944 packet has a destination address corresponding to the service offered 945 by ISP-B (D=2001:db8:0:5555::/64), then it will get forwarded to 946 SERb2. It will be dropped by SERb2 or by ISP-B, since the packet has 947 a source address that was not assigned by ISP-B. However, this is 948 expected behavior. In order to use the service offered by ISP-B, the 949 host needs to originate the packet with a source address assigned by 950 ISP-B. 952 In this example, a packet with a source address that doesn't match 953 2001:db8:0:a000::/52 or 2001:db8:0:b000::/52 must have originated 954 from an external host. Such a packet will use the unscoped 955 forwarding table (the last table in Figure 6). These packets will 956 flow exactly as they would in absence of multihoming. 958 We can also modify this example to illustrate how it supports 959 deployments where not all routers in the site support SADR. 960 Continuing with the topology shown in Figure 3, suppose that R3 and 961 R5 do not support SADR. Instead they are only capable of 962 understanding unscoped route advertisements. The SADR routers in the 963 network will still originate the routes shown in Figure 4. However, 964 R3 and R5 will only understand the unscoped routes as shown in 965 Figure 7. 967 Routes originated by SERa: 968 (D=2001:db8:0:5555::/64) 969 (D=::/0) 971 Routes originated by SERb1: 972 (D=::/0) 974 Routes originated by SERb2: 975 (D=2001:db8:0:6666::/64) 977 Routes originated by R1: 978 (D=2001:db8:0:a010::/64) 979 (D=2001:db8:0:b010::/64) 981 Routes originated by R2: 982 (D=2001:db8:0:a010::/64) 983 (D=2001:db8:0:b010::/64) 985 Routes originated by R3: 986 (D=2001:db8:0:a020::/64) 987 (D=2001:db8:0:b020::/64) 989 Figure 7: Routes Advertisements Understood by Routers that do no 990 Support SADR 992 With these unscoped route advertisements, R5 will produce the 993 forwarding table shown in Figure 8. 995 forwarding table 996 ============================================ 997 D=2001:db8:0:a010::/64 NH=R8 998 D=2001:db8:0:b010::/64 NH=R8 999 D=2001:db8:0:a020::/64 NH=R3 1000 D=2001:db8:0:b020::/64 NH=R3 1001 D=2001:db8:0:5555::/64 NH=R8 1002 D=2001:db8:0:6666::/64 NH=SERb2 1003 D=::/0 NH=R8 1005 Figure 8: Forwarding Table For R5, Which Doesn't Understand Source- 1006 Prefix-Scoped Routes 1008 As all SERs belong to the SADR domain any traffic that needs to exit 1009 the site will eventually hit a SADR-capable router. To prevent 1010 routing loops involving SADR-capable and non-SADR-capable routers, 1011 traffic that enters the SADR-capable domain does not leave the domain 1012 until it exits the site. Therefore all SADR-capable routers with the 1013 domain MUST be logically connected. 1015 Note that the mechanism described here for converting source-prefix- 1016 scoped destination prefix routing advertisements into forwarding 1017 state is somewhat different from that proposed in 1018 [I-D.ietf-rtgwg-dst-src-routing]. The method described in the 1019 current document is functionally equivalent, but it is based on 1020 application of existing mechanisms for the described scenarios. 1022 6. Mechanisms For Hosts To Choose Good Source Addresses In A Multihomed 1023 Site 1025 Until this point, we have made the assumption that hosts are able to 1026 choose the correct source address using some unspecified mechanism. 1027 This has allowed us to just focus on what the routers in a multihomed 1028 site network need to do in order to forward packets to the correct 1029 ISP based on source address. Now we look at possible mechanisms for 1030 hosts to choose the correct source address. We also look at what 1031 role, if any, the routers may play in providing information that 1032 helps hosts to choose source addresses. 1034 It should be noted that this section discussed how hosts could select 1035 the source address for new connections. Any connection which already 1036 exists on a host is bound to the specific source address which can 1037 not be changed. Section 6.7 discusses the connections preservation 1038 issue in more details. 1040 Any host that needs to be able to send traffic using the uplinks to a 1041 given ISP is expected to be configured with an address from the 1042 prefix assigned by that ISP. The host will control which ISP is used 1043 for its traffic by selecting one of the addresses configured on the 1044 host as the source address for outgoing traffic. It is the 1045 responsibility of the site network to ensure that a packet with the 1046 source address from an ISP is now sent on an uplink to that ISP. 1048 If all of the ISP uplinks are working, the choice of source address 1049 by the host may be driven by the desire to load share across ISP 1050 uplinks, or it may be driven by the desire to take advantage of 1051 certain properties of a particular uplink or ISP (if some information 1052 about various path properties has been made availabe to the host 1053 somehow - see [I-D.ietf-intarea-provisioning-domains] as an example). 1054 If any of the ISP uplinks is not working, then the choice of source 1055 address by the host can cause packets to get dropped. 1057 How a host should make good decisions about source address selection 1058 in a multihomed site is not a solved problem. We do not attempt to 1059 solve this problem in this document. Instead we discuss the current 1060 state of affairs with respect to standardized solutions and 1061 implementation of those solutions. We also look at proposed 1062 solutions for this problem. 1064 An external host initiating communication with a host internal to a 1065 PA multihomed site will need to know multiple addresses for that host 1066 in order to communicate with it using different ISPs to the 1067 multihomed site (knowing just one address would undermine all 1068 benefits of redundant connectivity provided by multihoming). These 1069 addresses are typically learned through DNS. (For simplicity, we 1070 assume that the external host is single-homed.) The external host 1071 chooses the ISP that will be used at the remote multihomed site by 1072 setting the destination address on the packets it transmits. For a 1073 session originated from an external host to an internal host, the 1074 choice of source address used by the internal host is simple. The 1075 internal host has no choice but to use the destination address in the 1076 received packet as the source address of the transmitted packet. 1078 For a session originated by a host inside the multi-homed site, the 1079 decision of what source address to select is more complicated. We 1080 consider three main methods for hosts to get information about the 1081 network. The two proactive methods are Neighbor Discovery Router 1082 Advertisements and DHCPv6. The one reactive method we consider is 1083 ICMPv6. Note that we are explicitly excluding the possibility of 1084 having hosts participate in or even listen directly to routing 1085 protocol advertisements. 1087 First we look at how a host is currently expected to select the 1088 source and destination address with which it sends a packet for a new 1089 connection. 1091 6.1. Source Address Selection Algorithm on Hosts 1093 [RFC6724] defines the algorithms that hosts are expected to use to 1094 select source and destination addresses for packets. It defines an 1095 algorithm for selecting a source address and a separate algorithm for 1096 selecting a destination address. Both of these algorithms depend on 1097 a policy table. [RFC6724] defines a default policy which produces 1098 certain behavior. 1100 The rules in the two algorithms in [RFC6724] depend on many different 1101 properties of addresses. While these are needed for understanding 1102 how a host should choose addresses in an arbitrary environment, most 1103 of the rules are not relevant for understanding how a host should 1104 choose among multiple source addresses in multihomed environment when 1105 sending a packet to a remote host. Returning to the example in 1106 Figure 3, we look at what the default algorithms in [RFC6724] say 1107 about the source address that internal host H31 should use to send 1108 traffic to external host H101, somewhere on the Internet. 1110 There is no choice to be made with respect to destination address. 1111 H31 needs to send a packet with D=2001:db8:0:1234::101 in order to 1112 reach H101. So H31 have to choose between using 1113 S=2001:db8:0:a010::31 or S=2001:db8:0:b010::31 as the source address 1114 for this packet. We go through the rules for source address 1115 selection in Section 5 of [RFC6724]. 1117 Rule 1 (Prefer same address) is not useful to break the tie between 1118 source addresses, because neither the candidate source addresses 1119 equals the destination address. 1121 Rule 2 (Prefer appropriate scope) is also not used in this scenario, 1122 because both source addresses and the destination address have global 1123 scope. 1125 Rule 3 (Avoid deprecated addresses) applies to an address that has 1126 been autoconfigured by a host using stateless address 1127 autoconfiguration as defined in [RFC4862]. An address autoconfigured 1128 by a host has a preferred lifetime and a valid lifetime. The address 1129 is preferred until the preferred lifetime expires, after which it 1130 becomes deprecated. A deprecated address is not used if there is a 1131 preferred address of the appropriate scope available. When the valid 1132 lifetime expires, the address cannot be used at all. The preferred 1133 and valid lifetimes for an autoconfigured address are set based on 1134 the corresponding lifetimes in the Prefix Information Option in 1135 Neighbor Discovery Router Advertisements. So a possible tool to 1136 control source address selection in this scenario would be for a host 1137 to make an address deprecated by having routers on that link, R1 and 1138 R2 in Figure 3, send a Router Advertisement message containing a 1139 Prefix Information Option for the source prefix to be discouraged (or 1140 prohibited) with the preferred lifetime set to zero. This is a 1141 rather blunt tool, because it discourages or prohibits the use of 1142 that source prefix for all destinations. However, it may be useful 1143 in some scenarios. For example, if all uplinks to a particular ISP 1144 fail, it is desirable to prevent hosts from using source addresses 1145 from that ISP address space. 1147 Rule 4 (Avoid home addresses) does not apply here because we are not 1148 considering Mobile IP. 1150 Rule 5 (Prefer outgoing interface) is not useful in this scenario, 1151 because both source addresses are assigned to the same interface. 1153 Rule 5.5 (Prefer addresses in a prefix advertised by the next-hop) is 1154 not useful in the scenario when both R1 and R2 will advertise both 1155 source prefixes. However potentially this rule may allow a host to 1156 select the correct source prefix by selecting a next-hop. The most 1157 obvious way would be to make R1 to advertise itself as a default 1158 router and send PIO for 2001:db8:0:a010::/64, while R2 is advertising 1159 itself as a default router and sending PIO for 2001:db8:0:b010::/64. 1160 We'll discuss later how Rule 5.5 can be used to influence a source 1161 address selection in single-router topologies (e.g. when H41 is 1162 sending traffic using R3 as a default gateway). 1164 Rule 6 (Prefer matching label) refers to the Label value determined 1165 for each source and destination prefix as a result of applying the 1166 policy table to the prefix. With the default policy table defined in 1167 Section 2.1 of [RFC6724], Label(2001:db8:0:a010::31) = 5, 1168 Label(2001:db8:0:b010::31) = 5, and Label(2001:db8:0:1234::101) = 5. 1169 So with the default policy, Rule 6 does not break the tie. However, 1170 the algorithms in [RFC6724] are defined in such a way that non- 1171 default address selection policy tables can be used. [RFC7078] 1172 defines a way to distribute a non-default address selection policy 1173 table to hosts using DHCPv6. So even though the application of rule 1174 6 to this scenario using the default policy table is not useful, rule 1175 6 may still be a useful tool. 1177 Rule 7 (Prefer temporary addresses) has to do with the technique 1178 described in [RFC4941] to periodically randomize the interface 1179 portion of an IPv6 address that has been generated using stateless 1180 address autoconfiguration. In general, if H31 were using this 1181 technique, it would use it for both source addresses, for example 1182 creating temporary addresses 2001:db8:0:a010:2839:9938:ab58:830f and 1183 2001:db8:0:b010:4838:f483:8384:3208, in addition to 1184 2001:db8:0:a010::31 and 2001:db8:0:b010::31. So this rule would 1185 prefer the two temporary addresses, but it would not break the tie 1186 between the two source prefixes from ISP-A and ISP-B. 1188 Rule 8 (Use longest matching prefix) dictates that between two 1189 candidate source addresses the one which has longest common prefix 1190 length with the destination address. For example, if H31 were 1191 selecting the source address for sending packets to H101, this rule 1192 would not be a tie breaker as for both candidate source addresses 1193 2001:db8:0:a101::31 and 2001:db8:0:b101::31 the common prefix length 1194 with the destination is 48. However if H31 were selecting the source 1195 address for sending packets H41 address 2001:db8:0:a020::41, then 1196 this rule would result in using 2001:db8:0:a101::31 as a source 1197 (2001:db8:0:a101::31 and 2001:db8:0:a020::41 share the common prefix 1198 2001:db8:0:a000::/58, while for 2001:db8:0:b101::31 and 1199 2001:db8:0:a020::41 the common prefix is 2001:db8:0:a000::/51). 1200 Therefore rule 8 might be useful for selecting the correct source 1201 address in some but not all scenarios (for example if ISP-B services 1202 belong to 2001:db8:0:b000::/59 then H31 would always use 1203 2001:db8:0:b010::31 to access those destinations). 1205 So we can see that of the 8 source selection address rules from 1206 [RFC6724], four actually apply to our basic site multihoming 1207 scenario. The rules that are relevant to this scenario are 1208 summarized below. 1210 o Rule 3: Avoid deprecated addresses. 1212 o Rule 5.5: Prefer addresses in a prefix advertised by the next-hop. 1214 o Rule 6: Prefer matching label. 1216 o Rule 8: Prefer longest matching prefix. 1218 The two methods that we discuss for controlling the source address 1219 selection through the four relevant rules above are SLAAC Router 1220 Advertisement messages and DHCPv6. 1222 We also consider a possible role for ICMPv6 for getting traffic- 1223 driven feedback from the network. With the source address selection 1224 algorithm discussed above, the goal is to choose the correct source 1225 address on the first try, before any traffic is sent. However, 1226 another strategy is to choose a source address, send the packet, get 1227 feedback from the network about whether or not the source address is 1228 correct, and try another source address if it is not. 1230 We consider four scenarios where a host needs to select the correct 1231 source address. The first is when both uplinks are working. The 1232 second is when one uplink has failed. The third one is a situation 1233 when one failed uplink has recovered. The last one is failure of 1234 both (all) uplinks. 1236 It should be noted that [RFC6724] defines the default behaviour for 1237 IPv6 hosts. The applications and uppler-layer protocols can make 1238 their own choices on selecting source addresses. However the 1239 mechanism proposed in this document attempts to ensure that the 1240 subset of source addresses available for applications and upper-layer 1241 protocols is selected with the up-to-date network state in mind. 1243 6.2. Selecting Source Address When Both Uplinks Are Working 1245 Again we return to the topology in Figure 3. Suppose that the site 1246 administrator wants to implement a policy by which all hosts need to 1247 use ISP-A to reach H101 at D=2001:db8:0:1234::101. So for example, 1248 H31 needs to select S=2001:db8:0:a010::31. 1250 6.2.1. Distributing Address Selection Policy Table with DHCPv6 1252 This policy can be implemented by using DHCPv6 to distribute an 1253 address selection policy table that assigns the same label to 1254 destination address that match 2001:db8:0:1234::/64 as it does to 1255 source addresses that match 2001:db8:0:a000::/52. The following two 1256 entries accomplish this. 1258 Prefix Precedence Label 1259 2001:db8:0:1234::/64 50 33 1260 2001:db8:0:a000::/52 50 33 1262 Figure 9: Policy table entries to implement a routing policy 1264 This requires that the hosts implement [RFC6724], the basic source 1265 and destination address framework, along with [RFC7078], the DHCPv6 1266 extension for distributing a non-default policy table. Note that it 1267 does NOT require that the hosts use DHCPv6 for address assignment. 1268 The hosts could still use stateless address autoconfiguration for 1269 address configuration, while using DHCPv6 only for policy table 1270 distribution (see [RFC8415]). However this method has a number of 1271 disadvantages: 1273 o DHCPv6 support is not a mandatory requirement for IPv6 hosts 1274 ([RFC6434]), so this method might not work for all devices. 1276 o Network administrators are required to explicitly configure the 1277 desired network access policies on DHCPv6 servers. While it might 1278 be feasible in the scenario of a single multihomed network, such 1279 approach might have some scalability issues, especially if the 1280 centralized DHCPv6 solution is deployed to serve a large number of 1281 multiomed sites. 1283 6.2.2. Controlling Source Address Selection With Router Advertisements 1285 Neighbor Discovery currently has two mechanisms to communicate prefix 1286 information to hosts. The base specification for Neighbor Discovery 1287 (see [RFC4861]) defines the Prefix Information Option (PIO) in the 1288 Router Advertisement (RA) message. When a host receives a PIO with 1289 the A-flag set, it can use the prefix in the PIO as source prefix 1290 from which it assigns itself an IP address using stateless address 1291 autoconfiguration (SLAAC) procedures described in [RFC4862]. In the 1292 example of Figure 3, if the site network is using SLAAC, we would 1293 expect both R1 and R2 to send RA messages with PIOs for both source 1294 prefixes 2001:db8:0:a010::/64 and 2001:db8:0:b010::/64 with the 1295 A-flag set. H31 would then use the SLAAC procedure to configure 1296 itself with the 2001:db8:0:a010::31 and 2001:db8:0:b010::31. 1298 Whereas a host learns about source prefixes from PIO messages, hosts 1299 can learn about a destination prefix from a Router Advertisement 1300 containing Route Information Option (RIO), as specified in [RFC4191]. 1301 The destination prefixes in RIOs are intended to allow a host to 1302 choose the router that it uses as its first hop to reach a particular 1303 destination prefix. 1305 As currently standardized, neither PIO nor RIO options contained in 1306 Neighbor Discovery Router Advertisements can communicate the 1307 information needed to implement the desired routing policy. PIO's 1308 communicate source prefixes, and RIO communicate destination 1309 prefixes. However, there is currently no standardized way to 1310 directly associate a particular destination prefix with a particular 1311 source prefix. 1313 [I-D.pfister-6man-sadr-ra] proposes a Source Address Dependent Route 1314 Information option for Neighbor Discovery Router Advertisements which 1315 would associate a source prefix and with a destination prefix. The 1316 details of [I-D.pfister-6man-sadr-ra] might need tweaking to address 1317 this use case. However, in order to be able to use Neighbor 1318 Discovery Router Advertisements to implement this routing policy, an 1319 extension that allows R1 and R2 to explicitly communicate to H31 an 1320 association between S=2001:db8:0:a000::/52 D=2001:db8:0:1234::/64 1321 would be needed. 1323 However, Rule 5.5 of the source address selection algorithm 1324 (discussed in Section 6.1 above), together with default router 1325 preference (specified in [RFC4191]) and RIO can be used to influence 1326 a source address selection on a host as described below. Let's look 1327 at source address selection on the host H41. It receives RAs from R3 1328 with PIOs for 2001:db8:0:a020::/64 and 2001:db8:0:b020::/64. At that 1329 point all traffic would use the same next-hop (R3 link-local address) 1330 so Rule 5.5 does not apply. Now let's assume that R3 supports SADR 1331 and has two scoped forwarding tables, one scoped to 1332 S=2001:db8:0:a000::/52 and another scoped to S=2001:db8:0:b000::/52. 1333 If R3 generates two different link-local addresses for its interface 1334 facing H41 (one for each scoped forwarding table, LLA_A and LLA_B) 1335 and starts sending two different RAs: one is sent from LLA_A and 1336 includes PIO for 2001:db8:0:a020::/64, another is sent from LLA_B and 1337 includes PIO for 2001:db8:0:b020::/64. Now it is possible to 1338 influence H41 source address selection for destinations which follow 1339 the default route by setting default router preference in RAs. If it 1340 is desired that H41 reaches H101 (or any destinations in the 1341 Internet) via ISP-A, then RAs sent from LLA_A should have default 1342 router preference set to 01 (high priority), while RAs sent from 1343 LLA_B should have preference set to 11 (low). Then LLA_A would be 1344 chosen as a next-hop for H101 and therefore (as per rule 5.5) 1345 2001:db8:0:a020::41 would be selected as the source address. If, at 1346 the same time, it is desired that H61 is accessible via ISP-B then R3 1347 should include a RIO for 2001:db8:0:6666::/64 to its RA sent from 1348 LLA_B. H41 would chose LLA_B as a next-hop for all traffic to H61 1349 and then as per Rule 5.5, 2001:db8:0:b020::41 would be selected as a 1350 source address. 1352 If in the above mentioned scenario it is desirable that all Internet 1353 traffic leaves the network via ISP-A and the link to ISP-B is used 1354 for accessing ISP-B services only (not as ISP-A link backup), then 1355 RAs sent by R3 from LLA_B should have Router Lifetime set to 0 and 1356 should include RIOs for ISP-B address space. It would instruct H41 1357 to use LLA_A for all Internet traffic but use LLA_B as a next-hop 1358 while sending traffic to ISP-B addresses. 1360 The description of the mechanism above assumes SADR support by the 1361 first-hop routers as well as SERs. However, a first-hop router can 1362 still provide a less flexible version of this mechanism even without 1363 implementing SADR. This could be done by providing configuration 1364 knobs on the first-hop router that allow it to generate different 1365 link-local addresses and to send individual RAs for each prefix. 1367 The mechanism described above relies on Rule 5.5 of the default 1368 source address selection algorithm defined in [RFC6724]. [RFC8028] 1369 states that "A host SHOULD select default routers for each prefix it 1370 is assigned an address in". It also recommends that hosts should 1371 implement Rule 5.5. of [RFC6724]. Hosts following the 1372 recommendations specified in [RFC8028] therefore should be able to 1373 benefit from the solution described in this document. No standards 1374 need to be updated in regards to host behavior. 1376 6.2.3. Controlling Source Address Selection With ICMPv6 1378 We now discuss how one might use ICMPv6 to implement the routing 1379 policy to send traffic destined for H101 out the uplink to ISP-A, 1380 even when uplinks to both ISPs are working. If H31 started sending 1381 traffic to H101 with S=2001:db8:0:b010::31 and 1382 D=2001:db8:0:1234::101, it would be routed through SER-b1 and out the 1383 uplink to ISP-B. SERb1 could recognize that this traffic is not 1384 following the desired routing policy and react by sending an ICMPv6 1385 message back to H31. 1387 In this example, we could arrange things so that SERb1 drops the 1388 packet with S=2001:db8:0:b010::31 and D=2001:db8:0:1234::101, and 1389 then sends to H31 an ICMPv6 Destination Unreachable message with Code 1390 5 (Source address failed ingress/egress policy). When H31 receives 1391 this packet, it would then be expected to try another source address 1392 to reach the destination. In this example, H31 would then send a 1393 packet with S=2001:db8:0:a010::31 and D=2001:db8:0:1234::101, which 1394 will reach SERa and be forwarded out the uplink to ISP-A. 1396 However, we would also want it to be the case that SERb1 does not 1397 enforce this routing policy when the uplink from SERa to ISP-A has 1398 failed. This could be accomplished by having SERa originate a 1399 source-prefix-scoped route for (S=2001:db8:0:a000::/52, 1400 D=2001:db8:0:1234::/64) and have SERb1 monitor the presence of that 1401 route. If that route is not present (because SERa has stopped 1402 originating it), then SERb1 will not enforce the routing policy, and 1403 it will forward packets with S=2001:db8:0:b010::31 and 1404 D=2001:db8:0:1234::101 out its uplink to ISP-B. 1406 We can also use this source-prefix-scoped route originated by SERa to 1407 communicate the desired routing policy to SERb1. We can define an 1408 EXCLUSIVE flag to be advertised together with the IGP route for 1409 (S=2001:db8:0:a000::/52, D=2001:db8:0:1234::/64). This would allow 1410 SERa to communicate to SERb that SERb should reject traffic for 1411 D=2001:db8:0:1234::/64 and respond with an ICMPv6 Destination 1412 Unreachable Code 5 message, as long as the route for 1413 (S=2001:db8:0:a000::/52, D=2001:db8:0:1234::/64) is present. 1415 Finally, if we are willing to extend ICMPv6 to support this solution, 1416 then we could create a mechanism for SERb1 to tell the host what 1417 source address it should be using to successfully forward packets 1418 that meet the policy. In its current form, when SERb1 sends an 1419 ICMPv6 Destination Unreachable Code 5 message, it is basically 1420 saying, "This source address is wrong. Try another source address." 1421 In the absence of a clear indication which address to try next, the 1422 host will iterate over all addresses assigned to the interface (e.g. 1423 various privacy addresses) which would lead to significant delays and 1424 degraded user experience. It would be better is if the ICMPv6 1425 message could say, "This source address is wrong. Instead use a 1426 source address in S=2001:db8:0:a000::/52.". 1428 However using ICMPv6 for signaling source address information back to 1429 hosts introduces new challenges. Most routers currently have 1430 software or hardware limits on generating ICMP messages. A site 1431 administrator deploying a solution that relies on the SERs generating 1432 ICMP messages could try to improve the performance of SERs for 1433 generating ICMP messages. However, in a large network, it is still 1434 likely that ICMP message generation limits will be reached. As a 1435 result hosts would not receive ICMPv6 back which in turn leads to 1436 traffic blackholing and poor user experience. To improve the 1437 scalability of ICMPv6-based signaling hosts SHOULD cache the 1438 preferred source address (or prefix) for the given destination (which 1439 in turn might cause issues in case of the corresponding ISP uplinks 1440 failure - see Section 6.3). In addition, the same source prefix 1441 SHOULD be used for other destinations in the same /64 as the original 1442 destination address. The source prefix to the destination mapping 1443 SHOULD have a specific lifetime. Expiration of the lifetime SHOULD 1444 trigger the source address selection algorithm again. 1446 Using ICMPv6 Destination Unreachable Messages with Code 5 to 1447 influence source address selection allows an attacker to exhaust the 1448 list of candidate source addresses on the host by sending spoofed 1449 ICMPv6 Code 5 for all prefixes known on the network (therefore 1450 preventing a victim from establishing a communication with the 1451 destination host). To protect from an attack of this kind, hosts 1452 SHOULD verify that the original packet header included into ICMPv6 1453 error message was actually sent by the host. 1455 As currently standardized in [RFC4443], the ICMPv6 Destination 1456 Unreachable Message with Code 5 would allow for the iterative 1457 approach to retransmitting packets using different source addresses. 1458 As currently defined, the ICMPv6 message does not provide a mechanism 1459 to communication information about which source prefix should be used 1460 for a retransmitted packet. The current document does not define 1461 such a mechanism but it might be a useful extension to define in a 1462 different document. However this approach has some security 1463 implications such as an ability for an attacker to send spoofed 1464 ICMPv6 messages to signal invalid/unreachable source prefix causing 1465 DoS-type attack. 1467 6.2.4. Summary of Methods For Controlling Source Address Selection To 1468 Implement Routing Policy 1470 So to summarize this section, we have looked at three methods for 1471 implementing a simple routing policy where all traffic for a given 1472 destination on the Internet needs to use a particular ISP, even when 1473 the uplinks to both ISPs are working. 1475 The default source address selection policy cannot distinguish 1476 between the source addresses needed to enforce this policy, so a non- 1477 default policy table using associating source and destination 1478 prefixes using Label values would need to be installed on each host. 1479 A mechanism exists for DHCPv6 to distribute a non-default policy 1480 table but such solution would heavily rely on DHCPv6 support by host 1481 operating system. Moreover there is no mechanism to translate 1482 desired routing/traffic engineering policies into policy tables on 1483 DHCPv6 servers. Therefore using DHCPv6 for controlling address 1484 selection policy table is not recommended and SHOULD NOT be used. 1486 At the same time Router Advertisements provide a reliable mechanism 1487 to influence source address selection process via PIO, RIO and 1488 default router preferences. As all those options have been 1489 standardized by IETF and are supported by various operating systems 1490 no changes are required on hosts. First-hop routers in the 1491 enterprise network need to be able of sending different RAs for 1492 different SLAAC prefixes (either based on scoped forwarding tables or 1493 based on pre-configured policies). 1495 SERs can enforce the routing policy by sending ICMPv6 Destination 1496 Unreachable messages with Code 5 (Source address failed ingress/ 1497 egress policy) for traffic that is being sent with the wrong source 1498 address. The policy distribution can be automated by defining an 1499 EXCLUSIVE flag for the source-prefix-scoped route which can be set on 1500 the SER that originates the route. As ICMPv6 message generation can 1501 be rate-limited on routers, it SHOULD NOT be used as the only 1502 mechanism to influence source address selection on hosts. While 1503 hosts SHOULD select the correct source address for a given 1504 destination the network SHOULD signal any source address issues back 1505 to hosts using ICMPv6 error messages. 1507 6.3. Selecting Source Address When One Uplink Has Failed 1509 Now we discuss if DHCPv6, Neighbor Discovery Router Advertisements, 1510 and ICMPv6 can help a host choose the right source address when an 1511 uplink to one of the ISPs has failed. Again we look at the scenario 1512 in Figure 3. This time we look at traffic from H31 destined for 1513 external host H501 at D=2001:db8:0:5678::501. We initially assume 1514 that the uplink from SERa to ISP-A is working and that the uplink 1515 from SERb1 to ISP-B is working. 1517 We assume there is no particular routing policy desired, so H31 is 1518 free to send packets with S=2001:db8:0:a010::31 or 1519 S=2001:db8:0:b010::31 and have them delivered to H501. For this 1520 example, we assume that H31 has chosen S=2001:db8:0:b010::31 so that 1521 the packets exit via SERb to ISP-B. Now we see what happens when the 1522 link from SERb1 to ISP-B fails. How should H31 learn that it needs 1523 to start sending the packet to H501 with S=2001:db8:0:a010::31 in 1524 order to start using the uplink to ISP-A? We need to do this in a 1525 way that doesn't prevent H31 from still sending packets with 1526 S=2001:db8:0:b010::31 in order to reach H61 at D=2001:db8:0:6666::61. 1528 6.3.1. Controlling Source Address Selection With DHCPv6 1530 For this example we assume that the site network in Figure 3 has a 1531 centralized DHCP server and all routers act as DHCP relay agents. We 1532 assume that both of the addresses assigned to H31 were assigned via 1533 DHCP. 1535 We could try to have the DHCP server monitor the state of the uplink 1536 from SERb1 to ISP-B in some manner and then tell H31 that it can no 1537 longer use S=2001:db8:0:b010::31 by settings its valid lifetime to 1538 zero. The DHCP server could initiate this process by sending a 1539 Reconfigure Message to H31 as described in Section 18.3 of [RFC8415]. 1540 Or the DHCP server can assign addresses with short lifetimes in order 1541 to force clients to renew them often. 1543 This approach would prevent H31 from using S=2001:db8:0:b010::31 to 1544 reach a host on the Internet. However, it would also prevent H31 1545 from using S=2001:db8:0:b010::31 to reach H61 at 1546 D=2001:db8:0:6666::61, which is not desirable. 1548 Another potential approach is to have the DHCP server monitor the 1549 uplink from SERb1 to ISP-B and control the choice of source address 1550 on H31 by updating its address selection policy table via the 1551 mechanism in [RFC7078]. The DHCP server could initiate this process 1552 by sending a Reconfigure Message to H31. Note that [RFC8415] 1553 requires that Reconfigure Message use DHCP authentication. DHCP 1554 authentication could be avoided by using short address lifetimes to 1555 force clients to send Renew messages to the server often. If the 1556 host is not obtaining its IP addresses from the DHCP server, then it 1557 would need to use the Information Refresh Time option defined in 1558 [RFC8415]. 1560 If the following policy table can be installed on H31 after the 1561 failure of the uplink from SERb1, then the desired routing behavior 1562 should be achieved based on source and destination prefix being 1563 matched with label values. 1565 Prefix Precedence Label 1566 ::/0 50 44 1567 2001:db8:0:a000::/52 50 44 1568 2001:db8:0:6666::/64 50 55 1569 2001:db8:0:b000::/52 50 55 1571 Figure 10: Policy Table Needed On Failure Of Uplink From SERb1 1573 The described solution has a number of significant drawbacks, some of 1574 them already discussed in Section 6.2.1. 1576 o DHCPv6 support is not required for an IPv6 host and there are 1577 operating systems which do not support DHCPv6. Besides that, it 1578 does not appear that [RFC7078] has been widely implemented on host 1579 operating systems. 1581 o [RFC7078] does not clearly specify this kind of a dynamic use case 1582 where address selection policy needs to be updated quickly in 1583 response to the failure of a link. In a large network it would 1584 present scalability issues as many hosts need to be reconfigured 1585 in very short period of time. 1587 o Updating DHCPv6 server configuration each time an ISP uplink 1588 changes its state introduces some scalability issues, especially 1589 for mid/large distributed scale enterprise networks. In addition 1590 to that, the policy table needs to be manually configured by 1591 administrators which makes that solution prone to human error. 1593 o No mechanism exists for making DHCPv6 servers aware of network 1594 topology/routing changes in the network. In general DHCPv6 1595 servers monitoring network-related events sounds like a bad idea 1596 as completely new functionality beyond the scope of DHCPv6 role is 1597 required. 1599 6.3.2. Controlling Source Address Selection With Router Advertisements 1601 The same mechanism as discussed in Section 6.2.2 can be used to 1602 control the source address selection in the case of an uplink 1603 failure. If a particular prefix should not be used as a source for 1604 any destinations, then the router needs to send RA with Preferred 1605 Lifetime field for that prefix set to 0. 1607 Let's consider a scenario when all uplinks are operational and H41 1608 receives two different RAs from R3: one from LLA_A with PIO for 1609 2001:db8:0:a020::/64, default router preference set to 11 (low) and 1610 another one from LLA_B with PIO for 2001:db8:0:a020::/64, default 1611 router preference set to 01 (high) and RIO for 2001:db8:0:6666::/64. 1612 As a result H41 is using 2001:db8:0:b020::41 as a source address for 1613 all Internet traffic and those packets are sent by SERs to ISP-B. If 1614 SERb1 uplink to ISP-B failed, the desired behavior is that H41 stops 1615 using 2001:db8:0:b020::41 as a source address for all destinations 1616 but H61. To achieve that R3 should react to SERb1 uplink failure 1617 (which could be detected as the scoped route (S=2001:db8:0:b000::/52, 1618 D=::/0) disappearance) by withdrawing itself as a default router. R3 1619 sends a new RA from LLA_B with Router Lifetime value set to 0 (which 1620 means that it should not be used as default router). That RA still 1621 contains PIO for 2001:db8:0:b020::/64 (for SLAAC purposes) and RIO 1622 for 2001:db8:0:6666::/64 so H41 can reach H61 using LLA_B as a next- 1623 hop and 2001:db8:0:b020::41 as a source address. For all traffic 1624 following the default route, LLA_A will be used as a next-hop and 1625 2001:db8:0:a020::41 as a source address. 1627 If all uplinks to ISP-B have failed and therefore source addresses 1628 from ISP-B address space should not be used at all, the forwarding 1629 table scoped S=2001:db8:0:b000::/52 contains no entries. Hosts can 1630 be instructed to stop using source addresses from that block by 1631 sending RAs containing PIO with Preferred Lifetime set to 0. 1633 6.3.3. Controlling Source Address Selection With ICMPv6 1635 Now we look at how ICMPv6 messages can provide information back to 1636 H31. We assume again that at the time of the failure H31 is sending 1637 packets to H501 using (S=2001:db8:0:b010::31, 1638 D=2001:db8:0:5678::501). When the uplink from SERb1 to ISP-B fails, 1639 SERb1 would stop originating its source-prefix-scoped route for the 1640 default destination (S=2001:db8:0:b000::/52, D=::/0) as well as its 1641 unscoped default destination route. With these routes no longer in 1642 the IGP, traffic with (S=2001:db8:0:b010::31, D=2001:db8:0:5678::501) 1643 would end up at SERa based on the unscoped default destination route 1644 being originated by SERa. Since that traffic has the wrong source 1645 address to be forwarded to ISP-A, SERa would drop it and send a 1646 Destination Unreachable message with Code 5 (Source address failed 1647 ingress/egress policy) back to H31. H31 would then know to use 1648 another source address for that destination and would try with 1649 (S=2001:db8:0:a010::31, D=2001:db8:0:5678::501). This would be 1650 forwarded to SERa based on the source-prefix-scoped default 1651 destination route still being originated by SERa, and SERa would 1652 forward it to ISP-A. As discussed above, if we are willing to extend 1653 ICMPv6, SERa can even tell H31 what source address it should use to 1654 reach that destination. The expected host behaviour has been 1655 discussed in Section 6.2.3. Using ICMPv6 would have the same 1656 scalability/rate limiting issues discussed in Section 6.2.3. ISP-B 1657 uplink failure immidiately makes source addresses from 1658 2001:db8:0:b000::/52 unsuitable for external communication and might 1659 trigger a large number of ICMPv6 packets being sent to hosts in that 1660 subnet. 1662 6.3.4. Summary Of Methods For Controlling Source Address Selection On 1663 The Failure Of An Uplink 1665 It appears that DHCPv6 is not particularly well suited to quickly 1666 changing the source address used by a host in the event of the 1667 failure of an uplink, which eliminates DHCPv6 from the list of 1668 potential solutions. On the other hand Router Advertisements 1669 provides a reliable mechanism to dynamically provide hosts with a 1670 list of valid prefixes to use as source addresses as well as prevent 1671 particular prefixes to be used. While no additional new features are 1672 required to be implemented on hosts, routers need to be able to send 1673 RAs based on the state of scoped forwarding tables entries and to 1674 react to network topology changes by sending RAs with particular 1675 parameters set. 1677 The use of ICMPv6 Destination Unreachable messages generated by the 1678 SER (or any SADR-capable) routers seem like they have the potential 1679 to provide a support mechanism together with RAs to signal source 1680 address selection errors back to hosts, however scalability issues 1681 may arise in large networks in case of sudden topology change. 1682 Therefore it is highly desirable that hosts are able to select the 1683 correct source address in case of uplinks failure with ICMPv6 being 1684 an additional mechanism to signal unexpected failures back to hosts. 1686 The current behavior of different host operating system when 1687 receiving ICMPv6 Destination Unreachable message with code 5 (Source 1688 address failed ingress/egress policy) is not clear to the authors. 1689 Information from implementers, users, and testing would be quite 1690 helpful in evaluating this approach. 1692 6.4. Selecting Source Address Upon Failed Uplink Recovery 1694 The next logical step is to look at the scenario when a failed uplink 1695 on SERb1 to ISP-B is coming back up, so hosts can start using source 1696 addresses belonging to 2001:db8:0:b000::/52 again. 1698 6.4.1. Controlling Source Address Selection With DHCPv6 1700 The mechanism to use DHCPv6 to instruct the hosts (H31 in our 1701 example) to start using prefixes from ISP-B space (e.g. 1702 S=2001:db8:0:b010::31 for H31) to reach hosts on the Internet is 1703 quite similar to one discussed in Section 6.3.1 and shares the same 1704 drawbacks. 1706 6.4.2. Controlling Source Address Selection With Router Advertisements 1708 Let's look at the scenario discussed in Section 6.3.2. If the 1709 uplink(s) failure caused the complete withdrawal of prefixes from 1710 2001:db8:0:b000::/52 address space by setting Preferred Lifetime 1711 value to 0, then the recovery of the link should just trigger new RA 1712 being sent with non-zero Preferred Lifetime. In another scenario 1713 discussed in Section 6.3.2, the SERb1 uplink to ISP-B failure leads 1714 to disappearance of the (S=2001:db8:0:b000::/52, D=::/0) entry from 1715 the forwarding table scoped to S=2001:db8:0:b000::/52 and, in turn, 1716 caused R3 to send RAs from LLA_B with Router Lifetime set to 0. The 1717 recovery of the SERb1 uplink to ISP-B leads to 1718 (S=2001:db8:0:b000::/52, D=::/0) scoped forwarding entry re- 1719 appearance and instructs R3 that it should advertise itself as a 1720 default router for ISP-B address space domain (send RAs from LLA_B 1721 with non-zero Router Lifetime). 1723 6.4.3. Controlling Source Address Selection With ICMP 1725 It looks like ICMPv6 provides a rather limited functionality to 1726 signal back to hosts that particular source addresses have become 1727 valid again. Unless the changes in the uplink state a particular 1728 (S,D) pair, hosts can keep using the same source address even after 1729 an ISP uplink has come back up. For example, after the uplink from 1730 SERb1 to ISP-B had failed, H31 received ICMPv6 Code 5 message (as 1731 described in Section 6.3.3) and allegedly started using 1732 (S=2001:db8:0:a010::31, D=2001:db8:0:5678::501) to reach H501. Now 1733 when the SERb1 uplink comes back up, the packets with that (S,D) pair 1734 are still routed to SERa1 and sent to the Internet. Therefore H31 is 1735 not informed that it should stop using 2001:db8:0:a010::31 and start 1736 using 2001:db8:0:b010::31 again. Unless SERa has a policy configured 1737 to drop packets (S=2001:db8:0:a010::31, D=2001:db8:0:5678::501) and 1738 send ICMPv6 back if SERb1 uplink to ISP-B is up, H31 will be unaware 1739 of the network topology change and keep using S=2001:db8:0:a010::31 1740 for Internet destinations, including H51. 1742 One of the possible option may be using a scoped route with EXCLUSIVE 1743 flag as described in Section 6.2.3. SERa1 uplink recovery would 1744 cause (S=2001:db8:0:a000::/52, D=2001:db8:0:1234::/64) route to 1745 reappear in the routing table. In the absence of that route packets 1746 to H101 which were sent to ISP-B (as ISP-A uplink was down) with 1747 source addresses from 2001:db8:0:b000::/52. When the route re- 1748 appears SERb1 would reject those packets and sends ICMPv6 back as 1749 discussed in Section 6.2.3. Practically it might lead to scalability 1750 issues which have been already discussed in Section 6.2.3 and 1751 Section 6.4.3. 1753 6.4.4. Summary Of Methods For Controlling Source Address Selection Upon 1754 Failed Uplink Recovery 1756 Once again DHCPv6 does not look like reasonable choice to manipulate 1757 source address selection process on a host in the case of network 1758 topology changes. Using Router Advertisement provides the flexible 1759 mechanism to dynamically react to network topology changes (if 1760 routers are able to use routing changes as a trigger for sending out 1761 RAs with specific parameters). ICMPv6 could be considered as a 1762 supporting mechanism to signal incorrect source address back to hosts 1763 but should not be considered as the only mechanism to control the 1764 address selection in multihomed environments. 1766 6.5. Selecting Source Address When All Uplinks Failed 1768 One particular tricky case is a scenario when all uplinks have 1769 failed. In that case there is no valid source address to be used for 1770 any external destinations while it might be desirable to have intra- 1771 site connectivity. 1773 6.5.1. Controlling Source Address Selection With DHCPv6 1775 From DHCPv6 perspective uplinks failure should be treated as two 1776 independent failures and processed as described in Section 6.3.1. At 1777 this stage it is quite obvious that it would result in quite 1778 complicated policy table which needs to be explicitly configured by 1779 administrators and therefore seems to be impractical. 1781 6.5.2. Controlling Source Address Selection With Router Advertisements 1783 As discussed in Section 6.3.2 an uplink failure causes the scoped 1784 default entry to disappear from the scoped forwarding table and 1785 triggers RAs with zero Router Lifetime. Complete disappearance of 1786 all scoped entries for a given source prefix would cause the prefix 1787 being withdrawn from hosts by setting Preferred Lifetime value to 1788 zero in PIO. If all uplinks (SERa, SERb1 and SERb2) failed, hosts 1789 either lost their default routers and/or have no global IPv6 1790 addresses to use as a source. (Note that 'uplink failure' might mean 1791 'IPv6 connectivity failure with IPv4 still being reachable', in which 1792 case hosts might fall back to IPv4 if there is IPv4 connectivity to 1793 destinations). As a result, intra-site connectivity is broken. One 1794 of the possible way to solve it is to use ULAs. 1796 All hosts have ULA addresses assigned in addition to GUAs and used 1797 for intra-site communication even if there is no GUA assigned to a 1798 host. To avoid accidental leaking of packets with ULA sources SADR- 1799 capable routers SHOULD have a scoped forwarding table for ULA source 1800 for internal routes but MUST NOT have an entry for D=::/0 in that 1801 table. In the absence of (S=ULA_Prefix; D=::/0) first-hop routers 1802 will send dedicated RAs from a unique link-local source LLA_ULA with 1803 PIO from ULA address space, RIO for the ULA prefix and Router 1804 Lifetime set to zero. The behaviour is consistent with the situation 1805 when SERb1 lost the uplink to ISP-B (so there is no Internet 1806 connectivity from 2001:db8:0:b000::/52 sources) but those sources can 1807 be used to reach some specific destinations. In the case of ULA 1808 there is no Internet connectivity from ULA sources but they can be 1809 used to reach another ULA destinations. Note that ULA usage could be 1810 particularly useful if all ISPs assign prefixes via DHCP-PD. In the 1811 absence of ULAs upon the all uplinks failure hosts would lost all 1812 their GUAs upon prefix lifetime expiration which again makes intra- 1813 site communication impossible. 1815 It should be noted that the Rule 5.5 (prefer a prefix advertised by 1816 the selected next-hop) takes precedence over the Rule 6 (prefer 1817 matching label, which ensures that GUA source addresses are preferred 1818 over ULAs for GUA destinations). Therefore if ULAs are used, the 1819 network administrator needs to ensure that while the site has an 1820 Internet connectivity, hosts do not select a router which advertises 1821 ULA prefixes as their default router. 1823 6.5.3. Controlling Source Address Selection With ICMPv6 1825 In case of all uplinks failure all SERs will drop outgoing IPv6 1826 traffic and respond with ICMPv6 error message. In the large network 1827 when many hosts are trying to reach Internet destinations it means 1828 that SERs need to generate an ICMPv6 error to every packet they 1829 receive from hosts which presents the same scalability issues 1830 discussed in Section 6.3.3 1832 6.5.4. Summary Of Methods For Controlling Source Address Selection When 1833 All Uplinks Failed 1835 Again, combining SADR with Router Advertisements seems to be the most 1836 flexible and scalable way to control the source address selection on 1837 hosts. 1839 6.6. Summary Of Methods For Controlling Source Address Selection 1841 To summarize the scenarios and options discussed above: 1843 While DHCPv6 allows administrators to manipulate source address 1844 selection policy tables, this method has a number of significant 1845 disadvantages which eliminates DHCPv6 from a list of potential 1846 solutions: 1848 1. It required hosts to support DHCPv6 and its extension (RFC7078); 1849 2. DHCPv6 server needs to monitor network state and detect routing 1850 changes. 1852 3. The use of policy tables requires manual configuration and might 1853 be extremely complicated, especially in the case of distributed 1854 network when large number of remote sites are being served by 1855 centralized DHCPv6 servers. 1857 4. Network topology/routing policy changes could trigger 1858 simultaneous re-configuration of large number of hosts which 1859 present serious scalability issues. 1861 The use of Router Advertisements to influence the source address 1862 selection on hosts seem to be the most reliable, flexible and 1863 scalable solution. It has the following benefits: 1865 1. no new (non-standard) functionality needs to be implemented on 1866 hosts (except for [RFC4191] support); 1868 2. no changes in RA format; 1870 3. routers can react to routing table changes by sending RAs which 1871 would minimize the failover time in the case of network topology 1872 changes; 1874 4. information required for source address selection is broadcast to 1875 all affected hosts in case of topology change event which 1876 improves the scalability of the solution (comparing to DHCPv6 1877 reconfiguration or ICMPv6 error messages). 1879 To fully benefit from the RA-based solution, first-hop routers need 1880 to implement SADR and be able to send dedicated RAs per scoped 1881 forwarding table as discussed above, reacting to network changes with 1882 sending new RAs. It should be noted that the proposed solution would 1883 work even if first-hop routers are not SADR-capable but still able to 1884 send individual RAs for each ISP prefix and react to topology changes 1885 as discussed above (e.g. via configuration knobs). 1887 The RA-based solution relies heavily on hosts correctly implementing 1888 default address selection algorithm as defined in [RFC6724]. While 1889 the basic (and most common) multihoming scenario (two or more 1890 Internet uplinks, no 'walled gardens') would work for any host 1891 supporting the minimal implementation of [RFC6724], more complex use 1892 cases (such as "walled garden" and other scenarios when some ISP 1893 resources can only be reached from that ISP address space) require 1894 that hosts support Rule 5.5 of the default address selection 1895 algorithm. There is some evidence that not all host OSes have that 1896 rule implemented currently. However it should be noted that 1897 [RFC8028] states that Rule 5.5 should be implemented. 1899 ICMPv6 Code 5 error message SHOULD be used to complement RA-based 1900 solution to signal incorrect source address selection back to hosts, 1901 but it SHOULD NOT be considered as the stand-alone solution. To 1902 prevent scenarios when hosts in multihomed envinronments incorrectly 1903 identify onlink/offlink destinations, hosts SHOULD treat ICMPv6 1904 Redirects as discussed in [RFC8028]. 1906 6.7. Solution Limitations 1908 6.7.1. Connections Preservation 1910 The proposed solution is not designed to preserve connection state in 1911 case of an uplink failure. When all uplinks to an ISP go down all 1912 transport connections established to/from that ISP address space will 1913 be interrupted (unless the transport protocol has specific 1914 multihoming support). That behaviour is similar to the scenario of 1915 IPv4 multihoming with NAT when an uplink failure causes all 1916 connections to be NATed to completely different public IPv4 1917 addresses. While it does sound suboptimal, it is determined by the 1918 nature of PA address space: if all uplinks to the particular ISP have 1919 failed, there is no path for the ingress traffic to reach the site 1920 and the egress traffic is supposed to be dropped by the BCP38 1921 [RFC2827] ingress filters. The only potential way to overcome this 1922 limitation would be running BGP with all ISPs and advertise all site 1923 prefixes to all uplinks - a solution which shares all drawbacks of 1924 using PI address space without having its benefits. Networks willing 1925 and capable of running BGP and using PI are out of scope of this 1926 document. 1928 It should be noted that in case of IPv4 NAT-based multihoming uplink 1929 recovery could cause connection interruptions as well (unless packet 1930 forwarding is integrated with existing NAT sessions tracking so the 1931 egress interface for the existing sessions is not changed). However 1932 the proposed solution has a benefit of preserving the existing 1933 sessions during/after the failed uplink restoration. Unlike the 1934 uplink failure event which causes all addresses from the affected 1935 prefix to be deprecated the recovery would just add new preferred 1936 addresses to a host without making any addresses unavailable. 1937 Therefore connections estavlished to/from those addresses do not have 1938 to be interrupted. 1940 While it's desirable for active connections to survive ISP failover 1941 events, for sites using PA address space such events affect the 1942 reachability of IP addresses assigned to hosts. Unless the transport 1943 (or even higher level protocols) are capable of suviving the host 1944 renumbering, the active connections will be broken. The proposed 1945 solution focuses on minimizing the impact of failover for new 1946 connections and for multipath-aware protocols. 1948 6.8. Other Configuration Parameters 1950 6.8.1. DNS Configuration 1952 In mutihomed envinronment each ISP might provide their own list of 1953 DNS servers. For example, in the topology shown in Figure 3, ISP-A 1954 might provide recursive DNS server H51 2001:db8:0:5555::51, while 1955 ISP-B might provide H61 2001:db8:0:6666::61 as a recursive DNS 1956 server. [RFC8106] defines IPv6 Router Advertisement options to allow 1957 IPv6 routers to advertise a list of DNS recursive server addresses 1958 and a DNS Search List to IPv6 hosts. Using RDNSS together with 1959 'scoped' RAs as described above would allow a first-hop router (R3 in 1960 the Figure 3) to send DNS server addresses and search lists provided 1961 by each ISP (or the corporate DNS servers addresses if the enterprise 1962 is running its own DNS servers - as discussed below DNS split-horizon 1963 problem is to hard to solve without running a local DNS server). 1965 As discussed in Section 6.5.2, failure of all ISP uplinks would cause 1966 deprecation of all addresses assigned to a host from the address 1967 space of all ISPs. If any intra-site IPv6 connectivity is still 1968 desirable (most likely to be the case for any mid/large scare 1969 network), then ULAs should be used as discussed in Section 6.5.2. In 1970 such a scenario, the enterprise network should run its own recursive 1971 DNS server(s) and provide its ULA addresses to hosts via RDNSS in RAs 1972 send for ULA-scoped forwarding table as described in Section 6.5.2. 1974 There are some scenarios when the final outcome of the name 1975 resolution might be different depending on: 1977 o which DNS server is used; 1979 o which source address the client uses to send a DNS query to the 1980 server (DNS split horizon). 1982 There is no way currently to instruct a host to use a particular DNS 1983 server out of the configured servers list for resolving a particular 1984 name. Therefore it does not seem feasible to solve the problem of 1985 DNS server selection on the host (it should be noted that this 1986 particular issue is protocol-agnostic and happens for IPv4 as well). 1987 In such a scenario it is recommended that the enterprise runs its own 1988 local recursive DNS server. 1990 To influence host source address selection for packets sent to a 1991 particular DNS server the following requirements must be met: 1993 o the host supports RIO as defined in [RFC4191]; 1995 o the routers send RIO for routes to DNS server addresses. 1997 For example, if it is desirable that host H31 reaches the ISP-A DNS 1998 server H51 2001:db8:0:5555::51 using its source address 1999 2001:db8:0:a010::31, then both R1 and R2 should send the RIO 2000 containing the route to 2001:db8:0:5555::51 (or covering route) in 2001 their 'scoped' RAs, containing LLA_A as the default router address 2002 and the PO for SLAAC prefix 2001:db8:0:a010::/64. In that case the 2003 host H31 (if it supports the Rule 5.5) would select LLA_A as a next- 2004 hop and then chose 2001:db8:0:a010::31 as the source address for 2005 packets to the DNS server. 2007 It should be noted that [RFC6106] explicitly prohibits using DNS 2008 information if the RA router Lifetime expired: "An RDNSS address or a 2009 DNSSL domain name MUST be used only as long as both the RA router 2010 Lifetime (advertised by a Router Advertisement message) and the 2011 corresponding option Lifetime have not expired.". Therefore hosts 2012 might ignore RDNSS information provided in ULA-scoped RAs as those 2013 RAs would have router lifetime set to 0. However the updated version 2014 of RFC6106 ([RFC8106]) has that requirement removed. 2016 As discussed above the DNS split-horizon problem and selecting the 2017 correct DNS server in a multihomed envinroment is not an easy one to 2018 solve. The proper solution would require hosts to support the 2019 concept of multiple Provisioning Domains (PvD, a set of configuration 2020 information associated with a network, [RFC7556]). 2022 7. Deployment Considerations 2024 The solution described in this document requires certain mechanisms 2025 to be supported by the network infrastructure and hosts. It requires 2026 some routers in the enterprise site to support some form of Source 2027 Address Dependent Routing (SADR). It also requires hosts to be able 2028 to learn when the uplink to an ISP changes its state so the 2029 corresponding source addresses should (or should not) be used. 2030 Ongoing work to create mechanisms to accomplish this are discussed in 2031 this document, but they are still a work in progress. 2033 7.1. Deploying SADR Domain 2035 The proposed solution provides does not prescribe particular details 2036 regarding deploying an SADR domain within a multihomed enterprise 2037 network. However the following guidelines could be applied: 2039 o The SADR domain is usually limited by the multihomed site border. 2041 o The minimal deployable scenario requires enabling SADR on all SERs 2042 and including them into a single SADR domain. 2044 o As discussed in Section 4.2, extending the connected SADR domain 2045 beyond that point down to the first-hop routers can produce more 2046 efficient forwarding paths and allow the network to fully benefit 2047 from SADR. it would also simplify the operation of the SADR 2048 domain. 2050 7.2. Hosts-Related Considerations 2052 The solution discussed in this document relies on the default address 2053 selection algorithm ([RFC6724]) Rule 5.5. While [RFC6724] considers 2054 this rule as optional, the recent [RFC8028] states that "A host 2055 SHOULD select default routers for each prefix it is assigned an 2056 address in". It also recommends that hosts should implement Rule 2057 5.5. of [RFC6724]. Therefore while RFC8028-compliant hosts already 2058 have mechanism to learn about ISP uplinks state changes and selecting 2059 the source addresses accordingly, many hosts do not have such 2060 mechanism supported yet. 2062 It should be noted that multihomed enterprise network utilizing 2063 multiple ISP prefixes can be considered as a typical multiple 2064 provisioning domain (mPVD) scenario, as described in [RFC7556]. This 2065 document defines a way for the network to provide the PVD information 2066 to hosts indirectly, using the existing mechanisms. At the same time 2067 [I-D.ietf-intarea-provisioning-domains] takes one step further and 2068 describes a comprehensive mechanism for hosts to discover the whole 2069 set of configuration information associated with different PVD/ISPs. 2070 [I-D.ietf-intarea-provisioning-domains] complements this document in 2071 terms of making hosts being able to learn about ISP uplink states and 2072 selecting the corresponding source addresses. 2074 8. Other Solutions 2076 8.1. Shim6 2078 The Shim6 working group specified the Shim6 protocol [RFC5533] which 2079 allows a host at a multihomed site to communicate with an external 2080 host and exchange information about possible source and destination 2081 address pairs that they can use to communicate. It also specified 2082 the REAP protocol [RFC5534] to detect failures in the path between 2083 working address pairs and find new working address pairs. A 2084 fundamental requirement for Shim6 is that both internal and external 2085 hosts need to support Shim6. That is, both the host internal to the 2086 multihomed site and the host external to the multihomed site need to 2087 support Shim6 in order for there to be any benefit for the internal 2088 host to run Shim6. The Shim6 protocol specification was published in 2089 2009, but it has not been widely implemented. Therefore Shim6 is not 2090 considered as a viable solution for enterprise multihoming. 2092 8.2. IPv6-to-IPv6 Network Prefix Translation 2094 IPv6-to-IPv6 Network Prefix Translation (NPTv6) [RFC6296] is not the 2095 focus of this document. NPTv6 suffers from the same fundamental 2096 issue as any other address translation approaches: it breaks end-to- 2097 end connectivity. Therefore NPTv6 is not considered as desirable 2098 solution and this document intentionally focuses on solving 2099 enterprise multihoming problem without any form of address 2100 translations. 2102 With increasing interest and ongoing work in bringing path awareness 2103 to transport and application layer protocols hosts might be able to 2104 determine the properties of the various network paths and choose 2105 among paths available to them. As selecting the correct source 2106 address is one of the possible mechanisms path-aware hosts may 2107 utilize, address translation negatively affects hosts path-awareness 2108 which makes NTPv6 even more undesirable solution. 2110 8.3. Multipath Transport 2112 Using multipath transport (such as MPTCP, [RFC6824] or multipath 2113 capabilities in QUIC) might solve the problems discussed in Section 6 2114 since it would allow hosts to use multiple source addresses for a 2115 single connection and switch between source addresses when a 2116 particular address becomes unavailable or a new address gets assigned 2117 to the host interface. Therefore if all hosts in the enterprise 2118 network are only using multipath transport for all connections, the 2119 signaling solution described in Section 6 might not be needed (it 2120 should be noted that the Source Address Dependent Routing would still 2121 be required to deliver packets to the correct uplinks). At the time 2122 this document was written, multipath transport alone could not be 2123 considered a solution for the problem of selecting the source address 2124 in a multihomed environment. There are significant number of hosts 2125 which do not use multipath transport currently and it seems unlikely 2126 that the situation is going to change in any foreseeable future (even 2127 if new releases of operatin systems get multipath protocols support 2128 there will be a long tail of legacy hosts). The solution for 2129 enterprise multihoming needs to work for the least common 2130 denominator: hosts without multipath transport support. In addition, 2131 not all protocols are using multipath transport. While multipath 2132 transport would complement the solution described in Section 6, it 2133 could not be considered as a sole solution to the problem of source 2134 address selection in multihomed environments. 2136 9. IANA Considerations 2138 This memo asks the IANA for no new parameters. 2140 10. Security Considerations 2142 Section 6.2.3 discusses a mechanism for controlling source address 2143 selection on hosts using ICMPv6 messages. It describes how an 2144 attacker could exploit this mechansim by sending spoofed ICMPv6 2145 messages. It recommends that a given host verify the original packet 2146 header included into ICMPv6 error message was actually sent by the 2147 host itself. 2149 The security considerations of using stateless address 2150 autoconfiguration are discussed in [RFC4862]. 2152 11. Acknowledgements 2154 The original outline was suggested by Ole Troan. 2156 The authors would like to thank the following people (in alphabetical 2157 order) for their review and feedback: Olivier Bonaventure, Deborah 2158 Brungard, Brian E Carpenter, Lorenzo Colitti, Roman Danyliw, Benjamin 2159 Kaduk, Suresh Krishnan, Mirja Kuhlewind, David Lamparter, Nicolai 2160 Leymann, Acee Lindem, Philip Matthewsu, Robert Raszuk, Alvaro Retana, 2161 Dave Thaler, Michael Tuxen, Martin Vigoureux, Eric Vyncke, Magnus 2162 Westerlund. 2164 12. References 2166 12.1. Normative References 2168 [RFC1918] Rekhter, Y., Moskowitz, B., Karrenberg, D., de Groot, G., 2169 and E. Lear, "Address Allocation for Private Internets", 2170 BCP 5, RFC 1918, DOI 10.17487/RFC1918, February 1996, 2171 . 2173 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2174 Requirement Levels", BCP 14, RFC 2119, 2175 DOI 10.17487/RFC2119, March 1997, 2176 . 2178 [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: 2179 Defeating Denial of Service Attacks which employ IP Source 2180 Address Spoofing", BCP 38, RFC 2827, DOI 10.17487/RFC2827, 2181 May 2000, . 2183 [RFC4191] Draves, R. and D. Thaler, "Default Router Preferences and 2184 More-Specific Routes", RFC 4191, DOI 10.17487/RFC4191, 2185 November 2005, . 2187 [RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 2188 Addresses", RFC 4193, DOI 10.17487/RFC4193, October 2005, 2189 . 2191 [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing 2192 Architecture", RFC 4291, DOI 10.17487/RFC4291, February 2193 2006, . 2195 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 2196 Control Message Protocol (ICMPv6) for the Internet 2197 Protocol Version 6 (IPv6) Specification", STD 89, 2198 RFC 4443, DOI 10.17487/RFC4443, March 2006, 2199 . 2201 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 2202 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 2203 DOI 10.17487/RFC4861, September 2007, 2204 . 2206 [RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless 2207 Address Autoconfiguration", RFC 4862, 2208 DOI 10.17487/RFC4862, September 2007, 2209 . 2211 [RFC6106] Jeong, J., Park, S., Beloeil, L., and S. Madanapalli, 2212 "IPv6 Router Advertisement Options for DNS Configuration", 2213 RFC 6106, DOI 10.17487/RFC6106, November 2010, 2214 . 2216 [RFC6296] Wasserman, M. and F. Baker, "IPv6-to-IPv6 Network Prefix 2217 Translation", RFC 6296, DOI 10.17487/RFC6296, June 2011, 2218 . 2220 [RFC6724] Thaler, D., Ed., Draves, R., Matsumoto, A., and T. Chown, 2221 "Default Address Selection for Internet Protocol Version 6 2222 (IPv6)", RFC 6724, DOI 10.17487/RFC6724, September 2012, 2223 . 2225 [RFC7078] Matsumoto, A., Fujisaki, T., and T. Chown, "Distributing 2226 Address Selection Policy Using DHCPv6", RFC 7078, 2227 DOI 10.17487/RFC7078, January 2014, 2228 . 2230 [RFC7556] Anipko, D., Ed., "Multiple Provisioning Domain 2231 Architecture", RFC 7556, DOI 10.17487/RFC7556, June 2015, 2232 . 2234 [RFC8028] Baker, F. and B. Carpenter, "First-Hop Router Selection by 2235 Hosts in a Multi-Prefix Network", RFC 8028, 2236 DOI 10.17487/RFC8028, November 2016, 2237 . 2239 [RFC8106] Jeong, J., Park, S., Beloeil, L., and S. Madanapalli, 2240 "IPv6 Router Advertisement Options for DNS Configuration", 2241 RFC 8106, DOI 10.17487/RFC8106, March 2017, 2242 . 2244 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2245 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2246 May 2017, . 2248 [RFC8415] Mrugalski, T., Siodelski, M., Volz, B., Yourtchenko, A., 2249 Richardson, M., Jiang, S., Lemon, T., and T. Winters, 2250 "Dynamic Host Configuration Protocol for IPv6 (DHCPv6)", 2251 RFC 8415, DOI 10.17487/RFC8415, November 2018, 2252 . 2254 12.2. Informative References 2256 [I-D.ietf-intarea-provisioning-domains] 2257 Pfister, P., Vyncke, E., Pauly, T., Schinazi, D., and W. 2258 Shao, "Discovering Provisioning Domain Names and Data", 2259 draft-ietf-intarea-provisioning-domains-05 (work in 2260 progress), June 2019. 2262 [I-D.ietf-rtgwg-dst-src-routing] 2263 Lamparter, D. and A. Smirnov, "Destination/Source 2264 Routing", draft-ietf-rtgwg-dst-src-routing-07 (work in 2265 progress), March 2019. 2267 [I-D.pfister-6man-sadr-ra] 2268 Pfister, P., "Source Address Dependent Route Information 2269 Option for Router Advertisements", draft-pfister-6man- 2270 sadr-ra-01 (work in progress), June 2015. 2272 [RFC3704] Baker, F. and P. Savola, "Ingress Filtering for Multihomed 2273 Networks", BCP 84, RFC 3704, DOI 10.17487/RFC3704, March 2274 2004, . 2276 [RFC4941] Narten, T., Draves, R., and S. Krishnan, "Privacy 2277 Extensions for Stateless Address Autoconfiguration in 2278 IPv6", RFC 4941, DOI 10.17487/RFC4941, September 2007, 2279 . 2281 [RFC5533] Nordmark, E. and M. Bagnulo, "Shim6: Level 3 Multihoming 2282 Shim Protocol for IPv6", RFC 5533, DOI 10.17487/RFC5533, 2283 June 2009, . 2285 [RFC5534] Arkko, J. and I. van Beijnum, "Failure Detection and 2286 Locator Pair Exploration Protocol for IPv6 Multihoming", 2287 RFC 5534, DOI 10.17487/RFC5534, June 2009, 2288 . 2290 [RFC6434] Jankiewicz, E., Loughney, J., and T. Narten, "IPv6 Node 2291 Requirements", RFC 6434, DOI 10.17487/RFC6434, December 2292 2011, . 2294 [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, 2295 "TCP Extensions for Multipath Operation with Multiple 2296 Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013, 2297 . 2299 [RFC7676] Pignataro, C., Bonica, R., and S. Krishnan, "IPv6 Support 2300 for Generic Routing Encapsulation (GRE)", RFC 7676, 2301 DOI 10.17487/RFC7676, October 2015, 2302 . 2304 Authors' Addresses 2306 Fred Baker 2307 Santa Barbara, California 93117 2308 USA 2310 Email: FredBaker.IETF@gmail.com 2312 Chris Bowers 2313 Juniper Networks 2314 Sunnyvale, California 94089 2315 USA 2317 Email: cbowers@juniper.net 2318 Jen Linkova 2319 Google 2320 1 Darling Island Rd 2321 Pyrmont, NSW 2009 2322 AU 2324 Email: furry@google.com