idnits 2.17.1 draft-ietf-rtgwg-enterprise-pa-multihoming-11.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 3, 2019) is 1731 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 6106 (Obsoleted by RFC 8106) == Outdated reference: A later version (-11) exists of draft-ietf-intarea-provisioning-domains-05 -- Obsolete informational reference (is this intentional?): RFC 4941 (Obsoleted by RFC 8981) -- Obsolete informational reference (is this intentional?): RFC 6434 (Obsoleted by RFC 8504) -- Obsolete informational reference (is this intentional?): RFC 6824 (Obsoleted by RFC 8684) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Routing Working Group F. Baker 3 Internet-Draft 4 Intended status: Informational C. Bowers 5 Expires: January 4, 2020 Juniper Networks 6 J. Linkova 7 Google 8 July 3, 2019 10 Enterprise Multihoming using Provider-Assigned IPv6 Addresses without 11 Network Prefix Translation: Requirements and Solutions 12 draft-ietf-rtgwg-enterprise-pa-multihoming-11 14 Abstract 16 Connecting an enterprise site to multiple ISPs over IPv6 using 17 provider-assigned addresses is difficult without the use of some form 18 of Network Address Translation (NAT). Much has been written on this 19 topic over the last 10 to 15 years, but it still remains a problem 20 without a clearly defined or widely implemented solution. Any 21 multihoming solution without NAT requires hosts at the site to have 22 addresses from each ISP and to select the egress ISP by selecting a 23 source address for outgoing packets. It also requires routers at the 24 site to take into account those source addresses when forwarding 25 packets out towards the ISPs. 27 This document examines currently available mechanisms for providing a 28 solution to this problem for a broad range of enterprise topologies. 29 It covers the behavior of routers to forward traffic taking into 30 account source address, and it covers the behavior of hosts to select 31 appropriate default source addresses. It also covers any possible 32 role that routers might play in providing information to hosts to 33 help them select appropriate source addresses. In the process of 34 exploring potential solutions, this document also makes explicit 35 requirements for how the solution would be expected to behave from 36 the perspective of an enterprise site network administrator. 38 Status of This Memo 40 This Internet-Draft is submitted in full conformance with the 41 provisions of BCP 78 and BCP 79. 43 Internet-Drafts are working documents of the Internet Engineering 44 Task Force (IETF). Note that other groups may also distribute 45 working documents as Internet-Drafts. The list of current Internet- 46 Drafts is at https://datatracker.ietf.org/drafts/current/. 48 Internet-Drafts are draft documents valid for a maximum of six months 49 and may be updated, replaced, or obsoleted by other documents at any 50 time. It is inappropriate to use Internet-Drafts as reference 51 material or to cite them other than as "work in progress." 53 This Internet-Draft will expire on January 4, 2020. 55 Copyright Notice 57 Copyright (c) 2019 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents 62 (https://trustee.ietf.org/license-info) in effect on the date of 63 publication of this document. Please review these documents 64 carefully, as they describe your rights and restrictions with respect 65 to this document. Code Components extracted from this document must 66 include Simplified BSD License text as described in Section 4.e of 67 the Trust Legal Provisions and are provided without warranty as 68 described in the Simplified BSD License. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 73 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 6 74 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 75 4. Enterprise Multihoming Use Cases . . . . . . . . . . . . . . 8 76 4.1. Simple ISP Connectivity with Connected SERs . . . . . . . 8 77 4.2. Simple ISP Connectivity Where SERs Are Not Directly 78 Connected . . . . . . . . . . . . . . . . . . . . . . . . 9 79 4.3. Enterprise Network Operator Expectations . . . . . . . . 11 80 4.4. More complex ISP connectivity . . . . . . . . . . . . . . 13 81 4.5. ISPs and Provider-Assigned Prefixes . . . . . . . . . . . 15 82 4.6. Simplified Topologies . . . . . . . . . . . . . . . . . . 16 83 5. Generating Source-Prefix-Scoped Forwarding Tables . . . . . 16 84 6. Mechanisms For Hosts To Choose Good Source Addresses In A 85 Multihomed Site . . . . . . . . . . . . . . . . . . . . . . . 23 86 6.1. Source Address Selection Algorithm on Hosts . . . . . . . 25 87 6.2. Selecting Source Address When Both Uplinks Are Working . 28 88 6.2.1. Distributing Address Selection Policy Table with 89 DHCPv6 . . . . . . . . . . . . . . . . . . . . . . . 28 90 6.2.2. Controlling Source Address Selection With Router 91 Advertisements . . . . . . . . . . . . . . . . . . . 29 92 6.2.3. Controlling Source Address Selection With ICMPv6 . . 31 93 6.2.4. Summary of Methods For Controlling Source Address 94 Selection To Implement Routing Policy . . . . . . . . 32 95 6.3. Selecting Source Address When One Uplink Has Failed . . . 33 96 6.3.1. Controlling Source Address Selection With DHCPv6 . . 34 97 6.3.2. Controlling Source Address Selection With Router 98 Advertisements . . . . . . . . . . . . . . . . . . . 35 99 6.3.3. Controlling Source Address Selection With ICMPv6 . . 36 100 6.3.4. Summary Of Methods For Controlling Source Address 101 Selection On The Failure Of An Uplink . . . . . . . . 37 102 6.4. Selecting Source Address Upon Failed Uplink Recovery . . 37 103 6.4.1. Controlling Source Address Selection With DHCPv6 . . 37 104 6.4.2. Controlling Source Address Selection With Router 105 Advertisements . . . . . . . . . . . . . . . . . . . 38 106 6.4.3. Controlling Source Address Selection With ICMP . . . 38 107 6.4.4. Summary Of Methods For Controlling Source Address 108 Selection Upon Failed Uplink Recovery . . . . . . . . 39 109 6.5. Selecting Source Address When All Uplinks Failed . . . . 39 110 6.5.1. Controlling Source Address Selection With DHCPv6 . . 39 111 6.5.2. Controlling Source Address Selection With Router 112 Advertisements . . . . . . . . . . . . . . . . . . . 39 113 6.5.3. Controlling Source Address Selection With ICMPv6 . . 40 114 6.5.4. Summary Of Methods For Controlling Source Address 115 Selection When All Uplinks Failed . . . . . . . . . . 40 116 6.6. Summary Of Methods For Controlling Source Address 117 Selection . . . . . . . . . . . . . . . . . . . . . . . . 40 118 6.7. Solution Limitations . . . . . . . . . . . . . . . . . . 42 119 6.7.1. Connections Preservation . . . . . . . . . . . . . . 42 120 6.8. Other Configuration Parameters . . . . . . . . . . . . . 43 121 6.8.1. DNS Configuration . . . . . . . . . . . . . . . . . . 43 122 7. Deployment Considerations . . . . . . . . . . . . . . . . . . 44 123 7.1. Deploying SADR Domain . . . . . . . . . . . . . . . . . . 44 124 7.2. Hosts-Related Considerations . . . . . . . . . . . . . . 45 125 8. Other Solutions . . . . . . . . . . . . . . . . . . . . . . . 45 126 8.1. Shim6 . . . . . . . . . . . . . . . . . . . . . . . . . . 45 127 8.2. IPv6-to-IPv6 Network Prefix Translation . . . . . . . . . 46 128 8.3. Multipath Transport . . . . . . . . . . . . . . . . . . . 46 129 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 47 130 10. Security Considerations . . . . . . . . . . . . . . . . . . . 47 131 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 48 132 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 48 133 12.1. Normative References . . . . . . . . . . . . . . . . . . 48 134 12.2. Informative References . . . . . . . . . . . . . . . . . 50 135 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 51 137 1. Introduction 139 Site multihoming, the connection of a subscriber network to multiple 140 upstream networks using redundant uplinks, is a common enterprise 141 architecture for improving the reliability of its Internet 142 connectivity. If the site uses provider-independent (PI) addresses, 143 all traffic originating from the enterprise can use source addresses 144 from the PI address space. Site multihoming with PI addresses is 145 commonly used with both IPv4 and IPv6, and does not present any new 146 technical challenges. 148 It may be desirable for an enterprise site to connect to multiple 149 ISPs using provider-assigned (PA) addresses, instead of PI addresses. 150 Multihoming with provider-assigned addresses is typically less 151 expensive for the enterprise relative to using provider-independent 152 addresses as it does not require obtaining and maintaining PI address 153 space as well as running BGP between the enterprise and the ISPs (for 154 small/meduim networks running BGP might be not just undesirable but 155 impossible, especially if residential-type ISP connections are used). 156 PA multihoming is also a practice that should be facilitated and 157 encouraged because it does not add to the size of the Internet 158 routing table, whereas PI multihoming does. Note that PA is also 159 used to mean "provider-aggregatable". In this document we assume 160 that provider-assigned addresses are always provider-aggregatable. 162 With PA multihoming, for each ISP connection, the site is assigned a 163 prefix from within an address block allocated to that ISP by its 164 National or Regional Internet Registry. In the simple case of two 165 ISPs (ISP-A and ISP-B), the site will have two different prefixes 166 assigned to it (prefix-A and prefix-B). This arrangement is 167 problematic. First, packets with the "wrong" source address may be 168 dropped by one of the ISPs. In order to limit denial of service 169 attacks using spoofed source addresses, BCP38 [RFC2827] recommends 170 that ISPs filter traffic from customer sites to only allow traffic 171 with a source address that has been assigned by that ISP. So a 172 packet sent from a multihomed site on the uplink to ISP-B with a 173 source address in prefix-A may be dropped by ISP-B. 175 However, even if ISP-B does not implement BCP38 or ISP-B adds 176 prefix-A to its list of allowed source addresses on the uplink from 177 the multihomed site, two-way communication may still fail. If the 178 packet with source address in prefix-A was sent to ISP-B because the 179 uplink to ISP-A failed, then if ISP-B does not drop the packet and 180 the packet reaches its destination somewhere on the Internet, the 181 return packet will be sent back with a destination address in prefix- 182 A. The return packet will be routed over the Internet to ISP-A, but 183 it will not be delivered to the multihomed site because the site 184 uplink with ISP-A has failed. Two-way communication would require 185 some arrangement for ISP-B to advertise prefix-A when the uplink to 186 ISP-A fails. 188 Note that the same may be true with a provider that does not 189 implement BCP 38, if his upstream provider does, or has no 190 corresponding route to deliver the ingress traffic to the multihomed 191 site. The issue is not that the immediate provider implements 192 ingress filtering; it is that someone upstream does (so egress 193 traffic is blocked), or lacks a route (causing blackholing of the 194 ingress traffic). 196 Another issue with asymmetric traffic flow (when the egress traffic 197 leaves the site via one ISP but the return traffic enters the site 198 via another uplink) is related to stateful firewalls/middleboxes. 199 Keeping state in that case might be problematic, even impossible. 201 With IPv4, this problem is commonly solved by using [RFC1918] private 202 address space within the multi-homed site and Network Address 203 Translation (NAT) or Network Address/Port Translation (NAPT) on the 204 uplinks to the ISPs. However, one of the goals of IPv6 is to 205 eliminate the need for and the use of NAT or NAPT. Therefore, 206 requiring the use of NAT or NAPT for an enterprise site to multihome 207 with provider-assigned addresses is not an attractive solution. 209 [RFC6296] describes a translation solution specifically tailored to 210 meet the requirements of multi-homing with provider-assigned IPv6 211 addresses. With the IPv6-to-IPv6 Network Prefix Translation (NPTv6) 212 solution, within the site an enterprise can use Unique Local 213 Addresses [RFC4193] or the prefix assigned by one of the ISPs. As 214 traffic leaves the site on an uplink to an ISP, the source address 215 gets translated to an address within the prefix assigned by the ISP 216 on that uplink in a predictable and reversible manner. [RFC6296] is 217 currently classified as Experimental, and it has been implemented by 218 several vendors. See Section 8.2, for more discussion of NPTv6. 220 This document defines routing requirements for enterprise multihoming 221 This document focuses on the following general class of solutions. 223 Each host at the enterprise has multiple addresses, at least one from 224 each ISP-assigned prefix. Each host, as discussed in Section 6.1 and 225 [RFC6724], is responsible for choosing the source address applied to 226 each packet it sends. A host is expected to be able respond 227 dynamically to the failure of an uplink to a given ISP by no longer 228 sending packets with the source address corresponding to that ISP. 229 Potential mechanisms for the communication of changes in the network 230 to the host are Neighbor Discovery Router Advertisements ([RFC4861]), 231 DHCPv6 ([RFC8415]), and ICMPv6 ([RFC4443]). 233 The routers in the enterprise network are responsible for ensuring 234 that packets are delivered to the "correct" ISP uplink based on 235 source address. This requires that at least some routers in the site 236 network are able to take into account the source address of a packet 237 when deciding how to route it. That is, some routers must be capable 238 of some form of Source Address Dependent Routing (SADR), if only as 239 described in the section 4.3 of [RFC3704]. At a minimum, the routers 240 connected to the ISP uplinks (the site exit routers or SERs) must be 241 capable of Source Address Dependent Routing. Expanding the connected 242 domain of routers capable of SADR from the site exit routers deeper 243 into the site network will generally result in more efficient routing 244 of traffic with external destinations. 246 This document is organized as follows. Section 4 looks in more 247 detail at the enterprise networking environments in which this 248 solution is expected to operate. The discussion of Section 4 uses 249 the concepts of source-prefix-scoped routing advertisements and 250 forwarding tables and provides a description of how source-prefix- 251 scoped routing advertisements are used to generate source-prefix- 252 scoped forwarding tables. Instead, this detailed description is 253 provided in Section 5. Section 6 discusses existing and proposed 254 mechanisms for hosts to select the default source address to be used 255 by applications. It also discusses the requirements for routing that 256 are needed to support these enterprise network scenarios and the 257 mechanisms by which hosts are expected to update default source 258 addresses based on network state. Section 7 discusses deployment 259 considerations, while Section 8 discusses other solutions. 261 2. Requirements Language 263 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 264 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 265 "OPTIONAL" in this document are to be interpreted as described in BCP 266 14 [RFC2119] [RFC8174] when, and only when, they appear in all 267 capitals, as shown here. 269 3. Terminology 271 PA (provider-assigned or provider-aggregatable) address space: a 272 block of IP addresses assigned by an Regional Internet Registry (RIR) 273 to a Local Internet Registry (LIR), used to create allocations to end 274 sites. Can be aggregated and present in the routing table as one 275 route. 277 PI (provider-independent) address space: a block of IP addresses 278 assigned by an Regional Internet Registry (RIR) directly to end site/ 279 end customer. 281 ISP: Internet Service Provider. 283 LIR (Local Internet Registry): an organisation (usually an ISP or an 284 enterprise/academic) which receives IP addresses allocation from its 285 Regional Internet Regsitry, then assign parts of that allocation to 286 its customers. 288 RIR (Regional Internet Registry): an organization which manages the 289 Internet number resources (such as IP addresses and AS numbers) 290 within a geographical region of the world. 292 SADR (Source Address Dependent Routing): Routing which takes into 293 account the source address of a packet in addition to the packet 294 destination address. 296 SADR domain: a routing domain where some (or all) routers exchange 297 source-dependent routing information. 299 Source-Prefix-Scoped Routing/Forwarding Table: a routing (or 300 forwarding) table which contains routing (or forwarding) information 301 which is applicable to packets with source addresses from the 302 specific prefix only. 304 Unscoped Routing/Forwarding Table: a routing (or forwarding) table 305 which can be used to route/forward packets with any source addresses. 307 SER (Site Edge Router): a router which connects the site to an ISP 308 (terminates an ISP uplink).. 310 LLA (Link-Local Address): IPv6 Unicast Address from fe80::/10 prefix 311 ([RFC4291]). 313 ULA (Unique Local IPv6 Unicast Address): IPv6 unicast addresses from 314 FC00::/7 prefix. They are globally unique and intended for local 315 communications ([RFC4193]). 317 GUA (Global Unicast Address): globally routable IPv6 addresses of the 318 global scope ([RFC4291]). 320 SLAAC (IPv6 Stateless Address Autoconfiguration): a stateless process 321 of configuring network stack on IPv6 hosts ([RFC4862]). 323 RA (Router Advertisement): a message sent by an IPv6 router to 324 advertise its presence to hosts together with various network-related 325 parameters required for hosts to perform SLAAC ([RFC4861]). 327 PIO (Prefix Information Option): a part of RA message containing 328 information about IPv6 prefixes which could be used by hosts to 329 generate global IPv6 addresses ([RFC4862]). 331 RIO (Route Information Option): a part of RA message containing 332 information about more specific IPv6 prefixes reachable via the 333 advertising router ([RFC4191]). 335 4. Enterprise Multihoming Use Cases 337 4.1. Simple ISP Connectivity with Connected SERs 339 We start by looking at a scenario in which a site has connections to 340 two ISPs, as shown in Figure 1. The site is assigned the prefix 341 2001:db8:0:a000::/52 by ISP-A and prefix 2001:db8:0:b000::/52 by ISP- 342 B. We consider three hosts in the site. H31 and H32 are on a LAN 343 that has been assigned subnets 2001:db8:0:a010::/64 and 344 2001:db8:0:b010::/64. H31 has been assigned the addresses 345 2001:db8:0:a010::31 and 2001:db8:0:b010::31. H32 has been assigned 346 2001:db8:0:a010::32 and 2001:db8:0:b010::32. H41 is on a different 347 subnet that has been assigned 2001:db8:0:a020::/64 and 348 2001:db8:0:b020::/64. 350 2001:db8:0:1234::101 H101 351 | 352 | 353 2001:db8:0:a010::31 -------- 354 2001:db8:0:b010::31 ,-----. / \ 355 +--+ +--+ +----+ ,' `. : : 356 +---|R1|---|R4|---+---|SERa|-+ ISP-A +--+-- : 357 H31--+ +--+ +--+ | +----+ `. ,' : : 358 | | `-----' : Internet : 359 | | : : 360 | | : : 361 | | : : 362 | | ,-----. : : 363 H32--+ +--+ | +----+ ,' `. : : 364 +---|R2|----------+---|SERb|-+ ISP-B +--+-- : 365 +--+ | +----+ `. ,' : : 366 | `-----' : : 367 | : : 368 +--+ +--+ +--+ \ / 369 H41------|R3|--|R5|--|R6| -------- 370 +--+ +--+ +--+ 372 2001:db8:0:a020::41 373 2001:db8:0:b020::41 375 Figure 1: Simple ISP Connectivity With Connected SERs 377 We refer to a router that connects the site to an ISP as a site edge 378 router (SER). Several other routers provide connectivity among the 379 internal hosts (H31, H32, and H41), as well as connecting the 380 internal hosts to the Internet through SERa and SERb. In this 381 example SERa and SERb share a direct connection to each other. In 382 Section 4.2, we consider a scenario where this is not the case. 384 For the moment, we assume that the hosts are able to make good 385 choices about which source addresses through some mechanism that 386 doesn't involve the routers in the site network. Here, we focus on 387 primary task of the routed site network, which is to get packets 388 efficiently to their destinations, while sending a packet to the ISP 389 that assigned the prefix that matches the source address of the 390 packet. In Section 6, we examine what role the routed network may 391 play in helping hosts make good choices about source addresses for 392 packets. 394 With this solution, routers will need some form of Source Address 395 Dependent Routing, which will be new functionality. It would be 396 useful if an enterprise site does not need to upgrade all routers to 397 support the new SADR functionality in order to support PA multi- 398 homing. We consider if this is possible and what are the tradeoffs 399 of not having all routers in the site support SADR functionality. 401 In the topology in Figure 1, it is possible to support PA multihoming 402 with only SERa and SERb being capable of SADR. The other routers can 403 continue to forward based only on destination address, and exchange 404 routes that only consider destination address. In this scenario, 405 SERa and SERb communicate source-scoped routing information across 406 their shared connection. When SERa receives a packet with a source 407 address matching prefix 2001:db8:0:b000::/52 , it forwards the packet 408 to SERb, which forwards it on the uplink to ISP-B. The analogous 409 behaviour holds for traffic that SERb receives with a source address 410 matching prefix 2001:db8:0:a000::/52. 412 In Figure 1, when only SERa and SERb are capable of source address 413 dependent routing, PA multi-homing will work. However, the paths 414 over which the packets are sent will generally not be the shortest 415 paths. The forwarding paths will generally be more efficient as more 416 routers are capable of SADR. For example, if R4, R2, and R6 are 417 upgraded to support SADR, then can exchange source-scoped routes with 418 SERa and SERb. They will then know to send traffic with a source 419 address matching prefix 2001:db8:0:b000::/52 directly to SERb, 420 without sending it to SERa first. 422 4.2. Simple ISP Connectivity Where SERs Are Not Directly Connected 424 In Figure 2, we modify the topology slightly by inserting R7, so that 425 SERa and SERb are no longer directly connected. With this topology, 426 it is not enough to just enable SADR routing on SERa and SERb to 427 support PA multi-homing. There are two solutions to enable PA 428 multihoming in this topology. 430 2001:db8:0:1234::101 H101 431 | 432 | 433 2001:db8:0:a010::31 -------- 434 2001:db8:0:b010::31 ,-----. / \ 435 +--+ +--+ +----+ ,' `. : : 436 +---|R1|---|R4|---+---|SERa|-+ ISP-A +--+-- : 437 H31--+ +--+ +--+ | +----+ `. ,' : : 438 | | `-----' : Internet : 439 | +--+ : : 440 | |R7| : : 441 | +--+ : : 442 | | ,-----. : : 443 H32--+ +--+ | +----+ ,' `. : : 444 +---|R2|----------+---|SERb|-+ ISP-B +--+-- : 445 +--+ | +----+ `. ,' : : 446 | `-----' : : 447 | : : 448 +--+ +--+ +--+ \ / 449 H41------|R3|--|R5|--|R6| -------- 450 +--+ +--+ +--+ | 451 | 452 2001:db8:0:a020::41 2001:db8:0:5678::501 H501 453 2001:db8:0:b020::41 455 Figure 2: Simple ISP Connectivity Where SERs Are Not Directly 456 Connected 458 One option is to effectively modify the topology by creating a 459 logical tunnel between SERa and SERb, using GRE ([RFC7676]) for 460 example. Although SERa and SERb are not directly connected 461 physically in this topology, they can be directly connected logically 462 by a tunnel. 464 The other option is to enable SADR functionality on R7. In this way, 465 R7 will exchange source-scoped routes with SERa and SERb, making the 466 three routers act as a single SADR domain. This illustrates the 467 basic principle that the minimum requirement for the routed site 468 network to support PA multi-homing is having all of the site exit 469 routers be part of a connected SADR domain. Extending the connected 470 SADR domain beyond that point can produce more efficient forwarding 471 paths. 473 4.3. Enterprise Network Operator Expectations 475 Before considering a more complex scenario, let's look in more detail 476 at the reasonably simple multihoming scenario in Figure 2 to 477 understand what can reasonably be expected from this solution. As a 478 general guiding principle, we assume an enterprise network operator 479 will expect a multihomed network to behave as close as to a single- 480 homed network as possible. So a solution that meets those 481 expectations where possible is a good thing. 483 For traffic between internal hosts and traffic from outside the site 484 to internal hosts, an enterprise network operator would expect there 485 be no visible change in the path taken by this traffic, since this 486 traffic does not need to be routed in a way that depends on source 487 address. It is also reasonable to expect that internal hosts should 488 be able to communicate with each other using either of their source 489 addresses without restriction. For example, H31 should be able to 490 communicate with H41 using a packet with S=2001:db8:0:a010::31, 491 D=2001:db8:0:b020::41, regardless of the state of uplink to ISP-B. 493 These goals can be accomplished by having all of the routers in the 494 network continue to originate normal unscoped destination routes for 495 their connected networks. If we can arrange so that these unscoped 496 destination routes get used for forwarding this traffic, then we will 497 have accomplished the goal of keeping forwarding of traffic destined 498 for internal hosts, unaffected by the multihoming solution. 500 For traffic destined for external hosts, it is reasonable to expect 501 that traffic with a source address from the prefix assigned by ISP-A 502 to follow the path to that the traffic would follow if there is no 503 connection to ISP-B. This can be accomplished by having SERa 504 originate a source-scoped route of the form (S=2001:db8:0:a000::/52, 505 D=::/0) . If all of the routers in the site support SADR, then the 506 path of traffic exiting via ISP-A can match that expectation. If 507 some routers don't support SADR, then it is reasonable to expect that 508 the path for traffic exiting via ISP-A may be different within the 509 site. This is a tradeoff that the enterprise network operator may 510 decide to make. 512 It is important to understand how this multihoming solution behaves 513 when an uplink to one of the ISPs fails. To simplify this 514 discussion, we assume that all routers in the site support SADR. We 515 first start by looking at how the network operates when the uplinks 516 to both ISP-A and ISP-B are functioning properly. SERa originates a 517 source-scoped route of the form (S=2001:db8:0:a000::/52, D=::/0), and 518 SERb is originates a source-scoped route of the form 519 (S=2001:db8:0:b000::/52, D=::/0). These routes are distributed 520 through the routers in the site, and they establish within the 521 routers two set of forwarding paths for traffic leaving the site. 522 One set of forwarding paths is for packets with source address in 523 2001:db8:0:a000::/52. The other set of forwarding paths is for 524 packets with source address in 2001:db8:0:b000::/52. The normal 525 destination routes which are not scoped to these two source prefixes 526 play no role in the forwarding. Whether a packet exits the site via 527 SERa or via SERb is completely determined by the source address 528 applied to the packet by the host. So for example, when host H31 529 sends a packet to host H101 with (S=2001:db8:0:a010::31, 530 D=2001:db8:0:1234::101), the packet will only be sent out the link 531 from SERa to ISP-A. 533 Now consider what happens when the uplink from SERa to ISP-A fails. 534 The only way for the packets from H31 to reach H101 is for H31 to 535 start using the source address for ISP-B. H31 needs to send the 536 following packet: (S=2001:db8:0:b010::31, D=2001:db8:0:1234::101). 538 This behavior is very different from the behavior that occurs with 539 site multihoming using PI addresses or with PA addresses using NAT. 540 In these other multi-homing solutions, hosts do not need to react to 541 network failures several hops away in order to regain Internet 542 access. Instead, a host can be largely unaware of the failure of an 543 uplink to an ISP. When multihoming with PA addresses and NAT, 544 existing sessions generally need to be re-established after a failure 545 since the external host will receive packets from the internal host 546 with a new source address. However, new sessions can be established 547 without any action on the part of the hosts. Multihoming with PA 548 addresses and NAT has created the expectation of a fairly quick and 549 simple recovery from network failures. Alternatives should to be 550 evaluated in terms of the speed and complexity of the recovery 551 mechanism. 553 Another example where the behavior of this multihoming solution 554 differs significantly from that of multihoming with PI address or 555 with PA addresses using NAT is in the ability of the enterprise 556 network operator to route traffic over different ISPs based on 557 destination address. We still consider the fairly simple network of 558 Figure 2 and assume that uplinks to both ISPs are functioning. 559 Assume that the site is multihomed using PA addresses and NAT, and 560 that SERa and SERb each originate a normal destination route for 561 D=::/0, with the route origination dependent on the state of the 562 uplink to the respective ISP. 564 Now suppose it is observed that an important application running 565 between internal hosts and external host H101 experience much better 566 performance when the traffic passes through ISP-A (perhaps because 567 ISP-A provides lower latency to H101.) When multihoming this site 568 with PI addresses or with PA addresses and NAT, the enterprise 569 network operator can configure SERa to originate into the site 570 network a normal destination route for D=2001:db8:0:1234::/64 (the 571 destination prefix to reach H101) that depends on the state of the 572 uplink to ISP-A. When the link to ISP-A is functioning, the 573 destination route D=2001:db8:0:1234::/64 will be originated by SERa, 574 so traffic from all hosts will use ISP-A to reach H101 based on the 575 longest destination prefix match in the route lookup. 577 Implementing the same routing policy is more difficult with the PA 578 multihoming solution described in this document since it doesn't use 579 NAT. By design, the only way to control where a packet exits this 580 network is by setting the source address of the packet. Since the 581 network cannot modify the source address without NAT, the host must 582 set it. To implement this routing policy, each host needs to use the 583 source address from the prefix assigned by ISP-A to send traffic 584 destined for H101. Mechanisms have been proposed to allow hosts to 585 choose the source address for packets in a fine grained manner. We 586 will discuss these proposals in Section 6. However, interacting with 587 host operating systems in some manner to ensure a particular source 588 address is chosen for a particular destination prefix is not what an 589 enterprise network administrator would expect to have to do to 590 implement this routing policy. 592 4.4. More complex ISP connectivity 594 The previous sections considered two variations of a simple 595 multihoming scenario where the site is connected to two ISPs offering 596 only Internet connectivity. It is likely that many actual enterprise 597 multihoming scenarios will be similar to this simple example. 598 However, there are more complex multihoming scenarios that we would 599 like this solution to address as well. 601 It is fairly common for an ISP to offer a service in addition to 602 Internet access over the same uplink. Two variations of this are 603 reflected in Figure 3. In addition to Internet access, ISP-A offers 604 a service which requires the site to access host H51 at 605 2001:db8:0:5555::51. The site has a single physical and logical 606 connection with ISP-A, and ISP-A only allows access to H51 over that 607 connection. So when H32 needs to access the service at H51 it needs 608 to send packets with (S=2001:db8:0:a010::32, D=2001:db8:0:5555::51) 609 and those packets need to be forward out the link from SERa to ISP-A. 611 2001:db8:0:1234::101 H101 612 | 613 | 614 2001:db8:0:a010::31 -------- 615 2001:db8:0:b010::31 ,-----. / \ 616 +--+ +--+ +----+ ,' `. : : 617 +---|R1|---|R4|---+---|SERa|-+ ISP-A +--+-- : 618 H31--+ +--+ +--+ | +----+ `. ,' : : 619 | | `-----' : Internet : 620 | | | : : 621 | | H51 : : 622 | | 2001:db8:0:5555::51 : : 623 | +--+ : : 624 | |R7| : : 625 | +--+ : : 626 | | : : 627 | | ,-----. : : 628 H32--+ +--+ | +-----+ ,' `. : : 629 +---|R2|-----+----+--|SERb1|-+ ISP-B +--+-- : 630 +--+ | +-----+ `. ,' : : 631 +--+ `--|--' : : 632 2001:db8:0:a010::32 |R8| | \ / 633 +--+ ,--|--. -------- 634 | +-----+ ,' `. | 635 +-------|SERb2|-+ ISP-B | | 636 | +-----+ `. ,' H501 637 | `-----' 2001:db8:0:5678 638 | | ::501 639 +--+ +--+ H61 640 H41------|R3|--|R5| 2001:db8:0:6666::61 641 +--+ +--+ 643 2001:db8:0:a020::41 644 2001:db8:0:b020::41 646 Figure 3: Internet access and services offered by ISP-A and ISP-B 648 ISP-B illustrates a variation on this scenario. In addition to 649 Internet access, ISP-B also offers a service which requires the site 650 to access host H61. The site has two connections to two different 651 parts of ISP-B (shown as SERb1 and SERb2 in Figure 3). ISP-B expects 652 Internet traffic to use the uplink from SERb1, while it expects 653 traffic destined for the service at H61 to use the uplink from SERb2. 654 For either uplink, ISP-B expects the ingress traffic to have a source 655 address matching the prefix it assigned to the site, 656 2001:db8:0:b000::/52. 658 As discussed before, we rely completely on the internal host to set 659 the source address of the packet properly. In the case of a packet 660 sent by H31 to access the service in ISP-B at H61, we expect the 661 packet to have the following addresses: (S=2001:db8:0:b010::31, 662 D=2001:db8:0:6666::61). The routed network has two potential ways of 663 distributing routes so that this packet exits the site on the uplink 664 at SERb2. 666 We could just rely on normal destination routes, without using 667 source-prefix scoped routes. If we have SERb2 originate a normal 668 unscoped destination route for D=2001:db8:0:6666::/64, the packets 669 from H31 to H61 will exit the site at SERb2 as desired. We should 670 not have to worry about SERa needing to originate the same route, 671 because ISP-B should choose a globally unique prefix for the service 672 at H61. 674 The alternative is to have SERb2 originate a source-prefix-scoped 675 destination route of the form (S=2001:db8:0:b000::/52, 676 D=2001:db8:0:6666::/64). From a forwarding point of view, the use of 677 the source-prefix-scoped destination route would result in traffic 678 with source addresses corresponding only to ISP-B being sent to 679 SERb2. Instead, the use of the unscoped destination route would 680 result in traffic with source addresses corresponding to ISP-A and 681 ISP-B being sent to SERb2, as long as the destination address matches 682 the destination prefix. It seems like either forwarding behavior 683 would be acceptable. 685 However, from the point of view of the enterprise network 686 administrator trying to configure, maintain, and trouble-shoot this 687 multihoming solution, it seems much clearer to have SERb2 originate 688 the source-prefix-scoped destination route correspond to the service 689 offered by ISP-B. In this way, all of the traffic leaving the site 690 is determined by the source-prefix-scoped routes, and all of the 691 traffic within the site or arriving from external hosts is determined 692 by the unscoped destination routes. Therefore, for this multihoming 693 solution we choose to originate source-prefix-scoped routes for all 694 traffic leaving the site. 696 4.5. ISPs and Provider-Assigned Prefixes 698 While we expect that most site multihoming involves connecting to 699 only two ISPs, this solution allows for connections to an arbitrary 700 number of ISPs to be supported. However, when evaluating scalable 701 implementations of the solution, it would be reasonable to assume 702 that the maximum number of ISPs that a site would connect to is five 703 (topologies with two redundant routers each having two uplinks to 704 different ISPs plus a tunnel to a headoffice acting as fifth one are 705 not unheard of). 707 It is also useful to note that the prefixes assigned to the site by 708 different ISPs will not overlap. This must be the case, since the 709 provider-assigned addresses have to be globally unique. 711 4.6. Simplified Topologies 713 The topologies of many enterprise sites using this multihoming 714 solution may in practice be simpler than the examples that we have 715 used. The topology in Figure 1 could be further simplified by having 716 all hosts directly connected to the LAN connecting the two site exit 717 routers, SERa and SERb. The topology could also be simplified by 718 having the uplinks to ISP-A and ISP-B both connected to the same site 719 exit router. However, it is the aim of this document to provide a 720 solution that applies to a broad a range of enterprise site network 721 topologies, so this document focuses on providing a solution to the 722 more general case. The simplified cases will also be supported by 723 this solution, and there may even be optimizations that can be made 724 for simplified cases. This solution however needs to support more 725 complex topologies. 727 We are starting with the basic assumption that enterprise site 728 networks can be quite complex from a routing perspective. However, 729 even a complex site network can be multihomed to different ISPs with 730 PA addresses using IPv4 and NAT. It is not reasonable to expect an 731 enterprise network operator to change the routing topology of the 732 site in order to deploy IPv6. 734 5. Generating Source-Prefix-Scoped Forwarding Tables 736 So far we have described in general terms how the routers in this 737 solution that are capable of Source Address Dependent Routing will 738 forward traffic using both normal unscoped destination routes and 739 source-prefix-scoped destination routes. Here we give a precise 740 method for generating a source-prefix-scoped forwarding table on a 741 router that supports SADR. 743 1. Compute the next-hops for the source-prefix-scoped destination 744 prefixes using only routers in the connected SADR domain. These 745 are the initial source-prefix-scoped forwarding table entries. 747 2. Compute the next-hops for the unscoped destination prefixes using 748 all routers in the IGP. This is the unscoped forwarding table. 750 3. For a given source-prefix-scoped forwarding table T (scoped to 751 source prefix P), consider a source-prefix-scoped forwarding 752 table T', whose source prefix P' contains P. We call T the more 753 specific source-prefix-scoped forwarding table, and T' the less 754 specific source-prefix-scoped forwarding table. We select 755 entries in the less specific source-prefix-scoped forwarding 756 table to augment the more specific source-prefix-scoped 757 forwarding table based on the following rules. If a destination 758 prefix of an entry in the less specific source-prefix-scoped 759 forwarding table exactly matches the destination prefix of an 760 existing entry in the more specific source-prefix-scoped 761 forwarding table (including destination prefix length), then do 762 not add the entry to the more specific source-prefix-scoped 763 forwarding table. If the destination prefix does NOT match an 764 existing entry, then add the entry to the more specific source- 765 prefix-scoped forwarding table. As the unscoped forwarding table 766 is considered to be scoped to ::/0, this process will propagate 767 routes from the unscoped forwarding table to the more specific 768 source-prefix-scoped forwarding table. If there exist multiple 769 source-prefix-scoped forwarding tables whose source prefixes 770 contain P, these source-prefix-scoped forwarding tables should be 771 processed in order from most specific to least specific. 773 The forwarding tables produced by this process are used in the 774 following way to forward packets. 776 1. Select the most specific (longest prefix match) source-prefix- 777 scoped forwarding table that matches the source address of the 778 packet (again, the unscoped forwarding table is considered to be 779 scoped to ::/0). 781 2. Look up the destination address of the packet in the selected 782 forwarding table to determine the next-hop for the packet. 784 The following example illustrates how this process is used to create 785 a forwarding table for each provider-assigned source prefix. We 786 consider the multihomed site network in Figure 3. Initially we 787 assume that all of the routers in the site network support SADR. 788 Figure 4 shows the routes that are originated by the routers in the 789 site network. 791 Routes originated by SERa: 792 (S=2001:db8:0:a000::/52, D=2001:db8:0:5555/64) 793 (S=2001:db8:0:a000::/52, D=::/0) 794 (D=2001:db8:0:5555::/64) 795 (D=::/0) 797 Routes originated by SERb1: 798 (S=2001:db8:0:b000::/52, D=::/0) 799 (D=::/0) 801 Routes originated by SERb2: 802 (S=2001:db8:0:b000::/52, D=2001:db8:0:6666::/64) 803 (D=2001:db8:0:6666::/64) 805 Routes originated by R1: 806 (D=2001:db8:0:a010::/64) 807 (D=2001:db8:0:b010::/64) 809 Routes originated by R2: 810 (D=2001:db8:0:a010::/64) 811 (D=2001:db8:0:b010::/64) 813 Routes originated by R3: 814 (D=2001:db8:0:a020::/64) 815 (D=2001:db8:0:b020::/64) 817 Figure 4: Routes Originated by Routers in the Site Network 819 Each SER originates destination routes which are scoped to the source 820 prefix assigned by the ISP that the SER connects to. Note that the 821 SERs also originate the corresponding unscoped destination route. 822 This is not needed when all of the routers in the site support SADR. 823 However, it is required when some routers do not support SADR. This 824 will be discussed in more detail later. 826 We focus on how R8 constructs its source-prefix-scoped forwarding 827 tables from these route advertisements. R8 computes the next hops 828 for destination routes which are scoped to the source prefix 829 2001:db8:0:a000::/52. The results are shown in the first table in 830 Figure 5. (In this example, the next hops are computed assuming that 831 all links have the same metric.) Then, R8 computes the next hops for 832 destination routes which are scoped to the source prefix 833 2001:db8:0:b000::/52. The results are shown in the second table in 834 Figure 5 . Finally, R8 computes the next hops for the unscoped 835 destination prefixes. The results are shown in the third table in 836 Figure 5. 838 forwarding entries scoped to 839 source prefix = 2001:db8:0:a000::/52 840 ============================================ 841 D=2001:db8:0:5555/64 NH=R7 842 D=::/0 NH=R7 844 forwarding entries scoped to 845 source prefix = 2001:db8:0:b000::/52 846 ============================================ 847 D=2001:db8:0:6666/64 NH=SERb2 848 D=::/0 NH=SERb1 850 unscoped forwarding entries 851 ============================================ 852 D=2001:db8:0:a010::/64 NH=R2 853 D=2001:db8:0:b010::/64 NH=R2 854 D=2001:db8:0:a020::/64 NH=R5 855 D=2001:db8:0:b020::/64 NH=R5 856 D=2001:db8:0:5555::/64 NH=R7 857 D=2001:db8:0:6666::/64 NH=SERb2 858 D=::/0 NH=SERb1 860 Figure 5: Forwarding Entries Computed at R8 862 The final step is for R8 to augment the more specific source-prefix- 863 scoped forwarding tables with entries from less specific source- 864 prefix-scoped forwarding tables. The unscoped forwarding table is 865 considered as being scoped to ::/0, so both 2001:db8:0:a000::/52 and 866 2001:db8:0:b000::/52 are more specific prefixes of ::/0. Therefore, 867 entries in the unscoped forwarding table will be evaluated to be 868 added to these two more specific source-prefix-scoped forwarding 869 tables. If a forwarding entry from the less specific source-prefix- 870 scoped forwarding table has the exact same destination prefix 871 (including destination prefix length) as the forwarding entry from 872 the more specific source-prefix-scoped forwarding table, then the 873 existing forwarding entry in the more specific source-prefix-scoped 874 forwarding table wins. 876 As an example of how the source scoped forwarding entries are 877 augmented, we consider how the two entries in the first table in 878 Figure 5 (the table for source prefix = 2001:db8:0:a000::/52) are 879 augmented with entries from the third table in Figure 5 (the table of 880 unscoped or scoped to ::/0 forwarding entries). The first four 881 unscoped forwarding entries (D=2001:db8:0:a010::/64, 882 D=2001:db8:0:b010::/64, D=2001:db8:0:a020::/64, and 883 D=2001:db8:0:b020::/64) are not an exact match for any of the 884 existing entries in the forwarding table for source prefix 885 2001:db8:0:a000::/52. Therefore, these four entries are added to the 886 final forwarding table for source prefix 2001:db8:0:a000::/52. The 887 result of adding these entries is reflected in the first four entries 888 the first table in Figure 6. 890 The next less specific scoped (scope is ::/0) forwarding table entry 891 is for D=2001:db8:0:5555::/64. This entry is an exact match for the 892 existing entry in the forwarding table for the more specific source 893 prefix 2001:db8:0:a000::/52. Therefore, we do not replace the 894 existing entry with the entry from the unscoped forwarding table. 895 This is reflected in the fifth entry in the first table in Figure 6. 896 (Note that since both scoped and unscoped entries have R7 as the next 897 hop, the result of applying this rule is not visible.) 899 The next less specific prefix scoped (scope is ::/0) forwarding table 900 entry is for D=2001:db8:0:6666::/64. This entry is not an exact 901 match for any existing entries in the forwarding table for source 902 prefix 2001:db8:0:a000::/52. Therefore, we add this entry. This is 903 reflected in the sixth entry in the first table in Figure 6. 905 The next less specific prefix scoped (scope is ::/0) forwarding table 906 entry is for D=::/0. This entry is an exact match for the existing 907 entry in the forwarding table for more specific source prefix 908 2001:db8:0:a000::/52. Therefore, we do not overwrite the existing 909 source-prefix-scoped entry, as can be seen in the last entry in the 910 first table in Figure 6. 912 if source address matches 2001:db8:0:a000::/52 913 then use this forwarding table 914 ============================================ 915 D=2001:db8:0:a010::/64 NH=R2 916 D=2001:db8:0:b010::/64 NH=R2 917 D=2001:db8:0:a020::/64 NH=R5 918 D=2001:db8:0:b020::/64 NH=R5 919 D=2001:db8:0:5555::/64 NH=R7 920 D=2001:db8:0:6666::/64 NH=SERb2 921 D=::/0 NH=R7 923 else if source address matches 2001:db8:0:b000::/52 924 then use this forwarding table 925 ============================================ 926 D=2001:db8:0:a010::/64 NH=R2 927 D=2001:db8:0:b010::/64 NH=R2 928 D=2001:db8:0:a020::/64 NH=R5 929 D=2001:db8:0:b020::/64 NH=R5 930 D=2001:db8:0:5555::/64 NH=R7 931 D=2001:db8:0:6666::/64 NH=SERb2 932 D=::/0 NH=SERb1 934 else if source address matches ::/0 use this forwarding table 935 ============================================ 936 D=2001:db8:0:a010::/64 NH=R2 937 D=2001:db8:0:b010::/64 NH=R2 938 D=2001:db8:0:a020::/64 NH=R5 939 D=2001:db8:0:b020::/64 NH=R5 940 D=2001:db8:0:5555::/64 NH=R7 941 D=2001:db8:0:6666::/64 NH=SERb2 942 D=::/0 NH=SERb1 944 Figure 6: Complete Forwarding Tables Computed at R8 946 The forwarding tables produced by this process at R8 have the desired 947 properties. A packet with a source address in 2001:db8:0:a000::/52 948 will be forwarded based on the first table in Figure 6. If the 949 packet is destined for the Internet at large or the service at 950 D=2001:db8:0:5555/64, it will be sent to R7 in the direction of SERa. 951 If the packet is destined for an internal host, then the first four 952 entries will send it to R2 or R5 as expected. Note that if this 953 packet has a destination address corresponding to the service offered 954 by ISP-B (D=2001:db8:0:5555::/64), then it will get forwarded to 955 SERb2. It will be dropped by SERb2 or by ISP-B, since the packet has 956 a source address that was not assigned by ISP-B. However, this is 957 expected behavior. In order to use the service offered by ISP-B, the 958 host needs to originate the packet with a source address assigned by 959 ISP-B. 961 In this example, a packet with a source address that doesn't match 962 2001:db8:0:a000::/52 or 2001:db8:0:b000::/52 must have originated 963 from an external host. Such a packet will use the unscoped 964 forwarding table (the last table in Figure 6). These packets will 965 flow exactly as they would in absence of multihoming. 967 We can also modify this example to illustrate how it supports 968 deployments where not all routers in the site support SADR. 969 Continuing with the topology shown in Figure 3, suppose that R3 and 970 R5 do not support SADR. Instead they are only capable of 971 understanding unscoped route advertisements. The SADR routers in the 972 network will still originate the routes shown in Figure 4. However, 973 R3 and R5 will only understand the unscoped routes as shown in 974 Figure 7. 976 Routes originated by SERa: 977 (D=2001:db8:0:5555::/64) 978 (D=::/0) 980 Routes originated by SERb1: 981 (D=::/0) 983 Routes originated by SERb2: 984 (D=2001:db8:0:6666::/64) 986 Routes originated by R1: 987 (D=2001:db8:0:a010::/64) 988 (D=2001:db8:0:b010::/64) 990 Routes originated by R2: 991 (D=2001:db8:0:a010::/64) 992 (D=2001:db8:0:b010::/64) 994 Routes originated by R3: 995 (D=2001:db8:0:a020::/64) 996 (D=2001:db8:0:b020::/64) 998 Figure 7: Routes Advertisements Understood by Routers that do no 999 Support SADR 1001 With these unscoped route advertisements, R5 will produce the 1002 forwarding table shown in Figure 8. 1004 forwarding table 1005 ============================================ 1006 D=2001:db8:0:a010::/64 NH=R8 1007 D=2001:db8:0:b010::/64 NH=R8 1008 D=2001:db8:0:a020::/64 NH=R3 1009 D=2001:db8:0:b020::/64 NH=R3 1010 D=2001:db8:0:5555::/64 NH=R8 1011 D=2001:db8:0:6666::/64 NH=SERb2 1012 D=::/0 NH=R8 1014 Figure 8: Forwarding Table For R5, Which Doesn't Understand Source- 1015 Prefix-Scoped Routes 1017 As all SERs belong to the SADR domain any traffic that needs to exit 1018 the site will eventually hit a SADR-capable router. To prevent 1019 routing loops involving SADR-capable and non-SADR-capable routers, 1020 traffic that enters the SADR-capable domain does not leave the domain 1021 until it exits the site. Therefore all SADR-capable routers within 1022 the domain MUST be logically connected. 1024 Note that the mechanism described here for converting source-prefix- 1025 scoped destination prefix routing advertisements into forwarding 1026 state is somewhat different from that proposed in 1027 [I-D.ietf-rtgwg-dst-src-routing]. The method described in the 1028 current document is functionally equivalent, but it is based on 1029 application of existing mechanisms for the described scenarios. 1031 6. Mechanisms For Hosts To Choose Good Source Addresses In A Multihomed 1032 Site 1034 Until this point, we have made the assumption that hosts are able to 1035 choose the correct source address using some unspecified mechanism. 1036 This has allowed us to just focus on what the routers in a multihomed 1037 site network need to do in order to forward packets to the correct 1038 ISP based on source address. Now we look at possible mechanisms for 1039 hosts to choose the correct source address. We also look at what 1040 role, if any, the routers may play in providing information that 1041 helps hosts to choose source addresses. 1043 It should be noted that this section discussed how hosts could select 1044 the default source address for new connections. Any connection which 1045 already exists on a host is bound to the specific source address 1046 which can not be changed. Section 6.7 discusses the connections 1047 preservation issue in more details. 1049 Any host that needs to be able to send traffic using the uplinks to a 1050 given ISP is expected to be configured with an address from the 1051 prefix assigned by that ISP. The host will control which ISP is used 1052 for its traffic by selecting one of the addresses configured on the 1053 host as the source address for outgoing traffic. It is the 1054 responsibility of the site network to ensure that a packet with the 1055 source address from an ISP is now sent on an uplink to that ISP. 1057 If all of the ISP uplinks are working, the choice of source address 1058 by the host may be driven by the desire to load share across ISP 1059 uplinks, or it may be driven by the desire to take advantage of 1060 certain properties of a particular uplink or ISP (if some information 1061 about various path properties has been made availabe to the host 1062 somehow - see [I-D.ietf-intarea-provisioning-domains] as an example). 1063 If any of the ISP uplinks is not working, then the choice of source 1064 address by the host can cause packets to get dropped. 1066 How a host should make good decisions about source address selection 1067 in a multihomed site is not a solved problem. We do not attempt to 1068 solve this problem in this document. Instead we discuss the current 1069 state of affairs with respect to standardized solutions and 1070 implementation of those solutions. We also look at proposed 1071 solutions for this problem. 1073 An external host initiating communication with a host internal to a 1074 PA multihomed site will need to know multiple addresses for that host 1075 in order to communicate with it using different ISPs to the 1076 multihomed site (knowing just one address would undermine all 1077 benefits of redundant connectivity provided by multihoming). These 1078 addresses are typically learned through DNS. (For simplicity, we 1079 assume that the external host is single-homed.) The external host 1080 chooses the ISP that will be used at the remote multihomed site by 1081 setting the destination address on the packets it transmits. For a 1082 session originated from an external host to an internal host, the 1083 choice of source address used by the internal host is simple. The 1084 internal host has no choice but to use the destination address in the 1085 received packet as the source address of the transmitted packet. 1087 For a session originated by a host inside the multi-homed site, the 1088 decision of what source address to select is more complicated. We 1089 consider three main methods for hosts to get information about the 1090 network. The two proactive methods are Neighbor Discovery Router 1091 Advertisements and DHCPv6. The one reactive method we consider is 1092 ICMPv6. Note that we are explicitly excluding the possibility of 1093 having hosts participate in or even listen directly to routing 1094 protocol advertisements. 1096 First we look at how a host is currently expected to select the 1097 default source and destination addresses to be used for a new 1098 connection. 1100 6.1. Source Address Selection Algorithm on Hosts 1102 [RFC6724] defines the algorithms that hosts are expected to use to 1103 select source and destination addresses for packets. It defines an 1104 algorithm for selecting a source address and a separate algorithm for 1105 selecting a destination address. Both of these algorithms depend on 1106 a policy table. [RFC6724] defines a default policy which produces 1107 certain behavior. 1109 The rules in the two algorithms in [RFC6724] depend on many different 1110 properties of addresses. While these are needed for understanding 1111 how a host should choose addresses in an arbitrary environment, most 1112 of the rules are not relevant for understanding how a host should 1113 choose among multiple source addresses in multihomed environment when 1114 sending a packet to a remote host. Returning to the example in 1115 Figure 3, we look at what the default algorithms in [RFC6724] say 1116 about the source address that internal host H31 should use to send 1117 traffic to external host H101, somewhere on the Internet. 1119 There is no choice to be made with respect to destination address. 1120 H31 needs to send a packet with D=2001:db8:0:1234::101 in order to 1121 reach H101. So H31 have to choose between using 1122 S=2001:db8:0:a010::31 or S=2001:db8:0:b010::31 as the source address 1123 for this packet. We go through the rules for source address 1124 selection in Section 5 of [RFC6724]. 1126 Rule 1 (Prefer same address) is not useful to break the tie between 1127 source addresses, because neither the candidate source addresses 1128 equals the destination address. 1130 Rule 2 (Prefer appropriate scope) is also not used in this scenario, 1131 because both source addresses and the destination address have global 1132 scope. 1134 Rule 3 (Avoid deprecated addresses) applies to an address that has 1135 been autoconfigured by a host using stateless address 1136 autoconfiguration as defined in [RFC4862]. An address autoconfigured 1137 by a host has a preferred lifetime and a valid lifetime. The address 1138 is preferred until the preferred lifetime expires, after which it 1139 becomes deprecated. A deprecated address is not used if there is a 1140 preferred address of the appropriate scope available. When the valid 1141 lifetime expires, the address cannot be used at all. The preferred 1142 and valid lifetimes for an autoconfigured address are set based on 1143 the corresponding lifetimes in the Prefix Information Option in 1144 Neighbor Discovery Router Advertisements. So a possible tool to 1145 control source address selection in this scenario would be for a host 1146 to make an address deprecated by having routers on that link, R1 and 1147 R2 in Figure 3, send a Router Advertisement message containing a 1148 Prefix Information Option for the source prefix to be discouraged (or 1149 prohibited) with the preferred lifetime set to zero. This is a 1150 rather blunt tool, because it discourages or prohibits the use of 1151 that source prefix for all destinations. However, it may be useful 1152 in some scenarios. For example, if all uplinks to a particular ISP 1153 fail, it is desirable to prevent hosts from using source addresses 1154 from that ISP address space. 1156 Rule 4 (Avoid home addresses) does not apply here because we are not 1157 considering Mobile IP. 1159 Rule 5 (Prefer outgoing interface) is not useful in this scenario, 1160 because both source addresses are assigned to the same interface. 1162 Rule 5.5 (Prefer addresses in a prefix advertised by the next-hop) is 1163 not useful in the scenario when both R1 and R2 will advertise both 1164 source prefixes. However potentially this rule may allow a host to 1165 select the correct source prefix by selecting a next-hop. The most 1166 obvious way would be to make R1 to advertise itself as a default 1167 router and send PIO for 2001:db8:0:a010::/64, while R2 is advertising 1168 itself as a default router and sending PIO for 2001:db8:0:b010::/64. 1169 We'll discuss later how Rule 5.5 can be used to influence a source 1170 address selection in single-router topologies (e.g. when H41 is 1171 sending traffic using R3 as a default gateway). 1173 Rule 6 (Prefer matching label) refers to the Label value determined 1174 for each source and destination prefix as a result of applying the 1175 policy table to the prefix. With the default policy table defined in 1176 Section 2.1 of [RFC6724], Label(2001:db8:0:a010::31) = 5, 1177 Label(2001:db8:0:b010::31) = 5, and Label(2001:db8:0:1234::101) = 5. 1178 So with the default policy, Rule 6 does not break the tie. However, 1179 the algorithms in [RFC6724] are defined in such a way that non- 1180 default address selection policy tables can be used. [RFC7078] 1181 defines a way to distribute a non-default address selection policy 1182 table to hosts using DHCPv6. So even though the application of rule 1183 6 to this scenario using the default policy table is not useful, rule 1184 6 may still be a useful tool. 1186 Rule 7 (Prefer temporary addresses) has to do with the technique 1187 described in [RFC4941] to periodically randomize the interface 1188 portion of an IPv6 address that has been generated using stateless 1189 address autoconfiguration. In general, if H31 were using this 1190 technique, it would use it for both source addresses, for example 1191 creating temporary addresses 2001:db8:0:a010:2839:9938:ab58:830f and 1192 2001:db8:0:b010:4838:f483:8384:3208, in addition to 1193 2001:db8:0:a010::31 and 2001:db8:0:b010::31. So this rule would 1194 prefer the two temporary addresses, but it would not break the tie 1195 between the two source prefixes from ISP-A and ISP-B. 1197 Rule 8 (Use longest matching prefix) dictates that between two 1198 candidate source addresses the one which has longest common prefix 1199 length with the destination address. For example, if H31 were 1200 selecting the source address for sending packets to H101, this rule 1201 would not be a tie breaker as for both candidate source addresses 1202 2001:db8:0:a101::31 and 2001:db8:0:b101::31 the common prefix length 1203 with the destination is 48. However if H31 were selecting the source 1204 address for sending packets H41 address 2001:db8:0:a020::41, then 1205 this rule would result in using 2001:db8:0:a101::31 as a source 1206 (2001:db8:0:a101::31 and 2001:db8:0:a020::41 share the common prefix 1207 2001:db8:0:a000::/58, while for 2001:db8:0:b101::31 and 1208 2001:db8:0:a020::41 the common prefix is 2001:db8:0:a000::/51). 1209 Therefore rule 8 might be useful for selecting the correct source 1210 address in some but not all scenarios (for example if ISP-B services 1211 belong to 2001:db8:0:b000::/59 then H31 would always use 1212 2001:db8:0:b010::31 to access those destinations). 1214 So we can see that of the 8 source selection address rules from 1215 [RFC6724], four actually apply to our basic site multihoming 1216 scenario. The rules that are relevant to this scenario are 1217 summarized below. 1219 o Rule 3: Avoid deprecated addresses. 1221 o Rule 5.5: Prefer addresses in a prefix advertised by the next-hop. 1223 o Rule 6: Prefer matching label. 1225 o Rule 8: Prefer longest matching prefix. 1227 The two methods that we discuss for controlling the source address 1228 selection through the four relevant rules above are SLAAC Router 1229 Advertisement messages and DHCPv6. 1231 We also consider a possible role for ICMPv6 for getting traffic- 1232 driven feedback from the network. With the source address selection 1233 algorithm discussed above, the goal is to choose the correct source 1234 address on the first try, before any traffic is sent. However, 1235 another strategy is to choose a source address, send the packet, get 1236 feedback from the network about whether or not the source address is 1237 correct, and try another source address if it is not. 1239 We consider four scenarios where a host needs to select the correct 1240 source address. The first is when both uplinks are working. The 1241 second is when one uplink has failed. The third one is a situation 1242 when one failed uplink has recovered. The last one is failure of 1243 both (all) uplinks. 1245 It should be noted that [RFC6724] only defines the behavior of IPv6 1246 hosts to select default addresses that applications and upper-layer 1247 protocols can use. Applications and upper-layer protocols can make 1248 their own choices on selecting source addresses. The mechanism 1249 proposed in this document attempts to ensure that the subset of 1250 source addresses available for applications and upper-layer protocols 1251 is selected with the up-to-date network state in mind. The rest of 1252 the document discusses various aspects of the default source address 1253 selection defined in [RFC6724], calling it for the sake of brevity 1254 "the source address selection". 1256 6.2. Selecting Source Address When Both Uplinks Are Working 1258 Again we return to the topology in Figure 3. Suppose that the site 1259 administrator wants to implement a policy by which all hosts need to 1260 use ISP-A to reach H101 at D=2001:db8:0:1234::101. So for example, 1261 H31 needs to select S=2001:db8:0:a010::31. 1263 6.2.1. Distributing Address Selection Policy Table with DHCPv6 1265 This policy can be implemented by using DHCPv6 to distribute an 1266 address selection policy table that assigns the same label to 1267 destination address that match 2001:db8:0:1234::/64 as it does to 1268 source addresses that match 2001:db8:0:a000::/52. The following two 1269 entries accomplish this. 1271 Prefix Precedence Label 1272 2001:db8:0:1234::/64 50 33 1273 2001:db8:0:a000::/52 50 33 1275 Figure 9: Policy table entries to implement a routing policy 1277 This requires that the hosts implement [RFC6724], the basic source 1278 and destination address framework, along with [RFC7078], the DHCPv6 1279 extension for distributing a non-default policy table. Note that it 1280 does NOT require that the hosts use DHCPv6 for address assignment. 1281 The hosts could still use stateless address autoconfiguration for 1282 address configuration, while using DHCPv6 only for policy table 1283 distribution (see [RFC8415]). However this method has a number of 1284 disadvantages: 1286 o DHCPv6 support is not a mandatory requirement for IPv6 hosts 1287 ([RFC6434]), so this method might not work for all devices. 1289 o Network administrators are required to explicitly configure the 1290 desired network access policies on DHCPv6 servers. While it might 1291 be feasible in the scenario of a single multihomed network, such 1292 approach might have some scalability issues, especially if the 1293 centralized DHCPv6 solution is deployed to serve a large number of 1294 multiomed sites. 1296 6.2.2. Controlling Source Address Selection With Router Advertisements 1298 Neighbor Discovery currently has two mechanisms to communicate prefix 1299 information to hosts. The base specification for Neighbor Discovery 1300 (see [RFC4861]) defines the Prefix Information Option (PIO) in the 1301 Router Advertisement (RA) message. When a host receives a PIO with 1302 the A-flag set, it can use the prefix in the PIO as source prefix 1303 from which it assigns itself an IP address using stateless address 1304 autoconfiguration (SLAAC) procedures described in [RFC4862]. In the 1305 example of Figure 3, if the site network is using SLAAC, we would 1306 expect both R1 and R2 to send RA messages with PIOs for both source 1307 prefixes 2001:db8:0:a010::/64 and 2001:db8:0:b010::/64 with the 1308 A-flag set. H31 would then use the SLAAC procedure to configure 1309 itself with the 2001:db8:0:a010::31 and 2001:db8:0:b010::31. 1311 Whereas a host learns about source prefixes from PIO messages, hosts 1312 can learn about a destination prefix from a Router Advertisement 1313 containing Route Information Option (RIO), as specified in [RFC4191]. 1314 The destination prefixes in RIOs are intended to allow a host to 1315 choose the router that it uses as its first hop to reach a particular 1316 destination prefix. 1318 As currently standardized, neither PIO nor RIO options contained in 1319 Neighbor Discovery Router Advertisements can communicate the 1320 information needed to implement the desired routing policy. PIO's 1321 communicate source prefixes, and RIO communicate destination 1322 prefixes. However, there is currently no standardized way to 1323 directly associate a particular destination prefix with a particular 1324 source prefix. 1326 [I-D.pfister-6man-sadr-ra] proposes a Source Address Dependent Route 1327 Information option for Neighbor Discovery Router Advertisements which 1328 would associate a source prefix and with a destination prefix. The 1329 details of [I-D.pfister-6man-sadr-ra] might need tweaking to address 1330 this use case. However, in order to be able to use Neighbor 1331 Discovery Router Advertisements to implement this routing policy, an 1332 extension that allows R1 and R2 to explicitly communicate to H31 an 1333 association between S=2001:db8:0:a000::/52 D=2001:db8:0:1234::/64 1334 would be needed. 1336 However, Rule 5.5 of the default source address selection algorithm 1337 (discussed in Section 6.1 above), together with default router 1338 preference (specified in [RFC4191]) and RIO can be used to influence 1339 a source address selection on a host as described below. Let's look 1340 at source address selection on the host H41. It receives RAs from R3 1341 with PIOs for 2001:db8:0:a020::/64 and 2001:db8:0:b020::/64. At that 1342 point all traffic would use the same next-hop (R3 link-local address) 1343 so Rule 5.5 does not apply. Now let's assume that R3 supports SADR 1344 and has two scoped forwarding tables, one scoped to 1345 S=2001:db8:0:a000::/52 and another scoped to S=2001:db8:0:b000::/52. 1346 If R3 generates two different link-local addresses for its interface 1347 facing H41 (one for each scoped forwarding table, LLA_A and LLA_B) 1348 and starts sending two different RAs: one is sent from LLA_A and 1349 includes PIO for 2001:db8:0:a020::/64, another is sent from LLA_B and 1350 includes PIO for 2001:db8:0:b020::/64. Now it is possible to 1351 influence H41 source address selection for destinations which follow 1352 the default route by setting default router preference in RAs. If it 1353 is desired that H41 reaches H101 (or any destinations in the 1354 Internet) via ISP-A, then RAs sent from LLA_A should have default 1355 router preference set to 01 (high priority), while RAs sent from 1356 LLA_B should have preference set to 11 (low). Then LLA_A would be 1357 chosen as a next-hop for H101 and therefore (as per rule 5.5) 1358 2001:db8:0:a020::41 would be selected as the source address. If, at 1359 the same time, it is desired that H61 is accessible via ISP-B then R3 1360 should include a RIO for 2001:db8:0:6666::/64 to its RA sent from 1361 LLA_B. H41 would chose LLA_B as a next-hop for all traffic to H61 1362 and then as per Rule 5.5, 2001:db8:0:b020::41 would be selected as a 1363 source address. 1365 If in the above mentioned scenario it is desirable that all Internet 1366 traffic leaves the network via ISP-A and the link to ISP-B is used 1367 for accessing ISP-B services only (not as ISP-A link backup), then 1368 RAs sent by R3 from LLA_B should have Router Lifetime set to 0 and 1369 should include RIOs for ISP-B address space. It would instruct H41 1370 to use LLA_A for all Internet traffic but use LLA_B as a next-hop 1371 while sending traffic to ISP-B addresses. 1373 The description of the mechanism above assumes SADR support by the 1374 first-hop routers as well as SERs. However, a first-hop router can 1375 still provide a less flexible version of this mechanism even without 1376 implementing SADR. This could be done by providing configuration 1377 knobs on the first-hop router that allow it to generate different 1378 link-local addresses and to send individual RAs for each prefix. 1380 The mechanism described above relies on Rule 5.5 of the default 1381 source address selection algorithm defined in [RFC6724]. [RFC8028] 1382 states that "A host SHOULD select default routers for each prefix it 1383 is assigned an address in". It also recommends that hosts should 1384 implement Rule 5.5. of [RFC6724]. Hosts following the 1385 recommendations specified in [RFC8028] therefore should be able to 1386 benefit from the solution described in this document. No standards 1387 need to be updated in regards to host behavior. 1389 6.2.3. Controlling Source Address Selection With ICMPv6 1391 We now discuss how one might use ICMPv6 to implement the routing 1392 policy to send traffic destined for H101 out the uplink to ISP-A, 1393 even when uplinks to both ISPs are working. If H31 started sending 1394 traffic to H101 with S=2001:db8:0:b010::31 and 1395 D=2001:db8:0:1234::101, it would be routed through SER-b1 and out the 1396 uplink to ISP-B. SERb1 could recognize that this traffic is not 1397 following the desired routing policy and react by sending an ICMPv6 1398 message back to H31. 1400 In this example, we could arrange things so that SERb1 drops the 1401 packet with S=2001:db8:0:b010::31 and D=2001:db8:0:1234::101, and 1402 then sends to H31 an ICMPv6 Destination Unreachable message with Code 1403 5 (Source address failed ingress/egress policy). When H31 receives 1404 this packet, it would then be expected to try another source address 1405 to reach the destination. In this example, H31 would then send a 1406 packet with S=2001:db8:0:a010::31 and D=2001:db8:0:1234::101, which 1407 will reach SERa and be forwarded out the uplink to ISP-A. 1409 However, we would also want it to be the case that SERb1 does not 1410 enforce this routing policy when the uplink from SERa to ISP-A has 1411 failed. This could be accomplished by having SERa originate a 1412 source-prefix-scoped route for (S=2001:db8:0:a000::/52, 1413 D=2001:db8:0:1234::/64) and have SERb1 monitor the presence of that 1414 route. If that route is not present (because SERa has stopped 1415 originating it), then SERb1 will not enforce the routing policy, and 1416 it will forward packets with S=2001:db8:0:b010::31 and 1417 D=2001:db8:0:1234::101 out its uplink to ISP-B. 1419 We can also use this source-prefix-scoped route originated by SERa to 1420 communicate the desired routing policy to SERb1. We can define an 1421 EXCLUSIVE flag to be advertised together with the IGP route for 1422 (S=2001:db8:0:a000::/52, D=2001:db8:0:1234::/64). This would allow 1423 SERa to communicate to SERb that SERb should reject traffic for 1424 D=2001:db8:0:1234::/64 and respond with an ICMPv6 Destination 1425 Unreachable Code 5 message, as long as the route for 1426 (S=2001:db8:0:a000::/52, D=2001:db8:0:1234::/64) is present. The 1427 definition of an EXCLUSIVE flag for SADR advertisements in IGPs would 1428 require future standardization work. 1430 Finally, if we are willing to extend ICMPv6 to support this solution, 1431 then we could create a mechanism for SERb1 to tell the host what 1432 source address it should be using to successfully forward packets 1433 that meet the policy. In its current form, when SERb1 sends an 1434 ICMPv6 Destination Unreachable Code 5 message, it is basically 1435 saying, "This source address is wrong. Try another source address." 1436 In the absence of a clear indication which address to try next, the 1437 host will iterate over all addresses assigned to the interface (e.g. 1438 various privacy addresses) which would lead to significant delays and 1439 degraded user experience. It would be better is if the ICMPv6 1440 message could say, "This source address is wrong. Instead use a 1441 source address in S=2001:db8:0:a000::/52.". 1443 However using ICMPv6 for signaling source address information back to 1444 hosts introduces new challenges. Most routers currently have 1445 software or hardware limits on generating ICMP messages. A site 1446 administrator deploying a solution that relies on the SERs generating 1447 ICMP messages could try to improve the performance of SERs for 1448 generating ICMP messages. However, in a large network, it is still 1449 likely that ICMP message generation limits will be reached. As a 1450 result hosts would not receive ICMPv6 back which in turn leads to 1451 traffic blackholing and poor user experience. To improve the 1452 scalability of ICMPv6-based signaling hosts SHOULD cache the 1453 preferred source address (or prefix) for the given destination (which 1454 in turn might cause issues in case of the corresponding ISP uplinks 1455 failure - see Section 6.3). In addition, the same source prefix 1456 SHOULD be used for other destinations in the same /64 as the original 1457 destination address. The source prefix to the destination mapping 1458 SHOULD have a specific lifetime. Expiration of the lifetime SHOULD 1459 trigger the source address selection algorithm again. 1461 Using ICMPv6 Destination Unreachable Messages with Code 5 to 1462 influence source address selection introduces some security 1463 challenges which are discussed in Section 10. 1465 As currently standardized in [RFC4443], the ICMPv6 Destination 1466 Unreachable Message with Code 5 would allow for the iterative 1467 approach to retransmitting packets using different source addresses. 1468 As currently defined, the ICMPv6 message does not provide a mechanism 1469 to communication information about which source prefix should be used 1470 for a retransmitted packet. The current document does not define 1471 such a mechanism but it might be a useful extension to define in a 1472 different document. However this approach has some security 1473 implications such as an ability for an attacker to send spoofed 1474 ICMPv6 messages to signal invalid/unreachable source prefix causing 1475 DoS-type attack. 1477 6.2.4. Summary of Methods For Controlling Source Address Selection To 1478 Implement Routing Policy 1480 So to summarize this section, we have looked at three methods for 1481 implementing a simple routing policy where all traffic for a given 1482 destination on the Internet needs to use a particular ISP, even when 1483 the uplinks to both ISPs are working. 1485 The default source address selection policy cannot distinguish 1486 between the source addresses needed to enforce this policy, so a non- 1487 default policy table using associating source and destination 1488 prefixes using Label values would need to be installed on each host. 1489 A mechanism exists for DHCPv6 to distribute a non-default policy 1490 table but such solution would heavily rely on DHCPv6 support by host 1491 operating system. Moreover there is no mechanism to translate 1492 desired routing/traffic engineering policies into policy tables on 1493 DHCPv6 servers. Therefore using DHCPv6 for controlling address 1494 selection policy table is not recommended and SHOULD NOT be used. 1496 At the same time Router Advertisements provide a reliable mechanism 1497 to influence source address selection process via PIO, RIO and 1498 default router preferences. As all those options have been 1499 standardized by IETF and are supported by various operating systems 1500 no changes are required on hosts. First-hop routers in the 1501 enterprise network need to be able of sending different RAs for 1502 different SLAAC prefixes (either based on scoped forwarding tables or 1503 based on pre-configured policies). 1505 SERs can enforce the routing policy by sending ICMPv6 Destination 1506 Unreachable messages with Code 5 (Source address failed ingress/ 1507 egress policy) for traffic that is being sent with the wrong source 1508 address. The policy distribution could be automated by defining an 1509 EXCLUSIVE flag for the source-prefix-scoped route which can be set on 1510 the SER that originates the route. As ICMPv6 message generation can 1511 be rate-limited on routers, it SHOULD NOT be used as the only 1512 mechanism to influence source address selection on hosts. While 1513 hosts SHOULD select the correct source address for a given 1514 destination the network SHOULD signal any source address issues back 1515 to hosts using ICMPv6 error messages. 1517 6.3. Selecting Source Address When One Uplink Has Failed 1519 Now we discuss if DHCPv6, Neighbor Discovery Router Advertisements, 1520 and ICMPv6 can help a host choose the right source address when an 1521 uplink to one of the ISPs has failed. Again we look at the scenario 1522 in Figure 3. This time we look at traffic from H31 destined for 1523 external host H501 at D=2001:db8:0:5678::501. We initially assume 1524 that the uplink from SERa to ISP-A is working and that the uplink 1525 from SERb1 to ISP-B is working. 1527 We assume there is no particular routing policy desired, so H31 is 1528 free to send packets with S=2001:db8:0:a010::31 or 1529 S=2001:db8:0:b010::31 and have them delivered to H501. For this 1530 example, we assume that H31 has chosen S=2001:db8:0:b010::31 so that 1531 the packets exit via SERb to ISP-B. Now we see what happens when the 1532 link from SERb1 to ISP-B fails. How should H31 learn that it needs 1533 to start sending the packet to H501 with S=2001:db8:0:a010::31 in 1534 order to start using the uplink to ISP-A? We need to do this in a 1535 way that doesn't prevent H31 from still sending packets with 1536 S=2001:db8:0:b010::31 in order to reach H61 at D=2001:db8:0:6666::61. 1538 6.3.1. Controlling Source Address Selection With DHCPv6 1540 For this example we assume that the site network in Figure 3 has a 1541 centralized DHCP server and all routers act as DHCP relay agents. We 1542 assume that both of the addresses assigned to H31 were assigned via 1543 DHCP. 1545 We could try to have the DHCP server monitor the state of the uplink 1546 from SERb1 to ISP-B in some manner and then tell H31 that it can no 1547 longer use S=2001:db8:0:b010::31 by settings its valid lifetime to 1548 zero. The DHCP server could initiate this process by sending a 1549 Reconfigure Message to H31 as described in Section 18.3 of [RFC8415]. 1550 Or the DHCP server can assign addresses with short lifetimes in order 1551 to force clients to renew them often. 1553 This approach would prevent H31 from using S=2001:db8:0:b010::31 to 1554 reach a host on the Internet. However, it would also prevent H31 1555 from using S=2001:db8:0:b010::31 to reach H61 at 1556 D=2001:db8:0:6666::61, which is not desirable. 1558 Another potential approach is to have the DHCP server monitor the 1559 uplink from SERb1 to ISP-B and control the choice of source address 1560 on H31 by updating its address selection policy table via the 1561 mechanism in [RFC7078]. The DHCP server could initiate this process 1562 by sending a Reconfigure Message to H31. Note that [RFC8415] 1563 requires that Reconfigure Message use DHCP authentication. DHCP 1564 authentication could be avoided by using short address lifetimes to 1565 force clients to send Renew messages to the server often. If the 1566 host is not obtaining its IP addresses from the DHCP server, then it 1567 would need to use the Information Refresh Time option defined in 1568 [RFC8415]. 1570 If the following policy table can be installed on H31 after the 1571 failure of the uplink from SERb1, then the desired routing behavior 1572 should be achieved based on source and destination prefix being 1573 matched with label values. 1575 Prefix Precedence Label 1576 ::/0 50 44 1577 2001:db8:0:a000::/52 50 44 1578 2001:db8:0:6666::/64 50 55 1579 2001:db8:0:b000::/52 50 55 1581 Figure 10: Policy Table Needed On Failure Of Uplink From SERb1 1583 The described solution has a number of significant drawbacks, some of 1584 them already discussed in Section 6.2.1. 1586 o DHCPv6 support is not required for an IPv6 host and there are 1587 operating systems which do not support DHCPv6. Besides that, it 1588 does not appear that [RFC7078] has been widely implemented on host 1589 operating systems. 1591 o [RFC7078] does not clearly specify this kind of a dynamic use case 1592 where address selection policy needs to be updated quickly in 1593 response to the failure of a link. In a large network it would 1594 present scalability issues as many hosts need to be reconfigured 1595 in very short period of time. 1597 o Updating DHCPv6 server configuration each time an ISP uplink 1598 changes its state introduces some scalability issues, especially 1599 for mid/large distributed scale enterprise networks. In addition 1600 to that, the policy table needs to be manually configured by 1601 administrators which makes that solution prone to human error. 1603 o No mechanism exists for making DHCPv6 servers aware of network 1604 topology/routing changes in the network. In general DHCPv6 1605 servers monitoring network-related events sounds like a bad idea 1606 as completely new functionality beyond the scope of DHCPv6 role is 1607 required. 1609 6.3.2. Controlling Source Address Selection With Router Advertisements 1611 The same mechanism as discussed in Section 6.2.2 can be used to 1612 control the source address selection in the case of an uplink 1613 failure. If a particular prefix should not be used as a source for 1614 any destinations, then the router needs to send RA with Preferred 1615 Lifetime field for that prefix set to 0. 1617 Let's consider a scenario when all uplinks are operational and H41 1618 receives two different RAs from R3: one from LLA_A with PIO for 1619 2001:db8:0:a020::/64, default router preference set to 11 (low) and 1620 another one from LLA_B with PIO for 2001:db8:0:a020::/64, default 1621 router preference set to 01 (high) and RIO for 2001:db8:0:6666::/64. 1623 As a result H41 is using 2001:db8:0:b020::41 as a source address for 1624 all Internet traffic and those packets are sent by SERs to ISP-B. If 1625 SERb1 uplink to ISP-B failed, the desired behavior is that H41 stops 1626 using 2001:db8:0:b020::41 as a source address for all destinations 1627 but H61. To achieve that R3 should react to SERb1 uplink failure 1628 (which could be detected as the scoped route (S=2001:db8:0:b000::/52, 1629 D=::/0) disappearance) by withdrawing itself as a default router. R3 1630 sends a new RA from LLA_B with Router Lifetime value set to 0 (which 1631 means that it should not be used as default router). That RA still 1632 contains PIO for 2001:db8:0:b020::/64 (for SLAAC purposes) and RIO 1633 for 2001:db8:0:6666::/64 so H41 can reach H61 using LLA_B as a next- 1634 hop and 2001:db8:0:b020::41 as a source address. For all traffic 1635 following the default route, LLA_A will be used as a next-hop and 1636 2001:db8:0:a020::41 as a source address. 1638 If all uplinks to ISP-B have failed and therefore source addresses 1639 from ISP-B address space should not be used at all, the forwarding 1640 table scoped S=2001:db8:0:b000::/52 contains no entries. Hosts can 1641 be instructed to stop using source addresses from that block by 1642 sending RAs containing PIO with Preferred Lifetime set to 0. 1644 6.3.3. Controlling Source Address Selection With ICMPv6 1646 Now we look at how ICMPv6 messages can provide information back to 1647 H31. We assume again that at the time of the failure H31 is sending 1648 packets to H501 using (S=2001:db8:0:b010::31, 1649 D=2001:db8:0:5678::501). When the uplink from SERb1 to ISP-B fails, 1650 SERb1 would stop originating its source-prefix-scoped route for the 1651 default destination (S=2001:db8:0:b000::/52, D=::/0) as well as its 1652 unscoped default destination route. With these routes no longer in 1653 the IGP, traffic with (S=2001:db8:0:b010::31, D=2001:db8:0:5678::501) 1654 would end up at SERa based on the unscoped default destination route 1655 being originated by SERa. Since that traffic has the wrong source 1656 address to be forwarded to ISP-A, SERa would drop it and send a 1657 Destination Unreachable message with Code 5 (Source address failed 1658 ingress/egress policy) back to H31. H31 would then know to use 1659 another source address for that destination and would try with 1660 (S=2001:db8:0:a010::31, D=2001:db8:0:5678::501). This would be 1661 forwarded to SERa based on the source-prefix-scoped default 1662 destination route still being originated by SERa, and SERa would 1663 forward it to ISP-A. As discussed above, if we are willing to extend 1664 ICMPv6, SERa can even tell H31 what source address it should use to 1665 reach that destination. The expected host behaviour has been 1666 discussed in Section 6.2.3. Using ICMPv6 would have the same 1667 scalability/rate limiting issues discussed in Section 6.2.3. ISP-B 1668 uplink failure immediately makes source addresses from 1669 2001:db8:0:b000::/52 unsuitable for external communication and might 1670 trigger a large number of ICMPv6 packets being sent to hosts in that 1671 subnet. 1673 6.3.4. Summary Of Methods For Controlling Source Address Selection On 1674 The Failure Of An Uplink 1676 It appears that DHCPv6 is not particularly well suited to quickly 1677 changing the source address used by a host in the event of the 1678 failure of an uplink, which eliminates DHCPv6 from the list of 1679 potential solutions. On the other hand Router Advertisements 1680 provides a reliable mechanism to dynamically provide hosts with a 1681 list of valid prefixes to use as source addresses as well as prevent 1682 particular prefixes to be used. While no additional new features are 1683 required to be implemented on hosts, routers need to be able to send 1684 RAs based on the state of scoped forwarding tables entries and to 1685 react to network topology changes by sending RAs with particular 1686 parameters set. 1688 The use of ICMPv6 Destination Unreachable messages generated by the 1689 SER (or any SADR-capable) routers seem like they have the potential 1690 to provide a support mechanism together with RAs to signal source 1691 address selection errors back to hosts, however scalability issues 1692 may arise in large networks in case of sudden topology change. 1693 Therefore it is highly desirable that hosts are able to select the 1694 correct source address in case of uplinks failure with ICMPv6 being 1695 an additional mechanism to signal unexpected failures back to hosts. 1697 The current behavior of different host operating system when 1698 receiving ICMPv6 Destination Unreachable message with code 5 (Source 1699 address failed ingress/egress policy) is not clear to the authors. 1700 Information from implementers, users, and testing would be quite 1701 helpful in evaluating this approach. 1703 6.4. Selecting Source Address Upon Failed Uplink Recovery 1705 The next logical step is to look at the scenario when a failed uplink 1706 on SERb1 to ISP-B is coming back up, so hosts can start using source 1707 addresses belonging to 2001:db8:0:b000::/52 again. 1709 6.4.1. Controlling Source Address Selection With DHCPv6 1711 The mechanism to use DHCPv6 to instruct the hosts (H31 in our 1712 example) to start using prefixes from ISP-B space (e.g. 1713 S=2001:db8:0:b010::31 for H31) to reach hosts on the Internet is 1714 quite similar to one discussed in Section 6.3.1 and shares the same 1715 drawbacks. 1717 6.4.2. Controlling Source Address Selection With Router Advertisements 1719 Let's look at the scenario discussed in Section 6.3.2. If the 1720 uplink(s) failure caused the complete withdrawal of prefixes from 1721 2001:db8:0:b000::/52 address space by setting Preferred Lifetime 1722 value to 0, then the recovery of the link should just trigger new RA 1723 being sent with non-zero Preferred Lifetime. In another scenario 1724 discussed in Section 6.3.2, the SERb1 uplink to ISP-B failure leads 1725 to disappearance of the (S=2001:db8:0:b000::/52, D=::/0) entry from 1726 the forwarding table scoped to S=2001:db8:0:b000::/52 and, in turn, 1727 caused R3 to send RAs from LLA_B with Router Lifetime set to 0. The 1728 recovery of the SERb1 uplink to ISP-B leads to 1729 (S=2001:db8:0:b000::/52, D=::/0) scoped forwarding entry re- 1730 appearance and instructs R3 that it should advertise itself as a 1731 default router for ISP-B address space domain (send RAs from LLA_B 1732 with non-zero Router Lifetime). 1734 6.4.3. Controlling Source Address Selection With ICMP 1736 It looks like ICMPv6 provides a rather limited functionality to 1737 signal back to hosts that particular source addresses have become 1738 valid again. Unless the changes in the uplink state a particular 1739 (S,D) pair, hosts can keep using the same source address even after 1740 an ISP uplink has come back up. For example, after the uplink from 1741 SERb1 to ISP-B had failed, H31 received ICMPv6 Code 5 message (as 1742 described in Section 6.3.3) and allegedly started using 1743 (S=2001:db8:0:a010::31, D=2001:db8:0:5678::501) to reach H501. Now 1744 when the SERb1 uplink comes back up, the packets with that (S,D) pair 1745 are still routed to SERa1 and sent to the Internet. Therefore H31 is 1746 not informed that it should stop using 2001:db8:0:a010::31 and start 1747 using 2001:db8:0:b010::31 again. Unless SERa has a policy configured 1748 to drop packets (S=2001:db8:0:a010::31, D=2001:db8:0:5678::501) and 1749 send ICMPv6 back if SERb1 uplink to ISP-B is up, H31 will be unaware 1750 of the network topology change and keep using S=2001:db8:0:a010::31 1751 for Internet destinations, including H51. 1753 One of the possible option may be using a scoped route with EXCLUSIVE 1754 flag as described in Section 6.2.3. SERa1 uplink recovery would 1755 cause (S=2001:db8:0:a000::/52, D=2001:db8:0:1234::/64) route to 1756 reappear in the routing table. In the absence of that route packets 1757 to H101 which were sent to ISP-B (as ISP-A uplink was down) with 1758 source addresses from 2001:db8:0:b000::/52. When the route re- 1759 appears SERb1 would reject those packets and sends ICMPv6 back as 1760 discussed in Section 6.2.3. Practically it might lead to scalability 1761 issues which have been already discussed in Section 6.2.3 and 1762 Section 6.4.3. 1764 6.4.4. Summary Of Methods For Controlling Source Address Selection Upon 1765 Failed Uplink Recovery 1767 Once again DHCPv6 does not look like reasonable choice to manipulate 1768 source address selection process on a host in the case of network 1769 topology changes. Using Router Advertisement provides the flexible 1770 mechanism to dynamically react to network topology changes (if 1771 routers are able to use routing changes as a trigger for sending out 1772 RAs with specific parameters). ICMPv6 could be considered as a 1773 supporting mechanism to signal incorrect source address back to hosts 1774 but should not be considered as the only mechanism to control the 1775 address selection in multihomed environments. 1777 6.5. Selecting Source Address When All Uplinks Failed 1779 One particular tricky case is a scenario when all uplinks have 1780 failed. In that case there is no valid source address to be used for 1781 any external destinations while it might be desirable to have intra- 1782 site connectivity. 1784 6.5.1. Controlling Source Address Selection With DHCPv6 1786 From DHCPv6 perspective uplinks failure should be treated as two 1787 independent failures and processed as described in Section 6.3.1. At 1788 this stage it is quite obvious that it would result in quite 1789 complicated policy table which needs to be explicitly configured by 1790 administrators and therefore seems to be impractical. 1792 6.5.2. Controlling Source Address Selection With Router Advertisements 1794 As discussed in Section 6.3.2 an uplink failure causes the scoped 1795 default entry to disappear from the scoped forwarding table and 1796 triggers RAs with zero Router Lifetime. Complete disappearance of 1797 all scoped entries for a given source prefix would cause the prefix 1798 being withdrawn from hosts by setting Preferred Lifetime value to 1799 zero in PIO. If all uplinks (SERa, SERb1 and SERb2) failed, hosts 1800 either lost their default routers and/or have no global IPv6 1801 addresses to use as a source. (Note that 'uplink failure' might mean 1802 'IPv6 connectivity failure with IPv4 still being reachable', in which 1803 case hosts might fall back to IPv4 if there is IPv4 connectivity to 1804 destinations). As a result, intra-site connectivity is broken. One 1805 of the possible way to solve it is to use ULAs. 1807 All hosts have ULA addresses assigned in addition to GUAs and used 1808 for intra-site communication even if there is no GUA assigned to a 1809 host. To avoid accidental leaking of packets with ULA sources SADR- 1810 capable routers SHOULD have a scoped forwarding table for ULA source 1811 for internal routes but MUST NOT have an entry for D=::/0 in that 1812 table. In the absence of (S=ULA_Prefix; D=::/0) first-hop routers 1813 will send dedicated RAs from a unique link-local source LLA_ULA with 1814 PIO from ULA address space, RIO for the ULA prefix and Router 1815 Lifetime set to zero. The behaviour is consistent with the situation 1816 when SERb1 lost the uplink to ISP-B (so there is no Internet 1817 connectivity from 2001:db8:0:b000::/52 sources) but those sources can 1818 be used to reach some specific destinations. In the case of ULA 1819 there is no Internet connectivity from ULA sources but they can be 1820 used to reach another ULA destinations. Note that ULA usage could be 1821 particularly useful if all ISPs assign prefixes via DHCP-PD. In the 1822 absence of ULAs, upon the all uplinks failure hosts would lost all 1823 their GUAs upon prefix lifetime expiration which again makes intra- 1824 site communication impossible. 1826 It should be noted that the Rule 5.5 (prefer a prefix advertised by 1827 the selected next-hop) takes precedence over the Rule 6 (prefer 1828 matching label, which ensures that GUA source addresses are preferred 1829 over ULAs for GUA destinations). Therefore if ULAs are used, the 1830 network administrator needs to ensure that while the site has an 1831 Internet connectivity, hosts do not select a router which advertises 1832 ULA prefixes as their default router. 1834 6.5.3. Controlling Source Address Selection With ICMPv6 1836 In case of all uplinks failure all SERs will drop outgoing IPv6 1837 traffic and respond with ICMPv6 error message. In the large network 1838 when many hosts are trying to reach Internet destinations it means 1839 that SERs need to generate an ICMPv6 error to every packet they 1840 receive from hosts which presents the same scalability issues 1841 discussed in Section 6.3.3 1843 6.5.4. Summary Of Methods For Controlling Source Address Selection When 1844 All Uplinks Failed 1846 Again, combining SADR with Router Advertisements seems to be the most 1847 flexible and scalable way to control the source address selection on 1848 hosts. 1850 6.6. Summary Of Methods For Controlling Source Address Selection 1852 To summarize the scenarios and options discussed above: 1854 While DHCPv6 allows administrators to manipulate source address 1855 selection policy tables, this method has a number of significant 1856 disadvantages which eliminates DHCPv6 from a list of potential 1857 solutions: 1859 1. It required hosts to support DHCPv6 and its extension (RFC7078); 1860 2. DHCPv6 server needs to monitor network state and detect routing 1861 changes. 1863 3. The use of policy tables requires manual configuration and might 1864 be extremely complicated, especially in the case of distributed 1865 network when large number of remote sites are being served by 1866 centralized DHCPv6 servers. 1868 4. Network topology/routing policy changes could trigger 1869 simultaneous re-configuration of large number of hosts which 1870 present serious scalability issues. 1872 The use of Router Advertisements to influence the source address 1873 selection on hosts seem to be the most reliable, flexible and 1874 scalable solution. It has the following benefits: 1876 1. no new (non-standard) functionality needs to be implemented on 1877 hosts (except for [RFC4191] RIO support, which remains at the 1878 time of this writing not widely implemented); 1880 2. no changes in RA format; 1882 3. routers can react to routing table changes by sending RAs which 1883 would minimize the failover time in the case of network topology 1884 changes; 1886 4. information required for source address selection is broadcast to 1887 all affected hosts in case of topology change event which 1888 improves the scalability of the solution (comparing to DHCPv6 1889 reconfiguration or ICMPv6 error messages). 1891 To fully benefit from the RA-based solution, first-hop routers need 1892 to implement SADR, belong to the SADR domain and be able to send 1893 dedicated RAs per scoped forwarding table as discussed above, 1894 reacting to network changes with sending new RAs. It should be noted 1895 that the proposed solution would work even if first-hop routers are 1896 not SADR-capable but still able to send individual RAs for each ISP 1897 prefix and react to topology changes as discussed above (e.g. via 1898 configuration knobs). 1900 The RA-based solution relies heavily on hosts correctly implementing 1901 default address selection algorithm as defined in [RFC6724]. While 1902 the basic (and most common) multihoming scenario (two or more 1903 Internet uplinks, no 'walled gardens') would work for any host 1904 supporting the minimal implementation of [RFC6724], more complex use 1905 cases (such as "walled garden" and other scenarios when some ISP 1906 resources can only be reached from that ISP address space) require 1907 that hosts support Rule 5.5 of the default address selection 1908 algorithm. There is some evidence that not all host OSes have that 1909 rule implemented currently. However it should be noted that 1910 [RFC8028] states that Rule 5.5 should be implemented. 1912 ICMPv6 Code 5 error message SHOULD be used to complement RA-based 1913 solution to signal incorrect source address selection back to hosts, 1914 but it SHOULD NOT be considered as the stand-alone solution. To 1915 prevent scenarios when hosts in multihomed envinronments incorrectly 1916 identify onlink/offlink destinations, hosts SHOULD treat ICMPv6 1917 Redirects as discussed in [RFC8028]. 1919 6.7. Solution Limitations 1921 6.7.1. Connections Preservation 1923 The proposed solution is not designed to preserve connection state in 1924 case of an uplink failure. When all uplinks to an ISP go down all 1925 transport connections established to/from that ISP address space will 1926 be interrupted (unless the transport protocol has specific 1927 multihoming support). That behaviour is similar to the scenario of 1928 IPv4 multihoming with NAT when an uplink failure causes all 1929 connections to be NATed to completely different public IPv4 1930 addresses. While it does sound suboptimal, it is determined by the 1931 nature of PA address space: if all uplinks to the particular ISP have 1932 failed, there is no path for the ingress traffic to reach the site 1933 and the egress traffic is supposed to be dropped by the BCP38 1934 [RFC2827] ingress filters. The only potential way to overcome this 1935 limitation would be running BGP with all ISPs and advertise all site 1936 prefixes to all uplinks - a solution which shares all drawbacks of 1937 using PI address space without having its benefits. Networks willing 1938 and capable of running BGP and using PI are out of scope of this 1939 document. 1941 It should be noted that in case of IPv4 NAT-based multihoming uplink 1942 recovery could cause connection interruptions as well (unless packet 1943 forwarding is integrated with existing NAT sessions tracking so the 1944 egress interface for the existing sessions is not changed). However 1945 the proposed solution has a benefit of preserving the existing 1946 sessions during/after the failed uplink restoration. Unlike the 1947 uplink failure event which causes all addresses from the affected 1948 prefix to be deprecated the recovery would just add new preferred 1949 addresses to a host without making any addresses unavailable. 1950 Therefore connections estavlished to/from those addresses do not have 1951 to be interrupted. 1953 While it's desirable for active connections to survive ISP failover 1954 events, for sites using PA address space such events affect the 1955 reachability of IP addresses assigned to hosts. Unless the transport 1956 (or even higher level protocols) are capable of suviving the host 1957 renumbering, the active connections will be broken. The proposed 1958 solution focuses on minimizing the impact of failover for new 1959 connections and for multipath-aware protocols. 1961 6.8. Other Configuration Parameters 1963 6.8.1. DNS Configuration 1965 In mutihomed envinronment each ISP might provide their own list of 1966 DNS servers. For example, in the topology shown in Figure 3, ISP-A 1967 might provide recursive DNS server H51 2001:db8:0:5555::51, while 1968 ISP-B might provide H61 2001:db8:0:6666::61 as a recursive DNS 1969 server. [RFC8106] defines IPv6 Router Advertisement options to allow 1970 IPv6 routers to advertise a list of DNS recursive server addresses 1971 and a DNS Search List to IPv6 hosts. Using RDNSS together with 1972 'scoped' RAs as described above would allow a first-hop router (R3 in 1973 the Figure 3) to send DNS server addresses and search lists provided 1974 by each ISP (or the corporate DNS servers addresses if the enterprise 1975 is running its own DNS servers - as discussed below DNS split-horizon 1976 problem is to hard to solve without running a local DNS server). 1978 As discussed in Section 6.5.2, failure of all ISP uplinks would cause 1979 deprecation of all addresses assigned to a host from the address 1980 space of all ISPs. If any intra-site IPv6 connectivity is still 1981 desirable (most likely to be the case for any mid/large scare 1982 network), then ULAs should be used as discussed in Section 6.5.2. In 1983 such a scenario, the enterprise network should run its own recursive 1984 DNS server(s) and provide its ULA addresses to hosts via RDNSS in RAs 1985 send for ULA-scoped forwarding table as described in Section 6.5.2. 1987 There are some scenarios when the final outcome of the name 1988 resolution might be different depending on: 1990 o which DNS server is used; 1992 o which source address the client uses to send a DNS query to the 1993 server (DNS split horizon). 1995 There is no way currently to instruct a host to use a particular DNS 1996 server out of the configured servers list for resolving a particular 1997 name. Therefore it does not seem feasible to solve the problem of 1998 DNS server selection on the host (it should be noted that this 1999 particular issue is protocol-agnostic and happens for IPv4 as well). 2000 In such a scenario it is recommended that the enterprise runs its own 2001 local recursive DNS server. 2003 To influence host source address selection for packets sent to a 2004 particular DNS server the following requirements must be met: 2006 o the host supports RIO as defined in [RFC4191]; 2008 o the routers send RIO for routes to DNS server addresses. 2010 For example, if it is desirable that host H31 reaches the ISP-A DNS 2011 server H51 2001:db8:0:5555::51 using its source address 2012 2001:db8:0:a010::31, then both R1 and R2 should send the RIO 2013 containing the route to 2001:db8:0:5555::51 (or covering route) in 2014 their 'scoped' RAs, containing LLA_A as the default router address 2015 and the PO for SLAAC prefix 2001:db8:0:a010::/64. In that case the 2016 host H31 (if it supports the Rule 5.5) would select LLA_A as a next- 2017 hop and then chose 2001:db8:0:a010::31 as the source address for 2018 packets to the DNS server. 2020 It should be noted that [RFC6106] explicitly prohibits using DNS 2021 information if the RA router Lifetime expired: "An RDNSS address or a 2022 DNSSL domain name MUST be used only as long as both the RA router 2023 Lifetime (advertised by a Router Advertisement message) and the 2024 corresponding option Lifetime have not expired.". Therefore hosts 2025 might ignore RDNSS information provided in ULA-scoped RAs as those 2026 RAs would have router lifetime set to 0. However the updated version 2027 of RFC6106 ([RFC8106]) has that requirement removed. 2029 As discussed above the DNS split-horizon problem and selecting the 2030 correct DNS server in a multihomed envinroment is not an easy one to 2031 solve. The proper solution would require hosts to support the 2032 concept of multiple Provisioning Domains (PvD, a set of configuration 2033 information associated with a network, [RFC7556]). 2035 7. Deployment Considerations 2037 The solution described in this document requires certain mechanisms 2038 to be supported by the network infrastructure and hosts. It requires 2039 some routers in the enterprise site to support some form of Source 2040 Address Dependent Routing (SADR). It also requires hosts to be able 2041 to learn when the uplink to an ISP changes its state so the 2042 corresponding source addresses should (or should not) be used. 2043 Ongoing work to create mechanisms to accomplish this are discussed in 2044 this document, but they are still a work in progress. 2046 7.1. Deploying SADR Domain 2048 The proposed solution provides does not prescribe particular details 2049 regarding deploying an SADR domain within a multihomed enterprise 2050 network. However the following guidelines could be applied: 2052 o The SADR domain is usually limited by the multihomed site border. 2054 o The minimal deployable scenario requires enabling SADR on all SERs 2055 and including them into a single SADR domain. 2057 o As discussed in Section 4.2, extending the connected SADR domain 2058 beyond that point down to the first-hop routers can produce more 2059 efficient forwarding paths and allow the network to fully benefit 2060 from SADR. it would also simplify the operation of the SADR 2061 domain. 2063 o During the incremental SADR domain expansion from the SERs down 2064 towards first-hop routers it's important to ensure that at any 2065 moment of time all SADR-capable routers within the domain are 2066 logically connected (see Section 5). 2068 7.2. Hosts-Related Considerations 2070 The solution discussed in this document relies on the default address 2071 selection algorithm ([RFC6724]) Rule 5.5. While [RFC6724] considers 2072 this rule as optional, the recent [RFC8028] states that "A host 2073 SHOULD select default routers for each prefix it is assigned an 2074 address in". It also recommends that hosts should implement Rule 2075 5.5. of [RFC6724]. Therefore while RFC8028-compliant hosts already 2076 have mechanism to learn about ISP uplinks state changes and selecting 2077 the source addresses accordingly, many hosts do not have such 2078 mechanism supported yet. 2080 It should be noted that multihomed enterprise network utilizing 2081 multiple ISP prefixes can be considered as a typical multiple 2082 provisioning domain (mPVD) scenario, as described in [RFC7556]. This 2083 document defines a way for the network to provide the PVD information 2084 to hosts indirectly, using the existing mechanisms. At the same time 2085 [I-D.ietf-intarea-provisioning-domains] takes one step further and 2086 describes a comprehensive mechanism for hosts to discover the whole 2087 set of configuration information associated with different PVD/ISPs. 2088 [I-D.ietf-intarea-provisioning-domains] complements this document in 2089 terms of making hosts being able to learn about ISP uplink states and 2090 selecting the corresponding source addresses. 2092 8. Other Solutions 2094 8.1. Shim6 2096 The Shim6 working group specified the Shim6 protocol [RFC5533] which 2097 allows a host at a multihomed site to communicate with an external 2098 host and exchange information about possible source and destination 2099 address pairs that they can use to communicate. It also specified 2100 the REAP protocol [RFC5534] to detect failures in the path between 2101 working address pairs and find new working address pairs. A 2102 fundamental requirement for Shim6 is that both internal and external 2103 hosts need to support Shim6. That is, both the host internal to the 2104 multihomed site and the host external to the multihomed site need to 2105 support Shim6 in order for there to be any benefit for the internal 2106 host to run Shim6. The Shim6 protocol specification was published in 2107 2009, but it has not been widely implemented. Therefore Shim6 is not 2108 considered as a viable solution for enterprise multihoming. 2110 8.2. IPv6-to-IPv6 Network Prefix Translation 2112 IPv6-to-IPv6 Network Prefix Translation (NPTv6) [RFC6296] is not the 2113 focus of this document. NPTv6 suffers from the same fundamental 2114 issue as any other address translation approaches: it breaks end-to- 2115 end connectivity. Therefore NPTv6 is not considered as desirable 2116 solution and this document intentionally focuses on solving 2117 enterprise multihoming problem without any form of address 2118 translations. 2120 With increasing interest and ongoing work in bringing path awareness 2121 to transport and application layer protocols hosts might be able to 2122 determine the properties of the various network paths and choose 2123 among paths available to them. As selecting the correct source 2124 address is one of the possible mechanisms path-aware hosts may 2125 utilize, address translation negatively affects hosts path-awareness 2126 which makes NTPv6 even more undesirable solution. 2128 8.3. Multipath Transport 2130 Using multipath transport (such as MPTCP, [RFC6824] or multipath 2131 capabilities in QUIC) might solve the problems discussed in Section 6 2132 since it would allow hosts to use multiple source addresses for a 2133 single connection and switch between source addresses when a 2134 particular address becomes unavailable or a new address gets assigned 2135 to the host interface. Therefore if all hosts in the enterprise 2136 network are only using multipath transport for all connections, the 2137 signaling solution described in Section 6 might not be needed (it 2138 should be noted that the Source Address Dependent Routing would still 2139 be required to deliver packets to the correct uplinks). At the time 2140 this document was written, multipath transport alone could not be 2141 considered a solution for the problem of selecting the source address 2142 in a multihomed environment. There are significant number of hosts 2143 which do not use multipath transport currently and it seems unlikely 2144 that the situation is going to change in any foreseeable future (even 2145 if new releases of operatin systems get multipath protocols support 2146 there will be a long tail of legacy hosts). The solution for 2147 enterprise multihoming needs to work for the least common 2148 denominator: hosts without multipath transport support. In addition, 2149 not all protocols are using multipath transport. While multipath 2150 transport would complement the solution described in Section 6, it 2151 could not be considered as a sole solution to the problem of source 2152 address selection in multihomed environments. 2154 On the other hand PA-based multihoming could provide additional 2155 benefits for multipath protocol, should those protocols be deployed 2156 in the network. Multipath protocols could leverage source address 2157 selection to achieve maximum path diversity (and potentially improved 2158 performance). 2160 Therefore deploying multipath protocols could not be considered as an 2161 alternative to the approach proposed in this document. Instead both 2162 solutions complement each other so deploying multipath protocols in 2163 PA-based multihomed network proves mutually beneficial. 2165 9. IANA Considerations 2167 This memo asks the IANA for no new parameters. 2169 10. Security Considerations 2171 Section 6.2.3 discusses a mechanism for controlling source address 2172 selection on hosts using ICMPv6 messages. Using ICMPv6 to influence 2173 source address selection allows an attacker to exhaust the list of 2174 candidate source addresses on the host by sending spoofed ICMPv6 Code 2175 5 for all prefixes known on the network (therefore preventing a 2176 victim from establishing a communication with the destination host). 2177 Another possible attack vector is using ICMPv6 Destination 2178 Unreachable Messages with Code 5 to steer the egress tra ffic towards 2179 the particular ISP (for example if the attacker has the ability of 2180 doing traffic sniffing or man-in-the-middle attack in that ISP 2181 network). 2183 To prevent those attacks hosts SHOULD verify that the original packet 2184 header included into ICMPv6 error message was actually sent by the 2185 host (to ensure that the ICMPv6 message was triggered by a packet 2186 sent by the host). 2188 As ICMPv6 Destination Unreachable Messages with Code 5 could be 2189 originated by any SADR-capable router within the domain (or even come 2190 from the Internet), GTSM ([RFC5082]) can not be applied. Filtering 2191 such ICMOv6 messages at the site border can not be recommended as it 2192 would break the legitimate end2end error signalling mechanism ICMPv6 2193 is designed for. 2195 The security considerations of using stateless address 2196 autoconfiguration are discussed in [RFC4862]. 2198 11. Acknowledgements 2200 The original outline was suggested by Ole Troan. 2202 The authors would like to thank the following people (in alphabetical 2203 order) for their review and feedback: Olivier Bonaventure, Deborah 2204 Brungard, Brian E Carpenter, Lorenzo Colitti, Roman Danyliw, Benjamin 2205 Kaduk, Suresh Krishnan, Mirja Kuhlewind, David Lamparter, Nicolai 2206 Leymann, Acee Lindem, Philip Matthewsu, Robert Raszuk, Alvaro Retana, 2207 Dave Thaler, Michael Tuxen, Martin Vigoureux, Eric Vyncke, Magnus 2208 Westerlund. 2210 12. References 2212 12.1. Normative References 2214 [RFC1918] Rekhter, Y., Moskowitz, B., Karrenberg, D., de Groot, G., 2215 and E. Lear, "Address Allocation for Private Internets", 2216 BCP 5, RFC 1918, DOI 10.17487/RFC1918, February 1996, 2217 . 2219 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2220 Requirement Levels", BCP 14, RFC 2119, 2221 DOI 10.17487/RFC2119, March 1997, 2222 . 2224 [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: 2225 Defeating Denial of Service Attacks which employ IP Source 2226 Address Spoofing", BCP 38, RFC 2827, DOI 10.17487/RFC2827, 2227 May 2000, . 2229 [RFC4191] Draves, R. and D. Thaler, "Default Router Preferences and 2230 More-Specific Routes", RFC 4191, DOI 10.17487/RFC4191, 2231 November 2005, . 2233 [RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 2234 Addresses", RFC 4193, DOI 10.17487/RFC4193, October 2005, 2235 . 2237 [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing 2238 Architecture", RFC 4291, DOI 10.17487/RFC4291, February 2239 2006, . 2241 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 2242 Control Message Protocol (ICMPv6) for the Internet 2243 Protocol Version 6 (IPv6) Specification", STD 89, 2244 RFC 4443, DOI 10.17487/RFC4443, March 2006, 2245 . 2247 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 2248 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 2249 DOI 10.17487/RFC4861, September 2007, 2250 . 2252 [RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless 2253 Address Autoconfiguration", RFC 4862, 2254 DOI 10.17487/RFC4862, September 2007, 2255 . 2257 [RFC6106] Jeong, J., Park, S., Beloeil, L., and S. Madanapalli, 2258 "IPv6 Router Advertisement Options for DNS Configuration", 2259 RFC 6106, DOI 10.17487/RFC6106, November 2010, 2260 . 2262 [RFC6296] Wasserman, M. and F. Baker, "IPv6-to-IPv6 Network Prefix 2263 Translation", RFC 6296, DOI 10.17487/RFC6296, June 2011, 2264 . 2266 [RFC6724] Thaler, D., Ed., Draves, R., Matsumoto, A., and T. Chown, 2267 "Default Address Selection for Internet Protocol Version 6 2268 (IPv6)", RFC 6724, DOI 10.17487/RFC6724, September 2012, 2269 . 2271 [RFC7078] Matsumoto, A., Fujisaki, T., and T. Chown, "Distributing 2272 Address Selection Policy Using DHCPv6", RFC 7078, 2273 DOI 10.17487/RFC7078, January 2014, 2274 . 2276 [RFC7556] Anipko, D., Ed., "Multiple Provisioning Domain 2277 Architecture", RFC 7556, DOI 10.17487/RFC7556, June 2015, 2278 . 2280 [RFC8028] Baker, F. and B. Carpenter, "First-Hop Router Selection by 2281 Hosts in a Multi-Prefix Network", RFC 8028, 2282 DOI 10.17487/RFC8028, November 2016, 2283 . 2285 [RFC8106] Jeong, J., Park, S., Beloeil, L., and S. Madanapalli, 2286 "IPv6 Router Advertisement Options for DNS Configuration", 2287 RFC 8106, DOI 10.17487/RFC8106, March 2017, 2288 . 2290 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2291 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2292 May 2017, . 2294 [RFC8415] Mrugalski, T., Siodelski, M., Volz, B., Yourtchenko, A., 2295 Richardson, M., Jiang, S., Lemon, T., and T. Winters, 2296 "Dynamic Host Configuration Protocol for IPv6 (DHCPv6)", 2297 RFC 8415, DOI 10.17487/RFC8415, November 2018, 2298 . 2300 12.2. Informative References 2302 [I-D.ietf-intarea-provisioning-domains] 2303 Pfister, P., Vyncke, E., Pauly, T., Schinazi, D., and W. 2304 Shao, "Discovering Provisioning Domain Names and Data", 2305 draft-ietf-intarea-provisioning-domains-05 (work in 2306 progress), June 2019. 2308 [I-D.ietf-rtgwg-dst-src-routing] 2309 Lamparter, D. and A. Smirnov, "Destination/Source 2310 Routing", draft-ietf-rtgwg-dst-src-routing-07 (work in 2311 progress), March 2019. 2313 [I-D.pfister-6man-sadr-ra] 2314 Pfister, P., "Source Address Dependent Route Information 2315 Option for Router Advertisements", draft-pfister-6man- 2316 sadr-ra-01 (work in progress), June 2015. 2318 [RFC3704] Baker, F. and P. Savola, "Ingress Filtering for Multihomed 2319 Networks", BCP 84, RFC 3704, DOI 10.17487/RFC3704, March 2320 2004, . 2322 [RFC4941] Narten, T., Draves, R., and S. Krishnan, "Privacy 2323 Extensions for Stateless Address Autoconfiguration in 2324 IPv6", RFC 4941, DOI 10.17487/RFC4941, September 2007, 2325 . 2327 [RFC5082] Gill, V., Heasley, J., Meyer, D., Savola, P., Ed., and C. 2328 Pignataro, "The Generalized TTL Security Mechanism 2329 (GTSM)", RFC 5082, DOI 10.17487/RFC5082, October 2007, 2330 . 2332 [RFC5533] Nordmark, E. and M. Bagnulo, "Shim6: Level 3 Multihoming 2333 Shim Protocol for IPv6", RFC 5533, DOI 10.17487/RFC5533, 2334 June 2009, . 2336 [RFC5534] Arkko, J. and I. van Beijnum, "Failure Detection and 2337 Locator Pair Exploration Protocol for IPv6 Multihoming", 2338 RFC 5534, DOI 10.17487/RFC5534, June 2009, 2339 . 2341 [RFC6434] Jankiewicz, E., Loughney, J., and T. Narten, "IPv6 Node 2342 Requirements", RFC 6434, DOI 10.17487/RFC6434, December 2343 2011, . 2345 [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, 2346 "TCP Extensions for Multipath Operation with Multiple 2347 Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013, 2348 . 2350 [RFC7676] Pignataro, C., Bonica, R., and S. Krishnan, "IPv6 Support 2351 for Generic Routing Encapsulation (GRE)", RFC 7676, 2352 DOI 10.17487/RFC7676, October 2015, 2353 . 2355 Authors' Addresses 2357 Fred Baker 2358 Santa Barbara, California 93117 2359 USA 2361 Email: FredBaker.IETF@gmail.com 2363 Chris Bowers 2364 Juniper Networks 2365 Sunnyvale, California 94089 2366 USA 2368 Email: cbowers@juniper.net 2370 Jen Linkova 2371 Google 2372 1 Darling Island Rd 2373 Pyrmont, NSW 2009 2374 AU 2376 Email: furry@google.com