idnits 2.17.1 draft-ietf-shim6-failure-detection-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5 on line 995. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 972. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 979. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 985. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 8, 2005) is 6775 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 2461 (ref. '2') (Obsoleted by RFC 4861) ** Obsolete normative reference: RFC 2462 (ref. '3') (Obsoleted by RFC 4862) ** Obsolete normative reference: RFC 3315 (ref. '4') (Obsoleted by RFC 8415) ** Obsolete normative reference: RFC 3484 (ref. '5') (Obsoleted by RFC 6724) == Outdated reference: A later version (-18) exists of draft-ietf-dhc-dna-ipv4-08 == Outdated reference: A later version (-04) exists of draft-ietf-dna-goals-00 ** Downref: Normative reference to an Informational draft: draft-ietf-dna-goals (ref. '7') == Outdated reference: A later version (-07) exists of draft-ietf-ipv6-optimistic-dad-01 == Outdated reference: A later version (-09) exists of draft-ietf-ipv6-unique-local-addr-05 == Outdated reference: A later version (-01) exists of draft-ietf-shim6-reach-detect-00 -- Possible downref: Normative reference to a draft: ref. '10' -- Obsolete informational reference (is this intentional?): RFC 2960 (ref. '11') (Obsoleted by RFC 4960) -- Obsolete informational reference (is this intentional?): RFC 3489 (ref. '12') (Obsoleted by RFC 5389) == Outdated reference: A later version (-05) exists of draft-ietf-hip-mm-00 == Outdated reference: A later version (-08) exists of draft-ietf-mobike-design-00 == Outdated reference: A later version (-08) exists of draft-ietf-mobike-protocol-03 == Outdated reference: A later version (-19) exists of draft-ietf-mmusic-ice-02 == Outdated reference: A later version (-22) exists of draft-ietf-tsvwg-addip-sctp-10 == Outdated reference: A later version (-05) exists of draft-gont-tcpm-icmp-attacks-00 == Outdated reference: A later version (-12) exists of draft-ietf-shim6-proto-00 == Outdated reference: A later version (-08) exists of draft-rosenberg-midcom-turn-05 Summary: 9 errors (**), 0 flaws (~~), 16 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Arkko 3 Internet-Draft Ericsson 4 Expires: April 11, 2006 October 8, 2005 6 Failure Detection and Locator Pair Exploration Design for IPv6 7 Multihoming 8 draft-ietf-shim6-failure-detection-01 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on April 11, 2006. 35 Copyright Notice 37 Copyright (C) The Internet Society (2005). 39 Abstract 41 This draft discusses the issues of detecting failures in a currently 42 used address pair between two hosts and picking a new address pair to 43 be used when a failure occurs. The draft also discusses the roles of 44 a multihoming protocol versus network attachment functions at IP and 45 link layers. 47 Table of Contents 49 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 50 2. Requirements language . . . . . . . . . . . . . . . . . . . . 4 51 3. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 5 52 4. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 8 53 4.1. Available Addresses . . . . . . . . . . . . . . . . . 8 54 4.2. Locally Operational Addresses . . . . . . . . . . . . 9 55 4.3. Operational Address Pairs . . . . . . . . . . . . . . 9 56 4.4. Primary Address Pair . . . . . . . . . . . . . . . . . 11 57 4.5. Miscellaneous . . . . . . . . . . . . . . . . . . . . 11 58 5. Architectural Considerations . . . . . . . . . . . . . . . . . 12 59 6. Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 60 6.1. State Machines . . . . . . . . . . . . . . . . . . . . 14 61 6.2. Failure Detection . . . . . . . . . . . . . . . . . . 19 62 6.3. Alternative Locator Pair Exploration . . . . . . . . . 19 63 6.3.1. Exploration Order . . . . . . . . . . . . . . 19 64 6.3.2. Exploration Protocol . . . . . . . . . . . . . 21 65 7. Security Considerations . . . . . . . . . . . . . . . . . . . 23 66 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 67 8.1. Normative References . . . . . . . . . . . . . . . . . 24 68 8.2. Informative References . . . . . . . . . . . . . . . . 24 69 Appendix A. Contributors . . . . . . . . . . . . . . . . . . . . 27 70 Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 28 71 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 29 72 Intellectual Property and Copyright Statements . . . . . . . . . . 30 74 1. Introduction 76 The SHIM6 working group is extending IPv6 to support multihoming. 77 The focus of the group is to look at an IP layer (or layer 3.5) 78 mechanism that hides multihoming from applications [23]. This 79 mechanism needs to detect when a switch to another address or 80 addresses becomes necessary. We call this failure detection. 82 This draft discusses what requirements such a component of the SHIM6 83 protocol has, and how these requirements can be achieved. The draft 84 is structured as follows: Section 3 discusses what kind of solutions 85 have been used in other similar protocols. Section 4 defines a set 86 of useful terms and discusses them, and Section 5 discusses the 87 architectural implications of failure detection designs. Finally, 88 Section 6 describes one possible solution involving a mechanism to 89 detect failures and an exploration protocol for working address 90 pairs. 92 For the purposes of this draft, we consider an address to be 93 synonymous with a locator. There may be other, higher level 94 identifiers such as security associations, FQDNs, CGA public keys, 95 HBA bindings, or HITs that tie the different locators used by a node 96 together. 98 2. Requirements language 100 In this document, the key words "MAY", "MUST, "MUST NOT", "OPTIONAL", 101 "RECOMMENDED", "SHOULD", and "SHOULD NOT", are to be interpreted as 102 described in [1]. 104 3. Related Work 106 Another SHIM6 document [10] discusses what kind of mechanisms can be 107 used to detect whether the peer is still reachable at the currently 108 used address. Two proposed mechanisms, Correspondent Unreachability 109 Detection (CUD) and Forced Bidirectional Communication (FBD) are 110 presented. CUD is based on getting upper layer positive feedback, 111 and IPv6 NUD-like probing if there is no feedback. FBD is based on 112 forcing bidirectional communication by adding keepalive messages when 113 there is no other, payload traffic. 115 In SCTP [11], the addresses of the endpoints are learned in the 116 connection setup phase either through listing them explicitly or via 117 giving a DNS name that points to them. In order to provide a 118 failover mechanism between multihomed hosts, SCTP has the following 119 functions: 121 o One of the peer's addresses is selected as the primary address by 122 the application running on top of SCTP. All data packets are sent 123 to this address until there is a reason to choose another address, 124 such as the failure of the primary address. 126 o Testing the reachability of the peer endpoint's addresses. This 127 is done both via observing the data packets sent to the peer or 128 via a periodic heartbeat when there is no data packets to send. 130 Each time data packet retransmission is initiated (or when a 131 heartbeat is not answered within the estimated round-trip time) an 132 error counter is incremented. When a configured error limit is 133 reached, the particular destination address is marked as inactive. 134 The reception of an acknowledgement or heartbeat response clears 135 the counter. 137 o Retransmission: When retransmitting the endpoint attempts pick the 138 most "divergent" source-destination pair from the original source- 139 destination pair to which the packet was transmitted. Rules for 140 such selection are, however, left as implementation decisions in 141 SCTP. 143 SCTP does not define how local knowledge (such as information learned 144 from the link layer) should be used. SCTP also has no mechanism to 145 deal with dynamic changes to the set of available addresses, although 146 mechanisms for that are being developed [18]. 148 The MOBIKE protocol is currently being specified [16] [15]. This 149 protocol operates in a mixed IPv4/IPv6 environment, and typically has 150 to work through NATs. The current design is assumed to need to work 151 only in symmetric connectivity scenarios. 153 Some of the issues that have been discussed in the MOBIKE design 154 phase include the following: 156 o Single address vs. multiple peer addresses. A simple approach is 157 to have the peers be aware of just the current address of the 158 other side instead of all possible ones. Assuming that one of the 159 peers will request the other to start sending to a new address 160 this works well. However, this approach is unable to deal with 161 problems that affect both nodes. For instance, two nodes 162 connected by two separate point-to-point links will be unable to 163 switch to the other link if a failure occurs on the first one. 165 o Addresses vs. address pairs. Are tests and current paths 166 individual peer addresses, or pairs of peer and own addresses 167 (paths)? It seems that some failure scenarios require the use of 168 a path rather than a single address. A network failure may make 169 it impossible to communicate between a particular pair of 170 addresses, even if those addresses have some other connectivity. 172 o Where the connectivity information comes from. Does it come from 173 local stack (such as interface up/down, router advertisement), 174 from reception of ESP packets, from IKEv2 keepalives, or through 175 some MOBIKE-defined mechanism? 177 The mobility and multihoming specification for the HIP protocol [14] 178 leaves the determination of when address updates are sent to a local 179 policy, but suggests the use of local information and ICMP error 180 messages. 182 Network attachment procedures are also relevant for multihoming. The 183 IPv6 and MIP6 working groups have standardized mechanisms to learn 184 about networks that a node has attached to. Basic IPv6 Neighbor 185 Discovery was, however, designed primarily for static situations. 186 The fully dynamic detection procedure has turned out to be a 187 relatively complex procedure for mobile hosts, and it was not fully 188 anticipated at the time IPv6 Neighbor Discovery or DHCP were being 189 designed. As a result, enhanced or optimized mechanisms are being 190 designed in the DHC and DNA working groups [6] [7]. 192 ICE [17], STUN [12], and TURN [24] are also related mechanisms. They 193 are primarily used for NAT detection and communication through NATs 194 in IPv4 environment, for application such as as voice over IP. STUN 195 uses a server in the Internet to discover the presence and type of 196 NATs and the client's public IP addresses and ports. TURN makes it 197 possible to receive incoming connections in hosts behind NATs. ICE 198 makes use of these protocols in peer-to-peer cooperative fashion, 199 allowing participants to discover, create and verify mutual 200 connectivity, and then use this connectivity for multimedia streams. 201 While these mechanisms are not designed for dynamic and failure 202 situations, they have many of the same requirements for the 203 exploration of connectivity, as well as the requirement to deal with 204 middleboxes. 206 Related work in the IPv6 area includes RFC 3484 [5] which defines 207 source and destination address selection rules for IPv6 in situations 208 where multiple candidate address pairs exist. RFC 3484 considers 209 only a static situation, however, and does not take into account the 210 effect of failures. In the MULTI6 working group [22] considers how 211 applications can re-initiate connections after failures in the best 212 way. This work differs from the shim-layer approach selected for 213 further development in the working group with respect to the timing 214 of the address selection. In the shim-layer approach failure 215 detection and the selection of new addresses happens at any time, 216 while [22] considers only the case when an application re-establishes 217 connections. 219 4. Definitions 221 This section defines terms useful in discussing the failure detection 222 problem space. 224 4.1. Available Addresses 226 SHIM6 nodes need to be aware of what addresses they themselves have. 227 If a node loses the address it is currently using for communications, 228 another address must replace this address. And if a node loses an 229 address that the node's peer knows about, the peer must be informed. 230 Similarly, when a node acquires a new address it may generally wish 231 the peer to know about it. 233 Definition. Available address. An address is said to be available 234 if the following conditions are fulfilled: 236 o The address has been assigned to an interface of the node. 238 o If the address is an IPv6 address, we additionally require that 239 (a) the address is valid in the sense of RFC 2461 [2], and that 240 (b) the address is not tentative in the sense of RFC 2462 [3]. In 241 other words, the address assignment is complete so that 242 communications can be started. 244 Note this explicitly allows an address to be optimistic in the 245 sense of [8] even though implementations are probably better off 246 using other addresses as long as there is an alternative. 248 o The address is a global unicast, unique local address [9], or an 249 unambiguous IPv6 link-local or IPv4 RFC 1918 address. That is, it 250 is not an IPv6 site-local address. Where IPv6 link-local or RFC 251 1918 addresses are used, their use needs to be unambiguous. The 252 precise meaning of ambiguous has not been defined yet, but one 253 approach is requiring that at most one link-local address be used 254 per node within the same connection between two peers. 256 Note: Given RFC 3484 [5] rules for preferring smallest scope, 257 it is likely that many IPv6 flows at least start with even 258 link-local addresses. 260 o The address and interface is acceptable for use according to a 261 local policy. 263 Available addresses are discovered and monitored through mechanisms 264 outside the scope of SHIM6 (and HIP or MOBIKE). These mechanisms 265 include IPv6 Neighbor Discovery and Address Autoconfiguration [2] 266 [3], DHCP [4], enhanced network detection mechanisms detected by the 267 DNA working group, and corresponding IPv4 mechanisms, such as [6]. 269 4.2. Locally Operational Addresses 271 Two different granularity levels are needed for failure detection. 272 The coarser granularity is for individual addresses: 274 Definition. Locally Operational Address. An available address is 275 said to be locally operational when its use is known to be possible 276 locally: the interface is up, a relevant default router (if 277 applicable) is known to be reachable, and no other local information 278 points to the address being unusable. 280 Locally operational addresses are discovered and monitored through 281 mechanisms outside SHIM6 (and HIP or MOBIKE). These mechanisms 282 include IPv6 Neighbor Discovery [2], corresponding IPv4 mechanisms, 283 and link layer specific mechanisms. 285 It is also possible for hosts to learn about routing failures for a 286 particular selected source prefix. Protocols for distributing this 287 information are being designed [19] [22]. The development of such 288 protocols would be possible, however. Potential approaches include 289 overloading information in current IPv6 Router Advertisement or 290 adding some new information in them. Similarly, hosts could learn 291 information from servers that query the BGP routing tables. 293 4.3. Operational Address Pairs 295 The existence of locally operational addresses are not, however, a 296 guarantee that communications can be established with the peer. A 297 failure in the routing infrastructure can prevent the sent packets 298 from reaching their destination. For this reason we need the 299 definition of a second level of granularity, for pairs of addresses: 301 Definition. Bidirectionally operational address pair. A pair of 302 locally operational addresses are said to be an operational address 303 pair, iff bidirectional connectivity can be shown between the 304 addresses. That is, a packet sent with one of the addresses in the 305 source field and the other in the destination field reaches the 306 destination, and vice versa. 308 Unfortunately, there are scenarios where bidirectionally operational 309 address pairs do not exist. For instance, ingress filtering or 310 network failures may result in one address pair being operational in 311 one direction while another one is operational from the other 312 direction. The following definition captures this general situation: 314 Definition. Undirectionally operational address pair. A pair of 315 locally operational addresses are said to be an unidirectionally 316 operational address pair, iff packets sent with the first address as 317 the source and the second address as the destination can be shown to 318 reach the destination. 320 Both types of operational pairs are discovered and monitored through 321 the following mechanisms: 323 o Positive feedback from upper layer protocols. For instance, TCP 324 can indicate to the IP layer that it is making progress. This is 325 similar to how IPv6 Neighbor Unreachability Detection can in some 326 cases be avoided when upper layers provide information about 327 bidirectional connectivity [2]. In the case of unidirectional 328 connectivity, the upper layer protocol responses come back using 329 another address pair, but show that the messages sent using the 330 first address pair have been received. 332 o Negative feedback from upper layer protocols. It is conceivable 333 that upper layer protocols give an indication of a problem to the 334 SHIM6 layer. For instance, TCP could indicate that there's either 335 congestion or lack of connectivity in the path because it is not 336 getting ACKs. 338 o Explicit reachability tests, such as keepalives or probes added 339 when there's only unidirectional payload traffic [10]. 341 o ICMP error messages. Given the ease of spoofing ICMP messages, 342 one should be careful to not trust these blindly, however. Our 343 suggestion is to use ICMP error messages only as a hint to perform 344 an explicit reachability test, but not as a reason to disrupt 345 ongoing communications without other indications of problems. The 346 situation may be different when certain verifications of the ICMP 347 messages are being performed [21]. These verifications can ensure 348 that (practically) only on-path attackers can spoof the messages. 349 Such verifications are not possible for all transport protocols, 350 however. 352 Note that some protocols, such as HIP [14] and MOBIKE [16], perform a 353 return routability test of an address before it is taken into use. 354 The purpose of this test is to ensure that fraudulent peers do not 355 trick others into redirecting traffic streams onto innocent victims 356 [26]. Such tests can at the same time work as a means to ensure that 357 an address pair is operational. Note, however, that some advanced 358 optimizations attempt to postpone the reachability tests so that they 359 do not increase movement-related latency [25]. 361 4.4. Primary Address Pair 363 Contrary to SCTP which has a specific congestion avoidance design 364 suitable for multi-homing, IP-layer solutions need to avoid sending 365 packets concurrently over multiple paths; TCP behaves rather poorly 366 in such circumstances. For this reason it is necessary to choose a 367 particular pair of addresses as the primary address pair which is 368 used until problems occur, at least for the same session. 370 A primary address pair need not be operational at all times. If 371 there is no traffic to send, we may not know if the primary address 372 pair is operational. Nevertheless, it makes sense to assume that the 373 address pair that worked in some time ago continues to work for new 374 communications as well. 376 4.5. Miscellaneous 378 Addresses can become deprecated [2]. When other operational 379 addresses exist, nodes generally wish to move their communications 380 away from the deprecated addresses. 382 Similarly, IPv6 source address selection [5] may guide the selection 383 of a particular source address - destination address pair. 385 5. Architectural Considerations 387 Architecturally, a number of questions arises. One simple question 388 is whether there needs to be communications between a multihoming 389 solution residing at the IP layer and upper layer protocols? Upon 390 changing to a new address pair, transport layer protocol SHOULD be 391 notified so that it can perform a slow start, or some other form of 392 adaptation to the possibly changed conditions. This is necessary, 393 for instance, when switching from a high-bandwidth LAN interface to a 394 low bandwidth cellular interface. (Note that this notification can 395 not be done in protocol designs where the end points are not the 396 final hosts, such as where a gateway is used.) 398 A more fundamental question is which protocols should be responsible 399 for which parts of the problem. It seems clear that no multihoming 400 solution should take on the task of lower layers and other IP 401 functions for discovering its own addresses or testing local 402 connectivity. Protocols such as DHCP or Neighbor and Router 403 Discovery do this already. 405 But it is less clear which protocol(s) should discover end-to-end 406 connectivity problems or recover from them. One answer is that this 407 is clearly within the domain of multihoming protocol. By performing 408 testing and failure detection of the used path and switching to a new 409 path if necessary, the transport and application protocols can work 410 unchanged. 412 On the other hand, one could argue that transport and application 413 protocols would have more knowledge about the situation, and have a 414 better ability to decide when a move is required. For instance, they 415 know what the required throughput and congestion status is. Also, it 416 would be unfortunate if both the IP layer and transport/application 417 layer took action for the same problem, for instance by switching to 418 a new address at the IP layer and throttling back due to "congestion" 419 at the transport layer. 421 One can also envision that applications would be able to tell the IP 422 or transport layer that the current connection in unsatisfactory and 423 an exploration for a better one would be desirable. This would 424 require an API to be developed, however. 426 Generally speaking, we can divide information that a host has into 427 three categories: local information from "lower layers" such as IPv6 428 Neighbor Discovery, transit and congestion condition information from 429 either from the multihoming protocol itself or from transport layer 430 protocols and (where available) ECN, and application layer policies 431 that dictate what the requirements are for acceptable connections. 433 The division of work is largely left as an open issue as far as this 434 document is concerned, but our description works from a point of view 435 of a multihoming protocol at the IP layer. We also note that in the 436 CELP proposal [20], both IP, transport, and application layer 437 entities could share their connectivity status in a common 438 information pool. This may also be a useful approach. 440 Finally, the last architectural question is about the difference 441 between mobility and multihoming. Given our definitions above, 442 there's no fundamental difference with respect to how the 443 multihoming/mobility protocol learns the addresses it has available. 444 However, a practical difference is that in a multihoming scenario 445 there are alternative addresses, whereas in mobility changes to a new 446 address are forced due to the old address no longer being available. 447 Interestingly, with the exception of MOBIKE, existing mobility 448 protocols do not employ any failure detection mechanisms of their 449 own, and rely solely on link layer and neighbor discovery mechanisms. 451 6. Solution 453 We need to keep track of the host's own available addresses, 454 operational addresses, and operational address pairs, and to explore 455 for other operational pairs when a failure occurs. We will first 456 describe two general state machines that illustrate the overall 457 process, and then discuss the details of the reachability tests 458 needed for ensuring operational status, and the exploration protocol. 460 6.1. State Machines 462 Addresses can be in the AVAILABLE and OPERATIONAL states. The state 463 transitions relating to this are shown in Figure 1. 465 +--------------+ 466 Address becomes | | 467 available | | 468 ----------------->| | 469 | AVAILABLE | 470 <-----------------| | 471 Address is no | | 472 longer available | | 473 +--------------+ 474 | / \ 475 Address | | Address 476 becomes | | is no longer 477 operational | | operational 478 | | 479 \ / | 480 +--------------+ 481 | | 482 Address is no | | 483 longer available | | 484 <-----------------| OPERATIONAL | 485 | | 486 | | 487 | | 488 +--------------+ 490 Figure 1. Address state machine. 492 When an address becomes operational, it SHOULD be reported as a new 493 address to the peer. Similarly, when an address is no longer 494 operational or available, the peer SHOULD be informed. 496 In addition, a particular address can be either preferred or 497 deprecated. This is not shown in the state machine. 499 Another state machine describes address pair selection. A node runs 500 the address pair selection state machine to choose the currently used 501 primary address pair, the one which is used for sending outgoing 502 packets. A node runs one of these state machines towards each 503 different peer, tracking the known address pairs and their status. 504 Each peer also has its own state machine for talking back to the 505 node; there is no guarantee that the same address pairs (in reverse 506 order) have the same state; lack of bidirectionally operational pair 507 would result in a different state on both sides, for instance. 509 The state machine can be in the NO PRIMARY, TESTING PRIMARY, and 510 PRIMARY OPERATIONAL states. The chosen address pair is known to be 511 operational in the PRIMARY OPERATIONAL state, and is either 512 unverified or non-operational in the other states. 514 Figure 2 shows the state machine: 516 +----------------+ 517 | | 518 | | 519 | | 520 | | 521 | NO | 522 | PRIMARY | 523 | | 524 +-----| |<---------------+ 525 | | | | 526 | +----------------+ | 527 | / \ / \ | 528 Add | | | | 529 pair: | Delete | | Test Delete | 530 Send | pair & | | fail & pair & | 531 test | Last | | Last Last | 532 | | | | 533 | +----------------+ | 534 | | | | 535 +---->| |<----+ | 536 | | | Test | 537 Connect: Send test | | | fail & | 538 --------------------->| TESTING | | !Last | 539 | PRIMARY |+----+ | 540 +------------->| | | 541 | | |<----+ | 542 | +---->| | | | 543 | | +----------------+ | | 544 Policy | ICMP | | | | | | 545 change | Timer: | ULP | | Test | Delete | 546 | Send | feedback:| | OK: | pair & | 547 | test | Reset | | Reset | !Last | 548 | | timer | | timer | | 549 | | \ / \ / | | 550 | | +----------------+ | | 551 | +-----| | | | 552 | | |-----+ | 553 +--------------| | | 554 | | | 555 +-----| OPERATIONAL | | 556 ULP feedback: | | PRIMARY | | 557 Reset timer | | |----------------+ 558 +---->| | 559 | | 560 +----------------+ 562 Figure 2. Pair selection state machine. 564 The notation used in Figure 2 is explained below: 566 Connect 568 An event representing the desire of the application to send a 569 packet to a new peer, or an indication from a peer wishing to 570 connect to us. 572 Test OK 574 An event representing a successful completion of the reachability 575 test. 577 Test fail 579 An event representing failure to complete the reachability test. 581 ULP feedback 583 An event representing positive indication from an upper layer 584 protocol that the packets we have sent to the peer are getting 585 through. 587 ICMP 589 An event representing the reception of an ICMP error message. 591 Timer 593 An event representing timer elapsing. 595 Add pair 597 An event representing the addition of a new possible address pair, 598 either through learning a new local address or being told of a new 599 remote address. Note that this does not usually result in any 600 immediate action, unless we are currently lacking an operational 601 primary pair. 603 Delete pair 605 An event representing the deletion of the currently chosen primary 606 address pair, or learning that one of the addresses is in the pair 607 is no longer operational. 609 Policy change 611 An event representing the desire of the local or remote end to 612 change to a different address pair, despite the current one being 613 operational. This can be due to the availability of the higher- 614 bandwidth connection, cost, or other issues. 616 Last 618 A condition that tells whether or not the currently chosen primary 619 pair is the only known address pair. 621 Send test 623 An action to initiate the reachability test for a particular pair. 624 This test is typically embedded in the SHIM6 connection setup 625 exchange when run initially, and a separate exchange later. 627 Note that due to potentially asymmetric connectivity, both sides 628 have to perform their own tests, and make their own primary pair 629 selections. 631 Reset timer 633 An action to reset a timer so that it will send an event after a 634 specified time. 636 The state machines also assumes an underlying multihoming signaling 637 capability, consisting of the following abstract message exchanges: 639 Open 641 Establishes a connection between the peers. May also exchange 642 locator sets and test reachability at the same time. 644 Test 646 Verifies reachability using a specific address pair. 648 Add 650 Informs the peer about new locators. 652 Delete 654 Informs the peer about losing some locators. 656 Note that the above state machine leaves open how specific address 657 pairs are chosen or how the tests are actually performed. These 658 issues will be discussed in the next sections. We have also, on 659 purpose, decided to avoid attaching functional labels such as 660 "backup" to other address pairs beyond the primary pair. It is our 661 belief that a general design does not need these labels. 663 6.2. Failure Detection 665 This process consists of three tasks: 667 o Tracking local information from lower and upper layers. For 668 instance, when link layer informs that we have no connection then 669 we know there is a failure. 671 o Performing a reachability process as described in in [10] for 672 ensuring that there is reachability when the local information 673 says there should be. 675 o Following commands from the peer regarding the availability of 676 addresses. 678 6.3. Alternative Locator Pair Exploration 680 6.3.1. Exploration Order 682 The pair selection state machine assumes an ability to pick primary 683 and alternative address pairs. 685 This process results in a combinatorial explosion when there are many 686 addresses on both sides. Do both sides track all possible 687 combinations of addresses? If a failure occurs, shall all 688 combinations be tested before giving up? Are such tests performed in 689 parallel or in sequence, and what kind of backoff procedures should 690 be applied? 692 Our suggestion is that nodes MUST first consult RFC 3484 [5] Section 693 4 rules to determine what combinations of addresses are legal from a 694 local point of view, as this reduces the search space. RFC 3484 also 695 provides a priority ordering among different address pairs, making 696 the search possibly faster. Nodes SHOULD also use local information, 697 such as known quality of service parameters or interface types to 698 determine what addresses are preferred over others, and try pairs 699 containing such addresses first. In some cases we can also learn the 700 peer's preferences through the multihoming protocol. 702 Discussion note 1: It may also be possible to simulate preferences 703 by choosing to not tell the peer about some (non-preferred) 704 addresses. 706 Discussion note 2: The preferences may either be learned 707 dynamically or be configured. It is believed, however, that 708 dynamic learning based purely on the SHIM6 protocol is too hard 709 and not the task this layer should do. Solutions where multiple 710 protocols share their information in a common pool of locators 711 could provide this information from transport protocols, however 712 [20]. 714 The reception of packets from the peer with a given address pair is a 715 good hint that the address pair works, particularly when these 716 packets are authenticated multihoming protocol packets. However, the 717 reception of these packets alone is an insufficient reason to switch 718 to a new address, as in an unidirectional connectivity case the 719 return path may not work. 721 One suggested good implementation strategy is to record the 722 reachability test result (an on/off value) and multiply this by the 723 age of the information. This allows recently tested address pairs to 724 be chosen before old ones. 726 Out of the set of possible candidate address pairs, nodes SHOULD 727 attempt a test through all of them, but MUST do this sequentially and 728 using an exponential back-off procedure. 730 This sequential process is necessary in order to avoid a "signaling 731 storm" when an outage occurs (particularly for a complete site). 732 However, it also limits the number of addresses that can in practice 733 be used for multihoming, considering that transport and application 734 layer protocols will fail if the switch to a new address pair takes 735 too long. For instance, we can assume that an initial timeout value 736 is 0.1 seconds and there are four addresses on both sides. Going 737 through all sixteen address pairs and doubling the timeout value at 738 every trial would take 3200 seconds! 740 Finally, as has been noted in the context of MOBIKE, the existence of 741 NATs can require that peers continuously monitor the operational 742 status of address pairs, as otherwise NAT state related to a 743 particular communication is lost, and the peer on the outer side of 744 the NAT can no longer reach the peer inside the NAT. 746 6.3.2. Exploration Protocol 748 The exploration for a working address pair is not easy, as 749 unidirectional reachability needs to be considered. This is because 750 the test of a single pair may not result in a working paths to send 751 both the request and response packets. The following protocol could 752 be used to avoid this problem: 754 Peer A Peer B 755 | | 756 | Poll 1 (src=A1, dst=B1) | 757 |-------------------------------------------->| 758 | | 759 | Poll 2 (src=B1, dst=A1) OK: 1 | 760 | X------------------------------------| 761 | | 762 | Poll 3 (src=A2, dst=B1) | 763 |------------------------------X | 764 | | 765 | Poll 4 (src=B2, dst=A1) OK: 1 | 766 |<--------------------------------------------| 767 | | 768 | Poll 5 (src=A1, dst=B1) OK: 4 | 769 |-------------------------------------------->| 770 | | 772 When B receives the first Poll message, it memorizes that it has 773 gotten it. The Poll message from B, however, is lost so A tries 774 again with another pair. This is lost too, but B continues its own 775 testing process by sending its second Poll message, which is received 776 by A. The messages carry identifiers, and a list of identifiers that 777 were found messages the sender had itself successfully received 778 earlier. 780 In the end of the example case, A and B know that they have a working 781 path from A to B using (A1, B1) and from B to A using (B2, A1). 783 More generally, when A decides that it needs to test for 784 connectivity, it will initiate a set of Poll messages, in sequence, 785 until it gets a Poll message from B indicating that (a) B has 786 received one of A's Poll messages and, obviously, (b) that B's Poll 787 message is getting through. B uses the same algorithm, but starts 788 the process from the reception of the first Poll message from A. 790 Note that this protocol can be implemented in different ways. One 791 approach is to rely on data packets, such as TCP payload packets and 792 acknowledgements. This method has the benefit that it likely passes 793 easily through firewalls and other middleboxes. One exception to 794 this are stateful firewalls that wish to know what happened "earlier" 795 in the connection, but it seems that such firewalls are fundamentally 796 incompatible with multi-homing anyway. One drawback of this method 797 is, however, that the the number of available payload packets may not 798 match the need in a situation where a lot of address pairs need to be 799 explored. 801 Another approach is to have a completely separate protocol for the 802 exploration. This would need to be explicitly allowed in firewalls 803 before it could be used. On the other hand, then it would be very 804 clear for the firewall administrators what they are letting through. 806 7. Security Considerations 808 Attackers may spoof various indications from lower layers and the 809 network in an effort to confuse the peers about which addresses are 810 or are not working. For example, attackers may spoof ICMP error 811 messages in an effort to cause the parties to move their traffic 812 elsewhere or even to disconnect. Attackers may also spoof 813 information related to network attachments, router discovery, and 814 address assignments in an effort to make the parties believe they 815 have Internet connectivity when in reality they do not. 817 This may cause use of non-preferred addresses or even denial-of- 818 service. 820 SHIM6 does not provide any protection of its own for indications from 821 other parts of the protocol stack. However, MOBIKE is resistant to 822 incorrect information from these sources in the sense that it 823 provides its own security for both the signaling of addressing 824 information as well as actual payload data transmission. Denial-of- 825 service vulnerabilities remain, however. Some aspects of these 826 vulnerabilities can be mitigated through the use of techniques 827 specific to the other parts of the stack, such as properly dealing 828 with ICMP errors [21], link layer security, or the use of [13] to 829 protect IPv6 Router and Neighbor Discovery. 831 8. References 833 8.1. Normative References 835 [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement 836 Levels", BCP 14, RFC 2119, March 1997. 838 [2] Narten, T., Nordmark, E., and W. Simpson, "Neighbor Discovery 839 for IP Version 6 (IPv6)", RFC 2461, December 1998. 841 [3] Thomson, S. and T. Narten, "IPv6 Stateless Address 842 Autoconfiguration", RFC 2462, December 1998. 844 [4] Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C., and M. 845 Carney, "Dynamic Host Configuration Protocol for IPv6 846 (DHCPv6)", RFC 3315, July 2003. 848 [5] Draves, R., "Default Address Selection for Internet Protocol 849 version 6 (IPv6)", RFC 3484, February 2003. 851 [6] Aboba, B., "Detection of Network Attachment (DNA) in IPv4", 852 draft-ietf-dhc-dna-ipv4-08 (work in progress), July 2004. 854 [7] Choi, J., "Detecting Network Attachment in IPv6 Goals", 855 draft-ietf-dna-goals-00 (work in progress), June 2004. 857 [8] Moore, N., "Optimistic Duplicate Address Detection for IPv6", 858 draft-ietf-ipv6-optimistic-dad-01 (work in progress), 859 June 2004. 861 [9] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 862 Addresses", draft-ietf-ipv6-unique-local-addr-05 (work in 863 progress), June 2004. 865 [10] Beijnum, I., "Shim6 Reachability Detection", 866 draft-ietf-shim6-reach-detect-00 (work in progress), July 2005. 868 8.2. Informative References 870 [11] Stewart, R., Xie, Q., Morneault, K., Sharp, C., Schwarzbauer, 871 H., Taylor, T., Rytina, I., Kalla, M., Zhang, L., and V. 872 Paxson, "Stream Control Transmission Protocol", RFC 2960, 873 October 2000. 875 [12] Rosenberg, J., Weinberger, J., Huitema, C., and R. Mahy, "STUN 876 - Simple Traversal of User Datagram Protocol (UDP) Through 877 Network Address Translators (NATs)", RFC 3489, March 2003. 879 [13] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure 880 Neighbor Discovery (SEND)", RFC 3971, March 2005. 882 [14] Nikander, P., "End-Host Mobility and Multi-Homing with Host 883 Identity Protocol", draft-ietf-hip-mm-00 (work in progress), 884 October 2004. 886 [15] Kivinen, T., "Design of the MOBIKE protocol", 887 draft-ietf-mobike-design-00 (work in progress), June 2004. 889 [16] Eronen, P., "IKEv2 Mobility and Multihoming Protocol (MOBIKE)", 890 draft-ietf-mobike-protocol-03 (work in progress), 891 September 2005. 893 [17] Rosenberg, J., "Interactive Connectivity Establishment (ICE): A 894 Methodology for Network Address Translator (NAT) Traversal for 895 Multimedia Session Establishment Protocols", 896 draft-ietf-mmusic-ice-02 (work in progress), July 2004. 898 [18] Stewart, R., "Stream Control Transmission Protocol (SCTP) 899 Dynamic Address Reconfiguration", 900 draft-ietf-tsvwg-addip-sctp-10 (work in progress), 901 January 2005. 903 [19] Bagnulo, M., "Address selection in multihomed environments", 904 draft-bagnulo-shim6-addr-selection-00 (work in progress), 905 October 2005. 907 [20] Crocker, D., "Framework for Common Endpoint Locator Pools", 908 draft-crocker-celp-00 (work in progress), February 2004. 910 [21] Gont, F., "ICMP attacks against TCP", 911 draft-gont-tcpm-icmp-attacks-00 (work in progress), 912 August 2004. 914 [22] Huitema, C., "Address selection in multihomed environments", 915 draft-huitema-multi6-addr-selection-00 (work in progress), 916 October 2004. 918 [23] Nordmark, E., "Level 3 multihoming shim protocol", 919 draft-ietf-shim6-proto-00 (work in progress), October 2005. 921 [24] Rosenberg, J., "Traversal Using Relay NAT (TURN)", 922 draft-rosenberg-midcom-turn-05 (work in progress), July 2004. 924 [25] Vogt, C., Arkko, J., Bless, R., Doll, M., and T. Kuefner, 925 "Credit-Based Authorization for Mobile IPv6 Early Binding 926 Updates", draft-vogt-mipv6-credit-based-authorization-00 (work 927 in progress), May 2004. 929 [26] Aura, T., Roe, M., and J. Arkko, "Security of Internet Location 930 Management", In Proceedings of the 18th Annual Computer 931 Security Applications Conference, Las Vegas, Nevada, USA., 932 December 2002. 934 Appendix A. Contributors 936 This draft attempts to summarize the thoughts and unpublished 937 contributions of many people, including the MULTI6 WG design team 938 members Marcelo Bagnulo Braun, Iljitsch van Beijnum, Erik Nordmark, 939 Geoff Huston, Margaret Wasserman, and Jukka Ylitalo, the MOBIKE WG 940 contributors Pasi Eronen, Tero Kivinen, Francis Dupont, Spencer 941 Dawkins, and James Kempf, and my colleague Pekka Nikander at 942 Ericsson. This draft is also in debt to work done in the context of 943 SCTP [11]. 945 The protocol design in Section 6.3.2 is due to Erik, Marcelo, and 946 Iljitsch. 948 Appendix B. Acknowledgements 950 The author would also like to thank Christian Huitema, Pekka Savola, 951 and Hannes Tschofenig for interesting discussions in this problem 952 space, and for their comments on earlier versions of this draft. 954 Author's Address 956 Jari Arkko 957 Ericsson 958 Jorvas 02420 959 Finland 961 Email: jari.arkko@ericsson.com 963 Intellectual Property Statement 965 The IETF takes no position regarding the validity or scope of any 966 Intellectual Property Rights or other rights that might be claimed to 967 pertain to the implementation or use of the technology described in 968 this document or the extent to which any license under such rights 969 might or might not be available; nor does it represent that it has 970 made any independent effort to identify any such rights. Information 971 on the procedures with respect to rights in RFC documents can be 972 found in BCP 78 and BCP 79. 974 Copies of IPR disclosures made to the IETF Secretariat and any 975 assurances of licenses to be made available, or the result of an 976 attempt made to obtain a general license or permission for the use of 977 such proprietary rights by implementers or users of this 978 specification can be obtained from the IETF on-line IPR repository at 979 http://www.ietf.org/ipr. 981 The IETF invites any interested party to bring to its attention any 982 copyrights, patents or patent applications, or other proprietary 983 rights that may cover technology that may be required to implement 984 this standard. Please address the information to the IETF at 985 ietf-ipr@ietf.org. 987 Disclaimer of Validity 989 This document and the information contained herein are provided on an 990 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 991 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 992 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 993 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 994 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 995 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 997 Copyright Statement 999 Copyright (C) The Internet Society (2005). This document is subject 1000 to the rights, licenses and restrictions contained in BCP 78, and 1001 except as set forth therein, the authors retain all their rights. 1003 Acknowledgment 1005 Funding for the RFC Editor function is currently provided by the 1006 Internet Society.