idnits 2.17.1 draft-ietf-multi6-failure-detection-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1.a on line 15. -- Found old boilerplate from RFC 3978, Section 5.5 on line 946. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 923. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 930. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 936. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: This document is an Internet-Draft and is subject to all provisions of Section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 2005) is 7041 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '1' is defined on line 777, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2461 (ref. '2') (Obsoleted by RFC 4861) ** Obsolete normative reference: RFC 2462 (ref. '3') (Obsoleted by RFC 4862) ** Obsolete normative reference: RFC 3315 (ref. '4') (Obsoleted by RFC 8415) ** Obsolete normative reference: RFC 3484 (ref. '5') (Obsoleted by RFC 6724) == Outdated reference: A later version (-18) exists of draft-ietf-dhc-dna-ipv4-08 == Outdated reference: A later version (-04) exists of draft-ietf-dna-goals-00 ** Downref: Normative reference to an Informational draft: draft-ietf-dna-goals (ref. '7') == Outdated reference: A later version (-07) exists of draft-ietf-ipv6-optimistic-dad-01 == Outdated reference: A later version (-09) exists of draft-ietf-ipv6-unique-local-addr-05 -- Obsolete informational reference (is this intentional?): RFC 2960 (ref. '10') (Obsoleted by RFC 4960) -- Obsolete informational reference (is this intentional?): RFC 3489 (ref. '11') (Obsoleted by RFC 5389) == Outdated reference: A later version (-05) exists of draft-ietf-hip-mm-00 == Outdated reference: A later version (-08) exists of draft-ietf-mobike-design-00 == Outdated reference: A later version (-19) exists of draft-ietf-mmusic-ice-02 == Outdated reference: A later version (-22) exists of draft-ietf-tsvwg-addip-sctp-10 == Outdated reference: A later version (-08) exists of draft-dupont-ikev2-addrmgmt-05 == Outdated reference: A later version (-02) exists of draft-eronen-mobike-mopo-00 == Outdated reference: A later version (-05) exists of draft-gont-tcpm-icmp-attacks-00 == Outdated reference: A later version (-08) exists of draft-rosenberg-midcom-turn-05 Summary: 12 errors (**), 0 flaws (~~), 16 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group J. Arkko 2 Internet-Draft Ericsson 3 Expires: July 2, 2005 January 2005 5 Failure Detection and Locator Selection in Multi6 6 draft-ietf-multi6-failure-detection-00 8 Status of this Memo 10 This document is an Internet-Draft and is subject to all provisions 11 of section 3 of RFC 3667. By submitting this Internet-Draft, each 12 author represents that any applicable patent or other IPR claims of 13 which he or she is aware have been or will be disclosed, and any of 14 which he or she become aware will be disclosed, in accordance with 15 RFC 3668. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as 20 Internet-Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on July 2, 2005. 35 Copyright Notice 37 Copyright (C) The Internet Society (2005). 39 Abstract 41 This draft discusses locator pair selection and failure detection 42 mechanisms for the IPv6 multihoming feature being developed in the 43 Multi6 working group. Elements of this document may also be useful 44 for developing the details of the MOBIKE or HIP multihoming 45 mechanisms. The draft also discusses the roles of a multihoming 46 protocol versus network attachment functions at IP and link layers. 48 Table of Contents 50 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 51 2. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 4 52 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 7 53 3.1 Available Addresses . . . . . . . . . . . . . . . . . 7 54 3.2 Locally Operational Addresses . . . . . . . . . . . . 8 55 3.3 Operational Address Pairs . . . . . . . . . . . . . . 8 56 3.4 Primary Address Pair . . . . . . . . . . . . . . . . . 9 57 3.5 Miscellaneous . . . . . . . . . . . . . . . . . . . . 10 58 4. Architectural Considerations . . . . . . . . . . . . . . . . . 11 59 5. An Approach . . . . . . . . . . . . . . . . . . . . . . . . . 13 60 5.1 State Machine for Addresses . . . . . . . . . . . . . 13 61 5.2 State Machine for Address Pair Selection . . . . . . . 14 62 5.3 Pair Selection Algorithm . . . . . . . . . . . . . . . 18 63 5.4 Protocol for Testing Unidirectional Reachability . . . 19 64 6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21 65 6.1 Normative References . . . . . . . . . . . . . . . . . . 21 66 6.2 Informative References . . . . . . . . . . . . . . . . . 21 67 Author's Address . . . . . . . . . . . . . . . . . . . . . . . 23 68 A. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 24 69 B. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 25 70 Intellectual Property and Copyright Statements . . . . . . . . 26 72 1. Introduction 74 The Multi6 working group is extending IPv6 to support multihoming. A 75 number of possible approaches exist in this space, but the current 76 focus of the group is to look at an IP layer (or layer 3.5) mechanism 77 that hides multihoming from applications. Different variants of the 78 IP layer mechanism have been suggested in [23, 24, 25, 28] and other 79 references. 81 All these mechanisms have a common need to detect when a switch to 82 another address or addresses becomes necessary. We call this failure 83 detection, because the multi6 protocol works primarily as a failover 84 rather than a load balancing scheme. 86 This draft discusses what requirements such a component of the multi6 87 protocol has, and how these requirements can be achieved. The draft 88 is structured as follows: Section 2 discusses what kind of solutions 89 have been used in other similar protocols. Section 3 defines a set 90 of useful terms and discusses them, and Section 4 discusses the 91 architectural implications of multihoming at IP layer. Finally, 92 Section 5 describes one possible solution involving two state 93 machines, a failure testing protocol, and an address pair selection 94 algorithm. 96 For the purposes of this draft, we consider an address to be 97 synonymous with a locator. There may be other, higher level 98 identifiers such as security associations, FQDNs, CGA public keys, or 99 HITs that tie the different locators used by a node together. 101 2. Related Work 103 In SCTP [10], the addresses of the endpoints are learned in the 104 connection setup phase either through listing them explictly or via 105 giving a DNS name that points to them. In order to provide a 106 failover mechanism between multihomed hosts, SCTP has the following 107 functions: 109 o One of the peer's addresses is selected as the primary address by 110 the application running on top of SCTP. All data packets are sent 111 to this address until there is a reason to choose another address, 112 such as the failure of the primary address. 114 o Testing the reachability of the peer endpoint's addresses. This 115 is done both via observing the data packets sent to the peer or 116 via a periodic heartbeat when there is no data packets to send. 118 Each time data packet retransmission is initiated (or when a 119 heartbeat is not answered within the estimated round-trip time) an 120 error counter is incremented. When a configured error limit is 121 reached, the particular destination address is marked as inactive. 122 The reception of an acknowledgement or heartbeat response clears 123 the counter. 125 o Retransmission: When retransmitting the endpoint attempts pick the 126 most "divergent" source-destination pair from the original 127 source-destination pair to which the packet was transmitted. 128 Rules for such selection are, however, left as implementation 129 decisions in SCTP. 131 SCTP does not define how local knowledge (such as information learned 132 from the link layer) should be used. SCTP also has no mechanism to 133 deal with dynamic changes to the set of available addresses, although 134 mechanisms for that are being developed [15]. 136 The MOBIKE protocol is currently being designed, and some proposals 137 for the protocol exists [17, 18, 19, 22]. No official decision about 138 the protocol has been made yet, but there has been a lot of 139 discussion around the failure detection mechanisms in the context of 140 MOBIKE, and reference [13] records some of the current thoughts of 141 the WG on this issue. 143 Some of the issues that have been discussed include the following: 145 o Single address vs. multiple peer addresses. A simple approach is 146 to have the peers be aware of just the current address of the 147 other side instead of all possible ones. Assuming that one of the 148 peers will request the other to start sending to a new address 149 this works well. However, this approach is unable to deal with 150 problems that affect both nodes. For instance, two nodes 151 connected by two separate point-to-point links will be unable to 152 switch to the other link if a failure occurs on the first one. 154 o Addresses vs. address pairs. Are tests and current paths 155 individual peer addresses, or pairs of peer and own addresses 156 (paths)? It seems that some failure scenarios require the use of 157 a path rather than a single address. A network failure may make 158 it impossible to communicate between a particular pair of 159 addresses, even if those addresses have some other connectivity. 161 o Where the connectivity information comes from. Does it come from 162 local stack (such as interface up/down, router advertisement), 163 from reception of ESP packets, from IKEv2 keepalives, or through 164 some MOBIKE-defined mechanism? 166 The mobility and multihoming specification for the HIP protocol [12] 167 leaves the determination of when address updates are sent to a local 168 policy, but suggests the use of local information and ICMP error 169 messages. 171 Network attachment procedures are also relevant for multihoming. The 172 IPv6 and MIP6 working groups have standardized mechanisms to learn 173 about networks that a node has attached to. Basic IPv6 Neighbor 174 Discovery was, however, designed primarily for static situations. 175 The fully dynamic detection procedure has turned out to be a 176 relatively complex procedure for mobile hosts, and it was not fully 177 anticipated at the time IPv6 Neighbor Discovery or DHCP were being 178 designed. As a result, enhanced or optimized mechanisms are being 179 designed in the DHC and DNA working groups [6, 7]. 181 ICE [14], STUN [11], and TURN [26] are also related mechanisms. They 182 are primarily used for NAT detection and communication through NATs 183 in IPv4 environment, for application such as as voice over IP. STUN 184 uses a server in the Internet to discover the presence and type of 185 NATs and the client's public IP addresses and ports. TURN makes it 186 possible to receive incoming connections in hosts behind NATs. ICE 187 makes use of these protocols in peer-to-peer cooperative fashion, 188 allowing participants to discover, create and verify mutual 189 connectivity, and then use this connectivity for multimedia streams. 190 While these mechanisms are not designed for dynamic and failure 191 situations, they have many of the same requirements for the 192 exploration of connectivity, as well as the requirement to deal with 193 middleboxes. 195 Related work in the IPv6 area includes RFC 3484 [5] which defines 196 source and destination address selection rules for IPv6 in situations 197 where multiple candidate address pairs exist. RFC 3484 considers 198 only a static situation, however, and does not take into account the 199 effect of failures. In the MULTI6 working group [21] considers how 200 applications can re-initiate connections after failures in the best 201 way. This work differs from the shim-layer approach selected for 202 further development in the working group with respect to the timing 203 of the address selection. In the shim-layer approach failure 204 detection and the selection of new addresses happens at any time, 205 while [21] considers only the case when an application re-establishes 206 connections. 208 3. Definitions 210 This section defines terms useful in discussing the failure detection 211 problem space. 213 3.1 Available Addresses 215 Multi6 nodes need to be aware of what addresses they themselves have. 216 If a node loses the address it is currently using for communications, 217 another address must replace this address. And if a node loses an 218 address that the node's peer knows about, the peer must be informed. 219 Similarly, when a node acquires a new address it may generally wish 220 the peer to know about it. 222 Definition. Available address. An address is said to be available 223 if the following conditions are fulfilled: 225 o The address has been assigned to an interface of the node. 227 o If the address is an IPv6 address, we additionally require that 228 (a) the address is valid in the sense of RFC 2461 [2], and that 229 (b) the address is not tentative in the sense of RFC 2462 [3]. In 230 other words, the address assignment is complete so that 231 communications can be started. 233 Note this explicitly allows an address to be optimistic in the 234 sense of [8] even though implementations are probably better off 235 using other addresses as long as there is an alternative. 237 o The address is a global unicast, unique local address [9], or an 238 unambiquous IPv6 link-local or IPv4 RFC 1918 address. That is, it 239 is not an IPv6 site-local address. Where IPv6 link-local or RFC 240 1918 addresses are used, their use needs to be unambiquous. The 241 precise meaning of ambiquous has not been defined yet, but one 242 approach is requiring that at most one link-local address be used 243 per node within the same connection between two peers. 244 Note: Given RFC 3484 [5] rules for preferring smallest scope, 245 it is likely that many IPv6 flows at least start with even 246 link-local addresses. 248 o The address and interface is acceptable for use according to a 249 local policy. 251 Available addresses are discovered and monitored through mechanisms 252 outside the scope of MULTI6 (and HIP or MOBIKE). These mechanisms 253 include IPv6 Neighbor Discovery and Address Autoconfiguration [2, 3], 254 DHCP [4], enhanced network detection mechanisms detected by the DNA 255 working group, and corresponding IPv4 mechanisms, such as [6]. 257 3.2 Locally Operational Addresses 259 Two different granularity levels are needed for failure detection. 260 The coarser granularity is for individual addresses: 262 Definition. Locally Operational Address. An available address is 263 said to be locally operational when its use is known to be possible 264 locally: the interface is up and a relevant default router (if 265 applicable) is known to be reachable. 267 Locally operational addresses are discovered and monitored through 268 mechanisms outside MULTI6 (and HIP or MOBIKE). These mechanisms 269 include IPv6 Neighbor Discovery [2], corresponding IPv4 mechanisms, 270 and link layer specific mechanisms. 272 Theoretically, it is also possible for hosts to learn about routing 273 failures for a particular selected source prefix, even if no protocol 274 exists today to distribute this information in a convenient manner. 275 The development of such protocols would be possible, however. One 276 approach is overloading information in current IPv6 Router 277 Advertisements (see [21]) or adding some new information in them. 278 Similarly, hosts could learn information from servers that query the 279 BGP routing tables [21]. 281 3.3 Operational Address Pairs 283 The existence of locally operational addresses are not, however, a 284 guarantee that communications can be established with the peer. A 285 failure in the routing infrastructure can prevent the sent packets 286 from reaching their destination. For this reason we need the 287 definition of a second level of granularity, for pairs of addresses: 289 Definition. Bidirectionally operational address pair. A pair of 290 locally operational addresses are said to be an operational address 291 pair, iff bidirectional connectivity can be shown between the 292 addresses. That is, a packet sent with one of the addresses in the 293 source field and the other in the destination field reaches the 294 destination, and vice versa. 296 Unfortunately, there are scenarios where bidirectionally operational 297 address pairs do not exist. For instance, ingress filtering or 298 network failures may result in one address pair being operational in 299 one direction while another one is operational from the other 300 direction. The following definition captures this general situation: 302 Definition. Undirectionally operational address pair. A pair of 303 locally operational addresses are said to be an unidirectionally 304 operational address pair, iff packets sent with the first address as 305 the source and the second address as the destination can be shown to 306 reach the destination. 308 Both types of operational pairs are discovered and monitored through 309 the following mechanisms: 311 o Positive feedback from upper layer protocols. For instance, TCP 312 can indicate to the IP layer that it is making progress. This is 313 similar to how IPv6 Neighbor Unreachability Detection can in some 314 cases be avoided when upper layers provide information about 315 bidirectional connectivity [2]. In the case of unidirectional 316 connectivity, the upper layer protocol responses come back using 317 another address pair, but show that the messages sent using the 318 first address pair have been received. 320 o Negative feedback from upper layer protocols. It is conceivable 321 that upper layer protocols give an indication of a problem to the 322 MULTI6 layer. For instance, TCP could indicate that there's 323 either congestion or lack of connectivity in the path because it 324 is not getting ACKs. 326 o Explicit reachability tests. For instance, the IKEv2 keepalive 327 mechanism can be used to test that the current pair of addresses 328 is operational. 330 o ICMP error messages. Given the ease of spoofing ICMP messages, 331 one should be careful to not trust these blindly, however. Our 332 suggestion is to use ICMP error messages only as a hint to perform 333 an explicit reachability test, but not as a reason to disrupt 334 ongoing communications without other indications of problems. The 335 situation may be different when certain verifications of the ICMP 336 messages are being performed [20]. These verifications can ensure 337 that (pratically) only on-path attackers can spoof the messages. 338 Such verifications are not possible for all transport protocols, 339 however. 341 Note that some protocols, such as HIP [12], perform a return 342 routability test of an address before it is taken into use. The 343 purpose of this test is to ensure that fraudulent peers do not trick 344 others into redirecting traffic streams onto innocent victims [29]. 345 Such tests can at the same time work as a means to ensure that an 346 address pair is operational. Note, however, that some advanced 347 optimizations attempt to postpone the reachability tests so that they 348 do not increase movement-related latency [27]. 350 3.4 Primary Address Pair 352 Contrary to SCTP which has a specific congestion avoidance design 353 suitable for multi-homing, IP-layer solutions need to avoid sending 354 packets concurrently over multiple paths; TCP behaves rather poorly 355 in such circumstances. For this reason it is necessary to choose a 356 particular pair of addresses as the primary address pair which is 357 used until problems occur, at least for the same session. 359 A primary address pair need not be operational at all times. If 360 there is no traffic to send, we may not know if the primary address 361 pair is operational. Neverthless, it makes sense to assume that the 362 address pair that worked in some time ago continues to work for new 363 communications as well. 365 3.5 Miscellaneous 367 Addresses can become deprecated [2]. When other operational 368 addresses exist, nodes generally wish to move their communications 369 away from the deprecated addresses. 371 Similarly, IPv6 source address selection [5] may guide the selection 372 of a particular source address - destination address pair. 374 4. Architectural Considerations 376 Architecturally, a number of questions arises. One simple question 377 is whether there needs to be communications between a multihoming 378 solution residing at the IP layer and upper layer protocols? Upon 379 changing to a new address pair, transport layer protocol SHOULD be 380 notified so that it can perform a slow start, or some other form of 381 adaptation to the possibly changed conditions. This is necessary, 382 for instance, when switching from a high-bandwidth LAN interface to a 383 low bandwidth cellular interface. (Note that this notification can 384 not be done in protocol designs where the end points are not the 385 final hosts, such as where a gateway is used.) 387 A more fundamental question is which protocols should be responsible 388 for which parts of the problem. It seems clear that no multihoming 389 solution should take on the task of lower layers and other IP 390 functions for discovering its own addresses or testing local 391 connectivity. Protocols such as DHCP or Neighbor and Router 392 Discovery do this already. 394 But it is less clear which protocol(s) should discover end-to-end 395 connectivity problems or recover from them. One answer is that this 396 is clearly within the domain of multihoming protocol. By performing 397 testing and failure detection of the used path and switching to a new 398 path if necessary, the transport and application protocols can work 399 unchanged. 401 On the other hand, one could argue that transport and application 402 protocols would have more knowledge about the situation, and have a 403 better ability to decide when a move is required. For instance, they 404 know what the required throughput and congestion status is. Also, it 405 would be unfortunate if both the IP layer and transport/application 406 layer took action for the same problem, for instance by switching to 407 a new address at the IP layer and throttling back due to "congestion" 408 at the transport layer. 410 One can also envision that applications would be able to tell the IP 411 or transport layer that the current connection in unsatisfactory and 412 an exploration for a better one would be desirable. This would 413 require an API to be developed, however. 415 Generally speaking, we can divide information that a host has into 416 three categories: local information from "lower layers" such as IPv6 417 Neighbor Discovery, transit and congestion condition information from 418 either from the multihoming protocol itself or from transport layer 419 protocols and (where available) ECN, and application layer policies 420 that dictate what the requirements are for acceptable connections. 422 The division of work is largely left as an open issue as far as this 423 document is concerned, but our description works from a point of view 424 of a multihoming protocol at the IP layer. We also note that in the 425 CELP proposal [16], both IP, transport, and application layer 426 entities could share their connectivity status in a common 427 information pool. This may also be a useful approach. 429 Finally, the last architectural question is about the difference 430 between mobility and multihoming. Given our definitions above, 431 there's no fundamental difference with respect to how the 432 multihoming/mobility protocol learns the addresses it has available. 433 However, a practical difference is that in a multihoming scenario 434 there are alternative addresses, whereas in mobility changes to a new 435 address are forced due to the old address no longer being available. 437 5. An Approach 439 One suggested approach consists of a mechanism for keeping track of 440 the host's own available addresses, operational addresses, and 441 operational address pairs. 443 5.1 State Machine for Addresses 445 Addresses can be in the AVAILABLE and OPERATIONAL states. The state 446 transitions relating to this are shown in Figure 1. 448 +--------------+ 449 Address becomes | | 450 available | | 451 ----------------->| | 452 | AVAILABLE | 453 <-----------------| | 454 Address is no | | 455 longer available | | 456 +--------------+ 457 | / \ 458 Address | | Address 459 becomes | | is no longer 460 operational | | operational 461 | | 462 \ / | 463 +--------------+ 464 | | 465 Address is no | | 466 longer available | | 467 <-----------------| OPERATIONAL | 468 | | 469 | | 470 | | 471 +--------------+ 473 Figure 1. Address state machine. 475 When an address becomes operational, it SHOULD be reported as a new 476 address to the peer. Similarly, when an address is no longer 477 operational or available, the peer SHOULD be informed. 479 In addition, a particular address can be either preferred or 480 deprecated. This is not shown in the state machine. 482 5.2 State Machine for Address Pair Selection 484 A node runs the address pair selection state machine to choose the 485 currently used primary address pair, the one which is used for 486 sending outgoing packets. A node runs one of these state machines 487 towards each different peer, tracking the known address pairs and 488 their status. Each peer also has its own state machine for talking 489 back to the node; there is no guarantee that the same address pairs 490 (in reverse order) have the same state; lack of bidirectionally 491 operational pair would result in a different state on both sides, for 492 instance. 494 The state machine can be in the NO PRIMARY, TESTING PRIMARY, and 495 PRIMARY OPERATIONAL states. The chosen address pair is known to be 496 operational in the PRIMARY OPERATIONAL state, and is either 497 unverified or non-operational in the other states. 499 Figure 2 shows the state machine: 501 +----------------+ 502 | | 503 | | 504 | | 505 | | 506 | NO | 507 | PRIMARY | 508 | | 509 +-----| |<---------------+ 510 | | | | 511 | +----------------+ | 512 | / \ / \ | 513 Add | | | | 514 pair: | Delete | | Test Delete | 515 Send | pair & | | fail & pair & | 516 test | Last | | Last Last | 517 | | | | 518 | +----------------+ | 519 | | | | 520 +---->| |<----+ | 521 | | | Test | 522 Connect: Send test | | | fail & | 523 --------------------->| TESTING | | !Last | 524 | PRIMARY |+----+ | 525 +------------->| | | 526 | | |<----+ | 527 | +---->| | | | 528 | | +----------------+ | | 529 Policy | ICMP | | | | | | 530 change | Timer: | ULP | | Test | Delete | 531 | Send | feedback:| | OK: | pair & | 532 | test | Reset | | Reset | !Last | 533 | | timer | | timer | | 534 | | \ / \ / | | 535 | | +----------------+ | | 536 | +-----| | | | 537 | | |-----+ | 538 +--------------| | | 539 | | | 540 +-----| OPERATIONAL | | 541 ULP feedback: | | PRIMARY | | 542 Reset timer | | |----------------+ 543 +---->| | 544 | | 545 +----------------+ 547 Figure 2. Pair selection state machine. 549 The notation used in Figure 2 is explained below: 551 Connect 553 An event representing the desire of the application to send a 554 packet to a new peer, or an indication from a peer wishing to 555 connect to us. 557 Test OK 559 An event representing a successful completion of the reachability 560 test. 562 Test fail 564 An event representing failure to complete the reachability test. 566 ULP feedback 568 An event representing positive indication from an upper layer 569 protocol that the packets we have sent to the peer are getting 570 through. 572 ICMP 574 An event representing the reception of an ICMP error message. 576 Timer 578 An event representing timer elapsing. 580 Add pair 582 An event representing the addition of a new possible address pair, 583 either through learning a new local address or being told of a new 584 remote address. Note that this does not usually result in any 585 immediate action, unless we are currently lacking an operational 586 primary pair. 588 Delete pair 590 An event representing the deletion of the currently chosen primary 591 address pair. 593 Policy change 595 An event representing the desire of the local or remote end to 596 change to a different address pair, despite the current one being 597 operational. This can be due to the availability of the 598 higher-bandwidth connection, cost, or other issues. 600 Last 602 A condition that tells whether or not the currently chosen primary 603 pair is the only known address pair. 605 Send test 607 An action to initiate the reachability test for a particular pair. 608 This test is typically embedded in the Multi6 connection setup 609 exchange when run initially, and a separate exchange later. 611 Note that due to potentially asymmetric connectivity, both sides 612 have to perform their own tests, and make their own primary pair 613 selections. 615 Reset timer 617 An action to reset a timer so that it will send an event after a 618 specified time. 620 The state machines also assumes an underlying multihoming signaling 621 capabability, consisting of the following abstract message exchanges: 623 Open 625 Establishes a connection between the peers. May also exchange 626 locator sets and test reachability at the same time. 628 Test 630 Verifies reachability using a specific address pair. 632 Add 634 Informs the peer about new locators. 636 Delete 638 Informs the peer about losing some locators. 640 Note that the above state machine leaves open how specific address 641 pairs are chosen, as this will be discussed in the next section. We 642 have also, on purpose, decided to avoid attaching functional labels 643 such as "backup" to other address pairs beyond the primary pair. It 644 is our belief that a general design does not need these labels. 646 5.3 Pair Selection Algorithm 648 The pair selection state machine assumes an ability to pick primary 649 and alternative address pairs. 651 This process results in a combinatorial explosion when there are many 652 addresses on both sides. Do both sides track all possible 653 combinations of addresses? If a failure occurs, shall all 654 combinations be tested before giving up? Are such tests performed in 655 parallel or in sequence, and what kind of backoff procedures should 656 be applied? 658 Our suggestion is that nodes MUST first consult RFC 3484 [5] Section 659 4 rules to determine what combinations of addresses are legal from a 660 local point of view, as this reduces the search space. RFC 3484 also 661 provides a priority ordering among different address pairs, making 662 the search possibly faster. Nodes SHOULD also use local information, 663 such as known quality of service parameters or interface types to 664 determine what addresses are preferred over others, and try pairs 665 containing such addresses first. In some cases we can also learn the 666 peer's preferences through the multihoming protocol [12]. 668 Discussion note 1: It may also be possible to simulate preferences 669 by choosing to not tell the peer about some (non-preferred) 670 addresses. 672 Discussion note 2: The preferences may either be learned 673 dynamically or be configured. It is believed, however, that 674 dynamic learning based purely on the MULTI6 protocol is too hard 675 and not the task this layer should do. Solutions where multiple 676 protocols share their information in a common pool of locators 677 could provide this information from transport protocols, however 678 [16]. 680 The reception of packets from the peer with a given address pair is a 681 good hint that the address pair works, particularly when these 682 packets are authenticated multihoming protocol packets. However, the 683 reception of these packets alone is an insufficient reason to switch 684 to a new address, as in an unidirectional connectivity case the 685 return path may not work. 687 One suggested good implementation strategy is to record the 688 reachability test result (an on/off value) and multiply this by the 689 age of the information. This allows recently tested address pairs to 690 be chosen before old ones. 692 Out of the set of possible candidate address pairs, nodes SHOULD 693 attempt a test through all of them, but MUST do this sequentially 694 (based on an implementation-dependent priority order) and using an 695 exponential back-off procedure. 697 This sequantial process is necessary in order to avoid a "signaling 698 storm" when an outage occurs (particularly for a complete site). 699 However, it also limits the number of addresses that can in practice 700 be used for multihoming, considering that transport and application 701 layer protocols will fail if the switch to a new address pair takes 702 too long. For instance, we can assume that an initial timeout value 703 is 0.1 seconds and there are four addresses on both sides. Going 704 through all sixteen address pairs and doubling the timeout value at 705 every trial would take 3200 seconds! 707 Finally, as has been noted in the context of MOBIKE, the existence of 708 NATs can require that peers continuously monitor the operational 709 status of address pairs, as otherwise NAT state related to a 710 particular communication is lost, and the peer on the outer side of 711 the NAT can no longer reach the peer inside the NAT. 713 5.4 Protocol for Testing Unidirectional Reachability 715 Testing for reachability is not easy in an environment where 716 unidirectional reachability is a possibility. This is because the 717 test of a single pair may not result in a working paths to send both 718 the request and response packets. The following protocol could be 719 used to avoid this problem: 721 Peer A Peer B 722 | | 723 | Poll 1 (src=A1, dst=B1) | 724 |-------------------------------------------->| 725 | | 726 | Poll 2 (src=B1, dst=A1) OK: 1 | 727 | X------------------------------------| 728 | | 729 | Poll 3 (src=A2, dst=B1) | 730 |------------------------------X | 731 | | 732 | Poll 4 (src=B2, dst=A1) OK: 1 | 733 |<--------------------------------------------| 734 | | 735 | Poll 5 (src=A1, dst=B1) OK: 4 | 736 |-------------------------------------------->| 737 | | 739 When B receives the first Poll message, it memorizes that it has 740 gotten it. The Poll message from B, however, is lost so A tries 741 again with another pair. This is lost too, but B continues its own 742 testing process by sending its second Poll message, which is received 743 by A. The messages carry identifiers, and a list of identifiers that 744 were found messages the sender had itself successfully received 745 earlier. 747 In the end of the example case, A and B know that they have a working 748 path from A to B using (A1, B1) and from B to A using (B2, A1). 750 More generally, when A decides that it needs to test for 751 connectivity, it will initiate a set of Poll messages, in sequence, 752 until it gets a Poll message from B indicating that (a) B has 753 received one of A's Poll messages and, obviously, (b) that B's Poll 754 message is getting through. B uses the same algorithm, but starts 755 the process from the reception of the first Poll mesage from A. 757 Note that this protocol can be implemented in different ways. One 758 approach is to rely on data packets, such as TCP payload packets and 759 acknowledgements. This method has the benefit that it likely passes 760 easily through firewalls and other middleboxes. One exception to 761 this are stateful firewalls that wish to know what happened "earlier" 762 in the connection, but it seems that such firewalls are fundamentally 763 incompatible with multi-homing anyway. One drawback of this method 764 is, however, that the the number of available payload packets may not 765 match the need in a situation where a lot of address pairs need to be 766 explored. 768 Another approach is to have a completely separate protocol for the 769 exploration. This would need to be explicitly allowed in firewalls 770 before it could be used. On the other hand, then it would be very 771 clear for the firewall administrators what they are letting through. 773 6. References 775 6.1 Normative References 777 [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement 778 Levels", BCP 14, RFC 2119, March 1997. 780 [2] Narten, T., Nordmark, E. and W. Simpson, "Neighbor Discovery for 781 IP Version 6 (IPv6)", RFC 2461, December 1998. 783 [3] Thomson, S. and T. Narten, "IPv6 Stateless Address 784 Autoconfiguration", RFC 2462, December 1998. 786 [4] Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C. and M. 787 Carney, "Dynamic Host Configuration Protocol for IPv6 (DHCPv6)", 788 RFC 3315, July 2003. 790 [5] Draves, R., "Default Address Selection for Internet Protocol 791 version 6 (IPv6)", RFC 3484, February 2003. 793 [6] Aboba, B., "Detection of Network Attachment (DNA) in IPv4", 794 draft-ietf-dhc-dna-ipv4-08 (work in progress), July 2004. 796 [7] Choi, J., "Detecting Network Attachment in IPv6 Goals", 797 draft-ietf-dna-goals-00 (work in progress), June 2004. 799 [8] Moore, N., "Optimistic Duplicate Address Detection for IPv6", 800 draft-ietf-ipv6-optimistic-dad-01 (work in progress), June 2004. 802 [9] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 803 Addresses", draft-ietf-ipv6-unique-local-addr-05 (work in 804 progress), June 2004. 806 6.2 Informative References 808 [10] Stewart, R., Xie, Q., Morneault, K., Sharp, C., Schwarzbauer, 809 H., Taylor, T., Rytina, I., Kalla, M., Zhang, L. and V. Paxson, 810 "Stream Control Transmission Protocol", RFC 2960, October 2000. 812 [11] Rosenberg, J., Weinberger, J., Huitema, C. and R. Mahy, "STUN - 813 Simple Traversal of User Datagram Protocol (UDP) Through 814 Network Address Translators (NATs)", RFC 3489, March 2003. 816 [12] Nikander, P., "End-Host Mobility and Multi-Homing with Host 817 Identity Protocol", draft-ietf-hip-mm-00 (work in progress), 818 October 2004. 820 [13] Kivinen, T., "Design of the MOBIKE protocol", 821 draft-ietf-mobike-design-00 (work in progress), June 2004. 823 [14] Rosenberg, J., "Interactive Connectivity Establishment (ICE): A 824 Methodology for Network Address Translator (NAT) Traversal for 825 Multimedia Session Establishment Protocols", 826 draft-ietf-mmusic-ice-02 (work in progress), July 2004. 828 [15] Stewart, R., "Stream Control Transmission Protocol (SCTP) 829 Dynamic Address Reconfiguration", 830 draft-ietf-tsvwg-addip-sctp-10 (work in progress), January 831 2005. 833 [16] Crocker, D., "Framework for Common Endpoint Locator Pools", 834 draft-crocker-celp-00 (work in progress), February 2004. 836 [17] Dupont, F., "Address Management for IKE version 2", 837 draft-dupont-ikev2-addrmgmt-05 (work in progress), June 2004. 839 [18] Eronen, P., "Mobility Protocol Options for IKEv2 (MOPO-IKE)", 840 draft-eronen-mobike-mopo-00 (work in progress), July 2004. 842 [19] Eronen, P. and H. Tschofenig, "Simple Mobility and Multihoming 843 Extensions for IKEv2 (SMOBIKE)", draft-eronen-mobike-simple-00 844 (work in progress), March 2004. 846 [20] Gont, F., "ICMP attacks against TCP", 847 draft-gont-tcpm-icmp-attacks-00 (work in progress), August 848 2004. 850 [21] Huitema, C., "Address selection in multihomed environments", 851 draft-huitema-multi6-addr-selection-00 (work in progress), 852 October 2004. 854 [22] Kivinen, T., "MOBIKE protocol", 855 draft-kivinen-mobike-protocol-00 (work in progress), March 856 2004. 858 [23] Nordmark, E., "Multihoming without IP Identifiers", 859 draft-nordmark-multi6-noid-02 (work in progress), July 2004. 861 [24] Nordmark, E., "Multihoming using 64-bit Crypto-based IDs", 862 draft-nordmark-multi6-cb64-00 (work in progress), November 863 2003. 865 [25] Nordmark, E., "Strong Identity Multihoming using 128 bit 866 Identifiers (SIM/CBID128)", draft-nordmark-multi6-sim-01 (work 867 in progress), October 2003. 869 [26] Rosenberg, J., "Traversal Using Relay NAT (TURN)", 870 draft-rosenberg-midcom-turn-05 (work in progress), July 2004. 872 [27] Vogt, C., Arkko, J., Bless, R., Doll, M. and T. Kuefner, 873 "Credit-Based Authorization for Mobile IPv6 Early Binding 874 Updates", draft-vogt-mipv6-credit-based-authorization-00 (work 875 in progress), May 2004. 877 [28] Ylitalo, J., "Weak Identifier Multihoming Protocol (WIMP)", 878 draft-ylitalo-multi6-wimp-01 (work in progress), July 2004. 880 [29] Aura, T., Roe, M. and J. Arkko, "Security of Internet Location 881 Management", In Proceedings of the 18th Annual Computer 882 Security Applications Conference, Las Vegas, Nevada, USA., 883 December 2002. 885 Author's Address 887 Jari Arkko 888 Ericsson 889 Jorvas 02420 890 Finland 892 EMail: jari.arkko@ericsson.com 894 Appendix A. Contributors 896 This draft attempts to summarize the thoughts and unpublished 897 contributions of many people, including the MULTI6 WG design team 898 members Marcelo Bagnulo Braun, Iljitsch van Beijnum, Erik Nordmark, 899 Geoff Huston, Margaret Wasserman, and Jukka Ylitalo, the MOBIKE WG 900 contributors Pasi Eronen, Tero Kivinen, Francis Dupont, Spencer 901 Dawkins, and James Kempf, and my colleague Pekka Nikander at 902 Ericsson. This draft is also in debt to work done in the context of 903 SCTP [10]. 905 The protocol design in Section 5.4 is due to Erik, Marcelo, and 906 Iljitsch. 908 Appendix B. Acknowledgements 910 The author would also like to thank Christian Huitema, Pekka Savola, 911 and Hannes Tschofenig for interesting discussions in this problem 912 space, and for their comments on earlier versions of this draft. 914 Intellectual Property Statement 916 The IETF takes no position regarding the validity or scope of any 917 Intellectual Property Rights or other rights that might be claimed to 918 pertain to the implementation or use of the technology described in 919 this document or the extent to which any license under such rights 920 might or might not be available; nor does it represent that it has 921 made any independent effort to identify any such rights. Information 922 on the procedures with respect to rights in RFC documents can be 923 found in BCP 78 and BCP 79. 925 Copies of IPR disclosures made to the IETF Secretariat and any 926 assurances of licenses to be made available, or the result of an 927 attempt made to obtain a general license or permission for the use of 928 such proprietary rights by implementers or users of this 929 specification can be obtained from the IETF on-line IPR repository at 930 http://www.ietf.org/ipr. 932 The IETF invites any interested party to bring to its attention any 933 copyrights, patents or patent applications, or other proprietary 934 rights that may cover technology that may be required to implement 935 this standard. Please address the information to the IETF at 936 ietf-ipr@ietf.org. 938 Disclaimer of Validity 940 This document and the information contained herein are provided on an 941 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 942 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 943 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 944 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 945 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 946 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 948 Copyright Statement 950 Copyright (C) The Internet Society (2005). This document is subject 951 to the rights, licenses and restrictions contained in BCP 78, and 952 except as set forth therein, the authors retain all their rights. 954 Acknowledgment 956 Funding for the RFC Editor function is currently provided by the 957 Internet Society.