idnits 2.17.1 draft-ietf-shim6-failure-detection-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 14. -- Found old boilerplate from RFC 3978, Section 5.5 on line 975. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 952. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 959. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 965. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 2005) is 7041 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '1' is defined on line 799, but no explicit reference was found in the text == Unused Reference: '19' is defined on line 865, but no explicit reference was found in the text == Unused Reference: '20' is defined on line 868, but no explicit reference was found in the text == Unused Reference: '21' is defined on line 871, but no explicit reference was found in the text == Unused Reference: '24' is defined on line 883, but no explicit reference was found in the text == Unused Reference: '25' is defined on line 887, but no explicit reference was found in the text == Unused Reference: '26' is defined on line 890, but no explicit reference was found in the text == Unused Reference: '27' is defined on line 894, but no explicit reference was found in the text == Unused Reference: '30' is defined on line 906, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2461 (ref. '2') (Obsoleted by RFC 4861) ** Obsolete normative reference: RFC 2462 (ref. '3') (Obsoleted by RFC 4862) ** Obsolete normative reference: RFC 3315 (ref. '4') (Obsoleted by RFC 8415) ** Obsolete normative reference: RFC 3484 (ref. '5') (Obsoleted by RFC 6724) == Outdated reference: A later version (-18) exists of draft-ietf-dhc-dna-ipv4-08 == Outdated reference: A later version (-04) exists of draft-ietf-dna-goals-00 ** Downref: Normative reference to an Informational draft: draft-ietf-dna-goals (ref. '7') == Outdated reference: A later version (-07) exists of draft-ietf-ipv6-optimistic-dad-01 == Outdated reference: A later version (-09) exists of draft-ietf-ipv6-unique-local-addr-05 -- Obsolete informational reference (is this intentional?): RFC 2960 (ref. '10') (Obsoleted by RFC 4960) -- Obsolete informational reference (is this intentional?): RFC 3489 (ref. '11') (Obsoleted by RFC 5389) == Outdated reference: A later version (-05) exists of draft-ietf-hip-mm-00 == Outdated reference: A later version (-08) exists of draft-ietf-mobike-design-00 == Outdated reference: A later version (-08) exists of draft-ietf-mobike-protocol-00 == Outdated reference: A later version (-19) exists of draft-ietf-mmusic-ice-02 == Outdated reference: A later version (-22) exists of draft-ietf-tsvwg-addip-sctp-10 == Outdated reference: A later version (-08) exists of draft-dupont-ikev2-addrmgmt-05 == Outdated reference: A later version (-02) exists of draft-eronen-mobike-mopo-00 == Outdated reference: A later version (-05) exists of draft-gont-tcpm-icmp-attacks-00 == Outdated reference: A later version (-08) exists of draft-rosenberg-midcom-turn-05 Summary: 9 errors (**), 0 flaws (~~), 25 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Arkko 3 Internet-Draft Ericsson 4 Expires: July 5, 2005 January 2005 6 Failure Detection and Locator Selection Design Considerations 7 draft-ietf-shim6-failure-detection-00 9 Status of this Memo 11 By submitting this Internet-Draft, each author represents that any 12 applicable patent or other IPR claims of which he or she is aware 13 have been or will be disclosed, and any of which he or she becomes 14 aware will be disclosed, in accordance with Section 6 of BCP 79. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as Internet- 19 Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six months 22 and may be updated, replaced, or obsoleted by other documents at any 23 time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt. 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html. 32 This Internet-Draft will expire on July 5, 2005. 34 Copyright Notice 36 Copyright (C) The Internet Society (2005). 38 Abstract 40 This draft discusses locator pair selection and failure detection 41 mechanisms for the IPv6 multihoming feature being developed in the 42 SHIM6 working group. Elements of this document may also be useful 43 for developing the details of the MOBIKE or HIP multihoming 44 mechanisms. The draft also discusses the roles of a multihoming 45 protocol versus network attachment functions at IP and link layers. 47 Table of Contents 49 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 50 2. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 4 51 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 7 52 3.1 Available Addresses . . . . . . . . . . . . . . . . . 7 53 3.2 Locally Operational Addresses . . . . . . . . . . . . 8 54 3.3 Operational Address Pairs . . . . . . . . . . . . . . 8 55 3.4 Primary Address Pair . . . . . . . . . . . . . . . . . 10 56 3.5 Miscellaneous . . . . . . . . . . . . . . . . . . . . 10 57 4. Architectural Considerations . . . . . . . . . . . . . . . . . 11 58 5. An Approach . . . . . . . . . . . . . . . . . . . . . . . . . 13 59 5.1 State Machine for Addresses . . . . . . . . . . . . . 13 60 5.2 State Machine for Address Pair Selection . . . . . . . 14 61 5.3 Pair Selection Algorithm . . . . . . . . . . . . . . . 18 62 5.4 Protocol for Testing Unidirectional Reachability . . . 19 63 6. Security Considerations . . . . . . . . . . . . . . . . . . . 22 64 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 65 7.1 Normative References . . . . . . . . . . . . . . . . . 23 66 7.2 Informative References . . . . . . . . . . . . . . . . 23 67 Author's Address . . . . . . . . . . . . . . . . . . . . . . . 25 68 A. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 26 69 B. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 27 70 Intellectual Property and Copyright Statements . . . . . . . . 28 72 1. Introduction 74 The SHIM6 working group is extending IPv6 to support multihoming. A 75 number of possible approaches exist in this space, but the current 76 focus of the group is to look at an IP layer (or layer 3.5) mechanism 77 that hides multihoming from applications. This mechanism needs to 78 detect when a switch to another address or addresses becomes 79 necessary. We call this failure detection, because the SHIM6 80 protocol works primarily as a failover rather than a load balancing 81 scheme. 83 This draft discusses what requirements such a component of the SHIM6 84 protocol has, and how these requirements can be achieved. The draft 85 is structured as follows: Section 2 discusses what kind of solutions 86 have been used in other similar protocols. Section 3 defines a set 87 of useful terms and discusses them, and Section 4 discusses the 88 architectural implications of multihoming at IP layer. Finally, 89 Section 5 describes one possible solution involving two state 90 machines, a failure testing protocol, and an address pair selection 91 algorithm. 93 For the purposes of this draft, we consider an address to be 94 synonymous with a locator. There may be other, higher level 95 identifiers such as security associations, FQDNs, CGA public keys, or 96 HITs that tie the different locators used by a node together. 98 2. Related Work 100 In SCTP [10], the addresses of the endpoints are learned in the 101 connection setup phase either through listing them explictly or via 102 giving a DNS name that points to them. In order to provide a 103 failover mechanism between multihomed hosts, SCTP has the following 104 functions: 106 o One of the peer's addresses is selected as the primary address by 107 the application running on top of SCTP. All data packets are sent 108 to this address until there is a reason to choose another address, 109 such as the failure of the primary address. 111 o Testing the reachability of the peer endpoint's addresses. This 112 is done both via observing the data packets sent to the peer or 113 via a periodic heartbeat when there is no data packets to send. 115 Each time data packet retransmission is initiated (or when a 116 heartbeat is not answered within the estimated round-trip time) an 117 error counter is incremented. When a configured error limit is 118 reached, the particular destination address is marked as inactive. 119 The reception of an acknowledgement or heartbeat response clears 120 the counter. 122 o Retransmission: When retransmitting the endpoint attempts pick the 123 most "divergent" source-destination pair from the original source- 124 destination pair to which the packet was transmitted. Rules for 125 such selection are, however, left as implementation decisions in 126 SCTP. 128 SCTP does not define how local knowledge (such as information learned 129 from the link layer) should be used. SCTP also has no mechanism to 130 deal with dynamic changes to the set of available addresses, although 131 mechanisms for that are being developed [17]. 133 The MOBIKE protocol is currently being designed [15] [14]. This 134 protocol operates in a mixed IPv4/IPv6 enviroment, and typically has 135 to work through NATs. The current design is assumed to need to work 136 only in symmetric connectivity scenarios. 138 Some of the issues that have been discussed in the MOBIKE design 139 phase include the following: 141 o Single address vs. multiple peer addresses. A simple approach is 142 to have the peers be aware of just the current address of the 143 other side instead of all possible ones. Assuming that one of the 144 peers will request the other to start sending to a new address 145 this works well. However, this approach is unable to deal with 146 problems that affect both nodes. For instance, two nodes 147 connected by two separate point-to-point links will be unable to 148 switch to the other link if a failure occurs on the first one. 150 o Addresses vs. address pairs. Are tests and current paths 151 individual peer addresses, or pairs of peer and own addresses 152 (paths)? It seems that some failure scenarios require the use of 153 a path rather than a single address. A network failure may make 154 it impossible to communicate between a particular pair of 155 addresses, even if those addresses have some other connectivity. 157 o Where the connectivity information comes from. Does it come from 158 local stack (such as interface up/down, router advertisement), 159 from reception of ESP packets, from IKEv2 keepalives, or through 160 some MOBIKE-defined mechanism? 162 The mobility and multihoming specification for the HIP protocol [13] 163 leaves the determination of when address updates are sent to a local 164 policy, but suggests the use of local information and ICMP error 165 messages. 167 Network attachment procedures are also relevant for multihoming. The 168 IPv6 and MIP6 working groups have standardized mechanisms to learn 169 about networks that a node has attached to. Basic IPv6 Neighbor 170 Discovery was, however, designed primarily for static situations. 171 The fully dynamic detection procedure has turned out to be a 172 relatively complex procedure for mobile hosts, and it was not fully 173 anticipated at the time IPv6 Neighbor Discovery or DHCP were being 174 designed. As a result, enhanced or optimized mechanisms are being 175 designed in the DHC and DNA working groups [6] [7]. 177 ICE [16], STUN [11], and TURN [28] are also related mechanisms. They 178 are primarily used for NAT detection and communication through NATs 179 in IPv4 environment, for application such as as voice over IP. STUN 180 uses a server in the Internet to discover the presence and type of 181 NATs and the client's public IP addresses and ports. TURN makes it 182 possible to receive incoming connections in hosts behind NATs. ICE 183 makes use of these protocols in peer-to-peer cooperative fashion, 184 allowing participants to discover, create and verify mutual 185 connectivity, and then use this connectivity for multimedia streams. 186 While these mechanisms are not designed for dynamic and failure 187 situations, they have many of the same requirements for the 188 exploration of connectivity, as well as the requirement to deal with 189 middleboxes. 191 Related work in the IPv6 area includes RFC 3484 [5] which defines 192 source and destination address selection rules for IPv6 in situations 193 where multiple candidate address pairs exist. RFC 3484 considers 194 only a static situation, however, and does not take into account the 195 effect of failures. In the MULTI6 working group [23] considers how 196 applications can re-initiate connections after failures in the best 197 way. This work differs from the shim-layer approach selected for 198 further development in the working group with respect to the timing 199 of the address selection. In the shim-layer approach failure 200 detection and the selection of new addresses happens at any time, 201 while [23] considers only the case when an application re-establishes 202 connections. 204 3. Definitions 206 This section defines terms useful in discussing the failure detection 207 problem space. 209 3.1 Available Addresses 211 SHIM6 nodes need to be aware of what addresses they themselves have. 212 If a node loses the address it is currently using for communications, 213 another address must replace this address. And if a node loses an 214 address that the node's peer knows about, the peer must be informed. 215 Similarly, when a node acquires a new address it may generally wish 216 the peer to know about it. 218 Definition. Available address. An address is said to be available 219 if the following conditions are fulfilled: 221 o The address has been assigned to an interface of the node. 223 o If the address is an IPv6 address, we additionally require that 224 (a) the address is valid in the sense of RFC 2461 [2], and that 225 (b) the address is not tentative in the sense of RFC 2462 [3]. In 226 other words, the address assignment is complete so that 227 communications can be started. 229 Note this explicitly allows an address to be optimistic in the 230 sense of [8] even though implementations are probably better off 231 using other addresses as long as there is an alternative. 233 o The address is a global unicast, unique local address [9], or an 234 unambiquous IPv6 link-local or IPv4 RFC 1918 address. That is, it 235 is not an IPv6 site-local address. Where IPv6 link-local or RFC 236 1918 addresses are used, their use needs to be unambiquous. The 237 precise meaning of ambiquous has not been defined yet, but one 238 approach is requiring that at most one link-local address be used 239 per node within the same connection between two peers. 241 Note: Given RFC 3484 [5] rules for preferring smallest scope, 242 it is likely that many IPv6 flows at least start with even 243 link-local addresses. 245 o The address and interface is acceptable for use according to a 246 local policy. 248 Available addresses are discovered and monitored through mechanisms 249 outside the scope of SHIM6 (and HIP or MOBIKE). These mechanisms 250 include IPv6 Neighbor Discovery and Address Autoconfiguration [2] 251 [3], DHCP [4], enhanced network detection mechanisms detected by the 252 DNA working group, and corresponding IPv4 mechanisms, such as [6]. 254 3.2 Locally Operational Addresses 256 Two different granularity levels are needed for failure detection. 257 The coarser granularity is for individual addresses: 259 Definition. Locally Operational Address. An available address is 260 said to be locally operational when its use is known to be possible 261 locally: the interface is up and a relevant default router (if 262 applicable) is known to be reachable. 264 Locally operational addresses are discovered and monitored through 265 mechanisms outside SHIM6 (and HIP or MOBIKE). These mechanisms 266 include IPv6 Neighbor Discovery [2], corresponding IPv4 mechanisms, 267 and link layer specific mechanisms. 269 Theoretically, it is also possible for hosts to learn about routing 270 failures for a particular selected source prefix, even if no protocol 271 exists today to distribute this information in a convenient manner. 272 The development of such protocols would be possible, however. One 273 approach is overloading information in current IPv6 Router 274 Advertisements (see [23]) or adding some new information in them. 275 Similarly, hosts could learn information from servers that query the 276 BGP routing tables [23]. 278 3.3 Operational Address Pairs 280 The existence of locally operational addresses are not, however, a 281 guarantee that communications can be established with the peer. A 282 failure in the routing infrastructure can prevent the sent packets 283 from reaching their destination. For this reason we need the 284 definition of a second level of granularity, for pairs of addresses: 286 Definition. Bidirectionally operational address pair. A pair of 287 locally operational addresses are said to be an operational address 288 pair, iff bidirectional connectivity can be shown between the 289 addresses. That is, a packet sent with one of the addresses in the 290 source field and the other in the destination field reaches the 291 destination, and vice versa. 293 Unfortunately, there are scenarios where bidirectionally operational 294 address pairs do not exist. For instance, ingress filtering or 295 network failures may result in one address pair being operational in 296 one direction while another one is operational from the other 297 direction. The following definition captures this general situation: 299 Definition. Undirectionally operational address pair. A pair of 300 locally operational addresses are said to be an unidirectionally 301 operational address pair, iff packets sent with the first address as 302 the source and the second address as the destination can be shown to 303 reach the destination. 305 Both types of operational pairs are discovered and monitored through 306 the following mechanisms: 308 o Positive feedback from upper layer protocols. For instance, TCP 309 can indicate to the IP layer that it is making progress. This is 310 similar to how IPv6 Neighbor Unreachability Detection can in some 311 cases be avoided when upper layers provide information about 312 bidirectional connectivity [2]. In the case of unidirectional 313 connectivity, the upper layer protocol responses come back using 314 another address pair, but show that the messages sent using the 315 first address pair have been received. 317 o Negative feedback from upper layer protocols. It is conceivable 318 that upper layer protocols give an indication of a problem to the 319 SHIM6 layer. For instance, TCP could indicate that there's either 320 congestion or lack of connectivity in the path because it is not 321 getting ACKs. 323 o Explicit reachability tests. For instance, the IKEv2 keepalive 324 mechanism can be used to test that the current pair of addresses 325 is operational. 327 o ICMP error messages. Given the ease of spoofing ICMP messages, 328 one should be careful to not trust these blindly, however. Our 329 suggestion is to use ICMP error messages only as a hint to perform 330 an explicit reachability test, but not as a reason to disrupt 331 ongoing communications without other indications of problems. The 332 situation may be different when certain verifications of the ICMP 333 messages are being performed [22]. These verifications can ensure 334 that (pratically) only on-path attackers can spoof the messages. 335 Such verifications are not possible for all transport protocols, 336 however. 338 Note that some protocols, such as HIP [13], perform a return 339 routability test of an address before it is taken into use. The 340 purpose of this test is to ensure that fraudulent peers do not trick 341 others into redirecting traffic streams onto innocent victims [31]. 342 Such tests can at the same time work as a means to ensure that an 343 address pair is operational. Note, however, that some advanced 344 optimizations attempt to postpone the reachability tests so that they 345 do not increase movement-related latency [29]. 347 3.4 Primary Address Pair 349 Contrary to SCTP which has a specific congestion avoidance design 350 suitable for multi-homing, IP-layer solutions need to avoid sending 351 packets concurrently over multiple paths; TCP behaves rather poorly 352 in such circumstances. For this reason it is necessary to choose a 353 particular pair of addresses as the primary address pair which is 354 used until problems occur, at least for the same session. 356 A primary address pair need not be operational at all times. If 357 there is no traffic to send, we may not know if the primary address 358 pair is operational. Neverthless, it makes sense to assume that the 359 address pair that worked in some time ago continues to work for new 360 communications as well. 362 3.5 Miscellaneous 364 Addresses can become deprecated [2]. When other operational 365 addresses exist, nodes generally wish to move their communications 366 away from the deprecated addresses. 368 Similarly, IPv6 source address selection [5] may guide the selection 369 of a particular source address - destination address pair. 371 4. Architectural Considerations 373 Architecturally, a number of questions arises. One simple question 374 is whether there needs to be communications between a multihoming 375 solution residing at the IP layer and upper layer protocols? Upon 376 changing to a new address pair, transport layer protocol SHOULD be 377 notified so that it can perform a slow start, or some other form of 378 adaptation to the possibly changed conditions. This is necessary, 379 for instance, when switching from a high-bandwidth LAN interface to a 380 low bandwidth cellular interface. (Note that this notification can 381 not be done in protocol designs where the end points are not the 382 final hosts, such as where a gateway is used.) 384 A more fundamental question is which protocols should be responsible 385 for which parts of the problem. It seems clear that no multihoming 386 solution should take on the task of lower layers and other IP 387 functions for discovering its own addresses or testing local 388 connectivity. Protocols such as DHCP or Neighbor and Router 389 Discovery do this already. 391 But it is less clear which protocol(s) should discover end-to-end 392 connectivity problems or recover from them. One answer is that this 393 is clearly within the domain of multihoming protocol. By performing 394 testing and failure detection of the used path and switching to a new 395 path if necessary, the transport and application protocols can work 396 unchanged. 398 On the other hand, one could argue that transport and application 399 protocols would have more knowledge about the situation, and have a 400 better ability to decide when a move is required. For instance, they 401 know what the required throughput and congestion status is. Also, it 402 would be unfortunate if both the IP layer and transport/application 403 layer took action for the same problem, for instance by switching to 404 a new address at the IP layer and throttling back due to "congestion" 405 at the transport layer. 407 One can also envision that applications would be able to tell the IP 408 or transport layer that the current connection in unsatisfactory and 409 an exploration for a better one would be desirable. This would 410 require an API to be developed, however. 412 Generally speaking, we can divide information that a host has into 413 three categories: local information from "lower layers" such as IPv6 414 Neighbor Discovery, transit and congestion condition information from 415 either from the multihoming protocol itself or from transport layer 416 protocols and (where available) ECN, and application layer policies 417 that dictate what the requirements are for acceptable connections. 419 The division of work is largely left as an open issue as far as this 420 document is concerned, but our description works from a point of view 421 of a multihoming protocol at the IP layer. We also note that in the 422 CELP proposal [18], both IP, transport, and application layer 423 entities could share their connectivity status in a common 424 information pool. This may also be a useful approach. 426 Finally, the last architectural question is about the difference 427 between mobility and multihoming. Given our definitions above, 428 there's no fundamental difference with respect to how the 429 multihoming/mobility protocol learns the addresses it has available. 430 However, a practical difference is that in a multihoming scenario 431 there are alternative addresses, whereas in mobility changes to a new 432 address are forced due to the old address no longer being available. 434 5. An Approach 436 One suggested approach consists of a mechanism for keeping track of 437 the host's own available addresses, operational addresses, and 438 operational address pairs. 440 5.1 State Machine for Addresses 442 Addresses can be in the AVAILABLE and OPERATIONAL states. The state 443 transitions relating to this are shown in Figure 1. 445 +--------------+ 446 Address becomes | | 447 available | | 448 ----------------->| | 449 | AVAILABLE | 450 <-----------------| | 451 Address is no | | 452 longer available | | 453 +--------------+ 454 | / \ 455 Address | | Address 456 becomes | | is no longer 457 operational | | operational 458 | | 459 \ / | 460 +--------------+ 461 | | 462 Address is no | | 463 longer available | | 464 <-----------------| OPERATIONAL | 465 | | 466 | | 467 | | 468 +--------------+ 470 Figure 1. Address state machine. 472 When an address becomes operational, it SHOULD be reported as a new 473 address to the peer. Similarly, when an address is no longer 474 operational or available, the peer SHOULD be informed. 476 In addition, a particular address can be either preferred or 477 deprecated. This is not shown in the state machine. 479 5.2 State Machine for Address Pair Selection 481 A node runs the address pair selection state machine to choose the 482 currently used primary address pair, the one which is used for 483 sending outgoing packets. A node runs one of these state machines 484 towards each different peer, tracking the known address pairs and 485 their status. Each peer also has its own state machine for talking 486 back to the node; there is no guarantee that the same address pairs 487 (in reverse order) have the same state; lack of bidirectionally 488 operational pair would result in a different state on both sides, for 489 instance. 491 The state machine can be in the NO PRIMARY, TESTING PRIMARY, and 492 PRIMARY OPERATIONAL states. The chosen address pair is known to be 493 operational in the PRIMARY OPERATIONAL state, and is either 494 unverified or non-operational in the other states. 496 Figure 2 shows the state machine: 498 +----------------+ 499 | | 500 | | 501 | | 502 | | 503 | NO | 504 | PRIMARY | 505 | | 506 +-----| |<---------------+ 507 | | | | 508 | +----------------+ | 509 | / \ / \ | 510 Add | | | | 511 pair: | Delete | | Test Delete | 512 Send | pair & | | fail & pair & | 513 test | Last | | Last Last | 514 | | | | 515 | +----------------+ | 516 | | | | 517 +---->| |<----+ | 518 | | | Test | 519 Connect: Send test | | | fail & | 520 --------------------->| TESTING | | !Last | 521 | PRIMARY |+----+ | 522 +------------->| | | 523 | | |<----+ | 524 | +---->| | | | 525 | | +----------------+ | | 526 Policy | ICMP | | | | | | 527 change | Timer: | ULP | | Test | Delete | 528 | Send | feedback:| | OK: | pair & | 529 | test | Reset | | Reset | !Last | 530 | | timer | | timer | | 531 | | \ / \ / | | 532 | | +----------------+ | | 533 | +-----| | | | 534 | | |-----+ | 535 +--------------| | | 536 | | | 537 +-----| OPERATIONAL | | 538 ULP feedback: | | PRIMARY | | 539 Reset timer | | |----------------+ 540 +---->| | 541 | | 542 +----------------+ 544 Figure 2. Pair selection state machine. 546 The notation used in Figure 2 is explained below: 548 Connect 550 An event representing the desire of the application to send a 551 packet to a new peer, or an indication from a peer wishing to 552 connect to us. 554 Test OK 556 An event representing a successful completion of the reachability 557 test. 559 Test fail 561 An event representing failure to complete the reachability test. 563 ULP feedback 565 An event representing positive indication from an upper layer 566 protocol that the packets we have sent to the peer are getting 567 through. 569 ICMP 571 An event representing the reception of an ICMP error message. 573 Timer 575 An event representing timer elapsing. 577 Add pair 579 An event representing the addition of a new possible address pair, 580 either through learning a new local address or being told of a new 581 remote address. Note that this does not usually result in any 582 immediate action, unless we are currently lacking an operational 583 primary pair. 585 Delete pair 587 An event representing the deletion of the currently chosen primary 588 address pair. 590 Policy change 592 An event representing the desire of the local or remote end to 593 change to a different address pair, despite the current one being 594 operational. This can be due to the availability of the higher- 595 bandwidth connection, cost, or other issues. 597 Last 599 A condition that tells whether or not the currently chosen primary 600 pair is the only known address pair. 602 Send test 604 An action to initiate the reachability test for a particular pair. 605 This test is typically embedded in the SHIM6 connection setup 606 exchange when run initially, and a separate exchange later. 608 Note that due to potentially asymmetric connectivity, both sides 609 have to perform their own tests, and make their own primary pair 610 selections. 612 Reset timer 614 An action to reset a timer so that it will send an event after a 615 specified time. 617 The state machines also assumes an underlying multihoming signaling 618 capabability, consisting of the following abstract message exchanges: 620 Open 622 Establishes a connection between the peers. May also exchange 623 locator sets and test reachability at the same time. 625 Test 627 Verifies reachability using a specific address pair. 629 Add 631 Informs the peer about new locators. 633 Delete 635 Informs the peer about losing some locators. 637 Note that the above state machine leaves open how specific address 638 pairs are chosen, as this will be discussed in the next section. We 639 have also, on purpose, decided to avoid attaching functional labels 640 such as "backup" to other address pairs beyond the primary pair. It 641 is our belief that a general design does not need these labels. 643 5.3 Pair Selection Algorithm 645 The pair selection state machine assumes an ability to pick primary 646 and alternative address pairs. 648 This process results in a combinatorial explosion when there are many 649 addresses on both sides. Do both sides track all possible 650 combinations of addresses? If a failure occurs, shall all 651 combinations be tested before giving up? Are such tests performed in 652 parallel or in sequence, and what kind of backoff procedures should 653 be applied? 655 Our suggestion is that nodes MUST first consult RFC 3484 [5] Section 656 4 rules to determine what combinations of addresses are legal from a 657 local point of view, as this reduces the search space. RFC 3484 also 658 provides a priority ordering among different address pairs, making 659 the search possibly faster. Nodes SHOULD also use local information, 660 such as known quality of service parameters or interface types to 661 determine what addresses are preferred over others, and try pairs 662 containing such addresses first. In some cases we can also learn the 663 peer's preferences through the multihoming protocol [13]. 665 Discussion note 1: It may also be possible to simulate preferences 666 by choosing to not tell the peer about some (non-preferred) 667 addresses. 669 Discussion note 2: The preferences may either be learned 670 dynamically or be configured. It is believed, however, that 671 dynamic learning based purely on the SHIM6 protocol is too hard 672 and not the task this layer should do. Solutions where multiple 673 protocols share their information in a common pool of locators 674 could provide this information from transport protocols, however 675 [18]. 677 The reception of packets from the peer with a given address pair is a 678 good hint that the address pair works, particularly when these 679 packets are authenticated multihoming protocol packets. However, the 680 reception of these packets alone is an insufficient reason to switch 681 to a new address, as in an unidirectional connectivity case the 682 return path may not work. 684 One suggested good implementation strategy is to record the 685 reachability test result (an on/off value) and multiply this by the 686 age of the information. This allows recently tested address pairs to 687 be chosen before old ones. 689 Out of the set of possible candidate address pairs, nodes SHOULD 690 attempt a test through all of them, but MUST do this sequentially 691 (based on an implementation-dependent priority order) and using an 692 exponential back-off procedure. 694 This sequantial process is necessary in order to avoid a "signaling 695 storm" when an outage occurs (particularly for a complete site). 696 However, it also limits the number of addresses that can in practice 697 be used for multihoming, considering that transport and application 698 layer protocols will fail if the switch to a new address pair takes 699 too long. For instance, we can assume that an initial timeout value 700 is 0.1 seconds and there are four addresses on both sides. Going 701 through all sixteen address pairs and doubling the timeout value at 702 every trial would take 3200 seconds! 704 Finally, as has been noted in the context of MOBIKE, the existence of 705 NATs can require that peers continuously monitor the operational 706 status of address pairs, as otherwise NAT state related to a 707 particular communication is lost, and the peer on the outer side of 708 the NAT can no longer reach the peer inside the NAT. 710 5.4 Protocol for Testing Unidirectional Reachability 712 Testing for reachability is not easy in an environment where 713 unidirectional reachability is a possibility. This is because the 714 test of a single pair may not result in a working paths to send both 715 the request and response packets. The following protocol could be 716 used to avoid this problem: 718 Peer A Peer B 719 | | 720 | Poll 1 (src=A1, dst=B1) | 721 |-------------------------------------------->| 722 | | 723 | Poll 2 (src=B1, dst=A1) OK: 1 | 724 | X------------------------------------| 725 | | 726 | Poll 3 (src=A2, dst=B1) | 727 |------------------------------X | 728 | | 729 | Poll 4 (src=B2, dst=A1) OK: 1 | 730 |<--------------------------------------------| 731 | | 732 | Poll 5 (src=A1, dst=B1) OK: 4 | 733 |-------------------------------------------->| 734 | | 736 When B receives the first Poll message, it memorizes that it has 737 gotten it. The Poll message from B, however, is lost so A tries 738 again with another pair. This is lost too, but B continues its own 739 testing process by sending its second Poll message, which is received 740 by A. The messages carry identifiers, and a list of identifiers that 741 were found messages the sender had itself successfully received 742 earlier. 744 In the end of the example case, A and B know that they have a working 745 path from A to B using (A1, B1) and from B to A using (B2, A1). 747 More generally, when A decides that it needs to test for 748 connectivity, it will initiate a set of Poll messages, in sequence, 749 until it gets a Poll message from B indicating that (a) B has 750 received one of A's Poll messages and, obviously, (b) that B's Poll 751 message is getting through. B uses the same algorithm, but starts 752 the process from the reception of the first Poll mesage from A. 754 Note that this protocol can be implemented in different ways. One 755 approach is to rely on data packets, such as TCP payload packets and 756 acknowledgements. This method has the benefit that it likely passes 757 easily through firewalls and other middleboxes. One exception to 758 this are stateful firewalls that wish to know what happened "earlier" 759 in the connection, but it seems that such firewalls are fundamentally 760 incompatible with multi-homing anyway. One drawback of this method 761 is, however, that the the number of available payload packets may not 762 match the need in a situation where a lot of address pairs need to be 763 explored. 765 Another approach is to have a completely separate protocol for the 766 exploration. This would need to be explicitly allowed in firewalls 767 before it could be used. On the other hand, then it would be very 768 clear for the firewall administrators what they are letting through. 770 6. Security Considerations 772 Attackers may spoof various indications from lower layers and the 773 network in an effort to confuse the peers about which addresses are 774 or are not working. For example, attackers may spoof ICMP error 775 messages in an effort to cause the parties to move their traffic 776 elsewhere or even to disconnect. Attackers may also spoof 777 information related to network attachments, router discovery, and 778 address assignments in an effort to make the parties believe they 779 have Internet connectivity when in reality they do not. 781 This may cause use of non-preferred addresses or even denial-of- 782 service. 784 SHIM6 does not provide any protection of its own for indications from 785 other parts of the protocol stack. However, MOBIKE is resistant to 786 incorrect information from these sources in the sense that it 787 provides its own security for both the signaling of addressing 788 information as well as actual payload data transmission. Denial-of- 789 service vulnerabilities remain, however. Some aspects of these 790 vulnerabilities can be mitigated through the use of techniques 791 specific to the other parts of the stack, such as properly dealing 792 with ICMP errors [22], link layer security, or the use of [12] to 793 protect IPv6 Router and Neighbor Discovery. 795 7. References 797 7.1 Normative References 799 [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement 800 Levels", BCP 14, RFC 2119, March 1997. 802 [2] Narten, T., Nordmark, E., and W. Simpson, "Neighbor Discovery 803 for IP Version 6 (IPv6)", RFC 2461, December 1998. 805 [3] Thomson, S. and T. Narten, "IPv6 Stateless Address 806 Autoconfiguration", RFC 2462, December 1998. 808 [4] Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C., and M. 809 Carney, "Dynamic Host Configuration Protocol for IPv6 (DHCPv6)", 810 RFC 3315, July 2003. 812 [5] Draves, R., "Default Address Selection for Internet Protocol 813 version 6 (IPv6)", RFC 3484, February 2003. 815 [6] Aboba, B., "Detection of Network Attachment (DNA) in IPv4", 816 draft-ietf-dhc-dna-ipv4-08 (work in progress), July 2004. 818 [7] Choi, J., "Detecting Network Attachment in IPv6 Goals", 819 draft-ietf-dna-goals-00 (work in progress), June 2004. 821 [8] Moore, N., "Optimistic Duplicate Address Detection for IPv6", 822 draft-ietf-ipv6-optimistic-dad-01 (work in progress), June 2004. 824 [9] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 825 Addresses", draft-ietf-ipv6-unique-local-addr-05 (work in 826 progress), June 2004. 828 7.2 Informative References 830 [10] Stewart, R., Xie, Q., Morneault, K., Sharp, C., Schwarzbauer, 831 H., Taylor, T., Rytina, I., Kalla, M., Zhang, L., and V. 832 Paxson, "Stream Control Transmission Protocol", RFC 2960, 833 October 2000. 835 [11] Rosenberg, J., Weinberger, J., Huitema, C., and R. Mahy, "STUN 836 - Simple Traversal of User Datagram Protocol (UDP) Through 837 Network Address Translators (NATs)", RFC 3489, March 2003. 839 [12] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure 840 Neighbor Discovery (SEND)", RFC 3971, March 2005. 842 [13] Nikander, P., "End-Host Mobility and Multi-Homing with Host 843 Identity Protocol", draft-ietf-hip-mm-00 (work in progress), 844 October 2004. 846 [14] Kivinen, T., "Design of the MOBIKE protocol", 847 draft-ietf-mobike-design-00 (work in progress), June 2004. 849 [15] Eronen, P., "IKEv2 Mobility and Multihoming Protocol (MOBIKE)", 850 draft-ietf-mobike-protocol-00 (work in progress), June 2005. 852 [16] Rosenberg, J., "Interactive Connectivity Establishment (ICE): A 853 Methodology for Network Address Translator (NAT) Traversal for 854 Multimedia Session Establishment Protocols", 855 draft-ietf-mmusic-ice-02 (work in progress), July 2004. 857 [17] Stewart, R., "Stream Control Transmission Protocol (SCTP) 858 Dynamic Address Reconfiguration", 859 draft-ietf-tsvwg-addip-sctp-10 (work in progress), 860 January 2005. 862 [18] Crocker, D., "Framework for Common Endpoint Locator Pools", 863 draft-crocker-celp-00 (work in progress), February 2004. 865 [19] Dupont, F., "Address Management for IKE version 2", 866 draft-dupont-ikev2-addrmgmt-05 (work in progress), June 2004. 868 [20] Eronen, P., "Mobility Protocol Options for IKEv2 (MOPO-IKE)", 869 draft-eronen-mobike-mopo-00 (work in progress), July 2004. 871 [21] Eronen, P. and H. Tschofenig, "Simple Mobility and Multihoming 872 Extensions for IKEv2 (SMOBIKE)", draft-eronen-mobike-simple-00 873 (work in progress), March 2004. 875 [22] Gont, F., "ICMP attacks against TCP", 876 draft-gont-tcpm-icmp-attacks-00 (work in progress), 877 August 2004. 879 [23] Huitema, C., "Address selection in multihomed environments", 880 draft-huitema-multi6-addr-selection-00 (work in progress), 881 October 2004. 883 [24] Kivinen, T., "MOBIKE protocol", 884 draft-kivinen-mobike-protocol-00 (work in progress), 885 March 2004. 887 [25] Nordmark, E., "Multihoming without IP Identifiers", 888 draft-nordmark-multi6-noid-02 (work in progress), July 2004. 890 [26] Nordmark, E., "Multihoming using 64-bit Crypto-based IDs", 891 draft-nordmark-multi6-cb64-00 (work in progress), 892 November 2003. 894 [27] Nordmark, E., "Strong Identity Multihoming using 128 bit 895 Identifiers (SIM/CBID128)", draft-nordmark-multi6-sim-01 (work 896 in progress), October 2003. 898 [28] Rosenberg, J., "Traversal Using Relay NAT (TURN)", 899 draft-rosenberg-midcom-turn-05 (work in progress), July 2004. 901 [29] Vogt, C., Arkko, J., Bless, R., Doll, M., and T. Kuefner, 902 "Credit-Based Authorization for Mobile IPv6 Early Binding 903 Updates", draft-vogt-mipv6-credit-based-authorization-00 (work 904 in progress), May 2004. 906 [30] Ylitalo, J., "Weak Identifier Multihoming Protocol (WIMP)", 907 draft-ylitalo-multi6-wimp-01 (work in progress), July 2004. 909 [31] Aura, T., Roe, M., and J. Arkko, "Security of Internet Location 910 Management", In Proceedings of the 18th Annual Computer 911 Security Applications Conference, Las Vegas, Nevada, USA., 912 December 2002. 914 Author's Address 916 Jari Arkko 917 Ericsson 918 Jorvas 02420 919 Finland 921 Email: jari.arkko@ericsson.com 923 Appendix A. Contributors 925 This draft attempts to summarize the thoughts and unpublished 926 contributions of many people, including the MULTI6 WG design team 927 members Marcelo Bagnulo Braun, Iljitsch van Beijnum, Erik Nordmark, 928 Geoff Huston, Margaret Wasserman, and Jukka Ylitalo, the MOBIKE WG 929 contributors Pasi Eronen, Tero Kivinen, Francis Dupont, Spencer 930 Dawkins, and James Kempf, and my colleague Pekka Nikander at 931 Ericsson. This draft is also in debt to work done in the context of 932 SCTP [10]. 934 The protocol design in Section 5.4 is due to Erik, Marcelo, and 935 Iljitsch. 937 Appendix B. Acknowledgements 939 The author would also like to thank Christian Huitema, Pekka Savola, 940 and Hannes Tschofenig for interesting discussions in this problem 941 space, and for their comments on earlier versions of this draft. 943 Intellectual Property Statement 945 The IETF takes no position regarding the validity or scope of any 946 Intellectual Property Rights or other rights that might be claimed to 947 pertain to the implementation or use of the technology described in 948 this document or the extent to which any license under such rights 949 might or might not be available; nor does it represent that it has 950 made any independent effort to identify any such rights. Information 951 on the procedures with respect to rights in RFC documents can be 952 found in BCP 78 and BCP 79. 954 Copies of IPR disclosures made to the IETF Secretariat and any 955 assurances of licenses to be made available, or the result of an 956 attempt made to obtain a general license or permission for the use of 957 such proprietary rights by implementers or users of this 958 specification can be obtained from the IETF on-line IPR repository at 959 http://www.ietf.org/ipr. 961 The IETF invites any interested party to bring to its attention any 962 copyrights, patents or patent applications, or other proprietary 963 rights that may cover technology that may be required to implement 964 this standard. Please address the information to the IETF at 965 ietf-ipr@ietf.org. 967 Disclaimer of Validity 969 This document and the information contained herein are provided on an 970 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 971 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 972 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 973 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 974 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 975 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 977 Copyright Statement 979 Copyright (C) The Internet Society (2005). This document is subject 980 to the rights, licenses and restrictions contained in BCP 78, and 981 except as set forth therein, the authors retain all their rights. 983 Acknowledgment 985 Funding for the RFC Editor function is currently provided by the 986 Internet Society.