idnits 2.17.1 draft-ietf-shim6-failure-detection-13.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 17. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 1684. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1695. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1702. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1708. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 24, 2008) is 5779 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 3315 (Obsoleted by RFC 8415) ** Obsolete normative reference: RFC 3484 (Obsoleted by RFC 6724) == Outdated reference: A later version (-11) exists of draft-ietf-bfd-base-08 == Outdated reference: A later version (-09) exists of draft-ietf-dna-protocol-07 == Outdated reference: A later version (-04) exists of draft-ietf-shim6-locator-pair-selection-03 == Outdated reference: A later version (-12) exists of draft-ietf-shim6-proto-10 == Outdated reference: A later version (-12) exists of draft-ietf-tcpm-icmp-attacks-03 -- Obsolete informational reference (is this intentional?): RFC 4960 (Obsoleted by RFC 9260) Summary: 3 errors (**), 0 flaws (~~), 6 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Arkko 3 Internet-Draft Ericsson 4 Intended status: Standards Track I. van Beijnum 5 Expires: December 26, 2008 IMDEA Networks 6 June 24, 2008 8 Failure Detection and Locator Pair Exploration Protocol for IPv6 9 Multihoming 10 draft-ietf-shim6-failure-detection-13 12 Status of this Memo 14 By submitting this Internet-Draft, each author represents that any 15 applicable patent or other IPR claims of which he or she is aware 16 have been or will be disclosed, and any of which he or she becomes 17 aware will be disclosed, in accordance with Section 6 of BCP 79. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that 21 other groups may also distribute working documents as Internet- 22 Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six months 25 and may be updated, replaced, or obsoleted by other documents at any 26 time. It is inappropriate to use Internet-Drafts as reference 27 material or to cite them other than as "work in progress." 29 The list of current Internet-Drafts can be accessed at 30 http://www.ietf.org/ietf/1id-abstracts.txt. 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html. 35 This Internet-Draft will expire on December 26, 2008. 37 Abstract 39 This document specifies how the level 3 multihoming shim protocol 40 (SHIM6) detects failures between two communicating hosts. It also 41 specifies an exploration protocol for switching to another pair of 42 interfaces and/or addresses between the same hosts if a failure 43 occurs and an operational pair can be found. 45 Table of Contents 47 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 48 2. Requirements language . . . . . . . . . . . . . . . . . . . . 6 49 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 7 50 3.1. Available Addresses . . . . . . . . . . . . . . . . . . 7 51 3.2. Locally Operational Addresses . . . . . . . . . . . . . 8 52 3.3. Operational Address Pairs . . . . . . . . . . . . . . . 8 53 3.4. Primary Address Pair . . . . . . . . . . . . . . . . . . 10 54 3.5. Current Address Pair . . . . . . . . . . . . . . . . . . 10 55 4. Protocol Overview . . . . . . . . . . . . . . . . . . . . . . 11 56 4.1. Failure Detection . . . . . . . . . . . . . . . . . . . 11 57 4.2. Full Reachability Exploration . . . . . . . . . . . . . 13 58 4.3. Exploration Order . . . . . . . . . . . . . . . . . . . 14 59 5. Protocol Definition . . . . . . . . . . . . . . . . . . . . . 17 60 5.1. Keepalive Message . . . . . . . . . . . . . . . . . . . 17 61 5.2. Probe Message . . . . . . . . . . . . . . . . . . . . . 18 62 5.3. Keepalive Timeout Option Format . . . . . . . . . . . . 22 63 6. Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . 24 64 6.1. Incoming payload packet . . . . . . . . . . . . . . . . 24 65 6.2. Outgoing payload packet . . . . . . . . . . . . . . . . 25 66 6.3. Keepalive timeout . . . . . . . . . . . . . . . . . . . 25 67 6.4. Send timeout . . . . . . . . . . . . . . . . . . . . . . 26 68 6.5. Retransmission . . . . . . . . . . . . . . . . . . . . . 26 69 6.6. Reception of the Keepalive message . . . . . . . . . . . 26 70 6.7. Reception of the Probe message State=Exploring . . . . . 27 71 6.8. Reception of the Probe message State=InboundOk . . . . . 27 72 6.9. Reception of the Probe message State=Operational . . . . 27 73 6.10. Graphical Representation of the State Machine . . . . . 28 74 7. Protocol Constants . . . . . . . . . . . . . . . . . . . . . . 29 75 8. Security Considerations . . . . . . . . . . . . . . . . . . . 30 76 9. Operational Considerations . . . . . . . . . . . . . . . . . . 32 77 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34 78 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 35 79 11.1. Normative References . . . . . . . . . . . . . . . . . . 35 80 11.2. Informative References . . . . . . . . . . . . . . . . . 35 81 Appendix A. Example Protocol Runs . . . . . . . . . . . . . . . . 38 82 Appendix B. Contributors . . . . . . . . . . . . . . . . . . . . 43 83 Appendix C. Acknowledgements . . . . . . . . . . . . . . . . . . 44 84 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 45 85 Intellectual Property and Copyright Statements . . . . . . . . . . 46 87 1. Introduction 89 The SHIM6 protocol [I-D.ietf-shim6-proto] extends IPv6 to support 90 multihoming. It is an IP layer mechanism that hides multihoming from 91 applications. A part of the SHIM6 solution involves detecting when a 92 currently used pair of addresses (or interfaces) between two 93 communication hosts has failed, and picking another pair when this 94 occurs. We call the former failure detection, and the latter locator 95 pair exploration. 97 This document specifies the mechanisms and protocol messages to 98 achieve both failure detection and locator pair exploration. This 99 part of the SHIM6 protocol is called the REAchability Protocol 100 (REAP). 102 Failure detection is made as light weight as possible. Data traffic 103 in both direction is observed, and in the case where there is no 104 traffic because the communication is idle, failure detection is also 105 idle and doesn't generate any packets. When data traffic is flowing 106 in both directions, there is no need to send failure detection 107 packets, either. Only when there is traffic in one direction, the 108 failure detection mechanism generates keepalives in the other 109 direction. As a result, whenever there is outgoing traffic and no 110 incoming return traffic or keepalives, there must be failure, at 111 which point the locator pair exploration is performed to find a 112 working address pair for each direction. 114 The document is structured as follows: Section 3 defines a set of 115 useful terms, Section 4 gives an overview of REAP, and Section 5 116 specifies the message formats and behaviour in detail. Section 8 117 discusses the security considerations of REAP. 119 In this specification, we consider an address to be synonymous with a 120 locator. Other parts of the SHIM6 protocol ensure that the different 121 locators used by a node actually belong together. That is, REAP is 122 not responsible for ensuring that it ends up with a legitimate 123 locator. 125 REAP has been designed to be used with SHIM6, and is therefore 126 tailored to an environment where it runs on hosts, uses widely 127 varying types of paths and is unaware of application context. As a 128 result, REAP attempts to be as self-configuring and unobtrusive as 129 possible. In particular, it avoids sending any packets except where 130 absolutely required and employs exponential back-off to avoid 131 congestion. The downside is that it cannot offer the same 132 granularity of detecting problems as mechanisms that have more 133 application context and ability to negotiate or configure parameters. 134 Future versions of this specification may consider extensions with 135 such capabilities, for instance through inheriting some mechanisms 136 from Bidirectional Forwarding Detection (BFD) protocol 137 [I-D.ietf-bfd-base]. 139 2. Requirements language 141 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 142 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 143 document are to be interpreted as described in [RFC2119]. 145 3. Definitions 147 This section defines terms useful for discussing failure detection 148 and locator pair exploration. 150 3.1. Available Addresses 152 SHIM6 nodes need to be aware of what addresses they themselves have. 153 If a node loses the address it is currently using for communications, 154 another address must replace this address. And if a node loses an 155 address that the node's peer knows about, the peer must be informed. 156 Similarly, when a node acquires a new address it may generally wish 157 the peer to know about it. 159 Definition. Available address - an address is said to be available 160 if all the following conditions are fulfilled: 162 o The address has been assigned to an interface of the node. 164 o The valid lifetime of the prefix (RFC 4861 [RFC4861] Section 165 4.6.2) associated with the address has not expired. 167 o The address is not tentative in the sense of RFC 4862 [RFC4862]. 168 In other words, the address assignment is complete so that 169 communications can be started. 171 Note that this explicitly allows an address to be optimistic in 172 the sense of Optimistic DAD [RFC4429] even though implementations 173 may prefer using other addresses as long as there is an 174 alternative. 176 o The address is a global unicast or unique local address [RFC4193]. 177 That is, it is not an IPv6 site-local or link-local address. 179 With link-local addresses, the nodes would be unable to determine 180 on which link the given address is usable. 182 o The address and interface is acceptable for use according to a 183 local policy. 185 Available addresses are discovered and monitored through mechanisms 186 outside the scope of SHIM6. SHIM6 implementations MUST be able to 187 employ information provided by IPv6 Neighbor Discovery [RFC4861], 188 Address Autoconfiguration [RFC4862], and DHCP [RFC3315] (when DHCP is 189 implemented). This information includes the availability of a new 190 address and status changes of existing addresses (such as when an 191 address becomes invalid). 193 3.2. Locally Operational Addresses 195 Two different granularity levels are needed for failure detection. 196 The coarser granularity is for individual addresses: 198 Definition. Locally Operational Address - an available address is 199 said to be locally operational when its use is known to be possible 200 locally: the interface is up, a default router (if needed) suitable 201 for this address is known to be reachable, and no other local 202 information points to the address being unusable. 204 Locally operational addresses are discovered and monitored through 205 mechanisms outside the SHIM6 protocol. SHIM6 implementations MUST be 206 able to employ information provided from Neighbor Unreachability 207 Detection [RFC4861]. Implementations MAY also employ additional, 208 link layer specific mechanisms. 210 Note 1: A part of the problem in ensuring that an address is 211 operational is making sure that after a change in link layer 212 connectivity we are still connected to the same IP subnet. 213 Mechanisms such as DNA CPL [I-D.ietf-dna-cpl] or DNAv6 214 [I-D.ietf-dna-protocol] can be used to ensure this. 216 Note 2: In theory, it would also be possible for hosts to learn 217 about routing failures for a particular selected source prefix, if 218 only suitable protocols for this purpose existed. Some proposals 219 in this space have been made, see, for instance 220 [I-D.bagnulo-shim6-addr-selection] and 221 [I-D.huitema-multi6-addr-selection], but none have been 222 standardized to date. 224 3.3. Operational Address Pairs 226 The existence of locally operational addresses are not, however, a 227 guarantee that communications can be established with the peer. A 228 failure in the routing infrastructure can prevent packets from 229 reaching their destination. For this reason we need the definition 230 of a second level of granularity, for pairs of addresses: 232 Definition. Bidirectionally operational address pair - a pair of 233 locally operational addresses are said to be an operational address 234 pair when bidirectional connectivity can be shown between the 235 addresses. That is, a packet sent with one of the addresses in the 236 source field and the other in the destination field reaches the 237 destination, and vice versa. 239 Unfortunately, there are scenarios where bidirectionally operational 240 address pairs do not exist. For instance, ingress filtering or 241 network failures may result in one address pair being operational in 242 one direction while another one is operational from the other 243 direction. The following definition captures this general situation: 245 Definition. Unidirectionally operational address pair - a pair of 246 locally operational addresses are said to be an unidirectionally 247 operational address pair when packets sent with the first address as 248 the source and the second address as the destination reaches the 249 destination. 251 SHIM6 implementations MUST support the discovery of operational 252 address pairs through the use of explicit reachability tests and 253 Forced Bidirectional Communication (FBD), described later in this 254 specification. Future extensions of SHIM6 may specify additional 255 mechanisms. Some ideas of such mechanisms are listed below, but not 256 fully specified in this document: 258 o Positive feedback from upper layer protocols. For instance, TCP 259 can indicate to the IP layer that it is making progress. This is 260 similar to how IPv6 Neighbor Unreachability Detection can in some 261 cases be avoided when upper layers provide information about 262 bidirectional connectivity [RFC4861]. 264 In the case of unidirectional connectivity, the upper layer 265 protocol responses come back using another address pair, but show 266 that the messages sent using the first address pair have been 267 received. 269 o Negative feedback from upper layer protocols. It is conceivable 270 that upper layer protocols give an indication of a problem to the 271 multihoming layer. For instance, TCP could indicate that there's 272 either congestion or lack of connectivity in the path because it 273 is not getting ACKs. 275 o ICMP error messages. Given the ease of spoofing ICMP messages, 276 one should be careful to not trust these blindly, however. One 277 approach would be to use ICMP error messages only as a hint to 278 perform an explicit reachability test or move an address pair to a 279 lower place in the list of address pairs to be probed, but not as 280 a reason to disrupt ongoing communications without other 281 indications of problems. The situation may be different when 282 certain verifications of the ICMP messages are being performed, as 283 explained by Gont in [I-D.ietf-tcpm-icmp-attacks]. These 284 verifications can ensure that (practically) only on-path attackers 285 can spoof the messages. 287 3.4. Primary Address Pair 289 The primary address pair consists of the addresses that upper layer 290 protocols use in their interaction with the SHIM6 layer. Use of the 291 primary address pair means that the communication is compatible with 292 regular non-SHIM6 communication and no context ID needs to be 293 present. 295 3.5. Current Address Pair 297 SHIM6 needs to avoid sending packets which belong to the same 298 transport connection concurrently over multiple paths. This is 299 because congestion control in commonly used transport protocols is 300 based upon a notion of a single path. While routing can introduce 301 path changes as well and transport protocols have means to deal with 302 this, frequent changes will cause problems. Effective congestion 303 control over multiple paths is considered a research topic at the 304 time this specification is written. SHIM6 does not attempt to employ 305 multiple paths simultaneously. 307 Note: SCTP and future multipath transport protocols are likely to 308 require interaction with SHIM6, at least to ensure that they do 309 not employ SHIM6 unexpectedly. 311 For these reasons it is necessary to choose a particular pair of 312 addresses as the current address pair which is used until problems 313 occur, at least for the same session. 315 It is theoretically possible to support multiple current address 316 pairs for different transport sessions or SHIM6 contexts. 317 However, this is not supported in this version of the SHIM6 318 protocol. 320 A current address pair need not be operational at all times. If 321 there is no traffic to send, we may not know if the primary address 322 pair is operational. Nevertheless, it makes sense to assume that the 323 address pair that worked previously continues to be operational for 324 new communications as well. 326 4. Protocol Overview 328 This section discusses the design of the reachability detection and 329 full reachability exploration mechanisms, and gives on overview of 330 the REAP protocol. 332 Exploring the full set of communication options between two hosts 333 that both have two or more addresses is an expensive operation as the 334 number of combinations to be explored increases very quickly with the 335 number of addresses. For instance, with two addresses on both sides, 336 there are four possible address pairs. Since we can't assume that 337 reachability in one direction automatically means reachability for 338 the complement pair in the other direction, the total number of two- 339 way combinations is eight. (Combinations = nA * nB * 2.) 341 An important observation in multihoming is that failures are 342 relatively infrequent, so that an operational pair that worked a few 343 seconds ago is very likely to be still operational. So it makes 344 sense to have a light-weight protocol that confirms existing 345 reachability, and only invoke heavier exploration when a there is a 346 suspected failure. 348 4.1. Failure Detection 350 Failure detection consists of three parts: tracking local 351 information, tracking remote peer status, and finally verifying 352 reachability. Tracking local information consists of using, for 353 instance, reachability information about the local router as an 354 input. Nodes SHOULD employ techniques listed in Section 3.1 and 355 Section 3.2 to track the local situation. It is also necessary to 356 track remote address information from the peer. For instance, if the 357 peer's currently used address is no longer in use, a mechanism to 358 relay that information is needed. The Update Request message in the 359 SHIM6 protocol is used for this purpose [I-D.ietf-shim6-proto]. 360 Finally, when the local and remote information indicates that 361 communication should be possible and there are upper layer packets to 362 be sent, reachability verification is necessary to ensure that the 363 peers actually have an operational address pair. 365 A technique called Forced Bidirectional Detection (FBD, originally 366 defined in an earlier SHIM6 document [I-D.ietf-shim6-reach-detect]) 367 is employed for the reachability verification. Reachability for the 368 currently used address pair in a SHIM6 context is determined by 369 making sure that whenever there is data traffic in one direction, 370 there is also traffic in the other direction. This can be data 371 traffic as well, but also transport layer acknowledgments or a REAP 372 reachability keepalive if there is no other traffic. This way, it is 373 no longer possible to have traffic in only one direction, so whenever 374 there is data traffic going out, but there are no return packets, 375 there must be a failure, so the full exploration mechanism is 376 started. 378 A more detailed description of the current pair reachability 379 evaluation mechanism: 381 1. To avoid the other side from concluding there is a reachability 382 failure, it's necessary for a host implementing the failure 383 detection mechanism to generate periodic keepalives when there is 384 no other traffic. 386 FBD works by generating REAP keepalives if the node is receiving 387 packets from its peer but not sending any of its own. The 388 keepalives are sent at certain intervals so that the other side 389 knows there is a reachability problem when it doesn't receive any 390 incoming packets for its Send Timeout period. The host 391 communicates its Send Timeout value to the peer as an Keepalive 392 Timeout Option (section 5.3) in the I2, I2bis, R2, or UPDATE 393 messages. The peer then maps this value to its Keepalive Timeout 394 value. 396 The interval after which keepalives are sent is named Keepalive 397 Interval. The RECOMMENDED approach is sending keepalives at one- 398 half to one-third of the Keepalive Timeout interval, so that 399 multiple keepalives are generated and have time to reach the 400 correspondent before it times out. 402 2. Whenever outgoing data packets are generated, a timer is started 403 to reflect the requirement that the peer should generate return 404 traffic from data packets. The timeout value is set to the value 405 of Send Timeout. 407 For the purposes of this specification, "data packet" refers to 408 any packet that is part of a SHIM6 context, including both upper 409 layer protocol packets and SHIM6 protocol messages except those 410 defined in this specification. 412 3. Whenever incoming data packets are received, the timer associated 413 with the return traffic from the peer is stopped, and another 414 timer is started to reflect the requirement for this node to 415 generate return traffic. This timeout value is set to the value 416 of Keepalive Timeout. 418 These two timers are mutually exclusive. In other words, either 419 the node is expecting to see traffic from the peer based on the 420 traffic that the node sent earlier or the node is expecting to 421 respond to the peer based on the traffic that the peer sent 422 earlier (or the node is in an idle state). 424 4. The reception of a REAP keepalive packet leads to stopping the 425 timer associated with the return traffic from the peer. 427 5. Keepalive Interval seconds after the last data packet has been 428 received for a context, and if no other packet has been sent 429 within this context since the data packet has been received, a 430 REAP keepalive packet is generated for the context in question 431 and transmitted to the correspondent. A host may send the 432 keepalive sooner than Keepalive Interval seconds if 433 implementation considerations warrant this, but should take care 434 to avoid sending keepalives at an excessive rate. REAP keepalive 435 packets SHOULD continue to be sent at the Keepalive Interval 436 until either a data packet in the SHIM6 context has been received 437 from the peer or the Keepalive Timeout expires. Keepalives are 438 not sent at all if data was sent within the keep-alive interval. 439 A recommended value range for Keepalive Interval is specified in 440 Section 7. The actual value SHOULD be randomized in order to 441 prevent synchronization. 443 6. Send Timeout seconds after the transmission of a data packet with 444 no return traffic on this context, a full reachability 445 exploration is started. 447 Section 7 provides some suggested defaults for these timeout values. 448 Experience from the deployment of the SHIM6 protocol is needed in 449 order to determine what values are most suitable. 451 4.2. Full Reachability Exploration 453 As explained in previous sections, the currently used address pair 454 may become invalid either through one of the addresses being becoming 455 unavailable or nonoperational, or the pair itself being declared 456 nonoperational. An exploration process attempts to find another 457 operational pair so that communications can resume. 459 What makes this process hard is the requirement to support 460 unidirectionally operational address pairs. It is insufficient to 461 probe address pairs by a simple request - response protocol. 462 Instead, the party that first detects the problem starts a process 463 where it tries each of the different address pairs in turn by sending 464 a message to its peer. These messages carry information about the 465 state of connectivity between the peers, such as whether the sender 466 has seen any traffic from the peer recently. When the peer receives 467 a message that indicates a problem, it assists the process by 468 starting its own parallel exploration to the other direction, again 469 sending information about the recently received payload traffic or 470 signaling messages. 472 Specifically, when A decides that it needs to explore for an 473 alternative address pair to B, it will initiate a set of Probe 474 messages, in sequence, until it gets an Probe message from B 475 indicating that (a) B has received one of A's messages and, 476 obviously, (b) that B's Probe message gets back to A. B uses the same 477 algorithm, but starts the process from the reception of the first 478 Probe message from A. 480 Upon changing to a new address pair, the network path traversed most 481 likely has changed, so that the ULP SHOULD be informed. This can be 482 a signal for the ULP to adapt due to the change in path so that, for 483 example, TCP could initiate a slow start procedure, although it's 484 likely that the circumstances that led to the selection of a new path 485 already caused enough packet loss to trigger slow start. 487 REAP is designed to support failure recovery even in the case of 488 having only unidirectionally operational address pairs. However, due 489 to security concerns discussed in Section 8, the exploration process 490 can typically be run only for a session that has already been 491 established. Specifically, while REAP would in theory be capable of 492 exploration even during connection establishment, its use within the 493 SHIM6 protocol does not allow this. 495 4.3. Exploration Order 497 The exploration process assumes an ability to choose address pairs 498 for testing. An overview of the choosing process used by REAP is as 499 follows: 501 o As an input to start the process, the node has knowledge of its 502 own addresses and has been told via SHIM6 protocol messages what 503 the addresses of the peer are. A list of possible pairs of 504 addresses can be constructed by combining the two pieces of 505 information. 507 o By employing standard IPv6 address selection rules, the list is 508 pruned by removing combinations that are inappropriate, such as 509 attempting to use a link local address when contacting a peer that 510 uses a global unicast address. 512 o Similarly, standard IPv6 address selection rules provide a basic 513 priority order for the pairs. 515 o Local preferences may be applied for some additional tuning of the 516 order in the list. The mechanisms for local preference settings 517 are not specified, but can involve, for instance, configuration 518 that sets the preference for using one interface over another. 520 o As a result, the node has a prioritized list of address pairs to 521 try. However, the list may still be long, as there may be a 522 combinatorial explosion when there are many addresses on both 523 sides. REAP employs these pairs sequentially, however, and uses a 524 back-off procedure is to avoid a "signaling storm". This ensures 525 that the exploration process is relatively conservative or "safe". 526 The tradeoff is that fnding a working path may take time if there 527 are many addresses on both sides. 529 In more detail, the process is as follows. Nodes first consult the 530 RFC 3484 default address selection rules [RFC3484] to determine what 531 combinations of addresses are allowed from a local point of view, as 532 this reduces the search space. RFC 3484 also provides a priority 533 ordering among different address pairs, making the search possibly 534 faster. (Additional mechanisms may be defined in the future for 535 arriving at an initial ordering of address pairs before testing 536 starts [I-D.ietf-shim6-locator-pair-selection].) Nodes may also use 537 local information, such as known quality of service parameters or 538 interface types to determine what addresses are preferred over 539 others, and try pairs containing such addresses first. The SHIM6 540 protocol also carries preference information in its messages. 542 Out of the set of possible candidate address pairs, nodes SHOULD 543 attempt to test through all of them until an operational pair is 544 found, and retrying the process as is necessary. However, all nodes 545 MUST perform this process sequentially and with exponential back-off. 546 This sequential process is necessary in order to avoid a "signaling 547 storm" when an outage occurs (particularly for a complete site). 548 However, it also limits the number of addresses that can in practice 549 be used for multihoming, considering that transport and application 550 layer protocols will fail if the switch to a new address pair takes 551 too long. 553 Section 7 suggests default values for the timers associated with the 554 exploration process. The value Initial Probe Timeout (0.5 seconds) 555 specifies the interval between initial attempts to send probes; 556 Number of Initial Probes (4) specifies how many initial probes can be 557 sent before the exponential backoff procedure needs to be employed. 558 This process increases the time between every probe if there is no 559 response. Typically, each increase doubles the time but this 560 specification does not mandate a particular increase. 562 Note: The rationale for sending four packets at a fixed rate 563 before the exponential backoff is employed is to avoid having to 564 send these packets excessively fast. Without this, having 0.5 565 seconds between the third and fourth probe means that the time 566 between the first and second probe would have to be 0.125 seconds, 567 which gives very little time for a reply to the first packet to 568 arrive. Also, this means that the first four packets are sent 569 within 0.875 seconds rather than 2 seconds, increasing the 570 potential for congestion if a large number of shim contexts need 571 to send probes at the same time after a failure. 573 Finally, Max Probe Timeout (60 seconds) specifies a limit beyond 574 which the probe interval may not grow. If the exploration process 575 reaches this interval, it will continue sending at this rate until a 576 suitable response is triggered or the SHIM6 context is garbage 577 collected, because upper layer protocols using the SHIM6 context in 578 question are no longer attempting to send packets. Reaching the Max 579 Probe Timeout may also serve as a hint to the garbage collection 580 process that the context is no longer usable. 582 5. Protocol Definition 584 5.1. Keepalive Message 586 The format of the keepalive message is as follows: 588 0 1 2 3 589 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 590 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 591 | Next Header | Hdr Ext Len |0| Type = 66 | Reserved1 |0| 592 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 593 | Checksum |R| | 594 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 595 | Receiver Context Tag | 596 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 597 | Reserved2 | 598 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 599 | | 600 + Options + 601 | | 602 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 604 Next Header, Hdr Ext Len, 0, 0, Checksum 606 These are as specified in Section 5.3 of the SHIM6 protocol 607 description [I-D.ietf-shim6-proto]. 609 Type 611 This field identifies the Keepalive message and MUST be set to 66 612 (Keepalive). 614 Reserved1 616 This is a 7-bit field reserved for future use. It is set to zero 617 on transmit, and MUST be ignored on receipt. 619 R 621 This is a 1-bit field reserved for future use. It is set to zero 622 on transmit, and MUST be ignored on receipt. 624 Receiver Context Tag 626 This is a 47-bit field for the Context Tag the receiver has 627 allocated for the context. 629 Reserved2 631 This is a 32-bit field reserved for future use. It is set to zero 632 on transmit, and MUST be ignored on receipt. 634 Options 636 This MAY contain one or more SHIM6 options.The inclusion of the 637 latter options is not necessary, however, as there are currently 638 no defined options that are useful in a Keepalive message. These 639 options are provided only for future extensibility reasons. 641 A valid message conforms to the format above, has a Receiver Context 642 Tag that matches to context known by the receiver, is valid shim 643 control message as defined in Section 12.2 of the SHIM6 protocol 644 description [I-D.ietf-shim6-proto], and its shim context state is 645 ESTABLISHED. The receiver processes a valid message by inspecting 646 its options, and executing any actions specified for such options. 648 The processing rules for this message are the given in more detail in 649 Section 6. 651 5.2. Probe Message 653 This message performs REAP exploration. Its format is as follows: 655 0 1 2 3 656 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 657 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 658 | Next Header | Hdr Ext Len |0| Type = 67 | Reserved |0| 659 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 660 | Checksum |R| | 661 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 662 | Receiver Context Tag | 663 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 664 | Precvd| Psent |Sta| Reserved2 | 665 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 666 | | 667 + First probe sent + 668 | | 669 + Source address + 670 | | 671 + + 672 | | 673 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 674 | | 675 + First probe sent + 676 | | 677 + Destination address + 678 | | 679 + + 680 | | 681 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 682 | First probe nonce | 683 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 684 | First probe data | 685 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 686 / / 687 / Nth probe sent / 688 | | 689 + Source address + 690 | | 691 + + 692 | | 693 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 694 | | 695 + Nth probe sent + 696 | | 697 + Destination address + 698 | | 699 + + 700 | | 701 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 702 | Nth probe nonce | 703 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 704 | Nth probe data | 705 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 706 | | 707 + First probe received + 708 | | 709 + Source address + 710 | | 711 + + 712 | | 713 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 714 | | 715 + First probe received + 716 | | 717 + Destination address + 718 | | 719 + + 720 | | 721 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 722 | First probe nonce | 723 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 724 | First probe data | 725 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 726 | | 727 + Nth probe received + 728 | | 729 + Source address + 730 | | 731 + + 732 | | 733 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 734 | | 735 + Nth probe received + 736 | | 737 + Destination address + 738 | | 739 + + 740 | | 741 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 742 | Nth probe nonce | 743 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 744 | Nth probe data | 745 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 746 | | 747 + Options + 748 | | 749 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 750 | | 751 + Options + 752 | | 753 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 755 Next Header, Hdr Ext Len, 0, 0, Checksum 757 These are as specified in Section 5.3 of the SHIM6 protocol 758 description [I-D.ietf-shim6-proto]. 760 Type 762 This field identifies the Probe message and MUST be set to 67 763 (Probe). 765 Reserved 767 This is a 7-bit field reserved for future use. It is set to zero 768 on transmit, and MUST be ignored on receipt. 770 R 772 This is a 1-bit field reserved for future use. It is set to zero 773 on transmit, and MUST be ignored on receipt. 775 Receiver Context Tag 777 This is a 47-bit field for the Context Tag the receiver has 778 allocated for the context. 780 Psent 782 This is a 4-bit field that indicates the number of sent probes 783 included in this probe message. The first set of probe fields 784 pertains to the current message and MUST be present, so the 785 minimum value for this field is 1. Additional sent probe fields 786 are copies of the same fields sent in (recent) earlier probes and 787 may be included or omitted as per any logic employed by the 788 implementation. 790 Precvd 792 This is a 4-bit field that indicates the number of received probes 793 included in this probe message. Received probe fields are copies 794 of the same fields in earlier received probes that arrived since 795 the last transition to state Exploring. When a sender is in state 796 InboundOk it MUST include copies of the fields of at least one of 797 the inbound probes. A sender MAY include additional sets of these 798 received probe fields in any state as per any logic employed by 799 the implementation. 801 The fields probe source, probe destination, probe nonce and probe 802 data may be repeated, depending on the value of Psent and 803 Preceived. 805 Sta (State) 807 This 2-bit State field is used to inform the peer about the state 808 of the sender. It has three legal values: 810 0 (Operational) implies that the sender both (a) believes it has 811 no problem communicating and (b) believes that the recipient also 812 has no problem communicating. 814 1 (Exploring) implies that the sender has a problem communicating 815 with the recipient, e.g., it has not seen any traffic from the 816 recipient even when it expected some. 818 2 (InboundOk) implies that the sender believes it has no problem 819 communicating, i.e., it at least sees packets from the recipient, 820 but that the recipient either has a problem or has not yet 821 confirmed to the sender that the problem has been solved. 823 Reserved2 825 MUST be set to 0 upon transmission and MUST be ignored upon 826 reception. 828 Probe source 830 This 128-bit field contains the source IPv6 address used to send 831 the probe. 833 Probe destination 835 This 128-bit field contains the destination IPv6 address used to 836 send the probe. 838 Probe nonce 840 This is a 32-bit field that is initialized by the sender with a 841 value that allows it to determine which sent probes a received 842 probe correlates with. It is highly RECOMMENDED that the nonce 843 field is at least moderately hard to guess so that even on-path 844 attackers can't deduce the next nonce value that will be used. 845 This value SHOULD be generated using a random number generator 846 that is known to have good randomness properties as outlined in 847 RFC 4086 [RFC4086]. 849 Probe data 851 This is a 32-bit field with no fixed meaning. The probe data 852 field is copied back with no changes. Future flags may define a 853 use for this field. 855 Options 857 For future extensions. 859 5.3. Keepalive Timeout Option Format 861 Either side of a SHIM6 context can notify the peer of the value that 862 it would prefer the peer to use as its Keepalive Timeout value. If 863 the host is using a non-default Send Timeout value, it SHOULD 864 communicate this value as a Keepalive Timeout value to the peer in 865 the below option. This option MAY be sent in the I2, I2bis, R2, or 866 UPDATE messages. The option SHOULD only need to be sent once in a 867 given shim6 association. If a host receives this option it SHOULD 868 update its Keepalive Timeout value for the correspondent. 870 0 1 2 3 871 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 872 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 873 | Type = 10 |0| Length = 4 | 874 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 875 + Reserved | Keepalive Timeout | 876 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 878 Fields: 880 Type 882 This field identifies the option and MUST be set to 10 (Keepalive 883 Timeout). 885 Length 887 This field MUST be set as specified in Section 5.14 of the SHIM6 888 protocol description [I-D.ietf-shim6-proto]. That is, it is set 889 to 4. 891 Reserved 893 16-bit field reserved for future use. Set to zero upon transmit 894 and MUST be ignored upon receipt. 896 Keepalive Timeout 898 Value in seconds corresponding to suggested Keepalive Timeout 899 value for the peer. 901 6. Behaviour 903 The required behaviour of REAP nodes is specified below in the form 904 of a state machine. The externally observable behaviour of an 905 implementation MUST conform to this state machine, but there is no 906 requirement that the implementation actually employs a state machine. 907 Intermixed with the following description we also provide a state 908 machine description in a tabular form. That form is only 909 informational, however. 911 On a given context with a given peer, the node can be in one of three 912 states: Operational, Exploring, or InboundOK. In the Operational 913 state the underlying address pairs are assumed to be operational. In 914 the Exploring state this node has observed a problem and has 915 currently not seen any traffic from the peer. Finally, in the 916 InboundOK state this node sees traffic from the peer, but peer may 917 not yet see any traffic from this node so that the exploration 918 process needs to continue. 920 The node maintains also the Send timer (Send Timeout seconds) and 921 Keepalive timer (Keepalive Timeout seconds). The Send timer reflects 922 the requirement that when this node sends a payload packet there 923 should be some return traffic (either payload packets or Keepalive 924 messages) within Send Timeout seconds. The Keepalive timer reflects 925 the requirement that when this node receives a payload packet there 926 should a similar response towards the peer. The Keepalive timer is 927 only used within the Operational state, and the Send timer in the 928 Operational and InboundOK states. No timer is running in the 929 Exploring state. As explained in Section 4.1, the two timers are 930 mutually exclusive. That is, either the Keepalive timer is running 931 or the Send timer is running (or no timer is running). 933 Note that Appendix A gives some examples of typical protocol runs to 934 illustrate the behaviour. 936 6.1. Incoming payload packet 938 Upon the reception of a payload packet in the Operational state, the 939 node starts the Keepalive timer if it is not yet running, and stops 940 the Send timer if it was running. 942 If the node is in the Exploring state it transitions to the InboundOK 943 state, sends a Probe message, and starts the Send timer. It fills 944 the Psent and corresponding Probe source address, Probe destination 945 address, Probe nonce, and Probe data fields with information about 946 recent Probe messages that have not yet been reported as seen by the 947 peer. It also fills the Precvd and corresponding Probe source 948 address, Probe destination address, Probe nonce, and Probe data 949 fields with information about recent Probe messages it has seen from 950 the peer. When sending a Probe message, the State field MUST be set 951 to a value that matches the conceptual state of the sender after 952 sending the Probe. In this case the node therefore sets the Sta 953 field to 2 (InboundOk). The IP source and and destination addresses 954 for sending the Probe message are selected as discussed in 955 Section 4.3. 957 In the InboundOK state the node stops the Send timer if it was 958 running, but does not do anything else. 960 The reception of SHIM6 control messages other than the Keepalive and 961 Probe messages are treated similarly with payload packets. 963 While the Keepalive timer is running, the node SHOULD send Keepalive 964 messages to the peer with an interval of Keepalive Interval seconds. 965 Conceptually, a separate timer is used to distinguish between the 966 interval between Keepalive messages and the overall Keepalive Timeout 967 interval. However, this separate timer is not modelled in the 968 tabular or graphical state machines. When sent, the Keepalive 969 message is constructed as described in Section 5.1. It is sent using 970 the current address pair. 972 Operational Exploring InboundOk 973 ------------------------------------------------------------- 974 STOP Send; SEND Probe InboundOk; STOP Send 975 START Keepalive START Send; 976 GOTO InboundOk 978 6.2. Outgoing payload packet 980 Upon sending a payload packet in the Operational state, the node 981 stops the Keepalive timer if it was running and starts the Send timer 982 if it was not running. In the Exploring state there is no effect, 983 and in the InboundOK state the node simply starts the Send timer if 984 it was not yet running. (The sending of SHIM6 control messages is 985 again treated similarly here.) 987 Operational Exploring InboundOk 988 ----------------------------------------------------------- 989 START Send; - START Send 990 STOP Keepalive 992 6.3. Keepalive timeout 994 Upon a timeout on the Keepalive timer, the node sends one last 995 Keepalive message. This can only happen in the Operational state. 997 The Keepalive message is constructed as described in Section 5.1. It 998 is sent using the current address pair. 1000 Operational Exploring InboundOk 1001 ----------------------------------------------------------- 1002 SEND Keepalive - - 1004 6.4. Send timeout 1006 Upon a timeout on the Send timer, the node enters the Exploring state 1007 and sends a Probe message. The Probe message is constructed as 1008 explained in Section 6.1, except that the Sta field is set to 1 1009 (Exploring). 1011 Operational Exploring InboundOk 1012 ----------------------------------------------------------- 1013 SEND Probe Exploring; - SEND Probe Exploring; 1014 GOTO Exploring GOTO Exploring 1016 6.5. Retransmission 1018 While in the Exploring state the node keeps retransmitting its Probe 1019 messages to different (or same) addresses as defined in Section 4.3. 1020 A similar process is employed in the InboundOk state, except that 1021 upon such retransmission the Send timer is started if it was not 1022 running already. 1024 The Probe messages are constructed as explained in Section 6.1, 1025 except that the Sta field is set to 1 (Exploring) or 2 (InboundOk), 1026 depending on which state the sender is in. 1028 Operational Exploring InboundOk 1029 ---------------------------------------------------------- 1030 - SEND Probe Exploring SEND Probe InboundOk 1031 START Send 1033 6.6. Reception of the Keepalive message 1035 Upon the reception of a Keepalive message in the Operational state, 1036 the node stops the Send timer, if it was running. If the node is in 1037 the Exploring state it transitions to the InboundOK state, sends a 1038 Probe message, and starts the Send timer. The Probe message is 1039 constructed as explained in Section 6.1. 1041 In the InboundOK state the Send timer is stopped, if it was running. 1043 Operational Exploring InboundOk 1044 ----------------------------------------------------------- 1045 STOP Send SEND Probe InboundOk; STOP Send 1046 START Send; 1047 GOTO InboundOk 1049 6.7. Reception of the Probe message State=Exploring 1051 Upon receiving a Probe with State set to Exploring, the node enters 1052 the InboundOK state, sends a Probe as described in Section 6.1, stops 1053 the Keepalive timer if it was running, and restarts the Send timer. 1055 Operational Exploring InboundOk 1056 ----------------------------------------------------------- 1057 SEND Probe InboundOk; SEND Probe InboundOk; SEND Probe 1058 STOP Keepalive; START Send; InboundOk; 1059 RESTART Send; GOTO InboundOk RESTART Send 1060 GOTO InboundOk 1062 6.8. Reception of the Probe message State=InboundOk 1064 Upon the reception of a Probe message with State set to InboundOk, 1065 the node sends a Probe message, restarts the Send timer, stops the 1066 Keepalive timer if it was running, and transitions to the Operational 1067 state. New current address pair is chosen for the connection, based 1068 on the reports of received probes in the message that we just 1069 received. If no received probes have been reported, the current 1070 address pair is unchanged. 1072 The Probe message is constructed as explained in Section 6.1, except 1073 that the Sta field is set to 0 (Operational). 1075 Operational Exploring InboundOk 1076 ------------------------------------------------------------- 1077 SEND Probe Operational; SEND Probe Operational; SEND Probe 1078 RESTART Send; RESTART Send; Operational; 1079 STOP Keepalive GOTO Operational RESTART Send; 1080 GOTO Operational 1082 6.9. Reception of the Probe message State=Operational 1084 Upon the reception of a Probe message with State set to Operational, 1085 the node stops the Send timer if it was running, starts the Keepalive 1086 timer if it was not yet running, and transitions to the Operational 1087 state. The Probe message is constructed as explained in Section 6.1, 1088 except that the Sta field is set to 0 (Operational). 1090 Note: This terminates the exploration process when both parties 1091 are happy and know that their peer is happy as well. 1093 Operational Exploring InboundOk 1094 ----------------------------------------------------------- 1095 STOP Send STOP Send; STOP Send; 1096 START Keepalive START Keepalive START Keepalive 1097 GOTO Operational GOTO Operational 1099 The reachability detection and exploration process has no effect on 1100 payload communications until a new operational address pairs have 1101 actually been confirmed. Prior to that the payload packets continue 1102 to be sent to the previously used addresses. 1104 6.10. Graphical Representation of the State Machine 1106 In the PDF version of this specification, an informational drawing 1107 illustrates the state machine. Where the text and the drawing 1108 differ, the text takes precedence. 1110 7. Protocol Constants 1112 The following protocol constants are defined: 1114 Send Timeout 15 seconds 1115 Keepalive Interval X seconds, where X is 1116 one third to one half of 1117 the Keepalive Timeout value 1118 (see Section 4.1) 1119 Initial Probe Timeout 0.5 seconds 1120 Number of Initial Probes 4 probes 1121 Max Probe Timeout 60 seconds 1123 Alternate values of the Send Timeout may be selected by a host and 1124 communicated to the peer in the Keepalive Timeout Option. A very 1125 small value of Send Timeout may affect the ability to exchange 1126 keepalives over a path that has a long roundtrip delay. Similarly, 1127 it may cause SHIM6 to react to temporary failures more often than 1128 necessary. As a result, it is RECOMMENDED that an alternate Send 1129 Timeout value not be under 10 seconds. Choosing a higher value than 1130 the one recommended above is also possible, but there is a 1131 relationship between Send Timeout and the ability of REAP to discover 1132 and correct errors in the communication path. In any case, in order 1133 for SHIM6 to be useful, it should detect and repair communication 1134 problems far before upper layers give up. For this reason, it is 1135 RECOMMENDED that Send Timeout be at most 100 seconds (default TCP R2 1136 timeout [RFC1122]). 1138 Note that it is not expected that the Send Timeout or other values 1139 need to be estimated based on experienced roundtrip times. 1140 Signaling exchanges are performed based on exponential backoff. 1141 The keepalive processes send packets only in the relatively rare 1142 condition that all traffic is unidirectional. Finally, because 1143 Send Timeout is far greater than usual roundtrip times, it merely 1144 divides the traffic into periods that SHIM6 looks at to decide 1145 whether to act. 1147 8. Security Considerations 1149 Attackers may spoof various indications from lower layers and the 1150 network in an effort to confuse the peers about which addresses are 1151 or are not operational. For example, attackers may spoof ICMP error 1152 messages in an effort to cause the parties to move their traffic 1153 elsewhere or even to disconnect. Attackers may also spoof 1154 information related to network attachments, router discovery, and 1155 address assignments in an effort to make the parties believe they 1156 have Internet connectivity when in reality they do not. 1158 This may cause use of non-preferred addresses or even denial-of- 1159 service. 1161 This protocol does not provide any protection of its own for 1162 indications from other parts of the protocol stack. Unprotected 1163 indications SHOULD NOT be taken as a proof of connectivity problems. 1164 However, REAP has weak resistance against incorrect information even 1165 from unprotected indications in the sense that it performs its own 1166 tests prior to picking a new address pair. Denial-of- service 1167 vulnerabilities remain, however, as do vulnerabilities against on 1168 path attackers. 1170 Some aspects of these vulnerabilities can be mitigated through the 1171 use of techniques specific to the other parts of the stack, such as 1172 properly dealing with ICMP errors [I-D.ietf-tcpm-icmp-attacks], link 1173 layer security, or the use of SEND [RFC3971] to protect IPv6 Router 1174 and Neighbor Discovery. 1176 Other parts of the SHIM6 protocol ensure that the set of addresses we 1177 are switching between actually belong together. REAP itself provides 1178 no such assurances. Similarly, REAP provides some protection against 1179 third party flooding attacks [AURA02]; when REAP is run its Probe 1180 nonces can be used as a return routability check that the claimed 1181 address is indeed willing to receive traffic. However, this needs to 1182 be complemented with another mechanism to ensure that the claimed 1183 address is also the correct host. SHIM6 does this by performing 1184 binding of all operations to context tags. 1186 The keepalive mechanism in this specification is vulnerable to 1187 spoofing. On path-attackers that can see a SHIM6 context tag can 1188 send spoofed Keepalive messages once per Send Timeout interval, to 1189 prevent two SHIM6 nodes from sending Keepalives themselves. This 1190 vulnerability is only relevant to nodes involved in a one-way 1191 communication. The result of the attack is that the nodes enter the 1192 exploration phase needlessly, but they should be able to confirm 1193 connectivity unless, of course, the attacker is able to prevent the 1194 exploration phase from completing. Off-path attackers may not be 1195 able to generate spoofed results, given that the context tags are 47- 1196 bit random numbers. 1198 To protect against spoofed keepalive packets, a host implementing 1199 both shim6 and IPsec MAY ignore incoming REAP keepalives if it has 1200 good reason to assume that the other side will be sending IPsec- 1201 protected return traffic. I.e., if a host is sending TCP data, it 1202 can reasonably expect to receive TCP ACKs in return. If no IPsec- 1203 protected ACKs come back but unprotected keepalives do, this could be 1204 the result from an attacker trying to hide broken connectivity. 1206 To protect against spoofed keepalive packets, a host implementing 1207 both shim6 and IPsec MAY ignore incoming REAP keepalives if it has 1208 good reason to assume that the other side will be sending IPsec- 1209 protected return traffic. I.e., if a host is sending TCP data, it 1210 can reasonably expect to receive TCP ACKs in return. If no IPsec- 1211 protected ACKs come back but unprotected keepalives do, this could be 1212 the result from an attacker trying to hide broken connectivity. 1214 The exploration phase is vulnerable to attackers that are on the 1215 path. Off-path attackers would find it hard to guess either the 1216 context tag or the correct probe identifiers. Given that IPsec 1217 operates above the shim layer, it is not possible to protect the 1218 exploration phase against on-path attackers. This is similar to the 1219 ability to protect other Shim6 control exchanges. There are 1220 mechanisms in place to prevent the redirection of communications to 1221 wrong addresses, but on-path attackers can cause denial-of-service, 1222 move communications to less-preferred address pairs, and so on. 1224 Finally, the exploration itself can cause a number of packets to be 1225 sent. As a result it may be used as a tool for packet amplification 1226 in flooding attacks. In order to prevent this it is required that 1227 the protocol employing REAP has built-in mechanisms to prevent this. 1228 For instance, in SHIM6 contexts are created only after a relatively 1229 large number of packets has been exchanged, a cost which reduces the 1230 attractiveness of using SHIM6 and REAP for amplification attacks. 1231 However, such protections are typically not present at connection 1232 establishment time. When exploration would be needed for connection 1233 establishment to succeed, its usage would result in an amplification 1234 vulnerability. As a result, SHIM6 does not support the use of REAP 1235 in connection establishment stage. 1237 9. Operational Considerations 1239 When there are no failures, the failure detection mechanism (and 1240 SHIM6 in general) are light-weight: keepalives are not sent when a 1241 SHIM6 context is idle or when there is traffic in both directions. 1242 So in normal TCP or TCP-like operation, there would only be one or 1243 two keepalives when a session transitions from active to idle. 1245 Only when there are failures, there is significant failure detection 1246 traffic, and then especially in the case where a link goes down that 1247 is shared by many active sessions and by multiple hosts. When this 1248 happens, one keepalive is sent and then a series of probes. This 1249 happens per active (traffic generating) context, which will all 1250 timeout within 10 seconds after the failure. This makes the peak 1251 traffic that SHIM6 generates after a failure around one packet per 1252 second per context. Presumably, the sessions that run over those 1253 contexts were sending at least that much traffic and most likely 1254 more, but if the backup path is significantly lower bandwidth than 1255 the failed path, this could lead to temporary congestion. 1257 However, note that in the case of multihoming using BGP, if the 1258 failover is fast enough that TCP doesn't go into slow start, the 1259 full data traffic that flows over the failed path is switched over 1260 to the backup path, and if this backup path is of a lower 1261 capacity, there will be even more congestion in that case. 1263 Although the failure detection probing does not perform congestion 1264 control as such, the exponential backoff makes sure that the number 1265 of packets sent quickly goes down and eventually reaches one per 1266 context per minute, which should be sufficiently conservative even on 1267 the lowest bandwidth links. 1269 Section 7 specifies a number of protocol parameters. Possible tuning 1270 of these parameters and others that are not mandated in this 1271 specification may affect these properties. It is expected that 1272 further revisions of this specification provide additional 1273 information after sufficient deployment experience has been obtained 1274 from different environments. 1276 Implementations may provide means to monitor their performance and 1277 send alarms about problems. Their standardization is, however, 1278 subject of future specifications. In general, SHIM6 is most 1279 applicable for small sites and hosts, and it is expected that 1280 monitoring requirements on such deployments are relatively modest. 1281 In any case, where the host is associated with a management system, 1282 it is RECOMMENDED that detected failures and failover events are 1283 reported via asynchronous notifications to the management system. 1284 Similarly, where logging mechanisms are available on the host, these 1285 events should be recorded in event logs. 1287 SHIM6 uses the same header for both signaling and the encapsulation 1288 of data packets after a rehoming event. This way, fate is shared 1289 between the two types of packets, so the situation where reachability 1290 probes or keepalives can be transmitted successfully, but data 1291 packets can not, is largely avoided: either all SHIM6 packets make it 1292 through, so SHIM6 functions as intended, or none do, and no SHIM6 1293 state is negotiated. Even in the situation where some packets make 1294 it through and other do not, SHIM6 will generally either work as 1295 intended or provide a service that is no worse than in the absense of 1296 SHIM6, apart from the possible generation a a small amount of 1297 signaling traffic. 1299 Sometimes data packets and possibly data packets encapsulated in the 1300 SHIM6 header do not make it through, but signaling and keepalives do. 1301 This situation can occur when there is a path MTU discovery black 1302 hole on one of the paths. If only large packets are sent at some 1303 point, then reachability exploration will be turned on and REAP will 1304 likely select another path, which may or may not be affected by the 1305 PMTUD black hole. 1307 10. IANA Considerations 1309 No IANA actions are required. The number assignments necessary for 1310 the messages defined in this document appear together with all the 1311 other IANA assignments in the main SHIM6 specification 1312 [I-D.ietf-shim6-proto]. 1314 11. References 1316 11.1. Normative References 1318 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1319 Requirement Levels", BCP 14, RFC 2119, March 1997. 1321 [RFC3315] Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C., 1322 and M. Carney, "Dynamic Host Configuration Protocol for 1323 IPv6 (DHCPv6)", RFC 3315, July 2003. 1325 [RFC3484] Draves, R., "Default Address Selection for Internet 1326 Protocol version 6 (IPv6)", RFC 3484, February 2003. 1328 [RFC4086] Eastlake, D., Schiller, J., and S. Crocker, "Randomness 1329 Requirements for Security", BCP 106, RFC 4086, June 2005. 1331 [RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 1332 Addresses", RFC 4193, October 2005. 1334 [RFC4429] Moore, N., "Optimistic Duplicate Address Detection (DAD) 1335 for IPv6", RFC 4429, April 2006. 1337 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 1338 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 1339 September 2007. 1341 [RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless 1342 Address Autoconfiguration", RFC 4862, September 2007. 1344 11.2. Informative References 1346 [AURA02] Aura, T., Roe, M., and J. Arkko, "Security of Internet 1347 Location Management", In Proceedings of the 18th Annual 1348 Computer Security Applications Conference, Las Vegas, 1349 Nevada, USA., December 2002. 1351 [I-D.bagnulo-shim6-addr-selection] 1352 Bagnulo, M., "Address selection in multihomed 1353 environments", draft-bagnulo-shim6-addr-selection-00 (work 1354 in progress), October 2005. 1356 [I-D.huitema-multi6-addr-selection] 1357 Huitema, C., "Address selection in multihomed 1358 environments", draft-huitema-multi6-addr-selection-00 1359 (work in progress), October 2004. 1361 [I-D.ietf-bfd-base] 1362 Katz, D. and D. Ward, "Bidirectional Forwarding 1363 Detection", draft-ietf-bfd-base-08 (work in progress), 1364 March 2008. 1366 [I-D.ietf-dna-cpl] 1367 Nordmark, E. and J. Choi, "DNA with unmodified routers: 1368 Prefix list based approach", draft-ietf-dna-cpl-02 (work 1369 in progress), January 2006. 1371 [I-D.ietf-dna-protocol] 1372 Narayanan, S., Kempf, J., Nordmark, E., Pentland, B., 1373 Choi, J., Daley, G., and N. Montavont, "Detecting Network 1374 Attachment in IPv6 Networks (DNAv6)", 1375 draft-ietf-dna-protocol-07 (work in progress), 1376 February 2008. 1378 [I-D.ietf-hip-mm] 1379 Henderson, T., "End-Host Mobility and Multihoming with the 1380 Host Identity Protocol", draft-ietf-hip-mm-05 (work in 1381 progress), March 2007. 1383 [I-D.ietf-shim6-locator-pair-selection] 1384 Bagnulo, M., "Default Locator-pair selection algorithm for 1385 the SHIM6 protocol", 1386 draft-ietf-shim6-locator-pair-selection-03 (work in 1387 progress), February 2008. 1389 [I-D.ietf-shim6-proto] 1390 Nordmark, E. and M. Bagnulo, "Shim6: Level 3 Multihoming 1391 Shim Protocol for IPv6", draft-ietf-shim6-proto-10 (work 1392 in progress), February 2008. 1394 [I-D.ietf-shim6-reach-detect] 1395 Beijnum, I., "Shim6 Reachability Detection", 1396 draft-ietf-shim6-reach-detect-01 (work in progress), 1397 October 2005. 1399 [I-D.ietf-tcpm-icmp-attacks] 1400 Gont, F., "ICMP attacks against TCP", 1401 draft-ietf-tcpm-icmp-attacks-03 (work in progress), 1402 March 2008. 1404 [RFC1122] Braden, R., "Requirements for Internet Hosts - 1405 Communication Layers", STD 3, RFC 1122, October 1989. 1407 [RFC3971] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure 1408 Neighbor Discovery (SEND)", RFC 3971, March 2005. 1410 [RFC4960] Stewart, R., "Stream Control Transmission Protocol", 1411 RFC 4960, September 2007. 1413 Appendix A. Example Protocol Runs 1415 This appendix has examples of REAP protocol runs in typical 1416 scenarios. We start with the simplest scenario of two hosts, A and 1417 B, that have a SHIM6 connection with each other but are not currently 1418 sending any data. As neither side sends anything, they also do not 1419 expect anything back, so there are no messages at all: 1421 EXAMPLE 1: No communications 1423 Peer A Peer B 1424 | | 1425 | | 1426 | | 1427 | | 1428 | | 1429 | | 1430 | | 1431 | | 1433 Our second example involves an active connection with bidirectional 1434 payload packet flows. Here the reception of data from the peer is 1435 taken as an indication of reachability, so again there are no extra 1436 packes: 1438 EXAMPLE 2: Bidirectional communications 1440 Peer A Peer B 1441 | | 1442 | payload packet | 1443 |-------------------------------------------->| 1444 | | 1445 | payload packet | 1446 |<--------------------------------------------| 1447 | | 1448 | payload packet | 1449 |-------------------------------------------->| 1450 | | 1451 | | 1453 The third example is the first one that involves an actual REAP 1454 message. Here the hosts communicate in just one direction, so REAP 1455 messages are needed to indicate to the peer that sends payload 1456 packets that its packets are getting through: 1458 EXAMPLE 3: Unidirectional communications 1460 Peer A Peer B 1461 | | 1462 | payload packet | 1463 |-------------------------------------------->| 1464 | | 1465 | payload packet | 1466 |-------------------------------------------->| 1467 | | 1468 | payload packet | 1469 |-------------------------------------------->| 1470 | | 1471 | Keepalive id=p | 1472 |<--------------------------------------------| 1473 | | 1474 | payload packet | 1475 |-------------------------------------------->| 1476 | | 1477 | | 1479 The next example involves a failure scenario. Here A has addresses A 1480 and B has addresses B1 and B2. The currently used address pairs are 1481 (A, B1) and (B1, A). All connections via B1 become broken, which 1482 leads to an exploration process: 1484 EXAMPLE 4: Failure scenario 1486 Peer A Peer B 1487 | | 1488 State: | State: 1489 Operational | Operational 1490 | (A,B1) payload packet | 1491 |-------------------------------------------->| 1492 | | 1493 | (B1,A) payload packet | 1494 |<--------------------------------------------| At time T1 1495 | | path A<->B1 1496 | (A,B1) payload packet | becomes 1497 |----------------------------------------/ | broken 1498 | | 1499 | ( B1,A) payload packet | 1500 | /-----------------------------------------| 1501 | | 1502 | (A,B1) payload packet | 1503 |----------------------------------------/ | 1504 | | 1505 | (B1,A) payload packet | 1506 | /-----------------------------------------| 1507 | | 1508 | (A,B1) payload packet | 1509 |----------------------------------------/ | 1510 | | 1511 | | Send Timeout 1512 | | seconds after 1513 | | T1, B happens to 1514 | | see the problem 1515 | (B1,A) Probe id=p, | first and sends a 1516 | state=exploring | complaint that 1517 | /-----------------------------------------| it is not rec- 1518 | | eiving anything 1519 | | State: 1520 | | Exploring 1521 | | 1522 | (B2,A) Probe id=q, | 1523 | state=exploring | But its lost, 1524 |<--------------------------------------------| retransmission 1525 | | uses another pair 1526 A realizes | 1527 that it needs | 1528 to start the | 1529 exploration. It | 1530 picks B2 as the | 1531 most likely candidate, | 1532 as it appeared in the | 1533 Probe | 1534 State: InboundOk | 1535 | | 1536 | (A, B2) Probe id=r, | 1537 | state=inboundok, | 1538 | received probe q | This one gets 1539 |-------------------------------------------->| through. 1540 | | State: 1541 | | Operational 1542 | | 1543 | | 1544 | (B2,A) Probe id=s, | 1545 | state=operational, | B now knows 1546 | received probe r | that A has no 1547 |<--------------------------------------------| problem to receive 1548 | | its packets 1549 State: Operational | 1550 | | 1551 | (A,B2) payload packet | 1552 |-------------------------------------------->| Payload packets 1553 | | flow again 1554 | (B2,A) payload packet | 1555 |<--------------------------------------------| 1557 The next example shows when the failure for the current locator pair 1558 is in the other direction only. A has addresses A1 and A2, and B has 1559 addresses B1 and B2. The current communication is between A1 and B1, 1560 but A's packets no longer reach B using this pair. 1562 EXAMPLE 5: One-way failure 1564 Peer A Peer B 1565 | | 1566 State: | State: 1567 Operational | Operational 1568 | | 1569 | (A1,B1) payload packet | 1570 |-------------------------------------------->| 1571 | | 1572 | (B1,A1) payload packet | 1573 |<--------------------------------------------| 1574 | | 1575 | (A1,B1) payload packet | At time T1 1576 |----------------------------------------/ | path A1->B1 1577 | | becomes 1578 | | broken 1579 | (B1,A1) payload packet | 1580 |<--------------------------------------------| 1581 | | 1582 | (A1,B1) payload packet | 1583 |----------------------------------------/ | 1584 | | 1585 | (B1,A1) payload packet | 1586 |<--------------------------------------------| 1587 | | 1588 | (A1,B1) payload packet | 1589 |----------------------------------------/ | 1590 | | 1591 | | Send Timeout 1592 | | seconds after 1593 | | T1, B notices 1594 | | the problem and 1595 | (B1,A1) Probe id=p, | sends a com- 1596 | state=exploring | plaint that 1597 |<--------------------------------------------| it is not rec- 1598 | | eiving anything 1599 A responds | State: Exploring 1600 State: InboundOk | 1601 | | 1602 | (A1, B1) Probe id=q, | 1603 | state=inboundok, | 1604 | received probe p | 1605 |----------------------------------------/ | But A's response 1606 | | is lost 1607 | (B2,A2) Probe id=r, | 1608 | state=exploring | Next try different 1609 |<--------------------------------------------| locator pair 1610 | | 1611 | (A2, B2) Probe id=s, | 1612 | state=inboundok, | 1613 | received probes p, r | This one gets 1614 |-------------------------------------------->| through 1615 | | State: Operational 1616 | | 1617 | | B now knows 1618 | | that A has no 1619 | (B2,A2) Probe id=t, | problem to receive 1620 | state=operational, | its packets, and 1621 | received probe s | that A's probe 1622 |<--------------------------------------------| gets to B. It 1623 | | sends a 1624 State: Operational | confirmation to A 1625 | | 1626 | (A2,B2) payload packet | 1627 |-------------------------------------------->| Payload packets 1628 | | flow again 1629 | (B1,A1) payload packet | 1630 |<--------------------------------------------| 1632 Appendix B. Contributors 1634 This draft attempts to summarize the thoughts and unpublished 1635 contributions of many people, including the MULTI6 WG design team 1636 members Marcelo Bagnulo Braun, Erik Nordmark, Geoff Huston, Kurtis 1637 Lindqvist, Margaret Wasserman, and Jukka Ylitalo, the MOBIKE WG 1638 contributors Pasi Eronen, Tero Kivinen, Francis Dupont, Spencer 1639 Dawkins, and James Kempf, and HIP WG contributors such as Pekka 1640 Nikander. This draft is also in debt to work done in the context of 1641 SCTP [RFC4960] and HIP multihoming and mobility extension 1642 [I-D.ietf-hip-mm]. 1644 Appendix C. Acknowledgements 1646 The authors would also like to thank Christian Huitema, Pekka Savola, 1647 John Loughney, Sam Xia, Hannes Tschofenig, Sebastian Barre, Thomas 1648 Henderson, Matthijs Mekking, Deguang Le, Eric Gray, Dan Romascanu, 1649 Stephen Kent, Alberto Garcia, Bernard Aboba, Lars Eggert, Dave Ward, 1650 and Tim Polk for interesting discussions in this problem space, and 1651 for review of this specification. 1653 Authors' Addresses 1655 Jari Arkko 1656 Ericsson 1657 Jorvas 02420 1658 Finland 1660 Email: jari.arkko@ericsson.com 1662 Iljitsch van Beijnum 1663 IMDEA Networks 1664 Avda. del Mar Mediterraneo, 22 1665 Leganes, Madrid 28918 1666 Spain 1668 Email: iljitsch@muada.com 1670 Full Copyright Statement 1672 Copyright (C) The IETF Trust (2008). 1674 This document is subject to the rights, licenses and restrictions 1675 contained in BCP 78, and except as set forth therein, the authors 1676 retain all their rights. 1678 This document and the information contained herein are provided on an 1679 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1680 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 1681 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 1682 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 1683 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1684 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1686 Intellectual Property 1688 The IETF takes no position regarding the validity or scope of any 1689 Intellectual Property Rights or other rights that might be claimed to 1690 pertain to the implementation or use of the technology described in 1691 this document or the extent to which any license under such rights 1692 might or might not be available; nor does it represent that it has 1693 made any independent effort to identify any such rights. Information 1694 on the procedures with respect to rights in RFC documents can be 1695 found in BCP 78 and BCP 79. 1697 Copies of IPR disclosures made to the IETF Secretariat and any 1698 assurances of licenses to be made available, or the result of an 1699 attempt made to obtain a general license or permission for the use of 1700 such proprietary rights by implementers or users of this 1701 specification can be obtained from the IETF on-line IPR repository at 1702 http://www.ietf.org/ipr. 1704 The IETF invites any interested party to bring to its attention any 1705 copyrights, patents or patent applications, or other proprietary 1706 rights that may cover technology that may be required to implement 1707 this standard. Please address the information to the IETF at 1708 ietf-ipr@ietf.org.