idnits 2.17.1 draft-ietf-shim6-failure-detection-11.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 17. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 1630. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1641. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1648. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1654. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 13, 2008) is 5915 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 3315 (Obsoleted by RFC 8415) ** Obsolete normative reference: RFC 3484 (Obsoleted by RFC 6724) == Outdated reference: A later version (-09) exists of draft-ietf-dna-protocol-06 == Outdated reference: A later version (-04) exists of draft-ietf-shim6-locator-pair-selection-02 == Outdated reference: A later version (-12) exists of draft-ietf-shim6-proto-09 == Outdated reference: A later version (-12) exists of draft-ietf-tcpm-icmp-attacks-02 -- Obsolete informational reference (is this intentional?): RFC 4960 (Obsoleted by RFC 9260) Summary: 3 errors (**), 0 flaws (~~), 5 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Arkko 3 Internet-Draft Ericsson 4 Intended status: Standards Track I. van Beijnum 5 Expires: August 16, 2008 IMDEA Networks 6 February 13, 2008 8 Failure Detection and Locator Pair Exploration Protocol for IPv6 9 Multihoming 10 draft-ietf-shim6-failure-detection-11 12 Status of this Memo 14 By submitting this Internet-Draft, each author represents that any 15 applicable patent or other IPR claims of which he or she is aware 16 have been or will be disclosed, and any of which he or she becomes 17 aware will be disclosed, in accordance with Section 6 of BCP 79. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that 21 other groups may also distribute working documents as Internet- 22 Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six months 25 and may be updated, replaced, or obsoleted by other documents at any 26 time. It is inappropriate to use Internet-Drafts as reference 27 material or to cite them other than as "work in progress." 29 The list of current Internet-Drafts can be accessed at 30 http://www.ietf.org/ietf/1id-abstracts.txt. 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html. 35 This Internet-Draft will expire on August 16, 2008. 37 Copyright Notice 39 Copyright (C) The IETF Trust (2008). 41 Abstract 43 This document specifies how the level 3 multihoming shim protocol 44 (SHIM6) detects failures between two communicating hosts. It also 45 specifies an exploration protocol for switching to another pair of 46 interfaces and/or addresses between the same hosts if a failure 47 occurs and an operational pair can be found. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 52 2. Requirements language . . . . . . . . . . . . . . . . . . . . 5 53 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 6 54 3.1. Available Addresses . . . . . . . . . . . . . . . . . . 6 55 3.2. Locally Operational Addresses . . . . . . . . . . . . . 7 56 3.3. Operational Address Pairs . . . . . . . . . . . . . . . 7 57 3.4. Primary Address Pair . . . . . . . . . . . . . . . . . . 9 58 3.5. Current Address Pair . . . . . . . . . . . . . . . . . . 9 59 4. Protocol Overview . . . . . . . . . . . . . . . . . . . . . . 10 60 4.1. Failure Detection . . . . . . . . . . . . . . . . . . . 10 61 4.2. Full Reachability Exploration . . . . . . . . . . . . . 12 62 4.3. Exploration Order . . . . . . . . . . . . . . . . . . . 13 63 5. Protocol Definition . . . . . . . . . . . . . . . . . . . . . 15 64 5.1. Keepalive Message . . . . . . . . . . . . . . . . . . . 15 65 5.2. Probe Message . . . . . . . . . . . . . . . . . . . . . 16 66 5.3. Keepalive Timeout Option Format . . . . . . . . . . . . 20 67 6. Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . 22 68 6.1. Incoming payload packet . . . . . . . . . . . . . . . . 22 69 6.2. Outgoing payload packet . . . . . . . . . . . . . . . . 23 70 6.3. Keepalive timeout . . . . . . . . . . . . . . . . . . . 23 71 6.4. Send timeout . . . . . . . . . . . . . . . . . . . . . . 24 72 6.5. Retransmission . . . . . . . . . . . . . . . . . . . . . 24 73 6.6. Reception of the Keepalive message . . . . . . . . . . . 24 74 6.7. Reception of the Probe message State=Exploring . . . . . 25 75 6.8. Reception of the Probe message State=InboundOk . . . . . 25 76 6.9. Reception of the Probe message State=Operational . . . . 25 77 6.10. Graphical Representation of the State Machine . . . . . 26 78 7. Protocol Constants . . . . . . . . . . . . . . . . . . . . . . 27 79 8. Security Considerations . . . . . . . . . . . . . . . . . . . 28 80 9. Operational Considerations . . . . . . . . . . . . . . . . . . 30 81 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 32 82 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 33 83 11.1. Normative References . . . . . . . . . . . . . . . . . . 33 84 11.2. Informative References . . . . . . . . . . . . . . . . . 33 85 Appendix A. Example Protocol Runs . . . . . . . . . . . . . . . . 35 86 Appendix B. Contributors . . . . . . . . . . . . . . . . . . . . 40 87 Appendix C. Acknowledgements . . . . . . . . . . . . . . . . . . 41 88 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 42 89 Intellectual Property and Copyright Statements . . . . . . . . . . 43 91 1. Introduction 93 The SHIM6 protocol [I-D.ietf-shim6-proto] extends IPv6 to support 94 multihoming. It is an IP layer mechanism that hides multihoming from 95 applications. A part of the SHIM6 solution involves detecting when a 96 currently used pair of addresses (or interfaces) between two 97 communication hosts has failed, and picking another pair when this 98 occurs. We call the former failure detection, and the latter locator 99 pair exploration. 101 This document specifies the mechanisms and protocol messages to 102 achieve both failure detection and locator pair exploration. This 103 part of the SHIM6 protocol is called the REAchability Protocol 104 (REAP). 106 Failure detection is made as light weight as possible. Data traffic 107 in both direction is observed, and in the case where there is no 108 traffic because the communication is idle, failure detection is also 109 idle and doesn't generate any packets. When data traffic is flowing 110 in both directions, there is no need to send failure detection 111 packets, either. Only when there is traffic in one direction, the 112 failure detection mechanism generates keepalives in the other 113 direction. As a result, whenever there is outgoing traffic and no 114 incoming return traffic or keepalives, there must be failure, at 115 which point the locator pair exploration is performed to find a 116 working address pair for each direction. 118 The document is structured as follows: Section 3 defines a set of 119 useful terms, Section 4 gives an overview of REAP, and Section 5 120 specifies the message formats and behaviour in detail. Section 8 121 discusses the security considerations of REAP. 123 In this specification, we consider an address to be synonymous with a 124 locator. Other parts of the SHIM6 protocol ensure that the different 125 locators used by a node actually belong together. That is, REAP is 126 not responsible for ensuring that it ends up with a legitimate 127 locator. 129 2. Requirements language 131 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 132 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 133 document are to be interpreted as described in [RFC2119]. 135 3. Definitions 137 This section defines terms useful for discussing failure detection 138 and locator pair exploration. 140 3.1. Available Addresses 142 SHIM6 nodes need to be aware of what addresses they themselves have. 143 If a node loses the address it is currently using for communications, 144 another address must replace this address. And if a node loses an 145 address that the node's peer knows about, the peer must be informed. 146 Similarly, when a node acquires a new address it may generally wish 147 the peer to know about it. 149 Definition. Available address - an address is said to be available 150 if all the following conditions are fulfilled: 152 o The address has been assigned to an interface of the node. 154 o The valid lifetime of the prefix (RFC 4861 [RFC4861] Section 155 4.6.2) associated with the address has not expired. 157 o The address is not tentative in the sense of RFC 4862 [RFC4862]. 158 In other words, the address assignment is complete so that 159 communications can be started. 161 Note that this explicitly allows an address to be optimistic in 162 the sense of Optimistic DAD [RFC4429] even though implementations 163 may prefer using other addresses as long as there is an 164 alternative. 166 o The address is a global unicast or unique local address [RFC4193]. 167 That is, it is not an IPv6 site-local or link-local address. 169 With link-local addresses, the nodes would be unable to determine 170 on which link the given address is usable. 172 o The address and interface is acceptable for use according to a 173 local policy. 175 Available addresses are discovered and monitored through mechanisms 176 outside the scope of SHIM6. SHIM6 implementations MUST be able to 177 employ information provided by IPv6 Neighbor Discovery [RFC4861], 178 Address Autoconfiguration [RFC4862], and DHCP [RFC3315] (when DHCP is 179 implemented). This information includes the availability of a new 180 address and status changes of existing addresses (such as when an 181 address becomes invalid). 183 3.2. Locally Operational Addresses 185 Two different granularity levels are needed for failure detection. 186 The coarser granularity is for individual addresses: 188 Definition. Locally Operational Address - an available address is 189 said to be locally operational when its use is known to be possible 190 locally: the interface is up, a default router (if needed) suitable 191 for this address is known to be reachable, and no other local 192 information points to the address being unusable. 194 Locally operational addresses are discovered and monitored through 195 mechanisms outside the SHIM6 protocol. SHIM6 implementations MUST be 196 able to employ information provided from Neighbor Unreachability 197 Detection [RFC4861]. Implementations MAY also employ additional, 198 link layer specific mechanisms. 200 Note 1: A part of the problem in ensuring that an address is 201 operational is making sure that after a change in link layer 202 connectivity we are still connected to the same IP subnet. 203 Mechanisms such as DNA CPL [I-D.ietf-dna-cpl] or DNAv6 204 [I-D.ietf-dna-protocol] can be used to ensure this. 206 Note 2: In theory, it would also be possible for hosts to learn 207 about routing failures for a particular selected source prefix, if 208 only suitable protocols for this purpose existed. Some proposals 209 in this space have been made, see, for instance 210 [I-D.bagnulo-shim6-addr-selection] and 211 [I-D.huitema-multi6-addr-selection], but none have been 212 standardized to date. 214 3.3. Operational Address Pairs 216 The existence of locally operational addresses are not, however, a 217 guarantee that communications can be established with the peer. A 218 failure in the routing infrastructure can prevent packets from 219 reaching their destination. For this reason we need the definition 220 of a second level of granularity, for pairs of addresses: 222 Definition. Bidirectionally operational address pair - a pair of 223 locally operational addresses are said to be an operational address 224 pair when bidirectional connectivity can be shown between the 225 addresses. That is, a packet sent with one of the addresses in the 226 source field and the other in the destination field reaches the 227 destination, and vice versa. 229 Unfortunately, there are scenarios where bidirectionally operational 230 address pairs do not exist. For instance, ingress filtering or 231 network failures may result in one address pair being operational in 232 one direction while another one is operational from the other 233 direction. The following definition captures this general situation: 235 Definition. Unidirectionally operational address pair - a pair of 236 locally operational addresses are said to be an unidirectionally 237 operational address pair when packets sent with the first address as 238 the source and the second address as the destination reaches the 239 destination. 241 SHIM6 implementations MUST support the discovery of operational 242 address pairs through the use of explicit reachability tests and 243 Forced Bidirectional Communication (FBD), described later in this 244 specification. In addition, implementations MAY employ additional 245 mechanisms. Some ideas such mechanisms are listed below, but not 246 fully specified in this document: 248 o Positive feedback from upper layer protocols. For instance, TCP 249 can indicate to the IP layer that it is making progress. This is 250 similar to how IPv6 Neighbor Unreachability Detection can in some 251 cases be avoided when upper layers provide information about 252 bidirectional connectivity [RFC4861]. 254 In the case of unidirectional connectivity, the upper layer 255 protocol responses come back using another address pair, but show 256 that the messages sent using the first address pair have been 257 received. 259 o Negative feedback from upper layer protocols. It is conceivable 260 that upper layer protocols give an indication of a problem to the 261 multihoming layer. For instance, TCP could indicate that there's 262 either congestion or lack of connectivity in the path because it 263 is not getting ACKs. 265 o ICMP error messages. Given the ease of spoofing ICMP messages, 266 one should be careful to not trust these blindly, however. Our 267 suggestion is to use ICMP error messages only as a hint to perform 268 an explicit reachability test or move an address pair to a lower 269 place in the list of address pairs to be probed, but not as a 270 reason to disrupt ongoing communications without other indications 271 of problems. The situation may be different when certain 272 verifications of the ICMP messages are being performed, as 273 explained by Gont in [I-D.ietf-tcpm-icmp-attacks]. These 274 verifications can ensure that (practically) only on-path attackers 275 can spoof the messages. 277 3.4. Primary Address Pair 279 The primary address pair consists of the addresses that upper layer 280 protocols use in their interaction with the SHIM6 layer. Use of the 281 primary address pair means that the communication is compatible with 282 regular non-SHIM6 communication and no context ID needs to be 283 present. 285 3.5. Current Address Pair 287 SHIM6 needs to avoid sending packets which belong to the same 288 transport connection concurrently over multiple paths. This is 289 because congestion control in commonly used transport protocols is 290 based upon a notion of a single path. While routing can introduce 291 path changes as well and transport protocols have means to deal with 292 this, frequent changes will cause problems. Effective congestion 293 control over multiple paths is considered a research topic at the 294 time this specification is written. SHIM6 does not attempt to employ 295 multiple paths simultaneously. 297 Note: SCTP and future multipath transport protocols are likely to 298 require interaction with SHIM6, at least to ensure that they do 299 not employ SHIM6 unexpectedly. 301 For these reasons it is necessary to choose a particular pair of 302 addresses as the current address pair which is used until problems 303 occur, at least for the same session. 305 It is theoretically possible to support multiple current address 306 pairs for different transport sessions or SHIM6 contexts. 307 However, this is not supported in this version of the SHIM6 308 protocol. 310 A current address pair need not be operational at all times. If 311 there is no traffic to send, we may not know if the primary address 312 pair is operational. Nevertheless, it makes sense to assume that the 313 address pair that worked previously continues to be operational for 314 new communications as well. 316 4. Protocol Overview 318 This section discusses the design of the reachability detection and 319 full reachability exploration mechanisms, and gives on overview of 320 the REAP protocol. 322 Exploring the full set of communication options between two hosts 323 that both have two or more addresses is an expensive operation as the 324 number of combinations to be explored increases very quickly with the 325 number of addresses. For instance, with two addresses on both sides, 326 there are four possible address pairs. Since we can't assume that 327 reachability in one direction automatically means reachability for 328 the complement pair in the other direction, the total number of two- 329 way combinations is eight. (Combinations = nA * nB * 2.) 331 An important observation in multihoming is that failures are 332 relatively infrequent, so that an operational pair that worked a few 333 seconds ago is very likely to be still operational. So it makes 334 sense to have a light-weight protocol that confirms existing 335 reachability, and only invoke heavier exploration when a there is a 336 suspected failure. 338 4.1. Failure Detection 340 Failure detection consists of three parts: tracking local 341 information, tracking remote peer status, and finally verifying 342 reachability. Tracking local information consists of using, for 343 instance, reachability information about the local router as an 344 input. Nodes SHOULD employ techniques listed in Section 3.1 and 345 Section 3.2 to track the local situation. It is also necessary to 346 track remote address information from the peer. For instance, if the 347 peer's currently used address is no longer in use, a mechanism to 348 relay that information is needed. The Update Request message in the 349 SHIM6 protocol is used for this purpose [I-D.ietf-shim6-proto]. 350 Finally, when the local and remote information indicates that 351 communication should be possible and there are upper layer packets to 352 be sent, reachability verification is necessary to ensure that the 353 peers actually have an operational address pair. 355 A technique called Forced Bidirectional Detection (FBD, originally 356 defined in an earlier SHIM6 document [I-D.ietf-shim6-reach-detect]) 357 is employed for the reachability verification. Reachability for the 358 currently used address pair in a SHIM6 context is determined by 359 making sure that whenever there is data traffic in one direction, 360 there is also traffic in the other direction. This can be data 361 traffic as well, but also transport layer acknowledgments or a REAP 362 reachability keepalive if there is no other traffic. This way, it is 363 no longer possible to have traffic in only one direction, so whenever 364 there is data traffic going out, but there are no return packets, 365 there must be a failure, so the full exploration mechanism is 366 started. 368 A more detailed description of the current pair reachability 369 evaluation mechanism: 371 1. To avoid the other side from concluding there is a reachability 372 failure, it's necessary for a host implementing the failure 373 detection mechanism to generate periodic keepalives when there is 374 no other traffic. 376 FBD works by generating REAP keepalives if the node is receiving 377 packets from its peer but not sending any of its own. The 378 keepalives are sent at certain intervals so that the other side 379 knows there is a reachability problem when it doesn't receive any 380 incoming packets for its Send Timeout period. The host 381 communicates its Send Timeout value to the peer as an Keepalive 382 Timeout Option (section 5.3) in the I2, I2bis, R2, or UPDATE 383 messages. The peer then maps this value to its Keepalive Timeout 384 value. 386 The interval after which keepalives are sent is named Keepalive 387 Interval. The RECOMMENDED approach is sending keepalives at one- 388 half to one-third of the Keepalive Timeout interval, so that 389 multiple keepalives are generated and have time to reach the 390 correspondent before it times out. 392 2. Whenever outgoing data packets are generated, a timer is started 393 to reflect the requirement that the peer should generate return 394 traffic from data packets. The timeout value is set to the value 395 of Send Timeout. 397 For the purposes of this specification, "data packet" refers to 398 any packet that is part of a SHIM6 context, including both upper 399 layer protocol packets and SHIM6 protocol messages except those 400 defined in this specification. 402 3. Whenever incoming data packets are received, the timer associated 403 with the return traffic from the peer is stopped, and another 404 timer is started to reflect the requirement for this node to 405 generate return traffic. This timeout value is set to the value 406 of Keepalive Timeout. 408 These two timers are mutually exclusive. In other words, either 409 the node is expecting to see traffic from the peer based on the 410 traffic that the node sent earlier or the node is expecting to 411 respond to the peer based on the traffic that the peer sent 412 earlier (or the node is in an idle state). 414 4. The reception of a REAP keepalive packet leads to stopping the 415 timer associated with the return traffic from the peer. 417 5. Keepalive Interval seconds after the last data packet has been 418 received for a context, and if no other packet has been sent 419 within this context since the data packet has been received, a 420 REAP keepalive packet is generated for the context in question 421 and transmitted to the correspondent. A host may send the 422 keepalive sooner than Keepalive Interval seconds if 423 implementation considerations warrant this, but should take care 424 to avoid sending keepalives at an excessive rate. REAP keepalive 425 packets SHOULD continue to be sent at the Keepalive Interval 426 until either a data packet in the SHIM6 context has been received 427 from the peer or the Keepalive Timeout expires. Keepalives are 428 not sent at all if data was sent within the keep-alive interval. 430 6. Send Timeout seconds after the transmission of a data packet with 431 no return traffic on this context, a full reachability 432 exploration is started. 434 Section 7 provides some suggested defaults for these timeout values. 435 Experience from the deployment of the SHIM6 protocol is needed in 436 order to determine what values are most suitable. 438 4.2. Full Reachability Exploration 440 As explained in previous sections, the currently used address pair 441 may become invalid either through one of the addresses being becoming 442 unavailable or nonoperational, or the pair itself being declared 443 nonoperational. An exploration process attempts to find another 444 operational pair so that communications can resume. 446 What makes this process hard is the requirement to support 447 unidirectionally operational address pairs. It is insufficient to 448 probe address pairs by a simple request - response protocol. 449 Instead, the party that first detects the problem starts a process 450 where it tries each of the different address pairs in turn by sending 451 a message to its peer. These messages carry information about the 452 state of connectivity between the peers, such as whether the sender 453 has seen any traffic from the peer recently. When the peer receives 454 a message that indicates a problem, it assists the process by 455 starting its own parallel exploration to the other direction, again 456 sending information about the recently received payload traffic or 457 signaling messages. 459 Specifically, when A decides that it needs to explore for an 460 alternative address pair to B, it will initiate a set of Probe 461 messages, in sequence, until it gets an Probe message from B 462 indicating that (a) B has received one of A's messages and, 463 obviously, (b) that B's Probe message gets back to A. B uses the same 464 algorithm, but starts the process from the reception of the first 465 Probe message from A. 467 Upon changing to a new address pair, the network path traversed most 468 likely has changed, so that the ULP SHOULD be informed. This can be 469 a signal for the ULP to adapt due to the change in path so that, for 470 example, TCP could initiate a slow start procedure, although it's 471 likely that the circumstances that led to the selection of a new path 472 already caused enough packet loss to trigger slow start. 474 REAP is designed to support failure recovery even in the case of 475 having only unidirectionally operational address pairs. However, due 476 to security concerns discussed in Section 8, the exploration process 477 can typically be run only for a session that has already been 478 established. Specifically, while REAP would in theory be capable of 479 exploration even during connection establishment, its use within the 480 SHIM6 protocol does not allow this. 482 4.3. Exploration Order 484 The exploration process assumes an ability to choose address pairs 485 for testing, in some sequence. This process may result in a 486 combinatorial explosion when there are many addresses on both sides, 487 but a back-off procedure is employed to avoid a "signaling storm". 489 Nodes first consult the RFC 3484 default address selection rules 490 [RFC3484] to determine what combinations of addresses are allowed 491 from a local point of view, as this reduces the search space. RFC 492 3484 also provides a priority ordering among different address pairs, 493 making the search possibly faster. (Additional mechanisms may be 494 defined in the future for arriving at an initial ordering of address 495 pairs before testing starts [I-D.ietf-shim6-locator-pair-selection].) 496 Nodes may also use local information, such as known quality of 497 service parameters or interface types to determine what addresses are 498 preferred over others, and try pairs containing such addresses first. 499 The SHIM6 protocol also carries preference information in its 500 messages. 502 Out of the set of possible candidate address pairs, nodes SHOULD 503 attempt to test through all of them until an operational pair is 504 found, and retrying the process as is necessary. However, all nodes 505 MUST perform this process sequentially and with exponential back-off. 506 This sequential process is necessary in order to avoid a "signaling 507 storm" when an outage occurs (particularly for a complete site). 509 However, it also limits the number of addresses that can in practice 510 be used for multihoming, considering that transport and application 511 layer protocols will fail if the switch to a new address pair takes 512 too long. 514 Section 7 suggests default values for the timers associated with the 515 exploration process. The value Initial Probe Timeout (0.5 seconds) 516 specifies the interval between initial attempts to send probes; 517 Number of Initial Probes (4) specifies how many initial probes can be 518 sent before the exponential backoff procedure needs to be employed. 519 This process increases the time between every probe if there is no 520 response. Typically, each increase doubles the time but this 521 specification does not mandate a particular increase. 523 Note: The rationale for sending four packets at a fixed rate 524 before the exponential backoff is employed is to avoid having to 525 send these packets excessively fast. Without this, having 0.5 526 seconds between the third and fourth probe means that the time 527 between the first and second probe would have to be 0.125 seconds, 528 which gives very little time for a reply to the first packet to 529 arrive. Also, this means that the first four packets are sent 530 within 0.875 seconds rather than 2 seconds, increasing the 531 potential for congestion if a large number of shim contexts need 532 to send probes at the same time after a failure. 534 Finally, Max Probe Timeout (60 seconds) specifies a limit beyond 535 which the probe interval may not grow. If the exploration process 536 reaches this interval, it will continue sending at this rate until a 537 suitable response is triggered or the SHIM6 context is garbage 538 collected, because upper layer protocols using the SHIM6 context in 539 question are no longer attempting to send packets. Reaching the Max 540 Probe Timeout may also serve as a hint to the garbage collection 541 process that the context is no longer usable. 543 5. Protocol Definition 545 5.1. Keepalive Message 547 The format of the keepalive message is as follows: 549 0 1 2 3 550 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 551 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 552 | Next Header | Hdr Ext Len |0| Type = 66 | Reserved1 |0| 553 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 554 | Checksum |R| | 555 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 556 | Receiver Context Tag | 557 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 558 | Reserved2 | 559 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 560 | | 561 + Options + 562 | | 563 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 565 Next Header, Hdr Ext Len, 0, 0, Checksum 567 These are as specified in Section 5.3 of the SHIM6 protocol 568 description [I-D.ietf-shim6-proto]. 570 Type 572 This field identifies the Keepalive message and MUST be set to 66 573 (Keepalive). 575 Reserved1 577 This is a 7-bit field reserved for future use. It is set to zero 578 on transmit, and MUST be ignored on receipt. 580 R 582 This is a 1-bit field reserved for future use. It is set to zero 583 on transmit, and MUST be ignored on receipt. 585 Receiver Context Tag 587 This is a 47-bit field for the Context Tag the receiver has 588 allocated for the context. 590 Reserved2 592 This is a 32-bit field reserved for future use. It is set to zero 593 on transmit, and MUST be ignored on receipt. 595 Options 597 This MAY contain one or more SHIM6 options.The inclusion of the 598 latter options is not necessary, however, as there are currently 599 no defined options that are useful in a Keepalive message. These 600 options are provided only for future extensibility reasons. 602 A valid message conforms to the format above, has a Receiver Context 603 Tag that matches to context known by the receiver, is valid shim 604 control message as defined in Section 12.2 of the SHIM6 protocol 605 description [I-D.ietf-shim6-proto], and its shim context state is 606 ESTABLISHED. The receiver processes a valid message by inspecting 607 its options, and executing any actions specified for such options. 609 The processing rules for this message are the given in more detail in 610 Section 6. 612 5.2. Probe Message 614 This message performs REAP exploration. Its format is as follows: 616 0 1 2 3 617 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 618 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 619 | Next Header | Hdr Ext Len |0| Type = 67 | Reserved |0| 620 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 621 | Checksum |R| | 622 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 623 | Receiver Context Tag | 624 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 625 | Precvd| Psent |Sta| Reserved2 | 626 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 627 | | 628 + First probe sent + 629 | | 630 + Source address + 631 | | 632 + + 633 | | 634 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 635 | | 636 + First probe sent + 637 | | 638 + Destination address + 639 | | 640 + + 641 | | 642 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 643 | First probe nonce | 644 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 645 | First probe data | 646 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 647 / / 648 / Nth probe sent / 649 | | 650 + Source address + 651 | | 652 + + 653 | | 654 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 655 | | 656 + Nth probe sent + 657 | | 658 + Destination address + 659 | | 660 + + 661 | | 662 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 663 | Nth probe nonce | 664 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 665 | Nth probe data | 666 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 667 | | 668 + First probe received + 669 | | 670 + Source address + 671 | | 672 + + 673 | | 674 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 675 | | 676 + First probe received + 677 | | 678 + Destination address + 679 | | 680 + + 681 | | 682 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 683 | First probe nonce | 684 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 685 | First probe data | 686 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 687 | | 688 + Nth probe received + 689 | | 690 + Source address + 691 | | 692 + + 693 | | 694 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 695 | | 696 + Nth probe received + 697 | | 698 + Destination address + 699 | | 700 + + 701 | | 702 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 703 | Nth probe nonce | 704 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 705 | Nth probe data | 706 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 707 | | 708 + Options + 709 | | 710 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 711 | | 712 + Options + 713 | | 714 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 716 Next Header, Hdr Ext Len, 0, 0, Checksum 718 These are as specified in Section 5.3 of the SHIM6 protocol 719 description [I-D.ietf-shim6-proto]. 721 Type 723 This field identifies the Probe message and MUST be set to 67 724 (Probe). 726 Reserved 728 This is a 7-bit field reserved for future use. It is set to zero 729 on transmit, and MUST be ignored on receipt. 731 R 733 This is a 1-bit field reserved for future use. It is set to zero 734 on transmit, and MUST be ignored on receipt. 736 Receiver Context Tag 738 This is a 47-bit field for the Context Tag the receiver has 739 allocated for the context. 741 Psent 743 This is a 4-bit field that indicates the number of sent probes 744 included in this probe message. The first set of probe fields 745 pertains to the current message and MUST be present, so the 746 minimum value for this field is 1. Additional sent probe fields 747 are copies of the same fields sent in (recent) earlier probes and 748 may be included or omitted as per any logic employed by the 749 implementation. 751 Precvd 753 This is a 4-bit field that indicates the number of received probes 754 included in this probe message. Received probe fields are copies 755 of the same fields in earlier received probes that arrived since 756 the last transition from state Operational to state Exploring. 757 When a sender is in state InboundOk it MUST include copies of the 758 fields of at least one of the inbound probes. A sender MAY 759 include additional sets of these received probe fields in any 760 state as per any logic employed by the implementation. 762 The fields probe source, probe destination, probe nonce and probe 763 data may be repeated, depending on the value of Psent and 764 Preceived. 766 Sta (State) 768 This 2-bit State field is used to inform the peer about the state 769 of the sender. It has three legal values: 771 0 (Operational) implies that the sender both (a) believes it has 772 no problem communicating and (b) believes that the recipient also 773 has no problem communicating. 775 1 (Exploring) implies that the sender has a problem communicating 776 with the recipient, e.g., it has not seen any traffic from the 777 recipient even when it expected some. 779 2 (InboundOk) implies that the sender believes it has no problem 780 communicating, i.e., it at least sees packets from the recipient, 781 but that the recipient either has a problem or has not yet 782 confirmed to the sender that the problem has been solved. 784 Reserved2 786 MUST be set to 0 upon transmission and MUST be ignored upon 787 reception. 789 Probe source 791 This 128-bit field contains the source IPv6 address used to send 792 the probe. 794 Probe destination 796 This 128-bit field contains the destination IPv6 address used to 797 send the probe. 799 Probe nonce 801 This is a 32-bit field that is initialized by the sender with a 802 value that allows it to determine which sent probes a received 803 probe correlates with. It is highly RECOMMENDED that the nonce 804 field is at least moderately hard to guess so that even on-path 805 attackers can't deduce the next nonce value that will be used. 806 This value SHOULD be generated using a random number generator 807 that is known to have good randomness properties as outlined in 808 RFC 4086 [RFC4086]. 810 Probe data 812 This is a 32-bit field with no fixed meaning. The probe data 813 field is copied back with no changes. Future flags may define a 814 use for this field. 816 Options 818 For future extensions. 820 5.3. Keepalive Timeout Option Format 822 Either side of a SHIM6 context can notify the peer of the value that 823 it would prefer the peer to use as its Keepalive Timeout value. If 824 the host is using a non-default Send Timeout value, it SHOULD 825 communicate this value as a Keepalive Timeout value to the peer in 826 the below option. This option MAY be sent in the I2, I2bis, R2, or 827 UPDATE messages. The option SHOULD only need to be sent once in a 828 given shim6 association. If a host receives this option it SHOULD 829 update its Keepalive Timeout value for the correspondent. 831 0 1 2 3 832 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 833 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 834 | Type = 10 |0| Length = 4 | 835 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 836 + Reserved | Keepalive Timeout | 837 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 839 Fields: 841 Type 843 This field identifies the option and MUST be set to 10 (Keepalive 844 Timeout). 846 Length 848 This field MUST be set as specified in Section 5.14 of the SHIM6 849 protocol description [I-D.ietf-shim6-proto]. That is, it is set 850 to 4. 852 Reserved 854 16-bit field reserved for future use. Set to zero upon transmit 855 and MUST be ignored upon receipt. 857 Keepalive Timeout 859 Value in seconds corresponding to suggested Keepalive Timeout 860 value for the peer. 862 6. Behaviour 864 The required behaviour of REAP nodes is specified below in the form 865 of a state machine. The externally observable behaviour of an 866 implementation MUST conform to this state machine, but there is no 867 requirement that the implementation actually employs a state machine. 868 Intermixed with the following description we also provide a state 869 machine description in a tabular form. That form is only 870 informational, however. 872 On a given context with a given peer, the node can be in one of three 873 states: Operational, Exploring, or InboundOK. In the Operational 874 state the underlying address pairs are assumed to be operational. In 875 the Exploring state this node has observed a problem and has 876 currently not seen any traffic from the peer. Finally, in the 877 InboundOK state this node sees traffic from the peer, but peer may 878 not yet see any traffic from this node so that the exploration 879 process needs to continue. 881 The node maintains also the Send timer (Send Timeout seconds) and 882 Keepalive timer (Keepalive Timeout seconds). The Send timer reflects 883 the requirement that when this node sends a payload packet there 884 should be some return traffic (either payload packets or Keepalive 885 messages) within Send Timeout seconds. The Keepalive timer reflects 886 the requirement that when this node receives a payload packet there 887 should a similar response towards the peer. The Keepalive timer is 888 only used within the Operational state, and the Send timer in the 889 Operational and InboundOK states. No timer is running in the 890 Exploring state. As explained in Section 4.1, the two timers are 891 mutually exclusive. That is, either the Keepalive timer is running 892 or the Send timer is running (or no timer is running). 894 Note that Appendix A gives some examples of typical protocol runs to 895 illustrate the behaviour. 897 6.1. Incoming payload packet 899 Upon the reception of a payload packet in the Operational state, the 900 node starts the Keepalive timer if it is not yet running, and stops 901 the Send timer if it was running. 903 If the node is in the Exploring state it transitions to the InboundOK 904 state, sends a Probe message, and starts the Send timer. It fills 905 the Psent and corresponding Probe source address, Probe destination 906 address, Probe nonce, and Probe data fields with information about 907 recent Probe messages that have not yet been reported as seen by the 908 peer. It also fills the Precvd and corresponding Probe source 909 address, Probe destination address, Probe nonce, and Probe data 910 fields with information about recent Probe messages it has seen from 911 the peer. When sending a Probe message, the State field MUST be set 912 to a value that matches the conceptual state of the sender after 913 sending the Probe. In this case the node therefore sets the Sta 914 field to 2 (InboundOk). The IP source and and destination addresses 915 for sending the Probe message are selected as discussed in 916 Section 4.3. 918 In the InboundOK state the node stops the Send timer if it was 919 running, but does not do anything else. 921 The reception of SHIM6 control messages other than the Keepalive and 922 Probe messages are treated similarly with payload packets. 924 While the Keepalive timer is running, the node SHOULD send Keepalive 925 messages to the peer with an interval of Keepalive Interval seconds. 926 Conceptually, a separate timer is used to distinguish between the 927 interval between Keepalive messages and the overall Keepalive Timeout 928 interval. However, this separate timer is not modelled in the 929 tabular or graphical state machines. When sent, the Keepalive 930 message is constructed as described in Section 5.1. It is sent using 931 the current address pair. 933 Operational Exploring InboundOk 934 ------------------------------------------------------------- 935 STOP Send; SEND Probe InboundOk; STOP Send 936 START Keepalive START Send; 937 GOTO InboundOk 939 6.2. Outgoing payload packet 941 Upon sending a payload packet in the Operational state, the node 942 stops the Keepalive timer if it was running and starts the Send timer 943 if it was not running. In the Exploring state there is no effect, 944 and in the InboundOK state the node simply starts the Send timer if 945 it was not yet running. (The sending of SHIM6 control messages is 946 again treated similarly here.) 948 Operational Exploring InboundOk 949 ----------------------------------------------------------- 950 START Send; - START Send 951 STOP Keepalive 953 6.3. Keepalive timeout 955 Upon a timeout on the Keepalive timer, the node sends one last 956 Keepalive message. This can only happen in the Operational state. 958 The Keepalive message is constructed as described in Section 5.1. It 959 is sent using the current address pair. 961 Operational Exploring InboundOk 962 ----------------------------------------------------------- 963 SEND Keepalive - - 965 6.4. Send timeout 967 Upon a timeout on the Send timer, the node enters the Exploring state 968 and sends a Probe message. The Probe message is constructed as 969 explained in Section 6.1, except that the Sta field is set to 1 970 (Exploring). 972 Operational Exploring InboundOk 973 ----------------------------------------------------------- 974 SEND Probe Exploring; - SEND Probe Exploring; 975 GOTO Exploring GOTO Exploring 977 6.5. Retransmission 979 While in the Exploring state the node keeps retransmitting its Probe 980 messages to different (or same) addresses as defined in Section 4.3. 981 A similar process is employed in the InboundOk state, except that 982 upon such retransmission the Send timer is started if it was not 983 running already. 985 The Probe messages are constructed as explained in Section 6.1, 986 except that the Sta field is set to 1 (Exploring) or 2 (InboundOk), 987 depending on which state the sender is in. 989 Operational Exploring InboundOk 990 ---------------------------------------------------------- 991 - SEND Probe Exploring SEND Probe InboundOk 992 START Send 994 6.6. Reception of the Keepalive message 996 Upon the reception of a Keepalive message in the Operational state, 997 the node stops the Send timer, if it was running. If the node is in 998 the Exploring state it transitions to the InboundOK state, sends a 999 Probe message, and starts the Send timer. The Probe message is 1000 constructed as explained in Section 6.1. 1002 In the InboundOK state the Send timer is stopped, if it was running. 1004 Operational Exploring InboundOk 1005 ----------------------------------------------------------- 1006 STOP Send SEND Probe InboundOk; STOP Send 1007 START Send; 1008 GOTO InboundOk 1010 6.7. Reception of the Probe message State=Exploring 1012 Upon receiving a Probe with State set to Exploring, the node enters 1013 the InboundOK state, sends a Probe as described in Section 6.1, stops 1014 the Keepalive timer if it was running, and restarts the Send timer. 1016 Operational Exploring InboundOk 1017 ----------------------------------------------------------- 1018 SEND Probe InboundOk; SEND Probe InboundOk; SEND Probe 1019 STOP Keepalive; START Send; InboundOk; 1020 RESTART Send; GOTO InboundOk RESTART Send 1021 GOTO InboundOk 1023 6.8. Reception of the Probe message State=InboundOk 1025 Upon the reception of a Probe message with State set to InboundOk, 1026 the node sends a Probe message, restarts the Send timer, stops the 1027 Keepalive timer if it was running, and transitions to the Operational 1028 state. New current address pair is chosen for the connection, based 1029 on the reports of received probes in the message that we just 1030 received. If no received probes have been reported, the current 1031 address pair is unchanged. 1033 The Probe message is constructed as explained in Section 6.1, except 1034 that the Sta field is set to 0 (Operational). 1036 Operational Exploring InboundOk 1037 ------------------------------------------------------------- 1038 SEND Probe Operational; SEND Probe Operational; SEND Probe 1039 RESTART Send; RESTART Send; Operational; 1040 STOP Keepalive GOTO Operational RESTART Send; 1041 GOTO Operational 1043 6.9. Reception of the Probe message State=Operational 1045 Upon the reception of a Probe message with State set to Operational, 1046 the node stops the Send timer if it was running, starts the Keepalive 1047 timer if it was not yet running, and transitions to the Operational 1048 state. The Probe message is constructed as explained in Section 6.1, 1049 except that the Sta field is set to 0 (Operational). 1051 Note: This terminates the exploration process when both parties 1052 are happy and know that their peer is happy as well. 1054 Operational Exploring InboundOk 1055 ----------------------------------------------------------- 1056 STOP Send STOP Send; STOP Send; 1057 START Keepalive START Keepalive START Keepalive 1058 GOTO Operational GOTO Operational 1060 The reachability detection and exploration process has no effect on 1061 payload communications until a new operational address pairs have 1062 actually been confirmed. Prior to that the payload packets continue 1063 to be sent to the previously used addresses. 1065 6.10. Graphical Representation of the State Machine 1067 In the PDF version of this specification, an informational drawing 1068 illustrates the state machine. Where the text and the drawing 1069 differ, the text takes precedence. 1071 7. Protocol Constants 1073 The following protocol constants are defined: 1075 Send Timeout 15 seconds 1076 Keepalive Interval X seconds, where X is 1077 one third to one half of 1078 the Keepalive Timeout value 1079 (see Section 4.1) 1080 Initial Probe Timeout 0.5 seconds 1081 Number of Initial Probes 4 probes 1082 Max Probe Timeout 60 seconds 1084 Alternate values of the Send Timeout may be selected by a host and 1085 communicated to the peer in the Keepalive Timeout Option. A very 1086 small value of Send Timeout may affect the ability to exchange 1087 keepalives over a path that has a long roundtrip delay. Similarly, 1088 it may cause SHIM6 to react temporary failures more often than 1089 necessary. As a result, it is RECOMMENDED that an alternate Send 1090 Timeout value not be under 10 seconds. Choosing a higher value than 1091 the one recommended above is also possible, but there is a 1092 relationship between Send Timeout and the ability of REAP to discover 1093 and correct errors in the communication path. In any case, in order 1094 for SHIM6 to be useful, it should detect and repair communication 1095 problems far before upper layers give up. For this reason, it is 1096 RECOMMENDED that Send Timeout be at most 100 seconds (default TCP R2 1097 timeout [RFC1122]). 1099 Note that it is not expected that the Send Timeout or other values 1100 need to be estimated based on experienced roundtrip times. 1101 Signaling exchanges are performed based on exponential backoff. 1102 The keepalive processes send packets only run in the relatively 1103 rare condition that all traffic is unidirectional. Finally, 1104 because Send Timeout is far greater than usual roundtrip times, it 1105 merely divides the traffic into periods that SHIM6 looks at to 1106 decide whether to act. 1108 8. Security Considerations 1110 Attackers may spoof various indications from lower layers and the 1111 network in an effort to confuse the peers about which addresses are 1112 or are not operational. For example, attackers may spoof ICMP error 1113 messages in an effort to cause the parties to move their traffic 1114 elsewhere or even to disconnect. Attackers may also spoof 1115 information related to network attachments, router discovery, and 1116 address assignments in an effort to make the parties believe they 1117 have Internet connectivity when in reality they do not. 1119 This may cause use of non-preferred addresses or even denial-of- 1120 service. 1122 This protocol does not provide any protection of its own for 1123 indications from other parts of the protocol stack. Unprotected 1124 indications SHOULD NOT be taken as a proof of connectivity problems. 1125 However, REAP has weak resistance against incorrect information even 1126 from unprotected indications in the sense that it performs its own 1127 tests prior to picking a new address pair. Denial-of- service 1128 vulnerabilities remain, however, as do vulnerabilities against on 1129 path attackers. 1131 Some aspects of these vulnerabilities can be mitigated through the 1132 use of techniques specific to the other parts of the stack, such as 1133 properly dealing with ICMP errors [I-D.ietf-tcpm-icmp-attacks], link 1134 layer security, or the use of SEND [RFC3971] to protect IPv6 Router 1135 and Neighbor Discovery. 1137 Other parts of the SHIM6 protocol ensure that the set of addresses we 1138 are switching between actually belong together. REAP itself provides 1139 no such assurances. Similarly, REAP provides some protection against 1140 third party flooding attacks [AURA02]; when REAP is run its Probe 1141 nonces can be used as a return routability check that the claimed 1142 address is indeed willing to receive traffic. However, this needs to 1143 be complemented with another mechanism to ensure that the claimed 1144 address is also the correct host. SHIM6 does this by performing 1145 binding of all operations to context tags. 1147 The keepalive mechanism in this specification is vulnerable to 1148 spoofing. On path-attackers that can see a SHIM6 context tag can 1149 send spoofed Keepalive messages once per Send Timeout interval, to 1150 prevent two SHIM6 nodes from sending Keepalives themselves. This 1151 vulnerability is only relevant to nodes involved in a one-way 1152 communication. The result of the attack is that the nodes enter the 1153 exploration phase needlessly, but they should be able to confirm 1154 connectivity unless, of course, the attacker is able to prevent the 1155 exploration phase from completing. Off-path attackers may not be 1156 able to generate spoofed results, given that the context tags are 47- 1157 bit random numbers. 1159 To protect against spoofed keepalive packets, a host implementing 1160 both shim6 and IPsec MAY ignore incoming REAP keepalives if it has 1161 good reason to assume that the other side will be sending IPsec- 1162 protected return traffic. I.e., if a host is sending TCP data, it 1163 can reasonably expect to receive TCP ACKs in return. If no IPsec- 1164 protected ACKs come back but unprotected keepalives do, this could be 1165 the result from an attacker trying to hide broken connectivity. 1167 The exploration phase is vulnerable to attackers that are on the 1168 path. Off-path attackers would find it hard to guess either the 1169 context tag or the correct probe identifiers. Given that IPsec 1170 operates above the shim layer, it is not possible to protect the 1171 exploration phase against on-path attackers. This is similar to the 1172 ability to protect other Shim6 control exchanges. There are 1173 mechanisms in place to prevent the redirection of communications to 1174 wrong addresses, but on-path attackers can cause denial-of-service, 1175 move communications to less-preferred address pairs, and so on. 1177 Finally, the exploration itself can cause a number of packets to be 1178 sent. As a result it may be used as a tool for packet amplification 1179 in flooding attacks. In order to prevent this it is required that 1180 the protocol employing REAP has built-in mechanisms to prevent this. 1181 For instance, in SHIM6 contexts are created only after a relatively 1182 large number of packets has been exchanged, a cost which reduces the 1183 attractiveness of using SHIM6 and REAP for amplification attacks. 1184 However, such protections are typically not present at connection 1185 establishment time. When exploration would be needed for connection 1186 establishment to succeed, its usage would result in an amplification 1187 vulnerability. As a result, SHIM6 does not support the use of REAP 1188 in connection establishment stage. 1190 9. Operational Considerations 1192 When there are no failures, the failure detection mechanism (and 1193 SHIM6 in general) are light-weight: keepalives are not sent when a 1194 SHIM6 context is idle or when there is traffic in both directions. 1195 So in normal TCP or TCP-like operation, there would only be one or 1196 two keepalives when a session transitions from active to idle. 1198 Only when there are failures, there is significant failure detection 1199 traffic, and then especially in the case where a link goes down that 1200 is shared by many active sessions and by multiple hosts. When this 1201 happens, one keepalive is sent and then a series of probes. This 1202 happens per active (traffic generating) context, which will all 1203 timeout within 10 seconds after the failure. This makes the peak 1204 traffic that SHIM6 generates after a failure around one packet per 1205 second per context. Presumably, the sessions that run over those 1206 contexts were sending at least that much traffic and most likely 1207 more, but if the backup path is significantly lower bandwidth than 1208 the failed path, this could lead to temporary congestion. 1210 However, note that in the case of multihoming using BGP, if the 1211 failover is fast enough that TCP doesn't go into slow start, the 1212 full data traffic that flows over the failed path is switched over 1213 to the backup path, and if this backup path is of a lower 1214 capacity, there will be even more congestion in that case. 1216 Although the failure detection probing does not perform congestion 1217 control as such, the exponential backoff makes sure that the number 1218 of packets sent quickly goes down and eventually reaches one per 1219 context per minute, which should be sufficiently conservative even on 1220 the lowest bandwidth links. 1222 Section 7 specifies a number of protocol parameters. Possible tuning 1223 of these parameters and others that are not mandated in this 1224 specification may affect these properties. It is expected that 1225 further revisions of this specification provide additional 1226 information after sufficient deployment experience has been obtained 1227 from different environments. 1229 Implementations may provide means to monitor their performance and 1230 send alarms about problems. Their standardization is, however, 1231 subject of future specifications. In general, SHIM6 is most 1232 applicable for small sites and hosts, and it is expected that 1233 monitoring requirements on such deployments are relatively modest. 1234 In any case, where the host is associated with a management system, 1235 it is RECOMMENDED that detected failures and failover events are 1236 reported via asynchronous notifications to the management system. 1237 Similarly, where logging mechanisms are available on the host, these 1238 events should be recorded in event logs. 1240 SHIM6 uses the same header for both signaling and the encapsulation 1241 of data packets after a rehoming event. This way, fate is shared 1242 between the two types of packets, so the situation where reachability 1243 probes or keepalives can be transmitted successfully, but data 1244 packets can not, is largely avoided: either all SHIM6 packets make it 1245 through, so SHIM6 functions as intended, or none do, and no SHIM6 1246 state is negotiated. Even in the situation where some packets make 1247 it through and other do not, SHIM6 will generally either work as 1248 intended or provide a service that is no worse than in the absense of 1249 SHIM6, apart from the possible generation a a small amount of 1250 signaling traffic. 1252 If data packets and possibly data packets encapsulated in the SHIM6 1253 header do not make it through, but signaling and keepalives do. This 1254 situation can occur when there is a path MTU discovery black hole on 1255 one of the paths. If only large packets are sent at some point, then 1256 reachability exploration will be turned on and REAP will likely 1257 select another path, which may or may not be affected by the PMTUD 1258 black hole. 1260 10. IANA Considerations 1262 No IANA actions are required. The number assignments necessary for 1263 the messages defined in this document appear together with all the 1264 other IANA assignments in the main SHIM6 specification 1265 [I-D.ietf-shim6-proto]. 1267 11. References 1269 11.1. Normative References 1271 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1272 Requirement Levels", BCP 14, RFC 2119, March 1997. 1274 [RFC3315] Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C., 1275 and M. Carney, "Dynamic Host Configuration Protocol for 1276 IPv6 (DHCPv6)", RFC 3315, July 2003. 1278 [RFC3484] Draves, R., "Default Address Selection for Internet 1279 Protocol version 6 (IPv6)", RFC 3484, February 2003. 1281 [RFC4086] Eastlake, D., Schiller, J., and S. Crocker, "Randomness 1282 Requirements for Security", BCP 106, RFC 4086, June 2005. 1284 [RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 1285 Addresses", RFC 4193, October 2005. 1287 [RFC4429] Moore, N., "Optimistic Duplicate Address Detection (DAD) 1288 for IPv6", RFC 4429, April 2006. 1290 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 1291 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 1292 September 2007. 1294 [RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless 1295 Address Autoconfiguration", RFC 4862, September 2007. 1297 11.2. Informative References 1299 [AURA02] Aura, T., Roe, M., and J. Arkko, "Security of Internet 1300 Location Management", In Proceedings of the 18th Annual 1301 Computer Security Applications Conference, Las Vegas, 1302 Nevada, USA., December 2002. 1304 [I-D.bagnulo-shim6-addr-selection] 1305 Bagnulo, M., "Address selection in multihomed 1306 environments", draft-bagnulo-shim6-addr-selection-00 (work 1307 in progress), October 2005. 1309 [I-D.huitema-multi6-addr-selection] 1310 Huitema, C., "Address selection in multihomed 1311 environments", draft-huitema-multi6-addr-selection-00 1312 (work in progress), October 2004. 1314 [I-D.ietf-dna-cpl] 1315 Nordmark, E. and J. Choi, "DNA with unmodified routers: 1316 Prefix list based approach", draft-ietf-dna-cpl-02 (work 1317 in progress), January 2006. 1319 [I-D.ietf-dna-protocol] 1320 Kempf, J., "Detecting Network Attachment in IPv6 Networks 1321 (DNAv6)", draft-ietf-dna-protocol-06 (work in progress), 1322 June 2007. 1324 [I-D.ietf-hip-mm] 1325 Henderson, T., "End-Host Mobility and Multihoming with the 1326 Host Identity Protocol", draft-ietf-hip-mm-05 (work in 1327 progress), March 2007. 1329 [I-D.ietf-shim6-locator-pair-selection] 1330 Bagnulo, M., "Default Locator-pair selection algorithm for 1331 the SHIM6 protocol", 1332 draft-ietf-shim6-locator-pair-selection-02 (work in 1333 progress), July 2007. 1335 [I-D.ietf-shim6-proto] 1336 Bagnulo, M. and E. Nordmark, "Shim6: Level 3 Multihoming 1337 Shim Protocol for IPv6", draft-ietf-shim6-proto-09 (work 1338 in progress), November 2007. 1340 [I-D.ietf-shim6-reach-detect] 1341 Beijnum, I., "Shim6 Reachability Detection", 1342 draft-ietf-shim6-reach-detect-01 (work in progress), 1343 October 2005. 1345 [I-D.ietf-tcpm-icmp-attacks] 1346 Gont, F., "ICMP attacks against TCP", 1347 draft-ietf-tcpm-icmp-attacks-02 (work in progress), 1348 May 2007. 1350 [RFC1122] Braden, R., "Requirements for Internet Hosts - 1351 Communication Layers", STD 3, RFC 1122, October 1989. 1353 [RFC3971] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure 1354 Neighbor Discovery (SEND)", RFC 3971, March 2005. 1356 [RFC4960] Stewart, R., "Stream Control Transmission Protocol", 1357 RFC 4960, September 2007. 1359 Appendix A. Example Protocol Runs 1361 This appendix has examples of REAP protocol runs in typical 1362 scenarios. We start with the simplest scenario of two hosts, A and 1363 B, that have a SHIM6 connection with each other but are not currently 1364 sending any data. As neither side sends anything, they also do not 1365 expect anything back, so there are no messages at all: 1367 EXAMPLE 1: No communications 1369 Peer A Peer B 1370 | | 1371 | | 1372 | | 1373 | | 1374 | | 1375 | | 1376 | | 1377 | | 1379 Our second example involves an active connection with bidirectional 1380 payload packet flows. Here the reception of data from the peer is 1381 taken as an indication of reachability, so again there are no extra 1382 packes: 1384 EXAMPLE 2: Bidirectional communications 1386 Peer A Peer B 1387 | | 1388 | payload packet | 1389 |-------------------------------------------->| 1390 | | 1391 | payload packet | 1392 |<--------------------------------------------| 1393 | | 1394 | payload packet | 1395 |-------------------------------------------->| 1396 | | 1397 | | 1399 The third example is the first one that involves an actual REAP 1400 message. Here the hosts communicate in just one direction, so REAP 1401 messages are needed to indicate to the peer that sends payload 1402 packets that its packets are getting through: 1404 EXAMPLE 3: Unidirectional communications 1406 Peer A Peer B 1407 | | 1408 | payload packet | 1409 |-------------------------------------------->| 1410 | | 1411 | payload packet | 1412 |-------------------------------------------->| 1413 | | 1414 | payload packet | 1415 |-------------------------------------------->| 1416 | | 1417 | Keepalive id=p | 1418 |<--------------------------------------------| 1419 | | 1420 | payload packet | 1421 |-------------------------------------------->| 1422 | | 1423 | | 1425 The next example involves a failure scenario. Here A has addresses A 1426 and B has addresses B1 and B2. The currently used address pairs are 1427 (A, B1) and (B1, A). All connections via B1 become broken, which 1428 leads to an exploration process: 1430 EXAMPLE 4: Failure scenario 1432 Peer A Peer B 1433 | | 1434 State: | State: 1435 Operational | Operational 1436 | (A,B1) payload packet | 1437 |-------------------------------------------->| 1438 | | 1439 | (B1,A) payload packet | 1440 |<--------------------------------------------| At time T1 1441 | | path A<->B1 1442 | (A,B1) payload packet | becomes 1443 |----------------------------------------/ | broken 1444 | | 1445 | ( B1,A) payload packet | 1446 | /-----------------------------------------| 1447 | | 1448 | (A,B1) payload packet | 1449 |----------------------------------------/ | 1450 | | 1451 | (B1,A) payload packet | 1452 | /-----------------------------------------| 1453 | | 1454 | (A,B1) payload packet | 1455 |----------------------------------------/ | 1456 | | 1457 | | Send Timeout 1458 | | seconds after 1459 | | T1, B happens to 1460 | | see the problem 1461 | (B1,A) Probe id=p, | first and sends a 1462 | state=exploring | complaint that 1463 | /-----------------------------------------| it is not rec- 1464 | | eiving anything 1465 | | State: 1466 | | Exploring 1467 | | 1468 | (B2,A) Probe id=q, | 1469 | state=exploring | But its lost, 1470 |<--------------------------------------------| retransmission 1471 | | uses another pair 1472 A realizes | 1473 that it needs | 1474 to start the | 1475 exploration. It | 1476 picks B2 as the | 1477 most likely candidate, | 1478 as it appeared in the | 1479 Probe | 1480 State: InboundOk | 1481 | | 1482 | (A, B2) Probe id=r, | 1483 | state=inboundok, | 1484 | received probe q | This one gets 1485 |-------------------------------------------->| through. 1486 | | State: 1487 | | Operational 1488 | | 1489 | | 1490 | (B2,A) Probe id=s, | 1491 | state=operational, | B now knows 1492 | received probe r | that A has no 1493 |<--------------------------------------------| problem to receive 1494 | | its packets 1495 State: Operational | 1496 | | 1497 | (A,B2) payload packet | 1498 |-------------------------------------------->| Payload packets 1499 | | flow again 1500 | (B2,A) payload packet | 1501 |<--------------------------------------------| 1503 The next example shows when the failure for the current locator pair 1504 is in the other direction only. A has addresses A1 and A2, and B has 1505 addresses B1 and B2. The current communication is between A1 and B1, 1506 but A's packets no longer reach B using this pair. 1508 EXAMPLE 5: One-way failure 1510 Peer A Peer B 1511 | | 1512 State: | State: 1513 Operational | Operational 1514 | | 1515 | (A1,B1) payload packet | 1516 |-------------------------------------------->| 1517 | | 1518 | (B1,A1) payload packet | 1519 |<--------------------------------------------| 1520 | | 1521 | (A1,B1) payload packet | At time T1 1522 |----------------------------------------/ | path A1->B1 1523 | | becomes 1524 | | broken 1525 | (B1,A1) payload packet | 1526 |<--------------------------------------------| 1527 | | 1528 | (A1,B1) payload packet | 1529 |----------------------------------------/ | 1530 | | 1531 | (B1,A1) payload packet | 1532 |<--------------------------------------------| 1533 | | 1534 | (A1,B1) payload packet | 1535 |----------------------------------------/ | 1536 | | 1537 | | Send Timeout 1538 | | seconds after 1539 | | T1, B notices 1540 | | the problem and 1541 | (B1,A1) Probe id=p, | sends a com- 1542 | state=exploring | plaint that 1543 |<--------------------------------------------| it is not rec- 1544 | | eiving anything 1545 A responds | State: Exploring 1546 State: InboundOk | 1547 | | 1548 | (A1, B1) Probe id=q, | 1549 | state=inboundok, | 1550 | received probe p | 1551 |----------------------------------------/ | But A's response 1552 | | is lost 1553 | (B2,A2) Probe id=r, | 1554 | state=exploring | Next try different 1555 |<--------------------------------------------| locator pair 1556 | | 1557 | (A2, B2) Probe id=s, | 1558 | state=inboundok, | 1559 | received probes p, r | This one gets 1560 |-------------------------------------------->| through 1561 | | State: Operational 1562 | | 1563 | | B now knows 1564 | | that A has no 1565 | (B2,A2) Probe id=t, | problem to receive 1566 | state=operational, | its packets, and 1567 | received probe s | that A's probe 1568 |<--------------------------------------------| gets to B. It 1569 | | sends a 1570 State: Operational | confirmation to A 1571 | | 1572 | (A2,B2) payload packet | 1573 |-------------------------------------------->| Payload packets 1574 | | flow again 1575 | (B1,A1) payload packet | 1576 |<--------------------------------------------| 1578 Appendix B. Contributors 1580 This draft attempts to summarize the thoughts and unpublished 1581 contributions of many people, including the MULTI6 WG design team 1582 members Marcelo Bagnulo Braun, Erik Nordmark, Geoff Huston, Kurtis 1583 Lindqvist, Margaret Wasserman, and Jukka Ylitalo, the MOBIKE WG 1584 contributors Pasi Eronen, Tero Kivinen, Francis Dupont, Spencer 1585 Dawkins, and James Kempf, and HIP WG contributors such as Pekka 1586 Nikander. This draft is also in debt to work done in the context of 1587 SCTP [RFC4960] and HIP multihoming and mobility extension 1588 [I-D.ietf-hip-mm]. 1590 Appendix C. Acknowledgements 1592 The authors would also like to thank Christian Huitema, Pekka Savola, 1593 John Loughney, Sam Xia, Hannes Tschofenig, Sebastian Barre, Thomas 1594 Henderson, Matthijs Mekking, Deguang Le, Eric Gray, Dan Romascanu, 1595 Stephen Kent, Alberto Garcia, Bernard Aboba, Lars Eggert, and Tim 1596 Polk for interesting discussions in this problem space, and for 1597 review of this specification. 1599 Authors' Addresses 1601 Jari Arkko 1602 Ericsson 1603 Jorvas 02420 1604 Finland 1606 Email: jari.arkko@ericsson.com 1608 Iljitsch van Beijnum 1609 IMDEA Networks 1610 Avda. del Mar Mediterraneo, 22 1611 Leganes, Madrid 28918 1612 Spain 1614 Email: iljitsch@muada.com 1616 Full Copyright Statement 1618 Copyright (C) The IETF Trust (2008). 1620 This document is subject to the rights, licenses and restrictions 1621 contained in BCP 78, and except as set forth therein, the authors 1622 retain all their rights. 1624 This document and the information contained herein are provided on an 1625 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1626 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 1627 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 1628 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 1629 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1630 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1632 Intellectual Property 1634 The IETF takes no position regarding the validity or scope of any 1635 Intellectual Property Rights or other rights that might be claimed to 1636 pertain to the implementation or use of the technology described in 1637 this document or the extent to which any license under such rights 1638 might or might not be available; nor does it represent that it has 1639 made any independent effort to identify any such rights. Information 1640 on the procedures with respect to rights in RFC documents can be 1641 found in BCP 78 and BCP 79. 1643 Copies of IPR disclosures made to the IETF Secretariat and any 1644 assurances of licenses to be made available, or the result of an 1645 attempt made to obtain a general license or permission for the use of 1646 such proprietary rights by implementers or users of this 1647 specification can be obtained from the IETF on-line IPR repository at 1648 http://www.ietf.org/ipr. 1650 The IETF invites any interested party to bring to its attention any 1651 copyrights, patents or patent applications, or other proprietary 1652 rights that may cover technology that may be required to implement 1653 this standard. Please address the information to the IETF at 1654 ietf-ipr@ietf.org. 1656 Acknowledgment 1658 Funding for the RFC Editor function is provided by the IETF 1659 Administrative Support Activity (IASA).