idnits 2.17.1 

draft-ietf-shim6-failure-detection-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 15.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 995.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 972.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 979.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 985.

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords. 

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (October 8, 2005) is 6775 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Obsolete normative reference: RFC 2461 (ref. '2') (Obsoleted by RFC 4861)

  ** Obsolete normative reference: RFC 2462 (ref. '3') (Obsoleted by RFC 4862)

  ** Obsolete normative reference: RFC 3315 (ref. '4') (Obsoleted by RFC 8415)

  ** Obsolete normative reference: RFC 3484 (ref. '5') (Obsoleted by RFC 6724)

  == Outdated reference: A later version (-18) exists of
     draft-ietf-dhc-dna-ipv4-08

  == Outdated reference: A later version (-04) exists of
     draft-ietf-dna-goals-00

  ** Downref: Normative reference to an Informational draft:
     draft-ietf-dna-goals (ref. '7')

  == Outdated reference: A later version (-07) exists of
     draft-ietf-ipv6-optimistic-dad-01

  == Outdated reference: A later version (-09) exists of
     draft-ietf-ipv6-unique-local-addr-05

  == Outdated reference: A later version (-01) exists of
     draft-ietf-shim6-reach-detect-00

  -- Possible downref: Normative reference to a draft: ref. '10' 

  -- Obsolete informational reference (is this intentional?): RFC 2960 (ref.
     '11') (Obsoleted by RFC 4960)

  -- Obsolete informational reference (is this intentional?): RFC 3489 (ref.
     '12') (Obsoleted by RFC 5389)

  == Outdated reference: A later version (-05) exists of draft-ietf-hip-mm-00

  == Outdated reference: A later version (-08) exists of
     draft-ietf-mobike-design-00

  == Outdated reference: A later version (-08) exists of
     draft-ietf-mobike-protocol-03

  == Outdated reference: A later version (-19) exists of
     draft-ietf-mmusic-ice-02

  == Outdated reference: A later version (-22) exists of
     draft-ietf-tsvwg-addip-sctp-10

  == Outdated reference: A later version (-05) exists of
     draft-gont-tcpm-icmp-attacks-00

  == Outdated reference: A later version (-12) exists of
     draft-ietf-shim6-proto-00

  == Outdated reference: A later version (-08) exists of
     draft-rosenberg-midcom-turn-05


     Summary: 9 errors (**), 0 flaws (~~), 16 warnings (==), 10 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                           J. Arkko
3	Internet-Draft                                                  Ericsson
4	Expires: April 11, 2006                                  October 8, 2005

6	     Failure Detection and Locator Pair Exploration Design for IPv6
7	                              Multihoming
8	                 draft-ietf-shim6-failure-detection-01

10	Status of this Memo

12	   By submitting this Internet-Draft, each author represents that any
13	   applicable patent or other IPR claims of which he or she is aware
14	   have been or will be disclosed, and any of which he or she becomes
15	   aware will be disclosed, in accordance with Section 6 of BCP 79.

17	   Internet-Drafts are working documents of the Internet Engineering
18	   Task Force (IETF), its areas, and its working groups.  Note that
19	   other groups may also distribute working documents as Internet-
20	   Drafts.

22	   Internet-Drafts are draft documents valid for a maximum of six months
23	   and may be updated, replaced, or obsoleted by other documents at any
24	   time.  It is inappropriate to use Internet-Drafts as reference
25	   material or to cite them other than as "work in progress."

27	   The list of current Internet-Drafts can be accessed at
28	   http://www.ietf.org/ietf/1id-abstracts.txt.

30	   The list of Internet-Draft Shadow Directories can be accessed at
31	   http://www.ietf.org/shadow.html.

33	   This Internet-Draft will expire on April 11, 2006.

35	Copyright Notice

37	   Copyright (C) The Internet Society (2005).

39	Abstract

41	   This draft discusses the issues of detecting failures in a currently
42	   used address pair between two hosts and picking a new address pair to
43	   be used when a failure occurs.  The draft also discusses the roles of
44	   a multihoming protocol versus network attachment functions at IP and
45	   link layers.

47	Table of Contents

49	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
50	   2.  Requirements language  . . . . . . . . . . . . . . . . . . . .  4
51	   3.  Related Work . . . . . . . . . . . . . . . . . . . . . . . . .  5
52	   4.  Definitions  . . . . . . . . . . . . . . . . . . . . . . . . .  8
53	         4.1.  Available Addresses  . . . . . . . . . . . . . . . . .  8
54	         4.2.  Locally Operational Addresses  . . . . . . . . . . . .  9
55	         4.3.  Operational Address Pairs  . . . . . . . . . . . . . .  9
56	         4.4.  Primary Address Pair . . . . . . . . . . . . . . . . . 11
57	         4.5.  Miscellaneous  . . . . . . . . . . . . . . . . . . . . 11
58	   5.  Architectural Considerations . . . . . . . . . . . . . . . . . 12
59	   6.  Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
60	         6.1.  State Machines . . . . . . . . . . . . . . . . . . . . 14
61	         6.2.  Failure Detection  . . . . . . . . . . . . . . . . . . 19
62	         6.3.  Alternative Locator Pair Exploration . . . . . . . . . 19
63	               6.3.1.  Exploration Order  . . . . . . . . . . . . . . 19
64	               6.3.2.  Exploration Protocol . . . . . . . . . . . . . 21
65	   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 23
66	   8.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 24
67	         8.1.  Normative References . . . . . . . . . . . . . . . . . 24
68	         8.2.  Informative References . . . . . . . . . . . . . . . . 24
69	   Appendix A.  Contributors  . . . . . . . . . . . . . . . . . . . . 27
70	   Appendix B.  Acknowledgements  . . . . . . . . . . . . . . . . . . 28
71	   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 29
72	   Intellectual Property and Copyright Statements . . . . . . . . . . 30

74	1.  Introduction

76	   The SHIM6 working group is extending IPv6 to support multihoming.
77	   The focus of the group is to look at an IP layer (or layer 3.5)
78	   mechanism that hides multihoming from applications [23].  This
79	   mechanism needs to detect when a switch to another address or
80	   addresses becomes necessary.  We call this failure detection.

82	   This draft discusses what requirements such a component of the SHIM6
83	   protocol has, and how these requirements can be achieved.  The draft
84	   is structured as follows: Section 3 discusses what kind of solutions
85	   have been used in other similar protocols.  Section 4 defines a set
86	   of useful terms and discusses them, and Section 5 discusses the
87	   architectural implications of failure detection designs.  Finally,
88	   Section 6 describes one possible solution involving a mechanism to
89	   detect failures and an exploration protocol for working address
90	   pairs.

92	   For the purposes of this draft, we consider an address to be
93	   synonymous with a locator.  There may be other, higher level
94	   identifiers such as security associations, FQDNs, CGA public keys,
95	   HBA bindings, or HITs that tie the different locators used by a node
96	   together.

98	2.  Requirements language

100	   In this document, the key words "MAY", "MUST, "MUST NOT", "OPTIONAL",
101	   "RECOMMENDED", "SHOULD", and "SHOULD NOT", are to be interpreted as
102	   described in [1].

104	3.  Related Work

106	   Another SHIM6 document [10] discusses what kind of mechanisms can be
107	   used to detect whether the peer is still reachable at the currently
108	   used address.  Two proposed mechanisms, Correspondent Unreachability
109	   Detection (CUD) and Forced Bidirectional Communication (FBD) are
110	   presented.  CUD is based on getting upper layer positive feedback,
111	   and IPv6 NUD-like probing if there is no feedback.  FBD is based on
112	   forcing bidirectional communication by adding keepalive messages when
113	   there is no other, payload traffic.

115	   In SCTP [11], the addresses of the endpoints are learned in the
116	   connection setup phase either through listing them explicitly or via
117	   giving a DNS name that points to them.  In order to provide a
118	   failover mechanism between multihomed hosts, SCTP has the following
119	   functions:

121	   o  One of the peer's addresses is selected as the primary address by
122	      the application running on top of SCTP.  All data packets are sent
123	      to this address until there is a reason to choose another address,
124	      such as the failure of the primary address.

126	   o  Testing the reachability of the peer endpoint's addresses.  This
127	      is done both via observing the data packets sent to the peer or
128	      via a periodic heartbeat when there is no data packets to send.

130	      Each time data packet retransmission is initiated (or when a
131	      heartbeat is not answered within the estimated round-trip time) an
132	      error counter is incremented.  When a configured error limit is
133	      reached, the particular destination address is marked as inactive.
134	      The reception of an acknowledgement or heartbeat response clears
135	      the counter.

137	   o  Retransmission: When retransmitting the endpoint attempts pick the
138	      most "divergent" source-destination pair from the original source-
139	      destination pair to which the packet was transmitted.  Rules for
140	      such selection are, however, left as implementation decisions in
141	      SCTP.

143	   SCTP does not define how local knowledge (such as information learned
144	   from the link layer) should be used.  SCTP also has no mechanism to
145	   deal with dynamic changes to the set of available addresses, although
146	   mechanisms for that are being developed [18].

148	   The MOBIKE protocol is currently being specified [16] [15].  This
149	   protocol operates in a mixed IPv4/IPv6 environment, and typically has
150	   to work through NATs.  The current design is assumed to need to work
151	   only in symmetric connectivity scenarios.

153	   Some of the issues that have been discussed in the MOBIKE design
154	   phase include the following:

156	   o  Single address vs. multiple peer addresses.  A simple approach is
157	      to have the peers be aware of just the current address of the
158	      other side instead of all possible ones.  Assuming that one of the
159	      peers will request the other to start sending to a new address
160	      this works well.  However, this approach is unable to deal with
161	      problems that affect both nodes.  For instance, two nodes
162	      connected by two separate point-to-point links will be unable to
163	      switch to the other link if a failure occurs on the first one.

165	   o  Addresses vs. address pairs.  Are tests and current paths
166	      individual peer addresses, or pairs of peer and own addresses
167	      (paths)?  It seems that some failure scenarios require the use of
168	      a path rather than a single address.  A network failure may make
169	      it impossible to communicate between a particular pair of
170	      addresses, even if those addresses have some other connectivity.

172	   o  Where the connectivity information comes from.  Does it come from
173	      local stack (such as interface up/down, router advertisement),
174	      from reception of ESP packets, from IKEv2 keepalives, or through
175	      some MOBIKE-defined mechanism?

177	   The mobility and multihoming specification for the HIP protocol [14]
178	   leaves the determination of when address updates are sent to a local
179	   policy, but suggests the use of local information and ICMP error
180	   messages.

182	   Network attachment procedures are also relevant for multihoming.  The
183	   IPv6 and MIP6 working groups have standardized mechanisms to learn
184	   about networks that a node has attached to.  Basic IPv6 Neighbor
185	   Discovery was, however, designed primarily for static situations.
186	   The fully dynamic detection procedure has turned out to be a
187	   relatively complex procedure for mobile hosts, and it was not fully
188	   anticipated at the time IPv6 Neighbor Discovery or DHCP were being
189	   designed.  As a result, enhanced or optimized mechanisms are being
190	   designed in the DHC and DNA working groups [6] [7].

192	   ICE [17], STUN [12], and TURN [24] are also related mechanisms.  They
193	   are primarily used for NAT detection and communication through NATs
194	   in IPv4 environment, for application such as as voice over IP.  STUN
195	   uses a server in the Internet to discover the presence and type of
196	   NATs and the client's public IP addresses and ports.  TURN makes it
197	   possible to receive incoming connections in hosts behind NATs.  ICE
198	   makes use of these protocols in peer-to-peer cooperative fashion,
199	   allowing participants to discover, create and verify mutual
200	   connectivity, and then use this connectivity for multimedia streams.
201	   While these mechanisms are not designed for dynamic and failure
202	   situations, they have many of the same requirements for the
203	   exploration of connectivity, as well as the requirement to deal with
204	   middleboxes.

206	   Related work in the IPv6 area includes RFC 3484 [5] which defines
207	   source and destination address selection rules for IPv6 in situations
208	   where multiple candidate address pairs exist.  RFC 3484 considers
209	   only a static situation, however, and does not take into account the
210	   effect of failures.  In the MULTI6 working group [22] considers how
211	   applications can re-initiate connections after failures in the best
212	   way.  This work differs from the shim-layer approach selected for
213	   further development in the working group with respect to the timing
214	   of the address selection.  In the shim-layer approach failure
215	   detection and the selection of new addresses happens at any time,
216	   while [22] considers only the case when an application re-establishes
217	   connections.

219	4.  Definitions

221	   This section defines terms useful in discussing the failure detection
222	   problem space.

224	4.1.  Available Addresses

226	   SHIM6 nodes need to be aware of what addresses they themselves have.
227	   If a node loses the address it is currently using for communications,
228	   another address must replace this address.  And if a node loses an
229	   address that the node's peer knows about, the peer must be informed.
230	   Similarly, when a node acquires a new address it may generally wish
231	   the peer to know about it.

233	   Definition.  Available address.  An address is said to be available
234	   if the following conditions are fulfilled:

236	   o  The address has been assigned to an interface of the node.

238	   o  If the address is an IPv6 address, we additionally require that
239	      (a) the address is valid in the sense of RFC 2461 [2], and that
240	      (b) the address is not tentative in the sense of RFC 2462 [3].  In
241	      other words, the address assignment is complete so that
242	      communications can be started.

244	      Note this explicitly allows an address to be optimistic in the
245	      sense of [8] even though implementations are probably better off
246	      using other addresses as long as there is an alternative.

248	   o  The address is a global unicast, unique local address [9], or an
249	      unambiguous IPv6 link-local or IPv4 RFC 1918 address.  That is, it
250	      is not an IPv6 site-local address.  Where IPv6 link-local or RFC
251	      1918 addresses are used, their use needs to be unambiguous.  The
252	      precise meaning of ambiguous has not been defined yet, but one
253	      approach is requiring that at most one link-local address be used
254	      per node within the same connection between two peers.

256	         Note: Given RFC 3484 [5] rules for preferring smallest scope,
257	         it is likely that many IPv6 flows at least start with even
258	         link-local addresses.

260	   o  The address and interface is acceptable for use according to a
261	      local policy.

263	   Available addresses are discovered and monitored through mechanisms
264	   outside the scope of SHIM6 (and HIP or MOBIKE).  These mechanisms
265	   include IPv6 Neighbor Discovery and Address Autoconfiguration [2]
266	   [3], DHCP [4], enhanced network detection mechanisms detected by the
267	   DNA working group, and corresponding IPv4 mechanisms, such as [6].

269	4.2.  Locally Operational Addresses

271	   Two different granularity levels are needed for failure detection.
272	   The coarser granularity is for individual addresses:

274	   Definition.  Locally Operational Address.  An available address is
275	   said to be locally operational when its use is known to be possible
276	   locally: the interface is up, a relevant default router (if
277	   applicable) is known to be reachable, and no other local information
278	   points to the address being unusable.

280	   Locally operational addresses are discovered and monitored through
281	   mechanisms outside SHIM6 (and HIP or MOBIKE).  These mechanisms
282	   include IPv6 Neighbor Discovery [2], corresponding IPv4 mechanisms,
283	   and link layer specific mechanisms.

285	   It is also possible for hosts to learn about routing failures for a
286	   particular selected source prefix.  Protocols for distributing this
287	   information are being designed [19] [22].  The development of such
288	   protocols would be possible, however.  Potential approaches include
289	   overloading information in current IPv6 Router Advertisement or
290	   adding some new information in them.  Similarly, hosts could learn
291	   information from servers that query the BGP routing tables.

293	4.3.  Operational Address Pairs

295	   The existence of locally operational addresses are not, however, a
296	   guarantee that communications can be established with the peer.  A
297	   failure in the routing infrastructure can prevent the sent packets
298	   from reaching their destination.  For this reason we need the
299	   definition of a second level of granularity, for pairs of addresses:

301	   Definition.  Bidirectionally operational address pair.  A pair of
302	   locally operational addresses are said to be an operational address
303	   pair, iff bidirectional connectivity can be shown between the
304	   addresses.  That is, a packet sent with one of the addresses in the
305	   source field and the other in the destination field reaches the
306	   destination, and vice versa.

308	   Unfortunately, there are scenarios where bidirectionally operational
309	   address pairs do not exist.  For instance, ingress filtering or
310	   network failures may result in one address pair being operational in
311	   one direction while another one is operational from the other
312	   direction.  The following definition captures this general situation:

314	   Definition.  Undirectionally operational address pair.  A pair of
315	   locally operational addresses are said to be an unidirectionally
316	   operational address pair, iff packets sent with the first address as
317	   the source and the second address as the destination can be shown to
318	   reach the destination.

320	   Both types of operational pairs are discovered and monitored through
321	   the following mechanisms:

323	   o  Positive feedback from upper layer protocols.  For instance, TCP
324	      can indicate to the IP layer that it is making progress.  This is
325	      similar to how IPv6 Neighbor Unreachability Detection can in some
326	      cases be avoided when upper layers provide information about
327	      bidirectional connectivity [2].  In the case of unidirectional
328	      connectivity, the upper layer protocol responses come back using
329	      another address pair, but show that the messages sent using the
330	      first address pair have been received.

332	   o  Negative feedback from upper layer protocols.  It is conceivable
333	      that upper layer protocols give an indication of a problem to the
334	      SHIM6 layer.  For instance, TCP could indicate that there's either
335	      congestion or lack of connectivity in the path because it is not
336	      getting ACKs.

338	   o  Explicit reachability tests, such as keepalives or probes added
339	      when there's only unidirectional payload traffic [10].

341	   o  ICMP error messages.  Given the ease of spoofing ICMP messages,
342	      one should be careful to not trust these blindly, however.  Our
343	      suggestion is to use ICMP error messages only as a hint to perform
344	      an explicit reachability test, but not as a reason to disrupt
345	      ongoing communications without other indications of problems.  The
346	      situation may be different when certain verifications of the ICMP
347	      messages are being performed [21].  These verifications can ensure
348	      that (practically) only on-path attackers can spoof the messages.
349	      Such verifications are not possible for all transport protocols,
350	      however.

352	   Note that some protocols, such as HIP [14] and MOBIKE [16], perform a
353	   return routability test of an address before it is taken into use.
354	   The purpose of this test is to ensure that fraudulent peers do not
355	   trick others into redirecting traffic streams onto innocent victims
356	   [26].  Such tests can at the same time work as a means to ensure that
357	   an address pair is operational.  Note, however, that some advanced
358	   optimizations attempt to postpone the reachability tests so that they
359	   do not increase movement-related latency [25].

361	4.4.  Primary Address Pair

363	   Contrary to SCTP which has a specific congestion avoidance design
364	   suitable for multi-homing, IP-layer solutions need to avoid sending
365	   packets concurrently over multiple paths; TCP behaves rather poorly
366	   in such circumstances.  For this reason it is necessary to choose a
367	   particular pair of addresses as the primary address pair which is
368	   used until problems occur, at least for the same session.

370	   A primary address pair need not be operational at all times.  If
371	   there is no traffic to send, we may not know if the primary address
372	   pair is operational.  Nevertheless, it makes sense to assume that the
373	   address pair that worked in some time ago continues to work for new
374	   communications as well.

376	4.5.  Miscellaneous

378	   Addresses can become deprecated [2].  When other operational
379	   addresses exist, nodes generally wish to move their communications
380	   away from the deprecated addresses.

382	   Similarly, IPv6 source address selection [5] may guide the selection
383	   of a particular source address - destination address pair.

385	5.  Architectural Considerations

387	   Architecturally, a number of questions arises.  One simple question
388	   is whether there needs to be communications between a multihoming
389	   solution residing at the IP layer and upper layer protocols?  Upon
390	   changing to a new address pair, transport layer protocol SHOULD be
391	   notified so that it can perform a slow start, or some other form of
392	   adaptation to the possibly changed conditions.  This is necessary,
393	   for instance, when switching from a high-bandwidth LAN interface to a
394	   low bandwidth cellular interface.  (Note that this notification can
395	   not be done in protocol designs where the end points are not the
396	   final hosts, such as where a gateway is used.)

398	   A more fundamental question is which protocols should be responsible
399	   for which parts of the problem.  It seems clear that no multihoming
400	   solution should take on the task of lower layers and other IP
401	   functions for discovering its own addresses or testing local
402	   connectivity.  Protocols such as DHCP or Neighbor and Router
403	   Discovery do this already.

405	   But it is less clear which protocol(s) should discover end-to-end
406	   connectivity problems or recover from them.  One answer is that this
407	   is clearly within the domain of multihoming protocol.  By performing
408	   testing and failure detection of the used path and switching to a new
409	   path if necessary, the transport and application protocols can work
410	   unchanged.

412	   On the other hand, one could argue that transport and application
413	   protocols would have more knowledge about the situation, and have a
414	   better ability to decide when a move is required.  For instance, they
415	   know what the required throughput and congestion status is.  Also, it
416	   would be unfortunate if both the IP layer and transport/application
417	   layer took action for the same problem, for instance by switching to
418	   a new address at the IP layer and throttling back due to "congestion"
419	   at the transport layer.

421	   One can also envision that applications would be able to tell the IP
422	   or transport layer that the current connection in unsatisfactory and
423	   an exploration for a better one would be desirable.  This would
424	   require an API to be developed, however.

426	   Generally speaking, we can divide information that a host has into
427	   three categories: local information from "lower layers" such as IPv6
428	   Neighbor Discovery, transit and congestion condition information from
429	   either from the multihoming protocol itself or from transport layer
430	   protocols and (where available) ECN, and application layer policies
431	   that dictate what the requirements are for acceptable connections.

433	   The division of work is largely left as an open issue as far as this
434	   document is concerned, but our description works from a point of view
435	   of a multihoming protocol at the IP layer.  We also note that in the
436	   CELP proposal [20], both IP, transport, and application layer
437	   entities could share their connectivity status in a common
438	   information pool.  This may also be a useful approach.

440	   Finally, the last architectural question is about the difference
441	   between mobility and multihoming.  Given our definitions above,
442	   there's no fundamental difference with respect to how the
443	   multihoming/mobility protocol learns the addresses it has available.
444	   However, a practical difference is that in a multihoming scenario
445	   there are alternative addresses, whereas in mobility changes to a new
446	   address are forced due to the old address no longer being available.
447	   Interestingly, with the exception of MOBIKE, existing mobility
448	   protocols do not employ any failure detection mechanisms of their
449	   own, and rely solely on link layer and neighbor discovery mechanisms.

451	6.  Solution

453	   We need to keep track of the host's own available addresses,
454	   operational addresses, and operational address pairs, and to explore
455	   for other operational pairs when a failure occurs.  We will first
456	   describe two general state machines that illustrate the overall
457	   process, and then discuss the details of the reachability tests
458	   needed for ensuring operational status, and the exploration protocol.

460	6.1.  State Machines

462	   Addresses can be in the AVAILABLE and OPERATIONAL states.  The state
463	   transitions relating to this are shown in Figure 1.

465	                     +--------------+
466	     Address becomes |              |
467	     available       |              |
468	   ----------------->|              |
469	                     |  AVAILABLE   |
470	   <-----------------|              |
471	     Address is no   |              |
472	    longer available |              |
473	                     +--------------+
474	                        |       / \
475	                Address |        | Address
476	                becomes |        | is no longer
477	            operational |        | operational
478	                        |        |
479	                       \ /       |
480	                     +--------------+
481	                     |              |
482	     Address is no   |              |
483	    longer available |              |
484	   <-----------------| OPERATIONAL  |
485	                     |              |
486	                     |              |
487	                     |              |
488	                     +--------------+

490	          Figure 1. Address state machine.

492	   When an address becomes operational, it SHOULD be reported as a new
493	   address to the peer.  Similarly, when an address is no longer
494	   operational or available, the peer SHOULD be informed.

496	   In addition, a particular address can be either preferred or
497	   deprecated.  This is not shown in the state machine.

499	   Another state machine describes address pair selection.  A node runs
500	   the address pair selection state machine to choose the currently used
501	   primary address pair, the one which is used for sending outgoing
502	   packets.  A node runs one of these state machines towards each
503	   different peer, tracking the known address pairs and their status.
504	   Each peer also has its own state machine for talking back to the
505	   node; there is no guarantee that the same address pairs (in reverse
506	   order) have the same state; lack of bidirectionally operational pair
507	   would result in a different state on both sides, for instance.

509	   The state machine can be in the NO PRIMARY, TESTING PRIMARY, and
510	   PRIMARY OPERATIONAL states.  The chosen address pair is known to be
511	   operational in the PRIMARY OPERATIONAL state, and is either
512	   unverified or non-operational in the other states.

514	   Figure 2 shows the state machine:

516	                         +----------------+
517	                         |                |
518	                         |                |
519	                         |                |
520	                         |                |
521	                         |       NO       |
522	                         |     PRIMARY    |
523	                         |                |
524	                   +-----|                |<---------------+
525	                   |     |                |                |
526	                   |     +----------------+                |
527	                   |         / \    / \                    |
528	               Add |          |      |                     |
529	             pair: |   Delete |      | Test         Delete |
530	              Send |   pair & |      | fail &       pair & |
531	              test |     Last |      | Last           Last |
532	                   |          |      |                     |
533	                   |     +----------------+                |
534	                   |     |                |                |
535	                   +---->|                |<----+          |
536	                         |                |     | Test     |
537	    Connect: Send test   |                |     | fail &   |
538	   --------------------->|     TESTING    |     | !Last    |
539	                         |     PRIMARY    |+----+          |
540	          +------------->|                |                |
541	          |              |                |<----+          |
542	          |        +---->|                |     |          |
543	          |        |     +----------------+     |          |
544	   Policy | ICMP | |          |      |          |          |
545	   change | Timer: |      ULP |      | Test     | Delete   |
546	          |   Send | feedback:|      | OK:      | pair &   |
547	          |   test |    Reset |      | Reset    | !Last    |
548	          |        |    timer |      | timer    |          |
549	          |        |         \ /    \ /         |          |
550	          |        |     +----------------+     |          |
551	          |        +-----|                |     |          |
552	          |              |                |-----+          |
553	          +--------------|                |                |
554	                         |                |                |
555	                   +-----|   OPERATIONAL  |                |
556	     ULP feedback: |     |     PRIMARY    |                |
557	       Reset timer |     |                |----------------+
558	                   +---->|                |
559	                         |                |
560	                         +----------------+

562	          Figure 2. Pair selection state machine.

564	   The notation used in Figure 2 is explained below:

566	   Connect

568	      An event representing the desire of the application to send a
569	      packet to a new peer, or an indication from a peer wishing to
570	      connect to us.

572	   Test OK

574	      An event representing a successful completion of the reachability
575	      test.

577	   Test fail

579	      An event representing failure to complete the reachability test.

581	   ULP feedback

583	      An event representing positive indication from an upper layer
584	      protocol that the packets we have sent to the peer are getting
585	      through.

587	   ICMP

589	      An event representing the reception of an ICMP error message.

591	   Timer

593	      An event representing timer elapsing.

595	   Add pair

597	      An event representing the addition of a new possible address pair,
598	      either through learning a new local address or being told of a new
599	      remote address.  Note that this does not usually result in any
600	      immediate action, unless we are currently lacking an operational
601	      primary pair.

603	   Delete pair

605	      An event representing the deletion of the currently chosen primary
606	      address pair, or learning that one of the addresses is in the pair
607	      is no longer operational.

609	   Policy change

611	      An event representing the desire of the local or remote end to
612	      change to a different address pair, despite the current one being
613	      operational.  This can be due to the availability of the higher-
614	      bandwidth connection, cost, or other issues.

616	   Last

618	      A condition that tells whether or not the currently chosen primary
619	      pair is the only known address pair.

621	   Send test

623	      An action to initiate the reachability test for a particular pair.
624	      This test is typically embedded in the SHIM6 connection setup
625	      exchange when run initially, and a separate exchange later.

627	      Note that due to potentially asymmetric connectivity, both sides
628	      have to perform their own tests, and make their own primary pair
629	      selections.

631	   Reset timer

633	      An action to reset a timer so that it will send an event after a
634	      specified time.

636	   The state machines also assumes an underlying multihoming signaling
637	   capability, consisting of the following abstract message exchanges:

639	   Open

641	      Establishes a connection between the peers.  May also exchange
642	      locator sets and test reachability at the same time.

644	   Test

646	      Verifies reachability using a specific address pair.

648	   Add

650	      Informs the peer about new locators.

652	   Delete

654	      Informs the peer about losing some locators.

656	   Note that the above state machine leaves open how specific address
657	   pairs are chosen or how the tests are actually performed.  These
658	   issues will be discussed in the next sections.  We have also, on
659	   purpose, decided to avoid attaching functional labels such as
660	   "backup" to other address pairs beyond the primary pair.  It is our
661	   belief that a general design does not need these labels.

663	6.2.  Failure Detection

665	   This process consists of three tasks:

667	   o  Tracking local information from lower and upper layers.  For
668	      instance, when link layer informs that we have no connection then
669	      we know there is a failure.

671	   o  Performing a reachability process as described in in [10] for
672	      ensuring that there is reachability when the local information
673	      says there should be.

675	   o  Following commands from the peer regarding the availability of
676	      addresses.

678	6.3.  Alternative Locator Pair Exploration

680	6.3.1.  Exploration Order

682	   The pair selection state machine assumes an ability to pick primary
683	   and alternative address pairs.

685	   This process results in a combinatorial explosion when there are many
686	   addresses on both sides.  Do both sides track all possible
687	   combinations of addresses?  If a failure occurs, shall all
688	   combinations be tested before giving up?  Are such tests performed in
689	   parallel or in sequence, and what kind of backoff procedures should
690	   be applied?

692	   Our suggestion is that nodes MUST first consult RFC 3484 [5] Section
693	   4 rules to determine what combinations of addresses are legal from a
694	   local point of view, as this reduces the search space.  RFC 3484 also
695	   provides a priority ordering among different address pairs, making
696	   the search possibly faster.  Nodes SHOULD also use local information,
697	   such as known quality of service parameters or interface types to
698	   determine what addresses are preferred over others, and try pairs
699	   containing such addresses first.  In some cases we can also learn the
700	   peer's preferences through the multihoming protocol.

702	      Discussion note 1: It may also be possible to simulate preferences
703	      by choosing to not tell the peer about some (non-preferred)
704	      addresses.

706	      Discussion note 2: The preferences may either be learned
707	      dynamically or be configured.  It is believed, however, that
708	      dynamic learning based purely on the SHIM6 protocol is too hard
709	      and not the task this layer should do.  Solutions where multiple
710	      protocols share their information in a common pool of locators
711	      could provide this information from transport protocols, however
712	      [20].

714	   The reception of packets from the peer with a given address pair is a
715	   good hint that the address pair works, particularly when these
716	   packets are authenticated multihoming protocol packets.  However, the
717	   reception of these packets alone is an insufficient reason to switch
718	   to a new address, as in an unidirectional connectivity case the
719	   return path may not work.

721	   One suggested good implementation strategy is to record the
722	   reachability test result (an on/off value) and multiply this by the
723	   age of the information.  This allows recently tested address pairs to
724	   be chosen before old ones.

726	   Out of the set of possible candidate address pairs, nodes SHOULD
727	   attempt a test through all of them, but MUST do this sequentially and
728	   using an exponential back-off procedure.

730	   This sequential process is necessary in order to avoid a "signaling
731	   storm" when an outage occurs (particularly for a complete site).
732	   However, it also limits the number of addresses that can in practice
733	   be used for multihoming, considering that transport and application
734	   layer protocols will fail if the switch to a new address pair takes
735	   too long.  For instance, we can assume that an initial timeout value
736	   is 0.1 seconds and there are four addresses on both sides.  Going
737	   through all sixteen address pairs and doubling the timeout value at
738	   every trial would take 3200 seconds!

740	   Finally, as has been noted in the context of MOBIKE, the existence of
741	   NATs can require that peers continuously monitor the operational
742	   status of address pairs, as otherwise NAT state related to a
743	   particular communication is lost, and the peer on the outer side of
744	   the NAT can no longer reach the peer inside the NAT.

746	6.3.2.  Exploration Protocol

748	   The exploration for a working address pair is not easy, as
749	   unidirectional reachability needs to be considered.  This is because
750	   the test of a single pair may not result in a working paths to send
751	   both the request and response packets.  The following protocol could
752	   be used to avoid this problem:

754	    Peer A                                        Peer B
755	      |                                             |
756	      |  Poll 1 (src=A1, dst=B1)                    |
757	      |-------------------------------------------->|
758	      |                                             |
759	      |               Poll 2 (src=B1, dst=A1) OK: 1 |
760	      |        X------------------------------------|
761	      |                                             |
762	      |  Poll 3 (src=A2, dst=B1)                    |
763	      |------------------------------X              |
764	      |                                             |
765	      |          Poll 4 (src=B2, dst=A1) OK: 1      |
766	      |<--------------------------------------------|
767	      |                                             |
768	      |  Poll 5 (src=A1, dst=B1) OK: 4              |
769	      |-------------------------------------------->|
770	      |                                             |

772	   When B receives the first Poll message, it memorizes that it has
773	   gotten it.  The Poll message from B, however, is lost so A tries
774	   again with another pair.  This is lost too, but B continues its own
775	   testing process by sending its second Poll message, which is received
776	   by A. The messages carry identifiers, and a list of identifiers that
777	   were found messages the sender had itself successfully received
778	   earlier.

780	   In the end of the example case, A and B know that they have a working
781	   path from A to B using (A1, B1) and from B to A using (B2, A1).

783	   More generally, when A decides that it needs to test for
784	   connectivity, it will initiate a set of Poll messages, in sequence,
785	   until it gets a Poll message from B indicating that (a) B has
786	   received one of A's Poll messages and, obviously, (b) that B's Poll
787	   message is getting through.  B uses the same algorithm, but starts
788	   the process from the reception of the first Poll message from A.

790	   Note that this protocol can be implemented in different ways.  One
791	   approach is to rely on data packets, such as TCP payload packets and
792	   acknowledgements.  This method has the benefit that it likely passes
793	   easily through firewalls and other middleboxes.  One exception to
794	   this are stateful firewalls that wish to know what happened "earlier"
795	   in the connection, but it seems that such firewalls are fundamentally
796	   incompatible with multi-homing anyway.  One drawback of this method
797	   is, however, that the the number of available payload packets may not
798	   match the need in a situation where a lot of address pairs need to be
799	   explored.

801	   Another approach is to have a completely separate protocol for the
802	   exploration.  This would need to be explicitly allowed in firewalls
803	   before it could be used.  On the other hand, then it would be very
804	   clear for the firewall administrators what they are letting through.

806	7.  Security Considerations

808	   Attackers may spoof various indications from lower layers and the
809	   network in an effort to confuse the peers about which addresses are
810	   or are not working.  For example, attackers may spoof ICMP error
811	   messages in an effort to cause the parties to move their traffic
812	   elsewhere or even to disconnect.  Attackers may also spoof
813	   information related to network attachments, router discovery, and
814	   address assignments in an effort to make the parties believe they
815	   have Internet connectivity when in reality they do not.

817	   This may cause use of non-preferred addresses or even denial-of-
818	   service.

820	   SHIM6 does not provide any protection of its own for indications from
821	   other parts of the protocol stack.  However, MOBIKE is resistant to
822	   incorrect information from these sources in the sense that it
823	   provides its own security for both the signaling of addressing
824	   information as well as actual payload data transmission.  Denial-of-
825	   service vulnerabilities remain, however.  Some aspects of these
826	   vulnerabilities can be mitigated through the use of techniques
827	   specific to the other parts of the stack, such as properly dealing
828	   with ICMP errors [21], link layer security, or the use of [13] to
829	   protect IPv6 Router and Neighbor Discovery.

831	8.  References

833	8.1.  Normative References

835	   [1]   Bradner, S., "Key words for use in RFCs to Indicate Requirement
836	         Levels", BCP 14, RFC 2119, March 1997.

838	   [2]   Narten, T., Nordmark, E., and W. Simpson, "Neighbor Discovery
839	         for IP Version 6 (IPv6)", RFC 2461, December 1998.

841	   [3]   Thomson, S. and T. Narten, "IPv6 Stateless Address
842	         Autoconfiguration", RFC 2462, December 1998.

844	   [4]   Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C., and M.
845	         Carney, "Dynamic Host Configuration Protocol for IPv6
846	         (DHCPv6)", RFC 3315, July 2003.

848	   [5]   Draves, R., "Default Address Selection for Internet Protocol
849	         version 6 (IPv6)", RFC 3484, February 2003.

851	   [6]   Aboba, B., "Detection of Network Attachment (DNA) in IPv4",
852	         draft-ietf-dhc-dna-ipv4-08 (work in progress), July 2004.

854	   [7]   Choi, J., "Detecting Network Attachment in IPv6 Goals",
855	         draft-ietf-dna-goals-00 (work in progress), June 2004.

857	   [8]   Moore, N., "Optimistic Duplicate Address Detection for IPv6",
858	         draft-ietf-ipv6-optimistic-dad-01 (work in progress),
859	         June 2004.

861	   [9]   Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast
862	         Addresses", draft-ietf-ipv6-unique-local-addr-05 (work in
863	         progress), June 2004.

865	   [10]  Beijnum, I., "Shim6 Reachability Detection",
866	         draft-ietf-shim6-reach-detect-00 (work in progress), July 2005.

868	8.2.  Informative References

870	   [11]  Stewart, R., Xie, Q., Morneault, K., Sharp, C., Schwarzbauer,
871	         H., Taylor, T., Rytina, I., Kalla, M., Zhang, L., and V.
872	         Paxson, "Stream Control Transmission Protocol", RFC 2960,
873	         October 2000.

875	   [12]  Rosenberg, J., Weinberger, J., Huitema, C., and R. Mahy, "STUN
876	         - Simple Traversal of User Datagram Protocol (UDP) Through
877	         Network Address Translators (NATs)", RFC 3489, March 2003.

879	   [13]  Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure
880	         Neighbor Discovery (SEND)", RFC 3971, March 2005.

882	   [14]  Nikander, P., "End-Host Mobility and Multi-Homing with Host
883	         Identity Protocol", draft-ietf-hip-mm-00 (work in progress),
884	         October 2004.

886	   [15]  Kivinen, T., "Design of the MOBIKE protocol",
887	         draft-ietf-mobike-design-00 (work in progress), June 2004.

889	   [16]  Eronen, P., "IKEv2 Mobility and Multihoming Protocol (MOBIKE)",
890	         draft-ietf-mobike-protocol-03 (work in progress),
891	         September 2005.

893	   [17]  Rosenberg, J., "Interactive Connectivity Establishment (ICE): A
894	         Methodology for Network  Address Translator (NAT) Traversal for
895	         Multimedia Session Establishment Protocols",
896	         draft-ietf-mmusic-ice-02 (work in progress), July 2004.

898	   [18]  Stewart, R., "Stream Control Transmission Protocol (SCTP)
899	         Dynamic Address  Reconfiguration",
900	         draft-ietf-tsvwg-addip-sctp-10 (work in progress),
901	         January 2005.

903	   [19]  Bagnulo, M., "Address selection in multihomed environments",
904	         draft-bagnulo-shim6-addr-selection-00 (work in progress),
905	         October 2005.

907	   [20]  Crocker, D., "Framework for Common Endpoint Locator Pools",
908	         draft-crocker-celp-00 (work in progress), February 2004.

910	   [21]  Gont, F., "ICMP attacks against TCP",
911	         draft-gont-tcpm-icmp-attacks-00 (work in progress),
912	         August 2004.

914	   [22]  Huitema, C., "Address selection in multihomed environments",
915	         draft-huitema-multi6-addr-selection-00 (work in progress),
916	         October 2004.

918	   [23]  Nordmark, E., "Level 3 multihoming shim protocol",
919	         draft-ietf-shim6-proto-00 (work in progress), October 2005.

921	   [24]  Rosenberg, J., "Traversal Using Relay NAT (TURN)",
922	         draft-rosenberg-midcom-turn-05 (work in progress), July 2004.

924	   [25]  Vogt, C., Arkko, J., Bless, R., Doll, M., and T. Kuefner,
925	         "Credit-Based Authorization for Mobile IPv6 Early Binding
926	         Updates", draft-vogt-mipv6-credit-based-authorization-00 (work
927	         in progress), May 2004.

929	   [26]  Aura, T., Roe, M., and J. Arkko, "Security of Internet Location
930	         Management", In Proceedings of the 18th Annual Computer
931	         Security Applications Conference, Las Vegas, Nevada, USA.,
932	         December 2002.

934	Appendix A.  Contributors

936	   This draft attempts to summarize the thoughts and unpublished
937	   contributions of many people, including the MULTI6 WG design team
938	   members Marcelo Bagnulo Braun, Iljitsch van Beijnum, Erik Nordmark,
939	   Geoff Huston, Margaret Wasserman, and Jukka Ylitalo, the MOBIKE WG
940	   contributors Pasi Eronen, Tero Kivinen, Francis Dupont, Spencer
941	   Dawkins, and James Kempf, and my colleague Pekka Nikander at
942	   Ericsson.  This draft is also in debt to work done in the context of
943	   SCTP [11].

945	   The protocol design in Section 6.3.2 is due to Erik, Marcelo, and
946	   Iljitsch.

948	Appendix B.  Acknowledgements

950	   The author would also like to thank Christian Huitema, Pekka Savola,
951	   and Hannes Tschofenig for interesting discussions in this problem
952	   space, and for their comments on earlier versions of this draft.

954	Author's Address

956	   Jari Arkko
957	   Ericsson
958	   Jorvas  02420
959	   Finland

961	   Email: jari.arkko@ericsson.com

963	Intellectual Property Statement

965	   The IETF takes no position regarding the validity or scope of any
966	   Intellectual Property Rights or other rights that might be claimed to
967	   pertain to the implementation or use of the technology described in
968	   this document or the extent to which any license under such rights
969	   might or might not be available; nor does it represent that it has
970	   made any independent effort to identify any such rights.  Information
971	   on the procedures with respect to rights in RFC documents can be
972	   found in BCP 78 and BCP 79.

974	   Copies of IPR disclosures made to the IETF Secretariat and any
975	   assurances of licenses to be made available, or the result of an
976	   attempt made to obtain a general license or permission for the use of
977	   such proprietary rights by implementers or users of this
978	   specification can be obtained from the IETF on-line IPR repository at
979	   http://www.ietf.org/ipr.

981	   The IETF invites any interested party to bring to its attention any
982	   copyrights, patents or patent applications, or other proprietary
983	   rights that may cover technology that may be required to implement
984	   this standard.  Please address the information to the IETF at
985	   ietf-ipr@ietf.org.

987	Disclaimer of Validity

989	   This document and the information contained herein are provided on an
990	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
991	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
992	   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
993	   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
994	   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
995	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

997	Copyright Statement

999	   Copyright (C) The Internet Society (2005).  This document is subject
1000	   to the rights, licenses and restrictions contained in BCP 78, and
1001	   except as set forth therein, the authors retain all their rights.

1003	Acknowledgment

1005	   Funding for the RFC Editor function is currently provided by the
1006	   Internet Society.