idnits 2.17.1 

draft-ietf-shim6-failure-detection-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 14.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 975.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 952.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 959.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 965.

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords. 

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (January 2005) is 7041 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: '1' is defined on line 799, but no explicit reference
     was found in the text

  == Unused Reference: '19' is defined on line 865, but no explicit reference
     was found in the text

  == Unused Reference: '20' is defined on line 868, but no explicit reference
     was found in the text

  == Unused Reference: '21' is defined on line 871, but no explicit reference
     was found in the text

  == Unused Reference: '24' is defined on line 883, but no explicit reference
     was found in the text

  == Unused Reference: '25' is defined on line 887, but no explicit reference
     was found in the text

  == Unused Reference: '26' is defined on line 890, but no explicit reference
     was found in the text

  == Unused Reference: '27' is defined on line 894, but no explicit reference
     was found in the text

  == Unused Reference: '30' is defined on line 906, but no explicit reference
     was found in the text

  ** Obsolete normative reference: RFC 2461 (ref. '2') (Obsoleted by RFC 4861)

  ** Obsolete normative reference: RFC 2462 (ref. '3') (Obsoleted by RFC 4862)

  ** Obsolete normative reference: RFC 3315 (ref. '4') (Obsoleted by RFC 8415)

  ** Obsolete normative reference: RFC 3484 (ref. '5') (Obsoleted by RFC 6724)

  == Outdated reference: A later version (-18) exists of
     draft-ietf-dhc-dna-ipv4-08

  == Outdated reference: A later version (-04) exists of
     draft-ietf-dna-goals-00

  ** Downref: Normative reference to an Informational draft:
     draft-ietf-dna-goals (ref. '7')

  == Outdated reference: A later version (-07) exists of
     draft-ietf-ipv6-optimistic-dad-01

  == Outdated reference: A later version (-09) exists of
     draft-ietf-ipv6-unique-local-addr-05

  -- Obsolete informational reference (is this intentional?): RFC 2960 (ref.
     '10') (Obsoleted by RFC 4960)

  -- Obsolete informational reference (is this intentional?): RFC 3489 (ref.
     '11') (Obsoleted by RFC 5389)

  == Outdated reference: A later version (-05) exists of draft-ietf-hip-mm-00

  == Outdated reference: A later version (-08) exists of
     draft-ietf-mobike-design-00

  == Outdated reference: A later version (-08) exists of
     draft-ietf-mobike-protocol-00

  == Outdated reference: A later version (-19) exists of
     draft-ietf-mmusic-ice-02

  == Outdated reference: A later version (-22) exists of
     draft-ietf-tsvwg-addip-sctp-10

  == Outdated reference: A later version (-08) exists of
     draft-dupont-ikev2-addrmgmt-05

  == Outdated reference: A later version (-02) exists of
     draft-eronen-mobike-mopo-00

  == Outdated reference: A later version (-05) exists of
     draft-gont-tcpm-icmp-attacks-00

  == Outdated reference: A later version (-08) exists of
     draft-rosenberg-midcom-turn-05


     Summary: 9 errors (**), 0 flaws (~~), 25 warnings (==), 9 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                           J. Arkko
3	Internet-Draft                                                  Ericsson
4	Expires: July 5, 2005                                       January 2005

6	     Failure Detection and Locator Selection Design Considerations
7	                 draft-ietf-shim6-failure-detection-00

9	Status of this Memo

11	   By submitting this Internet-Draft, each author represents that any
12	   applicable patent or other IPR claims of which he or she is aware
13	   have been or will be disclosed, and any of which he or she becomes
14	   aware will be disclosed, in accordance with Section 6 of BCP 79.

16	   Internet-Drafts are working documents of the Internet Engineering
17	   Task Force (IETF), its areas, and its working groups.  Note that
18	   other groups may also distribute working documents as Internet-
19	   Drafts.

21	   Internet-Drafts are draft documents valid for a maximum of six months
22	   and may be updated, replaced, or obsoleted by other documents at any
23	   time.  It is inappropriate to use Internet-Drafts as reference
24	   material or to cite them other than as "work in progress."

26	   The list of current Internet-Drafts can be accessed at
27	   http://www.ietf.org/ietf/1id-abstracts.txt.

29	   The list of Internet-Draft Shadow Directories can be accessed at
30	   http://www.ietf.org/shadow.html.

32	   This Internet-Draft will expire on July 5, 2005.

34	Copyright Notice

36	   Copyright (C) The Internet Society (2005).

38	Abstract

40	   This draft discusses locator pair selection and failure detection
41	   mechanisms for the IPv6 multihoming feature being developed in the
42	   SHIM6 working group.  Elements of this document may also be useful
43	   for developing the details of the MOBIKE or HIP multihoming
44	   mechanisms.  The draft also discusses the roles of a multihoming
45	   protocol versus network attachment functions at IP and link layers.

47	Table of Contents

49	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
50	   2.  Related Work . . . . . . . . . . . . . . . . . . . . . . . . .  4
51	   3.  Definitions  . . . . . . . . . . . . . . . . . . . . . . . . .  7
52	         3.1   Available Addresses  . . . . . . . . . . . . . . . . .  7
53	         3.2   Locally Operational Addresses  . . . . . . . . . . . .  8
54	         3.3   Operational Address Pairs  . . . . . . . . . . . . . .  8
55	         3.4   Primary Address Pair . . . . . . . . . . . . . . . . . 10
56	         3.5   Miscellaneous  . . . . . . . . . . . . . . . . . . . . 10
57	   4.  Architectural Considerations . . . . . . . . . . . . . . . . . 11
58	   5.  An Approach  . . . . . . . . . . . . . . . . . . . . . . . . . 13
59	         5.1   State Machine for Addresses  . . . . . . . . . . . . . 13
60	         5.2   State Machine for Address Pair Selection . . . . . . . 14
61	         5.3   Pair Selection Algorithm . . . . . . . . . . . . . . . 18
62	         5.4   Protocol for Testing Unidirectional Reachability . . . 19
63	   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 22
64	   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 23
65	         7.1   Normative References . . . . . . . . . . . . . . . . . 23
66	         7.2   Informative References . . . . . . . . . . . . . . . . 23
67	       Author's Address . . . . . . . . . . . . . . . . . . . . . . . 25
68	   A.  Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 26
69	   B.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 27
70	       Intellectual Property and Copyright Statements . . . . . . . . 28

72	1.  Introduction

74	   The SHIM6 working group is extending IPv6 to support multihoming.  A
75	   number of possible approaches exist in this space, but the current
76	   focus of the group is to look at an IP layer (or layer 3.5) mechanism
77	   that hides multihoming from applications.  This mechanism needs to
78	   detect when a switch to another address or addresses becomes
79	   necessary.  We call this failure detection, because the SHIM6
80	   protocol works primarily as a failover rather than a load balancing
81	   scheme.

83	   This draft discusses what requirements such a component of the SHIM6
84	   protocol has, and how these requirements can be achieved.  The draft
85	   is structured as follows: Section 2 discusses what kind of solutions
86	   have been used in other similar protocols.  Section 3 defines a set
87	   of useful terms and discusses them, and Section 4 discusses the
88	   architectural implications of multihoming at IP layer.  Finally,
89	   Section 5 describes one possible solution involving two state
90	   machines, a failure testing protocol, and an address pair selection
91	   algorithm.

93	   For the purposes of this draft, we consider an address to be
94	   synonymous with a locator.  There may be other, higher level
95	   identifiers such as security associations, FQDNs, CGA public keys, or
96	   HITs that tie the different locators used by a node together.

98	2.  Related Work

100	   In SCTP [10], the addresses of the endpoints are learned in the
101	   connection setup phase either through listing them explictly or via
102	   giving a DNS name that points to them.  In order to provide a
103	   failover mechanism between multihomed hosts, SCTP has the following
104	   functions:

106	   o  One of the peer's addresses is selected as the primary address by
107	      the application running on top of SCTP.  All data packets are sent
108	      to this address until there is a reason to choose another address,
109	      such as the failure of the primary address.

111	   o  Testing the reachability of the peer endpoint's addresses.  This
112	      is done both via observing the data packets sent to the peer or
113	      via a periodic heartbeat when there is no data packets to send.

115	      Each time data packet retransmission is initiated (or when a
116	      heartbeat is not answered within the estimated round-trip time) an
117	      error counter is incremented.  When a configured error limit is
118	      reached, the particular destination address is marked as inactive.
119	      The reception of an acknowledgement or heartbeat response clears
120	      the counter.

122	   o  Retransmission: When retransmitting the endpoint attempts pick the
123	      most "divergent" source-destination pair from the original source-
124	      destination pair to which the packet was transmitted.  Rules for
125	      such selection are, however, left as implementation decisions in
126	      SCTP.

128	   SCTP does not define how local knowledge (such as information learned
129	   from the link layer) should be used.  SCTP also has no mechanism to
130	   deal with dynamic changes to the set of available addresses, although
131	   mechanisms for that are being developed [17].

133	   The MOBIKE protocol is currently being designed [15] [14].  This
134	   protocol operates in a mixed IPv4/IPv6 enviroment, and typically has
135	   to work through NATs.  The current design is assumed to need to work
136	   only in symmetric connectivity scenarios.

138	   Some of the issues that have been discussed in the MOBIKE design
139	   phase include the following:

141	   o  Single address vs. multiple peer addresses.  A simple approach is
142	      to have the peers be aware of just the current address of the
143	      other side instead of all possible ones.  Assuming that one of the
144	      peers will request the other to start sending to a new address
145	      this works well.  However, this approach is unable to deal with
146	      problems that affect both nodes.  For instance, two nodes
147	      connected by two separate point-to-point links will be unable to
148	      switch to the other link if a failure occurs on the first one.

150	   o  Addresses vs. address pairs.  Are tests and current paths
151	      individual peer addresses, or pairs of peer and own addresses
152	      (paths)?  It seems that some failure scenarios require the use of
153	      a path rather than a single address.  A network failure may make
154	      it impossible to communicate between a particular pair of
155	      addresses, even if those addresses have some other connectivity.

157	   o  Where the connectivity information comes from.  Does it come from
158	      local stack (such as interface up/down, router advertisement),
159	      from reception of ESP packets, from IKEv2 keepalives, or through
160	      some MOBIKE-defined mechanism?

162	   The mobility and multihoming specification for the HIP protocol [13]
163	   leaves the determination of when address updates are sent to a local
164	   policy, but suggests the use of local information and ICMP error
165	   messages.

167	   Network attachment procedures are also relevant for multihoming.  The
168	   IPv6 and MIP6 working groups have standardized mechanisms to learn
169	   about networks that a node has attached to.  Basic IPv6 Neighbor
170	   Discovery was, however, designed primarily for static situations.
171	   The fully dynamic detection procedure has turned out to be a
172	   relatively complex procedure for mobile hosts, and it was not fully
173	   anticipated at the time IPv6 Neighbor Discovery or DHCP were being
174	   designed.  As a result, enhanced or optimized mechanisms are being
175	   designed in the DHC and DNA working groups [6] [7].

177	   ICE [16], STUN [11], and TURN [28] are also related mechanisms.  They
178	   are primarily used for NAT detection and communication through NATs
179	   in IPv4 environment, for application such as as voice over IP.  STUN
180	   uses a server in the Internet to discover the presence and type of
181	   NATs and the client's public IP addresses and ports.  TURN makes it
182	   possible to receive incoming connections in hosts behind NATs.  ICE
183	   makes use of these protocols in peer-to-peer cooperative fashion,
184	   allowing participants to discover, create and verify mutual
185	   connectivity, and then use this connectivity for multimedia streams.
186	   While these mechanisms are not designed for dynamic and failure
187	   situations, they have many of the same requirements for the
188	   exploration of connectivity, as well as the requirement to deal with
189	   middleboxes.

191	   Related work in the IPv6 area includes RFC 3484 [5] which defines
192	   source and destination address selection rules for IPv6 in situations
193	   where multiple candidate address pairs exist.  RFC 3484 considers
194	   only a static situation, however, and does not take into account the
195	   effect of failures.  In the MULTI6 working group [23] considers how
196	   applications can re-initiate connections after failures in the best
197	   way.  This work differs from the shim-layer approach selected for
198	   further development in the working group with respect to the timing
199	   of the address selection.  In the shim-layer approach failure
200	   detection and the selection of new addresses happens at any time,
201	   while [23] considers only the case when an application re-establishes
202	   connections.

204	3.  Definitions

206	   This section defines terms useful in discussing the failure detection
207	   problem space.

209	3.1  Available Addresses

211	   SHIM6 nodes need to be aware of what addresses they themselves have.
212	   If a node loses the address it is currently using for communications,
213	   another address must replace this address.  And if a node loses an
214	   address that the node's peer knows about, the peer must be informed.
215	   Similarly, when a node acquires a new address it may generally wish
216	   the peer to know about it.

218	   Definition.  Available address.  An address is said to be available
219	   if the following conditions are fulfilled:

221	   o  The address has been assigned to an interface of the node.

223	   o  If the address is an IPv6 address, we additionally require that
224	      (a) the address is valid in the sense of RFC 2461 [2], and that
225	      (b) the address is not tentative in the sense of RFC 2462 [3].  In
226	      other words, the address assignment is complete so that
227	      communications can be started.

229	      Note this explicitly allows an address to be optimistic in the
230	      sense of [8] even though implementations are probably better off
231	      using other addresses as long as there is an alternative.

233	   o  The address is a global unicast, unique local address [9], or an
234	      unambiquous IPv6 link-local or IPv4 RFC 1918 address.  That is, it
235	      is not an IPv6 site-local address.  Where IPv6 link-local or RFC
236	      1918 addresses are used, their use needs to be unambiquous.  The
237	      precise meaning of ambiquous has not been defined yet, but one
238	      approach is requiring that at most one link-local address be used
239	      per node within the same connection between two peers.

241	         Note: Given RFC 3484 [5] rules for preferring smallest scope,
242	         it is likely that many IPv6 flows at least start with even
243	         link-local addresses.

245	   o  The address and interface is acceptable for use according to a
246	      local policy.

248	   Available addresses are discovered and monitored through mechanisms
249	   outside the scope of SHIM6 (and HIP or MOBIKE).  These mechanisms
250	   include IPv6 Neighbor Discovery and Address Autoconfiguration [2]
251	   [3], DHCP [4], enhanced network detection mechanisms detected by the
252	   DNA working group, and corresponding IPv4 mechanisms, such as [6].

254	3.2  Locally Operational Addresses

256	   Two different granularity levels are needed for failure detection.
257	   The coarser granularity is for individual addresses:

259	   Definition.  Locally Operational Address.  An available address is
260	   said to be locally operational when its use is known to be possible
261	   locally: the interface is up and a relevant default router (if
262	   applicable) is known to be reachable.

264	   Locally operational addresses are discovered and monitored through
265	   mechanisms outside SHIM6 (and HIP or MOBIKE).  These mechanisms
266	   include IPv6 Neighbor Discovery [2], corresponding IPv4 mechanisms,
267	   and link layer specific mechanisms.

269	   Theoretically, it is also possible for hosts to learn about routing
270	   failures for a particular selected source prefix, even if no protocol
271	   exists today to distribute this information in a convenient manner.
272	   The development of such protocols would be possible, however.  One
273	   approach is overloading information in current IPv6 Router
274	   Advertisements (see [23]) or adding some new information in them.
275	   Similarly, hosts could learn information from servers that query the
276	   BGP routing tables [23].

278	3.3  Operational Address Pairs

280	   The existence of locally operational addresses are not, however, a
281	   guarantee that communications can be established with the peer.  A
282	   failure in the routing infrastructure can prevent the sent packets
283	   from reaching their destination.  For this reason we need the
284	   definition of a second level of granularity, for pairs of addresses:

286	   Definition.  Bidirectionally operational address pair.  A pair of
287	   locally operational addresses are said to be an operational address
288	   pair, iff bidirectional connectivity can be shown between the
289	   addresses.  That is, a packet sent with one of the addresses in the
290	   source field and the other in the destination field reaches the
291	   destination, and vice versa.

293	   Unfortunately, there are scenarios where bidirectionally operational
294	   address pairs do not exist.  For instance, ingress filtering or
295	   network failures may result in one address pair being operational in
296	   one direction while another one is operational from the other
297	   direction.  The following definition captures this general situation:

299	   Definition.  Undirectionally operational address pair.  A pair of
300	   locally operational addresses are said to be an unidirectionally
301	   operational address pair, iff packets sent with the first address as
302	   the source and the second address as the destination can be shown to
303	   reach the destination.

305	   Both types of operational pairs are discovered and monitored through
306	   the following mechanisms:

308	   o  Positive feedback from upper layer protocols.  For instance, TCP
309	      can indicate to the IP layer that it is making progress.  This is
310	      similar to how IPv6 Neighbor Unreachability Detection can in some
311	      cases be avoided when upper layers provide information about
312	      bidirectional connectivity [2].  In the case of unidirectional
313	      connectivity, the upper layer protocol responses come back using
314	      another address pair, but show that the messages sent using the
315	      first address pair have been received.

317	   o  Negative feedback from upper layer protocols.  It is conceivable
318	      that upper layer protocols give an indication of a problem to the
319	      SHIM6 layer.  For instance, TCP could indicate that there's either
320	      congestion or lack of connectivity in the path because it is not
321	      getting ACKs.

323	   o  Explicit reachability tests.  For instance, the IKEv2 keepalive
324	      mechanism can be used to test that the current pair of addresses
325	      is operational.

327	   o  ICMP error messages.  Given the ease of spoofing ICMP messages,
328	      one should be careful to not trust these blindly, however.  Our
329	      suggestion is to use ICMP error messages only as a hint to perform
330	      an explicit reachability test, but not as a reason to disrupt
331	      ongoing communications without other indications of problems.  The
332	      situation may be different when certain verifications of the ICMP
333	      messages are being performed [22].  These verifications can ensure
334	      that (pratically) only on-path attackers can spoof the messages.
335	      Such verifications are not possible for all transport protocols,
336	      however.

338	   Note that some protocols, such as HIP [13], perform a return
339	   routability test of an address before it is taken into use.  The
340	   purpose of this test is to ensure that fraudulent peers do not trick
341	   others into redirecting traffic streams onto innocent victims [31].
342	   Such tests can at the same time work as a means to ensure that an
343	   address pair is operational.  Note, however, that some advanced
344	   optimizations attempt to postpone the reachability tests so that they
345	   do not increase movement-related latency [29].

347	3.4  Primary Address Pair

349	   Contrary to SCTP which has a specific congestion avoidance design
350	   suitable for multi-homing, IP-layer solutions need to avoid sending
351	   packets concurrently over multiple paths; TCP behaves rather poorly
352	   in such circumstances.  For this reason it is necessary to choose a
353	   particular pair of addresses as the primary address pair which is
354	   used until problems occur, at least for the same session.

356	   A primary address pair need not be operational at all times.  If
357	   there is no traffic to send, we may not know if the primary address
358	   pair is operational.  Neverthless, it makes sense to assume that the
359	   address pair that worked in some time ago continues to work for new
360	   communications as well.

362	3.5  Miscellaneous

364	   Addresses can become deprecated [2].  When other operational
365	   addresses exist, nodes generally wish to move their communications
366	   away from the deprecated addresses.

368	   Similarly, IPv6 source address selection [5] may guide the selection
369	   of a particular source address - destination address pair.

371	4.  Architectural Considerations

373	   Architecturally, a number of questions arises.  One simple question
374	   is whether there needs to be communications between a multihoming
375	   solution residing at the IP layer and upper layer protocols?  Upon
376	   changing to a new address pair, transport layer protocol SHOULD be
377	   notified so that it can perform a slow start, or some other form of
378	   adaptation to the possibly changed conditions.  This is necessary,
379	   for instance, when switching from a high-bandwidth LAN interface to a
380	   low bandwidth cellular interface.  (Note that this notification can
381	   not be done in protocol designs where the end points are not the
382	   final hosts, such as where a gateway is used.)

384	   A more fundamental question is which protocols should be responsible
385	   for which parts of the problem.  It seems clear that no multihoming
386	   solution should take on the task of lower layers and other IP
387	   functions for discovering its own addresses or testing local
388	   connectivity.  Protocols such as DHCP or Neighbor and Router
389	   Discovery do this already.

391	   But it is less clear which protocol(s) should discover end-to-end
392	   connectivity problems or recover from them.  One answer is that this
393	   is clearly within the domain of multihoming protocol.  By performing
394	   testing and failure detection of the used path and switching to a new
395	   path if necessary, the transport and application protocols can work
396	   unchanged.

398	   On the other hand, one could argue that transport and application
399	   protocols would have more knowledge about the situation, and have a
400	   better ability to decide when a move is required.  For instance, they
401	   know what the required throughput and congestion status is.  Also, it
402	   would be unfortunate if both the IP layer and transport/application
403	   layer took action for the same problem, for instance by switching to
404	   a new address at the IP layer and throttling back due to "congestion"
405	   at the transport layer.

407	   One can also envision that applications would be able to tell the IP
408	   or transport layer that the current connection in unsatisfactory and
409	   an exploration for a better one would be desirable.  This would
410	   require an API to be developed, however.

412	   Generally speaking, we can divide information that a host has into
413	   three categories: local information from "lower layers" such as IPv6
414	   Neighbor Discovery, transit and congestion condition information from
415	   either from the multihoming protocol itself or from transport layer
416	   protocols and (where available) ECN, and application layer policies
417	   that dictate what the requirements are for acceptable connections.

419	   The division of work is largely left as an open issue as far as this
420	   document is concerned, but our description works from a point of view
421	   of a multihoming protocol at the IP layer.  We also note that in the
422	   CELP proposal [18], both IP, transport, and application layer
423	   entities could share their connectivity status in a common
424	   information pool.  This may also be a useful approach.

426	   Finally, the last architectural question is about the difference
427	   between mobility and multihoming.  Given our definitions above,
428	   there's no fundamental difference with respect to how the
429	   multihoming/mobility protocol learns the addresses it has available.
430	   However, a practical difference is that in a multihoming scenario
431	   there are alternative addresses, whereas in mobility changes to a new
432	   address are forced due to the old address no longer being available.

434	5.  An Approach

436	   One suggested approach consists of a mechanism for keeping track of
437	   the host's own available addresses, operational addresses, and
438	   operational address pairs.

440	5.1  State Machine for Addresses

442	   Addresses can be in the AVAILABLE and OPERATIONAL states.  The state
443	   transitions relating to this are shown in Figure 1.

445	                     +--------------+
446	     Address becomes |              |
447	     available       |              |
448	   ----------------->|              |
449	                     |  AVAILABLE   |
450	   <-----------------|              |
451	     Address is no   |              |
452	    longer available |              |
453	                     +--------------+
454	                        |       / \
455	                Address |        | Address
456	                becomes |        | is no longer
457	            operational |        | operational
458	                        |        |
459	                       \ /       |
460	                     +--------------+
461	                     |              |
462	     Address is no   |              |
463	    longer available |              |
464	   <-----------------| OPERATIONAL  |
465	                     |              |
466	                     |              |
467	                     |              |
468	                     +--------------+

470	          Figure 1. Address state machine.

472	   When an address becomes operational, it SHOULD be reported as a new
473	   address to the peer.  Similarly, when an address is no longer
474	   operational or available, the peer SHOULD be informed.

476	   In addition, a particular address can be either preferred or
477	   deprecated.  This is not shown in the state machine.

479	5.2  State Machine for Address Pair Selection

481	   A node runs the address pair selection state machine to choose the
482	   currently used primary address pair, the one which is used for
483	   sending outgoing packets.  A node runs one of these state machines
484	   towards each different peer, tracking the known address pairs and
485	   their status.  Each peer also has its own state machine for talking
486	   back to the node; there is no guarantee that the same address pairs
487	   (in reverse order) have the same state; lack of bidirectionally
488	   operational pair would result in a different state on both sides, for
489	   instance.

491	   The state machine can be in the NO PRIMARY, TESTING PRIMARY, and
492	   PRIMARY OPERATIONAL states.  The chosen address pair is known to be
493	   operational in the PRIMARY OPERATIONAL state, and is either
494	   unverified or non-operational in the other states.

496	   Figure 2 shows the state machine:

498	                         +----------------+
499	                         |                |
500	                         |                |
501	                         |                |
502	                         |                |
503	                         |       NO       |
504	                         |     PRIMARY    |
505	                         |                |
506	                   +-----|                |<---------------+
507	                   |     |                |                |
508	                   |     +----------------+                |
509	                   |         / \    / \                    |
510	               Add |          |      |                     |
511	             pair: |   Delete |      | Test         Delete |
512	              Send |   pair & |      | fail &       pair & |
513	              test |     Last |      | Last           Last |
514	                   |          |      |                     |
515	                   |     +----------------+                |
516	                   |     |                |                |
517	                   +---->|                |<----+          |
518	                         |                |     | Test     |
519	    Connect: Send test   |                |     | fail &   |
520	   --------------------->|     TESTING    |     | !Last    |
521	                         |     PRIMARY    |+----+          |
522	          +------------->|                |                |
523	          |              |                |<----+          |
524	          |        +---->|                |     |          |
525	          |        |     +----------------+     |          |
526	   Policy | ICMP | |          |      |          |          |
527	   change | Timer: |      ULP |      | Test     | Delete   |
528	          |   Send | feedback:|      | OK:      | pair &   |
529	          |   test |    Reset |      | Reset    | !Last    |
530	          |        |    timer |      | timer    |          |
531	          |        |         \ /    \ /         |          |
532	          |        |     +----------------+     |          |
533	          |        +-----|                |     |          |
534	          |              |                |-----+          |
535	          +--------------|                |                |
536	                         |                |                |
537	                   +-----|   OPERATIONAL  |                |
538	     ULP feedback: |     |     PRIMARY    |                |
539	       Reset timer |     |                |----------------+
540	                   +---->|                |
541	                         |                |
542	                         +----------------+

544	          Figure 2. Pair selection state machine.

546	   The notation used in Figure 2 is explained below:

548	   Connect

550	      An event representing the desire of the application to send a
551	      packet to a new peer, or an indication from a peer wishing to
552	      connect to us.

554	   Test OK

556	      An event representing a successful completion of the reachability
557	      test.

559	   Test fail

561	      An event representing failure to complete the reachability test.

563	   ULP feedback

565	      An event representing positive indication from an upper layer
566	      protocol that the packets we have sent to the peer are getting
567	      through.

569	   ICMP

571	      An event representing the reception of an ICMP error message.

573	   Timer

575	      An event representing timer elapsing.

577	   Add pair

579	      An event representing the addition of a new possible address pair,
580	      either through learning a new local address or being told of a new
581	      remote address.  Note that this does not usually result in any
582	      immediate action, unless we are currently lacking an operational
583	      primary pair.

585	   Delete pair

587	      An event representing the deletion of the currently chosen primary
588	      address pair.

590	   Policy change

592	      An event representing the desire of the local or remote end to
593	      change to a different address pair, despite the current one being
594	      operational.  This can be due to the availability of the higher-
595	      bandwidth connection, cost, or other issues.

597	   Last

599	      A condition that tells whether or not the currently chosen primary
600	      pair is the only known address pair.

602	   Send test

604	      An action to initiate the reachability test for a particular pair.
605	      This test is typically embedded in the SHIM6 connection setup
606	      exchange when run initially, and a separate exchange later.

608	      Note that due to potentially asymmetric connectivity, both sides
609	      have to perform their own tests, and make their own primary pair
610	      selections.

612	   Reset timer

614	      An action to reset a timer so that it will send an event after a
615	      specified time.

617	   The state machines also assumes an underlying multihoming signaling
618	   capabability, consisting of the following abstract message exchanges:

620	   Open

622	      Establishes a connection between the peers.  May also exchange
623	      locator sets and test reachability at the same time.

625	   Test

627	      Verifies reachability using a specific address pair.

629	   Add

631	      Informs the peer about new locators.

633	   Delete

635	      Informs the peer about losing some locators.

637	   Note that the above state machine leaves open how specific address
638	   pairs are chosen, as this will be discussed in the next section.  We
639	   have also, on purpose, decided to avoid attaching functional labels
640	   such as "backup" to other address pairs beyond the primary pair.  It
641	   is our belief that a general design does not need these labels.

643	5.3  Pair Selection Algorithm

645	   The pair selection state machine assumes an ability to pick primary
646	   and alternative address pairs.

648	   This process results in a combinatorial explosion when there are many
649	   addresses on both sides.  Do both sides track all possible
650	   combinations of addresses?  If a failure occurs, shall all
651	   combinations be tested before giving up?  Are such tests performed in
652	   parallel or in sequence, and what kind of backoff procedures should
653	   be applied?

655	   Our suggestion is that nodes MUST first consult RFC 3484 [5] Section
656	   4 rules to determine what combinations of addresses are legal from a
657	   local point of view, as this reduces the search space.  RFC 3484 also
658	   provides a priority ordering among different address pairs, making
659	   the search possibly faster.  Nodes SHOULD also use local information,
660	   such as known quality of service parameters or interface types to
661	   determine what addresses are preferred over others, and try pairs
662	   containing such addresses first.  In some cases we can also learn the
663	   peer's preferences through the multihoming protocol [13].

665	      Discussion note 1: It may also be possible to simulate preferences
666	      by choosing to not tell the peer about some (non-preferred)
667	      addresses.

669	      Discussion note 2: The preferences may either be learned
670	      dynamically or be configured.  It is believed, however, that
671	      dynamic learning based purely on the SHIM6 protocol is too hard
672	      and not the task this layer should do.  Solutions where multiple
673	      protocols share their information in a common pool of locators
674	      could provide this information from transport protocols, however
675	      [18].

677	   The reception of packets from the peer with a given address pair is a
678	   good hint that the address pair works, particularly when these
679	   packets are authenticated multihoming protocol packets.  However, the
680	   reception of these packets alone is an insufficient reason to switch
681	   to a new address, as in an unidirectional connectivity case the
682	   return path may not work.

684	   One suggested good implementation strategy is to record the
685	   reachability test result (an on/off value) and multiply this by the
686	   age of the information.  This allows recently tested address pairs to
687	   be chosen before old ones.

689	   Out of the set of possible candidate address pairs, nodes SHOULD
690	   attempt a test through all of them, but MUST do this sequentially
691	   (based on an implementation-dependent priority order) and using an
692	   exponential back-off procedure.

694	   This sequantial process is necessary in order to avoid a "signaling
695	   storm" when an outage occurs (particularly for a complete site).
696	   However, it also limits the number of addresses that can in practice
697	   be used for multihoming, considering that transport and application
698	   layer protocols will fail if the switch to a new address pair takes
699	   too long.  For instance, we can assume that an initial timeout value
700	   is 0.1 seconds and there are four addresses on both sides.  Going
701	   through all sixteen address pairs and doubling the timeout value at
702	   every trial would take 3200 seconds!

704	   Finally, as has been noted in the context of MOBIKE, the existence of
705	   NATs can require that peers continuously monitor the operational
706	   status of address pairs, as otherwise NAT state related to a
707	   particular communication is lost, and the peer on the outer side of
708	   the NAT can no longer reach the peer inside the NAT.

710	5.4  Protocol for Testing Unidirectional Reachability

712	   Testing for reachability is not easy in an environment where
713	   unidirectional reachability is a possibility.  This is because the
714	   test of a single pair may not result in a working paths to send both
715	   the request and response packets.  The following protocol could be
716	   used to avoid this problem:

718	    Peer A                                        Peer B
719	      |                                             |
720	      |  Poll 1 (src=A1, dst=B1)                    |
721	      |-------------------------------------------->|
722	      |                                             |
723	      |               Poll 2 (src=B1, dst=A1) OK: 1 |
724	      |        X------------------------------------|
725	      |                                             |
726	      |  Poll 3 (src=A2, dst=B1)                    |
727	      |------------------------------X              |
728	      |                                             |
729	      |          Poll 4 (src=B2, dst=A1) OK: 1      |
730	      |<--------------------------------------------|
731	      |                                             |
732	      |  Poll 5 (src=A1, dst=B1) OK: 4              |
733	      |-------------------------------------------->|
734	      |                                             |

736	   When B receives the first Poll message, it memorizes that it has
737	   gotten it.  The Poll message from B, however, is lost so A tries
738	   again with another pair.  This is lost too, but B continues its own
739	   testing process by sending its second Poll message, which is received
740	   by A. The messages carry identifiers, and a list of identifiers that
741	   were found messages the sender had itself successfully received
742	   earlier.

744	   In the end of the example case, A and B know that they have a working
745	   path from A to B using (A1, B1) and from B to A using (B2, A1).

747	   More generally, when A decides that it needs to test for
748	   connectivity, it will initiate a set of Poll messages, in sequence,
749	   until it gets a Poll message from B indicating that (a) B has
750	   received one of A's Poll messages and, obviously, (b) that B's Poll
751	   message is getting through.  B uses the same algorithm, but starts
752	   the process from the reception of the first Poll mesage from A.

754	   Note that this protocol can be implemented in different ways.  One
755	   approach is to rely on data packets, such as TCP payload packets and
756	   acknowledgements.  This method has the benefit that it likely passes
757	   easily through firewalls and other middleboxes.  One exception to
758	   this are stateful firewalls that wish to know what happened "earlier"
759	   in the connection, but it seems that such firewalls are fundamentally
760	   incompatible with multi-homing anyway.  One drawback of this method
761	   is, however, that the the number of available payload packets may not
762	   match the need in a situation where a lot of address pairs need to be
763	   explored.

765	   Another approach is to have a completely separate protocol for the
766	   exploration.  This would need to be explicitly allowed in firewalls
767	   before it could be used.  On the other hand, then it would be very
768	   clear for the firewall administrators what they are letting through.

770	6.  Security Considerations

772	   Attackers may spoof various indications from lower layers and the
773	   network in an effort to confuse the peers about which addresses are
774	   or are not working.  For example, attackers may spoof ICMP error
775	   messages in an effort to cause the parties to move their traffic
776	   elsewhere or even to disconnect.  Attackers may also spoof
777	   information related to network attachments, router discovery, and
778	   address assignments in an effort to make the parties believe they
779	   have Internet connectivity when in reality they do not.

781	   This may cause use of non-preferred addresses or even denial-of-
782	   service.

784	   SHIM6 does not provide any protection of its own for indications from
785	   other parts of the protocol stack.  However, MOBIKE is resistant to
786	   incorrect information from these sources in the sense that it
787	   provides its own security for both the signaling of addressing
788	   information as well as actual payload data transmission.  Denial-of-
789	   service vulnerabilities remain, however.  Some aspects of these
790	   vulnerabilities can be mitigated through the use of techniques
791	   specific to the other parts of the stack, such as properly dealing
792	   with ICMP errors [22], link layer security, or the use of [12] to
793	   protect IPv6 Router and Neighbor Discovery.

795	7.  References

797	7.1  Normative References

799	   [1]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
800	        Levels", BCP 14, RFC 2119, March 1997.

802	   [2]  Narten, T., Nordmark, E., and W. Simpson, "Neighbor Discovery
803	        for IP Version 6 (IPv6)", RFC 2461, December 1998.

805	   [3]  Thomson, S. and T. Narten, "IPv6 Stateless Address
806	        Autoconfiguration", RFC 2462, December 1998.

808	   [4]  Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C., and M.
809	        Carney, "Dynamic Host Configuration Protocol for IPv6 (DHCPv6)",
810	        RFC 3315, July 2003.

812	   [5]  Draves, R., "Default Address Selection for Internet Protocol
813	        version 6 (IPv6)", RFC 3484, February 2003.

815	   [6]  Aboba, B., "Detection of Network Attachment (DNA) in IPv4",
816	        draft-ietf-dhc-dna-ipv4-08 (work in progress), July 2004.

818	   [7]  Choi, J., "Detecting Network Attachment in IPv6 Goals",
819	        draft-ietf-dna-goals-00 (work in progress), June 2004.

821	   [8]  Moore, N., "Optimistic Duplicate Address Detection for IPv6",
822	        draft-ietf-ipv6-optimistic-dad-01 (work in progress), June 2004.

824	   [9]  Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast
825	        Addresses", draft-ietf-ipv6-unique-local-addr-05 (work in
826	        progress), June 2004.

828	7.2  Informative References

830	   [10]  Stewart, R., Xie, Q., Morneault, K., Sharp, C., Schwarzbauer,
831	         H., Taylor, T., Rytina, I., Kalla, M., Zhang, L., and V.
832	         Paxson, "Stream Control Transmission Protocol", RFC 2960,
833	         October 2000.

835	   [11]  Rosenberg, J., Weinberger, J., Huitema, C., and R. Mahy, "STUN
836	         - Simple Traversal of User Datagram Protocol (UDP) Through
837	         Network Address Translators (NATs)", RFC 3489, March 2003.

839	   [12]  Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure
840	         Neighbor Discovery (SEND)", RFC 3971, March 2005.

842	   [13]  Nikander, P., "End-Host Mobility and Multi-Homing with Host
843	         Identity Protocol", draft-ietf-hip-mm-00 (work in progress),
844	         October 2004.

846	   [14]  Kivinen, T., "Design of the MOBIKE protocol",
847	         draft-ietf-mobike-design-00 (work in progress), June 2004.

849	   [15]  Eronen, P., "IKEv2 Mobility and Multihoming Protocol (MOBIKE)",
850	         draft-ietf-mobike-protocol-00 (work in progress), June 2005.

852	   [16]  Rosenberg, J., "Interactive Connectivity Establishment (ICE): A
853	         Methodology for Network  Address Translator (NAT) Traversal for
854	         Multimedia Session Establishment Protocols",
855	         draft-ietf-mmusic-ice-02 (work in progress), July 2004.

857	   [17]  Stewart, R., "Stream Control Transmission Protocol (SCTP)
858	         Dynamic Address  Reconfiguration",
859	         draft-ietf-tsvwg-addip-sctp-10 (work in progress),
860	         January 2005.

862	   [18]  Crocker, D., "Framework for Common Endpoint Locator Pools",
863	         draft-crocker-celp-00 (work in progress), February 2004.

865	   [19]  Dupont, F., "Address Management for IKE version 2",
866	         draft-dupont-ikev2-addrmgmt-05 (work in progress), June 2004.

868	   [20]  Eronen, P., "Mobility Protocol Options for IKEv2 (MOPO-IKE)",
869	         draft-eronen-mobike-mopo-00 (work in progress), July 2004.

871	   [21]  Eronen, P. and H. Tschofenig, "Simple Mobility and Multihoming
872	         Extensions for IKEv2 (SMOBIKE)", draft-eronen-mobike-simple-00
873	         (work in progress), March 2004.

875	   [22]  Gont, F., "ICMP attacks against TCP",
876	         draft-gont-tcpm-icmp-attacks-00 (work in progress),
877	         August 2004.

879	   [23]  Huitema, C., "Address selection in multihomed environments",
880	         draft-huitema-multi6-addr-selection-00 (work in progress),
881	         October 2004.

883	   [24]  Kivinen, T., "MOBIKE protocol",
884	         draft-kivinen-mobike-protocol-00 (work in progress),
885	         March 2004.

887	   [25]  Nordmark, E., "Multihoming without IP Identifiers",
888	         draft-nordmark-multi6-noid-02 (work in progress), July 2004.

890	   [26]  Nordmark, E., "Multihoming using 64-bit Crypto-based IDs",
891	         draft-nordmark-multi6-cb64-00 (work in progress),
892	         November 2003.

894	   [27]  Nordmark, E., "Strong Identity Multihoming using 128 bit
895	         Identifiers (SIM/CBID128)", draft-nordmark-multi6-sim-01 (work
896	         in progress), October 2003.

898	   [28]  Rosenberg, J., "Traversal Using Relay NAT (TURN)",
899	         draft-rosenberg-midcom-turn-05 (work in progress), July 2004.

901	   [29]  Vogt, C., Arkko, J., Bless, R., Doll, M., and T. Kuefner,
902	         "Credit-Based Authorization for Mobile IPv6 Early Binding
903	         Updates", draft-vogt-mipv6-credit-based-authorization-00 (work
904	         in progress), May 2004.

906	   [30]  Ylitalo, J., "Weak Identifier Multihoming Protocol (WIMP)",
907	         draft-ylitalo-multi6-wimp-01 (work in progress), July 2004.

909	   [31]  Aura, T., Roe, M., and J. Arkko, "Security of Internet Location
910	         Management", In Proceedings of the 18th Annual Computer
911	         Security Applications Conference, Las Vegas, Nevada, USA.,
912	         December 2002.

914	Author's Address

916	   Jari Arkko
917	   Ericsson
918	   Jorvas  02420
919	   Finland

921	   Email: jari.arkko@ericsson.com

923	Appendix A.  Contributors

925	   This draft attempts to summarize the thoughts and unpublished
926	   contributions of many people, including the MULTI6 WG design team
927	   members Marcelo Bagnulo Braun, Iljitsch van Beijnum, Erik Nordmark,
928	   Geoff Huston, Margaret Wasserman, and Jukka Ylitalo, the MOBIKE WG
929	   contributors Pasi Eronen, Tero Kivinen, Francis Dupont, Spencer
930	   Dawkins, and James Kempf, and my colleague Pekka Nikander at
931	   Ericsson.  This draft is also in debt to work done in the context of
932	   SCTP [10].

934	   The protocol design in Section 5.4 is due to Erik, Marcelo, and
935	   Iljitsch.

937	Appendix B.  Acknowledgements

939	   The author would also like to thank Christian Huitema, Pekka Savola,
940	   and Hannes Tschofenig for interesting discussions in this problem
941	   space, and for their comments on earlier versions of this draft.

943	Intellectual Property Statement

945	   The IETF takes no position regarding the validity or scope of any
946	   Intellectual Property Rights or other rights that might be claimed to
947	   pertain to the implementation or use of the technology described in
948	   this document or the extent to which any license under such rights
949	   might or might not be available; nor does it represent that it has
950	   made any independent effort to identify any such rights.  Information
951	   on the procedures with respect to rights in RFC documents can be
952	   found in BCP 78 and BCP 79.

954	   Copies of IPR disclosures made to the IETF Secretariat and any
955	   assurances of licenses to be made available, or the result of an
956	   attempt made to obtain a general license or permission for the use of
957	   such proprietary rights by implementers or users of this
958	   specification can be obtained from the IETF on-line IPR repository at
959	   http://www.ietf.org/ipr.

961	   The IETF invites any interested party to bring to its attention any
962	   copyrights, patents or patent applications, or other proprietary
963	   rights that may cover technology that may be required to implement
964	   this standard.  Please address the information to the IETF at
965	   ietf-ipr@ietf.org.

967	Disclaimer of Validity

969	   This document and the information contained herein are provided on an
970	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
971	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
972	   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
973	   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
974	   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
975	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

977	Copyright Statement

979	   Copyright (C) The Internet Society (2005).  This document is subject
980	   to the rights, licenses and restrictions contained in BCP 78, and
981	   except as set forth therein, the authors retain all their rights.

983	Acknowledgment

985	   Funding for the RFC Editor function is currently provided by the
986	   Internet Society.