idnits 2.17.1 

draft-worley-sip-happy-earballs-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([RFC6555], [RFC3263]), which
     it shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 682 has weird spacing: '...ordered    uno...'

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords. 

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document date (March 2, 2017) is 2605 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'RFC4213' is defined on line 1973, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 6555 (Obsoleted by RFC 8305)

  -- Obsolete informational reference (is this intentional?): RFC 5245
     (Obsoleted by RFC 8445, RFC 8839)


     Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	SIPCORE                                                        D. Worley
3	Internet-Draft                                                   Ariadne
4	Intended status: Standards Track                           March 2, 2017
5	Expires: September 3, 2017

7	              Happy Earballs: Success with Dual-Stack SIP
8	                   draft-worley-sip-happy-earballs-01

10	Abstract

12	   TBD: The Session Initiation Protocol (SIP) supports multiple
13	   transports running both over IPv4 and IPv6 protocols.  In more and
14	   more cases, a SIP user agent (UA) is connected to network interfaces
15	   with multiple address families.  In these cases sending a message
16	   from a dual stack client to a dual stack server may suffer from the
17	   issues described in [RFC6555] ("Happy Eyeballs"): the UA attempts to
18	   send the message using IPv6, but IPv6 connectivity is not working to
19	   the server.  This can cause significant delays in the process of
20	   sending the message to the server.  This negatively affects the
21	   user's experience.

23	   TBD: This document builds on [RFC6555] by modifying the procedures
24	   specified in [RFC3263] and related specifications to require that a
25	   client ensure that communication targets are accessible before
26	   sending messages to them, to allow a client to contact targets out of
27	   the order required by other specifications, and to require a client
28	   to properly distribute the message load among targets over time.

30	Status of This Memo

32	   This Internet-Draft is submitted in full conformance with the
33	   provisions of BCP 78 and BCP 79.

35	   Internet-Drafts are working documents of the Internet Engineering
36	   Task Force (IETF).  Note that other groups may also distribute
37	   working documents as Internet-Drafts.  The list of current Internet-
38	   Drafts is at http://datatracker.ietf.org/drafts/current/.

40	   Internet-Drafts are draft documents valid for a maximum of six months
41	   and may be updated, replaced, or obsoleted by other documents at any
42	   time.  It is inappropriate to use Internet-Drafts as reference
43	   material or to cite them other than as "work in progress."

45	   This Internet-Draft will expire on September 3, 2017.

47	Copyright Notice

49	   Copyright (c) 2017 IETF Trust and the persons identified as the
50	   document authors.  All rights reserved.

52	   This document is subject to BCP 78 and the IETF Trust's Legal
53	   Provisions Relating to IETF Documents
54	   (http://trustee.ietf.org/license-info) in effect on the date of
55	   publication of this document.  Please review these documents
56	   carefully, as they describe your rights and restrictions with respect
57	   to this document.  Code Components extracted from this document must
58	   include Simplified BSD License text as described in Section 4.e of
59	   the Trust Legal Provisions and are provided without warranty as
60	   described in the Simplified BSD License.

62	Table of Contents

64	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
65	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   5
66	   3.  Structure of This Document  . . . . . . . . . . . . . . . . .   8
67	     3.1.  Scope of Applicability  . . . . . . . . . . . . . . . . .   9
68	   4.  Baseline Procedures . . . . . . . . . . . . . . . . . . . . .  10
69	     4.1.  Target Ordering . . . . . . . . . . . . . . . . . . . . .  12
70	     4.2.  Priority Nodes  . . . . . . . . . . . . . . . . . . . . .  13
71	     4.3.  Unordered Nodes . . . . . . . . . . . . . . . . . . . . .  14
72	     4.4.  Combinations of Prioritization and Unordered Nodes  . . .  14
73	     4.5.  Load-Balancing Nodes  . . . . . . . . . . . . . . . . . .  17
74	   5.  Primary Procedure Modifications . . . . . . . . . . . . . . .  18
75	     5.1.  Permitted to Reorder Targets  . . . . . . . . . . . . . .  18
76	     5.2.  Must Preserve Traffic Distribution  . . . . . . . . . . .  18
77	   6.  Additional Requirements . . . . . . . . . . . . . . . . . . .  19
78	     6.1.  Address Family Preference . . . . . . . . . . . . . . . .  19
79	     6.2.  Address Selection . . . . . . . . . . . . . . . . . . . .  20
80	     6.3.  Vias  . . . . . . . . . . . . . . . . . . . . . . . . . .  21
81	     6.4.  DNS Caching . . . . . . . . . . . . . . . . . . . . . . .  21
82	     6.5.  Unused Flows  . . . . . . . . . . . . . . . . . . . . . .  21
83	     6.6.  Debugging and Troubleshooting . . . . . . . . . . . . . .  22
84	   7.  Consequences of the New Requirements  . . . . . . . . . . . .  22
85	   8.  Generic Algorithm . . . . . . . . . . . . . . . . . . . . . .  26
86	     8.1.  Policies and Implementor Latitude . . . . . . . . . . . .  27
87	     8.2.  Examples  . . . . . . . . . . . . . . . . . . . . . . . .  28
88	       8.2.1.  Two Unordered Targets, Both Cached  . . . . . . . . .  28
89	       8.2.2.  Two Unordered Targets, Both Cached, One Slow  . . . .  29
90	       8.2.3.  Two Unordered Targets, One Cached . . . . . . . . . .  30
91	       8.2.4.  Two Unordered Targets, Neither Cached . . . . . . . .  31
92	       8.2.5.  Three Unordered Targets, One Cached . . . . . . . . .  31
93	       8.2.6.  Two Prioritized Targets, Both Cached  . . . . . . . .  32
94	       8.2.7.  Two Prioritized Targets, the Second Cached  . . . . .  33
95	       8.2.8.  Two Prioritized Targets, the First Cached . . . . . .  34
96	       8.2.9.  Two Prioritized Targets, Neither Cached . . . . . . .  34
97	       8.2.10. Three Targets . . . . . . . . . . . . . . . . . . . .  35
98	   9.  Using a Simplified Data Structure . . . . . . . . . . . . . .  36
99	   10. Security Considerations . . . . . . . . . . . . . . . . . . .  39
100	   11. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  39
101	   12. History . . . . . . . . . . . . . . . . . . . . . . . . . . .  40
102	     12.1.  Changes from draft-worley-sip-happy-earballs-00 to
103	            draft-worley-sip-happy-earballs-01 . . . . . . . . . . .  40
104	     12.2.  Changes from draft-worley-sip-he-connection-01 to draft-
105	            worley-sip-happy-earballs-00 . . . . . . . . . . . . . .  40
106	     12.3.  Changes from draft-worley-sip-he-connection-00 to draft-
107	            worley-sip-he-connection-01  . . . . . . . . . . . . . .  40
108	     12.4.  Changes from draft-johansson-sip-he-connection-01 to
109	            draft-worley-sip-he-connection-00  . . . . . . . . . . .  40
110	   13. References  . . . . . . . . . . . . . . . . . . . . . . . . .  41
111	     13.1.  Normative References . . . . . . . . . . . . . . . . . .  41
112	     13.2.  Informative References . . . . . . . . . . . . . . . . .  42
113	   Appendix A.  Implementing Load Balancing  . . . . . . . . . . . .  43
114	   Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . .  43
115	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  44

117	1.  Introduction

119	                            Earballs -- n., another word for ears.
120	                             Made famous by the animated American TV
121	                             spy comedy, "Archer".

123	                             "Ow, my earballs!" -- Cheryl Tunt, "Archer"

125	                             -- from "Urban Dictionary"

127	   The Session Initiation Protocol (SIP) [RFC3261] and the documents
128	   that extended it provide support for both IPv4 and IPv6.  However,
129	   this support has problems with environments that are characteristic
130	   of the transitional migratory phase from IPv4 to IPv6 networks:
131	   During this phase, many server and client implementations run on
132	   dual-stack hosts.  In such environments, a dual-stack host will
133	   likely suffer greater connection delay, and by extension an inferior
134	   user experience, than an IPv4-only host.

136	   The difficulty stems from the reality that a device cannot predict
137	   whether apparent IPv6 connectivity to another device is usable; both
138	   devices may have IPv6 addresses and yet some transit network between
139	   the two may not transport IPv6.  SIP requires a device that transmits
140	   a request to one destination address (e.g., the apparently useful
141	   IPv6 address) to wait for a response for a substantial period
142	   (usually 32 seconds) before transmitting the request to another
143	   destination address (the less-preferred IPv4 address).  The result is
144	   that apparent IPv6 connectivity that is not functional can cause
145	   substantial delays in processing SIP requests.  Especially when the
146	   requests are call setups (INVITE requests) this creates a very poor
147	   user experience.

149	   The overall procedure for transmitting a SIP message is as follows:
150	   The specification of where a message is to be sent is called a
151	   "goal"; in the common case, the goal is the host-port part of a SIP
152	   URI.  The SIP client uses specified algorithms (particularly
153	   [RFC3263]) to make DNS lookups to determine a prioritized set of
154	   "targets" (basically, destination IP address family, IP address,
155	   transport protocol, and port).  Under these specifications, the
156	   client contacts the targets in order until one is contacted
157	   successfully.  In order to contact a target, the client establishes a
158	   transport connection (if necessary), sends the message using the
159	   transport (possibly resending the message several times), and then
160	   (for requests) waits for a response (either provisional or final).
161	   The process ends successfully if the client receives a response.  The
162	   process ends unsuccessfully if the client receives a permanent error
163	   from the transport layer or if a SIP timer (Timer B or Timer F in
164	   [RFC3261]) expires.  Timeouts generally default to 32 seconds.  If
165	   the user has to wait for even one timeout, this will seriously
166	   degrade the user experience.  Thus, it is desirable to minimize the
167	   number of times the client has timeouts when sending requests.

169	   If the target list contains both IPv6 addresses and IPv4 addresses,
170	   this procedure can degrade the user's experience in common
171	   situations.  Typically, this problem arises when the client has an
172	   IPv6 interface, the server's preferred address is an IPv6 address,
173	   but the transit networks between the client and server do not carry
174	   IPv6.  This can cause the client to attempt to send a SIP request to
175	   the IPv6 target and then wait for a timeout before it continues with
176	   an IPv4 target.  This problem parallels a problem that was widely
177	   seen in web browsers.  The web browser problem was cured by
178	   specifying that web browsers should use a "Happy Eyeballs" algorithm
179	   [RFC6555] to determine the order in which to contact target
180	   addresses.

182	   This document builds on the concepts of [RFC6555] by amending these
183	   procedures, so that targets derived from a goal may be contacted in a
184	   different order than is specified by the previous specifications.  By
185	   analogy with the name "Happy Eyeballs" for similar algorithms in web
186	   browsers, we label these algorithms "Happy Earballs" [UD].

188	   Conceptually, the Happy Earballs algorithm is composed of a number of
189	   interlocking components:

191	   a.  For a given message, the set of targets is organized into a
192	       directed acyclic graph (or partial order) specifying which
193	       targets must be contacted before other targets.

195	   b.  Targets which are unresponsive, or which respond too slowly
196	       relative to other targets in the set, are moved to the end of the
197	       order.

199	   c.  Despite this change, the relative proportions of traffic sent to
200	       responsive targets remains as specified by the baseline
201	       specifications.

203	   d.  The client caches the known response times of targets.

205	   e.  The client does not send the message to a target before knowing
206	       the target's response time (unless this is the last target in the
207	       set).  If needed, the client sends a "probe" transmission (which
208	       has no semantic effect) to determine the target's response time.

210	   These components are assembled into an event-driven algorithm for
211	   sending the message to targets and, when necessary, sending probes to
212	   targets.  Implementors have flexibility in choosing among targets
213	   that are not prioritized, and choosing when to send probes to
214	   targets.  Implementors also have flexibility defining when one target
215	   is considered "too slow" compared to another target.  (TBD: Is that
216	   true?)  Nonetheless, all variants of this algorithm have the
217	   properties of distributing traffic among responsive targets as
218	   specified by the baseline specifications and not sending messages to
219	   non-responsive targets (unless all responsive targets have failed).

221	   This document does not address the similar problem for media (RTP and
222	   RTCP) packets.  Problems with media connectivity can be addressed
223	   using ICE [RFC5245], ALTC [RFC6947], or probing using RTP packets.

225	2.  Terminology

227	   The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
228	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
229	   document are to be interpreted as described in RFC 2119 [RFC2119].

231	   baseline:  the prior specifications of the behavior of a client for
232	      sending a message to a goal.  The baseline specifications are
233	      modified by this document.

235	   cache:  (verb) to store temporarily information regarding the
236	      response time of a target so as to accelerate future message
237	      transmission; (noun) the collection of information so stored

239	   client:  the device that must send a message

241	   CRTT(T):  for a target T, the cached round-trip response time of T
242	      (if any is cached).  There is a special value of "infinity" if T
243	      did not respond at all.  Collectively, these values are called
244	      "CRTT() values".  Contrasted with RTT(T), which is the actual
245	      round-trip response time (at some given moment).

247	   dual-stack:  narrowly, a device with both IPv4 and IPv6 interfaces;
248	      broadly, a device with more than one interface, including those
249	      with interfaces that use two or more address families

251	   flow:  a group of transmissions to a target which are considered
252	      related by a transport protocol.  For connection-oriented
253	      protocols, is the data carried by a connection.  For
254	      connectionless protocols, is all messages sent to a particular
255	      target (5-tuple).  For protocols with security associations, is
256	      all messages sent within a particular security association.

258	   goal:  the identification of a particular server.  E.g., a URI, a
259	      TSAP, a via-para, or information provided by the context of the
260	      message.

262	   graph:  a directed, acyclic graph whose nodes are a set of targets
263	      and whose arcs are requirements that one target must be sent to
264	      before another target is sent to.  (Mathematically, the graph
265	      defines a partial order on the targets.)  Graph arcs are drawn
266	      without arrowheads, and are to be read as the leftmost endpoint
267	      must be sent to before the rightmost endpoint.  A graph is always
268	      relative to the sending of a particular message and may be
269	      modified during the processing of the message.  A graph may be
270	      augmented with dummy nodes that do not represent targets in order
271	      to reduce the number of arcs that must be drawn to express the
272	      requirements.  (Abstractly, dummy nodes may be considered targets
273	      to which sends immediately fail.)

275	   initial:  a target which has no target prioritized before it
276	      (considered relative to all targets in the target set, or some
277	      subset or rearrangement of the target set, depending on the
278	      context)

280	   Limit(t):  a function converting one time value into another.  If
281	      RTT(T1) > Limit(RTT(T0)), then target T1 responds "too slowly"
282	      relative to the response time of target T0, and T1 is considered
283	      non-responsive.  Depends on two parameters, "m" and "f".
284	      Limit(infinity) is considered to be infinity.

286	   normal:  a target which is not slow (relative to a particular goal)
287	   NSAP:  "Network Service Access Point", the identification of a
288	      network interface, which comprises an address family and an
289	      address

291	   priority:  the situation where the client must send the message to a
292	      first target before it may send the message to a second target.
293	      The prioritization of a set of targets necessarily forms a partial
294	      order and can be represented by a directed acyclic graph.

296	   probe:  a transport operation that attempts to determine if a target
297	      is responsive, without transmitting a message.  Since a probe does
298	      not send a message, if the transmission fails, it does not commit
299	      the client to waiting a timeout period before sending the message
300	      to another target.

302	   quick:  a target whose cached RTT() is less than Limit(0), and thus
303	      is never slow for any goal

305	   RTT(T):  for a target T, the actual round-trip response time of T.
306	      There is a special value of "infinity" if T does not respond at
307	      all.  Collectively, these values are called "RTT() values".
308	      Contrasted with CRTT(T), which is the RTT(T) measurement recorded
309	      in the cache.

311	   responsive:  a property of a target T0 relative to the set of targets
312	      for a goal: the response time of the target T0 is sufficiently
313	      short when compared with the response time of other targets;
314	      RTT(T0) <= RTT(T1) for any T1 in the set of targets.  See TBD
315	      Section 5 for discussion.

317	   send:  (verb and noun) to attempt transmission of a (possibly
318	      modified copy of) a message to a target.  Contrasted with
319	      "successfully send", which is when the message is received by the
320	      server or when the client detects that the message is received by
321	      the server.  Does not include "probe" transport operations.

323	   server:  the (conceptual) device to which a message is to be sent.
324	      May consist of multiple physical devices listening on multiple
325	      network interfaces.

327	   slow:  a target T0 which appears to be "too slow" based on the cached
328	      RTT() information, i.e., there is another target T1 of the goal
329	      for which RTT(T0) > Limit(RTT(T1)) using the cached RTT() values.
330	      This includes the case where RTT(T0) is infinity, i.e., T0 does
331	      not respond at all.  Opposite of "normal".

333	   target:  the complete specification of a transport to be used to send
334	      a message from the client to the server.  A target is commonly
335	      conceptualized as "protocol/address/port" (which is a TSAP), but
336	      the target also includes the TSAP that will be used as the source
337	      of the communication, and so "5-tuple" is more accurate.  In many
338	      cases, the source TSAP is determined by the destination TSAP, so
339	      it is not mentioned.  Perforce, the transport protocol and address
340	      family of the source and destination TSAPs are the same.

342	   timeout:  (noun) the period of time after sending a message which a
343	      client must wait before it is permitted to send the message to
344	      another target without receiving positive indication of the
345	      failure of the first transmission.  For sending requests, either
346	      Timer B or Timer F [RFC3261].  Refers either to the abstract time
347	      interval or a particular instance after a particular send
348	      operation.  (verb) the event when the timeout period has expired.

350	   TSAP:  "Transport Service Access Point", the identification of an
351	      endpoint of a transport flow, which usually comprises a transport
352	      protocol, an NSAP (or network address), and a port number

354	   traffic:  the messages for a particular goal that are successfully
355	      sent to a particular target or set of targets; the number of such
356	      messages sent over a period of time; or the fraction of such
357	      messages relative to all messages for the goal

359	3.  Structure of This Document

361	   This document assumes that the context of the message provides a
362	   "goal", which is the specification of the device or collection of
363	   devices which are the server, and that there are existing "baseline"
364	   specifications which translate the goal into a set of "targets", and
365	   in what order(s) the client may send (possibly modified copies of)
366	   the message to the targets, until one of the send operations is
367	   successful.

369	   Section 4 introduces a model of how the baseline specifications
370	   operate: in every instance, they produce a partial order or directed
371	   acyclic graph of targets specifying which targets must be sent to
372	   before sending to which other targets.  This section also describes
373	   how "load-balanced" targets (SRV records with equal priority) are be
374	   handled within this model.

376	   Section 5describes the major modifications of the procedures for how
377	   a client sends a message to a server.

379	   This document relaxes the requirements on the client regarding the
380	   order(s) in which the message may be sent to the targets, that is, it
381	   permits additional orders, so that the client is less likely to have
382	   to wait for a timeout.  On the whole, when network connectivity is
383	   imperfect, this allows clients to transmit the messages to servers
384	   more quickly than they would using the unmodified baseline
385	   specifications.

387	   However, this document also places additional restrictions on the
388	   client's sending behavior to ensure that the overall traffic
389	   distribution among the targets converges over time to the
390	   distribution that would have resulted from obeying the baseline
391	   specifications.

393	   Section 6 requires certain additional behaviors that ensure that the
394	   use of IPv6 is not disadvantaged in mixed IPv4/IPv6 networks as well
395	   as a number of miscellaneous requirements to optimize the behavior of
396	   clients.

398	   Section 7 discusses some consequences of the new requirements,
399	   including what new orders of targets are permitted, what behaviors
400	   minimize the time needed to successfully send a message, techniques
401	   for probing a target (that is, determining if it is responsive
402	   without sending the message, and thus possibly committing to waiting
403	   for a timeout period), and suitable approaches for caching
404	   information about targets.

406	   These consequences, when assembled with the overarching goal of
407	   minimizing the time to successfully send the message, produce a
408	   generic algorithm, which is succinctly described in event-driven
409	   terms in Section 8.

411	   Section 8.2 applies the generic algorithm to a number of simple
412	   cases, including how it replicates "Happy Eyeballs" behaviors in
413	   analogous situations.

415	   Section 9 constructs a simplified algorithm (which is a subset of the
416	   generic algorithm) that replaces the directed acyclic graph with a
417	   list of lists.

419	3.1.  Scope of Applicability

421	   This document modifies any SIP target selection processes that are
422	   defined now or may be defined in the future, excepting those that
423	   explicitly exempt themselves.

425	   This document does not affect communications specified to be carried
426	   only by a single WebSocket transport, as in those contexts there is
427	   only one transport target (the WebSocket connection), and hence the
428	   target selection process is trivial.

430	   Similarly, this document does not affect sending a message when it is
431	   specified to be carried by a particular existing connection, as when
432	   a client sends a response to a request that it received via a
433	   connection-oriented protocol, and that connection still exists.

435	   A client MUST NOT consider the set of the target URIs of a "forking"
436	   operation to be a single goal to which the processes of this document
437	   apply.  Instead, the modifications MUST be applied to each of those
438	   URIs as separate goals.  This is because the decision of whether to
439	   send a request to a later forking target may be affected by the SIP
440	   response to an earlier transmission ([RFC3261] section 16).  A
441	   forking proxy may, as part of its policy, apply some or all of these
442	   procedures to a forking operation.  This may be useful when sending a
443	   request to a UAS that has registered multiple addresses for the same
444	   AoR (as may be discerned by examining the +sip.instance media feature
445	   tags in the registrations [RFC5626]).  However, any such processing
446	   is outside the scope of this document.

448	4.  Baseline Procedures

450	   The situation that this document addresses is when a SIP device is
451	   required to send a message (which may be either a request or a
452	   response).  This document uses the term "client" for the device which
453	   must send the message.  The client is given a "goal", which is the
454	   specification of the "server", which is the (possibly composite)
455	   device to which the message is to be sent.  (Both of these usages are
456	   broader than the usage in [RFC3261].)  The purpose of client is to
457	   successfully send the message to the server.

459	   (Note that in the case of a request, when the message is sent to a
460	   target, a Via header field will be added to the message, and that the
461	   added Via header field will be different for each target.  If the
462	   client is a UAC, it may effectively use the SIP signaling path as a
463	   connectivity check for the media path by varying the SDP between
464	   requests sent to different targets.  This document considers all of
465	   these versions of the message to be copies of the original message to
466	   be sent.)

468	   If the message is a request, the goal is usually the hostport part of
469	   the URI in either the first Route header field or the request-line.
470	   If the message is a response, the goal can be specified by the first
471	   via-param in the first Via header field.  If the message is to be
472	   sent to an outbound proxy as specified by a DHCP option ([RFC3319] or
473	   [RFC3361]), then the goal is the ordered list of addresses or domain
474	   names provided by the DHCP option.  In other situations, the goal may
475	   be specified by other means.

477	   Baseline specifications (e.g., [RFC3263], [RFC3319], [RFC3361],
478	   [RFC6724], [RFC7984]) prescribe the construction of a set "targets"
479	   which are potential transport destinations to which the message can
480	   be sent.  Which specifications apply is determined by the context of
481	   the message.  Targets are commonly conceptualized as
482	   protocol/address/port combinations, but in general they are the pairs
483	   of source and destination TSAPs that provide the full specification
484	   of a transport flow.

486	   For example, the sending of an initial REGISTER message can involve
487	   six steps of expanding the goal into a list of targets:

489	      Selecting one of the SIP domain names from the list provided by a
490	      DHCP option [RFC3319] and [RFC3361].

492	      NAPTR translation from a SIP domain name to a server domain name
493	      [RFC3263].

495	      Selecting a transport protocol (e.g., UDP or TCP), if there is no
496	      transport parameter in the URI.

498	      SRV translation of a server domain name to server interface names
499	      [RFC2782].

501	      A/AAAA translation of a server interface name to server addresses.

503	      Selecting a source address to use to communicate with a server
504	      address [RFC6724].

506	   The process of deriving a set of targets from a goal can be
507	   conceptualized as constructing a tree, with the root node being the
508	   goal and the leaf nodes being the targets (whether or not an
509	   implementation constructs such a representation).  Each non-leaf node
510	   is expanded into zero or more child nodes by the application of the
511	   appropriate baseline specification.

513	   For a particular node, the relevant baseline specification may
514	   prescribe relationships between the traffic volume sent to the
515	   subsets of targets that are descended from its children.  E.g., a
516	   standard may prescribe prioritization, such that if any target
517	   descended from a higher-priority child is responsive, no traffic
518	   should be sent to any target descended from a lower-priority child.
519	   (SRV records and DHCP options can specify prioritization.)
520	   Similarly, a standard may prescribe load balancing, such that if
521	   there are responsive targets descended from two children, the ratio
522	   of traffic to the two subsets targets descended from the two children
523	   must be a particular (non-zero positive) number.  (SRV records can
524	   specify load balancing.)  Alternatively, a node may place no
525	   restrictions on the traffic to the subsets of targets descended from
526	   its children.

528	   As always, the construction of the tree and the traffic restrictions
529	   incorporated into it may be modified by the local policy.  In this
530	   document, we assume that all modifications are made to the tree that
531	   summarizes the requirements of the baseline specifications.  This
532	   makes it easier to determine the the interaction of local policy with
533	   the procedure modifications of this document.  And this assumption
534	   does not limit the generality of what local policy may do, since the
535	   local policy can remove any ordering restrictions from the tree, thus
536	   permitting almost any behavior by the modified procedures.

538	   The targets as generated by the baseline specifications MAY be
539	   subsetted by deleting any targets that the client cannot access for
540	   reasons such as the client does not implement the protocol, or it
541	   does not have a network interface that supports the protocol, or it
542	   does not have a network interface that can communicate with the
543	   address.  Removing these targets at an early stage of processing does
544	   not affect the on-the-wire behavior of either the baseline processes
545	   or the modified processes, since sends to such targets fail
546	   immediately.

548	   What constitutes failure of a send depends on the situation, and may
549	   be a transport protocol failure, the absence of a timely 100 Trying
550	   response, or a 503 response ([RFC3261] section 21.5.4 and [RFC3263]
551	   section 4.3).  For any particular message, either the message is
552	   successfully sent to exactly one target or the overall sending
553	   process fails.

555	   In the worst-case situation, the process may require waiting for one
556	   or more transaction timeouts (e.g., Timer B or Timer F in [RFC3261])
557	   before successfully transmitting the message to a target.  As the
558	   timeouts are typically 32 seconds, such a wait severely impacts the
559	   user experience.

561	4.1.  Target Ordering

563	   The baseline specifications assume that the client will effectively
564	   generate an order in which to contact the targets, then the client
565	   will sequence through the list, sending the message to each target
566	   until one of the sends is deemed to be successful.  (Each send may
567	   include retransmissions of the message.)  This is because at any
568	   stage, the client's next action is determined only by the goal and
569	   the fact that sending to the previous targets has failed -- the first
570	   target in the order is the target that the client will choose first
571	   (which depends only on the goal), the second target in the order is
572	   the target that the client will choose if and when the send to the
573	   first target fails (which depends only on the goal and the identity
574	   of the first target), etc.

576	   (In mathematical terms, the target order is a total ordering of the
577	   targets that is compatible with the partial ordering of the targets
578	   specified by the traffic restrictions.)

580	   The order must be compatible with the traffic restrictions imposed by
581	   the specifications on the targets, and the traffic restrictions are
582	   coded into the nodes of the target-derivation tree.  The following
583	   subsections discuss how the target order reflects these constraints
584	   for every non-leaf node type.

586	4.2.  Priority Nodes

588	   If a node specifies that its children are prioritized, all of the
589	   targets descended from a higher-priority child must precede all of
590	   the targets descended from a lower-priority child.  Suppose the tree
591	   of targets has one interior node that specifies prioritization of
592	   three targets.  This can result from these DNS records:

594	       _sip._udp.example.com.    SRV    1 1 5060 sip1.example.com
595	       _sip._udp.example.com.    SRV    2 1 5060 sip2.example.com
596	       _sip._udp.example.com.    SRV    3 1 5060 sip3.example.com
597	       sip1.example.com.         A      192.0.2.1
598	       sip2.example.com.         A      192.0.2.2
599	       sip3.example.com.         A      192.0.2.3

601	   We show the tree with the targets from left to right from highest
602	   priority to lowest priority:

604	            |
605	       -priority--
606	       |    |    |
607	       A    B    C

609	   We can then represent the traffic restrictions in a (directed,
610	   acyclic) graph.  In our graphs, the arcs are to be read from left to
611	   right, requiring sending to the left endpoint before sending to the
612	   right endpoint.  In this case, sending to A must be done before
613	   sending to B, and sending to B before sending to C.  We can add dummy
614	   "start" and "finish" nodes, represented by "*", for convenience in
615	   later analysis.  We can imagine them to be targets to which sending
616	   fails immediately.

618	       *----A----B----C----*

620	   There is only one allowed target order:

622	       A B C

624	4.3.  Unordered Nodes

626	   If a node places no traffic restrictions on its children, there is no
627	   collective relationship between the targets descended from its
628	   children.  The targets descended from different children may be
629	   appear in any order, and can even be interleaved.  We show the tree
630	   of a simple example:

632	       sip.example.com.         A      192.0.2.1
633	       sip.example.com.         A      192.0.2.2
634	       sip.example.com.         A      192.0.2.3

636	            |
637	       -unordered-
638	       |    |    |
639	       A    B    C

641	   We then represent the lack of traffic restrictions by a graph which
642	   has no lines between targets.  We add dummy start and finish nodes to
643	   clarify that all three targets are part of one graph:

645	         A
646	        / \
647	       *-B-*
648	        \ /
649	         C

651	   There are six allowed target orders:

653	       A B C

655	       A C B

657	       B A C

659	       B C A

661	       C A B

663	       C B A

665	4.4.  Combinations of Prioritization and Unordered Nodes

667	   A prioritized pair of hosts each with an unordered pair of targets
668	   can be specified with these DNS records:

670	       _sip._udp.example.com.    SRV    1 1 5060 sip1.example.com
671	       _sip._udp.example.com.    SRV    2 1 5060 sip2.example.com
672	       sip1.example.com.         A      192.0.2.1
673	       sip1.example.com.         AAAA   2001:DB8::1
674	       sip2.example.com.         A      192.0.2.2
675	       sip2.example.com.         AAAA   2001:DB8::2

677	   These records result in this tree:

679	                 |
680	           ---priority---
681	           |            |
682	       unordered    unordered
683	       |       |    |       |
684	       A       B    C       D

686	   From this tree, we generate this graph.  We can simplify the graph by
687	   adding a dummy target which must follow both A and B and must precede
688	   both C and D, in order to avoid writing four arcs A-C, A-D, B-C, and
689	   B-D.

691	         A   C
692	        / \ / \
693	       *   *   *
694	        \ / \ /
695	         B   D

697	   There are four allowed target orders:

699	       A B C D

701	       A B D C

703	       B A C D

705	       B A D C

707	   Now consider the situation of a client that has two IPv4 interfaces,
708	   192.0.2.129 and 192.0.2.193 (on two separate /26 subnets), with the
709	   first prioritized before the second.  If the client is configured to
710	   attempt to use every interface for every destination address (rather
711	   than being restricted by the source address selection rules
712	   Section 6.2), every destination address generates two targets, one
713	   with source address 192.0.2.129 and one with source address
714	   192.0.2.193, with the first prioritized before the second.

716	   In this situation, the following DNS records result in an unordered
717	   pair of hosts, each with a prioritized pair of targets.

719	       sip.example.com.         A      192.0.2.1
720	       sip.example.com.         A      192.0.2.65

722	   This situation results in this tree (where we now have to pay
723	   attention to the source addresses in the targets):

725	                                  |
726	                   -----------unordered-----------
727	                   |                             |
728	            ----priority----              ----priority----
729	            |              |              |              |
730	            A              B              C              D
731	       192.0.2.129    192.0.2.193    192.0.2.129    192.0.2.193
732	       192.0.2.1      192.0.2.1      192.0.2.65     192.0.2.65

734	   and this graph:

736	         A----B
737	        /      \
738	       *        *
739	        \      /
740	         C----D

742	   There are six allowed target orders:

744	       A B C D

746	       A C B D

748	       C A B D

750	       A C D B

752	       C A D B

754	       C D A B

756	   When the client is allowed to select any of the available source
757	   addresses for a send, each source address (combined with the
758	   destination address) generates a separate target Section 6.2.  Of the
759	   targets, the one selected by the default source address selection
760	   rules is preferred, and the remainder are unordered.  If there is
761	   only one destination address, it results in a tree with this form:

763	            |
764	       --priority--
765	       |          |
766	       |      -unordered-
767	       |      |    |    |
768	       A      B    C    D

770	   with this graph:

772	                   B
773	                  / \
774	       *----A----*-C-*
775	                  \ /
776	                   D

778	   There are eight allowed target orders:

780	       A B C D

782	       A B D C

784	       A C B D

786	       A C D B

788	       A D B C

790	       A D C B

792	4.5.  Load-Balancing Nodes

794	   The practical difficulty with load-balancing nodes is to ensure that
795	   the right proportion of traffic is sent to the descendants of each
796	   child node without having to maintain long-term records of the amount
797	   of traffic that has been sent to each child's descendants.  A simple
798	   way to accomplish this is to generate a new priority ordering of the
799	   children for each new message to be processed, with the orderings
800	   randomized to provide the correct traffic distribution among the
801	   children.  A randomization algorithm that achieves the correct
802	   traffic distribution is described in Appendix A.

804	   With this method, load-balancing nodes are processed for each message
805	   instance by generating a suitably randomized ordering of the
806	   children, and then processing them in the same way as the children of
807	   a priority node.

809	5.  Primary Procedure Modifications

811	   The following modifications are specified for all baseline
812	   specifications:

814	5.1.  Permitted to Reorder Targets

816	   A client MAY send the message to the targets in an order that is not
817	   permitted by the baseline specifications.

819	5.2.  Must Preserve Traffic Distribution

821	   To state the next requirement, we must define what it means to say
822	   that a target is "responsive".  Intuitively, a target is responsive
823	   if its response time to a message is not "too much" longer than the
824	   response time of any other target for the goal.  Note that the
825	   responsiveness of a target is always defined relative to the set of
826	   targets for a particular goal; hence, a target may be responsive for
827	   one goal at the same time that it is not responsive for another goal.

829	   We define RTT(T) to mean the round-trip response time of a target,
830	   the time it takes to receive confirmation of the receipt of a message
831	   sent to the target.  If the target does not respond at all, we
832	   consider RTT(T) to be "infinity", which is larger than any number.
833	   Note that RTT(T) is a fact about reality at some instant, not a
834	   measure of the client's current knowledge about reality at some
835	   instant.

837	   We define a function "Limit" that converts one time value into
838	   another: if RTT(T1) > Limit(RTT(T0)), then target T1 responds "too
839	   slowly" relative to the response time of target T0, and T1 is
840	   considered non-responsive.  We define Limit(t) = m*t + f, where "m"
841	   and "f" are parameters defined below.

843	   The parameter "m" limits the range of response times that we will
844	   allow among targets we consider responsive.  We set m to be 2.  (TBD:
845	   Is this a good choice?)

847	   The parameter "f" is the length of time that we consider to be
848	   insignificant when comparing the response time of targets.  We set f
849	   to be 2*T1, where T1 is the value of that name used in the procedures
850	   of [RFC3261].  That value is commonly the round-trip time estimate of
851	   the relevant network, and defaults to 500ms.  (TBD: Is this a good
852	   choice?)

854	   The result is that Limit(t) is "twice t, plus a little more to
855	   account for the inherent delays in the network".

857	   A target T0 is defined to be "responsive" if

859	      its response time, RTT(T0), is less than the relevant timeout
860	      (often Timer B or Timer F), and

862	      for any other target T1 of the goal, RTT(T0) < Limit(RTT(T1)).

864	   TBD: The following wording must be set so that the client can move
865	   non-responsive targets to any place in the order (or at least, any
866	   later place in the order) without violating this condition.  It is
867	   not clear this has been accomplished yet.  Later: I think we've
868	   accomplished this now.

870	   We are now ready to specify the major new constraint on the client's
871	   behavior: The client's procedures MUST, over time, distribute the
872	   traffic for any particular goal among the responsive targets in the
873	   same proportions as are required by the baseline specifications.
874	   Specifically,

876	      If the set of targets for a particular goal does not change over a
877	      period of time,

879	      for any two subsets of targets, both of which contain at least one
880	      target which is responsive for the entire period of time, and

882	      if the baseline specifications prescribe a proportion between the
883	      traffic to the two subsets, then

885	      the proportion between the traffic to the responsive subsets of
886	      the two subsets MUST converge to the proportion specified by the
887	      baseline specifications.

889	   Note that this is a "God's-eye view" requirement; the client has no
890	   direct way to determine whether it is satisfying the requirement
891	   because it cannot measure RTT() for every possible target at every
892	   moment.  Nonetheless, the client's procedures must ensure that the
893	   requirement is satisfied under all circumstances.

895	6.  Additional Requirements

897	6.1.  Address Family Preference

899	   Unless overridden by user configuration or by network configuration:
900	   If the host has a policy of preferring one address family, the client
901	   MUST prefer it.  If the host's policy is unknown or not obtainable,
902	   the client MUST prefer IPv6 over IPv4.  Thus, usually clients must
903	   give preference to IPv6 over IPv4.

905	   Such a preference MUST be implemented in the following way: Consider
906	   the "initial" targets, which are the targets which the baseline
907	   specifications do not prioritize after any other targets.  The client
908	   must additionally prioritize the initial targets which are of the
909	   preferred address family before the other initial targets.

911	   TBD: Is this sufficient?  We don't require address family preference
912	   to affect non-initial targets.  Alternatively, if the server has a
913	   lot of IPv6 addresses, none of which are responsive, the only way to
914	   quickly send to an IPv4 address is to send probes to all of the
915	   initial IPv6 addresses and one (almost-initial) IPv4 addresses.  This
916	   is recommended in the probing heuristics, but might require a lot of
917	   probes.  This parallels the advice in RFC 6555 section 5.4 "web site
918	   operators with native IPv6 connectivity SHOULD NOT offer multiple
919	   AAAA records".

921	6.2.  Address Selection

923	   Clients SHOULD provide a mechanism by which the address selection
924	   configuration [RFC6724] can be customized for the client
925	   independently of any other application.

927	   Clients SHOULD implement the destination address selection mechanism
928	   specified in [RFC6724].  Note that this mechanism provides a priority
929	   order among the set of A/AAAA records for a single server host name,
930	   whereas [RFC3263] assumes that such sets of A/AAAA records are
931	   unordered.

933	   Clients SHOULD implement rule 5.5 of section 5 of [RFC6724],
934	   preferring to use a source address with a prefix assigned by the
935	   selected next-hop.  This requires that the IPv6 stack remembers which
936	   next-hops advertised which prefixes.

938	   Clients SHOULD by default use the source address selection mechanism
939	   specified in [RFC6724], which chooses one source TSAP for any
940	   particular destination TSAP.

942	   Clients SHOULD also be configurable to use an alternative mechanism,
943	   in which for any destination TSAP, targets are generated for each
944	   source TSAP that could possibly communicate with the destination
945	   TSAP, with the source TSAP selected by [RFC6724] prioritized over the
946	   other source TSAPs and the other source TSAPs being unordered among
947	   themselves.

949	   The alternative policy is useful in situations where the source
950	   address selection table prioritizes an interface which does not
951	   forward SIP traffic to the destination address.  (For an example,
952	   when the source address selection table routes almost all
953	   destinations to an organizational VPN which has restricted
954	   connectivity.)

956	6.3.  Vias

958	   A client MUST provide a via-param in the first Via header field in
959	   requests that properly routes from the server to the client,
960	   regardless of the presence of NATs in the transportation path.  This
961	   is necessary even when the request is sent via a connection-oriented
962	   protocol, because the connection may be terminated before the
963	   response is sent back to the client and the server may need to
964	   reestablish a connection.  The client SHOULD provide the "rport"
965	   parameter on the via-param [RFC3581].

967	   Additionally, to assist tracing and diagnosis, a client SHOULD
968	   provide the source TSAP that it used in the via-param.  TBD: Is this
969	   too strict?  Is it actually useful?

971	6.4.  DNS Caching

973	   The information a client uses to determine a target set must be up-
974	   to-date.  In particular, the construction of a target set MUST NOT
975	   utilize information from DNS whose TTL has expired.  However, once a
976	   target set has been constructed, it may be retained until the message
977	   has either been sent or failed, without regard to the TTL of the
978	   information used to construct it.

980	   TBD: Should we allow a client to cache target set computations
981	   somewhat longer than the TTL to minimize disruption and DNS traffic?
982	   Phone calls typically take 3 minutes, so we could allow 5 minutes
983	   grace and thus ensure that target sets rarely have to be recomputed
984	   during a call.

986	6.5.  Unused Flows

988	   A flow that is created as a probe but not subsequently used (either
989	   to send the message or to maintain a SIP Outbound flow) SHOULD be
990	   terminated unless there is reason to believe that it will be used in
991	   the near future.  This includes flows that are connection-oriented
992	   protocols as well as non-connection-oriented flows with security
993	   associations.  Minimizing the number of unused connections reduces
994	   the load on the server and on stateful middleboxes.  Also, if the
995	   abandoned connection is IPv4, this reduces IPv4 address sharing
996	   contention in any intermediate NATs.

998	6.6.  Debugging and Troubleshooting

1000	   Happy Earballs is aimed at ensuring a reliable user experience
1001	   regardless of connectivity problems affecting any single transport.
1002	   However, this naturally means that applications employing these
1003	   techniques are by default less useful for diagnosing issues with a
1004	   particular address family or interface.  To assist in that regard, an
1005	   implementation MAY provide a mechanism to disable its Happy Earballs
1006	   behavior via a user setting, and SHOULD provide data useful for
1007	   debugging (e.g., a log or way to review current preferences).

1009	7.  Consequences of the New Requirements

1011	   In this section we explore some of the consequences of these
1012	   requirements and describe possible approaches for designing clients
1013	   that satisfy the modified requirements and provide shorter
1014	   transmission latency.

1016	   A client may send the message to the targets in an order that is not
1017	   permitted by the baseline specifications, but it may not omit any
1018	   targets from its ordering.  Thus, the client is required to send it
1019	   to all the targets before it may declare failure of the send process.
1020	   (TBD: Would it be better to 408 the message faster?)  For example, if
1021	   the client has cached information which indicates that a target is
1022	   unreachable, the client may move that target to the end of the order,
1023	   but if sending to all other targets is unsuccessful, the client must
1024	   send to that target before declaring failure.

1026	   A client may cache measured RTT() values for targets and use this
1027	   information to optimize target orderings; the cached value of RTT(T)
1028	   for a target T we name CRTT(T).  Because a single target may appear
1029	   in the target set for multiple goals, the client should cache RTT()
1030	   for targets (rather than judgments of responsiveness), and then when
1031	   sending to a goal use CRTT() to determine whether a target T is
1032	   responsive relative to the other targets of that particular goal.

1034	   A client may determine a target T1 to be "slow" (relative to a given
1035	   goal) if its CRTT(T1) is greater than Limit(CRTT(T2)) for some other
1036	   target T2 in the goal's target set.  A target that is not slow is
1037	   labeled "normal".  Note that a target being slow is determined by the
1038	   client via a combination of the information in the cache and the
1039	   state of the network at the moments that the cached information was
1040	   recorded.  As it were, the client thinks a slow target is non-
1041	   responsive, but the target may or may not actually be non-responsive
1042	   at that moment, depending on whether the CRTT() values agree with
1043	   current reality, which is the RTT() values.

1045	   If there is be an upper bound on the length of time that a client
1046	   retains CRTT() values, then the client may assume that any slow
1047	   target is non-responsive, in that it may place the target after the
1048	   normal targets in the order.  For this reason, the client is required
1049	   to have an upper bound on the length of time that CRTT() values
1050	   remain in the cache, after which they are either deleted or replaced
1051	   by values based on more recent observations of RTT().

1053	   The client may act on its CRTT() values in this way because in doing
1054	   so, it will not violate the traffic distribution requirement: If a
1055	   target is responsive over a long period, the eventual delete/refresh
1056	   of cached values from before that period ensures that the client will
1057	   eventually see the target as normal.  (2) If a target is not
1058	   continuously responsive over a long period, the traffic distribution
1059	   requirement places no restriction on whether the client sends traffic
1060	   to it or not, and repositioning it after all normal targets does not
1061	   affect the traffic distribution among the normal targets.

1063	   Implementations have flexibility regarding the lifetime of cache
1064	   entries.  The only absolute restriction is that as long as the
1065	   configuration is stable, there is a maximum lifetime that CRTT()
1066	   values are kept before being deleted or refreshed.  The advice in
1067	   [RFC6555] suggests that the upper bound on the lifetime of cache
1068	   entries should on the order of 10 minutes.

1070	   The length of time that different CRTT() values are cached may differ
1071	   from each other.  If the RTT() observation is from a periodic event
1072	   (e.g., registration or NAT binding refresh), the cache lifetime
1073	   should be similar to but longer than the period of the event, as the
1074	   event period is likely to be configured based on knowledge of the
1075	   interval over which the network is likely to remain stable.

1077	   When the state of a source address is changed, or the state of the
1078	   interface it is assigned to changes, or when the network it is
1079	   connected to is re-initialized, cached RTT() values for targets with
1080	   that source address should be deleted.  Interfaces can determine
1081	   network re-initialization by a variety of mechanisms (e.g., [RFC4436]
1082	   and [RFC6059]).

1084	   When a client processes a message, the ordering of targets that it
1085	   sends to must be an ordering permitted by the baseline
1086	   specifications, with the exception that slow targets may be moved
1087	   after all normal targets.

1089	   Since the client's goal is to deliver the message as quickly as
1090	   possible, a client should always move slow targets to the end of the
1091	   order, after the normal targets.  Note that if a probe transmission
1092	   during message processing discovers a target to be slow, the target
1093	   can be moved at that time to after all normal targets.

1095	   A client obtains RTT(T) for a target T whenever it sends to the
1096	   target.  But it can also obtain RTT(T) by a probe, which is any
1097	   transmission to T which requires a response but does not involve
1098	   sending the message (and hence does not commit the client to possibly
1099	   waiting for a timeout before sending to another target).  A client
1100	   may send probes to several targets simultaneously.

1102	   Probe operations include:

1104	   o  establishing a transport connection to the target (without sending
1105	      the message as data on the connection)

1107	   o  sending a keep-alive, as specified by the transport protocol

1109	   o  sending a CR-LF-CR-LF keep-alive on a SIP Outbound flow [RFC5626]

1111	   o  sending a STUN keep-alive message on a SIP Outbound flow [RFC5626]

1113	   o  sending an OPTIONS request with "Max-Forwards: 0".

1115	   (Note that a probe using an OPTIONS request can be used with any
1116	   protocol.  If the OPTIONS reaches the target, the target is required
1117	   to respond with either a 200 or 483 response [RFC3261], without
1118	   forwarding it to another entity, unless the target demands
1119	   authentication, which would be indicated by an immediate 401 or 407
1120	   response.  Conveniently, a server can respond to such a request
1121	   statelessly, so such requests are low-overhead.  (Although the SIP
1122	   Outbound keep-alive methods have even lower overhead.))

1124	   Similarly, if a client has a connection to a target T, and the
1125	   connection has been idle for long enough, the client will not have a
1126	   CRTT(T) for T, reflecting the fact that the connection may have
1127	   failed without the client's knowledge.  The client can refresh
1128	   CRTT(T) by performing a suitable probe operation within the
1129	   connection.

1131	   The client should use a flow or connection that is established to a
1132	   target instead of establishing a new flow or connection to that
1133	   target for sending either a probe or a message.

1135	   If the client initiates a probe of a target T, it may be able to
1136	   decide that T is slow without waiting to determine the actual value
1137	   of RTT(T), which may take as long as the timeout period: the true
1138	   value of RTT(T) is always at least as large as the elapsed time since
1139	   the probe was sent, and the determination of slowness depends on
1140	   whether RTT(T) exceeds Limit(CRTT(Tfastest)) (where Tfastest is the
1141	   target with smallest CRTT() value of any target in the target set).

1143	   If the client establishes a connection to a target without
1144	   simultaneously sending the message, the connection establishment is a
1145	   probe of the target, and after initiating the connection but before
1146	   sending the message, if that probe reveals that the target is slow,
1147	   the client may move that target to the end of the ordering, and
1148	   utilize another target.

1150	   In order to minimize the chance that the client must wait for a
1151	   timeout before sending to another target, the client may send probes
1152	   to targets, and the RTT() values revealed by those probes can change
1153	   what target the client will send to next.  Because of this, the
1154	   client's procedures do not simply convert the tree of targets into an
1155	   ordering of the targets, which the client then follows -- Information
1156	   discovered during the sequence of sends can affect the order targets
1157	   are sent to.

1159	   There is little value sending a probe to a target unless all targets
1160	   of higher priority (1) have been sent the message (and failed), (2)
1161	   have been sent a probe, or (3) have a CRTT() value.

1163	   A target T0 is "quick" if its CRTT(T0) is less than Limit(0), that
1164	   is, if T0 will be normal regardless of the CRTT() values of any other
1165	   target.  If an allowed next target has a cached RTT() value, and it
1166	   is "quick", then it is never slow for any goal.

1168	   If the client has reason to believe that it will soon be asked to
1169	   send a message to a goal with a target T, and if CRTT(T) does not
1170	   exist or is likely to expire before then, the client may decide to
1171	   preemptively refresh the cached value by probing T.

1173	   Similarly, a client may be in a situation where it has advance notice
1174	   that it is likely to need to send a message to a particular target,
1175	   for instance, if the user of a UA begins dialing an outgoing call
1176	   which will be routed through a particular outgoing proxy.  In such a
1177	   situation, the client should consider preemptively probing the
1178	   target.

1180	   Note that the use of probes increases the non-message traffic to the
1181	   targets, and thus has a cost.  A client minimizes the expected
1182	   transmission time by initially probing all of the targets, but that
1183	   strategy maximizes the additional traffic.  A client should weigh the
1184	   tradeoff between improved user experience and increased traffic.  In
1185	   particular, the client should be aware of which messages require
1186	   rapid service for good user experience (e.g., INVITE and BYE) and
1187	   which do not (e.g., REGISTER and re-SUBSCRIBE).

1189	   A client should avoid sending to a target which does not have a
1190	   CRTT() value (unless it is the last remaining target), because the
1191	   target might be non-responsive forcing the client to wait for a
1192	   timeout.  Instead, the client should probe the target before sending
1193	   the message to it.

1195	8.  Generic Algorithm

1197	   Based on these observations, we can construct an event-driven
1198	   algorithm that satisfies the revised requirements and avoids sending
1199	   the message to slow targets.  This algorithm has a number of places
1200	   where the implementor has latitude.

1202	   The primary state of the algorithm is a graph whose nodes are the
1203	   targets which have not yet been sent to and failed.

1205	   o  Each target is annotated with its CRTT() value, or the absence
1206	      thereof.

1208	   o  Each target has an indicator of whether a probe has been sent to
1209	      it, and if so, when the probe was sent.

1211	   o  The graph is initialized with the graph of targets as constructed
1212	      from the target construction tree.

1214	   o  The initial graph is assumed for convenience to have a unique
1215	      rightmost (final) target named Fin, which may be a dummy.  Targets
1216	      determined to be slow will be moved to after target Fin.

1218	   The algorithm uses some data derived from the graph.

1220	   o  S, which is Limit(CRTT(Tfastest)), where Tfastest is the target
1221	      (that is or was in the graph) with the smallest CRTT().  Thus, any
1222	      target T with CRTT(T) > S is slow.

1224	   o  The set of initial targets, targets which (at this time) have no
1225	      targets prioritized before them, and thus could be chosen as the
1226	      next target to send to.

1228	   The algorithm can be most easily described in an event-driven manner,
1229	   where particular situations trigger particular actions:

1231	   1.   When a target T is slow, that is, CRTT(T) > S and it is before
1232	        Fin, it is removed from its position in the graph and moved to a
1233	        new position, after Fin.  (Note that additional arcs may need to
1234	        be inserted where T used to be; if A is before T and T is before
1235	        B, A remains before B.)  (Note that a target can be discovered
1236	        to be slow because a probe has been outstanding longer than S
1237	        without responding, even if S is less than the timeout period.)

1239	   2.   When there is no outstanding send, the client may send to any
1240	        initial target whose CRTT() is known.

1242	   3.   When there is no outstanding send and only one target in the
1243	        graph, the client may send to it.

1245	   4.   When there is an initial dummy target, it may be removed from
1246	        the graph.

1248	   5.   When a target T has no CRTT() and all targets that precede T
1249	        either have a CRTT() or have a probe outstanding, the client may
1250	        send a probe to T.

1252	   6.   When a probe or a send succeeds, fails, or times out, the client
1253	        records CRTT() for that target.

1255	   7.   When the startup of a protocol (e.g., TCP SYN) probes the target
1256	        without sending the message, the client should perform the
1257	        startup as a probe, and send the message as a separate action.

1259	   8.   When a send to a target succeeds, sending to the goal has
1260	        succeeded (and the algorithm terminates).

1262	   9.   When a send to a target fails, the target is removed from the
1263	        graph.

1265	   10.  When the graph is empty (has no nodes), sending to the goal has
1266	        failed (and the algorithm terminates).

1268	8.1.  Policies and Implementor Latitude

1270	   The generic algorithm allows the implementor a considerable degree of
1271	   latitude in making the allowed choices.

1273	   o  If multiple initial targets have a CRTT(), the client can choose
1274	      which to send to, e.g., the one with the smallest CRTT().

1276	   o  The client chooses which targets to probe.  The client should
1277	      balance the value of information to be obtained by probing targets
1278	      with the cost of doing so.  The naive aggressive strategy is to
1279	      probe all initial targets that do not have a CRTT().

1281	   o  When a probe is sent and a response is not received for a time
1282	      period, the client may consider it "lost" and send another probe.

1284	      This time period will likely be longer than the typical network
1285	      RTT but significantly less than the send timeout period.

1287	   o  The client may choose to probe targets to exercise the maximum
1288	      diversity of network paths, covering the range of interfaces and
1289	      address families in the target set.  E.g., a non-initial target
1290	      that provides diversity may be chosen to be probed (which may
1291	      require first probing all of its preceding targets).

1293	   o  If all initial targets with CRTT() have "large" values, the client
1294	      may choose not to send to any of them, and instead send probes to
1295	      other targets which might respond faster.  Note that in this
1296	      situation, probing non-initial targets might be useful; if such a
1297	      probe responds quickly, it might cause the initial targets to
1298	      become slow and to be moved to the end, causing the probed target
1299	      to become initial (and thus available for sending).

1301	   o  The client may add prioritizations between targets which are not
1302	      required to have them due to the baseline specifications (as long
1303	      as they are consistent with the prior prioritizations, i.e., do
1304	      not specify both that A is before B and that B is before A).
1305	      Doing this judiciously may allow the algorithm to use a simpler
1306	      data structure for recording prioritization Section 9

1308	8.2.  Examples

1310	   This section shows examples of the generic algorithm (with naive
1311	   choices for the various "policies") in situations involving various
1312	   combinations of targets with particular properties.

1314	8.2.1.  Two Unordered Targets, Both Cached

1316	   Suppose there are two unordered targets, both of which have cached
1317	   RTT() values less than S.

1319	         A (cached)
1320	        /          \
1321	       *            Fin
1322	        \          /
1323	         B (cached)

1325	   First, the initial dummy node is removed.

1327	         A (cached)
1328	                   \
1329	                    Fin
1330	                   /
1331	         B (cached)

1333	   The client chooses the target with the smallest CRTT(), which we
1334	   assume is A, and sends to it.

1336	   If the send succeeds, the algorithm terminates successfully.

1338	   In the unlikely case that sending to A fails, it is deleted from the
1339	   graph.  (Sending to A will likely succeed, because there is a current
1340	   cached record of successful communication.)

1342	                    Fin
1343	                   /
1344	         B (cached)

1346	   The client chooses B to send to.

1348	   If the send succeeds, the algorithm terminates successfully.

1350	   If sending to B fails, it is deleted from the graph.

1352	                    Fin

1354	   The initial dummy node Fin is deleted from the graph.  The graph
1355	   becomes empty, and the algorithm terminates unsuccessfully.

1357	8.2.2.  Two Unordered Targets, Both Cached, One Slow

1359	   Suppose there are two unordered targets, both of which have cached
1360	   RTT() values, but the value for target A is greater than S.

1362	         A (cached)
1363	        /          \
1364	       *            Fin
1365	        \          /
1366	         B (cached)

1368	   First, the initial dummy node is removed.

1370	         A (cached)
1371	                   \
1372	                    Fin
1373	                   /
1374	         B (cached)

1376	   Target A is examined and found to be slow, so it is moved after Fin.

1378	                    Fin----A (slow)
1379	                   /
1380	         B (cached)

1382	   The client chooses B to send to.

1384	   If the send succeeds, the algorithm terminates successfully.

1386	   If sending to B fails, it is deleted from the graph.

1388	                    Fin----A (slow)

1390	   The initial dummy node is deleted from the graph.

1392	                           A (slow)

1394	   The client sends to A.

1396	8.2.3.  Two Unordered Targets, One Cached

1398	   Suppose there are two unordered targets, only one of which has a
1399	   CRTT():

1401	         A (cached)
1402	        /          \
1403	       *            Fin
1404	        \          /
1405	         ----B-----

1407	   The initial dummy node is deleted from the graph.

1409	         A (cached)
1410	                   \
1411	                    Fin
1412	                   /
1413	             B-----

1415	   Only target A is initial and has a CRTT(), so the client sends to it.

1417	   Since B is initial and has no CRTT(), the client sends a probe to it.

1419	   If the send to A succeeds, the algorithm terminates successfully.

1421	   If the send to A fails, it is deleted from the graph.

1423	                    Fin
1424	                   /
1425	             B-----

1427	   Since B is the only remaining (non-dummy) target, the client sends to
1428	   it.

1430	8.2.4.  Two Unordered Targets, Neither Cached

1432	   Suppose there are two unordered targets, neither of which have CRTT()
1433	   values:

1435	         A
1436	        / \
1437	       *   Fin
1438	        \ /
1439	         B

1441	   The initial dummy node is deleted from the graph.

1443	         A
1444	          \
1445	           Fin
1446	          /
1447	         B

1449	   The client cannot send to either target.  The client sends probes to
1450	   both targets.  When one of the probes returns, let us say it is the
1451	   probe to A, the graph indicates that one target is cached:

1453	         A (cached)
1454	                   \
1455	                    Fin
1456	                   /
1457	             B-----

1459	   Target A now is initial and has a CRTT(), so the client sends to it.

1461	   In the unlikely case that sending to A fails, A is deleted from the
1462	   graph, and the client immediately sends to B because B is the only
1463	   remaining target.

1465	   This situation parallels the standard "Happy Eyeballs" situation in
1466	   HTTP, where the client has two (or more) unordered addresses for the
1467	   server, one IPv4 and one IPv6.  The client requests connections with
1468	   both addresses simultaneously, and the first connection that succeeds
1469	   is used to send the HTTP request [RFC6555].

1471	8.2.5.  Three Unordered Targets, One Cached

1473	   Suppose there are three unordered targets, only one of which has a
1474	   CRTT():

1476	         A (cached)
1477	        /          \
1478	       *-----B------Fin
1479	        \          /
1480	         ----C-----

1482	   The initial dummy node is deleted from the graph.

1484	         A (cached)
1485	                   \
1486	             B------Fin
1487	                   /
1488	             C-----

1490	   Only target A is initial and has a CRTT(), so the client sends to it.

1492	   Since B is initial and has no CRTT(), the client sends a probe to it.
1493	   Similarly for C.

1495	   If the send to A succeeds, the algorithm terminates successfully.

1497	   If the send to A fails, it is deleted from the graph.

1499	             B------Fin
1500	                   /
1501	             C-----

1503	   The client waits until one of the probes (to B and C) either succeeds
1504	   or fails.  Suppose the probe to B succeeds.

1506	             B (cached)------Fin
1507	                            /
1508	             C--------------

1510	   Target B is now initial and has a CRTT(), so the client sends to it.

1512	8.2.6.  Two Prioritized Targets, Both Cached

1514	   Suppose there are two prioritized targets, both of which have CRTT():

1516	       *----A (cached)----B (cached)----Fin

1518	   If neither A nor B is slow, processing proceeds according to the
1519	   baseline specification: the client must send to A first, and then if
1520	   that fails, send to B.  In detail:

1522	   The initial dummy node is deleted from the graph.

1524	            A (cached)----B (cached)----Fin

1526	   The client sends to A.

1528	   If sending to A succeeds, the algorithm terminates successfully.  If
1529	   sending to A fails, A is removed from the graph.

1531	                          B (cached)----Fin

1533	   Target B is now initial, and the client sends to B.

1535	   If sending to B succeeds, the algorithm terminates successfully.  If
1536	   sending to B fails, B is removed from the graph, then Fin is removed
1537	   from the graph, and the algorithm terminates unsuccessfully.

1539	   However, as processing begins, it is possible that A is slow because
1540	   CRTT(A) > S = Limit(CRTT(B)).  In that case A is moved after Fin.

1542	            B (cached)----Fin----A (slow)

1544	   Target B is now initial, and the client sends to B.

1546	   If sending to B succeeds, the algorithm terminates successfully.  If
1547	   sending to B fails, B is removed from the graph, then Fin is removed
1548	   from the graph.  Then A is initial, and the client sends to A.

1550	8.2.7.  Two Prioritized Targets, the Second Cached

1552	   Suppose there are two prioritized targets, but only the second has a
1553	   CRTT():

1555	       *----A----B (cached)----Fin

1557	   The initial dummy node is deleted from the graph.

1559	            A----B (cached)----Fin

1561	   The client sends a probe to A.  If the probe response (the new
1562	   CRTT(A)) is less than S = Limit(CRTT(B), the the graph becomes:

1564	            A (cached)----B (cached)----Fin

1566	   The client then sends to A.

1568	   But if the probe remains outstanding longer than S, A becomes slow
1569	   and is moved to after Fin.

1571	            B (cached)----Fin----A (slow)

1573	   At that point, the client sends to B.

1575	8.2.8.  Two Prioritized Targets, the First Cached

1577	   Suppose there are two prioritized targets, but only the first has a
1578	   CRTT():

1580	       *----A (cached)----B ----Fin

1582	   The initial dummy node is deleted from the graph.

1584	            A (cached)----B----Fin

1586	   The client can send to A immediately.  If CRTT(A) <= Limit(0), the
1587	   client must send to A immediately since A is quick, and cannot be
1588	   revealed to be slow by probing other targets.

1590	   But if RTT(A) is large enough that there is a reasonable chance that
1591	   RTT(B) is smaller than Limit(RTT(A)) (which would make A slow), the
1592	   client may choose to send a probe to B first.  If the probe response
1593	   returns quickly enough, CRTT(A) > Limit(CRTT(B)), which makes A slow.
1594	   Then, A would be moved to after Fin, and the client would send to B.

1596	   At any point before the probe response arrives, the client can decide
1597	   to send to A.  But if the probe does not return within
1598	   Limit(CRTT(B)), the client should immediately send to A, as a will
1599	   not be shown to be slow.

1601	8.2.9.  Two Prioritized Targets, Neither Cached

1603	   Suppose there are two prioritized targets, neither of which have
1604	   CRTT():

1606	       *----A----B----Fin

1608	   The initial dummy node is deleted from the graph.

1610	            A----B----Fin

1612	   The client must now send a probe to A before it can send the message
1613	   to A.  But if A and B use different network paths (as they likely
1614	   will), it is probably advantageous to send probes to both A and B, in
1615	   case B is sufficiently more responsive than A to make A slow and
1616	   override the prioritization.  If the client chooses to send both
1617	   probes:

1619	   If the response to the probe of A returns first, the client
1620	   immediately sends to A.

1622	   If the response to the probe of B returns, and the response to the
1623	   probe of A returns before Limit(CRTT(B)), the client immediately
1624	   sends to A.

1626	   If the response to the probe of B returns, and the response to the
1627	   probe of A does not returns before Limit(CRTT(B)), the client moves A
1628	   to after Fin, and then immediately sends to B.

1630	8.2.10.  Three Targets

1632	   A more complex case is when the client must choose between three
1633	   source addresses for a destination address.  One, the address
1634	   selected by the usual rules rules, is prioritized and the other two
1635	   are unordered.  Consider the example where the destination address is
1636	   192.0.2.1 and the client has interfaces with addresses 192.0.2.65,
1637	   192.0.2.129, and 192.0.2.197, with the first of those being the
1638	   interface on the same subnet as the default outbound gateway.  Then
1639	   the graph is:

1641	                                  B
1642	                             192.0.2.129/
1643	                             192.0.2.1
1644	                 A          /            \
1645	       *----192.0.2.65/----*              Fin
1646	            192.0.2.1       \     C      /
1647	                             192.0.2.197/
1648	                             192.0.2.1

1650	   Assuming that there are no CRTT() values, the client starts by
1651	   probing A.  Unless it is being very conservative with probes, the
1652	   client also sends probes via B and C.

1654	   As the probe responses arrive, CRTT() values are recorded, which can
1655	   cause targets to be declared slow and relocated after Fin.
1656	   Eventually, some target will be both initial and have a CRTT(), and
1657	   the client sends to it.

1659	   If the probe response from A arrives before Limit(RTT(B)) and
1660	   Limit(RTT(C)), the graph is updated to:

1662	                            B
1663	                           / \
1664	            A (cached)----*   Fin
1665	                           \ /
1666	                            C

1668	   At that point, the client sends to A.

1670	   On the other hand, if a probe response arrives from B but one does
1671	   not arrive from A before Limit(CRTT(B)), then A becomes slow, and it
1672	   is moved to after Fin.  Then the dummy target after A and before B
1673	   and C is deleted.

1675	         B (cached)
1676	                   \
1677	                    Fin----A (slow)
1678	                   /
1679	         C---------

1681	   At this point, the client sends to B.

1683	9.  Using a Simplified Data Structure

1685	   Instead of maintaining a directed acyclic graph to control the
1686	   client's operation, the client can replace the graph with a sequence
1687	   of sets of targets based on their "rank".  The rank of a target is
1688	   defined as:

1690	      The rank of an initial target is 0.

1692	      The rank of a non-initial target is 1 more than the highest rank
1693	      of any target that it is prioritized after.

1695	   In this scheme, a target is prioritized after all targets with lower
1696	   ranks.  As processing progresses, all targets in the lowest non-empty
1697	   rank are initial, and all targets in higher ranks are non-initial.

1699	   This is compatible with the generic algorithm because it only adds to
1700	   the prioritization relationships, it does not delete any of them.
1701	   Thus, the behavior it causes is compatible with the Happy Earballs
1702	   requirements.

1704	   For example, consider a server with two addresses, IPv6 and IPv4,
1705	   with IPv6 prioritized via SRV records.  Both addresses accept both
1706	   TCP and UDP traffic:

1708	       _sip._udp.example.com.    SRV    1 1 5060 sip1.example.com
1709	       _sip._udp.example.com.    SRV    2 1 5060 sip2.example.com
1710	       _sip._tcp.example.com.    SRV    1 1 5060 sip1.example.com
1711	       _sip._tcp.example.com.    SRV    2 1 5060 sip2.example.com
1712	       sip1.example.com.         AAAA   2001:DB8::1
1713	       sip2.example.com.         A      192.0.2.1

1715	   Suppose a request can be transported using either TCP or UDP, and the
1716	   situation does not specify a priority between them.  The tree of
1717	   targets is then:

1719	                                       |
1720	                      -------------unordered-----------
1721	                      |                               |
1722	              -----priority-----              -----priority-----
1723	              |                |              |                |
1724	       TCP 2001:DB8::1  TCP 192.0.2.1  UDP 2001:DB8::1  UDP 192.0.2.1

1726	   The graph, annotating each target with its rank, is:

1728	         (0) TCP 2001:DB8::1----(1) TCP 192.0.2.1
1729	        /                                        \
1730	       *                                          *
1731	        \                                        /
1732	         (0) UDP 2001:DB8::1----(1) UDP 192.0.2.1

1734	   Which can be turned into a list of lists as:

1736	       rank 0: TCP 2001:DB8::1, UDP 2001:DB8::1

1738	       rank 1: TCP 192.0.2.1, UDP 192.0.2.1

1740	   The rank representation is functionally equivalent to the following
1741	   graph, which is the original graph with additional lines, showing
1742	   that the rank representation constrains the client's behavior more
1743	   than the original graph does:

1745	         (0) TCP 2001:DB8::1   (1) TCP 192.0.2.1
1746	        /                   \ /                 \
1747	       *                     *                   *
1748	        \                   / \                 /
1749	         (0) UDP 2001:DB8::1   (1) UDP 192.0.2.1

1751	   The rank lists can be built (without first constructing the graph) by
1752	   walking the target tree from top to bottom and left to right (highest
1753	   priority to lowest priority), with each node passing downward MRdown,
1754	   the minimum rank that any descendant target is allowed, and each node
1755	   passing upward MRup, the minimum rank allowed for any target
1756	   prioritized after that node.

1758	   o  The root node's MRdown is 0.

1760	   o  For an unordered node:

1762	      *  Each child's MRdown is the node's MRdown.

1764	      *  The node's MRup is the maximum of the node's MRdown and all of
1765	         its children's MRup's.

1767	   o  For a priority node (with the children ordered by priority):

1769	      *  The first child's MRdown is the node's MRdown.

1771	      *  A later child's MRdown is the preceding child's MRup.

1773	      *  The node's MRup is the final child's MRup, or if there are no
1774	         children, the node's MRdown.

1776	   o  For a load-balancing node, the children are first prioritized
1777	      randomly ([RFC2782] and Appendix A), then processed as for a
1778	      prioritized node.

1780	   o  For a target node:

1782	      *  The target has rank MRdown.

1784	      *  The node's MRup is MRdown + 1.

1786	   Here is the preceding example's tree, with each node annotated with
1787	   its MRdown on the left of the node label and its MRup on the right of
1788	   the node label:

1790	                                      |
1791	                     ---------(0) unordered (2)-------
1792	                     |                               |
1793	             -(0) priority (2)-              -(0) priority (2)-
1794	             |                |              |                |
1795	  (0) TCP 2001:DB8::1 (1)     |   (0) UDP 2001:DB8::1 (1)     |
1796	                              |                               |
1797	                   (1) TCP 192.0.2.1 (2)           (1) UDP 192.0.2.1 (2)

1799	   Note that the target tree does not have to be explicitly constructed;
1800	   it can be implicitly walked by a series of recursive function calls,
1801	   with the functions passing MRdown and MRup values between themselves,
1802	   and each target being inserted into the rank list-of-lists as it is
1803	   generated.

1805	   The address family preference rule Section 6.1 can be implemented
1806	   within the rank representation by first constructing the ranks based
1807	   on the baseline specifications, and then splitting rank 0 into two
1808	   ordered sub-ranks, 0.0 and 0.1, with 0.0 containing all rank 0
1809	   targets of the preferred address family and rank 0.1 containing all
1810	   other rank 0 targets.

1812	   An example of address family preference processing is the ordinary
1813	   case of two prioritized servers each with an IPv6 and IPv4 address:

1815	       _sip._udp.example.com.    SRV    1 1 5060 sip1.example.com
1816	       _sip._udp.example.com.    SRV    2 1 5060 sip2.example.com
1817	       sip1.example.com.         AAAA   2001:DB8::1
1818	       sip1.example.com.         A      192.0.2.1
1819	       sip2.example.com.         AAAA   2001:DB8::2
1820	       sip2.example.com.         A      192.0.2.2

1822	                                       |
1823	                      ----------(0) priority (2)--------
1824	                      |                                |
1825	             -(0) unordered (1)-              -(1) unordered (2)-
1826	             |                 |              |                 |
1827	    (0) 2001:DB8::1 (1)        |   (1) 2001:DB8::2 (2)          |
1828	                               |                                |
1829	                      (0) 192.0.2.1 (1)                (1) 192.0.2.2 (2)

1831	   After splitting rank 0 based on the address family preference, the
1832	   the ranks are:

1834	       rank 0.0: 2001:DB8::1

1836	       rank 0.1: 192.0.2.1

1838	       rank 1:   2001:DB8::2, 192.0.2.2

1840	10.  Security Considerations

1842	   This document changes the order in which a client will send to
1843	   targets but does not change the set of targets that it will send to.
1844	   There are no known SIP systems whose security depends on the order in
1845	   which a client sends to targets.  Given that network connectivity is
1846	   unreliable, it is unlikely that the security of any SIP system
1847	   depends on the reliable ordering of targets.

1849	   The specific security vulnerabilities, attacks and threat models of
1850	   the various protocols mentioned in this document (SIP, DNS, SRV
1851	   records, etc.) are well-documented in their respective
1852	   specifications, and their effect on the security of SIP systems is
1853	   unchanged.

1855	11.  IANA Considerations

1857	   This document does not require any actions by IANA.

1859	12.  History

1861	   Note to RFC Editor: Upon publication, please remove this section.

1863	12.1.  Changes from draft-worley-sip-happy-earballs-00 to draft-worley-
1864	       sip-happy-earballs-01

1866	   Overhaul of the exposition.

1868	12.2.  Changes from draft-worley-sip-he-connection-01 to draft-worley-
1869	       sip-happy-earballs-00

1871	   Complete overhaul.

1873	   Changed "EarBalls" to "Earballs".

1875	12.3.  Changes from draft-worley-sip-he-connection-00 to draft-worley-
1876	       sip-he-connection-01

1878	   Minor changes.

1880	   Add note that WebSocket is out of scope, because there is only one
1881	   possible transport in WebSocket.

1883	12.4.  Changes from draft-johansson-sip-he-connection-01 to draft-
1884	       worley-sip-he-connection-00

1886	   This version has a different name for technical reasons.  It is, in
1887	   reality, the successor to draft-johansson-sip-he-connection-01.

1889	   Move Acknowledgments after References, as that is the style the
1890	   Editor prefers.

1892	   Updated Security Considerations: This increment of the H.E. work does
1893	   not make normative changes in existing SIP.

1895	   Copy a lot of text from RFC 6555, as this I-D is parallel to RFC
1896	   6555.

1898	   Changed "hostname" to "host name", as the latter form is more common
1899	   in RFCs by a moderate margin.

1901	   Revised some of the introduction text to parallel the introduction of
1902	   RFC 7984.

1904	   Changed name of algorithm to "Happy EarBalls", added reference to
1905	   Urban Dictionary.

1907	   Many expansions of the discussion and revisions of the wording.

1909	13.  References

1911	13.1.  Normative References

1913	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1914	              Requirement Levels", BCP 14, RFC 2119,
1915	              DOI 10.17487/RFC2119, March 1997,
1916	              <http://www.rfc-editor.org/info/rfc2119>.

1918	   [RFC2782]  Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for
1919	              specifying the location of services (DNS SRV)", RFC 2782,
1920	              DOI 10.17487/RFC2782, February 2000,
1921	              <http://www.rfc-editor.org/info/rfc2782>.

1923	   [RFC3261]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
1924	              A., Peterson, J., Sparks, R., Handley, M., and E.
1925	              Schooler, "SIP: Session Initiation Protocol", RFC 3261,
1926	              DOI 10.17487/RFC3261, June 2002,
1927	              <http://www.rfc-editor.org/info/rfc3261>.

1929	   [RFC3263]  Rosenberg, J. and H. Schulzrinne, "Session Initiation
1930	              Protocol (SIP): Locating SIP Servers", RFC 3263,
1931	              DOI 10.17487/RFC3263, June 2002,
1932	              <http://www.rfc-editor.org/info/rfc3263>.

1934	   [RFC3581]  Rosenberg, J. and H. Schulzrinne, "An Extension to the
1935	              Session Initiation Protocol (SIP) for Symmetric Response
1936	              Routing", RFC 3581, DOI 10.17487/RFC3581, August 2003,
1937	              <http://www.rfc-editor.org/info/rfc3581>.

1939	   [RFC6555]  Wing, D. and A. Yourtchenko, "Happy Eyeballs: Success with
1940	              Dual-Stack Hosts", RFC 6555, DOI 10.17487/RFC6555, April
1941	              2012, <http://www.rfc-editor.org/info/rfc6555>.

1943	   [RFC6724]  Thaler, D., Ed., Draves, R., Matsumoto, A., and T. Chown,
1944	              "Default Address Selection for Internet Protocol Version 6
1945	              (IPv6)", RFC 6724, DOI 10.17487/RFC6724, September 2012,
1946	              <http://www.rfc-editor.org/info/rfc6724>.

1948	   [RFC7984]  Johansson, O., Salgueiro, G., Gurbani, V., and D. Worley,
1949	              Ed., "Locating Session Initiation Protocol (SIP) Servers
1950	              in a Dual-Stack IP Network", RFC 7984,
1951	              DOI 10.17487/RFC7984, September 2016,
1952	              <http://www.rfc-editor.org/info/rfc7984>.

1954	13.2.  Informative References

1956	   [I-D.johansson-sip-he-connection]
1957	              Johansson, O., Salgueiro, G., and D. Worley, "Setting up a
1958	              SIP (Session Initiation Protocol) connection in a dual
1959	              stack network using connection oriented transports",
1960	              draft-johansson-sip-he-connection-01 (work in progress),
1961	              October 2016.

1963	   [RFC3319]  Schulzrinne, H. and B. Volz, "Dynamic Host Configuration
1964	              Protocol (DHCPv6) Options for Session Initiation Protocol
1965	              (SIP) Servers", RFC 3319, DOI 10.17487/RFC3319, July 2003,
1966	              <http://www.rfc-editor.org/info/rfc3319>.

1968	   [RFC3361]  Schulzrinne, H., "Dynamic Host Configuration Protocol
1969	              (DHCP-for-IPv4) Option for Session Initiation Protocol
1970	              (SIP) Servers", RFC 3361, DOI 10.17487/RFC3361, August
1971	              2002, <http://www.rfc-editor.org/info/rfc3361>.

1973	   [RFC4213]  Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms
1974	              for IPv6 Hosts and Routers", RFC 4213,
1975	              DOI 10.17487/RFC4213, October 2005,
1976	              <http://www.rfc-editor.org/info/rfc4213>.

1978	   [RFC4436]  Aboba, B., Carlson, J., and S. Cheshire, "Detecting
1979	              Network Attachment in IPv4 (DNAv4)", RFC 4436,
1980	              DOI 10.17487/RFC4436, March 2006,
1981	              <http://www.rfc-editor.org/info/rfc4436>.

1983	   [RFC5245]  Rosenberg, J., "Interactive Connectivity Establishment
1984	              (ICE): A Protocol for Network Address Translator (NAT)
1985	              Traversal for Offer/Answer Protocols", RFC 5245,
1986	              DOI 10.17487/RFC5245, April 2010,
1987	              <http://www.rfc-editor.org/info/rfc5245>.

1989	   [RFC5626]  Jennings, C., Ed., Mahy, R., Ed., and F. Audet, Ed.,
1990	              "Managing Client-Initiated Connections in the Session
1991	              Initiation Protocol (SIP)", RFC 5626,
1992	              DOI 10.17487/RFC5626, October 2009,
1993	              <http://www.rfc-editor.org/info/rfc5626>.

1995	   [RFC6059]  Krishnan, S. and G. Daley, "Simple Procedures for
1996	              Detecting Network Attachment in IPv6", RFC 6059,
1997	              DOI 10.17487/RFC6059, November 2010,
1998	              <http://www.rfc-editor.org/info/rfc6059>.

2000	   [RFC6947]  Boucadair, M., Kaplan, H., Gilman, R., and S.
2001	              Veikkolainen, "The Session Description Protocol (SDP)
2002	              Alternate Connectivity (ALTC) Attribute", RFC 6947,
2003	              DOI 10.17487/RFC6947, May 2013,
2004	              <http://www.rfc-editor.org/info/rfc6947>.

2006	   [UD]       "The Jews Who Stole Christmas", , "Urban Dictionary, entry
2007	              'Earballs'", December 2011,
2008	              <http://www.urbandictionary.com/define.php?term=Earballs>.

2010	Appendix A.  Implementing Load Balancing

2012	   Load-balancing is specified by the "weight" field of DNS SRV records.
2013	   The defining algorithm is specified in [RFC2782].  The same result
2014	   can be obtained with a simpler algorithm: For each server, calculate
2015	   a "score": If its weight is 0, its score is "infinity" (in practice,
2016	   100 suffices).  If its weight is non-zero, its score is calculated by
2017	   choosing a random number between 0 and 1, taking the negative of the
2018	   logarithm of that number, and dividing the result by the weight.
2019	   (Thus, the score is always a positive number.)  (The resulting score
2020	   has an exponential distribution whose parameter is the weight.)
2021	   Then, sort the servers into order of increasing scores, so that the
2022	   servers with the smallest scores are used first.

2024	   This alternative algorithm is analyzed and sample implementations are
2025	   provided in the files in the directory
2026	   sipXrouter/sipXtackLib/doc/developer/scores in the GitHub Sipfoundry
2027	   project (https://github.com/sipfoundry/sipXrouter), among other
2028	   repositories.

2030	Acknowledgments

2032	   The authors would like to acknowledge the support and contribution of
2033	   the SIP Forum IPv6 Working Group.  This document is based on a lot of
2034	   tests and discussions at SIPit events, organized by the SIP Forum.

2036	   The foundation of this document is the work done by Olle Johansson
2037	   and Gonzalo Salgueiro in earlier documents, including
2038	   [I-D.johansson-sip-he-connection].  In turn, the foundation of that
2039	   work is [RFC6555], whose authors are Dan Wing and Andrew Yourtchenko.

2041	   Scott O.  Bradner suggested that the formula for determining
2042	   responsiveness should contain a constant term.

2044	   Roman Shpount described the need for configuration to override the
2045	   source address selection mechanism.

2047	   Tolga Asveren suggested requiring "rport".

2049	Author's Address

2051	   Dale R. Worley
2052	   Ariadne Internet Services
2053	   738 Main St.
2054	   Waltham, MA  02451
2055	   US

2057	   Email: worley@ariadne.com