idnits 2.17.1 draft-ietf-v6ops-happy-eyeballs-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 3, 2011) is 4800 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 3484 (Obsoleted by RFC 6724) -- Obsolete informational reference (is this intentional?): RFC 2766 (Obsoleted by RFC 4966) -- Obsolete informational reference (is this intentional?): RFC 5245 (Obsoleted by RFC 8445, RFC 8839) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 v6ops D. Wing 3 Internet-Draft A. Yourtchenko 4 Intended status: Standards Track Cisco 5 Expires: September 4, 2011 March 3, 2011 7 Happy Eyeballs: Trending Towards Success with Dual-Stack Hosts 8 draft-ietf-v6ops-happy-eyeballs-00 10 Abstract 12 This document describes how a dual-stack client can determine the 13 functioning path to a dual-stack server. This provides a seamless 14 user experience during initial deployment of dual-stack networks and 15 during outages of IPv4 or outages of IPv6. 17 Status of this Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at http://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on September 4, 2011. 34 Copyright Notice 36 Copyright (c) 2011 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (http://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 52 2. Notational Conventions . . . . . . . . . . . . . . . . . . . . 4 53 3. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 4 54 3.1. URIs and hostnames . . . . . . . . . . . . . . . . . . . . 4 55 3.2. IPv6 connectivity . . . . . . . . . . . . . . . . . . . . 4 56 4. Client Recommendations . . . . . . . . . . . . . . . . . . . . 5 57 4.1. Dualstack behavior . . . . . . . . . . . . . . . . . . . . 5 58 4.2. Implementation details . . . . . . . . . . . . . . . . . . 6 59 4.3. Additional Considerations . . . . . . . . . . . . . . . . 8 60 4.3.1. Additional Network and Host Traffic . . . . . . . . . 8 61 4.3.2. Abandon Non-Winning Connections . . . . . . . . . . . 9 62 4.3.3. Flush or Expire Cache . . . . . . . . . . . . . . . . 9 63 4.3.4. Determining Address Type . . . . . . . . . . . . . . . 9 64 4.3.5. Debugging and Troubleshooting . . . . . . . . . . . . 9 65 4.3.6. DNS Behavior . . . . . . . . . . . . . . . . . . . . . 10 66 4.3.7. Middlebox Issues . . . . . . . . . . . . . . . . . . . 10 67 4.3.8. Multiple Interfaces . . . . . . . . . . . . . . . . . 11 68 4.4. Content Provider Recommendations . . . . . . . . . . . . . 11 69 4.5. Security Considerations . . . . . . . . . . . . . . . . . 11 70 4.6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . 11 71 4.7. IANA Considerations . . . . . . . . . . . . . . . . . . . 12 72 5. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12 73 5.1. Normative References . . . . . . . . . . . . . . . . . . . 12 74 5.2. Informational References . . . . . . . . . . . . . . . . . 12 75 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 13 77 1. Introduction 79 In order to use HTTP successfully over IPv6, it is necessary that the 80 user enjoys nearly identical performance as compared to IPv4. A 81 combination of today's applications, IPv6 tunneling and IPv6 service 82 providers, and some of today's content providers all cause the user 83 experience to suffer (Section 3). For IPv6, a content provider may 84 ensure a positive user experience by using a DNS white list of IPv6 85 service providers who peer directly with them, e.g. [whitelist]. 86 However, this is not scalable to all service providers worldwide, nor 87 is it scalable for other content providers to operate their own DNS 88 white list. 90 Instead, this document suggests a mechanism for applications to 91 quickly determine if IPv6 or IPv4 is the most optimal to connect to a 92 server. The suggestions in this document provide a user experience 93 which is superior to connecting to ordered IP addresses which is 94 helpful during the IPv6/IPv4 transition with dual stack hosts. 96 This problem is described also in [RFC1671]: "The dual-stack code 97 may get two addresses back from DNS; which does it use? During the 98 many years of transition the Internet will contain black holes. For 99 example, somewhere on the way from IPng host A to IPng host B there 100 will sometimes (unpredictably) be IPv4-only routers which discard 101 IPng packets. Also, the state of the DNS does not necessarily 102 correspond to reality. A host for which DNS claims to know an IPng 103 address may in fact not be running IPng at a particular moment; thus 104 an IPng packet to that host will be discarded on delivery. Knowing 105 that a host has both IPv4 and IPng addresses gives no information 106 about black holes. A solution to this must be proposed and it must 107 not depend on manually maintained information. (If this is not 108 solved, the dual stack approach is no better than the packet 109 translation approach.)" 111 Following the procedures in this document, once a certain address 112 family is successful, the application trends towards preferring that 113 address family. Thus, repeated use of the application DOES NOT cause 114 repeated probes over both address families. 116 While the application recommendations in this document are described 117 in the context of HTTP clients ("web browsers"), it is also useful 118 and applicable to other interactive applications. 120 Code which implements some of the ideas described in this document 121 has been made available [Perreault] [Andrews]. 123 2. Notational Conventions 125 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 126 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 127 document are to be interpreted as described in [RFC2119]. 129 3. Problem Statement 131 As discussed in more detail in Section 3.1, it is important that the 132 same URI and hostname be used for IPv4 and IPv6. Using separate 133 namespaces causes namespace fragmentation and reduces the ability for 134 users to share URIs and hostnames, and complicates printed material 135 that includes the URI or hostname. 137 As discussed in more detail in Section 3.2, IPv6 connectivity is 138 sometimes broken entirely or slower than native IPv4 connectivity. 140 3.1. URIs and hostnames 142 URIs are often used between users to exchange pointers to content -- 143 such as on social networks, email, instant messaging, or other 144 systems. Thus, production URIs and production hostnames containing 145 references to IPv4 or IPv6 will only function if the other party is 146 also using an application, OS, and a network that can access the URI 147 or the hostname. 149 3.2. IPv6 connectivity 151 When IPv6 connectivity is impaired, today's IPv6-capable web browsers 152 incur many seconds of delay before falling back to IPv4. This harms 153 the user's experience with IPv6, which will slow the acceptance of 154 IPv6, because IPv6 is frequently disabled in its entirety on the end 155 systems to improve the user experience. 157 Reasons for such failure include no connection to the IPv6 Internet, 158 broken 6to4 or Teredo tunnels, and broken IPv6 peering. 160 DNS Server Client Server 161 | | | 162 1. |<--www.example.com A?-----| | 163 2. |<--www.example.com AAAA?--| | 164 3. |---192.0.2.1------------->| | 165 4. |---2001:dba::1----------->| | 166 5. | | | 167 6. | |--TCP SYN, IPv6--->X | 168 7. | |--TCP SYN, IPv6--->X | 169 8. | |--TCP SYN, IPv6--->X | 170 9. | | | 171 10. | |--TCP SYN, IPv4------->| 172 11. | |<-TCP SYN+ACK, IPv4----| 173 12. | |--TCP ACK, IPv4------->| 175 Figure 1: Existing behavior message flow 177 The client obtains the IPv4 and IPv6 records for the server (1-4). 178 The client attempts to connect using IPv6 to the server, but the IPv6 179 path is broken (6-8), which consumes several seconds of time. 180 Eventually, the client attempts to connect using IPv4 (10) which 181 succeeds. 183 4. Client Recommendations 185 To provide fast connections for users, clients should make 186 connections quickly over various technologies, automatically tune 187 itself to avoid flooding the network with unnecessary connections 188 (i.e., for technologies that have not made successful connections), 189 and occasionally flush its self-tuning. 191 4.1. Dualstack behavior 193 If a TCP client supports IPv6 and IPv4 and is connected to IPv4 and 194 IPv6 networks, it can perform the procedures described in this 195 section. 197 DNS Server Client Server 198 | | | 199 1. |<--www.example.com A?-----| | 200 2. |<--www.example.com AAAA?--| | 201 3. |---192.0.2.1------------->| | 202 4. |---2001:dba::1----------->| | 203 5. | | | 204 6. | |==TCP SYN, IPv6===>X | 205 7. | |--TCP SYN, IPv4------->| 206 8. | |<-TCP SYN+ACK, IPv4----| 207 9. | |--TCP ACK, IPv4------->| 208 10. | |==TCP SYN, IPv6===>X | 210 Figure 2: Happy Eyeballs flow 1, IPv6 broken 212 In the diagram above, the client sends two TCP SYNs at the same time 213 over IPv6 (6) and IPv4 (7). In the diagram, the IPv6 path is broken 214 but has little impact to the user because there is no long delay 215 before using IPv4. The IPv6 path is retried until the application 216 gives up (10). 218 DNS Server Client Server 219 | | | 220 1. |<--www.example.com A?-----| | 221 2. |<--www.example.com AAAA?--| | 222 3. |---192.0.2.1------------->| | 223 4. |---2001:dba::1----------->| | 224 5. | | | 225 6. | |==TCP SYN, IPv6=======>| 226 7. | |--TCP SYN, IPv4------->| 227 8. | |<=TCP SYN+ACK, IPv6====| 228 9. | |<-TCP SYN+ACK, IPv4----| 229 10. | |==TCP ACK, IPv6=======>| 230 11. | |--TCP ACK, IPv4------->| 231 12. | |--TCP RST, IPv4------->| 233 Figure 3: Happy Eyeballs flow 2, IPv6 working 235 The diagram above shows a case where both IPv6 and IPv4 are working, 236 and IPv4 is abandoned (12). 238 4.2. Implementation details 240 This section details how to provide robust dual stack service for 241 both IPv6 and IPv4, so that the user perceives very fast application 242 response. 244 The TCP client application is configured with one application-wide 245 value of P. A positive value indicates a preference for IPv6 and a 246 negative value indicates a preference for IPv4. A value of 0 247 indicates equal weight, which means the A and AAAA queries and 248 associated connection attempts will be sent as quickly as possible. 249 The absolute value of P is the measure of a delay before initiating a 250 DNS lookup and a connection attempt on the other address family. 251 There are two P values maintained: one is application-wide and the 252 other is specific per each destination (hostname and port). 254 The algorithm attempts to delay the DNS query until it expects that 255 address family will be necessary; that is, if the preference is 256 towards IPv6, then AAAA will be queried immediately and the A query 257 will be delayed. 259 The TCP client application starts two concurrent execution flows 260 (they will be referred to as "threads" but this reference does not 261 imply the implementation detail of using the threading library, 262 merely the property of mutual concurrency) in order to minimize the 263 user-noticeable delay ("dead time") during the connection attempts: 265 thread 1: (IPv6) 267 * If P<0, wait for absolute value of p*10 milliseconds 269 * send DNS query for AAAA 271 * wait until DNS response is received 273 * Attempt to connect over IPv6 using TCP 275 thread 2: (IPv4) 277 * if P>0, wait for p*10 milliseconds 279 * send DNS query for A 281 * wait until DNS response is received 283 * Attempt to connect over IPv4 using TCP 285 The first thread that succeeds returns the completed connection to 286 the parent code and aborts the other thread (Section 4.3.2). 288 After a connection is successful, we want to adjust the application- 289 wide preference and the per-destination preference. The value of P 290 is incremented (decremented) each time an IPv6 (IPv4) connection wins 291 the race.. When a connection using the less-preferred address family 292 is successful, it indicates the wrong address family was used and the 293 value of P is halved: 295 o If P>0 (indicating IPv6 is preferred over IPv4) and the first 296 thread to finish was the IPv6 thread it indicates the IPv6 297 preference is correct and we need to re-enforce this by increasing 298 the application-wide P value by 1. However, if the first thread 299 to finish was the IPv4 thread it indicates an IPv6 connection 300 problem occurred and we need to aggressively prefer IPv4 more by 301 halving P and rounding towards 0. 303 o If P<0 (indicating IPv4 is preferred over IPv6) and the first 304 thread to finish was the IPv4 thread it indicates the preference 305 is correct and we need to re-enforce this gently by decreasing the 306 application-wide P value by 1. However, if the first thread to 307 finish was the IPv6 thread it indicates an IPv4 connection problem 308 and we need to aggressively avoid IPv4 by halving P and rounding 309 towards 0. 311 o If P=0 (indicating equal preference), P is incremented by one if 312 the first thread to complete was the IPv6 thread, or decremented 313 by one if the first thread to complete was the IPv4 thread. 315 After adjusting P, the resulting delay should never be larger than 4 316 seconds -- which is similar to the value used by many IPv6-capable 317 TCP client applications to switch to an alternate A or AAAA record. 319 Editor's Note 01: Proof of concept tests on fast networks show 320 that even smaller value (around 0.5 seconds) may be practical. 321 More extensive testing would be useful to find the best upper 322 boundary that still ensures a good user experience. 324 Editor's Note 02: A strict implementation of the above steps 325 results in "P" being adjusted if there are no AAAA records or are 326 no A records. This is undesirable. Thus, a future version of 327 this specification is expected to recommend that "P" only be 328 adjusted if there was both an A and AAAA record. 330 4.3. Additional Considerations 332 This section discusses considerations and requirements that are 333 common to new technology deployment. 335 4.3.1. Additional Network and Host Traffic 337 Additional network traffic and additional server load is created due 338 to these recommendations and mitigated by application-wide and per- 339 destination timer adjustments. The procedures described in this 340 document retain a quality user experience while transitioning from 341 IPv4-only to dual stack. The quality user experience benefits the 342 user but to the detriment of the network and server that are serving 343 the user. 345 4.3.2. Abandon Non-Winning Connections 347 It is RECOMMENDED that the non-winning connections be abandoned, even 348 though they could be used to download content. This is because some 349 web sites provide HTTP clients with cookies (after logging in) that 350 incorporate the client's IP address, or use IP addresses to identify 351 users. If some connections from the same HTTP client are arriving 352 from different IP addresses, such HTTP applications will break. It's 353 also important to abandon connections to avoid consuming server or 354 middlebox (e.g., NAT) resources (file descriptors, memory, TCP 355 control blocks) and avoid sending TCP or application-level keepalives 356 on otherwise unused connections. 358 4.3.3. Flush or Expire Cache 360 Because every network has different characteristics (e.g., working or 361 broken IPv6 connectivity) the IPv6/IPv4 preference value (P) SHOULD 362 be reset to its default whenever the host is connected to a new 363 network ([cx-osx], [cx-win]). However, in some instances the 364 application and the host are unaware the network connectivity has 365 changed so it is RECOMMENDED that per-destination values expire after 366 10 minutes of inactivity. 368 4.3.4. Determining Address Type 370 For some transitional technologies such as a dual-stack host, it is 371 easy for the application to recognize the native IPv6 address 372 (learned via a AAAA query) and the native IPv4 address (learned via 373 an A query). For other transitional technologies [RFC2766] it is 374 impossible for the host to differentiate a transitional technology 375 IPv6 address from a native IPv6 address (see Section 4.1 of 376 [RFC4966]). Replacement transitional technologies are attempting to 377 bridge this gap. It is necessary for applications to distinguish 378 between native and transitional addresses in order to provide the 379 most seamless user experience. 381 Application awareness of transitional technologies, if implemented, 382 SHOULD provide a facility to give the preference only to native IPv6 383 addresses. 385 4.3.5. Debugging and Troubleshooting 387 This mechanism is aimed at ensuring a reliable user experience 388 regardless of connectivity problems affecting any single transport. 390 However, this naturally means that applications employing these 391 techniques are by default less useful for diagnosing issues with any 392 particular transport. To assist in that regard, the applications 393 implementing the proposal in this document SHOULD also provide a 394 mechanism to revert the behavior to that of a default provided by the 395 operating system - the [RFC3484]. 397 [[[ To be discussed. 399 Some sites may wish to be informed when the the hosts adjust their 400 "P" value, in order to troubleshoot the underlying cause. To help 401 these sites, a strawman proposal is to send a syslog message or 402 other notification to an address that may be configured by a site 403 administrator in a centralized fashion. (The exact method TBD - 404 DHCP option, domain name, etc.) This syslog message should be 405 sent only first N times that the host expects to prefer IPv6 but 406 has to use IPv4. I.e. the first N times it decreases the value of 407 P. N - TBD. 409 ]]] 411 4.3.6. DNS Behavior 413 Unique to DNS AAAA queries are the problems described in [RFC4074] 414 which, if they still persist, require applications to perform an A 415 query before the AAAA query. 417 [[Editor's Note 03: It is believed these defective DNS servers 418 have long since been upgraded. If so, we can remove this 419 section.]] 421 4.3.7. Middlebox Issues 423 Some devices are known to exhibit what amounts to a bug, when the A 424 and AAAA requests are sent back-to-back over the same 4-tuple, and 425 drop one of the requests or replies [DNS-middlebox]. However, in 426 some cases fixing this behaviour may not be possible either due to 427 the architectural limitations or due to the administrative 428 constraints (location of the faulty device is unknown to the end 429 hosts or not controlled by the end hosts). The algorithm described 430 in this draft, in the case of this erroneous behaviour will 431 eventually pace the queries such that this issue is will be avoided. 432 The algorithm described in this draft also avoids calling the 433 operating system's getaddrinfo() with "any", which should prevent the 434 operating system from sending the A and AAAA queries on the same 435 port. 437 For the large part, these issues are believed to be fixed, in which 438 case the getaddrinfo() with AF_UNSPEC as the address family in its 439 hints. 441 4.3.8. Multiple Interfaces 443 Interaction of the suggestions in this document with multiple 444 interfaces, and interaction with the MIF working group, is for 445 further study. 447 4.4. Content Provider Recommendations 449 Content providers SHOULD provide both AAAA and A records for servers 450 using the same DNS name for both IPv4 and IPv6. 452 4.5. Security Considerations 454 [[Placeholder.]] 456 See Section 4.3.2. 458 4.6. Acknowledgements 460 The mechanism described in this paper was inspired by Stuart 461 Cheshire's discussion at the IAB Plenary at IETF72, the author's 462 understanding of Safari's operation with SRV records, Interactive 463 Connectivity Establishment (ICE [RFC5245]), and the current IPv4/IPv6 464 behavior of SMTP mail transfer agents. 466 Thanks to Fred Baker, Jeff Kinzli, Christian Kuhtz, and Iljitsch van 467 Beijnum for fostering the creation of this document. 469 Thanks to Scott Brim, Rick Jones, Stig Venaas, Erik Kline, Bjoern 470 Zeeb for providing feedback on the document. 472 Thanks to Javier Ubillos, Simon Perreault and Mark Andrews for the 473 active feedback and the experimental work on the independent 474 practical implementations that they created. 476 Also the authors would like to thank the following individuals who 477 participated in various email discussions on this topic: Mohacsi 478 Janos, Pekka Savola, Ted Lemon, Carlos Martinez-Cagnazzo, Simon 479 Perreault, Jack Bates, Jeroen Massar, Fred Baker, Javier Ubillos, 480 Teemu Savolainen, Scott Brim, Erik Kline, Cameron Byrne, Daniel 481 Roesen, Guillaume Leclanche, Cameron Byrne, Mark Smith, Gert Doering, 482 Martin Millnert, Tim Durack. 484 4.7. IANA Considerations 486 This document has no IANA actions. 488 5. References 490 5.1. Normative References 492 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 493 Requirement Levels", BCP 14, RFC 2119, March 1997. 495 [RFC3484] Draves, R., "Default Address Selection for Internet 496 Protocol version 6 (IPv6)", RFC 3484, February 2003. 498 5.2. Informational References 500 [Andrews] Andrews, M., "How to connect to a multi-homed server over 501 TCP", January 2011, . 504 [DNS-middlebox] 505 Various, "DNS middlebox behavior with multiple queries 506 over same source port", June 2009, 507 . 509 [Perreault] 510 Perreault, S., "Happy Eyeballs in Erlang", February 2011, 511 . 514 [RFC1671] Carpenter, B., "IPng White Paper on Transition and Other 515 Considerations", RFC 1671, August 1994. 517 [RFC2766] Tsirtsis, G. and P. Srisuresh, "Network Address 518 Translation - Protocol Translation (NAT-PT)", RFC 2766, 519 February 2000. 521 [RFC4074] Morishita, Y. and T. Jinmei, "Common Misbehavior Against 522 DNS Queries for IPv6 Addresses", RFC 4074, May 2005. 524 [RFC4966] Aoun, C. and E. Davies, "Reasons to Move the Network 525 Address Translator - Protocol Translator (NAT-PT) to 526 Historic Status", RFC 4966, July 2007. 528 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment 529 (ICE): A Protocol for Network Address Translator (NAT) 530 Traversal for Offer/Answer Protocols", RFC 5245, 531 April 2010. 533 [cx-osx] Adium, "AIHostReachabilityMonitor", June 2009, 534 . 536 [cx-win] Microsoft, "NetworkChange.NetworkAvailabilityChanged 537 Event", June 2009, . 542 [whitelist] 543 Google, "Google IPv6 DNS Whitelist", January 2009, 544 . 546 Authors' Addresses 548 Dan Wing 549 Cisco Systems, Inc. 550 170 West Tasman Drive 551 San Jose, CA 95134 552 USA 554 Email: dwing@cisco.com 556 Andrew Yourtchenko 557 Cisco Systems, Inc. 558 De Kleetlaan, 7 559 San Jose, Diegem B-1831 560 Belgium 562 Email: ayourtch@cisco.com