idnits 2.17.1 draft-dickinson-dnsop-5966-bis-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC5966, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 26, 2014) is 3463 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 5966 (Obsoleted by RFC 7766) ** Obsolete normative reference: RFC 7230 (Obsoleted by RFC 9110, RFC 9112) == Outdated reference: A later version (-10) exists of draft-ietf-tcpm-fastopen-09 -- Obsolete informational reference (is this intentional?): RFC 6824 (Obsoleted by RFC 8684) Summary: 3 errors (**), 0 flaws (~~), 2 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Dickinson 3 Internet-Draft Sinodun Internet Technologies 4 Updates: 5966 (if approved) R. Bellis 5 Intended status: Standards Track Nominet 6 Expires: April 29, 2015 A. Mankin 7 D. Wessels 8 Verisign Labs 9 October 26, 2014 11 DNS Transport over TCP - Implementation Requirements 12 draft-dickinson-dnsop-5966-bis-00 14 Abstract 16 This document updates the requirements for the support of TCP as a 17 transport protocol for DNS implementations. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on April 29, 2015. 36 Copyright Notice 38 Copyright (c) 2014 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 54 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 3. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 3 56 4. Transport Protocol Selection . . . . . . . . . . . . . . . . 4 57 5. Connection Handling . . . . . . . . . . . . . . . . . . . . . 5 58 6. Query Pipelining . . . . . . . . . . . . . . . . . . . . . . 6 59 7. Response Reordering . . . . . . . . . . . . . . . . . . . . . 6 60 8. TCP Fast Open . . . . . . . . . . . . . . . . . . . . . . . . 7 61 9. Summary of Advantages and Disadvantages to using TCP for DNS 8 62 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 63 11. Security Considerations . . . . . . . . . . . . . . . . . . . 9 64 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 65 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 66 13.1. Normative References . . . . . . . . . . . . . . . . . . 9 67 13.2. Informative References . . . . . . . . . . . . . . . . . 10 68 Appendix A. Changes to RFC 5966 . . . . . . . . . . . . . . . . 11 69 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 71 1. Introduction 73 Most DNS [RFC1034] transactions take place over UDP [RFC0768]. TCP 74 [RFC0793] is always used for full zone transfers (AXFR) and is often 75 used for messages whose sizes exceed the DNS protocol's original 76 512-byte limit. 78 Section 6.1.3.2 of [RFC1123] states: 80 DNS resolvers and recursive servers MUST support UDP, and SHOULD 81 support TCP, for sending (non-zone-transfer) queries. 83 However, some implementors have taken the text quoted above to mean 84 that TCP support is an optional feature of the DNS protocol. 86 The majority of DNS server operators already support TCP and the 87 default configuration for most software implementations is to support 88 TCP. The primary audience for this document is those implementors 89 whose failure to support TCP restricts interoperability and limits 90 deployment of new DNS features. 92 This document therefore updates the core DNS protocol specifications 93 such that support for TCP is henceforth a REQUIRED part of a full DNS 94 protocol implementation. 96 There are several advantages and disadvantages to the increased use 97 of TCP as well as implementation details that need to be considered. 98 This document addresses these issues and updates [RFC5966], with 99 additional considerations and lessons learned from new research and 100 implementations [Connection-Oriented-DNS]. 102 Whilst this document makes no specific requirements for operators of 103 DNS servers to meet, it does offer some suggestions to operators to 104 help ensure that support for TCP on their servers and network is 105 optimal. It should be noted that failure to support TCP (or the 106 blocking of DNS over TCP at the network layer) may result in 107 resolution failure and/or application-level timeouts. 109 2. Terminology 111 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 112 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 113 document are to be interpreted as described in [RFC2119]. 115 3. Discussion 117 In the absence of EDNS0 (Extension Mechanisms for DNS 0) (see below), 118 the normal behaviour of any DNS server needing to send a UDP response 119 that would exceed the 512-byte limit is for the server to truncate 120 the response so that it fits within that limit and then set the TC 121 flag in the response header. When the client receives such a 122 response, it takes the TC flag as an indication that it should retry 123 over TCP instead. 125 RFC 1123 also says: 127 ... it is also clear that some new DNS record types defined in the 128 future will contain information exceeding the 512 byte limit that 129 applies to UDP, and hence will require TCP. Thus, resolvers and 130 name servers should implement TCP services as a backup to UDP 131 today, with the knowledge that they will require the TCP service 132 in the future. 134 Existing deployments of DNS Security (DNSSEC) [RFC4033] have shown 135 that truncation at the 512-byte boundary is now commonplace. For 136 example, a Non-Existent Domain (NXDOMAIN) (RCODE == 3) response from 137 a DNSSEC-signed zone using NextSECure 3 (NSEC3) [RFC5155] is almost 138 invariably larger than 512 bytes. 140 Since the original core specifications for DNS were written, the 141 Extension Mechanisms for DNS (EDNS0 [RFC6891]) have been introduced. 142 These extensions can be used to indicate that the client is prepared 143 to receive UDP responses larger than 512 bytes. An EDNS0-compatible 144 server receiving a request from an EDNS0-compatible client may send 145 UDP packets up to that client's announced buffer size without 146 truncation. 148 However, transport of UDP packets that exceed the size of the path 149 MTU causes IP packet fragmentation, which has been found to be 150 unreliable in some circumstances. Many firewalls routinely block 151 fragmented IP packets, and some do not implement the algorithms 152 necessary to reassemble fragmented packets. Worse still, some 153 network devices deliberately refuse to handle DNS packets containing 154 EDNS0 options. Other issues relating to UDP transport and packet 155 size are discussed in [RFC5625]. 157 The MTU most commonly found in the core of the Internet is around 158 1500 bytes, and even that limit is routinely exceeded by DNSSEC- 159 signed responses. 161 The future that was anticipated in RFC 1123 has arrived, and the only 162 standardised UDP-based mechanism that may have resolved the packet 163 size issue has been found inadequate. 165 4. Transport Protocol Selection 167 All general-purpose DNS implementations MUST support both UDP and TCP 168 transport. 170 o Authoritative server implementations MUST support TCP so that they 171 do not limit the size of responses to what fits in a single UDP 172 packet. 174 o Recursive server (or forwarder) implementations MUST support TCP 175 so that they do not prevent large responses from a TCP-capable 176 server from reaching its TCP-capable clients. 178 o Stub resolver implementations (e.g., an operating system's DNS 179 resolution library) MUST support TCP since to do otherwise would 180 limit their interoperability with their own clients and with 181 upstream servers. 183 Regarding the choice of when to use UDP or TCP, Section 6.1.3.2 of 184 RFC 1123 also says: 186 ... a DNS resolver or server that is sending a non-zone-transfer 187 query MUST send a UDP query first. 189 This requirement is hereby relaxed. A resolver MAY elect to send 190 either TCP or UDP queries depending on local operational reasons. If 191 it already has an open TCP connection to the server it SHOULD reuse 192 this connection. 194 In essence, TCP SHOULD be considered as valid a transport as UDP. It 195 SHOULD NOT be used only for zone transfers and as a fallback. 197 In addition it is noted that all Recursive and Authoritative servers 198 MUST send responses using the same transport as the query arrived on. 199 In the case of TCP this MUST also be the same connection. 201 5. Connection Handling 203 One perceived disadvantage to DNS over TCP is the added connection 204 setup latency, generally equal to one RTT. To amortize connection 205 setup costs, both clients and servers SHOULD support connection reuse 206 by sending multiple queries and responses over a single TCP 207 connection. 209 DNS currently has no connection signaling mechanism. Clients and 210 servers may close a connection at any time. Clients MUST be prepared 211 to retry failed queries on broken connections. 213 Section 4.2.2 of [RFC1035] says: 215 If the server needs to close a dormant connection to reclaim 216 resources, it should wait until the connection has been idle for a 217 period on the order of two minutes. In particular, the server 218 should allow the SOA and AXFR request sequence (which begins a 219 refresh operation) to be made on a single connection. Since the 220 server would be unable to answer queries anyway, a unilateral 221 close or reset may be used instead of a graceful close. 223 Other more modern protocols (e.g., HTTP/1.1 [RFC7230]) have support 224 for persistent TCP connections and operational experience has shown 225 that long timeouts can easily cause resource exhaustion and poor 226 response under heavy load. Intentionally opening many connections 227 and leaving them dormant can trivially create a "denial-of-service" 228 attack. 230 It is therefore RECOMMENDED that the default application-level idle 231 period should be of the order of seconds, but no particular value is 232 specified. In practice, the idle period may vary dynamically, and 233 servers MAY allow dormant connections to remain open for longer 234 periods as resources permit. 236 To mitigate the risk of unintentional server overload, DNS clients 237 MUST take care to minimize the number of concurrent TCP connections 238 made to any individual server. Similarly, servers MAY impose limits 239 on the number of concurrent TCP connections being handled for any 240 particular client. It is RECOMMENDED that for any given client - 241 server interaction there SHOULD be no more than one connection for 242 regular queries, one for zone transfers and one for each protocol 243 that is being used on top of TCP, for example, if the resolver was 244 using TLS. The server MUST NOT enforce these rules for a particular 245 client because it does not know if the client IP address belongs to a 246 single client or is, for example, multiple clients behind NAT. 248 6. Query Pipelining 250 Due to the use of TCP primarily for zone transfer and truncated 251 responses, no existing RFC discusses the idea of pipelining DNS 252 queries over a TCP connection. 254 In order to achieve performance on par with UDP, it is therefore 255 RECOMMENDED that DNS clients should pipeline their queries. When a 256 DNS client sends multiple queries to a server, it should not wait for 257 an outstanding reply before sending the next query. Clients should 258 treat TCP and UDP equivalently when considering the time at which to 259 send a particular query. 261 DNS servers (especially recursive) SHOULD expect to receive pipelined 262 queries. The server should process TCP queries in parallel, just as 263 it would for UDP. The handling of responses to pipelined queries is 264 covered in the following section. 266 When pipelining queries over TCP it is very easy to send more DNS 267 queries than there are DNS Message ID's. Implementations MUST take 268 care to check their list of outstanding DNS Message ID's before 269 sending a new query over an existing TCP connection. This is 270 especially important if the server could be performing out-of-order 271 processing. In addition, when sending multiple queries over TCP it 272 is very easy for a name server to overwhelm its own network 273 interface. Implementations MUST take care to manage buffer sizes or 274 to throttle writes to the network interface. 276 7. Response Reordering 278 RFC 1035 is ambiguous on the question of whether TCP responses may be 279 reordered -- the only relevant text is in Section 4.2.1, which 280 relates to UDP: 282 Queries or their responses may be reordered by the network, or by 283 processing in name servers, so resolvers should not depend on them 284 being returned in order. 286 For the avoidance of future doubt, this requirement is clarified. 287 Authoritative servers and recursive resolvers are RECOMMENDED to 288 support the sending of responses in parallel and/or out-of-order, 289 regardless of the transport protocol in use. Stub and recursive 290 resolvers MUST be able to process responses that arrive in a 291 different order to that in which the requests were sent, regardless 292 of the transport protocol in use. 294 In order to achieve performance on par with UDP, recursive resolvers 295 SHOULD process TCP queries in parallel and return individual 296 responses as soon as they are available, possibly out-of-order. 298 Since responses may arrive out-of-order, clients must take care to 299 match responses to outstanding queries, using the ID field, port 300 number, query name/type/class, and any other relevant protocol 301 features. 303 8. TCP Fast Open 305 This section is non-normative. 307 TCP fastopen [I-D.ietf-tcpm-fastopen] (TFO) allows data to be carried 308 in the SYN packet. It also saves up to one RTT compared to standard 309 TCP. 311 TFO mitigates the security vulnerabilities inherent in sending data 312 in the SYN, especially on a system like DNS where amplification 313 attacks are possible, by use of a server-supplied cookie. TFO 314 clients request a server cookie in the initial SYN packet at the 315 start of a new connection. The server returns a cookie in its SYN- 316 ACK. The client caches the cookie and reuses it when opening 317 subsequent connections to the same server. 319 The cookie is stored by the client's TCP stack (kernel) and persists 320 if either the client or server processes are restarted. TFO also 321 falls back to a regular TCP handshake gracefully. 323 Adding support for this to existing name server implementations is 324 relatively easy, but does require source code modifications. On the 325 client, the call to connect() is replaced with a TFO aware version of 326 sendmsg() or sendto(). On the server, TFO must be switched into 327 server mode by changing the kernel parameter (net.ipv4.tcp_fastopen 328 on Linux) to enable the server bit (Set the integer value to 2 329 (server only) or 3 (client and server)) and setting a socket option 330 between the bind() and listen() calls. 332 DNS services taking advantage of IP anycast [RFC4786] may need to 333 take additional steps when enabling TFO. From 334 [I-D.ietf-tcpm-fastopen]: 336 Servers that accept connection requests to the same server IP 337 address should use the same key such that they generate identical 338 Fast Open Cookies for a particular client IP address. Otherwise a 339 client may get different cookies across connections; its Fast Open 340 attempts would fall back to regular 3WHS. 342 9. Summary of Advantages and Disadvantages to using TCP for DNS 344 The TCP handshake generally prevents address spoofing and, therefore, 345 the reflection/amplification attacks which plague UDP. 347 TCP does not suffer from UDP's issues with fragmentation. 348 Middleboxes are known to block IP fragments, leading to timeouts and 349 forcing client implementations to "hunt" for EDNS0 reply size values 350 supported by the network path. Additionally, fragmentation may lead 351 to cache poisoning [fragmentation-considered-poisonous]. 353 TCP setup costs an additional RTT compared to UDP queries. Setup 354 costs can be amortized by reusing connections, pipelining queries, 355 and enabling TCP Fast Open. 357 TCP imposes additional state-keeping requirements on clients and 358 servers. The use of TCP Fast Open reduces the cost of closing and 359 re-opening TCP connections. 361 Long-lived TCP connections to anycast servers may be disrupted due to 362 routing changes. Clients utilizing TCP for DNS must always be 363 prepared to re-establish connections or otherwise retry outstanding 364 queries. It may also possible for TCP Multipath [RFC6824] to allow a 365 server to hand a connection over from the anycast address to a 366 unicast address. 368 There are many "Middleboxes" in use today that interfere with TCP 369 over port 53 [RFC5625]. This document does not propose any 370 solutions, other than to make it absolutely clear that TCP is a valid 371 transport for DNS and must be supported by all implementations. 373 10. IANA Considerations 375 This memo includes no request to IANA. 377 11. Security Considerations 379 Some DNS server operators have expressed concern that wider use of 380 DNS over TCP will expose them to a higher risk of denial-of-service 381 (DoS) attacks. 383 Although there is a higher risk of such attacks against TCP-enabled 384 servers, techniques for the mitigation of DoS attacks at the network 385 level have improved substantially since DNS was first designed. 387 Readers are advised to familiarise themselves with [CPNI-TCP]. 389 Operators of recursive servers should ensure that they only accept 390 connections from expected clients, and do not accept them from 391 unknown sources. In the case of UDP traffic, this will help protect 392 against reflector attacks [RFC5358] and in the case of TCP traffic it 393 will prevent an unknown client from exhausting the server's limits on 394 the number of concurrent connections. 396 12. Acknowledgements 398 The authors would like to thank Liang Zhu, Zi Hu, and John Heidemann 399 for extensive DNS-over-TCP discussions and code; and Lucie Guiraud 400 and Danny McPherson for reviewing early versions of this document. 401 We would also like to thank all those who contributed to RFC 5966. 403 13. References 405 13.1. Normative References 407 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 408 August 1980. 410 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, RFC 411 793, September 1981. 413 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", 414 STD 13, RFC 1034, November 1987. 416 [RFC1035] Mockapetris, P., "Domain names - implementation and 417 specification", STD 13, RFC 1035, November 1987. 419 [RFC1123] Braden, R., "Requirements for Internet Hosts - Application 420 and Support", STD 3, RFC 1123, October 1989. 422 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 423 Requirement Levels", BCP 14, RFC 2119, March 1997. 425 [RFC4033] Arends, R., Austein, R., Larson, M., Massey, D., and S. 426 Rose, "DNS Security Introduction and Requirements", RFC 427 4033, March 2005. 429 [RFC4786] Abley, J. and K. Lindqvist, "Operation of Anycast 430 Services", BCP 126, RFC 4786, December 2006. 432 [RFC5155] Laurie, B., Sisson, G., Arends, R., and D. Blacka, "DNS 433 Security (DNSSEC) Hashed Authenticated Denial of 434 Existence", RFC 5155, March 2008. 436 [RFC5358] Damas, J. and F. Neves, "Preventing Use of Recursive 437 Nameservers in Reflector Attacks", BCP 140, RFC 5358, 438 October 2008. 440 [RFC5625] Bellis, R., "DNS Proxy Implementation Guidelines", BCP 441 152, RFC 5625, August 2009. 443 [RFC5966] Bellis, R., "DNS Transport over TCP - Implementation 444 Requirements", RFC 5966, August 2010. 446 [RFC6891] Damas, J., Graff, M., and P. Vixie, "Extension Mechanisms 447 for DNS (EDNS(0))", STD 75, RFC 6891, April 2013. 449 [RFC7230] Fielding, R. and J. Reschke, "Hypertext Transfer Protocol 450 (HTTP/1.1): Message Syntax and Routing", RFC 7230, June 451 2014. 453 13.2. Informative References 455 [CPNI-TCP] 456 CPNI, "Security Assessment of the Transmission Control 457 Protocol (TCP)", 2009, . 460 [Connection-Oriented-DNS] 461 Zhu, L., Hu, Z., Heidemann, J., Wessels, D., Mankin, A., 462 and N. Somaiya, "T-DNS: Connection-Oriented DNS to Improve 463 Privacy and Security (extended)", . 466 [I-D.ietf-tcpm-fastopen] 467 Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 468 Fast Open", draft-ietf-tcpm-fastopen-09 (work in 469 progress), July 2014. 471 [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, 472 "TCP Extensions for Multipath Operation with Multiple 473 Addresses", RFC 6824, January 2013. 475 [fragmentation-considered-poisonous] 476 Herzberg, A. and H. Shulman, "Fragmentation Considered 477 Poisonous", May 2012, . 479 Appendix A. Changes to RFC 5966 481 This document differs from RFC 5966 in four additions: 483 1. DNS implementations are recommended not only to support TCP but 484 to support it on an equal footing with UDP 486 2. DNS implementations are recommended to support reuse of TCP 487 connections 489 3. DNS implementations are recommended to support pipelining and out 490 of order processing of the query stream 492 4. A non-normative discussion of use of TCP Fast Open is added 494 Authors' Addresses 496 John Dickinson 497 Sinodun Internet Technologies 498 Magdalen Centre 499 Oxford Science Park 500 Oxford OX4 4GA 501 UK 503 Email: jad@sinodun.com 504 URI: http://sinodun.com 506 Ray Bellis 507 Nominet 508 Edmund Halley Road 509 Oxford OX4 4DQ 510 UK 512 Phone: +44 1865 332211 513 Email: ray.bellis@nominet.org.uk 514 URI: http://www.nominet.org.uk/ 515 Allison Mankin 516 Verisign Labs 517 12061 Bluemont Way 518 Reston, VA 20190 520 Phone: +1 703 948-3200 521 Email: amankin@verisign.com 523 Duane Wessels 524 Verisign Labs 525 12061 Bluemont Way 526 Reston, VA 20190 528 Phone: +1 703 948-3200 529 Email: dwessels@verisign.com