idnits 2.17.1 draft-ietf-dnsop-5966bis-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC5966, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: When sending multiple queries over a TCP connection clients MUST take care to avoid Message ID collisions. In other words, they MUST not re-use the DNS Message ID of an in-flight query. This is especially important if the server could be performing out-of-order processing (see Section 7). == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: In order to achieve performance on par with UDP DNS clients SHOULD pipeline their queries. When a DNS client sends multiple queries to a server, it SHOULD not wait for an outstanding reply before sending the next query. Clients SHOULD treat TCP and UDP equivalently when considering the time at which to send a particular query. -- The document date (September 20, 2015) is 3140 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 5966 (Obsoleted by RFC 7766) ** Obsolete normative reference: RFC 7230 (Obsoleted by RFC 9110, RFC 9112) -- Obsolete informational reference (is this intentional?): RFC 5405 (Obsoleted by RFC 8085) -- Obsolete informational reference (is this intentional?): RFC 6824 (Obsoleted by RFC 8684) Summary: 3 errors (**), 0 flaws (~~), 3 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 dnsop J. Dickinson 3 Internet-Draft S. Dickinson 4 Obsoletes: 5966 (if approved) Sinodun 5 Intended status: Standards Track R. Bellis 6 Expires: March 23, 2016 ISC 7 A. Mankin 8 D. Wessels 9 Verisign Labs 10 September 20, 2015 12 DNS Transport over TCP - Implementation Requirements 13 draft-ietf-dnsop-5966bis-03 15 Abstract 17 This document specifies the requirement for support of TCP as a 18 transport protocol for DNS implementations and provides guidelines 19 towards DNS-over-TCP performance on par with that of DNS-over-UDP. 21 Status of This Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at http://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on March 23, 2016. 38 Copyright Notice 40 Copyright (c) 2015 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. Code Components extracted from this document must 49 include Simplified BSD License text as described in Section 4.e of 50 the Trust Legal Provisions and are provided without warranty as 51 described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 56 2. Requirements Terminology . . . . . . . . . . . . . . . . . . 3 57 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 58 4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 4 59 5. Transport Protocol Selection . . . . . . . . . . . . . . . . 5 60 6. Connection Handling . . . . . . . . . . . . . . . . . . . . . 6 61 6.1. Current practices . . . . . . . . . . . . . . . . . . . . 6 62 6.1.1. Clients . . . . . . . . . . . . . . . . . . . . . . . 7 63 6.1.2. Servers . . . . . . . . . . . . . . . . . . . . . . . 7 64 6.2. Recommendations . . . . . . . . . . . . . . . . . . . . . 7 65 6.2.1. Connection Re-use . . . . . . . . . . . . . . . . . . 7 66 6.2.1.1. Query Pipelining . . . . . . . . . . . . . . . . 8 67 6.2.2. Concurrent connections . . . . . . . . . . . . . . . 8 68 6.2.3. Idle Timeouts . . . . . . . . . . . . . . . . . . . . 9 69 6.2.4. Tear Down . . . . . . . . . . . . . . . . . . . . . . 9 70 7. Response Reordering . . . . . . . . . . . . . . . . . . . . . 9 71 8. TCP Message Length Field . . . . . . . . . . . . . . . . . . 10 72 9. TCP Fast Open . . . . . . . . . . . . . . . . . . . . . . . . 10 73 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 74 11. Security Considerations . . . . . . . . . . . . . . . . . . . 11 75 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 76 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 77 13.1. Normative References . . . . . . . . . . . . . . . . . . 12 78 13.2. Informative References . . . . . . . . . . . . . . . . . 13 79 Appendix A. Summary of Advantages and Disadvantages to using TCP 80 for DNS . . . . . . . . . . . . . . . . . . . . . . 14 81 Appendix B. Changes between revisions . . . . . . . . . . . . . 15 82 B.1. Changes -02 to -03 . . . . . . . . . . . . . . . . . . . 15 83 B.2. Changes -01 to -02 . . . . . . . . . . . . . . . . . . . 15 84 B.3. Changes -00 to -01 . . . . . . . . . . . . . . . . . . . 16 85 B.4. Changes to RFC 5966 . . . . . . . . . . . . . . . . . . . 17 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17 88 1. Introduction 90 Most DNS [RFC1034] transactions take place over UDP [RFC0768]. TCP 91 [RFC0793] is always used for full zone transfers (AXFR) and is often 92 used for messages whose sizes exceed the DNS protocol's original 93 512-byte limit. The growing deployment of DNSSEC and IPv6 has 94 increased response sizes and therefore the use of TCP. The need for 95 increased TCP use has also been driven by the protection it provides 96 against address spoofing and therefore exploitation of DNS in 97 reflection/amplification attacks. It is now widely used in Response 98 Rate Limiting [RRL]. 100 Section 6.1.3.2 of [RFC1123] states: 102 DNS resolvers and recursive servers MUST support UDP, and SHOULD 103 support TCP, for sending (non-zone-transfer) queries. 105 However, some implementors have taken the text quoted above to mean 106 that TCP support is an optional feature of the DNS protocol. 108 The majority of DNS server operators already support TCP and the 109 default configuration for most software implementations is to support 110 TCP. The primary audience for this document is those implementors 111 whose limited support for TCP restricts interoperability and hinders 112 deployment of new DNS features. 114 This document therefore updates the core DNS protocol specifications 115 such that support for TCP is henceforth a REQUIRED part of a full DNS 116 protocol implementation. 118 There are several advantages and disadvantages to the increased use 119 of TCP (see Appendix A) as well as implementation details that need 120 to be considered. This document addresses these issues and presents 121 TCP as a valid transport alternative for DNS. It extends the content 122 of [RFC5966], with additional considerations and lessons learned from 123 research, developments and implementation of TCP in DNS and in other 124 internet protocols. 126 Whilst this document makes no specific requirements for operators of 127 DNS servers to meet, it does offer some suggestions to operators to 128 help ensure that support for TCP on their servers and network is 129 optimal. It should be noted that failure to support TCP (or the 130 blocking of DNS over TCP at the network layer) may result in 131 resolution failure and/or application-level timeouts. 133 2. Requirements Terminology 135 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 136 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 137 document are to be interpreted as described in [RFC2119]. 139 3. Terminology 141 o Persistent connection: a TCP connection that is not closed either 142 by the server after sending the first response nor by the client 143 after receiving the first response. 145 o Connection Reuse: the sending of multiple queries and responses 146 over a single TCP connection. 148 o Idle DNS-over-TCP session: Clients and servers view application 149 level idleness differently. A DNS client considers an established 150 DNS-over-TCP session to be idle when it has no pending queries to 151 send and there are no outstanding responses. A DNS server 152 considers an established DNS-over-TCP session to be idle when it 153 has sent responses to all the queries it has received on that 154 connection. 156 o Pipelining: the sending of multiple queries and responses over a 157 single TCP connection but not waiting for any outstanding replies 158 before sending another query. 160 o Out-Of-Order Processing: The processing of queries concurrently 161 and the returning of individual responses as soon as they are 162 available, possibly out-of-order. This will most likely occur in 163 recursive servers, however it is possible in authoritative servers 164 that, for example, have different backend data stores. 166 4. Discussion 168 In the absence of EDNS0 (Extension Mechanisms for DNS 0) (see below), 169 the normal behaviour of any DNS server needing to send a UDP response 170 that would exceed the 512-byte limit is for the server to truncate 171 the response so that it fits within that limit and then set the TC 172 flag in the response header. When the client receives such a 173 response, it takes the TC flag as an indication that it should retry 174 over TCP instead. 176 RFC 1123 also says: 178 ... it is also clear that some new DNS record types defined in the 179 future will contain information exceeding the 512 byte limit that 180 applies to UDP, and hence will require TCP. Thus, resolvers and 181 name servers should implement TCP services as a backup to UDP 182 today, with the knowledge that they will require the TCP service 183 in the future. 185 Existing deployments of DNS Security (DNSSEC) [RFC4033] have shown 186 that truncation at the 512-byte boundary is now commonplace. For 187 example, a Non-Existent Domain (NXDOMAIN) (RCODE == 3) response from 188 a DNSSEC-signed zone using NextSECure 3 (NSEC3) [RFC5155] is almost 189 invariably larger than 512 bytes. 191 Since the original core specifications for DNS were written, the 192 Extension Mechanisms for DNS (EDNS0 [RFC6891]) have been introduced. 193 These extensions can be used to indicate that the client is prepared 194 to receive UDP responses larger than 512 bytes. An EDNS0-compatible 195 server receiving a request from an EDNS0-compatible client may send 196 UDP packets up to that client's announced buffer size without 197 truncation. 199 However, transport of UDP packets that exceed the size of the path 200 MTU causes IP packet fragmentation, which has been found to be 201 unreliable in many circumstances. Many firewalls routinely block 202 fragmented IP packets, and some do not implement the algorithms 203 necessary to reassemble fragmented packets. Worse still, some 204 network devices deliberately refuse to handle DNS packets containing 205 EDNS0 options. Other issues relating to UDP transport and packet 206 size are discussed in [RFC5625]. 208 The MTU most commonly found in the core of the Internet is around 209 1500 bytes, and even that limit is routinely exceeded by DNSSEC- 210 signed responses. 212 The future that was anticipated in RFC 1123 has arrived, and the only 213 standardised UDP-based mechanism that may have resolved the packet 214 size issue has been found inadequate. 216 5. Transport Protocol Selection 218 All general-purpose DNS implementations MUST support both UDP and TCP 219 transport. 221 o Authoritative server implementations MUST support TCP so that they 222 do not limit the size of responses to what fits in a single UDP 223 packet. 225 o Recursive server (or forwarder) implementations MUST support TCP 226 so that they do not prevent large responses from a TCP-capable 227 server from reaching its TCP-capable clients. 229 o Stub resolver implementations (e.g., an operating system's DNS 230 resolution library) MUST support TCP since to do otherwise would 231 limit their interoperability with their own clients and with 232 upstream servers. 234 Regarding the choice of when to use UDP or TCP, Section 6.1.3.2 of 235 RFC 1123 also says: 237 ... a DNS resolver or server that is sending a non-zone-transfer 238 query MUST send a UDP query first. 240 This requirement is hereby relaxed. A resolver MAY elect to send 241 either TCP or UDP queries depending on local operational reasons. 242 TCP MAY be used before sending any UDP queries. If it already has an 243 open TCP connection to the server it SHOULD reuse this connection. 244 In essence, TCP ought to be considered a valid alternative transport 245 to UDP, not purely a fallback option. 247 In addition it is noted that all Recursive and Authoritative servers 248 MUST send responses using the same transport as the query arrived on. 249 In the case of TCP this MUST also be the same connection. 251 6. Connection Handling 253 6.1. Current practices 255 Section 4.2.2 of [RFC1035] says: 257 o The server should assume that the client will initiate connection 258 closing, and should delay closing its end of the connection until 259 all outstanding client requests have been satisfied. 261 o If the server needs to close a dormant connection to reclaim 262 resources, it should wait until the connection has been idle for a 263 period on the order of two minutes. In particular, the server 264 should allow the SOA and AXFR request sequence (which begins a 265 refresh operation) to be made on a single connection. Since the 266 server would be unable to answer queries anyway, a unilateral 267 close or reset may be used instead of graceful close. 269 Other more modern protocols (e.g., HTTP/1.1 [RFC7230]) have support 270 by default for persistent TCP connections for all requests. 271 Connections are then normally closed via a 'connection close' signal 272 from one party. 274 The description in [RFC1035] is clear that servers should view 275 connections as persistent (particularly after receiving an SOA), but 276 unfortunately does not provide enough detail for an unambiguous 277 interpretation of client behaviour for queries other than a SOA. 278 Additionally, DNS does not yet have a signalling mechanism for 279 connection timeout or close, although some have been proposed. 281 6.1.1. Clients 283 There is no clear guidance today in any RFC as to when a DNS client 284 should close a TCP connection, and there are no specific 285 recommendations with regard to DNS client idle timeouts. However it 286 is common practice for clients to close the TCP connection after 287 sending a single request (apart from the SOA/AXFR case). 289 6.1.2. Servers 291 Many DNS server implementations use a long fixed idle timeout and 292 default to a small number of TCP connections. They also offer little 293 by the way of TCP connection management options. The disadvantages 294 of this include: 296 o Operational experience has shown that long server timeouts can 297 easily cause resource exhaustion and poor response under heavy 298 load. 300 o Intentionally opening many connections and leaving them idle can 301 trivially create a TCP "denial-of-service" attack as many DNS 302 servers are poorly equipped to defend against this by modifying 303 their idle timeouts or other connection management policies. 305 o A modest number of clients that all concurrently attempt to use 306 persistent connections with non-zero idle timeouts to such a 307 server could unintentionally cause the same "denial-of-service" 308 problem. 310 Note that this denial-of-service is only on the TCP service. 311 However, in these cases it affects not only clients wishing to use 312 TCP for their queries for operational reasons, but all clients who 313 choose to fall back to TCP from UDP after receiving a TC=1 flag. 315 6.2. Recommendations 317 The following sections include recommendations that are intended to 318 result in more consistent and scalable implementations of DNS-over- 319 TCP. 321 6.2.1. Connection Re-use 323 One perceived disadvantage to DNS over TCP is the added connection 324 setup latency, generally equal to one RTT. To amortize connection 325 setup costs, both clients and servers SHOULD support connection reuse 326 by sending multiple queries and responses over a single persistent 327 TCP connection. 329 When sending multiple queries over a TCP connection clients MUST take 330 care to avoid Message ID collisions. In other words, they MUST not 331 re-use the DNS Message ID of an in-flight query. This is especially 332 important if the server could be performing out-of-order processing 333 (see Section 7). 335 6.2.1.1. Query Pipelining 337 Due to the historical use of TCP primarily for zone transfer and 338 truncated responses, no existing RFC discusses the idea of pipelining 339 DNS queries over a TCP connection. 341 In order to achieve performance on par with UDP DNS clients SHOULD 342 pipeline their queries. When a DNS client sends multiple queries to 343 a server, it SHOULD not wait for an outstanding reply before sending 344 the next query. Clients SHOULD treat TCP and UDP equivalently when 345 considering the time at which to send a particular query. 347 DNS clients will benefit from noting that DNS servers that do not 348 both process pipelined queries concurrently and send out-of-order 349 responses will likely not provide performance on a par with UDP. If 350 TCP performance is of importance, clients might find it useful to use 351 server processing times as input to server and transport selection 352 algorithms. 354 DNS servers (especially recursive) SHOULD expect to receive pipelined 355 queries. The server SHOULD process TCP queries concurrently, just as 356 it would for UDP. The server SHOULD answer all pipelined queries, 357 even if they are sent in quick succession. The handling of responses 358 to pipelined queries is covered in Section 7. 360 6.2.2. Concurrent connections 362 To mitigate the risk of unintentional server overload, DNS clients 363 MUST take care to minimize the number of concurrent TCP connections 364 made to any individual server. It is RECOMMENDED that for any given 365 client/server interaction there SHOULD be no more than one connection 366 for regular queries, one for zone transfers and one for each protocol 367 that is being used on top of TCP, for example, if the resolver was 368 using TLS. It is however noted that certain primary/secondary 369 configurations with many busy zones might need to use more than one 370 TCP connection for zone transfers for operational reasons. 372 Similarly, servers MAY impose limits on the number of concurrent TCP 373 connections being handled for any particular client IP address or 374 subnet. These limits SHOULD be much looser than the client 375 guidelines above, because the server does not know, for example, if a 376 client IP address belongs to a single client or is multiple resolvers 377 on a single machine, or multiple clients behind NAT. 379 6.2.3. Idle Timeouts 381 To mitigate the risk of unintentional server overload, DNS clients 382 MUST take care to minimize the idle time of established DNS-over-TCP 383 sessions made to any individual server. DNS clients SHOULD close the 384 TCP connection of an idle session, unless an idle timeout has been 385 established using some other signalling mechanism, for example, 386 [edns-tcp-keepalive]. 388 To mitigate the risk of unintentional server overload it is 389 RECOMMENDED that the default server application-level idle period be 390 of the order of seconds, but no particular value is specified. In 391 practice, the idle period can vary dynamically, and servers MAY allow 392 idle connections to remain open for longer periods as resources 393 permit. A timeout of at least a few seconds is advisable for normal 394 operations to support those clients that expect the SOA and AXFR 395 request sequence to be made on a single connection as originally 396 specified in [RFC1035]. Servers MAY use zero timeouts when 397 experiencing heavy load or are under attack. 399 6.2.4. Tear Down 401 Under normal operation clients typically initiate connection closing 402 on idle connections however servers can close the connection if their 403 local idle timeout policy is exceeded. Connections can be also 404 closed by either end under unusual conditions such as defending 405 against an attack or system failure/reboot. 407 Clients SHOULD retry unanswered queries if the connection closes 408 before receiving all outstanding responses. No specific retry 409 algorithm is specified in this document. 411 If a server finds that a client has closed a TCP session, or if the 412 session has been otherwise interrupted, before all pending responses 413 have been sent then the server MUST NOT attempt to send those 414 responses. Of course the server MAY cache those responses. 416 7. Response Reordering 417 RFC 1035 is ambiguous on the question of whether TCP responses may be 418 reordered -- the only relevant text is in Section 4.2.1, which 419 relates to UDP: 421 Queries or their responses may be reordered by the network, or by 422 processing in name servers, so resolvers should not depend on them 423 being returned in order. 425 For the avoidance of future doubt, this requirement is clarified. 426 Authoritative servers and recursive resolvers are RECOMMENDED to 427 support the sending of responses in parallel and/or out-of-order, 428 regardless of the transport protocol in use. Stub and recursive 429 resolvers MUST be able to process responses that arrive in a 430 different order to that in which the requests were sent, regardless 431 of the transport protocol in use. 433 In order to achieve performance on par with UDP, recursive resolvers 434 SHOULD process TCP queries in parallel and return individual 435 responses as soon as they are available, possibly out-of-order. 437 Since pipelined responses can arrive out-of-order, clients MUST match 438 responses to outstanding queries using the ID field and port number. 439 Failure by clients to properly match responses to outstanding queries 440 can have serious consequences for interoperability. 442 8. TCP Message Length Field 444 For reasons of efficiency, DNS clients and servers SHOULD pass the 445 two-octet length field, and the message described by that length 446 field, to the TCP layer at the same time (e.g., in a single "write" 447 system call) to make it more likely that all the data will be 448 transmitted in a single TCP segment. This additionally avoids 449 problems due to some DNS servers being very sensitive to timeout 450 conditions on receiving messages (they might abort a TCP session if 451 the first TCP segment does not contain both the length field and the 452 entire message). 454 9. TCP Fast Open 456 This section is non-normative. 458 TCP Fast Open [RFC7413] (TFO) allows data to be carried in the SYN 459 packet, reducing the cost of re-opening TCP connections. It also 460 saves up to one RTT compared to standard TCP. 462 TFO mitigates the security vulnerabilities inherent in sending data 463 in the SYN, especially on a system like DNS where amplification 464 attacks are possible, by use of a server-supplied cookie. TFO 465 clients request a server cookie in the initial SYN packet at the 466 start of a new connection. The server returns a cookie in its SYN- 467 ACK. The client caches the cookie and reuses it when opening 468 subsequent connections to the same server. 470 The cookie is stored by the client's TCP stack (kernel) and persists 471 if either the client or server processes are restarted. TFO also 472 falls back to a regular TCP handshake gracefully. 474 DNS services taking advantage of IP anycast [RFC4786] might need to 475 take additional steps when enabling TFO. From [RFC7413]: 477 Servers that accept connection requests to the same server IP 478 address should use the same key such that they generate identical 479 Fast Open Cookies for a particular client IP address. Otherwise a 480 client may get different cookies across connections; its Fast Open 481 attempts would fall back to regular 3WHS. 483 10. IANA Considerations 485 This memo includes no request to IANA. 487 11. Security Considerations 489 Some DNS server operators have expressed concern that wider promotion 490 and use of DNS over TCP will expose them to a higher risk of denial- 491 of-service (DoS) attacks on TCP (both accidental and deliberate). 493 Although there is a higher risk of some specific attacks against TCP- 494 enabled servers, techniques for the mitigation of DoS attacks at the 495 network level have improved substantially since DNS was first 496 designed. 498 Readers are advised to familiarise themselves with [CPNI-TCP], a 499 security assessment of TCP detailing known TCP attacks and 500 countermeasures which references most of the relevant RFCs on this 501 topic. 503 To mitigate the risk of DoS attacks, DNS servers are advised to 504 engage in TCP connection management. This could include maintaining 505 state on existing connections, re-using existing connections and 506 controlling request queues to enable fair use. It is likely to be 507 advantageous to provide configurable connection management options, 508 for example: 510 o total number of TCP connections 511 o maximum TCP connections per source IP address or subnet 513 o TCP connection idle timeout 515 o maximum DNS transactions per TCP connection 517 o maximum TCP connection duration 519 No specific values are recommended for these parameters. 521 Operators are advised to familiarise themselves with the 522 configuration and tuning parameters available in the operating system 523 TCP stack. However detailed advice on this is outside the scope of 524 this document. 526 Operators of recursive servers are advised to ensure that they only 527 accept connections from expected clients, and do not accept them from 528 unknown sources. In the case of UDP traffic, this will help protect 529 against reflector attacks [RFC5358] and in the case of TCP traffic it 530 will prevent an unknown client from exhausting the server's limits on 531 the number of concurrent connections. 533 12. Acknowledgements 535 The authors would like to thank Francis Dupont and Paul Vixie for 536 detailed review, Andrew Sullivan, Tony Finch, Stephane Bortzmeyer and 537 the many others who contributed to the mailing list discussion. Also 538 Liang Zhu, Zi Hu, and John Heidemann for extensive DNS-over-TCP 539 discussions and code. Lucie Guiraud and Danny McPherson for 540 reviewing early versions of this document. We would also like to 541 thank all those who contributed to RFC 5966. 543 13. References 545 13.1. Normative References 547 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 548 August 1980. 550 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, RFC 551 793, September 1981. 553 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", 554 STD 13, RFC 1034, November 1987. 556 [RFC1035] Mockapetris, P., "Domain names - implementation and 557 specification", STD 13, RFC 1035, November 1987. 559 [RFC1123] Braden, R., "Requirements for Internet Hosts - Application 560 and Support", STD 3, RFC 1123, October 1989. 562 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 563 Requirement Levels", BCP 14, RFC 2119, March 1997. 565 [RFC4033] Arends, R., Austein, R., Larson, M., Massey, D., and S. 566 Rose, "DNS Security Introduction and Requirements", RFC 567 4033, March 2005. 569 [RFC4786] Abley, J. and K. Lindqvist, "Operation of Anycast 570 Services", BCP 126, RFC 4786, December 2006. 572 [RFC5155] Laurie, B., Sisson, G., Arends, R., and D. Blacka, "DNS 573 Security (DNSSEC) Hashed Authenticated Denial of 574 Existence", RFC 5155, March 2008. 576 [RFC5358] Damas, J. and F. Neves, "Preventing Use of Recursive 577 Nameservers in Reflector Attacks", BCP 140, RFC 5358, 578 October 2008. 580 [RFC5625] Bellis, R., "DNS Proxy Implementation Guidelines", BCP 581 152, RFC 5625, August 2009. 583 [RFC5966] Bellis, R., "DNS Transport over TCP - Implementation 584 Requirements", RFC 5966, August 2010. 586 [RFC6891] Damas, J., Graff, M., and P. Vixie, "Extension Mechanisms 587 for DNS (EDNS(0))", STD 75, RFC 6891, April 2013. 589 [RFC7230] Fielding, R. and J. Reschke, "Hypertext Transfer Protocol 590 (HTTP/1.1): Message Syntax and Routing", RFC 7230, June 591 2014. 593 13.2. Informative References 595 [CPNI-TCP] 596 CPNI, "Security Assessment of the Transmission Control 597 Protocol (TCP)", 2009, . 600 [Connection-Oriented-DNS] 601 Zhu, L., Hu, Z., Heidemann, J., Wessels, D., Mankin, A., 602 and N. Somaiya, "Connection-Oriented DNS to Improve 603 Privacy and Security", 604 . 606 [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines 607 for Application Designers", BCP 145, RFC 5405, DOI 608 10.17487/RFC5405, November 2008, 609 . 611 [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, 612 "TCP Extensions for Multipath Operation with Multiple 613 Addresses", RFC 6824, January 2013. 615 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 616 Fast Open", RFC 7413, December 2014. 618 [RRL] Vixie, P. and V. Schryver, "DNS Response Rate Limiting 619 (DNS RRL)", ISC-TN 2012-1-Draft1, April 2012. 621 [edns-tcp-keepalive] 622 Wouters, P., Abley, J., Dickinson, S., and R. Bellis, "The 623 edns-tcp-keepalive EDNS0 Option", draft-ietf-dnsop-edns- 624 tcp-keepalive-02 (work in progress), May 2015. 626 [fragmentation-considered-poisonous] 627 Herzberg, A. and H. Shulman, "Fragmentation Considered 628 Poisonous", May 2012, . 630 Appendix A. Summary of Advantages and Disadvantages to using TCP for 631 DNS 633 The TCP handshake generally prevents address spoofing and, therefore, 634 the reflection/amplification attacks which plague UDP. 636 IP fragmentation is less of a problem for TCP than it is for UDP. 637 TCP stacks generally implement Path MTU Discovery so they can avoid 638 IP fragmentation of TCP segments. UDP, on the other hand, does not 639 provide reassembly, which means datagrams that exceed the path MTU 640 size must experience fragmentation [RFC5405]. Middleboxes are known 641 to block IP fragments, leading to timeouts and forcing client 642 implementations to "hunt" for EDNS0 reply size values supported by 643 the network path. Additionally, fragmentation may lead to cache 644 poisoning [fragmentation-considered-poisonous]. 646 TCP setup costs an additional RTT compared to UDP queries. Setup 647 costs can be amortized by reusing connections, pipelining queries, 648 and enabling TCP Fast Open. 650 TCP imposes additional state-keeping requirements on clients and 651 servers. The use of TCP Fast Open reduces the cost of closing and 652 re-opening TCP connections. 654 Long-lived TCP connections to anycast servers might be disrupted due 655 to routing changes. Clients utilizing TCP for DNS need to always be 656 prepared to re-establish connections or otherwise retry outstanding 657 queries. It might also be possible for TCP Multipath [RFC6824] to 658 allow a server to hand a connection over from the anycast address to 659 a unicast address. 661 There are many "Middleboxes" in use today that interfere with TCP 662 over port 53 [RFC5625]. This document does not propose any 663 solutions, other than to make it absolutely clear that TCP is a valid 664 transport for DNS and support for it is a requirement for all 665 implementations. 667 A more in-depth discussion of connection orientated DNS can be found 668 elsewhere [Connection-Oriented-DNS]. 670 Appendix B. Changes between revisions 672 [Note to RFC Editor: please remove this section prior to 673 publication.] 675 B.1. Changes -02 to -03 677 o Replaced certain lower case RFC2119 keywords to improve clarity. 679 o Updated section 6.2.2 to recognise requirements for concurrent 680 zone transfers. 682 o Changed 'client IP address' to 'client IP address or subnet' when 683 discussing restrictions on TCP connections from clients. 685 o Added reference to edns-tcp-keepalive draft. 687 o Added wording to introduction to reference Appendix A and state 688 TCP is a valid transport alternative for DNS. 690 o Improved description of CPNI-TCP as a general reference source on 691 TCP security related RFCs. 693 B.2. Changes -01 to -02 695 o Added more text to Introduction as background to TCP use. 697 o Added definitions of Persistent connection and Idle session to 698 Terminology section. 700 o Separated Connection Handling section into Current Practice and 701 Recommendations. Provide more detail on current practices and 702 divided Recommendations up into more granular sub-sections. 704 o Add section on Idle time with new text on recommendations for 705 client idle behaviour. 707 o Move TCP message field length discussion to separate section. 709 o Removed references to system calls in TFO section. 711 o Added more discussion on DoS mitigation in Security Considerations 712 section. 714 o Added statement that servers MAY use 0 idle timeout. 716 o Re-stated position of TCP as an alternative to UDP in Discussion. 718 o Updated text on server limits on concurrent connections from a 719 particular client. 721 o Added text that client retry logic is outside the scope of this 722 document. 724 o Clarified that servers should answer all pipelined queries even if 725 sent very close together. 727 B.3. Changes -00 to -01 729 o Changed updates to obsoletes RFC 5966. 731 o Improved text in Section 4 Transport Protocol Selection to change 732 "TCP SHOULD NOT be used only for the transfers and as a fallback" 733 to make the intention clearer and more consistent. 735 o Reference to TCP FASTOPEN updated now that it is an RFC. 737 o Added paragraph to say that implementations MUST NOT send the TCP 738 framing 2 byte length field in a separate packet to the DNS 739 message. 741 o Added Terminology section. 743 o Changed should and RECOMMENDED in reference to parallel processing 744 to SHOULD in sections 7 and 8. 746 o Added text to address what a server should do when a client closes 747 the TCP connection before pending responses are sent. 749 o Moved the Advantages and Disadvantages section to an appendix. 751 B.4. Changes to RFC 5966 753 This document differs from RFC 5966 in four additions: 755 1. DNS implementations are recommended not only to support TCP but 756 to support it on an equal footing with UDP 758 2. DNS implementations are recommended to support reuse of TCP 759 connections 761 3. DNS implementations are recommended to support pipelining and out 762 of order processing of the query stream 764 4. A non-normative discussion of use of TCP Fast Open is added 766 Authors' Addresses 768 John Dickinson 769 Sinodun Internet Technologies 770 Magdalen Centre 771 Oxford Science Park 772 Oxford OX4 4GA 773 UK 775 Email: jad@sinodun.com 776 URI: http://sinodun.com 778 Sara Dickinson 779 Sinodun Internet Technologies 780 Magdalen Centre 781 Oxford Science Park 782 Oxford OX4 4GA 783 UK 785 Email: sara@sinodun.com 786 URI: http://sinodun.com 787 Ray Bellis 788 Internet Systems Consortium, Inc 789 950 Charter Street 790 Redwood City CA 94063 791 USA 793 Phone: +1 650 423 1200 794 Email: ray@isc.org 795 URI: http://www.isc.org 797 Allison Mankin 798 Verisign Labs 799 12061 Bluemont Way 800 Reston, VA 20190 801 US 803 Phone: +1 703 948-3200 804 Email: amankin@verisign.com 806 Duane Wessels 807 Verisign Labs 808 12061 Bluemont Way 809 Reston, VA 20190 810 US 812 Phone: +1 703 948-3200 813 Email: dwessels@verisign.com