idnits 2.17.1 draft-ietf-dnsop-5966bis-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC1035, updated by this document, for RFC5378 checks: 1987-11-01) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 15, 2016) is 3023 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 5966 (Obsoleted by RFC 7766) ** Obsolete normative reference: RFC 7230 (Obsoleted by RFC 9110, RFC 9112) ** Obsolete normative reference: RFC 7540 (Obsoleted by RFC 9113) -- Obsolete informational reference (is this intentional?): RFC 5405 (Obsoleted by RFC 8085) -- Obsolete informational reference (is this intentional?): RFC 6824 (Obsoleted by RFC 8684) Summary: 4 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 dnsop J. Dickinson 3 Internet-Draft S. Dickinson 4 Obsoletes: 5966 (if approved) Sinodun 5 Updates: 1035,1123 (if approved) R. Bellis 6 Intended status: Standards Track ISC 7 Expires: July 18, 2016 A. Mankin 8 D. Wessels 9 Verisign Labs 10 January 15, 2016 12 DNS Transport over TCP - Implementation Requirements 13 draft-ietf-dnsop-5966bis-06 15 Abstract 17 This document specifies the requirement for support of TCP as a 18 transport protocol for DNS implementations and provides guidelines 19 towards DNS-over-TCP performance on par with that of DNS-over-UDP. 20 This document obsoletes RFC5966 and therefore updates RFC1035 and 21 RFC1123. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on July 18, 2016. 40 Copyright Notice 42 Copyright (c) 2016 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 58 2. Requirements Terminology . . . . . . . . . . . . . . . . . . 4 59 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 5. Transport Protocol Selection . . . . . . . . . . . . . . . . 5 62 6. Connection Handling . . . . . . . . . . . . . . . . . . . . . 6 63 6.1. Current practices . . . . . . . . . . . . . . . . . . . . 6 64 6.1.1. Clients . . . . . . . . . . . . . . . . . . . . . . . 7 65 6.1.2. Servers . . . . . . . . . . . . . . . . . . . . . . . 7 66 6.2. Recommendations . . . . . . . . . . . . . . . . . . . . . 7 67 6.2.1. Connection Re-use . . . . . . . . . . . . . . . . . . 8 68 6.2.1.1. Query Pipelining . . . . . . . . . . . . . . . . 8 69 6.2.2. Concurrent connections . . . . . . . . . . . . . . . 8 70 6.2.3. Idle Timeouts . . . . . . . . . . . . . . . . . . . . 9 71 6.2.4. Tear Down . . . . . . . . . . . . . . . . . . . . . . 9 72 7. Response Reordering . . . . . . . . . . . . . . . . . . . . . 10 73 8. TCP Message Length Field . . . . . . . . . . . . . . . . . . 10 74 9. TCP Fast Open . . . . . . . . . . . . . . . . . . . . . . . . 11 75 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 76 11. Security Considerations . . . . . . . . . . . . . . . . . . . 12 77 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 13 78 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 79 13.1. Normative References . . . . . . . . . . . . . . . . . . 13 80 13.2. Informative References . . . . . . . . . . . . . . . . . 14 81 Appendix A. Summary of Advantages and Disadvantages to using TCP 82 for DNS . . . . . . . . . . . . . . . . . . . . . . 15 83 Appendix B. Changes between revisions . . . . . . . . . . . . . 16 84 B.1. Changes -05 to -06 . . . . . . . . . . . . . . . . . . . 16 85 B.2. Changes -04 to -05 . . . . . . . . . . . . . . . . . . . 17 86 B.3. Changes -03 to -04 . . . . . . . . . . . . . . . . . . . 17 87 B.4. Changes -02 to -03 . . . . . . . . . . . . . . . . . . . 18 88 B.5. Changes -01 to -02 . . . . . . . . . . . . . . . . . . . 18 89 B.6. Changes -00 to -01 . . . . . . . . . . . . . . . . . . . 19 90 Appendix C. Changes to RFC5966 . . . . . . . . . . . . . . . . . 19 91 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 93 1. Introduction 95 Most DNS [RFC1034] transactions take place over UDP [RFC0768]. TCP 96 [RFC0793] is always used for full zone transfers (AXFR) and is often 97 used for messages whose sizes exceed the DNS protocol's original 98 512-byte limit. The growing deployment of DNSSEC and IPv6 has 99 increased response sizes and therefore the use of TCP. The need for 100 increased TCP use has also been driven by the protection it provides 101 against address spoofing and therefore exploitation of DNS in 102 reflection/amplification attacks. It is now widely used in Response 103 Rate Limiting [RRL1][RRL2]. Additionally, recent work on DNS privacy 104 solutions such as [DNS-over-TLS] is another motivation to re-visit 105 DNS-over-TCP requirements. 107 Section 6.1.3.2 of [RFC1123] states: 109 DNS resolvers and recursive servers MUST support UDP, and SHOULD 110 support TCP, for sending (non-zone-transfer) queries. 112 However, some implementors have taken the text quoted above to mean 113 that TCP support is an optional feature of the DNS protocol. 115 The majority of DNS server operators already support TCP and the 116 default configuration for most software implementations is to support 117 TCP. The primary audience for this document is those implementors 118 whose limited support for TCP restricts interoperability and hinders 119 deployment of new DNS features. 121 This document therefore updates the core DNS protocol specifications 122 such that support for TCP is henceforth a REQUIRED part of a full DNS 123 protocol implementation. 125 There are several advantages and disadvantages to the increased use 126 of TCP (see Appendix A) as well as implementation details that need 127 to be considered. This document addresses these issues and presents 128 TCP as a valid transport alternative for DNS. It extends the content 129 of [RFC5966], with additional considerations and lessons learned from 130 research, developments and implementation of TCP in DNS and in other 131 internet protocols. 133 Whilst this document makes no specific requirements for operators of 134 DNS servers to meet, it does offer some suggestions to operators to 135 help ensure that support for TCP on their servers and network is 136 optimal. It should be noted that failure to support TCP (or the 137 blocking of DNS over TCP at the network layer) will probably result 138 in resolution failure and/or application-level timeouts. 140 2. Requirements Terminology 142 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 143 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 144 document are to be interpreted as described in [RFC2119]. 146 3. Terminology 148 o Persistent connection: a TCP connection that is not closed either 149 by the server after sending the first response nor by the client 150 after receiving the first response. 152 o Connection Reuse: the sending of multiple queries and responses 153 over a single TCP connection. 155 o Idle DNS-over-TCP session: Clients and servers view application 156 level idleness differently. A DNS client considers an established 157 DNS-over-TCP session to be idle when it has no pending queries to 158 send and there are no outstanding responses. A DNS server 159 considers an established DNS-over-TCP session to be idle when it 160 has sent responses to all the queries it has received on that 161 connection. 163 o Pipelining: the sending of multiple queries and responses over a 164 single TCP connection but not waiting for any outstanding replies 165 before sending another query. 167 o Out-Of-Order Processing: The processing of queries concurrently 168 and the returning of individual responses as soon as they are 169 available, possibly out-of-order. This will most likely occur in 170 recursive servers, however it is possible in authoritative servers 171 that, for example, have different backend data stores. 173 4. Discussion 175 In the absence of EDNS0 (Extension Mechanisms for DNS 0 [RFC6891]) 176 (see below), the normal behaviour of any DNS server needing to send a 177 UDP response that would exceed the 512-byte limit is for the server 178 to truncate the response so that it fits within that limit and then 179 set the TC flag in the response header. When the client receives 180 such a response, it takes the TC flag as an indication that it should 181 retry over TCP instead. 183 RFC 1123 also says: 185 ... it is also clear that some new DNS record types defined in the 186 future will contain information exceeding the 512 byte limit that 187 applies to UDP, and hence will require TCP. Thus, resolvers and 188 name servers should implement TCP services as a backup to UDP 189 today, with the knowledge that they will require the TCP service 190 in the future. 192 Existing deployments of DNS Security (DNSSEC) [RFC4033] have shown 193 that truncation at the 512-byte boundary is now commonplace. For 194 example, a Non-Existent Domain (NXDOMAIN) (RCODE == 3) response from 195 a DNSSEC-signed zone using NextSECure 3 (NSEC3) [RFC5155] is almost 196 invariably larger than 512 bytes. 198 Since the original core specifications for DNS were written, the 199 Extension Mechanisms for DNS have been introduced. These extensions 200 can be used to indicate that the client is prepared to receive UDP 201 responses larger than 512 bytes. An EDNS0-compatible server 202 receiving a request from an EDNS0-compatible client may send UDP 203 packets up to that client's announced buffer size without truncation. 205 However, transport of UDP packets that exceed the size of the path 206 MTU causes IP packet fragmentation, which has been found to be 207 unreliable in many circumstances. Many firewalls routinely block 208 fragmented IP packets, and some do not implement the algorithms 209 necessary to reassemble fragmented packets. Worse still, some 210 network devices deliberately refuse to handle DNS packets containing 211 EDNS0 options. Other issues relating to UDP transport and packet 212 size are discussed in [RFC5625]. 214 The MTU most commonly found in the core of the Internet is around 215 1500 bytes, and even that limit is routinely exceeded by DNSSEC- 216 signed responses. 218 The future that was anticipated in RFC 1123 has arrived, and the only 219 standardised UDP-based mechanism that may have resolved the packet 220 size issue has been found inadequate. 222 5. Transport Protocol Selection 224 Section 6.1.3.2 of [RFC1123] is updated: All general-purpose DNS 225 implementations MUST support both UDP and TCP transport. 227 o Authoritative server implementations MUST support TCP so that they 228 do not limit the size of responses to what fits in a single UDP 229 packet. 231 o Recursive server (or forwarder) implementations MUST support TCP 232 so that they do not prevent large responses from a TCP-capable 233 server from reaching its TCP-capable clients. 235 o Stub resolver implementations (e.g., an operating system's DNS 236 resolution library) MUST support TCP since to do otherwise would 237 limit the interoperability between their own clients and upstream 238 servers. 240 Regarding the choice of when to use UDP or TCP, Section 6.1.3.2 of 241 RFC 1123 also says: 243 ... a DNS resolver or server that is sending a non-zone-transfer 244 query MUST send a UDP query first. 246 This requirement is hereby relaxed. Stub resolvers and recursive 247 resolvers MAY elect to send either TCP or UDP queries depending on 248 local operational reasons. TCP MAY be used before sending any UDP 249 queries. If the resolver already has an open TCP connection to the 250 server it SHOULD reuse this connection. In essence, TCP ought to be 251 considered a valid alternative transport to UDP, not purely a retry 252 option. 254 In addition it is noted that all Recursive and Authoritative servers 255 MUST send responses using the same transport as the query arrived on. 256 In the case of TCP this MUST also be the same connection. 258 6. Connection Handling 260 6.1. Current practices 262 Section 4.2.2 of [RFC1035] says: 264 o The server should assume that the client will initiate connection 265 closing, and should delay closing its end of the connection until 266 all outstanding client requests have been satisfied. 268 o If the server needs to close a dormant connection to reclaim 269 resources, it should wait until the connection has been idle for a 270 period on the order of two minutes. In particular, the server 271 should allow the SOA and AXFR request sequence (which begins a 272 refresh operation) to be made on a single connection. Since the 273 server would be unable to answer queries anyway, a unilateral 274 close or reset may be used instead of graceful close. 276 Other more modern protocols (e.g., HTTP/1.1 [RFC7230], HTTP/2 277 [RFC7540]) have support by default for persistent TCP connections for 278 all requests. Connections are then normally closed via a 'connection 279 close' signal from one party. 281 The description in [RFC1035] is clear that servers should view 282 connections as persistent (particularly after receiving an SOA), but 283 unfortunately does not provide enough detail for an unambiguous 284 interpretation of client behaviour for queries other than a SOA. 285 Additionally, DNS does not yet have a signalling mechanism for 286 connection timeout or close, although some have been proposed. 288 6.1.1. Clients 290 There is no clear guidance today in any RFC as to when a DNS client 291 should close a TCP connection, and there are no specific 292 recommendations with regard to DNS client idle timeouts. However, at 293 the time of writing, it is common practice for clients to close the 294 TCP connection after sending a single request (apart from the SOA/ 295 AXFR case). 297 6.1.2. Servers 299 Many DNS server implementations use a long fixed idle timeout and 300 default to a small number of TCP connections. They also offer little 301 by the way of TCP connection management options. The disadvantages 302 of this include: 304 o Operational experience has shown that long server timeouts can 305 easily cause resource exhaustion and poor response under heavy 306 load. 308 o Intentionally opening many connections and leaving them idle can 309 trivially create a TCP "denial-of-service" attack as many DNS 310 servers are poorly equipped to defend against this by modifying 311 their idle timeouts or other connection management policies. 313 o A modest number of clients that all concurrently attempt to use 314 persistent connections with non-zero idle timeouts to such a 315 server could unintentionally cause the same "denial-of-service" 316 problem. 318 Note that this denial-of-service is only on the TCP service. 319 However, in these cases it affects not only clients wishing to use 320 TCP for their queries for operational reasons, but all clients who 321 choose to fall back to TCP from UDP after receiving a TC=1 flag. 323 6.2. Recommendations 325 The following sections include recommendations that are intended to 326 result in more consistent and scalable implementations of DNS-over- 327 TCP. 329 6.2.1. Connection Re-use 331 One perceived disadvantage to DNS over TCP is the added connection 332 setup latency, generally equal to one RTT. To amortize connection 333 setup costs, both clients and servers SHOULD support connection reuse 334 by sending multiple queries and responses over a single persistent 335 TCP connection. 337 When sending multiple queries over a TCP connection clients MUST NOT 338 re-use the DNS Message ID of an in-flight query on that connection in 339 order to avoid Message ID collisions. This is especially important 340 if the server could be performing out-of-order processing (see 341 Section 7). 343 6.2.1.1. Query Pipelining 345 Due to the historical use of TCP primarily for zone transfer and 346 truncated responses, no existing RFC discusses the idea of pipelining 347 DNS queries over a TCP connection. 349 In order to achieve performance on par with UDP DNS clients SHOULD 350 pipeline their queries. When a DNS client sends multiple queries to 351 a server, it SHOULD NOT wait for an outstanding reply before sending 352 the next query. Clients SHOULD treat TCP and UDP equivalently when 353 considering the time at which to send a particular query. 355 It is likely that DNS servers need to process pipelined queries 356 concurrently and also send out-of-order responses over TCP in order 357 to provide the level of performance possible with UDP transport. If 358 TCP performance is of importance, clients might find it useful to use 359 server processing times as input to server and transport selection 360 algorithms. 362 DNS servers (especially recursive) MUST expect to receive pipelined 363 queries. The server SHOULD process TCP queries concurrently, just as 364 it would for UDP. The server SHOULD answer all pipelined queries, 365 even if they are received in quick succession. The handling of 366 responses to pipelined queries is covered in Section 7. 368 6.2.2. Concurrent connections 370 To mitigate the risk of unintentional server overload, DNS clients 371 MUST take care to minimize the number of concurrent TCP connections 372 made to any individual server. It is RECOMMENDED that for any given 373 client/server interaction there SHOULD be no more than one connection 374 for regular queries, one for zone transfers and one for each protocol 375 that is being used on top of TCP, for example, if the resolver was 376 using TLS. It is however noted that certain primary/secondary 377 configurations with many busy zones might need to use more than one 378 TCP connection for zone transfers for operational reasons (for 379 example, to support concurrent transfers of multiple zones). 381 Similarly, servers MAY impose limits on the number of concurrent TCP 382 connections being handled for any particular client IP address or 383 subnet. These limits SHOULD be much looser than the client 384 guidelines above, because the server does not know, for example, if a 385 client IP address belongs to a single client or is multiple resolvers 386 on a single machine, or multiple clients behind a device performing 387 Network Address Translation (NAT). 389 6.2.3. Idle Timeouts 391 To mitigate the risk of unintentional server overload, DNS clients 392 MUST take care to minimize the idle time of established DNS-over-TCP 393 sessions made to any individual server. DNS clients SHOULD close the 394 TCP connection of an idle session, unless an idle timeout has been 395 established using some other signalling mechanism, for example, 396 [edns-tcp-keepalive]. 398 To mitigate the risk of unintentional server overload it is 399 RECOMMENDED that the default server application-level idle period be 400 of the order of seconds, but no particular value is specified. In 401 practice, the idle period can vary dynamically, and servers MAY allow 402 idle connections to remain open for longer periods as resources 403 permit. A timeout of at least a few seconds is advisable for normal 404 operations to support those clients that expect the SOA and AXFR 405 request sequence to be made on a single connection as originally 406 specified in [RFC1035]. Servers MAY use zero timeouts when 407 experiencing heavy load or are under attack. 409 DNS messages delivered over TCP might arrive in multiple segments. A 410 DNS server that resets its idle timeout after receiving a single 411 segment might be vulnerable to a "slow read attack." For this 412 reason, servers SHOULD reset the idle timeout on the receipt of a 413 full DNS message, rather than on receipt of any part of a DNS 414 message. 416 6.2.4. Tear Down 418 Under normal operation DNS clients typically initiate connection 419 closing on idle connections, however DNS servers can close the 420 connection if their local idle timeout policy is exceeded. 421 Connections can be also closed by either end under unusual conditions 422 such as defending against an attack or system failure/reboot. 424 DNS Clients SHOULD retry unanswered queries if the connection closes 425 before receiving all outstanding responses. No specific retry 426 algorithm is specified in this document. 428 If a DNS server finds that a DNS client has closed a TCP session, or 429 if the session has been otherwise interrupted, before all pending 430 responses have been sent then the server MUST NOT attempt to send 431 those responses. Of course the DNS server MAY cache those responses. 433 7. Response Reordering 435 RFC 1035 is ambiguous on the question of whether TCP responses may be 436 reordered -- the only relevant text is in Section 4.2.1, which 437 relates to UDP: 439 Queries or their responses may be reordered by the network, or by 440 processing in name servers, so resolvers should not depend on them 441 being returned in order. 443 For the avoidance of future doubt, this requirement is clarified. 444 Authoritative servers and recursive resolvers are RECOMMENDED to 445 support the preparing of responses in parallel and sending them out- 446 of-order, regardless of the transport protocol in use. Stub and 447 recursive resolvers MUST be able to process responses that arrive in 448 a different order to that in which the requests were sent, regardless 449 of the transport protocol in use. 451 In order to achieve performance on par with UDP, recursive resolvers 452 SHOULD process TCP queries in parallel and return individual 453 responses as soon as they are available, possibly out-of-order. 455 Since pipelined responses can arrive out-of-order, clients MUST match 456 responses to outstanding queries on the same TCP connection using the 457 Message ID. If the response contains a question section the client 458 MUST match the QNAME, QCLASS and QTYPE fields. Failure by clients to 459 properly match responses to outstanding queries can have serious 460 consequences for interoperability. 462 8. TCP Message Length Field 464 DNS clients and servers SHOULD pass the two-octet length field, and 465 the message described by that length field, to the TCP layer at the 466 same time (e.g., in a single "write" system call) to make it more 467 likely that all the data will be transmitted in a single TCP segment. 468 This is both for reasons of efficiency and to avoid problems due to 469 some DNS server implementations behaving undesirably when reading 470 data from the TCP layer (due to a lack of clarity in previous 471 standards). For example, some DNS server implementations might abort 472 a TCP session if the first "read" from the TCP layer does not contain 473 both the length field and the entire message. 475 To clarify, DNS servers MUST NOT close a connection simply because 476 the first "read" from the TCP layer does not contain the entire DNS 477 message, and servers SHOULD apply the connection timeouts as 478 specified in Section 6.2.3. 480 9. TCP Fast Open 482 This section is non-normative. 484 TCP Fast Open [RFC7413] (TFO) allows data to be carried in the SYN 485 packet, reducing the cost of re-opening TCP connections. It also 486 saves up to one RTT compared to standard TCP. 488 TFO mitigates the security vulnerabilities inherent in sending data 489 in the SYN, especially on a system like DNS where amplification 490 attacks are possible, by use of a server-supplied cookie. TFO 491 clients request a server cookie in the initial SYN packet at the 492 start of a new connection. The server returns a cookie in its SYN- 493 ACK. The client caches the cookie and reuses it when opening 494 subsequent connections to the same server. 496 The cookie is stored by the client's TCP stack (kernel) and persists 497 if either the client or server processes are restarted. TFO also 498 falls back to a regular TCP handshake gracefully. 500 DNS services taking advantage of IP anycast [RFC4786] might need to 501 take additional steps when enabling TFO. From [RFC7413]: 503 Servers that accept connection requests to the same server IP 504 address should use the same key such that they generate identical 505 Fast Open Cookies for a particular client IP address. Otherwise a 506 client may get different cookies across connections; its Fast Open 507 attempts would fall back to regular 3WHS. 509 When DNS-over-TCP is a transport for DNS private exchange, as in 510 [DNS-over-TLS], the implementor needs to be aware of TFO and to 511 ensure that data requiring protection (e.g. data for a DNS query) is 512 not accidentally transported in the clear. See [DNS-over-TLS] for 513 discussion." 515 10. IANA Considerations 517 This memo includes no request to IANA. 519 11. Security Considerations 521 Some DNS server operators have expressed concern that wider promotion 522 and use of DNS over TCP will expose them to a higher risk of denial- 523 of-service (DoS) attacks on TCP (both accidental and deliberate). 525 Although there is a higher risk of some specific attacks against TCP- 526 enabled servers, techniques for the mitigation of DoS attacks at the 527 network level have improved substantially since DNS was first 528 designed. 530 Readers are advised to familiarise themselves with [CPNI-TCP], a 531 security assessment of TCP detailing known TCP attacks and 532 countermeasures which references most of the relevant RFCs on this 533 topic. 535 To mitigate the risk of DoS attacks, DNS servers are advised to 536 engage in TCP connection management. This could include maintaining 537 state on existing connections, re-using existing connections and 538 controlling request queues to enable fair use. It is likely to be 539 advantageous to provide configurable connection management options, 540 for example: 542 o total number of TCP connections 544 o maximum TCP connections per source IP address or subnet 546 o TCP connection idle timeout 548 o maximum DNS transactions per TCP connection 550 o maximum TCP connection duration 552 No specific values are recommended for these parameters. 554 Operators are advised to familiarise themselves with the 555 configuration and tuning parameters available in the operating system 556 TCP stack. However detailed advice on this is outside the scope of 557 this document. 559 Operators of recursive servers are advised to ensure that they only 560 accept connections from expected clients (for example by the use of 561 an ACL), and do not accept them from unknown sources. In the case of 562 UDP traffic, this will help protect against reflection attacks 564 [RFC5358] and in the case of TCP traffic it will prevent an unknown 565 client from exhausting the server's limits on the number of 566 concurrent connections. 568 12. Acknowledgements 570 The authors would like to thank Francis Dupont and Paul Vixie for 571 detailed review, Andrew Sullivan, Tony Finch, Stephane Bortzmeyer, 572 Joe Abley, Tatuya Jinmei and the many others who contributed to the 573 mailing list discussion. Also Liang Zhu, Zi Hu, and John Heidemann 574 for extensive DNS-over-TCP discussions and code. Lucie Guiraud and 575 Danny McPherson for reviewing early versions of this document. We 576 would also like to thank all those who contributed to RFC5966. 578 13. References 580 13.1. Normative References 582 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 583 DOI 10.17487/RFC0768, August 1980, 584 . 586 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 587 RFC 793, DOI 10.17487/RFC0793, September 1981, 588 . 590 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", 591 STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987, 592 . 594 [RFC1035] Mockapetris, P., "Domain names - implementation and 595 specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, 596 November 1987, . 598 [RFC1123] Braden, R., Ed., "Requirements for Internet Hosts - 599 Application and Support", STD 3, RFC 1123, 600 DOI 10.17487/RFC1123, October 1989, 601 . 603 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 604 Requirement Levels", BCP 14, RFC 2119, 605 DOI 10.17487/RFC2119, March 1997, 606 . 608 [RFC4033] Arends, R., Austein, R., Larson, M., Massey, D., and S. 609 Rose, "DNS Security Introduction and Requirements", 610 RFC 4033, DOI 10.17487/RFC4033, March 2005, 611 . 613 [RFC4786] Abley, J. and K. Lindqvist, "Operation of Anycast 614 Services", BCP 126, RFC 4786, DOI 10.17487/RFC4786, 615 December 2006, . 617 [RFC5155] Laurie, B., Sisson, G., Arends, R., and D. Blacka, "DNS 618 Security (DNSSEC) Hashed Authenticated Denial of 619 Existence", RFC 5155, DOI 10.17487/RFC5155, March 2008, 620 . 622 [RFC5358] Damas, J. and F. Neves, "Preventing Use of Recursive 623 Nameservers in Reflector Attacks", BCP 140, RFC 5358, 624 DOI 10.17487/RFC5358, October 2008, 625 . 627 [RFC5625] Bellis, R., "DNS Proxy Implementation Guidelines", 628 BCP 152, RFC 5625, DOI 10.17487/RFC5625, August 2009, 629 . 631 [RFC5966] Bellis, R., "DNS Transport over TCP - Implementation 632 Requirements", RFC 5966, DOI 10.17487/RFC5966, August 633 2010, . 635 [RFC6891] Damas, J., Graff, M., and P. Vixie, "Extension Mechanisms 636 for DNS (EDNS(0))", STD 75, RFC 6891, 637 DOI 10.17487/RFC6891, April 2013, 638 . 640 [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 641 Protocol (HTTP/1.1): Message Syntax and Routing", 642 RFC 7230, DOI 10.17487/RFC7230, June 2014, 643 . 645 [RFC7540] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 646 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, 647 DOI 10.17487/RFC7540, May 2015, 648 . 650 13.2. Informative References 652 [Connection-Oriented-DNS] 653 Zhu, L., Hu, Z., Heidemann, J., Wessels, D., Mankin, A., 654 and N. Somaiya, "Connection-Oriented DNS to Improve 655 Privacy and Security", 656 . 658 [CPNI-TCP] 659 CPNI, "Security Assessment of the Transmission Control 660 Protocol (TCP)", 2009, . 663 [DNS-over-TLS] 664 Hu, Z., Zhu, L., Heidemann, J., Mankin, A., Wessels, D., 665 and P. Hoffman, "TLS for DNS: Initiation and Performance 666 Considerations", draft-ietf-dprive-dns-over-tls (work in 667 progress), January 2016. 669 [edns-tcp-keepalive] 670 Wouters, P., Abley, J., Dickinson, S., and R. Bellis, "The 671 edns-tcp-keepalive EDNS0 Option", draft-ietf-dnsop-edns- 672 tcp-keepalive-05 (work in progress), Jan 2015. 674 [fragmentation-considered-poisonous] 675 Herzberg, A. and H. Shulman, "Fragmentation Considered 676 Poisonous", May 2012, . 678 [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines 679 for Application Designers", BCP 145, RFC 5405, 680 DOI 10.17487/RFC5405, November 2008, 681 . 683 [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, 684 "TCP Extensions for Multipath Operation with Multiple 685 Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013, 686 . 688 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 689 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, 690 . 692 [RRL1] Vixie, P. and V. Schryver, "DNS Response Rate Limiting 693 (DNS RRL)", ISC-TN 2012-1-Draft1, August 2014, 694 . 696 [RRL2] "BIND RRL", ISC Knowledge Base AA-00994, April 2012, 697 . 700 Appendix A. Summary of Advantages and Disadvantages to using TCP for 701 DNS 703 The TCP handshake generally prevents address spoofing and, therefore, 704 the reflection/amplification attacks which plague UDP. 706 IP fragmentation is less of a problem for TCP than it is for UDP. 707 TCP stacks generally implement Path MTU Discovery so they can avoid 708 IP fragmentation of TCP segments. UDP, on the other hand, does not 709 provide reassembly, which means datagrams that exceed the path MTU 710 size must experience fragmentation [RFC5405]. Middleboxes are known 711 to block IP fragments, leading to timeouts and forcing client 712 implementations to "hunt" for EDNS0 reply size values supported by 713 the network path. Additionally, fragmentation may lead to cache 714 poisoning [fragmentation-considered-poisonous]. 716 TCP setup costs an additional RTT compared to UDP queries. Setup 717 costs can be amortized by reusing connections, pipelining queries, 718 and enabling TCP Fast Open. 720 TCP imposes additional state-keeping requirements on clients and 721 servers. The use of TCP Fast Open reduces the cost of closing and 722 re-opening TCP connections. 724 Long-lived TCP connections to anycast servers might be disrupted due 725 to routing changes. Clients utilizing TCP for DNS need to always be 726 prepared to re-establish connections or otherwise retry outstanding 727 queries. It might also be possible for TCP Multipath [RFC6824] to 728 allow a server to hand a connection over from the anycast address to 729 a unicast address. 731 There are many "Middleboxes" in use today that interfere with TCP 732 over port 53 [RFC5625]. This document does not propose any 733 solutions, other than to make it absolutely clear that TCP is a valid 734 transport for DNS and support for it is a requirement for all 735 implementations. 737 A more in-depth discussion of connection orientated DNS can be found 738 elsewhere [Connection-Oriented-DNS]. 740 Appendix B. Changes between revisions 742 [Note to RFC Editor: please remove this section prior to 743 publication.] 745 B.1. Changes -05 to -06 747 Introduction: Add reference to DNS-over-TLS 749 Section 5: 's/it/the resolver/' and 's/fallback/retry/' 751 Section 6.1.1: Make clear behaviour is 'at the time of writing', not 752 a recommendation 753 Section 6.2.1.1: Change SHOULD to MUST. 755 Section 6.2.2: Clarify 'operational reasons' for zone transfers. 757 Section 8: Re-word to remove reference to TCP segments. 759 Section 9: Add sentence about use of TFO with DNS privacy solutions. 761 B.2. Changes -04 to -05 763 Added second RRL reference to introduction 765 Introduction, paragraph 5: s/may result/will probably result/ 767 Section 5: Strengthened wording on update of RFC1123 769 Section 5: Added reference to HTTP/2 771 Section 6.2.1: Simplify wording of Message ID collisions 773 Section 6.2.2: Clarify wording on idle timeout reset 775 Section 6.2.4: Use DNS Server/client for consistency 777 Section 8: Re-word to reduce confusion of timeout vs TCP reads 779 Appendix C: Updated differences to RFC5966. 781 B.3. Changes -03 to -04 783 o Re-stated how messages received over TCP should be mapped to 784 queries. 786 o Added wording to cover timeouts for server side behaviour for when 787 receiving TCP messages. 789 o Added sentence to abstract stating this obsoletes RFC5966. 791 o Moved reference to RFC6891 earlier in Discussion section. 793 o Several minor wording updates to improve clarity. 795 o Corrected nits and updated references. 797 B.4. Changes -02 to -03 799 o Replaced certain lower case RFC2119 keywords to improve clarity. 801 o Updated section 6.2.2 to recognise requirements for concurrent 802 zone transfers. 804 o Changed 'client IP address' to 'client IP address or subnet' when 805 discussing restrictions on TCP connections from clients. 807 o Added reference to edns-tcp-keepalive draft. 809 o Added wording to introduction to reference Appendix A and state 810 TCP is a valid transport alternative for DNS. 812 o Improved description of CPNI-TCP as a general reference source on 813 TCP security related RFCs. 815 B.5. Changes -01 to -02 817 o Added more text to Introduction as background to TCP use. 819 o Added definitions of Persistent connection and Idle session to 820 Terminology section. 822 o Separated Connection Handling section into Current Practice and 823 Recommendations. Provide more detail on current practices and 824 divided Recommendations up into more granular sub-sections. 826 o Add section on Idle time with new text on recommendations for 827 client idle behaviour. 829 o Move TCP message field length discussion to separate section. 831 o Removed references to system calls in TFO section. 833 o Added more discussion on DoS mitigation in Security Considerations 834 section. 836 o Added statement that servers MAY use 0 idle timeout. 838 o Re-stated position of TCP as an alternative to UDP in Discussion. 840 o Updated text on server limits on concurrent connections from a 841 particular client. 843 o Added text that client retry logic is outside the scope of this 844 document. 846 o Clarified that servers should answer all pipelined queries even if 847 sent very close together. 849 B.6. Changes -00 to -01 851 o Changed updates to obsoletes RFC 5966. 853 o Improved text in Section 4 Transport Protocol Selection to change 854 "TCP SHOULD NOT be used only for the transfers and as a fallback" 855 to make the intention clearer and more consistent. 857 o Reference to TCP FASTOPEN updated now that it is an RFC. 859 o Added paragraph to say that implementations MUST NOT send the TCP 860 framing 2 byte length field in a separate packet to the DNS 861 message. 863 o Added Terminology section. 865 o Changed should and RECOMMENDED in reference to parallel processing 866 to SHOULD in sections 7 and 8. 868 o Added text to address what a server should do when a client closes 869 the TCP connection before pending responses are sent. 871 o Moved the Advantages and Disadvantages section to an appendix. 873 Appendix C. Changes to RFC5966 875 [Note to RFC Editor: please leave this section in the final 876 document.] 878 This document obsoletes [RFC5966] and differs from it in several 879 respects. An overview of the most substantial changes/updates that 880 implementors should take note of is given below: 882 1. A Terminology section (Section 3) is added defining several new 883 concepts. 885 2. Paragraph 3 of Section 5 puts TCP on a more equal footing with 886 UDP than RFC5966. For example it states: 888 1. TCP MAY be used before sending any UDP queries. 890 2. TCP ought to be considered a valid alternative transport to 891 UDP, not purely a fallback option. 893 3. Section 6.2.1 adds a new recommendation that TCP connection- 894 reuse SHOULD be supported. 896 4. Section 6.2.1.1 adds a new recommendation that DNS clients 897 SHOULD pipeline their queries and DNS servers SHOULD process 898 pipelined queries concurrently. 900 5. Section 6.2.2 adds new recommendations on the number and usage 901 of TCP connections for client/server interactions. 903 6. Section 6.2.3 adds a new recommendation that DNS clients SHOULD 904 close idle sessions unless using a signalling mechanism. 906 7. Section 7 clarifies that servers are RECOMMENDED to prepare TCP 907 responses in parallel and send answers out-of-order. It also 908 clarifies how TCP queries and responses should be matched by 909 clients. 911 8. Section 8 adds a new recommendation about how DNS clients and 912 servers should handle the 2 byte message length field for TCP 913 messages. 915 9. Section 9 adds a non-normative discussion of the use of TCP Fast 916 Open. 918 10. The Section 11 adds new advice regarding DoS mitigation 919 techniques. 921 Authors' Addresses 923 John Dickinson 924 Sinodun Internet Technologies 925 Magdalen Centre 926 Oxford Science Park 927 Oxford OX4 4GA 928 UK 930 Email: jad@sinodun.com 931 URI: http://sinodun.com 932 Sara Dickinson 933 Sinodun Internet Technologies 934 Magdalen Centre 935 Oxford Science Park 936 Oxford OX4 4GA 937 UK 939 Email: sara@sinodun.com 940 URI: http://sinodun.com 942 Ray Bellis 943 Internet Systems Consortium, Inc 944 950 Charter Street 945 Redwood City CA 94063 946 USA 948 Phone: +1 650 423 1200 949 Email: ray@isc.org 950 URI: http://www.isc.org 952 Allison Mankin 953 Verisign Labs 954 12061 Bluemont Way 955 Reston, VA 20190 956 US 958 Phone: +1 703 948-3200 959 Email: amankin@verisign.com 961 Duane Wessels 962 Verisign Labs 963 12061 Bluemont Way 964 Reston, VA 20190 965 US 967 Phone: +1 703 948-3200 968 Email: dwessels@verisign.com