idnits 2.17.1 draft-ietf-dnsop-session-signal-14.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC1035, updated by this document, for RFC5378 checks: 1987-11-01) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 02, 2018) is 2093 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'TBA1' is mentioned on line 2018, but not defined == Missing Reference: 'TBA2' is mentioned on line 2024, but not defined == Outdated reference: A later version (-23) exists of draft-ietf-dnsop-no-response-issue-11 == Outdated reference: A later version (-04) exists of draft-ietf-dnssd-mdns-relay-01 == Outdated reference: A later version (-25) exists of draft-ietf-dnssd-push-14 == Outdated reference: A later version (-14) exists of draft-ietf-doh-dns-over-https-12 Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 DNSOP Working Group R. Bellis 3 Internet-Draft ISC 4 Updates: 1035, 7766 (if approved) S. Cheshire 5 Intended status: Standards Track Apple Inc. 6 Expires: February 3, 2019 J. Dickinson 7 S. Dickinson 8 Sinodun 9 T. Lemon 10 Nibbhaya Consulting 11 T. Pusateri 12 Unaffiliated 13 August 02, 2018 15 DNS Stateful Operations 16 draft-ietf-dnsop-session-signal-14 18 Abstract 20 This document defines a new DNS OPCODE for DNS Stateful Operations 21 (DSO). DSO messages communicate operations within persistent 22 stateful sessions, using type-length-value (TLV) syntax. Three TLVs 23 are defined that manage session timeouts, termination, and encryption 24 padding, and a framework is defined for extensions to enable new 25 stateful operations. This document updates RFC 1035 by adding a new 26 DNS header opcode and result code which has different message 27 semantics. This document updates RFC 7766 by redefining a session, 28 providing new guidance on connection re-use, and providing a new 29 mechanism for handling session idle timeouts. 31 Status of This Memo 33 This Internet-Draft is submitted in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF). Note that other groups may also distribute 38 working documents as Internet-Drafts. The list of current Internet- 39 Drafts is at http://datatracker.ietf.org/drafts/current/. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 46 This Internet-Draft will expire on February 3, 2019. 48 Copyright Notice 50 Copyright (c) 2018 IETF Trust and the persons identified as the 51 document authors. All rights reserved. 53 This document is subject to BCP 78 and the IETF Trust's Legal 54 Provisions Relating to IETF Documents 55 (http://trustee.ietf.org/license-info) in effect on the date of 56 publication of this document. Please review these documents 57 carefully, as they describe your rights and restrictions with respect 58 to this document. Code Components extracted from this document must 59 include Simplified BSD License text as described in Section 4.e of 60 the Trust Legal Provisions and are provided without warranty as 61 described in the Simplified BSD License. 63 Table of Contents 65 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 66 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 6 67 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 68 4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 11 69 5. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 12 70 6. Protocol Details . . . . . . . . . . . . . . . . . . . . . . 12 71 6.1. DSO Session Establishment . . . . . . . . . . . . . . . . 12 72 6.1.1. Connection Sharing . . . . . . . . . . . . . . . . . 14 73 6.1.2. Zero Round-Trip Operation . . . . . . . . . . . . . . 15 74 6.1.3. Middlebox Considerations . . . . . . . . . . . . . . 16 75 6.2. Message Format . . . . . . . . . . . . . . . . . . . . . 17 76 6.2.1. DNS Header Fields in DSO Messages . . . . . . . . . . 18 77 6.2.2. DSO Data . . . . . . . . . . . . . . . . . . . . . . 20 78 6.2.3. EDNS(0) and TSIG . . . . . . . . . . . . . . . . . . 25 79 6.3. Message Handling . . . . . . . . . . . . . . . . . . . . 26 80 6.3.1. Error Responses . . . . . . . . . . . . . . . . . . . 27 81 6.4. Flow Control Considerations . . . . . . . . . . . . . . . 28 82 6.5. Responder-Initiated Operation Cancellation . . . . . . . 28 83 7. DSO Session Lifecycle and Timers . . . . . . . . . . . . . . 30 84 7.1. DSO Session Initiation . . . . . . . . . . . . . . . . . 30 85 7.2. DSO Session Timeouts . . . . . . . . . . . . . . . . . . 30 86 7.3. Inactive DSO Sessions . . . . . . . . . . . . . . . . . . 31 87 7.4. The Inactivity Timeout . . . . . . . . . . . . . . . . . 33 88 7.4.1. Closing Inactive DSO Sessions . . . . . . . . . . . . 33 89 7.4.2. Values for the Inactivity Timeout . . . . . . . . . . 34 90 7.5. The Keepalive Interval . . . . . . . . . . . . . . . . . 35 91 7.5.1. Keepalive Interval Expiry . . . . . . . . . . . . . . 35 92 7.5.2. Values for the Keepalive Interval . . . . . . . . . . 35 93 7.6. Server-Initiated Session Termination . . . . . . . . . . 37 94 7.6.1. Server-Initiated Retry Delay Message . . . . . . . . 38 95 8. Base TLVs for DNS Stateful Operations . . . . . . . . . . . . 41 96 8.1. Keepalive TLV . . . . . . . . . . . . . . . . . . . . . . 41 97 8.1.1. Client handling of received Session Timeout values . 43 98 8.1.2. Relation to edns-tcp-keepalive EDNS0 Option . . . . . 45 99 8.2. Retry Delay TLV . . . . . . . . . . . . . . . . . . . . . 46 100 8.2.1. Retry Delay TLV used as a Primary TLV . . . . . . . . 46 101 8.2.2. Retry Delay TLV used as a Response Additional TLV . . 48 102 8.3. Encryption Padding TLV . . . . . . . . . . . . . . . . . 48 103 9. Summary Highlights . . . . . . . . . . . . . . . . . . . . . 49 104 9.1. QR bit and MESSAGE ID . . . . . . . . . . . . . . . . . . 49 105 9.2. TLV Usage . . . . . . . . . . . . . . . . . . . . . . . . 50 106 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 52 107 10.1. DSO OPCODE Registration . . . . . . . . . . . . . . . . 52 108 10.2. DSO RCODE Registration . . . . . . . . . . . . . . . . . 52 109 10.3. DSO Type Code Registry . . . . . . . . . . . . . . . . . 52 110 11. Security Considerations . . . . . . . . . . . . . . . . . . . 53 111 11.1. TCP Fast Open Considerations . . . . . . . . . . . . . . 54 112 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 54 113 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 55 114 13.1. Normative References . . . . . . . . . . . . . . . . . . 55 115 13.2. Informative References . . . . . . . . . . . . . . . . . 56 116 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 57 118 1. Introduction 120 This document specifies a mechanism for managing stateful DNS 121 connections. DNS most commonly operates over a UDP transport, but 122 can also operate over streaming transports; the original DNS RFC 123 specifies DNS over TCP [RFC1035] and a profile for DNS over TLS 124 [RFC7858] has been specified. These transports can offer persistent, 125 long-lived sessions and therefore when using them for transporting 126 DNS messages it is of benefit to have a mechanism that can establish 127 parameters associated with those sessions, such as timeouts. In such 128 situations it is also advantageous to support server-initiated 129 messages (such as DNS Push Notifications [I-D.ietf-dnssd-push]). 131 The existing EDNS(0) Extension Mechanism for DNS [RFC6891] is 132 explicitly defined to only have "per-message" semantics. While 133 EDNS(0) has been used to signal at least one session-related 134 parameter (edns-tcp-keepalive EDNS0 Option [RFC7828]) the result is 135 less than optimal due to the restrictions imposed by the EDNS(0) 136 semantics and the lack of server-initiated signalling. For example, 137 a server cannot arbitrarily instruct a client to close a connection 138 because the server can only send EDNS(0) options in responses to 139 queries that contained EDNS(0) options. 141 This document defines a new DNS OPCODE, DSO ([TBA1], tentatively 6), 142 for DNS Stateful Operations. DSO messages are used to communicate 143 operations within persistent stateful sessions, expressed using type- 144 length-value (TLV) syntax. This document defines an initial set of 145 three TLVs, used to manage session timeouts, termination, and 146 encryption padding. 148 The three TLVs defined here are all mandatory for all implementations 149 of DSO. Further TLVs may be defined in additional specifications. 151 DSO messages may or may not be acknowledged; this is signaled by 152 providing a non-zero message ID for messages that must be 153 acknowledged and a zero message ID for messages that are not to be 154 acknowledged, and is also part of the definition of a particular 155 message type. Messages are pipelined; answers may appear out of 156 order when more than one answer is pending. 158 The format for DSO messages (Section 6.2) differs somewhat from the 159 traditional DNS message format used for standard queries and 160 responses. The standard twelve-byte header is used, but the four 161 count fields (QDCOUNT, ANCOUNT, NSCOUNT, ARCOUNT) are set to zero and 162 accordingly their corresponding sections are not present. 164 The actual data pertaining to DNS Stateful Operations (expressed in 165 TLV syntax) is appended to the end of the DNS message header. The 166 stream protocol carrying the DSO message frames it with 16-bit 167 message length, so the length of the DSO data is determined from that 168 length, rather than from any of the DNS header counts. 170 When displayed using packet analyzer tools that have not been updated 171 to recognize the DSO format, this will result in the DSO data being 172 displayed as unknown additional data after the end of the DNS 173 message. 175 This new format has distinct advantages over an RR-based format 176 because it is more explicit and more compact. Each TLV definition is 177 specific to its use case, and as a result contains no redundant or 178 overloaded fields. Importantly, it completely avoids conflating DNS 179 Stateful Operations in any way with normal DNS operations or with 180 existing EDNS(0)-based functionality. A goal of this approach is to 181 avoid the operational issues that have befallen EDNS(0), particularly 182 relating to middlebox behaviour (see for example 183 [I-D.ietf-dnsop-no-response-issue] sections 3.2 and 4). 185 With EDNS(0), multiple options may be packed into a single OPT 186 pseudo-RR, and there is no generalized mechanism for a client to be 187 able to tell whether a server has processed or otherwise acted upon 188 each individual option within the combined OPT pseudo-RR. The 189 specifications for each individual option need to define how each 190 different option is to be acknowledged, if necessary. 192 In contrast to EDNS(0), with DSO there is no compelling motivation to 193 pack multiple operations into a single message for efficiency 194 reasons, because DSO always operates using a connection-oriented 195 transport protocol. Each DSO operation is communicated in its own 196 separate DNS message, and the transport protocol can take care of 197 packing several DNS messages into a single IP packet if appropriate. 198 For example, TCP can pack multiple small DNS messages into a single 199 TCP segment. This simplification allows for clearer semantics. Each 200 DSO request message communicates just one primary operation, and the 201 RCODE in the corresponding response message indicates the success or 202 failure of that operation. 204 2. Requirements Language 206 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 207 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 208 "OPTIONAL" in this document are to be interpreted as described in BCP 209 14 [RFC2119] [RFC8174] when, and only when, they appear in all 210 capitals, as shown here. 212 3. Terminology 214 "DSO" is used to mean DNS Stateful Operation. 216 The term "connection" means a bidirectional byte (or message) stream, 217 where the bytes (or messages) are delivered reliably and in-order, 218 such as provided by using DNS over TCP [RFC1035] [RFC7766] or DNS 219 over TLS [RFC7858]. 221 The unqualified term "session" in the context of this document means 222 the exchange of DNS messages over a connection where: 224 o The connection between client and server is persistent and 225 relatively long-lived. 227 o Either end of the connection may initiate messages to the other. 229 In this document the term "session" is used exclusively as described 230 above. The term has no relationship to the "session layer" of the 231 OSI "seven-layer model". 233 A "DSO Session" is established between two endpoints that acknowledge 234 persistent DNS state via the exchange of DSO messages over the 235 connection. This is distinct from a DNS-over-TCP session as 236 described in the previous specification for DNS over TCP [RFC7766]. 238 A "DSO Session" is terminated when the underlying connection is 239 closed. The underlying connection can be closed in two ways: 241 Where this specification says, "close gracefully," that means sending 242 a TLS close_notify (if TLS is in use) followed by a TCP FIN, or the 243 equivalents for other protocols. Where this specification requires a 244 connection to be closed gracefully, the requirement to initiate that 245 graceful close is placed on the client, to place the burden of TCP's 246 TIME-WAIT state on the client rather than the server. 248 Where this specification says, "forcibly abort," that means sending a 249 TCP RST, or the equivalent for other protocols. In the BSD Sockets 250 API this is achieved by setting the SO_LINGER option to zero before 251 closing the socket. 253 The term "server" means the software with a listening socket, 254 awaiting incoming connection requests in the usual DNS sense. 256 The term "client" means the software which initiates a connection to 257 the server's listening socket in the usual DNS sense. 259 The terms "initiator" and "responder" correspond respectively to the 260 initial sender and subsequent receiver of a DSO request message or 261 unacknowledged message, regardless of which was the "client" and 262 "server" in the usual DNS sense. 264 The term "sender" may apply to either an initiator (when sending a 265 DSO request message or unacknowledged message) or a responder (when 266 sending a DSO response message). 268 Likewise, the term "receiver" may apply to either a responder (when 269 receiving a DSO request message or unacknowledged message) or an 270 initiator (when receiving a DSO response message). 272 In protocol implementation there are generally two kinds of errors 273 that software writers have to deal with. The first is situations 274 that arise due to factors in the environment, such as temporary loss 275 of connectivity. While undesirable, these situations do not indicate 276 a flaw in the software, and they are situations that software should 277 generally be able to recover from. The second is situations that 278 should never happen when communicating with a correctly-implemented 279 peer. If they do happen, they indicate a serious flaw in the 280 protocol implementation, beyond what it is reasonable to expect 281 software to recover from. This document describes this latter form 282 of error condition as a "fatal error" and specifies that an 283 implementation encountering a fatal error condition "MUST forcibly 284 abort the connection immediately". Given that these fatal error 285 conditions signify defective software, and given that defective 286 software is likely to remain defective for some time until it is 287 fixed, after forcibly aborting a connection, a client SHOULD refrain 288 from automatically reconnecting to that same service instance for at 289 least one hour. 291 This document uses the term "same service instance" as follows: 293 o In cases where a server is specified or configured using an IP 294 address and TCP port number, two different configurations are 295 referring to the same service instance if they contain the same IP 296 address and TCP port number. 298 o In cases where a server is specified or configured using a 299 hostname and TCP port number, such as in the content of a DNS SRV 300 record [RFC2782], two different configurations (or DNS SRV 301 records) are considered to be referring to the same service 302 instance if they contain the same hostname (subject to the usual 303 case insensitive DNS name matching rules [RFC1034] [RFC1035]) and 304 TCP port number. In these cases, configurations with different 305 hostnames are considered to be referring to different service 306 instances, even if those different hostnames happen to be aliases, 307 or happen to resolve to the same IP address(es). Implementations 308 SHOULD NOT resolve hostnames and then perform matching of IP 309 address(es) in order to evaluate whether two entities should be 310 determined to be the "same service instance". 312 When an anycast service is configured on a particular IP address and 313 port, it must be the case that although there is more than one 314 physical server responding on that IP address, each such server can 315 be treated as equivalent. What we mean by "equivalent" here is that 316 both servers can provide the same service and, where appropriate, the 317 same authentication information, such as PKI certificates, when 318 establishing connections. 320 In principle, anycast servers could maintain sufficient state that 321 they can both handle packets in the same TCP connection. In order 322 for this to work with DSO, they would need to also share DSO state. 323 It is unlikely that this can be done successfully, however, so we 324 recommend that each anycast server instance maintain its own session 325 state. 327 If a change in network topology causes packets in a particular TCP 328 connection to be sent to an anycast server instance that does not 329 know about the connection, the new server will automatically 330 terminate the connection with a TCP reset, since it will have no 331 record of the connection, and then the client can reconnect or stop 332 using the connection, as appropriate. 334 If after the connection is re-established, the client's assumption 335 that it is connected to the same service is violated in some way, 336 that would be considered to be incorrect behavior in this context. 337 It is however out of the possible scope for this specification to 338 make specific recommendations in this regard; that would be up to 339 follow-on documents that describe specific uses of DNS stateful 340 operations. 342 The term "long-lived operations" refers to operations such as Push 343 Notification subscriptions [I-D.ietf-dnssd-push], Discovery Relay 344 interface subscriptions [I-D.ietf-dnssd-mdns-relay], and other future 345 long-lived DNS operations that choose to use DSO as their basis. 346 These operations establish state that persists beyond the lifetime of 347 a traditional brief request/response transaction. This document, the 348 base specification for DNS Stateful Operations, defines a framework 349 for supporting long-lived operations, but does not itself define any 350 long-lived operations. Nonetheless, to appreciate the design 351 rationale behind DNS Stateful Operations, it is helpful to understand 352 the kind of long-lived operations that it is intended to support. 354 DNS Stateful Operations uses three kinds of message: "DSO request 355 messages", "DSO response messages", and "DSO unacknowledged 356 messages". A DSO request message elicits a DSO response message. 357 DSO unacknowledged messages are unidirectional messages and do not 358 induce a DNS response. 360 Both DSO request messages and DSO unacknowledged messages are 361 formatted as DNS request messages (the header QR bit is set to zero, 362 as described in Section 6.2). One difference is that in DSO request 363 messages the MESSAGE ID field is nonzero; in DSO unacknowledged 364 messages it is zero. 366 The content of DSO messages is expressed using type-length-value 367 (TLV) syntax. 369 In a DSO request message or DSO unacknowledged message the first TLV 370 is referred to as the "Primary TLV" and determines the nature of the 371 operation being performed, including whether it is an acknowledged or 372 unacknowledged operation; any other TLVs in a DSO request message or 373 unacknowledged message are referred to as "Additional TLVs" and serve 374 additional non-primary purposes, which may be related to the primary 375 purpose, or not, as in the case of the encryption padding TLV. 377 A DSO response message may contain no TLVs, or it may contain one or 378 more TLVs as appropriate to the information being communicated. In 379 the context of DSO response messages, one or more TLVs with the same 380 DSO-TYPE as the Primary TLV in the corresponding DSO request message 381 are referred to as "Response Primary TLVs". Any other TLVs with 382 different DSO-TYPEs are referred to as "Response Additional TLVs". 383 The Response Primary TLV(s), if present, MUST occur first in the 384 response message, before any Response Additional TLVs. 386 Two timers (elapsed time since an event) are defined in this 387 document: 389 o an inactivity timer (see Section 7.4 and Section 8.1) 391 o a keepalive timer (see Section 7.5 and Section 8.1) 393 The timeouts associated with these timers are called the inactivity 394 timeout and the keepalive interval, respectively. The term "Session 395 Timeouts" is used to refer to this pair of timeout values. 397 Resetting a timer means resetting the timer value to zero and 398 starting the timer again. Clearing a timer means resetting the timer 399 value to zero but NOT starting the timer again. 401 4. Discussion 403 There are several use cases for DNS Stateful operations that can be 404 described here. 406 Firstly, establishing session parameters such as server-defined 407 timeouts is of great use in the general management of persistent 408 connections. For example, using DSO sessions for stub-to-recursive 409 DNS-over-TLS [RFC7858] is more flexible for both the client and the 410 server than attempting to manage sessions using just the edns-tcp- 411 keepalive EDNS0 Option [RFC7828]. The simple set of TLVs defined in 412 this document is sufficient to greatly enhance connection management 413 for this use case. 415 Secondly, DNS-SD [RFC6763] has evolved into a naturally session-based 416 mechanism where, for example, long-lived subscriptions lend 417 themselves to 'push' mechanisms as opposed to polling. Long-lived 418 stateful connections and server-initiated messages align with this 419 use case [I-D.ietf-dnssd-push]. 421 A general use case is that DNS traffic is often bursty but session 422 establishment can be expensive. One challenge with long-lived 423 connections is to maintain sufficient traffic to maintain NAT and 424 firewall state. To mitigate this issue this document introduces a 425 new concept for the DNS, that is DSO "Keepalive traffic". This 426 traffic carries no DNS data and is not considered 'activity' in the 427 classic DNS sense, but serves to maintain state in middleboxes, and 428 to assure client and server that they still have connectivity to each 429 other. 431 5. Applicability 433 DNS Stateful Operations are applicable in cases where it is useful to 434 maintain an open session between a DNS client and server, where the 435 transport allows such a session to be maintained, and where the 436 transport guarantees in-order delivery of messages, on which DSO 437 depends. Examples of transports that can support session signaling 438 are DNS-over-TCP [RFC1035] [RFC7766] and DNS-over-TLS [RFC7858]. 440 Note that in the case of DNS over TLS, there is no mechanism for 441 upgrading from DNS-over-TCP to DNS-over-TLS (see [RFC7858] section 442 7). 444 DNS Stateful Operations are not applicable for transports that cannot 445 support clean session semantics, or that do not guarantee in-order 446 delivery. While in principle such a transport could be constructed 447 over UDP, the current DNS specification over UDP transport [RFC1035] 448 does not provide in-order delivery or session semantics, and hence 449 cannot be used. Similarly, DNS-over-HTTP 450 [I-D.ietf-doh-dns-over-https] cannot be used because HTTP has its own 451 mechanism for managing sessions, and this is incompatible with the 452 mechanism specified here. 454 No other transports are currently defined for use with DNS Stateful 455 Operations. Such transports can be added in the future, if they meet 456 the requirements set out in the first paragraph of this section. 458 6. Protocol Details 460 6.1. DSO Session Establishment 462 In order for a session to be established between a client and a 463 server, the client must first establish a connection to the server, 464 using an applicable transport (see Section 5). 466 In some environments it may be known in advance by external means 467 that both client and server support DSO, and in these cases either 468 client or server may initiate DSO messages at any time. In this 469 case, the session is established as soon as the connection is 470 established; this is referred to as implicit session establishment. 472 However, in the typical case a server will not know in advance 473 whether a client supports DSO, so in general, unless it is known in 474 advance by other means that a client does support DSO, a server MUST 475 NOT initiate DSO request messages or DSO unacknowledged messages 476 until a DSO Session has been mutually established by at least one 477 successful DSO request/response exchange initiated by the client, as 478 described below. This is referred to as explicit session 479 establishment. 481 Until a DSO session has been implicitly or explicitly established, a 482 client MUST NOT initiate DSO unacknowledged messages. 484 A DSO Session is established over a connection by the client sending 485 a DSO request message, such as a DSO Keepalive request message 486 (Section 8.1), and receiving a response, with matching MESSAGE ID, 487 and RCODE set to NOERROR (0), indicating that the DSO request was 488 successful. 490 If the RCODE in the response is set to DSOTYPENI ("DSO-TYPE Not 491 Implemented", [TBA2] tentatively RCODE 11) this indicates that the 492 server does support DSO, but does not implement the DSO-TYPE of the 493 primary TLV in this DSO request message. A server implementing DSO 494 MUST NOT return DSOTYPENI for a DSO Keepalive request message, 495 because the Keepalive TLV is mandatory to implement. But in the 496 future, if a client attempts to establish a DSO Session using a 497 response-requiring DSO request message using some newly-defined DSO- 498 TYPE that the server does not understand, that would result in a 499 DSOTYPENI response. If the server returns DSOTYPENI then a DSO 500 Session is not considered established, but the client is permitted to 501 continue sending DNS messages on the connection, including other DSO 502 messages such as the DSO Keepalive, which may result in a successful 503 NOERROR response, yielding the establishment of a DSO Session. 505 If the RCODE is set to any value other than NOERROR (0) or DSOTYPENI 506 ([TBA2] tentatively 11), then the client MUST assume that the server 507 does not implement DSO at all. In this case the client is permitted 508 to continue sending DNS messages on that connection, but the client 509 MUST NOT issue further DSO messages on that connection. 511 Two other possibilities exist: the server might drop the connection, 512 or the server might send no response to the DSO message. In the 513 first case, the client SHOULD mark the server as not supporting DSO, 514 and not attempt a DSO connection for some period of time (at least an 515 hour) after the failed attempt. The client MAY reconnect but not use 516 DSO, if appropriate. 518 In the second case, the client SHOULD set a reasonable timeout, after 519 which time the server will be assumed not to support DSO. At this 520 point the client MUST drop the connection to the server, since the 521 server's behavior is out of spec, and hence its state is undefined. 522 The client MAY reconnect, but not use DSO, if appropriate. 524 When the server receives a DSO request message from a client, and 525 transmits a successful NOERROR response to that request, the server 526 considers the DSO Session established. 528 When the client receives the server's NOERROR response to its DSO 529 request message, the client considers the DSO Session established. 531 Once a DSO Session has been established, either end may unilaterally 532 send appropriate DSO messages at any time, and therefore either 533 client or server may be the initiator of a message. 535 Once a DSO Session has been established, clients and servers should 536 behave as described in this specification with regard to inactivity 537 timeouts and session termination, not as previously prescribed in the 538 earlier specification for DNS over TCP [RFC7766]. 540 Because the Keepalive TLV can't fail (that is, can't return an RCODE 541 other than NOERROR), it is an ideal candidate for use in establishing 542 a DSO session. Any other option that can only succeed MAY also be 543 used to establish a DSO session. For clients that implement only the 544 DSO-TYPEs defined in this base specification, sending a Keepalive TLV 545 is the only DSO request message they have available to initiate a DSO 546 Session. Even for clients that do implement other future DSO-TYPEs, 547 for simplicity they MAY elect to always send an initial DSO Keepalive 548 request message as their way of initiating a DSO Session. A future 549 definition of a new response-requiring DSO-TYPE gives implementers 550 the option of using that new DSO-TYPE if they wish, but does not 551 change the fact that sending a Keepalive TLV remains a valid way of 552 initiating a DSO Session. 554 6.1.1. Connection Sharing 556 As previously specified for DNS over TCP [RFC7766]: 558 To mitigate the risk of unintentional server overload, DNS 559 clients MUST take care to minimize the number of concurrent 560 TCP connections made to any individual server. It is RECOMMENDED 561 that for any given client/server interaction there SHOULD be 562 no more than one connection for regular queries, one for zone 563 transfers, and one for each protocol that is being used on top 564 of TCP (for example, if the resolver was using TLS). However, 565 it is noted that certain primary/secondary configurations 566 with many busy zones might need to use more than one TCP 567 connection for zone transfers for operational reasons (for 568 example, to support concurrent transfers of multiple zones). 570 A single server may support multiple services, including DNS Updates 571 [RFC2136], DNS Push Notifications [I-D.ietf-dnssd-push], and other 572 services, for one or more DNS zones. When a client discovers that 573 the target server for several different operations is the same target 574 hostname and port, the client SHOULD use a single shared DSO Session 575 for all those operations. A client SHOULD NOT open multiple 576 connections to the same target host and port just because the names 577 being operated on are different or happen to fall within different 578 zones. This requirement has two benefits. First, it reduces 579 unnecessary connection load on the DNS server. Second, it avoids 580 paying the TCP slow start penalty when making subsequent connections 581 to the same server. 583 However, server implementers and operators should be aware that 584 connection sharing may not be possible in all cases. A single host 585 device may be home to multiple independent client software instances 586 that don't coordinate with each other. Similarly, multiple 587 independent client devices behind the same NAT gateway will also 588 typically appear to the DNS server as different source ports on the 589 same client IP address. Because of these constraints, a DNS server 590 MUST be prepared to accept multiple connections from different source 591 ports on the same client IP address. 593 6.1.2. Zero Round-Trip Operation 595 DSO permits zero round-trip operation using TCP Fast Open [RFC7413] 596 and TLS 1.3 [I-D.ietf-tls-tls13] to reduce or eliminate round trips 597 in session establishment. 599 A client MAY send multiple response-requiring DSO messages using TCP 600 fast open or TLS 1.3 early data, without having to wait for a 601 response to the first request message to confirm successful 602 establishment of a DSO session. 604 However, a client MUST NOT send non-response-requiring DSO request 605 messages until after a DSO Session has been mutually established. 607 Similarly, a server MUST NOT send DSO request messages until it has 608 received a response-requiring DSO request message from a client and 609 transmitted a successful NOERROR response for that request. 611 Caution must be taken to ensure that DSO messages sent before the 612 first round-trip is completed are idempotent, or are otherwise immune 613 to any problems that could be result from the inadvertent replay that 614 can occur with zero round-trip operation. 616 6.1.3. Middlebox Considerations 618 Where an application-layer middlebox (e.g., a DNS proxy, forwarder, 619 or session multiplexer) is in the path, care must be taken to avoid 620 inappropriately passing session signaling through the middlebox. 622 In cases where a DSO session is terminated on one side of a 623 middlebox, and then some session is opened on the other side of the 624 middlebox in order to satisfy requests sent over the first DSO 625 session, any such session MUST be treated as a separate session. If 626 the middlebox does implement DSO sessions, it MUST handle 627 unrecognized TLVs in the same way as any other DSO implementation as 628 described below in Section 6.2.2.4. 630 This does not preclude the use of DSO messages in the presence of an 631 IP-layer middlebox, such as a NAT that rewrites IP-layer and/or 632 transport- layer headers but otherwise preserves the effect of a 633 single session between the client and the server. And of course it 634 does not apply to middleboxes that do not implement DNS Stateless 635 Operations. 637 These restrictions do not apply to such middleboxes: since they have 638 no way to understand a DSO message, a pass-through middlebox like the 639 one described in the previous paragraph will pass DSO messages 640 unchanged or drop them (or possibly drop the connection). A 641 middlebox that is not doing a strict pass-through will have no way to 642 know on which connection to forward a DSO message, and therefore will 643 not be able to behave incorrectly. 645 To illustrate the above, consider a network where a middlebox 646 terminates one or more TCP connections from clients and multiplexes 647 the queries therein over a single TCP connection to an upstream 648 server. The DSO messages and any associated state are specific to 649 the individual TCP connections. A DSO-aware middlebox MAY in some 650 circumstances be able to retain associated state and pass it between 651 the client and server (or vice versa) but this would be highly TLV- 652 specific. For example, the middlebox may be able to maintain a list 653 of which clients have made Push Notification subscriptions 654 [I-D.ietf-dnssd-push] and make its own subscription(s) on their 655 behalf, relaying any subsequent notifications to the client (or 656 clients) that have subscribed to that particular notification. 658 6.2. Message Format 660 A DSO message begins with the standard twelve-byte DNS message header 661 [RFC1035] with the OPCODE field set to the DSO OPCODE ([TBA1] 662 tentatively 6). However, unlike standard DNS messages, the question 663 section, answer section, authority records section and additional 664 records sections are not present. The corresponding count fields 665 (QDCOUNT, ANCOUNT, NSCOUNT, ARCOUNT) MUST be set to zero on 666 transmission. 668 If a DSO message is received where any of the count fields are not 669 zero, then a FORMERR MUST be returned. 671 1 1 1 1 1 1 672 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 673 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 674 | MESSAGE ID | 675 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 676 |QR | OPCODE | Z | RCODE | 677 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 678 | QDCOUNT (MUST be zero) | 679 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 680 | ANCOUNT (MUST be zero) | 681 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 682 | NSCOUNT (MUST be zero) | 683 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 684 | ARCOUNT (MUST be zero) | 685 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 686 | | 687 / DSO Data / 688 / / 689 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 691 6.2.1. DNS Header Fields in DSO Messages 693 In an unacknowledged message the MESSAGE ID field MUST be set to 694 zero. In an acknowledged request message the MESSAGE ID field MUST 695 be set to a unique nonzero value, that the initiator is not currently 696 using for any other active operation on this connection. For the 697 purposes here, a MESSAGE ID is in use in this DSO Session if the 698 initiator has used it in a request for which it is still awaiting a 699 response, or if the client has used it to set up a long-lived 700 operation that has not yet been cancelled. For example, a long-lived 701 operation could be a Push Notification subscription 702 [I-D.ietf-dnssd-push] or a Discovery Relay interface subscription 703 [I-D.ietf-dnssd-mdns-relay]. 705 Whether a message is acknowledged or unacknowledged is determined 706 only by the specification for the Primary TLV. An acknowledgment 707 cannot be requested by including a nonzero message ID in a message 708 the primary TLV of which is specified to be unacknowledged, nor can 709 an acknowledgment be prevented by sending a message ID of zero in a 710 message with a primary TLV that is specified to be acknowledged. A 711 responder that receives either such malformed message MUST treat it 712 as a fatal error and forcibly abort the connection immediately. 714 In a request or unacknowledged message the DNS Header QR bit MUST be 715 zero (QR=0). If the QR bit is not zero the message is not a request 716 or unacknowledged message. 718 In a response message the DNS Header QR bit MUST be one (QR=1). 719 If the QR bit is not one the message is not a response message. 721 In a response message (QR=1) the MESSAGE ID field MUST contain a copy 722 of the value of the MESSAGE ID field in the request message being 723 responded to. In a response message (QR=1) the MESSAGE ID field MUST 724 NOT be zero. If a response message (QR=1) is received where the 725 MESSAGE ID is zero this is a fatal error and the recipient MUST 726 forcibly abort the connection immediately. 728 The DNS Header OPCODE field holds the DSO OPCODE value ([TBA1] 729 tentatively 6). 731 The Z bits are currently unused in DSO messages, and in both DSO 732 requests and DSO responses the Z bits MUST be set to zero (0) on 733 transmission and MUST be silently ignored on reception. 735 In a DNS request message (QR=0) the RCODE is set according to the 736 definition of the request. For example, in a Retry Delay message 737 (Section 7.6.1) the RCODE indicates the reason for termination. 738 However, in most cases, except where clearly specified otherwise, in 739 a DNS request message (QR=0) the RCODE is set to zero on 740 transmission, and silently ignored on reception. 742 The RCODE value in a response message (QR=1) may be one of the 743 following values: 745 +---------+-----------+---------------------------------------------+ 746 | Code | Mnemonic | Description | 747 +---------+-----------+---------------------------------------------+ 748 | 0 | NOERROR | Operation processed successfully | 749 | | | | 750 | 1 | FORMERR | Format error | 751 | | | | 752 | 2 | SERVFAIL | Server failed to process request due to a | 753 | | | problem with the server | 754 | | | | 755 | 3 | NXDOMAIN | Name Error -- Named entity does not exist | 756 | | | (TLV-dependent) | 757 | | | | 758 | 4 | NOTIMP | DSO not supported | 759 | | | | 760 | 5 | REFUSED | Operation declined for policy reasons | 761 | | | | 762 | 9 | NOTAUTH | Not Authoritative (TLV-dependent) | 763 | | | | 764 | [TBA2] | DSOTYPENI | Primary TLV's DSO-Type is not implemented | 765 | 11 | | | 766 +---------+-----------+---------------------------------------------+ 768 Use of the above RCODEs is likely to be common in DSO but does not 769 preclude the definition and use of other codes in future documents 770 that make use of DSO. 772 If a document defining a new DSO-TYPE makes use of NXDOMAIN (Name 773 Error) or NOTAUTH (Not Authoritative) then that document MUST specify 774 the specific interpretation of these RCODE values in the context of 775 that new DSO TLV. 777 6.2.2. DSO Data 779 The standard twelve-byte DNS message header with its zero-valued 780 count fields is followed by the DSO Data, expressed using TLV syntax, 781 as described below Section 6.2.2.1. 783 A DSO request message or DSO unacknowledged message MUST contain at 784 least one TLV. The first TLV in a DSO request message or DSO 785 unacknowledged message is referred to as the "Primary TLV" and 786 determines the nature of the operation being performed, including 787 whether it is an acknowledged or unacknowledged operation. In some 788 cases it may be appropriate to include other TLVs in a request 789 message or unacknowledged message, such as the Encryption Padding TLV 790 (Section 8.3), and these extra TLVs are referred to as the 791 "Additional TLVs" and are not limited to what is defined in this 792 document. New "Additional TLVs" may be defined in the future and 793 those definitions will describe when their use is appropriate. 795 A DSO response message may contain no TLVs, or it may be specified to 796 contain one or more TLVs appropriate to the information being 797 communicated. This includes "Primary TLVs" and "Additional TLVs" 798 defined in this document as well as in future TLV definitions. It 799 may be permissible for an additional TLV to appear in a response to a 800 primary TLV even though the specification of that primary TLV does 801 not specify it explicitly. See Section 9.2 for more information. 803 A DSO response message may contain one or more TLVs with DSO-TYPE the 804 same as the Primary TLV from the corresponding DSO request message, 805 in which case those TLV(s) are referred to as "Response Primary 806 TLVs". A DSO response message is not required to carry Response 807 Primary TLVs. The MESSAGE ID field in the DNS message header is 808 sufficient to identify the DSO request message to which this response 809 message relates. 811 A DSO response message may contain one or more TLVs with DSO-TYPEs 812 different from the Primary TLV from the corresponding DSO request 813 message, in which case those TLV(s) are referred to as "Response 814 Additional TLVs". 816 Response Primary TLV(s), if present, MUST occur first in the response 817 message, before any Response Additional TLVs. 819 It is anticipated that most DSO operations will be specified to use 820 request messages, which generate corresponding responses. In some 821 specialized high-traffic use cases, it may be appropriate to specify 822 unacknowledged messages. Unacknowledged messages can be more 823 efficient on the network, because they don't generate a stream of 824 corresponding reply messages. Using unacknowledged messages can also 825 simplify software in some cases, by removing need for an initiator to 826 maintain state while it waits to receive replies it doesn't care 827 about. When the specification for a particular TLV states that, when 828 used as a Primary TLV (i.e., first) in an outgoing DNS request 829 message (i.e., QR=0), that message is to be unacknowledged, the 830 MESSAGE ID field MUST be set to zero and the receiver MUST NOT 831 generate any response message corresponding to this unacknowledged 832 message. 834 The previous point, that the receiver MUST NOT generate responses to 835 unacknowledged messages, applies even in the case of errors. When a 836 DSO message is received where both the QR bit and the MESSAGE ID 837 field are zero, the receiver MUST NOT generate any response. For 838 example, if the DSO-TYPE in the Primary TLV is unrecognized, then a 839 DSOTYPENI error MUST NOT be returned; instead the receiver MUST 840 forcibly abort the connection immediately. 842 Unacknowledged messages MUST NOT be used "speculatively" in cases 843 where the sender doesn't know if the receiver supports the Primary 844 TLV in the message, because there is no way to receive any response 845 to indicate success or failure. Unacknowledged messages are only 846 appropriate in cases where the sender already knows that the receiver 847 supports, and wishes to receive, these messages. 849 For example, after a client has subscribed for Push Notifications 850 [I-D.ietf-dnssd-push], the subsequent event notifications are then 851 sent as unacknowledged messages, and this is appropriate because the 852 client initiated the message stream by virtue of its Push 853 Notification subscription, thereby indicating its support of Push 854 Notifications, and its desire to receive those notifications. 856 Similarly, after a Discovery Relay client has subscribed to receive 857 inbound mDNS (multicast DNS, [RFC6762]) traffic from a Discovery 858 Relay, the subsequent stream of received packets is then sent using 859 unacknowledged messages, and this is appropriate because the client 860 initiated the message stream by virtue of its Discovery Relay link 861 subscription, thereby indicating its support of Discovery Relay, and 862 its desire to receive inbound mDNS packets over that DSO session 863 [I-D.ietf-dnssd-mdns-relay]. 865 6.2.2.1. TLV Syntax 867 All TLVs, whether used as "Primary", "Additional", "Response 868 Primary", or "Response Additional", use the same encoding syntax. 870 Specifications that define new TLVs must specify whether the DSO-TYPE 871 can be used as the Primary TLV, used as an Additional TLV, or used in 872 either context, both in the case of requests and of responses. The 873 specification for a TLV must also state whether, when used as the 874 Primary (i.e., first) TLV in a DNS request message (i.e., QR=0), that 875 DSO message is to be acknowledged. If the DSO message is to be 876 acknowledged, the specification must also state which TLVs, if any, 877 are to be included in the response. The Primary TLV may or may not 878 be contained in the response, depending on what is specified for that 879 TLV. 881 1 1 1 1 1 1 882 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 883 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 884 | DSO-TYPE | 885 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 886 | DSO-LENGTH | 887 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 888 | | 889 / DSO-DATA / 890 / / 891 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 893 DSO-TYPE: A 16-bit unsigned integer, in network (big endian) byte 894 order, giving the DSO-TYPE of the current DSO TLV per the IANA DSO 895 Type Code Registry. 897 DSO-LENGTH: A 16-bit unsigned integer, in network (big endian) byte 898 order, giving the size in bytes of the DSO-DATA. 900 DSO-DATA: Type-code specific format. The generic DSO machinery 901 treats the DSO-DATA as an opaque "blob" without attempting to 902 interpret it. Interpretation of the meaning of the DSO-DATA for a 903 particular DSO-TYPE is the responsibility of the software that 904 implements that DSO-TYPE. 906 6.2.2.2. Request TLVs 908 The first TLV in a DSO request message or unacknowledged message is 909 the "Primary TLV" and indicates the operation to be performed. A DSO 910 request message or unacknowledged message MUST contain at at least 911 one TLV, the Primary TLV. 913 Immediately following the Primary TLV, a DSO request message or 914 unacknowledged message MAY contain one or more "Additional TLVs", 915 which specify additional parameters relating to the operation. 917 6.2.2.3. Response TLVs 919 Depending on the operation, a DSO response message MAY contain no 920 TLVs, because it is simply a response to a previous request message, 921 and the MESSAGE ID in the header is sufficient to identify the 922 request in question. Or it may contain a single response TLV, with 923 the same DSO-TYPE as the Primary TLV in the request message. 924 Alternatively it may contain one or more TLVs of other types, or a 925 combination of the above, as appropriate for the information that 926 needs to be communicated. The specification for each DSO TLV 927 determines what TLVs are required in a response to a request using 928 that TLV. 930 If a DSO response is received for an operation where the 931 specification requires that the response carry a particular TLV or 932 TLVs, and the required TLV(s) are not present, then this is a fatal 933 error and the recipient of the defective response message MUST 934 forcibly abort the connection immediately. 936 6.2.2.4. Unrecognized TLVs 938 If DSO request message is received containing an unrecognized Primary 939 TLV, with a nonzero MESSAGE ID (indicating that a response is 940 expected), then the receiver MUST send an error response with 941 matching MESSAGE ID, and RCODE DSOTYPENI ([TBA2] tentatively 11). 942 The error response MUST NOT contain a copy of the unrecognized 943 Primary TLV. 945 If DSO unacknowledged message is received containing an unrecognized 946 Primary TLV, with a zero MESSAGE ID (indicating that no response is 947 expected), then this is a fatal error and the recipient MUST forcibly 948 abort the connection immediately. 950 If a DSO request message or unacknowledged message is received where 951 the Primary TLV is recognized, containing one or more unrecognized 952 Additional TLVs, the unrecognized Additional TLVs MUST be silently 953 ignored, and the remainder of the message is interpreted and handled 954 as if the unrecognized parts were not present. 956 Similarly, if a DSO response message is received containing one or 957 more unrecognized TLVs, the unrecognized TLVs MUST be silently 958 ignored, and the remainder of the message is interpreted and handled 959 as if the unrecognized parts were not present. 961 6.2.3. EDNS(0) and TSIG 963 Since the ARCOUNT field MUST be zero, a DSO message can't contain a 964 valid EDNS(0) option in the additional records section. If 965 functionality provided by current or future EDNS(0) options is 966 desired for DSO messages, one or more new DSO TLVs need to be defined 967 to carry the necessary information. 969 For example, the EDNS(0) Padding Option [RFC7830] used for security 970 purposes is not permitted in a DSO message, so if message padding is 971 desired for DSO messages then the Encryption Padding TLV described in 972 Section 8.3 MUST be used. 974 Similarly, a DSO message MUST NOT contain a TSIG record. A TSIG 975 record in a conventional DNS message is added as the last record in 976 the additional records section, and carries a signature computed over 977 the preceding message content. Since DSO data appears *after* the 978 additional records section, it would not be included in the signature 979 calculation. If use of signatures with DSO messages becomes 980 necessary in the future, a new DSO TLV needs to be defined to perform 981 this function. 983 Note however that, while DSO *messages* cannot include EDNS(0) or 984 TSIG records, a DSO *session* is typically used to carry a whole 985 series of DNS messages of different kinds, including DSO messages, 986 and other DNS message types like Query [RFC1034] [RFC1035] and Update 987 [RFC2136], and those messages can carry EDNS(0) and TSIG records. 989 Although messages may contain other EDNS(0) options as appropriate, 990 this specification explicitly prohibits use of the edns-tcp-keepalive 991 EDNS0 Option [RFC7828] in *any* messages sent on a DSO Session 992 (because it is obsoleted by the functionality provided by the DSO 993 Keepalive operation). If any message sent on a DSO Session contains 994 an edns-tcp-keepalive EDNS0 Option this is a fatal error and the 995 recipient of the defective message MUST forcibly abort the connection 996 immediately. 998 6.3. Message Handling 1000 The initiator MUST set the value of the QR bit in the DNS header to 1001 zero (0), and the responder MUST set it to one (1). 1003 As described above in Section 6.2.1 whether an outgoing message with 1004 QR=0 is unacknowledged or acknowledged is determined by the 1005 specification for the Primary TLV, which in turn determines whether 1006 the MESSAGE ID field in that outgoing message will be zero or 1007 nonzero. 1009 A DSO unacknowledged message has both the QR bit and the MESSAGE ID 1010 field set to zero, and MUST NOT elicit a response. 1012 Every DSO request message (QR=0) with a nonzero MESSAGE ID field is 1013 an acknowledged DSO request, and MUST elicit a corresponding response 1014 (QR=1), which MUST have the same MESSAGE ID in the DNS message header 1015 as in the corresponding request. 1017 Valid DSO request messages sent by the client with a nonzero MESSAGE 1018 ID field elicit a response from the server, and Valid DSO request 1019 messages sent by the server with a nonzero MESSAGE ID field elicit a 1020 response from the client. 1022 The namespaces of 16-bit MESSAGE IDs are independent in each 1023 direction. This means it is *not* an error for both client and 1024 server to send request messages at the same time as each other, using 1025 the same MESSAGE ID, in different directions. This simplification is 1026 necessary in order for the protocol to be implementable. It would be 1027 infeasible to require the client and server to coordinate with each 1028 other regarding allocation of new unique MESSAGE IDs. It is also not 1029 necessary to require the client and server to coordinate with each 1030 other regarding allocation of new unique MESSAGE IDs. The value of 1031 the 16-bit MESSAGE ID combined with the identity of the initiator 1032 (client or server) is sufficient to unambiguously identify the 1033 operation in question. This can be thought of as a 17-bit message 1034 identifier space, using message identifiers 0x00001-0x0FFFF for 1035 client-to-server DSO request messages, and message identifiers 1036 0x10001-0x1FFFF for server-to-client DSO request messages. The 1037 least-significant 16 bits are stored explicitly in the MESSAGE ID 1038 field of the DSO message, and the most-significant bit is implicit 1039 from the direction of the message. 1041 As described above in Section 6.2.1, an initiator MUST NOT reuse a 1042 MESSAGE ID that it already has in use for an outstanding request 1043 (unless specified otherwise by the relevant specification for the 1044 DSO-TYPE in question). At the very least, this means that a MESSAGE 1045 ID can't be reused in a particular direction on a particular DSO 1046 Session while the initiator is waiting for a response to a previous 1047 request using that MESSAGE ID on that DSO Session (unless specified 1048 otherwise by the relevant specification for the DSO-TYPE in 1049 question), and for a long-lived operation the MESSAGE ID for the 1050 operation can't be reused while that operation remains active. 1052 If a client or server receives a response (QR=1) where the MESSAGE ID 1053 is zero, or is any other value that does not match the MESSAGE ID of 1054 any of its outstanding operations, this is a fatal error and the 1055 recipient MUST forcibly abort the connection immediately. 1057 If a responder receives a request (QR=0) where the MESSAGE ID is not 1058 zero, and the responder tracks query MESSAGE IDs, and the MESSAGE ID 1059 matches the MESSAGE ID of a query it received for which a response 1060 has not yet been sent, it MUST forcibly abort the connection 1061 immediately. This behavior is required to prevent a hypothetical 1062 attack that takes advantage of undefined behavior in this case. 1063 However, if the server does not track MESSAGE IDs in this way, no 1064 such risk exists, so tracking MESSAGE IDs just to implement this 1065 sanity check is not required. 1067 6.3.1. Error Responses 1069 When an unacknowledged DSO message type is received (MESSAGE ID field 1070 is zero), the receiver SHOULD already be expecting this DSO message 1071 type. Section 6.2.2.4 describes the handling of unknown DSO message 1072 types. Parsing errors MUST also result in the receiver aborting the 1073 connection. When an unacknowledged DSO message of an unexpected type 1074 is received, the receiver should abort the connection. Other 1075 internal errors processing the unacknowledged DSO message are 1076 implementation dependent as to whether the connection should be 1077 aborted according to the severity of the error. 1079 When an acknowledged DSO request message is unsuccessful for some 1080 reason, the responder returns an error code to the initiator. 1082 In the case of a server returning an error code to a client in 1083 response to an unsuccessful DSO request message, the server MAY 1084 choose to end the DSO Session, or MAY choose to allow the DSO Session 1085 to remain open. For error conditions that only affect the single 1086 operation in question, the server SHOULD return an error response to 1087 the client and leave the DSO Session open for further operations. 1089 For error conditions that are likely to make all operations 1090 unsuccessful in the immediate future, the server SHOULD return an 1091 error response to the client and then end the DSO Session by sending 1092 a Retry Delay message, as described in Section 7.6.1. 1094 Upon receiving an error response from the server, a client SHOULD NOT 1095 automatically close the DSO Session. An error relating to one 1096 particular operation on a DSO Session does not necessarily imply that 1097 all other operations on that DSO Session have also failed, or that 1098 future operations will fail. The client should assume that the 1099 server will make its own decision about whether or not to end the DSO 1100 Session, based on the server's determination of whether the error 1101 condition pertains to this particular operation, or would also apply 1102 to any subsequent operations. If the server does not end the DSO 1103 Session by sending the client a Retry Delay message (Section 7.6.1) 1104 then the client SHOULD continue to use that DSO Session for 1105 subsequent operations. 1107 6.4. Flow Control Considerations 1109 Because unacknowledged DSO messages do not generate an immediate 1110 response from the responder, if there is no other traffic flowing 1111 from the responder to the initiator, this can result in a 200ms delay 1112 before the TCP acknowledgment is sent to the initiator [NagleDA]. If 1113 the initiator has another message pending, but has not yet filled its 1114 output buffer, this can delay the delivery of that message by more 1115 than 200ms. In many cases, this will make no difference. However, 1116 implementors should be aware of this issue. Some operating systems 1117 offer ways to disable the 200ms TCP acknowledgment delay; this may be 1118 useful for relatively low-traffic sessions, or sessions with bursty 1119 traffic flows. 1121 6.5. Responder-Initiated Operation Cancellation 1123 This document, the base specification for DNS Stateful Operations, 1124 does not itself define any long-lived operations, but it defines a 1125 framework for supporting long-lived operations, such as Push 1126 Notification subscriptions [I-D.ietf-dnssd-push] and Discovery Relay 1127 interface subscriptions [I-D.ietf-dnssd-mdns-relay]. 1129 Generally speaking, a long-lived operation is initiated by the 1130 initiator, and, if successful, remains active until the initiator 1131 terminates the operation. 1133 However, it is possible that a long-lived operation may be valid at 1134 the time it was initiated, but then a later change of circumstances 1135 may render that previously valid operation invalid. 1137 For example, a long-lived client operation may pertain to a name that 1138 the server is authoritative for, but then the server configuration is 1139 changed such that it is no longer authoritative for that name. 1141 In such cases, instead of terminating the entire session it may be 1142 desirable for the responder to be able to cancel selectively only 1143 those operations that have become invalid. 1145 The responder performs this selective cancellation by sending a new 1146 response message, with the MESSAGE ID field containing the MESSAGE ID 1147 of the long-lived operation that is to be terminated (that it had 1148 previously acknowledged with a NOERROR RCODE), and the RCODE field of 1149 the new response message giving the reason for cancellation. 1151 After a response message with nonzero RCODE has been sent, that 1152 operation has been terminated from the responder's point of view, and 1153 the responder sends no more messages relating to that operation. 1155 After a response message with nonzero RCODE has been received by the 1156 initiator, that operation has been terminated from the initiator's 1157 point of view, and the cancelled operation's MESSAGE ID is now free 1158 for reuse. 1160 7. DSO Session Lifecycle and Timers 1162 7.1. DSO Session Initiation 1164 A DSO Session begins as described in Section 6.1. 1166 The client may perform as many DNS operations as it wishes using the 1167 newly created DSO Session. When the client has multiple messages to 1168 send, it SHOULD NOT wait for each response before sending the next 1169 message. This prevents TCP's delayed acknowledgement algorithm from 1170 forcing the client into a slow lock-step. The server MUST act on 1171 messages in the order they are transmitted, but SHOULD NOT delay 1172 sending responses to those messages as they become available in order 1173 to return them in the order the requests were received. [RFC7766] 1174 section 3.3 specifies this in more detail. 1176 7.2. DSO Session Timeouts 1178 Two timeout values are associated with a DSO Session: the inactivity 1179 timeout, and the keepalive interval. Both values are communicated in 1180 the same TLV, the Keepalive TLV (Section 8.1). 1182 The first timeout value, the inactivity timeout, is the maximum time 1183 for which a client may speculatively keep a DSO Session open with no 1184 operations pending (e.g., an outstanding DNS Push request) in the 1185 expectation that it may have future requests to send to that server. 1187 The second timeout value, the keepalive interval, is the maximum 1188 permitted interval between messages if the client wishes to keep the 1189 DSO Session alive. 1191 The two timeout values are independent. The inactivity timeout may 1192 be lower, the same, or higher than the keepalive interval, though in 1193 most cases the inactivity timeout is expected to be shorter than the 1194 keepalive interval. 1196 A shorter inactivity timeout with a longer keepalive interval signals 1197 to the client that it should not speculatively keep an inactive DSO 1198 Session open for very long without reason, but when it does have an 1199 active reason to keep a DSO Session open, it doesn't need to be 1200 sending an aggressive level of DSO keepalive traffic to maintain that 1201 session. An example of this would be a client that has subscribed to 1202 DNS Push notifications: in this case, the client is not sending any 1203 traffic to the server, but the session is not inactive, because there 1204 is a pending request to the server to receive push notifications. 1206 A longer inactivity timeout with a shorter keepalive interval signals 1207 to the client that it may speculatively keep an inactive DSO Session 1208 open for a long time, but to maintain that inactive DSO Session it 1209 should be sending a lot of DSO keepalive traffic. This configuration 1210 is expected to be less common. 1212 In the usual case where the inactivity timeout is shorter than the 1213 keepalive interval, it is only when a client has a very long-lived, 1214 low-traffic, operation that the keepalive interval comes into play, 1215 to ensure that a sufficient residual amount of traffic is generated 1216 to maintain NAT and firewall state and to assure client and server 1217 that they still have connectivity to each other. 1219 On a new DSO Session, if no explicit DSO Keepalive message exchange 1220 has taken place, the default value for both timeouts is 15 seconds. 1222 For both timeouts, lower values of the timeout result in higher 1223 network traffic and higher CPU load on the server. 1225 7.3. Inactive DSO Sessions 1227 At both servers and clients, the generation or reception of any 1228 complete DNS message, including DNS requests, responses, updates, or 1229 DSO messages, resets both timers for that DSO Session, with the 1230 exception that a DSO Keepalive message resets only the keepalive 1231 timer, not the inactivity timeout timer. 1233 In addition, for as long as the client has an outstanding operation 1234 in progress, the inactivity timer remains cleared, and an inactivity 1235 timeout cannot occur. 1237 For short-lived DNS operations like traditional queries and updates, 1238 an operation is considered in progress for the time between request 1239 and response, typically a period of a few hundred milliseconds at 1240 most. At the client, the inactivity timer is cleared upon 1241 transmission of a request and remains cleared until reception of the 1242 corresponding response. At the server, the inactivity timer is 1243 cleared upon reception of a request and remains cleared until 1244 transmission of the corresponding response. 1246 For long-lived DNS Stateful operations (such as a Push Notification 1247 subscription [I-D.ietf-dnssd-push] or a Discovery Relay interface 1248 subscription [I-D.ietf-dnssd-mdns-relay]), an operation is considered 1249 in progress for as long as the operation is active, until it is 1250 cancelled. This means that a DSO Session can exist, with active 1251 operations, with no messages flowing in either direction, for far 1252 longer than the inactivity timeout, and this is not an error. This 1253 is why there are two separate timers: the inactivity timeout, and the 1254 keepalive interval. Just because a DSO Session has no traffic for an 1255 extended period of time does not automatically make that DSO Session 1256 "inactive", if it has an active operation that is awaiting events. 1258 7.4. The Inactivity Timeout 1260 The purpose of the inactivity timeout is for the server to balance 1261 its trade off between the costs of setting up new DSO Sessions and 1262 the costs of maintaining inactive DSO Sessions. A server with 1263 abundant DSO Session capacity can offer a high inactivity timeout, to 1264 permit clients to keep a speculative DSO Session open for a long 1265 time, to save the cost of establishing a new DSO Session for future 1266 communications with that server. A server with scarce memory 1267 resources can offer a low inactivity timeout, to cause clients to 1268 promptly close DSO Sessions whenever they have no outstanding 1269 operations with that server, and then create a new DSO Session later 1270 when needed. 1272 7.4.1. Closing Inactive DSO Sessions 1274 When a connection's inactivity timeout is reached the client MUST 1275 begin closing the idle connection, but a client is not required to 1276 keep an idle connection open until the inactivity timeout is reached. 1277 A client MAY close a DSO Session at any time, at the client's 1278 discretion. If a client determines that it has no current or 1279 reasonably anticipated future need for a currently inactive DSO 1280 Session, then the client SHOULD gracefully close that connection. 1282 If, at any time during the life of the DSO Session, the inactivity 1283 timeout value (i.e., 15 seconds by default) elapses without there 1284 being any operation active on the DSO Session, the client MUST close 1285 the connection gracefully. 1287 If, at any time during the life of the DSO Session, twice the 1288 inactivity timeout value (i.e., 30 seconds by default), or five 1289 seconds, if twice the inactivity timeout value is less than five 1290 seconds, elapses without there being any operation active on the DSO 1291 Session, the server MUST consider the client delinquent, and MUST 1292 forcibly abort the DSO Session. 1294 In this context, an operation being active on a DSO Session includes 1295 a query waiting for a response, an update waiting for a response, or 1296 an active long-lived operation, but not a DSO Keepalive message 1297 exchange itself. A DSO Keepalive message exchange resets only the 1298 keepalive interval timer, not the inactivity timeout timer. 1300 If the client wishes to keep an inactive DSO Session open for longer 1301 than the default duration then it uses the DSO Keepalive message to 1302 request longer timeout values, as described in Section 8.1. 1304 7.4.2. Values for the Inactivity Timeout 1306 For the inactivity timeout value, lower values result in more 1307 frequent DSO Session teardown and re-establishment. Higher values 1308 result in lower traffic and lower CPU load on the server, but higher 1309 memory burden to maintain state for inactive DSO Sessions. 1311 A server may dictate any value it chooses for the inactivity timeout 1312 (either in a response to a client-initiated request, or in a server- 1313 initiated message) including values under one second, or even zero. 1315 An inactivity timeout of zero informs the client that it should not 1316 speculatively maintain idle connections at all, and as soon as the 1317 client has completed the operation or operations relating to this 1318 server, the client should immediately begin closing this session. 1320 A server will abort an idle client session after twice the inactivity 1321 timeout value, or five seconds, whichever is greater. In the case of 1322 a zero inactivity timeout value, this means that if a client fails to 1323 close an idle client session then the server will forcibly abort the 1324 idle session after five seconds. 1326 An inactivity timeout of 0xFFFFFFFF represents "infinity" and informs 1327 the client that it may keep an idle connection open as long as it 1328 wishes. Note that after granting an unlimited inactivity timeout in 1329 this way, at any point the server may revise that inactivity timeout 1330 by sending a new DSO Keepalive message dictating new Session Timeout 1331 values to the client. 1333 The largest *finite* inactivity timeout supported by the current 1334 Keepalive TLV is 0xFFFFFFFE (2^32-2 milliseconds, approximately 49.7 1335 days). 1337 7.5. The Keepalive Interval 1339 The purpose of the keepalive interval is to manage the generation of 1340 sufficient messages to maintain state in middleboxes (such at NAT 1341 gateways or firewalls) and for the client and server to periodically 1342 verify that they still have connectivity to each other. This allows 1343 them to clean up state when connectivity is lost, and to establish a 1344 new session if appropriate. 1346 7.5.1. Keepalive Interval Expiry 1348 If, at any time during the life of the DSO Session, the keepalive 1349 interval value (i.e., 15 seconds by default) elapses without any DNS 1350 messages being sent or received on a DSO Session, the client MUST 1351 take action to keep the DSO Session alive, by sending a DSO Keepalive 1352 message (Section 8.1). A DSO Keepalive message exchange resets only 1353 the keepalive timer, not the inactivity timer. 1355 If a client disconnects from the network abruptly, without cleanly 1356 closing its DSO Session, perhaps leaving a long-lived operation 1357 uncancelled, the server learns of this after failing to receive the 1358 required DSO keepalive traffic from that client. If, at any time 1359 during the life of the DSO Session, twice the keepalive interval 1360 value (i.e., 30 seconds by default) elapses without any DNS messages 1361 being sent or received on a DSO Session, the server SHOULD consider 1362 the client delinquent, and SHOULD forcibly abort the DSO Session. 1364 7.5.2. Values for the Keepalive Interval 1366 For the keepalive interval value, lower values result in a higher 1367 volume of DSO keepalive traffic. Higher values of the keepalive 1368 interval reduce traffic and CPU load, but have minimal effect on the 1369 memory burden at the server, because clients keep a DSO Session open 1370 for the same length of time (determined by the inactivity timeout) 1371 regardless of the level of DSO keepalive traffic required. 1373 It may be appropriate for clients and servers to select different 1374 keepalive interval values depending on the nature of the network they 1375 are on. 1377 A corporate DNS server that knows it is serving only clients on the 1378 internal network, with no intervening NAT gateways or firewalls, can 1379 impose a higher keepalive interval, because frequent DSO keepalive 1380 traffic is not required. 1382 A public DNS server that is serving primarily residential consumer 1383 clients, where it is likely there will be a NAT gateway on the path, 1384 may impose a lower keepalive interval, to generate more frequent DSO 1385 keepalive traffic. 1387 A smart client may be adaptive to its environment. A client using a 1388 private IPv4 address [RFC1918] to communicate with a DNS server at an 1389 address outside that IPv4 private address block, may conclude that 1390 there is likely to be a NAT gateway on the path, and accordingly 1391 request a lower keepalive interval. 1393 By default it is RECOMMENDED that clients request, and servers grant, 1394 a keepalive interval of 60 minutes. This keepalive interval provides 1395 for reasonably timely detection if a client abruptly disconnects 1396 without cleanly closing the session, and is sufficient to maintain 1397 state in firewalls and NAT gateways that follow the IETF recommended 1398 Best Current Practice that the "established connection idle-timeout" 1399 used by middleboxes be at least 2 hours 4 minutes [RFC5382] 1400 [RFC7857]. 1402 Note that the lower the keepalive interval value, the higher the load 1403 on client and server. For example, a hypothetical keepalive interval 1404 value of 100ms would result in a continuous stream of at least ten 1405 messages per second, in both directions, to keep the DSO Session 1406 alive. And, in this extreme example, a single packet loss and 1407 retransmission over a long path could introduce a momentary pause in 1408 the stream of messages, long enough to cause the server to 1409 overzealously abort the connection. 1411 Because of this concern, the server MUST NOT send a DSO Keepalive 1412 message (either a response to a client-initiated request, or a 1413 server-initiated message) with a keepalive interval value less than 1414 ten seconds. If a client receives a DSO Keepalive message specifying 1415 a keepalive interval value less than ten seconds this is a fatal 1416 error and the client MUST forcibly abort the connection immediately. 1418 A keepalive interval value of 0xFFFFFFFF represents "infinity" and 1419 informs the client that it should generate no DSO keepalive traffic. 1420 Note that after signaling that the client should generate no DSO 1421 keepalive traffic in this way, at any point the server may revise 1422 that DSO keepalive traffic requirement by sending a new DSO Keepalive 1423 message dictating new Session Timeout values to the client. 1425 The largest *finite* keepalive interval supported by the current 1426 Keepalive TLV is 0xFFFFFFFE (2^32-2 milliseconds, approximately 49.7 1427 days). 1429 7.6. Server-Initiated Session Termination 1431 In addition to cancelling individual long-lived operations 1432 selectively (Section 6.5) there are also occasions where a server may 1433 need to terminate one or more entire sessions. An entire session may 1434 need to be terminated if the client is defective in some way, or 1435 departs from the network without closing its session. Sessions may 1436 also need to be terminated if the server becomes overloaded, or if 1437 the server is reconfigured and lacks the ability to be selective 1438 about which operations need to be cancelled. 1440 This section discusses various reasons a session may be terminated, 1441 and the mechanisms for doing so. 1443 In normal operation, closing a DSO Session is the client's 1444 responsibility. The client makes the determination of when to close 1445 a DSO Session based on an evaluation of both its own needs, and the 1446 inactivity timeout value dictated by the server. A server only 1447 causes a DSO Session to be ended in the exceptional circumstances 1448 outlined below. 1450 Some of the exceptional situations in which a server may terminate a 1451 DSO Session include: 1453 o The server application software or underlying operating system is 1454 shutting down or restarting. 1456 o The server application software terminates unexpectedly (perhaps 1457 due to a bug that makes it crash). 1459 o The server is undergoing a reconfiguration or maintenance 1460 procedure, that, due to the way the server software is 1461 implemented, requires clients to be disconnected. For example, 1462 some software is implemented such that it reads a configuration 1463 file at startup, and changing the server's configuration entails 1464 modifying the configuration file and then killing and restarting 1465 the server software, which generally entails a loss of network 1466 connections. 1468 o The client fails to meets its obligation to generate the required 1469 DSO keepalive traffic, or to close an inactive session by the 1470 prescribed time (twice the time interval dictated by the server, 1471 or five seconds, whichever is greater, as described in 1472 Section 7.2). 1474 o The client sends a grossly invalid or malformed request that is 1475 indicative of a seriously defective client implementation. 1477 o The server is over capacity and needs to shed some load. 1479 7.6.1. Server-Initiated Retry Delay Message 1481 In the cases described above where a server elects to terminate a DSO 1482 Session, it could do so simply by forcibly aborting the connection. 1483 However, if it did this the likely behavior of the client might be 1484 simply to to treat this as a network failure and reconnect 1485 immediately, putting more burden on the server. 1487 Therefore, to avoid this reconnection implosion, a server SHOULD 1488 instead choose to shed client load by sending a Retry Delay message, 1489 with an appropriate RCODE value informing the client of the reason 1490 the DSO Session needs to be terminated. The format of the Retry 1491 Delay TLV, and the interpretations of the various RCODE values, are 1492 described in Section 8.2. After sending a Retry Delay message, the 1493 server MUST NOT send any further messages on that DSO Session. 1495 The server MAY randomize retry delays in situations where many retry 1496 delays are sent in quick succession, so as to avoid all the clients 1497 attempting to reconnect at once. In general, implementations should 1498 avoid using the Retry Delay message in a way that would result in 1499 many clients reconnecting at the same time, if every client attempts 1500 to reconnect at the exact time specified. 1502 Upon receipt of a Retry Delay message from the server, the client 1503 MUST make note of the reconnect delay for this server, and then 1504 immediately close the connection gracefully. 1506 After sending a Retry Delay message the server SHOULD allow the 1507 client five seconds to close the connection, and if the client has 1508 not closed the connection after five seconds then the server SHOULD 1509 forcibly abort the connection. 1511 A Retry Delay message MUST NOT be initiated by a client. If a server 1512 receives a Retry Delay message this is a fatal error and the server 1513 MUST forcibly abort the connection immediately. 1515 7.6.1.1. Outstanding Operations 1517 At the instant a server chooses to initiate a Retry Delay message 1518 there may be DNS requests already in flight from client to server on 1519 this DSO Session, which will arrive at the server after its Retry 1520 Delay message has been sent. The server MUST silently ignore such 1521 incoming requests, and MUST NOT generate any response messages for 1522 them. When the Retry Delay message from the server arrives at the 1523 client, the client will determine that any DNS requests it previously 1524 sent on this DSO Session, that have not yet received a response, now 1525 will certainly not be receiving any response. Such requests should 1526 be considered failed, and should be retried at a later time, as 1527 appropriate. 1529 In the case where some, but not all, of the existing operations on a 1530 DSO Session have become invalid (perhaps because the server has been 1531 reconfigured and is no longer authoritative for some of the names), 1532 but the server is terminating all affected DSO Sessions en masse by 1533 sending them all a Retry Delay message, the reconnect delay MAY be 1534 zero, indicating that the clients SHOULD immediately attempt to re- 1535 establish operations. 1537 It is likely that some of the attempts will be successful and some 1538 will not, depending on the nature of the reconfiguration. 1540 In the case where a server is terminating a large number of DSO 1541 Sessions at once (e.g., if the system is restarting) and the server 1542 doesn't want to be inundated with a flood of simultaneous retries, it 1543 SHOULD send different reconnect delay values to each client. These 1544 adjustments MAY be selected randomly, pseudorandomly, or 1545 deterministically (e.g., incrementing the time value by one tenth of 1546 a second for each successive client, yielding a post-restart 1547 reconnection rate of ten clients per second). 1549 7.6.1.2. Client Reconnection 1551 After a DSO Session is ended by the server (either by sending the 1552 client a Retry Delay message, or by forcibly aborting the underlying 1553 transport connection) the client SHOULD try to reconnect, to that 1554 service instance, or to another suitable service instance, if more 1555 than one is available. If reconnecting to the same service instance, 1556 the client MUST respect the indicated delay, if available, before 1557 attempting to reconnect. Clients should not attempt to randomize the 1558 delay; the server will randomly jitter the retry delay values it 1559 sends to each client if this behavior is desired. 1561 If the service instance will only be out of service for a short 1562 maintenance period, it should use a value a little longer that the 1563 expected maintenance window. It should not default to a very large 1564 delay value, or clients may not attempt to reconnect after it resumes 1565 service. 1567 If a particular service instance does not want a client to reconnect 1568 ever (perhaps the service instance is being de-commissioned), it 1569 SHOULD set the retry delay to the maximum value 0xFFFFFFFF (2^32-1 1570 milliseconds, approximately 49.7 days). It is not possible to 1571 instruct a client to stay away for longer than 49.7 days. If, after 1572 49.7 days, the DNS or other configuration information still indicates 1573 that this is the valid service instance for a particular service, 1574 then clients MAY attempt to reconnect. In reality, if a client is 1575 rebooted or otherwise lose state, it may well attempt to reconnect 1576 before 49.7 days elapses, for as long as the DNS or other 1577 configuration information continues to indicate that this is the 1578 service instance the client should use. 1580 8. Base TLVs for DNS Stateful Operations 1582 This section describes the three base TLVs for DNS Stateful 1583 Operations: Keepalive, Retry Delay, and Encryption Padding. 1585 8.1. Keepalive TLV 1587 The Keepalive TLV (DSO-TYPE=1) performs two functions: to reset the 1588 keepalive timer for the DSO Session, and to establish the values for 1589 the Session Timeouts. The client will request the desired session 1590 timeout values and the server will acknowledge with the response 1591 values that it requires the client to use. 1593 The DSO-DATA for the the Keepalive TLV is as follows: 1595 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 1596 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1597 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1598 | INACTIVITY TIMEOUT (32 bits) | 1599 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1600 | KEEPALIVE INTERVAL (32 bits) | 1601 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1603 INACTIVITY TIMEOUT: The inactivity timeout for the current DSO 1604 Session, specified as a 32-bit unsigned integer, in network (big 1605 endian) byte order, in units of milliseconds. This is the timeout 1606 at which the client MUST begin closing an inactive DSO Session. 1607 The inactivity timeout can be any value of the server's choosing. 1608 If the client does not gracefully close an inactive DSO Session, 1609 then after twice this interval, or five seconds, whichever is 1610 greater, the server will forcibly abort the connection. 1612 KEEPALIVE INTERVAL: The keepalive interval for the current DSO 1613 Session, specified as a 32-bit unsigned integer, in network (big 1614 endian) byte order, in units of milliseconds. This is the 1615 interval at which a client MUST generate DSO keepalive traffic to 1616 maintain connection state. The keepalive interval MUST NOT be 1617 less than ten seconds. If the client does not generate the 1618 mandated DSO keepalive traffic, then after twice this interval the 1619 server will forcibly abort the connection. Since the minimum 1620 allowed keepalive interval is ten seconds, the minimum time at 1621 which a server will forcibly disconnect a client for failing to 1622 generate the mandated DSO keepalive traffic is twenty seconds. 1624 The transmission or reception of DSO Keepalive messages (i.e., 1625 messages where the Keepalive TLV is the first TLV) reset only the 1626 keepalive timer, not the inactivity timer. The reason for this is 1627 that periodic DSO Keepalive messages are sent for the sole purpose of 1628 keeping a DSO Session alive, when that DSO Session has current or 1629 recent non-maintenance activity that warrants keeping that DSO 1630 Session alive. Sending DSO keepalive traffic itself is not 1631 considered a client activity; it is considered a maintenance activity 1632 that is performed in service of other client activities. If DSO 1633 keepalive traffic itself were to reset the inactivity timer, then 1634 that would create a circular livelock where keepalive traffic would 1635 be sent indefinitely to keep a DSO Session alive, where the only 1636 activity on that DSO Session would be the keepalive traffic keeping 1637 the DSO Session alive so that further keepalive traffic can be sent. 1638 For a DSO Session to be considered active, it must be carrying 1639 something more than just keepalive traffic. This is why merely 1640 sending or receiving a DSO Keepalive message does not reset the 1641 inactivity timer. 1643 When sent by a client, the DSO Keepalive request message MUST be sent 1644 as an acknowledged request, with a nonzero MESSAGE ID. If a server 1645 receives a DSO Keepalive message with a zero MESSAGE ID then this is 1646 a fatal error and the server MUST forcibly abort the connection 1647 immediately. The DSO Keepalive request message resets a DSO 1648 Session's keepalive timer, and at the same time communicates to the 1649 server the the client's requested Session Timeout values. In a 1650 server response to a client-initiated DSO Keepalive request message, 1651 the Session Timeouts contain the server's chosen values from this 1652 point forward in the DSO Session, which the client MUST respect. 1653 This is modeled after the DHCP protocol, where the client requests a 1654 certain lease lifetime using DHCP option 51 [RFC2132], but the server 1655 is the ultimate authority for deciding what lease lifetime is 1656 actually granted. 1658 When a client is sending its second and subsequent DSO Keepalive 1659 requests to the server, the client SHOULD continue to request its 1660 preferred values each time. This allows flexibility, so that if 1661 conditions change during the lifetime of a DSO Session, the server 1662 can adapt its responses to better fit the client's needs. 1664 Once a DSO Session is in progress (Section 6.1) a DSO Keepalive 1665 message MAY be initiated by a server. When sent by a server, the DSO 1666 Keepalive message MUST be sent as an unacknowledged message, with the 1667 MESSAGE ID set to zero. The client MUST NOT generate a response to a 1668 server-initiated DSO Keepalive message. If a client receives a DSO 1669 Keepalive request message with a nonzero MESSAGE ID then this is a 1670 fatal error and the client MUST forcibly abort the connection 1671 immediately. The unacknowledged DSO Keepalive message from the 1672 server resets a DSO Session's keepalive timer, and at the same time 1673 unilaterally informs the client of the new Session Timeout values to 1674 use from this point forward in this DSO Session. No client DSO 1675 response message to this unilateral declaration is required or 1676 allowed. 1678 In DSO Keepalive response messages, the Keepalive TLV is REQUIRED and 1679 is used only as a Response Primary TLV sent as a reply to a DSO 1680 Keepalive request message from the client. A Keepalive TLV MUST NOT 1681 be added to other responses as a Response Additional TLV. If the 1682 server wishes to update a client's Session Timeout values other than 1683 in response to a DSO Keepalive request message from the client, then 1684 it does so by sending an unacknowledged DSO Keepalive message of its 1685 own, as described above. 1687 It is not required that the Keepalive TLV be used in every DSO 1688 Session. While many DNS Stateful operations will be used in 1689 conjunction with a long-lived session state, not all DNS Stateful 1690 operations require long-lived session state, and in some cases the 1691 default 15-second value for both the inactivity timeout and keepalive 1692 interval may be perfectly appropriate. However, note that for 1693 clients that implement only the DSO-TYPEs defined in this document, a 1694 Keepalive request message is the only way for a client to initiate a 1695 DSO Session. 1697 8.1.1. Client handling of received Session Timeout values 1699 When a client receives a response to its client-initiated DSO 1700 Keepalive message, or receives a server-initiated DSO Keepalive 1701 message, the client has then received Session Timeout values dictated 1702 by the server. The two timeout values contained in the Keepalive TLV 1703 from the server may each be higher, lower, or the same as the 1704 respective Session Timeout values the client previously had for this 1705 DSO Session. 1707 In the case of the keepalive timer, the handling of the received 1708 value is straightforward. When a client receives a server-initiated 1709 message with the Keepalive TLV as its primary TLV, it resets the 1710 keepalive timer. Whenever it receives a Keepalive TLV from the 1711 server, either in a server-initiated message or a reply to its own 1712 client-initiated Keepalive message, it updates the keepalive interval 1713 for the DSO Session. The new keepalive interval indicates the 1714 maximum time that may elapse before another message must be sent or 1715 received on this DSO Session, if the DSO Session is to remain alive. 1716 If the client receives a response to a keepalive message that 1717 specifies a keepalive interval shorter than the current keepalive 1718 timer, the client MUST immediately send a Keepalive message. 1719 However, this should not normally happen in practice: it would 1720 require that Keepalive interval the server be shorter than the round- 1721 trip time of the connection. 1723 In the case of the inactivity timeout, the handling of the received 1724 value is a little more subtle, though the meaning of the inactivity 1725 timeout remains as specified -- it still indicates the maximum 1726 permissible time allowed without useful activity on a DSO Session. 1727 The act of receiving the message containing the Keepalive TLV does 1728 not itself reset the inactivity timer. The time elapsed since the 1729 last useful activity on this DSO Session is unaffected by exchange of 1730 DSO Keepalive messages. The new inactivity timeout value in the 1731 Keepalive TLV in the received message does update the timeout 1732 associated with the running inactivity timer; that becomes the new 1733 maximum permissible time without activity on a DSO Session. 1735 o If the current inactivity timer value is less than the new 1736 inactivity timeout, then the DSO Session may remain open for now. 1737 When the inactivity timer value reaches the new inactivity 1738 timeout, the client MUST then begin closing the DSO Session, as 1739 described above. 1741 o If the current inactivity timer value is equal to the new 1742 inactivity timeout, then this DSO Session has been inactive for 1743 exactly as long as the server will permit, and now the client MUST 1744 immediately begin closing this DSO Session. 1746 o If the current inactivity timer value is already greater than the 1747 new inactivity timeout, then this DSO Session has already been 1748 inactive for longer than the server permits, and the client MUST 1749 immediately begin closing this DSO Session. 1751 o If the current inactivity timer value is already more than twice 1752 the new inactivity timeout, then the client is immediately 1753 considered delinquent (this DSO Session is immediately eligible to 1754 be forcibly terminated by the server) and the client MUST 1755 immediately begin closing this DSO Session. However if a server 1756 abruptly reduces the inactivity timeout in this way, then, to give 1757 the client time to close the connection gracefully before the 1758 server resorts to forcibly aborting it, the server SHOULD give the 1759 client an additional grace period of one quarter of the new 1760 inactivity timeout, or five seconds, whichever is greater. 1762 8.1.2. Relation to edns-tcp-keepalive EDNS0 Option 1764 The inactivity timeout value in the Keepalive TLV (DSO-TYPE=1) has 1765 similar intent to the edns-tcp-keepalive EDNS0 Option [RFC7828]. A 1766 client/server pair that supports DSO MUST NOT use the edns-tcp- 1767 keepalive EDNS0 Option within any message after a DSO Session has 1768 been established. A client that has sent a DSO message to establish 1769 a session MUST NOT send an edns-tcp-keepalive EDNS0 Option from this 1770 point on. Once a DSO Session has been established, if either client 1771 or server receives a DNS message over the DSO Session that contains 1772 an edns-tcp-keepalive EDNS0 Option, this is a fatal error and the 1773 receiver of the edns-tcp-keepalive EDNS0 Option MUST forcibly abort 1774 the connection immediately. 1776 8.2. Retry Delay TLV 1778 The Retry Delay TLV (DSO-TYPE=2) can be used as a Primary TLV 1779 (unacknowledged) in a server-to-client message, or as a Response 1780 Additional TLV in either direction. 1782 The DSO-DATA for the the Retry Delay TLV is as follows: 1784 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 1785 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1786 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1787 | RETRY DELAY (32 bits) | 1788 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1790 RETRY DELAY: A time value, specified as a 32-bit unsigned integer, 1791 in network (big endian) byte order, in units of milliseconds, 1792 within which the initiator MUST NOT retry this operation, or retry 1793 connecting to this server. Recommendations for the RETRY DELAY 1794 value are given in Section 7.6.1. 1796 8.2.1. Retry Delay TLV used as a Primary TLV 1798 When sent from server to client, the Retry Delay TLV is used as the 1799 Primary TLV in an unacknowledged message. It is used by a server to 1800 instruct a client to close the DSO Session and underlying connection, 1801 and not to reconnect for the indicated time interval. 1803 In this case it applies to the DSO Session as a whole, and the client 1804 MUST begin closing the DSO Session, as described in Section 7.6.1. 1805 The RCODE in the message header SHOULD indicate the principal reason 1806 for the termination: 1808 o NOERROR indicates a routine shutdown or restart. 1810 o FORMERR indicates that the client requests are too badly malformed 1811 for the session to continue. 1813 o SERVFAIL indicates that the server is overloaded due to resource 1814 exhaustion and needs to shed load. 1816 o REFUSED indicates that the server has been reconfigured, and at 1817 this time it is now unable to perform one or more of the long- 1818 lived client operations that were previously being performed on 1819 this DSO Session. 1821 o NOTAUTH indicates that the server has been reconfigured and at 1822 this time it is now unable to perform one or more of the long- 1823 lived client operations that were previously being performed on 1824 this DSO Session because it does not have authority over the names 1825 in question (for example, a DNS Push Notification server could be 1826 reconfigured such that is is no longer accepting DNS Push 1827 Notification requests for one or more of the currently subscribed 1828 names). 1830 This document specifies only these RCODE values for Retry Delay 1831 message. Servers sending Retry Delay messages SHOULD use one of 1832 these values. However, future circumstances may create situations 1833 where other RCODE values are appropriate in Retry Delay messages, so 1834 clients MUST be prepared to accept Retry Delay messages with any 1835 RCODE value. 1837 In some cases, when a server sends a Retry Delay message to a client, 1838 there may be more than one reason for the server wanting to end the 1839 session. Possibly the configuration could have been changed such 1840 that some long-lived client operations can no longer be continued due 1841 to policy (REFUSED), and other long-lived client operations can no 1842 longer be performed due to the server no longer being authoritative 1843 for those names (NOTAUTH). In such cases the server MAY use any of 1844 the applicable RCODE values, or RCODE=NOERROR (routine shutdown or 1845 restart). 1847 Note that the selection of RCODE value in a Retry Delay message is 1848 not critical, since the RCODE value is generally used only for 1849 information purposes, such as writing to a log file for future human 1850 analysis regarding the nature of the disconnection. Generally 1851 clients do not modify their behavior depending on the RCODE value. 1852 The RETRY DELAY in the message tells the client how long it should 1853 wait before attempting a new connection to this service instance. 1855 For clients that do in some way modify their behavior depending on 1856 the RCODE value, they should treat unknown RCODE values the same as 1857 RCODE=NOERROR (routine shutdown or restart). 1859 A Retry Delay message from server to client is an unacknowledged 1860 message; the MESSAGE ID MUST be set to zero in the outgoing message 1861 and the client MUST NOT send a response. 1863 A client MUST NOT send a Retry Delay DSO message to a server. If a 1864 server receives a DSO message where the Primary TLV is the Retry 1865 Delay TLV, this is a fatal error and the server MUST forcibly abort 1866 the connection immediately. 1868 8.2.2. Retry Delay TLV used as a Response Additional TLV 1870 In the case of a request that returns a nonzero RCODE value, the 1871 responder MAY append a Retry Delay TLV to the response, indicating 1872 the time interval during which the initiator SHOULD NOT attempt this 1873 operation again. 1875 The indicated time interval during which the initiator SHOULD NOT 1876 retry applies only to the failed operation, not to the DSO Session as 1877 a whole. 1879 8.3. Encryption Padding TLV 1881 The Encryption Padding TLV (DSO-TYPE=3) can only be used as an 1882 Additional or Response Additional TLV. It is only applicable when 1883 the DSO Transport layer uses encryption such as TLS. 1885 The DSO-DATA for the the Padding TLV is optional and is a variable 1886 length field containing non-specified values. A DSO-LENGTH of 0 1887 essentially provides for 4 bytes of padding (the minimum amount). 1889 1 1 1 1 1 1 1890 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 1891 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 1892 / / 1893 / VARIABLE NUMBER OF BYTES / 1894 / / 1895 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ 1897 As specified for the EDNS(0) Padding Option [RFC7830] the PADDING 1898 bytes SHOULD be set to 0x00. Other values MAY be used, for example, 1899 in cases where there is a concern that the padded message could be 1900 subject to compression before encryption. PADDING bytes of any value 1901 MUST be accepted in the messages received. 1903 The Encryption Padding TLV may be included in either a DSO request, 1904 response, or both. As specified for the EDNS(0) Padding Option 1905 [RFC7830] if a request is received with an Encryption Padding TLV, 1906 then the response MUST also include an Encryption Padding TLV. 1908 The length of padding is intentionally not specified in this document 1909 and is a function of current best practices with respect to the type 1910 and length of data in the preceding TLVs 1911 [I-D.ietf-dprive-padding-policy]. 1913 9. Summary Highlights 1915 This section summarizes some noteworthy highlights about various 1916 components of the DSO protocol. 1918 9.1. QR bit and MESSAGE ID 1920 In DSO Request Messages the QR bit is 0 and the MESSAGE ID is 1921 nonzero. 1923 In DSO Response Messages the QR bit is 1 and the MESSAGE ID is 1924 nonzero. 1926 In DSO Unacknowledged Messages the QR bit is 0 and the MESSAGE ID is 1927 zero. 1929 The table below illustrates which combinations are legal and how they 1930 are interpreted: 1932 +--------------------------+------------------------+ 1933 | MESSAGE ID zero | MESSAGE ID nonzero | 1934 +--------+--------------------------+------------------------+ 1935 | QR=0 | Unacknowledged Message | Request Message | 1936 +--------+--------------------------+------------------------+ 1937 | QR=1 | Invalid - Fatal Error | Response Message | 1938 +--------+--------------------------+------------------------+ 1940 9.2. TLV Usage 1942 The table below indicates, for each of the three TLVs defined in this 1943 document, whether they are valid in each of ten different contexts. 1945 The first five contexts are requests or unacknowledged messages from 1946 client to server, and the corresponding responses from server back to 1947 client: 1949 o C-P - Primary TLV, sent in DSO Request message, from client to 1950 server, with nonzero MESSAGE ID indicating that this request MUST 1951 generate response message. 1953 o C-U - Primary TLV, sent in DSO Unacknowledged message, from client 1954 to server, with zero MESSAGE ID indicating that this request MUST 1955 NOT generate response message. 1957 o C-A - Additional TLV, optionally added to request message or 1958 unacknowledged message from client to server. 1960 o CRP - Response Primary TLV, included in response message sent back 1961 to the client (in response to a client "C-P" request with nonzero 1962 MESSAGE ID indicating that a response is required) where the DSO- 1963 TYPE of the Response TLV matches the DSO-TYPE of the Primary TLV 1964 in the request. 1966 o CRA - Response Additional TLV, included in response message sent 1967 back to the client (in response to a client "C-P" request with 1968 nonzero MESSAGE ID indicating that a response is required) where 1969 the DSO-TYPE of the Response TLV does not match the DSO-TYPE of 1970 the Primary TLV in the request. 1972 The second five contexts are their counterparts in the opposite 1973 direction: requests or unacknowledged messages from server to client, 1974 and the corresponding responses from client back to server. 1976 o S-P - Primary TLV, sent in DSO Request message, from server to 1977 client, with nonzero MESSAGE ID indicating that this request MUST 1978 generate response message. 1980 o S-U - Primary TLV, sent in DSO Unacknowledged message, from server 1981 to client, with zero MESSAGE ID indicating that this request MUST 1982 NOT generate response message. 1984 o S-A - Additional TLV, optionally added to request message or 1985 unacknowledged message from server to client. 1987 o SRP - Response Primary TLV, included in response message sent back 1988 to the server (in response to a server "S-P" request with nonzero 1989 MESSAGE ID indicating that a response is required) where the DSO- 1990 TYPE of the Response TLV matches the DSO-TYPE of the Primary TLV 1991 in the request. 1993 o SRA - Response Additional TLV, included in response message sent 1994 back to the server (in response to a server "S-P" request with 1995 nonzero MESSAGE ID indicating that a response is required) where 1996 the DSO-TYPE of the Response TLV does not match the DSO-TYPE of 1997 the Primary TLV in the request. 1999 +-------------------------+-------------------------+ 2000 | C-P C-U C-A CRP CRA | S-P S-U S-A SRP SRA | 2001 +------------+-------------------------+-------------------------+ 2002 | KeepAlive | X X | X | 2003 +------------+-------------------------+-------------------------+ 2004 | RetryDelay | X | X | 2005 +------------+-------------------------+-------------------------+ 2006 | Padding | X X | X X | 2007 +------------+-------------------------+-------------------------+ 2009 Note that some of the columns in this table are currently empty. The 2010 table provides a template for future TLV definitions to follow. It 2011 is recommended that definitions of future TLVs include a similar 2012 table summarizing the contexts where the new TLV is valid. 2014 10. IANA Considerations 2016 10.1. DSO OPCODE Registration 2018 The IANA is requested to record the value ([TBA1] tentatively) 6 for 2019 the DSO OPCODE in the DNS OPCODE Registry. DSO stands for DNS 2020 Stateful Operations. 2022 10.2. DSO RCODE Registration 2024 The IANA is requested to record the value ([TBA2] tentatively) 11 for 2025 the DSOTYPENI error code in the DNS RCODE Registry. The DSOTYPENI 2026 error code ("DSO-TYPE Not Implemented") indicates that the receiver 2027 does implement DNS Stateful Operations, but does not implement the 2028 specific DSO-TYPE of the primary TLV in the DSO request message. 2030 10.3. DSO Type Code Registry 2032 The IANA is requested to create the 16-bit DSO Type Code Registry, 2033 with initial (hexadecimal) values as shown below: 2035 +-----------+--------------------------------+----------+-----------+ 2036 | Type | Name | Status | Reference | 2037 +-----------+--------------------------------+----------+-----------+ 2038 | 0000 | Reserved | Standard | RFC-TBD | 2039 | | | | | 2040 | 0001 | KeepAlive | Standard | RFC-TBD | 2041 | | | | | 2042 | 0002 | RetryDelay | Standard | RFC-TBD | 2043 | | | | | 2044 | 0003 | EncryptionPadding | Standard | RFC-TBD | 2045 | | | | | 2046 | 0004-003F | Unassigned, reserved for | | | 2047 | | DSO session-management TLVs | | | 2048 | | | | | 2049 | 0040-F7FF | Unassigned | | | 2050 | | | | | 2051 | F800-FBFF | Reserved for | | | 2052 | | experimental/local use | | | 2053 | | | | | 2054 | FC00-FFFF | Reserved for future expansion | | | 2055 +-----------+--------------------------------+----------+-----------+ 2057 DSO Type Code zero is reserved and is not currently intended for 2058 allocation. 2060 Registrations of new DSO Type Codes in the "Reserved for DSO session- 2061 management" range 0004-003F and the "Reserved for future expansion" 2062 range FC00-FFFF require publication of an IETF Standards Action 2063 document [RFC8126]. 2065 Requests to register additional new DSO Type Codes in the 2066 "Unassigned" range 0040-F7FF are to be recorded by IANA after Expert 2067 Review [RFC8126]. The expert review should validate that the 2068 requested type code is specified in a way that conforms to this 2069 specification, and that the intended use for the code would not be 2070 addressed with an experimental/local assignment. 2072 DSO Type Codes in the "experimental/local" range F800-FBFF may be 2073 used as Experimental Use or Private Use values [RFC8126] and may be 2074 used freely for development purposes, or for other purposes within a 2075 single site. No attempt is made to prevent multiple sites from using 2076 the same value in different (and incompatible) ways. There is no 2077 need for IANA to review such assignments (since IANA does not record 2078 them) and assignments are not generally useful for broad 2079 interoperability. It is the responsibility of the sites making use 2080 of "experimental/local" values to ensure that no conflicts occur 2081 within the intended scope of use. 2083 11. Security Considerations 2085 If this mechanism is to be used with DNS over TLS, then these 2086 messages are subject to the same constraints as any other DNS-over- 2087 TLS messages and MUST NOT be sent in the clear before the TLS session 2088 is established. 2090 The data field of the "Encryption Padding" TLV could be used as a 2091 covert channel. 2093 When designing new DSO TLVs, the potential for data in the TLV to be 2094 used as a tracking identifier should be taken into consideration, and 2095 should be avoided when not required. 2097 When used without TLS or similar cryptographic protection, a 2098 malicious entity maybe able to inject a malicious Retry Delay 2099 Unacknowledged Message into the data stream, specifying an 2100 unreasonably large RETRY DELAY, causing a denial-of-service attack 2101 against the client. 2103 The establishment of DSO sessions has an increasing impact on the 2104 number of open TCP connections on a DNS server. Additional resources 2105 may be used on the server as a result. However, because the server 2106 can limit the number of DSO sessions established and can also close 2107 existing DSO sessions as needed, denial of service or resource 2108 exhaustion should not be a concern. 2110 11.1. TCP Fast Open Considerations 2112 It would be possible to add a TLV that requires the server to do some 2113 significant work, and send that to the server as initial data in a 2114 TCP SYN packet. A flood of such packets could be used as a DoS 2115 attack on the server. None of the TLVs defined here have this 2116 property. If a new TLV is specified that does have this property, 2117 the specification should require that some kind of exchange be done 2118 with the server before work is done. That is, the TLV that requires 2119 work could not be processed without a round-trip from the server to 2120 the client to verify that the source address of the packet is 2121 reachable. 2123 One way to accomplish this would be to have the client send a TLV 2124 indicating that it wishes to have the server do work of this sort; 2125 this TLV would not actually result in work being done, but would 2126 request a nonce from the server. The client could then use that 2127 nonce to request that work be done. 2129 Alternatively, the server could simply disable TCP fast open. This 2130 same problem would exist for DNS-over-TLS with TLS early data; the 2131 same remedies would apply. 2133 12. Acknowledgements 2135 Thanks to Stephane Bortzmeyer, Tim Chown, Ralph Droms, Paul Hoffman, 2136 Jan Komissar, Edward Lewis, Allison Mankin, Rui Paulo, David 2137 Schinazi, Manju Shankar Rao, and Bernie Volz for their helpful 2138 contributions to this document. 2140 13. References 2142 13.1. Normative References 2144 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", 2145 STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987, 2146 . 2148 [RFC1035] Mockapetris, P., "Domain names - implementation and 2149 specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, 2150 November 1987, . 2152 [RFC1918] Rekhter, Y., Moskowitz, B., Karrenberg, D., de Groot, G., 2153 and E. Lear, "Address Allocation for Private Internets", 2154 BCP 5, RFC 1918, DOI 10.17487/RFC1918, February 1996, 2155 . 2157 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2158 Requirement Levels", BCP 14, RFC 2119, 2159 DOI 10.17487/RFC2119, March 1997, . 2162 [RFC2136] Vixie, P., Ed., Thomson, S., Rekhter, Y., and J. Bound, 2163 "Dynamic Updates in the Domain Name System (DNS UPDATE)", 2164 RFC 2136, DOI 10.17487/RFC2136, April 1997, 2165 . 2167 [RFC6891] Damas, J., Graff, M., and P. Vixie, "Extension Mechanisms 2168 for DNS (EDNS(0))", STD 75, RFC 6891, 2169 DOI 10.17487/RFC6891, April 2013, . 2172 [RFC7766] Dickinson, J., Dickinson, S., Bellis, R., Mankin, A., and 2173 D. Wessels, "DNS Transport over TCP - Implementation 2174 Requirements", RFC 7766, DOI 10.17487/RFC7766, March 2016, 2175 . 2177 [RFC7830] Mayrhofer, A., "The EDNS(0) Padding Option", RFC 7830, 2178 DOI 10.17487/RFC7830, May 2016, . 2181 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 2182 Writing an IANA Considerations Section in RFCs", BCP 26, 2183 RFC 8126, DOI 10.17487/RFC8126, June 2017, 2184 . 2186 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2187 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2188 May 2017, . 2190 13.2. Informative References 2192 [I-D.ietf-dnsop-no-response-issue] 2193 Andrews, M. and R. Bellis, "A Common Operational Problem 2194 in DNS Servers - Failure To Respond.", draft-ietf-dnsop- 2195 no-response-issue-11 (work in progress), July 2018. 2197 [I-D.ietf-dnssd-mdns-relay] 2198 Lemon, T. and S. Cheshire, "Multicast DNS Discovery 2199 Relay", draft-ietf-dnssd-mdns-relay-01 (work in progress), 2200 July 2018. 2202 [I-D.ietf-dnssd-push] 2203 Pusateri, T. and S. Cheshire, "DNS Push Notifications", 2204 draft-ietf-dnssd-push-14 (work in progress), March 2018. 2206 [I-D.ietf-doh-dns-over-https] 2207 Hoffman, P. and P. McManus, "DNS Queries over HTTPS 2208 (DoH)", draft-ietf-doh-dns-over-https-12 (work in 2209 progress), June 2018. 2211 [I-D.ietf-dprive-padding-policy] 2212 Mayrhofer, A., "Padding Policy for EDNS(0)", draft-ietf- 2213 dprive-padding-policy-06 (work in progress), July 2018. 2215 [I-D.ietf-tls-tls13] 2216 Rescorla, E., "The Transport Layer Security (TLS) Protocol 2217 Version 1.3", draft-ietf-tls-tls13-28 (work in progress), 2218 March 2018. 2220 [NagleDA] Cheshire, S., "TCP Performance problems caused by 2221 interaction between Nagle's Algorithm and Delayed ACK", 2222 May 2005, 2223 . 2225 [RFC2132] Alexander, S. and R. Droms, "DHCP Options and BOOTP Vendor 2226 Extensions", RFC 2132, DOI 10.17487/RFC2132, March 1997, 2227 . 2229 [RFC2782] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for 2230 specifying the location of services (DNS SRV)", RFC 2782, 2231 DOI 10.17487/RFC2782, February 2000, . 2234 [RFC5382] Guha, S., Ed., Biswas, K., Ford, B., Sivakumar, S., and P. 2235 Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142, 2236 RFC 5382, DOI 10.17487/RFC5382, October 2008, 2237 . 2239 [RFC6762] Cheshire, S. and M. Krochmal, "Multicast DNS", RFC 6762, 2240 DOI 10.17487/RFC6762, February 2013, . 2243 [RFC6763] Cheshire, S. and M. Krochmal, "DNS-Based Service 2244 Discovery", RFC 6763, DOI 10.17487/RFC6763, February 2013, 2245 . 2247 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 2248 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, 2249 . 2251 [RFC7828] Wouters, P., Abley, J., Dickinson, S., and R. Bellis, "The 2252 edns-tcp-keepalive EDNS0 Option", RFC 7828, 2253 DOI 10.17487/RFC7828, April 2016, . 2256 [RFC7857] Penno, R., Perreault, S., Boucadair, M., Ed., Sivakumar, 2257 S., and K. Naito, "Updates to Network Address Translation 2258 (NAT) Behavioral Requirements", BCP 127, RFC 7857, 2259 DOI 10.17487/RFC7857, April 2016, . 2262 [RFC7858] Hu, Z., Zhu, L., Heidemann, J., Mankin, A., Wessels, D., 2263 and P. Hoffman, "Specification for DNS over Transport 2264 Layer Security (TLS)", RFC 7858, DOI 10.17487/RFC7858, May 2265 2016, . 2267 Authors' Addresses 2269 Ray Bellis 2270 Internet Systems Consortium, Inc. 2271 950 Charter Street 2272 Redwood City CA 94063 2273 USA 2275 Phone: +1 (650) 423-1200 2276 Email: ray@isc.org 2277 Stuart Cheshire 2278 Apple Inc. 2279 One Apple Park Way 2280 Cupertino CA 95014 2281 USA 2283 Phone: +1 (408) 996-1010 2284 Email: cheshire@apple.com 2286 John Dickinson 2287 Sinodun Internet Technologies 2288 Magadalen Centre 2289 Oxford Science Park 2290 Oxford OX4 4GA 2291 United Kingdom 2293 Email: jad@sinodun.com 2295 Sara Dickinson 2296 Sinodun Internet Technologies 2297 Magadalen Centre 2298 Oxford Science Park 2299 Oxford OX4 4GA 2300 United Kingdom 2302 Email: sara@sinodun.com 2304 Ted Lemon 2305 Nibbhaya Consulting 2306 P.O. Box 958 2307 Brattleboro VT 05302-0958 2308 USA 2310 Email: mellon@fugue.com 2312 Tom Pusateri 2313 Unaffiliated 2314 Raleigh NC 27608 2315 USA 2317 Phone: +1 (919) 867-1330 2318 Email: pusateri@bangj.com