idnits 2.17.1 

draft-ietf-dnsop-session-signal-14.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

     (Using the creation date from RFC1035, updated by this document, for
     RFC5378 checks: 1987-11-01)

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (August 02, 2018) is 2093 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'TBA1' is mentioned on line 2018, but not defined

  == Missing Reference: 'TBA2' is mentioned on line 2024, but not defined

  == Outdated reference: A later version (-23) exists of
     draft-ietf-dnsop-no-response-issue-11

  == Outdated reference: A later version (-04) exists of
     draft-ietf-dnssd-mdns-relay-01

  == Outdated reference: A later version (-25) exists of
     draft-ietf-dnssd-push-14

  == Outdated reference: A later version (-14) exists of
     draft-ietf-doh-dns-over-https-12


     Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	DNSOP Working Group                                            R. Bellis
3	Internet-Draft                                                       ISC
4	Updates: 1035, 7766 (if approved)                            S. Cheshire
5	Intended status: Standards Track                              Apple Inc.
6	Expires: February 3, 2019                                   J. Dickinson
7	                                                            S. Dickinson
8	                                                                 Sinodun
9	                                                                T. Lemon
10	                                                     Nibbhaya Consulting
11	                                                             T. Pusateri
12	                                                            Unaffiliated
13	                                                         August 02, 2018

15	                        DNS Stateful Operations
16	                   draft-ietf-dnsop-session-signal-14

18	Abstract

20	   This document defines a new DNS OPCODE for DNS Stateful Operations
21	   (DSO).  DSO messages communicate operations within persistent
22	   stateful sessions, using type-length-value (TLV) syntax.  Three TLVs
23	   are defined that manage session timeouts, termination, and encryption
24	   padding, and a framework is defined for extensions to enable new
25	   stateful operations.  This document updates RFC 1035 by adding a new
26	   DNS header opcode and result code which has different message
27	   semantics.  This document updates RFC 7766 by redefining a session,
28	   providing new guidance on connection re-use, and providing a new
29	   mechanism for handling session idle timeouts.

31	Status of This Memo

33	   This Internet-Draft is submitted in full conformance with the
34	   provisions of BCP 78 and BCP 79.

36	   Internet-Drafts are working documents of the Internet Engineering
37	   Task Force (IETF).  Note that other groups may also distribute
38	   working documents as Internet-Drafts.  The list of current Internet-
39	   Drafts is at http://datatracker.ietf.org/drafts/current/.

41	   Internet-Drafts are draft documents valid for a maximum of six months
42	   and may be updated, replaced, or obsoleted by other documents at any
43	   time.  It is inappropriate to use Internet-Drafts as reference
44	   material or to cite them other than as "work in progress."

46	   This Internet-Draft will expire on February 3, 2019.

48	Copyright Notice

50	   Copyright (c) 2018 IETF Trust and the persons identified as the
51	   document authors.  All rights reserved.

53	   This document is subject to BCP 78 and the IETF Trust's Legal
54	   Provisions Relating to IETF Documents
55	   (http://trustee.ietf.org/license-info) in effect on the date of
56	   publication of this document.  Please review these documents
57	   carefully, as they describe your rights and restrictions with respect
58	   to this document.  Code Components extracted from this document must
59	   include Simplified BSD License text as described in Section 4.e of
60	   the Trust Legal Provisions and are provided without warranty as
61	   described in the Simplified BSD License.

63	Table of Contents

65	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
66	   2.  Requirements Language . . . . . . . . . . . . . . . . . . . .   6
67	   3.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   6
68	   4.  Discussion  . . . . . . . . . . . . . . . . . . . . . . . . .  11
69	   5.  Applicability . . . . . . . . . . . . . . . . . . . . . . . .  12
70	   6.  Protocol Details  . . . . . . . . . . . . . . . . . . . . . .  12
71	     6.1.  DSO Session Establishment . . . . . . . . . . . . . . . .  12
72	       6.1.1.  Connection Sharing  . . . . . . . . . . . . . . . . .  14
73	       6.1.2.  Zero Round-Trip Operation . . . . . . . . . . . . . .  15
74	       6.1.3.  Middlebox Considerations  . . . . . . . . . . . . . .  16
75	     6.2.  Message Format  . . . . . . . . . . . . . . . . . . . . .  17
76	       6.2.1.  DNS Header Fields in DSO Messages . . . . . . . . . .  18
77	       6.2.2.  DSO Data  . . . . . . . . . . . . . . . . . . . . . .  20
78	       6.2.3.  EDNS(0) and TSIG  . . . . . . . . . . . . . . . . . .  25
79	     6.3.  Message Handling  . . . . . . . . . . . . . . . . . . . .  26
80	       6.3.1.  Error Responses . . . . . . . . . . . . . . . . . . .  27
81	     6.4.  Flow Control Considerations . . . . . . . . . . . . . . .  28
82	     6.5.  Responder-Initiated Operation Cancellation  . . . . . . .  28
83	   7.  DSO Session Lifecycle and Timers  . . . . . . . . . . . . . .  30
84	     7.1.  DSO Session Initiation  . . . . . . . . . . . . . . . . .  30
85	     7.2.  DSO Session Timeouts  . . . . . . . . . . . . . . . . . .  30
86	     7.3.  Inactive DSO Sessions . . . . . . . . . . . . . . . . . .  31
87	     7.4.  The Inactivity Timeout  . . . . . . . . . . . . . . . . .  33
88	       7.4.1.  Closing Inactive DSO Sessions . . . . . . . . . . . .  33
89	       7.4.2.  Values for the Inactivity Timeout . . . . . . . . . .  34
90	     7.5.  The Keepalive Interval  . . . . . . . . . . . . . . . . .  35
91	       7.5.1.  Keepalive Interval Expiry . . . . . . . . . . . . . .  35
92	       7.5.2.  Values for the Keepalive Interval . . . . . . . . . .  35
93	     7.6.  Server-Initiated Session Termination  . . . . . . . . . .  37
94	       7.6.1.  Server-Initiated Retry Delay Message  . . . . . . . .  38
95	   8.  Base TLVs for DNS Stateful Operations . . . . . . . . . . . .  41
96	     8.1.  Keepalive TLV . . . . . . . . . . . . . . . . . . . . . .  41
97	       8.1.1.  Client handling of received Session Timeout values  .  43
98	       8.1.2.  Relation to edns-tcp-keepalive EDNS0 Option . . . . .  45
99	     8.2.  Retry Delay TLV . . . . . . . . . . . . . . . . . . . . .  46
100	       8.2.1.  Retry Delay TLV used as a Primary TLV . . . . . . . .  46
101	       8.2.2.  Retry Delay TLV used as a Response Additional TLV . .  48
102	     8.3.  Encryption Padding TLV  . . . . . . . . . . . . . . . . .  48
103	   9.  Summary Highlights  . . . . . . . . . . . . . . . . . . . . .  49
104	     9.1.  QR bit and MESSAGE ID . . . . . . . . . . . . . . . . . .  49
105	     9.2.  TLV Usage . . . . . . . . . . . . . . . . . . . . . . . .  50
106	   10. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  52
107	     10.1.  DSO OPCODE Registration  . . . . . . . . . . . . . . . .  52
108	     10.2.  DSO RCODE Registration . . . . . . . . . . . . . . . . .  52
109	     10.3.  DSO Type Code Registry . . . . . . . . . . . . . . . . .  52
110	   11. Security Considerations . . . . . . . . . . . . . . . . . . .  53
111	     11.1.  TCP Fast Open Considerations . . . . . . . . . . . . . .  54
112	   12. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  54
113	   13. References  . . . . . . . . . . . . . . . . . . . . . . . . .  55
114	     13.1.  Normative References . . . . . . . . . . . . . . . . . .  55
115	     13.2.  Informative References . . . . . . . . . . . . . . . . .  56
116	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  57

118	1.  Introduction

120	   This document specifies a mechanism for managing stateful DNS
121	   connections.  DNS most commonly operates over a UDP transport, but
122	   can also operate over streaming transports; the original DNS RFC
123	   specifies DNS over TCP [RFC1035] and a profile for DNS over TLS
124	   [RFC7858] has been specified.  These transports can offer persistent,
125	   long-lived sessions and therefore when using them for transporting
126	   DNS messages it is of benefit to have a mechanism that can establish
127	   parameters associated with those sessions, such as timeouts.  In such
128	   situations it is also advantageous to support server-initiated
129	   messages (such as DNS Push Notifications [I-D.ietf-dnssd-push]).

131	   The existing EDNS(0) Extension Mechanism for DNS [RFC6891] is
132	   explicitly defined to only have "per-message" semantics.  While
133	   EDNS(0) has been used to signal at least one session-related
134	   parameter (edns-tcp-keepalive EDNS0 Option [RFC7828]) the result is
135	   less than optimal due to the restrictions imposed by the EDNS(0)
136	   semantics and the lack of server-initiated signalling.  For example,
137	   a server cannot arbitrarily instruct a client to close a connection
138	   because the server can only send EDNS(0) options in responses to
139	   queries that contained EDNS(0) options.

141	   This document defines a new DNS OPCODE, DSO ([TBA1], tentatively 6),
142	   for DNS Stateful Operations.  DSO messages are used to communicate
143	   operations within persistent stateful sessions, expressed using type-
144	   length-value (TLV) syntax.  This document defines an initial set of
145	   three TLVs, used to manage session timeouts, termination, and
146	   encryption padding.

148	   The three TLVs defined here are all mandatory for all implementations
149	   of DSO.  Further TLVs may be defined in additional specifications.

151	   DSO messages may or may not be acknowledged; this is signaled by
152	   providing a non-zero message ID for messages that must be
153	   acknowledged and a zero message ID for messages that are not to be
154	   acknowledged, and is also part of the definition of a particular
155	   message type.  Messages are pipelined; answers may appear out of
156	   order when more than one answer is pending.

158	   The format for DSO messages (Section 6.2) differs somewhat from the
159	   traditional DNS message format used for standard queries and
160	   responses.  The standard twelve-byte header is used, but the four
161	   count fields (QDCOUNT, ANCOUNT, NSCOUNT, ARCOUNT) are set to zero and
162	   accordingly their corresponding sections are not present.

164	   The actual data pertaining to DNS Stateful Operations (expressed in
165	   TLV syntax) is appended to the end of the DNS message header.  The
166	   stream protocol carrying the DSO message frames it with 16-bit
167	   message length, so the length of the DSO data is determined from that
168	   length, rather than from any of the DNS header counts.

170	   When displayed using packet analyzer tools that have not been updated
171	   to recognize the DSO format, this will result in the DSO data being
172	   displayed as unknown additional data after the end of the DNS
173	   message.

175	   This new format has distinct advantages over an RR-based format
176	   because it is more explicit and more compact.  Each TLV definition is
177	   specific to its use case, and as a result contains no redundant or
178	   overloaded fields.  Importantly, it completely avoids conflating DNS
179	   Stateful Operations in any way with normal DNS operations or with
180	   existing EDNS(0)-based functionality.  A goal of this approach is to
181	   avoid the operational issues that have befallen EDNS(0), particularly
182	   relating to middlebox behaviour (see for example
183	   [I-D.ietf-dnsop-no-response-issue] sections 3.2 and 4).

185	   With EDNS(0), multiple options may be packed into a single OPT
186	   pseudo-RR, and there is no generalized mechanism for a client to be
187	   able to tell whether a server has processed or otherwise acted upon
188	   each individual option within the combined OPT pseudo-RR.  The
189	   specifications for each individual option need to define how each
190	   different option is to be acknowledged, if necessary.

192	   In contrast to EDNS(0), with DSO there is no compelling motivation to
193	   pack multiple operations into a single message for efficiency
194	   reasons, because DSO always operates using a connection-oriented
195	   transport protocol.  Each DSO operation is communicated in its own
196	   separate DNS message, and the transport protocol can take care of
197	   packing several DNS messages into a single IP packet if appropriate.
198	   For example, TCP can pack multiple small DNS messages into a single
199	   TCP segment.  This simplification allows for clearer semantics.  Each
200	   DSO request message communicates just one primary operation, and the
201	   RCODE in the corresponding response message indicates the success or
202	   failure of that operation.

204	2.  Requirements Language

206	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
207	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
208	   "OPTIONAL" in this document are to be interpreted as described in BCP
209	   14 [RFC2119] [RFC8174] when, and only when, they appear in all
210	   capitals, as shown here.

212	3.  Terminology

214	   "DSO" is used to mean DNS Stateful Operation.

216	   The term "connection" means a bidirectional byte (or message) stream,
217	   where the bytes (or messages) are delivered reliably and in-order,
218	   such as provided by using DNS over TCP [RFC1035] [RFC7766] or DNS
219	   over TLS [RFC7858].

221	   The unqualified term "session" in the context of this document means
222	   the exchange of DNS messages over a connection where:

224	   o  The connection between client and server is persistent and
225	      relatively long-lived.

227	   o  Either end of the connection may initiate messages to the other.

229	   In this document the term "session" is used exclusively as described
230	   above.  The term has no relationship to the "session layer" of the
231	   OSI "seven-layer model".

233	   A "DSO Session" is established between two endpoints that acknowledge
234	   persistent DNS state via the exchange of DSO messages over the
235	   connection.  This is distinct from a DNS-over-TCP session as
236	   described in the previous specification for DNS over TCP [RFC7766].

238	   A "DSO Session" is terminated when the underlying connection is
239	   closed.  The underlying connection can be closed in two ways:

241	   Where this specification says, "close gracefully," that means sending
242	   a TLS close_notify (if TLS is in use) followed by a TCP FIN, or the
243	   equivalents for other protocols.  Where this specification requires a
244	   connection to be closed gracefully, the requirement to initiate that
245	   graceful close is placed on the client, to place the burden of TCP's
246	   TIME-WAIT state on the client rather than the server.

248	   Where this specification says, "forcibly abort," that means sending a
249	   TCP RST, or the equivalent for other protocols.  In the BSD Sockets
250	   API this is achieved by setting the SO_LINGER option to zero before
251	   closing the socket.

253	   The term "server" means the software with a listening socket,
254	   awaiting incoming connection requests in the usual DNS sense.

256	   The term "client" means the software which initiates a connection to
257	   the server's listening socket in the usual DNS sense.

259	   The terms "initiator" and "responder" correspond respectively to the
260	   initial sender and subsequent receiver of a DSO request message or
261	   unacknowledged message, regardless of which was the "client" and
262	   "server" in the usual DNS sense.

264	   The term "sender" may apply to either an initiator (when sending a
265	   DSO request message or unacknowledged message) or a responder (when
266	   sending a DSO response message).

268	   Likewise, the term "receiver" may apply to either a responder (when
269	   receiving a DSO request message or unacknowledged message) or an
270	   initiator (when receiving a DSO response message).

272	   In protocol implementation there are generally two kinds of errors
273	   that software writers have to deal with.  The first is situations
274	   that arise due to factors in the environment, such as temporary loss
275	   of connectivity.  While undesirable, these situations do not indicate
276	   a flaw in the software, and they are situations that software should
277	   generally be able to recover from.  The second is situations that
278	   should never happen when communicating with a correctly-implemented
279	   peer.  If they do happen, they indicate a serious flaw in the
280	   protocol implementation, beyond what it is reasonable to expect
281	   software to recover from.  This document describes this latter form
282	   of error condition as a "fatal error" and specifies that an
283	   implementation encountering a fatal error condition "MUST forcibly
284	   abort the connection immediately".  Given that these fatal error
285	   conditions signify defective software, and given that defective
286	   software is likely to remain defective for some time until it is
287	   fixed, after forcibly aborting a connection, a client SHOULD refrain
288	   from automatically reconnecting to that same service instance for at
289	   least one hour.

291	   This document uses the term "same service instance" as follows:

293	   o  In cases where a server is specified or configured using an IP
294	      address and TCP port number, two different configurations are
295	      referring to the same service instance if they contain the same IP
296	      address and TCP port number.

298	   o  In cases where a server is specified or configured using a
299	      hostname and TCP port number, such as in the content of a DNS SRV
300	      record [RFC2782], two different configurations (or DNS SRV
301	      records) are considered to be referring to the same service
302	      instance if they contain the same hostname (subject to the usual
303	      case insensitive DNS name matching rules [RFC1034] [RFC1035]) and
304	      TCP port number.  In these cases, configurations with different
305	      hostnames are considered to be referring to different service
306	      instances, even if those different hostnames happen to be aliases,
307	      or happen to resolve to the same IP address(es).  Implementations
308	      SHOULD NOT resolve hostnames and then perform matching of IP
309	      address(es) in order to evaluate whether two entities should be
310	      determined to be the "same service instance".

312	   When an anycast service is configured on a particular IP address and
313	   port, it must be the case that although there is more than one
314	   physical server responding on that IP address, each such server can
315	   be treated as equivalent.  What we mean by "equivalent" here is that
316	   both servers can provide the same service and, where appropriate, the
317	   same authentication information, such as PKI certificates, when
318	   establishing connections.

320	   In principle, anycast servers could maintain sufficient state that
321	   they can both handle packets in the same TCP connection.  In order
322	   for this to work with DSO, they would need to also share DSO state.
323	   It is unlikely that this can be done successfully, however, so we
324	   recommend that each anycast server instance maintain its own session
325	   state.

327	   If a change in network topology causes packets in a particular TCP
328	   connection to be sent to an anycast server instance that does not
329	   know about the connection, the new server will automatically
330	   terminate the connection with a TCP reset, since it will have no
331	   record of the connection, and then the client can reconnect or stop
332	   using the connection, as appropriate.

334	   If after the connection is re-established, the client's assumption
335	   that it is connected to the same service is violated in some way,
336	   that would be considered to be incorrect behavior in this context.
337	   It is however out of the possible scope for this specification to
338	   make specific recommendations in this regard; that would be up to
339	   follow-on documents that describe specific uses of DNS stateful
340	   operations.

342	   The term "long-lived operations" refers to operations such as Push
343	   Notification subscriptions [I-D.ietf-dnssd-push], Discovery Relay
344	   interface subscriptions [I-D.ietf-dnssd-mdns-relay], and other future
345	   long-lived DNS operations that choose to use DSO as their basis.
346	   These operations establish state that persists beyond the lifetime of
347	   a traditional brief request/response transaction.  This document, the
348	   base specification for DNS Stateful Operations, defines a framework
349	   for supporting long-lived operations, but does not itself define any
350	   long-lived operations.  Nonetheless, to appreciate the design
351	   rationale behind DNS Stateful Operations, it is helpful to understand
352	   the kind of long-lived operations that it is intended to support.

354	   DNS Stateful Operations uses three kinds of message: "DSO request
355	   messages", "DSO response messages", and "DSO unacknowledged
356	   messages".  A DSO request message elicits a DSO response message.
357	   DSO unacknowledged messages are unidirectional messages and do not
358	   induce a DNS response.

360	   Both DSO request messages and DSO unacknowledged messages are
361	   formatted as DNS request messages (the header QR bit is set to zero,
362	   as described in Section 6.2).  One difference is that in DSO request
363	   messages the MESSAGE ID field is nonzero; in DSO unacknowledged
364	   messages it is zero.

366	   The content of DSO messages is expressed using type-length-value
367	   (TLV) syntax.

369	   In a DSO request message or DSO unacknowledged message the first TLV
370	   is referred to as the "Primary TLV" and determines the nature of the
371	   operation being performed, including whether it is an acknowledged or
372	   unacknowledged operation; any other TLVs in a DSO request message or
373	   unacknowledged message are referred to as "Additional TLVs" and serve
374	   additional non-primary purposes, which may be related to the primary
375	   purpose, or not, as in the case of the encryption padding TLV.

377	   A DSO response message may contain no TLVs, or it may contain one or
378	   more TLVs as appropriate to the information being communicated.  In
379	   the context of DSO response messages, one or more TLVs with the same
380	   DSO-TYPE as the Primary TLV in the corresponding DSO request message
381	   are referred to as "Response Primary TLVs".  Any other TLVs with
382	   different DSO-TYPEs are referred to as "Response Additional TLVs".
383	   The Response Primary TLV(s), if present, MUST occur first in the
384	   response message, before any Response Additional TLVs.

386	   Two timers (elapsed time since an event) are defined in this
387	   document:

389	   o  an inactivity timer (see Section 7.4 and Section 8.1)

391	   o  a keepalive timer (see Section 7.5 and Section 8.1)

393	   The timeouts associated with these timers are called the inactivity
394	   timeout and the keepalive interval, respectively.  The term "Session
395	   Timeouts" is used to refer to this pair of timeout values.

397	   Resetting a timer means resetting the timer value to zero and
398	   starting the timer again.  Clearing a timer means resetting the timer
399	   value to zero but NOT starting the timer again.

401	4.  Discussion

403	   There are several use cases for DNS Stateful operations that can be
404	   described here.

406	   Firstly, establishing session parameters such as server-defined
407	   timeouts is of great use in the general management of persistent
408	   connections.  For example, using DSO sessions for stub-to-recursive
409	   DNS-over-TLS [RFC7858] is more flexible for both the client and the
410	   server than attempting to manage sessions using just the edns-tcp-
411	   keepalive EDNS0 Option [RFC7828].  The simple set of TLVs defined in
412	   this document is sufficient to greatly enhance connection management
413	   for this use case.

415	   Secondly, DNS-SD [RFC6763] has evolved into a naturally session-based
416	   mechanism where, for example, long-lived subscriptions lend
417	   themselves to 'push' mechanisms as opposed to polling.  Long-lived
418	   stateful connections and server-initiated messages align with this
419	   use case [I-D.ietf-dnssd-push].

421	   A general use case is that DNS traffic is often bursty but session
422	   establishment can be expensive.  One challenge with long-lived
423	   connections is to maintain sufficient traffic to maintain NAT and
424	   firewall state.  To mitigate this issue this document introduces a
425	   new concept for the DNS, that is DSO "Keepalive traffic".  This
426	   traffic carries no DNS data and is not considered 'activity' in the
427	   classic DNS sense, but serves to maintain state in middleboxes, and
428	   to assure client and server that they still have connectivity to each
429	   other.

431	5.  Applicability

433	   DNS Stateful Operations are applicable in cases where it is useful to
434	   maintain an open session between a DNS client and server, where the
435	   transport allows such a session to be maintained, and where the
436	   transport guarantees in-order delivery of messages, on which DSO
437	   depends.  Examples of transports that can support session signaling
438	   are DNS-over-TCP [RFC1035] [RFC7766] and DNS-over-TLS [RFC7858].

440	   Note that in the case of DNS over TLS, there is no mechanism for
441	   upgrading from DNS-over-TCP to DNS-over-TLS (see [RFC7858] section
442	   7).

444	   DNS Stateful Operations are not applicable for transports that cannot
445	   support clean session semantics, or that do not guarantee in-order
446	   delivery.  While in principle such a transport could be constructed
447	   over UDP, the current DNS specification over UDP transport [RFC1035]
448	   does not provide in-order delivery or session semantics, and hence
449	   cannot be used.  Similarly, DNS-over-HTTP
450	   [I-D.ietf-doh-dns-over-https] cannot be used because HTTP has its own
451	   mechanism for managing sessions, and this is incompatible with the
452	   mechanism specified here.

454	   No other transports are currently defined for use with DNS Stateful
455	   Operations.  Such transports can be added in the future, if they meet
456	   the requirements set out in the first paragraph of this section.

458	6.  Protocol Details

460	6.1.  DSO Session Establishment

462	   In order for a session to be established between a client and a
463	   server, the client must first establish a connection to the server,
464	   using an applicable transport (see Section 5).

466	   In some environments it may be known in advance by external means
467	   that both client and server support DSO, and in these cases either
468	   client or server may initiate DSO messages at any time.  In this
469	   case, the session is established as soon as the connection is
470	   established; this is referred to as implicit session establishment.

472	   However, in the typical case a server will not know in advance
473	   whether a client supports DSO, so in general, unless it is known in
474	   advance by other means that a client does support DSO, a server MUST
475	   NOT initiate DSO request messages or DSO unacknowledged messages
476	   until a DSO Session has been mutually established by at least one
477	   successful DSO request/response exchange initiated by the client, as
478	   described below.  This is referred to as explicit session
479	   establishment.

481	   Until a DSO session has been implicitly or explicitly established, a
482	   client MUST NOT initiate DSO unacknowledged messages.

484	   A DSO Session is established over a connection by the client sending
485	   a DSO request message, such as a DSO Keepalive request message
486	   (Section 8.1), and receiving a response, with matching MESSAGE ID,
487	   and RCODE set to NOERROR (0), indicating that the DSO request was
488	   successful.

490	   If the RCODE in the response is set to DSOTYPENI ("DSO-TYPE Not
491	   Implemented", [TBA2] tentatively RCODE 11) this indicates that the
492	   server does support DSO, but does not implement the DSO-TYPE of the
493	   primary TLV in this DSO request message.  A server implementing DSO
494	   MUST NOT return DSOTYPENI for a DSO Keepalive request message,
495	   because the Keepalive TLV is mandatory to implement.  But in the
496	   future, if a client attempts to establish a DSO Session using a
497	   response-requiring DSO request message using some newly-defined DSO-
498	   TYPE that the server does not understand, that would result in a
499	   DSOTYPENI response.  If the server returns DSOTYPENI then a DSO
500	   Session is not considered established, but the client is permitted to
501	   continue sending DNS messages on the connection, including other DSO
502	   messages such as the DSO Keepalive, which may result in a successful
503	   NOERROR response, yielding the establishment of a DSO Session.

505	   If the RCODE is set to any value other than NOERROR (0) or DSOTYPENI
506	   ([TBA2] tentatively 11), then the client MUST assume that the server
507	   does not implement DSO at all.  In this case the client is permitted
508	   to continue sending DNS messages on that connection, but the client
509	   MUST NOT issue further DSO messages on that connection.

511	   Two other possibilities exist: the server might drop the connection,
512	   or the server might send no response to the DSO message.  In the
513	   first case, the client SHOULD mark the server as not supporting DSO,
514	   and not attempt a DSO connection for some period of time (at least an
515	   hour) after the failed attempt.  The client MAY reconnect but not use
516	   DSO, if appropriate.

518	   In the second case, the client SHOULD set a reasonable timeout, after
519	   which time the server will be assumed not to support DSO.  At this
520	   point the client MUST drop the connection to the server, since the
521	   server's behavior is out of spec, and hence its state is undefined.
522	   The client MAY reconnect, but not use DSO, if appropriate.

524	   When the server receives a DSO request message from a client, and
525	   transmits a successful NOERROR response to that request, the server
526	   considers the DSO Session established.

528	   When the client receives the server's NOERROR response to its DSO
529	   request message, the client considers the DSO Session established.

531	   Once a DSO Session has been established, either end may unilaterally
532	   send appropriate DSO messages at any time, and therefore either
533	   client or server may be the initiator of a message.

535	   Once a DSO Session has been established, clients and servers should
536	   behave as described in this specification with regard to inactivity
537	   timeouts and session termination, not as previously prescribed in the
538	   earlier specification for DNS over TCP [RFC7766].

540	   Because the Keepalive TLV can't fail (that is, can't return an RCODE
541	   other than NOERROR), it is an ideal candidate for use in establishing
542	   a DSO session.  Any other option that can only succeed MAY also be
543	   used to establish a DSO session.  For clients that implement only the
544	   DSO-TYPEs defined in this base specification, sending a Keepalive TLV
545	   is the only DSO request message they have available to initiate a DSO
546	   Session.  Even for clients that do implement other future DSO-TYPEs,
547	   for simplicity they MAY elect to always send an initial DSO Keepalive
548	   request message as their way of initiating a DSO Session.  A future
549	   definition of a new response-requiring DSO-TYPE gives implementers
550	   the option of using that new DSO-TYPE if they wish, but does not
551	   change the fact that sending a Keepalive TLV remains a valid way of
552	   initiating a DSO Session.

554	6.1.1.  Connection Sharing

556	   As previously specified for DNS over TCP [RFC7766]:

558	      To mitigate the risk of unintentional server overload, DNS
559	      clients MUST take care to minimize the number of concurrent
560	      TCP connections made to any individual server.  It is RECOMMENDED
561	      that for any given client/server interaction there SHOULD be
562	      no more than one connection for regular queries, one for zone
563	      transfers, and one for each protocol that is being used on top
564	      of TCP (for example, if the resolver was using TLS). However,
565	      it is noted that certain primary/secondary configurations
566	      with many busy zones might need to use more than one TCP
567	      connection for zone transfers for operational reasons (for
568	      example, to support concurrent transfers of multiple zones).

570	   A single server may support multiple services, including DNS Updates
571	   [RFC2136], DNS Push Notifications [I-D.ietf-dnssd-push], and other
572	   services, for one or more DNS zones.  When a client discovers that
573	   the target server for several different operations is the same target
574	   hostname and port, the client SHOULD use a single shared DSO Session
575	   for all those operations.  A client SHOULD NOT open multiple
576	   connections to the same target host and port just because the names
577	   being operated on are different or happen to fall within different
578	   zones.  This requirement has two benefits.  First, it reduces
579	   unnecessary connection load on the DNS server.  Second, it avoids
580	   paying the TCP slow start penalty when making subsequent connections
581	   to the same server.

583	   However, server implementers and operators should be aware that
584	   connection sharing may not be possible in all cases.  A single host
585	   device may be home to multiple independent client software instances
586	   that don't coordinate with each other.  Similarly, multiple
587	   independent client devices behind the same NAT gateway will also
588	   typically appear to the DNS server as different source ports on the
589	   same client IP address.  Because of these constraints, a DNS server
590	   MUST be prepared to accept multiple connections from different source
591	   ports on the same client IP address.

593	6.1.2.  Zero Round-Trip Operation

595	   DSO permits zero round-trip operation using TCP Fast Open [RFC7413]
596	   and TLS 1.3 [I-D.ietf-tls-tls13] to reduce or eliminate round trips
597	   in session establishment.

599	   A client MAY send multiple response-requiring DSO messages using TCP
600	   fast open or TLS 1.3 early data, without having to wait for a
601	   response to the first request message to confirm successful
602	   establishment of a DSO session.

604	   However, a client MUST NOT send non-response-requiring DSO request
605	   messages until after a DSO Session has been mutually established.

607	   Similarly, a server MUST NOT send DSO request messages until it has
608	   received a response-requiring DSO request message from a client and
609	   transmitted a successful NOERROR response for that request.

611	   Caution must be taken to ensure that DSO messages sent before the
612	   first round-trip is completed are idempotent, or are otherwise immune
613	   to any problems that could be result from the inadvertent replay that
614	   can occur with zero round-trip operation.

616	6.1.3.  Middlebox Considerations

618	   Where an application-layer middlebox (e.g., a DNS proxy, forwarder,
619	   or session multiplexer) is in the path, care must be taken to avoid
620	   inappropriately passing session signaling through the middlebox.

622	   In cases where a DSO session is terminated on one side of a
623	   middlebox, and then some session is opened on the other side of the
624	   middlebox in order to satisfy requests sent over the first DSO
625	   session, any such session MUST be treated as a separate session.  If
626	   the middlebox does implement DSO sessions, it MUST handle
627	   unrecognized TLVs in the same way as any other DSO implementation as
628	   described below in Section 6.2.2.4.

630	   This does not preclude the use of DSO messages in the presence of an
631	   IP-layer middlebox, such as a NAT that rewrites IP-layer and/or
632	   transport- layer headers but otherwise preserves the effect of a
633	   single session between the client and the server.  And of course it
634	   does not apply to middleboxes that do not implement DNS Stateless
635	   Operations.

637	   These restrictions do not apply to such middleboxes: since they have
638	   no way to understand a DSO message, a pass-through middlebox like the
639	   one described in the previous paragraph will pass DSO messages
640	   unchanged or drop them (or possibly drop the connection).  A
641	   middlebox that is not doing a strict pass-through will have no way to
642	   know on which connection to forward a DSO message, and therefore will
643	   not be able to behave incorrectly.

645	   To illustrate the above, consider a network where a middlebox
646	   terminates one or more TCP connections from clients and multiplexes
647	   the queries therein over a single TCP connection to an upstream
648	   server.  The DSO messages and any associated state are specific to
649	   the individual TCP connections.  A DSO-aware middlebox MAY in some
650	   circumstances be able to retain associated state and pass it between
651	   the client and server (or vice versa) but this would be highly TLV-
652	   specific.  For example, the middlebox may be able to maintain a list
653	   of which clients have made Push Notification subscriptions
654	   [I-D.ietf-dnssd-push] and make its own subscription(s) on their
655	   behalf, relaying any subsequent notifications to the client (or
656	   clients) that have subscribed to that particular notification.

658	6.2.  Message Format

660	   A DSO message begins with the standard twelve-byte DNS message header
661	   [RFC1035] with the OPCODE field set to the DSO OPCODE ([TBA1]
662	   tentatively 6).  However, unlike standard DNS messages, the question
663	   section, answer section, authority records section and additional
664	   records sections are not present.  The corresponding count fields
665	   (QDCOUNT, ANCOUNT, NSCOUNT, ARCOUNT) MUST be set to zero on
666	   transmission.

668	   If a DSO message is received where any of the count fields are not
669	   zero, then a FORMERR MUST be returned.

671	                                                1   1   1   1   1   1
672	        0   1   2   3   4   5   6   7   8   9   0   1   2   3   4   5
673	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
674	      |                          MESSAGE ID                           |
675	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
676	      |QR |    OPCODE     |            Z              |     RCODE     |
677	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
678	      |                     QDCOUNT (MUST be zero)                    |
679	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
680	      |                     ANCOUNT (MUST be zero)                    |
681	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
682	      |                     NSCOUNT (MUST be zero)                    |
683	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
684	      |                     ARCOUNT (MUST be zero)                    |
685	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
686	      |                                                               |
687	      /                           DSO Data                            /
688	      /                                                               /
689	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

691	6.2.1.  DNS Header Fields in DSO Messages

693	   In an unacknowledged message the MESSAGE ID field MUST be set to
694	   zero.  In an acknowledged request message the MESSAGE ID field MUST
695	   be set to a unique nonzero value, that the initiator is not currently
696	   using for any other active operation on this connection.  For the
697	   purposes here, a MESSAGE ID is in use in this DSO Session if the
698	   initiator has used it in a request for which it is still awaiting a
699	   response, or if the client has used it to set up a long-lived
700	   operation that has not yet been cancelled.  For example, a long-lived
701	   operation could be a Push Notification subscription
702	   [I-D.ietf-dnssd-push] or a Discovery Relay interface subscription
703	   [I-D.ietf-dnssd-mdns-relay].

705	   Whether a message is acknowledged or unacknowledged is determined
706	   only by the specification for the Primary TLV.  An acknowledgment
707	   cannot be requested by including a nonzero message ID in a message
708	   the primary TLV of which is specified to be unacknowledged, nor can
709	   an acknowledgment be prevented by sending a message ID of zero in a
710	   message with a primary TLV that is specified to be acknowledged.  A
711	   responder that receives either such malformed message MUST treat it
712	   as a fatal error and forcibly abort the connection immediately.

714	   In a request or unacknowledged message the DNS Header QR bit MUST be
715	   zero (QR=0).  If the QR bit is not zero the message is not a request
716	   or unacknowledged message.

718	   In a response message the DNS Header QR bit MUST be one (QR=1).
719	   If the QR bit is not one the message is not a response message.

721	   In a response message (QR=1) the MESSAGE ID field MUST contain a copy
722	   of the value of the MESSAGE ID field in the request message being
723	   responded to.  In a response message (QR=1) the MESSAGE ID field MUST
724	   NOT be zero.  If a response message (QR=1) is received where the
725	   MESSAGE ID is zero this is a fatal error and the recipient MUST
726	   forcibly abort the connection immediately.

728	   The DNS Header OPCODE field holds the DSO OPCODE value ([TBA1]
729	   tentatively 6).

731	   The Z bits are currently unused in DSO messages, and in both DSO
732	   requests and DSO responses the Z bits MUST be set to zero (0) on
733	   transmission and MUST be silently ignored on reception.

735	   In a DNS request message (QR=0) the RCODE is set according to the
736	   definition of the request.  For example, in a Retry Delay message
737	   (Section 7.6.1) the RCODE indicates the reason for termination.
738	   However, in most cases, except where clearly specified otherwise, in
739	   a DNS request message (QR=0) the RCODE is set to zero on
740	   transmission, and silently ignored on reception.

742	   The RCODE value in a response message (QR=1) may be one of the
743	   following values:

745	   +---------+-----------+---------------------------------------------+
746	   |    Code | Mnemonic  | Description                                 |
747	   +---------+-----------+---------------------------------------------+
748	   |       0 | NOERROR   | Operation processed successfully            |
749	   |         |           |                                             |
750	   |       1 | FORMERR   | Format error                                |
751	   |         |           |                                             |
752	   |       2 | SERVFAIL  | Server failed to process request due to a   |
753	   |         |           | problem with the server                     |
754	   |         |           |                                             |
755	   |       3 | NXDOMAIN  | Name Error -- Named entity does not exist   |
756	   |         |           | (TLV-dependent)                             |
757	   |         |           |                                             |
758	   |       4 | NOTIMP    | DSO not supported                           |
759	   |         |           |                                             |
760	   |       5 | REFUSED   | Operation declined for policy reasons       |
761	   |         |           |                                             |
762	   |       9 | NOTAUTH   | Not Authoritative (TLV-dependent)           |
763	   |         |           |                                             |
764	   |  [TBA2] | DSOTYPENI | Primary TLV's DSO-Type is not implemented   |
765	   |      11 |           |                                             |
766	   +---------+-----------+---------------------------------------------+

768	   Use of the above RCODEs is likely to be common in DSO but does not
769	   preclude the definition and use of other codes in future documents
770	   that make use of DSO.

772	   If a document defining a new DSO-TYPE makes use of NXDOMAIN (Name
773	   Error) or NOTAUTH (Not Authoritative) then that document MUST specify
774	   the specific interpretation of these RCODE values in the context of
775	   that new DSO TLV.

777	6.2.2.  DSO Data

779	   The standard twelve-byte DNS message header with its zero-valued
780	   count fields is followed by the DSO Data, expressed using TLV syntax,
781	   as described below Section 6.2.2.1.

783	   A DSO request message or DSO unacknowledged message MUST contain at
784	   least one TLV.  The first TLV in a DSO request message or DSO
785	   unacknowledged message is referred to as the "Primary TLV" and
786	   determines the nature of the operation being performed, including
787	   whether it is an acknowledged or unacknowledged operation.  In some
788	   cases it may be appropriate to include other TLVs in a request
789	   message or unacknowledged message, such as the Encryption Padding TLV
790	   (Section 8.3), and these extra TLVs are referred to as the
791	   "Additional TLVs" and are not limited to what is defined in this
792	   document.  New "Additional TLVs" may be defined in the future and
793	   those definitions will describe when their use is appropriate.

795	   A DSO response message may contain no TLVs, or it may be specified to
796	   contain one or more TLVs appropriate to the information being
797	   communicated.  This includes "Primary TLVs" and "Additional TLVs"
798	   defined in this document as well as in future TLV definitions.  It
799	   may be permissible for an additional TLV to appear in a response to a
800	   primary TLV even though the specification of that primary TLV does
801	   not specify it explicitly.  See Section 9.2 for more information.

803	   A DSO response message may contain one or more TLVs with DSO-TYPE the
804	   same as the Primary TLV from the corresponding DSO request message,
805	   in which case those TLV(s) are referred to as "Response Primary
806	   TLVs".  A DSO response message is not required to carry Response
807	   Primary TLVs.  The MESSAGE ID field in the DNS message header is
808	   sufficient to identify the DSO request message to which this response
809	   message relates.

811	   A DSO response message may contain one or more TLVs with DSO-TYPEs
812	   different from the Primary TLV from the corresponding DSO request
813	   message, in which case those TLV(s) are referred to as "Response
814	   Additional TLVs".

816	   Response Primary TLV(s), if present, MUST occur first in the response
817	   message, before any Response Additional TLVs.

819	   It is anticipated that most DSO operations will be specified to use
820	   request messages, which generate corresponding responses.  In some
821	   specialized high-traffic use cases, it may be appropriate to specify
822	   unacknowledged messages.  Unacknowledged messages can be more
823	   efficient on the network, because they don't generate a stream of
824	   corresponding reply messages.  Using unacknowledged messages can also
825	   simplify software in some cases, by removing need for an initiator to
826	   maintain state while it waits to receive replies it doesn't care
827	   about.  When the specification for a particular TLV states that, when
828	   used as a Primary TLV (i.e., first) in an outgoing DNS request
829	   message (i.e., QR=0), that message is to be unacknowledged, the
830	   MESSAGE ID field MUST be set to zero and the receiver MUST NOT
831	   generate any response message corresponding to this unacknowledged
832	   message.

834	   The previous point, that the receiver MUST NOT generate responses to
835	   unacknowledged messages, applies even in the case of errors.  When a
836	   DSO message is received where both the QR bit and the MESSAGE ID
837	   field are zero, the receiver MUST NOT generate any response.  For
838	   example, if the DSO-TYPE in the Primary TLV is unrecognized, then a
839	   DSOTYPENI error MUST NOT be returned; instead the receiver MUST
840	   forcibly abort the connection immediately.

842	   Unacknowledged messages MUST NOT be used "speculatively" in cases
843	   where the sender doesn't know if the receiver supports the Primary
844	   TLV in the message, because there is no way to receive any response
845	   to indicate success or failure.  Unacknowledged messages are only
846	   appropriate in cases where the sender already knows that the receiver
847	   supports, and wishes to receive, these messages.

849	   For example, after a client has subscribed for Push Notifications
850	   [I-D.ietf-dnssd-push], the subsequent event notifications are then
851	   sent as unacknowledged messages, and this is appropriate because the
852	   client initiated the message stream by virtue of its Push
853	   Notification subscription, thereby indicating its support of Push
854	   Notifications, and its desire to receive those notifications.

856	   Similarly, after a Discovery Relay client has subscribed to receive
857	   inbound mDNS (multicast DNS, [RFC6762]) traffic from a Discovery
858	   Relay, the subsequent stream of received packets is then sent using
859	   unacknowledged messages, and this is appropriate because the client
860	   initiated the message stream by virtue of its Discovery Relay link
861	   subscription, thereby indicating its support of Discovery Relay, and
862	   its desire to receive inbound mDNS packets over that DSO session
863	   [I-D.ietf-dnssd-mdns-relay].

865	6.2.2.1.  TLV Syntax

867	   All TLVs, whether used as "Primary", "Additional", "Response
868	   Primary", or "Response Additional", use the same encoding syntax.

870	   Specifications that define new TLVs must specify whether the DSO-TYPE
871	   can be used as the Primary TLV, used as an Additional TLV, or used in
872	   either context, both in the case of requests and of responses.  The
873	   specification for a TLV must also state whether, when used as the
874	   Primary (i.e., first) TLV in a DNS request message (i.e., QR=0), that
875	   DSO message is to be acknowledged.  If the DSO message is to be
876	   acknowledged, the specification must also state which TLVs, if any,
877	   are to be included in the response.  The Primary TLV may or may not
878	   be contained in the response, depending on what is specified for that
879	   TLV.

881	                                                1   1   1   1   1   1
882	        0   1   2   3   4   5   6   7   8   9   0   1   2   3   4   5
883	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
884	      |                           DSO-TYPE                            |
885	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
886	      |                          DSO-LENGTH                           |
887	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
888	      |                                                               |
889	      /                           DSO-DATA                            /
890	      /                                                               /
891	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

893	   DSO-TYPE:  A 16-bit unsigned integer, in network (big endian) byte
894	      order, giving the DSO-TYPE of the current DSO TLV per the IANA DSO
895	      Type Code Registry.

897	   DSO-LENGTH:  A 16-bit unsigned integer, in network (big endian) byte
898	      order, giving the size in bytes of the DSO-DATA.

900	   DSO-DATA:  Type-code specific format.  The generic DSO machinery
901	      treats the DSO-DATA as an opaque "blob" without attempting to
902	      interpret it.  Interpretation of the meaning of the DSO-DATA for a
903	      particular DSO-TYPE is the responsibility of the software that
904	      implements that DSO-TYPE.

906	6.2.2.2.  Request TLVs

908	   The first TLV in a DSO request message or unacknowledged message is
909	   the "Primary TLV" and indicates the operation to be performed.  A DSO
910	   request message or unacknowledged message MUST contain at at least
911	   one TLV, the Primary TLV.

913	   Immediately following the Primary TLV, a DSO request message or
914	   unacknowledged message MAY contain one or more "Additional TLVs",
915	   which specify additional parameters relating to the operation.

917	6.2.2.3.  Response TLVs

919	   Depending on the operation, a DSO response message MAY contain no
920	   TLVs, because it is simply a response to a previous request message,
921	   and the MESSAGE ID in the header is sufficient to identify the
922	   request in question.  Or it may contain a single response TLV, with
923	   the same DSO-TYPE as the Primary TLV in the request message.
924	   Alternatively it may contain one or more TLVs of other types, or a
925	   combination of the above, as appropriate for the information that
926	   needs to be communicated.  The specification for each DSO TLV
927	   determines what TLVs are required in a response to a request using
928	   that TLV.

930	   If a DSO response is received for an operation where the
931	   specification requires that the response carry a particular TLV or
932	   TLVs, and the required TLV(s) are not present, then this is a fatal
933	   error and the recipient of the defective response message MUST
934	   forcibly abort the connection immediately.

936	6.2.2.4.  Unrecognized TLVs

938	   If DSO request message is received containing an unrecognized Primary
939	   TLV, with a nonzero MESSAGE ID (indicating that a response is
940	   expected), then the receiver MUST send an error response with
941	   matching MESSAGE ID, and RCODE DSOTYPENI ([TBA2] tentatively 11).
942	   The error response MUST NOT contain a copy of the unrecognized
943	   Primary TLV.

945	   If DSO unacknowledged message is received containing an unrecognized
946	   Primary TLV, with a zero MESSAGE ID (indicating that no response is
947	   expected), then this is a fatal error and the recipient MUST forcibly
948	   abort the connection immediately.

950	   If a DSO request message or unacknowledged message is received where
951	   the Primary TLV is recognized, containing one or more unrecognized
952	   Additional TLVs, the unrecognized Additional TLVs MUST be silently
953	   ignored, and the remainder of the message is interpreted and handled
954	   as if the unrecognized parts were not present.

956	   Similarly, if a DSO response message is received containing one or
957	   more unrecognized TLVs, the unrecognized TLVs MUST be silently
958	   ignored, and the remainder of the message is interpreted and handled
959	   as if the unrecognized parts were not present.

961	6.2.3.  EDNS(0) and TSIG

963	   Since the ARCOUNT field MUST be zero, a DSO message can't contain a
964	   valid EDNS(0) option in the additional records section.  If
965	   functionality provided by current or future EDNS(0) options is
966	   desired for DSO messages, one or more new DSO TLVs need to be defined
967	   to carry the necessary information.

969	   For example, the EDNS(0) Padding Option [RFC7830] used for security
970	   purposes is not permitted in a DSO message, so if message padding is
971	   desired for DSO messages then the Encryption Padding TLV described in
972	   Section 8.3 MUST be used.

974	   Similarly, a DSO message MUST NOT contain a TSIG record.  A TSIG
975	   record in a conventional DNS message is added as the last record in
976	   the additional records section, and carries a signature computed over
977	   the preceding message content.  Since DSO data appears *after* the
978	   additional records section, it would not be included in the signature
979	   calculation.  If use of signatures with DSO messages becomes
980	   necessary in the future, a new DSO TLV needs to be defined to perform
981	   this function.

983	   Note however that, while DSO *messages* cannot include EDNS(0) or
984	   TSIG records, a DSO *session* is typically used to carry a whole
985	   series of DNS messages of different kinds, including DSO messages,
986	   and other DNS message types like Query [RFC1034] [RFC1035] and Update
987	   [RFC2136], and those messages can carry EDNS(0) and TSIG records.

989	   Although messages may contain other EDNS(0) options as appropriate,
990	   this specification explicitly prohibits use of the edns-tcp-keepalive
991	   EDNS0 Option [RFC7828] in *any* messages sent on a DSO Session
992	   (because it is obsoleted by the functionality provided by the DSO
993	   Keepalive operation).  If any message sent on a DSO Session contains
994	   an edns-tcp-keepalive EDNS0 Option this is a fatal error and the
995	   recipient of the defective message MUST forcibly abort the connection
996	   immediately.

998	6.3.  Message Handling

1000	   The initiator MUST set the value of the QR bit in the DNS header to
1001	   zero (0), and the responder MUST set it to one (1).

1003	   As described above in Section 6.2.1 whether an outgoing message with
1004	   QR=0 is unacknowledged or acknowledged is determined by the
1005	   specification for the Primary TLV, which in turn determines whether
1006	   the MESSAGE ID field in that outgoing message will be zero or
1007	   nonzero.

1009	   A DSO unacknowledged message has both the QR bit and the MESSAGE ID
1010	   field set to zero, and MUST NOT elicit a response.

1012	   Every DSO request message (QR=0) with a nonzero MESSAGE ID field is
1013	   an acknowledged DSO request, and MUST elicit a corresponding response
1014	   (QR=1), which MUST have the same MESSAGE ID in the DNS message header
1015	   as in the corresponding request.

1017	   Valid DSO request messages sent by the client with a nonzero MESSAGE
1018	   ID field elicit a response from the server, and Valid DSO request
1019	   messages sent by the server with a nonzero MESSAGE ID field elicit a
1020	   response from the client.

1022	   The namespaces of 16-bit MESSAGE IDs are independent in each
1023	   direction.  This means it is *not* an error for both client and
1024	   server to send request messages at the same time as each other, using
1025	   the same MESSAGE ID, in different directions.  This simplification is
1026	   necessary in order for the protocol to be implementable.  It would be
1027	   infeasible to require the client and server to coordinate with each
1028	   other regarding allocation of new unique MESSAGE IDs.  It is also not
1029	   necessary to require the client and server to coordinate with each
1030	   other regarding allocation of new unique MESSAGE IDs.  The value of
1031	   the 16-bit MESSAGE ID combined with the identity of the initiator
1032	   (client or server) is sufficient to unambiguously identify the
1033	   operation in question.  This can be thought of as a 17-bit message
1034	   identifier space, using message identifiers 0x00001-0x0FFFF for
1035	   client-to-server DSO request messages, and message identifiers
1036	   0x10001-0x1FFFF for server-to-client DSO request messages.  The
1037	   least-significant 16 bits are stored explicitly in the MESSAGE ID
1038	   field of the DSO message, and the most-significant bit is implicit
1039	   from the direction of the message.

1041	   As described above in Section 6.2.1, an initiator MUST NOT reuse a
1042	   MESSAGE ID that it already has in use for an outstanding request
1043	   (unless specified otherwise by the relevant specification for the
1044	   DSO-TYPE in question).  At the very least, this means that a MESSAGE
1045	   ID can't be reused in a particular direction on a particular DSO
1046	   Session while the initiator is waiting for a response to a previous
1047	   request using that MESSAGE ID on that DSO Session (unless specified
1048	   otherwise by the relevant specification for the DSO-TYPE in
1049	   question), and for a long-lived operation the MESSAGE ID for the
1050	   operation can't be reused while that operation remains active.

1052	   If a client or server receives a response (QR=1) where the MESSAGE ID
1053	   is zero, or is any other value that does not match the MESSAGE ID of
1054	   any of its outstanding operations, this is a fatal error and the
1055	   recipient MUST forcibly abort the connection immediately.

1057	   If a responder receives a request (QR=0) where the MESSAGE ID is not
1058	   zero, and the responder tracks query MESSAGE IDs, and the MESSAGE ID
1059	   matches the MESSAGE ID of a query it received for which a response
1060	   has not yet been sent, it MUST forcibly abort the connection
1061	   immediately.  This behavior is required to prevent a hypothetical
1062	   attack that takes advantage of undefined behavior in this case.
1063	   However, if the server does not track MESSAGE IDs in this way, no
1064	   such risk exists, so tracking MESSAGE IDs just to implement this
1065	   sanity check is not required.

1067	6.3.1.  Error Responses

1069	   When an unacknowledged DSO message type is received (MESSAGE ID field
1070	   is zero), the receiver SHOULD already be expecting this DSO message
1071	   type.  Section 6.2.2.4 describes the handling of unknown DSO message
1072	   types.  Parsing errors MUST also result in the receiver aborting the
1073	   connection.  When an unacknowledged DSO message of an unexpected type
1074	   is received, the receiver should abort the connection.  Other
1075	   internal errors processing the unacknowledged DSO message are
1076	   implementation dependent as to whether the connection should be
1077	   aborted according to the severity of the error.

1079	   When an acknowledged DSO request message is unsuccessful for some
1080	   reason, the responder returns an error code to the initiator.

1082	   In the case of a server returning an error code to a client in
1083	   response to an unsuccessful DSO request message, the server MAY
1084	   choose to end the DSO Session, or MAY choose to allow the DSO Session
1085	   to remain open.  For error conditions that only affect the single
1086	   operation in question, the server SHOULD return an error response to
1087	   the client and leave the DSO Session open for further operations.

1089	   For error conditions that are likely to make all operations
1090	   unsuccessful in the immediate future, the server SHOULD return an
1091	   error response to the client and then end the DSO Session by sending
1092	   a Retry Delay message, as described in Section 7.6.1.

1094	   Upon receiving an error response from the server, a client SHOULD NOT
1095	   automatically close the DSO Session.  An error relating to one
1096	   particular operation on a DSO Session does not necessarily imply that
1097	   all other operations on that DSO Session have also failed, or that
1098	   future operations will fail.  The client should assume that the
1099	   server will make its own decision about whether or not to end the DSO
1100	   Session, based on the server's determination of whether the error
1101	   condition pertains to this particular operation, or would also apply
1102	   to any subsequent operations.  If the server does not end the DSO
1103	   Session by sending the client a Retry Delay message (Section 7.6.1)
1104	   then the client SHOULD continue to use that DSO Session for
1105	   subsequent operations.

1107	6.4.  Flow Control Considerations

1109	   Because unacknowledged DSO messages do not generate an immediate
1110	   response from the responder, if there is no other traffic flowing
1111	   from the responder to the initiator, this can result in a 200ms delay
1112	   before the TCP acknowledgment is sent to the initiator [NagleDA].  If
1113	   the initiator has another message pending, but has not yet filled its
1114	   output buffer, this can delay the delivery of that message by more
1115	   than 200ms.  In many cases, this will make no difference.  However,
1116	   implementors should be aware of this issue.  Some operating systems
1117	   offer ways to disable the 200ms TCP acknowledgment delay; this may be
1118	   useful for relatively low-traffic sessions, or sessions with bursty
1119	   traffic flows.

1121	6.5.  Responder-Initiated Operation Cancellation

1123	   This document, the base specification for DNS Stateful Operations,
1124	   does not itself define any long-lived operations, but it defines a
1125	   framework for supporting long-lived operations, such as Push
1126	   Notification subscriptions [I-D.ietf-dnssd-push] and Discovery Relay
1127	   interface subscriptions [I-D.ietf-dnssd-mdns-relay].

1129	   Generally speaking, a long-lived operation is initiated by the
1130	   initiator, and, if successful, remains active until the initiator
1131	   terminates the operation.

1133	   However, it is possible that a long-lived operation may be valid at
1134	   the time it was initiated, but then a later change of circumstances
1135	   may render that previously valid operation invalid.

1137	   For example, a long-lived client operation may pertain to a name that
1138	   the server is authoritative for, but then the server configuration is
1139	   changed such that it is no longer authoritative for that name.

1141	   In such cases, instead of terminating the entire session it may be
1142	   desirable for the responder to be able to cancel selectively only
1143	   those operations that have become invalid.

1145	   The responder performs this selective cancellation by sending a new
1146	   response message, with the MESSAGE ID field containing the MESSAGE ID
1147	   of the long-lived operation that is to be terminated (that it had
1148	   previously acknowledged with a NOERROR RCODE), and the RCODE field of
1149	   the new response message giving the reason for cancellation.

1151	   After a response message with nonzero RCODE has been sent, that
1152	   operation has been terminated from the responder's point of view, and
1153	   the responder sends no more messages relating to that operation.

1155	   After a response message with nonzero RCODE has been received by the
1156	   initiator, that operation has been terminated from the initiator's
1157	   point of view, and the cancelled operation's MESSAGE ID is now free
1158	   for reuse.

1160	7.  DSO Session Lifecycle and Timers

1162	7.1.  DSO Session Initiation

1164	   A DSO Session begins as described in Section 6.1.

1166	   The client may perform as many DNS operations as it wishes using the
1167	   newly created DSO Session.  When the client has multiple messages to
1168	   send, it SHOULD NOT wait for each response before sending the next
1169	   message.  This prevents TCP's delayed acknowledgement algorithm from
1170	   forcing the client into a slow lock-step.  The server MUST act on
1171	   messages in the order they are transmitted, but SHOULD NOT delay
1172	   sending responses to those messages as they become available in order
1173	   to return them in the order the requests were received.  [RFC7766]
1174	   section 3.3 specifies this in more detail.

1176	7.2.  DSO Session Timeouts

1178	   Two timeout values are associated with a DSO Session: the inactivity
1179	   timeout, and the keepalive interval.  Both values are communicated in
1180	   the same TLV, the Keepalive TLV (Section 8.1).

1182	   The first timeout value, the inactivity timeout, is the maximum time
1183	   for which a client may speculatively keep a DSO Session open with no
1184	   operations pending (e.g., an outstanding DNS Push request) in the
1185	   expectation that it may have future requests to send to that server.

1187	   The second timeout value, the keepalive interval, is the maximum
1188	   permitted interval between messages if the client wishes to keep the
1189	   DSO Session alive.

1191	   The two timeout values are independent.  The inactivity timeout may
1192	   be lower, the same, or higher than the keepalive interval, though in
1193	   most cases the inactivity timeout is expected to be shorter than the
1194	   keepalive interval.

1196	   A shorter inactivity timeout with a longer keepalive interval signals
1197	   to the client that it should not speculatively keep an inactive DSO
1198	   Session open for very long without reason, but when it does have an
1199	   active reason to keep a DSO Session open, it doesn't need to be
1200	   sending an aggressive level of DSO keepalive traffic to maintain that
1201	   session.  An example of this would be a client that has subscribed to
1202	   DNS Push notifications: in this case, the client is not sending any
1203	   traffic to the server, but the session is not inactive, because there
1204	   is a pending request to the server to receive push notifications.

1206	   A longer inactivity timeout with a shorter keepalive interval signals
1207	   to the client that it may speculatively keep an inactive DSO Session
1208	   open for a long time, but to maintain that inactive DSO Session it
1209	   should be sending a lot of DSO keepalive traffic.  This configuration
1210	   is expected to be less common.

1212	   In the usual case where the inactivity timeout is shorter than the
1213	   keepalive interval, it is only when a client has a very long-lived,
1214	   low-traffic, operation that the keepalive interval comes into play,
1215	   to ensure that a sufficient residual amount of traffic is generated
1216	   to maintain NAT and firewall state and to assure client and server
1217	   that they still have connectivity to each other.

1219	   On a new DSO Session, if no explicit DSO Keepalive message exchange
1220	   has taken place, the default value for both timeouts is 15 seconds.

1222	   For both timeouts, lower values of the timeout result in higher
1223	   network traffic and higher CPU load on the server.

1225	7.3.  Inactive DSO Sessions

1227	   At both servers and clients, the generation or reception of any
1228	   complete DNS message, including DNS requests, responses, updates, or
1229	   DSO messages, resets both timers for that DSO Session, with the
1230	   exception that a DSO Keepalive message resets only the keepalive
1231	   timer, not the inactivity timeout timer.

1233	   In addition, for as long as the client has an outstanding operation
1234	   in progress, the inactivity timer remains cleared, and an inactivity
1235	   timeout cannot occur.

1237	   For short-lived DNS operations like traditional queries and updates,
1238	   an operation is considered in progress for the time between request
1239	   and response, typically a period of a few hundred milliseconds at
1240	   most.  At the client, the inactivity timer is cleared upon
1241	   transmission of a request and remains cleared until reception of the
1242	   corresponding response.  At the server, the inactivity timer is
1243	   cleared upon reception of a request and remains cleared until
1244	   transmission of the corresponding response.

1246	   For long-lived DNS Stateful operations (such as a Push Notification
1247	   subscription [I-D.ietf-dnssd-push] or a Discovery Relay interface
1248	   subscription [I-D.ietf-dnssd-mdns-relay]), an operation is considered
1249	   in progress for as long as the operation is active, until it is
1250	   cancelled.  This means that a DSO Session can exist, with active
1251	   operations, with no messages flowing in either direction, for far
1252	   longer than the inactivity timeout, and this is not an error.  This
1253	   is why there are two separate timers: the inactivity timeout, and the
1254	   keepalive interval.  Just because a DSO Session has no traffic for an
1255	   extended period of time does not automatically make that DSO Session
1256	   "inactive", if it has an active operation that is awaiting events.

1258	7.4.  The Inactivity Timeout

1260	   The purpose of the inactivity timeout is for the server to balance
1261	   its trade off between the costs of setting up new DSO Sessions and
1262	   the costs of maintaining inactive DSO Sessions.  A server with
1263	   abundant DSO Session capacity can offer a high inactivity timeout, to
1264	   permit clients to keep a speculative DSO Session open for a long
1265	   time, to save the cost of establishing a new DSO Session for future
1266	   communications with that server.  A server with scarce memory
1267	   resources can offer a low inactivity timeout, to cause clients to
1268	   promptly close DSO Sessions whenever they have no outstanding
1269	   operations with that server, and then create a new DSO Session later
1270	   when needed.

1272	7.4.1.  Closing Inactive DSO Sessions

1274	   When a connection's inactivity timeout is reached the client MUST
1275	   begin closing the idle connection, but a client is not required to
1276	   keep an idle connection open until the inactivity timeout is reached.
1277	   A client MAY close a DSO Session at any time, at the client's
1278	   discretion.  If a client determines that it has no current or
1279	   reasonably anticipated future need for a currently inactive DSO
1280	   Session, then the client SHOULD gracefully close that connection.

1282	   If, at any time during the life of the DSO Session, the inactivity
1283	   timeout value (i.e., 15 seconds by default) elapses without there
1284	   being any operation active on the DSO Session, the client MUST close
1285	   the connection gracefully.

1287	   If, at any time during the life of the DSO Session, twice the
1288	   inactivity timeout value (i.e., 30 seconds by default), or five
1289	   seconds, if twice the inactivity timeout value is less than five
1290	   seconds, elapses without there being any operation active on the DSO
1291	   Session, the server MUST consider the client delinquent, and MUST
1292	   forcibly abort the DSO Session.

1294	   In this context, an operation being active on a DSO Session includes
1295	   a query waiting for a response, an update waiting for a response, or
1296	   an active long-lived operation, but not a DSO Keepalive message
1297	   exchange itself.  A DSO Keepalive message exchange resets only the
1298	   keepalive interval timer, not the inactivity timeout timer.

1300	   If the client wishes to keep an inactive DSO Session open for longer
1301	   than the default duration then it uses the DSO Keepalive message to
1302	   request longer timeout values, as described in Section 8.1.

1304	7.4.2.  Values for the Inactivity Timeout

1306	   For the inactivity timeout value, lower values result in more
1307	   frequent DSO Session teardown and re-establishment.  Higher values
1308	   result in lower traffic and lower CPU load on the server, but higher
1309	   memory burden to maintain state for inactive DSO Sessions.

1311	   A server may dictate any value it chooses for the inactivity timeout
1312	   (either in a response to a client-initiated request, or in a server-
1313	   initiated message) including values under one second, or even zero.

1315	   An inactivity timeout of zero informs the client that it should not
1316	   speculatively maintain idle connections at all, and as soon as the
1317	   client has completed the operation or operations relating to this
1318	   server, the client should immediately begin closing this session.

1320	   A server will abort an idle client session after twice the inactivity
1321	   timeout value, or five seconds, whichever is greater.  In the case of
1322	   a zero inactivity timeout value, this means that if a client fails to
1323	   close an idle client session then the server will forcibly abort the
1324	   idle session after five seconds.

1326	   An inactivity timeout of 0xFFFFFFFF represents "infinity" and informs
1327	   the client that it may keep an idle connection open as long as it
1328	   wishes.  Note that after granting an unlimited inactivity timeout in
1329	   this way, at any point the server may revise that inactivity timeout
1330	   by sending a new DSO Keepalive message dictating new Session Timeout
1331	   values to the client.

1333	   The largest *finite* inactivity timeout supported by the current
1334	   Keepalive TLV is 0xFFFFFFFE (2^32-2 milliseconds, approximately 49.7
1335	   days).

1337	7.5.  The Keepalive Interval

1339	   The purpose of the keepalive interval is to manage the generation of
1340	   sufficient messages to maintain state in middleboxes (such at NAT
1341	   gateways or firewalls) and for the client and server to periodically
1342	   verify that they still have connectivity to each other.  This allows
1343	   them to clean up state when connectivity is lost, and to establish a
1344	   new session if appropriate.

1346	7.5.1.  Keepalive Interval Expiry

1348	   If, at any time during the life of the DSO Session, the keepalive
1349	   interval value (i.e., 15 seconds by default) elapses without any DNS
1350	   messages being sent or received on a DSO Session, the client MUST
1351	   take action to keep the DSO Session alive, by sending a DSO Keepalive
1352	   message (Section 8.1).  A DSO Keepalive message exchange resets only
1353	   the keepalive timer, not the inactivity timer.

1355	   If a client disconnects from the network abruptly, without cleanly
1356	   closing its DSO Session, perhaps leaving a long-lived operation
1357	   uncancelled, the server learns of this after failing to receive the
1358	   required DSO keepalive traffic from that client.  If, at any time
1359	   during the life of the DSO Session, twice the keepalive interval
1360	   value (i.e., 30 seconds by default) elapses without any DNS messages
1361	   being sent or received on a DSO Session, the server SHOULD consider
1362	   the client delinquent, and SHOULD forcibly abort the DSO Session.

1364	7.5.2.  Values for the Keepalive Interval

1366	   For the keepalive interval value, lower values result in a higher
1367	   volume of DSO keepalive traffic.  Higher values of the keepalive
1368	   interval reduce traffic and CPU load, but have minimal effect on the
1369	   memory burden at the server, because clients keep a DSO Session open
1370	   for the same length of time (determined by the inactivity timeout)
1371	   regardless of the level of DSO keepalive traffic required.

1373	   It may be appropriate for clients and servers to select different
1374	   keepalive interval values depending on the nature of the network they
1375	   are on.

1377	   A corporate DNS server that knows it is serving only clients on the
1378	   internal network, with no intervening NAT gateways or firewalls, can
1379	   impose a higher keepalive interval, because frequent DSO keepalive
1380	   traffic is not required.

1382	   A public DNS server that is serving primarily residential consumer
1383	   clients, where it is likely there will be a NAT gateway on the path,
1384	   may impose a lower keepalive interval, to generate more frequent DSO
1385	   keepalive traffic.

1387	   A smart client may be adaptive to its environment.  A client using a
1388	   private IPv4 address [RFC1918] to communicate with a DNS server at an
1389	   address outside that IPv4 private address block, may conclude that
1390	   there is likely to be a NAT gateway on the path, and accordingly
1391	   request a lower keepalive interval.

1393	   By default it is RECOMMENDED that clients request, and servers grant,
1394	   a keepalive interval of 60 minutes.  This keepalive interval provides
1395	   for reasonably timely detection if a client abruptly disconnects
1396	   without cleanly closing the session, and is sufficient to maintain
1397	   state in firewalls and NAT gateways that follow the IETF recommended
1398	   Best Current Practice that the "established connection idle-timeout"
1399	   used by middleboxes be at least 2 hours 4 minutes [RFC5382]
1400	   [RFC7857].

1402	   Note that the lower the keepalive interval value, the higher the load
1403	   on client and server.  For example, a hypothetical keepalive interval
1404	   value of 100ms would result in a continuous stream of at least ten
1405	   messages per second, in both directions, to keep the DSO Session
1406	   alive.  And, in this extreme example, a single packet loss and
1407	   retransmission over a long path could introduce a momentary pause in
1408	   the stream of messages, long enough to cause the server to
1409	   overzealously abort the connection.

1411	   Because of this concern, the server MUST NOT send a DSO Keepalive
1412	   message (either a response to a client-initiated request, or a
1413	   server-initiated message) with a keepalive interval value less than
1414	   ten seconds.  If a client receives a DSO Keepalive message specifying
1415	   a keepalive interval value less than ten seconds this is a fatal
1416	   error and the client MUST forcibly abort the connection immediately.

1418	   A keepalive interval value of 0xFFFFFFFF represents "infinity" and
1419	   informs the client that it should generate no DSO keepalive traffic.
1420	   Note that after signaling that the client should generate no DSO
1421	   keepalive traffic in this way, at any point the server may revise
1422	   that DSO keepalive traffic requirement by sending a new DSO Keepalive
1423	   message dictating new Session Timeout values to the client.

1425	   The largest *finite* keepalive interval supported by the current
1426	   Keepalive TLV is 0xFFFFFFFE (2^32-2 milliseconds, approximately 49.7
1427	   days).

1429	7.6.  Server-Initiated Session Termination

1431	   In addition to cancelling individual long-lived operations
1432	   selectively (Section 6.5) there are also occasions where a server may
1433	   need to terminate one or more entire sessions.  An entire session may
1434	   need to be terminated if the client is defective in some way, or
1435	   departs from the network without closing its session.  Sessions may
1436	   also need to be terminated if the server becomes overloaded, or if
1437	   the server is reconfigured and lacks the ability to be selective
1438	   about which operations need to be cancelled.

1440	   This section discusses various reasons a session may be terminated,
1441	   and the mechanisms for doing so.

1443	   In normal operation, closing a DSO Session is the client's
1444	   responsibility.  The client makes the determination of when to close
1445	   a DSO Session based on an evaluation of both its own needs, and the
1446	   inactivity timeout value dictated by the server.  A server only
1447	   causes a DSO Session to be ended in the exceptional circumstances
1448	   outlined below.

1450	   Some of the exceptional situations in which a server may terminate a
1451	   DSO Session include:

1453	   o  The server application software or underlying operating system is
1454	      shutting down or restarting.

1456	   o  The server application software terminates unexpectedly (perhaps
1457	      due to a bug that makes it crash).

1459	   o  The server is undergoing a reconfiguration or maintenance
1460	      procedure, that, due to the way the server software is
1461	      implemented, requires clients to be disconnected.  For example,
1462	      some software is implemented such that it reads a configuration
1463	      file at startup, and changing the server's configuration entails
1464	      modifying the configuration file and then killing and restarting
1465	      the server software, which generally entails a loss of network
1466	      connections.

1468	   o  The client fails to meets its obligation to generate the required
1469	      DSO keepalive traffic, or to close an inactive session by the
1470	      prescribed time (twice the time interval dictated by the server,
1471	      or five seconds, whichever is greater, as described in
1472	      Section 7.2).

1474	   o  The client sends a grossly invalid or malformed request that is
1475	      indicative of a seriously defective client implementation.

1477	   o  The server is over capacity and needs to shed some load.

1479	7.6.1.  Server-Initiated Retry Delay Message

1481	   In the cases described above where a server elects to terminate a DSO
1482	   Session, it could do so simply by forcibly aborting the connection.
1483	   However, if it did this the likely behavior of the client might be
1484	   simply to to treat this as a network failure and reconnect
1485	   immediately, putting more burden on the server.

1487	   Therefore, to avoid this reconnection implosion, a server SHOULD
1488	   instead choose to shed client load by sending a Retry Delay message,
1489	   with an appropriate RCODE value informing the client of the reason
1490	   the DSO Session needs to be terminated.  The format of the Retry
1491	   Delay TLV, and the interpretations of the various RCODE values, are
1492	   described in Section 8.2.  After sending a Retry Delay message, the
1493	   server MUST NOT send any further messages on that DSO Session.

1495	   The server MAY randomize retry delays in situations where many retry
1496	   delays are sent in quick succession, so as to avoid all the clients
1497	   attempting to reconnect at once.  In general, implementations should
1498	   avoid using the Retry Delay message in a way that would result in
1499	   many clients reconnecting at the same time, if every client attempts
1500	   to reconnect at the exact time specified.

1502	   Upon receipt of a Retry Delay message from the server, the client
1503	   MUST make note of the reconnect delay for this server, and then
1504	   immediately close the connection gracefully.

1506	   After sending a Retry Delay message the server SHOULD allow the
1507	   client five seconds to close the connection, and if the client has
1508	   not closed the connection after five seconds then the server SHOULD
1509	   forcibly abort the connection.

1511	   A Retry Delay message MUST NOT be initiated by a client.  If a server
1512	   receives a Retry Delay message this is a fatal error and the server
1513	   MUST forcibly abort the connection immediately.

1515	7.6.1.1.  Outstanding Operations

1517	   At the instant a server chooses to initiate a Retry Delay message
1518	   there may be DNS requests already in flight from client to server on
1519	   this DSO Session, which will arrive at the server after its Retry
1520	   Delay message has been sent.  The server MUST silently ignore such
1521	   incoming requests, and MUST NOT generate any response messages for
1522	   them.  When the Retry Delay message from the server arrives at the
1523	   client, the client will determine that any DNS requests it previously
1524	   sent on this DSO Session, that have not yet received a response, now
1525	   will certainly not be receiving any response.  Such requests should
1526	   be considered failed, and should be retried at a later time, as
1527	   appropriate.

1529	   In the case where some, but not all, of the existing operations on a
1530	   DSO Session have become invalid (perhaps because the server has been
1531	   reconfigured and is no longer authoritative for some of the names),
1532	   but the server is terminating all affected DSO Sessions en masse by
1533	   sending them all a Retry Delay message, the reconnect delay MAY be
1534	   zero, indicating that the clients SHOULD immediately attempt to re-
1535	   establish operations.

1537	   It is likely that some of the attempts will be successful and some
1538	   will not, depending on the nature of the reconfiguration.

1540	   In the case where a server is terminating a large number of DSO
1541	   Sessions at once (e.g., if the system is restarting) and the server
1542	   doesn't want to be inundated with a flood of simultaneous retries, it
1543	   SHOULD send different reconnect delay values to each client.  These
1544	   adjustments MAY be selected randomly, pseudorandomly, or
1545	   deterministically (e.g., incrementing the time value by one tenth of
1546	   a second for each successive client, yielding a post-restart
1547	   reconnection rate of ten clients per second).

1549	7.6.1.2.  Client Reconnection

1551	   After a DSO Session is ended by the server (either by sending the
1552	   client a Retry Delay message, or by forcibly aborting the underlying
1553	   transport connection) the client SHOULD try to reconnect, to that
1554	   service instance, or to another suitable service instance, if more
1555	   than one is available.  If reconnecting to the same service instance,
1556	   the client MUST respect the indicated delay, if available, before
1557	   attempting to reconnect.  Clients should not attempt to randomize the
1558	   delay; the server will randomly jitter the retry delay values it
1559	   sends to each client if this behavior is desired.

1561	   If the service instance will only be out of service for a short
1562	   maintenance period, it should use a value a little longer that the
1563	   expected maintenance window.  It should not default to a very large
1564	   delay value, or clients may not attempt to reconnect after it resumes
1565	   service.

1567	   If a particular service instance does not want a client to reconnect
1568	   ever (perhaps the service instance is being de-commissioned), it
1569	   SHOULD set the retry delay to the maximum value 0xFFFFFFFF (2^32-1
1570	   milliseconds, approximately 49.7 days).  It is not possible to
1571	   instruct a client to stay away for longer than 49.7 days.  If, after
1572	   49.7 days, the DNS or other configuration information still indicates
1573	   that this is the valid service instance for a particular service,
1574	   then clients MAY attempt to reconnect.  In reality, if a client is
1575	   rebooted or otherwise lose state, it may well attempt to reconnect
1576	   before 49.7 days elapses, for as long as the DNS or other
1577	   configuration information continues to indicate that this is the
1578	   service instance the client should use.

1580	8.  Base TLVs for DNS Stateful Operations

1582	   This section describes the three base TLVs for DNS Stateful
1583	   Operations: Keepalive, Retry Delay, and Encryption Padding.

1585	8.1.  Keepalive TLV

1587	   The Keepalive TLV (DSO-TYPE=1) performs two functions: to reset the
1588	   keepalive timer for the DSO Session, and to establish the values for
1589	   the Session Timeouts.  The client will request the desired session
1590	   timeout values and the server will acknowledge with the response
1591	   values that it requires the client to use.

1593	   The DSO-DATA for the the Keepalive TLV is as follows:

1595	                           1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
1596	       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1597	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1598	      |                 INACTIVITY TIMEOUT (32 bits)                  |
1599	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1600	      |                 KEEPALIVE INTERVAL (32 bits)                  |
1601	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1603	   INACTIVITY TIMEOUT:  The inactivity timeout for the current DSO
1604	      Session, specified as a 32-bit unsigned integer, in network (big
1605	      endian) byte order, in units of milliseconds.  This is the timeout
1606	      at which the client MUST begin closing an inactive DSO Session.
1607	      The inactivity timeout can be any value of the server's choosing.
1608	      If the client does not gracefully close an inactive DSO Session,
1609	      then after twice this interval, or five seconds, whichever is
1610	      greater, the server will forcibly abort the connection.

1612	   KEEPALIVE INTERVAL:  The keepalive interval for the current DSO
1613	      Session, specified as a 32-bit unsigned integer, in network (big
1614	      endian) byte order, in units of milliseconds.  This is the
1615	      interval at which a client MUST generate DSO keepalive traffic to
1616	      maintain connection state.  The keepalive interval MUST NOT be
1617	      less than ten seconds.  If the client does not generate the
1618	      mandated DSO keepalive traffic, then after twice this interval the
1619	      server will forcibly abort the connection.  Since the minimum
1620	      allowed keepalive interval is ten seconds, the minimum time at
1621	      which a server will forcibly disconnect a client for failing to
1622	      generate the mandated DSO keepalive traffic is twenty seconds.

1624	   The transmission or reception of DSO Keepalive messages (i.e.,
1625	   messages where the Keepalive TLV is the first TLV) reset only the
1626	   keepalive timer, not the inactivity timer.  The reason for this is
1627	   that periodic DSO Keepalive messages are sent for the sole purpose of
1628	   keeping a DSO Session alive, when that DSO Session has current or
1629	   recent non-maintenance activity that warrants keeping that DSO
1630	   Session alive.  Sending DSO keepalive traffic itself is not
1631	   considered a client activity; it is considered a maintenance activity
1632	   that is performed in service of other client activities.  If DSO
1633	   keepalive traffic itself were to reset the inactivity timer, then
1634	   that would create a circular livelock where keepalive traffic would
1635	   be sent indefinitely to keep a DSO Session alive, where the only
1636	   activity on that DSO Session would be the keepalive traffic keeping
1637	   the DSO Session alive so that further keepalive traffic can be sent.
1638	   For a DSO Session to be considered active, it must be carrying
1639	   something more than just keepalive traffic.  This is why merely
1640	   sending or receiving a DSO Keepalive message does not reset the
1641	   inactivity timer.

1643	   When sent by a client, the DSO Keepalive request message MUST be sent
1644	   as an acknowledged request, with a nonzero MESSAGE ID.  If a server
1645	   receives a DSO Keepalive message with a zero MESSAGE ID then this is
1646	   a fatal error and the server MUST forcibly abort the connection
1647	   immediately.  The DSO Keepalive request message resets a DSO
1648	   Session's keepalive timer, and at the same time communicates to the
1649	   server the the client's requested Session Timeout values.  In a
1650	   server response to a client-initiated DSO Keepalive request message,
1651	   the Session Timeouts contain the server's chosen values from this
1652	   point forward in the DSO Session, which the client MUST respect.
1653	   This is modeled after the DHCP protocol, where the client requests a
1654	   certain lease lifetime using DHCP option 51 [RFC2132], but the server
1655	   is the ultimate authority for deciding what lease lifetime is
1656	   actually granted.

1658	   When a client is sending its second and subsequent DSO Keepalive
1659	   requests to the server, the client SHOULD continue to request its
1660	   preferred values each time.  This allows flexibility, so that if
1661	   conditions change during the lifetime of a DSO Session, the server
1662	   can adapt its responses to better fit the client's needs.

1664	   Once a DSO Session is in progress (Section 6.1) a DSO Keepalive
1665	   message MAY be initiated by a server.  When sent by a server, the DSO
1666	   Keepalive message MUST be sent as an unacknowledged message, with the
1667	   MESSAGE ID set to zero.  The client MUST NOT generate a response to a
1668	   server-initiated DSO Keepalive message.  If a client receives a DSO
1669	   Keepalive request message with a nonzero MESSAGE ID then this is a
1670	   fatal error and the client MUST forcibly abort the connection
1671	   immediately.  The unacknowledged DSO Keepalive message from the
1672	   server resets a DSO Session's keepalive timer, and at the same time
1673	   unilaterally informs the client of the new Session Timeout values to
1674	   use from this point forward in this DSO Session.  No client DSO
1675	   response message to this unilateral declaration is required or
1676	   allowed.

1678	   In DSO Keepalive response messages, the Keepalive TLV is REQUIRED and
1679	   is used only as a Response Primary TLV sent as a reply to a DSO
1680	   Keepalive request message from the client.  A Keepalive TLV MUST NOT
1681	   be added to other responses as a Response Additional TLV.  If the
1682	   server wishes to update a client's Session Timeout values other than
1683	   in response to a DSO Keepalive request message from the client, then
1684	   it does so by sending an unacknowledged DSO Keepalive message of its
1685	   own, as described above.

1687	   It is not required that the Keepalive TLV be used in every DSO
1688	   Session.  While many DNS Stateful operations will be used in
1689	   conjunction with a long-lived session state, not all DNS Stateful
1690	   operations require long-lived session state, and in some cases the
1691	   default 15-second value for both the inactivity timeout and keepalive
1692	   interval may be perfectly appropriate.  However, note that for
1693	   clients that implement only the DSO-TYPEs defined in this document, a
1694	   Keepalive request message is the only way for a client to initiate a
1695	   DSO Session.

1697	8.1.1.  Client handling of received Session Timeout values

1699	   When a client receives a response to its client-initiated DSO
1700	   Keepalive message, or receives a server-initiated DSO Keepalive
1701	   message, the client has then received Session Timeout values dictated
1702	   by the server.  The two timeout values contained in the Keepalive TLV
1703	   from the server may each be higher, lower, or the same as the
1704	   respective Session Timeout values the client previously had for this
1705	   DSO Session.

1707	   In the case of the keepalive timer, the handling of the received
1708	   value is straightforward.  When a client receives a server-initiated
1709	   message with the Keepalive TLV as its primary TLV, it resets the
1710	   keepalive timer.  Whenever it receives a Keepalive TLV from the
1711	   server, either in a server-initiated message or a reply to its own
1712	   client-initiated Keepalive message, it updates the keepalive interval
1713	   for the DSO Session.  The new keepalive interval indicates the
1714	   maximum time that may elapse before another message must be sent or
1715	   received on this DSO Session, if the DSO Session is to remain alive.
1716	   If the client receives a response to a keepalive message that
1717	   specifies a keepalive interval shorter than the current keepalive
1718	   timer, the client MUST immediately send a Keepalive message.
1719	   However, this should not normally happen in practice: it would
1720	   require that Keepalive interval the server be shorter than the round-
1721	   trip time of the connection.

1723	   In the case of the inactivity timeout, the handling of the received
1724	   value is a little more subtle, though the meaning of the inactivity
1725	   timeout remains as specified -- it still indicates the maximum
1726	   permissible time allowed without useful activity on a DSO Session.
1727	   The act of receiving the message containing the Keepalive TLV does
1728	   not itself reset the inactivity timer.  The time elapsed since the
1729	   last useful activity on this DSO Session is unaffected by exchange of
1730	   DSO Keepalive messages.  The new inactivity timeout value in the
1731	   Keepalive TLV in the received message does update the timeout
1732	   associated with the running inactivity timer; that becomes the new
1733	   maximum permissible time without activity on a DSO Session.

1735	   o  If the current inactivity timer value is less than the new
1736	      inactivity timeout, then the DSO Session may remain open for now.
1737	      When the inactivity timer value reaches the new inactivity
1738	      timeout, the client MUST then begin closing the DSO Session, as
1739	      described above.

1741	   o  If the current inactivity timer value is equal to the new
1742	      inactivity timeout, then this DSO Session has been inactive for
1743	      exactly as long as the server will permit, and now the client MUST
1744	      immediately begin closing this DSO Session.

1746	   o  If the current inactivity timer value is already greater than the
1747	      new inactivity timeout, then this DSO Session has already been
1748	      inactive for longer than the server permits, and the client MUST
1749	      immediately begin closing this DSO Session.

1751	   o  If the current inactivity timer value is already more than twice
1752	      the new inactivity timeout, then the client is immediately
1753	      considered delinquent (this DSO Session is immediately eligible to
1754	      be forcibly terminated by the server) and the client MUST
1755	      immediately begin closing this DSO Session.  However if a server
1756	      abruptly reduces the inactivity timeout in this way, then, to give
1757	      the client time to close the connection gracefully before the
1758	      server resorts to forcibly aborting it, the server SHOULD give the
1759	      client an additional grace period of one quarter of the new
1760	      inactivity timeout, or five seconds, whichever is greater.

1762	8.1.2.  Relation to edns-tcp-keepalive EDNS0 Option

1764	   The inactivity timeout value in the Keepalive TLV (DSO-TYPE=1) has
1765	   similar intent to the edns-tcp-keepalive EDNS0 Option [RFC7828].  A
1766	   client/server pair that supports DSO MUST NOT use the edns-tcp-
1767	   keepalive EDNS0 Option within any message after a DSO Session has
1768	   been established.  A client that has sent a DSO message to establish
1769	   a session MUST NOT send an edns-tcp-keepalive EDNS0 Option from this
1770	   point on.  Once a DSO Session has been established, if either client
1771	   or server receives a DNS message over the DSO Session that contains
1772	   an edns-tcp-keepalive EDNS0 Option, this is a fatal error and the
1773	   receiver of the edns-tcp-keepalive EDNS0 Option MUST forcibly abort
1774	   the connection immediately.

1776	8.2.  Retry Delay TLV

1778	   The Retry Delay TLV (DSO-TYPE=2) can be used as a Primary TLV
1779	   (unacknowledged) in a server-to-client message, or as a Response
1780	   Additional TLV in either direction.

1782	   The DSO-DATA for the the Retry Delay TLV is as follows:

1784	                           1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
1785	       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1786	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1787	      |                     RETRY DELAY (32 bits)                     |
1788	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1790	   RETRY DELAY:  A time value, specified as a 32-bit unsigned integer,
1791	      in network (big endian) byte order, in units of milliseconds,
1792	      within which the initiator MUST NOT retry this operation, or retry
1793	      connecting to this server.  Recommendations for the RETRY DELAY
1794	      value are given in Section 7.6.1.

1796	8.2.1.  Retry Delay TLV used as a Primary TLV

1798	   When sent from server to client, the Retry Delay TLV is used as the
1799	   Primary TLV in an unacknowledged message.  It is used by a server to
1800	   instruct a client to close the DSO Session and underlying connection,
1801	   and not to reconnect for the indicated time interval.

1803	   In this case it applies to the DSO Session as a whole, and the client
1804	   MUST begin closing the DSO Session, as described in Section 7.6.1.
1805	   The RCODE in the message header SHOULD indicate the principal reason
1806	   for the termination:

1808	   o  NOERROR indicates a routine shutdown or restart.

1810	   o  FORMERR indicates that the client requests are too badly malformed
1811	      for the session to continue.

1813	   o  SERVFAIL indicates that the server is overloaded due to resource
1814	      exhaustion and needs to shed load.

1816	   o  REFUSED indicates that the server has been reconfigured, and at
1817	      this time it is now unable to perform one or more of the long-
1818	      lived client operations that were previously being performed on
1819	      this DSO Session.

1821	   o  NOTAUTH indicates that the server has been reconfigured and at
1822	      this time it is now unable to perform one or more of the long-
1823	      lived client operations that were previously being performed on
1824	      this DSO Session because it does not have authority over the names
1825	      in question (for example, a DNS Push Notification server could be
1826	      reconfigured such that is is no longer accepting DNS Push
1827	      Notification requests for one or more of the currently subscribed
1828	      names).

1830	   This document specifies only these RCODE values for Retry Delay
1831	   message.  Servers sending Retry Delay messages SHOULD use one of
1832	   these values.  However, future circumstances may create situations
1833	   where other RCODE values are appropriate in Retry Delay messages, so
1834	   clients MUST be prepared to accept Retry Delay messages with any
1835	   RCODE value.

1837	   In some cases, when a server sends a Retry Delay message to a client,
1838	   there may be more than one reason for the server wanting to end the
1839	   session.  Possibly the configuration could have been changed such
1840	   that some long-lived client operations can no longer be continued due
1841	   to policy (REFUSED), and other long-lived client operations can no
1842	   longer be performed due to the server no longer being authoritative
1843	   for those names (NOTAUTH).  In such cases the server MAY use any of
1844	   the applicable RCODE values, or RCODE=NOERROR (routine shutdown or
1845	   restart).

1847	   Note that the selection of RCODE value in a Retry Delay message is
1848	   not critical, since the RCODE value is generally used only for
1849	   information purposes, such as writing to a log file for future human
1850	   analysis regarding the nature of the disconnection.  Generally
1851	   clients do not modify their behavior depending on the RCODE value.
1852	   The RETRY DELAY in the message tells the client how long it should
1853	   wait before attempting a new connection to this service instance.

1855	   For clients that do in some way modify their behavior depending on
1856	   the RCODE value, they should treat unknown RCODE values the same as
1857	   RCODE=NOERROR (routine shutdown or restart).

1859	   A Retry Delay message from server to client is an unacknowledged
1860	   message; the MESSAGE ID MUST be set to zero in the outgoing message
1861	   and the client MUST NOT send a response.

1863	   A client MUST NOT send a Retry Delay DSO message to a server.  If a
1864	   server receives a DSO message where the Primary TLV is the Retry
1865	   Delay TLV, this is a fatal error and the server MUST forcibly abort
1866	   the connection immediately.

1868	8.2.2.  Retry Delay TLV used as a Response Additional TLV

1870	   In the case of a request that returns a nonzero RCODE value, the
1871	   responder MAY append a Retry Delay TLV to the response, indicating
1872	   the time interval during which the initiator SHOULD NOT attempt this
1873	   operation again.

1875	   The indicated time interval during which the initiator SHOULD NOT
1876	   retry applies only to the failed operation, not to the DSO Session as
1877	   a whole.

1879	8.3.  Encryption Padding TLV

1881	   The Encryption Padding TLV (DSO-TYPE=3) can only be used as an
1882	   Additional or Response Additional TLV.  It is only applicable when
1883	   the DSO Transport layer uses encryption such as TLS.

1885	   The DSO-DATA for the the Padding TLV is optional and is a variable
1886	   length field containing non-specified values.  A DSO-LENGTH of 0
1887	   essentially provides for 4 bytes of padding (the minimum amount).

1889	                                                1   1   1   1   1   1
1890	        0   1   2   3   4   5   6   7   8   9   0   1   2   3   4   5
1891	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
1892	      /                                                               /
1893	      /                   VARIABLE NUMBER OF BYTES                    /
1894	      /                                                               /
1895	      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

1897	   As specified for the EDNS(0) Padding Option [RFC7830] the PADDING
1898	   bytes SHOULD be set to 0x00.  Other values MAY be used, for example,
1899	   in cases where there is a concern that the padded message could be
1900	   subject to compression before encryption.  PADDING bytes of any value
1901	   MUST be accepted in the messages received.

1903	   The Encryption Padding TLV may be included in either a DSO request,
1904	   response, or both.  As specified for the EDNS(0) Padding Option
1905	   [RFC7830] if a request is received with an Encryption Padding TLV,
1906	   then the response MUST also include an Encryption Padding TLV.

1908	   The length of padding is intentionally not specified in this document
1909	   and is a function of current best practices with respect to the type
1910	   and length of data in the preceding TLVs
1911	   [I-D.ietf-dprive-padding-policy].

1913	9.  Summary Highlights

1915	   This section summarizes some noteworthy highlights about various
1916	   components of the DSO protocol.

1918	9.1.  QR bit and MESSAGE ID

1920	   In DSO Request Messages the QR bit is 0 and the MESSAGE ID is
1921	   nonzero.

1923	   In DSO Response Messages the QR bit is 1 and the MESSAGE ID is
1924	   nonzero.

1926	   In DSO Unacknowledged Messages the QR bit is 0 and the MESSAGE ID is
1927	   zero.

1929	   The table below illustrates which combinations are legal and how they
1930	   are interpreted:

1932	               +--------------------------+------------------------+
1933	               |     MESSAGE ID zero      |   MESSAGE ID nonzero   |
1934	      +--------+--------------------------+------------------------+
1935	      |  QR=0  |  Unacknowledged Message  |    Request Message     |
1936	      +--------+--------------------------+------------------------+
1937	      |  QR=1  |  Invalid - Fatal Error   |    Response Message    |
1938	      +--------+--------------------------+------------------------+

1940	9.2.  TLV Usage

1942	   The table below indicates, for each of the three TLVs defined in this
1943	   document, whether they are valid in each of ten different contexts.

1945	   The first five contexts are requests or unacknowledged messages from
1946	   client to server, and the corresponding responses from server back to
1947	   client:

1949	   o  C-P - Primary TLV, sent in DSO Request message, from client to
1950	      server, with nonzero MESSAGE ID indicating that this request MUST
1951	      generate response message.

1953	   o  C-U - Primary TLV, sent in DSO Unacknowledged message, from client
1954	      to server, with zero MESSAGE ID indicating that this request MUST
1955	      NOT generate response message.

1957	   o  C-A - Additional TLV, optionally added to request message or
1958	      unacknowledged message from client to server.

1960	   o  CRP - Response Primary TLV, included in response message sent back
1961	      to the client (in response to a client "C-P" request with nonzero
1962	      MESSAGE ID indicating that a response is required) where the DSO-
1963	      TYPE of the Response TLV matches the DSO-TYPE of the Primary TLV
1964	      in the request.

1966	   o  CRA - Response Additional TLV, included in response message sent
1967	      back to the client (in response to a client "C-P" request with
1968	      nonzero MESSAGE ID indicating that a response is required) where
1969	      the DSO-TYPE of the Response TLV does not match the DSO-TYPE of
1970	      the Primary TLV in the request.

1972	   The second five contexts are their counterparts in the opposite
1973	   direction: requests or unacknowledged messages from server to client,
1974	   and the corresponding responses from client back to server.

1976	   o  S-P - Primary TLV, sent in DSO Request message, from server to
1977	      client, with nonzero MESSAGE ID indicating that this request MUST
1978	      generate response message.

1980	   o  S-U - Primary TLV, sent in DSO Unacknowledged message, from server
1981	      to client, with zero MESSAGE ID indicating that this request MUST
1982	      NOT generate response message.

1984	   o  S-A - Additional TLV, optionally added to request message or
1985	      unacknowledged message from server to client.

1987	   o  SRP - Response Primary TLV, included in response message sent back
1988	      to the server (in response to a server "S-P" request with nonzero
1989	      MESSAGE ID indicating that a response is required) where the DSO-
1990	      TYPE of the Response TLV matches the DSO-TYPE of the Primary TLV
1991	      in the request.

1993	   o  SRA - Response Additional TLV, included in response message sent
1994	      back to the server (in response to a server "S-P" request with
1995	      nonzero MESSAGE ID indicating that a response is required) where
1996	      the DSO-TYPE of the Response TLV does not match the DSO-TYPE of
1997	      the Primary TLV in the request.

1999	                +-------------------------+-------------------------+
2000	                | C-P  C-U  C-A  CRP  CRA | S-P  S-U  S-A  SRP  SRA |
2001	   +------------+-------------------------+-------------------------+
2002	   | KeepAlive  |  X              X       |       X                 |
2003	   +------------+-------------------------+-------------------------+
2004	   | RetryDelay |                      X  |       X                 |
2005	   +------------+-------------------------+-------------------------+
2006	   | Padding    |            X         X  |            X         X  |
2007	   +------------+-------------------------+-------------------------+

2009	   Note that some of the columns in this table are currently empty.  The
2010	   table provides a template for future TLV definitions to follow.  It
2011	   is recommended that definitions of future TLVs include a similar
2012	   table summarizing the contexts where the new TLV is valid.

2014	10.  IANA Considerations

2016	10.1.  DSO OPCODE Registration

2018	   The IANA is requested to record the value ([TBA1] tentatively) 6 for
2019	   the DSO OPCODE in the DNS OPCODE Registry.  DSO stands for DNS
2020	   Stateful Operations.

2022	10.2.  DSO RCODE Registration

2024	   The IANA is requested to record the value ([TBA2] tentatively) 11 for
2025	   the DSOTYPENI error code in the DNS RCODE Registry.  The DSOTYPENI
2026	   error code ("DSO-TYPE Not Implemented") indicates that the receiver
2027	   does implement DNS Stateful Operations, but does not implement the
2028	   specific DSO-TYPE of the primary TLV in the DSO request message.

2030	10.3.  DSO Type Code Registry

2032	   The IANA is requested to create the 16-bit DSO Type Code Registry,
2033	   with initial (hexadecimal) values as shown below:

2035	   +-----------+--------------------------------+----------+-----------+
2036	   | Type      | Name                           | Status   | Reference |
2037	   +-----------+--------------------------------+----------+-----------+
2038	   | 0000      | Reserved                       | Standard | RFC-TBD   |
2039	   |           |                                |          |           |
2040	   | 0001      | KeepAlive                      | Standard | RFC-TBD   |
2041	   |           |                                |          |           |
2042	   | 0002      | RetryDelay                     | Standard | RFC-TBD   |
2043	   |           |                                |          |           |
2044	   | 0003      | EncryptionPadding              | Standard | RFC-TBD   |
2045	   |           |                                |          |           |
2046	   | 0004-003F | Unassigned, reserved for       |          |           |
2047	   |           | DSO session-management TLVs    |          |           |
2048	   |           |                                |          |           |
2049	   | 0040-F7FF | Unassigned                     |          |           |
2050	   |           |                                |          |           |
2051	   | F800-FBFF | Reserved for                   |          |           |
2052	   |           | experimental/local use         |          |           |
2053	   |           |                                |          |           |
2054	   | FC00-FFFF | Reserved for future expansion  |          |           |
2055	   +-----------+--------------------------------+----------+-----------+

2057	   DSO Type Code zero is reserved and is not currently intended for
2058	   allocation.

2060	   Registrations of new DSO Type Codes in the "Reserved for DSO session-
2061	   management" range 0004-003F and the "Reserved for future expansion"
2062	   range FC00-FFFF require publication of an IETF Standards Action
2063	   document [RFC8126].

2065	   Requests to register additional new DSO Type Codes in the
2066	   "Unassigned" range 0040-F7FF are to be recorded by IANA after Expert
2067	   Review [RFC8126].  The expert review should validate that the
2068	   requested type code is specified in a way that conforms to this
2069	   specification, and that the intended use for the code would not be
2070	   addressed with an experimental/local assignment.

2072	   DSO Type Codes in the "experimental/local" range F800-FBFF may be
2073	   used as Experimental Use or Private Use values [RFC8126] and may be
2074	   used freely for development purposes, or for other purposes within a
2075	   single site.  No attempt is made to prevent multiple sites from using
2076	   the same value in different (and incompatible) ways.  There is no
2077	   need for IANA to review such assignments (since IANA does not record
2078	   them) and assignments are not generally useful for broad
2079	   interoperability.  It is the responsibility of the sites making use
2080	   of "experimental/local" values to ensure that no conflicts occur
2081	   within the intended scope of use.

2083	11.  Security Considerations

2085	   If this mechanism is to be used with DNS over TLS, then these
2086	   messages are subject to the same constraints as any other DNS-over-
2087	   TLS messages and MUST NOT be sent in the clear before the TLS session
2088	   is established.

2090	   The data field of the "Encryption Padding" TLV could be used as a
2091	   covert channel.

2093	   When designing new DSO TLVs, the potential for data in the TLV to be
2094	   used as a tracking identifier should be taken into consideration, and
2095	   should be avoided when not required.

2097	   When used without TLS or similar cryptographic protection, a
2098	   malicious entity maybe able to inject a malicious Retry Delay
2099	   Unacknowledged Message into the data stream, specifying an
2100	   unreasonably large RETRY DELAY, causing a denial-of-service attack
2101	   against the client.

2103	   The establishment of DSO sessions has an increasing impact on the
2104	   number of open TCP connections on a DNS server.  Additional resources
2105	   may be used on the server as a result.  However, because the server
2106	   can limit the number of DSO sessions established and can also close
2107	   existing DSO sessions as needed, denial of service or resource
2108	   exhaustion should not be a concern.

2110	11.1.  TCP Fast Open Considerations

2112	   It would be possible to add a TLV that requires the server to do some
2113	   significant work, and send that to the server as initial data in a
2114	   TCP SYN packet.  A flood of such packets could be used as a DoS
2115	   attack on the server.  None of the TLVs defined here have this
2116	   property.  If a new TLV is specified that does have this property,
2117	   the specification should require that some kind of exchange be done
2118	   with the server before work is done.  That is, the TLV that requires
2119	   work could not be processed without a round-trip from the server to
2120	   the client to verify that the source address of the packet is
2121	   reachable.

2123	   One way to accomplish this would be to have the client send a TLV
2124	   indicating that it wishes to have the server do work of this sort;
2125	   this TLV would not actually result in work being done, but would
2126	   request a nonce from the server.  The client could then use that
2127	   nonce to request that work be done.

2129	   Alternatively, the server could simply disable TCP fast open.  This
2130	   same problem would exist for DNS-over-TLS with TLS early data; the
2131	   same remedies would apply.

2133	12.  Acknowledgements

2135	   Thanks to Stephane Bortzmeyer, Tim Chown, Ralph Droms, Paul Hoffman,
2136	   Jan Komissar, Edward Lewis, Allison Mankin, Rui Paulo, David
2137	   Schinazi, Manju Shankar Rao, and Bernie Volz for their helpful
2138	   contributions to this document.

2140	13.  References

2142	13.1.  Normative References

2144	   [RFC1034]  Mockapetris, P., "Domain names - concepts and facilities",
2145	              STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987,
2146	              <https://www.rfc-editor.org/info/rfc1034>.

2148	   [RFC1035]  Mockapetris, P., "Domain names - implementation and
2149	              specification", STD 13, RFC 1035, DOI 10.17487/RFC1035,
2150	              November 1987, <https://www.rfc-editor.org/info/rfc1035>.

2152	   [RFC1918]  Rekhter, Y., Moskowitz, B., Karrenberg, D., de Groot, G.,
2153	              and E. Lear, "Address Allocation for Private Internets",
2154	              BCP 5, RFC 1918, DOI 10.17487/RFC1918, February 1996,
2155	              <https://www.rfc-editor.org/info/rfc1918>.

2157	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
2158	              Requirement Levels", BCP 14, RFC 2119,
2159	              DOI 10.17487/RFC2119, March 1997, <https://www.rfc-
2160	              editor.org/info/rfc2119>.

2162	   [RFC2136]  Vixie, P., Ed., Thomson, S., Rekhter, Y., and J. Bound,
2163	              "Dynamic Updates in the Domain Name System (DNS UPDATE)",
2164	              RFC 2136, DOI 10.17487/RFC2136, April 1997,
2165	              <https://www.rfc-editor.org/info/rfc2136>.

2167	   [RFC6891]  Damas, J., Graff, M., and P. Vixie, "Extension Mechanisms
2168	              for DNS (EDNS(0))", STD 75, RFC 6891,
2169	              DOI 10.17487/RFC6891, April 2013, <https://www.rfc-
2170	              editor.org/info/rfc6891>.

2172	   [RFC7766]  Dickinson, J., Dickinson, S., Bellis, R., Mankin, A., and
2173	              D. Wessels, "DNS Transport over TCP - Implementation
2174	              Requirements", RFC 7766, DOI 10.17487/RFC7766, March 2016,
2175	              <https://www.rfc-editor.org/info/rfc7766>.

2177	   [RFC7830]  Mayrhofer, A., "The EDNS(0) Padding Option", RFC 7830,
2178	              DOI 10.17487/RFC7830, May 2016, <https://www.rfc-
2179	              editor.org/info/rfc7830>.

2181	   [RFC8126]  Cotton, M., Leiba, B., and T. Narten, "Guidelines for
2182	              Writing an IANA Considerations Section in RFCs", BCP 26,
2183	              RFC 8126, DOI 10.17487/RFC8126, June 2017,
2184	              <https://www.rfc-editor.org/info/rfc8126>.

2186	   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2187	              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
2188	              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

2190	13.2.  Informative References

2192	   [I-D.ietf-dnsop-no-response-issue]
2193	              Andrews, M. and R. Bellis, "A Common Operational Problem
2194	              in DNS Servers - Failure To Respond.", draft-ietf-dnsop-
2195	              no-response-issue-11 (work in progress), July 2018.

2197	   [I-D.ietf-dnssd-mdns-relay]
2198	              Lemon, T. and S. Cheshire, "Multicast DNS Discovery
2199	              Relay", draft-ietf-dnssd-mdns-relay-01 (work in progress),
2200	              July 2018.

2202	   [I-D.ietf-dnssd-push]
2203	              Pusateri, T. and S. Cheshire, "DNS Push Notifications",
2204	              draft-ietf-dnssd-push-14 (work in progress), March 2018.

2206	   [I-D.ietf-doh-dns-over-https]
2207	              Hoffman, P. and P. McManus, "DNS Queries over HTTPS
2208	              (DoH)", draft-ietf-doh-dns-over-https-12 (work in
2209	              progress), June 2018.

2211	   [I-D.ietf-dprive-padding-policy]
2212	              Mayrhofer, A., "Padding Policy for EDNS(0)", draft-ietf-
2213	              dprive-padding-policy-06 (work in progress), July 2018.

2215	   [I-D.ietf-tls-tls13]
2216	              Rescorla, E., "The Transport Layer Security (TLS) Protocol
2217	              Version 1.3", draft-ietf-tls-tls13-28 (work in progress),
2218	              March 2018.

2220	   [NagleDA]  Cheshire, S., "TCP Performance problems caused by
2221	              interaction between Nagle's Algorithm and Delayed ACK",
2222	              May 2005,
2223	              <http://www.stuartcheshire.org/papers/nagledelayedack/>.

2225	   [RFC2132]  Alexander, S. and R. Droms, "DHCP Options and BOOTP Vendor
2226	              Extensions", RFC 2132, DOI 10.17487/RFC2132, March 1997,
2227	              <https://www.rfc-editor.org/info/rfc2132>.

2229	   [RFC2782]  Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for
2230	              specifying the location of services (DNS SRV)", RFC 2782,
2231	              DOI 10.17487/RFC2782, February 2000, <https://www.rfc-
2232	              editor.org/info/rfc2782>.

2234	   [RFC5382]  Guha, S., Ed., Biswas, K., Ford, B., Sivakumar, S., and P.
2235	              Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142,
2236	              RFC 5382, DOI 10.17487/RFC5382, October 2008,
2237	              <https://www.rfc-editor.org/info/rfc5382>.

2239	   [RFC6762]  Cheshire, S. and M. Krochmal, "Multicast DNS", RFC 6762,
2240	              DOI 10.17487/RFC6762, February 2013, <https://www.rfc-
2241	              editor.org/info/rfc6762>.

2243	   [RFC6763]  Cheshire, S. and M. Krochmal, "DNS-Based Service
2244	              Discovery", RFC 6763, DOI 10.17487/RFC6763, February 2013,
2245	              <https://www.rfc-editor.org/info/rfc6763>.

2247	   [RFC7413]  Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP
2248	              Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014,
2249	              <https://www.rfc-editor.org/info/rfc7413>.

2251	   [RFC7828]  Wouters, P., Abley, J., Dickinson, S., and R. Bellis, "The
2252	              edns-tcp-keepalive EDNS0 Option", RFC 7828,
2253	              DOI 10.17487/RFC7828, April 2016, <https://www.rfc-
2254	              editor.org/info/rfc7828>.

2256	   [RFC7857]  Penno, R., Perreault, S., Boucadair, M., Ed., Sivakumar,
2257	              S., and K. Naito, "Updates to Network Address Translation
2258	              (NAT) Behavioral Requirements", BCP 127, RFC 7857,
2259	              DOI 10.17487/RFC7857, April 2016, <https://www.rfc-
2260	              editor.org/info/rfc7857>.

2262	   [RFC7858]  Hu, Z., Zhu, L., Heidemann, J., Mankin, A., Wessels, D.,
2263	              and P. Hoffman, "Specification for DNS over Transport
2264	              Layer Security (TLS)", RFC 7858, DOI 10.17487/RFC7858, May
2265	              2016, <https://www.rfc-editor.org/info/rfc7858>.

2267	Authors' Addresses

2269	   Ray Bellis
2270	   Internet Systems Consortium, Inc.
2271	   950 Charter Street
2272	   Redwood City  CA 94063
2273	   USA

2275	   Phone: +1 (650) 423-1200
2276	   Email: ray@isc.org
2277	   Stuart Cheshire
2278	   Apple Inc.
2279	   One Apple Park Way
2280	   Cupertino  CA 95014
2281	   USA

2283	   Phone: +1 (408) 996-1010
2284	   Email: cheshire@apple.com

2286	   John Dickinson
2287	   Sinodun Internet Technologies
2288	   Magadalen Centre
2289	   Oxford Science Park
2290	   Oxford  OX4 4GA
2291	   United Kingdom

2293	   Email: jad@sinodun.com

2295	   Sara Dickinson
2296	   Sinodun Internet Technologies
2297	   Magadalen Centre
2298	   Oxford Science Park
2299	   Oxford  OX4 4GA
2300	   United Kingdom

2302	   Email: sara@sinodun.com

2304	   Ted Lemon
2305	   Nibbhaya Consulting
2306	   P.O. Box 958
2307	   Brattleboro  VT 05302-0958
2308	   USA

2310	   Email: mellon@fugue.com

2312	   Tom Pusateri
2313	   Unaffiliated
2314	   Raleigh  NC 27608
2315	   USA

2317	   Phone: +1 (919) 867-1330
2318	   Email: pusateri@bangj.com