idnits 2.17.1 

draft-ietf-tcpm-1323bis-16.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The abstract seems to indicate that this document obsoletes RFC1323, but
     the header doesn't have an 'Obsoletes:' line to match this.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (November 12, 2013) is 3815 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'Ekstroem04' is defined on line 1294, but no explicit
     reference was found in the text

  == Unused Reference: 'Hamming77' is defined on line 1312, but no explicit
     reference was found in the text

  == Unused Reference: 'Jain86' is defined on line 1336, but no explicit
     reference was found in the text

  == Unused Reference: 'Mathis08' is defined on line 1366, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC0896' is defined on line 1391, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC1110' is defined on line 1397, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC2581' is defined on line 1415, but no explicit
     reference was found in the text

  == Unused Reference: 'Watson81' is defined on line 1459, but no explicit
     reference was found in the text

  == Unused Reference: 'Zhang86' is defined on line 1464, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC  793 (Obsoleted by RFC 9293)

  -- Obsolete informational reference (is this intentional?): RFC  896
     (Obsoleted by RFC 7805)

  -- Obsolete informational reference (is this intentional?): RFC 1072
     (Obsoleted by RFC 1323, RFC 2018, RFC 6247)

  -- Obsolete informational reference (is this intentional?): RFC 1110
     (Obsoleted by RFC 6247)

  -- Obsolete informational reference (is this intentional?): RFC 1185
     (Obsoleted by RFC 1323)

  -- Obsolete informational reference (is this intentional?): RFC 1323
     (Obsoleted by RFC 7323)

  -- Obsolete informational reference (is this intentional?): RFC 1981
     (Obsoleted by RFC 8201)

  -- Obsolete informational reference (is this intentional?): RFC 2581
     (Obsoleted by RFC 5681)

  -- Obsolete informational reference (is this intentional?): RFC 6528
     (Obsoleted by RFC 9293)

  -- Obsolete informational reference (is this intentional?): RFC 6691
     (Obsoleted by RFC 9293)


     Summary: 1 error (**), 0 flaws (~~), 10 warnings (==), 12 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	TCP Maintenance (TCPM)                                         D. Borman
3	Internet-Draft                                       Quantum Corporation
4	Intended status: Standards Track                               B. Braden
5	Expires: May 16, 2014                             University of Southern
6	                                                              California
7	                                                             V. Jacobson
8	                                                            Google, Inc.
9	                                                   R. Scheffenegger, Ed.
10	                                                            NetApp, Inc.
11	                                                       November 12, 2013

13	                  TCP Extensions for High Performance
14	                       draft-ietf-tcpm-1323bis-16

16	Abstract

18	   This document specifies a set of TCP extensions to improve
19	   performance over paths with a large bandwidth * delay product and to
20	   provide reliable operation over very high-speed paths.  It defines
21	   TCP options for scaled windows and timestamps.  The timestamps can be
22	   used for two distinct mechanisms, PAWS (Protection Against Wrapped
23	   Sequences) and RTTM (Round Trip Time Measurement).

25	   This document obsoletes RFC 1323 and describes changes from it.

27	Status of this Memo

29	   This Internet-Draft is submitted in full conformance with the
30	   provisions of BCP 78 and BCP 79.

32	   Internet-Drafts are working documents of the Internet Engineering
33	   Task Force (IETF).  Note that other groups may also distribute
34	   working documents as Internet-Drafts.  The list of current Internet-
35	   Drafts is at http://datatracker.ietf.org/drafts/current/.

37	   Internet-Drafts are draft documents valid for a maximum of six months
38	   and may be updated, replaced, or obsoleted by other documents at any
39	   time.  It is inappropriate to use Internet-Drafts as reference
40	   material or to cite them other than as "work in progress."

42	   This Internet-Draft will expire on May 16, 2014.

44	Copyright Notice

46	   Copyright (c) 2013 IETF Trust and the persons identified as the
47	   document authors.  All rights reserved.

49	   This document is subject to BCP 78 and the IETF Trust's Legal
50	   Provisions Relating to IETF Documents
51	   (http://trustee.ietf.org/license-info) in effect on the date of
52	   publication of this document.  Please review these documents
53	   carefully, as they describe your rights and restrictions with respect
54	   to this document.  Code Components extracted from this document must
55	   include Simplified BSD License text as described in Section 4.e of
56	   the Trust Legal Provisions and are provided without warranty as
57	   described in the Simplified BSD License.

59	Table of Contents

61	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
62	     1.1.  TCP Performance  . . . . . . . . . . . . . . . . . . . . .  4
63	     1.2.  TCP Reliability  . . . . . . . . . . . . . . . . . . . . .  5
64	     1.3.  Using TCP options  . . . . . . . . . . . . . . . . . . . .  6
65	     1.4.  Terminology  . . . . . . . . . . . . . . . . . . . . . . .  7
66	   2.  TCP Window Scale Option  . . . . . . . . . . . . . . . . . . .  8
67	     2.1.  Introduction . . . . . . . . . . . . . . . . . . . . . . .  8
68	     2.2.  Window Scale Option  . . . . . . . . . . . . . . . . . . .  8
69	     2.3.  Using the Window Scale Option  . . . . . . . . . . . . . .  9
70	     2.4.  Addressing Window Retraction . . . . . . . . . . . . . . . 10
71	   3.  TCP Timestamps option  . . . . . . . . . . . . . . . . . . . . 12
72	     3.1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . 12
73	     3.2.  Timestamps option  . . . . . . . . . . . . . . . . . . . . 12
74	   4.  The RTTM Mechanism . . . . . . . . . . . . . . . . . . . . . . 15
75	     4.1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . 15
76	     4.2.  Updating the RTO value . . . . . . . . . . . . . . . . . . 16
77	     4.3.  Which Timestamp to Echo  . . . . . . . . . . . . . . . . . 16
78	   5.  PAWS - Protection Against Wrapped Sequence Numbers . . . . . . 20
79	     5.1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . 20
80	     5.2.  The PAWS Mechanism . . . . . . . . . . . . . . . . . . . . 20
81	     5.3.  Basic PAWS Algorithm . . . . . . . . . . . . . . . . . . . 21
82	     5.4.  Timestamp Clock  . . . . . . . . . . . . . . . . . . . . . 23
83	     5.5.  Outdated Timestamps  . . . . . . . . . . . . . . . . . . . 25
84	     5.6.  Header Prediction  . . . . . . . . . . . . . . . . . . . . 25
85	     5.7.  IP Fragmentation . . . . . . . . . . . . . . . . . . . . . 27
86	     5.8.  Duplicates from Earlier Incarnations of Connection . . . . 27
87	   6.  Conclusions and Acknowledgements . . . . . . . . . . . . . . . 28
88	   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 28
89	     7.1.  Privacy Considerations . . . . . . . . . . . . . . . . . . 30
90	   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 30
91	   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 30
92	     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 30
93	     9.2.  Informative References . . . . . . . . . . . . . . . . . . 31
94	   Appendix A.  Implementation Suggestions  . . . . . . . . . . . . . 34
95	   Appendix B.  Duplicates from Earlier Connection Incarnations . . . 35
96	     B.1.  System Crash with Loss of State  . . . . . . . . . . . . . 36
97	     B.2.  Closing and Reopening a Connection . . . . . . . . . . . . 36
98	   Appendix C.  Summary of Notation . . . . . . . . . . . . . . . . . 37
99	   Appendix D.  Event Processing Summary  . . . . . . . . . . . . . . 38
100	   Appendix E.  Timestamps Edge Cases . . . . . . . . . . . . . . . . 44
101	   Appendix F.  Window Retraction Example . . . . . . . . . . . . . . 45
102	   Appendix G.  RTO calculation modification  . . . . . . . . . . . . 45
103	   Appendix H.  Changes from RFC 1323 . . . . . . . . . . . . . . . . 46
104	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 48

106	1.  Introduction

108	   The TCP protocol [RFC0793] was designed to operate reliably over
109	   almost any transmission medium regardless of transmission rate,
110	   delay, corruption, duplication, or reordering of segments.  Over the
111	   years, advances in networking technology have resulted in ever-higher
112	   transmission speeds, and the fastest paths are well beyond the domain
113	   for which TCP was originally engineered.

115	   This document defines a set of modest extensions to TCP to extend the
116	   domain of its application to match the increasing network capability.
117	   It is an update to and obsoletes [RFC1323], which in turn is based
118	   upon and obsoletes [RFC1072] and [RFC1185].

120	   Changes between [RFC1323] and this document are detailed in
121	   Appendix H.  These changes are partly due to errata in [RFC1323], and
122	   partly due to the improved understanding of how the involved
123	   components interact.

125	   For brevity, the full discussions of the merits and history behind
126	   the TCP options defined within this document have been omitted.
127	   [RFC1323] should be consulted for reference.  It is recommended that
128	   a modern TCP stack implements and make use of the extensions
129	   described in this document.

131	1.1.  TCP Performance

133	   TCP performance problems arise when the bandwidth * delay product is
134	   large.  A network having such paths is referred to as "long, fat
135	   network" (LFN).

137	   There are two fundamental performance problems with basic TCP over
138	   LFN paths:

140	   (1)  Window Size Limit

142	        The TCP header uses a 16 bit field to report the receive window
143	        size to the sender.  Therefore, the largest window that can be
144	        used is 2^16 = 64 KiB.  For LFN paths where the bandwidth *
145	        delay product exceeds 64 KiB, the receive window limits the
146	        maximum throughput of the TCP connection over the path, i.e.,
147	        the amount of unacknowledged data that TCP can send in order to
148	        keep the pipeline full.

150	        To circumvent this problem, Section 2 of this memo defines a TCP
151	        option, "Window Scale", to allow windows larger than 2^16.  This
152	        option defines an implicit scale factor, which is used to
153	        multiply the window size value found in a TCP header to obtain
154	        the true window size.

156	        It must be noted, that the use of large receive windows
157	        increases the chance of too quickly wrapping sequence numbers,
158	        as described below in Section 1.2, (1).

160	   (2)  Recovery from Losses

162	        Packet losses in an LFN can have a catastrophic effect on
163	        throughput.

165	        To generalize the Fast Retransmit / Fast Recovery mechanism to
166	        handle multiple packets dropped per window, Selective
167	        Acknowledgments are required.  Unlike the normal cumulative
168	        acknowledgments of TCP, Selective Acknowledgments give the
169	        sender a complete picture of which segments are queued at the
170	        receiver and which have not yet arrived.

172	        Selective acknowledgements and their use are specified in
173	        separate documents, "TCP Selective Acknowledgment Options"
174	        [RFC2018], "An Extension to the Selective Acknowledgement (SACK)
175	        Option for TCP" [RFC2883], and "A Conservative Selective
176	        Acknowledgment (SACK)-based Loss Recovery Algorithm for TCP"
177	        [RFC6675], and not further discussed in this document.

179	1.2.  TCP Reliability

181	   An especially serious kind of error may result from an accidental
182	   reuse of TCP sequence numbers in data segments.  TCP reliability
183	   depends upon the existence of a bound on the lifetime of a segment:
184	   the "Maximum Segment Lifetime" or MSL.

186	   Duplication of sequence numbers might happen in either of two ways:

188	   (1)  Sequence number wrap-around on the current connection

190	        A TCP sequence number contains 32 bits.  At a high enough
191	        transfer rate of large volumes of data (at least 4 GiB in the
192	        same session), the 32-bit sequence space may be "wrapped"
193	        (cycled) within the time that a segment is delayed in queues.

195	   (2)  Earlier incarnation of the connection

197	        Suppose that a connection terminates, either by a proper close
198	        sequence or due to a host crash, and the same connection (i.e.,
199	        using the same pair of port numbers) is immediately reopened.  A
200	        delayed segment from the terminated connection could fall within
201	        the current window for the new incarnation and be accepted as
202	        valid.

204	   Duplicates from earlier incarnations, case (2), are avoided by
205	   enforcing the current fixed MSL of the TCP specification, as
206	   explained in Section 5.8 and Appendix B.  In addition, the randmizing
207	   of ephemeral ports can also help to probabilistically reduce the
208	   chances of duplicates from earlier connections.  However, case (1),
209	   avoiding the reuse of sequence numbers within the same connection,
210	   requires an upper bound on MSL that depends upon the transfer rate,
211	   and at high enough rates, a dedicated mechanism is required.

213	   A possible fix for the problem of cycling the sequence space would be
214	   to increase the size of the TCP sequence number field.  For example,
215	   the sequence number field (and also the acknowledgment field) could
216	   be expanded to 64 bits.  This could be done either by changing the
217	   TCP header or by means of an additional option.

219	   Section 5 presents a different mechanism, which we call PAWS
220	   (Protection Against Wrapped Sequence numbers), to extend TCP
221	   reliability to transfer rates well beyond the foreseeable upper limit
222	   of network bandwidths.  PAWS uses the TCP Timestamps option defined
223	   in Section 3.2 to protect against old duplicates from the same
224	   connection.

226	1.3.  Using TCP options

228	   The extensions defined in this document all use TCP options.

230	   When [RFC1323] was published, there was concern that some buggy TCP
231	   implementation might crash on the first appearance of an option on a
232	   non-<SYN> segment.  However, bugs like that can lead to DOS attacks
233	   against a TCP.  Research has shown that most TCP implementations will
234	   properly handle unknown options on non-<SYN> segments ([Medina04],
235	   [Medina05]).  But it is still prudent to be conservative in what you
236	   send, and avoiding buggy TCP implementation is not the only reason
237	   for negotiating TCP options on <SYN> segments.

239	   The window scale option negotiates fundamental parameters of the TCP
240	   session.  Therefore, it is only sent during the initial handshake.
241	   Furthermore, the window scale option will be sent in a <SYN,ACK>
242	   segment only if the corresponding option was received in the initial
243	   <SYN> segment.

245	   The Timestamps option may appear in any data or <ACK> segment, adding
246	   10 bytes (up to 12 bytes including padding) to the 20-byte TCP
247	   header.  It is required that this TCP option will be sent on all non-
248	   <SYN> segments after an exchange of options on the <SYN> segments has
249	   indicated that both sides understand this extension.

251	   Research has shown that the use of the Timestamps option to take
252	   additional RTT samples within each RTT has little effect on the
253	   ultimate retransmission timeout value [Allman99].  However, there are
254	   other uses of the Timestamps option, such as the Eifel mechanism
255	   [RFC3522], [RFC4015], and PAWS (see Section 5) which improve overall
256	   TCP security and performance.  The extra header bandwidth used by
257	   this option should be evaluated for the gains in performance and
258	   security in an actual deployment.

260	   Appendix A contains a recommended layout of the options in TCP
261	   headers to achieve reasonable data field alignment.

263	   Finally, we observe that most of the mechanisms defined in this
264	   document are important for LFN's and/or very high-speed networks.
265	   For low-speed networks, it might be a performance optimization to NOT
266	   use these mechanisms.  A TCP vendor concerned about optimal
267	   performance over low-speed paths might consider turning these
268	   extensions off for low- speed paths, or allow a user or installation
269	   manager to disable them.

271	1.4.  Terminology

273	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
274	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
275	   document are to be interpreted as described in [RFC2119].

277	   In this document, these words will appear with that interpretation
278	   only when in UPPER CASE.  Lower case uses of these words are not to
279	   be interpreted as carrying [RFC2119] significance.

281	2.  TCP Window Scale Option

283	2.1.  Introduction

285	   The window scale extension expands the definition of the TCP window
286	   to 30 bits and then uses an implicit scale factor to carry this 30-
287	   bit value in the 16-bit Window field of the TCP header (SEG.WND in
288	   [RFC0793]).  The exponent of the scale factor is carried in a TCP
289	   option, Window Scale.  This option is sent only in a <SYN> segment (a
290	   segment with the SYN bit on), hence the window scale is fixed in each
291	   direction when a connection is opened.

293	   The maximum receive window, and therefore the scale factor, is
294	   determined by the maximum receive buffer space.  In a typical modern
295	   implementation, this maximum buffer space is set by default but can
296	   be overridden by a user program before a TCP connection is opened.
297	   This determines the scale factor, and therefore no new user interface
298	   is needed for window scaling.

300	2.2.  Window Scale Option

302	   The three-byte Window Scale option MAY be sent in a <SYN> segment by
303	   a TCP.  It has two purposes: (1) indicate that the TCP is prepared to
304	   both send and receive window scaling, and (2) communicate the
305	   exponent of a scale factor to be applied to its receive window.
306	   Thus, a TCP that is prepared to scale windows SHOULD send the option,
307	   even if its own scale factor is 1 and the exponent 0.  The scale
308	   factor is limited to a power of two and encoded logarithmically, so
309	   it may be implemented by binary shift operations.  The maximum scale
310	   exponent is limited to 14 for a maximum permissible receive window
311	   size of 1 GiB (2^(14+16)).

313	   TCP Window Scale Option (WSopt):

315	   Kind: 3

317	   Length: 3 bytes

319	          +---------+---------+---------+
320	          | Kind=3  |Length=3 |shift.cnt|
321	          +---------+---------+---------+
322	               1         1         1

324	   This option is an offer, not a promise; both sides MUST send Window
325	   Scale options in their <SYN> segments to enable window scaling in
326	   either direction.  If window scaling is enabled, then the TCP that
327	   sent this option will right-shift its true receive-window values by
328	   'shift.cnt' bits for transmission in SEG.WND.  The value 'shift.cnt'
329	   MAY be zero (offering to scale, while applying a scale factor of 1 to
330	   the receive window).

332	   This option MAY be sent in an initial <SYN> segment (i.e., a segment
333	   with the SYN bit on and the ACK bit off).  It MAY also be sent in a
334	   <SYN,ACK> segment, but only if a Window Scale option was received in
335	   the initial <SYN> segment.  A Window Scale option in a segment
336	   without a SYN bit MUST be ignored.

338	   The window field in a segment where the SYN bit is set (i.e., a <SYN>
339	   or <SYN,ACK>) MUST NOT be scaled.

341	2.3.  Using the Window Scale Option

343	   A model implementation of window scaling is as follows, using the
344	   notation of [RFC0793]:

346	   o  The connection state MUST be augmented by two window shift
347	      counters, Snd.Wind.Shift and Rcv.Wind.Shift, to be applied to the
348	      incoming and outgoing window fields, respectively.

350	   o  If a TCP receives a <SYN> segment containing a Window Scale
351	      option, it SHOULD send its own Window Scale option in the
352	      <SYN,ACK> segment.

354	   o  The Window Scale option MUST be sent with shift.cnt = R, where R
355	      is the value that the TCP would like to use for its receive
356	      window.

358	   o  Upon receiving a <SYN> segment with a Window Scale option
359	      containing shift.cnt = S, a TCP MUST set Snd.Wind.Shift to S and
360	      MUST set Rcv.Wind.Shift to R; otherwise, it MUST set both
361	      Snd.Wind.Shift and Rcv.Wind.Shift to zero.

363	   o  The window field (SEG.WND) in the header of every incoming
364	      segment, with the exception of <SYN> segments, MUST be left-
365	      shifted by Snd.Wind.Shift bits before updating SND.WND:

367	                    SND.WND = SEG.WND << Snd.Wind.Shift

369	      (assuming the other conditions of [RFC0793] are met, and using the
370	      "C" notation "<<" for left-shift).

372	   o  The window field (SEG.WND) of every outgoing segment, with the
373	      exception of <SYN> segments, MUST be right-shifted by
374	      Rcv.Wind.Shift bits:

376	                    SEG.WND = RCV.WND >> Rcv.Wind.Shift

378	   TCP determines if a data segment is "old" or "new" by testing whether
379	   its sequence number is within 2^31 bytes of the left edge of the
380	   window, and if it is not, discarding the data as "old".  To insure
381	   that new data is never mistakenly considered old and vice versa, the
382	   left edge of the sender's window has to be at most 2^31 away from the
383	   right edge of the receiver's window.  Similarly with the sender's
384	   right edge and receiver's left edge.  Since the right and left edges
385	   of either the sender's or receiver's window differ by the window
386	   size, and since the sender and receiver windows can be out of phase
387	   by at most the window size, the above constraints imply that two
388	   times the maximum window size must be less than 2^31, or

390	                             max window < 2^30

392	   Since the max window is 2^S (where S is the scaling shift count)
393	   times at most 2^16 - 1 (the maximum unscaled window), the maximum
394	   window is guaranteed to be < 2^30 if S <= 14.  Thus, the shift count
395	   MUST be limited to 14 (which allows windows of 2^30 = 1 GiB).  If a
396	   Window Scale option is received with a shift.cnt value larger than
397	   14, the TCP SHOULD log the error but MUST use 14 instead of the
398	   specified value.  This is safe as a sender can always choose to only
399	   partially use any signaled receive window.  If the receiver is
400	   scaling by a factor larger than 14 and the sender is only scaling by
401	   14 then the receive window used by the sender will appear smaller
402	   than it is in reality.

404	   The scale factor applies only to the Window field as transmitted in
405	   the TCP header; each TCP using extended windows will maintain the
406	   window values locally as 32-bit numbers.  For example, the
407	   "congestion window" computed by Slow Start and Congestion Avoidance
408	   (see [RFC5681]) is not affected by the scale factor, so window
409	   scaling will not introduce quantization into the congestion window.

411	2.4.  Addressing Window Retraction

413	   When a non-zero scale factor is in use, there are instances when a
414	   retracted window can be offered - see Appendix F for a detailed
415	   example.  The end of the window will be on a boundary based on the
416	   granularity of the scale factor being used.  If the sequence number
417	   is then updated by a number of bytes smaller than that granularity,
418	   the TCP will have to either advertise a new window that is beyond
419	   what it previously advertised (and perhaps beyond the buffer), or
420	   will have to advertise a smaller window, which will cause the TCP
421	   window to shrink.  Implementations MUST ensure that they handle a
422	   shrinking window, as specified in section 4.2.2.16 of [RFC1122].

424	   For the receiver, this implies that:

426	   1)  The receiver MUST honor, as in-window, any segment that would
427	       have been in-window for any <ACK> sent by the receiver.

429	   2)  When window scaling is in effect, the receiver SHOULD track the
430	       actual maximum window sequence number (which is likely to be
431	       greater than the window announced by the most recent <ACK>, if
432	       more than one segment has arrived since the application consumed
433	       any data in the receive buffer).

435	   On the sender side:

437	   3)  The initial transmission MUST be within the window announced by
438	       the most recent <ACK>.

440	   4)  On first retransmission, or if the sequence number is out-of-
441	       window by less than 2^Rcv.Wind.Shift then do normal
442	       retransmission(s) without regard to receiver window as long as
443	       the original segment was in window when it was sent.

445	   5)  Subsequent retransmissions MAY only be sent, if they are within
446	       the window announced by the most recent <ACK>.

448	3.  TCP Timestamps option

450	3.1.  Introduction

452	   The Timestamps option is introduced to address some of the issues
453	   mentioned in Section 1.1 and Section 1.2.  The Timestamps option is
454	   specified in a symmetrical manner, so that TSval timestamps are
455	   carried in both data and <ACK> segments and are echoed in TSecr
456	   fields carried in returning <ACK> or data segments.  Originally used
457	   primarily for timestamping individual segments, the properties of the
458	   Timestamps option allow not only the use for taking time measurements
459	   (Section 4), but additional uses as well (xref target="sec4"/>).

461	   It is necessary to remember that there is a distinction between the
462	   Timestamps option conveying timestamp information, and the use of
463	   that information.  In particular, the Round Trip Time Measurement
464	   (RTTM) mechanism must be viewed independently from updating the
465	   Retransmission Timeout (RTO) (see Section 4.2).  In this case, the
466	   sample granularity also needs to be taken into account.  Other
467	   mechanisms, such as PAWS, or Eifel, are not built upon the timestamp
468	   information itself, but are based on the intrinsic property of
469	   monotonically increasing values.

471	   The Timestamps option is important when large receive windows are
472	   used, to allow the use of the PAWS mechanism (see Section 5).
473	   Furthermore, the option may be useful for all TCP's, since it
474	   simplifies the sender and allows the use of additional optimizations
475	   such as Eifel ([RFC3522], [RFC4015]) and others ([RFC6817],
476	   [Kuzmanovic03], [Kuehlewind10].

478	3.2.  Timestamps option

480	   TCP is a symmetric protocol, allowing data to be sent at any time in
481	   either direction, and therefore timestamp echoing may occur in either
482	   direction.  For simplicity and symmetry, we specify that timestamps
483	   always be sent and echoed in both directions.  For efficiency, we
484	   combine the timestamp and timestamp reply fields into a single TCP
485	   Timestamps option.

487	   TCP Timestamps option (TSopt):

489	   Kind: 8

491	   Length: 10 bytes

493	          +-------+-------+---------------------+---------------------+
494	          |Kind=8 |  10   |   TS Value (TSval)  |TS Echo Reply (TSecr)|
495	          +-------+-------+---------------------+---------------------+
496	              1       1              4                     4

498	   The Timestamps option carries two four-byte timestamp fields.  The
499	   Timestamp Value field (TSval) contains the current value of the
500	   timestamp clock of the TCP sending the option.

502	   The Timestamp Echo Reply field (TSecr) is valid if the ACK bit is set
503	   in the TCP header; if it is valid, it echoes a timestamp value that
504	   was sent by the remote TCP in the TSval field of a Timestamps option.
505	   When TSecr is not valid, its value MUST be zero.  However, a value of
506	   zero does not imply TSecr being invalid.  The TSecr value will
507	   generally be from the most recent Timestamps option that was
508	   received; however, there are exceptions that are explained below.

510	   A TCP MAY send the Timestamps option (TSopt) in an initial <SYN>
511	   segment (i.e., segment containing a SYN bit and no ACK bit), and MAY
512	   send a TSopt in <SYN,ACK> only if it received a TSopt in the initial
513	   <SYN> segment for the connection.

515	   Once TSopt has been successfully negotiated, that is both <SYN>, and
516	   <SYN,ACK> contain TSopt, the TSopt MUST be sent in every non-<RST>
517	   segment for the duration of the connection, and SHOULD be sent in an
518	   <RST> segment (see Section 5.2 for details).  The TCP SHOULD remember
519	   this state by setting a flag, referred to as Snd.TS.OK, to one.  If a
520	   non-<RST> segment is received without a TSopt, a TCP SHOULD silently
521	   drop the segment.  A TCP MUST NOT abort a TCP connection because any
522	   segment lacks an expected TSopt.

524	   Implementations are strongly encouraged to follow the above rules for
525	   handling a missing Timestamps option, and the order of precedence
526	   mentioned in Section 5.3 when deciding on the acceptance of a
527	   segment.

529	   If a receiver chooses to accept a segment without an expected
530	   Timestamps option, it must be clear that undetectable data corruption
531	   may occur.

533	   Such a TCP receiver may experience undetectable wrapped- sequence
534	   effects, such as data (payload) corruption or session stalls.  In
535	   order to maintain the integrity of the payload data, in particular on
536	   high speed networks, it is paramount to follow the described
537	   processing rules.

539	   However, it has been mentioned that under some circumstances, the
540	   above guidelines are too strict, and some paths sporadically suppress
541	   the Timestamps option, while maintaining payload integrity.  A path
542	   behaving in this manner should be deemed unacceptable, but it has
543	   been noted that some implementations relax the acceptance rules as a
544	   workaround, and allow TCP to run across such paths [Oppermann13]

546	   If a TSopt is received on a connection where TSopt was not negotiated
547	   in the initial three-way handshake, the TSopt MUST be ignored and the
548	   packet processed normally.

550	   In the case of crossing <SYN> segments where one <SYN> contains a
551	   TSopt and the other doesn't, both sides MAY send a TSopt in the
552	   <SYN,ACK> segment.

554	   TSopt is required for the two mechanisms described in sections 4 and
555	   5.  There are also other mechanisms that rely on the presence of the
556	   TSopt, e.g.  [RFC3522].  If a TCP stopped sending TSopt at any time
557	   during an established session, it interferes with these mechanisms.
558	   This update to [RFC1323] describes explicitly the previous assumption
559	   (see Section 5.2), that each TCP segment must have TSopt, once
560	   negotiated.

562	4.  The RTTM Mechanism

564	4.1.  Introduction

566	   One use of the Timestamps option is to measure the round trip time of
567	   virtually every packet acknowledged.  The Round Trip Time Measurement
568	   (RTTM) mechansim requires a Timestamps option in every measured
569	   segment, with a TSval that is obtained from a (virtual) "timestamp
570	   clock".  Values of this clock MUST be at least approximately
571	   proportional to real time, in order to measure actual RTT.

573	   TCP measures the round trip time (RTT), primarily for the purpose of
574	   arriving at a reasonable value for the Retransmission Timeout (RTO)
575	   timer interval.  Accurate and current RTT estimates are necessary to
576	   adapt to changing traffic conditions, while a conservative estimate
577	   of the RTO interval is necessary to minimize spurious RTOs.

579	   These TSval values are echoed in TSecr values in the reverse
580	   direction.  The difference between a received TSecr value and the
581	   current timestamp clock value provides an RTT measurement.

583	   When timestamps are used, every segment that is received will contain
584	   a TSecr value.  However, these values cannot all be used to update
585	   the measured RTT.  The following example illustrates why.  It shows a
586	   one-way data flow with segments arriving in sequence without loss.
587	   Here A, B, C... represent data blocks occupying successive blocks of
588	   sequence numbers, and ACK(A),... represent the corresponding
589	   cumulative acknowledgments.  The two timestamp fields of the
590	   Timestamps option are shown symbolically as <TSval=x,TSecr=y>.  Each
591	   TSecr field contains the value most recently received in a TSval
592	   field.

594	              TCP  A                                     TCP B

596	                              <A,TSval=1,TSecr=120> ----->

598	                   <---- <ACK(A),TSval=127,TSecr=1>

600	                              <B,TSval=5,TSecr=127> ----->

602	                   <---- <ACK(B),TSval=131,TSecr=5>

604	                . . . . . . . . . . . . . . . . . . . . . .

606	                              <C,TSval=65,TSecr=131> ---->

608	                   <---- <ACK(C),TSval=191,TSecr=65>
609	                                  (etc.)

611	   The dotted line marks a pause (60 time units long) in which A had
612	   nothing to send.  Note that this pause inflates the RTT which B could
613	   infer from receiving TSecr=131 in data segment C. Thus, in one-way
614	   data flows, RTTM in the reverse direction measures a value that is
615	   inflated by gaps in sending data.  However, the following rule
616	   prevents a resulting inflation of the measured RTT:

618	   RTTM Rule: A TSecr value received in a segment MAY be used to update
619	              the averaged RTT measurement only if the segment advances
620	              the left edge of the send window, i.e.  SND.UNA is
621	              increased.

623	   Since TCP B is not sending data, the data segment C does not
624	   acknowledge any new data when it arrives at B. Thus, the inflated
625	   RTTM measurement is not used to update B's RTTM measurement.

627	4.2.  Updating the RTO value

629	   When [RFC1323] was originally written, it was perceived that taking
630	   RTT measurements for each segment, and also during retransmissions,
631	   would contribute to reduce spurious RTOs, while maintaining the
632	   timeliness of necessary RTOs.  At the time, RTO was also the only
633	   mechanism to make use of the measured RTT.  It has been shown, that
634	   taking more RTT samples has only a very limited effect to optimize
635	   RTOs [Allman99].

637	   Implementers should note that with timestamps multiple RTTMs can be
638	   taken per RTT.  The [RFC6298] RTO estimator has weighting factors,
639	   alpha and beta, based on an implicit assumption that at most one RTTM
640	   will be sampled per RTT.  When multiple RTTMs per RTT are available
641	   to update the RTO estimator, an implementation SHOULD try to adhere
642	   to the spirit of the history specified in [RFC6298].  An
643	   implementation suggestion is detailed in Appendix G.

645	   [Ludwig00] and [Floyd05] have highlighted the problem that an
646	   unmodified RTO calculation, which is updated with per-packet RTT
647	   samples, will truncate the path history too soon.  This can lead to
648	   an increase in spurious retransmissions, when the path properties
649	   vary in the order of a few RTTs, but a high number of RTT samples are
650	   taken on a much shorter timescale.

652	4.3.  Which Timestamp to Echo

654	   If more than one Timestamps option is received before a reply segment
655	   is sent, the TCP must choose only one of the TSvals to echo, ignoring
656	   the others.  To minimize the state kept in the receiver (i.e., the
657	   number of unprocessed TSvals), the receiver should be required to
658	   retain at most one timestamp in the connection control block.

660	   There are three situations to consider:

662	   (A)  Delayed ACKs.

664	        Many TCP's acknowledge only every second segment out of a group
665	        of segments arriving within a short time interval; this policy
666	        is known generally as "delayed ACKs".  The data-sender TCP must
667	        measure the effective RTT, including the additional time due to
668	        delayed ACKs, or else it will retransmit unnecessarily.  Thus,
669	        when delayed ACKs are in use, the receiver SHOULD reply with the
670	        TSval field from the earliest unacknowledged segment.

672	   (B)  A hole in the sequence space (segment(s) have been lost).

674	        The sender will continue sending until the window is filled, and
675	        the receiver may be generating <ACK>s as these out-of-order
676	        segments arrive (e.g., to aid "fast retransmit").

678	        The lost segment is probably a sign of congestion, and in that
679	        situation the sender should be conservative about
680	        retransmission.  Furthermore, it is better to overestimate than
681	        underestimate the RTT.  An <ACK> for an out-of-order segment
682	        SHOULD therefore contain the timestamp from the most recent
683	        segment that advanced RCV.NXT.

685	        The same situation occurs if segments are re-ordered by the
686	        network.

688	   (C)  A filled hole in the sequence space.

690	        The segment that fills the hole and advances the window
691	        represents the most recent measurement of the network
692	        characteristics.  An RTT computed from an earlier segment would
693	        probably include the sender's retransmit time-out, badly biasing
694	        the sender's average RTT estimate.  Thus, the timestamp from the
695	        latest segment (which filled the hole) MUST be echoed.

697	   An algorithm that covers all three cases is described in the
698	   following rules for Timestamps option processing on a synchronized
699	   connection:

701	   (1)  The connection state is augmented with two 32-bit slots:

703	        TS.Recent holds a timestamp to be echoed in TSecr whenever a
704	        segment is sent, and Last.ACK.sent holds the ACK field from the
705	        last segment sent.  Last.ACK.sent will equal RCV.NXT except when
706	        <ACK>s have been delayed.

708	   (2)  If:

710	            SEG.TSval >= TS.recent and SEG.SEQ <= Last.ACK.sent

712	        then SEG.TSval is copied to TS.Recent; otherwise, it is ignored.

714	   (3)  When a TSopt is sent, its TSecr field is set to the current
715	        TS.Recent value.

717	   The following examples illustrate these rules.  Here A, B, C...
718	   represent data segments occupying successive blocks of sequence
719	   numbers, and ACK(A),... represent the corresponding acknowledgment
720	   segments.  Note that ACK(A) has the same sequence number as B. We
721	   show only one direction of timestamp echoing, for clarity.

723	   o  Segments arrive in sequence, and some of the <ACK>s are delayed.

725	      By case (A), the timestamp from the oldest unacknowledged segment
726	      is echoed.

728	                                                    TS.Recent
729	                  <A, TSval=1> ------------------->
730	                                                        1
731	                  <B, TSval=2> ------------------->
732	                                                        1
733	                  <C, TSval=3> ------------------->
734	                                                        1
735	                           <---- <ACK(C), TSecr=1>
736	                  (etc)

738	   o  Segments arrive out of order, and every segment is acknowledged.

740	      By case (B), the timestamp from the last segment that advanced the
741	      left window edge is echoed, until the missing segment arrives; it
742	      is echoed according to Case (C).  The same sequence would occur if
743	      segments B and D were lost and retransmitted.

745	                                                    TS.Recent
746	                  <A, TSval=1> ------------------->
747	                                                        1
748	                           <---- <ACK(A), TSecr=1>
749	                                                        1
750	                  <C, TSval=3> ------------------->
751	                                                        1
752	                           <---- <ACK(A), TSecr=1>
753	                                                        1
754	                  <B, TSval=2> ------------------->
755	                                                        2
756	                           <---- <ACK(C), TSecr=2>
757	                                                        2
758	                  <E, TSval=5> ------------------->
759	                                                        2
760	                           <---- <ACK(C), TSecr=2>
761	                                                        2
762	                  <D, TSval=4> ------------------->
763	                                                        4
764	                           <---- <ACK(E), TSecr=4>
765	                  (etc)

767	5.  PAWS - Protection Against Wrapped Sequence Numbers

769	5.1.  Introduction

771	   Another use for the Timestamps options is the mechanism to Protect
772	   Against Wrapped Sequence numbers (PAWS).  Section 5.2 describes a
773	   simple mechanism to reject old duplicate segments that might corrupt
774	   an open TCP connection.  PAWS operates within a single TCP
775	   connection, using state that is saved in the connection control
776	   block.  Section 5.8 and Appendix H discuss the implications of the
777	   PAWS mechanism for avoiding old duplicates from previous incarnations
778	   of the same connection.

780	5.2.  The PAWS Mechanism

782	   PAWS uses the TCP Timestamps option described earlier, and assumes
783	   that every received TCP segment (including data and <ACK> segments)
784	   contains a timestamp SEG.TSval whose values are monotonically non-
785	   decreasing in time.  The basic idea is that a segment can be
786	   discarded as an old duplicate if it is received with a timestamp
787	   SEG.TSval less than some timestamp recently received on this
788	   connection.

790	   In the PAWS mechanism, the "timestamps" are 32-bit unsigned integers
791	   in a modular 32-bit space.  Thus, "less than" is defined the same way
792	   it is for TCP sequence numbers, and the same implementation
793	   techniques apply.  If s and t are timestamp values,

795	                       s < t  if 0 < (t - s) < 2^31,

797	   computed in unsigned 32-bit arithmetic.

799	   The choice of incoming timestamps to be saved for this comparison
800	   MUST guarantee a value that is monotonically non-decreasing.  For
801	   example, an implementation might save the timestamp from the segment
802	   that last advanced the left edge of the receive window, i.e., the
803	   most recent in-sequence segment.  For simplicity, the value TS.Recent
804	   introduced in Section 4.3 is used instead, as using a common value
805	   for both PAWS and RTTM simplifies the implementation.  As Section 4.3
806	   explained, TS.Recent differs from the timestamp from the last in-
807	   sequence segment only in the case of delayed <ACK>s, and therefore by
808	   less than one window.  Either choice will therefore protect against
809	   sequence number wrap-around.

811	   PAWS submits all incoming segments to the same test, and therefore
812	   protects against duplicate <ACK> segments as well as data segments.
813	   (An alternative non-symmetric algorithm would protect against old
814	   duplicate <ACK>s: the sender of data would reject incoming <ACK>
815	   segments whose TSecr values were less than the TSecr saved from the
816	   last segment whose ACK field advanced the left edge of the send
817	   window.  This algorithm was deemed to lack economy of mechanism and
818	   symmetry.)

820	   TSval timestamps sent on <SYN> and <SYN,ACK> segments are used to
821	   initialize PAWS.  PAWS protects against old duplicate non- <SYN>
822	   segments, and duplicate <SYN> segments received while there is a
823	   synchronized connection.  Duplicate <SYN> and <SYN,ACK> segments
824	   received when there is no connection will be discarded by the normal
825	   3-way handshake and sequence number checks of TCP.

827	   [RFC1323] recommended that <RST> segments NOT carry timestamps, and
828	   that they be acceptable regardless of their timestamp.  At that time,
829	   the thinking was that old duplicate <RST> segments should be
830	   exceedingly unlikely, and their cleanup function should take
831	   precedence over timestamps.  More recently, discussions about various
832	   blind attacks on TCP connections have raised the suggestion that if
833	   the Timestamps option is present, SEG.TSecr could be used to provide
834	   stricter acceptance tests for <RST> segments.

836	   While still under discussion, to enable research into this area it is
837	   now RECOMMENDED that when generating an <RST>, that if the segment
838	   causing the <RST> to be generated contained a Timestamps option, that
839	   the <RST> also contain a Timestamps option.  In the <RST> segment,
840	   SEG.TSecr SHOULD be set to SEG.TSval from the incoming segment and
841	   SEG.TSval SHOULD be set to zero.  If an <RST> is being generated
842	   because of a user abort, and Snd.TS.OK is set, then a Timestamps
843	   option SHOULD be included in the <RST>.  When an <RST> segment is
844	   received, it MUST NOT be subjected to the PAWS check by verifying an
845	   acceptable value in SEG.TSval, and information from the Timestamps
846	   option MUST NOT be used to update connection state information.
847	   SEG.TSecr MAY be used to provide stricter <RST> acceptance checks.

849	5.3.  Basic PAWS Algorithm

851	   If the PAWS algorithm is used, the following processing MUST be
852	   performed on all incoming segments for a synchronized connection.
853	   Also, PAWS processing MUST take precedence over the regular TCP
854	   acceptablitiy check (Section 3.3 in [RFC0793]), which is performed
855	   after verification of the received Timestamps option:

857	   R1)  If there is a Timestamps option in the arriving segment,
858	        SEG.TSval < TS.Recent, TS.Recent is valid (see later discussion)
859	        and the RST bit is not set, then treat the arriving segment as
860	        not acceptable:

862	           Send an acknowledgement in reply as specified in [RFC0793]
863	           page 69 and drop the segment.

865	           Note: it is necessary to send an <ACK> segment in order to
866	           retain TCP's mechanisms for detecting and recovering from
867	           half- open connections.  For example, see Figure 10 of
868	           [RFC0793].

870	   R2)  If the segment is outside the window, reject it (normal TCP
871	        processing)

873	   R3)  If an arriving segment satisfies: SEG.SEQ <= Last.ACK.sent (see
874	        Section 4.3), then record its timestamp in TS.Recent.

876	   R4)  If an arriving segment is in-sequence (i.e., at the left window
877	        edge), then accept it normally.

879	   R5)  Otherwise, treat the segment as a normal in-window, out-of-
880	        sequence TCP segment (e.g., queue it for later delivery to the
881	        user).

883	   Steps R2, R4, and R5 are the normal TCP processing steps specified by
884	   [RFC0793].

886	   It is important to note that the timestamp MUST be checked only when
887	   a segment first arrives at the receiver, regardless of whether it is
888	   in- sequence or it must be queued for later delivery.

890	   Consider the following example.

892	      Suppose the segment sequence: A.1, B.1, C.1, ..., Z.1 has been
893	      sent, where the letter indicates the sequence number and the digit
894	      represents the timestamp.  Suppose also that segment B.1 has been
895	      lost.  The timestamp in TS.Recent is 1 (from A.1), so C.1, ...,
896	      Z.1 are considered acceptable and are queued.  When B is
897	      retransmitted as segment B.2 (using the latest timestamp), it
898	      fills the hole and causes all the segments through Z to be
899	      acknowledged and passed to the user.  The timestamps of the queued
900	      segments are *not* inspected again at this time, since they have
901	      already been accepted.  When B.2 is accepted, TS.Recent is set to
902	      2.

904	   This rule allows reasonable performance under loss.  A full window of
905	   data is in transit at all times, and after a loss a full window less
906	   one segment will show up out-of-sequence to be queued at the receiver
907	   (e.g., up to ~2^30 bytes of data); the Timestamps option must not
908	   result in discarding this data.

910	   In certain unlikely circumstances, the algorithm of rules R1-R5 could
911	   lead to discarding some segments unnecessarily, as shown in the
912	   following example:

914	      Suppose again that segments: A.1, B.1, C.1, ..., Z.1 have been
915	      sent in sequence and that segment B.1 has been lost.  Furthermore,
916	      suppose delivery of some of C.1, ...  Z.1 is delayed until *after*
917	      the retransmission B.2 arrives at the receiver.  These delayed
918	      segments will be discarded unnecessarily when they do arrive,
919	      since their timestamps are now out of date.

921	   This case is very unlikely to occur.  If the retransmission was
922	   triggered by a timeout, some of the segments C.1, ...  Z.1 must have
923	   been delayed longer than the RTO time.  This is presumably an
924	   unlikely event, or there would be many spurious timeouts and
925	   retransmissions.  If B's retransmission was triggered by the "fast
926	   retransmit" algorithm, i.e., by duplicate <ACK>s, then the queued
927	   segments that caused these <ACK>s must have been received already.

929	   Even if a segment were delayed past the RTO, the Fast Retransmit
930	   mechanism [Jacobson90c] will cause the delayed segments to be
931	   retransmitted at the same time as B.2, avoiding an extra RTT and
932	   therefore causing a very small performance penalty.

934	   We know of no case with a significant probability of occurrence in
935	   which timestamps will cause performance degradation by unnecessarily
936	   discarding segments.

938	5.4.  Timestamp Clock

940	   It is important to understand that the PAWS algorithm does not
941	   require clock synchronization between sender and receiver.  The
942	   sender's timestamp clock is used as a source of monotonic non-
943	   decreasing values to stamp the segments.  The receiver treats the
944	   timestamp value as simply a monotonically non-decreasing serial
945	   number, without any connection to time.  From the receiver's
946	   viewpoint, the timestamp is acting as a logical extension of the
947	   high-order bits of the sequence number.

949	   The receiver algorithm does place some requirements on the frequency
950	   of the timestamp clock.

952	   (a)  The timestamp clock must not be "too slow".

954	        It MUST tick at least once for each 2^31 bytes sent.  In fact,
955	        in order to be useful to the sender for round trip timing, the
956	        clock SHOULD tick at least once per window's worth of data, and
957	        even with the window extension defined in Section 2.2, 2^31
958	        bytes must be at least two windows.

960	        To make this more quantitative, any clock faster than 1 tick/sec
961	        will reject old duplicate segments for link speeds of ~8 Gbps.
962	        A 1 ms timestamp clock will work at link speeds up to 8 Tbps
963	        (8*10^12) bps!

965	   (b)  The timestamp clock must not be "too fast".

967	        The recycling time of the timestamp clock MUST be greater than
968	        MSL seconds.  Since the clock (timestamp) is 32 bits and the
969	        worst-case MSL is 255 seconds, the maximum acceptable clock
970	        frequency is one tick every 59 ns.

972	        However, it is desirable to establish a much longer recycle
973	        period, in order to handle outdated timestamps on idle
974	        connections (see Section 5.5), and to relax the MSL requirement
975	        for preventing sequence number wrap-around.  With a 1 ms
976	        timestamp clock, the 32-bit timestamp will wrap its sign bit in
977	        24.8 days.  Thus, it will reject old duplicates on the same
978	        connection if MSL is 24.8 days or less.  This appears to be a
979	        very safe figure; an MSL of 24.8 days or longer can probably be
980	        assumed in the Internet without requiring precise MSL
981	        enforcement.

983	   Based upon these considerations, we choose a timestamp clock
984	   frequency in the range 1 ms to 1 sec per tick.  This range also
985	   matches the requirements of the RTTM mechanism, which does not need
986	   much more resolution than the granularity of the retransmit timer,
987	   e.g., tens or hundreds of milliseconds.

989	   The PAWS mechanism also puts a strong monotonicity requirement on the
990	   sender's timestamp clock.  The method of implementation of the
991	   timestamp clock to meet this requirement depends upon the system
992	   hardware and software.

994	   o  Some hosts have a hardware clock that is guaranteed to be
995	      monotonic between hardware resets.

997	   o  A clock interrupt may be used to simply increment a binary integer
998	      by 1 periodically.

1000	   o  The timestamp clock may be derived from a system clock that is
1001	      subject to being abruptly changed, by adding a variable offset
1002	      value.  This offset is initialized to zero.  When a new timestamp
1003	      clock value is needed, the offset can be adjusted as necessary to
1004	      make the new value equal to or larger than the previous value
1005	      (which was saved for this purpose).

1007	   o  A random offset may be added to the timestamp clock on a per
1008	      connection basis.  See [RFC6528], section 3, on randomizing the
1009	      initial sequence number (ISN).  The same function with a different
1010	      secret key can be use to generate the per connection timestamp
1011	      offset.

1013	5.5.  Outdated Timestamps

1015	   If a connection remains idle long enough for the timestamp clock of
1016	   the other TCP to wrap its sign bit, then the value saved in TS.Recent
1017	   will become too old; as a result, the PAWS mechanism will cause all
1018	   subsequent segments to be rejected, freezing the connection (until
1019	   the timestamp clock wraps its sign bit again).

1021	   With the chosen range of timestamp clock frequencies (1 sec to 1 ms),
1022	   the time to wrap the sign bit will be between 24.8 days and 24800
1023	   days.  A TCP connection that is idle for more than 24 days and then
1024	   comes to life is exceedingly unusual.  However, it is undesirable in
1025	   principle to place any limitation on TCP connection lifetimes.

1027	   We therefore require that an implementation of PAWS include a
1028	   mechanism to "invalidate" the TS.Recent value when a connection is
1029	   idle for more than 24 days.  (An alternative solution to the problem
1030	   of outdated timestamps would be to send keep-alive segments at a very
1031	   low rate, but still more often than the wrap-around time for
1032	   timestamps, e.g., once a day.  This would impose negligible overhead.
1033	   However, the TCP specification has never included keep-alives, so the
1034	   solution based upon invalidation was chosen.)

1036	   Note that a TCP does not know the frequency, and therefore, the
1037	   wraparound time, of the other TCP, so it must assume the worst.  The
1038	   validity of TS.Recent needs to be checked only if the basic PAWS
1039	   timestamp check fails, i.e., only if SEG.TSval < TS.Recent.  If
1040	   TS.Recent is found to be invalid, then the segment is accepted,
1041	   regardless of the failure of the timestamp check, and rule R3 updates
1042	   TS.Recent with the TSval from the new segment.

1044	   To detect how long the connection has been idle, the TCP MAY update a
1045	   clock or timestamp value associated with the connection whenever
1046	   TS.Recent is updated, for example.  The details will be
1047	   implementation-dependent.

1049	5.6.  Header Prediction

1051	   "Header prediction" [Jacobson90a] is a high-performance transport
1052	   protocol implementation technique that is most important for high-
1053	   speed links.  This technique optimizes the code for the most common
1054	   case, receiving a segment correctly and in order.  Using header
1055	   prediction, the receiver asks the question, "Is this segment the next
1056	   in sequence?"  This question can be answered in fewer machine
1057	   instructions than the question, "Is this segment within the window?"

1059	   Adding header prediction to our timestamp procedure leads to the
1060	   following recommended sequence for processing an arriving TCP
1061	   segment:

1063	   H1)  Check timestamp (same as step R1 above)

1065	   H2)  Do header prediction: if segment is next in sequence and if
1066	        there are no special conditions requiring additional processing,
1067	        accept the segment, record its timestamp, and skip H3.

1069	   H3)  Process the segment normally, as specified in RFC 793.  This
1070	        includes dropping segments that are outside the window and
1071	        possibly sending acknowledgments, and queuing in-window, out-of-
1072	        sequence segments.

1074	   Another possibility would be to interchange steps H1 and H2, i.e., to
1075	   perform the header prediction step H2 *first*, and perform H1 and H3
1076	   only when header prediction fails.  This could be a performance
1077	   improvement, since the timestamp check in step H1 is very unlikely to
1078	   fail, and it requires unsigned modulo arithmetic.  To perform this
1079	   check on every single segment is contrary to the philosophy of header
1080	   prediction.  We believe that this change might produce a measurable
1081	   reduction in CPU time for TCP protocol processing on high-speed
1082	   networks.

1084	   However, putting H2 first would create a hazard: a segment from 2^32
1085	   bytes in the past might arrive at exactly the wrong time and be
1086	   accepted mistakenly by the header-prediction step.  The following
1087	   reasoning has been introduced in [RFC1185] to show that the
1088	   probability of this failure is negligible.

1090	      If all segments are equally likely to show up as old duplicates,
1091	      then the probability of an old duplicate exactly matching the left
1092	      window edge is the maximum segment size (MSS) divided by the size
1093	      of the sequence space.  This ratio must be less than 2^-16, since
1094	      MSS must be < 2^16; for example, it will be (2^12)/(2^32) = 2^-20
1095	      for a 100 Mbit/s link.  However, the older a segment is, the less
1096	      likely it is to be retained in the Internet, and under any
1097	      reasonable model of segment lifetime the probability of an old
1098	      duplicate exactly at the left window edge must be much smaller
1099	      than 2^-16.

1101	      The 16 bit TCP checksum also allows a basic unreliability of one
1102	      part in 2^16.  A protocol mechanism whose reliability exceeds the
1103	      reliability of the TCP checksum should be considered "good
1104	      enough", i.e., it won't contribute significantly to the overall
1105	      error rate.  We therefore believe we can ignore the problem of an
1106	      old duplicate being accepted by doing header prediction before
1107	      checking the timestamp.

1109	   However, this probabilistic argument is not universally accepted, and
1110	   the consensus at present is that the performance gain does not
1111	   justify the hazard in the general case.  It is therefore recommended
1112	   that H2 follow H1.

1114	5.7.  IP Fragmentation

1116	   At high data rates, the protection against old segments provided by
1117	   PAWS can be circumvented by errors in IP fragment reassembly (see
1118	   [RFC4963]).  The only way to protect against incorrect IP fragment
1119	   reassembly is to not allow the segments to be fragmented.  This is
1120	   done by setting the Don't Fragment (DF) bit in the IP header.
1121	   Setting the DF bit implies the use of Path MTU Discovery as described
1122	   in [RFC1191], [RFC1981], and [RFC4821], thus any TCP implementation
1123	   that implements PAWS MUST also implement Path MTU Discovery.

1125	5.8.  Duplicates from Earlier Incarnations of Connection

1127	   The PAWS mechanism protects against errors due to sequence number
1128	   wrap-around on high-speed connections.  Segments from an earlier
1129	   incarnation of the same connection are also a potential cause of old
1130	   duplicate errors.  In both cases, the TCP mechanisms to prevent such
1131	   errors depend upon the enforcement of a maximum segment lifetime
1132	   (MSL) by the Internet (IP) layer (see Appendix of RFC 1185 for a
1133	   detailed discussion).  Unlike the case of sequence space wrap-around,
1134	   the MSL required to prevent old duplicate errors from earlier
1135	   incarnations does not depend upon the transfer rate.  If the IP layer
1136	   enforces the recommended 2 minute MSL of TCP, and if the TCP rules
1137	   are followed, TCP connections will be safe from earlier incarnations,
1138	   no matter how high the network speed.  Thus, the PAWS mechanism is
1139	   not required for this case.

1141	   We may still ask whether the PAWS mechanism can provide additional
1142	   security against old duplicates from earlier connections, allowing us
1143	   to relax the enforcement of MSL by the IP layer.  Appendix B explores
1144	   this question, showing that further assumptions and/or mechanisms are
1145	   required, beyond those of PAWS.  This is not part of the current
1146	   extension.

1148	6.  Conclusions and Acknowledgements

1150	   This memo presented a set of extensions to TCP to provide efficient
1151	   operation over large bandwidth * delay product paths and reliable
1152	   operation over very high-speed paths.  These extensions are designed
1153	   to provide compatible interworking with TCP stacks that do not
1154	   implement the extensions.

1156	   These mechanisms are implemented using TCP options for scaled windows
1157	   and timestamps.  The timestamps are used for two distinct mechanisms:
1158	   RTTM (Round Trip Time Measurement) and PAWS (Protection Against
1159	   Wrapped Sequences).

1161	   The Window Scale option was originally suggested by Mike St. Johns of
1162	   USAF/DCA.  The present form of the option was suggested by Mike
1163	   Karels of UC Berkeley in response to a more cumbersome scheme defined
1164	   by Van Jacobson.  Lixia Zhang helped formulate the PAWS mechanism
1165	   description in [RFC1185].

1167	   Finally, much of this work originated as the result of discussions
1168	   within the End-to-End Task Force on the theoretical limitations of
1169	   transport protocols in general and TCP in particular.  Task force
1170	   members and other on the end2end-interest list have made valuable
1171	   contributions by pointing out flaws in the algorithms and the
1172	   documentation.  Continued discussion and development since the
1173	   publication of [RFC1323] originally occurred in the IETF TCP Large
1174	   Windows Working Group, later on in the End-to-End Task Force, and
1175	   most recently in the IETF TCP Maintenance Working Group.  The authors
1176	   are grateful for all these contributions.

1178	7.  Security Considerations

1180	   The TCP sequence space is a fixed size, and as the window becomes
1181	   larger it becomes easier for an attacker to generate forged packets
1182	   that can fall within the TCP window, and be accepted as valid
1183	   segments.  While use of timestamps and PAWS can help to mitigate
1184	   this, when using PAWS, if an attacker is able to forge a packet that
1185	   is acceptable to the TCP connection, a timestamp that is in the
1186	   future would cause valid segments to be dropped due to PAWS checks.
1187	   Hence, implementers should take care to not open the TCP window
1188	   drastically beyond the requirements of the connection.

1190	   A naive implementation that derives the timestamp clock value
1191	   directly from a system uptime clock may unintentionally leak this
1192	   information to an attacker.  This does not directly compromise any of
1193	   the mechanisms described in this document.  However, this may be
1194	   valuable information to a potential attacker.  An implementer should
1195	   evaluate the potential impact and mitigate this accordingly (i.e. by
1196	   using a random offset for the timestamp clock on each connection, or
1197	   using an external, real-time derived timestamp clock source).

1199	   Expanding the TCP window beyond 64 KiB for IPv6 allows Jumbograms
1200	   [RFC2675] to be used when the local network supports packets larger
1201	   than 64 KiB.  When larger TCP segments are used, the TCP checksum
1202	   becomes weaker.

1204	   Mechanisms to protect the TCP header from modification should also
1205	   protect the TCP options.

1207	   Middleboxes and TCP options:

1209	      Some middleboxes have been known to remove the TCP options
1210	      described in this document from TCP segments [Honda11].
1211	      Middleboxes that remove TCP options described in this document
1212	      from the <SYN> segment interfere with the selection of parameters
1213	      appropriate for the session.  Removing any of these options in a
1214	      <SYN,ACK> segment will leave the end hosts in a state that
1215	      destroys the proper operation of the protocol.

1217	      *  If a Window Scale option is removed from a <SYN,ACK> segment,
1218	         the end hosts will not negotiate the window scaling factor
1219	         correctly.  Middleboxes must not remove or modify the Window
1220	         Scale option from <SYN,ACK> segments.

1222	      *  If a stateful firewall uses the window field to detect whether
1223	         a received segment is inside the current window, and does not
1224	         support the Window Scale option, it will not be able to
1225	         correctly determine whether or not a packet is in the window.
1226	         These middle boxes must also support the Window Scale option
1227	         and apply the scale factor when processing segments.  If the
1228	         window scale factor cannot be determined, it must not do window
1229	         based processing.

1231	      *  If the Timestamps option is removed from the <SYN> or <SYN,ACK>
1232	         segment, high speed connections that need PAWS would not have
1233	         that protection.  Successful negotiation of Timestamps option
1234	         enforces a stricter verification of incoming segments at the
1235	         receiver.  If the Timestamps option was removed from a
1236	         subsequent data segment after a successful negotiation (e.g. as
1237	         part of re-segmentation), the segment is discarded by the
1238	         receiver without further processing.  Middleboxes should not
1239	         remove the Timestamps option.

1241	      *  It must be noted that [RFC1323] doesn't address the case of the
1242	         Timestamps option being dropped or selectively omitted after
1243	         being negotiated, and that the update in this document may
1244	         cause some broken middlebox behavior to be detected
1245	         (potentially unresponsive TCP sessions).

1247	   Implementations that depend on PAWS could provide a mechanism for the
1248	   application to determine whether or not PAWS is in use on the
1249	   connection, and chose to terminate the connection if that protection
1250	   doesn't exist.  This is not just to protect the connection against
1251	   middleboxes that might remove the Timestamps option, but also against
1252	   remote hosts that do not have Timestamp support.

1254	7.1.  Privacy Considerations

1256	   The TCP options described in this document do not expose individual
1257	   users data.  However, a naive implementation simply using the system
1258	   clock as source for the Timestamps option will reveal characteristics
1259	   of the TCP potentially allowing more targeted attacks.  It is
1260	   therefore RECOMMENDED to generate a random, per-connection offset to
1261	   be used with the clock source when generating the Timestamps option
1262	   value (see Section 5.4).

1264	   Furthermore, the combination, relative ordering and padding of the
1265	   TCP options described in Section 2.2 and Section 3.2 will reveal
1266	   additional clues to allow the fingerprinting of the system.

1268	8.  IANA Considerations

1270	   This document has no actions for IANA.  The described TCP options are
1271	   well known from the superceded [RFC1323].

1273	9.  References

1275	9.1.  Normative References

1277	   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
1278	              RFC 793, September 1981.

1280	   [RFC1191]  Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
1281	              November 1990.

1283	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1284	              Requirement Levels", BCP 14, RFC 2119, March 1997.

1286	9.2.  Informative References

1288	   [Allman99]
1289	              Allman, M. and V. Paxson, "On Estimating End-to-End
1290	              Network Path Properties", Proc. ACM SIGCOMM Technical
1291	              Symposium, Cambridge, MA, September 1999,
1292	              <http://aciri.org/mallman/papers/estimation-la.pdf>.

1294	   [Ekstroem04]
1295	              Ekstroem, H. and R. Ludwig, "The Peak-Hopper: A New End-
1296	              to-End Retransmission Timer for Reliable Unicast
1297	              Transport", INFOCOM 2004 IEEE, March 2004, <http://
1298	              citeseerx.ist.psu.edu/viewdoc/
1299	              download?doi=10.1.1.76.2748&rep=rep1&type=pdf>.

1301	   [Floyd05]  Floyd, S., "[tcpm] How the RTO should be estimated with
1302	              timestamps", Message from 26.Jan.2007 to the tcpm mailing
1303	              list, August 2005, <http://www.ietf.org/mail-archive/web/
1304	              tcpm/current/msg02508.html>.

1306	   [Garlick77]
1307	              Garlick, L., Rom, R., and J. Postel, "Issues in Reliable
1308	              Host-to-Host Protocols", Proc. Second Berkeley Workshop on
1309	              Distributed Data Management and Computer Networks,
1310	              May 1977, <http://www.rfc-editor.org/ien/ien12.txt>.

1312	   [Hamming77]
1313	              Hamming, R., "Digital Filters", Prentice Hall, Englewood
1314	              Cliffs, N.J. ISBN 0-13-212571-4, 1977.

1316	   [Honda11]  Honda, M., Nishida, Y., Raiciu, C., Greenhalgh, A.,
1317	              Handley, M., and H. Tokuda, "Is it still possible to
1318	              extend TCP?", Proc. of ACM Internet Measurement
1319	              Conference (IMC) '11, November 2011.

1321	   [Jacobson88a]
1322	              Jacobson, V., "Congestion Avoidance and Control", SIGCOMM
1323	              '88, Stanford,  CA., August 1988,
1324	              <http://ee.lbl.gov/papers/congavoid.pdf>.

1326	   [Jacobson90a]
1327	              Jacobson, V., "4BSD Header Prediction", ACM Computer
1328	              Communication Review, April 1990.

1330	   [Jacobson90c]
1331	              Jacobson, V., "Modified TCP congestion avoidance
1332	              algorithm", Message to the end2end-interest mailing list,
1333	              April 1990,
1334	              <ftp://ftp.isi.edu/end2end/end2end-interest-1990.mail>.

1336	   [Jain86]   Jain, R., "Divergence of Timeout Algorithms for Packet
1337	              Retransmissions", Proc. Fifth Phoenix Conf. on Comp. and
1338	              Comm., Scottsdale, Arizona, March 1986,
1339	              <http://arxiv.org/ftp/cs/papers/9809/9809097.pdf>.

1341	   [Karn87]   Karn, P. and C. Partridge, "Estimating Round-Trip Times in
1342	              Reliable Transport Protocols", Proc. SIGCOMM '87,
1343	              August 1987.

1345	   [Kuehlewind10]
1346	              Kuehlewind, M. and B. Briscoe, "Chirping for Congestion
1347	              Control - Implementation Feasibility", November 2010,
1348	              <bobbriscoe.net/projects/netsvc_i-f/chirp_pfldnet10.pdf>.

1350	   [Kuzmanovic03]
1351	              Kuzmanovic, A. and E. Knightly, "TCP-LP: Low-Priority
1352	              Service via End-Point Congestion Control", 2003,
1353	              <www.cs.northwestern.edu/~akuzma/doc/TCP-LP-ToN.pdf>.

1355	   [Ludwig00]
1356	              Ludwig, R. and K. Sklower, "The Eifel Retransmission
1357	              Timer", ACM SIGCOMM Computer Communication Review Volume
1358	              30 Issue 3, July 2000, <http://ccr.sigcomm.org/archive/
1359	              2000/july00/LudwigFinal.pdf>.

1361	   [Martin03]
1362	              Martin, D., "[Tsvwg] RFC 1323.bis", Message to the tsvwg
1363	              mailing list, September 2003, <http://www.ietf.org/
1364	              mail-archive/web/tsvwg/current/msg04435.html>.

1366	   [Mathis08]
1367	              Mathis, M., "[tcpm] Example of 1323 window retraction
1368	              problem", Message to the tcpm mailing list, March 2008, <h
1369	              ttp://www.ietf.org/mail-archive/web/tcpm/current/
1370	              msg03564.html>.

1372	   [Medina04]
1373	              Medina, A., Allman, M., and S. Floyd, "Measuring
1374	              Interactions Between Transport Protocols and Middleboxes",
1375	              Proc. ACM SIGCOMM/USENIX Internet Measurement Conference.
1376	              October 2004, August 2004,
1377	              <http://www.icir.net/tbit/tbit-Aug2004.pdf>.

1379	   [Medina05]
1380	              Medina, A., Allman, M., and S. Floyd, "Measuring the
1381	              Evolution of Transport Protocols in the Internet", ACM
1382	              Computer Communication Review 35(2), April 2005,
1383	              <http://icir.net/floyd/papers/TCPevolution-Mar2005.pdf>.

1385	   [Oppermann13]
1386	              Oppermann, A., "[tcpm] Explanation to the relaxation of
1387	              TSopt acceptance rules", Message to the tcpm mailing list,
1388	              Jun 2013, <http://www.ietf.org/mail-archive/web/tcpm/
1389	              current/msg08001.html>.

1391	   [RFC0896]  Nagle, J., "Congestion control in IP/TCP internetworks",
1392	              RFC 896, January 1984.

1394	   [RFC1072]  Jacobson, V. and R. Braden, "TCP extensions for long-delay
1395	              paths", RFC 1072, October 1988.

1397	   [RFC1110]  McKenzie, A., "Problem with the TCP big window option",
1398	              RFC 1110, August 1989.

1400	   [RFC1122]  Braden, R., "Requirements for Internet Hosts -
1401	              Communication Layers", STD 3, RFC 1122, October 1989.

1403	   [RFC1185]  Jacobson, V., Braden, B., and L. Zhang, "TCP Extension for
1404	              High-Speed Paths", RFC 1185, October 1990.

1406	   [RFC1323]  Jacobson, V., Braden, B., and D. Borman, "TCP Extensions
1407	              for High Performance", RFC 1323, May 1992.

1409	   [RFC1981]  McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery
1410	              for IP version 6", RFC 1981, August 1996.

1412	   [RFC2018]  Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP
1413	              Selective Acknowledgment Options", RFC 2018, October 1996.

1415	   [RFC2581]  Allman, M., Paxson, V., and W. Stevens, "TCP Congestion
1416	              Control", RFC 2581, April 1999.

1418	   [RFC2675]  Borman, D., Deering, S., and R. Hinden, "IPv6 Jumbograms",
1419	              RFC 2675, August 1999.

1421	   [RFC2883]  Floyd, S., Mahdavi, J., Mathis, M., and M. Podolsky, "An
1422	              Extension to the Selective Acknowledgement (SACK) Option
1423	              for TCP", RFC 2883, July 2000.

1425	   [RFC3522]  Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm
1426	              for TCP", RFC 3522, April 2003.

1428	   [RFC4015]  Ludwig, R. and A. Gurtov, "The Eifel Response Algorithm
1429	              for TCP", RFC 4015, February 2005.

1431	   [RFC4821]  Mathis, M. and J. Heffner, "Packetization Layer Path MTU
1432	              Discovery", RFC 4821, March 2007.

1434	   [RFC4963]  Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly
1435	              Errors at High Data Rates", RFC 4963, July 2007.

1437	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
1438	              Control", RFC 5681, September 2009.

1440	   [RFC6298]  Paxson, V., Allman, M., Chu, J., and M. Sargent,
1441	              "Computing TCP's Retransmission Timer", RFC 6298,
1442	              June 2011.

1444	   [RFC6528]  Gont, F. and S. Bellovin, "Defending against Sequence
1445	              Number Attacks", RFC 6528, February 2012.

1447	   [RFC6675]  Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M.,
1448	              and Y. Nishida, "A Conservative Loss Recovery Algorithm
1449	              Based on Selective Acknowledgment (SACK) for TCP",
1450	              RFC 6675, August 2012.

1452	   [RFC6691]  Borman, D., "TCP Options and Maximum Segment Size (MSS)",
1453	              RFC 6691, July 2012.

1455	   [RFC6817]  Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind,
1456	              "Low Extra Delay Background Transport (LEDBAT)", RFC 6817,
1457	              December 2012.

1459	   [Watson81]
1460	              Watson, R., "Timer-based Mechanisms in Reliable Transport
1461	              Protocol Connection Management", Computer Networks, Vol.
1462	              5, 1981.

1464	   [Zhang86]  Zhang, L., "Why TCP Timers Don't Work Well", Proc. SIGCOMM
1465	              '86, Stowe, VT, August 1986.

1467	Appendix A.  Implementation Suggestions

1469	   TCP Option Layout

1471	      The following layout is recommended for sending options on non-
1472	      <SYN> segments, to achieve maximum feasible alignment of 32-bit
1473	      and 64-bit machines.

1475	                   +--------+--------+--------+--------+
1476	                   |   NOP  |  NOP   |  TSopt |   10   |
1477	                   +--------+--------+--------+--------+
1478	                   |          TSval timestamp          |
1479	                   +--------+--------+--------+--------+
1480	                   |          TSecr timestamp          |
1481	                   +--------+--------+--------+--------+

1483	   Interaction with the TCP Urgent Pointer

1485	      The TCP Urgent pointer, like the TCP window, is a 16 bit value.
1486	      Some of the original discussion for the TCP Window Scale option
1487	      included proposals to increase the Urgent pointer to 32 bits.  As
1488	      it turns out, this is unnecessary.  There are two observations
1489	      that should be made:

1491	      (1)  With IP Version 4, the largest amount of TCP data that can be
1492	           sent in a single packet is 65495 bytes (64 KiB - 1 -- size of
1493	           fixed IP and TCP headers).

1495	      (2)  Updates to the urgent pointer while the user is in "urgent
1496	           mode" are invisible to the user.

1498	      This means that if the Urgent Pointer points beyond the end of the
1499	      TCP data in the current segment, then the user will remain in
1500	      urgent mode until the next TCP segment arrives.  That segment will
1501	      update the urgent pointer to a new offset, and the user will never
1502	      have left urgent mode.

1504	      Thus, to properly implement the Urgent Pointer, the sending TCP
1505	      only has to check for overflow of the 16 bit Urgent Pointer field
1506	      before filling it in.  If it does overflow, than a value of 65535
1507	      should be inserted into the Urgent Pointer.

1509	      The same technique applies to IP Version 6, except in the case of
1510	      IPv6 Jumbograms.  When IPv6 Jumbograms are supported, [RFC2675]
1511	      requires additional steps for dealing with the Urgent Pointer,
1512	      these are described in section 5.2 of [RFC2675].

1514	Appendix B.  Duplicates from Earlier Connection Incarnations

1516	   There are two cases to be considered: (1) a system crashing (and
1517	   losing connection state) and restarting, and (2) the same connection
1518	   being closed and reopened without a loss of host state.  These will
1519	   be described in the following two sections.

1521	B.1.  System Crash with Loss of State

1523	   TCP's quiet time of one MSL upon system startup handles the loss of
1524	   connection state in a system crash/restart.  For an explanation, see
1525	   for example "When to Keep Quiet" in the TCP protocol specification
1526	   [RFC0793].  The MSL that is required here does not depend upon the
1527	   transfer speed.  The current TCP MSL of 2 minutes seemed acceptable
1528	   as an operational compromise, when many host systems used to take
1529	   this long to boot after a crash.  Current host systems can boot
1530	   considerably faster.

1532	   The Timestamps option may be used to ease the MSL requirements (or to
1533	   provide additional security against data corruption).  If timestamps
1534	   are being used and if the timestamp clock can be guaranteed to be
1535	   monotonic over a system crash/restart, i.e., if the first value of
1536	   the sender's timestamp clock after a crash/restart can be guaranteed
1537	   to be greater than the last value before the restart, then a quiet
1538	   time is unnecessary.

1540	   To dispense totally with the quiet time would require that the host
1541	   clock be synchronized to a time source that is stable over the crash/
1542	   restart period, with an accuracy of one timestamp clock tick or
1543	   better.  We can back off from this strict requirement to take
1544	   advantage of approximate clock synchronization.  Suppose that the
1545	   clock is always re-synchronized to within N timestamp clock ticks and
1546	   that booting (extended with a quiet time, if necessary) takes more
1547	   than N ticks.  This will guarantee monotonicity of the timestamps,
1548	   which can then be used to reject old duplicates even without an
1549	   enforced MSL.

1551	B.2.  Closing and Reopening a Connection

1553	   When a TCP connection is closed, a delay of 2*MSL in TIME-WAIT state
1554	   ties up the socket pair for 4 minutes (see Section 3.5 of [RFC0793].
1555	   Applications built upon TCP that close one connection and open a new
1556	   one (e.g., an FTP data transfer connection using Stream mode) must
1557	   choose a new socket pair each time.  The TIME-WAIT delay serves two
1558	   different purposes:

1560	   (a)  Implement the full-duplex reliable close handshake of TCP.

1562	        The proper time to delay the final close step is not really
1563	        related to the MSL; it depends instead upon the RTO for the FIN
1564	        segments and therefore upon the RTT of the path.  (It could be
1565	        argued that the side that is sending a FIN knows what degree of
1566	        reliability it needs, and therefore it should be able to
1567	        determine the length of the TIME-WAIT delay for the FIN's
1568	        recipient.  This could be accomplished with an appropriate TCP
1569	        option in FIN segments.)

1571	        Although there is no formal upper-bound on RTT, common network
1572	        engineering practice makes an RTT greater than 1 minute very
1573	        unlikely.  Thus, the 4 minute delay in TIME-WAIT state works
1574	        satisfactorily to provide a reliable full-duplex TCP close.
1575	        Note again that this is independent of MSL enforcement and
1576	        network speed.

1578	        The TIME-WAIT state could cause an indirect performance problem
1579	        if an application needed to repeatedly close one connection and
1580	        open another at a very high frequency, since the number of
1581	        available TCP ports on a host is less than 2^16.  However, high
1582	        network speeds are not the major contributor to this problem;
1583	        the RTT is the limiting factor in how quickly connections can be
1584	        opened and closed.  Therefore, this problem will be no worse at
1585	        high transfer speeds.

1587	   (b)  Allow old duplicate segments to expire.

1589	        To replace this function of TIME-WAIT state, a mechanism would
1590	        have to operate across connections.  PAWS is defined strictly
1591	        within a single connection; the last timestamp (TS.Recent) is
1592	        kept in the connection control block, and discarded when a
1593	        connection is closed.

1595	        An additional mechanism could be added to the TCP, a per-host
1596	        cache of the last timestamp received from any connection.  This
1597	        value could then be used in the PAWS mechanism to reject old
1598	        duplicate segments from earlier incarnations of the connection,
1599	        if the timestamp clock can be guaranteed to have ticked at least
1600	        once since the old connection was open.  This would require that
1601	        the TIME-WAIT delay plus the RTT together must be at least one
1602	        tick of the sender's timestamp clock.  Such an extension is not
1603	        part of the proposal of this RFC.

1605	        Note that this is a variant on the mechanism proposed by
1606	        Garlick, Rom, and Postel [Garlick77], which required each host
1607	        to maintain connection records containing the highest sequence
1608	        numbers on every connection.  Using timestamps instead, it is
1609	        only necessary to keep one quantity per remote host, regardless
1610	        of the number of simultaneous connections to that host.

1612	Appendix C.  Summary of Notation

1614	   The following notation has been used in this document.

1616	   Options

1618	      WSopt:            TCP Window Scale Option
1619	      TSopt:            TCP Timestamps option

1621	   Option Fields

1623	      shift.cnt:        Window scale byte in WSopt
1624	      TSval:            32-bit Timestamp Value field in TSopt
1625	      TSecr:            32-bit Timestamp Reply field in TSopt

1627	   Option Fields in Current Segment

1629	      SEG.TSval:        TSval field from TSopt in current segment
1630	      SEG.TSecr:        TSecr field from TSopt in current segment
1631	      SEG.WSopt:        8-bit value in WSopt

1633	   Clock Values

1635	      my.TSclock:       System wide source of 32-bit timestamp values
1636	      my.TSclock.rate:  Period of my.TSclock (1 ms to 1 sec)
1637	      Snd.TSoffset:     A offset for randomizing Snd.TSclock
1638	      Snd.TSclock:      my.TSclock + Snd.TSoffset

1640	   Per-Connection State Variables

1642	      TS.Recent:        Latest received Timestamp
1643	      Last.ACK.sent:    Last ACK field sent
1644	      Snd.TS.OK:        1-bit flag
1645	      Snd.WS.OK:        1-bit flag
1646	      Rcv.Wind.Shift:   Receive window scale exponent
1647	      Snd.Wind.Shift:   Send window scale exponent
1648	      Start.Time:       Snd.TSclock value when segment being timed was
1649	                        sent (used by pre-1323 code).

1651	   Procedure

1653	      Update_SRTT(m)    Procedure to update the smoothed RTT and RTT
1654	                        variance estimates, using the rules of
1655	                        [Jacobson88a], given m, a new RTT measurement

1657	Appendix D.  Event Processing Summary

1659	   OPEN Call

1661	      ...

1663	      An initial send sequence number (ISS) is selected.  Send a <SYN>
1664	      segment of the form:

1666	        <SEQ=ISS><CTL=SYN><TSval=Snd.TSclock><WSopt=Rcv.Wind.Shift>

1668	      ...

1670	   SEND Call

1672	      CLOSED STATE (i.e., TCB does not exist)

1674	         ...

1676	      LISTEN STATE

1678	         If the foreign socket is specified, then change the connection
1679	         from passive to active, select an ISS.  Send a <SYN> segment
1680	         containing the options: <TSval=Snd.TSclock> and
1681	         <WSopt=Rcv.Wind.Shift>.  Set SND.UNA to ISS, SND.NXT to ISS+1.
1682	         Enter SYN-SENT state. ...

1684	      SYN-SENT STATE
1685	      SYN-RECEIVED STATE

1687	         ...

1689	      ESTABLISHED STATE
1690	      CLOSE-WAIT STATE

1692	         Segmentize the buffer and send it with a piggybacked
1693	         acknowledgment (acknowledgment value = RCV.NXT). ...

1695	         If the urgent flag is set ...

1697	         If the Snd.TS.OK flag is set, then include the TCP Timestamps
1698	         option <TSval=Snd.TSclock,TSecr=TS.Recent> in each data
1699	         segment.

1701	         Scale the receive window for transmission in the segment
1702	         header:

1704	                   SEG.WND = (RCV.WND >> Rcv.Wind.Shift).

1706	   SEGMENT ARRIVES

1708	      ...

1710	      If the state is LISTEN then

1712	         first check for an RST

1714	            ...

1716	         second check for an ACK

1718	            ...

1720	         third check for a SYN

1722	            if the SYN bit is set, check the security.  If the ...

1724	               ...

1726	            if the SEG.PRC is less than the TCB.PRC then continue.

1728	            Check for a Window Scale option (WSopt); if one is found,
1729	            save SEG.WSopt in Snd.Wind.Shift and set Snd.WS.OK flag on.
1730	            Otherwise, set both Snd.Wind.Shift and Rcv.Wind.Shift to
1731	            zero and clear Snd.WS.OK flag.

1733	            Check for a TSopt option; if one is found, save SEG.TSval in
1734	            the variable TS.Recent and turn on the Snd.TS.OK bit.

1736	            Set RCV.NXT to SEG.SEQ+1, IRS is set to SEG.SEQ and any
1737	            other control or text should be queued for processing later.
1738	            ISS should be selected and a <SYN> segment sent of the form:

1740	                    <SEQ=ISS><ACK=RCV.NXT><CTL=SYN,ACK>

1742	            If the Snd.WS.OK bit is on, include a WSopt option
1743	            <WSopt=Rcv.Wind.Shift> in this segment.  If the Snd.TS.OK
1744	            bit is on, include a TSopt <TSval=Snd.TSclock,
1745	            TSecr=TS.Recent> in this segment.  Last.ACK.sent is set to
1746	            RCV.NXT.

1748	            SND.NXT is set to ISS+1 and SND.UNA to ISS.  The connection
1749	            state should be changed to SYN-RECEIVED.  Note that any
1750	            other incoming control or data (combined with SYN) will be
1751	            processed in the SYN-RECEIVED state, but processing of SYN
1752	            and ACK should not be repeated.  If the listen was not fully
1753	            specified (i.e., the foreign socket was not fully
1754	            specified), then the unspecified fields should be filled in
1755	            now.

1757	         fourth other text or control

1759	            ...

1761	      If the state is SYN-SENT then

1763	         first check the ACK bit

1765	            ...

1767	         ...

1769	         fourth check the SYN bit

1771	            ...

1773	            If the SYN bit is on and the security/compartment and
1774	            precedence are acceptable then, RCV.NXT is set to SEG.SEQ+1,
1775	            IRS is set to SEG.SEQ, and any acknowledgements on the
1776	            retransmission queue which are thereby acknowledged should
1777	            be removed.

1779	            Check for a Window Scale option (WSopt); if it is found,
1780	            save SEG.WSopt in Snd.Wind.Shift; otherwise, set both
1781	            Snd.Wind.Shift and Rcv.Wind.Shift to zero.

1783	            Check for a TSopt option; if one is found, save SEG.TSval in
1784	            variable TS.Recent and turn on the Snd.TS.OK bit in the
1785	            connection control block.  If the ACK bit is set, use
1786	            Snd.TSclock - SEG.TSecr as the initial RTT estimate.

1788	            If SND.UNA > ISS (our <SYN> has been ACKed), change the
1789	            connection state to ESTABLISHED, form an <ACK> segment:

1791	                    <SEQ=SND.NXT><ACK=RCV.NXT><CTL=ACK>

1793	            and send it.  If the Snd.Echo.OK bit is on, include a TSopt
1794	            option <TSval=Snd.TSclock,TSecr=TS.Recent> in this <ACK>
1795	            segment.  Last.ACK.sent is set to RCV.NXT.

1797	            Data or controls which were queued for transmission may be
1798	            included.  If there are other controls or text in the
1799	            segment then continue processing at the sixth step below
1800	            where the URG bit is checked, otherwise return.

1802	            Otherwise enter SYN-RECEIVED, form a <SYN,ACK> segment:

1804	                    <SEQ=ISS><ACK=RCV.NXT><CTL=SYN,ACK>

1806	            and send it.  If the Snd.Echo.OK bit is on, include a TSopt
1807	            option <TSval=Snd.TSclock,TSecr=TS.Recent> in this segment.
1808	            If the Snd.WS.OK bit is on, include a WSopt option
1809	            <WSopt=Rcv.Wind.Shift> in this segment.  Last.ACK.sent is
1810	            set to RCV.NXT.

1812	            If there are other controls or text in the segment, queue
1813	            them for processing after the ESTABLISHED state has been
1814	            reached, return.

1816	         fifth, if neither of the SYN or RST bits is set then drop the
1817	         segment and return.

1819	      Otherwise,

1821	      First, check sequence number

1823	         SYN-RECEIVED STATE
1824	         ESTABLISHED STATE
1825	         FIN-WAIT-1 STATE
1826	         FIN-WAIT-2 STATE
1827	         CLOSE-WAIT STATE
1828	         CLOSING STATE
1829	         LAST-ACK STATE
1830	         TIME-WAIT STATE

1832	            Segments are processed in sequence.  Initial tests on
1833	            arrival are used to discard old duplicates, but further
1834	            processing is done in SEG.SEQ order.  If a segment's
1835	            contents straddle the boundary between old and new, only the
1836	            new parts should be processed.

1838	            Rescale the received window field:

1840	                  TrueWindow = SEG.WND << Snd.Wind.Shift,

1842	            and use "TrueWindow" in place of SEG.WND in the following
1843	            steps.

1845	            Check whether the segment contains a Timestamp Option and
1846	            bit Snd.TS.OK is on.  If so:

1848	               If SEG.TSval < TS.Recent and the RST bit is off, then
1849	               test whether connection has been idle less than 24 days;
1850	               if all are true, then the segment is not acceptable;
1851	               follow steps below for an unacceptable segment.

1853	               If SEG.SEQ is less than or equal to Last.ACK.sent, then
1854	               save SEG.TSval in variable TS.Recent.

1856	            There are four cases for the acceptability test for an
1857	            incoming segment:

1859	               ...

1861	            If an incoming segment is not acceptable, an acknowledgment
1862	            should be sent in reply (unless the RST bit is set, if so
1863	            drop the segment and return):

1865	                    <SEQ=SND.NXT><ACK=RCV.NXT><CTL=ACK>

1867	            Last.ACK.sent is set to SEG.ACK of the acknowledgment.  If
1868	            the Snd.Echo.OK bit is on, include the Timestamps option
1869	            <TSval=Snd.TSclock,TSecr=TS.Recent> in this <ACK> segment.
1870	            Set Last.ACK.sent to SEG.ACK and send the <ACK> segment.
1871	            After sending the acknowledgment, drop the unacceptable
1872	            segment and return.

1874	      ...

1876	      fifth check the ACK field.

1878	         if the ACK bit is off drop the segment and return.

1880	         if the ACK bit is on

1882	            ...

1884	            ESTABLISHED STATE

1886	               If SND.UNA < SEG.ACK <= SND.NXT then, set SND.UNA <-
1887	               SEG.ACK.  Also compute a new estimate of round-trip time.
1888	               If Snd.TS.OK bit is on, use Snd.TSclock - SEG.TSecr;
1889	               otherwise use the elapsed time since the first segment in
1890	               the retransmission queue was sent.  Any segments on the
1891	               retransmission queue which are thereby entirely
1892	               acknowledged...

1894	      ...

1896	      Seventh, process the segment text.

1898	         ESTABLISHED STATE
1899	         FIN-WAIT-1 STATE
1900	         FIN-WAIT-2 STATE
1901	            ...

1903	            Send an acknowledgment of the form:

1905	                    <SEQ=SND.NXT><ACK=RCV.NXT><CTL=ACK>

1907	            If the Snd.TS.OK bit is on, include Timestamp Option
1908	            <TSval=Snd.TSclock,TSecr=TS.Recent> in this <ACK> segment.
1909	            Set Last.ACK.sent to SEG.ACK of the acknowledgment, and send
1910	            it.  This acknowledgment should be piggy-backed on a segment
1911	            being transmitted if possible without incurring undue delay.

1913	            ...

1915	Appendix E.  Timestamps Edge Cases

1917	   While the rules laid out for when to calculate RTTM produce the
1918	   correct results most of the time, there are some edge cases where an
1919	   incorrect RTTM can be calculated.  All of these situations involve
1920	   the loss of segments.  It is felt that these scenarios are rare, and
1921	   that if they should happen, they will cause a single RTTM measurement
1922	   to be inflated, which mitigates its effects on RTO calculations.

1924	   [Martin03] cites two similar cases when the returning <ACK> is lost,
1925	   and before the retransmission timer fires, another returning <ACK>
1926	   segment arrives, which aknowledges the data.  In this case, the RTTM
1927	   calculated will be inflated:

1929	           clock
1930	             tc=1   <A, TSval=1> ------------------->

1932	             tc=2   (lost) <---- <ACK(A), TSecr=1, win=n>
1933	                 (RTTM would have been 1)

1935	                    (receive window opens, window update is sent)
1936	             tc=5        <---- <ACK(A), TSecr=1, win=m>
1937	                    (RTTM is calculated at 4)

1939	   One thing to note about this situation is that it is somewhat bounded
1940	   by RTO + RTT, limiting how far off the RTTM calculation will be.
1941	   While more complex scenarios can be constructed that produce larger
1942	   inflations (e.g., retransmissions are lost), those scenarios involve
1943	   multiple segment losses, and the connection will have other more
1944	   serious operational problems than using an inflated RTTM in the RTO
1945	   calculation.

1947	Appendix F.  Window Retraction Example

1949	   Consider an established TCP connection using a scale factor of 128,
1950	   Snd.Wind.Shift=7 and Rcv.Wind.Shift=7, that is running with a very
1951	   small window because the receiver is bottlenecked and both ends are
1952	   doing small reads and writes.

1954	   Consider the ACKs coming back:

1956	   SEG.ACK  SEG.WIN computed SND.WIN   receiver's actual window
1957	   1000     2       1256               1300

1959	   The sender writes 40 bytes and receiver ACKs:

1961	   1040     2       1296               1300

1963	   The sender writes 5 additional bytes and the receiver has a problem.
1964	   Two choices:

1966	   1045     2       1301               1300   - BEYOND BUFFER

1968	   1045     1       1173               1300   - RETRACTED WINDOW

1970	   This is a general problem and can happen any time the sender does a
1971	   write which is smaller than the window scale factor.

1973	   In most stacks it is at least partially obscured when the window size
1974	   is larger than some small number of segments because the stacks
1975	   prefer to announce windows that are an integral number of segments,
1976	   rounded up to the next scale factor.  This plus silly window
1977	   suppression tends to cause less frequent, larger window updates.  If
1978	   the window was rounded down to a segment size there is more
1979	   opportunity to advance the window, the BEYOND BUFFER case above,
1980	   rather than retracting it.

1982	Appendix G.  RTO calculation modification

1984	   Taking multiple RTT samples per window would shorten the history
1985	   calculated by the RTO mechanism in [RFC6298], and the below algorithm
1986	   aims to maintain a similar history as originally intended by
1987	   [RFC6298].

1989	   It is roughly known how many samples a congestion window worth of
1990	   data will yield, not accounting for ACK compression, and ACK losses.
1991	   Such events will result in more history of the path being reflected
1992	   in the final value for RTO, and are uncritical.  This modification
1993	   will ensure that a similar amount of time is taken into account for
1994	   the RTO estimation, regardless of how many samples are taken per
1995	   window:

1997	      ExpectedSamples = ceiling(FlightSize / (SMSS * 2))

1999	      alpha' = alpha / ExpectedSamples

2001	      beta' = beta / ExpectedSamples

2003	   Note that the factor 2 in ExpectedSamples is due to "Delayed ACKs".

2005	   Instead of using alpha and beta in the algorithm of [RFC6298], use
2006	   alpha' and beta' instead:

2008	      RTTVAR <- (1 - beta') * RTTVAR + beta' * |SRTT - R'|

2010	      SRTT <- (1 - alpha') * SRTT + alpha' * R'

2012	      (for each sample R')

2014	Appendix H.  Changes from RFC 1323

2016	   Several important updates and clarifications to the specification in
2017	   RFC 1323 are made in these document.  The technical changes are
2018	   summarized below:

2020	   (a)  A wrong reference to SND.WND was corrected to SEG.WND in
2021	        Section 2.3

2023	   (b)  Section 2.4 was added describing the unavoidable window
2024	        retraction issue, and explicitly describing the mitigation steps
2025	        necessary.

2027	   (c)  In Section 3.2 the wording how the Timestamps option negotiation
2028	        is to be performed was updated with RFC2119 wording.  Further, a
2029	        number of paragraphs were added to clarify the expected behavior
2030	        with a compliant implementation using TSopt, as RFC1323 left
2031	        room for interpretation - e.g. potential late enablement of
2032	        TSopt.

2034	   (d)  The description of which TSecr values can be used to update the
2035	        measured RTT has been clarified.  Specifically, with timestamps,
2036	        the Karn algorithm [Karn87] is disabled.  The Karn algorithm
2037	        disables all RTT measurements during retransmission, since it is
2038	        ambiguous whether the <ACK> is for the original segment, or the
2039	        retransmitted segment.  With timestamps, that ambiguity is
2040	        removed since the TSecr in the <ACK> will contain the TSval from
2041	        whichever data segment made it to the destination.

2043	   (e)  RTTM update processing explicitly excludes segments not updating
2044	        SND.UNA.  The original text could be interpreted to allow taking
2045	        RTT samples when SACK acknowledges some new, non-continuous
2046	        data.

2048	   (f)  In RFC1323, section 3.4, step (2) of the algorithm to control
2049	        which timestamp is echoed was incorrect in two regards:

2051	        (1)  It failed to update TS.recent for a retransmitted segment
2052	             that resulted from a lost <ACK>.

2054	        (2)  It failed if SEG.LEN = 0.

2056	        In the new algorithm, the case of SEG.TSval >= TS.recent is
2057	        included for consistency with the PAWS test.

2059	   (g)  It is now recommended that the Timestamps option is included in
2060	        <RST> segments if the incoming segment contained a Timestamps
2061	        option.

2063	   (h)  <RST> segments are explicitly excluded from PAWS processing.

2065	   (i)  Added text to clarify the precedence between regular TCP
2066	        [RFC0793] and this document Timestamps option / PAWS processing.
2067	        Discussion about combined acceptability checks are ongoing.

2069	   (j)  Snd.TSoffset and Snd.TSclock variables have been added.
2070	        Snd.TSclock is the sum of my.TSclock and Snd.TSoffset.  This
2071	        allows the starting points for timestamp values to be randomized
2072	        on a per-connection basis.  Setting Snd.TSoffset to zero yields
2073	        the same results as [RFC1323].  Text was added to guide
2074	        implementors to the proper selection of these offsets, as
2075	        entirly random offsets for each new connection will conflict
2076	        with PAWS.

2078	   (k)  Appendix A has been expanded with information about the TCP
2079	        Urgent Pointer.  An earlier revision contained text around the
2080	        TCP MSS option, which was split off into [RFC6691].

2082	   (l)  One correction was made to the Event Processing Summary in
2083	        Appendix D.  In SEND CALL/ESTABLISHED STATE, RCV.WND is used to
2084	        fill in the SEG.WND value, not SND.WND.

2086	   (m)  Appendix G was added to exemplify how an RTO calculation might
2087	        be updated to properly take the much higher RTT sampling
2088	        frequency enabled by the Timestamps option into account.

2090	   Editorial changes of the document, that don't impact the
2091	   implementation or function of the mechanisms described in this
2092	   document include:

2094	   (a)  Removed much of the discussion in Section 1 to streamline the
2095	        document.  However, detailed examples and discussions in
2096	        Section 2, Section 3 and Section 5 are kept as guideline for
2097	        implementers.

2099	   (b)  Added short text that the use of WS increases the chances of
2100	        sequence number wrap, thus the PAWS mechanism is required in
2101	        certain environments.

2103	   (c)  Removed references to "new" options, as the options were
2104	        introduced in [RFC1323] already.  Changed the text in
2105	        Section 1.3 to specifically address TS and WS options.

2107	   (d)  Section 1.4 was added for [RFC2119] wording.  Normative text was
2108	        updated with the appropriate phrases.

2110	   (e)  Added < > brackets to mark specific types of segments, and
2111	        replaced most occurances of "packet" with "segment", where TCP
2112	        segments are referred to.

2114	   (f)  Updated the text in Section 3 to take into account what has been
2115	        learned since [RFC1323].

2117	   (g)  Removed the list of changes between [RFC1323] and prior
2118	        versions.  These changes are mentioned in Appendix C of
2119	        [RFC1323].

2121	   (h)  Moved Appendix Changes from RFC 1323 to the end of the
2122	        appendices for easier lookup.  In addition, the entries were
2123	        split into a technical and an editorial part, and sorted to
2124	        roughly correspond with the sections in the text where they
2125	        apply.

2127	Authors' Addresses

2129	   David Borman
2130	   Quantum Corporation
2131	   Mendota Heights  MN 55120
2132	   USA

2134	   Email: david.borman@quantum.com

2136	   Bob Braden
2137	   University of Southern California
2138	   4676 Admiralty Way
2139	   Marina del Rey  CA 90292
2140	   USA

2142	   Email: braden@isi.edu

2144	   Van Jacobson
2145	   Google, Inc.
2146	   1600 Amphitheatre Parkway
2147	   Mountain View  CA 94043
2148	   USA

2150	   Email: vanj@google.com

2152	   Richard Scheffenegger (editor)
2153	   NetApp, Inc.
2154	   Am Euro Platz 2
2155	   Vienna,   1120
2156	   Austria

2158	   Email: rs@netapp.com