idnits 2.17.1 draft-trammell-tcpm-timestamp-interval-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 15, 2013) is 3936 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 1323 (Obsoleted by RFC 7323) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TCP Maintenance and Minor Extensions (tcpm) R. Scheffenegger 3 Internet-Draft NetApp, Inc. 4 Intended status: Experimental M. Kuehlewind 5 Expires: January 16, 2014 University of Stuttgart 6 B. Trammell 7 ETH Zurich 8 July 15, 2013 10 Encoding of Time Intervals for the TCP Timestamp Option 11 draft-trammell-tcpm-timestamp-interval-01.txt 13 Abstract 15 The TCP Timestamp option would be useful for additional measurements 16 if it could be assumed that the interval between ticks of the 17 timestamp clock are regular, and if that interval were known. In 18 practice, many implementations do use a timestamp clock source that 19 has a regular interval. This draft specifies a compact encoding for 20 exposing the timestamp interval to a receiver, and discusses 21 applications therefor. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on January 16, 2014. 40 Copyright Notice 42 Copyright (c) 2013 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 58 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 3. Timestamp interval exposure . . . . . . . . . . . . . . . . . 3 60 3.1. Interval encoding requirements . . . . . . . . . . . . . 3 61 3.2. Interval encoding specification . . . . . . . . . . . . . 4 62 3.3. Timestamp Interval experimental TCP option . . . . . . . 6 63 4. Guidelines for defined-interval timestamp export . . . . . . 7 64 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 65 6. Security Considerations . . . . . . . . . . . . . . . . . . . 8 66 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 67 7.1. Normative References . . . . . . . . . . . . . . . . . . 8 68 7.2. Informative References . . . . . . . . . . . . . . . . . 8 69 Appendix A. Methodology for one-way delay variation measurement 70 using known timestamp 71 intervals . . . . . . . . . . . . . . . . . . . . . 8 72 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 74 1. Introduction 76 The Timestamp option originally introduced in [RFC1323] was designed 77 to support only two very specific mechanisms, round trip time 78 measurement (RTTM), and protection against wrapped sequence numbers 79 (PAWS), assuming a particular TCP algorithm (Reno). 81 While [RFC1323] specifies only that timestamps "must be at least 82 approximately proportional to real time" to support RTTM, many 83 implementations generate timestamp values from a regular timing 84 source. Determining the real-time interval represented by a single 85 tick makes additional measurements possible. In addition to easing 86 passive measurements using the timestamp option, it also makes 87 possible the measurement of inter-departure time; the comparison of 88 inter-departure time to inter-arrival time can be used to one-way 89 delay variation measurement, useful for congestion control algorithms 90 as well in QoS applications. 92 This document specifies a compact encoding for timestamp intervals 93 which can be exported via any number of mechanisms, either through a 94 new TCP option, by piggybacking on the timestamp option as in 95 [I-D.scheffenegger-tcpm-timestamp-negotiation], or through other in- 96 or out-of-band means. This document specifies an experimental TCP 97 option for experiments with interval exposure separate from any other 98 mechanism. 100 2. Terminology 102 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 103 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 104 document are to be interpreted as described in [RFC2119]. 106 Terms defined in [RFC1323] are used in this document as defined 107 there. 109 This document defines the following additional term: 111 Timestamp interval 112 The interval between two ticks of the timestamp clock source 113 running at a constant frequency. Note that the timestamp clock is 114 not required to be identical with the TCP clock, even though most 115 implementations use the same clock for practical purposes. 117 3. Timestamp interval exposure 119 This section describes the requirements for interval encoding, then 120 specifies an interval to meet these requirements based on a 16-bit 121 reduced-precision encoding of a 42-bit fixed-point unsigned integer. 123 3.1. Interval encoding requirements 125 The choice of a timestamp interval is generally implementation- 126 specific, and there are a small number of commonly chosen intervals. 127 However, a general solution must support not only common cases, but 128 uncommon ones, and provide future flexibility to allow an 129 implementation to dynamically choose new timestamp intervals for new 130 sockets, based on network conditions and specific requirements for 131 timestamp measurements. 133 There are some sensible bounds on the range of timestamp intervals 134 that must be reasonably supported. The minimum inter-packet interval 135 for 64-byte packets (i.e., back-to-back ACK segments) on a future 400 136 Gigabit Ethernet would be about 1ns; smaller intervals need not be 137 supported with current technology, even for applications for which a 138 unique timestamp for every packet would be useful. On the other side 139 of the scale, low-bandwidth, high-latency links may operate with 140 timestamp intervals on the order of seconds. 142 The precision required by timestamp interval export, on the other 143 hand, is determined by the applications for which the information 144 will be used and the precision of the underlying clock source. As 145 many clock sources may provide less than maximum precision (due to 146 e.g. interrupt jitter), there should be some way to represent 147 variable precision. 149 As a timestamp interval will need to be bound to a connection in-band 150 at runtime, a space-efficient encoding is necessary. 152 These requirements indicate a reduced-precision encoding of a fixed- 153 point interval, expressed in seconds, as described in the next 154 subsection. 156 3.2. Interval encoding specification 158 A 42-bit fixed-point unsigned integer with 4 bits before the decimal 159 point and 38 bits after, expressed in seconds, is sufficient to 160 encode an interval range from just under 16 seconds (0x3ff ffff ffff) 161 down to 2^-38 s or 3.64 ps (0x000 0000 0001), meeting the range 162 requirement. Sufficient precision for the applications envisioned by 163 this document is provided by exporting just the 11 most significant 164 bits of the interval value (here, the "value"), coupled with a 5-bit 165 "scale" which locates the least significant bit of the value within 166 the larger field: a scale of 31 places the value field between bits 167 41 and 31 inclusive of the fixed-point integer for the largest 168 intervals, while a scale of 0 places the value field between bits 10 169 and 0 inclusive. By using a scale such that the most significant bit 170 of the value is not 1, less than 11 bits of precision can be 171 signaled, as well; implementations SHOULD NOT represent more 172 precision in an exported timestamp interval than they actually 173 support. Full precision export is available down to 2^-27 s (or 7.45 174 ns) with diminishing precision down to 3.64 ps. This arrangement 175 therefore allows the representation of timestamp intervals over 13 176 orders of magnitude and 11 bits of precision with only two octets. 177 The details of this encoding are illustrated in Figure 1. 179 MSb LSb 180 41 37 31 23 15 7 0 181 +----+------+--------+--------+--------+-------+ 182 | int| frac | full value 183 +----+------+--------+--------+--------+-------+ 184 / \ 185 +-+ \ 186 / \ 187 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 188 | scale | value | encoded interval 189 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 190 15 11 10 0 192 Figure 1: Timestamp interval encoding using scaled fixed-point 193 integer 195 This encoded 16-bit interval is then exported for a given connection 196 as a standalone TCP option or as part of the extended timestamp 197 negotiation described in the following subsections. 199 A sender explicitly signals that it uses an irregular timestamp clock 200 by sending zero for both scale and value (i.e., 0x0000). 201 Combinations of a value of zero and a non-zero scale are reserved for 202 future use. These values MUST NOT be sent as a timestamp interval, 203 and SHOULD presently be interpreted by the receiver as exposing an 204 irregular timestamp clock. 206 For implementations that support only a single timestamp interval for 207 all flows in all situations, the encoded interval can be implemented 208 as a constant. Encodings for common timestamp intervals with maximum 209 precision are given in Table 1. Encodings for 9-bit precision, the 210 maximum available from common software interrupt clock sources, are 211 given in Table 2. 213 +----------+-----------+-------+-------+----------+ 214 | interval | frequency | scale | value | combined | 215 +----------+-----------+-------+-------+----------+ 216 | 16 s | 0.06 Hz | 0x1f | 0x7ff | 0xffff | 217 | 1 s | 1 Hz | 0x1c | 0x400 | 0xe400 | 218 | 0.5 s | 2 Hz | 0x1b | 0x400 | 0xdc00 | 219 | 100 ms | 10 Hz | 0x18 | 0x666 | 0xc666 | 220 | 10 ms | 100 Hz | 0x15 | 0x51f | 0xad1f | 221 | 4 ms | 250 Hz | 0x14 | 0x419 | 0xa419 | 222 | 1 ms | 1 kHz | 0x12 | 0x418 | 0x9418 | 223 | 200 us | 5 kHz | 0x0f | 0x68e | 0x7e8e | 224 | 50 us | 20 kHz | 0x0d | 0x68e | 0x6e8e | 225 | 1 us | 1 MHz | 0x08 | 0x432 | 0x4432 | 226 | 60 ns | 16.7 MHz | 0x04 | 0x407 | 0x2407 | 227 | none | -------- | 0x00 | 0x000 | 0x0000 | 228 +----------+-----------+-------+-------+----------+ 230 Table 1: Encodings for common timestamp intervals with maximum 231 precision 233 +----------+-----------+-------+-------+----------+ 234 | interval | frequency | scale | value | combined | 235 +----------+-----------+-------+-------+----------+ 236 | 1.0 s | 1 Hz | 0x1e | 0x100 | 0xf100 | 237 | 0.5 s | 2 Hz | 0x1d | 0x100 | 0xe900 | 238 | 100 ms | 10 Hz | 0x1a | 0x199 | 0xd199 | 239 | 10 ms | 100 Hz | 0x17 | 0x147 | 0xb947 | 240 | 4 ms | 250 Hz | 0x16 | 0x106 | 0xb106 | 241 | 1 ms | 1 kHz | 0x14 | 0x106 | 0xa106 | 242 | 200 us | 5 kHz | 0x11 | 0x1a3 | 0x89a3 | 243 | 50 us | 20 kHz | 0x0f | 0x1a3 | 0x79a3 | 244 | 1 us | 1 MHz | 0x0a | 0x10c | 0x510c | 245 | 60 ns | 16.7 MHz | 0x06 | 0x101 | 0x3101 | 246 | none | -------- | 0x00 | 0x000 | 0x0000 | 247 +----------+-----------+-------+-------+----------+ 249 Table 2: Encodings for common timestamp intervals with 9-bit 250 precision 252 3.3. Timestamp Interval experimental TCP option 254 This section specifies an experimental TCP option, using an ExID and 255 magic number as described in [I-D.ietf-tcpm-experimental-options], 256 for exporting timestamp intervals. This option MAY appear in any TCP 257 segment after the SYN segment to advertise the sender's timestamp 258 interval, encoded as in Section 3.2 above. If the receiver uses 259 timestamp interval information, it stores the interval for the 260 duration of the connection, or until a subsequent Timestamp Interval 261 option is received. The receiver may assume the Timestamp Interval 262 is applicable from the point of receipt of the option; i.e. that all 263 subsequent received segments with the same or a subsequent sequence 264 number as the segment containing the option export timestamps with 265 the stated option. 267 If a sender has previously sent a timestamp interval to a receiver, 268 and changes the timestamp interval on the connection, it MUST send a 269 new Timestamp Interval option. 271 This option MUST NOT appear in a segment in which a TCP Timestamp 272 option is also not present. 274 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 275 | Kind = 253 | Length = 8 | ExID = 0x75ec | 276 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 277 | more magic = 0xffee | encoded advertised interval | 278 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 280 Figure 2: Structure of Timestamp Interval Experimental TCP option for 281 interval export 283 Should timestamp interval exposure prove useful, and a separate TCP 284 Option be chosen as the preferred method to send it in-band, this 285 option would have a length of 4, and the use of an ExID and magic 286 number would preserve word alignment in implementations transitioning 287 from experimental to production TCP Option usage. 289 4. Guidelines for defined-interval timestamp export 291 As noted above, implementations SHOULD NOT indicate more precision 292 than they support. As common software interrupt clock sources 293 provide about 9 bits of precision, these should be indicated with 2 294 leading zero bits in the value field. Low variance software clocks 295 (e.g. CPU cycle counters) should be indicated with a single leading 296 zero bit, and hardware injecting the timestamp into the header with 297 high precision should use the full precision. Similarly, if the 298 clock source exhibits a very high variability (e.g. when running in a 299 virtualized environment), 3 or more leading zeros should be used in 300 the value field. 302 Timestamp intervals faster than about 1 ms SHOULD be implemented by 303 inserting the timestamp "late" before transmitting a segment to avoid 304 unnecessary timing jitter. 306 Intervals on the order of 1us or less are intended for use with for 307 hardware-assisted implementations, e.g. direct use of a (shifted) CPU 308 cycle counter as clock source. 310 5. IANA Considerations 312 This document uses the Experimental Option Experiment Identifier 313 (ExID) 0x75ec ffee to identify the Timestamp Interval experimental 314 option in Section 3.3; an application for this codepoint in the IANA 315 TCP Experimental Option ExID registry has already been submitted. 317 6. Security Considerations 319 [EDITOR'S NOTE: discuss implications of misuse -- what can I break by 320 sending a bad interval?] 322 7. References 324 7.1. Normative References 326 [I-D.ietf-tcpm-experimental-options] 327 Touch, J., "Shared Use of Experimental TCP Options", 328 draft-ietf-tcpm-experimental-options-06 (work in 329 progress), June 2013. 331 [I-D.scheffenegger-tcpm-timestamp-negotiation] 332 Scheffenegger, R., Kuehlewind, M., and B. Trammell, 333 "Additional negotiation in the TCP Timestamp Option field 334 during the TCP handshake", draft-scheffenegger-tcpm- 335 timestamp-negotiation-05 (work in progress), October 2012. 337 [RFC1323] Jacobson, V., Braden, B., and D. Borman, "TCP Extensions 338 for High Performance", RFC 1323, May 1992. 340 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 341 Requirement Levels", BCP 14, RFC 2119, March 1997. 343 7.2. Informative References 345 [Chirp] Kuehlewind, M. and B. Briscoe, "Chirping for Congestion 346 Control - Implementation Feasibility", Nov 2010, . 349 [I-D.ietf-ledbat-congestion] 350 Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind, 351 "Low Extra Delay Background Transport (LEDBAT)", draft- 352 ietf-ledbat-congestion-10 (work in progress), September 353 2012. 355 Appendix A. Methodology for one-way delay variation measurement using 356 known timestamp intervals 358 New congestion control algorithms are currently proposed, that react 359 on the measured one-way delay variation (see 360 [I-D.ietf-ledbat-congestion], [Chirp]). This control variable is 361 updated after each received ACK. 363 C(t) = TSval(t) - TSecr(t) 365 V(t) = C(t) - C(t-1) 367 provided that the timestamp clocks at both ends are running at 368 roughly the same rate. Without prior knowledge of the timestamp 369 clock interval used by the partner, a sender can try to learn this 370 interval by observing the exchanged segments for a duration of a few 371 RTTs. However, such a scheme fails if the partner uses some form of 372 implicit integrity check of the timestamp values, which would appear 373 as either random scrambling of LSB bits in the timestamp, or give the 374 impression of much shorter clock intervals than what is actually 375 used. If the partner uses some form of segment counting as timestamp 376 value, without any direct relationship to the wall-clock time, the 377 above formula will fail to yield meaningful results. Finally the 378 network conditions need to remain stable during any such training 379 phase, so that the sender can arrive at reasonable estimates of the 380 partners timestamp clock tick duration. 382 [EDITOR'S NOTE: the following refers to a mask field which doesn't 383 exist anymore, needs a rewrite. Shouldn't we define C(t) = (TSecr(t) 384 - TSval(t)) * (TSinterval(remote) / TSinterval(local))?] 386 This note addresses these concerns by providing a means by which both 387 host are required to use a timestamp clock that is closely related to 388 the wall-clock time, with known clock rate, and also provides means 389 by which a host can signal the use of a few LSB bits for timestamp 390 value integrity checks. To arrive at a valid one-way delay (OWD) 391 variation, first the timestamp received from the partner has to be 392 right-shifted by a known amount of bits as defined by the mask field. 393 Next the local and remote timestamp values need to be normalized to a 394 common base clock interval (typically, the local clock interval): 396 remote interval 397 C = (TSecr >> local mask) - (TSval >> remote mask) * --------------- 398 t local interval 400 V(t) = C(t) - C(t-1) 402 [EDITOR'S NOTE: the following refers to field definitions from the 403 old TS nego draft; needs a rewrite.] 404 The adjustment factor can be calculated once during the timestamp 405 capability negotiation phase, and pure integer arithmetic can be used 406 during per-segment processing: 408 EXP.min = min(EXP.loc, EXP.rem) 410 EXP.rem -= EXP.min 412 EXP.loc -= EXP.min 414 FRAC.rem = (0x800 | FRAC.rem) << EXP.rem 416 FRAC.loc = (0x800 | FRAC.loc) << EXP.loc 418 and assuming that the local clock tick duration is lower 420 ADJ = FRAC.rem / FRAC.loc 422 with ADJ being a integer variable. For higher precision, two 423 appropriately calculated integers can be used. 425 Any previously required training on the remote clock interval can be 426 removed, resulting in a simpler and more dependable algorithm. 427 Furthermore, transient network effects during the training phase 428 which may result in a wrong inference of the remote clock interval 429 are eliminated completely. 431 Though specified for endpoint usage for congestion control, the 432 difference betwen interarrival and interdeparture time used by this 433 algorithm is applicable for passive measurement of jitter, as well. 435 Authors' Addresses 437 Richard Scheffenegger 438 NetApp, Inc. 439 Am Euro Platz 2 440 1120 Vienna 441 Austria 443 Phone: +43 1 3676811 3146 444 Email: rs@netapp.com 445 Mirja Kuehlewind 446 University of Stuttgart 447 Pfaffenwaldring 47 448 70569 Stuttgart 449 Germany 451 Email: mirja.kuehlewind@ikr.uni-stuttgart.de 453 Brian Trammell 454 Swiss Federal Institute of Technology Zurich 455 Gloriastrasse 35 456 8092 Zurich 457 Switzerland 459 Phone: +41 44 632 70 13 460 Email: trammell@tik.ee.ethz.ch