idnits 2.17.1 draft-elkins-ippm-6man-pdm-option-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 31, 2015) is 3312 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT N. Elkins 3 Inside Products 4 R. Hamilton 5 Chemical Abstracts Service 6 M. Ackermann 7 Intended Status: Proposed Standard BCBS Michigan 8 Expires: October 2, 2015 March 31, 2015 10 IPv6 Performance and Diagnostic Metrics (PDM) Destination Option 11 draft-elkins-ippm-6man-pdm-option-00 13 Table of Contents 15 1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . 4 16 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4 17 1.2 End User Quality of Service (QoS) . . . . . . . . . . . . . 4 18 1.3 Need for a Packet Sequence Number . . . . . . . . . . . . . 5 19 1.4 Rationale for proposed solution . . . . . . . . . . . . . . 5 20 1.5 PDM Works in Collaboration with Other Headers . . . . . . . 6 21 2 Measurement Information Derived from PDM . . . . . . . . . . . . 6 22 2.1 Round-Trip Delay . . . . . . . . . . . . . . . . . . . . . . 6 23 2.2 Server Delay . . . . . . . . . . . . . . . . . . . . . . . . 7 24 3 Performance and Diagnostic Metrics Destination Option Layout . . 7 25 3.1 Destination Options Header . . . . . . . . . . . . . . . . . 7 26 3.2 Performance and Diagnostic Metrics Destination Option . . . 7 27 3.3 Header Placement . . . . . . . . . . . . . . . . . . . . . . 10 28 3.4 Implementation Considerations . . . . . . . . . . . . . . . 11 29 3.5 Dynamic Configuration Options . . . . . . . . . . . . . . . 11 30 3.6 5-tuple Aging . . . . . . . . . . . . . . . . . . . . . . . 12 31 4 Considerations of Timing Representation . . . . . . . . . . . . 12 32 4.1 Encoding the Delta-Time Values . . . . . . . . . . . . . . . 12 33 4.2 Timer registers are different on different hardware . . . . 12 34 4.3 Timer Units on Other Systems . . . . . . . . . . . . . . . . 13 35 4.4 Time Base . . . . . . . . . . . . . . . . . . . . . . . . . 13 36 4.5 Timer-value scaling . . . . . . . . . . . . . . . . . . . . 14 37 4.6 Limitations with this encoding method . . . . . . . . . . . 15 38 4.7 Lack of precision induced by timer value truncation . . . . 15 39 5 PDM Flow - Simple Client Server . . . . . . . . . . . . . . . . 16 40 5.1 Step 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 41 5.2 Step 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 42 5.3 Step 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 43 5.4 Step 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 44 5.5 Step 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 45 6 Other Flows . . . . . . . . . . . . . . . . . . . . . . . . . . 21 46 6.1 PDM Flow - One Way Traffic . . . . . . . . . . . . . . . . . 22 47 6.2 PDM Flow - Multiple Send Traffic . . . . . . . . . . . . . . 23 48 6.3 PDM Flow - Multiple Send with Errors . . . . . . . . . . . . 24 49 7 Potential Overhead Considerations . . . . . . . . . . . . . . . 25 50 8 Security Considerations . . . . . . . . . . . . . . . . . . . . 26 51 9 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 27 52 10 References . . . . . . . . . . . . . . . . . . . . . . . . . . 27 53 10.1 Normative References . . . . . . . . . . . . . . . . . . . 27 54 10.2 Informative References . . . . . . . . . . . . . . . . . . 27 55 11 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 27 56 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27 58 Abstract 60 To assess performance problems, measurements based on optional 61 sequence numbers and timing may be embedded in each packet. Such 62 measurements may be interpreted in real-time or after the fact. An 63 implementation of the existing IPv6 Destination Options extension 64 header, the Performance and Diagnostic Metrics (PDM) Destination 65 Options extension header as well as the field limits, calculations, 66 and usage of the PDM in measurement are included in this document. 68 Status of this Memo 70 This Internet-Draft is submitted to IETF in full conformance with the 71 provisions of BCP 78 and BCP 79. 73 Internet-Drafts are working documents of the Internet Engineering 74 Task Force (IETF), its areas, and its working groups. Note that 75 other groups may also distribute working documents as 76 Internet-Drafts. 78 Internet-Drafts are draft documents valid for a maximum of six months 79 and may be updated, replaced, or obsoleted by other documents at any 80 time. It is inappropriate to use Internet-Drafts as reference 81 material or to cite them other than as "work in progress." 83 The list of current Internet-Drafts can be accessed at 84 http://www.ietf.org/1id-abstracts.html 86 The list of Internet-Draft Shadow Directories can be accessed at 87 http://www.ietf.org/shadow.html 89 Copyright and License Notice 91 Copyright (c) 2015 IETF Trust and the persons identified as the 92 document authors. All rights reserved. 94 This document is subject to BCP 78 and the IETF Trust's Legal 95 Provisions Relating to IETF Documents 96 (http://trustee.ietf.org/license-info) in effect on the date of 97 publication of this document. Please review these documents 98 carefully, as they describe your rights and restrictions with respect 99 to this document. Code Components extracted from this document must 100 include Simplified BSD License text as described in Section 4.e of 101 the Trust Legal Provisions and are provided without warranty as 102 described in the Simplified BSD License 104 1 Background 106 To assess performance problems, measurements based on optional 107 sequence numbers and timing may be embedded in each packet. Such 108 measurements may be interpreted in real-time or after the fact. An 109 implementation of the existing IPv6 Destination Options extension 110 header, the Performance and Diagnostic Metrics (PDM) Destination 111 Options extension header has been proposed in a companion document. 112 This document specifies the layout, field limits, calculations, and 113 usage of the PDM in measurement. 115 As defined in RFC2460 [RFC2460], destination options are carried by 116 the IPv6 Destination Options extension header. Destination options 117 include optional information that need be examined only by the IPv6 118 node given as the destination address in the IPv6 header, not by 119 routers or other "middle boxes". This document specifies a new 120 destination option, the Performance and Diagnostic Metrics (PDM) 121 destination option. 123 1.1 Terminology 125 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 126 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 127 document are to be interpreted as described in RFC 2119 [RFC2119]. 129 1.2 End User Quality of Service (QoS) 131 The difference between timing values in the PDM traveling along with 132 the packet will be used to estimate QoS as experienced by an end user 133 device. 135 For many applications, the key user performance indicator is response 136 time. When the end user is an individual, he is generally 137 indifferent to what is happening along the network; what he really 138 cares about is how long it takes to get a response back. But this is 139 not just a matter of individuals' personal convenience. In many 140 cases, rapid response is critical to the business being conducted. 142 When the end user is a device (e.g. with the Internet of Things), 143 what matters is the speed with which requested data can be 144 transferred -- specifically, whether the requested data can be 145 transferred in time to accomplish the desired actions. This can be 146 important when the relevant external conditions are subject to rapid 147 change. 149 Response time and consistency are not just "nice to have". On many 150 networks, the impact can be financial hardship or endanger human 151 life. In some cities, the emergency police contact system operates 152 over IP, law enforcement uses TCP/IP networks, transactions on our 153 stock exchanges are settled using IP networks. The critical nature 154 of such activities to our daily lives and financial well-being demand 155 a simple solution to support measurements. 157 1.3 Need for a Packet Sequence Number 159 While performing network diagnostics of an end-to-end connection, it 160 often becomes necessary to find the device along the network path 161 creating problems. Diagnostic data may be collected at multiple 162 places along the path (if possible), or at the source and 163 destination. Then, in post-collection processing, the diagnostic 164 data corresponding to each packet at different observation points 165 must be matched for proper measurements. A sequence number in each 166 packet provides sufficient basis for the matching process. If need 167 be, the timing fields may be used along with the sequence number to 168 ensure uniqueness. 170 This method of data collection along the path is of special use to 171 determine where packet loss or packet corruption is happening. 173 The packet sequence number needs to be unique in the context of the 174 session (5-tuple). See section 2 for a definition of 5-tuple. 176 1.4 Rationale for proposed solution 178 The current IPv6 specification does not provide timing nor a similar 179 field in the IPv6 main header or in any extension header. So, we 180 propose the IPv6 Performance and Diagnostic Metrics destination 181 option (PDM). 183 Advantages include: 185 1. Real measure of actual transactions. 186 2. Independence from transport layer protocols. 187 3. Ability to span organizational boundaries with consistent 188 instrumentation 189 4. No time synchronization needed between session partners 191 The PDM provides the ability to quickly determine if the (latency) 192 problem is in the network or in the server (application). More 193 intermediate measurements may be needed if the host or network 194 discrimination is not sufficient. At the client, TCP/IP stack time 195 vs. applications time may still need to be broken out by client 196 software. 198 1.5 PDM Works in Collaboration with Other Headers 200 The purpose of the PDM is not to supplant all the variables present 201 in all other headers but to provide data which is not available or 202 very difficult to get. The way PDM would be used is by a technician 203 (or tool) looking at a packet capture. Within the packet capture, 204 they would have available to them the layer 2 header, IP header (v6 205 or v4), TCP, UCP, ICMP, SCTP or other headers. All information 206 would be looked at together to make sense of the packet flow. The 207 technician or processing tool could analyze, report or ignore the 208 data from PDM, as necessary. 210 For an example of how PDM can help with TCP retransmit problems, 211 please look at section 8. 213 2 Measurement Information Derived from PDM 215 Each packet contains information about the sender and receiver. In IP 216 protocol, the identifying information is called a "5-tuple". 218 The 5-tuple consists of: 220 SADDR : IP address of the sender 221 SPORT : Port for sender 222 DADDR : IP address of the destination 223 DPORT : Port for destination 224 PROTC : Protocol for upper layer (ex. TCP, UDP, ICMP, etc.) 226 The PDM contains the following base fields: 228 PSNTP : Packet Sequence Number This Packet 229 PSNLR : Packet Sequence Number Last Received 230 DELTATLR : Delta Time Last Received 231 DELTATLS : Delta Time Last Sent 233 Other fields for scaling and time base are also in the PDM and will 234 be described in section 3. 236 This information, combined with the 5-tuple, allows the measurement 237 of the following metrics: 239 1. Round-trip delay 240 2. Server delay 242 2.1 Round-Trip Delay 244 Round-trip *Network* delay is the delay for packet transfer from a 245 source host to a destination host and then back to the source host. 246 This measurement has been defined, and the advantages and 247 disadvantages discussed in "A Round-trip Delay Metric for IPPM" 248 [RFC2681]. 250 2.2 Server Delay 252 Server delay is the interval between when a packet is received by a 253 device and the first corresponding packet is sent back in response. 254 This may be "Server Processing Time". It may also be a delay caused 255 by acknowledgements. Server processing time includes the time taken 256 by the combination of the stack and application to return the 257 response. The stack delay may be related to network performance. If 258 this aggregate time is seen as a problem, and there is a need to make 259 a clear distinction between application processing time and stack 260 delay, including that caused by the network, then more client based 261 measurements are needed. 263 3 Performance and Diagnostic Metrics Destination Option Layout 265 3.1 Destination Options Header 267 The IPv6 Destination Options Header is used to carry optional 268 information that need be examined only by a packet's destination 269 node(s). The Destination Options Header is identified by a Next 270 Header value of 60 in the immediately preceding header and is defined 271 in RFC2460 [RFC2460]. The IPv6 Performance and Diagnostic Metrics 272 Destination Option (PDM) is an implementation of the Destination 273 Options Header (Next Header value = 60). The PDM does not require 274 time synchronization. 276 3.2 Performance and Diagnostic Metrics Destination Option 278 The IPv6 Performance and Diagnostic Metrics Destination Option (PDM) 279 contains the following fields: 281 TIMEBASE : Base timer unit 282 SCALEDTLR: Scale for Delta Time Last Received 283 SCALEDTLS: Scale for Delta Time Last Sent 284 PSNTP : Packet Sequence Number This Packet 285 PSNLR : Packet Sequence Number Last Received 286 DELTATLR : Delta Time Last Received 287 DELTATLS : Delta Time Last Sent 289 The PDM destination option is encoded in type-length-value (TLV) 290 format as follows: 292 0 1 2 3 293 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 294 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 295 | Option Type | Option Length |TB |ScaleDTLR | ScaleDTLS | 296 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 297 | PSN This Packet | PSN Last Received | 298 |-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 299 | Delta Time Last Received | Delta Time Last Sent | 300 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 302 Option Type 304 TBD = 0xXX (TBD) [To be assigned by IANA] [RFC2780] 306 Option Length 308 8-bit unsigned integer. Length of the option, in octets, excluding 309 the Option Type and Option Length fields. This field MUST be set to 310 16. 312 Time Base 314 2-bit unsigned integer. It will indicate the lowest granularity 315 possible for this device. That is, for a value of 00 in the Time 316 Base field, a value of 1 in the DELTA fields indicates 1 317 microsecond. 319 This field is being included so that a device may choose the 320 granularity which most suits its timer ticks. That is, so that it 321 does not have to do more work than needed to convert values required 322 for the PDM. 324 The possible values of Time Base are as follows: 326 00 - milliseconds 327 01 - microseconds 328 10 - nanoseconds 329 11 - picoseconds 331 Scale Delta Time Last Received (SCALEDTLR) 333 7-bit signed integer. This is the scaling value for the Delta Time 334 Last Received (DELTATLR) field. The possible values are from -128 to 335 +127. See Section 4 for further discussion on Timing Considerations 336 and formatting of the scaling values. 338 Scale Delta Time Last Sent (SCALEDTLS) 340 7-bit signed integer. This is the scaling value for the Delta Time 341 Last Sent (DELTATLS) field. The possible values are from -128 to 342 +127. 344 Packet Sequence Number This Packet (PSNTP) 346 16-bit unsigned integer. This field will wrap. It is intended for 347 human use. That is, while to be used while analyzing packet traces. 349 Initialized at a random number and monotonically incremented for each 350 packet on the 5-tuple. The 5-tuple consists of the source and 351 destination IP addresses, the source and destination ports, and the 352 upper layer protocol (ex. TCP, ICMP, etc). The random number 353 initialization is to make it harder to spoof and insert such packets. 355 Operating systems MUST implement a separate packet sequence number 356 counter per 5-tuple. Operating systems MUST NOT implement a single 357 counter for all connections. 359 Packet Sequence Number Last Received (PSNLR) 361 16-bit unsigned integer. This is the PSN of the packet last received 362 on the 5-tuple. 364 Delta Time Last Received (DELTATLR) 366 A 16-bit unsigned integer field. The value is according to the scale 367 in SCALEDTLR. 369 DELTATLR = Send time packet 2 - Receive time packet 1 371 Delta TimeLast Sent (DELTATLS) 373 A 16-bit unsigned integer field. The value is according to the 374 scale in SCALEDTLS. 376 Delta Time Last Sent = Receive time packet 2 - Send time packet 1 378 Option Type 380 The two highest-order bits of the Option Type field are encoded to 381 indicate specific processing of the option; for the PDM destination 382 option, these two bits MUST be set to 00. This indicates the 383 following processing requirements: 385 00 - skip over this option and continue processing the header. 387 RFC2460 [RFC2460] defines other values for the Option Type field. 388 These MUST NOT be used in the PDM. The other values are as follows: 390 01 - discard the packet. 392 10 - discard the packet and, regardless of whether or not the 393 packet's Destination Address was a multicast address, send an ICMP 394 Parameter Problem, Code 2, message to the packet's Source Address, 395 pointing to the unrecognized Option Type. 397 11 - discard the packet and, only if the packet's Destination Address 398 was not a multicast address, send an ICMP Parameter Problem, Code 2, 399 message to the packet's Source Address, pointing to the unrecognized 400 Option Type. 402 In keeping with RFC2460 [RFC2460], the third-highest-order bit of the 403 Option Type specifies whether or not the Option Data of that option 404 can change en-route to the packet's final destination. 406 In the PDM, the value of the third-highest-order bit MUST be 0. The 407 possible values are as follows: 409 0 - Option Data does not change en-route 411 1 - Option Data may change en-route 413 The three high-order bits described above are to be treated as part 414 of the Option Type, not independent of the Option Type. That is, a 415 particular option is identified by a full 8-bit Option Type, not just 416 the low-order 5 bits of an Option Type. 418 3.3 Header Placement 420 The PDM destination option MUST be placed as follows: 422 - Before the upper-layer header. That is, this is the last 423 extension header. 425 This follows the order defined in RFC2460 [RFC2460] 427 IPv6 header 428 Hop-by-Hop Options header 430 Destination Options header 432 Routing header 434 Fragment header 436 Authentication header 438 Encapsulating Security Payload header 440 Destination Options header 442 upper-layer header 444 For each IPv6 packet header, the PDM MUST NOT appear more than once. 445 However, an encapsulated packet MAY contain a separate PDM associated 446 with each encapsulated IPv6 header. 448 3.4 Implementation Considerations 450 The PDM destination options extension header SHOULD be turned on by 451 each stack on a host node. It MAY also be turned on only in case of 452 diagnostics needed for problem resolution. 454 3.5 Dynamic Configuration Options 456 If implemented, each operating system MUST have a default 457 configuration parameter, e.g. diag_header_sys_default_value=yes/no. 458 The operating system MAY also have a dynamic configuration option to 459 change the configuration setting as needed. 461 If the PDM destination options extension header is used, then it MAY 462 be turned on for all packets flowing through the host, applied to an 463 upper-layer protocol (TCP, UDP, SCTP, etc), a local port, or IP 464 address only. These are at the discretion of the implementation. 466 The PDM MUST NOT be changed dynamically via packet flow as this may 467 create potential security violation or DoS attack by numerous packets 468 turning the header on and off. 470 As with all other destination options extension headers, the PDM is 471 for destination nodes only. As specified above, intermediate devices 472 MUST neither set nor modify this field. 474 3.6 5-tuple Aging 476 Within the operating system, metrics must be kept on a 5-tuple basis. 478 The 5-tuple is: 480 SADDR : IP address of the sender SPORT : Port for sender DADDR : IP 481 address of the destination DPORT : Port for destination PROTC : 482 Protocol for upper layer (ex. TCP, UDP, ICMP) 484 The question comes of when to stop keeping data or restarting the 485 numbering for a 5-tuple. For example, in the case of TCP, at some 486 point, the connection will terminate. Keeping data in control blocks 487 forever, will have unfortunate consequences for the operating system. 489 So, the recommendation is to use a known aging parameter such as Max 490 Segment Lifetime (MSL) as defined in Transmission Control Protocol 491 [RFC0793] to reuse or drop the control block. The choice of aging 492 parameter is left up to the implementation. 494 4 Considerations of Timing Representation 496 4.1 Encoding the Delta-Time Values 498 This section makes reference to and expands on the document "Encoding 499 of Time Intervals for the TCP Timestamp Option" [TRAM-TCPM]. 501 4.2 Timer registers are different on different hardware 503 One of the problems with timestamp recording is the variety of 504 hardware that generates the time value to be used. Different CPUs 505 track the time in registers of different sizes, and the most- 506 frequently-iterated bit could be the first on the left or the first 507 on the right. In order to generate some examples here it is necessary 508 to indicate the type of timer register being used. 510 As described in the "IBM z/Architecture Principles of Operation" 511 [IBM-POPS], the Time-Of-Day clock in a zSeries CPU is a 104-bit 512 register, where bit 51 is incremented approximately every 513 microsecond: 515 1 516 0 1 2 3 4 5 6 0 517 +--------+---------+---------+---------+---------+---------+--+...+ 518 | | | | | |* | | 519 +--------+---------+---------+---------+---------+---------+--+...+ 520 ^ ^ ^ 521 0 51 = 1 usec 103 522 To represent these values concisely a hexadecimal representation will 523 be used, where each digit represents 4 binary bits. Thus: 525 0000 0000 0000 0001 = 1 timer unit (2**-12 usec, or about 244 psec) 526 0000 0000 0000 1000 = 1 microsecond 527 0000 0000 003E 8000 = 1 millisecond 528 0000 0000 F424 0000 = 1 second 529 0000 0039 3870 0000 = 1 minute 530 0000 0D69 3A40 0000 = 1 hour 531 0001 41DD 7600 0000 = 1 day 533 Note that only the first 64 bits of the register are commonly 534 represented, as that represents a count of timer units on this 535 hardware. Commonly the first 52 bits are all that are displayed, as 536 that represents a count of microseconds. 538 4.3 Timer Units on Other Systems 540 This encoding method works the same with other hardware clock 541 formats. The method uses a microsecond as the basic value and allows 542 for large time differentials. 544 4.4 Time Base 546 This specification allows for the fact that different CPU TOD clocks 547 use different binary points. For some clocks, a value of 1 could 548 indicate 1 microsecond, whereas other clocks could use the value 1 to 549 indicate 1 millisecond. In the former case, the binary digits to the 550 right of that binary point measure 2**(-n) microseconds, and in the 551 latter case, 2**(-n) milliseconds. 553 The Time Base allows us to ensure we have a common reference, at the 554 very least, common knowledge of what the binary point is for the 555 transmitted values. 557 We propose a base unit for the time. This is a 2-bit integer 558 indicating the lowest granularity possible for this device. That is, 559 for a value of 00 in the Time Base field, a value of 1 in the DELTA 560 fields indicates 1 picosecond. 562 The possible values of Time Base are as follows: 564 00 - milliseconds 565 01 - microseconds 566 10 - nanoseconds 567 11 - picoseconds 569 Time base is not necessarily equivalent to length of one timer tick. 570 That is, on many, if not all, systems, the timer tick value will not 571 be in complete units of nanoseconds, milliseconds, etc. For example, 572 on an IBM zSeries machine, one timer tick (or clock unit) is 2 to the 573 -12th microseconds. 575 Therefore, some amount of conversion may be needed to approximate 576 Time Base units. 578 4.5 Timer-value scaling 580 As discussed in [TRAM-TCPM] we propose storing not an entire time- 581 interval value, but just the most significant bits of that value, 582 along with a scaling factor to indicate the magnitude of the time- 583 interval value. In our case, we will use the high-order 16 bits. The 584 scaling value will be the number of bits in the timer register to the 585 right of the 16th significant bit. That is, if the timer register 586 contains this binary value: 588 1110100011010100101001010001000000000000 589 <-16 bits -><-24 bits -> 591 then, the values stored would be 1110 1000 1101 0100 in binary (E8D4 592 hexadecimal) for the time value and 24 for the scaling value. Note 593 that the displayed value is the binary equivalent of 1 second 594 expressed in picoseconds. 596 The below table represents a device which has a TimeBase of 597 picosecond (or 00). The smallest and simplest value to represent is 598 1 picosecond; the time value stored is 1, and the scaling value is 0. 599 Using values from the table below, we have: 601 Time value in Encoded Scaling 602 Delta time picoseconds value decimal 603 -------------------------------------------------------- 604 1 picosecond 1 1 0 605 1 nanosecond 3E8 3E8 0 606 1 microsecond F4240 F424 4 607 1 millisecond 3B9ACA00 3B9A 16 608 1 second E8D4A51000 E8D4 24 609 1 minute 3691D6AFC000 3691 32 610 1 hour cca2e51310000 CCA2 36 611 1 day 132f4579c980000 132F 44 612 365 days 1b5a660ea44b80000 1B5A 52 614 Sample binary values (high order 16 bits taken) 616 1 psec 1 0001 617 1 nsec 3E8 0011 1110 1000 618 1 usec F4240 1111 0100 0010 0100 0000 619 1 msec 3B9ACA00 0011 1011 1001 1010 1100 1010 0000 0000 620 1 sec E8D4A51000 1110 1000 1101 0100 1010 0101 0001 0000 0000 0000 622 4.6 Limitations with this encoding method 624 If we follow the specification in [TRAM-TCPM], the size of one of 625 these time-interval fields is limited to this 11-bit value and five- 626 bit scale, so that they fit into a 16-bit space. With that 627 limitation, the maximum value that could be stored in 16 bits is: 629 11-bit value Scale 630 ============= ====== 631 1111 1111 111 1 1111 633 or an encoded value of 3FF and a scale value of 31. This value 634 corresponds to any time differential between: 636 || 637 11 1111 1111 1000 0000 0000 0000 0000 0000 0000 0000 (binary) 638 3 F F 8 0 0 0 0 0 0 0 (hexadecimal) 640 and 642 11 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 (binary) 643 3 F F F F F F F F F F (hexadecimal) 645 This time value, 3FFFFFFFFFF, converts to 50 days, 21 hours, 40 646 minutes and 46.511103 seconds. A time differential 1 microsecond 647 longer won't fit into 16 bits using this encoding method. 649 4.7 Lack of precision induced by timer value truncation 651 When the bit values following the first 11 significant bits are 652 truncated, obviously loss of precision in the value. The range of 653 values that will be truncated to the same encoded value is 654 2**(Scale)-1 microseconds. 656 The smallest time differential value that will be truncated is 658 1000 0000 0000 = 2.048 msec 660 The value 662 1000 0000 0001 = 2.049 msec 664 will be truncated to the same encoded value, which is 400 in hex, 665 with a scale value of 1. With the scale value of 1, the value range 666 is calculated as 2**1 - 1, or 1 usec, which you can see is the 667 difference between these minimum and maximum values. 669 With that in mind, let's look at that table of delta time values 670 again, where the Precision is the range from the smallest value 671 corresponding to this encoded value to the largest: 673 Time value in Encoded 674 Delta time microseconds value Scale Precision 675 1 microsecond 1 1 0 0:00.000000 676 1 millisecond 38E 38E 0 0:00.000000 677 1 second F4240 7A1 9 0:00.000511 678 1 minute 3938700 727 15 0:00.032767 679 1 hour D693A400 6B4 21 0:02.097151 680 1 day 141DD76000 507 26 1:07.108863 681 Maximum value 3FFFFFFFFFF 7FF 31 35:47.483647 683 So, when measuring the delay between transmission of two packets, or 684 between the reception of two packets, any delay shorter than 50 days 685 21 hours and change can be stored in this encoded fashion within 16 686 bits. When you encode, for example, a DTN response time delay of 50 687 days, 21 hours and 40 minutes, you can be assured of accuracy within 688 35 minutes. 690 5 PDM Flow - Simple Client Server 692 Following is a sample simple flow for the PDM with one packet sent 693 from Host A and one packet received by Host B. The PDM does not 694 require time synchronization between Host A and Host B. The 695 calculations to derive meaningful metrics for network diagnostics are 696 shown below each packet sent or received. 698 Each packet, in addition to the PDM contains information on the 699 sender and receiver. As discussed before, a 5-tuple consists of: 701 SADDR : IP address of the sender 702 SPORT : Port for sender 703 DADDR : IP address of the destination 704 DPORT : Port for destination 705 PROTC : Protocol for upper layer (ex. TCP, UDP, ICMP) 707 It should be understood that the packet identification information is 708 in each packet. We will not repeat that in each of the following 709 steps. 711 5.1 Step 1 713 Packet 1 is sent from Host A to Host B. The time for Host A is set 714 initially to 10:00AM. 716 The time and packet sequence number are saved by the sender 717 internally. The packet sequence number and delta times are sent in 718 the packet. 720 Packet 1 722 +----------+ +----------+ 723 | | | | 724 | Host | ----------> | Host | 725 | A | | B | 726 | | | | 727 +----------+ +----------+ 729 PDM Contents: 731 PSNTP : Packet Sequence Number This Packet: 25 732 PSNLR : Packet Sequence Number Last Received: - 733 DELTATLR : Delta Time Last Received: - 734 SCALEDTLR: Scale of Delta Time Last Received: 0 735 DELTATLS : Delta Time Last Sent: - 736 SCALEDTLS: Scale of Delta Time Last Sent: 0 737 TIMEBASE : Granularity of Time: 00 (Milliseconds) 739 Internally, within the sender, Host A, it must keep: 741 Packet Sequence Number of the last packet sent: 25 742 Time the last packet was sent: 10:00:00 744 Note, the initial PSNTP from Host A starts at a random number. In 745 this case, 25. The time in these examples is shown in seconds for 746 the sake of simplicity. 748 5.2 Step 2 750 Packet 1 is received at Host B. Its time is set to one hour later 751 than Host A. In this case, 11:00AM 753 Internally, within the receiver, Host B, it must note: 755 Packet Sequence Number of the last packet received: 25 756 Time the last packet was received : 11:00:03 758 Note, this timestamp is in Host B time. It has nothing whatsoever to 759 do with Host A time. The Packet Sequence Number of the last packet 760 received will become PSNLR which will be sent out in the packet sent 761 by Host B in the next step. The time last received will be used to 762 calculate the DELTALR value to be sent out in the packet sent by Host 763 B in the next step. 765 5.3 Step 3 767 Packet 2 is sent by Host B to Host A. Note, the initial packet 768 sequence number (PSNTP) from Host B starts at a random number. In 769 this case, 12. Before sending the packet, Host B does a calculation 770 of deltas. Since Host B knows when it is sending the packet, and it 771 knows when it received the previous packet, it can do the following 772 calculation: 774 Sending time (packet 2) - receive time (packet 1) 776 We will call the result of this calculation: Delta Time Last 777 Received 779 That is: 781 DELTATLR = Sending time (packet 2) - receive time (packet 1) 783 Note, both sending time and receive time are saved internally in Host 784 B. They do not travel in the packet. Only the Delta is in the 785 packet. 787 Assume that within Host B is the following: 789 Packet Sequence Number of the last packet received: 25 790 Time the last packet was received: 11:00:03 791 Packet Sequence Number of this packet: 12 792 Time this packet is being sent: 11:00:07 794 We can now calculate a delta value to be sent out in the packet. 795 DELTATLR becomes: 797 4 seconds = 11:00:07 - 11:00:03 799 This is the derived metric: Server Delay. The time and scaling 800 factor must be calculated. Then, this value, along with the packet 801 sequence numbers will be sent to Host A as follows: 803 Packet 2 805 +----------+ +----------+ 806 | | | | 807 | Host | <---------- | Host | 808 | A | | B | 809 | | | | 810 +----------+ +----------+ 812 PDM Contents: 814 PSNTP : Packet Sequence Number This Packet: 12 815 PSNLR : Packet Sequence Number Last Received: 25 816 DELTATLR : Delta Time Last Received: 3A35 (4 seconds) 817 SCALEDTLR: Scale of Delta Time Last Received: 25 818 DELTATLS : Delta Time Last Sent: - 819 SCALEDTLS: Scale of Delta Time Last Sent: 0 820 TIMEBASE : Granularity of Time: 00 (Milliseconds) 822 The metric left to be calculated is the Round-Trip Delay. This will 823 be calculated by Host A when it receives Packet 2. 825 5.4 Step 4 827 Packet 2 is received at Host A. Remember, its time is set to one 828 hour earlier than Host B. Internally, it must note: 830 Packet Sequence Number of the last packet received: 12 831 Time the last packet was received : 10:00:12 833 Note, this timestamp is in Host A time. It has nothing whatsoever to 834 do with Host B time. 836 So, now, Host A can calculate total end-to-end time. That is: 838 End-to-End Time = Time Last Received - Time Last Sent 839 For example, packet 25 was sent by Host A at 10:00:00. Packet 12 was 840 received by Host A at 10:00:12 so: 842 End-to-End time = 10:00:12 - 10:00:00 or 12 (Server and Network RT 843 delay combined). This time may also be called total Overall Round- 844 trip time (which includes Network RTT and Host Response Time). 846 This derived metric we will call DELTATLS or Delta Time Last Sent. 848 We can now also calculate round trip delay. The formula is: 850 Round trip delay = DELTATLS - DELTATLR 852 Or: 854 Round trip delay = 12 - 4 or 8 856 Now, the only problem is that at this point all metrics are in Host A 857 only and not exposed in a packet. To do that, we need a third packet. 859 Note: this simple example assumes one send and one receive. That 860 is done only for purposes of explaining the function of the PDM. In 861 cases where there are multiple packets returned, one would take the 862 time in the last packet in the sequence. The calculations of such 863 timings and intelligent processing is the function of post-processing 864 of the data. 866 5.5 Step 5 868 Packet 3 is sent from Host A to Host B. 870 +----------+ +----------+ 871 | | | | 872 | Host | ----------> | Host | 873 | A | | B | 874 | | | | 875 +----------+ +----------+ 877 PDM Contents: 879 PSNTP : Packet Sequence Number This Packet: 26 880 PSNLR : Packet Sequence Number Last Received: 12 881 DELTATLR : Delta Time Last Received: 0 882 SCALEDTLS: Scale of Delta Time Last Received 0 883 DELTATLS : Delta Time Last Sent: 105e (12 seconds) 884 SCALEDTLR: Scale of Delta Time Last Received: 26 885 TIMEBASE : Granularity of Time: 00 (Milliseconds) 886 To calculate Two-Way Delay, any packet capture device may look at 887 these packets and do what is necessary. 889 6 Other Flows 891 What we have discussed so far is a simple flow with one packet sent 892 and one returned. Let's look at how PDM may be useful in other 893 types of flows. 895 6.1 PDM Flow - One Way Traffic 897 The flow on a particular session may not be a send-receive paradigm. 898 Let us consider some other situations. In the case of a one-way 899 flow, one might see the following: 901 Packet Sender PSN PSN Delta Time Delta Time 902 This Packet Last Recvd Last Recvd Last Sent 903 ===================================================================== 904 1 Server 1 0 0 0 905 2 Server 2 0 0 5 906 3 Server 3 0 0 12 907 4 Server 4 0 0 20 909 What does this mean and how is it useful? 911 In a one-way flow, only the Delta Time Last Sent will be seen as 912 used. Recall, Delta Time Last Sent is the difference between the 913 send of one packet from a device and the next. This is a measure of 914 throughput for the sender - according to the sender's point of view. 915 That is, it is a measure of how fast is the application itself (with 916 stack time included) able to send packets. 918 How might this be useful? If one is having a performance issue at 919 the client and sees that packet 2, for example, is sent after 5 920 microseconds from the server but takes 3 minutes to arrive at the 921 destination, then one may safely conclude that there are delays in 922 the path other than at the server which may be causing the delivery 923 issue of that packet. Such delays may include the network links, 924 middle-boxes, etc. 926 Now, true one-way traffic is quite rare. What people often mean by 927 "one-way" traffic is an application such as FTP where a group of 928 packets (for example, a TCP window size worth) is sent, then the 929 sender waits for acknowledgment. This type of flow would actually 930 fall into the "multiple-send" traffic model. 932 6.2 PDM Flow - Multiple Send Traffic 934 Assume that two packets are sent for each ACK from the server. 936 Packet Sender PSN PSN Delta Time Delta Time 937 This Packet Last Recvd Last Recvd Last Sent 938 ===================================================================== 939 1 Server 1 0 0 0 940 2 Server 2 0 0 5 941 3 Client 1 2 20 0 942 4 Server 3 1 10 15 944 How might this be used? 946 Notice that in packet 3, the client has a value of Delta Time Last 947 received of 20. Recall that Delta Time Last Received is the Send 948 time of packet 3 - receive time of packet 2. So, what does one know 949 now? In this case, Delta Time Last Received is the processing time 950 for the Client to send the next packet. 952 How to interpret this depends on what is actually being sent. 953 Remember, PDM is not being used in isolation, but to supplement the 954 fields found in other headers. Let's take some examples: 956 1. Client is sending a standalone TCP ACK. One would find this by 957 looking at the payload length in the IPv6 header and the TCP 958 Acknowledgement field in the TCP header. So, in this case, the 959 client is taking 20 units to send back the ACK. This may or may not 960 be interesting. 962 2. Client is sending data with the packet. Again, one would find 963 this by looking at the payload length in the IPv6 header and the TCP 964 Acknowledgement field in the TCP header. So, in this case, the 965 client is taking 20 units to send back data. This may represent 966 "User Think Time". Again, this may or may not be interesting, in 967 isolation. But, if there is a performance problem receiving data at 968 the server, then taken in conjunction with RTT or other packet timing 969 information, this information may be quite interesting. 971 Of course, one also needs to look at the PSN Last Received field to 972 make sure of the interpretation of this data. That is, to make 973 sure that the Delta Last Received corresponds to the packet of 974 interest. 976 The benefits of PDM are that we have such information available in a 977 uniform manner for all applications and all protocols without 978 extensive changes required to applications. 980 6.3 PDM Flow - Multiple Send with Errors 982 One might wonder if all of the functions of PDM might be better 983 suited to TCP or a TCP option. Let us take the case of how PDM may 984 help in a case of TCP retransmissions in a way that TCP options or 985 TCP ACK / SEQ would not. 987 Assume that three packets are sent with each send from the server. 989 From the server, this is what is seen. 991 Pkt Sender PSN PSN Delta Time Delta Time TCP Data 992 This Pkt LastRecvd LastRecvd LastSent SEQ Bytes 993 ===================================================================== 994 1 Server 1 0 0 0 123 100 995 2 Server 2 0 0 5 223 100 996 3 Server 3 0 0 5 333 100 998 The client however, does not get all the packets. From the client, 999 this is what is seen for the packets sent from the server. 1001 Pkt Sender PSN PSN Delta Time Delta Time TCP Data 1002 This Pkt LastRecvd LastRecvd LastSent SEQ Bytes 1003 ===================================================================== 1004 1 Server 1 0 0 0 123 100 1005 2 Server 3 0 0 5 333 100 1007 Let's assume that the server now retransmits the packet. 1008 (Obviously, a duplicate acknowledgment sequence for fast retransmit 1009 or a retransmit timeout would occur. To illustrate the point, these 1010 packets are being left out.) 1012 So, then if a TCP retransmission is done, then from the client, this 1013 is what is seen for the packets sent from the server. 1015 Pkt Sender PSN PSN Delta Time Delta Time TCP Data 1016 This Pkt LastRecvd LastRecvd LastSent SEQ Bytes 1017 ===================================================================== 1018 1 Server 4 0 0 30 223 100 1020 The server has resent the old packet 2 with TCP sequence number of 1021 223. The retransmitted packet now has a PSN This Packet value of 4. 1022 The Delta Last Sent is 30 - the time between sending the packet with 1023 PSN of 3 and this current packet. 1025 Let's say that packet 4 STILL does not make it. Then, after some 1026 amount of time (RTO) then the packet with TCP sequence number of 223 1027 is resent. 1029 From the client, this is what is seen for the packets sent from the 1030 server. 1032 Pkt Sender PSN PSN Delta Time Delta Time TCP Data 1033 This Pkt LastRecvd LastRecvd LastSent SEQ Bytes 1034 ===================================================================== 1035 1 Server 5 0 0 60 223 100 1037 If now, this packet makes it, one has a very good idea that packets 1038 exist which are being sent from the server as retransmissions and not 1039 making it to the client. This is because the PSN of the resent 1040 packet from the server is 5 rather than 4. If we had used TCP 1041 sequence number alone, we would never have seen this situation. 1042 Because the TCP sequence number in all situations is 223. 1044 This situation would be experienced by the user of the application 1045 (the human being actually sitting somewhere) as a "hangs" or long 1046 delay between packets. On large networks, to diagnose problems such 1047 as these where packets are lost somewhere on the network, one has to 1048 take multiple traces to find out exactly where. 1050 The first thing is to start with doing a trace at the client and the 1051 server. So, we can see if the server sent a particular packet and 1052 the client received it. If the client did not receive it, then we 1053 start tracking back to trace points at the router right after the 1054 server and the router right before the client. Did they get these 1055 packets which the server has sent? This is a time consuming 1056 activity. 1058 With PDM, we can speed up the diagnostic time because we may be able 1059 to use only the trace taken at the client to see what the server is 1060 sending. 1062 7 Potential Overhead Considerations 1064 Questions have been posed as to the potential overhead of PDM. 1065 First, PDM is entirely optional. That is, a site may choose to 1066 implement PDM or not as they wish. If they are happy with the costs 1067 of PDM vs. the benefits, then the choice should be theirs. 1069 Below is a table outlining the potential overhead in terms of 1070 additional time to deliver the response to the end user for various 1071 assumed RTTs. 1073 Packet 1074 Bytes RTT Bytes Bytes New Overhead 1075 in Packet Per Milli in PDM RTT 1076 ===================================================================== 1077 1000 1000 milli 1 16 1016.000 16.000 milli 1078 1000 100 milli 10 16 101.600 1.600 milli 1079 1000 10 milli 100 16 10.160 .160 milli 1080 1000 1 milli 1000 16 1.016 .016 milli 1082 Below are some examples of actual RTTs for packets traversing large 1083 enterprise networks. The first example is for packets going to 1084 multiple business partners. 1086 Packet 1087 Bytes RTT Bytes Bytes New Overhead 1088 in Packet Per Milli in PDM RTT 1089 ===================================================================== 1090 1000 17 milli 58 16 17.360 .360 milli 1092 The second example is for packets at a large enterprise customer 1093 within a data center. Notice that the scale is now in microseconds 1094 rather than milliseconds. 1096 Packet 1097 Bytes RTT Bytes Bytes New Overhead 1098 in Packet Per Micro in PDM RTT 1099 ===================================================================== 1100 1000 20 micro 50 16 20.320 .320 micro 1102 8 Security Considerations 1104 The PDM MUST NOT be changed dynamically via packet flow as this 1105 creates a possibility for potential security violations or DoS 1106 attacks by numerous packets turning the header on and off. 1108 Attackers may also send many packets from multiple ports, for example 1109 by doing a port scan. This will cause the stack to create many 1110 control blocks. This is the same problem as seen for SYN flood 1111 attacks. Similar protections should be implemented by the stack to 1112 preserve the integrity of memory. 1114 9 IANA Considerations 1116 Option Type to be assigned by IANA [RFC2780]. 1118 10 References 1120 10.1 Normative References 1122 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, RFC 1123 793, September 1981. 1125 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1126 Requirement Levels", BCP 14, RFC 2119, March 1997. 1128 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1129 (IPv6) Specification", RFC 2460, December 1998. 1131 [RFC2681] Almes, G., Kalidindi, S., and M. Zekauskas, "A Round-trip 1132 Delay Metric for IPPM", RFC 2681, September 1999. 1134 [RFC2780] Bradner, S. and V. Paxson, "IANA Allocation Guidelines For 1135 Values In the Internet Protocol and Related Headers", BCP 37, RFC 1136 2780, March 2000. 1138 10.2 Informative References 1140 [TRAM-TCPM] Trammel, B., "Encoding of Time Intervals for the TCP 1141 Timestamp Option-01", Internet Draft, July 2013. [Work in Progress] 1143 [IBM-POPS] IBM Corporation, "IBM z/Architecture Principles of 1144 Operation", SA22-7832, 1990-2012 1146 11 Acknowledgments 1148 The authors would like to thank Keven Haining, Al Morton, Brian 1149 Trammel, David Boyes, Bill Jouris, Richard Scheffenegger, and Rick 1150 Troth for their comments and assistance. 1152 Authors' Addresses 1154 Nalini Elkins 1155 Inside Products, Inc. 1156 36A Upper Circle 1157 Carmel Valley, CA 93924 1158 United States 1159 Phone: +1 831 659 8360 1160 Email: nalini.elkins@insidethestack.com 1161 http://www.insidethestack.com 1162 Robert Hamilton 1163 Chemical Abstracts Service 1164 A Division of the American Chemical Society 1165 2540 Olentangy River Road 1166 Columbus, Ohio 43202 1167 United States 1168 Phone: +1 614 447 3600 x2517 1169 Email: rhamilton@cas.org 1170 http://www.cas.org 1172 Michael S. Ackermann 1173 Blue Cross Blue Shield of Michigan 1174 P.O. Box 2888 1175 Detroit, Michigan 48231 1176 United States 1177 Phone: +1 310 460 4080 1178 Email: mackermann@bcbsmi.com 1179 http://www.bcbsmi.com