idnits 2.17.1 draft-ietf-ippm-6man-pdm-option-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 10, 2015) is 3113 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT N. Elkins 3 Inside Products 4 R. Hamilton 5 Chemical Abstracts Service 6 M. Ackermann 7 Intended Status: Proposed Standard BCBS Michigan 8 Expires: April 12, 2016 October 10, 2015 10 IPv6 Performance and Diagnostic Metrics (PDM) Destination Option 11 draft-ietf-ippm-6man-pdm-option-01 13 Table of Contents 15 1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . 4 16 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4 17 1.2 End User Quality of Service (QoS) . . . . . . . . . . . . . 4 18 1.3 Need for a Packet Sequence Number . . . . . . . . . . . . . 5 19 1.4 Rationale for proposed solution . . . . . . . . . . . . . . 5 20 1.5 PDM Works in Collaboration with Other Headers . . . . . . . 6 21 1.6 IPv6 Transition Technologies . . . . . . . . . . . . . . . . 6 22 2 Measurement Information Derived from PDM . . . . . . . . . . . . 6 23 2.1 Round-Trip Delay . . . . . . . . . . . . . . . . . . . . . . 7 24 2.2 Server Delay . . . . . . . . . . . . . . . . . . . . . . . . 7 25 3 Performance and Diagnostic Metrics Destination Option Layout . . 7 26 3.1 Destination Options Header . . . . . . . . . . . . . . . . . 7 27 3.2 Performance and Diagnostic Metrics Destination Option . . . 7 28 3.3 Header Placement . . . . . . . . . . . . . . . . . . . . . . 11 29 3.4 Header Placement Using IPSec ESP Mode . . . . . . . . . . . 11 30 3.5 Implementation Considerations . . . . . . . . . . . . . . . 12 31 3.6 Dynamic Configuration Options . . . . . . . . . . . . . . . 12 32 3.6 5-tuple Aging . . . . . . . . . . . . . . . . . . . . . . . 12 33 4 Considerations of Timing Representation . . . . . . . . . . . . 13 34 4.1 Encoding the Delta-Time Values . . . . . . . . . . . . . . . 13 35 4.2 Timer registers are different on different hardware . . . . 13 36 4.3 Timer Units on Other Systems . . . . . . . . . . . . . . . . 14 37 4.4 Time Base . . . . . . . . . . . . . . . . . . . . . . . . . 14 38 4.5 Timer-value scaling . . . . . . . . . . . . . . . . . . . . 15 39 4.6 Limitations with this encoding method . . . . . . . . . . . 16 40 4.7 Lack of precision induced by timer value truncation . . . . 16 41 5 PDM Flow - Simple Client Server . . . . . . . . . . . . . . . . 17 42 5.1 Step 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 43 5.2 Step 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 44 5.3 Step 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 45 5.4 Step 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 46 5.5 Step 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 48 6 Other Flows . . . . . . . . . . . . . . . . . . . . . . . . . . 21 49 6.1 PDM Flow - One Way Traffic . . . . . . . . . . . . . . . . . 22 50 6.2 PDM Flow - Multiple Send Traffic . . . . . . . . . . . . . . 23 51 6.3 PDM Flow - Multiple Send with Errors . . . . . . . . . . . . 24 52 7 Potential Overhead Considerations . . . . . . . . . . . . . . . 25 53 8 Security Considerations . . . . . . . . . . . . . . . . . . . . 26 54 9 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 26 55 10 References . . . . . . . . . . . . . . . . . . . . . . . . . . 27 56 10.1 Normative References . . . . . . . . . . . . . . . . . . . 27 57 10.2 Informative References . . . . . . . . . . . . . . . . . . 27 58 11 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 27 59 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27 61 Abstract 63 To assess performance problems, measurements based on optional 64 sequence numbers and timing may be embedded in each packet. Such 65 measurements may be interpreted in real-time or after the fact. An 66 implementation of the existing IPv6 Destination Options extension 67 header, the Performance and Diagnostic Metrics (PDM) Destination 68 Options extension header as well as the field limits, calculations, 69 and usage of the PDM in measurement are included in this document. 71 Status of this Memo 73 This Internet-Draft is submitted to IETF in full conformance with the 74 provisions of BCP 78 and BCP 79. 76 Internet-Drafts are working documents of the Internet Engineering 77 Task Force (IETF), its areas, and its working groups. Note that 78 other groups may also distribute working documents as 79 Internet-Drafts. 81 Internet-Drafts are draft documents valid for a maximum of six months 82 and may be updated, replaced, or obsoleted by other documents at any 83 time. It is inappropriate to use Internet-Drafts as reference 84 material or to cite them other than as "work in progress." 86 The list of current Internet-Drafts can be accessed at 87 http://www.ietf.org/1id-abstracts.html 89 The list of Internet-Draft Shadow Directories can be accessed at 90 http://www.ietf.org/shadow.html 92 Copyright and License Notice 94 Copyright (c) 2015 IETF Trust and the persons identified as the 95 document authors. All rights reserved. 97 IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 98 3: This document is subject to BCP 78 and the IETF Trust's Legal 99 Provisions Relating to IETF Documents 100 (http://trustee.ietf.org/license-info) in effect on the date of 101 publication of this document. Please review these documents 102 carefully, as they describe your rights and restrictions with respect 103 to this document. Code Components extracted from this document must 104 include Simplified BSD License text as described in Section 4.e of 105 the Trust Legal Provisions and are provided without warranty as 106 described in the Simplified BSD License. 108 1 Background 110 To assess performance problems, measurements based on optional 111 sequence numbers and timing may be embedded in each packet. Such 112 measurements may be interpreted in real-time or after the fact. An 113 implementation of the existing IPv6 Destination Options extension 114 header, the Performance and Diagnostic Metrics (PDM) Destination 115 Options extension header has been proposed in a companion document. 116 This document specifies the layout, field limits, calculations, and 117 usage of the PDM in measurement. 119 As defined in RFC2460 [RFC2460], destination options are carried by 120 the IPv6 Destination Options extension header. Destination options 121 include optional information that need be examined only by the IPv6 122 node given as the destination address in the IPv6 header, not by 123 routers or other "middle boxes". This document specifies a new 124 destination option, the Performance and Diagnostic Metrics (PDM) 125 destination option. 127 1.1 Terminology 129 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 130 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 131 document are to be interpreted as described in RFC 2119 [RFC2119]. 133 1.2 End User Quality of Service (QoS) 135 The difference between timing values in the PDM traveling along with 136 the packet will be used to estimate QoS as experienced by an end user 137 device. 139 For many applications, the key user performance indicator is response 140 time. When the end user is an individual, he is generally 141 indifferent to what is happening along the network; what he really 142 cares about is how long it takes to get a response back. But this is 143 not just a matter of individuals' personal convenience. In many 144 cases, rapid response is critical to the business being conducted. 146 When the end user is a device (e.g. with the Internet of Things), 147 what matters is the speed with which requested data can be 148 transferred -- specifically, whether the requested data can be 149 transferred in time to accomplish the desired actions. This can be 150 important when the relevant external conditions are subject to rapid 151 change. 153 Response time and consistency are not just "nice to have". On many 154 networks, the impact can be financial hardship or endanger human 155 life. In some cities, the emergency police contact system operates 156 over IP, law enforcement uses TCP/IP networks, transactions on our 157 stock exchanges are settled using IP networks. The critical nature 158 of such activities to our daily lives and financial well-being demand 159 a simple solution to support measurements. 161 1.3 Need for a Packet Sequence Number 163 While performing network diagnostics of an end-to-end connection, it 164 often becomes necessary to find the device along the network path 165 creating problems. Diagnostic data may be collected at multiple 166 places along the path (if possible), or at the source and 167 destination. Then, in post-collection processing, the diagnostic 168 data corresponding to each packet at different observation points 169 must be matched for proper measurements. A sequence number in each 170 packet provides sufficient basis for the matching process. If need 171 be, the timing fields may be used along with the sequence number to 172 ensure uniqueness. 174 This method of data collection along the path is of special use to 175 determine where packet loss or packet corruption is happening. 177 The packet sequence number needs to be unique in the context of the 178 session (5-tuple). See section 2 for a definition of 5-tuple. 180 1.4 Rationale for proposed solution 182 The current IPv6 specification does not provide timing nor a similar 183 field in the IPv6 main header or in any extension header. So, we 184 propose the IPv6 Performance and Diagnostic Metrics destination 185 option (PDM). 187 Advantages include: 189 1. Real measure of actual transactions. 190 2. Independence from transport layer protocols. 191 3. Ability to span organizational boundaries with consistent 192 instrumentation 193 4. No time synchronization needed between session partners 195 The PDM provides the ability to quickly determine if the (latency) 196 problem is in the network or in the server (application). More 197 intermediate measurements may be needed if the host or network 198 discrimination is not sufficient. At the client, TCP/IP stack time 199 vs. applications time may still need to be broken out by client 200 software. 202 1.5 PDM Works in Collaboration with Other Headers 204 The purpose of the PDM is not to supplant all the variables present 205 in all other headers but to provide data which is not available or 206 very difficult to get. The way PDM would be used is by a technician 207 (or tool) looking at a packet capture. Within the packet capture, 208 they would have available to them the layer 2 header, IP header (v6 209 or v4), TCP, UCP, ICMP, SCTP or other headers. All information 210 would be looked at together to make sense of the packet flow. The 211 technician or processing tool could analyze, report or ignore the 212 data from PDM, as necessary. 214 For an example of how PDM can help with TCP retransmit problems, 215 please look at section 8. 217 1.6 IPv6 Transition Technologies 219 In the path to full implementation of IPv6, transition technologies 220 such as translation or tunneling may be employed. The PDM header is 221 not expected to work in such scenarios. It is likely that an IPv6 222 packet containing PDM will be dropped if using IPv6 transition 223 technologies. 225 2 Measurement Information Derived from PDM 227 Each packet contains information about the sender and receiver. In IP 228 protocol, the identifying information is called a "5-tuple". 230 The 5-tuple consists of: 232 SADDR : IP address of the sender 233 SPORT : Port for sender 234 DADDR : IP address of the destination 235 DPORT : Port for destination 236 PROTC : Protocol for upper layer (ex. TCP, UDP, ICMP, etc.) 238 The PDM contains the following base fields: 240 PSNTP : Packet Sequence Number This Packet 241 PSNLR : Packet Sequence Number Last Received 242 DELTATLR : Delta Time Last Received 243 DELTATLS : Delta Time Last Sent 245 Other fields for scaling and time base are also in the PDM and will 246 be described in section 3. 248 This information, combined with the 5-tuple, allows the measurement 249 of the following metrics: 251 1. Round-trip delay 252 2. Server delay 254 2.1 Round-Trip Delay 256 Round-trip *Network* delay is the delay for packet transfer from a 257 source host to a destination host and then back to the source host. 258 This measurement has been defined, and the advantages and 259 disadvantages discussed in "A Round-trip Delay Metric for IPPM" 260 [RFC2681]. 262 2.2 Server Delay 264 Server delay is the interval between when a packet is received by a 265 device and the first corresponding packet is sent back in response. 266 This may be "Server Processing Time". It may also be a delay caused 267 by acknowledgements. Server processing time includes the time taken 268 by the combination of the stack and application to return the 269 response. The stack delay may be related to network performance. If 270 this aggregate time is seen as a problem, and there is a need to make 271 a clear distinction between application processing time and stack 272 delay, including that caused by the network, then more client based 273 measurements are needed. 275 3 Performance and Diagnostic Metrics Destination Option Layout 277 3.1 Destination Options Header 279 The IPv6 Destination Options Header is used to carry optional 280 information that need be examined only by a packet's destination 281 node(s). The Destination Options Header is identified by a Next 282 Header value of 60 in the immediately preceding header and is defined 283 in RFC2460 [RFC2460]. The IPv6 Performance and Diagnostic Metrics 284 Destination Option (PDM) is an implementation of the Destination 285 Options Header (Next Header value = 60). The PDM does not require 286 time synchronization. 288 3.2 Performance and Diagnostic Metrics Destination Option 290 The IPv6 Performance and Diagnostic Metrics Destination Option (PDM) 291 contains the following fields: 293 TIMEBASE : Base timer unit 294 SCALEDTLR: Scale for Delta Time Last Received 295 SCALEDTLS: Scale for Delta Time Last Sent 296 PSNTP : Packet Sequence Number This Packet 297 PSNLR : Packet Sequence Number Last Received 298 DELTATLR : Delta Time Last Received 299 DELTATLS : Delta Time Last Sent 301 The PDM destination option is encoded in type-length-value (TLV) 302 format as follows: 304 0 1 2 3 305 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 306 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 307 | Option Type | Option Length |TB |ScaleDTLR | ScaleDTLS | 308 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 309 | PSN This Packet | PSN Last Received | 310 |-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 311 | Delta Time Last Received | Delta Time Last Sent | 312 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 314 Option Type 316 TBD = 0xXX (TBD) [To be assigned by IANA] [RFC2780] 318 Option Length 320 8-bit unsigned integer. Length of the option, in octets, excluding 321 the Option Type and Option Length fields. This field MUST be set to 322 16. 324 Time Base 326 2-bit unsigned integer. It will indicate the lowest granularity 327 possible for this device. That is, for a value of 00 in the Time 328 Base field, a value of 1 in the DELTA fields indicates 1 329 microsecond. 331 This field is being included so that a device may choose the 332 granularity which most suits its timer ticks. That is, so that it 333 does not have to do more work than needed to convert values required 334 for the PDM. 336 The possible values of Time Base are as follows: 338 00 - milliseconds 339 01 - microseconds 340 10 - nanoseconds 341 11 - picoseconds 343 Scale Delta Time Last Received (SCALEDTLR) 344 7-bit signed integer. This is the scaling value for the Delta Time 345 Last Received (DELTATLR) field. The possible values are from -128 to 346 +127. See Section 4 for further discussion on Timing Considerations 347 and formatting of the scaling values. 349 Scale Delta Time Last Sent (SCALEDTLS) 351 7-bit signed integer. This is the scaling value for the Delta Time 352 Last Sent (DELTATLS) field. The possible values are from -128 to 353 +127. 355 Packet Sequence Number This Packet (PSNTP) 357 16-bit unsigned integer. This field will wrap. It is intended for 358 human use. That is, while to be used while analyzing packet traces. 360 Initialized at a random number and monotonically incremented for each 361 packet on the 5-tuple. The 5-tuple consists of the source and 362 destination IP addresses, the source and destination ports, and the 363 upper layer protocol (ex. TCP, ICMP, etc). The random number 364 initialization is to make it harder to spoof and insert such packets. 366 Operating systems MUST implement a separate packet sequence number 367 counter per 5-tuple. Operating systems MUST NOT implement a single 368 counter for all connections. 370 Packet Sequence Number Last Received (PSNLR) 372 16-bit unsigned integer. This is the PSN of the packet last received 373 on the 5-tuple. 375 Delta Time Last Received (DELTATLR) 377 A 16-bit unsigned integer field. The value is according to the scale 378 in SCALEDTLR. 380 DELTATLR = Send time packet 2 - Receive time packet 1 382 Delta TimeLast Sent (DELTATLS) 384 A 16-bit unsigned integer field. The value is according to the 385 scale in SCALEDTLS. 387 Delta Time Last Sent = Receive time packet 2 - Send time packet 1 389 Option Type 391 The two highest-order bits of the Option Type field are encoded to 392 indicate specific processing of the option; for the PDM destination 393 option, these two bits MUST be set to 00. This indicates the 394 following processing requirements: 396 00 - skip over this option and continue processing the header. 398 RFC2460 [RFC2460] defines other values for the Option Type field. 399 These MUST NOT be used in the PDM. The other values are as follows: 401 01 - discard the packet. 403 10 - discard the packet and, regardless of whether or not the 404 packet's Destination Address was a multicast address, send an ICMP 405 Parameter Problem, Code 2, message to the packet's Source Address, 406 pointing to the unrecognized Option Type. 408 11 - discard the packet and, only if the packet's Destination Address 409 was not a multicast address, send an ICMP Parameter Problem, Code 2, 410 message to the packet's Source Address, pointing to the unrecognized 411 Option Type. 413 In keeping with RFC2460 [RFC2460], the third-highest-order bit of the 414 Option Type specifies whether or not the Option Data of that option 415 can change en-route to the packet's final destination. 417 In the PDM, the value of the third-highest-order bit MUST be 0. The 418 possible values are as follows: 420 0 - Option Data does not change en-route 422 1 - Option Data may change en-route 424 The three high-order bits described above are to be treated as part 425 of the Option Type, not independent of the Option Type. That is, a 426 particular option is identified by a full 8-bit Option Type, not just 427 the low-order 5 bits of an Option Type. 429 3.3 Header Placement 431 The PDM destination option MUST be placed as follows: 433 - Before the upper-layer header or the ESP header. 435 This follows the order defined in RFC2460 [RFC2460] 437 IPv6 header 439 Hop-by-Hop Options header 441 Destination Options header <-------- 443 Routing header 445 Fragment header 447 Authentication header 449 Encapsulating Security Payload header 451 Destination Options header <------------ 453 upper-layer header 455 Note that there is a choice of where to place the Destination Options 456 header. If using ESP mode, please see section 3.4 of this document 457 for placement of the PDM Destination Options header. 459 For each IPv6 packet header, the PDM MUST NOT appear more than once. 460 However, an encapsulated packet MAY contain a separate PDM associated 461 with each encapsulated IPv6 header. 463 3.4 Header Placement Using IPSec ESP Mode 465 IP Encapsulating Security Payload (ESP) is defined in [RFC4303] and 466 is widely used. Section 3.1.1 of [RFC4303] discusses placement of 467 Destination Options Headers. Below is the diagram from [RFC4303] 468 discussing placement. PDM MUST be placed before the ESP header in 469 order to work. If placed before the ESP header, the PDM header will 470 flow in the clear over the network thus allowing gathering of 471 performance and diagnostic data without sacrificing security. 473 BEFORE APPLYING ESP 475 --------------------------------------- 476 IPv6 | | ext hdrs | | | 477 | orig IP hdr |if present| TCP | Data | 478 --------------------------------------- 480 AFTER APPLYING ESP 481 --------------------------------------------------------- 482 IPv6 | orig |hop-by-hop,dest*,| |dest| | | ESP | ESP| 483 |IP hdr|routing,fragment.|ESP|opt*|TCP|Data|Trailer| ICV| 484 --------------------------------------------------------- 485 |<--- encryption ---->| 486 |<------ integrity ------>| 488 * = if present, could be before ESP, after ESP, or both 490 3.5 Implementation Considerations 492 The PDM destination options extension header SHOULD be turned on by 493 each stack on a host node. It MAY also be turned on only in case of 494 diagnostics needed for problem resolution. 496 3.6 Dynamic Configuration Options 498 If implemented, each operating system MUST have a default 499 configuration parameter, e.g. diag_header_sys_default_value=yes/no. 500 The operating system MAY also have a dynamic configuration option to 501 change the configuration setting as needed. 503 If the PDM destination options extension header is used, then it MAY 504 be turned on for all packets flowing through the host, applied to an 505 upper-layer protocol (TCP, UDP, SCTP, etc), a local port, or IP 506 address only. These are at the discretion of the implementation. 508 The PDM MUST NOT be changed dynamically via packet flow as this may 509 create potential security violation or DoS attack by numerous packets 510 turning the header on and off. 512 As with all other destination options extension headers, the PDM is 513 for destination nodes only. As specified above, intermediate devices 514 MUST neither set nor modify this field. 516 3.6 5-tuple Aging 518 Within the operating system, metrics must be kept on a 5-tuple basis. 520 The 5-tuple is: 522 SADDR : IP address of the sender SPORT : Port for sender DADDR : IP 523 address of the destination DPORT : Port for destination PROTC : 524 Protocol for upper layer (ex. TCP, UDP, ICMP) 526 The question comes of when to stop keeping data or restarting the 527 numbering for a 5-tuple. For example, in the case of TCP, at some 528 point, the connection will terminate. Keeping data in control blocks 529 forever, will have unfortunate consequences for the operating system. 531 So, the recommendation is to use a known aging parameter such as Max 532 Segment Lifetime (MSL) as defined in Transmission Control Protocol 533 [RFC0793] to reuse or drop the control block. The choice of aging 534 parameter is left up to the implementation. 536 4 Considerations of Timing Representation 538 4.1 Encoding the Delta-Time Values 540 This section makes reference to and expands on the document "Encoding 541 of Time Intervals for the TCP Timestamp Option" [TRAM-TCPM]. 543 4.2 Timer registers are different on different hardware 545 One of the problems with timestamp recording is the variety of 546 hardware that generates the time value to be used. Different CPUs 547 track the time in registers of different sizes, and the most- 548 frequently-iterated bit could be the first on the left or the first 549 on the right. In order to generate some examples here it is necessary 550 to indicate the type of timer register being used. 552 As described in the "IBM z/Architecture Principles of Operation" 553 [IBM-POPS], the Time-Of-Day clock in a zSeries CPU is a 104-bit 554 register, where bit 51 is incremented approximately every 555 microsecond: 557 1 558 0 1 2 3 4 5 6 0 559 +--------+---------+---------+---------+---------+---------+--+...+ 560 | | | | | |* | | 561 +--------+---------+---------+---------+---------+---------+--+...+ 562 ^ ^ ^ 563 0 51 = 1 usec 103 565 To represent these values concisely a hexadecimal representation will 566 be used, where each digit represents 4 binary bits. Thus: 568 0000 0000 0000 0001 = 1 timer unit (2**-12 usec, or about 244 psec) 569 0000 0000 0000 1000 = 1 microsecond 570 0000 0000 003E 8000 = 1 millisecond 571 0000 0000 F424 0000 = 1 second 572 0000 0039 3870 0000 = 1 minute 573 0000 0D69 3A40 0000 = 1 hour 574 0001 41DD 7600 0000 = 1 day 576 Note that only the first 64 bits of the register are commonly 577 represented, as that represents a count of timer units on this 578 hardware. Commonly the first 52 bits are all that are displayed, as 579 that represents a count of microseconds. 581 4.3 Timer Units on Other Systems 583 This encoding method works the same with other hardware clock 584 formats. The method uses a microsecond as the basic value and allows 585 for large time differentials. 587 4.4 Time Base 589 This specification allows for the fact that different CPU TOD clocks 590 use different binary points. For some clocks, a value of 1 could 591 indicate 1 microsecond, whereas other clocks could use the value 1 to 592 indicate 1 millisecond. In the former case, the binary digits to the 593 right of that binary point measure 2**(-n) microseconds, and in the 594 latter case, 2**(-n) milliseconds. 596 The Time Base allows us to ensure we have a common reference, at the 597 very least, common knowledge of what the binary point is for the 598 transmitted values. 600 We propose a base unit for the time. This is a 2-bit integer 601 indicating the lowest granularity possible for this device. That is, 602 for a value of 00 in the Time Base field, a value of 1 in the DELTA 603 fields indicates 1 picosecond. 605 The possible values of Time Base are as follows: 607 00 - milliseconds 608 01 - microseconds 609 10 - nanoseconds 610 11 - picoseconds 612 Time base is not necessarily equivalent to length of one timer tick. 613 That is, on many, if not all, systems, the timer tick value will not 614 be in complete units of nanoseconds, milliseconds, etc. For example, 615 on an IBM zSeries machine, one timer tick (or clock unit) is 2 to the 616 -12th microseconds. 618 Therefore, some amount of conversion may be needed to approximate 619 Time Base units. 621 4.5 Timer-value scaling 623 As discussed in [TRAM-TCPM] we propose storing not an entire time- 624 interval value, but just the most significant bits of that value, 625 along with a scaling factor to indicate the magnitude of the time- 626 interval value. In our case, we will use the high-order 16 bits. The 627 scaling value will be the number of bits in the timer register to the 628 right of the 16th significant bit. That is, if the timer register 629 contains this binary value: 631 1110100011010100101001010001000000000000 632 <-16 bits -><-24 bits -> 634 then, the values stored would be 1110 1000 1101 0100 in binary (E8D4 635 hexadecimal) for the time value and 24 for the scaling value. Note 636 that the displayed value is the binary equivalent of 1 second 637 expressed in picoseconds. 639 The below table represents a device which has a TimeBase of 640 picosecond (or 00). The smallest and simplest value to represent is 641 1 picosecond; the time value stored is 1, and the scaling value is 0. 642 Using values from the table below, we have: 644 Time value in Encoded Scaling 645 Delta time picoseconds value decimal 646 -------------------------------------------------------- 647 1 picosecond 1 1 0 648 1 nanosecond 3E8 3E8 0 649 1 microsecond F4240 F424 4 650 1 millisecond 3B9ACA00 3B9A 16 651 1 second E8D4A51000 E8D4 24 652 1 minute 3691D6AFC000 3691 32 653 1 hour cca2e51310000 CCA2 36 654 1 day 132f4579c980000 132F 44 655 365 days 1b5a660ea44b80000 1B5A 52 657 Sample binary values (high order 16 bits taken) 659 1 psec 1 0001 660 1 nsec 3E8 0011 1110 1000 661 1 usec F4240 1111 0100 0010 0100 0000 662 1 msec 3B9ACA00 0011 1011 1001 1010 1100 1010 0000 0000 663 1 sec E8D4A51000 1110 1000 1101 0100 1010 0101 0001 0000 0000 0000 665 4.6 Limitations with this encoding method 667 If we follow the specification in [TRAM-TCPM], the size of one of 668 these time-interval fields is limited to this 11-bit value and five- 669 bit scale, so that they fit into a 16-bit space. With that 670 limitation, the maximum value that could be stored in 16 bits is: 672 11-bit value Scale 673 ============= ====== 674 1111 1111 111 1 1111 676 or an encoded value of 3FF and a scale value of 31. This value 677 corresponds to any time differential between: 679 || 680 11 1111 1111 1000 0000 0000 0000 0000 0000 0000 0000 (binary) 681 3 F F 8 0 0 0 0 0 0 0 (hexadecimal) 683 and 685 11 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 (binary) 686 3 F F F F F F F F F F (hexadecimal) 688 This time value, 3FFFFFFFFFF, converts to 50 days, 21 hours, 40 689 minutes and 46.511103 seconds. A time differential 1 microsecond 690 longer won't fit into 16 bits using this encoding method. 692 4.7 Lack of precision induced by timer value truncation 694 When the bit values following the first 11 significant bits are 695 truncated, obviously loss of precision in the value. The range of 696 values that will be truncated to the same encoded value is 697 2**(Scale)-1 microseconds. 699 The smallest time differential value that will be truncated is 701 1000 0000 0000 = 2.048 msec 703 The value 705 1000 0000 0001 = 2.049 msec 707 will be truncated to the same encoded value, which is 400 in hex, 708 with a scale value of 1. With the scale value of 1, the value range 709 is calculated as 2**1 - 1, or 1 usec, which you can see is the 710 difference between these minimum and maximum values. 712 With that in mind, let's look at that table of delta time values 713 again, where the Precision is the range from the smallest value 714 corresponding to this encoded value to the largest: 716 Time value in Encoded 717 Delta time microseconds value Scale Precision 718 1 microsecond 1 1 0 0:00.000000 719 1 millisecond 38E 38E 0 0:00.000000 720 1 second F4240 7A1 9 0:00.000511 721 1 minute 3938700 727 15 0:00.032767 722 1 hour D693A400 6B4 21 0:02.097151 723 1 day 141DD76000 507 26 1:07.108863 724 Maximum value 3FFFFFFFFFF 7FF 31 35:47.483647 726 So, when measuring the delay between transmission of two packets, or 727 between the reception of two packets, any delay shorter than 50 days 728 21 hours and change can be stored in this encoded fashion within 16 729 bits. When you encode, for example, a DTN response time delay of 50 730 days, 21 hours and 40 minutes, you can be assured of accuracy within 731 35 minutes. 733 5 PDM Flow - Simple Client Server 735 Following is a sample simple flow for the PDM with one packet sent 736 from Host A and one packet received by Host B. The PDM does not 737 require time synchronization between Host A and Host B. The 738 calculations to derive meaningful metrics for network diagnostics are 739 shown below each packet sent or received. 741 Each packet, in addition to the PDM contains information on the 742 sender and receiver. As discussed before, a 5-tuple consists of: 744 SADDR : IP address of the sender 745 SPORT : Port for sender 746 DADDR : IP address of the destination 747 DPORT : Port for destination 748 PROTC : Protocol for upper layer (ex. TCP, UDP, ICMP) 750 It should be understood that the packet identification information is 751 in each packet. We will not repeat that in each of the following 752 steps. 754 5.1 Step 1 756 Packet 1 is sent from Host A to Host B. The time for Host A is set 757 initially to 10:00AM. 759 The time and packet sequence number are saved by the sender 760 internally. The packet sequence number and delta times are sent in 761 the packet. 763 Packet 1 765 +----------+ +----------+ 766 | | | | 767 | Host | ----------> | Host | 768 | A | | B | 769 | | | | 770 +----------+ +----------+ 772 PDM Contents: 774 PSNTP : Packet Sequence Number This Packet: 25 775 PSNLR : Packet Sequence Number Last Received: - 776 DELTATLR : Delta Time Last Received: - 777 SCALEDTLR: Scale of Delta Time Last Received: 0 778 DELTATLS : Delta Time Last Sent: - 779 SCALEDTLS: Scale of Delta Time Last Sent: 0 780 TIMEBASE : Granularity of Time: 00 (Milliseconds) 782 Internally, within the sender, Host A, it must keep: 784 Packet Sequence Number of the last packet sent: 25 785 Time the last packet was sent: 10:00:00 787 Note, the initial PSNTP from Host A starts at a random number. In 788 this case, 25. The time in these examples is shown in seconds for 789 the sake of simplicity. 791 5.2 Step 2 793 Packet 1 is received at Host B. Its time is set to one hour later 794 than Host A. In this case, 11:00AM 796 Internally, within the receiver, Host B, it must note: 798 Packet Sequence Number of the last packet received: 25 799 Time the last packet was received : 11:00:03 800 Note, this timestamp is in Host B time. It has nothing whatsoever to 801 do with Host A time. The Packet Sequence Number of the last packet 802 received will become PSNLR which will be sent out in the packet sent 803 by Host B in the next step. The time last received will be used to 804 calculate the DELTALR value to be sent out in the packet sent by Host 805 B in the next step. 807 5.3 Step 3 809 Packet 2 is sent by Host B to Host A. Note, the initial packet 810 sequence number (PSNTP) from Host B starts at a random number. In 811 this case, 12. Before sending the packet, Host B does a calculation 812 of deltas. Since Host B knows when it is sending the packet, and it 813 knows when it received the previous packet, it can do the following 814 calculation: 816 Sending time (packet 2) - receive time (packet 1) 818 We will call the result of this calculation: Delta Time Last 819 Received 821 That is: 823 DELTATLR = Sending time (packet 2) - receive time (packet 1) 825 Note, both sending time and receive time are saved internally in Host 826 B. They do not travel in the packet. Only the Delta is in the 827 packet. 829 Assume that within Host B is the following: 831 Packet Sequence Number of the last packet received: 25 832 Time the last packet was received: 11:00:03 833 Packet Sequence Number of this packet: 12 834 Time this packet is being sent: 11:00:07 836 We can now calculate a delta value to be sent out in the packet. 837 DELTATLR becomes: 839 4 seconds = 11:00:07 - 11:00:03 841 This is the derived metric: Server Delay. The time and scaling 842 factor must be calculated. Then, this value, along with the packet 843 sequence numbers will be sent to Host A as follows: 845 Packet 2 847 +----------+ +----------+ 848 | | | | 849 | Host | <---------- | Host | 850 | A | | B | 851 | | | | 852 +----------+ +----------+ 854 PDM Contents: 856 PSNTP : Packet Sequence Number This Packet: 12 857 PSNLR : Packet Sequence Number Last Received: 25 858 DELTATLR : Delta Time Last Received: 3A35 (4 seconds) 859 SCALEDTLR: Scale of Delta Time Last Received: 25 860 DELTATLS : Delta Time Last Sent: - 861 SCALEDTLS: Scale of Delta Time Last Sent: 0 862 TIMEBASE : Granularity of Time: 00 (Milliseconds) 864 The metric left to be calculated is the Round-Trip Delay. This will 865 be calculated by Host A when it receives Packet 2. 867 5.4 Step 4 869 Packet 2 is received at Host A. Remember, its time is set to one 870 hour earlier than Host B. Internally, it must note: 872 Packet Sequence Number of the last packet received: 12 873 Time the last packet was received : 10:00:12 875 Note, this timestamp is in Host A time. It has nothing whatsoever to 876 do with Host B time. 878 So, now, Host A can calculate total end-to-end time. That is: 880 End-to-End Time = Time Last Received - Time Last Sent 882 For example, packet 25 was sent by Host A at 10:00:00. Packet 12 was 883 received by Host A at 10:00:12 so: 885 End-to-End time = 10:00:12 - 10:00:00 or 12 (Server and Network RT 886 delay combined). This time may also be called total Overall Round- 887 trip time (which includes Network RTT and Host Response Time). 889 This derived metric we will call DELTATLS or Delta Time Last Sent. 891 We can now also calculate round trip delay. The formula is: 893 Round trip delay = DELTATLS - DELTATLR 895 Or: 897 Round trip delay = 12 - 4 or 8 899 Now, the only problem is that at this point all metrics are in Host A 900 only and not exposed in a packet. To do that, we need a third packet. 902 Note: this simple example assumes one send and one receive. That 903 is done only for purposes of explaining the function of the PDM. In 904 cases where there are multiple packets returned, one would take the 905 time in the last packet in the sequence. The calculations of such 906 timings and intelligent processing is the function of post-processing 907 of the data. 909 5.5 Step 5 911 Packet 3 is sent from Host A to Host B. 913 +----------+ +----------+ 914 | | | | 915 | Host | ----------> | Host | 916 | A | | B | 917 | | | | 918 +----------+ +----------+ 920 PDM Contents: 922 PSNTP : Packet Sequence Number This Packet: 26 923 PSNLR : Packet Sequence Number Last Received: 12 924 DELTATLR : Delta Time Last Received: 0 925 SCALEDTLS: Scale of Delta Time Last Received 0 926 DELTATLS : Delta Time Last Sent: 105e (12 seconds) 927 SCALEDTLR: Scale of Delta Time Last Received: 26 928 TIMEBASE : Granularity of Time: 00 (Milliseconds) 930 To calculate Two-Way Delay, any packet capture device may look at 931 these packets and do what is necessary. 933 6 Other Flows 935 What we have discussed so far is a simple flow with one packet sent 936 and one returned. Let's look at how PDM may be useful in other 937 types of flows. 939 6.1 PDM Flow - One Way Traffic 941 The flow on a particular session may not be a send-receive paradigm. 942 Let us consider some other situations. In the case of a one-way 943 flow, one might see the following: 945 Packet Sender PSN PSN Delta Time Delta Time 946 This Packet Last Recvd Last Recvd Last Sent 947 ===================================================================== 948 1 Server 1 0 0 0 949 2 Server 2 0 0 5 950 3 Server 3 0 0 12 951 4 Server 4 0 0 20 953 What does this mean and how is it useful? 955 In a one-way flow, only the Delta Time Last Sent will be seen as 956 used. Recall, Delta Time Last Sent is the difference between the 957 send of one packet from a device and the next. This is a measure of 958 throughput for the sender - according to the sender's point of view. 959 That is, it is a measure of how fast is the application itself (with 960 stack time included) able to send packets. 962 How might this be useful? If one is having a performance issue at 963 the client and sees that packet 2, for example, is sent after 5 964 microseconds from the server but takes 3 minutes to arrive at the 965 destination, then one may safely conclude that there are delays in 966 the path other than at the server which may be causing the delivery 967 issue of that packet. Such delays may include the network links, 968 middle-boxes, etc. 970 Now, true one-way traffic is quite rare. What people often mean by 971 "one-way" traffic is an application such as FTP where a group of 972 packets (for example, a TCP window size worth) is sent, then the 973 sender waits for acknowledgment. This type of flow would actually 974 fall into the "multiple-send" traffic model. 976 6.2 PDM Flow - Multiple Send Traffic 978 Assume that two packets are sent for each ACK from the server. 980 Packet Sender PSN PSN Delta Time Delta Time 981 This Packet Last Recvd Last Recvd Last Sent 982 ===================================================================== 983 1 Server 1 0 0 0 984 2 Server 2 0 0 5 985 3 Client 1 2 20 0 986 4 Server 3 1 10 15 988 How might this be used? 990 Notice that in packet 3, the client has a value of Delta Time Last 991 received of 20. Recall that Delta Time Last Received is the Send 992 time of packet 3 - receive time of packet 2. So, what does one know 993 now? In this case, Delta Time Last Received is the processing time 994 for the Client to send the next packet. 996 How to interpret this depends on what is actually being sent. 997 Remember, PDM is not being used in isolation, but to supplement the 998 fields found in other headers. Let's take some examples: 1000 1. Client is sending a standalone TCP ACK. One would find this by 1001 looking at the payload length in the IPv6 header and the TCP 1002 Acknowledgement field in the TCP header. So, in this case, the 1003 client is taking 20 units to send back the ACK. This may or may not 1004 be interesting. 1006 2. Client is sending data with the packet. Again, one would find 1007 this by looking at the payload length in the IPv6 header and the TCP 1008 Acknowledgement field in the TCP header. So, in this case, the 1009 client is taking 20 units to send back data. This may represent 1010 "User Think Time". Again, this may or may not be interesting, in 1011 isolation. But, if there is a performance problem receiving data at 1012 the server, then taken in conjunction with RTT or other packet timing 1013 information, this information may be quite interesting. 1015 Of course, one also needs to look at the PSN Last Received field to 1016 make sure of the interpretation of this data. That is, to make 1017 sure that the Delta Last Received corresponds to the packet of 1018 interest. 1020 The benefits of PDM are that we have such information available in a 1021 uniform manner for all applications and all protocols without 1022 extensive changes required to applications. 1024 6.3 PDM Flow - Multiple Send with Errors 1026 One might wonder if all of the functions of PDM might be better 1027 suited to TCP or a TCP option. Let us take the case of how PDM may 1028 help in a case of TCP retransmissions in a way that TCP options or 1029 TCP ACK / SEQ would not. 1031 Assume that three packets are sent with each send from the server. 1033 From the server, this is what is seen. 1035 Pkt Sender PSN PSN Delta Time Delta Time TCP Data 1036 This Pkt LastRecvd LastRecvd LastSent SEQ Bytes 1037 ===================================================================== 1038 1 Server 1 0 0 0 123 100 1039 2 Server 2 0 0 5 223 100 1040 3 Server 3 0 0 5 333 100 1042 The client however, does not get all the packets. From the client, 1043 this is what is seen for the packets sent from the server. 1045 Pkt Sender PSN PSN Delta Time Delta Time TCP Data 1046 This Pkt LastRecvd LastRecvd LastSent SEQ Bytes 1047 ===================================================================== 1048 1 Server 1 0 0 0 123 100 1049 2 Server 3 0 0 5 333 100 1051 Let's assume that the server now retransmits the packet. 1052 (Obviously, a duplicate acknowledgment sequence for fast retransmit 1053 or a retransmit timeout would occur. To illustrate the point, these 1054 packets are being left out.) 1056 So, then if a TCP retransmission is done, then from the client, this 1057 is what is seen for the packets sent from the server. 1059 Pkt Sender PSN PSN Delta Time Delta Time TCP Data 1060 This Pkt LastRecvd LastRecvd LastSent SEQ Bytes 1061 ===================================================================== 1062 1 Server 4 0 0 30 223 100 1064 The server has resent the old packet 2 with TCP sequence number of 1065 223. The retransmitted packet now has a PSN This Packet value of 4. 1066 The Delta Last Sent is 30 - the time between sending the packet with 1067 PSN of 3 and this current packet. 1069 Let's say that packet 4 STILL does not make it. Then, after some 1070 amount of time (RTO) then the packet with TCP sequence number of 223 1071 is resent. 1073 From the client, this is what is seen for the packets sent from the 1074 server. 1076 Pkt Sender PSN PSN Delta Time Delta Time TCP Data 1077 This Pkt LastRecvd LastRecvd LastSent SEQ Bytes 1078 ===================================================================== 1079 1 Server 5 0 0 60 223 100 1081 If now, this packet makes it, one has a very good idea that packets 1082 exist which are being sent from the server as retransmissions and not 1083 making it to the client. This is because the PSN of the resent 1084 packet from the server is 5 rather than 4. If we had used TCP 1085 sequence number alone, we would never have seen this situation. 1086 Because the TCP sequence number in all situations is 223. 1088 This situation would be experienced by the user of the application 1089 (the human being actually sitting somewhere) as a "hangs" or long 1090 delay between packets. On large networks, to diagnose problems such 1091 as these where packets are lost somewhere on the network, one has to 1092 take multiple traces to find out exactly where. 1094 The first thing is to start with doing a trace at the client and the 1095 server. So, we can see if the server sent a particular packet and 1096 the client received it. If the client did not receive it, then we 1097 start tracking back to trace points at the router right after the 1098 server and the router right before the client. Did they get these 1099 packets which the server has sent? This is a time consuming 1100 activity. 1102 With PDM, we can speed up the diagnostic time because we may be able 1103 to use only the trace taken at the client to see what the server is 1104 sending. 1106 7 Potential Overhead Considerations 1108 Questions have been posed as to the potential overhead of PDM. 1109 First, PDM is entirely optional. That is, a site may choose to 1110 implement PDM or not as they wish. If they are happy with the costs 1111 of PDM vs. the benefits, then the choice should be theirs. 1113 Below is a table outlining the potential overhead in terms of 1114 additional time to deliver the response to the end user for various 1115 assumed RTTs. 1117 Bytes RTT Bytes Bytes New Overhead 1118 in Packet Per Milli in PDM RTT 1119 ===================================================================== 1120 1000 1000 milli 1 16 1016.000 16.000 milli 1121 1000 100 milli 10 16 101.600 1.600 milli 1122 1000 10 milli 100 16 10.160 .160 milli 1123 1000 1 milli 1000 16 1.016 .016 milli 1125 Below are some examples of actual RTTs for packets traversing large 1126 enterprise networks. The first example is for packets going to 1127 multiple business partners. 1129 Bytes RTT Bytes Bytes New Overhead 1130 in Packet Per Milli in PDM RTT 1131 ===================================================================== 1132 1000 17 milli 58 16 17.360 .360 milli 1134 The second example is for packets at a large enterprise customer 1135 within a data center. Notice that the scale is now in microseconds 1136 rather than milliseconds. 1138 Bytes RTT Bytes Bytes New Overhead 1139 in Packet Per Micro in PDM RTT 1140 ===================================================================== 1141 1000 20 micro 50 16 20.320 .320 micro 1143 8 Security Considerations 1145 The PDM MUST NOT be changed dynamically via packet flow as this 1146 creates a possibility for potential security violations or DoS 1147 attacks by numerous packets turning the header on and off. 1149 Attackers may also send many packets from multiple ports, for example 1150 by doing a port scan. This will cause the stack to create many 1151 control blocks. This is the same problem as seen for SYN flood 1152 attacks. Similar protections should be implemented by the stack to 1153 preserve the integrity of memory. 1155 9 IANA Considerations 1157 Option Type to be assigned by IANA [RFC2780]. 1159 10 References 1161 10.1 Normative References 1163 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, RFC 1164 793, September 1981. 1166 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1167 Requirement Levels", BCP 14, RFC 2119, March 1997. 1169 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1170 (IPv6) Specification", RFC 2460, December 1998. 1172 [RFC2681] Almes, G., Kalidindi, S., and M. Zekauskas, "A Round-trip 1173 Delay Metric for IPPM", RFC 2681, September 1999. 1175 [RFC2780] Bradner, S. and V. Paxson, "IANA Allocation Guidelines For 1176 Values In the Internet Protocol and Related Headers", BCP 37, RFC 1177 2780, March 2000. 1179 [RFC4303] Kent, S, "IP Encapsulating Security Payload (ESP)", RFC 1180 4303, December 2005. 1182 10.2 Informative References 1184 [TRAM-TCPM] Trammel, B., "Encoding of Time Intervals for the TCP 1185 Timestamp Option-01", Internet Draft, July 2013. [Work in Progress] 1187 [IBM-POPS] IBM Corporation, "IBM z/Architecture Principles of 1188 Operation", SA22-7832, 1990-2012 1190 11 Acknowledgments 1192 The authors would like to thank Keven Haining, Al Morton, Brian 1193 Trammel, David Boyes, Bill Jouris, Richard Scheffenegger, and Rick 1194 Troth for their comments and assistance. 1196 Authors' Addresses 1198 Nalini Elkins 1199 Inside Products, Inc. 1200 36A Upper Circle 1201 Carmel Valley, CA 93924 1202 United States 1203 Phone: +1 831 659 8360 1204 Email: nalini.elkins@insidethestack.com 1205 http://www.insidethestack.com 1206 Robert Hamilton 1207 Chemical Abstracts Service 1208 A Division of the American Chemical Society 1209 2540 Olentangy River Road 1210 Columbus, Ohio 43202 1211 United States 1212 Phone: +1 614 447 3600 x2517 1213 Email: rhamilton@cas.org 1214 http://www.cas.org 1216 Michael S. Ackermann 1217 Blue Cross Blue Shield of Michigan 1218 P.O. Box 2888 1219 Detroit, Michigan 48231 1220 United States 1221 Phone: +1 310 460 4080 1222 Email: mackermann@bcbsmi.com 1223 http://www.bcbsmi.com