idnits 2.17.1 draft-elkins-v6ops-ipv6-end-to-end-rt-needed-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 3, 2013) is 3857 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 1323 (Obsoleted by RFC 7323) ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) ** Obsolete normative reference: RFC 2679 (Obsoleted by RFC 7679) Summary: 3 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT N. Elkins 3 Intended Status: Informational Inside Products 4 M. Ackermann 5 BCBS Michigan 6 W. Jouris 7 Inside Products 8 K. Haining 9 US Bank 10 S. Perdomo 11 DTCC 12 Expires: April 2014 October 3, 2013 14 End-to-end Response Time Needed for IPv6 Diagnostics 15 draft-elkins-v6ops-ipv6-end-to-end-rt-needed-01 17 Abstract 19 To diagnose performance and connectivity problems, metrics on real 20 (non-synthetic) transmission are critical for timely end-to-end 21 problem resolution. Such diagnostics may be real-time or after the 22 fact, but must not impact an operational production network. The base 23 metrics are: packet sequence number and packet timestamp. Metrics 24 derived from these will be described separately. This document 25 provides the background and rationale for the requirement for end-to- 26 end response time. 28 Status of this Memo 30 This Internet-Draft is submitted to IETF in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF), its areas, and its working groups. Note that 35 other groups may also distribute working documents as 36 Internet-Drafts. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 The list of current Internet-Drafts can be accessed at 44 http://www.ietf.org/1id-abstracts.html 46 The list of Internet-Draft Shadow Directories can be accessed at 47 http://www.ietf.org/shadow.html 49 Copyright and License Notice 51 Copyright (c) 2013 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents 56 (http://trustee.ietf.org/license-info) in effect on the date of 57 publication of this document. Please review these documents 58 carefully, as they describe your rights and restrictions with respect 59 to this document. Code Components extracted from this document must 60 include Simplified BSD License text as described in Section 4.e of 61 the Trust Legal Provisions and are provided without warranty as 62 described in the Simplified BSD License. 64 Table of Contents 66 1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . 3 67 1.1 Why End-to-end Response Time is Needed . . . . . . . . . . . 3 68 1.2 Trending of Response Time Data . . . . . . . . . . . . . . . 4 69 1.3 What to measure? . . . . . . . . . . . . . . . . . . . . . . 4 70 1.4 TCP Timestamp not enough . . . . . . . . . . . . . . . . . . 5 71 1.5 Inadequacy of Current Instrumentation Technology . . . . . . 5 72 1.5.1 Synthetic transactions . . . . . . . . . . . . . . . . . 5 73 1.5.2 PING . . . . . . . . . . . . . . . . . . . . . . . . . . 5 74 1.5.3 Other Estimates of Network Time . . . . . . . . . . . . 6 75 1.5.4 Server / Client Agents . . . . . . . . . . . . . . . . . 6 76 2 Solution Parameters . . . . . . . . . . . . . . . . . . . . . . 6 77 2.1 Rationale for proposed solution . . . . . . . . . . . . . . 7 78 2.2 Merits of timestamp in PDM . . . . . . . . . . . . . . . . . 7 79 2.3 What kind of timestamp? . . . . . . . . . . . . . . . . . . 8 80 3 Backward Compatibility . . . . . . . . . . . . . . . . . . . . 8 81 4 Security Considerations . . . . . . . . . . . . . . . . . . . . 8 82 5 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 8 83 6 References . . . . . . . . . . . . . . . . . . . . . . . . . . 8 84 6.1 Normative References . . . . . . . . . . . . . . . . . . . . 9 85 6.2 Informative References . . . . . . . . . . . . . . . . . . . 9 86 7 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 9 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10 89 1 Background 91 To diagnose performance and connectivity problems, metrics on real 92 (non-synthetic) transmission are critical for timely end-to-end 93 problem resolution. Such diagnostics may be real-time or after the 94 fact, but must not impact an operational production network. The base 95 metrics are: packet sequence number and packet timestamp. Metrics 96 derived from these will be described separately. This document 97 provides the background and rationale for the requirement for end-to- 98 end response time. 100 For background, please see draft-ackermann-tictoc-pdm-ntp-usage-00 101 [ACKPDM], draft-elkins-v6ops-ipv6-packet-sequence-needed-01 [ELKPSN], 102 draft-elkins-v6ops-ipv6-pdm-recommended-usage-01 [ELKPUSE], draft- 103 elkins-6man-ipv6-pdm-dest-option-02 [ELKPDM] and draft-elkins-ippm- 104 pdm-metrics-00 [ELKIPPM]. These drafts are companions to this 105 document. 107 As discussed in the above Internet Drafts, current methods are 108 inadequate for these purposes because they assume unreasonable access 109 to intermediate devices, are cost prohibitive, require infeasible 110 changes to a running production network, or do not provide timely 111 data. The IPv6 Performance and Diagnostic Metrics destination option 112 (PDM) provides a solution to these problems. This document will 113 detail the background and need for end-to-end response time. 115 1.1 Why End-to-end Response Time is Needed 117 The timestamps in the PDM traveling along with the packet will be 118 used to calculate end-to-end response time, without requiring agents 119 in devices along the path. In many networks, end-to-end response 120 times are a critical component of Service Levels Agreements (SLAs). 122 End-to-end response is what the user of a network system actually 123 experiences. When the end user is an individual, he is generally 124 indifferent to what is happening along the network; what he really 125 cares about is how long it takes to get a response back. But this is 126 not just a matter of individuals' personal convenience. In many 127 cases, rapid response is critical to the business being conducted. 129 When the end user is a device (e.g. with the Internet of Things), 130 what matters is the speed with which requested data can be 131 transferred -- specifically, whether the requested data can be 132 transferred in time to accomplish the desired actions. This can be 133 important when the relevant external conditions are subject to rapid 134 change. 136 Response time and consistency are not just "nice to have". On many 137 networks, the impact can be financial hardship or endanger human 138 life. In some cities, the emergency police contact system operates 139 over IP, law enforcement uses TCP/IP networks, our stock exchanges 140 are settled using IP networks. The critical nature of such 141 activities to our daily lives and financial well-being demand a 142 solution. Section 1.5 will detail the current state of end-to-end 143 response time monitoring today. 145 1.2 Trending of Response Time Data 147 In addition to the need for tracking current service, end-to-end 148 response time is valuable for capacity planning. By tracking 149 response times, and identifying trends, it becomes possible to 150 determine when network capacity is being approached. This allows 151 additional capacity to be obtained before service levels fall below 152 requirements. Without that kind of tracking, the only option is to 153 wait until there is a problem, and then scramble to get additional 154 capacity on an emergency (and probably high cost) basis. 156 The documents draft-elkins-v6ops-ipv6-pdm-recommended-usage-01 157 [ELKPUSE] and draft-elkins-ippm-pdm-metrics-00 [ELKIPPM] will detail 158 use for the PDM for capacity planning purposes. 160 1.3 What to measure? 162 End to end response time can be broken down into 3 parts: 164 - Network delay 165 - Application (or server) delay 166 - Client delay 168 Network delay may be one-way delay [RFC2679] or round-trip delay 169 [RFC2681]. 171 Additionally, network delay may include multiple hops. Application 172 and server delay include operating system by stack time. By and 173 large, the three timings are 'good enough' measurements to allow 174 rapid triage into the failing component. 176 Ways are available (provided by operating systems) to measure 177 Application and Client times. Network time can also be measured in 178 isolation via some of the measurement techniques described in section 179 1.5. The most difficult portion is to integrate network time with the 180 server or application times. Products exist to do this but are 181 available at an exorbitant cost, require agents, and will likely 182 become more prohibitive as the speed of networks grow and as the 183 world becomes more connected via mobile devices. This is discussed 184 in detail in section 1.5. 186 Measuring network time requires precise timestamps. Furthermore, 187 those timestamps need to occur at the end-points of the transactions 188 being measured. And they need to be available, regardless of the 189 protocol being used by the transaction. Which is to say, the 190 timestamp has to be available in one of the extensions to the IP 191 header - this is provided by the PDM. 193 1.4 TCP Timestamp not enough 195 Some suggest that the TCP Timestamp option might be sufficient to 196 calculate end-to-end response time. 198 The TCP Timestamp Option is defined in RFC1323 [RFC1323]. The reason 199 for the TCP Timestamp option is to be able to discard packets when 200 the TCP sequence number wraps. (PAWS) 202 The problems with the TCP Timestamp option are: 204 1. Not everyone turns this on. 206 2. It is only available for TCP applications 208 3. No time synchronization between sender and receiver. 210 4. No indication of date in long-running connections. (That is 211 connections which last longer than one day) 213 5. The granularity of the timestamp is at best at millisecond level. 214 In the future, as speeds of devices and networks grow, this level of 215 granularity will be inadequate. Even today, on many networks, the 216 timings are at microsecond level not millisecond. 218 1.5 Inadequacy of Current Instrumentation Technology 220 The current technology includes: 222 1. Synthetic transactions 224 2. Pings 226 3. Other Estimates of network time 228 4. Server / Client Agents 230 1.5.1 Synthetic transactions 232 1.5.2 PING An ICMP ping measures network time. First, you can PING the 233 remote device. Then you assume that the time it takes to get a 234 response to a PING is the same as the time that a transaction 235 (regardless of packet size) would take to traverse the network. 236 However, QoS rules, firewalls, etc. may mean that PING, (and other 237 synthetic transactions) may not be subject to the same conditions. 239 1.5.3 Other Estimates of Network Time 241 If a packet trace is done, it is possible to look at the time between 242 when a response was seen to be sent at the packet capture device and 243 when the ACK for the response comes back. 245 If you assume that the ACK took the same amount of time as the 246 original query, you have the network time. Unfortunately, the time 247 for the ACK may not be the same as the time for a much larger query 248 transaction to traverse the network. 250 The biggest problem with this method is that of TCP delayed 251 acknowledgements. If the client is doing delayed ACKs, then the ACK 252 will be held until the next request is ready to go out. In this 253 case, the time to receive the ACK has no correlation with network 254 time. 256 1.5.4 Server / Client Agents 258 There are also products which claim that they can determine end-to- 259 end response times, integrating server and network times - and indeed 260 they can do so. But they require agents which must be placed at each 261 point which is to be monitored. That is, it is necessary to add 262 those agents EVERYWHERE around the network, at a very high cost. 263 These kind of products can be purchased by only the richest 1% of the 264 corporations. As the speed of networks grow, and as the world becomes 265 more connected via mobile devices, such products will only become 266 more expensive. If, indeed, their technology can keep up. 268 TCP/IP networks today are used throughout the world. The need for 269 adequate performance will become more and more critical. A method 270 that is scalable and affordable is needed to ensure this growth. 272 2 Solution Parameters 274 What is needed is: 276 1) A method to identify and/or track the behavior of a connection 277 without assuming access to the transport devices. 279 2) A method to observe a connection in flight without introducing 280 agents. 282 3) a method to observe arbitrary flows at multiple points within a 283 network and correlate the results of those observations in a 284 consistent manner. 286 4) A method to signal and correlate transport issues to application 287 end-to-end behavior. 289 5) A method which does not require changes to a production network in 290 real time. 292 6) Adequate granularity in the measurement technique to provide the 293 needed metrics. 295 7) A method that is scalable to very large networks. 297 8) A method that is affordable to all. 299 2.1 Rationale for proposed solution 301 The current IPv6 specification does not provide a timestamp number 302 nor similar field in the IPv6 main header or in any extension header. 303 So, we propose the IPv6 Performance and Diagnostic Metrics 304 destination option (PDM) [ELKPDM]. 306 2.2 Merits of timestamp in PDM 308 Advantages include: 310 1. Less overhead than other alternatives. 312 2. Real measure of actual transactions. 314 3. Less cost to provide solutions 316 4. More accurate and complete information. 318 5. Independence from transport layer protocols. 320 6. Ability to span organizational boundaries with consistent 321 instrumentation 323 In other words, this is a solution to a long-standing problem. The 324 PDM will provide a metric which will allow those responsible for 325 network support to determine what is happening in their network 326 without expensive equipment (agents) at each device. 328 The PDM does not solve every response time issue for every situation. 329 Network connections with multiple hops will still need more granular 330 metrics, as will the differentiation between multiple components at 331 each host. That is, TCP/IP stack time vs. applications time will 332 still need to be broken out by client software. What the PDM does 333 provide is triage. That is, to determine quickly if the problem is 334 in the network or in the server or application. 336 2.3 What kind of timestamp? 338 Questions arise about exactly the kind of timestamp to use. Both the 339 Network Time Protocol (NTP) [RFC5905] and Precision Time Protocol 340 (PTP) [IEEE1588] are used to provide timing on TCP/IP networks. 342 NTP has evolved within the IETF structure while PTP has evolved 343 within the Institute of Electrical and Electronics Engineers (IEEE) 344 community. By and large, operating systems such as Windows, Linux, 345 and IBM mainframe computers use NTP. These are the source and 346 destination systems for packets. Intermediate nodes such as routers 347 and switches may prefer PTP. 349 Since we are describing a new extension header for destination 350 systems, the timestamp to be used will be in accordance with NTP. In 351 the documents, draft-ackermann-tictoc-pdm-ntp-usage-00 [ACKPDM] and 352 draft-elkins-v6ops-ipv6-pdm-recommended-usage-01 [ELKPUSE], we will 353 discuss guidelines for implementing NTP for use with the PDM. 355 3 Backward Compatibility 357 The scheme proposed in this document is backward compatible with all 358 the currently defined IPv6 extension headers. According to RFC2460 359 [RFC2460], if the destination node does not recognize this option, it 360 should skip over this option and continue processing the header. 362 4 Security Considerations 364 There are no security considerations. 366 5 IANA Considerations 368 There are no IANA considerations. 370 6 References 372 6.1 Normative References 374 [RFC1323] Jacobson, V., Braden, R., and D. Borman, "TCP Extensions 375 for High Performance", RFC 1323, May 1992. 377 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 378 (IPv6) Specification", RFC 2460, December 1998. 380 [RFC2679] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 381 Delay Metric for IPPM", RFC 2679, September 1999. 383 [RFC2681] Almes, G., Kalidindi, S., and M. Zekauskas, "A Round-trip 384 Delay Metric for IPPM", RFC 2681, September 1999. 386 [RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch, 387 "Network Time Protocol Version 4: Protocol and Algorithms 388 Specification", RFC 5905, June 2010. 390 [IEEE1588] IEEE 1588-2002 standard, "Standard for a Precision Clock 391 Synchronization Protocol for Networked Measurement and 392 Control Systems" 394 6.2 Informative References 396 [ACKPDM] Ackermann, M., "draft-ackermann-tictoc-pdm-ntp-usage-00", 397 Internet Draft, September 2013. 399 [ELKPSN] Elkins, N., "draft-elkins-v6ops-ipv6-packet-sequence- 400 needed-01", Internet Draft, September 2013. 402 [ELKPDM] Elkins, N., "draft-elkins-6man-ipv6-pdm-dest-option-02", 403 Internet Draft, September 2013. 405 [ELKPUSE] Elkins, N., "draft-elkins-v6ops-ipv6-pdm-recommended-usage- 406 01", Internet Draft, September 2013 408 [ELKIPPM] Elkins, N., "Draft-elkins-ippm-pdm-metrics-00", Internet 409 Draft, September 2013. 411 7 Acknowledgments 413 The authors would like to thank Rick Troth, David Boyes, 414 and Fred Baker for their comments. 416 Authors' Addresses 418 Nalini Elkins 419 Inside Products, Inc. 420 36A Upper Circle 421 Carmel Valley, CA 93924 422 United States 423 Phone: +1 831 659 8360 424 Email: nalini.elkins@insidethestack.com 425 http://www.insidethestack.com 427 Michael S. Ackermann 428 Blue Cross Blue Shield of Michigan 429 P.O. Box 2888 430 Detroit, Michigan 48231 431 United States 432 Phone: +1 310 460 4080 433 Email: mackermann@bcbsmi.com 434 http://www.bcbsmi.com 436 Keven Haining 437 US Bank 438 16900 W Capitol Drive 439 Brookfield, WI 53005 440 United States 441 Phone: +1 262 790 3551 442 Email: keven.haining@usbank.com 443 http://www.usbank.com 445 Sigfrido Perdomo 446 Depository Trust and Clearing Corporation 447 55 Water Street 448 New York, NY 10055 449 United States 450 Phone: +1 917 842 7375 451 Email: s.perdomo@dtcc.com 452 http://www.dtcc.com 454 William Jouris 455 Inside Products, Inc. 456 36A Upper Circle 457 Carmel Valley, CA 93924 458 United States 459 Phone: +1 925 855 9512 460 Email: bill.jouris@insidethestack.com 461 http://www.insidethestack.com