idnits 2.17.1 draft-mirsky-ippm-hybrid-two-step-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 17, 2018) is 2015 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-17) exists of draft-ietf-ippm-ioam-data-03 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IPPM Working Group G. Mirsky 3 Internet-Draft ZTE Corp. 4 Intended status: Standards Track W. Lingqiang 5 Expires: April 20, 2019 G. Zhui 6 ZTE Corporation 7 October 17, 2018 9 Hybrid Two-Step Performance Measurement Method 10 draft-mirsky-ippm-hybrid-two-step-02 12 Abstract 14 Development of, and advancements in, automation of network operations 15 brought new requirements for measurement methodology. Among them is 16 the ability to collect instant network state as the packet being 17 processed by the networking elements along its path through the 18 domain. This document introduces a new hybrid measurement method, 19 referred to as hybrid two-step, as it separates the act of measuring 20 and/or calculating the performance metric from the act of collecting 21 and transporting network state. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at https://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on April 20, 2019. 40 Copyright Notice 42 Copyright (c) 2018 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (https://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 58 2. Conventions used in this document . . . . . . . . . . . . . . 3 59 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 60 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 3 61 3. Problem Overview . . . . . . . . . . . . . . . . . . . . . . 3 62 4. Theory of Operation . . . . . . . . . . . . . . . . . . . . . 4 63 4.1. Operation of the HTS Ingress Node . . . . . . . . . . . . 5 64 4.2. Operation of the HTS Transient Node . . . . . . . . . . . 7 65 4.3. Operation of the HTS Egress Node . . . . . . . . . . . . 8 66 4.4. Considerations for HTS Timers . . . . . . . . . . . . . . 8 67 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 68 6. Security Considerations . . . . . . . . . . . . . . . . . . . 8 69 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 70 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 71 8.1. Normative References . . . . . . . . . . . . . . . . . . 9 72 8.2. Informative References . . . . . . . . . . . . . . . . . 9 73 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 75 1. Introduction 77 Successful resolution of challenges of automated network operation, 78 as part of, for example, overall service orchestration or data center 79 operation, relies on a timely collection of accurate information that 80 reflects the state of network elements on an unprecedented scale. 81 Because performing the analysis and act upon the collected 82 information requires considerable computing and storage resources, 83 the network state information is unlikely to be processed by the 84 network elements themselves but will be relayed into the data storage 85 facilities, e.g., data lakes. The process of producing, collecting 86 network state information also referred to in this document as 87 network telemetry, and transporting it for post-processing should 88 work equally well with data flows or injected in the network test 89 packets. RFC 7799 [RFC7799] describes a combination of elements of 90 passive and active measurement as a hybrid measurement. 92 Several technical methods have been proposed to enable collection of 93 network state information instantaneous to the packet processing, 94 among them [P4.INT] and [I-D.ietf-ippm-ioam-data]. 96 This document introduces Hybrid Two-Step (HTS) as a new hybrid 97 measurement method that separates measuring or calculating the 98 performance metric from the collecting and transporting this 99 information. The Hybrid Two-Step method extends the two-step mode of 100 Residence Time Measurement (RTM) defined in [RFC8169] to on-path 101 network state collection and transport. 103 2. Conventions used in this document 105 2.1. Terminology 107 RTM Residence Time Measurement 109 ECMP Equal Cost Multipath 111 MTU Maximum Transmission Unit 113 HTS Hybrid Two-Step 115 Network telemetry - the process of collecting and reporting of 116 network state 118 2.2. Requirements Language 120 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 121 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 122 "OPTIONAL" in this document are to be interpreted as described in BCP 123 14 [RFC2119] [RFC8174] when, and only when, they appear in all 124 capitals, as shown here. 126 3. Problem Overview 128 Performance measurements are meant to provide data that characterize 129 conditions experienced by traffic flows in the network and possibly 130 trigger operational changes (e.g., re-route of flows, or changes in 131 resource allocations). Modifications to a network are determined 132 based on the performance metric information available at the time 133 that a change is to be made. The correctness of this determination 134 is based on the quality of the collected metrics data. The quality 135 of collected measurement data is defined by: 137 o the resolution and accuracy of each measurement; 139 o predictability of both the time at which each measurement is made 140 and the timeliness of measurement collection data delivery for 141 use. 143 Consider the case of delay measurement that relies on collecting time 144 of packet arrival at the ingress interface and time of the packet 145 transmission at the egress interface. The method includes recording 146 a local clock value on receiving the first octet of an affected 147 message at the device ingress, and again recording the clock value on 148 transmitting the first byte of the same message at the device egress. 149 In this ideal case, the difference between the two recorded clock 150 times corresponds to the time that the message spent in traversing 151 the device. In practice, the time that has been recorded can differ 152 from the ideal case by any fixed amount and a correction can be 153 applied to compute the same time difference taking into account the 154 known fixed time associated with the actual measurement. In this 155 way, the resulting time difference reflects any variable delay 156 associated with queuing. 158 Depending on the implementation, it may be a challenge to compute the 159 difference between message arrival and departure times and - on the 160 fly - add the necessary residence time information to the same 161 message. And that task may become even more challenging if the 162 packet is encrypted. Implementations SHOULD NOT record a message 163 departure time that may be significantly inaccurate in the same 164 message, as the result of estimating the departure time that includes 165 the variable time component (such as that associated with buffering 166 and queuing of the message). A similar problem may cause a lower 167 quality of, for example, information that characterizes utilization 168 of the egress interface. If unable to obtain the data consistently, 169 without variable delays for additional processing, information may 170 not accurately reflect the state at the egress interface. To 171 mitigate this problem [RFC8169] defined an RTM two-step mode. 173 Another challenge associated with methods that collect network state 174 information into the actual data packet is the risk to exceed the 175 Maximum Transmission Unit (MTU) size, especially if the packet 176 traverses overlay domains or VPNs. Since the fragmentation is not 177 available at the transport network, operators may have to reduce MTU 178 size advertised to client layer or risk missing network state data 179 for the part, most probably the latter part, of the path. 181 4. Theory of Operation 183 The HTS method consists of the two phases: 185 o performing a measurement or obtaining network state information, 186 one or more than one type, on a node; 188 o collecting and transporting the measurement. 190 HTS uses HTS Trigger carried in a data packet or a specially 191 constructed test packet. Nature of the HTS Trigger is transport 192 network layer specific, and its description is outside the scope of 193 this document. The packet that includes the HTS Trigger in this 194 document also referred to as the trigger packet. 196 The HTS method uses the HTS Follow-up packet, in this document also 197 referred to as the follow-up packet, to collect measurement and 198 network state data from the nodes. The node that creates the HTS 199 Trigger also generates the HTS Follow-up packet. The follow-up 200 packet contains characteristic information, copied from the trigger 201 packet, sufficient for participating HTS nodes to associate it with 202 the original packet. The exact composition of the characteristic 203 information is specific for each transport network, and its 204 definition is outside the scope of this document. The follow-up 205 packet also uses the same encapsulation as the data packet. If not 206 payload but only network information used to load-balance flows in 207 equal cost multipath (ECMP), use of the network encapsulation 208 identical to the trigger packet should guarantee that the follow-up 209 packet remains in-band, i.e., traverses the same set of network 210 elements, with the original data packet with the HTS Trigger. Only 211 one outstanding follow-up packet MUST be on the node for the given 212 path. That means that if the node receives an HTS Trigger for the 213 flow on which it still waits for the follow-up packet to the previous 214 HTS Trigger, the node will originate the follow-up packet to 215 transport the former set of the network state data and transmit it 216 before it sends the follow-up packet with the latest collection of 217 network state information. 219 4.1. Operation of the HTS Ingress Node 221 A node that originates the HTS Trigger is referred to as HTS ingress 222 node. As stated, the ingress node originates the follow-up packet. 223 The follow-up packet has the transport network encapsulation 224 identical with the trigger packet followed by the HTS shim and one or 225 more telemetry information elements encoded as Type-Length-Value 226 {TLV}. Figure 1 displays the example of the follow-up packet format. 228 0 1 2 3 229 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 230 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 231 | | 232 ~ Transport Network ~ 233 | Encapsulation | 234 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 235 |Ver|HTS Shim Len| Flags | Sequence Number | 236 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 237 | Telemetry Data Profile | 238 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 239 | | 240 ~ Telemetry Data TLVs ~ 241 | | 242 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 244 Figure 1: Follow-up Packet Format 246 Fields of the HTS shim are as follows: 248 Version (Ver) is the two-bits long field. It specifies the 249 version of the HTS shim format. This document defines the format 250 for the 0b00 value of the field. 252 HTS Shim Length is the six bits-long field. It defines the length 253 of the HTS shim in bytes. The minimal value of the field is four 254 bytes. 256 Flags is eight-bits long field. The format of the Flags field 257 displayed in Figure 2. 259 Full (F) flag MUST be set to zero by the node originating the 260 HTS follow-up packet and MUST be set to one by the node that 261 does not add its telemetry data to avoid exceeding MTU size. 263 The node originating the follow-up packet MUST zero the 264 Reserved field and ignore it on the receipt. 266 Sequence Number is 16 bits-long field. The value of the field 267 reflects the number of the HTS follow-up packet in the sequence of 268 the HTS follow-up packets originated in response to the same HTS 269 trigger. The ingress node MUST set the value of the field to 270 zero. 272 Telemetry Data Profile is the optional variable length field of 273 bit-size flags. Each flag indicates requested type of telemetry 274 data to be collected at the each HTS node. The increment of the 275 field is four bytes with a minimum length of zero. 277 0 278 0 1 2 3 4 5 6 7 279 +-+-+-+-+-+-+-+-+ 280 |F| Reserved | 281 +-+-+-+-+-+-+-+-+ 283 Figure 2: Flags Field Format 285 4.2. Operation of the HTS Transient Node 287 Upon receiving the trigger packet the HTS transient node MUST: 289 o copy the transport information; 291 o start the HTS Follow-up Timer for the obtained flow. 293 Upon receiving the follow-up packet the HTS transient node MUST: 295 o verify that the matching transport information exists and the Full 296 flag is cleared, then stop the associated HTS Follow-up timer; 298 o collect telemetry data requested in the Telemetry Data Profile 299 field or defined by the local HTS policy; 301 o if adding the collected telemetry would not exceed MTU, then 302 append data into Telemetry Data TLVs field and transmit the 303 follow-up packet; 305 o otherwise, set the value of the Full flag to one and transmit the 306 received a follow-up packet; 308 o originate the new follow-up packet using the same transport 309 information. The value of the Sequence Number field in the HTS 310 shim MUST be set to the value of the field in the received follow- 311 up packet incremented by one. Copy collected telemetry data and 312 transmit the packet. 314 If the follow-up timer expires the transient node MUST: 316 o originate the follow-up packet using transport information 317 associated with the expired timer; 319 o initialize the HTS shim by setting Version field to 0b00 and 320 Sequence Number field to 0. Values of HTS Shim Length and 321 Telemetry Data Profile fields MAY be set according to the local 322 policy. 324 o copy telemetry information into Telemetry Data TLVs field and 325 transmit the packet. 327 4.3. Operation of the HTS Egress Node 329 Upon receiving the trigger packet the HTS egress node MUST: 331 o copy the transport information; 333 o start the HTS Collection timer for the obtained flow. 335 When the egress node receives the follow-up packet for the known 336 flow, i.e., the flow to which the Collection timer is running, the 337 node MUST: 339 o copy telemetry information; 341 o restart the corresponding Collection timer. 343 When the Collection timer expires the egress relays the collected 344 telemetry information for processing and analysis to a local or 345 remote agent. 347 4.4. Considerations for HTS Timers 349 This specification defines two timers - HTS Follow-up and HTS 350 Collection. Because for the particular flow there MUST be not more 351 than one HTS Trigger, values of HTS timers bounded by the rate of the 352 trigger generation for that flow. 354 5. IANA Considerations 356 TBD 358 6. Security Considerations 360 Nodes that practice HTS method are presumed to share a trust model 361 that depends on the existence of a trusted relationship among nodes. 362 This is necessary as these nodes are expected to correctly modify the 363 specific content of the data in the follow-up packet, and the degree 364 to which HTS measurement is useful for network operation depends on 365 this ability. In practice, this means either confidentiality or 366 integrity protection cannot cover those portions of messages that 367 contain the network state data. Though there are methods that make 368 it possible in theory to provide either or both such protections and 369 still allow for intermediate nodes to make detectable yet 370 authenticated modifications, such methods do not seem practical at 371 present, particularly for protocols that used to measure latency and/ 372 or jitter. 374 The ability to potentially authenticate and/or encrypt the network 375 state data for scenarios both with and without the participation of 376 intermediate nodes that participate in HTS measurement is left for 377 further study. 379 While it is possible for a supposed compromised node to intercept and 380 modify the network state information in the follow-up packet, this is 381 an issue that exists for nodes in general - for all data that to be 382 carried over the particular networking technology - and is therefore 383 the basis for an additional presumed trust model associated with an 384 existing network. 386 7. Acknowledgments 388 Authors express their gratitude and appreciation to Joel Halpern for 389 the most helpful and insightful discussion on the applicability of 390 HTS in a Service Function Chaining domain. 392 8. References 394 8.1. Normative References 396 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 397 Requirement Levels", BCP 14, RFC 2119, 398 DOI 10.17487/RFC2119, March 1997, 399 . 401 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 402 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 403 May 2017, . 405 8.2. Informative References 407 [I-D.ietf-ippm-ioam-data] 408 Brockners, F., Bhandari, S., Pignataro, C., Gredler, H., 409 Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, 410 P., Chang, R., daniel.bernier@bell.ca, d., and J. Lemon, 411 "Data Fields for In-situ OAM", draft-ietf-ippm-ioam- 412 data-03 (work in progress), June 2018. 414 [P4.INT] "In-band Network Telemetry (INT)", P4.org Specification, 415 October 2017. 417 [RFC7799] Morton, A., "Active and Passive Metrics and Methods (with 418 Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799, 419 May 2016, . 421 [RFC8169] Mirsky, G., Ruffini, S., Gray, E., Drake, J., Bryant, S., 422 and A. Vainshtein, "Residence Time Measurement in MPLS 423 Networks", RFC 8169, DOI 10.17487/RFC8169, May 2017, 424 . 426 Authors' Addresses 428 Greg Mirsky 429 ZTE Corp. 431 Email: gregimirsky@gmail.com 433 Wang Lingqiang 434 ZTE Corporation 435 No 19 ,East Huayuan Road 436 Beijing 100191 437 P.R.China 439 Phone: +86 10 82963945 440 Email: wang.lingqiang@zte.com.cn 442 Guo Zhui 443 ZTE Corporation 444 No 19 ,East Huayuan Road 445 Beijing 100191 446 P.R.China 448 Phone: +86 10 82963945 449 Email: guo.zhui@zte.com.cn