idnits 2.17.1 draft-mirsky-ippm-hybrid-two-step-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 8, 2019) is 1654 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-17) exists of draft-ietf-ippm-ioam-data-07 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IPPM Working Group G. Mirsky 3 Internet-Draft ZTE Corp. 4 Intended status: Standards Track W. Lingqiang 5 Expires: April 10, 2020 G. Zhui 6 ZTE Corporation 7 October 8, 2019 9 Hybrid Two-Step Performance Measurement Method 10 draft-mirsky-ippm-hybrid-two-step-04 12 Abstract 14 Development of, and advancements in, automation of network operations 15 brought new requirements for measurement methodology. Among them is 16 the ability to collect instant network state as the packet being 17 processed by the networking elements along its path through the 18 domain. This document introduces a new hybrid measurement method, 19 referred to as hybrid two-step, as it separates the act of measuring 20 and/or calculating the performance metric from the act of collecting 21 and transporting network state. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at https://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on April 10, 2020. 40 Copyright Notice 42 Copyright (c) 2019 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (https://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 58 2. Conventions used in this document . . . . . . . . . . . . . . 3 59 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 60 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 3 61 3. Problem Overview . . . . . . . . . . . . . . . . . . . . . . 3 62 4. Theory of Operation . . . . . . . . . . . . . . . . . . . . . 4 63 4.1. Operation of the HTS Ingress Node . . . . . . . . . . . . 5 64 4.2. Operation of the HTS Transient Node . . . . . . . . . . . 7 65 4.3. Operation of the HTS Egress Node . . . . . . . . . . . . 8 66 4.4. Considerations for HTS Timers . . . . . . . . . . . . . . 8 67 4.5. Deploying HTS in a Multicast Network . . . . . . . . . . 8 68 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 69 6. Security Considerations . . . . . . . . . . . . . . . . . . . 9 70 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 71 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 72 8.1. Normative References . . . . . . . . . . . . . . . . . . 10 73 8.2. Informative References . . . . . . . . . . . . . . . . . 10 74 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 76 1. Introduction 78 Successful resolution of challenges of automated network operation, 79 as part of, for example, overall service orchestration or data center 80 operation, relies on a timely collection of accurate information that 81 reflects the state of network elements on an unprecedented scale. 82 Because performing the analysis and act upon the collected 83 information requires considerable computing and storage resources, 84 the network state information is unlikely to be processed by the 85 network elements themselves but will be relayed into the data storage 86 facilities, e.g., data lakes. The process of producing, collecting 87 network state information also referred to in this document as 88 network telemetry, and transporting it for post-processing should 89 work equally well with data flows or injected in the network test 90 packets. RFC 7799 [RFC7799] describes a combination of elements of 91 passive and active measurement as a hybrid measurement. 93 Several technical methods have been proposed to enable collection of 94 network state information instantaneous to the packet processing, 95 among them [P4.INT] and [I-D.ietf-ippm-ioam-data]. 97 This document introduces Hybrid Two-Step (HTS) as a new hybrid 98 measurement method that separates measuring or calculating the 99 performance metric from the collecting and transporting this 100 information. The Hybrid Two-Step method extends the two-step mode of 101 Residence Time Measurement (RTM) defined in [RFC8169] to on-path 102 network state collection and transport. 104 2. Conventions used in this document 106 2.1. Terminology 108 RTM Residence Time Measurement 110 ECMP Equal Cost Multipath 112 MTU Maximum Transmission Unit 114 HTS Hybrid Two-Step 116 Network telemetry - the process of collecting and reporting of 117 network state 119 2.2. Requirements Language 121 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 122 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 123 "OPTIONAL" in this document are to be interpreted as described in BCP 124 14 [RFC2119] [RFC8174] when, and only when, they appear in all 125 capitals, as shown here. 127 3. Problem Overview 129 Performance measurements are meant to provide data that characterize 130 conditions experienced by traffic flows in the network and possibly 131 trigger operational changes (e.g., re-route of flows, or changes in 132 resource allocations). Modifications to a network are determined 133 based on the performance metric information available at the time 134 that a change is to be made. The correctness of this determination 135 is based on the quality of the collected metrics data. The quality 136 of collected measurement data is defined by: 138 o the resolution and accuracy of each measurement; 140 o predictability of both the time at which each measurement is made 141 and the timeliness of measurement collection data delivery for 142 use. 144 Consider the case of delay measurement that relies on collecting time 145 of packet arrival at the ingress interface and time of the packet 146 transmission at the egress interface. The method includes recording 147 a local clock value on receiving the first octet of an affected 148 message at the device ingress, and again recording the clock value on 149 transmitting the first byte of the same message at the device egress. 150 In this ideal case, the difference between the two recorded clock 151 times corresponds to the time that the message spent in traversing 152 the device. In practice, the time that has been recorded can differ 153 from the ideal case by any fixed amount and a correction can be 154 applied to compute the same time difference taking into account the 155 known fixed time associated with the actual measurement. In this 156 way, the resulting time difference reflects any variable delay 157 associated with queuing. 159 Depending on the implementation, it may be a challenge to compute the 160 difference between message arrival and departure times and - on the 161 fly - add the necessary residence time information to the same 162 message. And that task may become even more challenging if the 163 packet is encrypted. Implementations SHOULD NOT record a message 164 departure time that may be significantly inaccurate in the same 165 message, as the result of estimating the departure time that includes 166 the variable time component (such as that associated with buffering 167 and queuing of the message). A similar problem may cause a lower 168 quality of, for example, information that characterizes utilization 169 of the egress interface. If unable to obtain the data consistently, 170 without variable delays for additional processing, information may 171 not accurately reflect the state at the egress interface. To 172 mitigate this problem [RFC8169] defined an RTM two-step mode. 174 Another challenge associated with methods that collect network state 175 information into the actual data packet is the risk to exceed the 176 Maximum Transmission Unit (MTU) size, especially if the packet 177 traverses overlay domains or VPNs. Since the fragmentation is not 178 available at the transport network, operators may have to reduce MTU 179 size advertised to client layer or risk missing network state data 180 for the part, most probably the latter part, of the path. 182 4. Theory of Operation 184 The HTS method consists of the two phases: 186 o performing a measurement or obtaining network state information, 187 one or more than one type, on a node; 189 o collecting and transporting the measurement. 191 HTS uses HTS Trigger carried in a data packet or a specially 192 constructed test packet. Nature of the HTS Trigger is transport 193 network layer specific, and its description is outside the scope of 194 this document. The packet that includes the HTS Trigger in this 195 document also referred to as the trigger packet. 197 The HTS method uses the HTS Follow-up packet, in this document also 198 referred to as the follow-up packet, to collect measurement and 199 network state data from the nodes. The node that creates the HTS 200 Trigger also generates the HTS Follow-up packet. The follow-up 201 packet contains characteristic information, copied from the trigger 202 packet, sufficient for participating HTS nodes to associate it with 203 the original packet. The exact composition of the characteristic 204 information is specific for each transport network, and its 205 definition is outside the scope of this document. The follow-up 206 packet also uses the same encapsulation as the data packet. If not 207 payload but only network information used to load-balance flows in 208 equal cost multipath (ECMP), use of the network encapsulation 209 identical to the trigger packet should guarantee that the follow-up 210 packet remains in-band, i.e., traverses the same set of network 211 elements, with the original data packet with the HTS Trigger. Only 212 one outstanding follow-up packet MUST be on the node for the given 213 path. That means that if the node receives an HTS Trigger for the 214 flow on which it still waits for the follow-up packet to the previous 215 HTS Trigger, the node will originate the follow-up packet to 216 transport the former set of the network state data and transmit it 217 before it sends the follow-up packet with the latest collection of 218 network state information. 220 4.1. Operation of the HTS Ingress Node 222 A node that originates the HTS Trigger is referred to as HTS ingress 223 node. As stated, the ingress node originates the follow-up packet. 224 The follow-up packet has the transport network encapsulation 225 identical with the trigger packet followed by the HTS shim and one or 226 more telemetry information elements encoded as Type-Length-Value 227 {TLV}. Figure 1 displays the example of the follow-up packet format. 229 0 1 2 3 230 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 231 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 232 | | 233 ~ Transport Network ~ 234 | Encapsulation | 235 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 236 |Ver|HTS Shim Len| Flags | Sequence Number | 237 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 238 | Telemetry Data Profile | 239 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 240 | | 241 ~ Telemetry Data TLVs ~ 242 | | 243 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 245 Figure 1: Follow-up Packet Format 247 Fields of the HTS shim are as follows: 249 Version (Ver) is the two-bits long field. It specifies the 250 version of the HTS shim format. This document defines the format 251 for the 0b00 value of the field. 253 HTS Shim Length is the six bits-long field. It defines the length 254 of the HTS shim in bytes. The minimal value of the field is four 255 bytes. 257 Flags is eight-bits long field. The format of the Flags field 258 displayed in Figure 2. 260 Full (F) flag MUST be set to zero by the node originating the 261 HTS follow-up packet and MUST be set to one by the node that 262 does not add its telemetry data to avoid exceeding MTU size. 264 The node originating the follow-up packet MUST zero the 265 Reserved field and ignore it on the receipt. 267 Sequence Number is 16 bits-long field. The value of the field 268 reflects the number of the HTS follow-up packet in the sequence of 269 the HTS follow-up packets originated in response to the same HTS 270 trigger. The ingress node MUST set the value of the field to 271 zero. 273 Telemetry Data Profile is the optional variable length field of 274 bit-size flags. Each flag indicates requested type of telemetry 275 data to be collected at the each HTS node. The increment of the 276 field is four bytes with a minimum length of zero. 278 0 279 0 1 2 3 4 5 6 7 280 +-+-+-+-+-+-+-+-+ 281 |F| Reserved | 282 +-+-+-+-+-+-+-+-+ 284 Figure 2: Flags Field Format 286 4.2. Operation of the HTS Transient Node 288 Upon receiving the trigger packet the HTS transient node MUST: 290 o copy the transport information; 292 o start the HTS Follow-up Timer for the obtained flow. 294 Upon receiving the follow-up packet the HTS transient node MUST: 296 o verify that the matching transport information exists and the Full 297 flag is cleared, then stop the associated HTS Follow-up timer; 299 o collect telemetry data requested in the Telemetry Data Profile 300 field or defined by the local HTS policy; 302 o if adding the collected telemetry would not exceed MTU, then 303 append data into Telemetry Data TLVs field and transmit the 304 follow-up packet; 306 o otherwise, set the value of the Full flag to one and transmit the 307 received a follow-up packet; 309 o originate the new follow-up packet using the same transport 310 information. The value of the Sequence Number field in the HTS 311 shim MUST be set to the value of the field in the received follow- 312 up packet incremented by one. Copy collected telemetry data and 313 transmit the packet. 315 If the follow-up timer expires the transient node MUST: 317 o originate the follow-up packet using transport information 318 associated with the expired timer; 320 o initialize the HTS shim by setting Version field to 0b00 and 321 Sequence Number field to 0. Values of HTS Shim Length and 322 Telemetry Data Profile fields MAY be set according to the local 323 policy. 325 o copy telemetry information into Telemetry Data TLVs field and 326 transmit the packet. 328 4.3. Operation of the HTS Egress Node 330 Upon receiving the trigger packet the HTS egress node MUST: 332 o copy the transport information; 334 o start the HTS Collection timer for the obtained flow. 336 When the egress node receives the follow-up packet for the known 337 flow, i.e., the flow to which the Collection timer is running, the 338 node MUST: 340 o copy telemetry information; 342 o restart the corresponding Collection timer. 344 When the Collection timer expires the egress relays the collected 345 telemetry information for processing and analysis to a local or 346 remote agent. 348 4.4. Considerations for HTS Timers 350 This specification defines two timers - HTS Follow-up and HTS 351 Collection. Because for the particular flow there MUST be not more 352 than one HTS Trigger, values of HTS timers bounded by the rate of the 353 trigger generation for that flow. 355 4.5. Deploying HTS in a Multicast Network 357 Previous sections discussed the operation of HTS in a unicast 358 network. Multicast services are important, and the ability to 359 collect telemetry information is an invaluable component in 360 delivering a high quality of experience. While the replication of 361 data packets is necessary, replication of HTS follow-up packets is 362 not. Replication of multicast data packets down a multicast tree may 363 be set based on multicast routing information or explicit information 364 included in the special header, as, for example, in Bit-Indexed 365 Explicit Replication [RFC8296]. A replicating node processes HTS 366 packet as defined below: 368 o the first transmitted multicast packet MUST be followed by the 369 received corresponding HTS packet as described in Section 4.2; 371 o each consecutively transmitted copy of the original multicast 372 packet MUST be followed by the new HTS packet originated by the 373 replicating node that acts as a transient HTS node when the 374 Follow-up timer expired. 376 As a result, there are no duplicate copies of Telemetry Data TLV for 377 the same pair of ingress and egress interfaces. At the same time, 378 all ingress/egress pairs traversed by the given multicast packet 379 reflected in their respective Telemetry Data TLV. Consequently, a 380 centralized controller would be able to reconstruct and analyze the 381 state of the particular multicast distribution tree based on HTS 382 packets collected from egress nodes. 384 5. IANA Considerations 386 TBD 388 6. Security Considerations 390 Nodes that practice HTS method are presumed to share a trust model 391 that depends on the existence of a trusted relationship among nodes. 392 This is necessary as these nodes are expected to correctly modify the 393 specific content of the data in the follow-up packet, and the degree 394 to which HTS measurement is useful for network operation depends on 395 this ability. In practice, this means either confidentiality or 396 integrity protection cannot cover those portions of messages that 397 contain the network state data. Though there are methods that make 398 it possible in theory to provide either or both such protections and 399 still allow for intermediate nodes to make detectable yet 400 authenticated modifications, such methods do not seem practical at 401 present, particularly for protocols that used to measure latency and/ 402 or jitter. 404 The ability to potentially authenticate and/or encrypt the network 405 state data for scenarios both with and without the participation of 406 intermediate nodes that participate in HTS measurement is left for 407 further study. 409 While it is possible for a supposed compromised node to intercept and 410 modify the network state information in the follow-up packet, this is 411 an issue that exists for nodes in general - for all data that to be 412 carried over the particular networking technology - and is therefore 413 the basis for an additional presumed trust model associated with an 414 existing network. 416 7. Acknowledgments 418 Authors express their gratitude and appreciation to Joel Halpern for 419 the most helpful and insightful discussion on the applicability of 420 HTS in a Service Function Chaining domain. 422 8. References 424 8.1. Normative References 426 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 427 Requirement Levels", BCP 14, RFC 2119, 428 DOI 10.17487/RFC2119, March 1997, 429 . 431 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 432 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 433 May 2017, . 435 8.2. Informative References 437 [I-D.ietf-ippm-ioam-data] 438 Brockners, F., Bhandari, S., Pignataro, C., Gredler, H., 439 Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, 440 P., Chang, R., daniel.bernier@bell.ca, d., and J. Lemon, 441 "Data Fields for In-situ OAM", draft-ietf-ippm-ioam- 442 data-07 (work in progress), September 2019. 444 [P4.INT] "In-band Network Telemetry (INT)", P4.org Specification, 445 October 2017. 447 [RFC7799] Morton, A., "Active and Passive Metrics and Methods (with 448 Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799, 449 May 2016, . 451 [RFC8169] Mirsky, G., Ruffini, S., Gray, E., Drake, J., Bryant, S., 452 and A. Vainshtein, "Residence Time Measurement in MPLS 453 Networks", RFC 8169, DOI 10.17487/RFC8169, May 2017, 454 . 456 [RFC8296] Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A., 457 Tantsura, J., Aldrin, S., and I. Meilik, "Encapsulation 458 for Bit Index Explicit Replication (BIER) in MPLS and Non- 459 MPLS Networks", RFC 8296, DOI 10.17487/RFC8296, January 460 2018, . 462 Authors' Addresses 464 Greg Mirsky 465 ZTE Corp. 467 Email: gregimirsky@gmail.com 468 Wang Lingqiang 469 ZTE Corporation 470 No 19 ,East Huayuan Road 471 Beijing 100191 472 P.R.China 474 Phone: +86 10 82963945 475 Email: wang.lingqiang@zte.com.cn 477 Guo Zhui 478 ZTE Corporation 479 No 19 ,East Huayuan Road 480 Beijing 100191 481 P.R.China 483 Phone: +86 10 82963945 484 Email: guo.zhui@zte.com.cn