| < draft-mirsky-ippm-hybrid-two-step-00.txt | draft-mirsky-ippm-hybrid-two-step-01.txt > | |||
|---|---|---|---|---|
| IPPM Working Group G. Mirsky | IPPM Working Group G. Mirsky | |||
| Internet-Draft ZTE Corp. | Internet-Draft ZTE Corp. | |||
| Intended status: Informational W. Lingqiang | Intended status: Informational W. Lingqiang | |||
| Expires: August 31, 2018 G. Zhui | Expires: January 3, 2019 G. Zhui | |||
| ZTE Corporation | ZTE Corporation | |||
| February 27, 2018 | July 2, 2018 | |||
| Hybrid Two-Step Performance Measurement Method | Hybrid Two-Step Performance Measurement Method | |||
| draft-mirsky-ippm-hybrid-two-step-00 | draft-mirsky-ippm-hybrid-two-step-01 | |||
| Abstract | Abstract | |||
| Development of and advancements in automation of network operations | Development of, and advancements in, automation of network operations | |||
| brought new requirements toward measurement methodology. Among them | brought new requirements for measurement methodology. Among them is | |||
| is ability to collect the instant telemetry as the packet being | the ability to collect instant network state as the packet being | |||
| processed by the networking elements along its path through the | processed by the networking elements along its path through the | |||
| domain. This document introduces new hybrid measurement method, | domain. This document introduces a new hybrid measurement method, | |||
| referred to as hybrid two-step, as it separates act of measuring and/ | referred to as hybrid two-step, as it separates the act of measuring | |||
| or calculating performance metric from the act of collecting and | and/or calculating performance metric from the act of collecting and | |||
| transporting telemetry. | transporting network state. | |||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on August 31, 2018. | This Internet-Draft will expire on January 3, 2019. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2018 IETF Trust and the persons identified as the | Copyright (c) 2018 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 2, line 18 ¶ | skipping to change at page 2, line 18 ¶ | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
| 2. Conventions used in this document . . . . . . . . . . . . . . 3 | 2. Conventions used in this document . . . . . . . . . . . . . . 3 | |||
| 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 | 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 3 | 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 3 | |||
| 3. Problem Overview . . . . . . . . . . . . . . . . . . . . . . 3 | 3. Problem Overview . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 4. Theory of Operation . . . . . . . . . . . . . . . . . . . . . 4 | 4. Theory of Operation . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 4 | 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 6. Security Considerations . . . . . . . . . . . . . . . . . . . 5 | 6. Security Considerations . . . . . . . . . . . . . . . . . . . 5 | |||
| 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 5 | 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 8.1. Normative References . . . . . . . . . . . . . . . . . . 5 | 8.1. Normative References . . . . . . . . . . . . . . . . . . 6 | |||
| 8.2. Informative References . . . . . . . . . . . . . . . . . 6 | 8.2. Informative References . . . . . . . . . . . . . . . . . 6 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 6 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 1. Introduction | 1. Introduction | |||
| Successful resolution of challenges of automated network operation, | Successful resolution of challenges of automated network operation, | |||
| as part of overall life-cycle service orchestration, relies on | as part of, for example, overall service orchestration or data center | |||
| collection of accurate and timely information that reflects the state | operation, relies on a timely collection of accurate information that | |||
| of network elements on unprecedented massive, even grandiose scale. | reflects the state of network elements on an unprecedented scale. | |||
| Because analysis and action upon the it requires considerable | Because performing the analysis and act upon the collected | |||
| computing and storage resources, the network state information, also | information requires considerable computing and storage resources, | |||
| referred to as telemetry, is unlikely to be processed by network | the network state information is unlikely to be processed by network | |||
| elements themselves but will be relayed into data lakes. The process | elements themselves but will be relayed into the data storage | |||
| of producing telemetry information, collecting and transporting it | facilities, e.g. data lakes. The process of producing, collecting | |||
| for post-processing should equally work with data flows and specially | network state information also referred in this document as network | |||
| inserted in the network test packets. Per [RFC7799] classification | telemetry, and transporting it for post-processing should work | |||
| such process classified as hybrid measurement method. | equally well with data flows or injected in the network test packets. | |||
| RFC 7799 [RFC7799] describes a combination of elements of passive and | ||||
| active measurement as a hybrid measurement. | ||||
| Several technical methods were proposed to enable collection of | Several technical methods have been proposed to enable collection of | |||
| telemetry information instantaneous to the packet processing. Among | network state information instantaneous to the packet processing, | |||
| them [P4.INT] and [I-D.ietf-ippm-ioam-data]. | among them [P4.INT] and [I-D.ietf-ippm-ioam-data]. | |||
| This document introduces new hybrid measurement method, referred to | This document introduces Hybrid Two-Step (HTS) as a new hybrid | |||
| as Hybrid Two-step (HTS), that it separates measuring and/or | measurement method that separates measuring or calculating | |||
| calculating performance metric from the collecting and transporting | performance metric from the collecting and transporting this | |||
| telemetry. The hybrid two-step method extends two-step mode of | information. The Hybrid Two-Step method extends the two-step mode of | |||
| Residence Time Measurement (RTM) defined in [RFC8169] to on-path | Residence Time Measurement (RTM) defined in [RFC8169] to on-path | |||
| telemetry collection and transport. | network state collection and transport. | |||
| 2. Conventions used in this document | 2. Conventions used in this document | |||
| 2.1. Terminology | 2.1. Terminology | |||
| RTM Residence Time Measurement | RTM Residence Time Measurement | |||
| ECMP Equal Cost Multipath | ECMP Equal Cost Multipath | |||
| MTU Maximum Transmission Unit | MTU Maximum Transmission Unit | |||
| HTS Hybrid Two-step | HTS Hybrid Two-Step | |||
| Network telemetry - the process of collecting and reporting of | ||||
| network state | ||||
| 2.2. Requirements Language | 2.2. Requirements Language | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
| "OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in BCP | |||
| 14 [RFC2119] [RFC8174] when, and only when, they appear in all | 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
| capitals, as shown here. | capitals, as shown here. | |||
| 3. Problem Overview | 3. Problem Overview | |||
| Performance measurements are meant to provide data that characterize | Performance measurements are meant to provide data that characterize | |||
| conditions experienced by data in the network and possibly trigger | conditions experienced by traffic flows in the network and possibly | |||
| operations to re-route flows, allocate additional or free excess of | trigger operational changes (e.g. - re-route of flows, or changes in | |||
| resources. All changes to the network depend on the quality of | resource allocations). Changes to a network are determined based on | |||
| collected data and calculated based on its performance metrics. The | the performance metric information available at the time that a | |||
| quality of measurements defined not only by resolution but by how | change is to be made. The correctness of this determination is based | |||
| consistent are performed measurements, how predictable is the moment | on the quality of the collected metrics data. The quality of | |||
| of measurement making, of obtaining the data. Consider case of delay | collected measurement data is defined is defined by: | |||
| measurement that relies on collection of time of packet arrival at | ||||
| the ingress interface and time of packet transmission at egress | ||||
| interface. The ideal method may read wall clock value as the very | ||||
| first octet of the packet being received at ingress, and another | ||||
| value, as the first octet being transmitted. That way all nodal | ||||
| processing delays be accounted for as this method excludes packet | ||||
| queuing. But if the measurement method requires the original packet | ||||
| to carry either both time values of the calculated delay value, then | ||||
| the packet must be modified on-the-fly, while being transmitted. And | ||||
| that task may become even more challenging if the packet is | ||||
| encrypted. As result, at egress time may be obtained before the | ||||
| packet transmission begins, thus leaving variable delays unmeasured. | ||||
| Similar problem may cause lower quality of, for example, information | ||||
| that characterizes utilization of the egress interface. If unable to | ||||
| obtain the data consistently, without variable delays for additional | ||||
| processing, information may not accurately reflect the state at the | ||||
| egress interface. To mitigate this problem [RFC8169] defined RTM | ||||
| two-step mode. | ||||
| Another challenge facing methods that collect telemetry into the | o the resolution and accuracy of each measurement; | |||
| actual data packet is risk of exceeding size of Maximum Transmission | ||||
| Unit (MTU), particularly if the packet traverses overlay domains or | o predictability of both the time at which each measurement is made | |||
| VPNs. Since the fragmentation is not available at the transport | and the timeliness of measurement collection data delivery for | |||
| network, operators may have to reduce MTU size advertised to client | use. | |||
| layer or risk missing telemetry data for the part, most probably the | ||||
| latter part, of the path. | Consider the case of delay measurement that relies on collecting time | |||
| of packet arrival at the ingress interface and time of the packet | ||||
| transmission at egress interface. The method may be to record a | ||||
| local clock value on receiving the first octet of an affected message | ||||
| at the device ingress, and again to record the clock value on sending | ||||
| the first byte of the same message at the device egress. In this | ||||
| ideal case, the difference between the two recorded clock times | ||||
| corresponds to the time that the message spent in traversing the | ||||
| device. In practice, the times actually recorded can differ from the | ||||
| ideal case by any fixed amount and a correction may then be applied | ||||
| to compute the same time difference taking into account the known | ||||
| fixed time associated with the actual measurement. In this way, the | ||||
| resulting time difference reflects any variable delay associated with | ||||
| queuing. | ||||
| Depending on the implementation, it may be a challenge to compute the | ||||
| difference between message arrival and departure times and - on the | ||||
| fly - add the necessary residence time information to the same | ||||
| message. And that task may become even more challenging if the | ||||
| packet is encrypted. Implementations SHOULD NOT record a message | ||||
| departure time that may be significantly inaccurate in an effort to | ||||
| include a correlated/computed delay value, in the same message, as a | ||||
| result of estimating the departure time while including any variable | ||||
| time component (such as that associated with buffering and queuing of | ||||
| messages). A similar problem may cause a lower quality of, for | ||||
| example, information that characterizes utilization of the egress | ||||
| interface. If unable to obtain the data consistently, without | ||||
| variable delays for additional processing, information may not | ||||
| accurately reflect the state at the egress interface. To mitigate | ||||
| this problem [RFC8169] defined RTM two-step mode. | ||||
| Another challenge associated with methods that collect network state | ||||
| information into the actual data packet is the risk to exceed the | ||||
| Maximum Transmission Unit (MTU) size, particularly if the packet | ||||
| traverses overlay domains or VPNs. Since the fragmentation is not | ||||
| available at the transport network, operators may have to reduce MTU | ||||
| size advertised to client layer or risk missing network state data | ||||
| for the part, most probably the latter part, of the path. | ||||
| 4. Theory of Operation | 4. Theory of Operation | |||
| HTS method consists of the two phases: | The HTS method consists of the two phases: | |||
| o performing a measurement or obtaining telemetry information, one | o performing a measurement or obtaining network state information, | |||
| or more than one type, on a node; | one or more than one type, on a node; | |||
| o collecting and transporting the measurement. | o collecting and transporting the measurement. | |||
| HTS uses HTS Control message to define types of measurement or | HTS uses HTS Control message to define types of measurement or | |||
| telemetry data collection requested from a node. HTS Control message | network state data collection requested from a node. HTS Control | |||
| may be inserted into the data packet, as meta-data or shim, or be | message may be inserted into the data packet, as meta-data or shim, | |||
| transmitted in the specially constructed test packet. | or be transmitted in a specially constructed test packet. | |||
| To collect measurement and telemetry data from the nodes HTS method | To collect measurement and network state data from the nodes HTS | |||
| uses the follow-up packet. The node that creates the HTS Control | method uses the follow-up packet. The node that creates the HTS | |||
| message also originates the HTS follow-up packet. The follow-up | Control message also originates the HTS follow-up packet. The | |||
| packet contains characteristic information, copied from the data | follow-up packet contains characteristic information, copied from the | |||
| packet, sufficient for participating nodes to associate it with the | data packet, sufficient for participating nodes to associate it with | |||
| original packet. Exact composition of the characteristic information | the original packet. The exact composition of the characteristic | |||
| is specific for each transport network and its definition is outside | information is specific for each transport network and its definition | |||
| the scope of this document. The follow-up packet also uses the same | is outside the scope of this document. The follow-up packet also | |||
| encapsulation as the data packet. If not payload but only network | uses the same encapsulation as the data packet. If not payload but | |||
| information used to load-balance flows in equal cost multipath | only network information used to load-balance flows in equal cost | |||
| (ECMP), use of the network encapsulation identical to the data packet | multipath (ECMP), use of the network encapsulation identical to the | |||
| should guarantee that the follow-up packet remains in-band, i.e. | data packet should guarantee that the follow-up packet remains in- | |||
| traverses the same set of network elements, with the original data | band, i.e. traverses the same set of network elements, with the | |||
| packet. Only one outstanding follow-up packet may be on the node for | original data packet. Only one outstanding follow-up packet may be | |||
| the given path. That means that if the node receives HTS Control | on the node for the given path. That means that if the node receives | |||
| message for the flow on which it still waits for the follow-up packet | HTS Control message for the flow on which it still waits for the | |||
| to the previous HTS Control message, the node will originate the | follow-up packet to the previous HTS Control message, the node will | |||
| follow-up packet to transport the former set of the telemetry data | originate the follow-up packet to transport the former set of the | |||
| and transmit it before it transmits the follow-up packet with the | network state data and transmit it before it transmits the follow-up | |||
| latest set of telemetry information. | packet with the latest set of network state information. | |||
| 5. IANA Considerations | 5. IANA Considerations | |||
| This document doesn't have any IANA requirements. The section may be | This document doesn't have any IANA requirements. The section may be | |||
| deleted before the publication. | deleted before the publication. | |||
| 6. Security Considerations | 6. Security Considerations | |||
| Nodes that practice HTS method are presumed to share a trust model | Nodes that practice HTS method are presumed to share a trust model | |||
| that depends on the existence of a trusted relationship among them. | that depends on the existence of a trusted relationship among nodes. | |||
| This is necessary as these nodes are expected to correctly modify | This is necessary as these nodes are expected to correctly modify the | |||
| specific content of the data in the follow-up packet, and degree to | specific content of the data in the follow-up packet, and the degree | |||
| which HTS measurement is useful for network operation depends on this | to which HTS measurement is useful for network operation depends on | |||
| ability. In practice, this means that those portions of messages | this ability. In practice, this means that those portions of | |||
| that contain the telemetry data cannot be covered by either | messages that contain the network state data cannot be covered by | |||
| confidentiality or integrity protection. Though there are methods | either confidentiality or integrity protection. Though there are | |||
| that make it possible in theory to provide either or both such | methods that make it possible in theory to provide either or both | |||
| protections and still allow for intermediate nodes to make detectable | such protections and still allow for intermediate nodes to make | |||
| but authenticated modifications, such methods do not seem practical | detectable but authenticated modifications, such methods do not seem | |||
| at present, particularly for protocols that used to measure latency | practical at present, particularly for protocols that used to measure | |||
| and/or jitter. | latency and/or jitter. | |||
| The ability to potentially authenticate and/or encrypt the telemetry | The ability to potentially authenticate and/or encrypt the network | |||
| data for scenarios both with and without participation of | state data for scenarios both with and without the participation of | |||
| intermediate nodes that participate in HTS measurement is left for | intermediate nodes that participate in HTS measurement is left for | |||
| further study. | further study. | |||
| While it is possible for a supposed compromised node to intercept and | While it is possible for a supposed compromised node to intercept and | |||
| modify the telemetry information in the follow-up packet, this is an | modify the network state information in the follow-up packet, this is | |||
| issue that exists for nodes in general - for any and all data that | an issue that exists for nodes in general - for any and all data that | |||
| may be carried over the particular networking technology - and is | may be carried over the particular networking technology - and is | |||
| therefore the basis for an additional presumed trust model associated | therefore the basis for an additional presumed trust model associated | |||
| with existing network. | with an existing network. | |||
| 7. Acknowledgements | 7. Acknowledgments | |||
| TBD | TBD | |||
| 8. References | 8. References | |||
| 8.1. Normative References | 8.1. Normative References | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
| DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
| skipping to change at page 6, line 10 ¶ | skipping to change at page 6, line 34 ¶ | |||
| [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
| 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
| May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
| 8.2. Informative References | 8.2. Informative References | |||
| [I-D.ietf-ippm-ioam-data] | [I-D.ietf-ippm-ioam-data] | |||
| Brockners, F., Bhandari, S., Pignataro, C., Gredler, H., | Brockners, F., Bhandari, S., Pignataro, C., Gredler, H., | |||
| Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, | Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, | |||
| P., Chang, R., and d. daniel.bernier@bell.ca, "Data Fields | P., Chang, R., daniel.bernier@bell.ca, d., and J. Lemon, | |||
| for In-situ OAM", draft-ietf-ippm-ioam-data-01 (work in | "Data Fields for In-situ OAM", draft-ietf-ippm-ioam- | |||
| progress), October 2017. | data-03 (work in progress), June 2018. | |||
| [P4.INT] "In-band Network Telemetry (INT)", P4.org Specification, | [P4.INT] "In-band Network Telemetry (INT)", P4.org Specification, | |||
| October 2017. | October 2017. | |||
| [RFC7799] Morton, A., "Active and Passive Metrics and Methods (with | [RFC7799] Morton, A., "Active and Passive Metrics and Methods (with | |||
| Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799, | Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799, | |||
| May 2016, <https://www.rfc-editor.org/info/rfc7799>. | May 2016, <https://www.rfc-editor.org/info/rfc7799>. | |||
| [RFC8169] Mirsky, G., Ruffini, S., Gray, E., Drake, J., Bryant, S., | [RFC8169] Mirsky, G., Ruffini, S., Gray, E., Drake, J., Bryant, S., | |||
| and A. Vainshtein, "Residence Time Measurement in MPLS | and A. Vainshtein, "Residence Time Measurement in MPLS | |||
| End of changes. 26 change blocks. | ||||
| 115 lines changed or deleted | 139 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||