idnits 2.17.1 draft-spv-ippm-monitor-implementation-services-kpi-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 17, 2016) is 2840 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC1305' is mentioned on line 251, but not defined ** Obsolete undefined reference: RFC 1305 (Obsoleted by RFC 5905) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Srivathsa. Sarangapani 3 Internet-Draft Peyush. Gupta 4 Intended status: Standards Track Juniper Networks 5 Expires: January 18, 2017 V. Hegde 6 Consultant 7 Q. Wu 8 Huawei 9 July 17, 2016 11 KPI Metrics for Service Monitoring using TWAMP 12 draft-spv-ippm-monitor-implementation-services-kpi-02 14 Abstract 16 We are using a new method to calculate services KPIs and metrics in 17 the network using TWAMP protocol. This draft outlines the 18 implementation of the service KPIs and there use cases in the service 19 plane in the network. The KPIs discussed in this draft include 20 Service Latency and Application Liveliness detection. 22 Service latency is defined as the time spent by the packet when it is 23 injected in the service module or service card till the time, 24 serviced packet is received back by the TWAMP server. TWAMP server 25 records the timestamp of the packet when it is injected into the 26 service module and then again record the timestamp when it receives 27 the packet afer service is applied in the data plane. 29 Application Liveliness detection means whether the application is up 30 and running in the network. In case you want to monitor the http 31 application or the dns server and verify if they are up and running, 32 this method is applicable. The implementation can be used for 33 liveliness detection of any service in the network. 35 Status of This Memo 37 This Internet-Draft is submitted in full conformance with the 38 provisions of BCP 78 and BCP 79. 40 Internet-Drafts are working documents of the Internet Engineering 41 Task Force (IETF). Note that other groups may also distribute 42 working documents as Internet-Drafts. The list of current Internet- 43 Drafts is at http://datatracker.ietf.org/drafts/current/. 45 Internet-Drafts are draft documents valid for a maximum of six months 46 and may be updated, replaced, or obsoleted by other documents at any 47 time. It is inappropriate to use Internet-Drafts as reference 48 material or to cite them other than as "work in progress." 49 This Internet-Draft will expire on January 18, 2017. 51 Copyright Notice 53 Copyright (c) 2016 IETF Trust and the persons identified as the 54 document authors. All rights reserved. 56 This document is subject to BCP 78 and the IETF Trust's Legal 57 Provisions Relating to IETF Documents 58 (http://trustee.ietf.org/license-info) in effect on the date of 59 publication of this document. Please review these documents 60 carefully, as they describe your rights and restrictions with respect 61 to this document. Code Components extracted from this document must 62 include Simplified BSD License text as described in Section 4.e of 63 the Trust Legal Provisions and are provided without warranty as 64 described in the Simplified BSD License. 66 Table of Contents 68 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 69 1.1. Conventions used in this document . . . . . . . . . . . . 3 70 1.1.1. Requirements Language . . . . . . . . . . . . . . . . 3 71 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 72 2. Services KPIs . . . . . . . . . . . . . . . . . . . . . . . . 4 73 2.1. Services Keepalive Monitoring . . . . . . . . . . . . . . 4 74 2.2. Service Latency . . . . . . . . . . . . . . . . . . . . . 5 75 3. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 76 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 77 5. Security Considerations . . . . . . . . . . . . . . . . . . . 8 78 6. Normative References . . . . . . . . . . . . . . . . . . . . 8 79 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 81 1. Introduction 83 The TWAMP-Test runs between a Session-Sender and a Session-Reflector 84 RFC 5357 [RFC5357]. The existing TWAMP-Test packet format has 85 existing padding octets that are currently not used (either set to 86 zero or pseudo-random values). These octets can be used to carry 87 additional information between the Session-Sender and the Session- 88 Reflector. The proposed extension uses these padding octets and 89 provide a method to monitor services KPIs in the network. This 90 feature is termed as Services KPI Monitoring using TWAMP. 92 TWAMP server is used to inject the packet in the service plane for 93 calculating the latency and liveliness. This is done as part of 94 TWAMP data connection. The packets being sent out of TWAMP server is 95 a UDP packet carrying the payload for the service for which we are 96 interested in calculating the KPIs. The timestamping is done at the 97 TWAMP server. Based on the service model, TWAMP server may be runnig 98 on the same box where the service is hosted or in a remote server. 100 The Interface between the TWAMP server and the service plane is 101 implementation specific. the underlying transport is UDP since this 102 is in data path. In this draft, the use cases presented are service 103 latency and keep alive monitoring. Service latency for services like 104 DPI, TDF, Video Caching is calculated. Similarly liveliness for http 105 server, dns application is calculated in the implementation part. 107 As per the proposed extension, both the TWAMP-Control and the TWAMP- 108 Test packet formats are modified. One TWAMP-Test session SHALL be 109 used to monitor KPIs for a specific service. But there can be 110 multiple KPIs monitored using a single test session for a specific 111 service. A single TWAMP-Control connection MAY establish multiple 112 TWAMP-Test sessions that measure KPIs for multiple services in the 113 network. 115 This extension can be used to monitor KPIs for standalone service or 116 a set of services. 118 1.1. Conventions used in this document 120 1.1.1. Requirements Language 122 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 123 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 124 document are to be interpreted as described in RFC 2119 [RFC2119]. 126 1.2. Terminology 128 TWAMP: Two-Way Active Measurement Protocol 130 KPI: Key Performance Indicator 132 DPI: Deep Packet Inspection 134 CGNAT: Carrier Grade Network Address Translation 136 SFW: Stateful Firewall 138 TDF: Traffic Detection Function 140 DNS: Domain Name Server 142 HTTP: Hyper Text Transfer Protocol 144 FTP: File Transfer Protocol 145 SKMC: Services KPI Monitoring Command 147 PDU: Protocol Data Unit 149 2. Services KPIs 151 2.1. Services Keepalive Monitoring 153 Metric Name: Services Keepalive Monitoring 155 Metric Description: This indicates whether the service is running or 156 not at any point of time. 158 Method of Measurement or Calculation: The Session-Reflector SHALL 159 inject the Service PDU to the Service Block for service processing. 160 Based on whether the Session-Reflector received the response, the 161 Session-Reflector SHALL decide whether the Service is alive or not. 163 Units of Measurement: This metric is expressed as a single bit 164 boolean value. If this bit is set then it indicates that the Service 165 Block is functional. If this bit is NOT set then it indicates that 166 the Service Block is not functional. 168 Measurement Point(s) with Potential Measurement Domain: This metric 169 is calculated at the Session-Reflector. 171 Measurement Timing: This metric is an instantaneous value. Based on 172 the kind of service it MAY be a good idea to store the history of 173 this value. It can be stored as an average of last one hour for 24 174 hours, then average of all values over previous day, week, month, 175 year etc. These data MAY be used for some analytics. 177 Implementation: The Session-Sender SHALL send the Service PDU as part 178 of the TWAMP-Test Packet Padding. When Session-Reflector receives 179 the TWAMP-Test packet, it SHALL extract the Service PDU. The 180 Session-Reflector SHALL extract the service PDU from the TWAMP 181 Payload and inject it to the service module. For ex, incase of http 182 server, the service PDU can just be a http req. The service module 183 will apply services on this PDU and once service processing is done, 184 it would send the response(http resp) back to Session Reflector. The 185 Session Reflector SHALL now reply back to the TWAMP client/session 186 sender with the actual TWAMP data packet with payload being the 187 boolean flag and response service PDU(http resp). 189 Verification: The metric value is a boolean which SHOULD be either 190 0(Service NOT Alive) or 1(Service is Alive). 192 The Session-Reflector MUST start the Packet Padding with the below 4 193 octets as indicated below. This is followed by the Service PDU 194 (which MAY be same as whatever was sent by Session-Sender or can be 195 the reply/response packet of the Service Block). 197 Setting Bit 0(X) indicates that the Session-Reflector successfully 198 sent the Service Request to the Service Block and received the 199 response from the Service Block. If this bit is NOT set then it 200 indicates that the Service Block is not functional. 202 0 1 2 3 203 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 204 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 205 |X| Reserved | MBZ (3 octets) | 206 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 208 Figure 1: Services Keepalive Monitoring 210 Use and Applications: This metric is useful for monitoring the 211 liveliness of a service. Normally the liveliness of the Server or a 212 Network Element is not enough to know whether the Service is really 213 Alive or NOT. Say for example if there is a Web Server Application, 214 then just monitoring whether the Server(where the web server) is 215 alive or not by using ping is just not enough to say whether the 216 application is really Alive or not. There could be instances where 217 in the Server is up and running, while the Web Server Application is 218 not running because of some software bug. These kinds of scenarios 219 can be caught only by probing the application and not just the Server 220 where Applicaton is running. 222 Reporting Model: This metric needs to be associated with a defined 223 time interval, which could be defined by fixed intervals. Based on 224 need, the TWAMP client can negotiate to check the liveliness of a 225 service during connection establishment. If so, then for each of the 226 data packet, the liveliness of the service is measured and reported 227 back to session-sender/client for the entire session. 229 2.2. Service Latency 231 Metric Name: Service Latency 233 Metric Description: This indicates the total latency introduced by 234 the service for a data packet which is undergoing specific treatment 235 offered by that service node in the path. Please note that the 236 latency calculation in service agnostic. Service Latency SHALL 237 include the transit time and actual service time. The transit time 238 should refer to round trip time on the path between Session-Reflector 239 and service node. The service time should refer to service 240 processing time or service treatment time. 242 Method of Measurement or Calculation: The Session-Reflector SHALL 243 notedown the time: Service latency measurement Sender Timestamp and 244 inject the Service PDU to the Service Block for service processing. 245 When the Session-Reflector receives the response, the Session- 246 Reflector SHALL notedown the time: Service latency measurement 247 Receiver Timestamp. 249 Units of Measurement: This metric is expressed as a timestamp which 250 is 64 bit value. The format of the timestamp is the same as in 251 [RFC1305] and is as follows: the first 32 bits represent the unsigned 252 integer number of seconds elapsed since 0h on 1 January 1900; the 253 next 32 bits represent the fractional part of a second that has 254 elapsed since then. 256 So, Timestamp is represented as follows: 258 0 1 2 3 259 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 260 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 261 | Integer part of seconds | 262 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 263 | Fractional part of seconds | 264 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 266 Figure 2: Timestamp Format 268 Measurement Point(s) with Potential Measurement Domain: This metric 269 is calculated at the Session-Reflector. 271 Measurement Timing: This metric is an instantaneous value. Based on 272 the kind of service it MAY be a good idea to store the history of 273 this value. It can be stored as an average of last one hour for 24 274 hours, then average of all values over previous day, week, month, 275 year etc. These data MAY be used for some analytics. 277 Implementation: The Session-Reflector SHALL extract the service PDU 278 from the TWAMP Payload and inject it to the service module. For ex, 279 incase of http server, the service PDU can just be a http req. The 280 injecting of service module can be broken into below steps: 282 - Session Reflector should note down the time(say T5) at which this 283 service PDU(http req) is injected to service module. 285 - The service module will apply services on this PDU and once service 286 processing is done, it would send the response(http resp) back to 287 Session Reflector. 289 - Once this response serviced PDU is received at Session Reflector, 290 the time(say T6) SHOULD be noted. 292 - The Session Reflector SHALL now reply back to the TWAMP client/ 293 session sender with the actual TWAMP data packet with payload being 294 the 2 timestamps and the response service PDU(http resp). 296 - Now the TWAMP client/session sender after receiving the packet back 297 can calculate the service latency, which would be T6-T5. 299 Verification: This metric value is a pair of timestamps. Service 300 Latency will be calculated by subtracting the "Service latency 301 measurement Sender Timestamp" from the "Service latency measurement 302 Receiver Timestamp" 304 The Session-Reflector MUST start the Packet Padding with the 16 305 octets indicated below. This SHALL be followed by the Service 306 PDU(which MAY be same as whatever was sent by Session-Sender or can 307 be the reply/response packet of the Service Block). 309 0 1 2 3 310 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 311 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 312 | | 313 | Service latency measurement Sender Timestamp | 314 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 315 | | 316 | Service latency measurement Receiver Timestamp | 317 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 319 Figure 3: Services Keepalive Monitoring 321 3. Acknowledgements 323 We would like to thank Perceival A Monteiro for their comments, 324 suggestions, reviews, helpful discussion, and proof-reading 326 4. IANA Considerations 328 TWAMP Services KPIs Registry 330 IANA is requested to reserve and maintain the below Services KPIs: 332 +-------+-------------+---------------------------------------------+ 333 | Value | Description | Explanation | 334 +-------+-------------+---------------------------------------------+ 335 | 0 | None | | 336 | 1 | Keepalive | Whether the respective service is running | 337 | | | or not | 338 | 2 | Service | Service Latency which SHALL include the | 339 | | Latency | transit time and actual service time | 340 +-------+-------------+---------------------------------------------+ 342 Table 1: TWAMP Services KPIs Registry 344 Request-TW-Session message defined in [RFC6038].IANA is requested to 345 reserve 2 octets for Service ID as follows: 347 +-------+-------------+--------------------------------+------------+ 348 | Value | Description | Semantics | Reference | 349 +-------+-------------+--------------------------------+------------+ 350 | X | Service ID | 2 Octets starting from offset | This | 351 | | | 92th Octet | document | 352 +-------+-------------+--------------------------------+------------+ 354 Table 2: New Services KPIs Monitoring Capability 356 5. Security Considerations 358 The TWAMP protocol (RFC 5357) supports authenticated and encrypted 359 mode for TWAMP session and data. The implementation discussed in the 360 proposed extension supports the authenticated and encrypted mode and 361 is therefore provides a secure mechanism to monitor services KPIs in 362 the network. 364 6. Normative References 366 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 367 Requirement Levels", BCP 14, RFC 2119, 368 DOI 10.17487/RFC2119, March 1997, 369 . 371 [RFC5357] Hedayat, K., Krzanowski, R., Morton, A., Yum, K., and J. 372 Babiarz, "A Two-Way Active Measurement Protocol (TWAMP)", 373 RFC 5357, DOI 10.17487/RFC5357, October 2008, 374 . 376 [RFC6038] Morton, A. and L. Ciavattone, "Two-Way Active Measurement 377 Protocol (TWAMP) Reflect Octets and Symmetrical Size 378 Features", RFC 6038, DOI 10.17487/RFC6038, October 2010, 379 . 381 Authors' Addresses 383 Srivathsa Sarangapani 384 Juniper Networks 385 89, Asthagrama Layout 2nd Stage, Basavehwaranagar 386 Bangalore 560079 387 INDIA 389 Phone: +91 9845052354 390 Email: srivathsas@juniper.net 392 Peyush Gupta 393 Juniper Networks 394 Flat #206, Keerti Royal Apartment, Outer Ring Road 395 Bangalore, Karnataka 560043 396 INDIA 398 Phone: +91 9449251927 399 Email: peyushg@juniper.net 401 Vinayak Hegde 402 Consultant 403 Brahma Sun City, Wadgaon-Sheri 404 Pune, Maharashtra 411014 405 INDIA 407 Phone: +91 944984401 408 Email: vinayakh@gmail.com 409 URI: http://www.vinayakhegde.com 411 Qin Wu 412 Huawei 413 101 Software Avenue, Yuhua District 414 Nanjing, Jiangsu 210012 415 China 417 Phone: +86-25-84565892 418 Email: bill.wu@huawei.com