idnits 2.17.1 draft-brockners-inband-oam-requirements-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 30, 2016) is 2727 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-05) exists of draft-brockners-proof-of-transit-01 == Outdated reference: A later version (-15) exists of draft-ietf-spring-segment-routing-09 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F. Brockners 3 Internet-Draft S. Bhandari 4 Intended status: Informational S. Dara 5 Expires: May 3, 2017 C. Pignataro 6 Cisco 7 H. Gredler 8 RtBrick Inc. 9 J. Leddy 10 Comcast 11 S. Youell 12 JMPC 13 D. Mozes 14 Mellanox Technologies Ltd. 15 T. Mizrahi 16 Marvell 17 P. Lapukhov 18 Facebook 19 R. Chang 20 Barefoot Networks 21 October 30, 2016 23 Requirements for In-situ OAM 24 draft-brockners-inband-oam-requirements-02 26 Abstract 28 This document discusses the motivation and requirements for including 29 specific operational and telemetry information into data packets 30 while the data packet traverses a path between two points in the 31 network. This method is referred to as "in-situ" Operations, 32 Administration, and Maintenance (OAM), given that the OAM information 33 is carried with the data packets as opposed to in "out-of-band" 34 packets dedicated to OAM. In situ OAM complements other OAM 35 mechanisms which use dedicated probe packets to convey OAM 36 information. 38 Status of This Memo 40 This Internet-Draft is submitted in full conformance with the 41 provisions of BCP 78 and BCP 79. 43 Internet-Drafts are working documents of the Internet Engineering 44 Task Force (IETF). Note that other groups may also distribute 45 working documents as Internet-Drafts. The list of current Internet- 46 Drafts is at http://datatracker.ietf.org/drafts/current/. 48 Internet-Drafts are draft documents valid for a maximum of six months 49 and may be updated, replaced, or obsoleted by other documents at any 50 time. It is inappropriate to use Internet-Drafts as reference 51 material or to cite them other than as "work in progress." 53 This Internet-Draft will expire on May 3, 2017. 55 Copyright Notice 57 Copyright (c) 2016 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents 62 (http://trustee.ietf.org/license-info) in effect on the date of 63 publication of this document. Please review these documents 64 carefully, as they describe your rights and restrictions with respect 65 to this document. Code Components extracted from this document must 66 include Simplified BSD License text as described in Section 4.e of 67 the Trust Legal Provisions and are provided without warranty as 68 described in the Simplified BSD License. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 73 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 4 74 3. Motivation for in-situ OAM . . . . . . . . . . . . . . . . . 5 75 3.1. Path Congruency Issues with Dedicated OAM Packets . . . . 5 76 3.2. Results Sent to a System Other Than the Sender . . . . . 6 77 3.3. Overlay and Underlay Correlation . . . . . . . . . . . . 6 78 3.4. SLA Verification . . . . . . . . . . . . . . . . . . . . 7 79 3.5. Analytics and Diagnostics . . . . . . . . . . . . . . . . 7 80 3.6. Frame Replication/Elimination Decision for Bi-casting 81 /Active-active Networks . . . . . . . . . . . . . . . . . 8 82 3.7. Proof of Transit . . . . . . . . . . . . . . . . . . . . 8 83 3.8. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 9 84 4. Considerations for In-situ OAM . . . . . . . . . . . . . . . 11 85 4.1. Type of Information to Be Recorded . . . . . . . . . . . 11 86 4.2. MTU and Packet Size . . . . . . . . . . . . . . . . . . . 12 87 4.3. Administrative Boundaries . . . . . . . . . . . . . . . . 12 88 4.4. Selective Enablement . . . . . . . . . . . . . . . . . . 13 89 4.5. Optimization of Node and Interface Identifiers . . . . . 13 90 4.6. Loop Communication Path (IPv6-specifics) . . . . . . . . 14 91 5. Requirements for In-situ OAM Data Types . . . . . . . . . . . 14 92 5.1. Generic Requirements . . . . . . . . . . . . . . . . . . 14 93 5.2. In-situ OAM Data with Per-hop Scope . . . . . . . . . . . 16 94 5.3. In-situ OAM with Selected Hop Scope . . . . . . . . . . . 17 95 5.4. In-situ OAM with End-to-end Scope . . . . . . . . . . . . 17 97 6. Security Considerations and Requirements . . . . . . . . . . 17 98 6.1. General considerations . . . . . . . . . . . . . . . . . 17 99 6.2. Proof of Transit . . . . . . . . . . . . . . . . . . . . 18 100 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 101 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 19 102 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 103 9.1. Normative References . . . . . . . . . . . . . . . . . . 19 104 9.2. Informative References . . . . . . . . . . . . . . . . . 19 105 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21 107 1. Introduction 109 This document discusses requirements for "in-situ" Operations, 110 Administration, and Maintenance (OAM) mechanisms. In this context, 111 "in-situ OAM" refers to the concept of directly encoding telemetry 112 information within the data packet as it traverses the network or 113 telemetry domain. Mechanisms which add tracing or other types of 114 telemetry information to the regular data traffic, sometimes also 115 referred to as "in-band" OAM can complement active, probe-based 116 mechanisms such as ping or traceroute, which are sometimes considered 117 as "out-of-band", because the messages are transported independently 118 from regular data traffic. In terms of "active" or "passive" OAM, 119 "in-situ" OAM can be considered a hybrid OAM type. While no extra 120 packets are sent, in-situ OAM adds information to the packets 121 therefore cannot be considered passive. In terms of the 122 classification given in [RFC7799] in-situ OAM could be portrayed as 123 "hybrid OAM, type 1". "In-situ" mechanisms do not require extra 124 packets to be sent and hence don't change the packet traffic mix 125 within the network. Traceroute and ping for example use ICMP 126 messages: New packets are injected to get tracing information. Those 127 add to the number of messages in a network, which already might be 128 highly loaded or suffering performance issues for a particular path 129 or traffic type. 131 A number of in-situ as well as in-band OAM mechanisms have been 132 discussed, such as the INT spec for the P4 programming language [P4] 133 or the SPUD prototype [I-D.hildebrand-spud-prototype]. The SPUD 134 prototype uses a similar logic that allows network devices on the 135 path between endpoints to participate explicitly in the tube outside 136 the end-to-end context. Even the IPv4 route-record option defined in 137 [RFC0791] can be considered an in-situ OAM mechanism. Per what was 138 already stated, in-situ OAM complements "out-of-band" mechanisms such 139 as ping or traceroute, or more recent active probing mechanisms, as 140 described in [I-D.lapukhov-dataplane-probe]. In-situ OAM mechanisms 141 can be leveraged where current out-of-band mechanisms do not apply or 142 do not offer the desired characteristics or requirements, such as 143 proving that a certain set of traffic takes a pre-defined path, 144 strict congruency between overlay and underlay transports is in 145 place, checking service level agreements for the live data traffic, 146 detailed statistics or verification of path selections within a 147 domain, or scenarios where probe traffic is potentially handled 148 differently from regular data traffic by the network devices. 149 [RFC7276] presents an overview of OAM tools. 151 Compared to probably the most basic example of "in-situ OAM" which is 152 IPv4 route recording [RFC0791], an in-situ OAM approach has the 153 following capabilities: 155 a. A flexible data format to allow different types of information to 156 be captured as part of an in-situ OAM operation, including but 157 not limited to path tracing information, operational and 158 telemetry information such as timestamps, sequence numbers, or 159 even generic data such as queue size, geo-location of the node 160 that forwarded the packet, etc. 162 b. A data format to express node as well as link identifiers to 163 record the path a packet takes with a fixed amount of added data. 165 c. The ability to determine whether any nodes were skipped while 166 recording in-situ OAM information (i.e., in-situ OAM is not 167 supported or not enabled on those nodes). 169 d. The ability to actively process information in the packet, for 170 example to prove in a cryptographically secure way that a packet 171 really took a pre-defined path using some traffic steering method 172 such as service chaining or traffic engineering. 174 e. The ability to include OAM data beyond simple path information, 175 such as timestamps or even generic data of a particular use case. 177 f. The ability to carry in-situ OAM data in various different 178 transport protocols. 180 2. Conventions 182 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 183 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 184 document are to be interpreted as described in [RFC2119]. 186 Abbreviations used in this document: 188 ECMP: Equal Cost Multi-Path 190 LISP: Locator/ID Separation Protocol 192 MTU: Maximum Transmit Unit 193 NSH: Network Service Header 195 NFV: Network Function Virtualization 197 OAM: Operations, Administration, and Maintenance 199 PMTU: Path MTU 201 SFC: Service Function Chain 203 SLA: Service Level Agreement 205 SR: Segment Routing 207 This document defines in-situ Operations, Administration, and 208 Maintenance (in-situ OAM), as the subset in which OAM information is 209 carried along with data packets. This is as opposed to "out-of-band 210 OAM", where specific packets are dedicated to carrying OAM 211 information. 213 3. Motivation for in-situ OAM 215 In several scenarios it is beneficial to make information about the 216 path a packet took through the network or through a network device as 217 well as associated telemetry information available to the operator. 218 This includes not only tasks like debugging, troubleshooting, as well 219 as network planning and network optimization but also policy or 220 service level agreement compliance checks. This section discusses 221 the motivation to introduce new methods for enhanced in-situ network 222 diagnostics. 224 3.1. Path Congruency Issues with Dedicated OAM Packets 226 Packet scheduling algorithms, especially for balancing traffic across 227 equal cost paths or links, often leverage information contained 228 within the packet, such as protocol number, IP-address or MAC- 229 address. Probe packets would thus either need to be sent from the 230 exact same endpoints with the exact same parameters, or probe packets 231 would need to be artificially constructed as "fake" packets and 232 inserted along the path. Both approaches are often not feasible from 233 an operational perspective, be it that access to the end-system is 234 not feasible, or that the diversity of parameters and associated 235 probe packets to be created is simply too large. An in-situ 236 mechanism is an alternative in those cases. 238 In-situ mechanisms are not impacted by differences in the handling of 239 probe traffic compared to other data packets, where probe traffic is 240 handled differently (and potentially forwarded differently) by a 241 router than regular data traffic. This obviously assumes that the 242 addition of in-situ information does not change the forwarding 243 behavior of the packet. Note that in certain implementations, the 244 addition information to a transport protocol changes the forwarding 245 behavior. IPv6 extension header processing is one example. Some 246 implementations process IPv6 packets with extension headers in the 247 "slow" path of a router, as opposed to the "fast" path. 249 3.2. Results Sent to a System Other Than the Sender 251 Traditional ping and traceroute tools return the OAM results to the 252 sender of the probe. Even when the ICMP messages that are used with 253 these tools are enhanced, and additional telemetry is collected 254 (e.g., ICMP Multi-Part [RFC4884] supporting MPLS information 255 [RFC4950], Interface and Next-Hop Identification [RFC5837], etc.), it 256 would be advantageous to separate the sending of an OAM probe from 257 the receiving of the telemetry data. In this context, it is helpful 258 to eliminate the requirement that there be a working bidirectional 259 path. 261 3.3. Overlay and Underlay Correlation 263 Several network deployments leverage tunneling mechanisms to create 264 overlay or service-layer networks. Examples include VXLAN-GPE, GRE, 265 or LISP. One often observed attribute of overlay networks is that 266 they do not offer the user of the overlay any insight into the 267 underlay network. This means that the path that a particular 268 tunneled packet takes, nor other operational details such as the per- 269 hop delay/jitter in the underlay are visible to the user of the 270 overlay network, giving rise to diagnosis and debugging challenges in 271 case of connectivity or performance issues. The scope of OAM tools 272 like ping or traceroute is limited to either the overlay or the 273 underlay which means that the user of the overlay has typically no 274 access to OAM in the underlay, unless specific operational procedures 275 are put in place. With in-situ OAM the operator of the underlay can 276 offer details of the connectivity in the underlay to the user of the 277 overlay. This could include the ability to find out which underlay 278 elements are shared by overlays and ability to know which overlays 279 are mapped to the same underlay elements. Deployment dependent 280 underlay transit nodes can be configured to update OAM information in 281 the overlay transport encapsulation. The operator of the egress 282 tunnel router could choose to share the recorded information about 283 the path with the user of the overlay. 285 Coupled with mechanisms such as Segment Routing (SR) 286 [I-D.ietf-spring-segment-routing], overlay network and underlay 287 network can be more tightly coupled: The user of the overlay has 288 detailed diagnostic information available in case of failure 289 conditions. The user of the overlay can also use the path recording 290 information as input to traffic steering or traffic engineering 291 mechanisms, to for example achieve path symmetry for the traffic 292 between two endpoints. [I-D.brockners-lisp-sr] is an example for how 293 these methods can be applied to LISP. 295 3.4. SLA Verification 297 In-situ OAM can help users of an overlay-service to verify that 298 negotiated SLAs for the real traffic are met by the underlay network 299 provider. Different from solutions which rely on active probes to 300 test an SLA, in-situ OAM based mechanisms avoid wrong interpretations 301 and "cheating", which can happen if the probe traffic that is used to 302 perform SLA-check is prioritized by the network provider of the 303 underlay. In active/standby deployments in-situ OAM would only allow 304 for SLA verification of the active path. 306 3.5. Analytics and Diagnostics 308 Network planners and operators benefit from knowledge of the actual 309 traffic distribution in the network. When deriving an overall 310 network connectivity traffic matrix one typically needs to correlate 311 data gathered from each individual device in the network. If the 312 path of a packet is recorded while the packet is forwarded, the 313 entire path that a packet took through the network is available to 314 the egress system. This obviates the need to retrieve individual 315 traffic statistics from every device in the network and correlate 316 those statistics, or employ other mechanisms such as leveraging 317 traffic engineering with null-bandwidth tunnels just to retrieve the 318 appropriate statistics to generate the traffic matrix. 320 In addition, with individual path tracing, information is available 321 at packet level granularity, rather than only at aggregate level - as 322 is usually the case with IPFIX-style methods which employ flow- 323 filters at the network elements. Data-center networks which use 324 equal-cost multipath (ECMP) forwarding are one example where detailed 325 statistics on flow distribution in the network are highly desired. 326 If a network supports ECMP, one can create detailed statistics for 327 the different paths packets take through the network at the egress 328 system, without a need to correlate/aggregate statistics from every 329 router in the system. Transit devices are off-loaded from the task 330 of gathering packet statistics. 332 In high-speed networks one can leverage and benefit from packet- 333 accurate measurements with for example hardware-accurate timestamping 334 (i.e., nanosecond-level verification) to support optimized packet 335 scheduling and queuing mechanisms. 337 3.6. Frame Replication/Elimination Decision for Bi-casting/Active- 338 active Networks 340 Bandwidth- and power-constrained, time-sensitive, or loss-intolerant 341 networks (e.g., networks for industry automation/control, health 342 care) require efficient OAM methods to decide when to replicate 343 packets to a secondary path in order to keep the loss/error-rate for 344 the receiver at a tolerable level - and also when to stop replication 345 and eliminate the redundant flow. Many Internet of Things (IoT) 346 networks are time sensitive and cannot leverage automatic 347 retransmission requests (ARQ) to cope with transmission errors or 348 lost packets. Transmitting the data over multiple disparate paths 349 (often called bi-casting or live-live) is a method used to reduce the 350 error rate observed by the receiver. Time sensitive networks (TSN) 351 receive a lot of attention from the manufacturing industry as shown 352 by a various standardization activities and industry forums being 353 formed (see e.g., IETF 6TiSCH, IEEE P802.1CB, AVnu). 355 3.7. Proof of Transit 357 Several deployments use traffic engineering, policy routing, segment 358 routing or Service Function Chaining (SFC) [RFC7665] to steer packets 359 through a specific set of nodes. In certain cases regulatory 360 obligations or a compliance policy require to prove that all packets 361 that are supposed to follow a specific path are indeed being 362 forwarded across the exact set of nodes specified. If a packet flow 363 is supposed to go through a series of service functions or network 364 nodes, it has to be proven that all packets of the flow actually went 365 through the service chain or collection of nodes specified by the 366 policy. In case the packets of a flow weren't appropriately 367 processed, a verification device would be required to identify the 368 policy violation and take corresponding actions (e.g., drop or 369 redirect the packet, send an alert etc.) corresponding to the policy. 370 In today's deployments, the proof that a packet traversed a 371 particular service chain is typically delivered in an indirect way: 372 Service appliances and network forwarding are in different trust 373 domains. Physical hand-off-points are defined between these trust 374 domains (i.e., physical interfaces). Or in other terms, in the 375 "network forwarding domain" things are wired up in a way that traffic 376 is delivered to the ingress interface of a service appliance and 377 received back from an egress interface of a service appliance. This 378 "wiring" is verified and trusted. The evolution to Network Function 379 Virtualization (NFV) and modern service chaining concepts (using 380 technologies such as Locator/ID Separation Protocol (LISP), Network 381 Service Header (NSH), Segment Routing (SR), etc.) blurs the line 382 between the different trust domains, because the hand-off-points are 383 no longer clearly defined physical interfaces, but are virtual 384 interfaces. Because of that very reason, networks operators require 385 that different trust layers not to be mixed in the same device. For 386 an NFV scenario a different proof is required. Offering a proof that 387 a packet traversed a specific set of service functions would allow 388 network operators to move away from the above described indirect 389 methods of proving that a service chain is in place for a particular 390 application. 392 Deployed service chains without the presence of a "proof of transit" 393 mechanism are typically operated as fail-open system: The packets 394 that arrive at the end of a service chain are processed. Adding 395 "proof of transit" capabilities to a service chain allows an operator 396 to turn a fail-open system into a fail-close system, i.e. packets 397 that did not properly traverse the service chain can be blocked. 399 A solution approach could be based on OAM data which is added to 400 every packet for achieving Proof Of Transit (POT).The OAM data is 401 updated at every hop and is used to verify whether a packet traversed 402 all required nodes. When the verifier receives each packet, it can 403 validate whether the packet traversed the service chain correctly. 404 The detailed mechanisms used for path verification along with the 405 procedures applied to the OAM data carried in the packet for path 406 verification are beyond the scope of this document. Details are 407 addressed in [I-D.brockners-proof-of-transit]. In this document the 408 term "proof" refers to a discrete set of bits that represents an 409 integer or string carried as OAM data. The OAM data is used to 410 verify whether a packet traversed the nodes it is supposed to 411 traverse. 413 3.8. Use Cases 415 In-situ OAM could be leveraged for several use cases, including: 417 o Traffic Matrix: Derive the network traffic matrix: Traffic for a 418 given time interval between any two edge nodes of a given domain. 419 Could be performed for all traffic or on a per Quality of Service 420 (QoS) class. 422 o Flow Debugging: Discover which path(s) a particular set of traffic 423 (identified by an n-tuple) takes in the network. Such a procedure 424 is particularly useful in case traffic is balanced across multiple 425 paths, like with link aggregation (LACP) or equal cost multi- 426 pathing (ECMP). 428 o Loss Statistics per Path: Retrieve loss statistics per flow and 429 path in the network. 431 o Path Heat Maps: Discover highly utilized links in the network. 433 o Trend Analysis on Traffic Patterns: Analyze if (and if so how) the 434 forwarding path for a specific set of traffic changes over time 435 (can give hints to routing issues, unstable links etc.) 437 o Network Delay Distribution: Show delay distribution across network 438 by node or links. If enabled per application or for a specific 439 flow then display the path taken along with the delay incurred at 440 every hop. 442 o SLA Verification: Verify that a negotiated service level agreement 443 (SLA), e.g., for packet drop rates or delay/jitter is conformed to 444 by the actual traffic. 446 o Low-power Networks: Include application level OAM information 447 (e.g., battery charge level, cache or buffer fill level) into data 448 traffic to avoid sending extra OAM traffic which incur an extra 449 cost on the devices. Using the battery charge level as example, 450 one could avoid sending extra OAM packets just to communicate 451 battery health, and as such would save battery on sensors. 453 o Path Verification or Service Function Path Verification: Proof and 454 verification of packets traversing check points in the network, 455 where check points can be nodes in the network or service 456 functions. 458 o Geo-location Policy: Network policy implemented based on which 459 path packets took. Example: Only if packets originated and stayed 460 within the trading-floor department, access to specific 461 applications or servers is granted. 463 o Device-level Troubleshooting and Optimization: In many cases, 464 network operators could benefit from information specific to a 465 single device. A non-exhaustive list of useful information 466 includes: queue-depths, buffer utilization (either shared or per- 467 port), packet latency measured from a known starting point, packet 468 latency introduced by a single device, and resource utilization 469 (CPU, memory, link bandwidth) of a given device or link. In some 470 cases, this information changes over per-packet timescales (i.e., 471 nanoseconds) and as such it is extremely challenging to collect 472 and report this info in an accurate and scalable manner. By 473 encoding the information from the forwarding element directly 474 within a data packet (i.e., within the 'fast-path') this 475 information can be added to some or all data packets and then 476 collected and analyzed by human or machine tools. This type of 477 information is particularly valuable for troubleshooting low-level 478 device errors as well as providing a knowledge feedback loop for 479 network and device optimization. 481 o Custom Network Probing: Active network probing and in-situ OAM can 482 be combined for customized and efficient network probing. This 483 could for example be a customized traceroute. 485 4. Considerations for In-situ OAM 487 The implementation of an in-situ OAM mechanism needs to take several 488 considerations into account, including administrative boundaries, how 489 information is recorded, Maximum Transfer Unit (MTU), Path MTU 490 Discovery (PMTUD) and packet size, etc. 492 4.1. Type of Information to Be Recorded 494 The information gathered for in-situ OAM can be categorized into 495 three main categories: Information with a per-hop scope, such as path 496 tracing; information which applies to a specific set of hops, such as 497 path or service chain verification; information which only applies to 498 the edges of a domain, such as sequence numbers. Note that a single 499 network device could comprise several in-situ OAM hops, for example 500 in case one wants to trace the path of a packet through that device. 502 o "edge to edge": Information that needs to be shared between 503 network edges (the "edge" of a network could either be a host or a 504 domain edge device): Edge to edge data e.g., packet and octet 505 count of data entering a well-defined domain and leaving it is 506 helpful in building traffic matrix, sequence number (also called 507 "path packet counters") is useful for the flow to detect packet 508 loss. 510 o "selected hops": Information that applies to a specific set of 511 nodes only. In case of path verification, only the nodes which 512 are "check points" are required to interpret and update the 513 information in the packet. 515 o "per hop": Information that is gathered at every hop along the 516 path a packet traverses within an administrative domain: 518 * Hop by Hop information e.g., Nodes visited for path tracing, 519 Timestamps at each hop to find delays along the path 521 * Stats collection at each hop to optimize communication in 522 resource constrained networks e.g., battery, CPU, memory status 523 of each node piggy backed in a data packet is useful in low 524 power lossy networks where network nodes are mostly asleep and 525 communication is expensive 527 4.2. MTU and Packet Size 529 The recorded data at every hop might lead to packet size exceeding 530 the Maximum Transmit Unit (MTU). A detailed discussion of the 531 implications of oversized IPv6 header chains is found in [RFC7112]. 532 The Path MTU restricts the amount of data that can be recorded for 533 purpose of OAM within a data packet. 535 If in-situ OAM data is inserted at the edge of the domain (e.g., by 536 intermediate routers) then the MTU on all interfaces with the domain 537 (MTU_INT) MUST be >= the maximum MTU on any "external" facing 538 interfaces (MTU_EXT) and the total size of in-situ OAM data to be 539 recorded MUST be <= (MTU_INT - MTU_EXT). 541 In-situ OAM comprises two approaches to insert OAM data-records in 542 the packets: 544 o Pre-allocated: In this case, the encapsulating node inserts empty 545 data records into the packet to cover the entire domain. The data 546 records will be incrementally updated/filled as the packet 547 progresses through the network. With pre-allocation the packet 548 size is only changed at the encapsulating node and is kept 549 constant throughout the domain. The pre-allocated approach is 550 beneficial for software data-plane implementations where 551 allocating the required space only once and index into the array 552 to populate the data during transit avoids copy operations at 553 every hop. 555 o Incremental: Every node that desires to include in-situ OAM 556 information extends the packet as needed. The incremental 557 approach is beneficial for hardware data-plane implementations as 558 it eliminates the need for the transit nodes to read the full 559 array and lookup the pointer in the option prior to updating the 560 data record contents. 562 The "incremental" or the "pre-allocated" approaches could even be 563 combined in the same deployment - in which case two in-situ OAM 564 headers would be present in the packet: One for the incremental 565 approach and one for the pre-allocated approach. In such a case one 566 would expect that nodes with a hardware data-plane would update the 567 incremental header, whereas nodes with a software data-plane would 568 process the pre-allocated header. 570 4.3. Administrative Boundaries 572 There are several challenges in enabling in-situ OAM in the public 573 Internet as well as in corporate/enterprise networks across 574 administrative domains, which include but are not limited to: 576 o Deployment dependent, the data fields that in-situ OAM requires as 577 part of a specific transport protocol may not be supported across 578 administrative boundaries. 580 o Current OAM implementations are often done in the slow path, i.e., 581 OAM packets are punted to router's CPU for processing. This leads 582 to performance and scaling issues and opens up routers for attacks 583 such as Denial of Service (DoS) attacks. 585 o Discovery of network topology and details of the network devices 586 across administrative boundaries may open up attack vectors 587 compromising network security. 589 o Specifically on IPv6: At the administrative boundaries IPv6 590 packets with extension headers are dropped for several reasons 591 described in [RFC7872]. 593 The following considerations will be discussed in a future version of 594 this document: If the packet is dropped due to the presence of the 595 in-situ OAM; If the policy failure is treated as feature disablement 596 and any further recording is stopped but the packet itself is not 597 dropped, it may lead to every node in the path to make this policy 598 decision. 600 4.4. Selective Enablement 602 The ability to selectively enable in-situ OAM is valuable. While it 603 may be desirable to enable data collection on all traffic or devices, 604 this may not always be feasible. In-situ OAM collection may also 605 come with a performance impact to forwarding rates or feature 606 capabilities, which may be acceptable in only some locations. For 607 example, the SPUD prototype uses the notion of "pipes" to describe 608 the portion of the traffic that could be subject to in-path 609 inspection. Mechanisms to decide which traffic would be subject to 610 in-situ OAM are outside the scope of this document. 612 4.5. Optimization of Node and Interface Identifiers 614 Since packets have a finite maximum size, the data recording or 615 carrying capacity of one packet in which the in-situ OAM metadata is 616 present is limited. In-situ OAM should use its own dedicated 617 namespace (confined to the domain in-situ OAM operates in) to 618 represent node and interface IDs to save space in the header. 619 Generic representations of node and interface identifiers which are 620 globally unique (such as a UUID) would consume significantly more 621 bits of in-situ OAM data. 623 4.6. Loop Communication Path (IPv6-specifics) 625 When recorded data is required to be analyzed on a source node that 626 issues a packet and inserts in-situ OAM data, the recorded data needs 627 to be carried back to the source node. 629 One way to carry the in-situ OAM data back to the source is to 630 utilize an ICMP Echo Request/Reply (ping) or ICMPv6 Echo Request/ 631 Reply (ping6) mechanism. In order to run the in-situ OAM mechanism 632 appropriately on the ping/ping6 mechanism, the following two 633 operations should be implemented by the ping/ping6 target node: 635 1. All of the in-situ OAM fields would be copied from an Echo 636 Request message to an Echo Reply message. 638 2. The Hop Limit field of the IPv6 header of these messages would be 639 copied as a continuous sequence. Further considerations are 640 addressed in a future version of this document. 642 5. Requirements for In-situ OAM Data Types 644 The above discussed use cases require different types of in-situ OAM 645 data. This section details requirements for in-situ OAM derived from 646 the discussion above. 648 5.1. Generic Requirements 650 REQ-G1: Classification: It should be possible to enable in-situ OAM 651 on a selected set of traffic (e.g., per interface, based on 652 an access control list specifying a specific set of traffic, 653 etc.) The selected set of traffic can also be all traffic. 655 REQ-G2: Scope: If in-situ OAM is used only within a specific domain, 656 provisions need to be put in place to ensure that in-situ 657 OAM data stays within the specific domain only. 659 REQ-G3: Transport independence: Data formats for in-situ OAM shall 660 be defined in a transport independent way. In-situ OAM 661 applies to a variety of transport protocols. Encapsulations 662 should be defined how the generic data formats are carried 663 by a specific protocol. 665 REQ-G4: Layering: It should be possible to have in-situ OAM 666 information for different transport protocol layers be 667 present in several fields within a single packet. This 668 could for example be the case when tunnels are employed and 669 in-situ OAM information is to be gathered for both the 670 underlay as well as the overlay network. Layering support 671 should not be limited to just underlay and overlay, but 672 include more than two layers. 674 REQ-G5: MTU size: With in-situ OAM information added, packets MUST 675 NOT become larger than the path MTU. 677 REQ-G5.1: If due to some reason a packet which contains in 678 situ OAM data record cannot be forwarded due to 679 the presence of in-situ OAM data records, the 680 node SHOULD remove the in situ OAM data records 681 and forward the packet, rather than drop the 682 entire packet. 684 REQ-G5.2: If the encapsulating router is unable to insert 685 in-situ OAM data records into a packet, e.g., due 686 to MTU issues, even though it is configured to do 687 so, it should use some operational means to 688 inform the operator (e.g., syslog) about the 689 inability to add in-situ OAM data records. Even 690 if the in-situ OAM encapsulating node fails to 691 add in-situ OAM data records, it should forward 692 the packet normally. 694 REQ-G5.3: MTU size consideration for in-situ OAM MUST take 695 domain specifics into account, e.g., changes of 696 the domain topology due to path protection 697 mechanisms might extend the hop count of a path 698 etc. 700 REQ-G6: Data structure reuse: The data types and data formats 701 defined and used for in-situ OAM ought to be reusable for 702 out-of-band OAM telemetry as well. 704 REQ-G7: Data records format: It is desirable that the format of in- 705 situ OAM data-records leverages already defined data formats 706 for OAM as much as feasible. 708 REQ-G8: Combination with active OAM mechanisms: In-situ OAM should 709 be useable for active network probing, like for example a 710 customized version of traceroute. Decapsulating in-situ OAM 711 nodes may have an ability to send the in-situ OAM 712 information retrieved from the packet back to the source 713 address of the packet or to the encapsulating node. 715 5.2. In-situ OAM Data with Per-hop Scope 717 REQ-H1: Missing nodes detection: Data shall be present that allows a 718 node to detect whether all nodes that might participate in 719 in-situ OAM operations have indeed participated. 721 REQ-H2: Node, instance or device identifier: Data shall be present 722 that allows to retrieve the identity of the entity reporting 723 telemetry information. The entity can be a device, or a 724 subsystem/component within a device. The latter will allow 725 for packet tracing within a device in much the same way as 726 between devices. 728 REQ-H3: Ingress interface identifier: Data shall be present that 729 allows the identification of the interface a particular 730 packet was received from. The interface can be a logical 731 and/or physical entity. 733 REQ-H4: Egress interface identifier: Data shall be present that 734 allows the identification of the interface a particular 735 packet was forwarded to. Interface can be a logical or 736 physical entity. 738 REQ-H5: Time-related requirements 740 REQ-H5.1: Delay: Data shall be present that allows to 741 retrieve the delay between two or more points of 742 interest within the system. Those points can be 743 within the same device or on different devices. 745 REQ-H5.2: Jitter: Data shall be present that allows to 746 retrieve the jitter between two or more points of 747 interest within the system. Those points can be 748 within the same device or on different devices. 749 Jitter can be derived from the different 750 timestamps gathered and does not necessarily need 751 to be an explicit data record. 753 REQ-H5.3: Wall-clock time: Data shall be present that 754 allows to retrieve the wall-clock time visited a 755 particular point of interest in the system. 757 REQ-H5.4: Time precision: Time with different precision 758 should be supported. Use-case dependent, the 759 required precision could e.g., be nanoseconds, 760 microseconds, milliseconds, or seconds. 762 REQ-H6: Generic data records (like e.g., GPS/Geo-location 763 information): It should be possible to add user-defined OAM 764 data at select hops to the packet. The semantics of the 765 data are defined by the user. 767 5.3. In-situ OAM with Selected Hop Scope 769 REQ-S1: Proof of transit: Data shall be present which allows to 770 securely prove that a packet has visited or ore several 771 particular points of interest (i.e., a particular set of 772 nodes). 774 REQ-S1.1: In case "Shamir's secret sharing scheme" is used 775 for proof of transit, two data records, "random" 776 and "cumulative" shall be present. The number of 777 bits used for "random" and "cumulative" data 778 records can vary between deployments and should 779 thus be configurable. 781 REQ-S1.2: Enable a fail-open service chaining system to be 782 converted into a fail-closed service chaining 783 system. 785 5.4. In-situ OAM with End-to-end Scope 787 REQ-E1: Sequence numbering: 789 REQ-E1.1: Reordering detection: It should be possible to 790 detect whether packets have been reordered while 791 traversing an in situ OAM domain. 793 REQ-E1.2: Duplicates detection: It should be possible to 794 detect whether packets have been duplicated while 795 traversing an in situ OAM domain. 797 REQ-E1.3: Detection of packet drops: It should be possible 798 to detect whether packets have been dropped while 799 traversing an in-situ OAM domain. 801 6. Security Considerations and Requirements 803 6.1. General considerations 805 General Security considerations will be expanded on in a later 806 version of this document. 808 In-situ OAM is considered a "per domain" feature, where one or 809 several operators decide on leveraging and configuring in-situ OAM 810 according to their needs. Still operators need to properly secure 811 the in-situ OAM domain to avoid malicious configuration and use, 812 which could include injecting malicious in-situ OAM packets into a 813 domain. 815 6.2. Proof of Transit 817 Threat Model: Attacks on the deployments could be due to malicious 818 administrators or accidental misconfiguration resulting in bypassing 819 of certain nodes. The solution approach should meet the following 820 requirements: 822 REQ-SEC1: Sound Proof of Transit: A valid and verifiable proof that 823 the packet definitively traversed through all the nodes as 824 expected. Probabilistic methods to achieve this should be 825 avoided, as the same could be exploited by an attacker. 827 REQ-SEC2: Tampering of meta data: An active attacker should not be 828 able to insert or modify or delete meta data in whole or 829 in parts and bypass few (or all) nodes. Any deviation 830 from the expected path should be accurately determined. 832 REQ-SEC3: Replay Attacks: A attacker (active/passive) should not be 833 able to reuse the POT bits in the packet by observing the 834 OAM data in the packet, packet characteristics (like IP 835 addresses, octets transferred, timestamps) or even the 836 proof bits themselves. The solution approach should 837 consider usage of these parameters for deriving any 838 secrets cautiously. Mitigating replay attacks beyond a 839 window of longer duration could be intractable to achieve 840 with fixed number of bits allocated for proof. 842 REQ-SEC4: Pre-play Attacks: A active attacker should not be able to 843 generate or reuse valid POT bits from legitimate packets, 844 in order to prove to the verifier as valid packets. This 845 slight variant of replay attacks. The attacker extracts 846 POT bits from legitimate packets and ensure they do not 847 reach the verifier. Subsequently reuse those POT bits in 848 crafted packets. 850 REQ-SEC5: Recycle Secrets: Any configuration of the secrets (like 851 cryptographic keys, initialization vectors etc.) either in 852 the controller or service functions should be re- 853 configurable. Solution approach should enable controls, 854 API calls etc. needed in order to perform such recycling. 855 It is desirable to provide recommendations on the duration 856 of rotation cycles needed for the secure functioning of 857 the overall system. 859 REQ-SEC6: Secret storage and distribution: Secrets should be shared 860 with the devices over secure channels. Methods should be 861 put in place so that secrets cannot be retrieved by non- 862 authorized personnel from the devices. 864 7. IANA Considerations 866 [RFC Editor: please remove this section prior to publication.] 868 This document has no IANA actions. 870 8. Acknowledgements 872 The authors would like to thank Jen Linkova, LJ Wobker, Eric Vyncke, 873 Nalini Elkins, Srihari Raghavan, Ranganathan T S, Karthik Babu 874 Harichandra Babu, Akshaya Nadahalli, Ignas Bagdonas, LJ Wobker, Erik 875 Nordmark, and Andrew Yourtchenko for the comments and advice. This 876 document leverages and builds on top of several concepts described in 877 [I-D.kitamura-ipv6-record-route]. The authors would like to 878 acknowledge the work done by the author Hiroshi Kitamura and people 879 involved in writing it. 881 9. References 883 9.1. Normative References 885 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 886 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/ 887 RFC2119, March 1997, 888 . 890 9.2. Informative References 892 [I-D.brockners-lisp-sr] 893 Brockners, F., Bhandari, S., Maino, F., and D. Lewis, 894 "LISP Extensions for Segment Routing", draft-brockners- 895 lisp-sr-01 (work in progress), February 2014. 897 [I-D.brockners-proof-of-transit] 898 Brockners, F., Bhandari, S., Dara, S., Pignataro, C., 899 Leddy, J., and S. Youell, "Proof of Transit", draft- 900 brockners-proof-of-transit-01 (work in progress), July 901 2016. 903 [I-D.hildebrand-spud-prototype] 904 Hildebrand, J. and B. Trammell, "Substrate Protocol for 905 User Datagrams (SPUD) Prototype", draft-hildebrand-spud- 906 prototype-03 (work in progress), March 2015. 908 [I-D.ietf-spring-segment-routing] 909 Filsfils, C., Previdi, S., Decraene, B., Litkowski, S., 910 and R. Shakir, "Segment Routing Architecture", draft-ietf- 911 spring-segment-routing-09 (work in progress), July 2016. 913 [I-D.kitamura-ipv6-record-route] 914 Kitamura, H., "Record Route for IPv6 (PR6) Hop-by-Hop 915 Option Extension", draft-kitamura-ipv6-record-route-00 916 (work in progress), November 2000. 918 [I-D.lapukhov-dataplane-probe] 919 Lapukhov, P. and r. remy@barefootnetworks.com, "Data-plane 920 probe for in-band telemetry collection", draft-lapukhov- 921 dataplane-probe-01 (work in progress), June 2016. 923 [P4] Kim, , "P4: In-band Network Telemetry (INT)", September 924 2015. 926 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, DOI 927 10.17487/RFC0791, September 1981, 928 . 930 [RFC4884] Bonica, R., Gan, D., Tappan, D., and C. Pignataro, 931 "Extended ICMP to Support Multi-Part Messages", RFC 4884, 932 DOI 10.17487/RFC4884, April 2007, 933 . 935 [RFC4950] Bonica, R., Gan, D., Tappan, D., and C. Pignataro, "ICMP 936 Extensions for Multiprotocol Label Switching", RFC 4950, 937 DOI 10.17487/RFC4950, August 2007, 938 . 940 [RFC5837] Atlas, A., Ed., Bonica, R., Ed., Pignataro, C., Ed., Shen, 941 N., and JR. Rivers, "Extending ICMP for Interface and 942 Next-Hop Identification", RFC 5837, DOI 10.17487/RFC5837, 943 April 2010, . 945 [RFC7112] Gont, F., Manral, V., and R. Bonica, "Implications of 946 Oversized IPv6 Header Chains", RFC 7112, DOI 10.17487/ 947 RFC7112, January 2014, 948 . 950 [RFC7276] Mizrahi, T., Sprecher, N., Bellagamba, E., and Y. 951 Weingarten, "An Overview of Operations, Administration, 952 and Maintenance (OAM) Tools", RFC 7276, DOI 10.17487/ 953 RFC7276, June 2014, 954 . 956 [RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function 957 Chaining (SFC) Architecture", RFC 7665, DOI 10.17487/ 958 RFC7665, October 2015, 959 . 961 [RFC7799] Morton, A., "Active and Passive Metrics and Methods (with 962 Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799, 963 May 2016, . 965 [RFC7872] Gont, F., Linkova, J., Chown, T., and W. Liu, 966 "Observations on the Dropping of Packets with IPv6 967 Extension Headers in the Real World", RFC 7872, DOI 968 10.17487/RFC7872, June 2016, 969 . 971 Authors' Addresses 973 Frank Brockners 974 Cisco Systems, Inc. 975 Hansaallee 249, 3rd Floor 976 DUESSELDORF, NORDRHEIN-WESTFALEN 40549 977 Germany 979 Email: fbrockne@cisco.com 981 Shwetha Bhandari 982 Cisco Systems, Inc. 983 Cessna Business Park, Sarjapura Marathalli Outer Ring Road 984 Bangalore, KARNATAKA 560 087 985 India 987 Email: shwethab@cisco.com 989 Sashank Dara 990 Cisco Systems, Inc. 991 Cessna Business Park, Sarjapura Marathalli Outer Ring Road 992 Bangalore, KARNATAKA 560 087 993 India 995 Email: sadara@cisco.com 996 Carlos Pignataro 997 Cisco Systems, Inc. 998 7200-11 Kit Creek Road 999 Research Triangle Park, NC 27709 1000 United States 1002 Email: cpignata@cisco.com 1004 Hannes Gredler 1005 RtBrick Inc. 1007 Email: hannes@rtbrick.com 1009 John Leddy 1010 Comcast 1012 Email: John_Leddy@cable.comcast.com 1014 Stephen Youell 1015 JP Morgan Chase 1016 25 Bank Street 1017 London E14 5JP 1018 United Kingdom 1020 Email: stephen.youell@jpmorgan.com 1022 David Mozes 1023 Mellanox Technologies Ltd. 1025 Email: davidm@mellanox.com 1027 Tal Mizrahi 1028 Marvell 1029 6 Hamada St. 1030 Yokneam 20692 1031 Israel 1033 Email: talmi@marvell.com 1034 Petr Lapukhov 1035 Facebook 1036 1 Hacker Way 1037 Menlo Park, CA 94025 1038 USA 1040 URI: petr@fb.com 1042 Remy Chang 1043 Barefoot Networks 1045 Email: remy@barefootnetworks.com