idnits 2.17.1 draft-brockners-inband-oam-requirements-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 13, 2017) is 2600 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-05) exists of draft-brockners-proof-of-transit-02 == Outdated reference: A later version (-15) exists of draft-ietf-spring-segment-routing-10 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F. Brockners 3 Internet-Draft S. Bhandari 4 Intended status: Informational S. Dara 5 Expires: September 14, 2017 C. Pignataro 6 Cisco 7 H. Gredler 8 RtBrick Inc. 9 J. Leddy 10 Comcast 11 S. Youell 12 JMPC 13 D. Mozes 14 Mellanox Technologies Ltd. 15 T. Mizrahi 16 Marvell 17 P. Lapukhov 18 Facebook 19 R. Chang 20 Barefoot Networks 21 March 13, 2017 23 Requirements for In-situ OAM 24 draft-brockners-inband-oam-requirements-03 26 Abstract 28 This document discusses the motivation and requirements for including 29 specific operational and telemetry information into data packets 30 while the data packet traverses a path between two points in the 31 network. This method is referred to as "in-situ" Operations, 32 Administration, and Maintenance (OAM), given that the OAM information 33 is carried with the data packets as opposed to in "out-of-band" 34 packets dedicated to OAM. In situ OAM complements other OAM 35 mechanisms which use dedicated probe packets to convey OAM 36 information. 38 Status of This Memo 40 This Internet-Draft is submitted in full conformance with the 41 provisions of BCP 78 and BCP 79. 43 Internet-Drafts are working documents of the Internet Engineering 44 Task Force (IETF). Note that other groups may also distribute 45 working documents as Internet-Drafts. The list of current Internet- 46 Drafts is at http://datatracker.ietf.org/drafts/current/. 48 Internet-Drafts are draft documents valid for a maximum of six months 49 and may be updated, replaced, or obsoleted by other documents at any 50 time. It is inappropriate to use Internet-Drafts as reference 51 material or to cite them other than as "work in progress." 53 This Internet-Draft will expire on September 14, 2017. 55 Copyright Notice 57 Copyright (c) 2017 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents 62 (http://trustee.ietf.org/license-info) in effect on the date of 63 publication of this document. Please review these documents 64 carefully, as they describe your rights and restrictions with respect 65 to this document. Code Components extracted from this document must 66 include Simplified BSD License text as described in Section 4.e of 67 the Trust Legal Provisions and are provided without warranty as 68 described in the Simplified BSD License. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 73 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 4 74 3. Motivation for in-situ OAM . . . . . . . . . . . . . . . . . 5 75 3.1. Path Congruency Issues with Dedicated OAM Packets . . . . 5 76 3.2. Results Sent to a System Other Than the Sender . . . . . 6 77 3.3. Overlay and Underlay Correlation . . . . . . . . . . . . 6 78 3.4. SLA Verification . . . . . . . . . . . . . . . . . . . . 7 79 3.5. Analytics and Diagnostics . . . . . . . . . . . . . . . . 7 80 3.6. Frame Replication/Elimination Decision for Bi-casting 81 /Active-active Networks . . . . . . . . . . . . . . . . . 8 82 3.7. Proof of Transit . . . . . . . . . . . . . . . . . . . . 8 83 3.8. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 9 84 4. Considerations for In-situ OAM . . . . . . . . . . . . . . . 11 85 4.1. Type of Information to be Recorded . . . . . . . . . . . 11 86 4.2. MTU and Packet Size . . . . . . . . . . . . . . . . . . . 12 87 4.3. Administrative Boundaries . . . . . . . . . . . . . . . . 13 88 4.3.1. Layered In-Situ OAM Domains . . . . . . . . . . . . . 13 89 4.4. Selective Enablement . . . . . . . . . . . . . . . . . . 14 90 4.5. Forwarding Behavior . . . . . . . . . . . . . . . . . . . 14 91 4.6. Optimization of Node and Interface Identifiers . . . . . 14 92 4.7. Loop Communication Path (IPv6-specifics) . . . . . . . . 15 93 5. Requirements for In-situ OAM Data Types . . . . . . . . . . . 15 94 5.1. Generic Requirements . . . . . . . . . . . . . . . . . . 15 95 5.2. In-situ OAM Data with Per-hop Scope . . . . . . . . . . . 17 96 5.3. In-situ OAM with Selected Hop Scope . . . . . . . . . . . 18 97 5.4. In-situ OAM with End-to-end Scope . . . . . . . . . . . . 18 98 6. Security Considerations and Requirements . . . . . . . . . . 19 99 6.1. General considerations . . . . . . . . . . . . . . . . . 19 100 6.2. Proof of Transit . . . . . . . . . . . . . . . . . . . . 19 101 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 102 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20 103 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 104 9.1. Normative References . . . . . . . . . . . . . . . . . . 20 105 9.2. Informative References . . . . . . . . . . . . . . . . . 21 106 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 22 108 1. Introduction 110 This document discusses requirements for "in-situ" Operations, 111 Administration, and Maintenance (OAM) mechanisms. In this context, 112 "in-situ OAM" refers to the concept of directly encoding telemetry 113 information within the data packet as it traverses the network or 114 telemetry domain. Mechanisms which add tracing or other types of 115 telemetry information to the regular data traffic, sometimes also 116 referred to as "in-band" OAM can complement active, probe-based 117 mechanisms such as ping or traceroute, which are sometimes considered 118 as "out-of-band", because the messages are transported independently 119 from regular data traffic. In terms of "active" or "passive" OAM, 120 "in-situ" OAM can be considered a hybrid OAM type. While no extra 121 packets are sent, in-situ OAM adds information to the packets 122 therefore cannot be considered passive. In terms of the 123 classification given in [RFC7799] in-situ OAM could be portrayed as 124 "hybrid OAM, type 1". "In-situ" mechanisms do not require extra 125 packets to be sent and hence don't change the packet traffic mix 126 within the network. Traceroute and ping for example use ICMP 127 messages: New packets are injected to get tracing information. Those 128 add to the number of messages in a network, which already might be 129 highly loaded or suffering performance issues for a particular path 130 or traffic type. 132 A number of in-situ as well as in-band OAM mechanisms have been 133 discussed, such as the INT spec for the P4 programming language [P4] 134 or the SPUD prototype [I-D.hildebrand-spud-prototype]. The SPUD 135 prototype uses a similar logic that allows network devices on the 136 path between endpoints to participate explicitly in the tube outside 137 the end-to-end context. Even the IPv4 route-record option defined in 138 [RFC0791] can be considered an in-situ OAM mechanism. Per what was 139 already stated, in-situ OAM complements "out-of-band" mechanisms such 140 as ping or traceroute, or more recent active probing mechanisms, as 141 described in [I-D.lapukhov-dataplane-probe]. In-situ OAM mechanisms 142 can be leveraged where current out-of-band mechanisms do not apply or 143 do not offer the desired characteristics or requirements, such as 144 proving that a certain set of traffic takes a pre-defined path, 145 strict congruency between overlay and underlay transports is in 146 place, checking service level agreements for the live data traffic, 147 detailed statistics or verification of path selections within a 148 domain, or scenarios where probe traffic is potentially handled 149 differently from regular data traffic by the network devices. 150 [RFC7276] presents an overview of OAM tools. 152 Compared to probably the most basic example of "in-situ OAM" which is 153 IPv4 route recording [RFC0791], an in-situ OAM approach has the 154 following capabilities: 156 a. A flexible data format to allow different types of information to 157 be captured as part of an in-situ OAM operation, including but 158 not limited to path tracing information, operational and 159 telemetry information such as timestamps, sequence numbers, or 160 even generic data such as queue size, geo-location of the node 161 that forwarded the packet, etc. 163 b. A data format to express node as well as link identifiers to 164 record the path a packet takes with a fixed amount of added data. 166 c. The ability to determine whether any nodes were skipped while 167 recording in-situ OAM information (i.e., in-situ OAM is not 168 supported or not enabled on those nodes). 170 d. The ability to actively process information in the packet, for 171 example to prove in a cryptographically secure way that a packet 172 really took a pre-defined path using some traffic steering method 173 such as service chaining or traffic engineering. 175 e. The ability to include OAM data beyond simple path information, 176 such as timestamps or even generic data of a particular use case. 178 f. The ability to carry in-situ OAM data in various different 179 transport protocols. 181 2. Conventions 183 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 184 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 185 document are to be interpreted as described in [RFC2119]. 187 Abbreviations used in this document: 189 ECMP: Equal Cost Multi-Path 191 IOAM: In-situ Operations, Administration, and Maintenance 192 LISP: Locator/ID Separation Protocol 194 MTU: Maximum Transmit Unit 196 NSH: Network Service Header 198 NFV: Network Function Virtualization 200 OAM: Operations, Administration, and Maintenance 202 PMTU: Path MTU 204 SFC: Service Function Chain 206 SLA: Service Level Agreement 208 SR: Segment Routing 210 SID: Segment Identifier 212 VXLAN-GPE: Virtual eXtensible Local Area Network, Generic Protocol 213 Extension 215 This document defines in-situ Operations, Administration, and 216 Maintenance (in-situ OAM), as the subset in which OAM information is 217 carried along with data packets. This is as opposed to "out-of-band 218 OAM", where specific packets are dedicated to carrying OAM 219 information. 221 3. Motivation for in-situ OAM 223 In several scenarios it is beneficial to make information about the 224 path a packet took through the network or through a network device as 225 well as associated telemetry information available to the operator. 226 This includes not only tasks like debugging, troubleshooting, as well 227 as network planning and network optimization but also policy or 228 service level agreement compliance checks. This section discusses 229 the motivation to introduce new methods for enhanced in-situ network 230 diagnostics. 232 3.1. Path Congruency Issues with Dedicated OAM Packets 234 Packet scheduling algorithms, especially for balancing traffic across 235 equal cost paths or links, often leverage information contained 236 within the packet, such as protocol number, IP-address or MAC- 237 address. Probe packets would thus either need to be sent from the 238 exact same endpoints with the exact same parameters, or probe packets 239 would need to be artificially constructed as "fake" packets and 240 inserted along the path. Both approaches are often not feasible from 241 an operational perspective, be it that access to the end-system is 242 not feasible, or that the diversity of parameters and associated 243 probe packets to be created is simply too large. An in-situ 244 mechanism is an alternative in those cases. 246 In-situ mechanisms are not impacted by differences in the handling of 247 probe traffic compared to other data packets, where probe traffic is 248 handled differently (and potentially forwarded differently) by a 249 router than regular data traffic. This obviously assumes that the 250 addition of in-situ information does not change the forwarding 251 behavior of the packet. Note that in certain implementations, the 252 addition information to a transport protocol changes the forwarding 253 behavior. IPv6 extension header processing is one example. Some 254 implementations process IPv6 packets with extension headers in the 255 "slow" path of a router, as opposed to the "fast" path. 257 3.2. Results Sent to a System Other Than the Sender 259 Traditional ping and traceroute tools return the OAM results to the 260 sender of the probe. Even when the ICMP messages that are used with 261 these tools are enhanced, and additional telemetry is collected 262 (e.g., ICMP Multi-Part [RFC4884] supporting MPLS information 263 [RFC4950], Interface and Next-Hop Identification [RFC5837], etc.), it 264 would be advantageous to separate the sending of an OAM probe from 265 the receiving of the telemetry data. In this context, it is helpful 266 to eliminate the requirement that there be a working bidirectional 267 path. 269 3.3. Overlay and Underlay Correlation 271 Several network deployments leverage tunneling mechanisms to create 272 overlay or service-layer networks. Examples include VXLAN-GPE, GRE, 273 or LISP. One often observed attribute of overlay networks is that 274 they do not offer the user of the overlay any insight into the 275 underlay network. This means that the path that a particular 276 tunneled packet takes, nor other operational details such as the per- 277 hop delay/jitter in the underlay are visible to the user of the 278 overlay network, giving rise to diagnosis and debugging challenges in 279 case of connectivity or performance issues. The scope of OAM tools 280 like ping or traceroute is limited to either the overlay or the 281 underlay which means that the user of the overlay has typically no 282 access to OAM in the underlay, unless specific operational procedures 283 are put in place. With in-situ OAM the operator of the underlay can 284 offer details of the connectivity in the underlay to the user of the 285 overlay. This could include the ability to find out which underlay 286 elements are shared by overlays and ability to know which overlays 287 are mapped to the same underlay elements. Deployment dependent 288 underlay transit nodes can be configured to update OAM information in 289 the overlay transport encapsulation. The operator of the egress 290 tunnel router could choose to share the recorded information about 291 the path with the user of the overlay. 293 Coupled with mechanisms such as Segment Routing (SR) 294 [I-D.ietf-spring-segment-routing], overlay network and underlay 295 network can be more tightly coupled: The user of the overlay has 296 detailed diagnostic information available in case of failure 297 conditions. The user of the overlay can also use the path recording 298 information as input to traffic steering or traffic engineering 299 mechanisms, to for example achieve path symmetry for the traffic 300 between two endpoints. [I-D.brockners-lisp-sr] is an example for how 301 these methods can be applied to LISP. 303 3.4. SLA Verification 305 In-situ OAM can help users of an overlay-service to verify that 306 negotiated SLAs for the real traffic are met by the underlay network 307 provider. Different from solutions which rely on active probes to 308 test an SLA, in-situ OAM based mechanisms avoid wrong interpretations 309 and "cheating", which can happen if the probe traffic that is used to 310 perform SLA-check is prioritized by the network provider of the 311 underlay. In active/standby deployments in-situ OAM would only allow 312 for SLA verification of the active path. 314 3.5. Analytics and Diagnostics 316 Network planners and operators benefit from knowledge of the actual 317 traffic distribution in the network. When deriving an overall 318 network connectivity traffic matrix one typically needs to correlate 319 data gathered from each individual device in the network. If the 320 path of a packet is recorded while the packet is forwarded, the 321 entire path that a packet took through the network is available to 322 the egress system. This obviates the need to retrieve individual 323 traffic statistics from every device in the network and correlate 324 those statistics, or employ other mechanisms such as leveraging 325 traffic engineering with null-bandwidth tunnels just to retrieve the 326 appropriate statistics to generate the traffic matrix. 328 In addition, with individual path tracing, information is available 329 at packet level granularity, rather than only at aggregate level - as 330 is usually the case with IPFIX-style methods which employ flow- 331 filters at the network elements. Data-center networks which use 332 equal-cost multipath (ECMP) forwarding are one example where detailed 333 statistics on flow distribution in the network are highly desired. 334 If a network supports ECMP, one can create detailed statistics for 335 the different paths packets take through the network at the egress 336 system, without a need to correlate/aggregate statistics from every 337 router in the system. Transit devices are off-loaded from the task 338 of gathering packet statistics. 340 In high-speed networks one can leverage and benefit from packet- 341 accurate measurements with for example hardware-accurate timestamping 342 (i.e., nanosecond-level verification) to support optimized packet 343 scheduling and queuing mechanisms. 345 3.6. Frame Replication/Elimination Decision for Bi-casting/Active- 346 active Networks 348 Bandwidth- and power-constrained, time-sensitive, or loss-intolerant 349 networks (e.g., networks for industry automation/control, health 350 care) require efficient OAM methods to decide when to replicate 351 packets to a secondary path in order to keep the loss/error-rate for 352 the receiver at a tolerable level - and also when to stop replication 353 and eliminate the redundant flow. Many Internet of Things (IoT) 354 networks are time sensitive and cannot leverage automatic 355 retransmission requests (ARQ) to cope with transmission errors or 356 lost packets. Transmitting the data over multiple disparate paths 357 (often called bi-casting or live-live) is a method used to reduce the 358 error rate observed by the receiver. Time sensitive networks (TSN) 359 receive a lot of attention from the manufacturing industry as shown 360 by a various standardization activities and industry forums being 361 formed (see e.g., IETF 6TiSCH, IEEE P802.1CB, AVnu). 363 3.7. Proof of Transit 365 Several deployments use traffic engineering, policy routing, segment 366 routing or Service Function Chaining (SFC) [RFC7665] to steer packets 367 through a specific set of nodes. In certain cases regulatory 368 obligations or a compliance policy require to prove that all packets 369 that are supposed to follow a specific path are indeed being 370 forwarded across the exact set of nodes specified. If a packet flow 371 is supposed to go through a series of service functions or network 372 nodes, it has to be proven that all packets of the flow actually went 373 through the service chain or collection of nodes specified by the 374 policy. In case the packets of a flow weren't appropriately 375 processed, a verification device would be required to identify the 376 policy violation and take corresponding actions (e.g., drop or 377 redirect the packet, send an alert etc.) corresponding to the policy. 378 In today's deployments, the proof that a packet traversed a 379 particular service chain is typically delivered in an indirect way: 380 Service appliances and network forwarding are in different trust 381 domains. Physical hand-off-points are defined between these trust 382 domains (i.e., physical interfaces). Or in other terms, in the 383 "network forwarding domain" things are wired up in a way that traffic 384 is delivered to the ingress interface of a service appliance and 385 received back from an egress interface of a service appliance. This 386 "wiring" is verified and trusted. The evolution to Network Function 387 Virtualization (NFV) and modern service chaining concepts (using 388 technologies such as Locator/ID Separation Protocol (LISP), Network 389 Service Header (NSH), Segment Routing (SR), etc.) blurs the line 390 between the different trust domains, because the hand-off-points are 391 no longer clearly defined physical interfaces, but are virtual 392 interfaces. Because of that very reason, networks operators require 393 that different trust layers not to be mixed in the same device. For 394 an NFV scenario a different proof is required. Offering a proof that 395 a packet traversed a specific set of service functions would allow 396 network operators to move away from the above described indirect 397 methods of proving that a service chain is in place for a particular 398 application. 400 Deployed service chains without the presence of a "proof of transit" 401 mechanism are typically operated as fail-open system: The packets 402 that arrive at the end of a service chain are processed. Adding 403 "proof of transit" capabilities to a service chain allows an operator 404 to turn a fail-open system into a fail-close system, i.e. packets 405 that did not properly traverse the service chain can be blocked. 407 A solution approach could be based on OAM data which is added to 408 every packet for achieving Proof Of Transit (POT).The OAM data is 409 updated at every hop and is used to verify whether a packet traversed 410 all required nodes. When the verifier receives each packet, it can 411 validate whether the packet traversed the service chain correctly. 412 The detailed mechanisms used for path verification along with the 413 procedures applied to the OAM data carried in the packet for path 414 verification are beyond the scope of this document. Details are 415 addressed in [I-D.brockners-proof-of-transit]. In this document the 416 term "proof" refers to a discrete set of bits that represents an 417 integer or string carried as OAM data. The OAM data is used to 418 verify whether a packet traversed the nodes it is supposed to 419 traverse. 421 3.8. Use Cases 423 In-situ OAM could be leveraged for several use cases, including: 425 o Traffic Matrix: Derive the network traffic matrix: Traffic for a 426 given time interval between any two edge nodes of a given domain. 427 Could be performed for all traffic or on a per Quality of Service 428 (QoS) class. 430 o Flow Debugging: Discover which path(s) a particular set of traffic 431 (identified by an n-tuple) takes in the network. Such a procedure 432 is particularly useful in case traffic is balanced across multiple 433 paths, like with link aggregation (LACP) or equal cost multi- 434 pathing (ECMP). 436 o Loss Statistics per Path: Retrieve loss statistics per flow and 437 path in the network. 439 o Path Heat Maps: Discover highly utilized links in the network. 441 o Trend Analysis on Traffic Patterns: Analyze if (and if so how) the 442 forwarding path for a specific set of traffic changes over time 443 (can give hints to routing issues, unstable links etc.) 445 o Network Delay Distribution: Show delay distribution across network 446 by node or links. If enabled per application or for a specific 447 flow then display the path taken along with the delay incurred at 448 every hop. 450 o SLA Verification: Verify that a negotiated service level agreement 451 (SLA), e.g., for packet drop rates or delay/jitter is conformed to 452 by the actual traffic. 454 o Low-power Networks: Include application level OAM information 455 (e.g., battery charge level, cache or buffer fill level) into data 456 traffic to avoid sending extra OAM traffic which incur an extra 457 cost on the devices. Using the battery charge level as example, 458 one could avoid sending extra OAM packets just to communicate 459 battery health, and as such would save battery on sensors. 461 o Path Verification or Service Function Path Verification: Proof and 462 verification of packets traversing check points in the network, 463 where check points can be nodes in the network or service 464 functions. 466 o Geo-location Policy: Network policy implemented based on which 467 path packets took. Example: Only if packets originated and stayed 468 within the trading-floor department, access to specific 469 applications or servers is granted. 471 o Device-level Troubleshooting and Optimization: In many cases, 472 network operators could benefit from information specific to a 473 single device. A non-exhaustive list of useful information 474 includes: queue-depths, buffer utilization (either shared or per- 475 port), packet latency measured from a known starting point, packet 476 latency introduced by a single device, and resource utilization 477 (CPU, memory, link bandwidth) of a given device or link. In some 478 cases, this information changes over per-packet timescales (i.e., 479 nanoseconds) and as such it is extremely challenging to collect 480 and report this info in an accurate and scalable manner. By 481 encoding the information from the forwarding element directly 482 within a data packet (i.e., within the 'fast-path') this 483 information can be added to some or all data packets and then 484 collected and analyzed by human or machine tools. This type of 485 information is particularly valuable for troubleshooting low-level 486 device errors as well as providing a knowledge feedback loop for 487 network and device optimization. 489 o Custom Network Probing: Active network probing and in-situ OAM can 490 be combined for customized and efficient network probing. This 491 could for example be a customized traceroute. 493 4. Considerations for In-situ OAM 495 The implementation of an in-situ OAM mechanism needs to take several 496 considerations into account, including administrative boundaries, how 497 information is recorded, Maximum Transfer Unit (MTU), Path MTU 498 Discovery (PMTUD) and packet size, etc. 500 4.1. Type of Information to be Recorded 502 The information gathered for in-situ OAM can be categorized into 503 three main categories: Information with a per-hop scope, such as path 504 tracing; information which applies to a specific set of hops, such as 505 path or service chain verification; information which only applies to 506 the edges of a domain, such as sequence numbers. Note that a single 507 network device could comprise several in-situ OAM hops, for example 508 in case one wants to trace the path of a packet through that device. 510 o "edge to edge": Information that needs to be shared between 511 network edges (the "edge" of a network could either be a host or a 512 domain edge device): Edge to edge data e.g., packet and octet 513 count of data entering a well-defined domain and leaving it is 514 helpful in building traffic matrix, sequence number (also called 515 "path packet counters") is useful for the flow to detect packet 516 loss. 518 o "selected hops": Information that applies to a specific set of 519 nodes only. In case of path verification, only the nodes which 520 are "check points" are required to interpret and update the 521 information in the packet. 523 o "per hop": Information that is gathered at every hop along the 524 path a packet traverses within an administrative domain: 526 * Hop by Hop information e.g., Nodes visited for path tracing, 527 Timestamps at each hop to find delays along the path 529 * Stats collection at each hop to optimize communication in 530 resource constrained networks e.g., battery, CPU, memory status 531 of each node piggy backed in a data packet is useful in low 532 power lossy networks where network nodes are mostly asleep and 533 communication is expensive 535 4.2. MTU and Packet Size 537 The recorded data at every hop might lead to packet size exceeding 538 the Maximum Transmit Unit (MTU). A detailed discussion of the 539 implications of oversized IPv6 header chains is found in [RFC7112]. 540 The Path MTU restricts the amount of data that can be recorded for 541 purpose of OAM within a data packet. 543 If in-situ OAM data is inserted at the edge of the domain (e.g., by 544 intermediate routers) then the MTU on all interfaces with the domain 545 (MTU_INT) MUST be >= the maximum MTU on any "external" facing 546 interfaces (MTU_EXT) and the total size of in-situ OAM data to be 547 recorded MUST be <= (MTU_INT - MTU_EXT). 549 In-situ OAM comprises two approaches to insert OAM data fields in the 550 packets: 552 o Pre-allocated: In this case, the encapsulating node inserts empty 553 data fields into the packet to cover the entire domain. The data 554 fields will be incrementally updated/filled as the packet 555 progresses through the network. With pre-allocation the packet 556 size is only changed at the encapsulating node and is kept 557 constant throughout the domain. The pre-allocated approach is 558 beneficial for software data-plane implementations where 559 allocating the required space only once and index into the array 560 to populate the data during transit avoids copy operations at 561 every hop. 563 o Incremental: Every node that desires to include in-situ OAM 564 information extends the packet as needed. The incremental 565 approach is beneficial for hardware data-plane implementations as 566 it eliminates the need for the transit nodes to read the full 567 array and lookup the pointer in the option prior to updating the 568 data fields contents. 570 The "incremental" or the "pre-allocated" approaches could even be 571 combined in the same deployment - in which case two in-situ OAM 572 headers would be present in the packet: One for the incremental 573 approach and one for the pre-allocated approach. In such a case one 574 would expect that nodes with a hardware data-plane would update the 575 incremental header, whereas nodes with a software data-plane would 576 process the pre-allocated header. 578 4.3. Administrative Boundaries 580 There are several challenges in enabling in-situ OAM in the public 581 Internet as well as in corporate/enterprise networks across 582 administrative domains, which include but are not limited to: 584 o Deployment dependent, the data fields that in-situ OAM requires as 585 part of a specific transport protocol may not be supported across 586 administrative boundaries. 588 o Current OAM implementations are often done in the slow path, i.e., 589 OAM packets are punted to router's CPU for processing. This leads 590 to performance and scaling issues and opens up routers for attacks 591 such as Denial of Service (DoS) attacks. 593 o Discovery of network topology and details of the network devices 594 across administrative boundaries may open up attack vectors 595 compromising network security. 597 o Specifically on IPv6: At the administrative boundaries IPv6 598 packets with extension headers are dropped for several reasons 599 described in [RFC7872]. 601 The following considerations will be discussed in a future version of 602 this document: If the packet is dropped due to the presence of the 603 in-situ OAM; If the policy failure is treated as feature disablement 604 and any further recording is stopped but the packet itself is not 605 dropped, it may lead to every node in the path to make this policy 606 decision. 608 4.3.1. Layered In-Situ OAM Domains 610 Like any OAM domain, in-situ OAM domains could also be layered/ 611 nested. Layering/nesting of in-situ OAM follows the general approach 612 of OAM layering: An in-situ OAM domain consists of maintenance end- 613 points (MEP) and maintenance intermediate points (MIP). MEP add to 614 or remove the entire set of in-situ OAM data fields from the traffic, 615 while only MIP update or add in-situ OAM data fields. When in-situ 616 OAM layering is employed, a MEP of one layer becomes a MIP in the 617 layer above, while MIP of the lower layer are not visible to the 618 layer above - unless specifically configured otherwise. 620 Consider the following examples: 622 o NSH over IPv6: In-situ OAM data fields could be present in both 623 transport protocols: NSH and IPv6, with NSH forming the overlay 624 network and IPv6 forming the underlay network. The network which 625 deploys NSH would form an in-situ OAM domain. In addition each 626 IPv6 underlay network which connects two NSH nodes forms an in- 627 situ OAM domain. The in-situ OAM domain with NSH as transport 628 could be considered as layered on top of the different in-situ OAM 629 domains which use IPv6 as transport. 631 o NSH using an in-situ OAM aware transport: Consider a case where 632 the underlay network would not natively support in-situ OAM, still 633 the individual transport nodes would have the capability to "look 634 deep into the packet" and update/add in-situ OAM information in 635 the NSH header. The in-situ OAM domain with NSH as transport 636 could be considered as layered on top of the different in-situ OAM 637 domains which are in-situ OAM aware and connect the individual NSH 638 nodes. 640 4.4. Selective Enablement 642 The ability to selectively enable in-situ OAM is valuable. While it 643 may be desirable to enable data collection on all traffic or devices, 644 this may not always be feasible. In-situ OAM collection may also 645 come with a performance impact to forwarding rates or feature 646 capabilities, which may be acceptable in only some locations. For 647 example, the SPUD prototype uses the notion of "pipes" to describe 648 the portion of the traffic that could be subject to in-path 649 inspection. Mechanisms to decide which traffic would be subject to 650 in-situ OAM are outside the scope of this document. 652 4.5. Forwarding Behavior 654 In-situ OAM adds additional data fields to live user traffic and as 655 such changes the packet which is also why in-situ OAM is 656 characterized as "hybrid, type 1" OAM. The effectiveness of in-situ 657 OAM as a tool for operations depends on forwarding nodes not altering 658 their forwarding behavior in case of in-situ OAM data fields being 659 present in the packet. As a consequence, an implementation of in- 660 situ OAM should not change the forwarding behavior of the packet, 661 i.e. packets with or without in-situ OAM data fields should be 662 handled the same way by a forwarding node (see also the associated 663 requirement further below). Note that there are implementations 664 where the addition of meta-data to live user traffic might cause the 665 forwarding behavior of the packet to change, e.g. certain 666 implementation handle IPv6 packets with or without extension headers 667 differently (see [RFC7872]). 669 4.6. Optimization of Node and Interface Identifiers 671 Since packets have a finite maximum size, the data recording or 672 carrying capacity of one packet in which the in-situ OAM metadata is 673 present is limited. In-situ OAM should use its own dedicated 674 namespace (confined to the domain in-situ OAM operates in) to 675 represent node and interface IDs to save space in the header. 676 Generic representations of node and interface identifiers which are 677 globally unique (such as a UUID) would consume significantly more 678 bits of in-situ OAM data. 680 4.7. Loop Communication Path (IPv6-specifics) 682 When recorded data is required to be analyzed on a source node that 683 issues a packet and inserts in-situ OAM data, the recorded data needs 684 to be carried back to the source node. 686 One way to carry the in-situ OAM data back to the source is to 687 utilize an ICMP Echo Request/Reply (ping) or ICMPv6 Echo Request/ 688 Reply (ping6) mechanism. In order to run the in-situ OAM mechanism 689 appropriately on the ping/ping6 mechanism, the following two 690 operations should be implemented by the ping/ping6 target node: 692 1. All of the in-situ OAM fields would be copied from an Echo 693 Request message to an Echo Reply message. 695 2. The Hop Limit field of the IPv6 header of these messages would be 696 copied as a continuous sequence. Further considerations are 697 addressed in a future version of this document. 699 5. Requirements for In-situ OAM Data Types 701 The above discussed use cases require different types of in-situ OAM 702 data. This section details requirements for in-situ OAM derived from 703 the discussion above. 705 5.1. Generic Requirements 707 REQ-G1: Classification: It should be possible to enable in-situ OAM 708 on a selected set of traffic (e.g., per interface, based on 709 an access control list specifying a specific set of 710 traffic, etc.) The selected set of traffic can also be all 711 traffic. 713 REQ-G2: Scope: If in-situ OAM is used only within a specific 714 domain, provisions need to be put in place to ensure that 715 in-situ OAM data stays within the specific domain only. 717 REQ-G3: Transport independence: Data formats for in-situ OAM shall 718 be defined in a transport independent way. In-situ OAM 719 applies to a variety of transport protocols. 720 Encapsulations should be defined how the generic data 721 formats are carried by a specific protocol. 723 REQ-G4: Layering: It should be possible to have in-situ OAM 724 information for different transport protocol layers be 725 present in several fields within a single packet. This 726 could for example be the case when tunnels are employed and 727 in-situ OAM information is to be gathered for both the 728 underlay as well as the overlay network. Layering support 729 should not be limited to just underlay and overlay, but 730 include more than two layers. 732 REQ-G5: MTU size: With in-situ OAM information added, packets MUST 733 NOT become larger than the path MTU. 735 REQ-G5.1: If due to some reason a packet which contains in 736 situ OAM data fields cannot be forwarded due to 737 the presence of in-situ OAM data fields, the 738 node SHOULD remove the in situ OAM data fields 739 and forward the packet, rather than drop the 740 entire packet. 742 REQ-G5.2: If the encapsulating router is unable to insert 743 in-situ OAM data fields into a packet, e.g., due 744 to MTU issues, even though it is configured to 745 do so, it should use some operational means to 746 inform the operator (e.g., syslog) about the 747 inability to add in-situ OAM data fields. Even 748 if the in-situ OAM encapsulating node fails to 749 add in-situ OAM data fields, it should forward 750 the packet normally. 752 REQ-G5.3: MTU size consideration for in-situ OAM MUST take 753 domain specifics into account, e.g., changes of 754 the domain topology due to path protection 755 mechanisms might extend the hop count of a path 756 etc. 758 REQ-G6: Data structure reuse: The data fields and associated types 759 defined and used for in-situ OAM ought to be reusable for 760 out-of-band OAM telemetry as well. 762 REQ-G7: Data fields: It is desirable that the format of in-situ OAM 763 data fields leverages already defined data formats for OAM 764 as much as feasible. 766 REQ-G8: Combination with active OAM mechanisms: In-situ OAM should 767 be usable for active network probing, like for example a 768 customized version of traceroute. Decapsulating in-situ 769 OAM nodes may have an ability to send the in-situ OAM 770 information retrieved from the packet back to the source 771 address of the packet or to the encapsulating node. 773 REQ-G9: Unaltered forwarding behavior of in-situ OAM nodes: The 774 addition of in-situ OAM data fields should not change the 775 way packets are forwarded within the in-situ OAM domain. 777 REQ-G10: Layering of in-situ OAM domains: It should be possible to 778 layer in-situ OAM domains on each other. Layering should 779 be supported within the same, as well as with different 780 transport protocols which carry in-situ OAM data fields. 782 5.2. In-situ OAM Data with Per-hop Scope 784 REQ-H1: Missing nodes detection: Data shall be present that allows a 785 node to detect whether all nodes that might participate in 786 in-situ OAM operations have indeed participated. 788 REQ-H2: Node, instance or device identifier: Data shall be present 789 that allows to retrieve the identity of the entity reporting 790 telemetry information. The entity can be a device, or a 791 subsystem/component within a device. The latter will allow 792 for packet tracing within a device in much the same way as 793 between devices. 795 REQ-H3: Ingress interface identifier: Data shall be present that 796 allows the identification of the interface a particular 797 packet was received from. The interface can be a logical 798 and/or physical entity. 800 REQ-H4: Egress interface identifier: Data shall be present that 801 allows the identification of the interface a particular 802 packet was forwarded to. Interface can be a logical or 803 physical entity. 805 REQ-H5: Time-related requirements 807 REQ-H5.1: Delay: Data shall be present that allows to 808 retrieve the delay between two or more points of 809 interest within the system. Those points can be 810 within the same device or on different devices. 812 REQ-H5.2: Jitter: Data shall be present that allows to 813 retrieve the jitter between two or more points of 814 interest within the system. Those points can be 815 within the same device or on different devices. 816 Jitter can be derived from the different 817 timestamps gathered and does not necessarily need 818 to be an explicit data field. 820 REQ-H5.3: Wall-clock time: Data shall be present that 821 allows to retrieve the wall-clock time visited a 822 particular point of interest in the system. 824 REQ-H5.4: Time precision: Time with different precision 825 should be supported. Use-case dependent, the 826 required precision could e.g., be nanoseconds, 827 microseconds, milliseconds, or seconds. 829 REQ-H6: Generic data fields (like e.g., GPS/Geo-location 830 information): It should be possible to add user-defined OAM 831 data at select hops to the packet. The semantics of the 832 data are defined by the user. 834 5.3. In-situ OAM with Selected Hop Scope 836 REQ-S1: Proof of transit: Data shall be present which allows to 837 securely prove that a packet has visited or ore several 838 particular points of interest (i.e., a particular set of 839 nodes). 841 REQ-S1.1: In case "Shamir's secret sharing scheme" is used 842 for proof of transit, two data fields, "random" 843 and "cumulative" shall be present. The number of 844 bits used for "random" and "cumulative" data 845 fields can vary between deployments and should 846 thus be configurable. 848 REQ-S1.2: Enable a fail-open service chaining system to be 849 converted into a fail-closed service chaining 850 system. 852 5.4. In-situ OAM with End-to-end Scope 854 REQ-E1: Sequence numbering: 856 REQ-E1.1: Reordering detection: It should be possible to 857 detect whether packets have been reordered while 858 traversing an in situ OAM domain. 860 REQ-E1.2: Duplicates detection: It should be possible to 861 detect whether packets have been duplicated while 862 traversing an in situ OAM domain. 864 REQ-E1.3: Detection of packet drops: It should be possible 865 to detect whether packets have been dropped while 866 traversing an in-situ OAM domain. 868 6. Security Considerations and Requirements 870 6.1. General considerations 872 General Security considerations will be expanded on in a later 873 version of this document. 875 In-situ OAM is considered a "per domain" feature, where one or 876 several operators decide on leveraging and configuring in-situ OAM 877 according to their needs. Still operators need to properly secure 878 the in-situ OAM domain to avoid malicious configuration and use, 879 which could include injecting malicious in-situ OAM packets into a 880 domain. 882 6.2. Proof of Transit 884 Threat Model: Attacks on the deployments could be due to malicious 885 administrators or accidental misconfiguration resulting in bypassing 886 of certain nodes. The solution approach should meet the following 887 requirements: 889 REQ-SEC1: Sound Proof of Transit: A valid and verifiable proof that 890 the packet definitively traversed through all the nodes as 891 expected. Probabilistic methods to achieve this should be 892 avoided, as the same could be exploited by an attacker. 894 REQ-SEC2: Tampering of meta data: An active attacker should not be 895 able to insert or modify or delete meta data in whole or 896 in parts and bypass few (or all) nodes. Any deviation 897 from the expected path should be accurately determined. 899 REQ-SEC3: Replay Attacks: A attacker (active/passive) should not be 900 able to reuse the POT bits in the packet by observing the 901 OAM data in the packet, packet characteristics (like IP 902 addresses, octets transferred, timestamps) or even the 903 proof bits themselves. The solution approach should 904 consider usage of these parameters for deriving any 905 secrets cautiously. Mitigating replay attacks beyond a 906 window of longer duration could be intractable to achieve 907 with fixed number of bits allocated for proof. 909 REQ-SEC4: Pre-play Attacks: A active attacker should not be able to 910 generate or reuse valid POT bits from legitimate packets, 911 in order to prove to the verifier as valid packets. This 912 slight variant of replay attacks. The attacker extracts 913 POT bits from legitimate packets and ensure they do not 914 reach the verifier. Subsequently reuse those POT bits in 915 crafted packets. 917 REQ-SEC5: Recycle Secrets: Any configuration of the secrets (like 918 cryptographic keys, initialization vectors etc.) either in 919 the controller or service functions should be re- 920 configurable. Solution approach should enable controls, 921 API calls etc. needed in order to perform such recycling. 922 It is desirable to provide recommendations on the duration 923 of rotation cycles needed for the secure functioning of 924 the overall system. 926 REQ-SEC6: Secret storage and distribution: Secrets should be shared 927 with the devices over secure channels. Methods should be 928 put in place so that secrets cannot be retrieved by non- 929 authorized personnel from the devices. 931 7. IANA Considerations 933 [RFC Editor: please remove this section prior to publication.] 935 This document has no IANA actions. 937 8. Acknowledgements 939 The authors would like to thank Jen Linkova, LJ Wobker, Eric Vyncke, 940 Nalini Elkins, Srihari Raghavan, Ranganathan T S, Karthik Babu 941 Harichandra Babu, Akshaya Nadahalli, Ignas Bagdonas, LJ Wobker, Erik 942 Nordmark, Vengada Prasad Govindan, and Andrew Yourtchenko for the 943 comments and advice. This document leverages and builds on top of 944 several concepts described in [I-D.kitamura-ipv6-record-route]. The 945 authors would like to acknowledge the work done by the author Hiroshi 946 Kitamura and people involved in writing it. 948 9. References 950 9.1. Normative References 952 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 953 Requirement Levels", BCP 14, RFC 2119, 954 DOI 10.17487/RFC2119, March 1997, 955 . 957 9.2. Informative References 959 [I-D.brockners-lisp-sr] 960 Brockners, F., Bhandari, S., Maino, F., and D. Lewis, 961 "LISP Extensions for Segment Routing", draft-brockners- 962 lisp-sr-01 (work in progress), February 2014. 964 [I-D.brockners-proof-of-transit] 965 Brockners, F., Bhandari, S., Dara, S., Pignataro, C., 966 Leddy, J., Youell, S., Mozes, D., and T. Mizrahi, "Proof 967 of Transit", draft-brockners-proof-of-transit-02 (work in 968 progress), October 2016. 970 [I-D.hildebrand-spud-prototype] 971 Hildebrand, J. and B. Trammell, "Substrate Protocol for 972 User Datagrams (SPUD) Prototype", draft-hildebrand-spud- 973 prototype-03 (work in progress), March 2015. 975 [I-D.ietf-spring-segment-routing] 976 Filsfils, C., Previdi, S., Decraene, B., Litkowski, S., 977 and R. Shakir, "Segment Routing Architecture", draft-ietf- 978 spring-segment-routing-10 (work in progress), November 979 2016. 981 [I-D.kitamura-ipv6-record-route] 982 Kitamura, H., "Record Route for IPv6 (PR6) Hop-by-Hop 983 Option Extension", draft-kitamura-ipv6-record-route-00 984 (work in progress), November 2000. 986 [I-D.lapukhov-dataplane-probe] 987 Lapukhov, P. and r. remy@barefootnetworks.com, "Data-plane 988 probe for in-band telemetry collection", draft-lapukhov- 989 dataplane-probe-01 (work in progress), June 2016. 991 [P4] Kim, , "P4: In-band Network Telemetry (INT)", September 992 2015. 994 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 995 DOI 10.17487/RFC0791, September 1981, 996 . 998 [RFC4884] Bonica, R., Gan, D., Tappan, D., and C. Pignataro, 999 "Extended ICMP to Support Multi-Part Messages", RFC 4884, 1000 DOI 10.17487/RFC4884, April 2007, 1001 . 1003 [RFC4950] Bonica, R., Gan, D., Tappan, D., and C. Pignataro, "ICMP 1004 Extensions for Multiprotocol Label Switching", RFC 4950, 1005 DOI 10.17487/RFC4950, August 2007, 1006 . 1008 [RFC5837] Atlas, A., Ed., Bonica, R., Ed., Pignataro, C., Ed., Shen, 1009 N., and JR. Rivers, "Extending ICMP for Interface and 1010 Next-Hop Identification", RFC 5837, DOI 10.17487/RFC5837, 1011 April 2010, . 1013 [RFC7112] Gont, F., Manral, V., and R. Bonica, "Implications of 1014 Oversized IPv6 Header Chains", RFC 7112, 1015 DOI 10.17487/RFC7112, January 2014, 1016 . 1018 [RFC7276] Mizrahi, T., Sprecher, N., Bellagamba, E., and Y. 1019 Weingarten, "An Overview of Operations, Administration, 1020 and Maintenance (OAM) Tools", RFC 7276, 1021 DOI 10.17487/RFC7276, June 2014, 1022 . 1024 [RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function 1025 Chaining (SFC) Architecture", RFC 7665, 1026 DOI 10.17487/RFC7665, October 2015, 1027 . 1029 [RFC7799] Morton, A., "Active and Passive Metrics and Methods (with 1030 Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799, 1031 May 2016, . 1033 [RFC7872] Gont, F., Linkova, J., Chown, T., and W. Liu, 1034 "Observations on the Dropping of Packets with IPv6 1035 Extension Headers in the Real World", RFC 7872, 1036 DOI 10.17487/RFC7872, June 2016, 1037 . 1039 Authors' Addresses 1041 Frank Brockners 1042 Cisco Systems, Inc. 1043 Hansaallee 249, 3rd Floor 1044 DUESSELDORF, NORDRHEIN-WESTFALEN 40549 1045 Germany 1047 Email: fbrockne@cisco.com 1048 Shwetha Bhandari 1049 Cisco Systems, Inc. 1050 Cessna Business Park, Sarjapura Marathalli Outer Ring Road 1051 Bangalore, KARNATAKA 560 087 1052 India 1054 Email: shwethab@cisco.com 1056 Sashank Dara 1057 Cisco Systems, Inc. 1058 Cessna Business Park, Sarjapura Marathalli Outer Ring Road 1059 Bangalore, KARNATAKA 560 087 1060 India 1062 Email: sadara@cisco.com 1064 Carlos Pignataro 1065 Cisco Systems, Inc. 1066 7200-11 Kit Creek Road 1067 Research Triangle Park, NC 27709 1068 United States 1070 Email: cpignata@cisco.com 1072 Hannes Gredler 1073 RtBrick Inc. 1075 Email: hannes@rtbrick.com 1077 John Leddy 1078 Comcast 1080 Email: John_Leddy@cable.comcast.com 1082 Stephen Youell 1083 JP Morgan Chase 1084 25 Bank Street 1085 London E14 5JP 1086 United Kingdom 1088 Email: stephen.youell@jpmorgan.com 1089 David Mozes 1090 Mellanox Technologies Ltd. 1092 Email: davidm@mellanox.com 1094 Tal Mizrahi 1095 Marvell 1096 6 Hamada St. 1097 Yokneam 20692 1098 Israel 1100 Email: talmi@marvell.com 1102 Petr Lapukhov 1103 Facebook 1104 1 Hacker Way 1105 Menlo Park, CA 94025 1106 USA 1108 URI: petr@fb.com 1110 Remy Chang 1111 Barefoot Networks 1113 Email: remy@barefootnetworks.com