idnits 2.17.1 draft-ietf-ippm-ioam-data-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 5, 2018) is 2241 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 536 -- Looks like a reference, but probably isn't: '1' on line 388 == Missing Reference: 'IEEE1588' is mentioned on line 1067, but not defined == Unused Reference: 'I-D.hildebrand-spud-prototype' is defined on line 1395, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE1588v2' -- Possible downref: Non-RFC (?) normative reference: ref. 'POSIX' == Outdated reference: A later version (-09) exists of draft-ietf-ntp-packet-timestamps-00 == Outdated reference: A later version (-16) exists of draft-ietf-nvo3-geneve-05 == Outdated reference: A later version (-13) exists of draft-ietf-nvo3-vxlan-gpe-05 Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ippm F. Brockners 3 Internet-Draft S. Bhandari 4 Intended status: Standards Track C. Pignataro 5 Expires: September 6, 2018 Cisco 6 H. Gredler 7 RtBrick Inc. 8 J. Leddy 9 Comcast 10 S. Youell 11 JPMC 12 T. Mizrahi 13 Marvell 14 D. Mozes 16 P. Lapukhov 17 Facebook 18 R. Chang 19 Barefoot Networks 20 D. Bernier 21 Bell Canada 22 J. Lemon 23 Broadcom 24 March 5, 2018 26 Data Fields for In-situ OAM 27 draft-ietf-ippm-ioam-data-02 29 Abstract 31 In-situ Operations, Administration, and Maintenance (IOAM) records 32 operational and telemetry information in the packet while the packet 33 traverses a path between two points in the network. This document 34 discusses the data fields and associated data types for in-situ OAM. 35 In-situ OAM data fields can be embedded into a variety of transports 36 such as NSH, Segment Routing, Geneve, native IPv6 (via extension 37 header), or IPv4. In-situ OAM can be used to complement OAM 38 mechanisms based on e.g. ICMP or other types of probe packets. 40 Status of This Memo 42 This Internet-Draft is submitted in full conformance with the 43 provisions of BCP 78 and BCP 79. 45 Internet-Drafts are working documents of the Internet Engineering 46 Task Force (IETF). Note that other groups may also distribute 47 working documents as Internet-Drafts. The list of current Internet- 48 Drafts is at http://datatracker.ietf.org/drafts/current/. 50 Internet-Drafts are draft documents valid for a maximum of six months 51 and may be updated, replaced, or obsoleted by other documents at any 52 time. It is inappropriate to use Internet-Drafts as reference 53 material or to cite them other than as "work in progress." 55 This Internet-Draft will expire on September 6, 2018. 57 Copyright Notice 59 Copyright (c) 2018 IETF Trust and the persons identified as the 60 document authors. All rights reserved. 62 This document is subject to BCP 78 and the IETF Trust's Legal 63 Provisions Relating to IETF Documents 64 (http://trustee.ietf.org/license-info) in effect on the date of 65 publication of this document. Please review these documents 66 carefully, as they describe your rights and restrictions with respect 67 to this document. Code Components extracted from this document must 68 include Simplified BSD License text as described in Section 4.e of 69 the Trust Legal Provisions and are provided without warranty as 70 described in the Simplified BSD License. 72 Table of Contents 74 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 75 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 3 76 3. Scope, Applicability, and Assumptions . . . . . . . . . . . . 4 77 4. IOAM Data Types and Formats . . . . . . . . . . . . . . . . . 5 78 4.1. IOAM Tracing Options . . . . . . . . . . . . . . . . . . 6 79 4.1.1. Pre-allocated and Incremental Trace Options . . . . . 8 80 4.1.2. IOAM node data fields and associated formats . . . . 12 81 4.1.3. Examples of IOAM node data . . . . . . . . . . . . . 17 82 4.2. IOAM Proof of Transit Option . . . . . . . . . . . . . . 19 83 4.2.1. IOAM Proof of Transit Type 0 . . . . . . . . . . . . 20 84 4.3. IOAM Edge-to-Edge Option . . . . . . . . . . . . . . . . 22 85 5. Timestamp Formats . . . . . . . . . . . . . . . . . . . . . . 23 86 5.1. PTP Truncated Timestamp Format . . . . . . . . . . . . . 23 87 5.2. NTP 64-bit Timestamp Format . . . . . . . . . . . . . . . 25 88 5.3. POSIX-based Timestamp Format . . . . . . . . . . . . . . 26 89 6. IOAM Data Export . . . . . . . . . . . . . . . . . . . . . . 27 90 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 28 91 7.1. Creation of a new In-Situ OAM Protocol Parameters 92 Registry (IOAM) Protocol Parameters IANA registry . . . . 28 93 7.2. IOAM Type Registry . . . . . . . . . . . . . . . . . . . 28 94 7.3. IOAM Trace Type Registry . . . . . . . . . . . . . . . . 29 95 7.4. IOAM Trace Flags Registry . . . . . . . . . . . . . . . . 29 96 7.5. IOAM POT Type Registry . . . . . . . . . . . . . . . . . 29 97 7.6. IOAM POT Flags Registry . . . . . . . . . . . . . . . . . 29 98 7.7. IOAM E2E Type Registry . . . . . . . . . . . . . . . . . 29 99 8. Manageability Considerations . . . . . . . . . . . . . . . . 29 100 9. Security Considerations . . . . . . . . . . . . . . . . . . . 30 101 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 30 102 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 30 103 11.1. Normative References . . . . . . . . . . . . . . . . . . 30 104 11.2. Informative References . . . . . . . . . . . . . . . . . 31 105 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 32 107 1. Introduction 109 This document defines data fields for "in-situ" Operations, 110 Administration, and Maintenance (IOAM). In-situ OAM records OAM 111 information within the packet while the packet traverses a particular 112 network domain. The term "in-situ" refers to the fact that the OAM 113 data is added to the data packets rather than is being sent within 114 packets specifically dedicated to OAM. IOAM is to complement 115 mechanisms such as Ping or Traceroute, or more recent active probing 116 mechanisms as described in [I-D.lapukhov-dataplane-probe]. In terms 117 of "active" or "passive" OAM, "in-situ" OAM can be considered a 118 hybrid OAM type. While no extra packets are sent, IOAM adds 119 information to the packets therefore cannot be considered passive. 120 In terms of the classification given in [RFC7799] IOAM could be 121 portrayed as Hybrid Type 1. "In-situ" mechanisms do not require 122 extra packets to be sent and hence don't change the packet traffic 123 mix within the network. IOAM mechanisms can be leveraged where 124 mechanisms using e.g. ICMP do not apply or do not offer the desired 125 results, such as proving that a certain traffic flow takes a pre- 126 defined path, SLA verification for the live data traffic, detailed 127 statistics on traffic distribution paths in networks that distribute 128 traffic across multiple paths, or scenarios in which probe traffic is 129 potentially handled differently from regular data traffic by the 130 network devices. 132 2. Conventions 134 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 135 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 136 document are to be interpreted as described in [RFC2119]. 138 Abbreviations used in this document: 140 E2E Edge to Edge 142 Geneve: Generic Network Virtualization Encapsulation 143 [I-D.ietf-nvo3-geneve] 145 IOAM: In-situ Operations, Administration, and Maintenance 146 MTU: Maximum Transmit Unit 148 NSH: Network Service Header [I-D.ietf-sfc-nsh] 150 OAM: Operations, Administration, and Maintenance 152 POT: Proof of Transit 154 SFC: Service Function Chain 156 SID: Segment Identifier 158 SR: Segment Routing 160 VXLAN-GPE: Virtual eXtensible Local Area Network, Generic Protocol 161 Extension [I-D.ietf-nvo3-vxlan-gpe] 163 3. Scope, Applicability, and Assumptions 165 IOAM deployment assumes a set of constraints, requirements, and 166 guiding principles which are described in this section. 168 Scope: This document defines the data fields and associated data 169 types for in-situ OAM. The in-situ OAM data field can be transported 170 by a variety of transport protocols, including NSH, Segment Routing, 171 Geneve, IPv6, or IPv4. Specification details for these different 172 transport protocols are outside the scope of this document. 174 Deployment domain (or scope) of in-situ OAM deployment: IOAM is a 175 network domain focused feature, with "network domain" being a set of 176 network devices or entities within a single administration. For 177 example, a network domain can include an enterprise campus using 178 physical connections between devices or an overlay network using 179 virtual connections / tunnels for connectivity between said devices. 180 A network domain is defined by its perimeter or edge. Designers of 181 carrier protocols for IOAM must specify mechanisms to ensure that 182 IOAM data stays within an IOAM domain. In addition, the operator of 183 such a domain is expected to put provisions in place to ensure that 184 IOAM data does not leak beyond the edge of an IOAM domain, e.g. using 185 for example packet filtering methods. The operator should consider 186 potential operational impact of IOAM to mechanisms such as ECMP 187 processing (e.g. load-balancing schemes based on packet length could 188 be impacted by the increased packet size due to IOAM), path MTU (i.e. 189 ensure that the MTU of all links within a domain is sufficiently 190 large to support the increased packet size due to IOAM) and ICMP 191 message handling (i.e. in case of a native IPv6 transport, IOAM 192 support for ICMPv6 Echo Request/Reply could desired which would 193 translate into ICMPv6 extensions to enable IOAM data fields to be 194 copied from an Echo Request message to an Echo Reply message). 196 IOAM control points: IOAM data fields are added to or removed from 197 the live user traffic by the devices which form the edge of a domain. 198 Devices within an IOAM domain can update and/or add IOAM data-fields. 199 Domain edge devices can be hosts or network devices. 201 Traffic-sets that IOAM is applied to: IOAM can be deployed on all or 202 only on subsets of the live user traffic. It SHOULD be possible to 203 enable IOAM on a selected set of traffic (e.g., per interface, based 204 on an access control list or flow specification defining a specific 205 set of traffic, etc.) The selected set of traffic can also be all 206 traffic. 208 Encapsulation independence: Data formats for IOAM SHOULD be defined 209 in a transport-independent manner. IOAM applies to a variety of 210 encapsulating protocols. A definition of how IOAM data fields are 211 carried by different transport protocols is outside the scope of this 212 document. 214 Layering: If several encapsulation protocols (e.g., in case of 215 tunneling) are stacked on top of each other, IOAM data-records could 216 be present at every layer. The behavior follows the ships-in-the- 217 night model. 219 Combination with active OAM mechanisms: IOAM should be usable for 220 active network probing, enabling for example a customized version of 221 traceroute. Decapsulating IOAM nodes may have an ability to send the 222 IOAM information retrieved from the packet back to the source address 223 of the packet or to the encapsulating node. 225 IOAM implementation: The IOAM data-field definitions take the 226 specifics of devices with hardware data-plane and software data-plane 227 into account. 229 4. IOAM Data Types and Formats 231 This section defines IOAM data types and data fields and associated 232 data types required for IOAM. 234 To accommodate the different uses of IOAM, IOAM data fields fall into 235 different categories, e.g. edge-to-edge, per node tracing, or for 236 proof of transit. In IOAM these categories are referred to as IOAM- 237 Types. A common registry is maintained for IOAM-Types, see 238 Section 7.2 for details. Corresponding to these IOAM-Types, 239 different IOAM data fields are defined. IOAM data fields can be 240 encapsulated into a variety of protocols, such as NSH, Geneve, IPv6, 241 etc. The definition of how IOAM data fields are encapsulated into 242 other protocols is outside the scope of this document. 244 IOAM is expected to be deployed in a specific domain rather than on 245 the overall Internet. The part of the network which employs IOAM is 246 referred to as the "IOAM-domain". IOAM data is added to a packet 247 upon entering the IOAM-domain and is removed from the packet when 248 exiting the domain. Within the IOAM-domain, the IOAM data may be 249 updated by network nodes that the packet traverses. The device which 250 adds an IOAM data container to the packet to capture IOAM data is 251 called the "IOAM encapsulating node", whereas the device which 252 removes the IOAM data container is referred to as the "IOAM 253 decapsulating node". Nodes within the domain which are aware of IOAM 254 data and read and/or write or process the IOAM data are called "IOAM 255 transit nodes". IOAM nodes which add or remove the IOAM data 256 container can also update the IOAM data fields at the same time. Or 257 in other words, IOAM encapsulation or decapsulating nodes can also 258 serve as IOAM transit nodes at the same time. Note that not every 259 node in an IOAM domain needs to be an IOAM transit node. For 260 example, a Segment Routing deployment might require the segment 261 routing path to be verified. In that case, only the SR nodes would 262 also be IOAM transit nodes rather than all nodes. 264 4.1. IOAM Tracing Options 266 "IOAM tracing data" is expected to be collected at every node that a 267 packet traverses to ensure visibility into the entire path a packet 268 takes within an IOAM domain, i.e., in a typical deployment all nodes 269 in an in-situ OAM-domain would participate in IOAM and thus be IOAM 270 transit nodes, IOAM encapsulating or IOAM decapsulating nodes. If 271 not all nodes within a domain are IOAM capable, IOAM tracing 272 information will only be collected on those nodes which are IOAM 273 capable. Nodes which are not IOAM capable will forward the packet 274 without any changes to the IOAM data fields. The maximum number of 275 hops and the minimum path MTU of the IOAM domain is assumed to be 276 known. 278 To optimize hardware and software implementations tracing is defined 279 as two separate options. Any deployment MAY choose to configure and 280 support one or both of the following options. An implementation of 281 the transport protocol that carries these in-situ OAM data MAY choose 282 to support only one of the options. In the event that both options 283 are utilized at the same time, the Incremental Trace Option MUST be 284 placed before the Pre-allocated Trace Option. Given that the 285 operator knows which equipment is deployed in a particular IOAM, the 286 operator will decide by means of configuration which type(s) of trace 287 options will be enabled for a particular domain. 289 Pre-allocated Trace Option: This trace option is defined as a 290 container of node data fields with pre-allocated space for each 291 node to populate its information. This option is useful for 292 software implementations where it is efficient to allocate the 293 space once and index into the array to populate the data during 294 transit. The IOAM encapsulating node allocates the option header 295 and sets the fields in the option header. The in situ OAM 296 encapsulating node allocates an array which is used to store 297 operational data retrieved from every node while the packet 298 traverses the domain. IOAM transit nodes update the content of 299 the array. A pointer which is part of the IOAM trace data points 300 to the next empty slot in the array, which is where the next IOAM 301 transit node fills in its data. 303 Incremental Trace Option: This trace option is defined as a 304 container of node data fields where each node allocates and pushes 305 its node data immediately following the option header. This type 306 of trace recording is useful for some of the hardware 307 implementations as this eliminates the need for the transit 308 network elements to read the full array in the option and allows 309 for arbitrarily long packets as the MTU allows. The in-situ OAM 310 encapsulating node allocates the option header. The in-situ OAM 311 encapsulating node based on operational state and configuration 312 sets the fields in the header that control what node data fields 313 should be collected, and how large the node data list can grow. 314 The in-situ OAM transit nodes push their node data to the node 315 data list, decrease the remaining length available to subsequent 316 nodes, and adjust the lengths and possibly checksums in outer 317 headers. 319 Every node data entry is to hold information for a particular IOAM 320 transit node that is traversed by a packet. The in-situ OAM 321 decapsulating node removes the IOAM data and processes and/or exports 322 the metadata. IOAM data uses its own name-space for information such 323 as node identifier or interface identifier. This allows for a 324 domain-specific definition and interpretation. For example: In one 325 case an interface-id could point to a physical interface (e.g., to 326 understand which physical interface of an aggregated link is used 327 when receiving or transmitting a packet) whereas in another case it 328 could refer to a logical interface (e.g., in case of tunnels). 330 The following IOAM data is defined for IOAM tracing: 332 o Identification of the IOAM node. An IOAM node identifier can 333 match to a device identifier or a particular control point or 334 subsystem within a device. 336 o Identification of the interface that a packet was received on, 337 i.e. ingress interface. 339 o Identification of the interface that a packet was sent out on, 340 i.e. egress interface. 342 o Time of day when the packet was processed by the node. Different 343 definitions of processing time are feasible and expected, though 344 it is important that all devices of an in-situ OAM domain follow 345 the same definition. 347 o Generic data: Format-free information where syntax and semantic of 348 the information is defined by the operator in a specific 349 deployment. For a specific deployment, all IOAM nodes should 350 interpret the generic data the same way. Examples for generic 351 IOAM data include geo-location information (location of the node 352 at the time the packet was processed), buffer queue fill level or 353 cache fill level at the time the packet was processed, or even a 354 battery charge level. 356 o A mechanism to detect whether IOAM trace data was added at every 357 hop or whether certain hops in the domain weren't in-situ OAM 358 transit nodes. 360 The "node data list" array in the packet is populated iteratively as 361 the packet traverses the network, starting with the last entry of the 362 array, i.e., "node data list [n]" is the first entry to be populated, 363 "node data list [n-1]" is the second one, etc. 365 4.1.1. Pre-allocated and Incremental Trace Options 367 The in-situ OAM pre-allocated trace option and the in-situ OAM 368 incremental trace option have similar formats. Except where noted 369 below, the internal formats and fields of the two trace options are 370 identical. 372 Pre-allocated and incremental trace option headers: 374 0 1 2 3 375 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 376 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 377 | IOAM-Trace-Type | NodeLen | Flags |RemainingLen | 378 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 380 The trace option data MUST be 4-octet aligned: 382 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ 383 | | | 384 | node data list [0] | | 385 | | | 386 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ D 387 | | a 388 | node data list [1] | t 389 | | a 390 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 391 ~ ... ~ S 392 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ p 393 | | a 394 | node data list [n-1] | c 395 | | e 396 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 397 | | | 398 | node data list [n] | | 399 | | | 400 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ 402 IOAM-Trace-Type: A 16-bit identifier which specifies which data 403 types are used in this node data list. 405 The IOAM-Trace-Type value is a bit field. The following bit 406 fields are defined in this document, with details on each field 407 described in the Section 4.1.2. The order of packing the data 408 fields in each node data element follows the bit order of the 409 IOAM-Trace-Type field, as follows: 411 Bit 0 (Most significant bit) When set indicates presence of 412 Hop_Lim and node_id in the node data. 414 Bit 1 When set indicates presence of ingress_if_id and 415 egress_if_id (short format) in the node data. 417 Bit 2 When set indicates presence of timestamp seconds in the 418 node data. 420 Bit 3 When set indicates presence of timestamp subseconds in 421 the node data. 423 Bit 4 When set indicates presence of transit delay in the node 424 data. 426 Bit 5 When set indicates presence of app_data (short format) in 427 the node data. 429 Bit 6 When set indicates presence of queue depth in the node 430 data. 432 Bit 7 When set indicates presence of variable length Opaque 433 State Snapshot field. 435 Bit 8 When set indicates presence of Hop_Lim and node_id in 436 wide format in the node data. 438 Bit 9 When set indicates presence of ingress_if_id and 439 egress_if_id in wide format in the node data. 441 Bit 10 When set indicates presence of app_data wide in the node 442 data. 444 Bit 11 When set indicates presence of the Checksum Complement 445 node data. 447 Bit 12-15 Undefined. An IOAM encapsulating node must set the 448 value of each of these bits to 0. If an IOAM transit 449 node receives a packet with one or more of these bits set 450 to 1, it must either: 452 1. Add corresponding node data filled with the reserved 453 value 0xFFFFFFFF, after the node data fields for the 454 IOAM-Trace-Type bits defined above, such that the 455 total node data added by this node in units of 456 4-octets is equal to NodeLen, or 458 2. Not add any node data fields to the packet, even for 459 the IOAM-Trace-Type bits defined above. 461 Section 4.1.2 describes the IOAM data types and their formats. 462 Within an in-situ OAM domain possible combinations of these bits 463 making the IOAM-Trace-Type can be restricted by configuration 464 knobs. 466 NodeLen: 5-bit unsigned integer. This field specifies the length of 467 data added by each node in multiples of 4-octets, excluding the 468 length of the "Opaque State Snapshot" field. 470 If IOAM-Trace-Type bit 7 is not set, then NodeLen specifies the 471 actual length added by each node. If IOAM-Trace-Type bit 7 is 472 set, then the actual length added by a node would be (NodeLen + 473 Opaque Data Length). 475 For example, if 3 IOAM-Trace-Type bits are set and none of them 476 are wide, then NodeLen would be 3. If 3 IOAM-Trace-Type bits are 477 set and 2 of them are wide, then NodeLen would be 5. 479 An IOAM encapsulating node must set NodeLen. 481 A node receiving an IOAM Pre-allocated or Incremental Trace Option 482 may rely on the NodeLen value, or it may ignore the NodeLen value 483 and calculate the node length from the IOAM-Trace-Type bits. 485 Flags 4-bit field. Following flags are defined: 487 Bit 0 "Overflow" (O-bit) (most significant bit). This bit is set 488 by the network element if there is not enough number of octets 489 left to record node data, no field is added and the overflow 490 "O-bit" must be set to "1" in the header. This is useful for 491 transit nodes to ignore further processing of the option. 493 Bit 1 "Loopback" (L-bit). Loopback mode is used to send a copy 494 of a packet back towards the source. Loopback mode assumes 495 that a return path from transit nodes and destination nodes 496 towards the source exists. The encapsulating node decides 497 (e.g. using a filter) which packets loopback mode is enabled 498 for by setting the loopback bit. The encapsulating node also 499 needs to ensure that sufficient space is available in the IOAM 500 header for loopback operation. The loopback bit when set 501 indicates to the transit nodes processing this option to create 502 a copy of the packet received and send this copy of the packet 503 back to the source of the packet while it continues to forward 504 the original packet towards the destination. The source 505 address of the original packet is used as destination address 506 in the copied packet. The address of the node performing the 507 copy operation is used as the source address. The L-bit MUST 508 be cleared in the copy of the packet that a node sends back 509 towards the source. On its way back towards the source, the 510 packet is processed like a regular packet with IOAM 511 information. Once the return packet reaches the IOAM domain 512 boundary IOAM decapsulation occurs as with any other packet 513 containing IOAM information. 515 Bit 2-3 Reserved: Must be zero. 517 RemainingLen: 7-bit unsigned integer. This field specifies the data 518 space in multiples of 4-octets remaining for recording the node 519 data, before the node data list is considered to have overflowed. 520 When RemainingLen reaches 0, nodes are no longer allowed to add 521 node data. Given that the sender knows the minimum path MTU, the 522 sender MAY set the initial value of RemainingLen according to the 523 number of node data bytes allowed before exceeding the MTU. 524 Subsequent nodes can carry out a simple comparison between 525 RemainingLen and NodeLen, along with the length of the "Opaque 526 State Snapshot" if applicable, to determine whether or not data 527 can be added by this node. When node data is added, the node MUST 528 decrease RemainingLen by the amount of data added. In the pre- 529 allocated trace option, this is used as an offset in data space to 530 record the node data element. 532 Node data List [n]: Variable-length field. The type of which is 533 determined by the IOAM-Trace-Type bit representing the n-th node 534 data in the node data list. The node data list is encoded 535 starting from the last node data of the path. The first element 536 of the node data list (node data list [0]) contains the last node 537 of the path while the last node data of the node data list (node 538 data list[n]) contains the first node data of the path traced. In 539 the pre-allocated trace option, the index contained in 540 RemainingLen identifies the offset for current active node data to 541 be populated. 543 4.1.2. IOAM node data fields and associated formats 545 All the data fields MUST be 4-octet aligned. If a node which is 546 supposed to update an IOAM data field is not capable of populating 547 the value of a field set in the IOAM-Trace-Type, the field value MUST 548 be set to 0xFFFFFFFF for 4-octet fields or 0xFFFFFFFFFFFFFFFF for 549 8-octet fields, indicating that the value is not populated, except 550 when explicitly specified in the field description below. 552 Data field and associated data type for each of the data field is 553 shown below: 555 Hop_Lim and node_id: 4-octet field defined as follows: 557 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 558 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 559 | Hop_Lim | node_id | 560 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 561 Hop_Lim: 1-octet unsigned integer. It is set to the Hop Limit 562 value in the packet at the node that records this data. Hop 563 Limit information is used to identify the location of the node 564 in the communication path. This is copied from the lower 565 layer, e.g., TTL value in IPv4 header or hop limit field from 566 IPv6 header of the packet when the packet is ready for 567 transmission. The semantics of the Hop_Lim field depend on the 568 lower layer protocol that IOAM is encapsulated over, and 569 therefore its specific semantics are outside the scope of this 570 memo. 572 node_id: 3-octet unsigned integer. Node identifier field to 573 uniquely identify a node within in-situ OAM domain. The 574 procedure to allocate, manage and map the node_ids is beyond 575 the scope of this document. 577 ingress_if_id and egress_if_id: 4-octet field defined as follows: 579 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 580 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 581 | ingress_if_id | egress_if_id | 582 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 584 ingress_if_id: 2-octet unsigned integer. Interface identifier to 585 record the ingress interface the packet was received on. 587 egress_if_id: 2-octet unsigned integer. Interface identifier to 588 record the egress interface the packet is forwarded out of. 590 timestamp seconds: 4-octet unsigned integer. Absolute timestamp in 591 seconds that specifies the time at which the packet was received 592 by the node. This field has three possible formats; based on 593 either PTP [IEEE1588v2], NTP [RFC5905], or POSIX [POSIX]. The 594 three timestamp formats are specified in Section 5. In all three 595 cases, the Timestamp Seconds field contains the 32 most 596 significant bits of the timestamp format that is specified in 597 Section 5. If a node is not capable of populating this field, it 598 assigns the value 0xFFFFFFFF. Note that this is a legitimate 599 value that is valid for 1 second in approximately 136 years; the 600 analyzer should correlate several packets or compare the timestamp 601 value to its own time-of-day in order to detect the error 602 indication. 604 timestamp subseconds: 4-octet unsigned integer. Absolute timestamp 605 in subseconds that specifies the time at which the packet was 606 received by the node. This field has three possible formats; 607 based on either PTP [IEEE1588v2], NTP [RFC5905], or POSIX [POSIX]. 608 The three timestamp formats are specified in Section 5. In all 609 three cases, the Timestamp Subseconds field contains the 32 least 610 significant bits of the timestamp format that is specified in 611 Section 5. If a node is not capable of populating this field, it 612 assigns the value 0xFFFFFFFF. Note that this is a legitimate 613 value in the NTP format, valid for approximately 233 picoseconds 614 in every second. If the NTP format is used the analyzer should 615 correlate several packets in order to detect the error indication. 617 transit delay: 4-octet unsigned integer in the range 0 to 2^31-1. 618 It is the time in nanoseconds the packet spent in the transit 619 node. This can serve as an indication of the queuing delay at the 620 node. If the transit delay exceeds 2^31-1 nanoseconds then the 621 top bit 'O' is set to indicate overflow and value set to 622 0x80000000. When this field is part of the data field but a node 623 populating the field is not able to fill it, the field position in 624 the field must be filled with value 0xFFFFFFFF to mean not 625 populated. 627 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 628 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 629 |O| transit delay | 630 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 632 app_data: 4-octet placeholder which can be used by the node to add 633 application specific data. App_data represents a "free-format" 634 4-octet bit field with its semantics defined by a specific 635 deployment. 637 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 638 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 639 | app_data | 640 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 642 queue depth: 4-octet unsigned integer field. This field indicates 643 the current length of the egress interface queue of the interface 644 from where the packet is forwarded out. The queue depth is 645 expressed as the current number of memory buffers used by the 646 queue (a packet may consume one or more memory buffers, depending 647 on its size). 649 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 650 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 651 | queue depth | 652 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 654 Opaque State Snapshot: Variable length field. It allows the network 655 element to store an arbitrary state in the node data field , 656 without a pre-defined schema. The schema needs to be made known 657 to the analyzer by some out-of-band mechanism. The specification 658 of this mechanism is beyond the scope of this document. The 659 24-bit "Schema Id" field in the field indicates which particular 660 schema is used, and should be configured on the network element by 661 the operator. 663 0 1 2 3 664 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 665 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 666 | Length | Schema ID | 667 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 668 | | 669 | | 670 | Opaque data | 671 ~ ~ 672 . . 673 . . 674 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 676 Length: 1-octet unsigned integer. It is the length in multiples 677 of 4-octets of the Opaque data field that follows Schema Id. 679 Schema ID: 3-octet unsigned integer identifying the schema of 680 Opaque data. 682 Opaque data: Variable length field. This field is interpreted as 683 specified by the schema identified by the Schema ID. 685 When this field is part of the data field but a node populating 686 the field has no opaque state data to report, the Length must be 687 set to 0 and the Schema ID must be set to 0xFFFFFF to mean no 688 schema. 690 Hop_Lim and node_id wide: 8-octet field defined as follows: 692 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 693 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 694 | Hop_Lim | node_id ~ 695 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 696 ~ node_id (contd) | 697 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 699 Hop_Lim: 1-octet unsigned integer. It is set to the Hop Limit 700 value in the packet at the node that records this data. Hop 701 Limit information is used to identify the location of the node 702 in the communication path. This is copied from the lower layer 703 for e.g. TTL value in IPv4 header or hop limit field from IPv6 704 header of the packet. The semantics of the Hop_Lim field 705 depend on the lower layer protocol that IOAM is encapsulated 706 over, and therefore its specific semantics are outside the 707 scope of this memo. 709 node_id: 7-octet unsigned integer. Node identifier field to 710 uniquely identify a node within in-situ OAM domain. The 711 procedure to allocate, manage and map the node_ids is beyond 712 the scope of this document. 714 ingress_if_id and egress_if_id wide: 8-octet field defined as 715 follows: 717 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 718 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 719 | ingress_if_id | 720 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 721 | egress_if_id | 722 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 724 ingress_if_id: 4-octet unsigned integer. Interface identifier to 725 record the ingress interface the packet was received on. 727 egress_if_id: 4-octet unsigned integer. Interface identifier to 728 record the egress interface the packet is forwarded out of. 730 app_data wide: 8-octet placeholder which can be used by the node to 731 add application specific data. App data represents a "free- 732 format" 8-octed bit field with its semantics defined by a specific 733 deployment. 735 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 736 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 737 | app data ~ 738 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 739 ~ app data (contd) | 740 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 742 Checksum Complement: 4-octet node data which contains a two-octet 743 Checksum Complement field, and a 2-octet reserved field. The 744 Checksum Complement is useful when IOAM is transported over 745 encapsulations that make use of a UDP transport, such as VXLAN-GPE 746 or Geneve. Without the Checksum Complement, nodes adding IOAM 747 node data must update the UDP Checksum field. When the Checksum 748 Complement is present, an IOAM encapsulating node or IOAM transit 749 node adding node data MUST carry out one of the following two 750 alternatives in order to maintain the correctness of the UDP 751 Checksum value: 753 1. Recompute the UDP Checksum field. 755 2. Use the Checksum Complement to make a checksum-neutral update 756 in the UDP payload; the Checksum Complement is assigned a 757 value that complements the rest of the node data fields that 758 were added by the current node, causing the existing UDP 759 Checksum field to remain correct. 761 IOAM decapsulating nodes MUST recompute the UDP Checksum field, 762 since they do not know whether previous hops modified the UDP 763 Checksum field or the Checksum Complement field. 765 Checksum Complement fields are used in a similar manner in 766 [RFC7820] and [RFC7821]. 768 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 769 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 770 | Checksum Complement | Reserved | 771 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 773 4.1.3. Examples of IOAM node data 775 An entry in the "node data list" array can have different formats, 776 following the needs of the deployment. Some deployments might only 777 be interested in recording the node identifiers, whereas others might 778 be interested in recording node identifier and timestamp. The 779 section defines different types that an entry in "node data list" can 780 take. 782 0xD400: IOAM-Trace-Type is 0xD400 then the format of node data is: 784 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 785 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 786 | Hop_Lim | node_id | 787 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 788 | ingress_if_id | egress_if_id | 789 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 790 | timestamp subseconds | 791 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 792 | app_data | 793 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 795 0xC000: IOAM-Trace-Type is 0xC000 then the format is: 797 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 798 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 799 | Hop_Lim | node_id | 800 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 801 | ingress_if_id | egress_if_id | 802 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 804 0x9000: IOAM-Trace-Type is 0x9000 then the format is: 806 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 807 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 808 | Hop_Lim | node_id | 809 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 810 | timestamp subseconds | 811 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 813 0x8400: IOAM-Trace-Type is 0x8400 then the format is: 815 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 816 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 817 | Hop_Lim | node_id | 818 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 819 | app_data | 820 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 822 0x9400: IOAM-Trace-Type is 0x9400 then the format is: 824 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 825 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 826 | Hop_Lim | node_id | 827 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 828 | timestamp subseconds | 829 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 830 | app_data | 831 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 833 0x3180: IOAM-Trace-Type is 0x3180 then the format is: 835 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 836 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 837 | timestamp seconds | 838 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 839 | timestamp subseconds | 840 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 841 | Length | Schema Id | 842 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 843 | | 844 | | 845 | Opaque data | 846 ~ ~ 847 . . 848 . . 849 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 850 | Hop_Lim | node_id | 851 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 852 | node_id(contd) | 853 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 855 4.2. IOAM Proof of Transit Option 857 IOAM Proof of Transit data is to support the path or service function 858 chain [RFC7665] verification use cases. Proof-of-transit uses 859 methods like nested hashing or nested encryption of the IOAM data or 860 mechanisms such as Shamir's Secret Sharing Schema (SSSS). While 861 details on how the IOAM data for the proof of transit option is 862 processed at IOAM encapsulating, decapsulating and transit nodes are 863 outside the scope of the document, all of these approaches share the 864 need to uniquely identify a packet as well as iteratively operate on 865 a set of information that is handed from node to node. 866 Correspondingly, two pieces of information are added as IOAM data to 867 the packet: 869 o Random: Unique identifier for the packet (e.g., 64-bits allow for 870 the unique identification of 2^64 packets). 872 o Cumulative: Information which is handed from node to node and 873 updated by every node according to a verification algorithm. 875 IOAM proof of transit option: 877 IOAM proof of transit option header: 879 0 1 2 3 880 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 881 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 882 |IOAM POT Type | IOAM POT flags| Reserved | 883 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 885 IOAM proof of transit option data MUST be 4-octet aligned.: 887 0 1 2 3 888 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 889 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 890 | POT Option data field determined by IOAM-POT-Type | 891 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 893 IOAM POT Type: 8-bit identifier of a particular POT variant that 894 specifies the POT data that is included. This document defines 895 POT Type 0: 897 0: POT data is a 16 Octet field as described below. 899 IOAM POT flags: 8-bit. Following flags are defined: 901 Bit 0 "Profile-to-use" (P-bit) (most significant bit). For IOAM 902 POT types that use a maximum of two profiles to drive 903 computation, indicates which POT-profile is used. The two 904 profiles are numbered 0, 1. 906 Bit 1-7 Reserved: Must be set to zero upon transmission and 907 ignored upon receipt. 909 Reserved: 16-bit Reserved bits are present for future use. The 910 reserved bits Must be set to zero upon transmission and ignored 911 upon receipt. 913 POT Option data: Variable-length field. The type of which is 914 determined by the IOAM-POT-Type. 916 4.2.1. IOAM Proof of Transit Type 0 917 IOAM proof of transit option of IOAM POT Type 0: 919 0 1 2 3 920 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 921 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 922 |IOAM POT Type=0|P|R R R R R R R| Reserved | 923 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ 924 | Random | | 925 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P 926 | Random(contd) | O 927 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ T 928 | Cumulative | | 929 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 930 | Cumulative (contd) | | 931 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ 933 IOAM POT Type: 8-bit identifier of a particular POT variant that 934 specifies the POT data that is included. This section defines the 935 POT data when the IOAM POT Type is set to the value 0. 937 P bit: 1-bit. "Profile-to-use" (P-bit) (most significant bit). 938 Indicates which POT-profile is used to generate the Cumulative. 939 Any node participating in POT will have a maximum of 2 profiles 940 configured that drive the computation of cumulative. The two 941 profiles are numbered 0, 1. This bit conveys whether profile 0 or 942 profile 1 is used to compute the Cumulative. 944 R (7 bits): 7-bit IOAM POT flags for future use. MUST be set to 945 zero upon transmission and ignored upon receipt. 947 Reserved: 16-bit Reserved bits are present for future use. The 948 reserved bits Must be set to zero upon transmission and ignored 949 upon receipt. 951 Random: 64-bit Per packet Random number. 953 Cumulative: 64-bit Cumulative that is updated at specific nodes by 954 processing per packet Random number field and configured 955 parameters. 957 Note: Larger or smaller sizes of "Random" and "Cumulative" data are 958 feasible and could be required for certain deployments (e.g. in case 959 of space constraints in the transport protocol used). Future 960 versions of this document will address different sizes of data for 961 "proof of transit". 963 4.3. IOAM Edge-to-Edge Option 965 The IOAM edge-to-edge option is to carry data that is added by the 966 IOAM encapsulating node and interpreted by IOAM decapsulating node. 967 The IOAM transit nodes MAY process the data without modifying it. 969 IOAM edge-to-edge option: 971 IOAM edge-to-edge option header: 973 0 1 2 3 974 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 975 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 976 | IOAM-E2E-Type | Reserved | 977 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 979 IOAM edge-to-edge option data MUST be 4-octet aligned: 981 0 1 2 3 982 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 983 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 984 | E2E Option data field determined by IOAM-E2E-Type | 985 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 987 IOAM-E2E-Type: A 16-bit identifier which specifies which data types 988 are used in the E2E option data. The IOAM-E2E-Type value is a bit 989 field. The order of packing the E2E option data field elements 990 follows the bit order of the IOAM-E2E-Type field, as follows: 992 Bit 0 (Most significant bit) When set indicates presence of a 993 64-bit sequence number added to a specific tube which is 994 used to detect packet loss, packet reordering, or packet 995 duplication for that tube. Each tube leverages a 996 dedicated namespace for its sequence numbers. 998 Bit 1 When set indicates presence of a 32-bit sequence number 999 added to a specific tube which is used to detect packet 1000 loss, packet reordering, or packet duplication for that 1001 tube. Each tube leverages a dedicated namespace for its 1002 sequence numbers. 1004 Bit 2 When set indicates presence of timestamp seconds for the 1005 transmission of the frame. This 4-octet field has three 1006 possible formats; based on either PTP [IEEE1588v2], NTP 1007 [RFC5905], or POSIX [POSIX]. The three timestamp formats 1008 are specified in Section 5. In all three cases, the 1009 Timestamp Seconds field contains the 32 most significant 1010 bits of the timestamp format that is specified in 1011 Section 5. If a node is not capable of populating this 1012 field, it assigns the value 0xFFFFFFFF. Note that this 1013 is a legitimate value that is valid for 1 second in 1014 approximately 136 years; the analyzer should correlate 1015 several packets or compare the timestamp value to its own 1016 time-of-day in order to detect the error indication. 1018 Bit 3 When set indicates presence of timestamp subseconds for 1019 the transmission of the frame. This 4-octet field has 1020 three possible formats; based on either PTP [IEEE1588v2], 1021 NTP [RFC5905], or POSIX [POSIX]. The three timestamp 1022 formats are specified in Section 5. In all three cases, 1023 the Timestamp Subseconds field contains the 32 least 1024 significant bits of the timestamp format that is 1025 specified in Section 5. If a node is not capable of 1026 populating this field, it assigns the value 0xFFFFFFFF. 1027 Note that this is a legitimate value in the NTP format, 1028 valid for approximately 233 picoseconds in every second. 1029 If the NTP format is used the analyzer should correlate 1030 several packets in order to detect the error indication. 1032 Bit 4-15 Undefined. An IOAM encapsulating node Must set the value 1033 of these bits to zero upon transmission and ignore upon 1034 receipt. 1036 Reserved: 16-bits Reserved bits are present for future use. The 1037 reserved bits Must be set to zero upon transmission and ignored 1038 upon receipt. 1040 E2E Option data: Variable-length field. The type of which is 1041 determined by the IOAM-E2E-Type. 1043 5. Timestamp Formats 1045 The IOAM data fields include a timestamp field which is represented 1046 in one of three possible timestamp formats. It is assumed that the 1047 management plane is responsible for determining which timestamp 1048 format is used. 1050 5.1. PTP Truncated Timestamp Format 1052 The Precision Time Protocol (PTP) [IEEE1588v2] uses an 80-bit 1053 timestamp format. The truncated timestamp format is a 64-bit field, 1054 which is the 64 least significant bits of the 80-bit PTP timestamp. 1055 The PTP truncated format is specified in Section 4.3 of 1056 [I-D.ietf-ntp-packet-timestamps], and the details are presented below 1057 for the sake of completeness. 1059 0 1 2 3 1060 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1061 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1062 | Seconds | 1063 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1064 | Nanoseconds | 1065 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1067 Figure 1: PTP [IEEE1588] Truncated Timestamp Format 1069 Timestamp field format: 1071 Seconds: specifies the integer portion of the number of seconds 1072 since the epoch. 1074 + Size: 32 bits. 1076 + Units: seconds. 1078 Nanoseconds: specifies the fractional portion of the number of 1079 seconds since the epoch. 1081 + Size: 32 bits. 1083 + Units: nanoseconds. The value of this field is in the range 0 1084 to (10^9)-1. 1086 Epoch: 1088 The PTP [IEEE1588v2] epoch is 1 January 1970 00:00:00 TAI, which 1089 is 31 December 1969 23:59:51.999918 UTC. 1091 Resolution: 1093 The resolution is 1 nanosecond. 1095 Wraparound: 1097 This time format wraps around every 2^32 seconds, which is roughly 1098 136 years. The next wraparound will occur in the year 2106. 1100 Synchronization Aspects: 1102 It is assumed that nodes that run this protocol are synchronized 1103 among themselves. Nodes may be synchronized to a global reference 1104 time. Note that if PTP [IEEE1588v2] is used for synchronization, 1105 the timestamp may be derived from the PTP-synchronized clock, 1106 allowing the timestamp to be measured with respect to the clock of 1107 an PTP Grandmaster clock. 1109 The PTP truncated timestamp format is not affected by leap 1110 seconds. 1112 5.2. NTP 64-bit Timestamp Format 1114 The Network Time Protocol (NTP) [RFC5905] timestamp format is 64 bits 1115 long. This format is specified in Section 4.2.1 of 1116 [I-D.ietf-ntp-packet-timestamps], and the details are presented below 1117 for the sake of completeness. 1119 0 1 2 3 1120 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1121 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1122 | Seconds | 1123 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1124 | Fraction | 1125 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1127 Figure 2: NTP [RFC5905] 64-bit Timestamp Format 1129 Timestamp field format: 1131 Seconds: specifies the integer portion of the number of seconds 1132 since the epoch. 1134 + Size: 32 bits. 1136 + Units: seconds. 1138 Fraction: specifies the fractional portion of the number of 1139 seconds since the epoch. 1141 + Size: 32 bits. 1143 + Units: the unit is 2^(-32) seconds, which is roughly equal to 1144 233 picoseconds. 1146 Epoch: 1148 The epoch is 1 January 1900 at 00:00 UTC. 1150 Resolution: 1152 The resolution is 2^(-32) seconds. 1154 Wraparound: 1156 This time format wraps around every 2^32 seconds, which is roughly 1157 136 years. The next wraparound will occur in the year 2036. 1159 Synchronization Aspects: 1161 Nodes that use this timestamp format will typically be 1162 synchronized to UTC using NTP [RFC5905]. Thus, the timestamp may 1163 be derived from the NTP-synchronized clock, allowing the timestamp 1164 to be measured with respect to the clock of an NTP server. 1166 The NTP timestamp format is affected by leap seconds; it 1167 represents the number of seconds since the epoch minus the number 1168 of leap seconds that have occurred since the epoch. The value of 1169 a timestamp during or slightly after a leap second may be 1170 temporarily inaccurate. 1172 5.3. POSIX-based Timestamp Format 1174 This timestamp format is based on the POSIX time format [POSIX]. The 1175 detailed specification of the timestamp format used in this document 1176 is presented below. 1178 0 1 2 3 1179 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1180 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1181 | Seconds | 1182 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1183 | Microseconds | 1184 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1186 Figure 3: POSIX-based Timestamp Format 1188 Timestamp field format: 1190 Seconds: specifies the integer portion of the number of seconds 1191 since the epoch. 1193 + Size: 32 bits. 1195 + Units: seconds. 1197 Microseconds: specifies the fractional portion of the number of 1198 seconds since the epoch. 1200 + Size: 32 bits. 1202 + Units: the unit is microseconds. The value of this field is in 1203 the range 0 to (10^6)-1. 1205 Epoch: 1207 The epoch is 1 January 1970 00:00:00 TAI, which is 31 December 1208 1969 23:59:51.999918 UTC. 1210 Resolution: 1212 The resolution is 1 microsecond. 1214 Wraparound: 1216 This time format wraps around every 2^32 seconds, which is roughly 1217 136 years. The next wraparound will occur in the year 2106. 1219 Synchronization Aspects: 1221 It is assumed that nodes that use this timestamp format run Linux 1222 operating system, and hence use the POSIX time. In some cases 1223 nodes may be synchronized to UTC using a synchronization mechanism 1224 that is outside the scope of this document, such as NTP [RFC5905]. 1225 Thus, the timestamp may be derived from the NTP-synchronized 1226 clock, allowing the timestamp to be measured with respect to the 1227 clock of an NTP server. 1229 The POSIX-based timestamp format is affected by leap seconds; it 1230 represents the number of seconds since the epoch minus the number 1231 of leap seconds that have occurred since the epoch. The value of 1232 a timestamp during or slightly after a leap second may be 1233 temporarily inaccurate. 1235 6. IOAM Data Export 1237 IOAM nodes collect information for packets traversing a domain that 1238 supports IOAM. IOAM decapsulating nodes as well as IOAM transit 1239 nodes can choose to retrieve IOAM information from the packet, 1240 process the information further and export the information using 1241 e.g., IPFIX. 1243 The discussion of IOAM data processing and export is left for a 1244 future version of this document. 1246 7. IANA Considerations 1248 This document requests the following IANA Actions. 1250 7.1. Creation of a new In-Situ OAM Protocol Parameters Registry (IOAM) 1251 Protocol Parameters IANA registry 1253 IANA is requested to create a new protocol registry for "In-Situ OAM 1254 (IOAM) Protocol Parameters". This is the common registry that will 1255 include registrations for all IOAM namespaces. Each Registry, whose 1256 names are listed below: 1258 IOAM Type 1260 IOAM Trace Type 1262 IOAM Trace flags 1264 IOAM POT Type 1266 IOAM POT flags 1268 IOAM E2E Type 1270 will contain the current set of possibilities defined in this 1271 document. New registries in this name space are created via RFC 1272 Required process as per [RFC8126]. 1274 The subsequent sub-sections detail the registries herein contained. 1276 7.2. IOAM Type Registry 1278 This registry defines 128 code points for the IOAM-Type field for 1279 identifying IOAM options as explained in Section 4. The following 1280 code points are defined in this draft: 1282 0 IOAM Pre-allocated Trace Option Type 1284 1 IOAM Incremental Trace Option Type 1286 2 IOAM POT Option Type 1288 3 IOAM E2E Option Type 1290 4 - 127 are available for assignment via RFC Required process as per 1291 [RFC8126]. 1293 7.3. IOAM Trace Type Registry 1295 This registry defines code point for each bit in the 16-bit IOAM- 1296 Trace-Type field for Pre-allocated trace option and Incremental trace 1297 option defined in Section 4.1. The meaning of Bit 0 - 11 for trace 1298 type are defined in this document in Paragraph 1 of (Section 4.1.1). 1299 The meaning for Bit 12 - 15 are available for assignment via RFC 1300 Required process as per [RFC8126]. 1302 7.4. IOAM Trace Flags Registry 1304 This registry defines code point for each bit in the 4 bit flags for 1305 Pre-allocated trace option and Incremental trace option defined in 1306 Section 4.1. The meaning of Bit 0 - 1 for trace flags are defined in 1307 this document in Paragraph 5 of Section 4.1.1. The meaning for Bit 2 1308 - 3 are available for assignment via RFC Required process as per 1309 [RFC8126]. 1311 7.5. IOAM POT Type Registry 1313 This registry defines 256 code points to define IOAM POT Type for 1314 IOAM proof of transit option Section 4.2. The code point value 0 is 1315 defined in this document, 1 - 255 are available for assignment via 1316 RFC Required process as per [RFC8126]. 1318 7.6. IOAM POT Flags Registry 1320 This registry defines code point for each bit in the 8 bit flags for 1321 IOAM POT option defined in Section 4.2. The meaning of Bit 0 for 1322 IOAM POT flags is defined in this document in Section 4.2. The 1323 meaning for Bit 1 - 7 are available for assignment via RFC Required 1324 process as per [RFC8126]. 1326 7.7. IOAM E2E Type Registry 1328 This registry defines code points for each bit in the 16 bit IOAM- 1329 E2E-Type field for IOAM E2E option Section 4.3. The meaning of Bit 0 1330 - 3 are defined in this document. The meaning of Bit 4 - 15 are 1331 available for assignments via RFC Required process as per [RFC8126]. 1333 8. Manageability Considerations 1335 Manageability considerations will be addressed in a later version of 1336 this document.. 1338 9. Security Considerations 1340 Security considerations will be addressed in a later version of this 1341 document. 1343 10. Acknowledgements 1345 The authors would like to thank Eric Vyncke, Nalini Elkins, Srihari 1346 Raghavan, Ranganathan T S, Karthik Babu Harichandra Babu, Akshaya 1347 Nadahalli, LJ Wobker, Erik Nordmark, Vengada Prasad Govindan, and 1348 Andrew Yourtchenko for the comments and advice. 1350 This document leverages and builds on top of several concepts 1351 described in [I-D.kitamura-ipv6-record-route]. The authors would 1352 like to acknowledge the work done by the author Hiroshi Kitamura and 1353 people involved in writing it. 1355 The authors would like to gracefully acknowledge useful review and 1356 insightful comments received from Joe Clarke, Al Morton, and Mickey 1357 Spiegel. 1359 11. References 1361 11.1. Normative References 1363 [IEEE1588v2] 1364 Institute of Electrical and Electronics Engineers, "IEEE 1365 Std 1588-2008 - IEEE Standard for a Precision Clock 1366 Synchronization Protocol for Networked Measurement and 1367 Control Systems", IEEE Std 1588-2008, 2008, 1368 . 1371 [POSIX] Institute of Electrical and Electronics Engineers, "IEEE 1372 Std 1003.1-2008 (Revision of IEEE Std 1003.1-2004) - IEEE 1373 Standard for Information Technology - Portable Operating 1374 System Interface (POSIX(R))", IEEE Std 1003.1-2008, 2008, 1375 . 1378 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1379 Requirement Levels", BCP 14, RFC 2119, 1380 DOI 10.17487/RFC2119, March 1997, . 1383 [RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch, 1384 "Network Time Protocol Version 4: Protocol and Algorithms 1385 Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010, 1386 . 1388 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 1389 Writing an IANA Considerations Section in RFCs", BCP 26, 1390 RFC 8126, DOI 10.17487/RFC8126, June 2017, 1391 . 1393 11.2. Informative References 1395 [I-D.hildebrand-spud-prototype] 1396 Hildebrand, J. and B. Trammell, "Substrate Protocol for 1397 User Datagrams (SPUD) Prototype", draft-hildebrand-spud- 1398 prototype-03 (work in progress), March 2015. 1400 [I-D.ietf-ntp-packet-timestamps] 1401 Mizrahi, T., Fabini, J., and A. Morton, "Guidelines for 1402 Defining Packet Timestamps", draft-ietf-ntp-packet- 1403 timestamps-00 (work in progress), October 2017. 1405 [I-D.ietf-nvo3-geneve] 1406 Gross, J., Ganga, I., and T. Sridhar, "Geneve: Generic 1407 Network Virtualization Encapsulation", draft-ietf- 1408 nvo3-geneve-05 (work in progress), September 2017. 1410 [I-D.ietf-nvo3-vxlan-gpe] 1411 Maino, F., Kreeger, L., and U. Elzur, "Generic Protocol 1412 Extension for VXLAN", draft-ietf-nvo3-vxlan-gpe-05 (work 1413 in progress), October 2017. 1415 [I-D.ietf-sfc-nsh] 1416 Quinn, P., Elzur, U., and C. Pignataro, "Network Service 1417 Header (NSH)", draft-ietf-sfc-nsh-28 (work in progress), 1418 November 2017. 1420 [I-D.kitamura-ipv6-record-route] 1421 Kitamura, H., "Record Route for IPv6 (PR6) Hop-by-Hop 1422 Option Extension", draft-kitamura-ipv6-record-route-00 1423 (work in progress), November 2000. 1425 [I-D.lapukhov-dataplane-probe] 1426 Lapukhov, P. and r. remy@barefootnetworks.com, "Data-plane 1427 probe for in-band telemetry collection", draft-lapukhov- 1428 dataplane-probe-01 (work in progress), June 2016. 1430 [RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function 1431 Chaining (SFC) Architecture", RFC 7665, 1432 DOI 10.17487/RFC7665, October 2015, . 1435 [RFC7799] Morton, A., "Active and Passive Metrics and Methods (with 1436 Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799, 1437 May 2016, . 1439 [RFC7820] Mizrahi, T., "UDP Checksum Complement in the One-Way 1440 Active Measurement Protocol (OWAMP) and Two-Way Active 1441 Measurement Protocol (TWAMP)", RFC 7820, 1442 DOI 10.17487/RFC7820, March 2016, . 1445 [RFC7821] Mizrahi, T., "UDP Checksum Complement in the Network Time 1446 Protocol (NTP)", RFC 7821, DOI 10.17487/RFC7821, March 1447 2016, . 1449 Authors' Addresses 1451 Frank Brockners 1452 Cisco Systems, Inc. 1453 Hansaallee 249, 3rd Floor 1454 DUESSELDORF, NORDRHEIN-WESTFALEN 40549 1455 Germany 1457 Email: fbrockne@cisco.com 1459 Shwetha Bhandari 1460 Cisco Systems, Inc. 1461 Cessna Business Park, Sarjapura Marathalli Outer Ring Road 1462 Bangalore, KARNATAKA 560 087 1463 India 1465 Email: shwethab@cisco.com 1467 Carlos Pignataro 1468 Cisco Systems, Inc. 1469 7200-11 Kit Creek Road 1470 Research Triangle Park, NC 27709 1471 United States 1473 Email: cpignata@cisco.com 1474 Hannes Gredler 1475 RtBrick Inc. 1477 Email: hannes@rtbrick.com 1479 John Leddy 1480 Comcast 1481 United States 1483 Email: John_Leddy@cable.comcast.com 1485 Stephen Youell 1486 JP Morgan Chase 1487 25 Bank Street 1488 London E14 5JP 1489 United Kingdom 1491 Email: stephen.youell@jpmorgan.com 1493 Tal Mizrahi 1494 Marvell 1495 6 Hamada St. 1496 Yokneam 2066721 1497 Israel 1499 Email: talmi@marvell.com 1501 David Mozes 1503 Email: mosesster@gmail.com 1505 Petr Lapukhov 1506 Facebook 1507 1 Hacker Way 1508 Menlo Park, CA 94025 1509 US 1511 Email: petr@fb.com 1512 Remy Chang 1513 Barefoot Networks 1514 4750 Patrick Henry Drive 1515 Santa Clara, CA 95054 1516 US 1518 Daniel Bernier 1519 Bell Canada 1520 Canada 1522 Email: daniel.bernier@bell.ca 1524 John Lemon 1525 Broadcom 1526 270 Innovation Drive 1527 San Jose, CA 95134 1528 US 1530 Email: john.lemon@broadcom.com