idnits 2.17.1 draft-lapukhov-dataplane-probe-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The abstract seems to contain references ([RFC7276]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 10, 2016) is 2874 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 opsawg P. Lapukhov 3 Internet-Draft Facebook 4 Intended status: Standards Track R. Chang 5 Expires: December 12, 2016 Barefoot Networks 6 June 10, 2016 8 Data-plane probe for in-band telemetry collection 9 draft-lapukhov-dataplane-probe-01 11 Abstract 13 Detecting and isolating network faults in IP networks has 14 traditionally been done using tools like ping and traceroute (see 15 [RFC7276]) or more complex systems built on similar concepts of 16 active probing and path tracing. While using active synthetic probes 17 is proven to be helpful in detecting data-plane faults, isolating 18 fault location is a much harder problem, especially in diverse 19 networks with multiple active forwarding planes (e.g. IP and MPLS). 20 Moreover, existing end-to-end tools do not generally support 21 functionality beyond dealing with packet loss - for example, they are 22 hardly useful for detecting and reporting transient (i.e. milli- or 23 even micro-second) network congestion. 25 Modern network forwarding hardware can allow for more sophisticated 26 data-plane functionality that provides substantial improvement to the 27 isolation and identification capabilities of network elements. For 28 example, it has become possible to encode a snapshot of a network 29 element's state within the packet payload as it transits the device. 30 One example of such state would be queue depth on the egress port 31 taken by that specific packet. When combined with a unique device 32 identifier embedded in the same packet, this could allow for precise 33 time and topological identification of the the congested location 34 within the network. 36 This document proposes a format for requesting and embedding 37 telemetry information in active probes, i.e. packet designated for 38 actively testing the network while not carrying application traffic. 39 These active probes could be conveyed over multiple protocols (ICMP, 40 UDP, TCP, etc.) and the document does not prescribe any particular 41 transport. In addition, this document provides recommendations on 42 handling the active probes by devices that do not support the 43 required data-plane functionality. 45 Status of This Memo 47 This Internet-Draft is submitted in full conformance with the 48 provisions of BCP 78 and BCP 79. 50 Internet-Drafts are working documents of the Internet Engineering 51 Task Force (IETF). Note that other groups may also distribute 52 working documents as Internet-Drafts. The list of current Internet- 53 Drafts is at http://datatracker.ietf.org/drafts/current/. 55 Internet-Drafts are draft documents valid for a maximum of six months 56 and may be updated, replaced, or obsoleted by other documents at any 57 time. It is inappropriate to use Internet-Drafts as reference 58 material or to cite them other than as "work in progress." 60 This Internet-Draft will expire on December 12, 2016. 62 Copyright Notice 64 Copyright (c) 2016 IETF Trust and the persons identified as the 65 document authors. All rights reserved. 67 This document is subject to BCP 78 and the IETF Trust's Legal 68 Provisions Relating to IETF Documents 69 (http://trustee.ietf.org/license-info) in effect on the date of 70 publication of this document. Please review these documents 71 carefully, as they describe your rights and restrictions with respect 72 to this document. Code Components extracted from this document must 73 include Simplified BSD License text as described in Section 4.e of 74 the Trust Legal Provisions and are provided without warranty as 75 described in the Simplified BSD License. 77 Table of Contents 79 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 80 2. Data plane probe . . . . . . . . . . . . . . . . . . . . . . 4 81 2.1. Probe transport . . . . . . . . . . . . . . . . . . . . . 4 82 2.2. Probe structure . . . . . . . . . . . . . . . . . . . . . 4 83 2.3. Header Format . . . . . . . . . . . . . . . . . . . . . . 5 84 2.4. Telemetry Data Frame and Telemetry Data Records . . . . . 7 85 3. Telemetry Record Types . . . . . . . . . . . . . . . . . . . 8 86 3.1. Device Identifier . . . . . . . . . . . . . . . . . . . . 9 87 3.2. Timestamp . . . . . . . . . . . . . . . . . . . . . . . . 9 88 3.3. Queueing Delay . . . . . . . . . . . . . . . . . . . . . 9 89 3.4. Ingress/Egress Port IDs . . . . . . . . . . . . . . . . . 10 90 3.5. Opaque State Snapshot . . . . . . . . . . . . . . . . . . 10 91 4. Operating in loopback mode . . . . . . . . . . . . . . . . . 11 92 5. Processing Probe Packet . . . . . . . . . . . . . . . . . . . 11 93 5.1. Detecting a probe . . . . . . . . . . . . . . . . . . . . 12 94 6. Non-Capable Devices . . . . . . . . . . . . . . . . . . . . . 12 95 7. Handling data-plane probes in the MPLS domain . . . . . . . . 12 96 8. Multi-chip device considerations . . . . . . . . . . . . . . 12 97 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 98 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 99 10.1. Normative References . . . . . . . . . . . . . . . . . . 13 100 10.2. Informative References . . . . . . . . . . . . . . . . . 13 101 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 103 1. Introduction 105 Detecting and isolating faults in IP networks may involve multiple 106 tools and approaches, but by far the two most popular utilities used 107 by operators are ping and traceroute. The ping utility provides the 108 basic end-to-end connectivity check by sending a special ICMP packet. 109 There are other variants of ping that work using TCP or UDP probes, 110 but may require a special responder application (for UDP) on the 111 other end of the probed connection. 113 This type of active probing approach has its limitations. First, it 114 operates end-to-end and thus it is impossible to tell where in the 115 path the fault has happened from simply observing the packet loss 116 ratios. Secondly, in multipath (ECMP) scenarios it can be difficult 117 to fully and/or deterministically exercise all the possible paths 118 connecting two end-points. 120 The traceroute utility has multiple variants as well - UDP, ICMP and 121 TCP based, for instance, and special variant for MPLS LSP testing. 122 Practically all variants follow the same model of operations: varying 123 TTL field setting in outgoing probes and analyzing the returned ICMP 124 unreachable messages. This does allow isolating the fault down to 125 the IP hop that is losing packets, but has its own limitations. As 126 with the ping utility, it becomes complicated to explore all possible 127 ECMP paths in the network. This is especially problematic in large 128 Clos fabric topologies that are very common in large data-center 129 networks. Next, many network devices limit the rate of outgoing ICMP 130 messages as well as the rate of "exception" packets "punted" to the 131 control plane processor. This puts a functional limit on the packet 132 rate that the traceroute can probe a given hop with, and hence 133 impacts the resolution and time to isolate a fault. Lastly, the 134 treatment for these control packets is often different from the 135 packets that take regular forwarding path: the latter are normally 136 not redirected to the control plane processor and handled purely in 137 the data-plane hardware. 139 Modern network processing elements (both hardware and software based) 140 are capable of packet handling beyond basic forwarding and simple 141 header modifications. Of special interest is the ability to capture 142 and embed instantaneous state from the network element and encode 143 this state directly into the transit packet. One example would be to 144 record the transit device's name, ingress and egress port 145 identifiers, queueing delays, timestamps and so on. By collecting 146 this state along each network device in the path, it becomes trivial 147 to trace a probe's path through the network as well as record transit 148 device characteristics. Extending this model, one could build a tool 149 that combines the useful properties of ping and traceroute using a 150 single packet flight through the network, without the constraints of 151 control plane (aka "slow path") processing. To aid in the 152 development of such tooling, this document defines a format for 153 requesting and embedding telemetry information in the body of active 154 probing packets. 156 2. Data plane probe 158 This section defines the structure of the active data-plane probe. 160 2.1. Probe transport 162 This document does not prescribe any specific encapsulation for the 163 data-plane probe. For example, the probe could be embedded inside a 164 UDP packet, or within an IPv6 extension header. 166 2.2. Probe structure 168 The probe consists of a fixed-size "Header" and arbitrary number of 169 variable-length "telemetry data frames" following the header. Frames 170 are variable length, and each frame, in turn, consists of multiple 171 "telemetry record" fields defined below in this document. The 172 records are added per the request of the telemetry information 173 specified in the header. 175 +---------------------------------------------------------+ 176 | Header | 177 +---------------------------------------------------------+ 178 | Telemetry data frame N | 179 +---------------------------------------------------------+ 180 | Telemetry data frame N-1 | 181 +---------------------------------------------------------+ 182 . . 183 . . 184 . . 185 +---------------------------------------------------------+ 186 | Telemetry data frame 1 | 187 +---------------------------------------------------------+ 189 Figure 1: Probe layout 191 Notice that the first frame is at the end of the packet. For 192 efficient hardware implementation, new frames are pushed onto the 193 stack at each hop. This eliminates the need for the transit network 194 elements to inspect the full packet and allows for arbitrarily long 195 packets as the MTU allows. 197 2.3. Header Format 199 The probe payload starts with a fixed-size header. The header 200 identifies the packet as a data-plane probe packet, and encodes basic 201 information shared by all telemetry records. 203 0 1 2 3 204 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 205 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 206 | Probe Marker (1) | 207 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 208 | Probe Marker (2) | 209 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 210 | Version | Message Type | Flags | 211 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 212 | Telemetry Request Vector | 213 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 214 | Hop Limit | Hop Count | Must Be Zero | 215 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 216 | Maximum Length | Current Length | 217 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 218 | Sender's Handle | Sequence Number | 219 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 221 Figure 2: Header Format 223 (1) The "Probe Marker" fields are arbitrary 32-bit values generally 224 used by the network elements to identify the packet as a probe 225 packet. These fields should be interpreted as unsigned integer 226 values, stored in network byte order. For example, a network 227 element may be configured to recognize a UDP packet destined to 228 port 31337 and having 0xDEAD 0xBEEF as the values in "Probe 229 Marker" field as an active probe, and treat it respectively. 231 (2) "Version Number" is currently set to 1. 233 (3) The "Message Type" field value could be either "1" - "Probe" or 234 "2" - "Probe Reply" 236 (4) The "Flags" field is 8 bits, and defines the following flags: 238 (5) 240 (1) "Overflow" (O-bit) (least significant bit). This bit is 241 set by the network element if the number of records on the 242 packet is at the maximum limit as specified by the packet: 243 i.e. the packet is already "full" of telemetry 244 information. 246 (6) "Telemetry Request Vector" is a 32-bit long field that requests 247 well-known inband telemetry information from the network 248 elements on the path. A bit set in this vector translates to a 249 request of a particular type of information. The following 250 types/bits are currently defined, starting with the least 251 significant bit first: 253 (1) Bit 0: Device identifier. 255 (2) Bit 1: Timestamp. 257 (3) Bit 2: Queueing delay. 259 (4) Bit 3: Ingress/Egress port identifiers. 261 (5) Bit 31: Opaque state snapshot request. 263 (7) "Hop Limit" is defined only for "Message Type" of "1" 264 ("Probe"). For "Probe Reply" the "Hop Limit" field must be set 265 to zero. This field is treated as an integer value 266 representing the number of network elements. See the Section 4 267 section on the intended use of the field. 269 (8) The "Hop Count" field specifies the current number of hops of 270 capable network elements the packet has transit through. It 271 begins with zero and must be incremented by one for every 272 network element that adds a telemetry record. Combined with a 273 push mechanism, this simplifies the work for the subsequent 274 network element and the packet receiver. The subsequent 275 network element just needs to parse the template and then 276 insert new record(s) immediately after the template. 278 (9) The "Max Length" field specifies the maximum length of the 279 telemetry payload in bytes. Given that the sender knows the 280 minimum path MTU, the sender can set the maximum of payload 281 bytes allowed before exceeding the MTU. Thus, a simple 282 comparison between "Current Length" and "Max Length" allows to 283 decide whether or not data could be added. 285 (10) The "Current Length" field specifies the current length of data 286 stored in the probe. This field is incremented by eacn network 287 element by the number of bytes it has added with the telemetry 288 data frame. 290 (11) The "Sender's Handle" field is set by the sender to allow the 291 receiver to identify a particular originator of probe packets. 292 Along with "Sequence Number" it allows for tracking of packet 293 order and loss within the network. 295 2.4. Telemetry Data Frame and Telemetry Data Records 297 Each telemetry data frame is constructed by concatenating multiple 298 telemetry data record, per the request in "Telemetry Request Vector" 299 fields of the dataplane probe header. The frame starts with a 16-bit 300 length field, which reflects the frame size in bytes, excluding the 301 length of the field itself. Following the "Frame Length" field is a 302 "Telemetry Response Vector" field: this vector corresponds to the 303 records the network element was capable of recording in the frame. 304 The body of the frame is constructed by appending fixed-size records 305 corresponding to every bit set in "Telemetry Response Vector". All 306 of the records, except the one requested by 31st bit ("Opaque State 307 Snapshot") are fixed size, with their lengths defined in Section 3. 308 The order of the records in the frame follows the order of the bits 309 in the "Telemetry Request Vector" (also reflected in "Telemetry 310 Response Vector"). Finally, if requested, a variable-length field is 311 appended at the end of the frame, with the length field occupying the 312 first 8 bits. This "length" field reflects the length of the opaque 313 data excluding the length field itself. 315 If inserting a new telemetry record would cause "Current Length" to 316 exceed "Max Length", no record is added and the overflow "O-bit" must 317 be set to "1" in the probe header. 319 0 1 2 3 320 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 321 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 322 | Frame Length | Must be Zero | 323 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 324 | Telemetry Response Vector | 325 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 326 | | 327 . Fixed Size Field 0 . 328 . (if requested) . 329 . . 330 | | 331 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 332 | | 333 . Fixed Size Field 1 . 334 . (if requested) . 335 . . 336 | | 337 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 338 . . 339 . . 340 ~ ~ 341 . . 342 . . 343 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 344 | | 345 . Fixed Size Field 30 . 346 . (if requested) . 347 . . 348 | | 349 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 350 | Length | | 351 +-+-+-+-+-+-+-+-+ + 352 | | 353 . Opaque State Snapshot . 354 . (if requested) . 355 | | 356 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 358 Figure 3: Telemetry Frame Format 360 3. Telemetry Record Types 362 This section defines some of the telemetry record types that could be 363 supported by the network elements. 365 3.1. Device Identifier 367 This record is used to identify the device reporting telemetry 368 information. This document does not prescribe any specific 369 identifier format. In general, it is expected to be configured by 370 the operator. The length of this record is 32-bit. 372 0 1 2 3 373 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 374 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 375 | Device ID | 376 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 378 Figure 4: Device Identifier 380 3.2. Timestamp 382 This telemetry record encodes the time data associated with the 383 packet. Most existing hardware support timestamping for IEEE1588. 384 To leverage existing hardware capabilities, packet receive time is 385 stored similarly as 48-bits of seconds, 32-bits of nanoseconds, and 386 residence time is in 48-bits of nanoseconds. The length of this 387 record is 128 bits. 389 0 1 2 3 390 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 391 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 392 | Receive Seconds [47:16] | 393 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 394 | Receive Seconds [15:0] | Receive Nanoseconds [31:16] | 395 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 396 | Receive Nanoseconds [15:0] | Residence Time [47:32] | 397 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 398 | Residence Time [31:0] | 399 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 401 Figure 5: Timestamp 403 3.3. Queueing Delay 405 This record encodes the amount of time that the frame has spent 406 queued in the network element. This is only recorded if packet has 407 been queued, and defines the time spent in memory buffers. This 408 could be helpful to detect queueing-related delays in the network. 409 If the queueing delay exceeds the maximum number of 2+ seconds 410 allowed by the 31-bit number, the network element must set the 411 overflow "O-bit". In case of the cut-through switching operation 412 this must be set to zero. The length of this record is 32 bits. 414 0 1 2 3 415 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 416 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 417 |O| Nanoseconds | 418 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 420 Figure 6: Queueing Delay 422 3.4. Ingress/Egress Port IDs 424 This record stores the ingress and egress physical ports used to 425 receive and send packet respectively. Here, "physical port" means a 426 unit with actual MAC and PHY devices associated - not any logical 427 subdivision based, for example, on protocol level tags (e.g. VLAN). 428 The port identifiers are opaque, and defined as 16-bit entries. For 429 example, those could be the corresponding SNMP ifIndex values. The 430 length of this record is 32 bits. 432 0 1 2 3 433 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 434 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 435 | Ingress Port ID | Egress Port ID | 436 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 438 Figure 7: Ingress/Egress Port IDs 440 3.5. Opaque State Snapshot 442 This record has variable size. It allows the network element to 443 store arbitrary state in the probe, without a pre-defined schema. 444 The schema needs to made known to the analyzer by some out-of-band 445 means. The 16-bit "Schema Id" field in the record is supposed to let 446 the analyzer know which particular schema to use, and it is expected 447 to be configured on the network element by the operator. This ID is 448 expected to be configured on the device by the network operator. 450 0 1 2 3 451 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 452 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 453 | Length | Schema Id | 454 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 455 | | 456 | | 457 | Opaque Data | 458 ~ ~ 459 . . 460 . . 461 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 463 Figure 8: Opaque State 465 4. Operating in loopback mode 467 In "loopback" mode the flow of probes is "turned back" at some 468 network element. The network element that "turns" packets around is 469 identified using the "Hop Limit" field. The network element that 470 receives a "Probe" type packet having "Hop Limit" value equal to "Hop 471 Count" is required to perform the following: 473 Change the "Message Type" field to "Probe Reply", and keep the 474 "Hop Limit" at zero. 476 Swap the destination/source IP addresses in the transport header 477 to send the packet back to the originator. 479 Add a new telemetry data frame corresponding to the new forwarding 480 information. 482 This way, the original probe is routed back to originator. Notice 483 that the return path may be different from the path that the original 484 probe has taken. This path will be recorded by the network elements 485 as the reply is transported back to the sender. Using this technique 486 one may progressively test a path until its breaking point. 488 If a network element is incapable of redirecting packets back to the 489 originator, another option would be exporting those packets to a 490 network analyzer device, using some sort of encapsulation header. 492 5. Processing Probe Packet 493 5.1. Detecting a probe 495 As mentioned previously, a combination of techniques need to be used 496 to differentiate the active probes. This may include, but should not 497 be limited to using just the known position of "Probe Id" fields. 499 6. Non-Capable Devices 501 Non-capable devices are those that cannot process a probe natively in 502 the fast-path data plane. Further, there could be two types of such 503 devices: those that can still process it via the control-plane 504 software, and those that can not. The control-plane processing 505 should be triggered by use of the "Router-Alert" option for IPv4 of 506 IPv6 packets (see [RFC2113] or [RFC2711]) added by the originator of 507 the probe. A control-plane capable device is expected to interpret 508 and fill-in as much telemetry-record data as it possibly could, given 509 the limited abilities. 511 Network elements that are not capable of processing the data-plane 512 probes are expected to perform regular packet forwarding. If a 513 network element receives a packet with the router-alert option set, 514 but has no special configuration to detect such probes, it should 515 process it according to [RFC6398]. Absence of the router alert 516 option leaves the non dataplane-capable devices with the only option 517 of processing the probe using traditional forwarding. 519 7. Handling data-plane probes in the MPLS domain 521 In general, the payload of an MPLS packet is opaque to the network 522 element. However, in many cases the network element still performs a 523 lookup beyond the MPLS label stack, e.g. to obtain information such 524 as L4 ports for load balancing. It may be possible to perform data- 525 plane probe classification in the same manner, additionally using the 526 "Probe Marker" to distinguish the probe packets. 528 In accordance to [RFC6178] Label Edge Routers (LERs) are required not 529 to impose an MPLS router-alert label for packets carrying the router- 530 alert option. It may be beneficial to enable such translation, so 531 that an end-to-end validation could be performed if a control-plane 532 capable MPLS network element is present on the probe's path. 534 8. Multi-chip device considerations 536 TBD 538 9. IANA Considerations 540 None 542 10. References 544 10.1. Normative References 546 [RFC2113] Katz, D., "IP Router Alert Option", RFC 2113, 547 DOI 10.17487/RFC2113, February 1997, 548 . 550 [RFC2711] Partridge, C. and A. Jackson, "IPv6 Router Alert Option", 551 RFC 2711, DOI 10.17487/RFC2711, October 1999, 552 . 554 [RFC6398] Le Faucheur, F., Ed., "IP Router Alert Considerations and 555 Usage", BCP 168, RFC 6398, DOI 10.17487/RFC6398, October 556 2011, . 558 [RFC6178] Smith, D., Mullooly, J., Jaeger, W., and T. Scholl, "Label 559 Edge Router Forwarding of IPv4 Option Packets", RFC 6178, 560 DOI 10.17487/RFC6178, March 2011, 561 . 563 10.2. Informative References 565 [RFC7276] Mizrahi, T., Sprecher, N., Bellagamba, E., and Y. 566 Weingarten, "An Overview of Operations, Administration, 567 and Maintenance (OAM) Tools", RFC 7276, 568 DOI 10.17487/RFC7276, June 2014, 569 . 571 Authors' Addresses 573 Petr Lapukhov 574 Facebook 575 1 Hacker Way 576 Menlo Park, CA 94025 577 US 579 Email: petr@fb.com 580 Remy Chang 581 Barefoot Networks 582 2185 Park Boulevard 583 Palo Alto, CA 94306 584 US 586 Email: remy@barefootnetworks.com