idnits 2.17.1 draft-spring-srv6-oam-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == It seems as if not all pages are separated by form feeds - found 0 form feeds but 22 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 10 instances of lines with non-RFC3849-compliant IPv6 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 23, 2017) is 2310 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'I-D.draft-filsfils-spring-srv6-network-programming' is mentioned on line 763, but not defined == Missing Reference: 'I-D.draft-brockners-inband-oam-requirements' is mentioned on line 804, but not defined == Unused Reference: 'I-D.brockners-inband-oam-requirements' is defined on line 952, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 SPRING Working Group Z. Ali 2 Internet Draft C. Filsfils 3 Intended status: Standards Track N. Kumar 4 Expires: June 23, 2018 C. Pignataro 5 F. Iqbal 6 Cisco Systems, Inc. 7 J. Leddy 8 Comcast 9 S. Matsushima 10 SoftBank 11 R. Raszuk 12 Bloomberg LP 13 B. Peirens 14 Proximus 15 G. Naik 16 Drexel University 17 December 23, 2017 19 Operations, Administration, and Maintenance (OAM) in Segment Routing 20 Networks with IPv6 Dataplane (SRv6) 21 draft-spring-srv6-oam-00.txt 23 Status of this Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF), its areas, and its working groups. Note that 30 other groups may also distribute working documents as Internet- 31 Drafts. 33 Internet-Drafts are draft documents valid for a maximum of six 34 months and may be updated, replaced, or obsoleted by other documents 35 at any time. It is inappropriate to use Internet-Drafts as 36 reference material or to cite them other than as "work in progress." 38 The list of current Internet-Drafts can be accessed at 39 http://www.ietf.org/1id-abstracts.html 41 The list of Internet-Draft Shadow Directories can be accessed at 42 http://www.ietf.org/shadow.html 44 This Internet-Draft will expire on June 23, 2018. 46 Copyright Notice 48 Copyright (c) 2017 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with 56 respect to this document. Code Components extracted from this 57 document must include Simplified BSD License text as described in 58 Section 4.e of the Trust Legal Provisions and are provided without 59 warranty as described in the Simplified BSD License. 61 Abstract 63 This document outlines various use-cases for Operations, 64 Administration, and Maintenance (OAM) in Segment Routing with the 65 IPv6 data plane (SRv6) network. It also specifies solutions to 66 address the SRv6 OAM requirements. 68 Table of Contents 70 1. Introduction...................................................3 71 1.1. Terminology and Reference Topology........................3 72 2. Use-cases......................................................4 73 2.1. Connectivity Verification.................................5 74 2.2. Monitoring a Specific Flow................................5 75 2.3. Monitoring all ECMP/ UCMP Paths...........................5 76 2.4. Traceroute................................................6 77 2.5. Proof of Transit..........................................6 78 2.6. Anycast Server selection..................................7 79 2.7. Detecting Path Divergence.................................7 80 2.8. Fault Isolation...........................................7 81 2.9. Connectivity Verification from arbitrary node.............7 82 3. OAM Mechanisms.................................................8 83 3.1. Ping......................................................8 84 3.1.1. Classic Ping.........................................8 85 3.1.2. Pinging a SID Function...............................9 86 3.1.2.1. End-to-end ping using EDN.OTP..................10 87 3.1.2.2. Segment-by-segment ping using O-bit (Proof of 88 Transit)................................................11 89 3.2. Error Reporting..........................................12 90 3.3. Traceroute...............................................12 91 3.3.1. Classic Traceroute..................................13 92 3.3.2. Traceroute to a SID Function........................14 93 3.3.2.1. Hop-by-hop traceroute using END.OTP............15 94 3.3.2.2. Tracing SRv6 Overlay...........................16 95 3.4. In-situ OAM..............................................18 96 3.5. Seamless BFD Applicability...............................19 97 3.6. Connectivity Verification from arbitrary SR node.........19 98 4. Security Considerations.......................................20 99 5. IANA Considerations...........................................20 100 6. References....................................................20 101 6.1. Normative References.....................................20 102 6.2. Informative References...................................21 103 7. Acknowledgments...............................................21 105 1. Introduction 107 This document outlines various SRv6 OAM use-cases. It also describes 108 OAM mechanisms that can be used to address SRv6 OAM requirements. 110 Additional OAM use-cases and mechanisms will be added in a future 111 revision of the document. 113 1.1. Terminology and Reference Topology 115 This document uses the terminology defined in [I-D.draft-filsfils- 116 spring-srv6-network-programming]. The readers are expected to be 117 familiar with the same. 119 Throughout the document, the following simple topology is used for 120 illustration. 122 +--------------------------| N100 |------------------------+ 123 | | 124 ====== link1====== link3------ link5====== link9------ 125 ||N1||======||N2||======| N3 |======||N4||======| N5 | 126 || ||------|| ||------| |------|| ||------| | 127 ====== link2====== link4------ link6======link10------ 128 | | 129 | ------ | 130 +-------| N6 |---------+ 131 link7 | | link8 132 ------ 134 Reference Topology 136 In the reference topology: 138 Nodes N1, N2, and N4 are SRv6 capable nodes. 140 Nodes N3, N5 and N6 are classic IPv6 nodes. 142 Node 100 is a controller. 144 Node Nk has a classic IPv6 loopback address Bk::/128 146 Node Nk has Ak::/48 for its local SID space from which Local SIDs 147 are explicitly allocated. 149 The IPv6 address of the nth Link between node X and Y at the X side 150 is represented as 99:X:Y::Xn. e.g., the IPv6 address of link6 (the 151 2nd link) between N3 and N4 at N3 in Figure 1 is 99:3:4:32. 152 Similarly, the IPv6 address of link5 (the 1st link between N3 and 153 N4) at node 3 is 99:3:4::31. 155 Ak::0 is explicitly allocated as the END function at Node k. 157 Ak::Cij is explicitly allocated as the END.X function at node k 158 towards neighbor node i via jth Link between node i and node j. 159 e.g., A2::C31 represents END.X at N2 towards N3 via link3 (the 1st 160 link between N2 and N3). Similarly, A4::C52 represents the END.X at 161 N4 towards N5 via link10. 163 SRH is the abbreviation for the Segment Routing Header. 165 SL is the abbreviation for the Segment Left. 167 SID is the abbreviation for the Segment ID. 169 represents a SID list where S1 is the first SID and S3 170 is the last SID. (S3, S2, S1; SL) represents the same SID list but 171 encoded in the SRH format where the rightmost SID (S1) in the SRH is 172 the first SID and the leftmost SID (S3) in the SRH is the last SID. 174 ECMP is the abbreviation for the Equal Cost Multi-Path. 176 UCMP is the abbreviation for the Unequal Cost Multi-Path. 178 2. Use-cases 180 This section outlines some for the basic OAM use-cases in an SRv6 181 network. Additional use-cases will be added in a future revision of 182 the document. 184 2.1. Connectivity Verification 186 One of the basic OAM use-cases for any network is the capability to 187 perform path monitoring between different end points over any 188 possible shortest path without any path preference. Such essential 189 path monitoring helps to monitor the path availability and the 190 liveliness of the remote end point. 192 The shortest path monitoring can be done continuously or can be 193 triggered on demand basis using an external event like a script or a 194 CLI trigger. It may be required to perform the connectivity 195 verification in the order of milliseconds, or at a slower pace. 197 In the reference topology in Figure 1, N1 can send OAM probe packet 198 destined to loopback address of N5 (B5::) to monitor the path 199 liveliness between N1 and N5. N1 optionally may include any relevant 200 segment list in SRH. N1 is not concerned about which route is taken 201 by the probe between N1 and N5 as long as N1 receives the response 202 back from N5. All transit nodes treat the probe packet as like other 203 data packet and forward it based on the Destination Address (DA). N5 204 looks into the payload of probe packet and respond back to the 205 source address of the probe packet (N1). 207 2.2. Monitoring a Specific Flow 209 The network OAM needs to have the ability to monitor a particular 210 path from the available ECMP paths. For example, in the reference 211 topology in figure 1, there are many ECMP paths between N1 and N5. 212 However, the service provider may like to monitor a flow that 213 follows [N1]--[N2]--[N6]--[N4]--[N5]. 215 The flow monitoring can be done continuously or can be triggered on 216 demand basis. It may be required to perform the connectivity 217 verification in the order of milliseconds, or at a slower pace. 219 2.3. Monitoring all ECMP/ UCMP Paths 221 In any network, it is common to see multiple ECMP paths between end 222 points that are used for load balancing or redundancy. While 223 monitoring, the shortest path helps to monitor the path and 224 liveliness of remote node, it may not be sufficient to detect any 225 failure in one of the ECMP paths. In our reference topology in 226 figure 1, N6 has 2 ECMP paths to reach N5 as below: 228 N6----N4----N5 230 N6----N4----N5 231 If the probe packet from N6 to N5 uses link10, it may not detect any 232 failure on link9. It is critical and beneficial to discover and 233 monitor all ECMP/ UCMP paths. Monitoring of all ECMP/ UCMP paths can 234 be done by probing the candidate paths from end-to-end or by each 235 node by monitoring its data plane. 237 2.4. Traceroute 239 It is essential to trace the path between different end points for 240 troubleshooting and fault localization purpose. In the SRv6 network, 241 depending on the forwarding instruction encoded in SRH, a packet may 242 traverse over zero or more SRv6 transit nodes which in turn are 243 connected through transit IPv6 nodes. For example, the best effort 244 traffic may traverse the shortest path between Ingress and egress 245 nodes while an SLA constrained traffic may follow a specific path 246 that involves one or more transit SRv6 nodes. 248 In either of these cases, traceroute functionality allows an 249 operator to discover the set of SRv6 and/or IPv6 nodes along the 250 path between different end points. Multipath being inevitable in any 251 network, it is also essential to identify the exact path (among the 252 available equal cost multi paths) that a particular flow or packet 253 is traversing. 255 2.5. Proof of Transit 257 Various scenarios require the packet to be steered over a particular 258 links or nodes. For example: 260 - Voice traffic in a SLA constrained network needs to traverse a 261 low latency path between endpoints which may not be the shortest 262 path, i.e. the voice traffic needs to be traffic engineered and 263 steered over the specified segment list that satisfies the SLA 264 constraint. 266 - In a service chaining environment, the traffic may need to 267 traverse over an ordered list of service functions. 269 In these scenarios, the SRH contains the list of SID functions that 270 the packet should execute before reaching the destination. It is 271 possible, due to an error, that the packet may reach the destination 272 without visiting all the segments in the segment list. It is, 273 therefore, important to have the ability to verify that all the 274 function SIDs have been executed correctly before the packet is 275 delivered to the destination. It is also important to ensure that 276 the order of execution of the SID function has been consistent with 277 the SRH contents. 279 2.6. Anycast Server selection 281 For application redundancy and load sharing purpose, it is prevalent 282 to see anycast deployment where the service address will assign to 283 different application servers spanned across the network. While 284 traditionally this type of deployment model was used to terminate 285 the client session to the nearest server, the recent capability of 286 collecting network and application telemetry along with the traffic 287 steering characteristics of SRv6 allows an operator to leverage the 288 knowledge to choose the right server and path based on not just the 289 shortest path, but also based on other performance metrics. 291 It is therefore essential to have the ability to monitor the anycast 292 server performance and detect any deviation and take corrective 293 actions. 295 2.7. Detecting Path Divergence 297 Path divergence occurs when network traffic diverges from the 298 expected path that packet was supposed to take. Path divergence may 299 result in congestion, delay, or breakage of strict SLAs promised to 300 customers. It is, therefore, important to exercise mechanisms that 301 can detect path divergence in the SRv6 network. 303 2.8. Fault Isolation 305 In the cases where a monitoring technique discovers an issue, it is 306 required to have the ability to pinpoint the failure location. The 307 fault isolation mechanisms are required to help service providers 308 troubleshoot failure in an SRv6 network. 310 2.9. Connectivity Verification from arbitrary node 312 In the recent past, network operators are interested in performing 313 network operations, administration, and maintenance configuration in 314 a centralized manner. In this use-case, one of the requirements is 315 to implement OAM functionality like connectivity verification 316 between different SRv6 end points in a centralized manner by 317 triggering it from any arbitrary node. The other requirement in this 318 use-case is to perform the connectivity verification between end 319 points without any control plane intervention at the monitored or 320 other transit nodes. 322 Additional OAM use-cases will be included in a future revision of 323 the document. 325 3. OAM Mechanisms 327 This section describes how ping and traceroute mechanisms can be 328 used in an SRv6 network. Additional OAM mechanisms will be added in 329 a future revision of the document. 331 3.1. Ping 333 [RFC4443] describes Internet Control Message Protocol for IPv6 334 (ICMPv6) that is used by IPv6 devices for network diagnostic and 335 error reporting purposes. As Segment Routing with IPv6 data plane 336 (SRv6) simply adds a new type of Routing Extension Header, existing 337 ICMPv6 mechanisms can be used in an SRv6 network. This section 338 describes the applicability of ICMPv6 in the SRv6 network and how 339 the existing ICMPv6 mechanisms can be used for providing OAM 340 functionality to address many use-cases outlined in Section 2. 342 Throughout this document, unless otherwise specified, the acronym 343 ICMPv6 refers to multi-part ICMPv6 messages [RFC4884]. The document 344 does not propose any changes to the standard ICMPv6 [RFC4443], 345 [RFC4884] or standard ICMPv4 [RFC792]. 347 There is no hardware or software change required for ping operation 348 at the classic IPv6 nodes in an SRv6 network. That includes the 349 classic IPv6 node with ingress, egress or transit roles. 350 Furthermore, no protocol changes are required to the standard ICMPv6 351 [RFC4443], [RFC4884] or standard ICMPv4 [RFC792]. In other words, 352 existing ICMP ping mechanisms work seamlessly in the SRv6 networks. 354 The following subsections outline some use cases of the ICMP ping in 355 the SRv6 networks. 357 3.1.1. Classic Ping 359 The existing mechanism to ping a remote IP prefix, along the 360 shortest path, continues to work without any modification. The 361 initiator may be an SRv6 node or a classic IPv6 node. Similarly, the 362 egress or transit may be an SRv6 capable node or a classic IPv6 363 node. 365 If an SRv6 capable ingress node wants to ping an IPv6 prefix via an 366 arbitrary segment list , it needs to initiate ICMPv6 367 ping with an SR header containing the SID list . This is 368 illustrated using the topology in Figure 1. Assume all the links 369 have IGP metric 10 except both links between node2 and node3, which 370 have IGP metric set to 100. User issues a ping from node N1 to a 371 loopback of node 5, via via segment list . 373 Figure 2 contains sample output for a ping request initiated at node 374 N1 to the loopback address of node N5 via a segment list . 377 > ping B5:: via segment-list A2::C31, A4::C52 379 Sending 5, 100-byte ICMP Echos to B5::, timeout is 2 seconds: 380 !!!!! 381 Success rate is 100 percent (5/5), round-trip min/avg/max = 0.625 382 /0.749/0.931 ms 383 A sample ping output at an SRv6 capable node 385 All transit nodes process the echo request message like any other 386 data packet carrying SR header and hence do not require any change. 387 Similarly, the egress node (IPv6 classic or SRv6 capable) does not 388 require any change to process the ICMPv6 echo request. For example, 389 in the ping example of Figure 2: 391 - Node N1 initiates an ICMPv6 ping packet with SRH as follows (B1::, 392 A2::C31)(B1::, A4::C52, A2::C31, SL=2, NH: ICMPv6)(ICMPv6 Echo 393 Request). 394 - Node N2, which is an SRv6 capable node, performs the standard SRH 395 processing. Specifically, it executes the END.X function (A2::C31) 396 on the echo request packet. 397 - Node N3, which is a classic IPv6 node, performs the standard IPv6 398 processing. Specifically, it forwards the echo request based on DA 399 A4::C52 in the IPv6 header. 400 - Node N4, which is an SRv6 capable node, performs the standard SRH 401 processing. Specifically, it observes the END.X function (A4::C52) 402 with PSP (Penultimate Segment POP) on the echo request packet and 403 removes the SRH and forwards the packet across link10 to N5. 404 - The echo request packet at N5 arrives as an IPv6 packet without a 405 SRH. Node N5, which is a classic IPv6 node, performs the standard 406 IPv6/ ICMPv6 processing on the echo request and responds, 407 accordingly. 409 3.1.2. Pinging a SID Function 411 The classic ping described in the previous section cannot be used to 412 ping a remote SID function, as explained using an example in the 413 following. Consider the case where the user wants to ping the remote 414 SID function A4::C52, via A2::C31, from node N1. Node N1 constructs 415 the ping packet (B1::0, A2::C31)( A4::C52, A2::C31, SL=1; 416 NH=ICMPv6)(ICMPv6 Echo Request). When the node N4 receives the 417 ICMPv6 echo request with DA set to A4::C52 and next header set to 418 ICMPv6, it silently drops it (see [I-D.draft-filsfils-spring-srv6- 419 network-programming] for details). To solve this problem, the 420 initiator needs to mark the ICMPv6 echo request as an OAM packet. 422 The OAM packets are identified either by setting the O-bit in SRH or 423 by inserting the END.OTP SID at an appropriate place in the SRH [I- 424 D.draft-filsfils-spring-srv6-network-programming]. 426 In an SRv6 network, the user can exercise two flavors of the ping: 427 end-to-end ping or segment-by-segment ping, as outlined in the 428 following. 430 3.1.2.1. End-to-end ping using EDN.OTP 432 Consider the same example where the user wants to ping a remote SID 433 function A4::C52 , via A2::C31, from node N1. To force a punt of the 434 ICMPv6 echo request at the node N4, node N1 inserts the END.OTP SID 435 just before the target SID A4::C52 in the SRH. The ICMPv6 echo 436 request is processed at the individual nodes along the path as 437 follows. 439 - Node N1 initiates an ICMPv6 ping packet with SRH as follows 440 (B1::0, A2::C31)(A4::C52, A4::OTP, A2::C31; SL=2; 441 NH=ICMPv6)(ICMPv6 Echo Request). 442 - Node N2, which is an SRv6 capable node, performs the standard SRH 443 processing. Specifically, it executes the END.X function (A2::C31) 444 on the echo request packet. 445 - Node N3 receives the packet as follows (B1::0, A4::OTP)(A4::C52, 446 A4::OTP, A2::C31 ; SL=1; NH=ICMPv6)(ICMPv6 Echo Request). Node N3, 447 which is a classic IPv6 node, performs the standard IPv6 448 processing. Specifically, it forwards the echo request based on DA 449 A4::OTP in the IPv6 header. 450 - When node N4 receives the packet (B1::0, A4::OTP)(A4::C52, 451 A4::OTP, A2::C31 ; SL=1; NH=ICMPv6)(ICMPv6 Echo Request), it 452 processes the END.OTP SID, as described in the pseudocode in [I- 453 D.draft-filsfils-spring-srv6-network-programming]. The packet gets 454 punted to the ICMPv6 process for processing. The ICMPv6 process 455 checks if the next SID in SRH (the target SID A4::C52) is locally 456 programmed. If the target SID is not locally programmed, N4 457 responses with the ICMPv6 message (Type: "SRv6 OAM (TBA)", Code: 458 "SID not locally implemented (TBA)"); otherwise a success is 459 returned. 461 3.1.2.2. Segment-by-segment ping using O-bit (Proof of Transit) 463 Consider the same example where the user wants to ping a remote SID 464 function A4::C52 , via A2::C31, from node N1. However, in this ping, 465 the node N1 wants to get a response from each segment node in the 466 SRH. In other words, in the segment-by-segment ping case, the node 467 N1 expects a response from node N2 and node N4 for their respective 468 local SID function. 470 To force a punt of the ICMPv6 echo request at node N2 and node N4, 471 node N1 sets the O-bit in SRH. The ICMPv6 echo request is processed 472 at the individual nodes along the path as follows. 474 - Node N1 initiates an ICMPv6 ping packet with SRH as follows 475 (B1::0, A2::C31)(A4::C52, A2::C31; SL=1, Flags.O=1; 476 NH=ICMPv6)(ICMPv6 Echo Request). 477 - When node N2 receives the packet (B1::0, A2::C31)(A4::C52, 478 A2::C31; SL=1, Flags.O=1; NH=ICMPv6)(ICMPv6 Echo Request) packet, 479 it processes the O-bit in SRH, as described in the pseudocode in 480 [I-D.draft-filsfils-spring-srv6-network-programming]. A time- 481 stamped copy of the packet gets punted to the ICMPv6 process for 482 processing. Node N2 continues to apply the A2::C31 SID function on 483 the original packet and forwards it, accordingly. As 484 SRH.Flags.O=1, Node N2 also disables the PSP flavour, i.e., does 485 not remove the SRH. The ICMPv6 process at node N2 checks if its 486 local SID (A2::C31) is locally programmed or not and responds to 487 the ICMPv6 Echo Request. If the target SID is not locally 488 programmed, N4 responses with the ICMPv6 message (Type: "SRv6 OAM 489 (TBA)", Code: "SID not locally implemented (TBA)"); otherwise a 490 success is returned. Please note that, as mentioned in [I-D.draft- 491 filsfils-spring-srv6-network-programming], if node N2 does not 492 support the O-bit, it simply ignores it and process the local SID, 493 A2::C31. 494 - Node N3, which is a classic IPv6 node, performs the standard IPv6 495 processing. Specifically, it forwards the echo request based on DA 496 A4::C52 in the IPv6 header. 497 - When node N4 receives the packet (B1::0, A4::C52)(A4::C52, 498 A2::C31; SL=0, Flags.O=1; NH=ICMPv6)(ICMPv6 Echo Request), it 499 processes the O-bit in SRH, as described in the pseudocode in [I- 500 D.draft-filsfils-spring-srv6-network-programming]. A time-stamped 501 copy of the packet gets punted to the ICMPv6 process for 502 processing. The ICMPv6 process at node N4 checks if its local SID 503 (A2::C31) is locally programmed or not and responds to the ICMPv6 504 Echo Request. If the target SID is not locally programmed, N4 505 responses with the ICMPv6 message (Type: "SRv6 OAM (TBA)", Code: 506 "SID not locally implemented (TBA)"); otherwise a success is 507 returned. 509 Support for O-bit is part of node capability advertisement. That 510 enables node N1 to know which segment nodes are capable of 511 responding to the ICMPv6 echo request. Node N1 processes the echo 512 responses and presents data to the user, accordingly. 514 Please note that segment-by-segment ping can be used to address 515 proof of transit use-case discussed earlier. 517 3.2. Error Reporting 519 Any IPv6 node can use ICMPv6 control messages to report packet 520 processing errors to the host that originated the datagram packet. 521 To name a few such scenarios: 523 - If the router receives an undeliverable IP datagram, or 524 - If the router receives a packet with a Hop Limit of zero, or 525 - If the router receives a packet such that if the router decrements 526 the packet's Hop Limit it becomes zero, or 527 - If the router receives a packet with problem with a field in the 528 IPv6 header or the extension headers such that it cannot complete 529 processing the packet, or 530 - If the router cannot forward a packet because the packet is larger 531 than the MTU of the outgoing link. 533 In the scenarios listed above, the ICMPv6 response also contains the 534 IP header, IP extension headers and leading payload octets of the 535 "original datagram" to which the ICMPv6 message is a response. 536 Specifically, the "Destination Unreachable Message", "Time Exceeded 537 Message", "Packet Too Big Message" and "Parameter Problem Message" 538 ICMPV6 messages can contain as much of the invoking packet as 539 possible without the ICMPv6 packet exceeding the minimum IPv6 MTU 540 [RFC4443], [RFC4884]. In an SRv6 network, the copy of the invoking 541 packet contains the SR header. The packet originator can use this 542 information for diagnostic purposes. For example, traceroute can use 543 this information as detailed in the following. 545 3.3. Traceroute 547 There is no hardware or software change required for traceroute 548 operation at the classic IPv6 nodes in an SRv6 network. That 549 includes the classic IPv6 node with ingress, egress or transit 550 roles. Furthermore, no protocol changes are required to the standard 551 traceroute operations. In other words, existing traceroute 552 mechanisms work seamlessly in the SRv6 networks. 554 The following subsections outline some use cases of the traceroute 555 in the SRv6 networks. 557 3.3.1. Classic Traceroute 559 The existing mechanism to traceroute a remote IP prefix, along the 560 shortest path, continues to work without any modification. The 561 initiator may be an SRv6 node or a classic IPv6 node. Similarly, the 562 egress or transit may be an SRv6 node or a classic IPv6 node. 564 If an SRv6 capable ingress node wants to traceroute to IPv6 prefix 565 via an arbitrary segment list , it needs to initiate 566 traceroute probe with an SR header containing the SID list . That is illustrated using the topology in Figure 1. Assume all 568 the links have IGP metric 10 except both links between node2 and 569 node3, which have IGP metric set to 100. User issues a traceroute 570 from node N1 to a loopback of node 5, via segment list . Figure 3 contains sample output for the traceroute 572 request. 574 > traceroute B5:: via segment-list A2::C31, A4::C52 576 Tracing the route to B5:: 578 1 99:1:2::21 0.512 msec 0.425 msec 0.374 msec 579 SRH: (B5::, A4::C52, A2::C31, SL=2) 581 2 99:2:3::31 0.721 msec 0.810 msec 0.795 msec 582 SRH: (B5::, A4::C52, A2::C31, SL=1) 584 3 99:3:4::41 0.921 msec 0.816 msec 0.759 msec 585 SRH: (B5::, A4::C52, A2::C31, SL=1) 587 4 99:4:5::52 0.879 msec 0.916 msec 1.024 msec 589 A sample traceroute output at an SRv6 capable node 591 Please note that information for hop2 is returned by N3, which is a 592 classic IPv6 node. Nonetheless, the ingress node is able to display 593 SR header contents as the packet travels through the IPv6 classic 594 node. This is because the "Time Exceeded Message" ICMPv6 message can 595 contain as much of the invoking packet as possible without the 596 ICMPv6 packet exceeding the minimum IPv6 MTU [RFC4443]. The SR 597 header is also included in these ICMPv6 messages initiated by the 598 classic IPv6 transit nodes that are not running SRv6 software. 599 Specifically, a node generating ICMPv6 message containing a copy of 600 the invoking packet does not need to understand the extension 601 header(s) in the invoking packet. 603 The segment list information returned for hop1 is returned by N2, 604 which is an SRv6 capable node. Just like for hop2, the ingress node 605 is able to display SR header contents for hop1. 607 There is no difference in processing of the traceroute probe at an 608 IPv6 classic node and an SRv6 capable node. Similarly, both IPv6 609 classic and SRv6 capable nodes use the address of the interface on 610 which probe was received as the source address in the ICMPv6 611 response. ICMP extensions defined in [RFC5837] can be used to also 612 display information about the IP interface through which the 613 datagram would have been forwarded had it been forwardable, and the 614 IP next hop to which the datagram would have been forwarded, the IP 615 interface upon which a datagram arrived, the sub-IP component of an 616 IP interface upon which a datagram arrived. 618 The information about the IP address of the incoming interface on 619 which the traceroute probe was received by the reporting node is 620 very useful. This information can also be used to verify if SID 621 functions A2::C31 and A4::C52 are executed correctly by N2 and N4, 622 respectively. Specifically, the information displayed for hop2 623 contains the incoming interface address 99:2:3::31 at N3. This 624 matches with the expected interface bound to END.X function A2::C31 625 (link3). Similarly, the information displayed for hop5 contains the 626 incoming interface address 99:4:5::52 at N5. This matches with the 627 expected interface bound to the END.X function A4::C52 (link10). 629 3.3.2. Traceroute to a SID Function 631 The classic traceroute described in the previous section cannot be 632 used to traceroute a remote SID function, as explained using an 633 example in the following. 635 Consider the case where the user wants to traceroute the remote SID 636 function A4::C52, via A2::C31, from node N1. Node N1 constructs the 637 traceroute packet (B1::0, A2::C31, HC=1)( A4::C52, A2::C31, SL=1; 638 NH=UDP)(traceroute probe). Even though Hop Count of the packet is 639 set to 1, when the node N4 receives the traceroute probe with DA set 640 to A4::C52 and next header set to UDP, it silently drops it (see [I- 641 D.draft-filsfils-spring-srv6-network-programming] for details). To 642 solve this problem, the initiator needs to mark the traceroute probe 643 as an OAM packet. 645 The OAM packets are identified either by setting the O-bit in SRH or 646 by inserting the END.OTP SID at an appropriate place in the SRH [I- 647 D.draft-filsfils-spring-srv6-network-programming]. 649 In an SRv6 network, the user can exercise two flavors of the 650 traceroute: hop-by-hop traceroute or overlay traceroute. 652 In hop-by-hop traceroute, user gets responses from all nodes 653 including classic IPv6 transit nodes, SRv6 capable transit nodes as 654 well as SRv6 capable segment endpoints. E.g., consider the example 655 where the user wants to traceroute to a remote SID function A4::C52 656 , via A2::C31, from node N1. The traceroute output will also display 657 information about node3, which is a transit (underlay) node. 659 The overlay traceroute, on the other hand, does not trace the 660 underlay nodes. In other words, the overlay traceroute only displays 661 the nodes that acts as SRv6 segments along the route. I.e., in the 662 example where the user wants to traceroute to a remote SID function 663 A4::C52 , via A2::C31, from node N1, the overlay traceroute would 664 only display the traceroute information from node N2 and node N2 and 665 will not display information from node 3. 667 3.3.2.1. Hop-by-hop traceroute using END.OTP 669 In this section, hop-by-hop traceroute to a SID function is 670 exemplified using UDP probes. However, the procedure is equally 671 applicable to other implementation of traceroute mechanism. 673 Consider the same example where the user wants to traceroute to a 674 remote SID function A4::C52 , via A2::C31, from node N1. To force a 675 punt of the traceroute probe only at the node N4, node N1 inserts 676 the END.OTP SID just before the target SID A4::C52 in the SRH. The 677 traceroute probe is processed at the individual nodes along the path 678 as follows. 680 - Node N1 initiates a traceroute probe packet with a monotonically 681 increasing value of hop count and SRH as follows (B1::0, 682 A2::C31)(A4::C52, A4::OTP, A2::C31; SL=2; NH=UDP)(Traceroute 683 probe). 684 - When node N2 receives the packet with hop-count = 1, it processes 685 the hop count expiry. Specifically, the node N2 responses with the 686 ICMPv6 message (Type: "Time Exceeded", Code: "Time to Live 687 exceeded in Transit"). 688 - When Node N2 receives the packet with hop-count > 1, it performs 689 the standard SRH processing. Specifically, it executes the END.X 690 function (A2::C31) on the traceroute probe. 691 - When node N3, which is a classic IPv6 node, receives the packet 692 (B1::0, A4::OTP)(A4::C52, A4::OTP, A2::C31 ; HC=1, SL=1; 693 NH=UDP)(Traceroute probe) with hop-count = 1, it processes the hop 694 count expiry. Specifically, the node N3 responses with the ICMPv6 695 message (Type: "Time Exceeded", Code: "Time to Live exceeded in 696 Transit"). 697 - When node N3, which is a classic IPv6 node, receives the packet 698 with hop-count > 1, it performs the standard IPv6 processing. 699 Specifically, it forwards the traceroute probe based on DA A4::OTP 700 in the IPv6 header. 701 - When node N4 receives the packet (B1::0, A4::OTP)(A4::C52, 702 A4::OTP, A2::C31 ; SL=1; HC=1, NH=UDP)(Traceroute probe), it 703 processes the END.OTP SID, as described in the pseudocode in [I- 704 D.draft-filsfils-spring-srv6-network-programming]. The packet gets 705 punted to the traceroute process for processing. The traceroute 706 process checks if the next SID in SRH (the target SID A4::C52) is 707 locally programmed. If the target SID A4::C52 is locally 708 programmed, node N4 responses with the ICMPv6 message (Type: 709 Destination unreachable, Code: Port Unreachable). If the target 710 SID A4::C52 is not a local SID, node N4 silently drops the 711 traceroute probe. 713 Figure 4 displays a sample traceroute output for this example. 715 > traceroute srv6 A4::C52 via segment-list A2::C31 717 Tracing the route to SID function A4::C52 719 1 99:1:2::21 0.512 msec 0.425 msec 0.374 msec 720 SRH: (A4::C52, A4::OTP, A2::C31; SL=2) 722 2 99:2:3::31 0.721 msec 0.810 msec 0.795 msec 723 SRH: (A4::C52, A4::OTP, A2::C31; SL=1) 725 3 99:3:4::41 0.921 msec 0.816 msec 0.759 msec 726 SRH: (A4::C52, A4::OTP, A2::C31; SL=1) 728 A sample output for hop-by-hop traceroute to a SID function 730 3.3.2.2. Tracing SRv6 Overlay 732 The overlay traceroute does not trace the underlay nodes, i.e., only 733 displays the nodes that acts as SRv6 segments along the path. This 734 is achieved by setting the SRH.Flags.O bit. 736 In this section, overlay traceroute to a SID function is exemplified 737 using UDP probes. However, the procedure is equally applicable to 738 other implementation of traceroute mechanism. 740 Consider the same example where the user wants to traceroute to a 741 remote SID function A4::C52 , via A2::C31, from node N1. 743 - Node N1 initiates a traceroute probe with SRH as follows (B1::0, 744 A2::C31)(A4::C52, A2::C31; HC=64, SL=1, Flags.O=1; 745 NH=UDP)(Traceroute Probe). Please note that the hop-count is set 746 to 64 to skip the underlay nodes from tracing. The O-bit in SRH is 747 set to make the overlay nodes (nodes processing the SRH) respond. 748 - When node N2 receives the packet (B1::0, A2::C31)(A4::C52, 749 A2::C31; SL=1, HC=64, Flags.O=1; NH=UDP)(Traceroute Probe), it 750 processes the O-bit in SRH, as described in the pseudocode in [I- 751 D.draft-filsfils-spring-srv6-network-programming]. A time-stamped 752 copy of the packet gets punted to the traceroute process for 753 processing. Node N2 continues to apply the A2::C31 SID function on 754 the original packet and forwards it, accordingly. As 755 SRH.Flags.O=1, Node N2 also disables the PSP flavor, i.e., does 756 not remove the SRH. The traceroute process at node N2 checks if 757 its local SID (A2::C31) is locally programmed. If the SID is not 758 locally programmed, it silently drops the packet. Otherwise, it 759 performs the egress check by looking at the SL value in SRH. As SL 760 is not equal to zero (i.e., it's not egress node), node N2 761 responses with the ICMPv6 message (Type: "SRv6 OAM (TBA)", Code: 762 "O-bit punt at Transit (TBA)"). Please note that, as mentioned in 763 [I-D.draft-filsfils-spring-srv6-network-programming], if node N2 764 does not support the O-bit, it simply ignores it and processes the 765 local SID, A2::C31. 766 - When node N3 receives the packet (B1::0, A4::C52)(A4::C52, 767 A2::C31; SL=0, HC=63, Flags.O=1; NH=UDP)(Traceroute Probe), 768 performs the standard IPv6 processing. Specifically, it forwards 769 the traceroute probe based on DA A4::C52 in the IPv6 header. 770 Please note that there is no hop-count expiration at the transit 771 nodes. 772 - When node N4 receives the packet (B1::0, A4::C52)(A4::C52, 773 A2::C31; SL=0, HC=62, Flags.O=1; NH=UDP)(Traceroute Probe), it 774 processes the O-bit in SRH, as described in the pseudocode in [I- 775 D.draft-filsfils-spring-srv6-network-programming]. A time-stamped 776 copy of the packet gets punted to the traceroute process for 777 processing. The traceroute process at node N4 checks if its local 778 SID (A2::C31) is locally programmed. If the SID is not locally 779 programmed, it silently drops the packet. Otherwise, it performs 780 the egress check by looking at the SL value in SRH. As SL is equal 781 to zero (i.e., N4 is the egress node), node N4 tries to consume 782 the UDP probe. As UDP probe is set to access an invalid port, the 783 node N4 responses with the ICMPv6 message (Type: Destination 784 unreachable, Code: Port Unreachable). 786 Figure 5 displays a sample overlay traceroute output for this 787 example. Please note that the underlay node N3 does not appear in 788 the output. 790 > traceroute srv6 A4::C52 via segment-list A2::C31 792 Tracing the route to SID function A4::C52 794 1 99:1:2::21 0.512 msec 0.425 msec 0.374 msec 795 SRH: (A4::C52, A4::OTP, A2::C31; SL=2) 797 2 99:3:4::41 0.921 msec 0.816 msec 0.759 msec 798 SRH: (A4::C52, A4::OTP, A2::C31; SL=1) 800 A sample output for overlay traceroute to a SID function 802 3.4. In-situ OAM 804 [I-D.draft-brockners-inband-oam-requirements] describes motivation 805 and requirements for In-situ OAM (iOAM). iOAM records operational 806 and telemetry information in the data packet while the packet 807 traverses the network of telemetry domain. iOAM complements out-of- 808 band probe based OAM mechanisms such ICMP ping and traceroute by 809 directly encoding tracing and the other kind of telemetry 810 information to the regular data traffic. 812 [I-D.brockners-inband-oam-transport] describes transport mechanisms 813 for iOAM data including IPv6 and Segment Routing traffic. 814 furthermore, [I-D.brockners-inband-oam-data] defines information 815 encoding for iOAM data. 817 One of the application of iOAM is to perform inband traceroute. In 818 SRv6 network, iOAM traceroute feature can be used to trace the order 819 set of segment ID executed by SRv6 nodes for packet forwarding along 820 the packet path. This is achieved by recording the node details that 821 the packet traversed in the packet header itself. 823 Another important application of iOAM is to perform delay 824 measurement in anycast server scenarios. Anycast server deployment 825 is commonly seen for redundancy and load balancing purpose. In SRv6 826 network, iOAM can be used to collect the timestamp from different 827 anycats servers to measure the delay induced by each server within 828 the anycast cluster that helps to provide SLA constrainted services. 830 One of the other applications of iOAM is to provide the Proof of 831 Transit (POT). Among other features of iOAM, SRv6 networks can use 832 the POT feature of iOAM to verify that all the function SIDs in SRH 833 have been executed before the packet is delivered to the 834 destination. It can also ensure that the order of execution of the 835 SID function has been consistent with the SRH contents. 837 More details on various applications of iOAM in SRv6 networks will 838 be included in future versions of this document. 840 3.5. Seamless BFD Applicability 842 [RFC7880] defines Seamless BFD (S-BFD) architecture that simplifies 843 BFD mechanism and enables it to perform path monitoring in a 844 controlled and scalable manner. [RFC7881] describes the procedure to 845 perform continuity check using S-BFD in different environments 846 including IPv6 networks. Section 5.1 of [RFC7881] explains the 847 SBFDInitiator specification and procedure to initiate S-BFD control 848 packet in IP and MPLS network. The specification described for IP- 849 routed S-BFD control packet is also directly applicable to the SRv6 850 network. 852 S-BFD has a fast bootstrapping capability. Furthermore, in S-BFD, 853 only the ingress is required to keep BFD states; the egress and 854 transit node does not have any knowledge of the BFD session. These 855 attributes of S-BFD make it an excellent candidate for rapid failure 856 detection in the SRv6 network. More details on various S-BFD usage 857 on the SRv6 network will be included in a future version. 859 3.6. Connectivity Verification from arbitrary SR node 861 In the recent past, network operators are interested in performing 862 network operations, administration, and maintenance configuration in 863 a centralized manner. Various data models like YANG are available to 864 collect data from the network and manage it from a centralized 865 entity. 867 SR technology enables a centralized OAM entity to perform path 868 monitoring from centralized OAM entity without control plane 869 intervention on monitored nodes. [I.D-draft-ietf-spring-oam-usecase] 870 describes such a centralized OAM mechanism. Specifically, the draft 871 describes a procedure that can be used to perform path continuity 872 check between any nodes within an SR domain from a centralized 873 monitoring system, with minimal or no control plane intervene on the 874 nodes. However, the draft focuses on SR networks with MPLS data 875 plane. The same concept applies to the SRv6 networks. This document 876 describes how the concept can be used to perform path monitoring in 877 an SRv6 network. 879 In the above reference topology, N100 is the centralized monitoring 880 system implementing an END function A100::. In order to verify a 881 segment list , N100 generates a probe packet with 882 SRH set to (A100::, A4::C52, A2::C31, SL=2). The controller routes 883 the probe packet towards the first segment, which is A2::C31. N2 884 performs the standard SRH processing and forward it over link3 with 885 the DA of IPv6 packet set to A4::C52. N4 also performs the normal 886 SRH processing and forward it over link10 with the DA of IPv6 packet 887 set to A100::. This makes the probe loops back to the centralized 888 monitoring system. 890 In our reference topology in Figure 1, N100 uses an IGP protocol 891 like OSPF or ISIS to get the topology view within the IGP domain. 892 N100 can also use BGP-LS to get the complete view of an inter-domain 893 topology. In other words, the controller leverages the visibility of 894 the topology to monitor the paths between the various endpoints 895 without control plane intervention required at the monitored nodes. 897 4. Security Considerations 899 This document does not define any new protocol extensions and relies 900 on existing procedures defined for ICMP. This document does not 901 impose any additional security challenges to be considered beyond 902 security considerations described in [RFC4884], [RFC4443], [RFC792] 903 and RFCs that updates these RFCs. 905 5. IANA Considerations 907 This document does not define any new protocol or any extension to 908 an existing protocol. 910 6. References 912 6.1. Normative References 914 [RFC4884] Extended ICMP to Support Multi-Part Messages. R. Bonica, 915 D. Gan, D. Tappan, C. Pignataro. April 2007. 917 [RFC4443] Internet Control Message Protocol (ICMPv6) for the 918 Internet Protocol Version 6 (IPv6) Specification. A. 919 Conta, S. Deering, M. Gupta, Ed. March 2006. 921 [RFC792] Internet Control Message Protocol. J. Postel. September 922 1981. 924 [RFC5837] Extending ICMP for Interface and Next-Hop Identification. 925 A. Atlas, Ed., R. Bonica, Ed., C. Pignataro, Ed., N. Shen, 926 JR. Rivers. April 2010. 928 [RFC7880] Seamless Bidirectional Forwarding Detection (S-BFD). 929 C.Pignataro, D.Ward, N.Akiya, M.Bhatia, S.Pallagatti. July 930 2016. 932 [RFC7881] Seamless Bidirectional Forwarding Detection (S-BFD) for 933 IPv4, IPv6, and MPLS. C.Pignataro, D.Ward, N.Akiya. July 934 2016. 936 [I.D-filsfils-spring-srv6-network-programming] SRv6 Network 937 Programming, draft-filsfils-spring-srv6-network- 938 programming, C. Fisfils, work in progress. 940 6.2. Informative References 942 [I.D-draft-ietf-spring-oam-usecase] A Scalable and Topology-Aware 943 MPLS Dataplane Monitoring System. R. Geib, C. Filsfils, C. 944 Pignataro, N. Kumar, work in progress. 946 [I-D.brockners-inband-oam-data] Data Formats for In-situ OAM. F. 947 Brockners, work in progress. 949 [I-D.brockners-inband-oam-transport] Encapsulations for In-situ OAM 950 Data, F.Brockners, work in progress. 952 [I-D.brockners-inband-oam-requirements] Requirements for In-situ 953 OAM, F.Brockners, work in progress. 955 7. Acknowledgments 957 To be added. 959 Authors' Addresses 961 Clarence Filsfils 962 963 Email: cfilsfil@cisco.com 965 Zafar Ali 966 Cisco Systems, Inc. 967 Email: zali@cisco.com 969 Nagendra Kumar 970 Cisco Systems, Inc. 971 Email: naikumar@cisco.com 973 Carlos Pignataro 974 Cisco Systems, Inc. 975 Email: cpignata@cisco.com 977 Faisal Iqbal 978 Cisco Systems, Inc. 979 Email: faiqbal@cisco.com 981 John Leddy 982 Comcast 983 Email: John_Leddy@cable.comcast.com 985 Robert Raszuk 986 Bloomberg LP 987 731 Lexington Ave 988 New York City, NY10022, USA 989 Email: robert@raszuk.net 991 Satoru Matsushima 992 SoftBank 993 Japan 994 Email: satoru.matsushima@g.softbank.co.jp 996 Bart Peirens 997 Proximus 998 Email: bart.peirens@proximus.com 1000 Gaurav Naik 1001 Drexel University 1002 United States of America 1003 Email: gn@drexel.edu