idnits 2.17.1 draft-ninan-mpls-spring-inter-domain-oam-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 11, 2021) is 1043 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'N-P1' is mentioned on line 597, but not defined == Missing Reference: 'N-ASBR1' is mentioned on line 597, but not defined == Missing Reference: 'EPE-ASBR1-ASBR4' is mentioned on line 597, but not defined == Missing Reference: 'N-PE4' is mentioned on line 597, but not defined == Missing Reference: 'N-ASBR4' is mentioned on line 655, but not defined == Missing Reference: 'EPE-ASBR4-ASBR1' is mentioned on line 655, but not defined == Missing Reference: 'N-PE1' is mentioned on line 797, but not defined == Missing Reference: 'PE1' is mentioned on line 650, but not defined == Missing Reference: 'ASBR1' is mentioned on line 650, but not defined == Missing Reference: 'ASBR4' is mentioned on line 650, but not defined == Missing Reference: 'ASBR6' is mentioned on line 650, but not defined == Missing Reference: 'ASBR8' is mentioned on line 650, but not defined == Missing Reference: 'PE5' is mentioned on line 650, but not defined == Missing Reference: 'N-ABR1' is mentioned on line 797, but not defined == Missing Reference: 'N-ABR2' is mentioned on line 797, but not defined == Unused Reference: 'I-D.ietf-idr-segment-routing-te-policy' is defined on line 845, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-mpls-interas-lspping' is defined on line 888, but no explicit reference was found in the text == Outdated reference: A later version (-26) exists of draft-ietf-idr-segment-routing-te-policy-11 ** Downref: Normative reference to an Informational draft: draft-ietf-spring-segment-routing-central-epe (ref. 'I-D.ietf-spring-segment-routing-central-epe') ** Obsolete normative reference: RFC 4379 (Obsoleted by RFC 8029) == Outdated reference: A later version (-22) exists of draft-ietf-spring-segment-routing-policy-11 -- Obsolete informational reference (is this intentional?): RFC 3107 (Obsoleted by RFC 8277) Summary: 3 errors (**), 0 flaws (~~), 20 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Routing area S. Hegde 3 Internet-Draft K. Arora 4 Intended status: Standards Track M. Srivastava 5 Expires: December 13, 2021 Juniper Networks Inc. 6 S. Ninan 7 Individual Contributor 8 N. Kumar 9 Cisco Systems, Inc. 10 June 11, 2021 12 PMS/Head-end based MPLS Ping and Traceroute in Inter-domain SR Networks 13 draft-ninan-mpls-spring-inter-domain-oam-03 15 Abstract 17 Segment Routing (SR) architecture leverages source routing and 18 tunneling paradigms and can be directly applied to the use of a 19 Multiprotocol Label Switching (MPLS) data plane. A network may 20 consist of multiple IGP domains or multiple ASes under the control of 21 same organization. It is useful to have the LSP Ping and traceroute 22 procedures when an SR end-to-end path spans across multiple ASes or 23 domains. This document describes mechanisms to facilitae LSP ping 24 and traceroute in inter-AS/inter-domain SR networks in an efficient 25 manner with simple OAM protocol extension which uses dataplane 26 forwarding alone for sending echo reply. 28 Requirements Language 30 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 31 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 32 document are to be interpreted as described in RFC 2119 [RFC2119]. 34 Status of This Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at https://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on December 13, 2021. 50 Copyright Notice 52 Copyright (c) 2021 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (https://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 68 1.1. Definition of Domain . . . . . . . . . . . . . . . . . . 4 69 2. Inter domain networks with multiple IGPs . . . . . . . . . . 5 70 3. Return Path TLV . . . . . . . . . . . . . . . . . . . . . . . 5 71 4. Segment sub-TLV . . . . . . . . . . . . . . . . . . . . . . . 6 72 4.1. Type 1: SID only, in the form of MPLS Label . . . . . . . 6 73 4.2. Type 3: IPv4 Node Address with optional SID for SR-MPLS . 7 74 4.3. Type 4: IPv6 Node Address with optional SID for SR MPLS . 9 75 4.4. Segment Flags . . . . . . . . . . . . . . . . . . . . . . 10 76 5. SRv6 Dataplane . . . . . . . . . . . . . . . . . . . . . . . 10 77 6. Detailed Procedures . . . . . . . . . . . . . . . . . . . . . 10 78 6.1. Sending an echo request . . . . . . . . . . . . . . . . . 10 79 6.2. Receiving an echo request . . . . . . . . . . . . . . . . 11 80 6.3. Sending an echo reply . . . . . . . . . . . . . . . . . . 11 81 6.4. Receiving an echo reply . . . . . . . . . . . . . . . . . 12 82 7. Detailed Example . . . . . . . . . . . . . . . . . . . . . . 12 83 7.1. Procedures for Segment Routing LSP ping . . . . . . . . . 12 84 7.2. Procedures for Segment Routing LSP Traceroute . . . . . . 13 85 8. Building Return Path TLV dynamically . . . . . . . . . . . . 15 86 8.1. The procedures to build the return path . . . . . . . . . 15 87 8.2. Details with example . . . . . . . . . . . . . . . . . . 17 88 9. Security Considerations . . . . . . . . . . . . . . . . . . . 18 89 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 90 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 18 91 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 18 92 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 93 13.1. Normative References . . . . . . . . . . . . . . . . . . 19 94 13.2. Informative References . . . . . . . . . . . . . . . . . 19 95 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21 97 1. Introduction 99 +----------------+ 100 | Controller/PMS | 101 +----------------+ 103 |---AS1-----| |------AS2------| |----AS3---| 105 ASBR2----ASBR3 ASBR5------ASBR7 106 / \ / \ 107 / \ / \ 108 PE1----P1---P2 P3---P4---PE4 P5---P6--PE5 109 \ / \ / 110 \ / \ / 111 ASBR1----ASBR4 ASBR6------ASBR8 113 Figure 1: Inter-AS Segment Routing topology 115 Many network deployments have built their networks consisting of 116 multiple Autonomous Systems either for ease of operations or as a 117 result of network mergers and acquisitions. Segment Routing can be 118 deployed in such scenarios to provide end to end paths, traversing 119 multiple Autonomous systems(AS). These paths consist of Segment 120 Identifiers(SID) of different type as per [RFC8402]. 122 [RFC8660] specifies the forwarding plane behaviour to allow Segment 123 Routing to operate on top of MPLS data plane. 124 [I-D.ietf-spring-segment-routing-central-epe] describes BGP peering 125 SIDs, which will help in steering packet from one Autonomous system 126 to another. Using above SR capabilities, paths which span across 127 multiple Autonomous systems can be created. 129 For example Figure 1 describes an inter-AS network scenario 130 consisting of ASes AS1 and AS2. Both AS1 and AS2 are Segment Routing 131 enabled and the EPE links have EPE labels configured and advertised 132 via [I-D.ietf-idr-bgpls-segment-routing-epe]. Controller or head-end 133 can build end-to-end Traffic-Engineered path consisting of Node-SIDs, 134 Adjacency-SIDs and EPE-SIDs. It is advantageous for operations to be 135 able to perform LSP ping and traceroute procedures on these inter-AS 136 SR paths. LSP ping/traceroute procedures use ip connectivity for 137 echo reply to reach the head-end. In inter-AS networks, ip 138 connectivity may not be there from each router in the path.For 139 example in Figure 1 P3 and P4 may not have ip connectivity for PE1. 141 [RFC8403] describes mechanisms to carry out the MPLS ping/traceroute 142 from a PMS. It is possible to build GRE tunnels or static routes to 143 each router in the network to get IP connectivity for the reverse 144 path. This mechanism is operationally very heavy and requires PMS to 145 be capable of building huge number of GRE tunnels, which may not be 146 feasible. 148 It is not possible to carry out LSP ping and Traceroute functionality 149 on these paths to verify basic connectivity and fault isolation using 150 existing LSP ping and Traceroute mechanism([RFC8287] and [RFC8029]). 151 This is because, there exists no IP connectivity to source address of 152 ping packet, which is in a different AS, from the destination of 153 Ping/Traceroute. 155 [RFC7743] describes a Echo-relay based solution based on advertising 156 a new Relay Node Address Stack TLV containing stack of Echo-relay ip 157 addresses. These mechansims can be applied to segment routing 158 networks as well. [RFC7743] mechanism requires the return ping 159 packet to reach the control plane on every relay node. The 160 motivation of the current document is to provide an alternate 161 mechanism for ping/traceroute in inter-domain segment routing 162 networks. 164 This document describes a new mechanism which is efficient and simple 165 and can be easily deployed in SR networks. This mechanism uses MPLS 166 path for the reply packet and does not require the reply packet to 167 visit control plane as in [RFC7743]. It simplifies the operations to 168 a greater extent in SR networks. The current draft describes a 169 mechanism that uses Return path TLV [RFC7110] to convey the reverse 170 path. Three new sub-TLVs for Return path TLV are defined, that 171 faciliate encoding segment routing label stack. The TLV can either 172 be derived by a smart application or controller which has a full 173 topology view. This document also proposes mechanisms to derive the 174 Return path dynamically during traceroute procedures. 176 1.1. Definition of Domain 178 The term domain used in this document implies an IGP domain where 179 every node is visible to every other node for the purposes of 180 shortest path computation. The domain implies an IGP area or level. 181 This document is applicable to SR networks where all nodes in each of 182 the domains are SR capable. It is also applicable to SR networks 183 where SR acts an an overlay having SR incapable underlay nodes. In 184 such networks, the traceroute procedure is executed only on the 185 overlay SR nodes. 187 2. Inter domain networks with multiple IGPs 189 |-Domain 1|-------Domain 2-----|--Domain 3-| 191 PE1------ABR1--------P--------ABR2------PE4 192 \ / \ /\ / 193 -------- ----------------- ------- 194 BGP-LU BGP-LU BGP-LU 196 Figure 2: Inter-domain networks with multiple IGPs 198 When the network consists of large number of nodes, the nodes are 199 seggregated into multiple IGP domains. The connectivity to the 200 remote PEs can be achieved using BGP-LU [RFC3107] or by stacking the 201 labels for each domain as described in [RFC8604]. It is useful to 202 support mpls ping and traceroute mechanisms for these networks. The 203 procedures described in this document for constructing Return path 204 TLV and its use in echo reply is equally applicable to networks 205 consisting of multiple IGP domains that use BGP-LU or label stacking. 207 3. Return Path TLV 209 Segment Routing networks statically assign the labels to nodes and 210 PMS/Head-end may know the entire database. The reverse path can be 211 built from PMS/Head-end by stacking segments for the reverse path. 212 Return path TLV as defined in [RFC7110] is used to carry the return 213 path. While using the procedures described in this document, the 214 reply mode MUST be set to 5 and Return Path TLV MUST be included in 215 the echo request message. The procedures decribed in [RFC7110] are 216 applicable for constructing the Return Path TLV. This document 217 define three new sub-TLVs to encode the Segment Routing path. 219 The type of segment that the head-end chooses to send in the Return 220 Path TLV is governed by local policy. Implementations may provide 221 CLI input parameters in Labels, IPv4 addresses or IPv6 addresses or a 222 combination of these which gets encoded in the return path TLV. 223 Implementations may also provide mechansims to acquire the database 224 of remote domains and compute the return path based on the acquired 225 database. For traceroute purposes, the return path will have to 226 consider the reply being sent from every node along the path. The 227 return path changes when the traceroute progresses and crosses each 228 domain. For traceroute purposes, the headend/PMS need to acquire the 229 entire database or use dynamically computed return path as described 230 in Section 8 231 Some networks may consist of pure IPV4 domains and Pure IPv6 domains. 232 Handling end-to-end MPLS OAM for such networks is out of scope for 233 this document. It is recommended to use dual stack in such cases and 234 use end-to-end IPv6 addresses for MPLS ping and trace route 235 procedures. 237 4. Segment sub-TLV 239 [I-D.ietf-spring-segment-routing-policy] defines various types of 240 segments. The segments applicable to this document have been re- 241 defined here. One or more segment sub-TLV can be included in the 242 Return Path TLV. The segment sub-TLVs included in a Return Path TLV 243 MAY be of different types. 245 Below types of segment sub-TLVs are applicable for the Reverse Path 246 Segment List TLV. 248 Type 1: SID only, in the form of MPLS Label 250 Type 3: IPv4 Node Address with optional SID 252 Type 4: IPv6 Node Address with optional SID for SR MPLS 254 4.1. Type 1: SID only, in the form of MPLS Label 256 The Type-1 Segment Sub-TLV encodes a single SID in the form of an 257 MPLS label. The format is as follows: 259 0 1 2 3 260 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 261 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 262 | Type | Length | 263 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 264 | Flags | RESERVED | 265 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 266 | Label | TC |S| TTL | 267 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 269 Figure 3: Type 1 Segment sub-TLV 271 where: 273 Type: TBD1(to be assigned by IANA from the registry "Sub-TLV Target 274 FEC stack TLV"). 276 Length is 8. 278 Flags: 1 octet of flags as defined in Section Section 4.4. 280 RESERVED: 3 octets of reserved bits. SHOULD be unset on transmission 281 and MUST be ignored on receipt. 283 Label: 20 bits of label value. 285 TC: 3 bits of traffic class 287 S: 1 bit of bottom-of-stack. 289 TTL: 1 octet of TTL. 291 The following applies to the Type-1 Segment sub-TLV: 293 The S bit SHOULD be zero upon transmission, and MUST be ignored upon 294 reception. 296 If the originator wants the receiver to choose the TC value, it sets 297 the TC field to zero. 299 If the originator wants the receiver to choose the TTL value, it sets 300 the TTL field to 255. 302 If the originator wants to recommend a value for these fields, it 303 puts those values in the TC and/or TTL fields. 305 The receiver MAY override the originator's values for these fields. 306 This would be determined by local policy at the receiver. One 307 possible policy would be to override the fields only if the fields 308 have the default values specified above. 310 4.2. Type 3: IPv4 Node Address with optional SID for SR-MPLS 312 The Type-3 Segment Sub-TLV encodes an IPv4 node address, SR Algorithm 313 and an optional SID in the form of an MPLS label. The format is as 314 follows: 316 0 1 2 3 317 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 318 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 319 | Type | Length | 320 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 321 | Flags | RESERVED | SR Algorithm | 322 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 323 | IPv4 Node Address (4 octets) | 324 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 325 | SID (optional, 4 octets) | 326 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 328 Figure 4: Type 3 Segment sub-TLV 330 where: 332 Type: TBD3(to be assigned by IANA from the registry "Sub-TLV Target 333 FEC stack TLV"). 335 Length is 8 or 12. 337 Flags: 1 octet of flags as defined in Section Section 4.4. 339 SR Algorithm: 1 octet specifying SR Algorithm as described in section 340 3.1.1 in [RFC8402], when A-Flag as defined in Section Section 4.4is 341 present. SR Algorithm is used by the receiver to derive the Label. 342 When A-Flag is not encoded, this field SHOULD be unset on 343 transmission and MUST be ignored on receipt. 345 RESERVED: 2 octets of reserved bits. SHOULD be unset on transmission 346 and MUST be ignored on receipt. 348 IPv4 Node Address: a 4 octet IPv4 address representing a node. 350 SID: 4 octet MPLS label. 352 The following applies to the Type-3 Segment sub-TLV: 354 The IPv4 Node Address MUST be present. 356 The SID is optional and specifies a 4 octet MPLS SID containing 357 label, TC, S and TTL as defined in Section Section 4.1. 359 If length is 8, then only the IPv4 Node Address is present. 361 If length is 12, then the IPv4 Node Address and the MPLS SID are 362 present. 364 4.3. Type 4: IPv6 Node Address with optional SID for SR MPLS 366 The Type-4 Segment Sub-TLV encodes an IPv6 node address, SR Algorithm 367 and an optional SID in the form of an MPLS label. The format is as 368 follows: 370 0 1 2 3 371 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 372 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 373 | Type | Length | 374 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 375 | Flags | RESERVED | SR Algorithm | 376 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 377 // IPv6 Node Address (16 octets) // 378 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 379 | SID (optional, 4 octets) | 380 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 382 Figure 5: Type 4 Segment sub-TLV 384 where: 386 Type: TBD4(to be assigned by IANA from the registry "Sub-TLV Target 387 FEC stack TLV"). 389 Length is 20 or 24. 391 Flags: 1 octet of flags as defined in Section Section 4.4. 393 SR Algorithm: 1 octet specifying SR Algorithm as described in section 394 3.1.1 in [RFC8402], when A-Flag as defined in Section Section 4.4 is 395 present. SR Algorithm is used by the receiver to derive the label. 396 When A-Flag is not encoded, this field SHOULD be unset on 397 transmission and MUST be ignored on receipt. 399 RESERVED: 2 octets of reserved bits. SHOULD be unset on transmission 400 and MUST be ignored on receipt. 402 IPv6 Node Address: a 16 octet IPv6 address representing a node. 404 SID: 4 octet MPLS label. 406 The following applies to the Type-4 Segment sub-TLV: 408 The IPv6 Node Address MUST be present. 410 The SID is optional and specifies a 4 octet MPLS SID containing 411 label, TC, S and TTL as defined in Section Section 4.1 . 413 If length is 20, then only the IPv6 Node Address is present. 415 If length is 24, then the IPv6 Node Address and the MPLS SID are 416 present. 418 4.4. Segment Flags 420 The Segment Types described above MAY contain following flags in the 421 "Flags" field (codes to be assigned by IANA from the registry "Return 422 path sub-TLV Flags" ) 424 0 1 2 3 4 5 6 7 425 +-+-+-+-+-+-+-+-+ 426 | |A| | 427 +-+-+-+-+-+-+-+-+ 429 Figure 6: Flags 431 where: 433 A-Flag: This flag indicates the presence of SR Algorithm id in the 434 "SR Algorithm" field applicable to various Segment Types. 436 Unused bits in the Flag octet SHOULD be set to zero upon transmission 437 and MUST be ignored upon receipt. 439 The following applies to the Segment Flags: 441 A-Flag is applicable to Segment Types 3, 4. If A-Flag appears with 442 any other Segment Type, it MUST be ignored. 444 5. SRv6 Dataplane 446 SRv6 dataplane is not in the scope of this document and will be 447 addressed in a separate document. 449 6. Detailed Procedures 451 6.1. Sending an echo request 453 In the inter-AS scenario when there is no reverse path connectivity, 454 the procedures described in this document should be used. LSP ping 455 initiator MUST set the Reply Mode of the echo request to "Reply via 456 Specified Path", and a Reply Path TLV MUST be carried in the echo 457 request message correspondingly. The Return Path TLV must contain 458 the Segment Routing Path in the reverse direction encoded as an 459 ordered list of segments. The first Segment MUST correspond to the 460 top Segment in MPLS header that the responder MUST use while sending 461 the echo reply. 463 6.2. Receiving an echo request 465 As described in [RFC7110], when Reply mode is set to 5 (Reply via 466 Specified Path),The echo request MUST contain the Return path TLV. 467 Absence of Return path TLV is treated as malformed echo request. when 468 an echo request is received, if the egress LSR does not know the 469 Reply Mode 5 defined in [RFC7110], an echo reply with the return code 470 set to "Malformed echo request received" and the Subcode set to zero 471 will be sent back to the ingress LSR according to the rules of 472 [RFC4379]. When a Return Path TLV is received, and the responder 473 that supports processing it, it MUST use the segments in Return Path 474 TLV to build the echo reply.The responder MUST follow the normal FEC 475 validation procedures as described in [RFC8029] and [RFC8287] and 476 this document does not suggest any change to those procedures. When 477 the echo reply has to be sent out the Return Path TLV is used to 478 construct the MPLS packet to send out. 480 6.3. Sending an echo reply 482 The echo reply message is sent as MPLS packet with a MPLS label 483 stack. The echo reply message MUST be constructed as described in 484 the [RFC8029]. An MPLS packet is constructed with echo reply in the 485 payload. The top label MUST be constructed from the first Segment 486 from the Return Path TLV. The remaining labels MUST follow the order 487 from the Return Path TLV. The responder MAY check the reachability 488 of the top label in its own LFIB before sending the echo reply. In 489 certain scenarios the head-end may choose to send Type 3/Type 4 490 segments consisting of IPV4 address or IPv6 address. Optionally a 491 SID may also be assiciated with Type 3/Type4 segment. In such cases 492 the node sending the echo reply MUST derive the MPLS labels based on 493 Node-SIDs associated with the IPv4 /IPv6 addresses or from the 494 optional MPLS SIDs in the type 3/ type 4 segments and encode the echo 495 reply with MPLS labels. 497 The reply path return code MUST be set as described in section 7.4 of 498 [RFC7110]. The Return Path TLV MUST be included in echo reply 499 indicating the specified return path that the echo reply message is 500 required to follow as described in section 5.3 of [RFC7110]. 502 When the node is configured to dynamically create return path for 503 next echo request, the procedures described in Section 8 MUST be 504 used. The reply path return code MUST be set to 6 and same Return 505 Path TLV or a new Return Path TLV MUST be included in the echo reply. 507 6.4. Receiving an echo reply 509 The rules and process defined in Section 4.6 of [RFC4379] and section 510 5.4 of [RFC7110] apply here. In addition, if the Return Path Reply 511 code is "Use Return Path TLV in echo reply for next echo request", 512 the Return Path TLV from the echo Reply MUST be sent in the next echo 513 request with TTL incremented by 1. 515 7. Detailed Example 517 Example topologies given in Figure 1 and Figure 2 will be used in 518 below sections to explain LSP Ping and Traceroute procedures. The 519 PMS/Head-end has complete view of topology. PE1, P1, P2, ASBR1 and 520 ASBR2 are in AS1. Similarly ASBR3, ASBR4, P3, P4 and PE4 are in AS2. 522 AS1 and AS2 have Segment Routing enabled. IGPs like OSPF/ISIS are 523 used to flood SIDs in each Autonomous System. The ASBR1, ASBR2, 524 ASBR3, ASBR4 advertise BGP EPE SIDs for the inter-AS links. Topology 525 of AS1 and AS2 are advertised via BGP-LS to the controller/PMS or 526 Head-end node. The EPE-SIDs are also advertised via BGP-LS as 527 described in [I-D.ietf-idr-bgpls-segment-routing-epe] 529 The description in the document uses below notations for Segment 530 Identifiers(SIDs). 532 Node SIDs : N-PE1, N-P1, N-ASBR1 N-ABR1, N-ABR2etc. 534 Adjacency SIDs : Adj-PE1-P1, Adj-P1-P2 etc. 536 EPE SIDS : EPE-ASBR2-ASBR3, EPE-ASBR1-ASBR4, EPE-ASBR3-ASBR2 etc. 538 Let us consider a traffic engineered path built from PE1 to PE4 with 539 Segment List stack as below. N-P1, N-ASBR1, EPE-ASBR1-ASBR4, N-PE4 540 for following procedures. This stack may be programmed by 541 controller/PMS or Head-end router PE1 may have imported the whole 542 topology information from BGP-LS and computed the inter-AS path. 544 7.1. Procedures for Segment Routing LSP ping 546 To perform LSP ping procedure on an SR-Path from PE1 to PE4 547 consisting of label stacks [N-P1,N-ASBR1,EPE-ASBR1-ASBR4, N-PE4], The 548 remote end(PE4) needs IP connectivity to head end(PE1) for the 549 Segment Routing ping to succeed, because echo reply needs to travel 550 back to PE1 from PE4. But in typical deployment scenario there will 551 be no ip route from PE4 to PE1 as they belong to different ASes. 553 PE1 adds Return Path from PE4 to PE1 in the MPLS echo request using 554 multiple Segments in "Return Path TLV" as defined above. An example 555 return path TLV for PE1 to PE4 for LSP ping is [N-ASBR4, EPE- 556 ASBR4-ASBR1, N-PE1]. An implementation may also build a Return Path 557 consisting of labels to reach its own AS. Once the label stack is 558 popped-off the echo reply message will be exposed. The further 559 packet forwarding will be based on ip lookup. An example Return Path 560 for this case could be [N-ASBR4, EPE-ASBR4-ASBR1]. 562 On receiving MPLS echo request PE4 first validates FEC in the echo 563 request. PE4 then builds label stack to send the response from PE4 564 to PE1 by copying the labels from "Return Path TLV". PE4 builds the 565 echo reply packet with the MPLS label stack constructed and imposes 566 MPLS headers on top of echo reply packet and sends out the packet 567 towards PE1. This Segment List stack can successfully steer reply 568 back to Head-end node(PE1). 570 7.2. Procedures for Segment Routing LSP Traceroute 572 Traceroute procedure involves visiting every node on the path and 573 echo reply sent from every node. In this section, we describe the 574 traceroute mechanims when the headend/PMS has complete visibility of 575 the database. Headend/PMS computes the return path from each node in 576 the entire SR-MPLS path that is being tracerouted. The return path 577 computation is implementation dependant. As the headend/PMS 578 completely controls the return path, it can use proprietary 579 computations to build the return path. 581 One of the ways the return path can be built, is to use the principle 582 of building label stacks by adding each domain border node's Node SID 583 on the return path label stack as the traceroute progresses. For 584 inter-AS networks, in addition to border node's Node-SID, EPE-SID in 585 the reverse direction also need to be added to the label stack. 587 The Inter-domain/inter-as traceroute procedure uses the TTL expiry 588 mechansim as specified in [RFC8029] and [RFC8287]. Every echo 589 request packet Headend/PMS MUST include the appropriate return path 590 in the Return Path TLV. The node that receives the echo request MUST 591 follow procedures described in section Section 6.1 and section 592 Section 6.2 to send out echo reply. 594 For Example: 596 Let us consider a topology from Figure 1. Let us consider a SR path 597 [N-P1,N-ASBR1,EPE-ASBR1-ASBR4, N-PE4]. The traceroute is being 598 executed for this inter-AS path for destination PE4. PE1 sends first 599 echo request with TTL set to 1 and includes return path TLV 600 consisting of Type 1 Segment containing label derived from its own 601 SRGB. Note that the type of segment used in constructing the return 602 Path is local policy. If the entire network has same SRGB 603 configured, Type 1 segments can be used.The TTL expires on P1 and the 604 P1 sends echo reply using the return path. Note that implementations 605 may choose to exclude return path TLV until traceroute reaches the 606 first domain border as the return IP path to PE1 is expected to be 607 available inside the first domain. 609 TTL is set to 2 and the next echo request is sent out. Until the 610 traceroute procedure reaches the domain border node ASBR1, same 611 return path TLV consisting of single Label (PE1's node Label)is used. 612 When echo request reaches ASBR1, and echo reply is received, the next 613 echo request needs to include additional label as ASBR1 is a border 614 node. The return path TLV is built based on the forward path. As 615 the forward path consists of EPE-ASBR1-ASBR4, an EPE-SID in the 616 reverse direction is included in the return path TLV. The return 617 path now consists of two labels [N-PE1, EPE-ASBR4-ASBR1]. The echo 618 reply from ASBR4 will use this return path to send the reply. 620 The next echo request after visiting the border node ASBR4 will 621 update the return path with Node-SID label of ASBR4. The return path 622 beyond ASBR4 will be [N-PE1, EPE-ASBR4-ASBR1, N-ASBR4]. This same 623 return path is used until the traceroute procedure reaches next set 624 of border nodes. When there are multiple ASes the traceroute 625 procedure will continue by adding a set of Node labels and EPE labels 626 as the border nodes are visited. 628 Note that the above return path building procedure requires the 629 database of all the domains to be available at the headend/PMS. 631 The above description assumed the same SRGB is configured on all 632 nodes along the path. The SRGB may differ from one node to another 633 node and the SR architecture [RFC8402] allows the nodes to use 634 different SRGB. In such scenarios PE1 sends Type 3 (or Type 4 in 635 case of IPv6 networks) segment with Node address of PE1 and with 636 optional MPLS SID associated with the Node address. The receiving 637 node derives the label for the return path based on its own SRGB. 638 When the traceroute procedure crosses the border ASBR1, headend PE1 639 should send type 1 segment for N-PE1 based on the label derived from 640 ASBR1's SRGB. This is required because in AS2, ASBR4, P3, P4 etc may 641 not have the topology information to derive SRGB for PE1. After the 642 traceroute procedure reaches ASBR4 the return path will be 643 [N-PE1(type1 with label based on ASBR1's SRGB), EPE-ASBR4-ASBR1, 644 N-ASBR4 (Type 3)]. 646 In order to extend the example to multiple ASes consisting of 3 or 647 more ASes, let us consider a traceroute from PE1 to PE5 in Figure 1. 648 In this example, the PE1 to PE5 path has to cross 3 domains AS1, AS2 649 and AS3. Let us consider a path from PE1 to PE5 that goes through 650 [PE1, ASBR1, ASBR4, ASBR6, ASBR8,PE5]. When the traceroute procedure 651 is visiting the nodes in AS1, the Return path TLV sent from headend 652 consists of [N-PE1]. When the traceroute procedure reaches the 653 ASBR4, the Return Path consists of [N-PE1, EPE-ASBR4-ASBR1]. While 654 visiting nodes in AS2, the traceroute procedure consists of Return 655 Path TLV [N-PE1, EPE-ASBR4-ASBR1, N-ASBR4]. similarly, while visiting 656 the ASBR8 Return Path TLV adds the EPE SID from ASBR8 to ASBR6. 657 While visiting nodes in AS3 Node-SId of ASBR8 would also be added 658 which makes the Return Path [N-PE1, EPE-ASBR4-ASBR1, N-ASBR4, EPE- 659 ASBR8-ASBR6, N-ASBR8] 661 Let us consider another example from topology Figure 2. This 662 topology consists of multi-domain IGP with common border node between 663 the domains. This could be achieved with multi-area or multi-level 664 IGP or multiple instances of IGP deployed on same node. The return 665 path computation for this topology is similar to the multi-AS 666 computation except that the return path consists of single border 667 node label. When traceroute procedure is visiting node P, the return 668 path consists of [N-PE1, N-ABR1]. 670 8. Building Return Path TLV dynamically 672 In some cases, the head-end may not have complete visibility of 673 Inter-AS/Inter-domain topology. In such cases, it can rely on 674 downstream routers to build the reverse path for mpls traceroute 675 procedures. For this purpose, new reply path return code is defined, 676 which implies the Return Path TLV in the echo reply corresponds to 677 the return path to be used in next echo request. 679 Value Meaning 680 ------ ---------------------- 681 0x0006 Use Return Path TLV in echo reply for next echo request. 682 (TBA by IANA) 684 Figure 7: Return Code 686 8.1. The procedures to build the return path 688 In order to dynamically build the return Path for traceroute 689 procedures, the domain border nodes along the path being tracerouted 690 MUST support the procedures described in this section. Local policy 691 on the domain border nodes SHOULD determine whether the domain border 692 node participates in building return path dynamically during 693 traceroute. 695 Headend/PMS node MAY include its own node label while initiating 696 traceroute procedure. When an ABR receives the echo request, if the 697 local policy implies building dynamic return path, ABR MUST include 698 its own Node label. If there is a Return Path TLV included in the 699 received echo request message, the ABR's node label is added before 700 the existing segments. The type of segment added is based on local 701 policy. In cases when SRGB is not uniform across the network, it is 702 RECOMMENDED to add type 3 or type 4 segment. If the existing segment 703 in the Return Path TLV is a type 3/type 4 segment, that segment MUST 704 be converted to Type 1 segment based on ABR's own SRGB.This is 705 because downstream nodes will not know what SRGB to use to translate 706 the IP address to a label. As the ABR added its own Node label, it 707 is guaranteed that this ABR will be in the return path and will be 708 forwarding the traffic based on next label after its own label. 710 When an ASBR receives an echo request from another AS, and ASBR is 711 configured to build the return path dynamically, ASBR MUST build a 712 Return Path TLV and include it in the echo reply. The Return Path 713 TLV MUST consist of its own node label and an EPE-SID to the AS from 714 where the traceroute message was received. A Reply path return code 715 of 6 MUST be set in the echo reply to indicate that next echo request 716 should use the Return Path from the Return Path TLV in the echo 717 reply. ASBR MUST locally decide the outgoing interface for the echo 718 reply packet. Generally, remote ASBR will choose interface on which 719 the incoming OAM packet was receieved to send the echo reply out. 720 Return Path TLV is built by adding two segment sub TLVs. The top 721 segment sub TLV consists of the ASBR's Node SID and second segment 722 consists of the EPE SID in the reverse direction to reach the AS from 723 which the OAM packet was received.The type of segment chosen to build 724 Return Path TLV is a local policy. It is RECOMMENDED to use type 3/ 725 type4 segment for the top segment when the SRGB is not gurateed to be 726 uniform in the domain. 728 Irrespective of which type of segment is included in the Return Path 729 TLV, the responder of echo request MUST always translate the Return 730 Path TLV to a label stack and build MPLS header for the the echo 731 reply packet. This procedure can be applied to an end-to-end path 732 consisting of multiple ASes. Each ASBR that receives echo request 733 from another AS adds its Node-SID and EPE-SID on top of existing 734 segments in the Return Path TLV. 736 An ASBR that receives the echo request from a neighbor belonging to 737 same AS, MUST look at the Return Path TLV received in the echo 738 request. If the Return Path TLV consists of a Type 3/Type 4 segment, 739 it MUST convert the Type 3/4 segment to Type 1 segment by deriving 740 label from its own SRGB. The ASBR MUST set the reply path return 741 code to 6 and send the newly constructed Return Path TLV in the echo 742 reply. 744 Internal nodes or non domain border nodes MAY not set the Return Path 745 TLV return code to 6 (TBA by IANA) in the echo reply message as there 746 is no change in the Return Path. In these cases, the headend node/ 747 PMS that initiates the traceroute procedure MUST continue to send 748 previously sent Return Path TLV in the echo request message in every 749 next echo request. 751 8.2. Details with example 753 Let us consider a topology from Figure 1. Let us consider a SR 754 policy path built from PE1 to PE4 with a label stack as below. N-P1, 755 N-ASBR1, EPE-ASBR1-ASBR4, N-PE4. PE1 begins traceroute with TTL set 756 to 1 and includes [N-PE1] in the Return Path TLV. The traceroute 757 packet TTL expires on P1 and P1 processes the traceroute as per the 758 procedures described in [RFC8029] and [RFC8287]. P1 sends echo reply 759 with the same return Path TLV with reply path return code set to 6. 760 The return code of the echo reply itself is set to the return code as 761 per [RFC8029] and [RFC8287]. This traceroute doesn't need any 762 changes to the Return Path TLV till it leaves AS1. The same Return 763 Path TLV that is received may be included in the echo reply by P1 and 764 P2 or no Return Path TLV included so that headend continues to use 765 same return path in echo request that it used to send previous echo 766 request. 768 When ASBR1 receives the echo request, in case it recieved type3/type 769 4 segment in the Return Path TLV in the echo request, it converts 770 that type 3/4 segment to Type 1 based on its own SRGB. When ASBR4 771 receives the echo request, it should form this Return Path TLV using 772 its own Node SID(N-ASBR4) and EPE SID (EPE-ASRB4-ASBR1) labels and 773 set the reply path return code to 6. Then PE1 should use this Return 774 Path TLV in subsequent echo requests. In this example, when the 775 subsequent echo request reaches P3, it should use this Return Path 776 TLV for sending the echo reply. The same Return Path TLV is 777 sufficient for any router in AS2 to send the reply. Because the 778 first label(N-ASBR4) can direct echo reply to ASBR4 and second one 779 (EPE-ASBR4-ASBR1) to direct echo reply to AS1. Once echo reply 780 reaches AS1, normal IP forwarding or the N-PE1 helps it to reach PE1. 782 The example described in above paragraphs can be extended to multiple 783 ASes by following the same procedure of each ASBR adding Node-SID and 784 EPE-SID on receieving echo request from neighboring AS. 786 Let us consider a topology from Figure 2. It consists of multiple 787 IGP domains with multiple area/levels or separate IGP instances. 788 There is a single border node that seperates the two domains. In 789 this case, PE1 sends traceroute packet with TTL set to 1 and includes 790 N-PE1 in the return path TLV. ABR1 receives the echo request and 791 while sending echo reply adds its own node Label to the Return Path 792 TLV and sets the Reply path return code to 6. The Return path TLV in 793 the echo reply from ABR1 consists of [N-PE1, N-ABR1]. Next echo 794 request with TTL 2 reaches P node. It is an internal node so it does 795 not change the Return Path. echo request with TTL 3 reaches ABR2 and 796 it adds its own Node label so the return path TLV sent in echo reply 797 will be [N-PE1, N-ABR1, N-ABR2]. echo request with TTL 4 reaches PE4 798 and it sends echo reply return code as Egress. PE4 does not include 799 any Return Path TLV in echo reply. The above example assumes uniform 800 SRGB throughout the domain. In case of different SRGBs, the top 801 segment will be a type 3/4 segment and all other segments will be 802 type 1. Each border node converts the type 3/type 4 segment to type 803 1 before adding its own segment to the Return Path TLV. 805 9. Security Considerations 807 TBD 809 10. IANA Considerations 811 Sub-TLVs for TLV Types 1, 16, and 21 813 SID only in the form of mpls label : TBD (Range 32768-65535) 815 IPv4 Node Address with optional SID for SR-MPLS : TBD (Range 816 32768-65535) 818 IPv6 Node Address with optional SID for SR-MPLS : TBD (Range 819 32768-65535) 821 11. Contributors 823 1.Carlos Pignataro 825 Cisco Systems, Inc. 827 cpignata@cisco.com 829 2. Zafar Ali 831 Cisco Systems, Inc. 833 zali@cisco.com 835 12. Acknowledgments 837 Thanks to Bruno Decreane for suggesting use of generic Segment sub- 838 TLV. Thanks to Adrian Farrel for careful review and comments. 839 Thanks to Mach Chen for suggesting to use Return Path TLV. 841 13. References 843 13.1. Normative References 845 [I-D.ietf-idr-segment-routing-te-policy] 846 Previdi, S., Filsfils, C., Talaulikar, K., Mattes, P., 847 Rosen, E., Jain, D., and S. Lin, "Advertising Segment 848 Routing Policies in BGP", draft-ietf-idr-segment-routing- 849 te-policy-11 (work in progress), November 2020. 851 [I-D.ietf-spring-segment-routing-central-epe] 852 Filsfils, C., Previdi, S., Dawra, G., Aries, E., and D. 853 Afanasiev, "Segment Routing Centralized BGP Egress Peer 854 Engineering", draft-ietf-spring-segment-routing-central- 855 epe-10 (work in progress), December 2017. 857 [RFC4379] Kompella, K. and G. Swallow, "Detecting Multi-Protocol 858 Label Switched (MPLS) Data Plane Failures", RFC 4379, 859 DOI 10.17487/RFC4379, February 2006, 860 . 862 [RFC7110] Chen, M., Cao, W., Ning, S., Jounay, F., and S. Delord, 863 "Return Path Specified Label Switched Path (LSP) Ping", 864 RFC 7110, DOI 10.17487/RFC7110, January 2014, 865 . 867 [RFC8029] Kompella, K., Swallow, G., Pignataro, C., Ed., Kumar, N., 868 Aldrin, S., and M. Chen, "Detecting Multiprotocol Label 869 Switched (MPLS) Data-Plane Failures", RFC 8029, 870 DOI 10.17487/RFC8029, March 2017, 871 . 873 [RFC8287] Kumar, N., Ed., Pignataro, C., Ed., Swallow, G., Akiya, 874 N., Kini, S., and M. Chen, "Label Switched Path (LSP) 875 Ping/Traceroute for Segment Routing (SR) IGP-Prefix and 876 IGP-Adjacency Segment Identifiers (SIDs) with MPLS Data 877 Planes", RFC 8287, DOI 10.17487/RFC8287, December 2017, 878 . 880 13.2. Informative References 882 [I-D.ietf-idr-bgpls-segment-routing-epe] 883 Previdi, S., Talaulikar, K., Filsfils, C., Patel, K., Ray, 884 S., and J. Dong, "BGP-LS extensions for Segment Routing 885 BGP Egress Peer Engineering", draft-ietf-idr-bgpls- 886 segment-routing-epe-19 (work in progress), May 2019. 888 [I-D.ietf-mpls-interas-lspping] 889 Nadeau, T. and G. Swallow, "Detecting MPLS Data Plane 890 Failures in Inter-AS and inter-provider Scenarios", draft- 891 ietf-mpls-interas-lspping-00 (work in progress), March 892 2007. 894 [I-D.ietf-spring-segment-routing-policy] 895 Filsfils, C., Talaulikar, K., Voyer, D., Bogdanov, A., and 896 P. Mattes, "Segment Routing Policy Architecture", draft- 897 ietf-spring-segment-routing-policy-11 (work in progress), 898 April 2021. 900 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 901 Requirement Levels", BCP 14, RFC 2119, 902 DOI 10.17487/RFC2119, March 1997, 903 . 905 [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in 906 BGP-4", RFC 3107, DOI 10.17487/RFC3107, May 2001, 907 . 909 [RFC7743] Luo, J., Ed., Jin, L., Ed., Nadeau, T., Ed., and G. 910 Swallow, Ed., "Relayed Echo Reply Mechanism for Label 911 Switched Path (LSP) Ping", RFC 7743, DOI 10.17487/RFC7743, 912 January 2016, . 914 [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., 915 Decraene, B., Litkowski, S., and R. Shakir, "Segment 916 Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, 917 July 2018, . 919 [RFC8403] Geib, R., Ed., Filsfils, C., Pignataro, C., Ed., and N. 920 Kumar, "A Scalable and Topology-Aware MPLS Data-Plane 921 Monitoring System", RFC 8403, DOI 10.17487/RFC8403, July 922 2018, . 924 [RFC8604] Filsfils, C., Ed., Previdi, S., Dawra, G., Ed., 925 Henderickx, W., and D. Cooper, "Interconnecting Millions 926 of Endpoints with Segment Routing", RFC 8604, 927 DOI 10.17487/RFC8604, June 2019, 928 . 930 [RFC8660] Bashandy, A., Ed., Filsfils, C., Ed., Previdi, S., 931 Decraene, B., Litkowski, S., and R. Shakir, "Segment 932 Routing with the MPLS Data Plane", RFC 8660, 933 DOI 10.17487/RFC8660, December 2019, 934 . 936 Authors' Addresses 938 Shraddha Hegde 939 Juniper Networks Inc. 940 Exora Business Park 941 Bangalore, KA 560103 942 India 944 Email: shraddha@juniper.net 946 Kapil Arora 947 Juniper Networks Inc. 949 Email: kapilaro@juniper.net 951 Mukul Srivastava 952 Juniper Networks Inc. 954 Email: msri@juniper.net 956 Samson Ninan 957 Individual Contributor 959 Email: samson.cse@gmail.com 961 Nagendra Kumar 962 Cisco Systems, Inc. 964 Email: naikumar@cisco.com