idnits 2.17.1 draft-smack-mpls-rfc4379bis-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 9 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 1 instance of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 == There are 8 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 3, 2015) is 3127 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'FEC-stack-depth' is mentioned on line 1751, but not defined ** Downref: Normative reference to an Informational RFC: RFC 4026 ** Obsolete normative reference: RFC 4379 (Obsoleted by RFC 8029) ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) -- Obsolete informational reference (is this intentional?): RFC 3107 (Obsoleted by RFC 8277) -- Obsolete informational reference (is this intentional?): RFC 4447 (Obsoleted by RFC 8077) Summary: 3 errors (**), 0 flaws (~~), 5 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group C. Pignataro 3 Internet-Draft N. Kumar 4 Obsoletes: 4379 (if approved) Cisco 5 Intended status: Standards Track S. Aldrin 6 Expires: April 5, 2016 Google 7 M. Chen 8 Huawei 9 October 3, 2015 11 Detecting Multi-Protocol Label Switched (MPLS) Data Plane Failures 12 draft-smack-mpls-rfc4379bis-05 14 Abstract 16 This document describes a simple and efficient mechanism that can be 17 used to detect data plane failures in Multi-Protocol Label Switching 18 (MPLS) Label Switched Paths (LSPs). There are two parts to this 19 document: information carried in an MPLS "echo request" and "echo 20 reply" for the purposes of fault detection and isolation, and 21 mechanisms for reliably sending the echo reply. 23 This document obsoletes RFC 4379. 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on April 5, 2016. 42 Copyright Notice 44 Copyright (c) 2015 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 60 1.1. Conventions . . . . . . . . . . . . . . . . . . . . . . 3 61 1.2. Structure of This Document . . . . . . . . . . . . . . . 4 62 1.3. Contributors . . . . . . . . . . . . . . . . . . . . . . 4 63 1.4. Scope of RFC4379bis work . . . . . . . . . . . . . . . . 4 64 1.5. ToDo . . . . . . . . . . . . . . . . . . . . . . . . . . 5 65 2. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 5 66 2.1. Use of Address Range 127/8 . . . . . . . . . . . . . . . 6 67 3. Packet Format . . . . . . . . . . . . . . . . . . . . . . . 7 68 3.1. Return Codes . . . . . . . . . . . . . . . . . . . . . . 12 69 3.2. Target FEC Stack . . . . . . . . . . . . . . . . . . . . 12 70 3.2.1. LDP IPv4 Prefix . . . . . . . . . . . . . . . . . . 14 71 3.2.2. LDP IPv6 Prefix . . . . . . . . . . . . . . . . . . 14 72 3.2.3. RSVP IPv4 LSP . . . . . . . . . . . . . . . . . . . 14 73 3.2.4. RSVP IPv6 LSP . . . . . . . . . . . . . . . . . . . 15 74 3.2.5. VPN IPv4 Prefix . . . . . . . . . . . . . . . . . . 15 75 3.2.6. VPN IPv6 Prefix . . . . . . . . . . . . . . . . . . 16 76 3.2.7. L2 VPN Endpoint . . . . . . . . . . . . . . . . . . 17 77 3.2.8. FEC 128 Pseudowire (Deprecated) . . . . . . . . . . 17 78 3.2.9. FEC 128 Pseudowire (Current) . . . . . . . . . . . . 18 79 3.2.10. FEC 129 Pseudowire . . . . . . . . . . . . . . . . . 19 80 3.2.11. BGP Labeled IPv4 Prefix . . . . . . . . . . . . . . 20 81 3.2.12. BGP Labeled IPv6 Prefix . . . . . . . . . . . . . . 20 82 3.2.13. Generic IPv4 Prefix . . . . . . . . . . . . . . . . 21 83 3.2.14. Generic IPv6 Prefix . . . . . . . . . . . . . . . . 21 84 3.2.15. Nil FEC . . . . . . . . . . . . . . . . . . . . . . 22 85 3.3. Downstream Mapping . . . . . . . . . . . . . . . . . . . 22 86 3.3.1. Multipath Information Encoding . . . . . . . . . . . 26 87 3.3.2. Downstream Router and Interface . . . . . . . . . . 28 88 3.4. Pad TLV . . . . . . . . . . . . . . . . . . . . . . . . 29 89 3.5. Vendor Enterprise Number . . . . . . . . . . . . . . . . 29 90 3.6. Interface and Label Stack . . . . . . . . . . . . . . . 29 91 3.7. Errored TLVs . . . . . . . . . . . . . . . . . . . . . . 31 92 3.8. Reply TOS Byte TLV . . . . . . . . . . . . . . . . . . . 31 93 4. Theory of Operation . . . . . . . . . . . . . . . . . . . . 31 94 4.1. Dealing with Equal-Cost Multi-Path (ECMP) . . . . . . . 32 95 4.2. Testing LSPs That Are Used to Carry MPLS Payloads . . . 33 96 4.3. Sending an MPLS Echo Request . . . . . . . . . . . . . . 33 97 4.4. Receiving an MPLS Echo Request . . . . . . . . . . . . . 34 98 4.4.1. FEC Validation . . . . . . . . . . . . . . . . . . . 40 99 4.5. Sending an MPLS Echo Reply . . . . . . . . . . . . . . . 41 100 4.6. Receiving an MPLS Echo Reply . . . . . . . . . . . . . . 42 101 4.7. Issue with VPN IPv4 and IPv6 Prefixes . . . . . . . . . 42 102 4.8. Non-compliant Routers . . . . . . . . . . . . . . . . . 42 103 5. Security Considerations . . . . . . . . . . . . . . . . . . 43 104 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . 44 105 6.1. Message Types, Reply Modes, Return Codes . . . . . . . . 44 106 6.2. TLVs . . . . . . . . . . . . . . . . . . . . . . . . . . 45 107 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 46 108 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 47 109 8.1. Normative References . . . . . . . . . . . . . . . . . . 47 110 8.2. Informative References . . . . . . . . . . . . . . . . . 48 111 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 48 113 1. Introduction 115 This document describes a simple and efficient mechanism that can be 116 used to detect data plane failures in MPLS Label Switched Paths 117 (LSPs). There are two parts to this document: information carried in 118 an MPLS "echo request" and "echo reply", and mechanisms for 119 transporting the echo reply. The first part aims at providing enough 120 information to check correct operation of the data plane, as well as 121 a mechanism to verify the data plane against the control plane, and 122 thereby localize faults. The second part suggests two methods of 123 reliable reply channels for the echo request message for more robust 124 fault isolation. 126 An important consideration in this design is that MPLS echo requests 127 follow the same data path that normal MPLS packets would traverse. 128 MPLS echo requests are meant primarily to validate the data plane, 129 and secondarily to verify the data plane against the control plane. 130 Mechanisms to check the control plane are valuable, but are not 131 covered in this document. 133 This document makes special use of the address range 127/8. This is 134 an exception to the behavior defined in RFC 1122 [RFC1122] and 135 updates that RFC. The motivation for this change and the details of 136 this exceptional use are discussed in section 2.1 below. 138 1.1. Conventions 140 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 141 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 142 document are to be interpreted as described in RFC 2119 [RFC2119]. 144 The term "Must Be Zero" (MBZ) is used in object descriptions for 145 reserved fields. These fields MUST be set to zero when sent and 146 ignored on receipt. 148 Terminology pertaining to L2 and L3 Virtual Private Networks (VPNs) 149 is defined in [RFC4026]. 151 Since this document refers to the MPLS Time to Live (TTL) far more 152 frequently than the IP TTL, the authors have chosen the convention of 153 using the unqualified "TTL" to mean "MPLS TTL" and using "IP TTL" for 154 the TTL value in the IP header. 156 1.2. Structure of This Document 158 The body of this memo contains four main parts: motivation, MPLS echo 159 request/reply packet format, LSP ping operation, and a reliable 160 return path. It is suggested that first-time readers skip the actual 161 packet formats and read the Theory of Operation first; the document 162 is structured the way it is to avoid forward references. 164 1.3. Contributors 166 A mechanism used to detect data plane failures in Multi-Protocol 167 Label Switching (MPLS) Label Switched Paths (LSPs) was originally 168 published as RFC 4379 in February 2006. It was produced by the MPLS 169 Working Group of the IETF and was jointly authored by Kireeti 170 Kompella and George Swallow. 172 The following made vital contributions to all aspects of the original 173 RFC 4379, and much of the material came out of debate and discussion 174 among this group. 176 Ronald P. Bonica, Juniper Networks, Inc. 177 Dave Cooper, Global Crossing 178 Ping Pan, Hammerhead Systems 179 Nischal Sheth, Juniper Networks, Inc. 180 Sanjay Wadhwa, Juniper Networks, Inc. 182 1.4. Scope of RFC4379bis work 184 The goal of this document is to take LSP Ping to an Internet 185 Standard. 187 [RFC4379] defines the basic mechanism for MPLS LSP validation that 188 can be used for fault detection and isolation. The scope of this 189 document also is to address various updates to MPLS LSP Ping, 190 including: 192 1. Updates to all references and citations. Obsoleted RFCs 2434, 193 2030, and 3036 are respectively replaced with RFCs 5226, 5905, 194 and 5036. Additionally, these three documents published as RFCs: 195 RFCs 4447, 5085, and 4761. 196 2. Incorporate all outstanding Errata. These include Erratum with 197 IDs: 108, 1418, 1714, 1786, 3399, 742, and 2978. 198 3. Replace EXP with Traffic Class (TC), based on the update from RFC 199 5462. 201 1.5. ToDo 203 This section should be empty, and removed, prior to publication. 204 ToDos: 206 1. Evaluation of which of the RFCs that updated RFC 4379 need to be 207 incorporated into this 4379bis document. Specifically, these 208 RFCs updated RFC 4379: 6424, 6425, 6426, 6829, 7506, and 7537. 209 RFCs that updated RFC 4379 and are incorporated into this 210 4379bis, will be Obsoleted by 4379bis. 211 2. Review IANA Allocations 213 2. Motivation 215 When an LSP fails to deliver user traffic, the failure cannot always 216 be detected by the MPLS control plane. There is a need to provide a 217 tool that would enable users to detect such traffic "black holes" or 218 misrouting within a reasonable period of time, and a mechanism to 219 isolate faults. 221 In this document, we describe a mechanism that accomplishes these 222 goals. This mechanism is modeled after the ping/traceroute paradigm: 223 ping (ICMP echo request [RFC0792]) is used for connectivity checks, 224 and traceroute is used for hop-by-hop fault localization as well as 225 path tracing. This document specifies a "ping" mode and a 226 "traceroute" mode for testing MPLS LSPs. 228 The basic idea is to verify that packets that belong to a particular 229 Forwarding Equivalence Class (FEC) actually end their MPLS path on a 230 Label Switching Router (LSR) that is an egress for that FEC. This 231 document proposes that this test be carried out by sending a packet 232 (called an "MPLS echo request") along the same data path as other 233 packets belonging to this FEC. An MPLS echo request also carries 234 information about the FEC whose MPLS path is being verified. This 235 echo request is forwarded just like any other packet belonging to 236 that FEC. In "ping" mode (basic connectivity check), the packet 237 should reach the end of the path, at which point it is sent to the 238 control plane of the egress LSR, which then verifies whether it is 239 indeed an egress for the FEC. In "traceroute" mode (fault 240 isolation), the packet is sent to the control plane of each transit 241 LSR, which performs various checks that it is indeed a transit LSR 242 for this path; this LSR also returns further information that helps 243 check the control plane against the data plane, i.e., that forwarding 244 matches what the routing protocols determined as the path. 246 One way these tools can be used is to periodically ping an FEC to 247 ensure connectivity. If the ping fails, one can then initiate a 248 traceroute to determine where the fault lies. One can also 249 periodically traceroute FECs to verify that forwarding matches the 250 control plane; however, this places a greater burden on transit LSRs 251 and thus should be used with caution. 253 2.1. Use of Address Range 127/8 255 As described above, LSP ping is intended as a diagnostic tool. It is 256 intended to enable providers of an MPLS-based service to isolate 257 network faults. In particular, LSP ping needs to diagnose situations 258 where the control and data planes are out of sync. It performs this 259 by routing an MPLS echo request packet based solely on its label 260 stack. That is, the IP destination address is never used in a 261 forwarding decision. In fact, the sender of an MPLS echo request 262 packet may not know, a priori, the address of the router at the end 263 of the LSP. 265 Providers of MPLS-based services also need the ability to trace all 266 of the possible paths that an LSP may take. Since most MPLS services 267 are based on IP unicast forwarding, these paths are subject to equal- 268 cost multi-path (ECMP) load sharing. 270 This leads to the following requirements: 272 1. Although the LSP in question may be broken in unknown ways, the 273 likelihood of a diagnostic packet being delivered to a user of an 274 MPLS service MUST be held to an absolute minimum. 276 2. If an LSP is broken in such a way that it prematurely terminates, 277 the diagnostic packet MUST NOT be IP forwarded. 279 3. A means of varying the diagnostic packets such that they exercise 280 all ECMP paths is thus REQUIRED. 282 Clearly, using general unicast addresses satisfies neither of the 283 first two requirements. A number of other options for addresses were 284 considered, including a portion of the private address space (as 285 determined by the network operator) and the newly designated IPv4 286 link local addresses. Use of the private address space was deemed 287 ineffective since the leading MPLS-based service is an IPv4 Virtual 288 Private Network (VPN). VPNs often use private addresses. 290 The IPv4 link local addresses are more attractive in that the scope 291 over which they can be forwarded is limited. However, if one were to 292 use an address from this range, it would still be possible for the 293 first recipient of a diagnostic packet that "escaped" from a broken 294 LSP to have that address assigned to the interface on which it 295 arrived and thus could mistakenly receive such a packet. 296 Furthermore, the IPv4 link local address range has only recently been 297 allocated. Many deployed routers would forward a packet with an 298 address from that range toward the default route. 300 The 127/8 range for IPv4 and that same range embedded in as 301 IPv4-mapped IPv6 addresses for IPv6 was chosen for a number of 302 reasons. 304 RFC 1122 allocates the 127/8 as "Internal host loopback address" and 305 states: "Addresses of this form MUST NOT appear outside a host." 306 Thus, the default behavior of hosts is to discard such packets. This 307 helps to ensure that if a diagnostic packet is misdirected to a host, 308 it will be silently discarded. 310 RFC 1812 [RFC1812] states: 312 A router SHOULD NOT forward, except over a loopback interface, any 313 packet that has a destination address on network 127. A router 314 MAY have a switch that allows the network manager to disable these 315 checks. If such a switch is provided, it MUST default to 316 performing the checks. 318 This helps to ensure that diagnostic packets are never IP forwarded. 320 The 127/8 address range provides 16M addresses allowing wide 321 flexibility in varying addresses to exercise ECMP paths. Finally, as 322 an implementation optimization, the 127/8 provides an easy means of 323 identifying possible LSP packets. 325 3. Packet Format 327 An MPLS echo request is a (possibly labeled) IPv4 or IPv6 UDP packet; 328 the contents of the UDP packet have the following format: 330 0 1 2 3 331 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 332 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 333 | Version Number | Global Flags | 334 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 335 | Message Type | Reply mode | Return Code | Return Subcode| 336 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 337 | Sender's Handle | 338 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 339 | Sequence Number | 340 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 341 | TimeStamp Sent (seconds) | 342 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 343 | TimeStamp Sent (seconds fraction) | 344 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 345 | TimeStamp Received (seconds) | 346 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 347 | TimeStamp Received (seconds fraction) | 348 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 349 | TLVs ... | 350 . . 351 . . 352 . . 353 | | 354 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 356 The Version Number is currently 1. (Note: the version number is to 357 be incremented whenever a change is made that affects the ability of 358 an implementation to correctly parse or process an MPLS echo request/ 359 reply. These changes include any syntactic or semantic changes made 360 to any of the fixed fields, or to any Type-Length-Value (TLV) or sub- 361 TLV assignment or format that is defined at a certain version number. 362 The version number may not need to be changed if an optional TLV or 363 sub-TLV is added.) 365 The Global Flags field is a bit vector with the following format: 367 0 1 368 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 369 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 370 | MBZ |V| 371 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 373 One flag is defined for now, the V bit; the rest MUST be set to zero 374 when sending and ignored on receipt. 376 The V (Validate FEC Stack) flag is set to 1 if the sender wants the 377 receiver to perform FEC Stack validation; if V is 0, the choice is 378 left to the receiver. 380 The Message Type is one of the following: 382 Value Meaning 383 ----- ------- 384 1 MPLS echo request 385 2 MPLS echo reply 387 The Reply Mode can take one of the following values: 389 Value Meaning 390 ----- ------- 391 1 Do not reply 392 2 Reply via an IPv4/IPv6 UDP packet 393 3 Reply via an IPv4/IPv6 UDP packet with Router Alert 394 4 Reply via application level control channel 396 An MPLS echo request with 1 (Do not reply) in the Reply Mode field 397 may be used for one-way connectivity tests; the receiving router may 398 log gaps in the Sequence Numbers and/or maintain delay/jitter 399 statistics. An MPLS echo request would normally have 2 (Reply via an 400 IPv4/IPv6 UDP packet) in the Reply Mode field. If the normal IP 401 return path is deemed unreliable, one may use 3 (Reply via an IPv4/ 402 IPv6 UDP packet with Router Alert). Note that this requires that all 403 intermediate routers understand and know how to forward MPLS echo 404 replies. The echo reply uses the same IP version number as the 405 received echo request, i.e., an IPv4 encapsulated echo reply is sent 406 in response to an IPv4 encapsulated echo request. 408 Some applications support an IP control channel. One such example is 409 the associated control channel defined in Virtual Circuit 410 Connectivity Verification (VCCV) [RFC5085]. Any application that 411 supports an IP control channel between its control entities may set 412 the Reply Mode to 4 (Reply via application level control channel) to 413 ensure that replies use that same channel. Further definition of 414 this codepoint is application specific and thus beyond the scope of 415 this document. 417 Return Codes and Subcodes are described in the next section. 419 The Sender's Handle is filled in by the sender, and returned 420 unchanged by the receiver in the echo reply (if any). There are no 421 semantics associated with this handle, although a sender may find 422 this useful for matching up requests with replies. 424 The Sequence Number is assigned by the sender of the MPLS echo 425 request and can be (for example) used to detect missed replies. 427 The TimeStamp Sent is the time-of-day (according to the sender's 428 clock) in NTP format [RFC5905] when the MPLS echo request is sent. 429 The TimeStamp Received in an echo reply is the time-of-day (according 430 to the receiver's clock) in NTP format that the corresponding echo 431 request was received. 433 TLVs (Type-Length-Value tuples) have the following format: 435 0 1 2 3 436 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 437 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 438 | Type | Length | 439 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 440 | Value | 441 . . 442 . . 443 . . 444 | | 445 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 447 Types are defined below; Length is the length of the Value field in 448 octets. The Value field depends on the Type; it is zero padded to 449 align to a 4-octet boundary. TLVs may be nested within other TLVs, 450 in which case the nested TLVs are called sub-TLVs. Sub-TLVs have 451 independent types and MUST also be 4-octet aligned. 453 Two examples follow. The Label Distribution Protocol (LDP) IPv4 FEC 454 sub-TLV has the following format: 456 0 1 2 3 457 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 458 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 459 | Type = 1 (LDP IPv4 FEC) | Length = 5 | 460 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 461 | IPv4 prefix | 462 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 463 | Prefix Length | Must Be Zero | 464 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 466 The Length for this TLV is 5. A Target FEC Stack TLV that contains 467 an LDP IPv4 FEC sub-TLV and a VPN IPv4 prefix sub-TLV has the 468 following format: 470 0 1 2 3 471 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 472 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 473 | Type = 1 (FEC TLV) | Length = 32 | 474 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 475 | sub-Type = 1 (LDP IPv4 FEC) | Length = 5 | 476 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 477 | IPv4 prefix | 478 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 479 | Prefix Length | Must Be Zero | 480 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 481 | sub-Type = 6 (VPN IPv4 prefix)| Length = 13 | 482 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 483 | Route Distinguisher | 484 | (8 octets) | 485 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 486 | IPv4 prefix | 487 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 488 | Prefix Length | Must Be Zero | 489 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 491 A description of the Types and Values of the top-level TLVs for LSP 492 ping are given below: 494 Type # Value Field 495 ------ ----------- 496 1 Target FEC Stack 497 2 Downstream Mapping 498 3 Pad 499 4 Not Assigned 500 5 Vendor Enterprise Number 501 6 Not Assigned 502 7 Interface and Label Stack 503 8 Not Assigned 504 9 Errored TLVs 505 10 Reply TOS Byte 507 Types less than 32768 (i.e., with the high-order bit equal to 0) are 508 mandatory TLVs that MUST either be supported by an implementation or 509 result in the return code of 2 ("One or more of the TLVs was not 510 understood") being sent in the echo response. 512 Types greater than or equal to 32768 (i.e., with the high-order bit 513 equal to 1) are optional TLVs that SHOULD be ignored if the 514 implementation does not understand or support them. 516 3.1. Return Codes 518 The Return Code is set to zero by the sender. The receiver can set 519 it to one of the values listed below. The notation refers to 520 the Return Subcode. This field is filled in with the stack-depth for 521 those codes that specify that. For all other codes, the Return 522 Subcode MUST be set to zero. 524 Value Meaning 525 ----- ------- 526 0 No return code 527 1 Malformed echo request received 528 2 One or more of the TLVs was not understood 529 3 Replying router is an egress for the FEC at stack- 530 depth 531 4 Replying router has no mapping for the FEC at stack- 532 depth 533 5 Downstream Mapping Mismatch (See Note 1) 534 6 Upstream Interface Index Unknown (See Note 1) 535 7 Reserved 536 8 Label switched at stack-depth 537 9 Label switched but no MPLS forwarding at stack-depth 538 539 10 Mapping for this FEC is not the given label at stack- 540 depth 541 11 No label entry at stack-depth 542 12 Protocol not associated with interface at FEC stack- 543 depth 544 13 Premature termination of ping due to label stack 545 shrinking to a single label 547 Note 1 549 The Return Subcode contains the point in the label stack where 550 processing was terminated. If the RSC is 0, no labels were 551 processed. Otherwise the packet would have been label switched at 552 depth RSC. 554 3.2. Target FEC Stack 556 A Target FEC Stack is a list of sub-TLVs. The number of elements is 557 determined by looking at the sub-TLV length fields. 559 Sub-Type Length Value Field 560 -------- ------ ----------- 561 1 5 LDP IPv4 prefix 562 2 17 LDP IPv6 prefix 563 3 20 RSVP IPv4 LSP 564 4 56 RSVP IPv6 LSP 565 5 Not Assigned 566 6 13 VPN IPv4 prefix 567 7 25 VPN IPv6 prefix 568 8 14 L2 VPN endpoint 569 9 10 "FEC 128" Pseudowire (deprecated) 570 10 14 "FEC 128" Pseudowire 571 11 16+ "FEC 129" Pseudowire 572 12 5 BGP labeled IPv4 prefix 573 13 17 BGP labeled IPv6 prefix 574 14 5 Generic IPv4 prefix 575 15 17 Generic IPv6 prefix 576 16 4 Nil FEC 578 Other FEC Types will be defined as needed. 580 Note that this TLV defines a stack of FECs, the first FEC element 581 corresponding to the top of the label stack, etc. 583 An MPLS echo request MUST have a Target FEC Stack that describes the 584 FEC Stack being tested. For example, if an LSR X has an LDP mapping 585 [RFC5036] for 192.168.1.1 (say, label 1001), then to verify that 586 label 1001 does indeed reach an egress LSR that announced this prefix 587 via LDP, X can send an MPLS echo request with an FEC Stack TLV with 588 one FEC in it, namely, of type LDP IPv4 prefix, with prefix 589 192.168.1.1/32, and send the echo request with a label of 1001. 591 Say LSR X wanted to verify that a label stack of <1001, 23456> is the 592 right label stack to use to reach a VPN IPv4 prefix [see section 593 3.2.5] of 10/8 in VPN foo. Say further that LSR Y with loopback 594 address 192.168.1.1 announced prefix 10/8 with Route Distinguisher 595 RD-foo-Y (which may in general be different from the Route 596 Distinguisher that LSR X uses in its own advertisements for VPN foo), 597 label 23456 and BGP next hop 192.168.1.1 [RFC4271]. Finally, suppose 598 that LSR X receives a label binding of 1001 for 192.168.1.1 via LDP. 599 X has two choices in sending an MPLS echo request: X can send an MPLS 600 echo request with an FEC Stack TLV with a single FEC of type VPN IPv4 601 prefix with a prefix of 10/8 and a Route Distinguisher of RD-foo-Y. 602 Alternatively, X can send an FEC Stack TLV with two FECs, the first 603 of type LDP IPv4 with a prefix of 192.168.1.1/32 and the second of 604 type of IP VPN with a prefix 10/8 with Route Distinguisher of RD-foo- 605 Y. In either case, the MPLS echo request would have a label stack of 606 <1001, 23456>. (Note: in this example, 1001 is the "outer" label and 607 23456 is the "inner" label.) 609 3.2.1. LDP IPv4 Prefix 611 The IPv4 Prefix FEC is defined in [RFC5036]. When an LDP IPv4 prefix 612 is encoded in a label stack, the following format is used. The value 613 consists of 4 octets of an IPv4 prefix followed by 1 octet of prefix 614 length in bits; the format is given below. The IPv4 prefix is in 615 network byte order; if the prefix is shorter than 32 bits, trailing 616 bits SHOULD be set to zero. See [RFC5036] for an example of a 617 Mapping for an IPv4 FEC. 619 0 1 2 3 620 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 621 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 622 | IPv4 prefix | 623 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 624 | Prefix Length | Must Be Zero | 625 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 627 3.2.2. LDP IPv6 Prefix 629 The IPv6 Prefix FEC is defined in [RFC5036]. When an LDP IPv6 prefix 630 is encoded in a label stack, the following format is used. The value 631 consists of 16 octets of an IPv6 prefix followed by 1 octet of prefix 632 length in bits; the format is given below. The IPv6 prefix is in 633 network byte order; if the prefix is shorter than 128 bits, the 634 trailing bits SHOULD be set to zero. See [RFC5036] for an example of 635 a Mapping for an IPv6 FEC. 637 0 1 2 3 638 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 639 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 640 | IPv6 prefix | 641 | (16 octets) | 642 | | 643 | | 644 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 645 | Prefix Length | Must Be Zero | 646 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 648 3.2.3. RSVP IPv4 LSP 650 The value has the format below. The value fields are taken from RFC 651 3209, sections 4.6.1.1 and 4.6.2.1. See [RFC3209]. 653 0 1 2 3 654 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 655 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 656 | IPv4 tunnel end point address | 657 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 658 | Must Be Zero | Tunnel ID | 659 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 660 | Extended Tunnel ID | 661 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 662 | IPv4 tunnel sender address | 663 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 664 | Must Be Zero | LSP ID | 665 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 667 3.2.4. RSVP IPv6 LSP 669 The value has the format below. The value fields are taken from RFC 670 3209, sections 4.6.1.2 and 4.6.2.2. See [RFC3209]. 672 0 1 2 3 673 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 674 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 675 | IPv6 tunnel end point address | 676 | | 677 | | 678 | | 679 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 680 | Must Be Zero | Tunnel ID | 681 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 682 | Extended Tunnel ID | 683 | | 684 | | 685 | | 686 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 687 | IPv6 tunnel sender address | 688 | | 689 | | 690 | | 691 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 692 | Must Be Zero | LSP ID | 693 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 695 3.2.5. VPN IPv4 Prefix 697 VPN-IPv4 Network Layer Routing Information (NLRI) is defined in 698 [RFC4365]. This document uses the term VPN IPv4 prefix for a VPN- 699 IPv4 NLRI that has been advertised with an MPLS label in BGP. See 700 [RFC3107]. 702 When a VPN IPv4 prefix is encoded in a label stack, the following 703 format is used. The value field consists of the Route Distinguisher 704 advertised with the VPN IPv4 prefix, the IPv4 prefix (with trailing 0 705 bits to make 32 bits in all), and a prefix length, as follows: 707 0 1 2 3 708 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 709 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 710 | Route Distinguisher | 711 | (8 octets) | 712 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 713 | IPv4 prefix | 714 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 715 | Prefix Length | Must Be Zero | 716 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 718 The Route Distinguisher (RD) is an 8-octet identifier; it does not 719 contain any inherent information. The purpose of the RD is solely to 720 allow one to create distinct routes to a common IPv4 address prefix. 721 The encoding of the RD is not important here. When matching this 722 field to the local FEC information, it is treated as an opaque value. 724 3.2.6. VPN IPv6 Prefix 726 VPN-IPv6 Network Layer Routing Information (NLRI) is defined in 727 [RFC4365]. This document uses the term VPN IPv6 prefix for a VPN- 728 IPv6 NLRI that has been advertised with an MPLS label in BGP. See 729 [RFC3107]. 731 When a VPN IPv6 prefix is encoded in a label stack, the following 732 format is used. The value field consists of the Route Distinguisher 733 advertised with the VPN IPv6 prefix, the IPv6 prefix (with trailing 0 734 bits to make 128 bits in all), and a prefix length, as follows: 736 0 1 2 3 737 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 738 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 739 | Route Distinguisher | 740 | (8 octets) | 741 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 742 | IPv6 prefix | 743 | | 744 | | 745 | | 746 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 747 | Prefix Length | Must Be Zero | 748 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 750 The Route Distinguisher is identical to the VPN IPv4 Prefix RD, 751 except that it functions here to allow the creation of distinct 752 routes to IPv6 prefixes. See section 3.2.5. When matching this 753 field to local FEC information, it is treated as an opaque value. 755 3.2.7. L2 VPN Endpoint 757 VPLS stands for Virtual Private LAN Service. The terms VPLS BGP NLRI 758 and VE ID (VPLS Edge Identifier) are defined in [RFC4761]. This 759 document uses the simpler term L2 VPN endpoint when referring to a 760 VPLS BGP NLRI. The Route Distinguisher is an 8-octet identifier used 761 to distinguish information about various L2 VPNs advertised by a 762 node. The VE ID is a 2-octet identifier used to identify a 763 particular node that serves as the service attachment point within a 764 VPLS. The structure of these two identifiers is unimportant here; 765 when matching these fields to local FEC information, they are treated 766 as opaque values. The encapsulation type is identical to the PW Type 767 in section 3.2.8 below. 769 When an L2 VPN endpoint is encoded in a label stack, the following 770 format is used. The value field consists of a Route Distinguisher (8 771 octets), the sender (of the ping)'s VE ID (2 octets), the receiver's 772 VE ID (2 octets), and an encapsulation type (2 octets), formatted as 773 follows: 775 0 1 2 3 776 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 777 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 778 | Route Distinguisher | 779 | (8 octets) | 780 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 781 | Sender's VE ID | Receiver's VE ID | 782 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 783 | Encapsulation Type | Must Be Zero | 784 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 786 3.2.8. FEC 128 Pseudowire (Deprecated) 788 FEC 128 (0x80) is defined in [RFC4447], as are the terms PW ID 789 (Pseudowire ID) and PW Type (Pseudowire Type). A PW ID is a non-zero 790 32-bit connection ID. The PW Type is a 15-bit number indicating the 791 encapsulation type. It is carried right justified in the field below 792 termed encapsulation type with the high-order bit set to zero. Both 793 of these fields are treated in this protocol as opaque values. 795 When an FEC 128 is encoded in a label stack, the following format is 796 used. The value field consists of the remote PE address (the 797 destination address of the targeted LDP session), the PW ID, and the 798 encapsulation type as follows: 800 0 1 2 3 801 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 802 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 803 | Remote PE Address | 804 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 805 | PW ID | 806 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 807 | PW Type | Must Be Zero | 808 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 810 This FEC is deprecated and is retained only for backward 811 compatibility. Implementations of LSP ping SHOULD accept and process 812 this TLV, but SHOULD send LSP ping echo requests with the new TLV 813 (see next section), unless explicitly configured to use the old TLV. 815 An LSR receiving this TLV SHOULD use the source IP address of the LSP 816 echo request to infer the sender's PE address. 818 3.2.9. FEC 128 Pseudowire (Current) 820 FEC 128 (0x80) is defined in [RFC4447], as are the terms PW ID 821 (Pseudowire ID) and PW Type (Pseudowire Type). A PW ID is a non-zero 822 32-bit connection ID. The PW Type is a 15-bit number indicating the 823 encapsulation type. It is carried right justified in the field below 824 termed encapsulation type with the high-order bit set to zero. 826 Both of these fields are treated in this protocol as opaque values. 827 When matching these field to the local FEC information, the match 828 MUST be exact. 830 When an FEC 128 is encoded in a label stack, the following format is 831 used. The value field consists of the sender's PE address (the 832 source address of the targeted LDP session), the remote PE address 833 (the destination address of the targeted LDP session), the PW ID, and 834 the encapsulation type as follows: 836 0 1 2 3 837 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 838 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 839 | Sender's PE Address | 840 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 841 | Remote PE Address | 842 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 843 | PW ID | 844 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 845 | PW Type | Must Be Zero | 846 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 848 3.2.10. FEC 129 Pseudowire 850 FEC 129 (0x81) and the terms PW Type, Attachment Group Identifier 851 (AGI), Attachment Group Identifier Type (AGI Type), Attachment 852 Individual Identifier Type (AII Type), Source Attachment Individual 853 Identifier (SAII), and Target Attachment Individual Identifier (TAII) 854 are defined in [RFC4447]. The PW Type is a 15-bit number indicating 855 the encapsulation type. It is carried right justified in the field 856 below PW Type with the high-order bit set to zero. All the other 857 fields are treated as opaque values and copied directly from the FEC 858 129 format. All of these values together uniquely define the FEC 859 within the scope of the LDP session identified by the source and 860 remote PE addresses. 862 When an FEC 129 is encoded in a label stack, the following format is 863 used. The Length of this TLV is 16 + AGI length + SAII length + TAII 864 length. Padding is used to make the total length a multiple of 4; 865 the length of the padding is not included in the Length field. 867 0 1 2 3 868 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 869 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 870 | Sender's PE Address | 871 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 872 | Remote PE Address | 873 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 874 | PW Type | AGI Type | AGI Length | 875 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 876 ~ AGI Value ~ 877 | | 878 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 879 | AII Type | SAII Length | SAII Value | 880 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 881 ~ SAII Value (continued) ~ 882 | | 883 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 884 | AII Type | TAII Length | TAII Value | 885 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 886 ~ TAII Value (continued) ~ 887 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 888 | TAII (cont.) | 0-3 octets of zero padding | 889 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 891 3.2.11. BGP Labeled IPv4 Prefix 893 BGP labeled IPv4 prefixes are defined in [RFC3107]. When a BGP 894 labeled IPv4 prefix is encoded in a label stack, the following format 895 is used. The value field consists the IPv4 prefix (with trailing 0 896 bits to make 32 bits in all), and the prefix length, as follows: 898 0 1 2 3 899 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 900 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 901 | IPv4 Prefix | 902 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 903 | Prefix Length | Must Be Zero | 904 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 906 3.2.12. BGP Labeled IPv6 Prefix 908 BGP labeled IPv6 prefixes are defined in [RFC3107]. When a BGP 909 labeled IPv6 prefix is encoded in a label stack, the following format 910 is used. The value consists of 16 octets of an IPv6 prefix followed 911 by 1 octet of prefix length in bits; the format is given below. The 912 IPv6 prefix is in network byte order; if the prefix is shorter than 913 128 bits, the trailing bits SHOULD be set to zero. 915 0 1 2 3 916 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 917 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 918 | IPv6 prefix | 919 | (16 octets) | 920 | | 921 | | 922 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 923 | Prefix Length | Must Be Zero | 924 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 926 3.2.13. Generic IPv4 Prefix 928 The value consists of 4 octets of an IPv4 prefix followed by 1 octet 929 of prefix length in bits; the format is given below. The IPv4 prefix 930 is in network byte order; if the prefix is shorter than 32 bits, 931 trailing bits SHOULD be set to zero. This FEC is used if the 932 protocol advertising the label is unknown or may change during the 933 course of the LSP. An example is an inter-AS LSP that may be 934 signaled by LDP in one Autonomous System (AS), by RSVP-TE [RFC3209] 935 in another AS, and by BGP between the ASes, such as is common for 936 inter-AS VPNs. 938 0 1 2 3 939 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 940 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 941 | IPv4 prefix | 942 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 943 | Prefix Length | Must Be Zero | 944 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 946 3.2.14. Generic IPv6 Prefix 948 The value consists of 16 octets of an IPv6 prefix followed by 1 octet 949 of prefix length in bits; the format is given below. The IPv6 prefix 950 is in network byte order; if the prefix is shorter than 128 bits, the 951 trailing bits SHOULD be set to zero. 953 0 1 2 3 954 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 955 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 956 | IPv6 prefix | 957 | (16 octets) | 958 | | 959 | | 960 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 961 | Prefix Length | Must Be Zero | 962 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 964 3.2.15. Nil FEC 966 At times, labels from the reserved range, e.g., Router Alert and 967 Explicit-null, may be added to the label stack for various diagnostic 968 purposes such as influencing load-balancing. These labels may have 969 no explicit FEC associated with them. The Nil FEC Stack is defined 970 to allow a Target FEC Stack sub-TLV to be added to the Target FEC 971 Stack to account for such labels so that proper validation can still 972 be performed. 974 The Length is 4. Labels are 20-bit values treated as numbers. 976 0 1 2 3 977 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 978 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 979 | Label | MBZ | 980 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 982 Label is the actual label value inserted in the label stack; the MBZ 983 fields MUST be zero when sent and ignored on receipt. 985 3.3. Downstream Mapping 987 The Downstream Mapping object is a TLV that MAY be included in an 988 echo request message. Only one Downstream Mapping object may appear 989 in an echo request. The presence of a Downstream Mapping object is a 990 request that Downstream Mapping objects be included in the echo 991 reply. If the replying router is the destination of the FEC, then a 992 Downstream Mapping TLV SHOULD NOT be included in the echo reply. 993 Otherwise the replying router SHOULD include a Downstream Mapping 994 object for each interface over which this FEC could be forwarded. 995 For a more precise definition of the notion of "downstream", see 996 section 3.3.2, "Downstream Router and Interface". 998 The Length is K + M + 4*N octets, where M is the Multipath Length, 999 and N is the number of Downstream Labels. Values for K are found in 1000 the description of Address Type below. The Value field of a 1001 Downstream Mapping has the following format: 1003 0 1 2 3 1004 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1005 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1006 | MTU | Address Type | DS Flags | 1007 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1008 | Downstream IP Address (4 or 16 octets) | 1009 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1010 | Downstream Interface Address (4 or 16 octets) | 1011 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1012 | Multipath Type| Depth Limit | Multipath Length | 1013 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1014 . . 1015 . (Multipath Information) . 1016 . . 1017 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1018 | Downstream Label | Protocol | 1019 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1020 . . 1021 . . 1022 . . 1023 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1024 | Downstream Label | Protocol | 1025 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1027 Maximum Transmission Unit (MTU) 1029 The MTU is the size in octets of the largest MPLS frame (including 1030 label stack) that fits on the interface to the Downstream LSR. 1032 Address Type 1034 The Address Type indicates if the interface is numbered or 1035 unnumbered. It also determines the length of the Downstream IP 1036 Address and Downstream Interface fields. The resulting total for 1037 the initial part of the TLV is listed in the table below as "K 1038 Octets". The Address Type is set to one of the following values: 1040 Type # Address Type K Octets 1041 ------ ------------ -------- 1042 1 IPv4 Numbered 16 1043 2 IPv4 Unnumbered 16 1044 3 IPv6 Numbered 40 1045 4 IPv6 Unnumbered 28 1047 DS Flags 1049 The DS Flags field is a bit vector with the following format: 1051 0 1 2 3 4 5 6 7 1052 +-+-+-+-+-+-+-+-+ 1053 | Rsvd(MBZ) |I|N| 1054 +-+-+-+-+-+-+-+-+ 1056 Two flags are defined currently, I and N. The remaining flags MUST 1057 be set to zero when sending and ignored on receipt. 1059 Flag Name and Meaning 1060 ---- ---------------- 1061 I Interface and Label Stack Object Request 1063 When this flag is set, it indicates that the replying 1064 router SHOULD include an Interface and Label Stack 1065 Object in the echo reply message. 1067 N Treat as a Non-IP Packet 1069 Echo request messages will be used to diagnose non-IP 1070 flows. However, these messages are carried in IP 1071 packets. For a router that alters its ECMP algorithm 1072 based on the FEC or deep packet examination, this flag 1073 requests that the router treat this as it would if the 1074 determination of an IP payload had failed. 1076 Downstream IP Address and Downstream Interface Address 1078 IPv4 addresses and interface indices are encoded in 4 octets; IPv6 1079 addresses are encoded in 16 octets. 1081 If the interface to the downstream LSR is numbered, then the 1082 Address Type MUST be set to IPv4 or IPv6, the Downstream IP 1083 Address MUST be set to either the downstream LSR's Router ID or 1084 the interface address of the downstream LSR, and the Downstream 1085 Interface Address MUST be set to the downstream LSR's interface 1086 address. 1088 If the interface to the downstream LSR is unnumbered, the Address 1089 Type MUST be IPv4 Unnumbered or IPv6 Unnumbered, the Downstream IP 1090 Address MUST be the downstream LSR's Router ID, and the Downstream 1091 Interface Address MUST be set to the index assigned by the 1092 upstream LSR to the interface. 1094 If an LSR does not know the IP address of its neighbor, then it 1095 MUST set the Address Type to either IPv4 Unnumbered or IPv6 1096 Unnumbered. For IPv4, it must set the Downstream IP Address to 1097 127.0.0.1; for IPv6 the address is set to 0::1. In both cases, 1098 the interface index MUST be set to 0. If an LSR receives an Echo 1099 Request packet with either of these addresses in the Downstream IP 1100 Address field, this indicates that it MUST bypass interface 1101 verification but continue with label validation. 1103 If the originator of an Echo Request packet wishes to obtain 1104 Downstream Mapping information but does not know the expected 1105 label stack, then it SHOULD set the Address Type to either IPv4 1106 Unnumbered or IPv6 Unnumbered. For IPv4, it MUST set the 1107 Downstream IP Address to 224.0.0.2; for IPv6 the address MUST be 1108 set to FF02::2. In both cases, the interface index MUST be set to 1109 0. If an LSR receives an Echo Request packet with the all-routers 1110 multicast address, then this indicates that it MUST bypass both 1111 interface and label stack validation, but return Downstream 1112 Mapping TLVs using the information provided. 1114 Multipath Type 1116 The following Multipath Types are defined: 1118 Key Type Multipath Information 1119 --- ---------------- --------------------- 1120 0 no multipath Empty (Multipath Length = 0) 1121 2 IP address IP addresses 1122 4 IP address range low/high address pairs 1123 8 Bit-masked IP IP address prefix and bit mask 1124 address set 1125 9 Bit-masked label set Label prefix and bit mask 1127 Type 0 indicates that all packets will be forwarded out this one 1128 interface. 1130 Types 2, 4, 8, and 9 specify that the supplied Multipath 1131 Information will serve to exercise this path. 1133 Depth Limit 1135 The Depth Limit is applicable only to a label stack and is the 1136 maximum number of labels considered in the hash; this SHOULD be 1137 set to zero if unspecified or unlimited. 1139 Multipath Length 1141 The length in octets of the Multipath Information. 1143 Multipath Information 1145 Address or label values encoded according to the Multipath Type. 1146 See the next section below for encoding details. 1148 Downstream Label(s) 1150 The set of labels in the label stack as it would have appeared if 1151 this router were forwarding the packet through this interface. 1152 Any Implicit Null labels are explicitly included. Labels are 1153 treated as numbers, i.e., they are right justified in the field. 1155 A Downstream Label is 24 bits, in the same format as an MPLS label 1156 minus the TTL field, i.e., the MSBit of the label is bit 0, the 1157 LSBit is bit 19, the Traffic Class (TC) bits are bits 20-22, and 1158 bit 23 is the S bit. The replying router SHOULD fill in the TC 1159 and S bits; the LSR receiving the echo reply MAY choose to ignore 1160 these bits. Protocol 1162 The Protocol is taken from the following table: 1164 Protocol # Signaling Protocol 1165 ---------- ------------------ 1166 0 Unknown 1167 1 Static 1168 2 BGP 1169 3 LDP 1170 4 RSVP-TE 1172 3.3.1. Multipath Information Encoding 1174 The Multipath Information encodes labels or addresses that will 1175 exercise this path. The Multipath Information depends on the 1176 Multipath Type. The contents of the field are shown in the table 1177 above. IPv4 addresses are drawn from the range 127/8; IPv6 addresses 1178 are drawn from the range 0:0:0:0:0:FFFF:7F00/104. Labels are treated 1179 as numbers, i.e., they are right justified in the field. For Type 4, 1180 ranges indicated by Address pairs MUST NOT overlap and MUST be in 1181 ascending sequence. 1183 Type 8 allows a more dense encoding of IP addresses. The IP prefix 1184 is formatted as a base IP address with the non-prefix low-order bits 1185 set to zero. The maximum prefix length is 27. Following the prefix 1186 is a mask of length 2^(32-prefix length) bits for IPv4 and 1187 2^(128-prefix length) bits for IPv6. Each bit set to 1 represents a 1188 valid address. The address is the base IPv4 address plus the 1189 position of the bit in the mask where the bits are numbered left to 1190 right beginning with zero. For example, the IPv4 addresses 1191 127.2.1.0, 127.2.1.5-127.2.1.15, and 127.2.1.20-127.2.1.29 would be 1192 encoded as follows: 1194 0 1 2 3 1195 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1196 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1197 |0 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0| 1198 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1199 |1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0| 1200 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1202 Those same addresses embedded in IPv6 would be encoded as follows: 1204 0 1 2 3 1205 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1206 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1207 |0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0| 1208 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1209 |0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0| 1210 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1211 |0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1| 1212 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1213 |0 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0| 1214 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1215 |1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0| 1216 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1218 Type 9 allows a more dense encoding of labels. The label prefix is 1219 formatted as a base label value with the non-prefix low-order bits 1220 set to zero. The maximum prefix (including leading zeros due to 1221 encoding) length is 27. Following the prefix is a mask of length 1222 2^(32-prefix length) bits. Each bit set to one represents a valid 1223 label. The label is the base label plus the position of the bit in 1224 the mask where the bits are numbered left to right beginning with 1225 zero. Label values of all the odd numbers between 1152 and 1279 1226 would be encoded as follows: 1228 0 1 2 3 1229 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1230 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1231 |0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0| 1232 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1233 |0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1| 1234 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1235 |0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1| 1236 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1237 |0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1| 1238 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1239 |0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1| 1240 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1241 If the received Multipath Information is non-null, the labels and IP 1242 addresses MUST be picked from the set provided. If none of these 1243 labels or addresses map to a particular downstream interface, then 1244 for that interface, the type MUST be set to 0. If the received 1245 Multipath Information is null (i.e., Multipath Length = 0, or for 1246 Types 8 and 9, a mask of all zeros), the type MUST be set to 0. 1248 For example, suppose LSR X at hop 10 has two downstream LSRs, Y and 1249 Z, for the FEC in question. The received X could return Multipath 1250 Type 4, with low/high IP addresses of 127.1.1.1->127.1.1.255 for 1251 downstream LSR Y and 127.2.1.1->127.2.1.255 for downstream LSR Z. 1252 The head end reflects this information to LSR Y. Y, which has three 1253 downstream LSRs, U, V, and W, computes that 127.1.1.1->127.1.1.127 1254 would go to U and 127.1.1.128-> 127.1.1.255 would go to V. Y would 1255 then respond with 3 Downstream Mappings: to U, with Multipath Type 4 1256 (127.1.1.1->127.1.1.127); to V, with Multipath Type 4 1257 (127.1.1.127->127.1.1.255); and to W, with Multipath Type 0. 1259 Note that computing Multipath Information may impose a significant 1260 processing burden on the receiver. A receiver MAY thus choose to 1261 process a subset of the received prefixes. The sender, on receiving 1262 a reply to a Downstream Mapping with partial information, SHOULD 1263 assume that the prefixes missing in the reply were skipped by the 1264 receiver, and MAY re-request information about them in a new echo 1265 request. 1267 3.3.2. Downstream Router and Interface 1269 The notion of "downstream router" and "downstream interface" should 1270 be explained. Consider an LSR X. If a packet that was originated 1271 with TTL n>1 arrived with outermost label L and TTL=1 at LSR X, X 1272 must be able to compute which LSRs could receive the packet if it was 1273 originated with TTL=n+1, over which interface the request would 1274 arrive and what label stack those LSRs would see. (It is outside the 1275 scope of this document to specify how this computation is done.) The 1276 set of these LSRs/interfaces consists of the downstream routers/ 1277 interfaces (and their corresponding labels) for X with respect to L. 1278 Each pair of downstream router and interface requires a separate 1279 Downstream Mapping to be added to the reply. 1281 The case where X is the LSR originating the echo request is a special 1282 case. X needs to figure out what LSRs would receive the MPLS echo 1283 request for a given FEC Stack that X originates with TTL=1. 1285 The set of downstream routers at X may be alternative paths (see the 1286 discussion below on ECMP) or simultaneous paths (e.g., for MPLS 1287 multicast). In the former case, the Multipath Information is used as 1288 a hint to the sender as to how it may influence the choice of these 1289 alternatives. 1291 3.4. Pad TLV 1293 The value part of the Pad TLV contains a variable number (>= 1) of 1294 octets. The first octet takes values from the following table; all 1295 the other octets (if any) are ignored. The receiver SHOULD verify 1296 that the TLV is received in its entirety, but otherwise ignores the 1297 contents of this TLV, apart from the first octet. 1299 Value Meaning 1300 ----- ------- 1301 1 Drop Pad TLV from reply 1302 2 Copy Pad TLV to reply 1303 3-255 Reserved for future use 1305 3.5. Vendor Enterprise Number 1307 SMI Private Enterprise Numbers are maintained by IANA. The Length is 1308 always 4; the value is the SMI Private Enterprise code, in network 1309 octet order, of the vendor with a Vendor Private extension to any of 1310 the fields in the fixed part of the message, in which case this TLV 1311 MUST be present. If none of the fields in the fixed part of the 1312 message have Vendor Private extensions, inclusion of this TLV is 1313 OPTIONAL. Vendor Private ranges for Message Types, Reply Modes, and 1314 Return Codes have been defined. When any of these are used, the 1315 Vendor Enterprise Number TLV MUST be included in the message. 1317 3.6. Interface and Label Stack 1319 The Interface and Label Stack TLV MAY be included in a reply message 1320 to report the interface on which the request message was received and 1321 the label stack that was on the packet when it was received. Only 1322 one such object may appear. The purpose of the object is to allow 1323 the upstream router to obtain the exact interface and label stack 1324 information as it appears at the replying LSR. 1326 The Length is K + 4*N octets; N is the number of labels in the label 1327 stack. Values for K are found in the description of Address Type 1328 below. The Value field of a Downstream Mapping has the following 1329 format: 1331 0 1 2 3 1332 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1333 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1334 | Address Type | Must Be Zero | 1335 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1336 | IP Address (4 or 16 octets) | 1337 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1338 | Interface (4 or 16 octets) | 1339 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1340 . . 1341 . . 1342 . Label Stack . 1343 . . 1344 . . 1345 . . 1346 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1348 Address Type 1350 The Address Type indicates if the interface is numbered or 1351 unnumbered. It also determines the length of the IP Address and 1352 Interface fields. The resulting total for the initial part of the 1353 TLV is listed in the table below as "K Octets". The Address Type 1354 is set to one of the following values: 1356 Type # Address Type K Octets 1357 ------ ------------ -------- 1358 1 IPv4 Numbered 12 1359 2 IPv4 Unnumbered 12 1360 3 IPv6 Numbered 36 1361 4 IPv6 Unnumbered 24 1363 IP Address and Interface 1365 IPv4 addresses and interface indices are encoded in 4 octets; IPv6 1366 addresses are encoded in 16 octets. 1368 If the interface upon which the echo request message was received 1369 is numbered, then the Address Type MUST be set to IPv4 or IPv6, 1370 the IP Address MUST be set to either the LSR's Router ID or the 1371 interface address, and the Interface MUST be set to the interface 1372 address. 1374 If the interface is unnumbered, the Address Type MUST be either 1375 IPv4 Unnumbered or IPv6 Unnumbered, the IP Address MUST be the 1376 LSR's Router ID, and the Interface MUST be set to the index 1377 assigned to the interface. 1379 Label Stack 1381 The label stack of the received echo request message. If any TTL 1382 values have been changed by this router, they SHOULD be restored. 1384 3.7. Errored TLVs 1386 The following TLV is a TLV that MAY be included in an echo reply to 1387 inform the sender of an echo request of mandatory TLVs either not 1388 supported by an implementation or parsed and found to be in error. 1390 The Value field contains the TLVs that were not understood, encoded 1391 as sub-TLVs. 1393 0 1 2 3 1394 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1395 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1396 | Type = 9 | Length | 1397 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1398 | Value | 1399 . . 1400 . . 1401 . . 1402 | | 1403 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1405 3.8. Reply TOS Byte TLV 1407 This TLV MAY be used by the originator of the echo request to request 1408 that an echo reply be sent with the IP header TOS byte set to the 1409 value specified in the TLV. This TLV has a length of 4 with the 1410 following value field. 1412 0 1 2 3 1413 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1414 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1415 | Reply-TOS Byte| Must Be Zero | 1416 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1418 4. Theory of Operation 1420 An MPLS echo request is used to test a particular LSP. The LSP to be 1421 tested is identified by the "FEC Stack"; for example, if the LSP was 1422 set up via LDP, and is to an egress IP address of 10.1.1.1, the FEC 1423 Stack contains a single element, namely, an LDP IPv4 prefix sub-TLV 1424 with value 10.1.1.1/32. If the LSP being tested is an RSVP LSP, the 1425 FEC Stack consists of a single element that captures the RSVP Session 1426 and Sender Template that uniquely identifies the LSP. 1428 FEC Stacks can be more complex. For example, one may wish to test a 1429 VPN IPv4 prefix of 10.1/8 that is tunneled over an LDP LSP with 1430 egress 10.10.1.1. The FEC Stack would then contain two sub-TLVs, the 1431 bottom being a VPN IPv4 prefix, and the top being an LDP IPv4 prefix. 1432 If the underlying (LDP) tunnel were not known, or was considered 1433 irrelevant, the FEC Stack could be a single element with just the VPN 1434 IPv4 sub-TLV. 1436 When an MPLS echo request is received, the receiver is expected to 1437 verify that the control plane and data plane are both healthy (for 1438 the FEC Stack being pinged) and that the two planes are in sync. The 1439 procedures for this are in section 4.4 below. 1441 4.1. Dealing with Equal-Cost Multi-Path (ECMP) 1443 LSPs need not be simple point-to-point tunnels. Frequently, a single 1444 LSP may originate at several ingresses, and terminate at several 1445 egresses; this is very common with LDP LSPs. LSPs for a given FEC 1446 may also have multiple "next hops" at transit LSRs. At an ingress, 1447 there may also be several different LSPs to choose from to get to the 1448 desired endpoint. Finally, LSPs may have backup paths, detour paths, 1449 and other alternative paths to take should the primary LSP go down. 1451 To deal with the last two first: it is assumed that the LSR sourcing 1452 MPLS echo requests can force the echo request into any desired LSP, 1453 so choosing among multiple LSPs at the ingress is not an issue. The 1454 problem of probing the various flavors of backup paths that will 1455 typically not be used for forwarding data unless the primary LSP is 1456 down will not be addressed here. 1458 Since the actual LSP and path that a given packet may take may not be 1459 known a priori, it is useful if MPLS echo requests can exercise all 1460 possible paths. This, although desirable, may not be practical, 1461 because the algorithms that a given LSR uses to distribute packets 1462 over alternative paths may be proprietary. 1464 To achieve some degree of coverage of alternate paths, there is a 1465 certain latitude in choosing the destination IP address and source 1466 UDP port for an MPLS echo request. This is clearly not sufficient; 1467 in the case of traceroute, more latitude is offered by means of the 1468 Multipath Information of the Downstream Mapping TLV. This is used as 1469 follows. An ingress LSR periodically sends an MPLS traceroute 1470 message to determine whether there are multipaths for a given LSP. 1471 If so, each hop will provide some information how each of its 1472 downstream paths can be exercised. The ingress can then send MPLS 1473 echo requests that exercise these paths. If several transit LSRs 1474 have ECMP, the ingress may attempt to compose these to exercise all 1475 possible paths. However, full coverage may not be possible. 1477 4.2. Testing LSPs That Are Used to Carry MPLS Payloads 1479 To detect certain LSP breakages, it may be necessary to encapsulate 1480 an MPLS echo request packet with at least one additional label when 1481 testing LSPs that are used to carry MPLS payloads (such as LSPs used 1482 to carry L2VPN and L3VPN traffic. For example, when testing LDP or 1483 RSVP-TE LSPs, just sending an MPLS echo request packet may not detect 1484 instances where the router immediately upstream of the destination of 1485 the LSP ping may forward the MPLS echo request successfully over an 1486 interface not configured to carry MPLS payloads because of the use of 1487 penultimate hop popping. Since the receiving router has no means to 1488 differentiate whether the IP packet was sent unlabeled or implicitly 1489 labeled, the addition of labels shimmed above the MPLS echo request 1490 (using the Nil FEC) will prevent a router from forwarding such a 1491 packet out unlabeled interfaces. 1493 4.3. Sending an MPLS Echo Request 1495 An MPLS echo request is a UDP packet. The IP header is set as 1496 follows: the source IP address is a routable address of the sender; 1497 the destination IP address is a (randomly chosen) IPv4 address from 1498 the range 127/8 or IPv6 address from the range 1499 0:0:0:0:0:FFFF:7F00/104. The IP TTL is set to 1. The source UDP 1500 port is chosen by the sender; the destination UDP port is set to 3503 1501 (assigned by IANA for MPLS echo requests). The Router Alert option 1502 MUST be set in the IP header. 1504 An MPLS echo request is sent with a label stack corresponding to the 1505 FEC Stack being tested. Note that further labels could be applied 1506 if, for example, the normal route to the topmost FEC in the stack is 1507 via a Traffic Engineered Tunnel [RFC3209]. If all of the FECs in the 1508 stack correspond to Implicit Null labels, the MPLS echo request is 1509 considered unlabeled even if further labels will be applied in 1510 sending the packet. 1512 If the echo request is labeled, one MAY (depending on what is being 1513 pinged) set the TTL of the innermost label to 1, to prevent the ping 1514 request going farther than it should. Examples of where this SHOULD 1515 be done include pinging a VPN IPv4 or IPv6 prefix, an L2 VPN endpoint 1516 or a pseudowire. Preventing the ping request from going too far can 1517 also be accomplished by inserting a Router Alert label above this 1518 label; however, this may lead to the undesired side effect that MPLS 1519 echo requests take a different data path than actual data. For more 1520 information on how these mechanisms can be used for pseudowire 1521 connectivity verification, see [RFC5085]. 1523 In "ping" mode (end-to-end connectivity check), the TTL in the 1524 outermost label is set to 255. In "traceroute" mode (fault isolation 1525 mode), the TTL is set successively to 1, 2, and so on. 1527 The sender chooses a Sender's Handle and a Sequence Number. When 1528 sending subsequent MPLS echo requests, the sender SHOULD increment 1529 the Sequence Number by 1. However, a sender MAY choose to send a 1530 group of echo requests with the same Sequence Number to improve the 1531 chance of arrival of at least one packet with that Sequence Number. 1533 The TimeStamp Sent is set to the time-of-day in NTP format that the 1534 echo request is sent. The TimeStamp Received is set to zero. 1536 An MPLS echo request MUST have an FEC Stack TLV. Also, the Reply 1537 Mode must be set to the desired reply mode; the Return Code and 1538 Subcode are set to zero. In the "traceroute" mode, the echo request 1539 SHOULD include a Downstream Mapping TLV. 1541 4.4. Receiving an MPLS Echo Request 1543 Sending an MPLS echo request to the control plane is triggered by one 1544 of the following packet processing exceptions: Router Alert option, 1545 IP TTL expiration, MPLS TTL expiration, MPLS Router Alert label, or 1546 the destination address in the 127/8 address range. The control 1547 plane further identifies it by UDP destination port 3503. 1549 For reporting purposes the bottom of stack is considered to be stack- 1550 depth of 1. This is to establish an absolute reference for the case 1551 where the actual stack may have more labels than there are FECs in 1552 the Target FEC Stack. 1554 Furthermore, in all the error codes listed in this document, a stack- 1555 depth of 0 means "no value specified". This allows compatibility 1556 with existing implementations that do not use the Return Subcode 1557 field. 1559 An LSR X that receives an MPLS echo request then processes it as 1560 follows. 1562 1. General packet sanity is verified. If the packet is not well- 1563 formed, LSR X SHOULD send an MPLS Echo Reply with the Return Code 1564 set to "Malformed echo request received" and the Subcode to zero. 1565 If there are any TLVs not marked as "Ignore" that LSR X does not 1566 understand, LSR X SHOULD send an MPLS "TLV not understood" (as 1567 appropriate), and the Subcode set to zero. In the latter case, 1568 the misunderstood TLVs (only) are included as sub-TLVs in an 1569 Errored TLVs TLV in the reply. The header fields Sender's 1570 Handle, Sequence Number, and Timestamp Sent are not examined, but 1571 are included in the MPLS echo reply message. 1573 The algorithm uses the following variables and identifiers: 1575 Interface-I: the interface on which the MPLS echo request was 1576 received. 1578 Stack-R: the label stack on the packet as it was received. 1580 Stack-D: the label stack carried in the Downstream Mapping 1581 TLV (not always present) 1583 Label-L: the label from the actual stack currently being 1584 examined. Requires no initialization. 1586 Label-stack-depth: the depth of label being verified. Initialized 1587 to the number of labels in the received label 1588 stack S. 1590 FEC-stack-depth: depth of the FEC in the Target FEC Stack that 1591 should be used to verify the current actual 1592 label. Requires no initialization. 1594 Best-return-code: contains the return code for the echo reply 1595 packet as currently best known. As the algorithm 1596 progresses, this code may change depending on the 1597 results of further checks that it performs. 1599 Best-rtn-subcode: similar to Best-return-code, but for the Echo 1600 Reply Subcode. 1602 FEC-status: result value returned by the FEC Checking 1603 algorithm described in section 4.4.1. 1605 /* Save receive context information */ 1607 2. If the echo request is good, LSR X stores the interface over 1608 which the echo was received in Interface-I, and the label stack 1609 with which it came in Stack-R. 1611 /* The rest of the algorithm iterates over the labels in Stack-R, 1612 verifies validity of label values, reports associated label switching 1613 operations (for traceroute), verifies correspondence between the 1614 Stack-R and the Target FEC Stack description in the body of the echo 1615 request, and reports any errors. */ 1617 /* The algorithm iterates as follows. */ 1618 3. Label Validation: 1620 If Label-stack-depth is 0 { 1622 /* The LSR needs to report its being a tail-end for the LSP */ 1624 Set FEC-stack-depth to 1, set Label-L to 3 (Implicit Null). 1625 Set Best-return-code to 3 ("Replying router is an egress for 1626 the FEC at stack depth"), set Best-rtn-subcode to the value of 1627 FEC-stack-depth (1) and go to step 5 (Egress Processing). 1629 } 1631 /* This step assumes there is always an entry for well-known label 1632 values */ 1634 Set Label-L to the value extracted from Stack-R at depth Label- 1635 stack-depth. Look up Label-L in the Incoming Label Map (ILM) to 1636 determine if the label has been allocated and an operation is 1637 associated with it. 1639 If there is no entry for L { 1641 /* Indicates a temporary or permanent label synchronization 1642 problem the LSR needs to report an error */ 1644 Set Best-return-code to 11 ("No label entry at stack-depth") 1645 and Best-rtn-subcode to Label-stack-depth. Go to step 7 (Send 1646 Reply Packet). 1648 } 1650 Else { 1652 Retrieve the associated label operation from the corresponding 1653 NHLFE and proceed to step 4 (Label Operation check). 1655 } 1657 4. Label Operation Check 1659 If the label operation is "Pop and Continue Processing" { 1661 /* Includes Explicit Null and Router Alert label cases */ 1663 Iterate to the next label by decrementing Label-stack-depth and 1664 loop back to step 3 (Label Validation). 1666 } 1668 If the label operation is "Swap or Pop and Switch based on Popped 1669 Label" { 1671 Set Best-return-code to 8 ("Label switched at stack-depth") and 1672 Best-rtn-subcode to Label-stack-depth to report transit 1673 switching. 1675 If a Downstream Mapping TLV is present in the received echo 1676 request { 1678 If the IP address in the TLV is 127.0.0.1 or 0::1 { 1680 Set Best-return-code to 6 ("Upstream Interface Index 1681 Unknown"). An Interface and Label Stack TLV SHOULD be 1682 included in the reply and filled with Interface-I and 1683 Stack-R. 1685 } 1687 Else { 1689 Verify that the IP address, interface address, and label 1690 stack in the Downstream Mapping TLV match Interface-I and 1691 Stack-R. If there is a mismatch, set Best-return-code to 1692 5, "Downstream Mapping Mismatch". An Interface and Label 1693 Stack TLV SHOULD be included in the reply and filled in 1694 based on Interface-I and Stack-R. Go to step 7 (Send 1695 Reply Packet). 1697 } 1699 } 1701 For each available downstream ECMP path { 1703 Retrieve output interface from the NHLFE entry. 1705 /* Note: this return code is set even if Label-stack-depth 1706 is one */ 1708 If the output interface is not MPLS enabled { 1710 Set Best-return-code to Return Code 9, "Label switched 1711 but no MPLS forwarding at stack-depth" and set Best-rtn- 1712 subcode to Label-stack-depth and goto Send_Reply_Packet. 1714 } 1716 If a Downstream Mapping TLV is present { 1718 A Downstream Mapping TLV SHOULD be included in the echo 1719 reply (see section 3.3) filled in with information about 1720 the current ECMP path. 1722 } 1724 } 1726 If no Downstream Mapping TLV is present, or the Downstream IP 1727 Address is set to the ALLROUTERS multicast address, go to step 1728 7 (Send Reply Packet). 1730 If the "Validate FEC Stack" flag is not set and the LSR is not 1731 configured to perform FEC checking by default, go to step 7 1732 (Send Reply Packet). 1734 /* Validate the Target FEC Stack in the received echo request. 1736 First determine FEC-stack-depth from the Downstream Mapping 1737 TLV. This is done by walking through Stack-D (the Downstream 1738 labels) from the bottom, decrementing the number of labels for 1739 each non-Implicit Null label, while incrementing FEC-stack- 1740 depth for each label. If the Downstream Mapping TLV contains 1741 one or more Implicit Null labels, FEC-stack-depth may be 1742 greater than Label-stack-depth. To be consistent with the 1743 above stack-depths, the bottom is considered to be entry 1. 1744 */ 1746 Set FEC-stack-depth to 0. Set i to Label-stack-depth. 1748 While (i > 0 ) do { 1750 ++FEC-stack-depth. 1751 if Stack-D[FEC-stack-depth] != 3 (Implicit Null) 1752 --i. 1753 } 1755 If the number of FECs in the FEC stack is greater than or equal 1756 to FEC-stack-depth { 1757 Perform the FEC Checking procedure (see subsection 4.4.1 1758 below). 1760 If FEC-status is 2, set Best-return-code to 10 ("Mapping for 1761 this FEC is not the given label at stack-depth"). 1763 If the return code is 1, set Best-return-code to FEC-return- 1764 code and Best-rtn-subcode to FEC-stack-depth. 1765 } 1767 Go to step 7 (Send Reply Packet). 1768 } 1770 5. Egress Processing: 1772 /* These steps are performed by the LSR that identified itself as 1773 the tail-end LSR for an LSP. */ 1775 If received echo request contains no Downstream Mapping TLV, or 1776 the Downstream IP Address is set to 127.0.0.1 or 0::1 go to step 6 1777 (Egress FEC Validation). 1779 Verify that the IP address, interface address, and label stack in 1780 the Downstream Mapping TLV match Interface-I and Stack-R. If not, 1781 set Best-return-code to 5, "Downstream Mapping Mis-match". A 1782 Received Interface and Label Stack TLV SHOULD be created for the 1783 echo response packet. Go to step 7 (Send Reply Packet). 1785 6. Egress FEC Validation: 1787 /* This is a loop for all entries in the Target FEC Stack starting 1788 with FEC-stack-depth. */ 1790 Perform FEC checking by following the algorithm described in 1791 subsection 4.4.1 for Label-L and the FEC at FEC-stack-depth. 1793 Set Best-return-code to FEC-code and Best-rtn-subcode to the value 1794 in FEC-stack-depth. 1796 If FEC-status (the result of the check) is 1, 1797 go to step 7 (Send Reply Packet). 1799 /* Iterate to the next FEC entry */ 1801 ++FEC-stack-depth. 1802 If FEC-stack-depth > the number of FECs in the FEC-stack, 1803 go to step 7 (Send Reply Packet). 1805 If FEC-status is 0 { 1807 ++Label-stack-depth. 1808 If Label-stack-depth > the number of labels in Stack-R, 1809 Go to step 7 (Send Reply Packet). 1811 Label-L = extracted label from Stack-R at depth 1812 Label-stack-depth. 1813 Loop back to step 6 (Egress FEC Validation). 1814 } 1816 7. Send Reply Packet: 1818 Send an MPLS echo reply with a Return Code of Best-return-code, 1819 and a Return Subcode of Best-rtn-subcode. Include any TLVs 1820 created during the above process. The procedures for sending the 1821 echo reply are found in subsection 4.5. 1823 4.4.1. FEC Validation 1825 /* This subsection describes validation of an FEC entry within the 1826 Target FEC Stack and accepts an FEC, Label-L, and Interface-I. The 1827 algorithm performs the following steps. */ 1829 1. Two return values, FEC-status and FEC-return-code, are 1830 initialized to 0. 1832 2. If the FEC is the Nil FEC { 1834 If Label-L is either Explicit_Null or Router_Alert, return. 1836 Else { 1838 Set FEC-return-code to 10 ("Mapping for this FEC is not the 1839 given label at stack-depth"). 1840 Set FEC-status to 1 1841 Return. 1842 } 1844 } 1846 3. Check the FEC label mapping that describes how traffic received 1847 on the LSP is further switched or which application it is 1848 associated with. If no mapping exists, set FEC-return-code to 1849 Return 4, "Replying router has no mapping for the FEC at stack- 1850 depth". Set FEC-status to 1. Return. 1852 4. If the label mapping for FEC is Implicit Null, set FEC-status to 1853 2 and proceed to step 5. Otherwise, if the label mapping for FEC 1854 is Label-L, proceed to step 5. Otherwise, set FEC-return-code to 1855 10 ("Mapping for this FEC is not the given label at stack- 1856 depth"), set FEC-status to 1, and return. 1858 5. This is a protocol check. Check what protocol would be used to 1859 advertise FEC. If it can be determined that no protocol 1860 associated with Interface-I would have advertised an FEC of that 1861 FEC-Type, set FEC-return-code to 12 ("Protocol not associated 1862 with interface at FEC stack-depth"). Set FEC-status to 1. 1864 6. Return. 1866 4.5. Sending an MPLS Echo Reply 1868 An MPLS echo reply is a UDP packet. It MUST ONLY be sent in response 1869 to an MPLS echo request. The source IP address is a routable address 1870 of the replier; the source port is the well-known UDP port for LSP 1871 ping. The destination IP address and UDP port are copied from the 1872 source IP address and UDP port of the echo request. The IP TTL is 1873 set to 255. If the Reply Mode in the echo request is "Reply via an 1874 IPv4 UDP packet with Router Alert", then the IP header MUST contain 1875 the Router Alert IP option. If the reply is sent over an LSP, the 1876 topmost label MUST in this case be the Router Alert label (1) (see 1877 [RFC3032]). 1879 The format of the echo reply is the same as the echo request. The 1880 Sender's Handle, the Sequence Number, and TimeStamp Sent are copied 1881 from the echo request; the TimeStamp Received is set to the time-of- 1882 day that the echo request is received (note that this information is 1883 most useful if the time-of-day clocks on the requester and the 1884 replier are synchronized). The FEC Stack TLV from the echo request 1885 MAY be copied to the reply. 1887 The replier MUST fill in the Return Code and Subcode, as determined 1888 in the previous subsection. 1890 If the echo request contains a Pad TLV, the replier MUST interpret 1891 the first octet for instructions regarding how to reply. 1893 If the replying router is the destination of the FEC, then Downstream 1894 Mapping TLVs SHOULD NOT be included in the echo reply. 1896 If the echo request contains a Downstream Mapping TLV, and the 1897 replying router is not the destination of the FEC, the replier SHOULD 1898 compute its downstream routers and corresponding labels for the 1899 incoming label, and add Downstream Mapping TLVs for each one to the 1900 echo reply it sends back. 1902 If the Downstream Mapping TLV contains Multipath Information 1903 requiring more processing than the receiving router is willing to 1904 perform, the responding router MAY choose to respond with only a 1905 subset of multipaths contained in the echo request Downstream 1906 Mapping. (Note: The originator of the echo request MAY send another 1907 echo request with the Multipath Information that was not included in 1908 the reply.) 1910 Except in the case of Reply Mode 4, "Reply via application level 1911 control channel", echo replies are always sent in the context of the 1912 IP/MPLS network. 1914 4.6. Receiving an MPLS Echo Reply 1916 An LSR X should only receive an MPLS echo reply in response to an 1917 MPLS echo request that it sent. Thus, on receipt of an MPLS echo 1918 reply, X should parse the packet to ensure that it is well-formed, 1919 then attempt to match up the echo reply with an echo request that it 1920 had previously sent, using the destination UDP port and the Sender's 1921 Handle. If no match is found, then X jettisons the echo reply; 1922 otherwise, it checks the Sequence Number to see if it matches. 1924 If the echo reply contains Downstream Mappings, and X wishes to 1925 traceroute further, it SHOULD copy the Downstream Mapping(s) into its 1926 next echo request(s) (with TTL incremented by one). 1928 4.7. Issue with VPN IPv4 and IPv6 Prefixes 1930 Typically, an LSP ping for a VPN IPv4 prefix or VPN IPv6 prefix is 1931 sent with a label stack of depth greater than 1, with the innermost 1932 label having a TTL of 1. This is to terminate the ping at the egress 1933 PE, before it gets sent to the customer device. However, under 1934 certain circumstances, the label stack can shrink to a single label 1935 before the ping hits the egress PE; this will result in the ping 1936 terminating prematurely. One such scenario is a multi-AS Carrier's 1937 Carrier VPN. 1939 To get around this problem, one approach is for the LSR that receives 1940 such a ping to realize that the ping terminated prematurely, and send 1941 back error code 13. In that case, the initiating LSR can retry the 1942 ping after incrementing the TTL on the VPN label. In this fashion, 1943 the ingress LSR will sequentially try TTL values until it finds one 1944 that allows the VPN ping to reach the egress PE. 1946 4.8. Non-compliant Routers 1948 If the egress for the FEC Stack being pinged does not support MPLS 1949 ping, then no reply will be sent, resulting in possible "false 1950 negatives". If in "traceroute" mode, a transit LSR does not support 1951 LSP ping, then no reply will be forthcoming from that LSR for some 1952 TTL, say, n. The LSR originating the echo request SHOULD try sending 1953 the echo request with TTL=n+1, n+2, ..., n+k to probe LSRs further 1954 down the path. In such a case, the echo request for TTL > n SHOULD 1955 be sent with Downstream Mapping TLV "Downstream IP Address" field set 1956 to the ALLROUTERs multicast address until a reply is received with a 1957 Downstream Mapping TLV. The label stack MAY be omitted from the 1958 Downstream Mapping TLV. Furthermore, the "Validate FEC Stack" flag 1959 SHOULD NOT be set until an echo reply packet with a Downstream 1960 Mapping TLV is received. 1962 5. Security Considerations 1964 Overall, the security needs for LSP ping are similar to those of ICMP 1965 ping. 1967 There are at least three approaches to attacking LSRs using the 1968 mechanisms defined here. One is a Denial-of-Service attack, by 1969 sending MPLS echo requests/replies to LSRs and thereby increasing 1970 their workload. The second is obfuscating the state of the MPLS data 1971 plane liveness by spoofing, hijacking, replaying, or otherwise 1972 tampering with MPLS echo requests and replies. The third is an 1973 unauthorized source using an LSP ping to obtain information about the 1974 network. 1976 To avoid potential Denial-of-Service attacks, it is RECOMMENDED that 1977 implementations regulate the LSP ping traffic going to the control 1978 plane. A rate limiter SHOULD be applied to the well-known UDP port 1979 defined below. 1981 Unsophisticated replay and spoofing attacks involving faking or 1982 replaying MPLS echo reply messages are unlikely to be effective. 1983 These replies would have to match the Sender's Handle and Sequence 1984 Number of an outstanding MPLS echo request message. A non-matching 1985 replay would be discarded as the sequence has moved on, thus a spoof 1986 has only a small window of opportunity. However, to provide a 1987 stronger defense, an implementation MAY also validate the TimeStamp 1988 Sent by requiring an exact match on this field. 1990 To protect against unauthorized sources using MPLS echo request 1991 messages to obtain network information, it is RECOMMENDED that 1992 implementations provide a means of checking the source addresses of 1993 MPLS echo request messages against an access list before accepting 1994 the message. 1996 It is not clear how to prevent hijacking (non-delivery) of echo 1997 requests or replies; however, if these messages are indeed hijacked, 1998 LSP ping will report that the data plane is not working as it should. 2000 It does not seem vital (at this point) to secure the data carried in 2001 MPLS echo requests and replies, although knowledge of the state of 2002 the MPLS data plane may be considered confidential by some. 2003 Implementations SHOULD, however, provide a means of filtering the 2004 addresses to which echo reply messages may be sent. 2006 Although this document makes special use of 127/8 address, these are 2007 used only in conjunction with the UDP port 3503. Furthermore, these 2008 packets are only processed by routers. All other hosts MUST treat 2009 all packets with a destination address in the range 127/8 in 2010 accordance to RFC 1122. Any packet received by a router with a 2011 destination address in the range 127/8 without a destination UDP port 2012 of 3503 MUST be treated in accordance to RFC 1812. In particular, 2013 the default behavior is to treat packets destined to a 127/8 address 2014 as "martians". 2016 6. IANA Considerations 2018 The TCP and UDP port number 3503 has been allocated by IANA for LSP 2019 echo requests and replies. 2021 The following sections detail the new name spaces to be managed by 2022 IANA. For each of these name spaces, the space is divided into 2023 assignment ranges; the following terms are used in describing the 2024 procedures by which IANA allocates values: "Standards Action" (as 2025 defined in [RFC5226]), "Specification Required", and "Vendor Private 2026 Use". 2028 Values from "Specification Required" ranges MUST be registered with 2029 IANA. The request MUST be made via an Experimental RFC that 2030 describes the format and procedures for using the code point; the 2031 actual assignment is made during the IANA actions for the RFC. 2033 Values from "Vendor Private" ranges MUST NOT be registered with IANA; 2034 however, the message MUST contain an enterprise code as registered 2035 with the IANA SMI Private Network Management Private Enterprise 2036 Numbers. For each name space that has a Vendor Private range, it 2037 must be specified where exactly the SMI Private Enterprise Number 2038 resides; see below for examples. In this way, several enterprises 2039 (vendors) can use the same code point without fear of collision. 2041 6.1. Message Types, Reply Modes, Return Codes 2043 The IANA has created and will maintain registries for Message Types, 2044 Reply Modes, and Return Codes. Each of these can take values in the 2045 range 0-255. Assignments in the range 0-191 are via Standards 2046 Action; assignments in the range 192-251 are made via "Specification 2047 Required"; values in the range 252-255 are for Vendor Private Use, 2048 and MUST NOT be allocated. 2050 If any of these fields fall in the Vendor Private range, a top-level 2051 Vendor Enterprise Number TLV MUST be present in the message. 2053 Message Types defined in this document are the following: 2055 Value Meaning 2056 ----- ------- 2057 1 MPLS echo request 2058 2 MPLS echo reply 2060 Reply Modes defined in this document are the following: 2062 Value Meaning 2063 ----- ------- 2064 1 Do not reply 2065 2 Reply via an IPv4/IPv6 UDP packet 2066 3 Reply via an IPv4/IPv6 UDP packet with Router Alert 2067 4 Reply via application level control channel 2069 Return Codes defined in this document are listed in section 3.1. 2071 6.2. TLVs 2073 The IANA has created and will maintain a registry for the Type field 2074 of top-level TLVs as well as for any associated sub-TLVs. Note the 2075 meaning of a sub-TLV is scoped by the TLV. The number spaces for the 2076 sub-TLVs of various TLVs are independent. 2078 The valid range for TLVs and sub-TLVs is 0-65535. Assignments in the 2079 range 0-16383 and 32768-49161 are made via Standards Action as 2080 defined in [RFC5226]; assignments in the range 16384-31743 and 2081 49162-64511 are made via "Specification Required" as defined above; 2082 values in the range 31744-32767 and 64512-65535 are for Vendor 2083 Private Use, and MUST NOT be allocated. 2085 If a TLV or sub-TLV has a Type that falls in the range for Vendor 2086 Private Use, the Length MUST be at least 4, and the first four octets 2087 MUST be that vendor's SMI Private Enterprise Number, in network octet 2088 order. The rest of the Value field is private to the vendor. 2090 TLVs and sub-TLVs defined in this document are the following: 2092 Type Sub-Type Value Field 2093 ---- -------- ----------- 2094 1 Target FEC Stack 2095 1 LDP IPv4 prefix 2096 2 LDP IPv6 prefix 2097 3 RSVP IPv4 LSP 2098 4 RSVP IPv6 LSP 2099 5 Not Assigned 2100 6 VPN IPv4 prefix 2101 7 VPN IPv6 prefix 2102 8 L2 VPN endpoint 2103 9 "FEC 128" Pseudowire (Deprecated) 2104 10 "FEC 128" Pseudowire 2105 11 "FEC 129" Pseudowire 2106 12 BGP labeled IPv4 prefix 2107 13 BGP labeled IPv6 prefix 2108 14 Generic IPv4 prefix 2109 15 Generic IPv6 prefix 2110 16 Nil FEC 2111 2 Downstream Mapping 2112 3 Pad 2113 4 Not Assigned 2114 5 Vendor Enterprise Number 2115 6 Not Assigned 2116 7 Interface and Label Stack 2117 8 Not Assigned 2118 9 Errored TLVs 2119 Any value The TLV not understood 2120 10 Reply TOS Byte 2122 7. Acknowledgements 2124 The original acknowledgements from RFC 4379 state the following: 2126 This document is the outcome of many discussions among many 2127 people, including Manoj Leelanivas, Paul Traina, Yakov Rekhter, 2128 Der-Hwa Gan, Brook Bailey, Eric Rosen, Ina Minei, Shivani 2129 Aggarwal, and Vanson Lim. 2131 The description of the Multipath Information sub-field of the 2132 Downstream Mapping TLV was adapted from text suggested by Curtis 2133 Villamizar. 2135 We would like to thank Loa Andersson for motivating the advancement 2136 of this bis specification. 2138 8. References 2140 8.1. Normative References 2142 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 2143 Communication Layers", STD 3, RFC 1122, DOI 10.17487/ 2144 RFC1122, October 1989, 2145 . 2147 [RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", 2148 RFC 1812, DOI 10.17487/RFC1812, June 1995, 2149 . 2151 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2152 Requirement Levels", BCP 14, RFC 2119, March 1997. 2154 [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., 2155 Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack 2156 Encoding", RFC 3032, DOI 10.17487/RFC3032, January 2001, 2157 . 2159 [RFC4026] Andersson, L. and T. Madsen, "Provider Provisioned Virtual 2160 Private Network (VPN) Terminology", RFC 4026, DOI 2161 10.17487/RFC4026, March 2005, 2162 . 2164 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 2165 Border Gateway Protocol 4 (BGP-4)", RFC 4271, DOI 2166 10.17487/RFC4271, January 2006, 2167 . 2169 [RFC4379] Kompella, K. and G. Swallow, "Detecting Multi-Protocol 2170 Label Switched (MPLS) Data Plane Failures", RFC 4379, 2171 February 2006. 2173 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 2174 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 2175 DOI 10.17487/RFC5226, May 2008, 2176 . 2178 [RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch, 2179 "Network Time Protocol Version 4: Protocol and Algorithms 2180 Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010, 2181 . 2183 8.2. Informative References 2185 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 2186 RFC 792, September 1981. 2188 [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in 2189 BGP-4", RFC 3107, DOI 10.17487/RFC3107, May 2001, 2190 . 2192 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 2193 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 2194 Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001, 2195 . 2197 [RFC4365] Rosen, E., "Applicability Statement for BGP/MPLS IP 2198 Virtual Private Networks (VPNs)", RFC 4365, DOI 10.17487/ 2199 RFC4365, February 2006, 2200 . 2202 [RFC4447] Martini, L., Ed., Rosen, E., El-Aawar, N., Smith, T., and 2203 G. Heron, "Pseudowire Setup and Maintenance Using the 2204 Label Distribution Protocol (LDP)", RFC 4447, DOI 2205 10.17487/RFC4447, April 2006, 2206 . 2208 [RFC4761] Kompella, K., Ed. and Y. Rekhter, Ed., "Virtual Private 2209 LAN Service (VPLS) Using BGP for Auto-Discovery and 2210 Signaling", RFC 4761, DOI 10.17487/RFC4761, January 2007, 2211 . 2213 [RFC5036] Andersson, L., Ed., Minei, I., Ed., and B. Thomas, Ed., 2214 "LDP Specification", RFC 5036, DOI 10.17487/RFC5036, 2215 October 2007, . 2217 [RFC5085] Nadeau, T. and C. Pignataro, "Pseudowire Virtual Circuit 2218 Connectivity Verification (VCCV): A Control Channel for 2219 Pseudowires", RFC 5085, December 2007. 2221 Authors' Addresses 2223 Carlos Pignataro 2224 Cisco Systems, Inc. 2226 Email: cpignata@cisco.com 2227 Nagendra Kumar 2228 Cisco Systems, Inc. 2230 Email: naikumar@cisco.com 2232 Sam Aldrin 2233 Google 2235 Email: aldrin.ietf@gmail.com 2237 Mach(Guoyi) Chen 2238 Huawei 2240 Email: mach.chen@huawei.com