idnits 2.17.1 draft-smack-mpls-rfc4379bis-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 9 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 1 instance of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 == There are 8 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 25, 2015) is 3167 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'FEC-stack-depth' is mentioned on line 1730, but not defined ** Obsolete normative reference: RFC 2434 (ref. 'IANA') (Obsoleted by RFC 5226) ** Obsolete normative reference: RFC 2030 (ref. 'NTP') (Obsoleted by RFC 4330) ** Downref: Normative reference to an Informational RFC: RFC 4026 ** Obsolete normative reference: RFC 4379 (Obsoleted by RFC 8029) -- Obsolete informational reference (is this intentional?): RFC 3107 (ref. 'BGP-LABEL') (Obsoleted by RFC 8277) -- Obsolete informational reference (is this intentional?): RFC 3036 (ref. 'LDP') (Obsoleted by RFC 5036) -- Duplicate reference: RFC3036, mentioned in 'PW-CONTROL', was also mentioned in 'LDP'. -- Obsolete informational reference (is this intentional?): RFC 3036 (ref. 'PW-CONTROL') (Obsoleted by RFC 5036) -- Duplicate reference: RFC3209, mentioned in 'VCCV', was also mentioned in 'RSVP-TE'. -- Duplicate reference: RFC3209, mentioned in 'VPLS-BGP', was also mentioned in 'VCCV'. Summary: 4 errors (**), 0 flaws (~~), 5 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group C. Pignataro 3 Internet-Draft N. Kumar 4 Intended status: Standards Track Cisco 5 Expires: February 26, 2016 S. Aldrin 6 Google 7 M. Chen 8 Huawei 9 August 25, 2015 11 Detecting Multi-Protocol Label Switched (MPLS) Data Plane Failures 12 draft-smack-mpls-rfc4379bis-00.txt 14 Abstract 16 This document describes a simple and efficient mechanism that can be 17 used to detect data plane failures in Multi-Protocol Label Switching 18 (MPLS) Label Switched Paths (LSPs). There are two parts to this 19 document: information carried in an MPLS "echo request" and "echo 20 reply" for the purposes of fault detection and isolation, and 21 mechanisms for reliably sending the echo reply. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on February 26, 2016. 40 Copyright Notice 42 Copyright (c) 2015 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 58 1.1. Conventions . . . . . . . . . . . . . . . . . . . . . . 3 59 1.2. Structure of This Document . . . . . . . . . . . . . . . 4 60 1.3. Contributors . . . . . . . . . . . . . . . . . . . . . . 4 61 1.4. Scope of RFC4379bis work . . . . . . . . . . . . . . . . 4 62 2. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 5 63 2.1. Use of Address Range 127/8 . . . . . . . . . . . . . . . 5 64 3. Packet Format . . . . . . . . . . . . . . . . . . . . . . . 7 65 3.1. Return Codes . . . . . . . . . . . . . . . . . . . . . . 11 66 3.2. Target FEC Stack . . . . . . . . . . . . . . . . . . . . 12 67 3.2.1. LDP IPv4 Prefix . . . . . . . . . . . . . . . . . . 14 68 3.2.2. LDP IPv6 Prefix . . . . . . . . . . . . . . . . . . 14 69 3.2.3. RSVP IPv4 LSP . . . . . . . . . . . . . . . . . . . 14 70 3.2.4. RSVP IPv6 LSP . . . . . . . . . . . . . . . . . . . 15 71 3.2.5. VPN IPv4 Prefix . . . . . . . . . . . . . . . . . . 15 72 3.2.6. VPN IPv6 Prefix . . . . . . . . . . . . . . . . . . 16 73 3.2.7. L2 VPN Endpoint . . . . . . . . . . . . . . . . . . 17 74 3.2.8. FEC 128 Pseudowire (Deprecated) . . . . . . . . . . 17 75 3.2.9. FEC 128 Pseudowire (Current) . . . . . . . . . . . . 18 76 3.2.10. FEC 129 Pseudowire . . . . . . . . . . . . . . . . . 19 77 3.2.11. BGP Labeled IPv4 Prefix . . . . . . . . . . . . . . 20 78 3.2.12. BGP Labeled IPv6 Prefix . . . . . . . . . . . . . . 20 79 3.2.13. Generic IPv4 Prefix . . . . . . . . . . . . . . . . 21 80 3.2.14. Generic IPv6 Prefix . . . . . . . . . . . . . . . . 21 81 3.2.15. Nil FEC . . . . . . . . . . . . . . . . . . . . . . 22 82 3.3. Downstream Mapping . . . . . . . . . . . . . . . . . . . 22 83 3.3.1. Multipath Information Encoding . . . . . . . . . . . 26 84 3.3.2. Downstream Router and Interface . . . . . . . . . . 28 85 3.4. Pad TLV . . . . . . . . . . . . . . . . . . . . . . . . 29 86 3.5. Vendor Enterprise Number . . . . . . . . . . . . . . . . 29 87 3.6. Interface and Label Stack . . . . . . . . . . . . . . . 29 88 3.7. Errored TLVs . . . . . . . . . . . . . . . . . . . . . . 31 89 3.8. Reply TOS Byte TLV . . . . . . . . . . . . . . . . . . . 31 90 4. Theory of Operation . . . . . . . . . . . . . . . . . . . . 32 91 4.1. Dealing with Equal-Cost Multi-Path (ECMP) . . . . . . . 32 92 4.2. Testing LSPs That Are Used to Carry MPLS Payloads . . . 33 93 4.3. Sending an MPLS Echo Request . . . . . . . . . . . . . . 33 94 4.4. Receiving an MPLS Echo Request . . . . . . . . . . . . . 34 95 4.4.1. FEC Validation . . . . . . . . . . . . . . . . . . . 40 96 4.5. Sending an MPLS Echo Reply . . . . . . . . . . . . . . . 41 97 4.6. Receiving an MPLS Echo Reply . . . . . . . . . . . . . . 42 98 4.7. Issue with VPN IPv4 and IPv6 Prefixes . . . . . . . . . 42 99 4.8. Non-compliant Routers . . . . . . . . . . . . . . . . . 43 100 5. Security Considerations . . . . . . . . . . . . . . . . . . 43 101 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . 44 102 6.1. Message Types, Reply Modes, Return Codes . . . . . . . . 45 103 6.2. TLVs . . . . . . . . . . . . . . . . . . . . . . . . . . 45 104 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 46 105 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 47 106 8.1. Normative References . . . . . . . . . . . . . . . . . . 47 107 8.2. Informative References . . . . . . . . . . . . . . . . . 47 108 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 48 110 1. Introduction 112 This document describes a simple and efficient mechanism that can be 113 used to detect data plane failures in MPLS Label Switched Paths 114 (LSPs). There are two parts to this document: information carried in 115 an MPLS "echo request" and "echo reply", and mechanisms for 116 transporting the echo reply. The first part aims at providing enough 117 information to check correct operation of the data plane, as well as 118 a mechanism to verify the data plane against the control plane, and 119 thereby localize faults. The second part suggests two methods of 120 reliable reply channels for the echo request message for more robust 121 fault isolation. 123 An important consideration in this design is that MPLS echo requests 124 follow the same data path that normal MPLS packets would traverse. 125 MPLS echo requests are meant primarily to validate the data plane, 126 and secondarily to verify the data plane against the control plane. 127 Mechanisms to check the control plane are valuable, but are not 128 covered in this document. 130 This document makes special use of the address range 127/8. This is 131 an exception to the behavior defined in RFC 1122 [RFC1122] and 132 updates that RFC. The motivation for this change and the details of 133 this exceptional use are discussed in section 2.1 below. 135 1.1. Conventions 137 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 138 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 139 document are to be interpreted as described in RFC 2119 [KEYWORDS]. 141 The term "Must Be Zero" (MBZ) is used in object descriptions for 142 reserved fields. These fields MUST be set to zero when sent and 143 ignored on receipt. 145 Terminology pertaining to L2 and L3 Virtual Private Networks (VPNs) 146 is defined in [RFC4026]. 148 Since this document refers to the MPLS Time to Live (TTL) far more 149 frequently than the IP TTL, the authors have chosen the convention of 150 using the unqualified "TTL" to mean "MPLS TTL" and using "IP TTL" for 151 the TTL value in the IP header. 153 1.2. Structure of This Document 155 The body of this memo contains four main parts: motivation, MPLS echo 156 request/reply packet format, LSP ping operation, and a reliable 157 return path. It is suggested that first-time readers skip the actual 158 packet formats and read the Theory of Operation first; the document 159 is structured the way it is to avoid forward references. 161 1.3. Contributors 163 A mechanism used to detect data plane failures in Multi-Protocol 164 Label Switching (MPLS) Label Switched Paths (LSPs) was originally 165 published as RFC 4379 in February 2006. It was produced by the MPLS 166 Working Group of the IETF and was jointly authored by Kireeti 167 Kompella and George Swallow. 169 The following made vital contributions to all aspects of the original 170 RFC 4379, and much of the material came out of debate and discussion 171 among this group. 173 Ronald P. Bonica, Juniper Networks, Inc. 174 Dave Cooper, Global Crossing 175 Ping Pan, Hammerhead Systems 176 Nischal Sheth, Juniper Networks, Inc. 177 Sanjay Wadhwa, Juniper Networks, Inc. 179 1.4. Scope of RFC4379bis work 181 The goal of this document is to take LSP Ping to an Internet 182 Standard. 184 [RFC4379] defines the basic mechanism for MPLS LSP validation that 185 can be used for fault detection and isolation. The scope of this 186 document also is to address the various errata and updates to MPLS 187 LSP Ping, including:To-be-Updated. 189 2. Motivation 191 When an LSP fails to deliver user traffic, the failure cannot always 192 be detected by the MPLS control plane. There is a need to provide a 193 tool that would enable users to detect such traffic "black holes" or 194 misrouting within a reasonable period of time, and a mechanism to 195 isolate faults. 197 In this document, we describe a mechanism that accomplishes these 198 goals. This mechanism is modeled after the ping/traceroute paradigm: 199 ping (ICMP echo request [ICMP]) is used for connectivity checks, and 200 traceroute is used for hop-by-hop fault localization as well as path 201 tracing. This document specifies a "ping" mode and a "traceroute" 202 mode for testing MPLS LSPs. 204 The basic idea is to verify that packets that belong to a particular 205 Forwarding Equivalence Class (FEC) actually end their MPLS path on a 206 Label Switching Router (LSR) that is an egress for that FEC. This 207 document proposes that this test be carried out by sending a packet 208 (called an "MPLS echo request") along the same data path as other 209 packets belonging to this FEC. An MPLS echo request also carries 210 information about the FEC whose MPLS path is being verified. This 211 echo request is forwarded just like any other packet belonging to 212 that FEC. In "ping" mode (basic connectivity check), the packet 213 should reach the end of the path, at which point it is sent to the 214 control plane of the egress LSR, which then verifies whether it is 215 indeed an egress for the FEC. In "traceroute" mode (fault 216 isolation), the packet is sent to the control plane of each transit 217 LSR, which performs various checks that it is indeed a transit LSR 218 for this path; this LSR also returns further information that helps 219 check the control plane against the data plane, i.e., that forwarding 220 matches what the routing protocols determined as the path. 222 One way these tools can be used is to periodically ping an FEC to 223 ensure connectivity. If the ping fails, one can then initiate a 224 traceroute to determine where the fault lies. One can also 225 periodically traceroute FECs to verify that forwarding matches the 226 control plane; however, this places a greater burden on transit LSRs 227 and thus should be used with caution. 229 2.1. Use of Address Range 127/8 231 As described above, LSP ping is intended as a diagnostic tool. It is 232 intended to enable providers of an MPLS-based service to isolate 233 network faults. In particular, LSP ping needs to diagnose situations 234 where the control and data planes are out of sync. It performs this 235 by routing an MPLS echo request packet based solely on its label 236 stack. That is, the IP destination address is never used in a 237 forwarding decision. In fact, the sender of an MPLS echo request 238 packet may not know, a priori, the address of the router at the end 239 of the LSP. 241 Providers of MPLS-based services also need the ability to trace all 242 of the possible paths that an LSP may take. Since most MPLS services 243 are based on IP unicast forwarding, these paths are subject to 244 equal-cost multi-path (ECMP) load sharing. 246 This leads to the following requirements: 248 1. Although the LSP in question may be broken in unknown ways, the 249 likelihood of a diagnostic packet being delivered to a user of an 250 MPLS service MUST be held to an absolute minimum. 252 2. If an LSP is broken in such a way that it prematurely terminates, 253 the diagnostic packet MUST NOT be IP forwarded. 255 3. A means of varying the diagnostic packets such that they exercise 256 all ECMP paths is thus REQUIRED. 258 Clearly, using general unicast addresses satisfies neither of the 259 first two requirements. A number of other options for addresses were 260 considered, including a portion of the private address space (as 261 determined by the network operator) and the newly designated IPv4 262 link local addresses. Use of the private address space was deemed 263 ineffective since the leading MPLS-based service is an IPv4 Virtual 264 Private Network (VPN). VPNs often use private addresses. 266 The IPv4 link local addresses are more attractive in that the scope 267 over which they can be forwarded is limited. However, if one were to 268 use an address from this range, it would still be possible for the 269 first recipient of a diagnostic packet that "escaped" from a broken 270 LSP to have that address assigned to the interface on which it 271 arrived and thus could mistakenly receive such a packet. 272 Furthermore, the IPv4 link local address range has only recently been 273 allocated. Many deployed routers would forward a packet with an 274 address from that range toward the default route. 276 The 127/8 range for IPv4 and that same range embedded in as 277 IPv4-mapped IPv6 addresses for IPv6 was chosen for a number of 278 reasons. 280 RFC 1122 allocates the 127/8 as "Internal host loopback address" and 281 states: "Addresses of this form MUST NOT appear outside a host." 282 Thus, the default behavior of hosts is to discard such packets. This 283 helps to ensure that if a diagnostic packet is misdirected to a host, 284 it will be silently discarded. 286 RFC 1812 [RFC1812] states: 288 A router SHOULD NOT forward, except over a loopback interface, any 289 packet that has a destination address on network 127. A router 290 MAY have a switch that allows the network manager to disable these 291 checks. If such a switch is provided, it MUST default to 292 performing the checks. 294 This helps to ensure that diagnostic packets are never IP forwarded. 296 The 127/8 address range provides 16M addresses allowing wide 297 flexibility in varying addresses to exercise ECMP paths. Finally, as 298 an implementation optimization, the 127/8 provides an easy means of 299 identifying possible LSP packets. 301 3. Packet Format 303 An MPLS echo request is a (possibly labeled) IPv4 or IPv6 UDP packet; 304 the contents of the UDP packet have the following format: 306 0 1 2 3 307 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 308 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 309 | Version Number | Global Flags | 310 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 311 | Message Type | Reply mode | Return Code | Return Subcode| 312 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 313 | Sender's Handle | 314 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 315 | Sequence Number | 316 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 317 | TimeStamp Sent (seconds) | 318 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 319 | TimeStamp Sent (microseconds) | 320 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 321 | TimeStamp Received (seconds) | 322 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 323 | TimeStamp Received (microseconds) | 324 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 325 | TLVs ... | 326 . . 327 . . 328 . . 329 | | 330 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 332 The Version Number is currently 1. (Note: the version number is to 333 be incremented whenever a change is made that affects the ability of 334 an implementation to correctly parse or process an MPLS echo 335 request/reply. These changes include any syntactic or semantic 336 changes made to any of the fixed fields, or to any Type-Length-Value 337 (TLV) or sub-TLV assignment or format that is defined at a certain 338 version number. The version number may not need to be changed if an 339 optional TLV or sub-TLV is added.) 341 The Global Flags field is a bit vector with the following format: 343 0 1 344 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 345 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 346 | MBZ |V| 347 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 349 One flag is defined for now, the V bit; the rest MUST be set to zero 350 when sending and ignored on receipt. 352 The V (Validate FEC Stack) flag is set to 1 if the sender wants the 353 receiver to perform FEC Stack validation; if V is 0, the choice is 354 left to the receiver. 356 The Message Type is one of the following: 358 Value Meaning 359 ----- ------- 360 1 MPLS echo request 361 2 MPLS echo reply 363 The Reply Mode can take one of the following values: 365 Value Meaning 366 ----- ------- 367 1 Do not reply 368 2 Reply via an IPv4/IPv6 UDP packet 369 3 Reply via an IPv4/IPv6 UDP packet with Router Alert 370 4 Reply via application level control channel 372 An MPLS echo request with 1 (Do not reply) in the Reply Mode field 373 may be used for one-way connectivity tests; the receiving router may 374 log gaps in the Sequence Numbers and/or maintain delay/jitter 375 statistics. An MPLS echo request would normally have 2 (Reply via an 376 IPv4/IPv6 UDP packet) in the Reply Mode field. If the normal IP 377 return path is deemed unreliable, one may use 3 (Reply via an IPv4/ 378 IPv6 UDP packet with Router Alert). Note that this requires that all 379 intermediate routers understand and know how to forward MPLS echo 380 replies. The echo reply uses the same IP version number as the 381 received echo request, i.e., an IPv4 encapsulated echo reply is sent 382 in response to an IPv4 encapsulated echo request. 384 Some applications support an IP control channel. One such example is 385 the associated control channel defined in Virtual Circuit 386 Connectivity Verification (VCCV) [VCCV]. Any application that 387 supports an IP control channel between its control entities may set 388 the Reply Mode to 4 (Reply via application level control channel) to 389 ensure that replies use that same channel. Further definition of 390 this codepoint is application specific and thus beyond the scope of 391 this document. 393 Return Codes and Subcodes are described in the next section. 395 The Sender's Handle is filled in by the sender, and returned 396 unchanged by the receiver in the echo reply (if any). There are no 397 semantics associated with this handle, although a sender may find 398 this useful for matching up requests with replies. 400 The Sequence Number is assigned by the sender of the MPLS echo 401 request and can be (for example) used to detect missed replies. 403 The TimeStamp Sent is the time-of-day (in seconds and microseconds, 404 according to the sender's clock) in NTP format [NTP] when the MPLS 405 echo request is sent. The TimeStamp Received in an echo reply is the 406 time-of-day (according to the receiver's clock) in NTP format that 407 the corresponding echo request was received. 409 TLVs (Type-Length-Value tuples) have the following format: 411 0 1 2 3 412 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 413 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 414 | Type | Length | 415 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 416 | Value | 417 . . 418 . . 419 . . 420 | | 421 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 423 Types are defined below; Length is the length of the Value field in 424 octets. The Value field depends on the Type; it is zero padded to 425 align to a 4-octet boundary. TLVs may be nested within other TLVs, 426 in which case the nested TLVs are called sub-TLVs. Sub-TLVs have 427 independent types and MUST also be 4-octet aligned. 429 Two examples follow. The Label Distribution Protocol (LDP) IPv4 FEC 430 sub-TLV has the following format: 432 0 1 2 3 433 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 434 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 435 | Type = 1 (LDP IPv4 FEC) | Length = 5 | 436 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 437 | IPv4 prefix | 438 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 439 | Prefix Length | Must Be Zero | 440 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 442 The Length for this TLV is 5. A Target FEC Stack TLV that contains 443 an LDP IPv4 FEC sub-TLV and a VPN IPv4 prefix sub-TLV has the 444 following format: 446 0 1 2 3 447 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 448 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 449 | Type = 1 (FEC TLV) | Length = 12 | 450 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 451 | sub-Type = 1 (LDP IPv4 FEC) | Length = 5 | 452 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 453 | IPv4 prefix | 454 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 455 | Prefix Length | Must Be Zero | 456 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 457 | sub-Type = 6 (VPN IPv4 prefix)| Length = 13 | 458 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 459 | Route Distinguisher | 460 | (8 octets) | 461 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 462 | IPv4 prefix | 463 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 464 | Prefix Length | Must Be Zero | 465 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 467 A description of the Types and Values of the top-level TLVs for LSP 468 ping are given below: 470 Type # Value Field 471 ------ ----------- 472 1 Target FEC Stack 473 2 Downstream Mapping 474 3 Pad 475 4 Not Assigned 476 5 Vendor Enterprise Number 477 6 Not Assigned 478 7 Interface and Label Stack 479 8 Not Assigned 480 9 Errored TLVs 481 10 Reply TOS Byte 483 Types less than 32768 (i.e., with the high-order bit equal to 0) are 484 mandatory TLVs that MUST either be supported by an implementation or 485 result in the return code of 2 ("One or more of the TLVs was not 486 understood") being sent in the echo response. 488 Types greater than or equal to 32768 (i.e., with the high-order bit 489 equal to 1) are optional TLVs that SHOULD be ignored if the 490 implementation does not understand or support them. 492 3.1. Return Codes 494 The Return Code is set to zero by the sender. The receiver can set 495 it to one of the values listed below. The notation refers to 496 the Return Subcode. This field is filled in with the stack-depth for 497 those codes that specify that. For all other codes, the Return 498 Subcode MUST be set to zero. 500 Value Meaning 501 ----- ------- 502 0 No return code 503 1 Malformed echo request received 504 2 One or more of the TLVs was not understood 505 3 Replying router is an egress for the FEC at stack- 506 depth 507 4 Replying router has no mapping for the FEC at stack- 508 depth 509 5 Downstream Mapping Mismatch (See Note 1) 510 6 Upstream Interface Index Unknown (See Note 1) 511 7 Reserved 512 8 Label switched at stack-depth 513 9 Label switched but no MPLS forwarding at stack-depth 514 515 10 Mapping for this FEC is not the given label at stack- 516 depth 517 11 No label entry at stack-depth 518 12 Protocol not associated with interface at FEC stack- 519 depth 520 13 Premature termination of ping due to label stack 521 shrinking to a single label 523 Note 1 525 The Return Subcode contains the point in the label stack where 526 processing was terminated. If the RSC is 0, no labels were 527 processed. Otherwise the packet would have been label switched at 528 depth RSC. 530 3.2. Target FEC Stack 532 A Target FEC Stack is a list of sub-TLVs. The number of elements is 533 determined by looking at the sub-TLV length fields. 535 Sub-Type Length Value Field 536 -------- ------ ----------- 537 1 5 LDP IPv4 prefix 538 2 17 LDP IPv6 prefix 539 3 20 RSVP IPv4 LSP 540 4 56 RSVP IPv6 LSP 541 5 Not Assigned 542 6 13 VPN IPv4 prefix 543 7 25 VPN IPv6 prefix 544 8 14 L2 VPN endpoint 545 9 10 "FEC 128" Pseudowire (deprecated) 546 10 14 "FEC 128" Pseudowire 547 11 16+ "FEC 129" Pseudowire 548 12 5 BGP labeled IPv4 prefix 549 13 17 BGP labeled IPv6 prefix 550 14 5 Generic IPv4 prefix 551 15 17 Generic IPv6 prefix 552 16 4 Nil FEC 554 Other FEC Types will be defined as needed. 556 Note that this TLV defines a stack of FECs, the first FEC element 557 corresponding to the top of the label stack, etc. 559 An MPLS echo request MUST have a Target FEC Stack that describes the 560 FEC Stack being tested. For example, if an LSR X has an LDP mapping 561 [LDP] for 192.168.1.1 (say, label 1001), then to verify that label 562 1001 does indeed reach an egress LSR that announced this prefix via 563 LDP, X can send an MPLS echo request with an FEC Stack TLV with one 564 FEC in it, namely, of type LDP IPv4 prefix, with prefix 565 192.168.1.1/32, and send the echo request with a label of 1001. 567 Say LSR X wanted to verify that a label stack of <1001, 23456> is the 568 right label stack to use to reach a VPN IPv4 prefix [see section 569 3.2.5] of 10/8 in VPN foo. Say further that LSR Y with loopback 570 address 192.168.1.1 announced prefix 10/8 with Route Distinguisher 571 RD-foo-Y (which may in general be different from the Route 572 Distinguisher that LSR X uses in its own advertisements for VPN foo), 573 label 23456 and BGP next hop 192.168.1.1 [BGP]. Finally, suppose 574 that LSR X receives a label binding of 1001 for 192.168.1.1 via LDP. 575 X has two choices in sending an MPLS echo request: X can send an MPLS 576 echo request with an FEC Stack TLV with a single FEC of type VPN IPv4 577 prefix with a prefix of 10/8 and a Route Distinguisher of RD-foo-Y. 578 Alternatively, X can send an FEC Stack TLV with two FECs, the first 579 of type LDP IPv4 with a prefix of 192.168.1.1/32 and the second of 580 type of IP VPN with a prefix 10/8 with Route Distinguisher of RD-foo- 581 Y. In either case, the MPLS echo request would have a label stack of 582 <1001, 23456>. (Note: in this example, 1001 is the "outer" label and 583 23456 is the "inner" label.) 585 3.2.1. LDP IPv4 Prefix 587 The IPv4 Prefix FEC is defined in [LDP]. When an LDP IPv4 prefix is 588 encoded in a label stack, the following format is used. The value 589 consists of 4 octets of an IPv4 prefix followed by 1 octet of prefix 590 length in bits; the format is given below. The IPv4 prefix is in 591 network byte order; if the prefix is shorter than 32 bits, trailing 592 bits SHOULD be set to zero. See [LDP] for an example of a Mapping 593 for an IPv4 FEC. 595 0 1 2 3 596 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 597 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 598 | IPv4 prefix | 599 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 600 | Prefix Length | Must Be Zero | 601 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 603 3.2.2. LDP IPv6 Prefix 605 The IPv6 Prefix FEC is defined in [LDP]. When an LDP IPv6 prefix is 606 encoded in a label stack, the following format is used. The value 607 consists of 16 octets of an IPv6 prefix followed by 1 octet of prefix 608 length in bits; the format is given below. The IPv6 prefix is in 609 network byte order; if the prefix is shorter than 128 bits, the 610 trailing bits SHOULD be set to zero. See [LDP] for an example of a 611 Mapping for an IPv6 FEC. 613 0 1 2 3 614 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 615 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 616 | IPv6 prefix | 617 | (16 octets) | 618 | | 619 | | 620 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 621 | Prefix Length | Must Be Zero | 622 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 624 3.2.3. RSVP IPv4 LSP 626 The value has the format below. The value fields are taken from RFC 627 3209, sections 4.6.1.1 and 4.6.2.1. See [RSVP-TE]. 629 0 1 2 3 630 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 631 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 632 | IPv4 tunnel end point address | 633 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 634 | Must Be Zero | Tunnel ID | 635 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 636 | Extended Tunnel ID | 637 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 638 | IPv4 tunnel sender address | 639 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 640 | Must Be Zero | LSP ID | 641 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 643 3.2.4. RSVP IPv6 LSP 645 The value has the format below. The value fields are taken from RFC 646 3209, sections 4.6.1.2 and 4.6.2.2. See [RSVP-TE]. 648 0 1 2 3 649 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 650 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 651 | IPv6 tunnel end point address | 652 | | 653 | | 654 | | 655 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 656 | Must Be Zero | Tunnel ID | 657 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 658 | Extended Tunnel ID | 659 | | 660 | | 661 | | 662 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 663 | IPv6 tunnel sender address | 664 | | 665 | | 666 | | 667 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 668 | Must Be Zero | LSP ID | 669 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 671 3.2.5. VPN IPv4 Prefix 673 VPN-IPv4 Network Layer Routing Information (NLRI) is defined in 674 [RFC4365]. This document uses the term VPN IPv4 prefix for a VPN- 675 IPv4 NLRI that has been advertised with an MPLS label in BGP. See 676 [BGP-LABEL]. 678 When a VPN IPv4 prefix is encoded in a label stack, the following 679 format is used. The value field consists of the Route Distinguisher 680 advertised with the VPN IPv4 prefix, the IPv4 prefix (with trailing 0 681 bits to make 32 bits in all), and a prefix length, as follows: 683 0 1 2 3 684 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 685 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 686 | Route Distinguisher | 687 | (8 octets) | 688 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 689 | IPv4 prefix | 690 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 691 | Prefix Length | Must Be Zero | 692 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 694 The Route Distinguisher (RD) is an 8-octet identifier; it does not 695 contain any inherent information. The purpose of the RD is solely to 696 allow one to create distinct routes to a common IPv4 address prefix. 697 The encoding of the RD is not important here. When matching this 698 field to the local FEC information, it is treated as an opaque value. 700 3.2.6. VPN IPv6 Prefix 702 VPN-IPv6 Network Layer Routing Information (NLRI) is defined in 703 [RFC4365]. This document uses the term VPN IPv6 prefix for a VPN- 704 IPv6 NLRI that has been advertised with an MPLS label in BGP. See 705 [BGP-LABEL]. 707 When a VPN IPv6 prefix is encoded in a label stack, the following 708 format is used. The value field consists of the Route Distinguisher 709 advertised with the VPN IPv6 prefix, the IPv6 prefix (with trailing 0 710 bits to make 128 bits in all), and a prefix length, as follows: 712 0 1 2 3 713 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 714 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 715 | Route Distinguisher | 716 | (8 octets) | 717 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 718 | IPv6 prefix | 719 | | 720 | | 721 | | 722 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 723 | Prefix Length | Must Be Zero | 724 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 726 The Route Distinguisher is identical to the VPN IPv4 Prefix RD, 727 except that it functions here to allow the creation of distinct 728 routes to IPv6 prefixes. See section 3.2.5. When matching this 729 field to local FEC information, it is treated as an opaque value. 731 3.2.7. L2 VPN Endpoint 733 VPLS stands for Virtual Private LAN Service. The terms VPLS BGP NLRI 734 and VE ID (VPLS Edge Identifier) are defined in [VPLS-BGP]. This 735 document uses the simpler term L2 VPN endpoint when referring to a 736 VPLS BGP NLRI. The Route Distinguisher is an 8-octet identifier used 737 to distinguish information about various L2 VPNs advertised by a 738 node. The VE ID is a 2-octet identifier used to identify a 739 particular node that serves as the service attachment point within a 740 VPLS. The structure of these two identifiers is unimportant here; 741 when matching these fields to local FEC information, they are treated 742 as opaque values. The encapsulation type is identical to the PW Type 743 in section 3.2.8 below. 745 When an L2 VPN endpoint is encoded in a label stack, the following 746 format is used. The value field consists of a Route Distinguisher (8 747 octets), the sender (of the ping)'s VE ID (2 octets), the receiver's 748 VE ID (2 octets), and an encapsulation type (2 octets), formatted as 749 follows: 751 0 1 2 3 752 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 753 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 754 | Route Distinguisher | 755 | (8 octets) | 756 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 757 | Sender's VE ID | Receiver's VE ID | 758 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 759 | Encapsulation Type | Must Be Zero | 760 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 762 3.2.8. FEC 128 Pseudowire (Deprecated) 764 FEC 128 (0x80) is defined in [PW-CONTROL], as are the terms PW ID 765 (Pseudowire ID) and PW Type (Pseudowire Type). A PW ID is a non-zero 766 32-bit connection ID. The PW Type is a 15-bit number indicating the 767 encapsulation type. It is carried right justified in the field below 768 termed encapsulation type with the high-order bit set to zero. Both 769 of these fields are treated in this protocol as opaque values. 771 When an FEC 128 is encoded in a label stack, the following format is 772 used. The value field consists of the remote PE address (the 773 destination address of the targeted LDP session), the PW ID, and the 774 encapsulation type as follows: 776 0 1 2 3 777 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 778 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 779 | Remote PE Address | 780 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 781 | PW ID | 782 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 783 | PW Type | Must Be Zero | 784 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 786 This FEC is deprecated and is retained only for backward 787 compatibility. Implementations of LSP ping SHOULD accept and process 788 this TLV, but SHOULD send LSP ping echo requests with the new TLV 789 (see next section), unless explicitly configured to use the old TLV. 791 An LSR receiving this TLV SHOULD use the source IP address of the LSP 792 echo request to infer the sender's PE address. 794 3.2.9. FEC 128 Pseudowire (Current) 796 FEC 128 (0x80) is defined in [PW-CONTROL], as are the terms PW ID 797 (Pseudowire ID) and PW Type (Pseudowire Type). A PW ID is a non-zero 798 32-bit connection ID. The PW Type is a 15-bit number indicating the 799 encapsulation type. It is carried right justified in the field below 800 termed encapsulation type with the high-order bit set to zero. 802 Both of these fields are treated in this protocol as opaque values. 803 When matching these field to the local FEC information, the match 804 MUST be exact. 806 When an FEC 128 is encoded in a label stack, the following format is 807 used. The value field consists of the sender's PE address (the 808 source address of the targeted LDP session), the remote PE address 809 (the destination address of the targeted LDP session), the PW ID, and 810 the encapsulation type as follows: 812 0 1 2 3 813 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 814 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 815 | Sender's PE Address | 816 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 817 | Remote PE Address | 818 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 819 | PW ID | 820 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 821 | PW Type | Must Be Zero | 822 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 824 3.2.10. FEC 129 Pseudowire 826 FEC 129 (0x81) and the terms PW Type, Attachment Group Identifier 827 (AGI), Attachment Group Identifier Type (AGI Type), Attachment 828 Individual Identifier Type (AII Type), Source Attachment Individual 829 Identifier (SAII), and Target Attachment Individual Identifier (TAII) 830 are defined in [PW-CONTROL]. The PW Type is a 15-bit number 831 indicating the encapsulation type. It is carried right justified in 832 the field below PW Type with the high-order bit set to zero. All the 833 other fields are treated as opaque values and copied directly from 834 the FEC 129 format. All of these values together uniquely define the 835 FEC within the scope of the LDP session identified by the source and 836 remote PE addresses. 838 When an FEC 129 is encoded in a label stack, the following format is 839 used. The Length of this TLV is 16 + AGI length + SAII length + TAII 840 length. Padding is used to make the total length a multiple of 4; 841 the length of the padding is not included in the Length field. 843 0 1 2 3 844 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 845 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 846 | Sender's PE Address | 847 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 848 | Remote PE Address | 849 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 850 | PW Type | AGI Type | AGI Length | 851 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 852 ~ AGI Value ~ 853 | | 854 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 855 | AII Type | SAII Length | SAII Value | 856 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 857 ~ SAII Value (continued) ~ 858 | | 859 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 860 | AII Type | TAII Length | TAII Value | 861 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 862 ~ TAII Value (continued) ~ 863 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 864 | TAII (cont.) | 0-3 octets of zero padding | 865 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 867 3.2.11. BGP Labeled IPv4 Prefix 869 BGP labeled IPv4 prefixes are defined in [BGP-LABEL]. When a BGP 870 labeled IPv4 prefix is encoded in a label stack, the following format 871 is used. The value field consists the IPv4 prefix (with trailing 0 872 bits to make 32 bits in all), and the prefix length, as follows: 874 0 1 2 3 875 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 876 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 877 | IPv4 Prefix | 878 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 879 | Prefix Length | Must Be Zero | 880 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 882 3.2.12. BGP Labeled IPv6 Prefix 884 BGP labeled IPv6 prefixes are defined in [BGP-LABEL]. When a BGP 885 labeled IPv6 prefix is encoded in a label stack, the following format 886 is used. The value consists of 16 octets of an IPv6 prefix followed 887 by 1 octet of prefix length in bits; the format is given below. The 888 IPv6 prefix is in network byte order; if the prefix is shorter than 889 128 bits, the trailing bits SHOULD be set to zero. 891 0 1 2 3 892 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 893 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 894 | IPv6 prefix | 895 | (16 octets) | 896 | | 897 | | 898 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 899 | Prefix Length | Must Be Zero | 900 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 902 3.2.13. Generic IPv4 Prefix 904 The value consists of 4 octets of an IPv4 prefix followed by 1 octet 905 of prefix length in bits; the format is given below. The IPv4 prefix 906 is in network byte order; if the prefix is shorter than 32 bits, 907 trailing bits SHOULD be set to zero. This FEC is used if the 908 protocol advertising the label is unknown or may change during the 909 course of the LSP. An example is an inter-AS LSP that may be 910 signaled by LDP in one Autonomous System (AS), by RSVP-TE [RSVP-TE] 911 in another AS, and by BGP between the ASes, such as is common for 912 inter-AS VPNs. 914 0 1 2 3 915 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 916 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 917 | IPv4 prefix | 918 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 919 | Prefix Length | Must Be Zero | 920 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 922 3.2.14. Generic IPv6 Prefix 924 The value consists of 16 octets of an IPv6 prefix followed by 1 octet 925 of prefix length in bits; the format is given below. The IPv6 prefix 926 is in network byte order; if the prefix is shorter than 128 bits, the 927 trailing bits SHOULD be set to zero. 929 0 1 2 3 930 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 931 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 932 | IPv6 prefix | 933 | (16 octets) | 934 | | 935 | | 936 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 937 | Prefix Length | Must Be Zero | 938 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 940 3.2.15. Nil FEC 942 At times, labels from the reserved range, e.g., Router Alert and 943 Explicit-null, may be added to the label stack for various diagnostic 944 purposes such as influencing load-balancing. These labels may have 945 no explicit FEC associated with them. The Nil FEC Stack is defined 946 to allow a Target FEC Stack sub-TLV to be added to the Target FEC 947 Stack to account for such labels so that proper validation can still 948 be performed. 950 The Length is 4. Labels are 20-bit values treated as numbers. 952 0 1 2 3 953 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 954 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 955 | Label | MBZ | 956 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 958 Label is the actual label value inserted in the label stack; the MBZ 959 fields MUST be zero when sent and ignored on receipt. 961 3.3. Downstream Mapping 963 The Downstream Mapping object is a TLV that MAY be included in an 964 echo request message. Only one Downstream Mapping object may appear 965 in an echo request. The presence of a Downstream Mapping object is a 966 request that Downstream Mapping objects be included in the echo 967 reply. If the replying router is the destination of the FEC, then a 968 Downstream Mapping TLV SHOULD NOT be included in the echo reply. 969 Otherwise the replying router SHOULD include a Downstream Mapping 970 object for each interface over which this FEC could be forwarded. 971 For a more precise definition of the notion of "downstream", see 972 section 3.3.2, "Downstream Router and Interface". 974 The Length is K + M + 4*N octets, where M is the Multipath Length, 975 and N is the number of Downstream Labels. Values for K are found in 976 the description of Address Type below. The Value field of a 977 Downstream Mapping has the following format: 979 0 1 2 3 980 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 981 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 982 | MTU | Address Type | DS Flags | 983 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 984 | Downstream IP Address (4 or 16 octets) | 985 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 986 | Downstream Interface Address (4 or 16 octets) | 987 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 988 | Multipath Type| Depth Limit | Multipath Length | 989 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 990 . . 991 . (Multipath Information) . 992 . . 993 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 994 | Downstream Label | Protocol | 995 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 996 . . 997 . . 998 . . 999 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1000 | Downstream Label | Protocol | 1001 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1003 Maximum Transmission Unit (MTU) 1005 The MTU is the size in octets of the largest MPLS frame (including 1006 label stack) that fits on the interface to the Downstream LSR. 1008 Address Type 1010 The Address Type indicates if the interface is numbered or 1011 unnumbered. It also determines the length of the Downstream IP 1012 Address and Downstream Interface fields. The resulting total for 1013 the initial part of the TLV is listed in the table below as "K 1014 Octets". The Address Type is set to one of the following values: 1016 Type # Address Type K Octets 1017 ------ ------------ -------- 1018 1 IPv4 Numbered 16 1019 2 IPv4 Unnumbered 16 1020 3 IPv6 Numbered 40 1021 4 IPv6 Unnumbered 28 1023 DS Flags 1025 The DS Flags field is a bit vector with the following format: 1027 0 1 2 3 4 5 6 7 1028 +-+-+-+-+-+-+-+-+ 1029 | Rsvd(MBZ) |I|N| 1030 +-+-+-+-+-+-+-+-+ 1032 Two flags are defined currently, I and N. The remaining flags MUST 1033 be set to zero when sending and ignored on receipt. 1035 Flag Name and Meaning 1036 ---- ---------------- 1037 I Interface and Label Stack Object Request 1039 When this flag is set, it indicates that the replying 1040 router SHOULD include an Interface and Label Stack 1041 Object in the echo reply message. 1043 N Treat as a Non-IP Packet 1045 Echo request messages will be used to diagnose non-IP 1046 flows. However, these messages are carried in IP 1047 packets. For a router that alters its ECMP algorithm 1048 based on the FEC or deep packet examination, this flag 1049 requests that the router treat this as it would if the 1050 determination of an IP payload had failed. 1052 Downstream IP Address and Downstream Interface Address 1054 IPv4 addresses and interface indices are encoded in 4 octets; IPv6 1055 addresses are encoded in 16 octets. 1057 If the interface to the downstream LSR is numbered, then the 1058 Address Type MUST be set to IPv4 or IPv6, the Downstream IP 1059 Address MUST be set to either the downstream LSR's Router ID or 1060 the interface address of the downstream LSR, and the Downstream 1061 Interface Address MUST be set to the downstream LSR's interface 1062 address. 1064 If the interface to the downstream LSR is unnumbered, the Address 1065 Type MUST be IPv4 Unnumbered or IPv6 Unnumbered, the Downstream IP 1066 Address MUST be the downstream LSR's Router ID, and the Downstream 1067 Interface Address MUST be set to the index assigned by the 1068 upstream LSR to the interface. 1070 If an LSR does not know the IP address of its neighbor, then it 1071 MUST set the Address Type to either IPv4 Unnumbered or IPv6 1072 Unnumbered. For IPv4, it must set the Downstream IP Address to 1073 127.0.0.1; for IPv6 the address is set to 0::1. In both cases, 1074 the interface index MUST be set to 0. If an LSR receives an Echo 1075 Request packet with either of these addresses in the Downstream IP 1076 Address field, this indicates that it MUST bypass interface 1077 verification but continue with label validation. 1079 If the originator of an Echo Request packet wishes to obtain 1080 Downstream Mapping information but does not know the expected 1081 label stack, then it SHOULD set the Address Type to either IPv4 1082 Unnumbered or IPv6 Unnumbered. For IPv4, it MUST set the 1083 Downstream IP Address to 224.0.0.2; for IPv6 the address MUST be 1084 set to FF02::2. In both cases, the interface index MUST be set to 1085 0. If an LSR receives an Echo Request packet with the all-routers 1086 multicast address, then this indicates that it MUST bypass both 1087 interface and label stack validation, but return Downstream 1088 Mapping TLVs using the information provided. 1090 Multipath Type 1092 The following Multipath Types are defined: 1094 Key Type Multipath Information 1095 --- ---------------- --------------------- 1096 0 no multipath Empty (Multipath Length = 0) 1097 2 IP address IP addresses 1098 4 IP address range low/high address pairs 1099 8 Bit-masked IP IP address prefix and bit mask 1100 address set 1101 9 Bit-masked label set Label prefix and bit mask 1103 Type 0 indicates that all packets will be forwarded out this one 1104 interface. 1106 Types 2, 4, 8, and 9 specify that the supplied Multipath Information 1107 will serve to exercise this path. 1109 Depth Limit 1111 The Depth Limit is applicable only to a label stack and is the 1112 maximum number of labels considered in the hash; this SHOULD be 1113 set to zero if unspecified or unlimited. 1115 Multipath Length 1117 The length in octets of the Multipath Information. 1119 Multipath Information 1121 Address or label values encoded according to the Multipath Type. 1122 See the next section below for encoding details. 1124 Downstream Label(s) 1126 The set of labels in the label stack as it would have appeared if 1127 this router were forwarding the packet through this interface. 1128 Any Implicit Null labels are explicitly included. Labels are 1129 treated as numbers, i.e., they are right justified in the field. 1131 A Downstream Label is 24 bits, in the same format as an MPLS label 1132 minus the TTL field, i.e., the MSBit of the label is bit 0, the 1133 LSBit is bit 19, the EXP bits are bits 20-22, and bit 23 is the S 1134 bit. The replying router SHOULD fill in the EXP and S bits; the 1135 LSR receiving the echo reply MAY choose to ignore these bits. 1136 Protocol 1138 The Protocol is taken from the following table: 1140 Protocol # Signaling Protocol 1141 ---------- ------------------ 1142 0 Unknown 1143 1 Static 1144 2 BGP 1145 3 LDP 1146 4 RSVP-TE 1148 3.3.1. Multipath Information Encoding 1150 The Multipath Information encodes labels or addresses that will 1151 exercise this path. The Multipath Information depends on the 1152 Multipath Type. The contents of the field are shown in the table 1153 above. IPv4 addresses are drawn from the range 127/8; IPv6 addresses 1154 are drawn from the range 0:0:0:0:0:FFFF:127/104. Labels are treated 1155 as numbers, i.e., they are right justified in the field. For Type 4, 1156 ranges indicated by Address pairs MUST NOT overlap and MUST be in 1157 ascending sequence. 1159 Type 8 allows a more dense encoding of IP addresses. The IP prefix 1160 is formatted as a base IP address with the non-prefix low-order bits 1161 set to zero. The maximum prefix length is 27. Following the prefix 1162 is a mask of length 2^(32-prefix length) bits for IPv4 and 1163 2^(128-prefix length) bits for IPv6. Each bit set to 1 represents a 1164 valid address. The address is the base IPv4 address plus the 1165 position of the bit in the mask where the bits are numbered left to 1166 right beginning with zero. For example, the IPv4 addresses 1167 127.2.1.0, 127.2.1.5-127.2.1.15, and 127.2.1.20-127.2.1.29 would be 1168 encoded as follows: 1170 0 1 2 3 1171 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1172 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1173 |0 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0| 1174 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1175 |1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0| 1176 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1178 Those same addresses embedded in IPv6 would be encoded as follows: 1180 0 1 2 3 1181 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1182 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1183 |0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0| 1184 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1185 |0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0| 1186 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1187 |0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0| 1188 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1189 |0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1| 1190 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1191 |0 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0| 1192 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1193 |1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0| 1194 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1196 Type 9 allows a more dense encoding of labels. The label prefix is 1197 formatted as a base label value with the non-prefix low-order bits 1198 set to zero. The maximum prefix (including leading zeros due to 1199 encoding) length is 27. Following the prefix is a mask of length 1200 2^(32-prefix length) bits. Each bit set to one represents a valid 1201 label. The label is the base label plus the position of the bit in 1202 the mask where the bits are numbered left to right beginning with 1203 zero. Label values of all the odd numbers between 1152 and 1279 1204 would be encoded as follows: 1206 0 1 2 3 1207 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1208 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1209 |0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0| 1210 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1211 |0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1| 1212 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1213 |0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1| 1214 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1215 |0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1| 1216 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1217 |0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1| 1218 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1220 If the received Multipath Information is non-null, the labels and IP 1221 addresses MUST be picked from the set provided. If none of these 1222 labels or addresses map to a particular downstream interface, then 1223 for that interface, the type MUST be set to 0. If the received 1224 Multipath Information is null (i.e., Multipath Length = 0, or for 1225 Types 8 and 9, a mask of all zeros), the type MUST be set to 0. 1227 For example, suppose LSR X at hop 10 has two downstream LSRs, Y and 1228 Z, for the FEC in question. The received X could return Multipath 1229 Type 4, with low/high IP addresses of 127.1.1.1->127.1.1.255 for 1230 downstream LSR Y and 127.2.1.1->127.2.1.255 for downstream LSR Z. 1231 The head end reflects this information to LSR Y. Y, which has three 1232 downstream LSRs, U, V, and W, computes that 127.1.1.1->127.1.1.127 1233 would go to U and 127.1.1.128-> 127.1.1.255 would go to V. Y would 1234 then respond with 3 Downstream Mappings: to U, with Multipath Type 4 1235 (127.1.1.1->127.1.1.127); to V, with Multipath Type 4 1236 (127.1.1.127->127.1.1.255); and to W, with Multipath Type 0. 1238 Note that computing Multipath Information may impose a significant 1239 processing burden on the receiver. A receiver MAY thus choose to 1240 process a subset of the received prefixes. The sender, on receiving 1241 a reply to a Downstream Mapping with partial information, SHOULD 1242 assume that the prefixes missing in the reply were skipped by the 1243 receiver, and MAY re-request information about them in a new echo 1244 request. 1246 3.3.2. Downstream Router and Interface 1248 The notion of "downstream router" and "downstream interface" should 1249 be explained. Consider an LSR X. If a packet that was originated 1250 with TTL n>1 arrived with outermost label L and TTL=1 at LSR X, X 1251 must be able to compute which LSRs could receive the packet if it was 1252 originated with TTL=n+1, over which interface the request would 1253 arrive and what label stack those LSRs would see. (It is outside the 1254 scope of this document to specify how this computation is done.) The 1255 set of these LSRs/interfaces consists of the downstream routers/ 1256 interfaces (and their corresponding labels) for X with respect to L. 1257 Each pair of downstream router and interface requires a separate 1258 Downstream Mapping to be added to the reply. 1260 The case where X is the LSR originating the echo request is a special 1261 case. X needs to figure out what LSRs would receive the MPLS echo 1262 request for a given FEC Stack that X originates with TTL=1. 1264 The set of downstream routers at X may be alternative paths (see the 1265 discussion below on ECMP) or simultaneous paths (e.g., for MPLS 1266 multicast). In the former case, the Multipath Information is used as 1267 a hint to the sender as to how it may influence the choice of these 1268 alternatives. 1270 3.4. Pad TLV 1272 The value part of the Pad TLV contains a variable number (>= 1) of 1273 octets. The first octet takes values from the following table; all 1274 the other octets (if any) are ignored. The receiver SHOULD verify 1275 that the TLV is received in its entirety, but otherwise ignores the 1276 contents of this TLV, apart from the first octet. 1278 Value Meaning 1279 ----- ------- 1280 1 Drop Pad TLV from reply 1281 2 Copy Pad TLV to reply 1282 3-255 Reserved for future use 1284 3.5. Vendor Enterprise Number 1286 SMI Private Enterprise Numbers are maintained by IANA. The Length is 1287 always 4; the value is the SMI Private Enterprise code, in network 1288 octet order, of the vendor with a Vendor Private extension to any of 1289 the fields in the fixed part of the message, in which case this TLV 1290 MUST be present. If none of the fields in the fixed part of the 1291 message have Vendor Private extensions, inclusion of this TLV is 1292 OPTIONAL. Vendor Private ranges for Message Types, Reply Modes, and 1293 Return Codes have been defined. When any of these are used, the 1294 Vendor Enterprise Number TLV MUST be included in the message. 1296 3.6. Interface and Label Stack 1298 The Interface and Label Stack TLV MAY be included in a reply message 1299 to report the interface on which the request message was received and 1300 the label stack that was on the packet when it was received. Only 1301 one such object may appear. The purpose of the object is to allow 1302 the upstream router to obtain the exact interface and label stack 1303 information as it appears at the replying LSR. 1305 The Length is K + 4*N octets; N is the number of labels in the label 1306 stack. Values for K are found in the description of Address Type 1307 below. The Value field of a Downstream Mapping has the following 1308 format: 1310 0 1 2 3 1311 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1312 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1313 | Address Type | Must Be Zero | 1314 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1315 | IP Address (4 or 16 octets) | 1316 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1317 | Interface (4 or 16 octets) | 1318 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1319 . . 1320 . . 1321 . Label Stack . 1322 . . 1323 . . 1324 . . 1325 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1327 Address Type 1329 The Address Type indicates if the interface is numbered or 1330 unnumbered. It also determines the length of the IP Address and 1331 Interface fields. The resulting total for the initial part of the 1332 TLV is listed in the table below as "K Octets". The Address Type 1333 is set to one of the following values: 1335 Type # Address Type K Octets 1336 ------ ------------ -------- 1337 1 IPv4 Numbered 12 1338 2 IPv4 Unnumbered 12 1339 3 IPv6 Numbered 36 1340 4 IPv6 Unnumbered 24 1342 IP Address and Interface 1344 IPv4 addresses and interface indices are encoded in 4 octets; IPv6 1345 addresses are encoded in 16 octets. 1347 If the interface upon which the echo request message was received 1348 is numbered, then the Address Type MUST be set to IPv4 or IPv6, 1349 the IP Address MUST be set to either the LSR's Router ID or the 1350 interface address, and the Interface MUST be set to the interface 1351 address. 1353 If the interface is unnumbered, the Address Type MUST be either 1354 IPv4 Unnumbered or IPv6 Unnumbered, the IP Address MUST be the 1355 LSR's Router ID, and the Interface MUST be set to the index 1356 assigned to the interface. 1358 Label Stack 1360 The label stack of the received echo request message. If any TTL 1361 values have been changed by this router, they SHOULD be restored. 1363 3.7. Errored TLVs 1365 The following TLV is a TLV that MAY be included in an echo reply to 1366 inform the sender of an echo request of mandatory TLVs either not 1367 supported by an implementation or parsed and found to be in error. 1369 The Value field contains the TLVs that were not understood, encoded 1370 as sub-TLVs. 1372 0 1 2 3 1373 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1374 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1375 | Type = 9 | Length | 1376 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1377 | Value | 1378 . . 1379 . . 1380 . . 1381 | | 1382 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1384 3.8. Reply TOS Byte TLV 1386 This TLV MAY be used by the originator of the echo request to request 1387 that an echo reply be sent with the IP header TOS byte set to the 1388 value specified in the TLV. This TLV has a length of 4 with the 1389 following value field. 1391 0 1 2 3 1392 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1393 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1394 | Reply-TOS Byte| Must Be Zero | 1395 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1397 4. Theory of Operation 1399 An MPLS echo request is used to test a particular LSP. The LSP to be 1400 tested is identified by the "FEC Stack"; for example, if the LSP was 1401 set up via LDP, and is to an egress IP address of 10.1.1.1, the FEC 1402 Stack contains a single element, namely, an LDP IPv4 prefix sub-TLV 1403 with value 10.1.1.1/32. If the LSP being tested is an RSVP LSP, the 1404 FEC Stack consists of a single element that captures the RSVP Session 1405 and Sender Template that uniquely identifies the LSP. 1407 FEC Stacks can be more complex. For example, one may wish to test a 1408 VPN IPv4 prefix of 10.1/8 that is tunneled over an LDP LSP with 1409 egress 10.10.1.1. The FEC Stack would then contain two sub-TLVs, the 1410 bottom being a VPN IPv4 prefix, and the top being an LDP IPv4 prefix. 1411 If the underlying (LDP) tunnel were not known, or was considered 1412 irrelevant, the FEC Stack could be a single element with just the VPN 1413 IPv4 sub-TLV. 1415 When an MPLS echo request is received, the receiver is expected to 1416 verify that the control plane and data plane are both healthy (for 1417 the FEC Stack being pinged) and that the two planes are in sync. The 1418 procedures for this are in section 4.4 below. 1420 4.1. Dealing with Equal-Cost Multi-Path (ECMP) 1422 LSPs need not be simple point-to-point tunnels. Frequently, a single 1423 LSP may originate at several ingresses, and terminate at several 1424 egresses; this is very common with LDP LSPs. LSPs for a given FEC 1425 may also have multiple "next hops" at transit LSRs. At an ingress, 1426 there may also be several different LSPs to choose from to get to the 1427 desired endpoint. Finally, LSPs may have backup paths, detour paths, 1428 and other alternative paths to take should the primary LSP go down. 1430 To deal with the last two first: it is assumed that the LSR sourcing 1431 MPLS echo requests can force the echo request into any desired LSP, 1432 so choosing among multiple LSPs at the ingress is not an issue. The 1433 problem of probing the various flavors of backup paths that will 1434 typically not be used for forwarding data unless the primary LSP is 1435 down will not be addressed here. 1437 Since the actual LSP and path that a given packet may take may not be 1438 known a priori, it is useful if MPLS echo requests can exercise all 1439 possible paths. This, although desirable, may not be practical, 1440 because the algorithms that a given LSR uses to distribute packets 1441 over alternative paths may be proprietary. 1443 To achieve some degree of coverage of alternate paths, there is a 1444 certain latitude in choosing the destination IP address and source 1445 UDP port for an MPLS echo request. This is clearly not sufficient; 1446 in the case of traceroute, more latitude is offered by means of the 1447 Multipath Information of the Downstream Mapping TLV. This is used as 1448 follows. An ingress LSR periodically sends an MPLS traceroute 1449 message to determine whether there are multipaths for a given LSP. 1450 If so, each hop will provide some information how each of its 1451 downstream paths can be exercised. The ingress can then send MPLS 1452 echo requests that exercise these paths. If several transit LSRs 1453 have ECMP, the ingress may attempt to compose these to exercise all 1454 possible paths. However, full coverage may not be possible. 1456 4.2. Testing LSPs That Are Used to Carry MPLS Payloads 1458 To detect certain LSP breakages, it may be necessary to encapsulate 1459 an MPLS echo request packet with at least one additional label when 1460 testing LSPs that are used to carry MPLS payloads (such as LSPs used 1461 to carry L2VPN and L3VPN traffic. For example, when testing LDP or 1462 RSVP-TE LSPs, just sending an MPLS echo request packet may not detect 1463 instances where the router immediately upstream of the destination of 1464 the LSP ping may forward the MPLS echo request successfully over an 1465 interface not configured to carry MPLS payloads because of the use of 1466 penultimate hop popping. Since the receiving router has no means to 1467 differentiate whether the IP packet was sent unlabeled or implicitly 1468 labeled, the addition of labels shimmed above the MPLS echo request 1469 (using the Nil FEC) will prevent a router from forwarding such a 1470 packet out unlabeled interfaces. 1472 4.3. Sending an MPLS Echo Request 1474 An MPLS echo request is a UDP packet. The IP header is set as 1475 follows: the source IP address is a routable address of the sender; 1476 the destination IP address is a (randomly chosen) IPv4 address from 1477 the range 127/8 or IPv6 address from the range 1478 0:0:0:0:0:FFFF:127/104. The IP TTL is set to 1. The source UDP port 1479 is chosen by the sender; the destination UDP port is set to 3503 1480 (assigned by IANA for MPLS echo requests). The Router Alert option 1481 MUST be set in the IP header. 1483 An MPLS echo request is sent with a label stack corresponding to the 1484 FEC Stack being tested. Note that further labels could be applied 1485 if, for example, the normal route to the topmost FEC in the stack is 1486 via a Traffic Engineered Tunnel [RSVP-TE]. If all of the FECs in the 1487 stack correspond to Implicit Null labels, the MPLS echo request is 1488 considered unlabeled even if further labels will be applied in 1489 sending the packet. 1491 If the echo request is labeled, one MAY (depending on what is being 1492 pinged) set the TTL of the innermost label to 1, to prevent the ping 1493 request going farther than it should. Examples of where this SHOULD 1494 be done include pinging a VPN IPv4 or IPv6 prefix, an L2 VPN endpoint 1495 or a pseudowire. Preventing the ping request from going too far can 1496 also be accomplished by inserting a Router Alert label above this 1497 label; however, this may lead to the undesired side effect that MPLS 1498 echo requests take a different data path than actual data. For more 1499 information on how these mechanisms can be used for pseudowire 1500 connectivity verification, see [VCCV]. 1502 In "ping" mode (end-to-end connectivity check), the TTL in the 1503 outermost label is set to 255. In "traceroute" mode (fault isolation 1504 mode), the TTL is set successively to 1, 2, and so on. 1506 The sender chooses a Sender's Handle and a Sequence Number. When 1507 sending subsequent MPLS echo requests, the sender SHOULD increment 1508 the Sequence Number by 1. However, a sender MAY choose to send a 1509 group of echo requests with the same Sequence Number to improve the 1510 chance of arrival of at least one packet with that Sequence Number. 1512 The TimeStamp Sent is set to the time-of-day (in seconds and 1513 microseconds) that the echo request is sent. The TimeStamp Received 1514 is set to zero. 1516 An MPLS echo request MUST have an FEC Stack TLV. Also, the Reply 1517 Mode must be set to the desired reply mode; the Return Code and 1518 Subcode are set to zero. In the "traceroute" mode, the echo request 1519 SHOULD include a Downstream Mapping TLV. 1521 4.4. Receiving an MPLS Echo Request 1523 Sending an MPLS echo request to the control plane is triggered by one 1524 of the following packet processing exceptions: Router Alert option, 1525 IP TTL expiration, MPLS TTL expiration, MPLS Router Alert label, or 1526 the destination address in the 127/8 address range. The control 1527 plane further identifies it by UDP destination port 3503. 1529 For reporting purposes the bottom of stack is considered to be stack- 1530 depth of 1. This is to establish an absolute reference for the case 1531 where the actual stack may have more labels than there are FECs in 1532 the Target FEC Stack. 1534 Furthermore, in all the error codes listed in this document, a stack- 1535 depth of 0 means "no value specified". This allows compatibility 1536 with existing implementations that do not use the Return Subcode 1537 field. 1539 An LSR X that receives an MPLS echo request then processes it as 1540 follows. 1542 1. General packet sanity is verified. If the packet is not well- 1543 formed, LSR X SHOULD send an MPLS Echo Reply with the Return Code 1544 set to "Malformed echo request received" and the Subcode to zero. 1545 If there are any TLVs not marked as "Ignore" that LSR X does not 1546 understand, LSR X SHOULD send an MPLS "TLV not understood" (as 1547 appropriate), and the Subcode set to zero. In the latter case, 1548 the misunderstood TLVs (only) are included as sub-TLVs in an 1549 Errored TLVs TLV in the reply. The header fields Sender's 1550 Handle, Sequence Number, and Timestamp Sent are not examined, but 1551 are included in the MPLS echo reply message. 1553 The algorithm uses the following variables and identifiers: 1555 Interface-I: the interface on which the MPLS echo request was 1556 received. 1558 Stack-R: the label stack on the packet as it was received. 1560 Stack-D: the label stack carried in the Downstream Mapping 1561 TLV (not always present) 1563 Label-L: the label from the actual stack currently being 1564 examined. Requires no initialization. 1566 Label-stack-depth: the depth of label being verified. Initialized 1567 to the number of labels in the received label 1568 stack S. 1570 FEC-stack-depth: depth of the FEC in the Target FEC Stack that 1571 should be used to verify the current actual 1572 label. Requires no initialization. 1574 Best-return-code: contains the return code for the echo reply 1575 packet as currently best known. As algorithm 1576 progresses, this code may change depending on the 1577 results of further checks that it performs. 1579 Best-rtn-subcode: similar to Best-return-code, but for the Echo 1580 Reply Subcode. 1582 FEC-status: result value returned by the FEC Checking 1583 algorithm described in section 4.4.1. 1585 /* Save receive context information */ 1587 2. If the echo request is good, LSR X stores the interface over 1588 which the echo was received in Interface-I, and the label stack 1589 with which it came in Stack-R. 1591 /* The rest of the algorithm iterates over the labels in Stack-R, 1592 verifies validity of label values, reports associated label switching 1593 operations (for traceroute), verifies correspondence between the 1594 Stack-R and the Target FEC Stack description in the body of the echo 1595 request, and reports any errors. */ 1597 /* The algorithm iterates as follows. */ 1599 3. Label Validation: 1601 If Label-stack-depth is 0 { 1603 /* The LSR needs to report its being a tail-end for the LSP */ 1605 Set FEC-stack-depth to 1, set Label-L to 3 (Implicit Null). 1606 Set Best-return-code to 3 ("Replying router is an egress for 1607 the FEC at stack depth"), set Best-rtn-subcode to the value of 1608 FEC-stack-depth (1) and go to step 5 (Egress Processing). 1610 } 1612 /* This step assumes there is always an entry for well-known label 1613 values */ 1615 Set Label-L to the value extracted from Stack-R at depth Label- 1616 stack-depth. Look up Label-L in the Incoming Label Map (ILM) to 1617 determine if the label has been allocated and an operation is 1618 associated with it. 1620 If there is no entry for L { 1622 /* Indicates a temporary or permanent label synchronization 1623 problem the LSR needs to report an error */ 1625 Set Best-return-code to 11 ("No label entry at stack-depth") 1626 and Best-rtn-subcode to Label-stack-depth. Go to step 7 (Send 1627 Reply Packet). 1629 } 1631 Else { 1633 Retrieve the associated label operation from the corresponding 1634 NLFE and proceed to step 4 (Label Operation check). 1636 } 1638 4. Label Operation Check 1639 If the label operation is "Pop and Continue Processing" { 1641 /* Includes Explicit Null and Router Alert label cases */ 1643 Iterate to the next label by decrementing Label-stack-depth and 1644 loop back to step 3 (Label Validation). 1646 } 1648 If the label operation is "Swap or Pop and Switch based on Popped 1649 Label" { 1651 Set Best-return-code to 8 ("Label switched at stack-depth") and 1652 Best-rtn-subcode to Label-stack-depth to report transit 1653 switching. 1655 If a Downstream Mapping TLV is present in the received echo 1656 request { 1658 If the IP address in the TLV is 127.0.0.1 or 0::1 { 1660 Set Best-return-code to 6 ("Upstream Interface Index 1661 Unknown"). An Interface and Label Stack TLV SHOULD be 1662 included in the reply and filled with Interface-I and 1663 Stack-R. 1665 } 1667 Else { 1669 Verify that the IP address, interface address, and label 1670 stack in the Downstream Mapping TLV match Interface-I and 1671 Stack-R. If there is a mismatch, set Best-return-code to 1672 5, "Downstream Mapping Mismatch". An Interface and Label 1673 Stack TLV SHOULD be included in the reply and filled in 1674 based on Interface-I and Stack-R. Go to step 7 (Send 1675 Reply Packet). 1677 } 1679 } 1681 For each available downstream ECMP path { 1683 Retrieve output interface from the NHLFE entry. 1685 /* Note: this return code is set even if Label-stack-depth 1686 is one */ 1687 If the output interface is not MPLS enabled { 1689 Set Best-return-code to Return Code 9, "Label switched 1690 but no MPLS forwarding at stack-depth" and set Best-rtn- 1691 subcode to Label-stack-depth and goto Send_Reply_Packet. 1693 } 1695 If a Downstream Mapping TLV is present { 1697 A Downstream Mapping TLV SHOULD be included in the echo 1698 reply (see section 3.3) filled in with information about 1699 the current ECMP path. 1701 } 1703 } 1705 If no Downstream Mapping TLV is present, or the Downstream IP 1706 Address is set to the ALLROUTERS multicast address, go to step 1707 7 (Send Reply Packet). 1709 If the "Validate FEC Stack" flag is not set and the LSR is not 1710 configured to perform FEC checking by default, go to step 7 1711 (Send Reply Packet). 1713 /* Validate the Target FEC Stack in the received echo request. 1715 First determine FEC-stack-depth from the Downstream Mapping 1716 TLV. This is done by walking through Stack-D (the Downstream 1717 labels) from the bottom, decrementing the number of labels for 1718 each non-Implicit Null label, while incrementing FEC-stack- 1719 depth for each label. If the Downstream Mapping TLV contains 1720 one or more Implicit Null labels, FEC-stack-depth may be 1721 greater than Label-stack-depth. To be consistent with the 1722 above stack-depths, the bottom is considered to entry 1. 1723 */ 1725 Set FEC-stack-depth to 0. Set i to Label-stack-depth. 1727 While (i > 0 ) do { 1729 ++FEC-stack-depth. 1730 if Stack-D[FEC-stack-depth] != 3 (Implicit Null) 1731 --i. 1732 } 1733 If the number of labels in the FEC stack is greater than or 1734 equal to FEC-stack-depth { 1735 Perform the FEC Checking procedure (see subsection 4.4.1 1736 below). 1738 If FEC-status is 2, set Best-return-code to 10 ("Mapping for 1739 this FEC is not the given label at stack-depth"). 1741 If the return code is 1, set Best-return-code to FEC-return- 1742 code and Best-rtn-subcode to FEC-stack-depth. 1743 } 1745 Go to step 7 (Send Reply Packet). 1746 } 1748 5. Egress Processing: 1750 /* These steps are performed by the LSR that identified itself as 1751 the tail-end LSR for an LSP. */ 1753 If received echo request contains no Downstream Mapping TLV, or 1754 the Downstream IP Address is set to 127.0.0.1 or 0::1 go to step 6 1755 (Egress FEC Validation). 1757 Verify that the IP address, interface address, and label stack in 1758 the Downstream Mapping TLV match Interface-I and Stack-R. If not, 1759 set Best-return-code to 5, "Downstream Mapping Mis-match". A 1760 Received Interface and Label Stack TLV SHOULD be created for the 1761 echo response packet. Go to step 7 (Send Reply Packet). 1763 6. Egress FEC Validation: 1765 /* This is a loop for all entries in the Target FEC Stack starting 1766 with FEC-stack-depth. */ 1768 Perform FEC checking by following the algorithm described in 1769 subsection 4.4.1 for Label-L and the FEC at FEC-stack-depth. 1771 Set Best-return-code to FEC-code and Best-rtn-subcode to the value 1772 in FEC-stack-depth. 1774 If FEC-status (the result of the check) is 1, 1775 go to step 7 (Send Reply Packet). 1777 /* Iterate to the next FEC entry */ 1778 ++FEC-stack-depth. 1779 If FEC-stack-depth > the number of FECs in the FEC-stack, 1780 go to step 7 (Send Reply Packet). 1782 If FEC-status is 0 { 1784 ++Label-stack-depth. 1785 If Label-stack-depth > the number of labels in Stack-R, 1786 Go to step 7 (Send Reply Packet). 1788 Label-L = extracted label from Stack-R at depth 1789 Label-stack-depth. 1790 Loop back to step 6 (Egress FEC Validation). 1791 } 1793 7. Send Reply Packet: 1795 Send an MPLS echo reply with a Return Code of Best-return-code, 1796 and a Return Subcode of Best-rtn-subcode. Include any TLVs 1797 created during the above process. The procedures for sending the 1798 echo reply are found in subsection 4.4.1. 1800 4.4.1. FEC Validation 1802 /* This subsection describes validation of an FEC entry within the 1803 Target FEC Stack and accepts an FEC, Label-L, and Interface-I. The 1804 algorithm performs the following steps. */ 1806 1. Two return values, FEC-status and FEC-return-code, are 1807 initialized to 0. 1809 2. If the FEC is the Nil FEC { 1811 If Label-L is either Explicit_Null or Router_Alert, return. 1813 Else { 1815 Set FEC-return-code to 10 ("Mapping for this FEC is not the 1816 given label at stack-depth"). 1817 Set FEC-status to 1 1818 Return. 1819 } 1821 } 1823 3. Check the FEC label mapping that describes how traffic received 1824 on the LSP is further switched or which application it is 1825 associated with. If no mapping exists, set FEC-return-code to 1826 Return 4, "Replying router has no mapping for the FEC at stack- 1827 depth". Set FEC-status to 1. Return. 1829 4. If the label mapping for FEC is Implicit Null, set FEC-status to 1830 2 and proceed to step 5. Otherwise, if the label mapping for FEC 1831 is Label-L, proceed to step 5. Otherwise, set FEC-return-code to 1832 10 ("Mapping for this FEC is not the given label at stack- 1833 depth"), set FEC-status to 1, and return. 1835 5. This is a protocol check. Check what protocol would be used to 1836 advertise FEC. If it can be determined that no protocol 1837 associated with Interface-I would have advertised an FEC of that 1838 FEC-Type, set FEC-return-code to 12 ("Protocol not associated 1839 with interface at FEC stack-depth"). Set FEC-status to 1. 1841 6. Return. 1843 4.5. Sending an MPLS Echo Reply 1845 An MPLS echo reply is a UDP packet. It MUST ONLY be sent in response 1846 to an MPLS echo request. The source IP address is a routable address 1847 of the replier; the source port is the well-known UDP port for LSP 1848 ping. The destination IP address and UDP port are copied from the 1849 source IP address and UDP port of the echo request. The IP TTL is 1850 set to 255. If the Reply Mode in the echo request is "Reply via an 1851 IPv4 UDP packet with Router Alert", then the IP header MUST contain 1852 the Router Alert IP option. If the reply is sent over an LSP, the 1853 topmost label MUST in this case be the Router Alert label (1) (see 1854 [LABEL-STACK]). 1856 The format of the echo reply is the same as the echo request. The 1857 Sender's Handle, the Sequence Number, and TimeStamp Sent are copied 1858 from the echo request; the TimeStamp Received is set to the time-of- 1859 day that the echo request is received (note that this information is 1860 most useful if the time-of-day clocks on the requester and the 1861 replier are synchronized). The FEC Stack TLV from the echo request 1862 MAY be copied to the reply. 1864 The replier MUST fill in the Return Code and Subcode, as determined 1865 in the previous subsection. 1867 If the echo request contains a Pad TLV, the replier MUST interpret 1868 the first octet for instructions regarding how to reply. 1870 If the replying router is the destination of the FEC, then Downstream 1871 Mapping TLVs SHOULD NOT be included in the echo reply. 1873 If the echo request contains a Downstream Mapping TLV, and the 1874 replying router is not the destination of the FEC, the replier SHOULD 1875 compute its downstream routers and corresponding labels for the 1876 incoming label, and add Downstream Mapping TLVs for each one to the 1877 echo reply it sends back. 1879 If the Downstream Mapping TLV contains Multipath Information 1880 requiring more processing than the receiving router is willing to 1881 perform, the responding router MAY choose to respond with only a 1882 subset of multipaths contained in the echo request Downstream 1883 Mapping. (Note: The originator of the echo request MAY send another 1884 echo request with the Multipath Information that was not included in 1885 the reply.) 1887 Except in the case of Reply Mode 4, "Reply via application level 1888 control channel", echo replies are always sent in the context of the 1889 IP/MPLS network. 1891 4.6. Receiving an MPLS Echo Reply 1893 An LSR X should only receive an MPLS echo reply in response to an 1894 MPLS echo request that it sent. Thus, on receipt of an MPLS echo 1895 reply, X should parse the packet to ensure that it is well-formed, 1896 then attempt to match up the echo reply with an echo request that it 1897 had previously sent, using the destination UDP port and the Sender's 1898 Handle. If no match is found, then X jettisons the echo reply; 1899 otherwise, it checks the Sequence Number to see if it matches. 1901 If the echo reply contains Downstream Mappings, and X wishes to 1902 traceroute further, it SHOULD copy the Downstream Mapping(s) into its 1903 next echo request(s) (with TTL incremented by one). 1905 4.7. Issue with VPN IPv4 and IPv6 Prefixes 1907 Typically, an LSP ping for a VPN IPv4 prefix or VPN IPv6 prefix is 1908 sent with a label stack of depth greater than 1, with the innermost 1909 label having a TTL of 1. This is to terminate the ping at the egress 1910 PE, before it gets sent to the customer device. However, under 1911 certain circumstances, the label stack can shrink to a single label 1912 before the ping hits the egress PE; this will result in the ping 1913 terminating prematurely. One such scenario is a multi-AS Carrier's 1914 Carrier VPN. 1916 To get around this problem, one approach is for the LSR that receives 1917 such a ping to realize that the ping terminated prematurely, and send 1918 back error code 13. In that case, the initiating LSR can retry the 1919 ping after incrementing the TTL on the VPN label. In this fashion, 1920 the ingress LSR will sequentially try TTL values until it finds one 1921 that allows the VPN ping to reach the egress PE. 1923 4.8. Non-compliant Routers 1925 If the egress for the FEC Stack being pinged does not support MPLS 1926 ping, then no reply will be sent, resulting in possible "false 1927 negatives". If in "traceroute" mode, a transit LSR does not support 1928 LSP ping, then no reply will be forthcoming from that LSR for some 1929 TTL, say, n. The LSR originating the echo request SHOULD try sending 1930 the echo request with TTL=n+1, n+2, ..., n+k to probe LSRs further 1931 down the path. In such a case, the echo request for TTL > n SHOULD 1932 be sent with Downstream Mapping TLV "Downstream IP Address" field set 1933 to the ALLROUTERs multicast address until a reply is received with a 1934 Downstream Mapping TLV. The label stack MAY be omitted from the 1935 Downstream Mapping TLV. Furthermore, the "Validate FEC Stack" flag 1936 SHOULD NOT be set until an echo reply packet with a Downstream 1937 Mapping TLV is received. 1939 5. Security Considerations 1941 Overall, the security needs for LSP ping are similar to those of ICMP 1942 ping. 1944 There are at least three approaches to attacking LSRs using the 1945 mechanisms defined here. One is a Denial-of-Service attack, by 1946 sending MPLS echo requests/replies to LSRs and thereby increasing 1947 their workload. The second is obfuscating the state of the MPLS data 1948 plane liveness by spoofing, hijacking, replaying, or otherwise 1949 tampering with MPLS echo requests and replies. The third is an 1950 unauthorized source using an LSP ping to obtain information about the 1951 network. To avoid potential Denial-of-Service attacks, it is 1952 RECOMMENDED that implementations regulate the LSP ping traffic going 1953 to the control plane. A rate limiter SHOULD be applied to the well- 1954 known UDP port defined below. 1956 Unsophisticated replay and spoofing attacks involving faking or 1957 replaying MPLS echo reply messages are unlikely to be effective. 1958 These replies would have to match the Sender's Handle and Sequence 1959 Number of an outstanding MPLS echo request message. A non-matching 1960 replay would be discarded as the sequence has moved on, thus a spoof 1961 has only a small window of opportunity. However, to provide a 1962 stronger defense, an implementation MAY also validate the TimeStamp 1963 Sent by requiring and exact match on this field. 1965 To protect against unauthorized sources using MPLS echo request 1966 messages to obtain network information, it is RECOMMENDED that 1967 implementations provide a means of checking the source addresses of 1968 MPLS echo request messages against an access list before accepting 1969 the message. 1971 It is not clear how to prevent hijacking (non-delivery) of echo 1972 requests or replies; however, if these messages are indeed hijacked, 1973 LSP ping will report that the data plane is not working as it should. 1975 It does not seem vital (at this point) to secure the data carried in 1976 MPLS echo requests and replies, although knowledge of the state of 1977 the MPLS data plane may be considered confidential by some. 1978 Implementations SHOULD, however, provide a means of filtering the 1979 addresses to which echo reply messages may be sent. 1981 Although this document makes special use of 127/8 address, these are 1982 used only in conjunction with the UDP port 3503. Furthermore, these 1983 packets are only processed by routers. All other hosts MUST treat 1984 all packets with a destination address in the range 127/8 in 1985 accordance to RFC 1122. Any packet received by a router with a 1986 destination address in the range 127/8 without a destination UDP port 1987 of 3503 MUST be treated in accordance to RFC 1812. In particular, 1988 the default behavior is to treat packets destined to a 127/8 address 1989 as "martians". 1991 6. IANA Considerations 1993 The TCP and UDP port number 3503 has been allocated by IANA for LSP 1994 echo requests and replies. 1996 The following sections detail the new name spaces to be managed by 1997 IANA. For each of these name spaces, the space is divided into 1998 assignment ranges; the following terms are used in describing the 1999 procedures by which IANA allocates values: "Standards Action" (as 2000 defined in [IANA]), "Specification Required", and "Vendor Private 2001 Use". 2003 Values from "Specification Required" ranges MUST be registered with 2004 IANA. The request MUST be made via an Experimental RFC that 2005 describes the format and procedures for using the code point; the 2006 actual assignment is made during the IANA actions for the RFC. 2008 Values from "Vendor Private" ranges MUST NOT be registered with IANA; 2009 however, the message MUST contain an enterprise code as registered 2010 with the IANA SMI Private Network Management Private Enterprise 2011 Numbers. For each name space that has a Vendor Private range, it 2012 must be specified where exactly the SMI Private Enterprise Number 2013 resides; see below for examples. In this way, several enterprises 2014 (vendors) can use the same code point without fear of collision. 2016 6.1. Message Types, Reply Modes, Return Codes 2018 The IANA has created and will maintain registries for Message Types, 2019 Reply Modes, and Return Codes. Each of these can take values in the 2020 range 0-255. Assignments in the range 0-191 are via Standards 2021 Action; assignments in the range 192-251 are made via "Specification 2022 Required"; values in the range 252-255 are for Vendor Private Use, 2023 and MUST NOT be allocated. 2025 If any of these fields fall in the Vendor Private range, a top-level 2026 Vendor Enterprise Number TLV MUST be present in the message. 2028 Message Types defined in this document are the following: 2030 Value Meaning 2031 ----- ------- 2032 1 MPLS echo request 2033 2 MPLS echo reply 2035 Reply Modes defined in this document are the following: 2037 Value Meaning 2038 ----- ------- 2039 1 Do not reply 2040 2 Reply via an IPv4/IPv6 UDP packet 2041 3 Reply via an IPv4/IPv6 UDP packet with Router Alert 2042 4 Reply via application level control channel 2044 Return Codes defined in this document are listed in section 3.1. 2046 6.2. TLVs 2048 The IANA has created and will maintain a registry for the Type field 2049 of top-level TLVs as well as for any associated sub-TLVs. Note the 2050 meaning of a sub-TLV is scoped by the TLV. The number spaces for the 2051 sub-TLVs of various TLVs are independent. 2053 The valid range for TLVs and sub-TLVs is 0-65535. Assignments in the 2054 range 0-16383 and 32768-49161 are made via Standards Action as 2055 defined in [IANA]; assignments in the range 16384-31743 and 2056 49162-64511 are made via "Specification Required" as defined above; 2057 values in the range 31744-32767 and 64512-65535 are for Vendor 2058 Private Use, and MUST NOT be allocated. 2060 If a TLV or sub-TLV has a Type that falls in the range for Vendor 2061 Private Use, the Length MUST be at least 4, and the first four octets 2062 MUST be that vendor's SMI Private Enterprise Number, in network octet 2063 order. The rest of the Value field is private to the vendor. TLVs 2064 and sub-TLVs defined in this document are the following: 2066 Type Sub-Type Value Field 2067 ---- -------- ----------- 2068 1 Target FEC Stack 2069 1 LDP IPv4 prefix 2070 2 LDP IPv6 prefix 2071 3 RSVP IPv4 LSP 2072 4 RSVP IPv6 LSP 2073 5 Not Assigned 2074 6 VPN IPv4 prefix 2075 7 VPN IPv6 prefix 2076 8 L2 VPN endpoint 2077 9 "FEC 128" Pseudowire (Deprecated) 2078 10 "FEC 128" Pseudowire 2079 11 "FEC 129" Pseudowire 2080 12 BGP labeled IPv4 prefix 2081 13 BGP labeled IPv6 prefix 2082 14 Generic IPv4 prefix 2083 15 Generic IPv6 prefix 2084 16 Nil FEC 2085 2 Downstream Mapping 2086 3 Pad 2087 4 Not Assigned 2088 5 Vendor Enterprise Number 2089 6 Not Assigned 2090 7 Interface and Label Stack 2091 8 Not Assigned 2092 9 Errored TLVs 2093 Any value The TLV not understood 2094 10 Reply TOS Byte 2096 7. Acknowledgements 2098 The original acknowledgements from RFC 4379 state the following: 2100 This document is the outcome of many discussions among many 2101 people, including Manoj Leelanivas, Paul Traina, Yakov Rekhter, 2102 Der-Hwa Gan, Brook Bailey, Eric Rosen, Ina Minei, Shivani 2103 Aggarwal, and Vanson Lim. 2105 The description of the Multipath Information sub-field of the 2106 Downstream Mapping TLV was adapted from text suggested by Curtis 2107 Villamizar. 2109 We would like to thank Loa Andersson for motivating the advancement 2110 of this bis specification. 2112 8. References 2114 8.1. Normative References 2116 [BGP] Rekhter, Y. and S. Hares, "A Border Gateway Protocol 4 2117 (BGP-4)", 2006, . 2119 [IANA] Narten, T. and H. Alvestrand, "Guidelines for Writing an 2120 IANA Considerations Section in RFCs", 1998, . 2122 [KEYWORDS] 2123 Bradner, S., "Key words for use in RFCs to Indicate 2124 Requirement Levels", 1997, . 2126 [LABEL-STACK] 2127 Rosen, E., Tappan, D., Fedorkow, G., Rekther, Y., 2128 Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack 2129 Encoding", 2001, . 2131 [NTP] Mills, D., "Simple Network Time Protocol (SNTP) Version 4 2132 for IPv4, IPv6 and OSI", 1996, . 2134 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 2135 Communication Layers", STD 3, RFC 1122, DOI 10.17487/ 2136 RFC1122, October 1989, 2137 . 2139 [RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", 2140 RFC 1812, DOI 10.17487/RFC1812, June 1995, 2141 . 2143 [RFC4026] Andersson, L. and T. Madsen, "Provider Provisioned Virtual 2144 Private Network (VPN) Terminology", RFC 4026, DOI 2145 10.17487/RFC4026, March 2005, 2146 . 2148 [RFC4379] Kompella, K. and G. Swallow, "Detecting Multi-Protocol 2149 Label Switched (MPLS) Data Plane Failures", RFC 4379, 2150 February 2006. 2152 8.2. Informative References 2154 [BGP-LABEL] 2155 Rekhter, Y. and E. Rosen, "Carrying Label Information in 2156 BGP-4", 2001, . 2158 [ICMP] Postel, J., "Internet Control Message Protocol", 1998, 2159 . 2161 [LDP] Andersson, L., Doolan, P., Feldman, N., Fredatte, A., and 2162 B. Thomas, "LDP Specificatio", 2001, . 2164 [PW-CONTROL] 2165 Martini, L., El-Aawar, N., Heron, G., Rosen, E., Tappan, 2166 D., and T. Smith, "Pseudowire Setup and Maintenance using 2167 the Label Distribution Protocol", 2001, . 2169 [RFC4365] Rosen, E., "Applicability Statement for BGP/MPLS IP 2170 Virtual Private Networks (VPNs)", RFC 4365, DOI 10.17487/ 2171 RFC4365, February 2006, 2172 . 2174 [RSVP-TE] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 2175 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 2176 Tunnels", 2001, . 2178 [VCCV] Nadeau, T. and R. Aggarwal, "Pseudo Wire Virtual Circuit 2179 Connectivity Verification (VCCV)", 2005, . 2181 [VPLS-BGP] 2182 Kompella, K. and Y. Rekther, "Virtual Private LAN 2183 Service", 2005, . 2185 Authors' Addresses 2187 Carlos Pignataro 2188 Cisco Systems, Inc. 2190 Email: cpignata@cisco.com 2192 Nagendra Kumar 2193 Cisco Systems, Inc. 2195 Email: naikumar@cisco.com 2197 Sam Aldrin 2198 Google 2200 Email: aldrin.ietf@gmail.com 2202 Mach(Guoyi) Chen 2203 Huawei 2205 Email: mach.chen@huawei.com