idnits 2.17.1 draft-ietf-mpls-lsp-ping-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3667, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1291. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 1297), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 36. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document seems to lack an RFC 3979 Section 5, para. 1 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? ( - It does however have an RFC 2026 Section 10.4(A) Disclaimer.) ** The document seems to lack an RFC 3979 Section 5, para. 2 IPR Disclosure Acknowledgement. ** The document seems to lack an RFC 3979 Section 5, para. 3 IPR Disclosure Invitation -- however, there's a paragraph with a matching beginning. Boilerplate error? ( - It does however have an RFC 2026 Section 10.4(B) IPR Disclosure Invitation.) ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, or will be disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document is more than 15 pages and seems to lack a Table of Contents. == It seems as if not all pages are separated by form feeds - found 0 form feeds but 29 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 5 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 8 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 2004) is 7215 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RSVP' is defined on line 1107, but no explicit reference was found in the text == Unused Reference: 'RSVP-REFRESH' is defined on line 1111, but no explicit reference was found in the text == Unused Reference: 'RSVP-TE' is defined on line 1114, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2434 (ref. 'IANA') (Obsoleted by RFC 5226) -- Obsolete informational reference (is this intentional?): RFC 3036 (ref. 'LDP') (Obsoleted by RFC 5036) Summary: 11 errors (**), 0 flaws (~~), 7 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group K. Kompella 2 Internet Draft Juniper Networks 3 Category: Standards Track G. Swallow 4 Expires: January 2005 Cisco Systems 5 July 2004 7 Detecting MPLS Data Plane Failures 8 draft-ietf-mpls-lsp-ping-06.txt 9 *** DRAFT *** 11 Status of this Memo 13 By submitting this Internet-Draft, I certify that any applicable 14 patent or other IPR claims of which I am aware have been disclosed, 15 and any of which I become aware will be disclosed, in accordance with 16 RFC 3668. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt. 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 Copyright Notice 36 Copyright (C) The Internet Society (2004). All Rights Reserved. 38 Abstract 40 This document describes a simple and efficient mechanism that can be 41 used to detect data plane failures in Multi-Protocol Label Switching 42 (MPLS) Label Switched Paths (LSPs). There are two parts to this 43 document: information carried in an MPLS "echo request" and "echo 44 reply" for the purposes of fault detection and isolation; and 45 mechanisms for reliably sending the echo reply. 47 Changes since last revision 49 (This section to be removed before publication.) 51 *** Changed the format of an L2 circuit ID FEC back to what it was, 52 on demand. Added a new FEC with sender's PE address field to 53 uniquely identify the VC ID *** 55 *** Added a FEC TLV for "Labeled BGP IPv4" *** 57 Reformatted section on Downstream Mapping 59 Described issue with (and solution to) problem with VPN IPv4/6 61 Rephrased section on receiving an LSP ping 63 Clarified "Expert Review" allocation policy. 65 1. Introduction 67 This document describes a simple and efficient mechanism that can be 68 used to detect data plane failures in MPLS LSPs. There are two parts 69 to this document: information carried in an MPLS "echo request" and 70 "echo reply"; and mechanisms for transporting the echo reply. The 71 first part aims at providing enough information to check correct 72 operation of the data plane, as well as a mechanism to verify the 73 data plane against the control plane, and thereby localize faults. 74 The second part suggests two methods of reliable reply channels for 75 the echo request message, for more robust fault isolation. 77 An important consideration in this design is that MPLS echo requests 78 follow the same data path that normal MPLS packets would traverse. 79 MPLS echo requests are meant primarily to validate the data plane, 80 and secondarily to verify the data plane against the control plane. 81 Mechanisms to check the control plane are valuable, but are not 82 covered in this document. 84 To avoid potential Denial of Service attacks, it is recommended to 85 regulate the LSP ping traffic going to the control plane. A rate 86 limiter should be applied to the well-known UDP port defined below. 88 1.1. Conventions 90 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 91 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 92 document are to be interpreted as described in RFC 2119 [KEYWORDS]. 94 1.2. Structure of this document 96 The body of this memo contains four main parts: motivation, MPLS echo 97 request/reply packet format, LSP ping operation, and a reliable 98 return path. It is suggested that first-time readers skip the actual 99 packet formats and read the Theory of Operation first; the document 100 is structured the way it is to avoid forward references. 102 1.3. Contributors 104 The following made vital contributions to all aspects of this 105 document, and much of the material came out of debate and discussion 106 among this group. 108 Ronald P. Bonica, MCI 109 Dave Cooper, Global Crossing 110 Ping Pan, Ciena 111 Nischal Sheth, Juniper Networks, Inc. 112 Sanjay Wadhwa, Juniper Networks 114 2. Motivation 116 When an LSP fails to deliver user traffic, the failure cannot always 117 be detected by the MPLS control plane. There is a need to provide a 118 tool that would enable users to detect such traffic "black holes" or 119 misrouting within a reasonable period of time; and a mechanism to 120 isolate faults. 122 In this document, we describe a mechanism that accomplishes these 123 goals. This mechanism is modeled after the ping/traceroute paradigm: 124 ping (ICMP echo request [ICMP]) is used for connectivity checks, and 125 traceroute is used for hop-by-hop fault localization as well as path 126 tracing. This document specifies a "ping mode" and a "traceroute" 127 mode for testing MPLS LSPs. 129 The basic idea is to verify that packets that belong to a particular 130 Forwarding Equivalence Class (FEC) actually end their MPLS path on an 131 LSR that is an egress for that FEC. This document proposes that this 132 test be carried out by sending a packet (called an "MPLS echo 133 request") along the same data path as other packets belonging to this 134 FEC. An MPLS echo request also carries information about the FEC 135 whose MPLS path is being verified. This echo request is forwarded 136 just like any other packet belonging to that FEC. In "ping" mode 137 (basic connectivity check), the packet should reach the end of the 138 path, at which point it is sent to the control plane of the egress 139 LSR, which then verifies whether it is indeed an egress for the FEC. 140 In "traceroute" mode (fault isolation), the packet is sent to the 141 control plane of each transit LSR, which performs various checks that 142 it is indeed a transit LSR for this path; this LSR also returns 143 further information that helps check the control plane against the 144 data plane, i.e., that forwarding matches what the routing protocols 145 determined as the path. 147 One way these tools can be used is to periodically ping a FEC to 148 ensure connectivity. If the ping fails, one can then initiate a 149 traceroute to determine where the fault lies. One can also 150 periodically traceroute FECs to verify that forwarding matches the 151 control plane; however, this places a greater burden on transit LSRs 152 and thus should be used with caution. 154 3. Packet Format 156 An MPLS echo request is a (possibly labelled) IPv4 or IPv6 UDP 157 packet; the contents of the UDP packet have the following format: 159 0 1 2 3 160 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 161 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 162 | Version Number | Must Be Zero | 163 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 164 | Message Type | Reply mode | Return Code | Return Subcode| 165 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 166 | Sender's Handle | 167 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 168 | Sequence Number | 169 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 170 | TimeStamp Sent (seconds) | 171 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 172 | TimeStamp Sent (microseconds) | 173 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 174 | TimeStamp Received (seconds) | 175 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 176 | TimeStamp Received (microseconds) | 177 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 178 | TLVs ... | 179 . . 180 . . 181 . . 182 | | 183 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 185 The Version Number is currently 1. (Note: the Version Number is to 186 be incremented whenever a change is made that affects the ability of 187 an implementation to correctly parse or process an MPLS echo 188 request/reply. These changes include any syntactic or semantic 189 changes made to any of the fixed fields, or to any TLV or sub-TLV 190 assignment or format that is defined at a certain version number. 191 The Version Number may not need to be changed if an optional TLV or 192 sub-TLV is added.) 194 The Message Type is one of the following: 196 Value Meaning 197 ----- ------- 198 1 MPLS Echo Request 199 2 MPLS Echo Reply 201 The Reply Mode can take one of the following values: 203 Value Meaning 204 ----- ------- 205 1 Do not reply 206 2 Reply via an IPv4/IPv6 UDP packet 207 3 Reply via an IPv4/IPv6 UDP packet with Router Alert 208 4 Reply via application level control channel 210 An MPLS echo request with "Do not reply" may be used for one-way 211 connectivity tests; the receiving router may log gaps in the sequence 212 numbers and/or maintain delay/jitter statistics. An MPLS echo 213 request would normally have "Reply via an IPv4/IPv6 UDP packet"; if 214 the normal IP return path is deemed unreliable, one may use "Reply 215 via an IPv4/IPv6 UDP packet with Router Alert" (note that this 216 requires that all intermediate routers understand and know how to 217 forward MPLS echo replies). The echo reply uses the same IP version 218 number as the received echo request, i.e., an IPv4 encapsulated echo 219 reply is sent in response to an IPv4 encapsulated echo request. 221 Any application which supports an IP control channel between its 222 control entities may set the Reply Mode to 4 to ensure that replies 223 use that same channel. Further definition of this codepoint is 224 application specific and thus beyond the scope of this docuemnt. 226 Return Codes and Subcodes are described in the next section. 228 The Sender's Handle is filled in by the sender, and returned 229 unchanged by the receiver in the echo reply (if any). There are no 230 semantics associated with this handle, although a sender may find 231 this useful for matching up requests with replies. 233 The Sequence Number is assigned by the sender of the MPLS echo 234 request, and can be (for example) used to detect missed replies. 236 The TimeStamp Sent is the time-of-day (in seconds and microseconds, 237 wrt the sender's clock) when the MPLS echo request is sent. The 238 TimeStamp Received in an echo reply is the time-of-day (wrt the 239 receiver's clock) that the corresponding echo request was received. 241 TLVs (Type-Length-Value tuples) have the following format: 243 0 1 2 3 244 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 245 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 246 | Type | Length | 247 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 248 | Value | 249 . . 250 . . 252 . . 253 | | 254 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 256 Types are defined below; Length is the length of the Value field in 257 octets. The Value field depends on the Type; it is zero padded to 258 align to a four-octet boundary. 260 Type # Value Field 261 ------ ----------- 262 1 Target FEC Stack 263 2 Downstream Mapping 264 3 Pad 265 4 Error Code 266 5 Vendor Enterprise Code 268 3.1. Return Codes 270 The Return Code is set to zero by the sender. The receiver can set 271 it to one of the values listed below. The notation refers to 272 the Return Subcode. This field is filled in with the stack-depth for 273 those codes which specify that. For all other codes the Return 274 Subcode MUST be set to zero. 276 Value Meaning 277 ----- ------- 279 0 No return code or return code contained in the Error 280 Code TLV 282 1 Malformed echo request received 284 2 One or more of the TLVs was not understood 286 3 Replying router is an egress for the FEC at stack 287 depth 289 4 Replying router has no mapping for the FEC at stack 290 depth 292 5 Reserved 294 6 Reserved 296 7 Reserved 298 8 Label switched at stack-depth 299 9 Label switched but no MPLS forwarding at stack-depth 300 302 10 Mapping for this FEC is not the given label at stack 303 depth 305 11 No label entry at stack-depth 307 12 Protocol not associated with interface at FEC stack 308 depth 310 13 Premature termination of ping due to label stack 311 shrinking to a single label 313 3.2. Target FEC Stack 315 A Target FEC Stack is a list of sub-TLVs. The number of elements is 316 determined by the looking at the sub-TLV length fields. 318 Sub-Type # Length Value Field 319 ---------- ------ ----------- 320 1 5 LDP IPv4 prefix 321 2 17 LDP IPv6 prefix 322 3 20 RSVP IPv4 Session Query 323 4 56 RSVP IPv6 Session Query 324 5 Reserved; see Appendix 325 6 13 VPN IPv4 prefix 326 7 25 VPN IPv6 prefix 327 8 14 L2 VPN endpoint 328 9 10 "FEC 128" Pseudowire (old) 329 10 14 "FEC 128" Pseudowire (new) 330 11 13+ "FEC 129" Pseudowire 331 12 10 BGP labeled IPv4 prefix 333 Other FEC Types will be defined as needed. 335 Note that this TLV defines a stack of FECs, the first FEC element 336 corresponding to the top of the label stack, etc. 338 An MPLS echo request MUST have a Target FEC Stack that describes the 339 FEC stack being tested. For example, if an LSR X has an LDP mapping 340 for 192.168.1.1 (say label 1001), then to verify that label 1001 does 341 indeed reach an egress LSR that announced this prefix via LDP, X can 342 send an MPLS echo request with a FEC Stack TLV with one FEC in it, 343 namely of type LDP IPv4 prefix, with prefix 192.168.1.1/32, and send 344 the echo request with a label of 1001. 346 Say LSR X wanted to verify that a label stack of <1001, 23456> is the 347 right label stack to use to reach a VPN IPv4 prefix of 10/8 in VPN 348 foo. Say further that LSR Y with loopback address 192.168.1.1 349 announced prefix 10/8 with Route Distinguisher RD-foo-Y (which may in 350 general be different from the Route Distinguisher that LSR X uses in 351 its own advertisements for VPN foo), label 23456 and BGP nexthop 352 192.168.1.1. Finally, suppose that LSR X receives a label binding of 353 1001 for 192.168.1.1 via LDP. X has two choices in sending an MPLS 354 echo request: X can send an MPLS echo request with a FEC Stack TLV 355 with a single FEC of type VPN IPv4 prefix with a prefix of 10/8 and a 356 Route Distinguisher of RD-foo-Y. Alternatively, X can send a FEC 357 Stack TLV with two FECs, the first of type LDP IPv4 with a prefix of 358 192.168.1.1/32 and the second of type of IP VPN with a prefix 10/8 359 with Route Distinguisher of RD-foo-Y. In either case, the MPLS echo 360 request would have a label stack of <1001, 23456>. (Note: in this 361 example, 1001 is the "outer" label and 23456 is the "inner" label.) 363 3.2.1. LDP IPv4 Prefix 365 The value consists of four octets of an IPv4 prefix followed by one 366 octet of prefix length in bits; the format is given below. The IPv4 367 prefix is in network byte order; if the prefix is shorter than 32 368 bits, trailing bits SHOULD be set to zero. See [LDP] for an example 369 of a Mapping for an IPv4 FEC. 371 0 1 2 3 372 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 373 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 374 | IPv4 prefix | 375 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 376 | Prefix Length | Must Be Zero | 377 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 379 3.2.2. LDP IPv6 Prefix 381 The value consists of sixteen octets of an IPv6 prefix followed by 382 one octet of prefix length in bits; the format is given below. The 383 IPv6 prefix is in network byte order; if the prefix is shorter than 384 128 bits, the trailing bits SHOULD be set to zero. See [LDP] for an 385 example of a Mapping for an IPv6 FEC. 387 0 1 2 3 388 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 389 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 390 | IPv6 prefix | 391 | (16 octets) | 392 | | 393 | | 394 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 395 | Prefix Length | Must Be Zero | 396 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 398 3.2.3. RSVP IPv4 Session 400 The value has the format below. The value fields are taken from 401 [RFC3209, sections 4.6.1.1 and 4.6.2.1]. 403 0 1 2 3 404 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 405 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 406 | IPv4 tunnel end point address | 407 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 408 | Must Be Zero | Tunnel ID | 409 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 410 | Extended Tunnel ID | 411 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 412 | IPv4 tunnel sender address | 413 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 414 | Must Be Zero | LSP ID | 415 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 417 3.2.4. RSVP IPv6 Session 419 The value has the format below. The value fields are taken from 420 [RFC3209, sections 4.6.1.2 and 4.6.2.2]. 422 0 1 2 3 423 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 424 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 425 | IPv6 tunnel end point address | 426 | | 427 | | 428 | | 429 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 430 | Must Be Zero | Tunnel ID | 431 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 432 | Extended Tunnel ID | 433 | | 434 | | 435 | | 436 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 437 | IPv6 tunnel sender address | 438 | | 439 | | 440 | | 441 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 442 | Must Be Zero | LSP ID | 443 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 445 3.2.5. VPN IPv4 Prefix 447 The value field consists of the Route Distinguisher advertised with 448 the VPN IPv4 prefix, the IPv4 prefix (with trailing 0 bits to make 32 449 bits in all) and a prefix length, as follows: 451 0 1 2 3 452 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 453 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 454 | Route Distinguisher | 455 | (8 octets) | 456 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 457 | IPv4 prefix | 458 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 459 | Prefix Length | Must Be Zero | 460 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 462 3.2.6. VPN IPv6 Prefix 464 The value field consists of the Route Distinguisher advertised with 465 the VPN IPv6 prefix, the IPv6 prefix (with trailing 0 bits to make 466 128 bits in all) and a prefix length, as follows: 468 0 1 2 3 469 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 470 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 471 | Route Distinguisher | 472 | (8 octets) | 473 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 474 | IPv6 prefix | 475 | | 476 | | 477 | | 478 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 479 | Prefix Length | Must Be Zero | 480 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 482 3.2.7. L2 VPN Endpoint 484 The value field consists of a Route Distinguisher (8 octets), the 485 sender (of the ping)'s CE ID (2 octets), the receiver's CE ID (2 486 octets), and an encapsulation type (2 octets), formatted as follows: 488 0 1 2 3 489 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 490 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 491 | Route Distinguisher | 492 | (8 octets) | 493 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 494 | Sender's CE ID | Receiver's CE ID | 495 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 496 | Encapsulation Type | Must Be Zero | 497 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 499 3.2.8. FEC 128 Pseudowire (Deprecated) 501 The value field consists of the remote PE address (the destination 502 address of the targetted LDP session), a VC ID and an encapsulation 503 type, as follows: 505 0 1 2 3 506 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 507 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 508 | Remote PE Address | 509 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 510 | VC ID | 511 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 512 | Encapsulation Type | Must Be Zero | 513 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 515 This FEC will be deprecated, and is retained only for backward 516 compatibility. Implementations of LSP ping SHOULD accept and process 517 this TLV, but SHOULD send LSP ping echo requests with the new TLV 518 (see next section), unless explicitly asked by configuration to use 519 the old TLV. 521 An LSR receiving this TLV SHOULD use the source IP address of the LSP 522 echo request to infer the Sender's PE Address. 524 3.2.9. FEC 128 Pseudowire (Current) 526 The value field consists of the sender's PE address (the source 527 address of the targetted LDP session), the remote PE address (the 528 destination address of the targetted LDP session), a VC ID and an 529 encapsulation type, as follows: 531 0 1 2 3 532 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 533 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 534 | Sender's PE Address | 535 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 536 | Remote PE Address | 537 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 538 | VC ID | 539 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 540 | Encapsulation Type | Must Be Zero | 541 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 543 3.2.10. FEC 129 Pseudowire 545 0 1 2 3 546 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 547 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 548 | Sender's PE Address | 549 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 550 | Remote PE Address | 551 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 552 | PW Type | AGI Length | SAII Length | 553 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 554 | TAII Length | AGI Value ... SAII Value ... TAII Value ... 555 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 556 . ... . 557 . . 558 . . 559 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 560 ... | 0-3 octets of zero padding | 561 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 563 The Length of this TLV is 13 + AGI length + SAII length + TAII 564 length. Padding is used to make the total length a multiple of 4; 565 the length of the padding is not included in the Length field. 567 3.2.11. BGP Labeled IPv4 Prefix 569 The value field consists of the BGP Next Hop associated with the NLRI 570 advertising the prefix and label, the IPv4 prefix (with trailing 0 571 bits to make 32 bits in all), and the prefix length, as follows: 573 0 1 2 3 574 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 575 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 576 | BGP Next Hop | 577 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 578 | IPv4 Prefix | 579 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 580 | Prefix Length | Must Be Zero | 581 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 583 3.3. Downstream Mapping 585 The Downstream Mapping object is an optional TLV. Only one 586 Downstream Mapping request may appear in and echo request. The 587 presence of a Downstream Mapping object is a request that Downstream 588 Mapping objects be included in the echo reply. If the replying 589 router is the destination of the FEC, then a Downstream Mapping TLV 590 SHOULD NOT be included in the echo reply. Otherwise Downstream 591 Mapping objects SHOULD include a Downstream Mapping object for each 592 interface over which this FEC could be forwarded. For a more precise 593 definition of the notion of "downstream", see the section named 594 "Downstream". 596 The Length is 16 + M + 4*N octets, where M is the Multipath Length, 597 and N is the number of Downstream Labels. The Value field of a 598 Downstream Mapping has the following format: 600 0 1 2 3 601 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 602 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 603 | MTU | Address Type | Resvd (SBZ) | 604 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 605 | Downstream IP Address (4 or 16 octets) | 606 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 607 | Downstream Interface Address (4 or 16 octets) | 608 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 609 | Hash Key Type | Depth Limit | Multipath Length | 610 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 611 . . 612 . (Multipath Information) . 613 . . 614 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 615 | Downstream Label | Protocol | 616 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 617 . . 618 . . 619 . . 620 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 621 | Downstream Label | Protocol | 622 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 624 Maximum Transmission Unit (MTU) 626 The MTU is the largest MPLS frame (including label stack) that 627 fits on the interface to the Downstream LSR. 629 Address Type 630 The Address Type indicates if the interface is numbered or 631 unnumbered and is set to one of the following values: 633 Type # Address Type 634 ------ ------------ 635 1 IPv4 636 2 Unnumbered 637 3 IPv6 639 Reserved 641 The field marked SBZ SHOULD be set to zero when sending and 642 SHOULD be ignored on receipt. 644 Downstream IP Address and Downstream Interface Address 646 If the interface to the downstream LSR is numbered, then the 647 Address Type MUST be set to IPv4 or IPv6, the Downstream IP 648 Address MUST be set to either the downstream LSR's Router ID or 649 the interface address of the downstream LSR, and the Downstream 650 Interface Address MUST be set to the downstream LSR's interface 651 address. 653 If the interface to the downstream LSR is unnumbered, the Address 654 Type MUST be Unnumbered, the Downstream IP Address MUST be the 655 downstream LSR's Router ID (4 octets), and the Downstream 656 Interface Address MUST be set to the index assigned by the 657 upstream LSR to the interface. 659 Multipath Length 661 The length in octets of the Multipath Information. 663 Downstream Label(s) 665 The set of labels in the label stack as it would have appeared if 666 this router were forwarding the packet through this interface. 667 Any Implicit Null labels are explicitly inluded. Labels are 668 treated as numbers, i.e. they are right justified in the field. 670 A Downstream Label is 24 bits, in the same format as an MPLS 671 label minus the TTL field, i.e., the MSBit of the label is bit 0, 672 the LSbit is bit 19, the EXP bits are bits 20-22, and bit 23 is 673 the S bit. The replying router SHOULD fill in the EXP and S 674 bits; the LSR receiving the echo reply MAY choose to ignore these 675 bits. 677 Protocol 678 The Protocol is taken from the following table: 680 Protocol # Signaling Protocol 681 ---------- ------------------ 682 0 Unknown 683 1 Static 684 2 BGP 685 3 LDP 686 4 RSVP-TE 687 5 Reserved; see Appendix 689 Depth Limit 691 The Depth Limit is applicable only to a label stack, and is the 692 maximum number of labels considered in the hash; this SHOULD be 693 set to zero if unspecified or unlimited. 695 Multipath Information 697 The multipath information encodes labels or addresses which will 698 exercise this path. The multipath informaiton depends on the 699 hash key type. The contents of the field are shown in the table 700 above. IP addresses are drawn from the range 127/8. Labels are 701 treated as numbers, i.e. they are right justified in the field. 702 Label and Address pairs MUST NOT overlap and MUST be in ascending 703 sequence. 705 Hash key 8 allows a denser encoding of IP address. The IPv4 706 prefix is formatted as a base IPv4 address with the non-prefix 707 low order bits set to zero. The maximum prefix length is 27. 708 Following the prefix is a mask of length 2^(32-prefix length) 709 bits. Each bit set to one represents a valid address. The 710 address is the base IPv4 address plus the position of the bit in 711 the mask where the bits are numbered left to right begining with 712 zero. 714 Hash key 9 allows a denser encoding of Labels. The label prefix 715 is formatted as a base label value with the non-prefix low order 716 bits set to zero. The maximum prefix (including leading zeros 717 due to encoding) length is 27. Following the prefix is a mask of 718 length 2^(32-prefix length) bits. Each bit set to one represents 719 a valid Label. The label is the base label plus the position of 720 the bit in the mask where the bits are numbered left to right 721 begining with zero. 723 If the received multipath information is non-null, the labels and 724 IP addresses MUST be picked from the set provided or the Hash Key 725 Type MUST be set to 7. If the received multipath information is 726 null, the receiver simply returns null. 728 For example, suppose LSR X at hop 10 has two downstream LSRs Y 729 and Z for the FEC in question. X could return Hash Key Type 4, 730 with low/high IP addresses of 1.1.1.1->1.1.1.255 for downstream 731 LSR Y and 2.1.1.1->2.1.1.255 for downstream LSR Z. The head end 732 reflects this information to LSR Y. Y, which has three 733 downstream LSRs U, V and W, computes that 1.1.1.1->1.1.1.127 734 would go to U and 1.1.1.128-> 1.1.1.255 would go to V. Y would 735 then respond with 3 Downstream Mappings: to U, with Hash Key Type 736 4 (1.1.1.1->1.1.1.127); to V, with Hash Key Type 4 737 (1.1.1.127->1.1.1.255); and to W, with Hash Key Type 7. 739 3.3.1. "Downstream" 741 The notion of "downstream router" and "downstream interface" should 742 be explained. Consider an LSR X. If a packet that was originated 743 with TTL n>1 arrived with outermost label L at LSR X, X must be able 744 to compute which LSRs could receive the packet if it was originated 745 with TTL=n+1, over which interface the request would arrive and what 746 label stack those LSRs would see. (It is outside the scope of this 747 document to specify how this computation is done.) The set of these 748 LSRs/interfaces are the downstream routers/interfaces (and their 749 corresponding labels) for X with respect to L. Each pair of 750 downstream router and interface requires a separate Downstream 751 Mapping to be added to the reply. (Note that there are multiple 752 Downstream Label fields in each TLV as the incoming label L may be 753 swapped with a label stack.) 755 The case where X is the LSR originating the echo request is a special 756 case. X needs to figure out what LSRs would receive the MPLS echo 757 request for a given FEC Stack that X originates with TTL=1. 759 The set of downstream routers at X may be alternative paths (see the 760 discussion below on ECMP) or simultaneous paths (e.g., for MPLS 761 multicast). In the former case, the Multipath sub-field is used as a 762 hint to the sender as to how it may influence the choice of these 763 alternatives. The "No of Multipaths" is the number of IP 764 Address/Next Label fields. The Hash Key Type is taken from the 765 following table: 767 Key Type Multipath Information 768 --- ---------------- --------------------- 769 0 no multipath (empty; M = 0) 770 1 label labels 771 2 IP address IP addresses 772 3 label range low/high label pairs 773 4 IP address range low/high address pairs 774 5 no more labels (empty; M = 0) 775 6 All IP addresses (empty; M = 0) 776 7 no match (empty; M = 0) 777 8 Bit-masked IPv4 IP address prefix and bit mask 778 address set 779 9 Bit-masked label set Label prefix and bit mask 781 Type 0 indicates that all packets will be forwarded out this one 782 interface. 784 Types 1, 2, 3, 4, 8 and 9 specify that the supplied Multipath 785 Information will serve to execise this path. 787 Types 5 and 6 are TBD. 789 Type 7 indicates that no matches are possible given the Multipath 790 Information in the received DS mapping information. 792 3.4. Pad TLV 794 The value part of the Pad TLV contains a variable number (>= 1) of 795 octets. The first octet takes values from the following table; all 796 the other octets (if any) are ignored. The receiver SHOULD verify 797 that the TLV is received in its entirety, but otherwise ignores the 798 contents of this TLV, apart from the first octet. 800 Value Meaning 801 ----- ------- 802 1 Drop Pad TLV from reply 803 2 Copy Pad TLV to reply 804 3-255 Reserved for future use 806 3.5. Error Code 808 The Error Code TLV is currently not defined; its purpose is to 809 provide a mechanism for a more elaborate error reporting structure, 810 should the reason arise. 812 3.6. Vendor Enterprise Code 814 The Length is always 4; the value is the SMI Enterprise code, in 815 network octet order, of the vendor with a Vendor Private extension to 816 any of the fields in the fixed part of the message, in which case 817 this TLV MUST be present. If none of the fields in the fixed part of 818 the message have vendor private extensions, this TLV is OPTIONAL. 820 4. Theory of Operation 822 An MPLS echo request is used to test a particular LSP. The LSP to be 823 tested is identified by the "FEC Stack"; for example, if the LSP was 824 set up via LDP, and is to an egress IP address of 10.1.1.1, the FEC 825 stack contains a single element, namely, an LDP IPv4 prefix sub-TLV 826 with value 10.1.1.1/32. If the LSP being tested is an RSVP LSP, the 827 FEC stack consists of a single element that captures the RSVP Session 828 and Sender Template which uniquely identifies the LSP. 830 FEC stacks can be more complex. For example, one may wish to test a 831 VPN IPv4 prefix of 10.1/8 that is tunneled over an LDP LSP with 832 egress 10.10.1.1. The FEC stack would then contain two sub-TLVs, the 833 first being a VPN IPv4 prefix, and the second being an LDP IPv4 834 prefix. If the underlying (LDP) tunnel were not known, or was 835 considered irrelevant, the FEC stack could be a single element with 836 just the VPN IPv4 sub-TLV. 838 When an MPLS echo request is received, the receiver is expected to do 839 a number of tests that verify that the control plane and data plane 840 are both healthy (for the FEC stack being pinged), and that the two 841 planes are in sync. 843 4.1. Dealing with Equal-Cost Multi-Path (ECMP) 845 LSPs need not be simple point-to-point tunnels. Frequently, a single 846 LSP may originate at several ingresses, and terminate at several 847 egresses; this is very common with LDP LSPs. LSPs for a given FEC 848 may also have multiple "next hops" at transit LSRs. At an ingress, 849 there may also be several different LSPs to choose from to get to the 850 desired endpoint. Finally, LSPs may have backup paths, detour paths 851 and other alternative paths to take should the primary LSP go down. 853 To deal with the last two first: it is assumed that the LSR sourcing 854 MPLS echo requests can force the echo request into any desired LSP, 855 so choosing among multiple LSPs at the ingress is not an issue. The 856 problem of probing the various flavors of backup paths that will 857 typically not be used for forwarding data unless the primary LSP is 858 down will not be addressed here. 860 Since the actual LSP and path that a given packet may take may not be 861 known a priori, it is useful if MPLS echo requests can exercise all 862 possible paths. This, while desirable, may not be practical, because 863 the algorithms that a given LSR uses to distribute packets over 864 alternative paths may be proprietary. 866 To achieve some degree of coverage of alternate paths, there is a 867 certain lattitude in choosing the destination IP address and source 868 UDP port for an MPLS echo request. This is clearly not sufficient; 869 in the case of traceroute, more lattitude is offered by means of the 870 "Multipath Exercise" sub-TLV of the Downstream Mapping TLV. This is 871 used as follows. An ingress LSR periodically sends an MPLS 872 traceroute message to determine whether there are multipaths for a 873 given LSP. If so, each hop will provide some information how each of 874 its downstreams can be exercised. The ingress can then send MPLS 875 echo requests that exercise these paths. If several transit LSRs 876 have ECMP, the ingress may attempt to compose these to exercise all 877 possible paths. However, full coverage may not be possible. 879 4.2. Sending an MPLS Echo Request 881 An MPLS echo request is a (possibly) labelled UDP packet. The IP 882 header is set as follows: the source IP address is a routable address 883 of the sender; the destination IP address is a (randomly chosen) 884 address from 127/8; the IP TTL is set to 1. The source UDP port is 885 chosen by the sender; the destination UDP port is set to 3503 886 (assigned by IANA for MPLS echo requests). The Router Alert option 887 is set in the IP header. 889 If the echo request is labelled, one may (depending on what is being 890 pinged) set the TTL of the innermost label to 1, to prevent the ping 891 request going farther than it should. Examples of this include 892 pinging a VPN IPv4 or IPv6 prefix, an L2 VPN end point or a 893 pseudowire. This can also be accomplished by inserting a router 894 alert label above this label; however, this may lead to the undesired 895 side effect that MPLS echo requests take a different data path than 896 actual data. 898 In "ping" mode (end-to-end connectivity check), the TTL in the 899 outermost label is set to 255. In "traceroute" mode (fault isolation 900 mode), the TTL is set successively to 1, 2, .... 902 The sender chooses a Sender's Handle, and a Sequence Number. When 903 sending subsequent MPLS echo requests, the sender SHOULD increment 904 the sequence number by 1. However, a sender MAY choose to send a 905 group of echo requests with the same sequence number to improve the 906 chance of arrival of at least one packet with that sequence number. 908 The TimeStamp Sent is set to the time-of-day (in seconds and 909 microseconds) that the echo request is sent. The TimeStamp Received 910 is set to zero. 912 An MPLS echo request MUST have a FEC Stack TLV. Also, the Reply Mode 913 must be set to the desired reply mode; the Return Code and Subcode 914 are set to zero. 916 In the "traceroute" mode, the echo request SHOULD contain one or more 917 Downstream Mapping TLVs. For TTL=1, all the downstream routers (and 918 corresponding labels) for the sender with respect to the FEC Stack 919 being pinged SHOULD be sent in the echo request. For n>1, the 920 Downstream Mapping TLVs from the echo reply for TTL=(n-1) are copied 921 to the echo request with TTL=n; the sender MAY choose to reduce the 922 size of a "Downstream Multipath Mapping TLV" when copying into the 923 next echo request as long as the Hash Key Type matching the label or 924 IP address used to exercise the current MP is still present. 926 4.3. Receiving an MPLS Echo Request 928 An LSR X that receives an MPLS echo request first parses the packet 929 to ensure that it is a well-formed packet, and that the TLVs that are 930 not marked "Ignore" are understood. If not, X SHOULD send an MPLS 931 echo reply with the Return Code set to "Malformed echo request 932 received" or "TLV not understood" (as appropriate), and the Subcode 933 set to zero. In the latter case, the misunderstood TLVs (only) are 934 included in the reply. 936 If the echo request is good, X notes the interface I over which the 937 echo was received, and the label stack with which it came. 939 X matches up the labels in the received label stack with the FECs 940 contained in the FEC stack. The matching is done beginning at the 941 bottom of both stacks, and working up. For reporting purposes the 942 bottom of stack is consided to be stack-depth of 1. This is to 943 establish an absolute reference for the case where the stack may have 944 more labels than are in the FEC stack. 946 If there are more FECs than labels, the extra FECs are assumed to 947 correspond to Implicit Null Labels. That is, extra Implicit Null 948 Labels are added to the top of the received label stack and the stack 949 depth is set to the depth of the FEC stack. Thus for the processing 950 below, there is never the case where there is a FEC with no 951 corresponding label. Further, the label operation associated with an 952 assumed Null Label is 'pop and continue processing'. 954 Note: in all the error codes listed in this draft a stack-depth of 0 955 means "no value specified". This allows compatibility with existing 956 implementations which do not use the Return Subcode field. 958 X sets a variable, call it current-stack-depth, to the number of 959 labels in the received label stack. Processing now continues with 960 the following steps: 962 1. Check if there is a FEC corresponding to the current-stack- 963 depth. If there is, go to step 2. If not, check if the label is 964 valid on interface I. If it is, continue with step 4. Otherwise 965 X MUST send an MPLS echo reply with a Return Code 11, "No label 966 entry at stack-depth" and a Return Subcode set to current-stack- 967 depth. 969 2. Check the FEC at the current-stack-depth to determine what 970 protocol would be used to advertise it. If it can determine that 971 no protocol associated with interface I, would have advertised a 972 FEC of that FEC-Type, X MUST send an MPLS echo reply with a 973 Return Code 12, "Protocol not associated with interface at FEC 974 stack-depth" and a Return Subcode set to current-stack-depth. 976 3. Check that the mapping for the FEC at the current-stack-depth is 977 the corresponding label. 979 If no mapping for the FEC exists, X MUST send an MPLS echo reply 980 with a Return Code 4, "Replying router has no mapping for the FEC 981 at stack-depth" and a Return Subcode set to current- stack-depth. 983 If a mapping is found, but the mapping is not the corresponding 984 label, X MUST send an MPLS echo reply with a Return Code 10, 985 "Mapping for this FEC is not the given label at stack-depth" and 986 a Return Subcode set to current-stack-depth. 988 4. X determines the label operation. If the operation is to pop and 989 continue processing, X checks the current-stack-depth. If it is 990 one, X MUST send an MPLS echo reply with a Return Code 3, 991 "Replying router is an egress for the FEC at stack depth" and a 992 Return Subcode set to one. Otherwise, X decrements current-stack- 993 depth and goes back to step 1. 995 If the label operation is pop and switch based on the popped 996 label, X then checks if it is valid to forward a labelled packet. 997 If it is, X MUST send an MPLS echo reply with a Return Code 8, 998 "Label switched at stack-depth" and a Return Subcode set to 999 current-stack-depth. If it is not valid to forward a labelled 1000 packet, X MUST send an MPLS echo reply with a Return Code 9, 1001 "Label switched but no MPLS forwarding at stack-depth" and a 1002 Return Subcode set to current-stack-depth. This return code is 1003 sent even if current-stack-depth is one. 1005 If the label operation is swap, X MUST send an MPLS echo reply 1006 with a Return Code 8, "Label switched at stack-depth" and a 1007 Return Subcode set to current-stack-depth. 1009 If the MPLS echo request contains a downstream mapping TLV, and the 1010 MPLS echo reply has either a Return Code of 8, or a Return Code of 9 1011 with a Return Subcode of 1 then Downstream mapping TLVs SHOULD be 1012 included for each multipath. 1014 X uses the procedure in the next subsection to send the echo reply. 1016 4.4. Sending an MPLS Echo Reply 1018 An MPLS echo reply is a UDP packet. It MUST ONLY be sent in response 1019 to an MPLS echo request. The source IP address is a routable address 1020 of the replier; the source port is the well-known UDP port for LSP 1021 ping. The destination IP address and UDP port are copied from the 1022 source IP address and UDP port of the echo request. The IP TTL is 1023 set to 255. If the Reply Mode in the echo request is "Reply via an 1024 IPv4 UDP packet with Router Alert", then the IP header MUST contain 1025 the Router Alert IP option. If the reply is sent over an LSP, the 1026 topmost label MUST in this case be the Router Alert label (1) (see 1027 [LABEL-STACK]). 1029 The format of the echo reply is the same as the echo request. The 1030 Sender's Handle, the Sequence Number and TimeStamp Sent are copied 1031 from the echo request; the TimeStamp Received is set to the time-of- 1032 day that the echo request is received (note that this information is 1033 most useful if the time-of-day clocks on the requestor and the 1034 replier are synchronized). The FEC Stack TLV from the echo request 1035 MAY be copied to the reply. 1037 The replier MUST fill in the Return Code and Subcode, as determined 1038 in the previous subsection. 1040 If the echo request contains a Pad TLV, the replier MUST interpret 1041 the first octet for instructions regarding how to reply. 1043 If the echo request contains a Downstream Mapping TLV, the replier 1044 SHOULD compute its downstream routers and corresponding labels for 1045 the incoming label, and add Downstream Mapping TLVs for each one to 1046 the echo reply it sends back. 1048 4.5. Receiving an MPLS Echo Reply 1050 An LSR X should only receive an MPLS Echo Reply in response to an 1051 MPLS Echo Request that it sent. Thus, on receipt of an MPLS Echo 1052 Reply, X should parse the packet to assure that it is well-formed, 1053 then attempt to match up the Echo Reply with an Echo Request that it 1054 had previously sent, using the destination UDP port and the Sender's 1055 Handle. If no match is found, then X jettisons the Echo Reply; 1056 otherwise, it checks the Sequence Number to see if it matches. Gaps 1057 in the Sequence Number MAY be logged and SHOULD be counted. Once an 1058 Echo Reply is received for a given Sequence Number (for a given UDP 1059 port and Handle), the Sequence Number for subsequent Echo Requests 1060 for that UDP port and Handle SHOULD be incremented. 1062 If the Echo Reply contains Downstream Mappings, and X wishes to 1063 traceroute further, it SHOULD copy the Downstream Mappings into its 1064 next Echo Request (with TTL incremented by one). 1066 4.6. Issue with VPN IPv4 and IPv6 Prefixes 1068 Typically, a LSP ping for a VPN IPv4 or IPv6 prefix is sent with a 1069 label stack of depth greater than 1, with the innermost label having 1070 a TTL of 1. This is to terminate the ping at the egress PE, before 1071 it gets sent to the customer device. However, under certain 1072 circumstances, the label stack can shrink to a single label before 1073 the ping hits the egress PE; this will result in the ping terminating 1074 prematurely. One such scenario is a multi-AS Carrier's Carrier VPN. 1076 To get around this problem, one approach is for the LSR that receives 1077 such a ping to realize that the ping terminated prematurely, and send 1078 back error code 13. In that case, the initiating LSR can retry the 1079 ping after incrementing the TTL on the VPN label. In this fashion, 1080 the ingress LSR will sequentially try TTL values until it finds one 1081 that allows the VPN ping to reach the egress PE. 1083 4.7. Non-compliant Routers 1085 If the egress for the FEC Stack being pinged does not support MPLS 1086 ping, then no reply will be sent, resulting in possible "false 1087 negatives". If in "traceroute" mode, a transit LSR does not support 1088 LSP ping, then no reply will be forthcoming from that LSR for some 1089 TTL, say n. The LSR originating the echo request SHOULD try sending 1090 the echo request with TTL=n+1, n+2, ..., n+k in the hope that some 1091 transit LSR further downstream may support MPLS echo requests and 1092 reply. In such a case, the echo request for TTL>n MUST NOT have 1093 Downstream Mapping TLVs, until a reply is received with a Downstream 1094 Mapping. 1096 Normative References 1098 [IANA] Narten, T. and H. Alvestrand, "Guidelines for IANA 1099 Considerations", BCP 26, RFC 2434, October 1998. 1101 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 1102 Requirement Levels", BCP 14, RFC 2119, March 1997. 1104 [LABEL-STACK] Rosen, E., et al, "MPLS Label Stack Encoding", RFC 1105 3032, January 2001. 1107 [RSVP] Braden, R. (Editor), et al, "Resource ReSerVation protocol 1108 (RSVP) -- Version 1 Functional Specification," RFC 2205, 1109 September 1997. 1111 [RSVP-REFRESH] Berger, L., et al, "RSVP Refresh Overhead Reduction 1112 Extensions", RFC 2961, April 2001. 1114 [RSVP-TE] Awduche, D., et al, "RSVP-TE: Extensions to RSVP for LSP 1115 tunnels", RFC 3209, December 2001. 1117 Informative References 1119 [ICMP] Postel, J., "Internet Control Message Protocol", RFC 792. 1121 [LDP] Andersson, L., et al, "LDP Specification", RFC 3036, January 1122 2001. 1124 Security Considerations 1126 There are at least two approaches to attacking LSRs using the 1127 mechanisms defined here. One is a Denial of Service attack, by 1128 sending MPLS echo requests/replies to LSRs and thereby increasing 1129 their workload. The other is obfuscating the state of the MPLS data 1130 plane liveness by spoofing, hijacking, replaying or otherwise 1131 tampering with MPLS echo requests and replies. 1133 Authentication will help reduce the number of seemingly valid MPLS 1134 echo requests, and thus cut down the Denial of Service attacks; 1135 beyond that, each LSR must protect itself. 1137 Authentication sufficiently addresses spoofing, replay and most 1138 tampering attacks; one hopes to use some mechanism devised or 1139 suggested by the RPSec WG. It is not clear how to prevent hijacking 1140 (non-delivery) of echo requests or replies; however, if these 1141 messages are indeed hijacked, LSP ping will report that the data 1142 plane isn't working as it should. 1144 It doesn't seem vital (at this point) to secure the data carried in 1145 MPLS echo requests and replies, although knowledge of the state of 1146 the MPLS data plane may be considered confidential by some. 1148 5. IANA Considerations 1150 The TCP and UDP port number 3503 has been allocated by IANA for LSP 1151 echo requests and replies. 1153 The following sections detail the new name spaces to be managed by 1154 IANA. For each of these name spaces, the space is divided into 1155 assignment ranges; the following terms are used in describing the 1156 procedures by which IANA allocates values: "Standards Action" (as 1157 defined in [IANA]); "Expert Review" and "Vendor Private Use". 1159 Values from "Expert Review" ranges MUST be registered with IANA, and 1160 MUST be accompanied by an Experimental RFC that describes the format 1161 and procedures for using the code point; the actual assignment is 1162 made during the IANA actions for the RFC. 1164 Values from "Vendor Private" ranges MUST NOT be registered with IANA; 1165 however, the message MUST contain an enterprise code as registered 1166 with the IANA SMI Network Management Private Enterprise Codes. For 1167 each name space that has a Vendor Private range, it must be specified 1168 where exactly the SMI Enterprise Code resides; see below for 1169 examples. In this way, several enterprises (vendors) can use the 1170 same code point without fear of collision. 1172 5.1. Message Types, Reply Modes, Return Codes 1174 It is requested that IANA maintain registries for Message Types, 1175 Reply Modes, Return Codes and Return Subcodes. Each of these can 1176 take values in the range 0-255. Assignments in the range 0-191 are 1177 via Standards Action; assignments in the range 192-251 are made via 1178 Expert Review; values in the range 252-255 are for Vendor Private 1179 Use, and MUST NOT be allocated. 1181 If any of these fields fall in the Vendor Private range, a top-level 1182 Vendor Enterprise Code TLV MUST be present in the message. 1184 5.2. TLVs 1186 It is requested that IANA maintain registries for the Type field of 1187 top-level TLVs as well as for sub-TLVs. The valid range for each of 1188 these is 0-65535. Assignments in the range 0-32767 are made via 1189 Standards Action as defined in {IANA]; assignments in the range 1190 32768-64511 are made via Expert Review (see below); values in the 1191 range 64512-65535 are for Vendor Private Use, and MUST NOT be 1192 allocated. 1194 If a TLV or sub-TLV has a Type that falls in the range for Vendor 1195 Private Use, the Length MUST be at least 4, and the first four octets 1196 MUST be that vendor's SMI Enterprise Code, in network octet order. 1197 The rest of the Value field is private to the vendor. 1199 Acknowledgments 1201 This document is the outcome of many discussions among many people, 1202 that include Manoj Leelanivas, Paul Traina, Yakov Rekhter, Der-Hwa 1203 Gan, Brook Bailey, Eric Rosen, Ina Minei and Shivani Aggarwal. 1205 The description of the Multipath Information sub-field of the 1206 Downstream Mapping TLV was adapted from text suggested by Curtis 1207 Villamizar. 1209 Appendix 1211 This appendix specifies non-normative aspects of detecting MPLS data 1212 plane liveness. 1214 5.1. CR-LDP FEC 1216 This section describes how a CR-LDP FEC can be included in an Echo 1217 Request using the following FEC subtype: 1219 Sub-Type # Length Value Field 1220 ---------- ------ ------------- 1221 5 6 CR-LDP LSP ID 1223 The value consists of the LSPID of the LSP being pinged. An LSPID is 1224 a four octet IPv4 address (a local address on the ingress LSR, for 1225 example, the Router ID) plus a two octet identifier that is unique 1226 per LSP on a given ingress LSR. 1228 0 1 2 3 1229 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1231 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1232 | Ingress LSR Router ID | 1233 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1234 | Must Be Zero | LSP ID | 1235 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1237 5.2. Downstream Mapping for CR-LDP 1239 If a label in a Downstream Mapping was learned via CR-LDP, the 1240 Protocol field in the Mapping TLV can use the following entry: 1242 Protocol # Signaling Protocol 1243 ---------- ------------------ 1244 5 CR-LDP 1246 Authors' Address 1248 Kireeti Kompella 1249 Juniper Networks 1250 1194 N.Mathilda Ave 1251 Sunnyvale, CA 94089 1252 Email: kireeti@juniper.net 1254 George Swallow 1255 Cisco Systems 1256 1414 Massachusetts Ave, 1257 Boxborough, MA 01719 1258 Phone: +1 978 936 1398 1259 Email: swallow@cisco.com 1261 Intellectual Property Rights Notices 1263 The IETF takes no position regarding the validity or scope of any 1264 intellectual property or other rights that might be claimed to 1265 pertain to the implementation or use of the technology described in 1266 this document or the extent to which any license under such rights 1267 might or might not be available; neither does it represent that it 1268 has made any effort to identify any such rights. Information on the 1269 IETF's procedures with respect to rights in standards-track and 1270 standards-related documentation can be found in BCP-11. Copies of 1271 claims of rights made available for publication and any assurances of 1272 licenses to be made available, or the result of an attempt made to 1273 obtain a general license or permission for the use of such 1274 proprietary rights by implementors or users of this specification can 1275 be obtained from the IETF Secretariat. 1277 The IETF invites any interested party to bring to its attention any 1278 copyrights, patents or patent applications, or other proprietary 1279 rights which may cover technology that may be required to practice 1280 this standard. Please address the information to the IETF Executive 1281 Director. 1283 Disclaimer of Validity 1285 This document and the information contained herein are provided on an 1286 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1287 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1288 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1289 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1290 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1291 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1293 Copyright Statement 1295 Copyright (C) The Internet Society (2004). This document is subject 1296 to the rights, licenses and restrictions contained in BCP 78, and 1297 except as set forth therein, the authors retain all their rights.