idnits 2.17.1 draft-ietf-idmr-traceroute-ipm-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 8 instances of too long lines in the document, the longest one being 6 characters in excess of 72. ** The abstract seems to contain references ([Bradner97]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == There are 3 instances of lines with non-RFC2606-compliant FQDNs in the document. == There are 2 instances of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 1999) is 9021 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'Bradner97' on line 52 looks like a reference -- Missing reference section? 'Brad88' on line 156 looks like a reference -- Missing reference section? 'Katz97' on line 617 looks like a reference -- Missing reference section? 'Pusa98' on line 690 looks like a reference -- Missing reference section? 'Thal98a' on line 690 looks like a reference -- Missing reference section? 'Thal98b' on line 690 looks like a reference Summary: 8 errors (**), 0 flaws (~~), 3 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Inter-Domain Multicast Routing Working Group 2 INTERNET-DRAFT W. Fenner 3 draft-ietf-idmr-traceroute-ipm-04.txt Xerox PARC 4 S. Casner 5 Cisco Systems 6 February 26, 1999 7 Expires August 1999 9 A "traceroute" facility for IP Multicast. 11 Status of this Memo 13 This document is an Internet Draft and is in full conformance with all 14 provisions of Section 10 of RFC2026. Internet Drafts are working docu- 15 ments of the Internet Engineering Task Force (IETF), its areas, and its 16 working groups. Note that other groups may also distribute working doc- 17 uments as Internet-Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six months 20 and may be updated, replaced, or obsoleted by other documents at any 21 time. It is inappropriate to use Internet- Drafts as reference material 22 or to cite them other than as "work in progress." 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/ietf/1id-abstracts.txt. 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html. 30 Distribution of this document is unlimited. 32 Abstract 34 This draft describes the IGMP multicast traceroute facility. As 35 the deployment of IP multicast has spread, it has become clear that 36 a method for tracing the route that a multicast IP packet takes 37 from a source to a particular receiver is absolutely required. 38 Unlike unicast traceroute, multicast traceroute requires a special 39 packet type and implementation on the part of routers. This speci- 40 fication describes the required functionality. 42 This document is a product of the Inter-Domain Multicast Routing working 43 group within the Internet Engineering Task Force. Comments are 44 solicited and should be addressed to the working group's mailing list at 45 idmr@cs.ucl.ac.uk and/or the author(s). 47 Key Words 49 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 50 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 51 document are to be interpreted as described in RFC 2119 [Bradner97]. 53 1. Introduction 55 The unicast "traceroute" program allows the tracing of a path from one 56 machine to another, using a mechanism that already existed in IP. 57 Unfortunately, no such existing mechanism can be applied to IP multicast 58 paths. The key mechanism for unicast traceroute is the ICMP TTL 59 exceeded message, which is specifically precluded as a response to mul- 60 ticast packets. Thus, we specify the multicast "traceroute" facility to 61 be implemented in multicast routers and accessed by diagnostic programs. 62 While it is a disadvantage that a new mechanism is required, the multi- 63 cast traceroute facility can provide additional information about packet 64 rates and losses that the unicast traceroute cannot, and generally 65 requires fewer packets to be sent. 67 Goals: 69 o To be able to trace the path that a packet would take from some 70 source to some destination. 72 o To be able to isolate packet loss problems (e.g., congestion). 74 o To be able to isolate configuration problems (e.g., TTL threshold). 76 o To minimize packets sent (e.g. no flooding, no implosion). 78 2. Overview 80 Given a multicast distribution tree, tracing from a source to a multi- 81 cast destination is hard, since you don't know down which branch of the 82 multicast tree the destination lies. This means that you have to flood 83 the whole tree to find the path from one source to one destination. 84 However, walking up the tree from destination to source is easy, as most 85 existing multicast routing protocols know the previous hop for each 86 source. Tracing from destination to source can involve only routers on 87 the direct path. 89 The party requesting the traceroute (which need be neither the source 90 nor the destination) sends a traceroute Query packet to the last-hop 91 multicast router for the given destination. The last-hop router turns 92 the Query into a Request packet by adding a response data block contain- 93 ing its interface addresses and packet statistics, and then forwards the 94 Request packet via unicast to the router that it believes is the proper 95 previous hop for the given source and group. Each hop adds its response 96 data to the end of the Request packet, then unicast forwards it to the 97 previous hop. The first hop router (the router that believes that pack- 98 ets from the source originate on one of its directly connected networks) 99 changes the packet type to indicate a Response packet and sends the com- 100 pleted response to the response destination address. The response may 101 be returned before reaching the first hop router if a fatal error condi- 102 tion such as "no route" is encountered along the path. 104 Multicast traceroute uses any information available to it in the router 105 to attempt to determine a previous hop to forward the trace towards. 106 Multicast routing protocols vary in the type and amount of state they 107 keep; multicast traceroute endeavors to work with all of them by using 108 whatever is available. For example, if a DVMRP router has no active 109 state for a particular source but does have a DVMRP route, it chooses 110 the parent of the DVMRP route as the previous hop. If a PIM-SM router 111 is on the (*,G) tree, it chooses the parent towards the RP as the previ- 112 ous hop. In these cases, no source/group-specific state is available, 113 but the path may still be traced. 115 3. Multicast Traceroute header 117 The header for all multicast traceroute packets is as follows: 119 0 1 2 3 120 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 121 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 122 | IGMP Type | # hops | checksum | 123 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 124 | Multicast Group Address | 125 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 126 | Source Address | 127 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 128 | Destination Address | 129 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 130 | Response Address | 131 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 132 | resp ttl | Query ID | 133 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 135 3.1. IGMP Type: 8 bits 137 The IGMP type field is defined to be 0x1F for traceroute queries 138 and requests. The IGMP type field is changed to 0x1E when the 139 packet is completed and sent as a response from the first hop 140 router to the querier. Two codes are required so that multicast 141 routers won't attempt to process a completed response in those 142 cases where the initial query was issued from a router or the 143 response is sent via multicast. 145 3.2. # hops: 8 bits 147 This field specifies the maximum number of hops that the requester 148 wants to trace. If there is some error condition in the middle of 149 the path that keeps the traceroute request from reaching the first- 150 hop router, this field can be used to perform an expanding-length 151 search to trace the path to just before the problem. 153 3.3. Checksum: 16 bits 155 The checksum is the 16-bit one's complement of the one's complement 156 sum of the whole IGMP message (the entire IP payload)[Brad88]. 157 When computing the checksum, the checksum field is set to zero. 158 When transmitting packets, the checksum MUST be computed and 159 inserted into this field. When receiving packets, the checksum 160 MUST be verified before processing a packet. 162 3.4. Group address 164 This field specifies the group address to be traced, or zero if no 165 group-specific information is desired. Note that non-group-spe- 166 cific traceroutes may not be possible with certain multicast rout- 167 ing protocols. 169 3.5. Source address 171 This field specifies the IP address of the multicast source for the 172 path being traced, or 0xFFFFFFFF if no source-specific information 173 is desired. Note that non-source-specific traceroutes may not be 174 possible with certain multicast routing protocols. 176 3.6. Destination address 178 This field specifies the IP address of the multicast receiver for 179 the path being traced. The trace starts at this destination and 180 proceeds toward the traffic source. 182 3.7. Response Address 184 This field specifies where the completed traceroute response packet 185 gets sent. It can be a unicast address or a multicast address, as 186 explained in section 6.2. 188 3.8. resp ttl: 8 bits 190 This field specifies the TTL at which to multicast the response, if 191 the response address is a multicast address. 193 3.9. Query ID: 24 bits 195 This field is used as a unique identifier for this traceroute 196 request so that duplicate or delayed responses may be detected and 197 to minimize collisions when a multicast response address is used. 199 4. Definitions 201 Since multicast traceroutes flow in the opposite direction to the data 202 flow, we always refer to "upstream" and "downstream" with respect to 203 data, unless explicitly specified. 205 Incoming Interface 206 The interface on which traffic is expected from the specified 207 source and group. 209 Outgoing Interface 210 The interface on which traffic is forwarded from the specified 211 source and group towards the destination. Also called the "Recep- 212 tion Interface", since it is the interface on which the multicast 213 traceroute Request was received. 215 Previous-Hop Router 216 The router, on the Incoming Interface, which is responsible for 217 forwarding traffic for the specified source and group. 219 5. Response data 221 Each router adds a "response data" segment to the traceroute packet 222 before it forwards it on. The response data looks like this: 224 0 1 2 3 225 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 226 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 227 | Query Arrival Time | 228 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 229 | Incoming Interface Address | 230 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 231 | Outgoing Interface Address | 232 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 233 | Previous-Hop Router Address | 234 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 235 | Input packet count on incoming interface | 236 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 237 | Output packet count on outgoing interface | 238 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 239 | Total number of packets for this source-group pair | 240 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 241 | | |M| | | | 242 | Rtg Protocol | FwdTTL |B|S| Src Mask |Forwarding Code| 243 | | |Z| | | | 244 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 246 5.1. Query Arrival Time 248 The Query Arrival Time is a 32-bit NTP timestamp specifying the 249 arrival time of the traceroute request packet at this router. The 250 32-bit form of an NTP timestamp consists of the middle 32 bits of 251 the full 64-bit form; that is, the low 16 bits of the integer part 252 and the high 16 bits of the fractional part. 254 The following formula converts from a UNIX timeval to a 32-bit NTP 255 timestamp: 257 query_arrival_time = (tv.tv_sec + 32384) << 16 + ((tv.tv_usec << 258 10) / 15625) 260 The constant 32384 is the number of seconds from Jan 1, 1900 to Jan 261 1, 1970 truncated to 16 bits. ((tv.tv_usec << 10) / 15625) is a 262 reduction of ((tv.tv_usec / 100000000) << 16). 264 5.2. Incoming Interface Address 266 This field specifies the address of the interface on which packets 267 from this source and group are expected to arrive, or 0 if unknown. 269 5.3. Outgoing Interface Address 271 This field specifies the address of the interface on which packets 272 from this source and group flow to the specified destination, or 0 273 if unknown. 275 5.4. Previous-Hop Router Address 277 This field specifies the router from which this router expects 278 packets from this source. This may be a multicast group if the 279 previous hop is not known because of the workings of the multicast 280 routing protocol. However, it should be 0 if the incoming inter- 281 face address is unknown. 283 5.5. Input packet count on incoming interface 285 This field contains the number of multicast packets received for 286 all groups and sources on the incoming interface, or 0xffffffff if 287 no count can be reported. 289 5.6. Output packet count on outgoing interface 291 This field contains the number of multicast packets that have been 292 transmitted for all groups and sources on the outgoing interface, 293 or 0xffffffff if no count can be reported. 295 5.7. Total number of packets for this source-group pair 297 This field counts the number of packets from the specified source 298 forwarded by this router to the specified group, or 0xffffffff if 299 no count can be reported. If the S bit is set, the count is for 300 the source network, as specified by the Src Mask field. If the S 301 bit is set and the Src Mask field is 63, indicating no source-spe- 302 cific state, the count is for all sources sending to this group. 304 5.8. Rtg Protocol: 8 bits 306 This field describes the routing protocol in use between this 307 router and the previous-hop router. Specified values include: 309 1 DVMRP 310 2 MOSPF 311 3 PIM 312 4 CBT 313 5 PIM using special routing table 314 6 PIM using a static route 315 7 DVMRP using a static route 316 8 PIM using MBGP (aka BGP4+) route 317 9 CBT using special routing table 318 10 CBT using a static route 319 11 PIM using state created by Assert processing 321 5.9. FwdTTL: 8 bits 323 This field contains the TTL that a packet is required to have 324 before it will be forwarded over the outgoing interface. 326 5.10. MBZ: 1 bit 328 Must be zeroed on transmission and ignored on reception. 330 5.11. S: 1 bit 332 If this bit is set, it indicates that the packet count for the 333 source-group pair is for the source network, as determined by mask- 334 ing the source address with the Src Mask field. 336 5.12. Src Mask: 6 bits 338 This field contains the number of 1's in the netmask this router 339 has for the source (i.e. a value of 24 means the netmask is 340 0xffffff00). If the router is forwarding solely on group state, 341 this field is set to 63 (0x3f). 343 5.13. Forwarding Code: 8 bits 345 This field contains a forwarding information/error code. Defined 346 values include: 348 Value Name Description 349 -------------------------------------------------------------------- 350 0x00 NO_ERROR No error 351 0x01 WRONG_IF Traceroute request arrived on an interface to 352 which this router would not forward for this 353 source,group,destination. 355 0x02 PRUNE_SENT This router has sent a prune upstream which 356 applies to the source and group in the tracer- 357 oute request. 358 0x03 PRUNE_RCVD This router has stopped forwarding for this 359 source and group in response to a request from 360 the next hop router. 361 0x04 SCOPED The group is subject to administrative scoping 362 at this hop. 363 0x05 NO_ROUTE This router has no route for the source. 364 0x06 WRONG_LAST_HOP This router is not the proper last-hop router. 365 0x07 NOT_FORWARDING This router is not forwarding this 366 source,group for an unspecified reason. 367 0x08 REACHED_RP Reached Rendez-vous Point or Core 368 0x09 RPF_IF Traceroute request arrived on the expected RPF 369 interface for this source,group. 370 0x0A NO_MULTICAST Traceroute request arrived on an interface 371 which is not enabled for multicast. 372 0x81 NO_SPACE There was not enough room to insert another 373 response data block in the packet. 374 0x82 OLD_ROUTER The previous hop router does not understand 375 traceroute requests. 376 0x83 ADMIN_PROHIB Traceroute is administratively prohibited. 378 Note that if a router discovers there is not enough room in a 379 packet to insert its response, it puts the 0x81 error code in the 380 previous router's Forwarding Code field, overwriting any error the 381 previous router placed there. It is expected that a multicast 382 traceroute client, upon receiving this error, will restart the 383 trace at the last hop listed in the packet. 385 The 0x80 bit of the Forwarding Code is used to indicate a fatal 386 error. A fatal error is one where the router may know the previous 387 hop but cannot forward the message to it. 389 6. Router Behavior 391 All of these actions are performed in addition to (NOT instead of) for- 392 warding the packet, if applicable. E.g. a multicast packet that has TTL 393 remaining MUST be forwarded normally, as MUST a unicast packet that has 394 TTL remaining and is not addressed to this router. 396 6.1. Traceroute Query 398 A traceroute Query message is a traceroute message with no response 399 blocks filled in, and uses IGMP type 0x1F. 401 6.1.1. Packet Verification 403 Upon receiving a traceroute Query message, a router must examine 404 the Query to see if it is the proper last-hop router for the desti- 405 nation address in the packet. It is the proper last-hop router if 406 it has a multicast-capable interface on the same subnet as the Des- 407 tination Address and is the router that would forward traffic from 408 the given source onto that subnet. 410 If the router determines that it is not the proper last-hop router, 411 or it cannot make that determination, it does one of two things 412 depending if the Query was received via multicast or unicast. If 413 the Query was received via multicast, then it MUST be silently 414 dropped. If it was received via unicast, a forwarding code of 415 NOT_LAST_HOP is noted and processing continues as in section 6.2. 417 Duplicate Query messages as identified by the tuple (IP Source, 418 Query ID) SHOULD be ignored. 420 6.1.2. Normal Processing 422 When a router receives a traceroute Query and it determines that it 423 is the proper last-hop router, it treats it like a traceroute 424 Request and performs the steps listed in section 6.2. 426 6.2. Traceroute Request 428 A traceroute Request is a traceroute message with some number of 429 response blocks filled in, and also uses IGMP type 0x1F. Routers 430 can tell the difference between Queries and Requests by checking 431 the length of the packet. 433 6.2.1. Packet Verification 435 If the traceroute Request is not addressed to this router, or if 436 the Request is addressed to a multicast group which is not a link- 437 scoped group (e.g. 224.0.0.x), it MUST be silently ignored. 439 6.2.2. Normal Processing 441 When a router receives a traceroute Request, it performs the fol- 442 lowing steps. Note that it is possible to have multiple situations 443 covered by the Forwarding Codes. The first one encountered is the 444 one that is reported, i.e. all "note forwarding code N" should be 445 interpreted as "if forwarding code is not already set, set forward- 446 ing code to N". 448 1. Insert a new response block into the packet and fill in the 449 Query Arrival Time, Outgoing Interface Address, Output Packet 450 Count, and FwdTTL. 452 2. Attempt to determine the forwarding information for the source 453 and group specified, using the same mechanisms as would be used 454 when a packet is received from the source destined for the 455 group. State need not be instantiated, it can be "phantom" 456 state created only for the purpose of the trace. 458 3. If no forwarding information can be determined, an error code 459 of NO_ROUTE is inserted in the Forwarding Code field, the 460 remaining fields that have not yet been filled in are set to 461 zero, and the packet is forwarded to the requester as described 462 in "Forwarding Traceroute Requests". 464 4. Fill in the Incoming Interface Address, Previous-Hop Router 465 Address, Input Packet Count, Total Number of Packets, Routing 466 Protocol, S, and Src Mask from the forwarding information that 467 was determined. 469 5. If traceroute is administratively prohibited or the previous 470 hop router does not understand traceroute requests, note the 471 appropriate forwarding code (ADMIN_PROHIB or OLD_ROUTER). If 472 traceroute is administratively prohibited and any of the fields 473 as filled in step 4 are considered private information, zero 474 out the applicable fields. Then the packet is forwarded to the 475 requester as described in "Forwarding Traceroute Requests". 477 6. If the reception interface is not enabled for multicast, note 478 forwarding code NO_MULTICAST. If the reception interface is 479 the interface from which the router would expect data to arrive 480 from the source, a forwarding code of RPF_IF is noted. Other- 481 wise, if the reception interface is not one to which the router 482 would forward data from the source, a forwarding code of 483 WRONG_IF is noted. 485 7. If the group is subject to administrative scoping on either the 486 Outgoing or Incoming interfaces, a forwarding code of SCOPED is 487 noted. 489 8. If this router is the Rendez-vous Point or Core for the group, 490 a forwarding code of REACHED_RP is noted. 492 9. If this router has sent a prune upstream which applies to the 493 source and group in the traceroute Request, it notes forwarding 494 code PRUNE_SENT. If the router has stopped forwarding down- 495 stream in response to a prune sent by the next hop router, it 496 notes forwarding code PRUNE_RCVD. If the router should nor- 497 mally forward traffic for this source and group downstream but 498 is not, it notes forwarding code NOT_FORWARDING. 500 10. The packet is then sent on to the previous hop or the requester 501 as described in "Forwarding Traceroute Requests". 503 6.3. Traceroute response 505 A router must forward all traceroute response packets normally, 506 with no special processing. If a router has initiated a traceroute 507 with a Query or Request message, it may listen for Responses to 508 that traceroute but MUST still forward them as well. 510 6.4. Forwarding Traceroute Requests 512 If the Previous-hop router is known for the source and group (or, 513 if no group is specified, the previous-hop router for the source, 514 or if no source is specified, the previous-hop router for the 515 group) and the number of response blocks is less than the number 516 requested, the packet is sent to that router. If the Incoming 517 Interface is known but the Previous-hop router is not known, the 518 packet is sent to an appropriate multicast address on the Incoming 519 Interface. The appropriate multicast address may depend on the 520 routing protocol in use, MUST be a link-scoped group (i.e. 521 224.0.0.x), MUST NOT be ALL-SYSTEMS.MCAST.NET (224.0.0.1) and may 522 be ALL-ROUTERS.MCAST.NET (224.0.0.2) if the routing protocol in use 523 does not define a more appropriate group. Otherwise, it is sent to 524 the Response Address in the header, as described in "Sending 525 Traceroute Responses". 527 6.5. Sending Traceroute Responses 529 6.5.1. Destination Address 531 A traceroute response must be sent to the Response Address in the 532 traceroute header. 534 6.5.2. TTL 536 If the Response Address is unicast, the router inserts its normal 537 unicast TTL in the IP header. If the Response Address is multi- 538 cast, the router copies the Response TTL from the traceroute header 539 into the IP header. 541 6.5.3. Source Address 543 If the Response Address is unicast, the router may use any of its 544 interface addresses as the source address. Since some multicast 545 routing protocols forward based on source address, if the Response 546 Address is multicast, the router MUST use an address that is known 547 in the multicast routing table if it can make that determination. 549 6.5.4. Sourcing Multicast Responses 551 When a router sources a multicast response, the response packet 552 MUST be sent on a single interface, then forwarded as if it were 553 received on that interface. It MUST NOT source the response packet 554 individually on each interface, since that causes duplicate pack- 555 ets. 557 7. Using multicast traceroute 559 7.1. Sample Client 561 This section describes the behavior of an example multicast traceroute 562 client. 564 7.1.1. Sending Initial Query 566 When the destination of the trace is the machine running the 567 client, the traceroute Query packet can be sent to the ALL-ROUTERS 568 multicast group (224.0.0.2). This will ensure that the packet is 569 received by the last-hop router on the subnet. Otherwise, if the 570 proper last-hop router is known for the trace destination, the 571 Query could be unicasted to that router. Otherwise, the Query 572 packet should be multicasted to the group being queried; if the 573 destination of the trace is a member of the group this will get the 574 Query to the proper last-hop router. In this final case, the 575 packet should contain the Router Alert option, to make sure that 576 routers that are not members of the multicast group notice the 577 packet. See also section 7.2 on determining the last-hop router. 579 7.1.2. Determining the Path 581 The client could send a small number of Initial Query messages with 582 a large "# hops" field, in order to try to trace the full path. If 583 this attempt fails, one strategy is to perform a linear search (as 584 the traditional unicast traceroute program does); set the "#hops" 585 field to 1 and try to get a response, then 2, and so on. If no 586 response is received at a certain hop, the hop count can continue 587 past the non-responding hop, in the hopes that further hops may 588 respond. These attempts should continue until a user-defined time- 589 out has occurred. 591 See also section 7.3 and 7.4 on receiving the results of a trace. 593 7.1.3. Collecting Statistics 595 After a client has determined that it has traced the whole path or 596 as much as it can expect to (see section 7.5), it might collect 597 statistics by waiting a short time and performing a second trace. 598 If the path is the same in the two traces, statistics can be dis- 599 played as described in section 8.3 and 8.4. 601 Details of performing a multicast traceroute: 603 7.2. Last hop router 605 The traceroute querier may not know which is the last hop router, 606 or that router may be behind a firewall that blocks unicast packets 607 but passes multicast packets. In these cases, the traceroute 608 request should be multicasted to the group being traced (since the 609 last hop router listens to that group). All routers except the 610 correct last hop router should ignore any multicast traceroute 611 request received via multicast. Traceroute requests which are mul- 612 ticasted to the group being traced must include the Router Alert IP 613 option [Katz97]. 615 Another alternative is to unicast to the trace destination. 616 Traceroute requests which are unicasted to the trace destination 617 must include the Router Alert IP option [Katz97], in order that the 618 last-hop router is aware of the packet. 620 If the traceroute querier is attached to the same router as the 621 destination of the request, the traceroute request may be multicas- 622 ted to 224.0.0.2 (ALL-ROUTERS.MCAST.NET) if the last-hop router is 623 not known. 625 7.3. First hop router 627 The traceroute querier may not be unicast reachable from the first 628 hop router. In this case, the querier should set the traceroute 629 response address to a multicast address, and should set the 630 response TTL to a value sufficient for the response from the first 631 hop router to reach the querier. It may be appropriate to start 632 with a small TTL and increase in subsequent attempts until a suffi- 633 cient TTL is reached, up to an appropriate maximum (such as 192). 635 The IANA has assigned 224.0.1.32, MTRACE.MCAST.NET, as the default 636 multicast group for multicast traceroute responses. Other groups 637 may be used if needed, e.g. when using mtrace to diagnose problems 638 with the IANA-assigned group. 640 7.4. Broken intermediate router 642 A broken intermediate router might simply not understand traceroute 643 packets, and drop them. The querier would then get no response at 644 all from its traceroute requests. It should then perform a hop-by- 645 hop search by setting the number of responses field until it gets a 646 response (both linear and binary search are options, but binary is 647 likely to be slower because a failure requires waiting for a time- 648 out). 650 7.5. Trace termination 652 When performing an expanding hop-by-hop trace, it is necessary to 653 determine when to stop expanding. 655 7.5.1. Arriving at source 657 A trace can be determined to have arrived at the source if the 658 Incoming Interface of the last router in the trace is non-zero, but 659 the Previous Hop router is zero. 661 7.5.2. Fatal Error 663 A trace has encountered a fatal error if the last Forwarding Error 664 in the trace has the 0x80 bit set. 666 7.5.3. No Previous Hop 668 A trace can not continue if the last Previous Hop in the trace is 669 set to 0. 671 7.5.4. Trace shorter than requested 673 If the trace that is returned is shorter than requested (i.e. the 674 number of Response blocks is smaller than the "# hops" field), the 675 trace encountered an error and could not continue. 677 7.6. Continuing after an error 679 When the NO_SPACE error occurs, the client might try to continue 680 the trace by starting it at the last hop in the trace. It can do 681 this by unicasting to this router's outgoing interface address, 682 keeping all fields the same. If this results in a single hop and a 683 "WRONG_IF" error, the client may try setting the trace destination 684 to the same outgoing interface address. 686 If a trace times out, it is likely to be because a router in the 687 middle of the path does not support multicast traceroute. That 688 router's address will be in the Previous Hop field of the last 689 entry in the last reply packet received. A client may be able to 690 determine (via mrinfo[Pusa98] or SNMP[Thal98a,Thal98b]) a list of 691 neighbors of the non-responding router. If desired, each of those 692 neighbors could be probed to determine the remainder of the path. 693 Unfortunately, this heuristic may end up with multiple paths, since 694 there is no way of knowing what the non-responding router's algo- 695 rithm for choosing a previous-hop router is. However, if all paths 696 but one flow back towards the non-responding router, it is possible 697 to be sure that this is the correct path. 699 7.7. Multicast Traceroute and shared-tree routing protocols 701 When using shared-tree routing protocols like PIM-SM and CBT, a 702 more advanced client may use multicast traceroute to determine 703 paths or potential paths. 705 7.7.1. PIM-SM 707 When a multicast traceroute reaches a PIM-SM RP and the RP does not for- 708 ward the trace on, it means that the RP has not performed a source-spe- 709 cific join so there is no more state to trace. However, the path that 710 traffic would use if the RP did perform a source-specific join can be 711 traced by setting the trace destination to the RP, the trace source to 712 the traffic source, and the trace group to 0. This trace Query may be 713 unicasted to the RP. 715 7.7.2. CBT 717 When a multicast traceroute reaches a CBT Core, it must simply stop 718 since CBT does not have source-specific state. However, a second trace 719 can be performed, setting the trace destination to the traffic source, 720 the trace group to the group being traced, and the trace source to the 721 Core (or to 0, since CBT does not have source-specific state). This 722 trace Query may be unicasted to the Core. There are two possibilities 723 when combining the two traces: 725 7.7.2.1. No overlap 727 If there is no overlap between the two traces, the second trace can 728 be reversed and appended to the first trace. This composite trace 729 shows the full path from the source to the destination. 731 7.7.2.2. Overlapping paths 733 If there is a portion of the path that is common to the ends of the 734 two traces, that portion is removed from both traces. Then, as in 735 the no overlap case, the second trace is reversed and appended to 736 the first trace, and the composite trace again contains the full 737 path. 739 This algorithm works whether the source has joined the CBT tree or not. 741 7.8. Protocol-specific considerations 743 7.8.1. DVMRP 745 DVMRP's dominant router election and route exchange guarantees that 746 DVMRP routers know whether or not they are the last-hop forwarder 747 for the link and who the previous hop is. 749 7.8.2. PIM Dense Mode 751 Routers running PIM Dense Mode do not know the path packets would 752 take unless traffic is flowing. Without some extra protocol mecha- 753 nism, this means that in an environment with multiple possible 754 paths with branch points on shared media, multicast traceroute can 755 only trace existing paths, not potential paths. When there are 756 multiple possible paths but the branch points are not on shared 757 media, the previous hop router is known, but the last hop router 758 may not know that it is the appropriate last hop. 760 When traffic is flowing, PIM Dense Mode routers know whether or not 761 they are the last-hop forwarder for the link (because they won or 762 lost an Assert battle) and know who the previous hop is (because it 763 won an Assert battle). Therefore, multicast traceroute is always 764 able to follow the proper path when traffic is flowing. 766 8. Problem Diagnosis 768 8.1. Forwarding Inconsistencies 770 The forwarding error code can tell if a group is unexpectedly 771 pruned or administratively scoped. 773 8.2. TTL problems 775 By taking the maximum of (hops from source + forwarding TTL thresh- 776 old) over all hops, you can discover the TTL required for the 777 source to reach the destination. 779 8.3. Congestion 781 By taking two traces, you can find packet loss information by com- 782 paring the difference in input packet counts to the difference in 783 output packet counts at the previous hop. On a point-to-point 784 link, any difference in these numbers implies packet loss. Since 785 the packet counts may be changing as the trace query is propagat- 786 ing, there may be small errors (off by 1 or 2) in these statistics. 787 However, these errors will not accumulate if multiple traces are 788 taken to expand the measurement period. On a shared link, the 789 count of input packets can be larger than the number of output 790 packets at the previous hop, due to other routers or hosts on the 791 link injecting packets. This appears as "negative loss" which may 792 mask real packet loss. 794 In addition to the counts of input and output packets for all mul- 795 ticast traffic on the interfaces, the response data includes a 796 count of the packets forwarded by a node for the specified source- 797 group pair. Taking the difference in this count between two traces 798 and then comparing those differences between two hops gives a mea- 799 sure of packet loss just for traffic from the specified source to 800 the specified receiver via the specified group. This measure is 801 not affected by shared links. 803 On a point-to-point link that is a multicast tunnel, packet loss is 804 usually due to congestion in unicast routers along the path of that 805 tunnel. On native multicast links, loss is more likely in the out- 806 put queue of one hop, perhaps due to priority dropping, or in the 807 input queue at the next hop. The counters in the response data do 808 not allow these cases to be distinguished. Differences in packet 809 counts between the incoming and outgoing interfaces on one node 810 cannot generally be used to measure queue overflow in the node 811 because some packets may be routed only to or from other interfaces 812 on that node. 814 In the multicast extensions for SunOS 4.1.x from Xerox PARC, both 815 the output packet count and the packet forwarding count for the 816 source-group pair are incremented before priority dropping for rate 817 limiting occurs and before the packets are put onto the interface 818 output queue which may overflow. These drops will appear as (posi- 819 tive) loss on the link even though they occur within the router. 821 In release 3.3/3.4 of the UNIX multicast extensions, a multicast 822 packet generated on a router will be counted as having come in an 823 interface even though it did not. This can create the appearance 824 of negative loss even on a point-to-point link. 826 In releases up through 3.5/3.6, packets were not counted as input 827 on an interface if the reverse-path forwarding check decided that 828 the packets should be dropped. That causes the packets to appear 829 as lost on the link if they were output by the upstream hop. This 830 situation can arise when two routers on the path for the group 831 being traced are connected by a shared link, and the path for some 832 other group does not flow between those two routers because the 833 downstream router receives packets for the other group on another 834 interface, but the upstream router is the elected forwarder to 835 other routers or hosts on the shared link. 837 8.4. Link Utilization 839 Again, with two traces, you can divide the difference in the input 840 or output packet counts at some hop by the difference in time 841 stamps from the same hop to obtain the packet rate over the link. 842 If the average packet size is known, then the link utilization can 843 also be estimated to see whether packet loss may be due to the rate 844 limit or the physical capacity on a particular link being exceeded. 846 8.5. Time delay 848 If the routers have synchronized clocks, it is possible to estimate 849 propagation and queueing delay from the differences between the 850 timestamps at successive hops. 852 9. Acknowledgments 854 This specification started largely as a transcription of Van Jacobson's 855 slides from the 30th IETF, and the implementation in mrouted 3.3 by Ajit 856 Thyagarajan. Van's original slides credit Steve Casner, Steve Deering, 857 Dino Farinacci and Deb Agrawal. A multicast traceroute client, mtrace, 858 has been implemented by Ajit Thyagarajan, Steve Casner and Bill Fenner. 860 The idea of unicasting a multicast traceroute Query to the destination 861 of the trace with Router Alert set is due to Tony Ballardie. The idea 862 of the "S" bit to allow statistics for a source subnet is due to Tom 863 Pusateri. 865 10. IANA Considerations 867 10.1. Routing Protocols 869 The IANA is responsible for allocating new Routing Protocol codes. 870 The Routing Protocol code is somewhat problematic, since in the 871 case of protocols like CBT and PIM it must encode both a unicast 872 routing algorithm and a multicast tree-building protocol. The 873 space was not divided into two fields because it was already small 874 and some combinations (e.g. DVMRP) would be wasted. 876 Routing Protocol codes should be allocated for any combination of 877 protocols that are in common use in the Internet. 879 10.2. Forwarding Codes 881 New Forwarding codes must only be created by an RFC that modifies 882 this document's section 7, fully describing the conditions under 883 which the new forwarding code is used. The IANA may act as a cen- 884 tral repository so that there is a single place to look up forward- 885 ing codes and the document in which they are defined. 887 11. Security Considerations 889 11.1. Topology discovery 891 mtrace can be used to discover any actively-used topology. If your 892 network topology is a secret, mtrace may be restricted at the bor- 893 der of your domain, using the ADMIN_PROHIB forwarding code. 895 11.2. Traffic rates 897 mtrace can be used to discover what sources are sending to what 898 groups and at what rates. If this information is a secret, mtrace 899 may be restricted at the border of your domain, using the 900 ADMIN_PROHIB forwarding code. 902 11.3. Unicast replies 904 The "Response address" field may be used to send a single packet 905 (the traceroute Reply packet) to an arbitrary unicast address. It 906 is possible to use this facility as a packet amplifier, as a small 907 multicast traceroute Query may turn into a large Reply packet. 909 12. References 911 Brad88 Braden, B., D. Borman, C. Partridge, "Computing the 912 Internet Checksum", RFC 1071, ISI, September 1988. 914 Bradner97 Bradner, S., "Key words for use in RFCs to Indicate 915 Requirement Levels", RFC 2119/BCP 14, Harvard University, 916 March 1997. 918 Katz97 Katz, D., "IP Router Alert Option," RFC 2113, Cisco Sys- 919 tems, February 1997. 921 13. Authors' Addresses 923 William C. Fenner 924 Xerox PARC 925 3333 Coyote Hill Road 926 Palo Alto, CA 94304 928 Phone: +1 650 812 4816 930 Email: fenner@parc.xerox.com 932 Stephen L. Casner 933 Cisco Systems 934 1072 Arastradero Road 935 Palo Alto, CA 94304 937 Email: casner@cisco.com