idnits 2.17.1 draft-ietf-idmr-traceroute-ipm-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-03-29) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 8 instances of too long lines in the document, the longest one being 6 characters in excess of 72. ** The abstract seems to contain references ([Brad97]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == There are 2 instances of lines with non-RFC2606-compliant FQDNs in the document. == There are 2 instances of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 2001) is 8474 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'Brad97' on line 51 looks like a reference -- Missing reference section? 'Brad88' on line 157 looks like a reference -- Missing reference section? 'Katz97' on line 665 looks like a reference -- Missing reference section? 'Pusa99' on line 738 looks like a reference -- Missing reference section? 'Thal99' on line 738 looks like a reference -- Missing reference section? 'Thal00' on line 738 looks like a reference Summary: 7 errors (**), 0 flaws (~~), 3 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Inter-Domain Multicast Routing Working Group 2 INTERNET-DRAFT W. Fenner 3 draft-ietf-idmr-traceroute-ipm-07.txt AT&T Research 4 S. Casner 5 Cisco Systems 6 July 14, 2000 7 Expires January 2001 9 A "traceroute" facility for IP Multicast. 11 Status of this Memo 13 This document is an Internet Draft and is in full conformance with all 14 provisions of Section 10 of RFC2026. Internet Drafts are working docu- 15 ments of the Internet Engineering Task Force (IETF), its areas, and its 16 working groups. Note that other groups may also distribute working doc- 17 uments as Internet-Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six months 20 and may be updated, replaced, or obsoleted by other documents at any 21 time. It is inappropriate to use Internet- Drafts as reference material 22 or to cite them other than as "work in progress." 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/ietf/1id-abstracts.txt. 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html. 30 Distribution of this document is unlimited. 32 Abstract 34 This draft describes the IGMP multicast traceroute facility. 35 Unlike unicast traceroute, multicast traceroute requires a special 36 packet type and implementation on the part of routers. This speci- 37 fication describes the required functionality in multicast routers, 38 as well as how management applications can use the new router func- 39 tionality. 41 This document is a product of the Inter-Domain Multicast Routing working 42 group within the Internet Engineering Task Force. Comments are 43 solicited and should be addressed to the working group's mailing list at 44 idmr@cs.ucl.ac.uk and/or the author(s). 46 Key Words 48 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 49 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 50 document are to be interpreted as described in RFC 2119 [Brad97]. 52 1. Introduction 54 The unicast "traceroute" program allows the tracing of a path from one 55 machine to another, using a mechanism that already existed in IP. 56 Unfortunately, no such existing mechanism can be applied to IP multicast 57 paths. The key mechanism for unicast traceroute is the ICMP TTL 58 exceeded message, which is specifically precluded as a response to mul- 59 ticast packets. Thus, we specify the multicast "traceroute" facility to 60 be implemented in multicast routers and accessed by diagnostic programs. 61 While it is a disadvantage that a new mechanism is required, the multi- 62 cast traceroute facility can provide additional information about packet 63 rates and losses that the unicast traceroute cannot, and generally 64 requires fewer packets to be sent. 66 Goals: 68 o To be able to trace the path that a packet would take from some 69 source to some destination. 71 o To be able to isolate packet loss problems (e.g., congestion). 73 o To be able to isolate configuration problems (e.g., TTL threshold). 75 o To minimize packets sent (e.g. no flooding, no implosion). 77 2. Overview 79 Given a multicast distribution tree, tracing from a source to a multi- 80 cast destination is hard, since you don't know down which branch of the 81 multicast tree the destination lies. This means that you have to flood 82 the whole tree to find the path from one source to one destination. 83 However, walking up the tree from destination to source is easy, as most 84 existing multicast routing protocols know the previous hop for each 85 source. Tracing from destination to source can involve only routers on 86 the direct path. 88 The party requesting the traceroute (which need be neither the source 89 nor the destination) sends a traceroute Query packet to the last-hop 90 multicast router for the given destination. The last-hop router turns 91 the Query into a Request packet by adding a response data block contain- 92 ing its interface addresses and packet statistics, and then forwards the 93 Request packet via unicast to the router that it believes is the proper 94 previous hop for the given source and group. Each hop adds its response 95 data to the end of the Request packet, then unicast forwards it to the 96 previous hop. The first hop router (the router that believes that pack- 97 ets from the source originate on one of its directly connected networks) 98 changes the packet type to indicate a Response packet and sends the com- 99 pleted response to the response destination address. The response may 100 be returned before reaching the first hop router if a fatal error condi- 101 tion such as "no route" is encountered along the path. 103 Multicast traceroute uses any information available to it in the router 104 to attempt to determine a previous hop to forward the trace towards. 105 Multicast routing protocols vary in the type and amount of state they 106 keep; multicast traceroute endeavors to work with all of them by using 107 whatever is available. For example, if a DVMRP router has no active 108 state for a particular source but does have a DVMRP route, it chooses 109 the parent of the DVMRP route as the previous hop. If a PIM-SM router 110 is on the (*,G) tree, it chooses the parent towards the RP as the previ- 111 ous hop. In these cases, no source/group-specific state is available, 112 but the path may still be traced. 114 3. Multicast Traceroute header 116 The header for all multicast traceroute packets is as follows. The 117 header is only filled in by the originator of the traceroute Query; 118 intermediate hops MUST NOT modify any of the fields. 120 0 1 2 3 121 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 122 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 123 | IGMP Type | # hops | checksum | 124 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 125 | Multicast Group Address | 126 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 127 | Source Address | 128 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 129 | Destination Address | 130 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 131 | Response Address | 132 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 133 | resp ttl | Query ID | 134 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 136 3.1. IGMP Type: 8 bits 138 The IGMP type field is defined to be 0x1F for traceroute queries 139 and requests. The IGMP type field is changed to 0x1E when the 140 packet is completed and sent as a response from the first hop 141 router to the querier. Two codes are required so that multicast 142 routers won't attempt to process a completed response in those 143 cases where the initial query was issued from a router or the 144 response is sent via multicast. 146 3.2. # hops: 8 bits 148 This field specifies the maximum number of hops that the requester 149 wants to trace. If there is some error condition in the middle of 150 the path that keeps the traceroute request from reaching the first- 151 hop router, this field can be used to perform an expanding-length 152 search to trace the path to just before the problem. 154 3.3. Checksum: 16 bits 156 The checksum is the 16-bit one's complement of the one's complement 157 sum of the whole IGMP message (the entire IP payload)[Brad88]. 158 When computing the checksum, the checksum field is set to zero. 159 When transmitting packets, the checksum MUST be computed and 160 inserted into this field. When receiving packets, the checksum 161 MUST be verified before processing a packet. 163 3.4. Group address 165 This field specifies the group address to be traced, or zero if no 166 group-specific information is desired. Note that non-group-spe- 167 cific traceroutes may not be possible with certain multicast rout- 168 ing protocols. 170 3.5. Source address 172 This field specifies the IP address of the multicast source for the 173 path being traced, or 0xFFFFFFFF if no source-specific information 174 is desired. Note that non-source-specific traceroutes may not be 175 possible with certain multicast routing protocols. 177 3.6. Destination address 179 This field specifies the IP address of the multicast receiver for 180 the path being traced. The trace starts at this destination and 181 proceeds toward the traffic source. 183 3.7. Response Address 185 This field specifies where the completed traceroute response packet 186 gets sent. It can be a unicast address or a multicast address, as 187 explained in section 6.2. 189 3.8. resp ttl: 8 bits 191 This field specifies the TTL at which to multicast the response, if 192 the response address is a multicast address. 194 3.9. Query ID: 24 bits 196 This field is used as a unique identifier for this traceroute 197 request so that duplicate or delayed responses may be detected and 198 to minimize collisions when a multicast response address is used. 200 4. Definitions 202 Since multicast traceroutes flow in the opposite direction to the data 203 flow, we always refer to "upstream" and "downstream" with respect to 204 data, unless explicitly specified. 206 Incoming Interface 207 The interface on which traffic is expected from the specified 208 source and group. 210 Outgoing Interface 211 The interface on which traffic is forwarded from the specified 212 source and group towards the destination. Also called the "Recep- 213 tion Interface", since it is the interface on which the multicast 214 traceroute Request was received. 216 Previous-Hop Router 217 The router, on the Incoming Interface, which is responsible for 218 forwarding traffic for the specified source and group. 220 5. Response data 222 Each router adds a "response data" segment to the traceroute packet 223 before it forwards it on. The response data looks like this: 225 0 1 2 3 226 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 227 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 228 | Query Arrival Time | 229 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 230 | Incoming Interface Address | 231 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 232 | Outgoing Interface Address | 233 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 234 | Previous-Hop Router Address | 235 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 236 | Input packet count on incoming interface | 237 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 238 | Output packet count on outgoing interface | 239 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 240 | Total number of packets for this source-group pair | 241 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 242 | | |M| | | | 243 | Rtg Protocol | FwdTTL |B|S| Src Mask |Forwarding Code| 244 | | |Z| | | | 245 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 247 5.1. Query Arrival Time 249 The Query Arrival Time is a 32-bit NTP timestamp specifying the 250 arrival time of the traceroute request packet at this router. The 251 32-bit form of an NTP timestamp consists of the middle 32 bits of 252 the full 64-bit form; that is, the low 16 bits of the integer part 253 and the high 16 bits of the fractional part. 255 The following formula converts from a UNIX timeval to a 32-bit NTP 256 timestamp: 258 query_arrival_time = (tv.tv_sec + 32384) << 16 + ((tv.tv_usec << 259 10) / 15625) 261 The constant 32384 is the number of seconds from Jan 1, 1900 to Jan 262 1, 1970 truncated to 16 bits. ((tv.tv_usec << 10) / 15625) is a 263 reduction of ((tv.tv_usec / 100000000) << 16). 265 5.2. Incoming Interface Address 267 This field specifies the address of the interface on which packets 268 from this source and group are expected to arrive, or 0 if unknown. 270 5.3. Outgoing Interface Address 272 This field specifies the address of the interface on which packets 273 from this source and group flow to the specified destination, or 0 274 if unknown. 276 5.4. Previous-Hop Router Address 278 This field specifies the router from which this router expects 279 packets from this source. This may be a multicast group (e.g. 280 ALL-[protocol]-ROUTERS.MCAST.NET) if the previous hop is not known 281 because of the workings of the multicast routing protocol. How- 282 ever, it should be 0 if the incoming interface address is unknown. 284 5.5. Packet counts 286 Note that these packet counts SHOULD be as up to date as possible. 287 If packet counts are not being maintained on the processor that 288 handles the traceroute request in a multi-processor router archi- 289 tecture, the packet SHOULD be delayed while the counters are gath- 290 ered from the remote processor(s). If this occurs, the Query 291 Arrival Time should be updated to reflect the time at which the 292 packet counts were learned. 294 5.6. Input packet count on incoming interface 296 This field contains the number of multicast packets received for 297 all groups and sources on the incoming interface, or 0xffffffff if 298 no count can be reported. This counter should have the same value 299 as ifInMulticastPkts from the IF-MIB for this interface. 301 5.7. Output packet count on outgoing interface 303 This field contains the number of multicast packets that have been 304 transmitted or queued for transmission for all groups and sources 305 on the outgoing interface, or 0xffffffff if no count can be 306 reported. This counter should have the same value as ifOutMulti- 307 castPkts from the IF-MIB for this interface. 309 5.8. Total number of packets for this source-group pair 311 This field counts the number of packets from the specified source 312 forwarded by this router to the specified group, or 0xffffffff if 313 no count can be reported. If the S bit is set, the count is for 314 the source network, as specified by the Src Mask field. If the S 315 bit is set and the Src Mask field is 63, indicating no source-spe- 316 cific state, the count is for all sources sending to this group. 317 This counter should have the same value as ipMRoutePkts from the 318 IPMROUTE-STD-MIB for this forwarding entry. 320 5.9. Rtg Protocol: 8 bits 322 This field describes the routing protocol in use between this 323 router and the previous-hop router. Specified values include: 325 1 DVMRP 326 2 MOSPF 327 3 PIM 328 4 CBT 329 5 PIM using special routing table 330 6 PIM using a static route 331 7 DVMRP using a static route 332 8 PIM using MBGP (aka BGP4+) route 333 9 CBT using special routing table 334 10 CBT using a static route 335 11 PIM using state created by Assert processing 337 5.10. FwdTTL: 8 bits 339 This field contains the TTL that a packet is required to have 340 before it will be forwarded over the outgoing interface. 342 5.11. MBZ: 1 bit 344 Must be zeroed on transmission and ignored on reception. 346 5.12. S: 1 bit 348 If this bit is set, it indicates that the packet count for the 349 source-group pair is for the source network, as determined by mask- 350 ing the source address with the Src Mask field. 352 5.13. Src Mask: 6 bits 354 This field contains the number of 1's in the netmask this router 355 has for the source (i.e. a value of 24 means the netmask is 356 0xffffff00). If the router is forwarding solely on group state, 357 this field is set to 63 (0x3f). 359 5.14. Forwarding Code: 8 bits 361 This field contains a forwarding information/error code. Defined 362 values include: 364 Value Name Description 365 -------------------------------------------------------------------- 366 0x00 NO_ERROR No error 367 0x01 WRONG_IF Traceroute request arrived on an interface to 368 which this router would not forward for this 369 source,group,destination. 370 0x02 PRUNE_SENT This router has sent a prune upstream which 371 applies to the source and group in the tracer- 372 oute request. 373 0x03 PRUNE_RCVD This router has stopped forwarding for this 374 source and group in response to a request from 375 the next hop router. 376 0x04 SCOPED The group is subject to administrative scoping 377 at this hop. 378 0x05 NO_ROUTE This router has no route for the source or 379 group and no way to determine a potential 380 route. 381 0x06 WRONG_LAST_HOP This router is not the proper last-hop router. 382 0x07 NOT_FORWARDING This router is not forwarding this 383 source,group out the outgoing interface for an 384 unspecified reason. 385 0x08 REACHED_RP Reached Rendez-vous Point or Core 386 0x09 RPF_IF Traceroute request arrived on the expected RPF 387 interface for this source,group. 388 0x0A NO_MULTICAST Traceroute request arrived on an interface 389 which is not enabled for multicast. 390 0x0B INFO_HIDDEN One or more hops have been hidden from this 391 trace. 392 0x81 NO_SPACE There was not enough room to insert another 393 response data block in the packet. 394 0x82 OLD_ROUTER The previous hop router does not understand 395 traceroute requests. 396 0x83 ADMIN_PROHIB Traceroute is administratively prohibited. 398 Note that if a router discovers there is not enough room in a 399 packet to insert its response, it puts the 0x81 error code in the 400 previous router's Forwarding Code field, overwriting any error the 401 previous router placed there. A multicast traceroute client, upon 402 receiving this error, MAY restart the trace at the last hop listed 403 in the packet. 405 The 0x80 bit of the Forwarding Code is used to indicate a fatal 406 error. A fatal error is one where the router may know the previous 407 hop but cannot forward the message to it. 409 6. Router Behavior 411 All of these actions are performed in addition to (NOT instead of) for- 412 warding the packet, if applicable. E.g. a multicast packet that has TTL 413 remaining MUST be forwarded normally, as MUST a unicast packet that has 414 TTL remaining and is not addressed to this router. 416 6.1. Traceroute Query 418 A traceroute Query message is a traceroute message with no response 419 blocks filled in, and uses IGMP type 0x1F. 421 6.1.1. Packet Verification 423 Upon receiving a traceroute Query message, a router must examine 424 the Query to see if it is the proper last-hop router for the desti- 425 nation address in the packet. It is the proper last-hop router if 426 it has a multicast-capable interface on the same subnet as the Des- 427 tination Address and is the router that would forward traffic from 428 the given source onto that subnet. 430 If the router determines that it is not the proper last-hop router, 431 or it cannot make that determination, it does one of two things 432 depending if the Query was received via multicast or unicast. If 433 the Query was received via multicast, then it MUST be silently 434 dropped. If it was received via unicast, a forwarding code of 435 WRONG_LAST_HOP is noted and processing continues as in section 6.2. 437 Duplicate Query messages as identified by the tuple (IP Source, 438 Query ID) SHOULD be ignored. This MAY be implemented using a sim- 439 ple 1-back cache (i.e. remembering the IP source and Query ID of 440 the previous Query message that was processed, and ignoring future 441 messages with the same IP Source and Query ID). Duplicate Request 442 messages MUST NOT be ignored in this manner. 444 6.1.2. Normal Processing 446 When a router receives a traceroute Query and it determines that it 447 is the proper last-hop router, it treats it like a traceroute 448 Request and performs the steps listed in section 6.2. 450 6.2. Traceroute Request 452 A traceroute Request is a traceroute message with some number of 453 response blocks filled in, and also uses IGMP type 0x1F. Routers 454 can tell the difference between Queries and Requests by checking 455 the length of the packet. 457 6.2.1. Packet Verification 459 If the traceroute Request is not addressed to this router, or if 460 the Request is addressed to a multicast group which is not a link- 461 scoped group (e.g. 224.0.0.x), it MUST be silently ignored. 463 6.2.2. Normal Processing 465 When a router receives a traceroute Request, it performs the fol- 466 lowing steps. Note that it is possible to have multiple situations 467 covered by the Forwarding Codes. The first one encountered is the 468 one that is reported, i.e. all "note forwarding code N" should be 469 interpreted as "if forwarding code is not already set, set forward- 470 ing code to N". 472 1. If there is room in the current buffer (or the router can effi- 473 ciently allocate more space to use), insert a new response 474 block into the packet and fill in the Query Arrival Time, Out- 475 going Interface Address, Output Packet Count, and FwdTTL. If 476 there was no room, fill in the response code "NO_SPACE" in the 477 *previous* hop's response block, and forward the packet to the 478 requester as described in "Forwarding Traceroute Requests". 480 2. Attempt to determine the forwarding information for the source 481 and group specified, using the same mechanisms as would be used 482 when a packet is received from the source destined for the 483 group. State need not be instantiated, it can be "phantom" 484 state created only for the purpose of the trace. 486 If using a shared-tree protocol and there is no source-specific 487 state, or if the source is specified as 0xFFFFFFFF, group state 488 should be used. If there is no group state or the group is 489 specified as 0, potential source state (i.e. the path that 490 would be followed for a source-specific Join) should be used. 491 If this router is the Core or RP and no source-specific infor- 492 mation is available, note an error code of REACHED_RP. 494 3. If no forwarding information can be determined, the router 495 notes an error code of NO_ROUTE, sets the remaining fields that 496 have not yet been filled in to zero, and the forwards the 497 packet to the requester as described in "Forwarding Traceroute 498 Requests". 500 4. Fill in the Incoming Interface Address, Previous-Hop Router 501 Address, Input Packet Count, Total Number of Packets, Routing 502 Protocol, S, and Src Mask from the forwarding information that 503 was determined. 505 5. If traceroute is administratively prohibited or the previous 506 hop router does not understand traceroute requests, note the 507 appropriate forwarding code (ADMIN_PROHIB or OLD_ROUTER). If 508 traceroute is administratively prohibited and any of the fields 509 as filled in step 4 are considered private information, zero 510 out the applicable fields. Then the packet is forwarded to the 511 requester as described in "Forwarding Traceroute Requests". 513 6. If the reception interface is not enabled for multicast, note 514 forwarding code NO_MULTICAST. If the reception interface is 515 the interface from which the router would expect data to arrive 516 from the source, note forwarding code RPF_IF. Otherwise, if 517 the reception interface is not one to which the router would 518 forward data from the source to the group, a forwarding code of 519 WRONG_IF is noted. 521 7. If the group is subject to administrative scoping on either the 522 Outgoing or Incoming interfaces, a forwarding code of SCOPED is 523 noted. 525 8. If this router is the Rendez-vous Point or Core for the group, 526 a forwarding code of REACHED_RP is noted. 528 9. If this router has sent a prune upstream which applies to the 529 source and group in the traceroute Request, it notes forwarding 530 code PRUNE_SENT. If the router has stopped forwarding down- 531 stream in response to a prune sent by the next hop router, it 532 notes forwarding code PRUNE_RCVD. If the router should nor- 533 mally forward traffic for this source and group downstream but 534 is not, it notes forwarding code NOT_FORWARDING. 536 10. The packet is then sent on to the previous hop or the requester 537 as described in "Forwarding Traceroute Requests". 539 6.3. Traceroute response 541 A router must forward all traceroute response packets normally, 542 with no special processing. If a router has initiated a traceroute 543 with a Query or Request message, it may listen for Responses to 544 that traceroute but MUST still forward them as well. 546 6.4. Forwarding Traceroute Requests 548 If the Previous-hop router is known for this request and the number 549 of response blocks is less than the number requested, the packet is 550 sent to that router. If the Incoming Interface is known but the 551 Previous-hop router is not known, the packet is sent to an appro- 552 priate multicast address on the Incoming Interface. The appropri- 553 ate multicast address may depend on the routing protocol in use, 554 MUST be a link-scoped group (i.e. 224.0.0.x), MUST NOT be ALL-SYS- 555 TEMS.MCAST.NET (224.0.0.1) and MAY be ALL-ROUTERS.MCAST.NET 556 (224.0.0.2) if the routing protocol in use does not define a more 557 appropriate group. Otherwise, it is sent to the Response Address 558 in the header, as described in "Sending Traceroute Responses". 559 Note that it is not an error for the number of response blocks to 560 be greater than the number requested; such a packet should simply 561 be forwarded to the requester as described in "Sending Traceroute 562 Responses". 564 6.5. Sending Traceroute Responses 566 6.5.1. Destination Address 568 A traceroute response must be sent to the Response Address in the 569 traceroute header. 571 6.5.2. TTL 573 If the Response Address is unicast, the router inserts its normal 574 unicast TTL in the IP header. If the Response Address is multi- 575 cast, the router copies the Response TTL from the traceroute header 576 into the IP header. 578 6.5.3. Source Address 580 If the Response Address is unicast, the router may use any of its 581 interface addresses as the source address. Since some multicast 582 routing protocols forward based on source address, if the Response 583 Address is multicast, the router MUST use an address that is known 584 in the multicast routing table if it can make that determination. 586 6.5.4. Sourcing Multicast Responses 588 When a router sources a multicast response, the response packet 589 MUST be sent on a single interface, then forwarded as if it were 590 received on that interface. It MUST NOT source the response packet 591 individually on each interface, in order to avoid duplicate pack- 592 ets. 594 6.6. Hiding information 596 Information about a domain's topology and connectivity may be hid- 597 den from multicast traceroute requests. The exact mechanism is not 598 specified here; however, the INFO_HIDDEN forwarding code may be 599 used to note that, for example, the incoming interface address and 600 packet count are for the entrance to the domain and the outgoing 601 interface address and packet count are the exit from the domain. 602 The source-group packet count may be from either router or not 603 specified (0xffffffff). 605 7. Using multicast traceroute 607 7.1. Sample Client 609 This section describes the behavior of an example multicast traceroute 610 client. 612 7.1.1. Sending Initial Query 614 When the destination of the trace is the machine running the 615 client, the traceroute Query packet can be sent to the ALL-ROUTERS 616 multicast group (224.0.0.2). This will ensure that the packet is 617 received by the last-hop router on the subnet. Otherwise, if the 618 proper last-hop router is known for the trace destination, the 619 Query could be unicasted to that router. Otherwise, the Query 620 packet should be multicasted to the group being queried; if the 621 destination of the trace is a member of the group this will get the 622 Query to the proper last-hop router. In this final case, the 623 packet should contain the Router Alert option, to make sure that 624 routers that are not members of the multicast group notice the 625 packet. See also section 7.2 on determining the last-hop router. 627 7.1.2. Determining the Path 629 The client could send a small number of Initial Query messages with 630 a large "# hops" field, in order to try to trace the full path. If 631 this attempt fails, one strategy is to perform a linear search (as 632 the traditional unicast traceroute program does); set the "#hops" 633 field to 1 and try to get a response, then 2, and so on. If no 634 response is received at a certain hop, the hop count can continue 635 past the non-responding hop, in the hopes that further hops may 636 respond. These attempts should continue until a user-defined time- 637 out has occurred. 639 See also section 7.3 and 7.4 on receiving the results of a trace. 641 7.1.3. Collecting Statistics 643 After a client has determined that it has traced the whole path or 644 as much as it can expect to (see section 7.5), it might collect 645 statistics by waiting a short time and performing a second trace. 646 If the path is the same in the two traces, statistics can be dis- 647 played as described in section 8.3 and 8.4. 649 Details of performing a multicast traceroute: 651 7.2. Last hop router 653 The traceroute querier may not know which is the last hop router, 654 or that router may be behind a firewall that blocks unicast packets 655 but passes multicast packets. In these cases, the traceroute 656 request should be multicasted to the group being traced (since the 657 last hop router listens to that group). All routers except the 658 correct last hop router should ignore any multicast traceroute 659 request received via multicast. Traceroute requests which are mul- 660 ticasted to the group being traced must include the Router Alert IP 661 option [Katz97]. 663 Another alternative is to unicast to the trace destination. 664 Traceroute requests which are unicasted to the trace destination 665 must include the Router Alert IP option [Katz97], in order that the 666 last-hop router is aware of the packet. 668 If the traceroute querier is attached to the same router as the 669 destination of the request, the traceroute request may be multicas- 670 ted to 224.0.0.2 (ALL-ROUTERS.MCAST.NET) if the last-hop router is 671 not known. 673 7.3. First hop router 675 The traceroute querier may not be unicast reachable from the first 676 hop router. In this case, the querier should set the traceroute 677 response address to a multicast address, and should set the 678 response TTL to a value sufficient for the response from the first 679 hop router to reach the querier. It may be appropriate to start 680 with a small TTL and increase in subsequent attempts until a suffi- 681 cient TTL is reached, up to an appropriate maximum (such as 192). 683 The IANA has assigned 224.0.1.32, MTRACE.MCAST.NET, as the default 684 multicast group for multicast traceroute responses. Other groups 685 may be used if needed, e.g. when using mtrace to diagnose problems 686 with the IANA-assigned group. 688 7.4. Broken intermediate router 690 A broken intermediate router might simply not understand traceroute 691 packets, and drop them. The querier would then get no response at 692 all from its traceroute requests. It should then perform a hop-by- 693 hop search by setting the number of responses field until it gets a 694 response (both linear and binary search are options, but binary is 695 likely to be slower because a failure requires waiting for a time- 696 out). 698 7.5. Trace termination 700 When performing an expanding hop-by-hop trace, it is necessary to 701 determine when to stop expanding. 703 7.5.1. Arriving at source 705 A trace can be determined to have arrived at the source if the 706 Incoming Interface of the last router in the trace is non-zero, but 707 the Previous Hop router is zero. 709 7.5.2. Fatal Error 711 A trace has encountered a fatal error if the last Forwarding Error 712 in the trace has the 0x80 bit set. 714 7.5.3. No Previous Hop 716 A trace can not continue if the last Previous Hop in the trace is 717 set to 0. 719 7.5.4. Trace shorter than requested 721 If the trace that is returned is shorter than requested (i.e. the 722 number of Response blocks is smaller than the "# hops" field), the 723 trace encountered an error and could not continue. 725 7.6. Continuing after an error 727 When the NO_SPACE error occurs, the client might try to continue 728 the trace by starting it at the last hop in the trace. It can do 729 this by unicasting to this router's outgoing interface address, 730 keeping all fields the same. If this results in a single hop and a 731 "WRONG_IF" error, the client may try setting the trace destination 732 to the same outgoing interface address. 734 If a trace times out, it is likely to be because a router in the 735 middle of the path does not support multicast traceroute. That 736 router's address will be in the Previous Hop field of the last 737 entry in the last reply packet received. A client may be able to 738 determine (via mrinfo[Pusa99] or SNMP[Thal99,Thal00]) a list of 739 neighbors of the non-responding router. If desired, each of those 740 neighbors could be probed to determine the remainder of the path. 741 Unfortunately, this heuristic may end up with multiple paths, since 742 there is no way of knowing what the non-responding router's algo- 743 rithm for choosing a previous-hop router is. However, if all paths 744 but one flow back towards the non-responding router, it is possible 745 to be sure that this is the correct path. 747 7.7. Multicast Traceroute and shared-tree routing protocols 749 When using shared-tree routing protocols like PIM-SM and CBT, a 750 more advanced client may use multicast traceroute to determine 751 paths or potential paths. 753 7.7.1. PIM-SM 755 When a multicast traceroute reaches a PIM-SM RP and the RP does not 756 forward the trace on, it means that the RP has not performed a 757 source-specific join so there is no more state to trace. However, 758 the path that traffic would use if the RP did perform a source-spe- 759 cific join can be traced by setting the trace destination to the 760 RP, the trace source to the traffic source, and the trace group to 761 0. This trace Query may be unicasted to the RP. 763 7.7.2. CBT 765 When a multicast traceroute reaches a CBT Core, it must simply stop 766 since CBT does not have source-specific state. However, a second 767 trace can be performed, setting the trace destination to the traf- 768 fic source, the trace group to the group being traced, and the 769 trace source to the Core (or to 0, since CBT does not have source- 770 specific state). This trace Query may be unicasted to the Core. 771 There are two possibilities when combining the two traces: 773 7.7.2.1. No overlap 775 If there is no overlap between the two traces, the second trace can 776 be reversed and appended to the first trace. This composite trace 777 shows the full path from the source to the destination. 779 7.7.2.2. Overlapping paths 781 If there is a portion of the path that is common to the ends of the 782 two traces, that portion is removed from both traces. Then, as in 783 the no overlap case, the second trace is reversed and appended to 784 the first trace, and the composite trace again contains the full 785 path. 787 This algorithm works whether the source has joined the CBT tree or 788 not. 790 7.8. Protocol-specific considerations 792 7.8.1. DVMRP 794 DVMRP's dominant router election and route exchange guarantees that 795 DVMRP routers know whether or not they are the last-hop forwarder 796 for the link and who the previous hop is. 798 7.8.2. PIM Dense Mode 800 Routers running PIM Dense Mode do not know the path packets would 801 take unless traffic is flowing. Without some extra protocol mecha- 802 nism, this means that in an environment with multiple possible 803 paths with branch points on shared media, multicast traceroute can 804 only trace existing paths, not potential paths. When there are 805 multiple possible paths but the branch points are not on shared 806 media, the previous hop router is known, but the last hop router 807 may not know that it is the appropriate last hop. 809 When traffic is flowing, PIM Dense Mode routers know whether or not 810 they are the last-hop forwarder for the link (because they won or 811 lost an Assert battle) and know who the previous hop is (because it 812 won an Assert battle). Therefore, multicast traceroute is always 813 able to follow the proper path when traffic is flowing. 815 8. Problem Diagnosis 817 8.1. Forwarding Inconsistencies 819 The forwarding error code can tell if a group is unexpectedly 820 pruned or administratively scoped. 822 8.2. TTL problems 824 By taking the maximum of (hops from source + forwarding TTL thresh- 825 old) over all hops, you can discover the TTL required for the 826 source to reach the destination. 828 8.3. Packet Loss 830 By taking two traces, you can find packet loss information by com- 831 paring the difference in input packet counts to the difference in 832 output packet counts at the previous hop. On a point-to-point 833 link, any difference in these numbers implies packet loss. Since 834 the packet counts may be changing as the trace query is propagat- 835 ing, there may be small errors (off by 1 or 2) in these statistics. 836 However, these errors will not accumulate if multiple traces are 837 taken to expand the measurement period. On a shared link, the 838 count of input packets can be larger than the number of output 839 packets at the previous hop, due to other routers or hosts on the 840 link injecting packets. This appears as "negative loss" which may 841 mask real packet loss. 843 In addition to the counts of input and output packets for all mul- 844 ticast traffic on the interfaces, the response data includes a 845 count of the packets forwarded by a node for the specified source- 846 group pair. Taking the difference in this count between two traces 847 and then comparing those differences between two hops gives a mea- 848 sure of packet loss just for traffic from the specified source to 849 the specified receiver via the specified group. This measure is 850 not affected by shared links. 852 On a point-to-point link that is a multicast tunnel, packet loss is 853 usually due to congestion in unicast routers along the path of that 854 tunnel. On native multicast links, loss is more likely in the out- 855 put queue of one hop, perhaps due to priority dropping, or in the 856 input queue at the next hop. The counters in the response data do 857 not allow these cases to be distinguished. Differences in packet 858 counts between the incoming and outgoing interfaces on one node 859 cannot generally be used to measure queue overflow in the node. 861 8.4. Link Utilization 863 Again, with two traces, you can divide the difference in the input 864 or output packet counts at some hop by the difference in time 865 stamps from the same hop to obtain the packet rate over the link. 866 If the average packet size is known, then the link utilization can 867 also be estimated to see whether packet loss may be due to the rate 868 limit or the physical capacity on a particular link being exceeded. 870 8.5. Time delay 872 If the routers have synchronized clocks, it is possible to estimate 873 propagation and queuing delay from the differences between the 874 timestamps at successive hops. However, this delay includes con- 875 trol processing overhead, so is not necessarily indicative of the 876 delay that data traffic would experience. 878 9. Implementation-specific Caveats 880 Some routers with distributed forwarding architectures may not update 881 the main processor's packet counts often enough for the packet counters 882 to be meaningful on a small time scale. This can be recognized during a 883 periodic trace by seeing positive loss in one trace and negative loss in 884 the next, with no (or small) net loss over a longer interval. The sug- 885 gested solution to this problem is to simply collect statistics over a 886 longer interval. 888 In the multicast extensions for SunOS 4.1.x from Xerox PARC, which are 889 the basis for many UNIX-based multicast routers, both the output packet 890 count and the packet forwarding count for the source-group pair are 891 incremented before priority dropping for rate limiting occurs and before 892 the packets are put onto the interface output queue which may overflow. 893 These drops will appear as (positive) loss on the link even though they 894 occur within the router. 896 In release 3.3/3.4 of the UNIX multicast extensions, a multicast packet 897 generated on a router will be counted as having come in an interface 898 even though it did not. This can create the appearance of negative loss 899 even on a point-to-point link. 901 In releases up through 3.5/3.6, packets were not counted as input on an 902 interface if the reverse-path forwarding check decided that the packets 903 should be dropped. That causes the packets to appear as lost on the 904 link if they were output by the upstream hop. This situation can arise 905 when two routers on the path for the group being traced are connected by 906 a shared link, and the path for some other group does not flow between 907 those two routers because the downstream router receives packets for the 908 other group on another interface, but the upstream router is the elected 909 forwarder to other routers or hosts on the shared link. 911 The packet counts for source/group pairs are generally kept in router 912 forwarding caches. These cache entries may be occasionally garbage-col- 913 lected on routers, so a multicast traceroute client should be prepared 914 to see packet counts decrease. If a long-running traceroute is keeping 915 a "base" to compare against, it should use the post-reset trace as the 916 new "base", as previous values returned by this hop are no longer valid. 917 In addition, it may choose to discard the data for all other hops to 918 cover the same amount of time for all hops. 920 Some routers (notably the obsolete mrouted 3.3 and 3.4) can constantly 921 reset these packet counts. A client might want to detect routers that 922 are constantly resetting and simply fail to collect statistics for that 923 hop (instead of allowing it to cause all other data to be discarded). 925 Some routers send byte-swapped counter values. If the difference 926 between a pair of measurements is extremely large, a traceroute client 927 may want to see if the difference is more reasonable when byte-swapped. 928 Note that this heuristic may start misfiring when packet rates get high, 929 so implementations may want to only attempt this heuristic when the 930 packet rate is much different on one router than on surrounding routers. 932 Some implementations (e.g. UNIX mrouted 3.8 and before) return incorrect 933 time values; the difference between the time values for the same hop in 934 two traces may have no relationship with the amount of time that passed 935 between making the traces. Implementations should check that time val- 936 ues look valid before using them. 938 10. Acknowledgments 940 This specification started largely as a transcription of Van Jacobson's 941 slides from the 30th IETF, and the implementation in mrouted 3.3 by Ajit 942 Thyagarajan. Van's original slides credit Steve Casner, Steve Deering, 943 Dino Farinacci and Deb Agrawal. A multicast traceroute client, mtrace, 944 has been implemented by Ajit Thyagarajan, Steve Casner and Bill Fenner. 946 The idea of unicasting a multicast traceroute Query to the destination 947 of the trace with Router Alert set is due to Tony Ballardie. The idea 948 of the "S" bit to allow statistics for a source subnet is due to Tom 949 Pusateri. 951 11. IANA Considerations 953 11.1. Routing Protocols 955 The IANA is responsible for allocating new Routing Protocol codes. 956 The Routing Protocol code is somewhat problematic, since in the 957 case of protocols like CBT and PIM it must encode both a unicast 958 routing algorithm and a multicast tree-building protocol. The 959 space was not divided into two fields because it was already small 960 and some combinations (e.g. DVMRP) would be wasted. 962 Routing Protocol codes should be allocated for any combination of 963 protocols that are in common use in the Internet. 965 11.2. Forwarding Codes 967 New Forwarding codes must only be created by an RFC that modifies 968 this document's section 7, fully describing the conditions under 969 which the new forwarding code is used. The IANA may act as a cen- 970 tral repository so that there is a single place to look up forward- 971 ing codes and the document in which they are defined. 973 12. Security Considerations 975 12.1. Topology discovery 977 mtrace can be used to discover any actively-used topology. If your 978 network topology is a secret, mtrace may be restricted at the bor- 979 der of your domain, using the ADMIN_PROHIB forwarding code. 981 12.2. Traffic rates 983 mtrace can be used to discover what sources are sending to what 984 groups and at what rates. If this information is a secret, mtrace 985 may be restricted at the border of your domain, using the 986 ADMIN_PROHIB forwarding code. 988 12.3. Unicast replies 990 The "Response address" field may be used to send a single packet 991 (the traceroute Reply packet) to an arbitrary unicast address. It 992 is possible to use this facility as a packet amplifier, as a small 993 multicast traceroute Query may turn into a large Reply packet. 995 13. References 997 Brad88 Braden, B., D. Borman, C. Partridge, "Computing the 998 Internet Checksum", RFC 1071, ISI, September 1988. 1000 Brad97 Bradner, S., "Key words for use in RFCs to Indicate 1001 Requirement Levels", RFC 2119/BCP 14, Harvard University, 1002 March 1997. 1004 Katz97 Katz, D., "IP Router Alert Option," RFC 2113, Cisco Sys- 1005 tems, February 1997. 1007 Pusa99 Pusateri, T., "DVMRP Version 3", work in progress, 1008 September 1999. 1010 Thal00 Thaler, D., "PIM MIB", work in progress, July 2000. 1012 Thal99 Thaler, D., "DVMRP MIB", work in progress, October 1999. 1014 14. Authors' Addresses 1016 William C. Fenner 1017 AT&T Labs -- Research 1018 75 Willow Rd. 1019 Menlo Park, CA 94025 1020 United States 1021 Email: fenner@research.att.com 1023 Stephen L. Casner 1024 Cisco Systems, Inc. 1025 170 West Tasman Drive 1026 San Jose, CA 95134 1027 United States 1028 Email: casner@cisco.com 1030 15. Change History 1032 (To be removed before publication as RFC) 1034 15.1. Changes from draft-ietf-idmr-traceroute-ipm-06.txt: 1036 - Added implementation-specific notes as suggested by Dave Thaler: 1038 - Forwarding cache entries going away while traffic is flowing, 1039 causing reset counters. 1041 - mrouted 3.3 and 3.4 constant resets 1043 - byte-swapped counters 1045 - bogus time due to missed ntohl() parenthesis in mrouted <= 3.8 1047 - Add example of ALL-[protocol]-ROUTERS.MCAST.NET for the multicast- 1048 on-prev-hop. (Maybe this isn't important any more; PIM used to be 1049 allowed to not know the proper prev hop but that's not true any 1050 more) 1052 15.2. Changes from draft-ietf-idmr-traceroute-ipm-05.txt: 1054 - Changes section added. 1056 - Updated abstract 1057 - Added mention of up-to-date packet counts, in particular allowing 1058 the delay of an mtrace packet while the counts are fetched in a 1059 distributed architecture. 1061 - Added mention of ifInMulticastPkts, ifOutMulticastPkts, and ipM- 1062 RoutePkts for clarification of what counts should be used. 1064 - Note that the dropping of duplicate Queries MAY be a 1-back cache 1065 and that duplicate Requests MUST NOT be dropped 1067 - Add no-space processing rule 1069 - Note that it's not an error for there to be more blocks than 1070 requested, just send it back after adding yours. 1072 - Clean up some of section 8 - move implementation-specific stuff to 1073 a separate section, rename "Congestion" to "Packet Loss", note that 1074 time delay isn't actually that useful.