idnits 2.17.1 draft-ietf-idmr-traceroute-ipm-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 7 instances of too long lines in the document, the longest one being 6 characters in excess of 72. ** The abstract seems to contain references ([Brad97]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == There are 2 instances of lines with non-RFC2606-compliant FQDNs in the document. == There are 2 instances of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 2000) is 8655 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'Brad97' on line 51 looks like a reference -- Missing reference section? 'Brad88' on line 157 looks like a reference -- Missing reference section? 'Katz97' on line 650 looks like a reference -- Missing reference section? 'Pusa99' on line 723 looks like a reference -- Missing reference section? 'Thal99a' on line 723 looks like a reference -- Missing reference section? 'Thal99b' on line 723 looks like a reference Summary: 7 errors (**), 0 flaws (~~), 3 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Inter-Domain Multicast Routing Working Group 2 INTERNET-DRAFT W. Fenner 3 draft-ietf-idmr-traceroute-ipm-06.txt AT&T Research 4 S. Casner 5 Cisco Systems 6 March 10, 2000 7 Expires August 2000 9 A "traceroute" facility for IP Multicast. 11 Status of this Memo 13 This document is an Internet Draft and is in full conformance with all 14 provisions of Section 10 of RFC2026. Internet Drafts are working docu- 15 ments of the Internet Engineering Task Force (IETF), its areas, and its 16 working groups. Note that other groups may also distribute working doc- 17 uments as Internet-Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six months 20 and may be updated, replaced, or obsoleted by other documents at any 21 time. It is inappropriate to use Internet- Drafts as reference material 22 or to cite them other than as "work in progress." 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/ietf/1id-abstracts.txt. 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html. 30 Distribution of this document is unlimited. 32 Abstract 34 This draft describes the IGMP multicast traceroute facility. 35 Unlike unicast traceroute, multicast traceroute requires a special 36 packet type and implementation on the part of routers. This speci- 37 fication describes the required functionality in multicast routers, 38 as well as how management applications can use the new router func- 39 tionality. 41 This document is a product of the Inter-Domain Multicast Routing working 42 group within the Internet Engineering Task Force. Comments are 43 solicited and should be addressed to the working group's mailing list at 44 idmr@cs.ucl.ac.uk and/or the author(s). 46 Key Words 48 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 49 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 50 document are to be interpreted as described in RFC 2119 [Brad97]. 52 1. Introduction 54 The unicast "traceroute" program allows the tracing of a path from one 55 machine to another, using a mechanism that already existed in IP. 56 Unfortunately, no such existing mechanism can be applied to IP multicast 57 paths. The key mechanism for unicast traceroute is the ICMP TTL 58 exceeded message, which is specifically precluded as a response to mul- 59 ticast packets. Thus, we specify the multicast "traceroute" facility to 60 be implemented in multicast routers and accessed by diagnostic programs. 61 While it is a disadvantage that a new mechanism is required, the multi- 62 cast traceroute facility can provide additional information about packet 63 rates and losses that the unicast traceroute cannot, and generally 64 requires fewer packets to be sent. 66 Goals: 68 o To be able to trace the path that a packet would take from some 69 source to some destination. 71 o To be able to isolate packet loss problems (e.g., congestion). 73 o To be able to isolate configuration problems (e.g., TTL threshold). 75 o To minimize packets sent (e.g. no flooding, no implosion). 77 2. Overview 79 Given a multicast distribution tree, tracing from a source to a multi- 80 cast destination is hard, since you don't know down which branch of the 81 multicast tree the destination lies. This means that you have to flood 82 the whole tree to find the path from one source to one destination. 83 However, walking up the tree from destination to source is easy, as most 84 existing multicast routing protocols know the previous hop for each 85 source. Tracing from destination to source can involve only routers on 86 the direct path. 88 The party requesting the traceroute (which need be neither the source 89 nor the destination) sends a traceroute Query packet to the last-hop 90 multicast router for the given destination. The last-hop router turns 91 the Query into a Request packet by adding a response data block contain- 92 ing its interface addresses and packet statistics, and then forwards the 93 Request packet via unicast to the router that it believes is the proper 94 previous hop for the given source and group. Each hop adds its response 95 data to the end of the Request packet, then unicast forwards it to the 96 previous hop. The first hop router (the router that believes that pack- 97 ets from the source originate on one of its directly connected networks) 98 changes the packet type to indicate a Response packet and sends the com- 99 pleted response to the response destination address. The response may 100 be returned before reaching the first hop router if a fatal error condi- 101 tion such as "no route" is encountered along the path. 103 Multicast traceroute uses any information available to it in the router 104 to attempt to determine a previous hop to forward the trace towards. 105 Multicast routing protocols vary in the type and amount of state they 106 keep; multicast traceroute endeavors to work with all of them by using 107 whatever is available. For example, if a DVMRP router has no active 108 state for a particular source but does have a DVMRP route, it chooses 109 the parent of the DVMRP route as the previous hop. If a PIM-SM router 110 is on the (*,G) tree, it chooses the parent towards the RP as the previ- 111 ous hop. In these cases, no source/group-specific state is available, 112 but the path may still be traced. 114 3. Multicast Traceroute header 116 The header for all multicast traceroute packets is as follows. The 117 header is only filled in by the originator of the traceroute Query; 118 intermediate hops MUST NOT modify any of the fields. 120 0 1 2 3 121 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 122 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 123 | IGMP Type | # hops | checksum | 124 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 125 | Multicast Group Address | 126 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 127 | Source Address | 128 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 129 | Destination Address | 130 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 131 | Response Address | 132 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 133 | resp ttl | Query ID | 134 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 136 3.1. IGMP Type: 8 bits 138 The IGMP type field is defined to be 0x1F for traceroute queries 139 and requests. The IGMP type field is changed to 0x1E when the 140 packet is completed and sent as a response from the first hop 141 router to the querier. Two codes are required so that multicast 142 routers won't attempt to process a completed response in those 143 cases where the initial query was issued from a router or the 144 response is sent via multicast. 146 3.2. # hops: 8 bits 148 This field specifies the maximum number of hops that the requester 149 wants to trace. If there is some error condition in the middle of 150 the path that keeps the traceroute request from reaching the first- 151 hop router, this field can be used to perform an expanding-length 152 search to trace the path to just before the problem. 154 3.3. Checksum: 16 bits 156 The checksum is the 16-bit one's complement of the one's complement 157 sum of the whole IGMP message (the entire IP payload)[Brad88]. 158 When computing the checksum, the checksum field is set to zero. 159 When transmitting packets, the checksum MUST be computed and 160 inserted into this field. When receiving packets, the checksum 161 MUST be verified before processing a packet. 163 3.4. Group address 165 This field specifies the group address to be traced, or zero if no 166 group-specific information is desired. Note that non-group-spe- 167 cific traceroutes may not be possible with certain multicast rout- 168 ing protocols. 170 3.5. Source address 172 This field specifies the IP address of the multicast source for the 173 path being traced, or 0xFFFFFFFF if no source-specific information 174 is desired. Note that non-source-specific traceroutes may not be 175 possible with certain multicast routing protocols. 177 3.6. Destination address 179 This field specifies the IP address of the multicast receiver for 180 the path being traced. The trace starts at this destination and 181 proceeds toward the traffic source. 183 3.7. Response Address 185 This field specifies where the completed traceroute response packet 186 gets sent. It can be a unicast address or a multicast address, as 187 explained in section 6.2. 189 3.8. resp ttl: 8 bits 191 This field specifies the TTL at which to multicast the response, if 192 the response address is a multicast address. 194 3.9. Query ID: 24 bits 196 This field is used as a unique identifier for this traceroute 197 request so that duplicate or delayed responses may be detected and 198 to minimize collisions when a multicast response address is used. 200 4. Definitions 202 Since multicast traceroutes flow in the opposite direction to the data 203 flow, we always refer to "upstream" and "downstream" with respect to 204 data, unless explicitly specified. 206 Incoming Interface 207 The interface on which traffic is expected from the specified 208 source and group. 210 Outgoing Interface 211 The interface on which traffic is forwarded from the specified 212 source and group towards the destination. Also called the "Recep- 213 tion Interface", since it is the interface on which the multicast 214 traceroute Request was received. 216 Previous-Hop Router 217 The router, on the Incoming Interface, which is responsible for 218 forwarding traffic for the specified source and group. 220 5. Response data 222 Each router adds a "response data" segment to the traceroute packet 223 before it forwards it on. The response data looks like this: 225 0 1 2 3 226 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 227 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 228 | Query Arrival Time | 229 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 230 | Incoming Interface Address | 231 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 232 | Outgoing Interface Address | 233 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 234 | Previous-Hop Router Address | 235 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 236 | Input packet count on incoming interface | 237 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 238 | Output packet count on outgoing interface | 239 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 240 | Total number of packets for this source-group pair | 241 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 242 | | |M| | | | 243 | Rtg Protocol | FwdTTL |B|S| Src Mask |Forwarding Code| 244 | | |Z| | | | 245 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 247 5.1. Query Arrival Time 249 The Query Arrival Time is a 32-bit NTP timestamp specifying the 250 arrival time of the traceroute request packet at this router. The 251 32-bit form of an NTP timestamp consists of the middle 32 bits of 252 the full 64-bit form; that is, the low 16 bits of the integer part 253 and the high 16 bits of the fractional part. 255 The following formula converts from a UNIX timeval to a 32-bit NTP 256 timestamp: 258 query_arrival_time = (tv.tv_sec + 32384) << 16 + ((tv.tv_usec << 259 10) / 15625) 261 The constant 32384 is the number of seconds from Jan 1, 1900 to Jan 262 1, 1970 truncated to 16 bits. ((tv.tv_usec << 10) / 15625) is a 263 reduction of ((tv.tv_usec / 100000000) << 16). 265 5.2. Incoming Interface Address 267 This field specifies the address of the interface on which packets 268 from this source and group are expected to arrive, or 0 if unknown. 270 5.3. Outgoing Interface Address 272 This field specifies the address of the interface on which packets 273 from this source and group flow to the specified destination, or 0 274 if unknown. 276 5.4. Previous-Hop Router Address 278 This field specifies the router from which this router expects 279 packets from this source. This may be a multicast group if the 280 previous hop is not known because of the workings of the multicast 281 routing protocol. However, it should be 0 if the incoming inter- 282 face address is unknown. 284 5.5. Packet counts 286 Note that these packet counts SHOULD be as up to date as possible. 287 If packet counts are not being maintained on the processor that 288 handles the traceroute request in a multi-processor router archi- 289 tecture, the packet SHOULD be delayed while the counters are gath- 290 ered from the remote processor(s). If this occurs, the Query 291 Arrival Time should be updated to reflect the time at which the 292 packet counts were learned. 294 5.6. Input packet count on incoming interface 296 This field contains the number of multicast packets received for 297 all groups and sources on the incoming interface, or 0xffffffff if 298 no count can be reported. This counter should have the same value 299 as ifInMulticastPkts from the IF-MIB for this interface. 301 5.7. Output packet count on outgoing interface 303 This field contains the number of multicast packets that have been 304 transmitted or queued for transmission for all groups and sources 305 on the outgoing interface, or 0xffffffff if no count can be 306 reported. This counter should have the same value as ifOutMulti- 307 castPkts from the IF-MIB for this interface. 309 5.8. Total number of packets for this source-group pair 311 This field counts the number of packets from the specified source 312 forwarded by this router to the specified group, or 0xffffffff if 313 no count can be reported. If the S bit is set, the count is for 314 the source network, as specified by the Src Mask field. If the S 315 bit is set and the Src Mask field is 63, indicating no source-spe- 316 cific state, the count is for all sources sending to this group. 317 This counter should have the same value as ipMRoutePkts from the 318 IPMROUTE-STD-MIB for this forwarding entry. 320 5.9. Rtg Protocol: 8 bits 322 This field describes the routing protocol in use between this 323 router and the previous-hop router. Specified values include: 325 l l. 1 DVMRP 2 MOSPF 3 PIM 4 CBT 5 PIM using spe- 326 cial routing table 6 PIM using a static route 7 DVMRP using a 327 static route 8 PIM using MBGP (aka BGP4+) route 9 CBT using 328 special routing table 10 CBT using a static route 11 PIM using 329 state created by Assert processing 331 5.10. FwdTTL: 8 bits 333 This field contains the TTL that a packet is required to have 334 before it will be forwarded over the outgoing interface. 336 5.11. MBZ: 1 bit 338 Must be zeroed on transmission and ignored on reception. 340 5.12. S: 1 bit 342 If this bit is set, it indicates that the packet count for the 343 source-group pair is for the source network, as determined by mask- 344 ing the source address with the Src Mask field. 346 5.13. Src Mask: 6 bits 348 This field contains the number of 1's in the netmask this router 349 has for the source (i.e. a value of 24 means the netmask is 350 0xffffff00). If the router is forwarding solely on group state, 351 this field is set to 63 (0x3f). 353 5.14. Forwarding Code: 8 bits 355 This field contains a forwarding information/error code. Defined 356 values include: 358 expand; l l lw(3i) . Value Name Description _ 359 0x00 NO_ERROR No error 0x01 WRONG_IF T{ Traceroute request 360 arrived on an interface to which this router would not forward for 361 this source,group,destination. T} 0x02 PRUNE_SENT T{ This 362 router has sent a prune upstream which applies to the source and 363 group in the traceroute request. T} 0x03 PRUNE_RCVD T{ This 364 router has stopped forwarding for this source and group in response 365 to a request from the next hop router. T} 0x04 SCOPED T{ The 366 group is subject to administrative scoping at this hop. T} 367 0x05 NO_ROUTE T{ This router has no route for the source or group 368 and no way to determine a potential route. T} 369 0x06 WRONG_LAST_HOP This router is not the proper last-hop router. 370 0x07 NOT_FORWARDING T{ This router is not forwarding this 371 source,group out the outgoing interface for an unspecified reason. 372 T} 0x08 REACHED_RP Reached Rendez-vous Point or Core 373 0x09 RPF_IF T{ Traceroute request arrived on the expected RPF 374 interface for this source,group. T} 0x0A NO_MULTICAST T{ Tracer- 375 oute request arrived on an interface which is not enabled for mul- 376 ticast. T} 0x0B INFO_HIDDEN T{ One or more hops have been hid- 377 den from this trace. T} 0x81 NO_SPACE T{ There was not enough 378 room to insert another response data block in the packet. T} 379 0x82 OLD_ROUTER T{ The previous hop router does not understand 380 traceroute requests. T} 0x83 ADMIN_PROHIB Traceroute is adminis- 381 tratively prohibited. 383 Note that if a router discovers there is not enough room in a 384 packet to insert its response, it puts the 0x81 error code in the 385 previous router's Forwarding Code field, overwriting any error the 386 previous router placed there. A multicast traceroute client, upon 387 receiving this error, MAY restart the trace at the last hop listed 388 in the packet. 390 The 0x80 bit of the Forwarding Code is used to indicate a fatal 391 error. A fatal error is one where the router may know the previous 392 hop but cannot forward the message to it. 394 6. Router Behavior 396 All of these actions are performed in addition to (NOT instead of) for- 397 warding the packet, if applicable. E.g. a multicast packet that has TTL 398 remaining MUST be forwarded normally, as MUST a unicast packet that has 399 TTL remaining and is not addressed to this router. 401 6.1. Traceroute Query 403 A traceroute Query message is a traceroute message with no response 404 blocks filled in, and uses IGMP type 0x1F. 406 6.1.1. Packet Verification 408 Upon receiving a traceroute Query message, a router must examine 409 the Query to see if it is the proper last-hop router for the desti- 410 nation address in the packet. It is the proper last-hop router if 411 it has a multicast-capable interface on the same subnet as the Des- 412 tination Address and is the router that would forward traffic from 413 the given source onto that subnet. 415 If the router determines that it is not the proper last-hop router, 416 or it cannot make that determination, it does one of two things 417 depending if the Query was received via multicast or unicast. If 418 the Query was received via multicast, then it MUST be silently 419 dropped. If it was received via unicast, a forwarding code of 420 WRONG_LAST_HOP is noted and processing continues as in section 6.2. 422 Duplicate Query messages as identified by the tuple (IP Source, 423 Query ID) SHOULD be ignored. This MAY be implemented using a sim- 424 ple 1-back cache (i.e. remembering the IP source and Query ID of 425 the previous Query message that was processed, and ignoring future 426 messages with the same IP Source and Query ID). Duplicate Request 427 messages MUST NOT be ignored in this manner. 429 6.1.2. Normal Processing 431 When a router receives a traceroute Query and it determines that it 432 is the proper last-hop router, it treats it like a traceroute 433 Request and performs the steps listed in section 6.2. 435 6.2. Traceroute Request 437 A traceroute Request is a traceroute message with some number of 438 response blocks filled in, and also uses IGMP type 0x1F. Routers 439 can tell the difference between Queries and Requests by checking 440 the length of the packet. 442 6.2.1. Packet Verification 444 If the traceroute Request is not addressed to this router, or if 445 the Request is addressed to a multicast group which is not a link- 446 scoped group (e.g. 224.0.0.x), it MUST be silently ignored. 448 6.2.2. Normal Processing 450 When a router receives a traceroute Request, it performs the fol- 451 lowing steps. Note that it is possible to have multiple situations 452 covered by the Forwarding Codes. The first one encountered is the 453 one that is reported, i.e. all "note forwarding code N" should be 454 interpreted as "if forwarding code is not already set, set forward- 455 ing code to N". 457 1. If there is room in the current buffer (or the router can effi- 458 ciently allocate more space to use), insert a new response 459 block into the packet and fill in the Query Arrival Time, Out- 460 going Interface Address, Output Packet Count, and FwdTTL. If 461 there was no room, fill in the response code "NO_SPACE" in the 462 *previous* hop's response block, and forward the packet to the 463 requester as described in "Forwarding Traceroute Requests". 465 2. Attempt to determine the forwarding information for the source 466 and group specified, using the same mechanisms as would be used 467 when a packet is received from the source destined for the 468 group. State need not be instantiated, it can be "phantom" 469 state created only for the purpose of the trace. 471 If using a shared-tree protocol and there is no source-specific 472 state, or if the source is specified as 0xFFFFFFFF, group state 473 should be used. If there is no group state or the group is 474 specified as 0, potential source state (i.e. the path that 475 would be followed for a source-specific Join) should be used. 476 If this router is the Core or RP and no source-specific infor- 477 mation is available, note an error code of REACHED_RP. 479 3. If no forwarding information can be determined, the router 480 notes an error code of NO_ROUTE, sets the remaining fields that 481 have not yet been filled in to zero, and the forwards the 482 packet to the requester as described in "Forwarding Traceroute 483 Requests". 485 4. Fill in the Incoming Interface Address, Previous-Hop Router 486 Address, Input Packet Count, Total Number of Packets, Routing 487 Protocol, S, and Src Mask from the forwarding information that 488 was determined. 490 5. If traceroute is administratively prohibited or the previous 491 hop router does not understand traceroute requests, note the 492 appropriate forwarding code (ADMIN_PROHIB or OLD_ROUTER). If 493 traceroute is administratively prohibited and any of the fields 494 as filled in step 4 are considered private information, zero 495 out the applicable fields. Then the packet is forwarded to the 496 requester as described in "Forwarding Traceroute Requests". 498 6. If the reception interface is not enabled for multicast, note 499 forwarding code NO_MULTICAST. If the reception interface is 500 the interface from which the router would expect data to arrive 501 from the source, note forwarding code RPF_IF. Otherwise, if 502 the reception interface is not one to which the router would 503 forward data from the source to the group, a forwarding code of 504 WRONG_IF is noted. 506 7. If the group is subject to administrative scoping on either the 507 Outgoing or Incoming interfaces, a forwarding code of SCOPED is 508 noted. 510 8. If this router is the Rendez-vous Point or Core for the group, 511 a forwarding code of REACHED_RP is noted. 513 9. If this router has sent a prune upstream which applies to the 514 source and group in the traceroute Request, it notes forwarding 515 code PRUNE_SENT. If the router has stopped forwarding down- 516 stream in response to a prune sent by the next hop router, it 517 notes forwarding code PRUNE_RCVD. If the router should nor- 518 mally forward traffic for this source and group downstream but 519 is not, it notes forwarding code NOT_FORWARDING. 521 10. The packet is then sent on to the previous hop or the requester 522 as described in "Forwarding Traceroute Requests". 524 6.3. Traceroute response 526 A router must forward all traceroute response packets normally, 527 with no special processing. If a router has initiated a traceroute 528 with a Query or Request message, it may listen for Responses to 529 that traceroute but MUST still forward them as well. 531 6.4. Forwarding Traceroute Requests 533 If the Previous-hop router is known for this request and the number 534 of response blocks is less than the number requested, the packet is 535 sent to that router. If the Incoming Interface is known but the 536 Previous-hop router is not known, the packet is sent to an appro- 537 priate multicast address on the Incoming Interface. The appropri- 538 ate multicast address may depend on the routing protocol in use, 539 MUST be a link-scoped group (i.e. 224.0.0.x), MUST NOT be ALL-SYS- 540 TEMS.MCAST.NET (224.0.0.1) and MAY be ALL-ROUTERS.MCAST.NET 541 (224.0.0.2) if the routing protocol in use does not define a more 542 appropriate group. Otherwise, it is sent to the Response Address 543 in the header, as described in "Sending Traceroute Responses". 544 Note that it is not an error for the number of response blocks to 545 be greater than the number requested; such a packet should simply 546 be forwarded to the requester as described in "Sending Traceroute 547 Responses". 549 6.5. Sending Traceroute Responses 551 6.5.1. Destination Address 553 A traceroute response must be sent to the Response Address in the 554 traceroute header. 556 6.5.2. TTL 558 If the Response Address is unicast, the router inserts its normal 559 unicast TTL in the IP header. If the Response Address is multi- 560 cast, the router copies the Response TTL from the traceroute header 561 into the IP header. 563 6.5.3. Source Address 565 If the Response Address is unicast, the router may use any of its 566 interface addresses as the source address. Since some multicast 567 routing protocols forward based on source address, if the Response 568 Address is multicast, the router MUST use an address that is known 569 in the multicast routing table if it can make that determination. 571 6.5.4. Sourcing Multicast Responses 573 When a router sources a multicast response, the response packet 574 MUST be sent on a single interface, then forwarded as if it were 575 received on that interface. It MUST NOT source the response packet 576 individually on each interface, in order to avoid duplicate pack- 577 ets. 579 6.6. Hiding information 581 Information about a domain's topology and connectivity may be hid- 582 den from multicast traceroute requests. The exact mechanism is not 583 specified here; however, the INFO_HIDDEN forwarding code may be 584 used to note that, for example, the incoming interface address and 585 packet count are for the entrance to the domain and the outgoing 586 interface address and packet count are the exit from the domain. 587 The source-group packet count may be from either router or not 588 specified (0xffffffff). 590 7. Using multicast traceroute 592 7.1. Sample Client 594 This section describes the behavior of an example multicast traceroute 595 client. 597 7.1.1. Sending Initial Query 599 When the destination of the trace is the machine running the 600 client, the traceroute Query packet can be sent to the ALL-ROUTERS 601 multicast group (224.0.0.2). This will ensure that the packet is 602 received by the last-hop router on the subnet. Otherwise, if the 603 proper last-hop router is known for the trace destination, the 604 Query could be unicasted to that router. Otherwise, the Query 605 packet should be multicasted to the group being queried; if the 606 destination of the trace is a member of the group this will get the 607 Query to the proper last-hop router. In this final case, the 608 packet should contain the Router Alert option, to make sure that 609 routers that are not members of the multicast group notice the 610 packet. See also section 7.2 on determining the last-hop router. 612 7.1.2. Determining the Path 614 The client could send a small number of Initial Query messages with 615 a large "# hops" field, in order to try to trace the full path. If 616 this attempt fails, one strategy is to perform a linear search (as 617 the traditional unicast traceroute program does); set the "#hops" 618 field to 1 and try to get a response, then 2, and so on. If no 619 response is received at a certain hop, the hop count can continue 620 past the non-responding hop, in the hopes that further hops may 621 respond. These attempts should continue until a user-defined time- 622 out has occurred. 624 See also section 7.3 and 7.4 on receiving the results of a trace. 626 7.1.3. Collecting Statistics 628 After a client has determined that it has traced the whole path or 629 as much as it can expect to (see section 7.5), it might collect 630 statistics by waiting a short time and performing a second trace. 631 If the path is the same in the two traces, statistics can be dis- 632 played as described in section 8.3 and 8.4. 634 Details of performing a multicast traceroute: 636 7.2. Last hop router 638 The traceroute querier may not know which is the last hop router, 639 or that router may be behind a firewall that blocks unicast packets 640 but passes multicast packets. In these cases, the traceroute 641 request should be multicasted to the group being traced (since the 642 last hop router listens to that group). All routers except the 643 correct last hop router should ignore any multicast traceroute 644 request received via multicast. Traceroute requests which are 645 multicasted to the group being traced must include the Router Alert 646 IP option [Katz97]. 648 Another alternative is to unicast to the trace destination. 649 Traceroute requests which are unicasted to the trace destination 650 must include the Router Alert IP option [Katz97], in order that the 651 last-hop router is aware of the packet. 653 If the traceroute querier is attached to the same router as the 654 destination of the request, the traceroute request may be multicas- 655 ted to 224.0.0.2 (ALL-ROUTERS.MCAST.NET) if the last-hop router is 656 not known. 658 7.3. First hop router 660 The traceroute querier may not be unicast reachable from the first 661 hop router. In this case, the querier should set the traceroute 662 response address to a multicast address, and should set the 663 response TTL to a value sufficient for the response from the first 664 hop router to reach the querier. It may be appropriate to start 665 with a small TTL and increase in subsequent attempts until a suffi- 666 cient TTL is reached, up to an appropriate maximum (such as 192). 668 The IANA has assigned 224.0.1.32, MTRACE.MCAST.NET, as the default 669 multicast group for multicast traceroute responses. Other groups 670 may be used if needed, e.g. when using mtrace to diagnose problems 671 with the IANA-assigned group. 673 7.4. Broken intermediate router 675 A broken intermediate router might simply not understand traceroute 676 packets, and drop them. The querier would then get no response at 677 all from its traceroute requests. It should then perform a hop-by- 678 hop search by setting the number of responses field until it gets a 679 response (both linear and binary search are options, but binary is 680 likely to be slower because a failure requires waiting for a time- 681 out). 683 7.5. Trace termination 685 When performing an expanding hop-by-hop trace, it is necessary to 686 determine when to stop expanding. 688 7.5.1. Arriving at source 690 A trace can be determined to have arrived at the source if the 691 Incoming Interface of the last router in the trace is non-zero, but 692 the Previous Hop router is zero. 694 7.5.2. Fatal Error 696 A trace has encountered a fatal error if the last Forwarding Error 697 in the trace has the 0x80 bit set. 699 7.5.3. No Previous Hop 701 A trace can not continue if the last Previous Hop in the trace is 702 set to 0. 704 7.5.4. Trace shorter than requested 706 If the trace that is returned is shorter than requested (i.e. the 707 number of Response blocks is smaller than the "# hops" field), the 708 trace encountered an error and could not continue. 710 7.6. Continuing after an error 712 When the NO_SPACE error occurs, the client might try to continue 713 the trace by starting it at the last hop in the trace. It can do 714 this by unicasting to this router's outgoing interface address, 715 keeping all fields the same. If this results in a single hop and a 716 "WRONG_IF" error, the client may try setting the trace destination 717 to the same outgoing interface address. 719 If a trace times out, it is likely to be because a router in the 720 middle of the path does not support multicast traceroute. That 721 router's address will be in the Previous Hop field of the last 722 entry in the last reply packet received. A client may be able to 723 determine (via mrinfo[Pusa99] or SNMP[Thal99a,Thal99b]) a list of 724 neighbors of the non-responding router. If desired, each of those 725 neighbors could be probed to determine the remainder of the path. 726 Unfortunately, this heuristic may end up with multiple paths, since 727 there is no way of knowing what the non-responding router's algo- 728 rithm for choosing a previous-hop router is. However, if all paths 729 but one flow back towards the non-responding router, it is possible 730 to be sure that this is the correct path. 732 7.7. Multicast Traceroute and shared-tree routing protocols 734 When using shared-tree routing protocols like PIM-SM and CBT, a 735 more advanced client may use multicast traceroute to determine 736 paths or potential paths. 738 7.7.1. PIM-SM 740 When a multicast traceroute reaches a PIM-SM RP and the RP does not for- 741 ward the trace on, it means that the RP has not performed a source- 742 specific join so there is no more state to trace. However, the path 743 that traffic would use if the RP did perform a source-specific join can 744 be traced by setting the trace destination to the RP, the trace source 745 to the traffic source, and the trace group to 0. This trace Query may 746 be unicasted to the RP. 748 7.7.2. CBT 750 When a multicast traceroute reaches a CBT Core, it must simply stop 751 since CBT does not have source-specific state. However, a second trace 752 can be performed, setting the trace destination to the traffic source, 753 the trace group to the group being traced, and the trace source to the 754 Core (or to 0, since CBT does not have source-specific state). This 755 trace Query may be unicasted to the Core. There are two possibilities 756 when combining the two traces: 758 7.7.2.1. No overlap 760 If there is no overlap between the two traces, the second trace can 761 be reversed and appended to the first trace. This composite trace 762 shows the full path from the source to the destination. 764 7.7.2.2. Overlapping paths 766 If there is a portion of the path that is common to the ends of the 767 two traces, that portion is removed from both traces. Then, as in 768 the no overlap case, the second trace is reversed and appended to 769 the first trace, and the composite trace again contains the full 770 path. 772 This algorithm works whether the source has joined the CBT tree or not. 774 7.8. Protocol-specific considerations 776 7.8.1. DVMRP 778 DVMRP's dominant router election and route exchange guarantees that 779 DVMRP routers know whether or not they are the last-hop forwarder 780 for the link and who the previous hop is. 782 7.8.2. PIM Dense Mode 784 Routers running PIM Dense Mode do not know the path packets would 785 take unless traffic is flowing. Without some extra protocol mecha- 786 nism, this means that in an environment with multiple possible 787 paths with branch points on shared media, multicast traceroute can 788 only trace existing paths, not potential paths. When there are 789 multiple possible paths but the branch points are not on shared 790 media, the previous hop router is known, but the last hop router 791 may not know that it is the appropriate last hop. 793 When traffic is flowing, PIM Dense Mode routers know whether or not 794 they are the last-hop forwarder for the link (because they won or 795 lost an Assert battle) and know who the previous hop is (because it 796 won an Assert battle). Therefore, multicast traceroute is always 797 able to follow the proper path when traffic is flowing. 799 8. Problem Diagnosis 801 8.1. Forwarding Inconsistencies 803 The forwarding error code can tell if a group is unexpectedly 804 pruned or administratively scoped. 806 8.2. TTL problems 808 By taking the maximum of (hops from source + forwarding TTL thresh- 809 old) over all hops, you can discover the TTL required for the 810 source to reach the destination. 812 8.3. Packet Loss 814 By taking two traces, you can find packet loss information by com- 815 paring the difference in input packet counts to the difference in 816 output packet counts at the previous hop. On a point-to-point 817 link, any difference in these numbers implies packet loss. Since 818 the packet counts may be changing as the trace query is propagat- 819 ing, there may be small errors (off by 1 or 2) in these statistics. 820 However, these errors will not accumulate if multiple traces are 821 taken to expand the measurement period. On a shared link, the 822 count of input packets can be larger than the number of output 823 packets at the previous hop, due to other routers or hosts on the 824 link injecting packets. This appears as "negative loss" which may 825 mask real packet loss. 827 In addition to the counts of input and output packets for all mul- 828 ticast traffic on the interfaces, the response data includes a 829 count of the packets forwarded by a node for the specified source- 830 group pair. Taking the difference in this count between two traces 831 and then comparing those differences between two hops gives a mea- 832 sure of packet loss just for traffic from the specified source to 833 the specified receiver via the specified group. This measure is 834 not affected by shared links. 836 On a point-to-point link that is a multicast tunnel, packet loss is 837 usually due to congestion in unicast routers along the path of that 838 tunnel. On native multicast links, loss is more likely in the out- 839 put queue of one hop, perhaps due to priority dropping, or in the 840 input queue at the next hop. The counters in the response data do 841 not allow these cases to be distinguished. Differences in packet 842 counts between the incoming and outgoing interfaces on one node 843 cannot generally be used to measure queue overflow in the node. 845 8.4. Link Utilization 847 Again, with two traces, you can divide the difference in the input 848 or output packet counts at some hop by the difference in time 849 stamps from the same hop to obtain the packet rate over the link. 850 If the average packet size is known, then the link utilization can 851 also be estimated to see whether packet loss may be due to the rate 852 limit or the physical capacity on a particular link being exceeded. 854 8.5. Time delay 856 If the routers have synchronized clocks, it is possible to estimate 857 propagation and queuing delay from the differences between the 858 timestamps at successive hops. However, this delay includes con- 859 trol processing overhead, so is not necessarily indicative of the 860 delay that data traffic would experience. 862 9. Implementation-specific Caveats 864 Some routers with distributed forwarding architectures may not update 865 the main processor's packet counts often enough for the packet counters 866 to be meaningful on a small time scale. This can be recognized during a 867 periodic trace by seeing positive loss in one trace and negative loss in 868 the next, with no (or small) net loss over a longer interval. The sug- 869 gested solution to this problem is to simply collect statistics over a 870 longer interval. 872 In the multicast extensions for SunOS 4.1.x from Xerox PARC, which are 873 the basis for many UNIX-based multicast routers, both the output packet 874 count and the packet forwarding count for the source-group pair are 875 incremented before priority dropping for rate limiting occurs and before 876 the packets are put onto the interface output queue which may overflow. 877 These drops will appear as (positive) loss on the link even though they 878 occur within the router. 880 In release 3.3/3.4 of the UNIX multicast extensions, a multicast packet 881 generated on a router will be counted as having come in an interface 882 even though it did not. This can create the appearance of negative loss 883 even on a point-to-point link. 885 In releases up through 3.5/3.6, packets were not counted as input on an 886 interface if the reverse-path forwarding check decided that the packets 887 should be dropped. That causes the packets to appear as lost on the 888 link if they were output by the upstream hop. This situation can arise 889 when two routers on the path for the group being traced are connected by 890 a shared link, and the path for some other group does not flow between 891 those two routers because the downstream router receives packets for the 892 other group on another interface, but the upstream router is the elected 893 forwarder to other routers or hosts on the shared link. 895 10. Acknowledgments 897 This specification started largely as a transcription of Van Jacobson's 898 slides from the 30th IETF, and the implementation in mrouted 3.3 by Ajit 899 Thyagarajan. Van's original slides credit Steve Casner, Steve Deering, 900 Dino Farinacci and Deb Agrawal. A multicast traceroute client, mtrace, 901 has been implemented by Ajit Thyagarajan, Steve Casner and Bill Fenner. 903 The idea of unicasting a multicast traceroute Query to the destination 904 of the trace with Router Alert set is due to Tony Ballardie. The idea 905 of the "S" bit to allow statistics for a source subnet is due to Tom 906 Pusateri. 908 11. IANA Considerations 910 11.1. Routing Protocols 912 The IANA is responsible for allocating new Routing Protocol codes. 913 The Routing Protocol code is somewhat problematic, since in the 914 case of protocols like CBT and PIM it must encode both a unicast 915 routing algorithm and a multicast tree-building protocol. The 916 space was not divided into two fields because it was already small 917 and some combinations (e.g. DVMRP) would be wasted. 919 Routing Protocol codes should be allocated for any combination of 920 protocols that are in common use in the Internet. 922 11.2. Forwarding Codes 924 New Forwarding codes must only be created by an RFC that modifies 925 this document's section 7, fully describing the conditions under 926 which the new forwarding code is used. The IANA may act as a cen- 927 tral repository so that there is a single place to look up forward- 928 ing codes and the document in which they are defined. 930 12. Security Considerations 932 12.1. Topology discovery 934 mtrace can be used to discover any actively-used topology. If your 935 network topology is a secret, mtrace may be restricted at the bor- 936 der of your domain, using the ADMIN_PROHIB forwarding code. 938 12.2. Traffic rates 940 mtrace can be used to discover what sources are sending to what 941 groups and at what rates. If this information is a secret, mtrace 942 may be restricted at the border of your domain, using the 943 ADMIN_PROHIB forwarding code. 945 12.3. Unicast replies 947 The "Response address" field may be used to send a single packet 948 (the traceroute Reply packet) to an arbitrary unicast address. It 949 is possible to use this facility as a packet amplifier, as a small 950 multicast traceroute Query may turn into a large Reply packet. 952 13. References 954 Brad88 Braden, B., D. Borman, C. Partridge, "Computing the 955 Internet Checksum", RFC 1071, ISI, September 1988. 957 Brad97 Bradner, S., "Key words for use in RFCs to Indicate 958 Requirement Levels", RFC 2119/BCP 14, Harvard University, 959 March 1997. 961 Katz97 Katz, D., "IP Router Alert Option," RFC 2113, Cisco Sys- 962 tems, February 1997. 964 Pusa99 Pusateri, T., "DVMRP Version 3", work in progress, June 965 1999. 967 Thal99a Thaler, D., "PIM MIB", work in progress, June 1999. 969 Thal99b Thaler, D., "DVMRP MIB", work in progress, May 1998. 971 14. Authors' Addresses 973 William C. Fenner 974 AT&T Labs -- Research 975 75 Willow Rd. 976 Menlo Park, CA 94025 977 United States 978 Email: fenner@research.att.com 980 Stephen L. Casner 981 Cisco Systems, Inc. 982 170 West Tasman Drive 983 San Jose, CA 95134 984 United States 985 Email: casner@cisco.com 987 15. Changes from the last revision: 989 - Changes section added. 991 - Updated abstract 993 - Added mention of up-to-date packet counts, in particular allowing 994 the delay of an mtrace packet while the counts are fetched in a 995 distributed architecture. 997 - Added mention of ifInMulticastPkts, ifOutMulticastPkts, and ipM- 998 RoutePkts for clarification of what counts should be used. 1000 - Note that the dropping of duplicate Queries MAY be a 1-back cache 1001 and that duplicate Requests MUST NOT be dropped 1003 - Add no-space processing rule 1005 - Note that it's not an error for there to be more blocks than 1006 requested, just send it back after adding yours. 1008 - Clean up some of section 8 - move implementation-specific stuff to 1009 a separate section, rename "Congestion" to "Packet Loss", note that 1010 time delay isn't actually that useful.