idnits 2.17.1 draft-ietf-mboned-mrm-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 22 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 52 instances of too long lines in the document, the longest one being 2 characters in excess of 72. == There are 6 instances of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 Miscellaneous warnings: ---------------------------------------------------------------------------- == The "Author's Address" (or "Authors' Addresses") section title is misspelled. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'AH' is mentioned on line 165, but not defined == Missing Reference: 'RFC1889' is mentioned on line 467, but not defined ** Obsolete undefined reference: RFC 1889 (Obsoleted by RFC 3550) == Unused Reference: 'UDP' is defined on line 1024, but no explicit reference was found in the text == Unused Reference: 'MD5' is defined on line 1030, but no explicit reference was found in the text ** Downref: Normative reference to an Informational RFC: RFC 1321 (ref. 'MD5') -- Possible downref: Non-RFC (?) normative reference: ref. 'KA98' Summary: 7 errors (**), 0 flaws (~~), 8 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 MBone Deployment Working Group Kevin Almeroth (ed) 2 Internet Engineering Task Force UCSB 3 Internet Draft Liming Wei 4 October 1999 Siara Systems, Inc. 5 Expires: April 1999 Dino Farinacci 6 Cisco 8 Multicast Reachability Monitor (MRM) 9 11 Status of this Memo 13 This document is an Internet-Draft and is in full conformance 14 with all provisions of Section 10 of RFC2026. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as 19 Internet-Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six 22 months and may be updated, replaced, or obsoleted by other 23 documents at any time. It is inappropriate to use Internet- 24 Drafts as reference material or to cite them other than as 25 "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 Abstract 35 MRM facilitates automated fault detection and fault isolation in a 36 large multicast routing infrastructure. It is designed to alarm a 37 network administrator of multicast reachability problems in close 38 to real-time. 40 There are two basic types of components in MRM, MRM manager and MRM 41 testers. This document specifies the protocol with which the two MRM 42 components communicate, the types of operations the testers perform, 43 and information an MRM manager can obtain. 45 Table of Contents 47 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 49 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 50 1.1 Partitioning Network Monitoring Tasks . . . . . . . . . . . . 3 52 2. Functions of the MRM Mechanism . . . . . . . . . . . . . . . . . . 4 53 2.1 Fault Detection . . . . . . . . . . . . . . . . . . . . . . . 4 54 2.2 Fault Isolation . . . . . . . . . . . . . . . . . . . . . . . 5 55 2.3 The Protocol . . . . . . . . . . . . . . . . . . . . . . . . . 5 56 2.3.1 MRM Manager Requests . . . . . . . . . . . . . . . . . . 6 57 2.3.1.1 MRM Manager Beacon Message . . . . . . . . . . . . . 7 58 2.3.1.2 Test Sender Requests (TSRs) . . . . . . . . . . . . 7 59 2.3.1.3 Test Receiver Requests (TRRs) . . . . . . . . . . . 8 60 2.3.2 Status Reports . . . . . . . . . . . . . . . . . . . . . 10 62 3. Use of MRM Well Known Addresses and Ports . . . . . . . . . . . . 11 64 4. Message Formats . . . . . . . . . . . . . . . . . . . . . . . . . 11 65 4.1 MRM Message Header . . . . . . . . . . . . . . . . . . . . . . 12 66 4.2 MRM Manager Beacon Message . . . . . . . . . . . . . . . . . . 13 67 4.3 Test Sender Request (TSR) . . . . . . . . . . . . . . . . . . 13 68 4.4 Test Receiver Requests (TRR) . . . . . . . . . . . . . . . . . 14 69 4.5 Status Report to the MRM Manager . . . . . . . . . . . . . . . 16 70 4.6 MRM Test Packet . . . . . . . . . . . . . . . . . . . . . . . 17 71 4.7 MRM Request-Ack Messages . . . . . . . . . . . . . . . . . . . 17 73 5. Authenticating MRM Messages . . . . . . . . . . . . . . . . . . . 17 74 5.1 Generating Authenticated Messages . . . . . . . . . . . . . . 18 75 5.2 Receiving Authenticated Messages . . . . . . . . . . . . . . . 18 76 5.3 Key Management . . . . . . . . . . . . . . . . . . . . . . . . 18 78 6. Security Considerations . . . . . . . . . . . . . . . . . . . . . 19 80 7. Different Approaches to Implement MRM . . . . . . . . . . . . . . 19 82 8. Example of an MRM Setup . . . . . . . . . . . . . . . . . . . . . 19 84 9. Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . 21 86 10. Authors addresses . . . . . . . . . . . . . . . . . . . . . . . . 21 88 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 90 Appendix A - Change History . . . . . . . . . . . . . . . . . . . . . 22 91 1. Introduction 93 The Multicast Reachability Monitor (MRM) is a network fault detection 94 and isolation mechanism for administering a multicast routing 95 infrastructure. It is proposed in response to requests from network 96 managers and users who need more systematic ways to get up-to-date 97 multicast reachability status. For these purposes, existing tools are 98 inefficient and inconvenient to use across large numbers of systems. 99 The companion document [mrm-use] contains additional information on 100 justification and usage guidelines for MRM. 102 The design goals for MRM include: 104 (1) Close to real-time detection and alarm of network problems, 105 independent of user input; 107 (2) Good coverage over the network, both in terms of the number of 108 systems to be monitored, and the types of diagnostics to be 109 performed; 111 (3) Good extensibility and relative independence of other specific 112 diagnostic tools and protocols (we borrow packet formats from 113 RTPv2, but almost nothing else from the RTP protocol). This makes 114 it easy to incorporate newer diagnostic tools as they become 115 available. 117 1.1 Partitioning Network Monitoring Tasks 119 Functionally, the task of monitoring a multicast domain can be 120 divided into two subtasks: 122 (1) Fault detection 123 (2) Fault isolation 125 In the fault detection phase, the participating MRM systems do not 126 need much detail about the nature of the fault. The mechanism can 127 be very simple and brute force. Data packets can be originated 128 from designated locations in the network and reception conditions 129 monitored from other locations. 131 In the fault isolation phase, depending on the types of fault 132 identified, the MRM manager can use proper tools to isolate the 133 fault and hopefully pin-point the location or reasons of the fault. 135 The rest of this document is organized as follows, Section 2 136 describes the MRM framework and details of the MRM protocol; Section 137 3 describes the usage of the well known MRM addresses and ports; 138 Section 4 specifies packet formats; Sections 5 discusses the MRM 139 authentication mechanisms; Section 6 discusses a few security issues; 140 and Section 8 gives an example of MRM setup. 142 2. Functions of the MRM Mechanism 144 An MRM based fault monitoring system consists of two types of 145 components: (1) an MRM manager that configures tests, collects and 146 presents fault information, and (2) MRM testers that source or sink 147 test traffic. These components collaborate to accomplish the two 148 functions of MRM: fault detection and fault isolation. 150 The MRM testers can be any routing devices or trusted end hosts. 151 They provide statistics about received data packets, to be used to 152 derive the network reachability status. These data packets can be 153 sourced by a router acting as an MRM tester, in response to a request 154 from the MRM manager. A system originating MRM data packets for 155 testing purposes is also called a Test Source (TS). A configured 156 set of MRM testers receiving the test traffic, and collecting 157 receiver statistics are also called Test Receivers (TRs). 159 An MRM manager initiates configuration requests to the MRM testers 160 and assigns the roles of TSs and TRs. The MRM manager informs the TSs 161 and TRs the types of monitoring or diagnostic tests to run. The MRM 162 manager also specifies the type of reports the TRs should send. 164 To guard against attacks on the MRM systems, IPsec Authentication 165 Header (AH) [AH] is used with HMAC-MD5 transformation as the standard 166 authentication algorithm. Authentication should always be enabled, 167 especially when MRM is used to monitor production services. 169 Note that this document only specifies the types of information an MRM 170 manager can obtain, and the protocol used to acquire such 171 information. How an MRM manager processes or presents the diagnostic 172 information is an implementation issue. An MRM manager can be as 173 simple as a command line wrapper of requests with simple display 174 functions, it can also be more sophisticated and incorporated as part 175 of a operational network monitoring tool in daily use by a network 176 operation center (NOC). 178 2.1 Fault Detection 180 Multicast routing can behave abnormally in different ways. The 181 following are a few common types of faults: 183 (1) Topological disconnectivity 185 The network topology for multicast routing is disconnected. For 186 example when a route for a subset of the networks are not in the 187 topology table. 189 (2) Black holes in forwarding path 191 No multicast packets can get through to certain receivers, even 192 though the network topology is perhaps intact. A possible cause 193 could be disabled multicast forwarding. Another possibility is 194 pruning errors,n e.g. due to inconsistent actions and timer 195 values on a multi-access LAN. 197 (3) Excessive/persistent Losses 199 Packets flow, but with excessive losses over extended period of 200 time. The possible causes include heavy congestion, line errors 201 or misuse of forwarding modes, etc. 203 (4) Excessive duplicates 205 Packets arrive at the receivers, but with large numbers of 206 duplicates. 208 (5) Others 210 Other types of fault that can be detected, e.g. non-pruners 211 as a failure mode. A non-pruning neighbor can be a sink for all 212 multicast traffic at all times, even if no receivers exist behind 213 that neighbor. This is "outlawed" by the "MBONE-community" [jhawk]. 214 Detecting the existence of such system in an inter-domain scenario, 215 however, is not trivial. We leave this task to the next iteration 216 of MRM refinement. 218 2.2 Fault Isolation 220 Fault isolation is initiated by the MRM manager. For different types 221 of faults detected, various tools can be used to isolate the faults 222 to small areas in the network. Currently, the tools available for 223 this purpose includes but not limited to mtrace [MTRACE}, MIBs based 224 debugging tools based, http-based status report mechanism and remote 225 execution mechanisms. 227 When one tool is not sufficient, a combination of tools can be 228 applied. In general, MRM is designed to be flexible about the types 229 of tools it can utilize. Integrating the functionality of other 230 tools into MRM is an implementation issue for the MRM manager. 232 2.3 The Protocol 234 As stated above, the task of monitoring multicast reachability is 235 accomplished by letting an MRM manager configure the MRM testers to 236 perform fault detection and isolation tests. The MRM manager 237 summarizes or displays the collected reports for the network 238 operators, in an implementation specific way. 240 The MRM manager keeps a list of tester addresses. The relevant 241 routing devices are administratively configured as candidate MRM 242 testers. These testers will become active TSs and TRs once they 243 accept and process requests from an MRM manager. 245 We chose to use RTPv2 encapsulation for the following MRM messages: 246 fault report messages from TRs and optionally some test data packets. 247 This is to allow re-use of existing RTP based reception mechanisms. 248 Note that despite the use of the RTPv2 packet format, the design 249 goals and rules for the MRM message exchange protocol are entirely 250 different from those specified in RTP. 252 2.3.1 MRM Manager Requests 254 An MRM manager sends Test Sender requests to the TSs, and Test 255 Receiver requests to the TRs. 257 The MRM manager optionally transmits periodic beacon requests 258 to the well-known MRM multicast address MRM-ADDR (224.0.1.111) 259 that all TSs and TRs listen to. This beacon mechanism has three 260 purposes: 262 (1) For the TSs and TRs to learn the liveness of the MRM manager; 264 (2) As a medium to periodically refresh requests, in order for 265 testers to recover lost MRM messages, configurations or state 266 (e.g. across reboots). 268 (3) Inform a large group of test participants that an MRM session 269 has been changed or cancelled. 271 The use of beacon messages by the manager is optional primarily 272 because multicast connectivity between the manager and TSs and 273 TRs may not exist. As a result, while beacon messages may add 274 robustness, they should not be relied on to provide critical 275 functionality. While the manager chooses whether or not to 276 send beacon messages, TSs and TRs must be prepared to handle 277 these messages. 279 The MRM manager may send a request to either a unicast address, 280 or multicast address 224.0.1.111. When the message is sent via 281 unreliable unicast transport (UDP), the recipient must send a 282 positive acknowledgement after it has received that request. 283 Unacknowledged request messages are retransmitted. 285 2.3.1.1 MRM Manager Beacon Message 287 The MRM manager periodically transmits beacon messages to advertise its 288 liveness to all MRM testers. This message is UDP-encapsulated. The 289 sender's timestamp can be used to calculate the jitters in delay 290 between subsequent beacon messages. 292 The recommended default Beacon message interval is 1 minute. The MRM 293 manager may piggyback the manager requests on the beacon messages. 294 This potentially reduces the need to individually check and repair 295 each tester's setup state, while still able to provide reliability 296 through a soft-state refresh mechanism. 298 2.3.1.2 Test Sender Requests (TSRs) 300 A Test Sender request is first unicast delivered to a TS, then 301 refreshed through multicast delivery via the MRM beacon mechanism. 302 A Test Sender request specifies one of the following two ways to 303 generate test packets: 305 (1) Local packet trigger. This request includes the following 306 parameters: 308 (a) intervals between two consecutive test packets; 309 (b) format and length of test packets (e.g. RTP/UDP); 310 (c) multicast address for the test group. 312 If a TS accepts this local packet trigger, it will start sending 313 periodic test packets, at intervals specified in the MRM request 314 message. The IP address of the MRM manager will be used as the ID 315 for all test packets originated by the TS under this request. To 316 detect loops and packet losses, all test packets also contain a 317 monotonically increasing sequence number (if encapsulated in RTP, 318 this would be the RTP sequence number). 320 (2) Proxy packet trigger (see Section 5 for security impacts). 322 This request lets a TS send a (sequence of) MRM test packet(s), 323 using the IP source address provided by the manager request 324 message. This request contains all parameters a local packet 325 trigger has, plus a proxy-source address. 327 This request is useful for monitoring intra-domain multicast 328 connectivity for external sources. A proxy packet trigger can be 329 used to inject packets into the local domain, pretending there is 330 an active source external of the local domain. Inside the domain, 331 as far as forwarding is concerned, these packets are 332 indistinguishable from packets originated from a real external 333 source. For security reasons, proxy packet triggers should be 334 enabled very carefully. 336 TSR messages are also used to stop ongoing tests. By re-sending 337 the original TSR packet, but with a holdtime of zero, a test can 338 be stopped. NOTE: TRR messages with a holdtime of zero should 339 also be sent to each test receiver participating in the test. 341 2.3.1.3 Test Receiver Requests (TRRs) 343 An MRM status request is first addressed to a unicast address of a 344 TR, and subsequently should be carried in the MRM manager beacon 345 messages sent to 224.0.1.111. 347 Each such request carries a holdtime of the request, after which the 348 TR can safely discard any information collected. A TRR with a 349 holdtime of zero implies that an ongoing test should be terminated. 350 The TRR specifies how each TR should collect the reception data. 352 The following are the request types for the TRs: 354 (1) Monitor multicast group. This request has the following fields: 356 (a) J-bit. If set, the TR will join the specified group, as if it 357 were a host with a member of that group. 359 If a tester did an IGMP join at the beginning of a test, when 360 the MRM request expires, the IGMP group membership should be 361 withdrawn. 363 When a TR is instructed to join a data group of an existing 364 application (e.g. a heartbeat [heartbeat] group), it is wise 365 to assess the impact on the TR system if the data rate is 366 non-trivial. 368 Furthermore, the use of existing groups introduces uncertainty 369 as to whether the source is actually transmitting. Because 370 TRs expect a constant flow of packets, using existing group 371 traffic, which may be bursty, introduces uncertainty at the 372 receiver as to whether traffic is flowing but is being lost 373 or not being sent. 375 (b) The address of the group to be monitored; 377 (c) List of source addresses to record reception quality 378 information; 380 (d) Threshold description for triggering fault reports. 382 This draft revision only specifies packet loss based 383 threshold. A fault is detected if the packet loss percentage 384 has reached the threshold during the specified time window for 385 measurement. Once set, the width of this window is fixed. But 386 the starting point (or left edge) of the window keeps moving 387 forward. 389 Reception quality data within the measurement window should be 390 kept so that threshold calculations can be made continuously 391 as the window moves forward in time. 393 (e) Maximum and minimum delays to trigger fault report. The report 394 is sent at a randomized delay between the minimum and the 395 maximum value. 397 (f) Type of error reports solicited. It is possible to specify an 398 RTCP report (as if the test session uses RTP), or a native MRM 399 report. Currently, MRM only supports RTP-based reports. 401 (2) Fault isolation request. This request is sent after a fault is 402 detected and identified by the MRM manager. It specifies the tool 403 and its associated parameters. 405 Details about this request message will be added in a future 406 revision of the MRM specification. 408 (3) Poll for receiver statistics. This instructs the TR to report the 409 statistics (historic data) it has collected via Status Reports. 410 The TR will send Status Reports, even if the fault threshold has 411 not been reached. Section 2.3.2 describes the status report 412 mechanism in detail. 414 When large numbers of TRs are activated, a fault in the upstream of a 415 tree may result in many TRs sending reports at the same time. To 416 address the issue of possible report implosion, each TR may use one 417 of the following two strategies: 419 (1) Report via unicast message. The MRM manager assigns a pre- 420 determined report-delay (as part of the configuration design 421 task) to each TR. Each TR upon detecting a fault, will randomly 422 delay the sending of its report based on the pre-set delay 423 period. This would allow an MRM system to monitor networks with 424 up to thousands of systems without unreasonable compromises in 425 detection response times. 427 (2) Each TR may be instructed to report the detected faults to the 428 well-known MRM group address 224.0.1.111 using the RTCP format 429 [RFC1889] and does back-off or suppression when duplicate reports 430 from other Testers are seen. If using this strategy the manager 431 should realize that using multicast to report a problem with 432 multicast may not be particularly robust. 434 This method allows the use of existing RTP-based monitoring tools 435 in the initial deployment and experiments with MRM. However, it 436 will prevent the MRM manager from learning a complete list of 437 receivers affected by a specific fault. When multicast routing is 438 not working correctly, these reports may not be heard by the MRM 439 manager, leaving faults undetected and not alarmed by the MRM 440 manager. It is recommended that all designs include at least a 441 subset of TRs that (take turns to) unicast their reports. 443 There is ambiguity in MRM not hearing any fault report from a certain 444 TR. It could be due to fault-free network status, the crash of the 445 TR, or problems in the transport mechanism between the TR and the MRM 446 manager. Requiring each TR to frequently report its liveness and to 447 only do unicast fault report may work for a moderate number of 448 testers, but may put undue burden on the network for larger numbers 449 of testers. A compromising solution is to only report liveness from 450 a critical portion of the network and do unicast fault report from a 451 subset of the testers. The periodic liveness reports serve two 452 purposes: (1) it provides evidence that the tester is still alive; 453 (2) it indicates the conditions of the tester functions. The 454 request-ack messages are used as tester liveness reports. 456 Note that the fault isolation phase does not necessarily require the 457 MRM manager to send a Fault Isolation Request to a TR. E.g, in a 458 typical network today, a third party mtrace issued by the MRM manager 459 may be sufficient to identify the faulty hop excessively dropping 460 packets if the tester is not completely blacked out. 462 2.3.2 Status Reports 464 These reports are sent by the TRs to the MRM manager, in response to 465 a status request. 467 For now, we use RTP [RFC1889] "receiver report (RR)" packet format to 468 carry receiver's status reports. It is expected that the MRM-native 469 report format (to be defined in future draft revisions) will carry 470 more useful information about the routing state and statistics. 472 Please refer to RFC1889 for details on the packet formats. Here we 473 define the few RTCP items used by MRM (or loosely referred to as RTP 474 profile for MRM): 476 SSRC (Synchronization source) of packet sender: 477 IP address of the Test Sender. 479 Extended highest sequence number received: 480 Highest sequence number seen by the Test Receiver. 482 Fraction loss: 483 Percent loss of Test Sender data. 485 Cumulative number of packets lost: 486 Total number of RTP data packets from SSRC lost within this 487 reception window period. 489 Inter-arrival Jitter: 490 Set to zero when sent, ignored when received. 492 When this report is UDP encapsulated and unicast addressed to the MRM 493 manager, it is explicitly acknowledged. The acknowledgement packet 494 contains the RTCP header portion of the original packet after the MRM 495 header. 497 3. Use of MRM Well Known Addresses and Ports 499 Once all TS and TR systems are configured, they join the well-known 500 MRM control group MRM-ADDR (224.0.1.111) and listen to the well-known 501 MRM UDP port MRM-MANAGER-PORT (679). 503 The MRM beacon messages are periodically sent to 224.0.1.111 UDP 504 port 679. 506 4. Message Formats 508 By default, MRM control messages are encapsulated inside UDP, and an 509 IP authentication header (AH) [KA98], is inserted in between the IP 510 header and the UDP header, as shown below: 512 +-----------+------+------------+------------+--------------+ 513 | IP Header | AH | UDP header | MRM header | MRM payload | 514 +-----------+------+------------+------------+--------------+ 516 The MRM status report in RTCP format is: 518 +-----------+------+------------+------------------+------------+ 519 | IP Header | AH | UDP header | RTCP Rcvr Report | MRM header | 520 +-----------+------+------------+------------------+------------+ 522 The MRM ACK packet format is: 524 +-----------+------+------------+------------+-------------+ 525 | IP Header | AH | UDP header | MRM header | RTCP Header | 526 +-----------+------+------------+------------+-------------+ 528 The inserted AH is reproduced below: 530 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 531 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 532 | Next Header | Payload Len | RESERVED | 533 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 534 | Security Parameters Index (SPI) | 535 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 536 | Sequence Number | 537 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 538 | | 539 | Authentication Data (variable) | 540 | | 541 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 543 As specified in [KA98], the following are the default values for the 544 fields above: 546 Next Header: 17, the value for UDP protocol. 548 Payload Len: 5, when MD5 is used (total length is 7 32-bit words). 550 RESERVED: 0 when sent, ignored when received. 552 SPI: 0 - 50, when using configured MD5 keys 554 Sequence Number: the sequence number 556 Authentication Data: message digest 558 4.1 MRM Message Header 560 The MRM message header contains 4 32-bit words. 562 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 563 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 564 |Version| Type | Code | Holdtime | 565 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 566 | Target IP address | 567 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 568 |M| Reserved | MRM message length | 569 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 570 | Timestamp (in milliseconds) | 571 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 573 Version: 4 bits 574 This revision defines version 1 of MRM. 576 Type: 4 bits 577 The defined message types are: 579 0 = Beacon (from MRM manager to all testers) 580 1 = TS Request (from MRM manager to Test Senders) 581 2 = TR Request (from MRM manager to Test Receivers) 582 3 = Status Response (from TR to the MRM manager) 583 4 = TS Request Ack (from TS to MRM manager) 584 5 = TR request Ack (from TR to MRM manager) 585 6 = Status Response Ack (from MRM manager to TR) 587 Code: 8 bits 588 Defined according to each packet type. 590 Holdtime: 16 bits 591 Maximum duration in seconds this message should be honored. 593 Target IP address: 32 bits 594 The unicast address of the intended recipient of this message. 596 M: 1 bit, 597 0: last MRM request message in this packet. 598 1: more MRM request messages follow in the same packet. 600 When multiple MRM messages are grouped into one packet, the IP/AH/UDP 601 headers of the second and all subsequent MRM messages are omitted. The 602 total length of the IP packet will reflect the the sum of lengths of 603 all MRM messages in the packet. 605 4.2 MRM Manager Beacon Message 607 This message is UDP encapsulated, addressed to UDP port MRM-MANAGER- 608 PORT. The outstanding Test Sender Requests and Test Receiver 609 Requests are included in the beacon message. The individual MRM 610 headers are included with these TSR/TRRs. 612 4.3 Test Sender Request (TSR) 614 There are two code values for a TSR: 616 0: Local packet trigger 617 1: Proxy packet trigger 619 NOTE: A host-based implementation is not expected to provide 620 proxy packet capability. 622 Following the MRM message header are the fields for the source 623 specification request: 625 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 626 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 627 | UDP port of test packets |R| S | LEN | Reserved | 628 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 629 | Test group address | 630 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 631 | Inter-packet delay (millisecond) | 632 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 633 | Proxy source IP address (for proxy packet trigger) | 634 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 636 UDP port of test packets: 16 bits 637 UDP port of test packets. 639 R: 1 bit 640 0: Tester will originate RTP/UDP encapsulated test packets 641 1: Tester will originate another kind of packet (not used) 643 S: 2 bits 644 00: send on the targeted interface only 645 01: send on all the multicast enabled interfaces 646 10: send on test-send enabled interfaces 647 11: Unused 649 LEN: 3 bits (optional) 650 Size of the packets to be sourced. The length field represents 651 a multiple of 16 bytes. The range of possible packet sizes is 652 16 bytes to 2048 bytes (2^7)*(16 bytes). The LEN field is 653 optional. If ignored, test senders should send 16 byte packets. 655 Reserved: 10 bits 656 Set to zero when sent. Ignored with received. 658 Inter-packet delay: 32 bits 659 Number of milliseconds between consecutive test packets. 661 Test group address: 32 bits 662 Multicast address of the test group. 664 Proxy source IP address: 32 bits 665 IP address of the source to proxy packet for. This field 666 exists only for a proxy packet trigger request. 668 4.4 Test Receiver Requests (TRR) 670 The following are code values for status request messages: 672 0: Monitor multicast group (Monitor request) 673 1: Poll for receiver statistics (Poll request) 674 2: Fault isolation request (not used in this revision) 676 Message format for monitor and poll requests: 678 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 679 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 680 |J|R| Reserved | Number of sources to monitor | 681 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 682 |Thres index (0)| Pkt loss (%) | Reception window (seconds) | 683 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 684 | Min report delay (seconds) | Max report delay (seconds) | 685 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 686 | Max startup delay (seconds) | Reserved | 687 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 688 | UDP port of test packets | UDP port for status reports | 689 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 690 / Threshold description block / 691 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 692 | Test group address | 693 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 694 | IP address of Source 1 | 695 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 696 | Inter-Packet delay interval from source 1 | 697 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 698 / ... / 699 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 700 | IP address of Source n | 701 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 702 | Inter-Packet delay from source n | 703 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 705 J: 1 bit 706 0: Don't join the multicast group to be monitored. 707 1: Join the multicast group to be monitored. 709 R: 1 bit 710 0: Fault report should be sent in RTCP format 711 1: Fault report should be sent in native MRM format (not used). 713 Reserved: 714 Zeroed when sent, ignored when received. 716 Number of sources to monitor: 16 bit 717 The number of sources this target tester should monitor. When 718 all sources for the test group are monitored, this field is 719 set to 1, and the corresponding source address field is set 720 to 0.0.0.0. 722 Thres index: 8 bits 723 Always 0. Index of the criteria for determining a threshold 724 for a fault. The value of this index determines the content 725 for the "Threshold description Block". 727 Pkt loss (%): 8 bits 728 Percentage of packet loss. A criteria to determine whether a 729 fault has occurred. 731 Max report delay (seconds): 16 bits 732 Maximum number of seconds within which a fault report must be 733 sent after it is detected. 735 Min report delay (seconds): 16 bits 736 Minimum number of seconds a fault report needs to be sent after 737 it is detected. A report should not be sent in less than this 738 delay. 740 Max startup delay (seconds): 16 bits 741 Max number of seconds the TR can wait before the start of the 742 test. The test is considered started if a test packet is 743 received, or the "max startup delay" has passed after the 744 receipt of this request. 746 Reception window (seconds): 16 bits 747 Number of seconds used for calculating packet loss percentage. 749 UDP port of data packets: 16 bits 750 UDP port test data packets use. 752 UDP port of status report packets: 16 bits 753 UDP port of status report packets. 755 Threshold description block: 0 bit 756 Variable length, depending on "Thres index". This revision only 757 defines threshold index 0, with no threshold description block. 759 Test group address: 32 bits 760 The IP multicast address for the test group. 762 IP address of source 1 .. n: 32 bits 763 The IP address of the sources the targeted tester should monitor. 764 When the address is 0.0.0.0, all sources to this group will be 765 monitored. 767 Inter-packet delay from source 1 .. n: 32 bits 768 Intervals between consecutive packets from the source 769 (milliseconds). 771 4.5 Status Report to the MRM Manager 773 This MRM revision uses the reception report (RTCP) format based on 774 Section 2.3.2. Future revisions will define MRM specific report 775 formats. 777 4.6 MRM Test Packet 779 MRM test packets are RTPv2/UDP encapsulated. The RTPv2 packet header 780 is replicated below for easy of description. 782 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 783 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 784 |V=2|P|X| CC |M| PT | sequence number | 785 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 786 | timestamp | 787 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 788 | synchronization source (SSRC) identifier | 789 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 790 | IP address of MRM manager | 791 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 793 CC: 794 Set to 0 when sent, ignored when received. 796 M: 797 Set to 0 when sent, ignored when received. 799 PT: 800 Set to 0 when sent, ignored when received. 802 Sequence number: 803 Sequence number. Set to 0, when a tester is activated. 805 Timestamp: 806 System timestamp, in milliseconds. 808 SSRC: 809 IP address of the tester, or a configured 32-bit number that 810 uniquely identifies the tester. 812 4.7 MRM Request-Ack Messages 814 The Acknowledgement messages for the Test Sender Request and the 815 Status Request provide guarantees that the requests are indeed 816 received by the testers, instead of being lost. The acknowledgement 817 packets contain the MRM header and trailer for the respective 818 messages, except that the message length and authentication data 819 fields are recalculated. 821 5. Authenticating MRM Messages 823 All MRM messages should be authenticated with the MD5 mechanism 824 specified here. The fields in the messages are transmitted in the 825 clear. Packets that fail the authentication check are discarded by 826 the receivers. 828 5.1 Generating Authenticated Messages 830 The sender of the MRM message decides which authentication key 831 is used. 833 (1) The MRM message length field is filled with the number of 834 bytes in the message; 836 (2) The rest of the message is composed; 838 (3) The IPSEC AH is constructed; 840 (4) The "authentication data" field is zeroed; 842 (5) The MRM authentication Key (16 byte long) is appended 843 to the MRM message. 845 (6) The pad for the key is added. The digest is calculated and 846 written into the "authentication data" field. 848 The part with the MD5 secret is not transmitted. 850 5.2 Receiving Authenticated Messages 852 The receiver follows the following steps when processing an incoming 853 message: 855 (1) The digest is stored away and the "authentication data" 856 field zeroed; 858 (2) It finds the key according to the value of "Key ID", and 859 the key is appended and the packet properly padded; 861 (3) A new digest is calculated. 863 A message is discarded if the new digest is different from the one 864 carried in the packet. 866 5.3 Key Management 868 We expect to rely on manual key distribution in the initial stages. 869 And MRM should be able to utilize the standard secure key management 870 mechanism when it becomes available. 872 6. Security Considerations 874 The strength of the security mechanism here depends on the strength 875 of the key and the MD5 algorithm. 877 Insufficiently protected TSs and TRs (e.g. by weak keys) can be 878 subject to attacks that can cause the TSs and TRs to take actions 879 causing harm to the network. 881 7. Different Approaches to Implement MRM 883 MRM is originally targeted at two types of users: network operation 884 centers that provide production quality services; and network 885 administrators who oversee semi-production or experimental multicast 886 services. The former often rely on SNMP-based tools for management 887 tasks and typically desire all types of monitoring functionalities to 888 be wrapped into the same set of tools. While the later, who usually 889 set the stage for production quality offerings, do not normally rely 890 on SNMP-based tools and favor task-oriented tools. 892 For this reason, this document specifies the native MRM messages and 893 operations. A companion document will define the MRM MIB that can 894 accomplish the majority of the native MRM tasks. 896 8. Example of an MRM Setup 898 The example shown in this section is for illustration purpose only, 899 and does not cover all possible functionalities of the MRM framework. 900 . . 901 Neighbor. T1 T2 . Neighbor 902 Domain . +----+ +----+ +-----+ . Domain 903 ----| BR1|-----------| R2 |-----------| BR3 |-------- 904 . +----+ +----+ +-----+ . 905 . | . | . 906 | . | 907 | .-----------------------. | 908 | . | 909 +----+ +-----+ 910 | R4 | | R5 | 911 +----+ +-----+ 912 . / 913 . T3 / 914 . +----+ / 915 --------| R6 |-----------/ 916 +----+ 917 | 918 | 919 ------------- 920 | MRM Manager | 921 ------------- 923 The above is a simple topology used to demonstrate the use of various 924 MRM features. Border routers BR1, BR3 and an internal router R6 are 925 administratively configured as candidate MRM Testers. The MRM manager 926 configures T1 to be a TS, and T2,T3 to be TRs. The following are the 927 messages sent by the MRM components. 929 (1) MRM manager sends Test Sender request (TSR) to T1. 930 Req1 = {Local packet trigger, 931 test packet interval = 60,000 (ms), 932 RTP/UDP test packet = TRUE, 933 Test group = 239.255.255.2} 935 T1 acknowledges receipt of Req1. 937 (2) MRM manager sends TR request Req2 to T2. Req2 has the following 938 content: 940 J-bit = TRUE, 941 list of source addresses = {T1's IP address}, 942 threshold for fault detection = {20% loss over 10 minutes}, 943 max delay for fault report = 10 seconds, 944 min delay for fault report = 0 seconds, 945 Test group = 239.255.255.2, 947 T2 acknowledges receipt of Req2. Req2 is retransmitted if the 948 retransmission timer expires. 950 (3) MRM manager sends TR request Req3 to T3. Similar to Req2, 951 except the target is T3, and, 953 max delay for fault report = 20 seconds, 954 min delay for fault report = 10 seconds 956 By using different (min, max) report times, it can avoid report 957 implosion at the MRM manager, when a fault is detected by T2 and T3 958 at the same time. 960 (4) MRM manager periodically sends beacon messages, carrying Req1 and 961 Req2, Req3. The holdtime is set to the remaining lifetime of the 962 original request. 964 Assume T1 has a fault such that it can only forward 1% of all 965 multicast packets, the fault is detected by T2 and T3. T2 randomly 966 delays between 0-10 seconds, and sends a fault report to the MRM 967 manager. The MRM manager acknowledges this report. T3 randomly 968 delays between 10-20 seconds, and sends its fault report to the MRM 969 manager, which is also acknowledged. This concludes the fault 970 detection phase. 972 In the fault isolation phase, assume the MRM manager sends a third 973 party mtrace request to T2 or T3, and isolates the fault to between 974 T1, R2 and T1, R4. The MRM manager can then issue an an alarm to the 975 network operator, with proper descriptions of the problem. 977 The operation for fault isolation phase might be more complicated for 978 other types of fault, e.g. if T1 has lost the ability to forward 979 multicast packets completely, T2 and T3 wouldn't have any multicast 980 routing state or statistics for mtrace to work, some other mechanisms 981 would have to be put in use. 983 9. Acknowledgment 985 We'd like to thank John Meylor, Beau Williamson, Stephen Deering, 986 Ishan Wu, Louis Mamakos, Manoj Leelanivas, David Meyer, Bill Fenner 987 and Dave Thaler for their comments and suggestions. We'd like to 988 especially TY Lin and Kamil Sarac for filling in missing details from 989 the previous version of the specification. 991 10. Authors addresses 993 Kevin Almeroth 994 Department of Computer Science 995 University of California 996 Santa Barbara, CA 93106-5110 997 almeroth@cs.ucsb.edu 999 Liming Wei 1000 Siara Systems, Inc. 1001 300 Ferguson Drive 1002 Mountain View, California 94043 1003 lwei@siara.com 1005 Dino Farinacci 1006 cisco Systems, Inc. 1007 170 West Tasman Drive 1008 San Jose, CA 95134 1009 dino@cisco.com 1011 11. References 1013 [mtrace] Steven Casner, Bill Fenner et al. The mtrace tool. 1015 [mrm-use] Kevin Almeroth, Liming Wei, "Justification and Use of MRM", 1016 draft, Jan 15, 1999. 1018 [aboba] Bernard Aboba, "The Use of SNTP as a Multicast Heartbeat", 1019 Draft, draft-ietf-mboned-sntp-heart-02.txt. 1021 [ping] Jon Postel, "Internet Control Message Protocol", RFC792, 1022 Information Sciences Institute, 1981. 1024 [UDP] Jon Postel, "User Datagram Protocol", RFC768. Information 1025 Sciences Institute. 1027 [scope] Dave Meyer, "Administratively Scoped IP Multicast", 1028 Draft, draft-ietf-mboned-admin-ip-space-03.txt. 1030 [MD5] R. Rivest, "The MD5 Message-Digest Algorithm", RFC1321, 1031 April, 1992 1033 [KA98] Kent Stephen, Randall Atkinson, "IP Authentication Header", 1034 "draft-ietf-ipsec-auth-header-07.txt", July 1998 1036 Appendix A - Change History 1038 October 1999 -- revisions since draft-ietf-mboned-mrm-00.txt 1040 (1) Added a TS length field to allow test send packets to be 1041 specified between 16 bytes and 2048 bytes in 16 byte 1042 increments. 1044 (2) Made usage of beacon messages by the manager optional. 1045 Test agents are required to be able to process beacon 1046 messages. 1048 (3) Monitoring existing groups is relegated to a later version 1049 because of the difficulty in monitoring the source to 1050 determine if it is sending a packet. When an MRM Test 1051 Source is used, Test Receivers know when, how many, and 1052 for how long packets will be sent. If no packets are 1053 received the test receiver knows to report 100% loss. 1054 This assumption is not possible when monitoring existing 1055 groups. 1057 (4) Added additional detail about packet formats and packet 1058 handling procedures to reduce ambiguity.