idnits 2.17.1 draft-ietf-mboned-intro-multicast-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 3 instances of too long lines in the document, the longest one being 1 character in excess of 72. == There are 6 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 26 instances of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 1997) is 9904 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Missing reference section? 'DR' on line 2316 looks like a reference Summary: 10 errors (**), 0 flaws (~~), 2 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT T. Maufer 2 Expire in six months C. Semeria 3 Category: Informational 3Com Corporation 4 March 1997 6 Introduction to IP Multicast Routing 8 10 Status of this Memo 12 This document is an Internet Draft. Internet Drafts are working 13 documents of the Internet Engineering Task Force (IETF), its Areas, and 14 its Working Groups. Note that other groups may also distribute working 15 documents as Internet Drafts. 17 Internet Drafts are draft documents valid for a maximum of six months. 18 Internet Drafts may be updated, replaced, or obsoleted by other 19 documents at any time. It is not appropriate to use Internet Drafts as 20 reference material or to cite them other than as a "working draft" or 21 "work in progress." 23 To learn the current status of any Internet-Draft, please check the 24 "1id-abstracts.txt" listing contained in the internet-drafts Shadow 25 Directories on: 27 ftp.is.co.za (Africa) 28 nic.nordu.net (Europe) 29 ds.internic.net (US East Coast) 30 ftp.isi.edu (US West Coast) 31 munnari.oz.au (Pacific Rim) 33 FOREWORD 35 This document is introductory in nature. We have not attempted to 36 describe every detail of each protocol, rather to give a concise 37 overview in all cases, with enough specifics to allow a reader to grasp 38 the essential details and operation of protocols related to multicast 39 IP. Every effort has been made to ensure the accurate representation of 40 any cited works, especially any works-in-progress. For the complete 41 details, we refer you to the relevant specification(s). 43 If internet-drafts are cited in this document, it is only because they 44 are the only sources of certain technical information at the time of 45 this writing. We expect that many of the internet-drafts which we have 46 cited will eventually become RFCs. See the shadow directories above for 47 the status of any of these drafts, their follow-on drafts, or possibly 48 the resulting RFCs. 50 ABSTRACT 52 The first part of this paper describes the benefits of multicasting, 53 the MBone, Class D addressing, and the operation of the Internet Group 54 Management Protocol (IGMP). The second section explores a number of 55 different techniques that may potentially be employed by multicast 56 routing protocols: 58 o Flooding 59 o Spanning Trees 60 o Reverse Path Broadcasting (RPB) 61 o Truncated Reverse Path Broadcasting (TRPB) 62 o Reverse Path Multicasting (RPM) 63 o "Shared-Tree" Techniques 65 The third part contains the main body of the paper. It describes how 66 the previous techniques are implemented in multicast routing protocols 67 available today (or under development). 69 o Distance Vector Multicast Routing Protocol (DVMRP) 70 o Multicast Extensions to OSPF (MOSPF) 71 o Protocol-Independent Multicast - Dense Mode (PIM-DM) 72 o Protocol-Independent Multicast - Sparse Mode (PIM-SM) 73 o Core-Based Trees (CBT) 75 Table of Contents 76 Section 78 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . INTRODUCTION 79 1.1 . . . . . . . . . . . . . . . . . . . . . . . . . Multicast Groups 80 1.2 . . . . . . . . . . . . . . . . . . . . . Group Membership Protocol 81 1.3 . . . . . . . . . . . . . . . . . . . . Multicast Routing Protocols 82 1.3.1 . . . . . . . . . . . Multicast Routing vs. Multicast Forwarding 83 2 . . . . . . . . MULTICAST SUPPORT FOR EMERGING INTERNET APPLICATIONS 84 2.1 . . . . . . . . . . . . . . . . . . . . . . . Reducing Network Load 85 2.2 . . . . . . . . . . . . . . . . . . . . . . . . Resource Discovery 86 2.3 . . . . . . . . . . . . . . . Support for Datacasting Applications 87 3 . . . . . . . . . . . . . . THE INTERNET'S MULTICAST BACKBONE (MBone) 88 4 . . . . . . . . . . . . . . . . . . . . . . . . MULTICAST ADDRESSING 89 4.1 . . . . . . . . . . . . . . . . . . . . . . . . Class D Addresses 90 4.2 . . . . . . . Mapping a Class D Address to an IEEE-802 MAC Address 91 4.3 . . . . . . . . . Transmission and Delivery of Multicast Datagrams 92 5 . . . . . . . . . . . . . . INTERNET GROUP MANAGEMENT PROTOCOL (IGMP) 93 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . IGMP Version 1 94 5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . IGMP Version 2 95 5.3 . . . . . . . . . . . . . . . . . . . . . . . . . . IGMP Version 3 96 6 . . . . . . . . . . . . . . . . . . . MULTICAST FORWARDING TECHNIQUES 97 6.1 . . . . . . . . . . . . . . . . . . . . . "Simpleminded" Techniques 98 6.1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flooding 99 6.1.2 . . . . . . . . . . . . . . . . . . . . . . . . . . Spanning Tree 100 6.2 . . . . . . . . . . . . . . . . . . . Source-Based Tree Techniques 101 6.2.1 . . . . . . . . . . . . . . . . . Reverse Path Broadcasting (RPB) 102 6.2.1.1 . . . . . . . . . . . . . Reverse Path Broadcasting: Operation 103 6.2.1.2 . . . . . . . . . . . . . . . . . RPB: Benefits and Limitations 104 6.2.2 . . . . . . . . . . . Truncated Reverse Path Broadcasting (TRPB) 105 6.2.3 . . . . . . . . . . . . . . . . . Reverse Path Multicasting (RPM) 106 6.2.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . Operation 107 6.2.3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . Limitations 108 6.3 . . . . . . . . . . . . . . . . . . . . . . Shared Tree Techniques 109 6.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operation 110 6.3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . Benefits 111 6.3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . Limitations 112 7 . . . . . . . . . . . . . . . . . . . "DENSE MODE" ROUTING PROTOCOLS 113 7.1 . . . . . . . . Distance Vector Multicast Routing Protocol (DVMRP) 114 7.1.1 . . . . . . . . . . . . . . . . . Physical and Tunnel Interfaces 115 7.1.2 . . . . . . . . . . . . . . . . . . . . . . . . . Basic Operation 116 7.1.3 . . . . . . . . . . . . . . . . . . . . . DVMRP Router Functions 117 7.1.4 . . . . . . . . . . . . . . . . . . . . . . . DVMRP Routing Table 118 7.1.5 . . . . . . . . . . . . . . . . . . . . . DVMRP Forwarding Table 119 7.2 . . . . . . . . . . . . . . . Multicast Extensions to OSPF (MOSPF) 120 7.2.1 . . . . . . . . . . . . . . . . . . Intra-Area Routing with MOSPF 121 7.2.1.1 . . . . . . . . . . . . . . . . . . . . . Local Group Database 122 7.2.1.2 . . . . . . . . . . . . . . . . . Datagram's Shortest Path Tree 123 7.2.1.3 . . . . . . . . . . . . . . . . . . . . . . . Forwarding Cache 124 7.2.2 . . . . . . . . . . . . . . . . . . Mixing MOSPF and OSPF Routers 125 7.2.3 . . . . . . . . . . . . . . . . . . Inter-Area Routing with MOSPF 126 7.2.3.1 . . . . . . . . . . . . . . . . Inter-Area Multicast Forwarders 127 7.2.3.2 . . . . . . . . . . . Inter-Area Datagram's Shortest Path Tree 128 7.2.4 . . . . . . . . . Inter-Autonomous System Multicasting with MOSPF 129 7.3 . . . . . . . . . . . . . . . Protocol-Independent Multicast (PIM) 130 7.3.1 . . . . . . . . . . . . . . . . . . . . PIM - Dense Mode (PIM-DM) 131 8 . . . . . . . . . . . . . . . . . . . "SPARSE MODE" ROUTING PROTOCOLS 132 8.1 . . . . . . . Protocol-Independent Multicast - Sparse Mode (PIM-SM) 133 8.1.1 . . . . . . . . . . . . . . Directly Attached Host Joins a Group 134 8.1.2 . . . . . . . . . . . . Directly Attached Source Sends to a Group 135 8.1.3 . . . . . . . Shared Tree (RP-Tree) or Shortest Path Tree (SPT)? 136 8.1.4 . . . . . . . . . . . . . . . . . . . . . . . Unresolved Issues 137 8.2 . . . . . . . . . . . . . . . . . . . . . . Core Based Trees (CBT) 138 8.2.1 . . . . . . . . . . . . . . . . . . Joining a Group's Shared Tree 139 8.2.2 . . . . . . . . . . . . . . . . . . . . . Data Packet Forwarding 140 8.2.3 . . . . . . . . . . . . . . . . . . . . . . . Non-Member Sending 141 8.2.4 . . . . . . . . . . . . . . . . . CBT Multicast Interoperability 142 9 . . . . . . INTEROPERABILITY FRAMEWORK FOR MULTICAST BORDER ROUTERS 143 9.1 . . . . . . . . . . . . . Requirements for Multicast Border Routers 144 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . REFERENCES 145 10.1 . . . . . . . . . . . . . . . . . . . Requests for Comments (RFCs) 146 10.2 . . . . . . . . . . . . . . . . . . . . . . . . . Internet-Drafts 147 10.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . Textbooks 148 10.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Other 149 11 . . . . . . . . . . . . . . . . . . . . . . SECURITY CONSIDERATIONS 150 12 . . . . . . . . . . . . . . . . . . . . . . . . . . ACKNOWLEDGEMENTS 151 13 . . . . . . . . . . . . . . . . . . . . . . . . . AUTHORS' ADDRESSES 152 1. INTRODUCTION 154 There are three fundamental types of IPv4 addresses: unicast, 155 broadcast, and multicast. A unicast address is used to transmit a 156 packet to a single destination. A broadcast address is used to send a 157 datagram to an entire subnetwork. A multicast address is designed to 158 enable the delivery of datagrams to a set of hosts that have been 159 configured as members of a multicast group across various 160 subnetworks. 162 Multicasting is not connection-oriented. A multicast datagram is 163 delivered to destination group members with the same "best-effort" 164 reliability as a standard unicast IP datagram. This means that 165 multicast datagrams are not guaranteed to reach all members of a group, 166 nor to arrive in the same order in which they were transmitted. 168 The only difference between a multicast IP packet and a unicast IP 169 packet is the presence of a 'group address' in the Destination Address 170 field of the IP header. Instead of a Class A, B, or C IP destination 171 address, multicasting employs a Class D address format, which ranges 172 from 224.0.0.0 to 239.255.255.255. 174 1.1 Multicast Groups 176 Individual hosts are free to join or leave a multicast group at any 177 time. There are no restrictions on the physical location or the number 178 of members in a multicast group. A host may be a member of more than 179 one multicast group at any given time and does not have to belong to a 180 group to send packets to members of a group. 182 1.2 Group Membership Protocol 184 A group membership protocol is employed by routers to learn about the 185 presence of group members on their directly attached subnetworks. When 186 a host joins a multicast group, it transmits a group membership protocol 187 message for the group(s) that it wishes to receive, and sets its IP 188 process and network interface card to receive frames addressed to the 189 multicast group. This receiver-initiated join process has excellent 190 scaling properties since, as the multicast group increases in size, it 191 becomes ever more likely that a new group member will be able to locate 192 a nearby branch of the multicast delivery tree. 194 [This space was intentionally left blank.] 195 ======================================================================== 196 _ _ _ _ 197 |_| |_| |_| |_| 198 '-' '-' '-' '-' 199 | | | | 200 <- - - - - - - - - -> 201 | 202 | 203 v 204 Router 205 ^ 206 / \ 207 _ ^ + + ^ _ 208 |_|-| / \ |-|_| 209 '_' | + + | '_' 210 _ | v v | _ 211 |_|-|- - >|Router| <- + - + - + -> |Router|<- -|-|_| 212 '_' | | '_' 213 _ | | _ 214 |_|-| |-|_| 215 '_' | | '_' 216 v v 218 LEGEND 220 <- - - -> Group Membership Protocol 221 <-+-+-+-> Multicast Routing Protocol 223 Figure 1: Multicast IP Delivery Service 224 ======================================================================= 226 1.3 Multicast Routing Protocols 228 Multicast routers execute a multicast routing protocol to define 229 delivery paths that enable the forwarding of multicast datagrams 230 across an internetwork. 232 1.3.1 Multicast Routing vs. Multicast Forwarding 234 Multicast routing protocols establish or help establish the distribution 235 tree for a given group, which enables multicast forwarding of packets 236 addressed to the group. In the case of unicast, routing protocols are 237 also used to build a forwarding table (commonly called a routing table). 238 Unicast destinations are entered in the routing table, and associated 239 with a metric and a next-hop router toward the destination. The key 240 difference between unicast forwarding and multicast forwarding is that 241 multicast packets must be forwarded away from their source. If a packet 242 is ever forwarded back toward its source, a forwarding loop could have 243 formed, possibly leading to a multicast "storm." 244 Each routing protocol constructs a forwarding table in its own way; the 245 forwarding table tells each router that for a certain source, or for a 246 given source sending to a certain group (called a (source, group) pair), 247 packets are expected to arrive on a certain "inbound" or "upstream" 248 interface and must be copied to certain (set of) "outbound" or 249 "downstream" interface(s) in order to reach all known subnetworks with 250 group members. 252 2. MULTICAST SUPPORT FOR EMERGING INTERNET APPLICATIONS 254 Today, the majority of Internet applications rely on point-to-point 255 transmission. The utilization of point-to-multipoint transmission has 256 traditionally been limited to local area network applications. Over the 257 past few years the Internet has seen a rise in the number of new 258 applications that rely on multicast transmission. Multicast IP 259 conserves bandwidth by forcing the network to do packet replication only 260 when necessary, and offers an attractive alternative to unicast 261 transmission for the delivery of network ticker tapes, live stock 262 quotes, multiparty videoconferencing, and shared whiteboard applications 263 (among others). It is important to note that the applications for IP 264 Multicast are not solely limited to the Internet. Multicast IP can also 265 play an important role in large commercial internetworks. 267 2.1 Reducing Network Load 269 Assume that a stock ticker application is required to transmit packets 270 to 100 stations within an organization's network. Unicast transmission 271 to this set of stations will require the periodic transmission of 100 272 packets where many packets may in fact be traversing the same link(s). 273 Multicast transmission is the ideal solution for this type of 274 application since it requires only a single packet stream to be 275 transmitted by the source which is replicated at forks in the multicast 276 delivery tree. 278 Broadcast transmission is not an effective solution for this type of 279 application since it affects the CPU performance of each and every 280 station that sees the packet. Besides, it wastes bandwidth. 282 2.2 Resource Discovery 284 Some applications utilize multicast instead of broadcast transmission 285 to transmit packets to group members residing on the same subnetwork. 286 However, there is no reason to limit the extent of a multicast 287 transmission to a single LAN. The time-to-live (TTL) field in the IP 288 header can be used to limit the range (or "scope") of a multicast 289 transmission. 291 2.3 Support for Datacasting Applications 293 Since 1992, the IETF has conducted a series of "audiocast" experiments 294 in which live audio and video were multicast from the IETF meeting site 295 to destinations around the world. In this case, "datacasting" takes 296 compressed audio and video signals from the source station and transmits 297 them as a sequence of UDP packets to a group address. Multicast 298 delivery today is not limited to audio and video. Stock quote systems 299 are one example of a (connectionless) data-oriented multicast 300 application. Someday reliable multicast transport protocols may 301 facilitate efficient inter-computer communication. Reliable multicast 302 transport protocols are currently an active area of research and 303 development. 305 3. THE INTERNET'S MULTICAST BACKBONE (MBone) 307 The Internet Multicast Backbone (MBone) is an interconnected set of 308 subnetworks and routers that support the delivery of IP multicast 309 traffic. The goal of the MBone is to construct a semipermanent IP 310 multicast testbed to enable the deployment of multicast applications 311 without waiting for the ubiquitous deployment of multicast-capable 312 routers in the Internet. 314 The MBone has grown from 40 subnets in four different countries in 1992, 315 to more than 3400 subnets in over 25 countries by March 1997. With 316 new multicast applications and multicast-based services appearing, it 317 seems likely that the use of multicast technology in the Internet will 318 keep growing at an ever-increasing rate. 320 The MBone is a virtual network that is layered on top of sections of the 321 physical Internet. It is composed of islands of multicast routing 322 capability connected to other islands by virtual point-to-point links 323 called "tunnels." The tunnels allow multicast traffic to pass through 324 the non-multicast-capable parts of the Internet. Tunneled IP multicast 325 packets are encapsulated as IP-over-IP (i.e., the protocol number is set 326 to 4) so they look like normal unicast packets to intervening routers. 327 The encapsulation is added on entry to a tunnel and stripped off on exit 328 from a tunnel. This set of multicast routers, their directly-connected 329 subnetworks, and the interconnecting tunnels comprise the MBone. 331 Since the MBone and the Internet have different topologies, multicast 332 routers execute a separate routing protocol to decide how to forward 333 multicast packets. The majority of the MBone routers currently use the 334 Distance Vector Multicast Routing Protocol (DVMRP), although some 335 portions of the MBone execute either Multicast OSPF (MOSPF) or the 336 Protocol-Independent Multicast (PIM) routing protocols. The operation 337 of each of these protocols is discussed later in this paper. 339 ======================================================================== 341 +++++++ 342 / |Island | \ 343 /T/ | A | \T\ 344 /U/ +++++++ \U\ 345 /N/ | \N\ 346 /N/ | \N\ 347 /E/ | \E\ 348 /L/ | \L\ 349 ++++++++ +++++++ ++++++++ 350 | Island | | Island| ---------| Island | 351 | B | | C | Tunnel | D | 352 ++++++++ +++++++ --------- ++++++++ 353 \ \ | 354 \T\ | 355 \U\ | 356 \N\ | 357 \N\ +++++++ 358 \E\ |Island | 359 \L\| E | 360 \ +++++++ 362 Figure 2: Internet Multicast Backbone (MBone) 364 ======================================================================== 366 As multicast routing software features become more widely available on 367 the routers of the Internet, providers may gradually decide to use 368 "native" multicast as an alternative to using lots of tunnels. 370 The MBone carries audio and video multicasts of Internet Engineering 371 Task Force (IETF) meetings, NASA Space Shuttle Missions, US House and 372 Senate sessions, and live satellite weather photos. There are public 373 and private sessions on the MBone. Sessions that are menat for public 374 viewing or participation are announced via the session directory (SDR) 375 tool. A user of this tool can see a list of current and future public 376 sessions provided the user is within the administrative scope of the 377 sender. 379 4. MULTICAST ADDRESSING 381 A multicast address is assigned to a set of receivers defining a 382 multicast group. Senders use the multicast address as the destination 383 IP address of a packet that is to be transmitted to all group members. 385 4.1 Class D Addresses 387 An IP multicast group is identified by a Class D address. Class D 388 addresses have their high-order four bits set to "1110" followed by 389 a 28-bit multicast group ID. Expressed in standard "dotted-decimal" 390 notation, multicast group addresses range from 224.0.0.0 to 391 239.255.255.255 (shorthand: 224.0.0.0/4). 393 Figure 3 shows the format of a 32-bit Class D address. 395 ======================================================================== 397 0 1 2 3 31 398 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 399 |1|1|1|0| Multicast Group ID | 400 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 401 |------------------------28 bits------------------------| 403 Figure 3: Class D Multicast Address Format 404 ======================================================================== 406 The Internet Assigned Numbers Authority (IANA) maintains a list of 407 registered IP multicast groups. The base address 224.0.0.0 is reserved 408 and cannot be assigned to any group. The block of multicast addresses 409 ranging from 224.0.0.1 to 224.0.0.255 is reserved for permanent 410 assignment to various uses, including routing protocols and other 411 protocols that require a well-known permanent address. Multicast 412 routers should not forward any multicast datagram with destination 413 addresses in this range, (regardless of the packet's TTL). 415 Some of the well-known groups include: 417 "all systems on this subnet" 224.0.0.1 418 "all routers on this subnet" 224.0.0.2 419 "all DVMRP routers" 224.0.0.4 420 "all OSPF routers" 224.0.0.5 421 "all OSPF designated routers" 224.0.0.6 422 "all RIP2 routers" 224.0.0.9 423 "all PIM routers" 224.0.0.13 424 "all CBT routers" 224.0.0.15 426 The remaining groups ranging from 224.0.1.0 to 239.255.255.255 are 427 assigned to various multicast applications or remain unassigned. From 428 this range, the addresses from 239.0.0.0 to 239.255.255.255 are being 429 reserved for various "administratively scoped" applications, not 430 necessarily Internet-wide applications. 432 The complete list may be found in the Assigned Numbers RFC (RFC 1700 or 433 its successor) or at the IANA Web Site: 435 437 4.2 Mapping a Class D Address to an IEEE-802 MAC Address 439 The IANA has been allocated a reserved portion of the IEEE-802 MAC-layer 440 multicast address space. All of the addresses in IANA's reserved block 441 begin with 01-00-5E (hex); to be clear, the range from 01-00-5E-00-00-00 442 to 01-00-5E-FF-FF-FF is reserved for IP multicast groups. 444 A simple procedure was developed to map Class D addresses to this 445 reserved MAC-layer multicast address block. This allows IP multicasting 446 to easily take advantage of the hardware-level multicasting supported by 447 network interface cards. 449 The mapping between a Class D IP address and an IEEE-802 (e.g., FDDI, 450 Ethernet) MAC-layer multicast address is obtained by placing the low- 451 order 23 bits of the Class D address into the low-order 23 bits of 452 IANA's reserved MAC-layer multicast address block. This simple 453 procedure removes the need for an explicit protocol for multicast 454 address resolution on LANs akin to ARP for unicast. All LAN stations 455 know this simple transformation, and can easily send any IP multicast 456 over any IEEE-802-based LAN. 458 Figure 4 illustrates how the multicast group address 234.138.8.5 459 (or EA-8A-08-05 expressed in hex) is mapped into an IEEE-802 multicast 460 address. Note that the high-order nine bits of the IP address are not 461 mapped into the MAC-layer multicast address. 463 The mapping in Figure 4 places the low-order 23 bits of the IP multicast 464 group ID into the low order 23 bits of the IEEE-802 multicast address. 465 Note that the mapping may place up to multiple IP groups into the same 466 IEEE-802 address because the upper five bits of the IP class D address 467 are not used. Thus, there is a 32-to-1 ratio of IP class D addresses to 468 valid MAC-layer multicast addresses. In practice, there is a small 469 chance of collisions, should multiple groups happen to pick class D 470 addresses that map to the same MAC-layer multicast address. However, 471 chances are that higher-layer protocols will let hosts interpret which 472 packets are for them (i.e., the chances of two different groups picking 473 the same class D address and the same set of UDP ports is extremely 474 unlikely). For example, the class D addresses 224.10.8.5 (E0-0A-08-05) 475 and 225.138.8.5 (E1-8A-08-05) map to the same IEEE-802 MAC-layer 476 multicast address (01-00-5E-0A-08-05) used in this example. 478 ======================================================================== 480 Class D Address: 234.138.8.5 (EA-8A-08-05) 482 | E A | 8 483 Class-D IP |_______ _______|__ _ _ _ 484 Address |-+-+-+-+-+-+-+-|-+ - - - 485 |1 1 1 0 1 0 1 0|1 486 |-+-+-+-+-+-+-+-|-+ - - - 487 ................... 488 IEEE-802 ....not......... 489 MAC-Layer .............. 490 Multicast ....mapped.. 491 Address ........... 492 |-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-|-+ - - - 493 |0 0 0 0 0 0 0 1|0 0 0 0 0 0 0 0|0 1 0 1 1 1 1 0|0 494 |-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-|-+ - - - 495 |_______ _______|_______ _______|_______ _______|_______ 496 | 0 1 | 0 0 | 5 E | 0 498 [Address mapping below continued from half above] 500 | 8 A | 0 8 | 0 5 | 501 |_______ _______|_______ _______|_______ _______| Class-D IP 502 - - - -|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-| Address 503 | 0 0 0 1 0 1 0|0 0 0 0 1 0 0 0|0 0 0 0 0 1 0 1| 504 - - - -|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-| 505 \____________ ____________________/ 506 \___ ___/ 507 \ / 508 | 509 23 low-order bits mapped 510 | 511 v 513 - - - -|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-| IEEE-802 514 | 0 0 0 1 0 1 0|0 0 0 0 1 0 0 0|0 0 0 0 0 1 0 1| MAC-Layer 515 - - - -|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-| Multicast 516 |_______ _______|_______ _______|_______ _______| Address 517 | 0 A | 0 8 | 0 5 | 519 Figure 4: Mapping between Class D and IEEE-802 Multicast Addresses 520 ======================================================================== 521 4.3 Transmission and Delivery of Multicast Datagrams 523 When the sender and receivers are members of the same (LAN) subnetwork, 524 the transmission and reception of multicast frames is a straightforward 525 process. The source station simply addresses the IP packet to the 526 multicast group, the network interface card maps the Class D address to 527 the corresponding IEEE-802 multicast address, and the frame is sent. 528 Receivers that wish to capture the frame notify their MAC and IP layers 529 that they want to receive datagrams addressed to the group. 531 Things become somewhat more complex when the sender is attached to one 532 subnetwork and receivers reside on different subnetworks. In this case, 533 the routers must implement a multicast routing protocol that permits the 534 construction of multicast delivery trees and supports multicast packet 535 forwarding. In addition, each router needs to implement a group 536 membership protocol that allows it to learn about the existence of group 537 members on its directly attached subnetworks. 539 5. INTERNET GROUP MANAGEMENT PROTOCOL (IGMP) 541 The Internet Group Management Protocol (IGMP) runs between hosts and 542 their immediately-neighboring multicast routers. The mechanisms of the 543 protocol allow a host to inform its local router that it wishes to 544 receive transmissions addressed to a specific multicast group. Also, 545 routers periodically query the LAN to determine if any group members are 546 still active. If there is more than one IP multicast router on the LAN, 547 one of the routers is elected "querier" and assumes the responsibility of 548 querying the LAN for the presence of any group members. 550 Based on the group membership information learned from the IGMP, a 551 router is able to determine which (if any) multicast traffic needs to be 552 forwarded to each of its "leaf" subnetworks. Multicast routers use this 553 information, in conjunction with a multicast routing protocol, to 554 support IP multicasting across the Internet. 556 5.1 IGMP Version 1 558 IGMP Version 1 was specified in RFC-1112. According to the 559 specification, multicast routers periodically transmit Host Membership 560 Query messages to determine which host groups have members on their 561 directly-attached networks. IGMP Query messages are addressed to the 562 all-hosts group (224.0.0.1) and have an IP TTL = 1. This means that 563 Query messages sourced from a router are transmitted onto the 564 directly-attached subnetwork but are not forwarded by any other 565 multicast routers. 567 When a host receives an IGMP Query message, it responds with a Host 568 Membership Report for each group to which it belongs, sent to each group 569 to which it belongs. (This is an important point: While IGMP Queries 570 ======================================================================== 572 Group 1 _____________________ 573 ____ ____ | multicast | 574 | | | | | router | 575 |_H2_| |_H4_| |_____________________| 576 ---- ---- +-----+ | 577 | | <-----|Query| | 578 | | +-----+ | 579 | | | 580 |---+----+-------+-------+--------+-----------------------+----| 581 | | | 582 | | | 583 ____ ____ ____ 584 | | | | | | 585 |_H1_| |_H3_| |_H5_| 586 ---- ---- ---- 587 Group 2 Group 1 Group 1 588 Group 2 590 Figure 5: Internet Group Management Protocol-Query Message 591 ======================================================================== 593 are sent to the "all hosts on this subnet" class D address (224.0.0.1), 594 IGMP Reports are sent to the group(s) to which the host(s) belong. 595 IGMP Reports, like Queries, are sent with the IP TTL = 1, and thus are 596 not forwarded beyond the local subnetwork.) 598 In order to avoid a flurry of Reports, each host starts a randomly- 599 chosen Report delay timer for each of its group memberships. If, during 600 the delay period, another Report is heard for the same group, every 601 other host in that group must reset its timer to a new random value. 602 This procedure spreads Reports out over a period of time and thus 603 minimizes Report traffic for each group that has at least one member on 604 a given subnetwork. 606 It should be noted that multicast routers do not need to be directly 607 addressed since their interfaces are required to promiscuously receive 608 all multicast IP traffic. Also, a router does not need to maintain a 609 detailed list of which hosts belong to each multicast group; the router 610 only needs to know that at least one group member is present on a given 611 network interface. 613 Multicast routers periodically transmit IGMP Queries to update their 614 knowledge of the group members present on each network interface. If 615 the router does not receive a Report from any members of a particular 616 group after a number of Queries, the router assumes that group members 617 are no longer present on an interface. Assuming this is a leaf subnet 618 (i.e., a subnet with group members but no multicast routers connecting 619 to additional group members further downstream), this interface is 620 removed from the delivery tree(s) for this group. Multicasts will 621 continue to be sent on this interface only if the router can tell (via 622 multicast routing protocols) that there are additional group members 623 further downstream reachable via this interface. 625 When a host first joins a group, it immediately transmits an IGMP Report 626 for the group rather than waiting for a router's IGMP Query. This 627 reduces the "join latency" for the first host to join a given group on 628 a particular subnetwork. "Join latency" is measured from the time when 629 a host's first IGMP Report is sent, until the transmission of the first 630 packet for that group onto that host's subnetwork. Of course, if the 631 group is already active, the join latency is precisely zero. 633 5.2 IGMP Version 2 635 IGMP version 2 was distributed as part of the Distance Vector Multicast 636 Routing Protocol (DVMRP) implementation ("mrouted") source code, from 637 version 3.3 through 3.8. Initially, there was no detailed specification 638 for IGMP version 2 other than this source code. However, the complete 639 specification has recently been published in which will update the specification contained in the first 641 appendix of RFC-1112. IGMP version 2 extends IGMP version 1 while 642 maintaining backward compatibility with version 1 hosts. 644 IGMP version 2 defines a procedure for the election of the multicast 645 querier for each LAN. In IGMP version 2, the multicast router with the 646 lowest IP address on the LAN is elected the multicast querier. In IGMP 647 version 1, the querier election was determined by the multicast routing 648 protocol. 650 IGMP version 2 defines a new type of Query message: the Group-Specific 651 Query. Group-Specific Query messages allow a router to transmit a Query 652 to a specific multicast group rather than all groups residing on a 653 directly attached subnetwork. 655 Finally, IGMP version 2 defines a Leave Group message to lower IGMP's 656 "leave latency." When the last host to respond to a Query with a Report 657 wishes to leave that specific group, the host transmits a Leave Group 658 message to the all-routers group (224.0.0.2) with the group field set to 659 the group being left. In response to a Leave Group message, the router 660 begins the transmission of Group-Specific Query messages on the 661 interface that received the Leave Group message. If there are no 662 Reports in response to the Group-Specific Query messages, then (if this 663 is a leaf subnet) this interface is removed from the delivery tree(s) 664 for this group (as was the case of IGMP version 1). Again, multicasts 665 will continue to be sent on this interface if the router can tell (via 666 multicast routing protocols) that there are additional group members 667 further downstream reachable via this interface. 669 "Leave latency" is measured from a router's perspective. In version 1 670 of IGMP, leave latency was the time from a router's hearing the last 671 Report for a given group, until the router aged out that interface from 672 the delivery tree for that group (assuming this is a leaf subnet, of 673 course). Note that the only way for the router to tell that this was 674 the LAST group member is that no reports are heard in some multiple of 675 the Query Interval (this is on the order of minutes). IGMP version 2, 676 with the addition of the Leave Group message, allows a group member to 677 more quickly inform the router that it is done receiving traffic for a 678 group. The router then must determine if this host was the last member 679 of this group on this subnetwork. To do this, the router quickly 680 queries the subnetwork for other group members via the Group-Specific 681 Query message. If no members send reports after several of these Group- 682 Specific Queries, the router can infer that the last member of that 683 group has, indeed, left the subnetwork. The benefit of lowering the 684 leave latency is that prune messages can be sent as soon as possible 685 after the last member host drops out of the group, instead of having to 686 wait for several minutes worth of Query intervals to pass. If a group 687 was experiencing high traffic levels, it can be very beneficial to stop 688 transmitting data for this group as soon as possible. 690 5.3 IGMP Version 3 692 IGMP version 3 is a preliminary draft specification published in 693 . IGMP version 3 introduces support for Group- 694 Source Report messages so that a host can elect to receive traffic from 695 specific sources of a multicast group. An Inclusion Group-Source Report 696 message allows a host to specify the IP addresses of the specific 697 sources it wants to receive. An Exclusion Group-Source Report message 698 allows a host to explicitly identify the sources that it does not want 699 to receive. With IGMP version 1 and version 2, if a host wants to 700 receive any traffic for a group, the traffic from all sources for the 701 group must be forwarded onto the host's subnetwork. 703 IGMP version 3 will help conserve bandwidth by allowing a host to select 704 the specific sources from which it wants to receive traffic. Also, 705 multicast routing protocols will be able to make use this information to 706 conserve bandwidth when constructing the branches of their multicast 707 delivery trees. 709 Finally, support for Leave Group messages first introduced in IGMP 710 version 2 has been enhanced to support Group-Source Leave messages. 711 This feature allows a host to leave an entire group or to specify the 712 specific IP address(es) of the (source, group) pair(s) that it wishes 713 to leave. Note that at this time, not all existing multicast routing 714 protocols have mechanisms to support such requests from group members. 715 This is one issue that will be addressed during the development of 716 IGMP version 3. 718 6. MULTICAST FORWARDING TECHNIQUES 720 IGMP provides the final step in a multicast packet delivery service 721 since it is only concerned with the forwarding of multicast traffic from 722 a router to group members on its directly-attached subnetworks. IGMP is 723 not concerned with the delivery of multicast packets between neighboring 724 routers or across an internetwork. 726 To provide an internetwork delivery service, it is necessary to define 727 multicast routing protocols. A multicast routing protocol is 728 responsible for the construction of multicast delivery trees and 729 enabling multicast packet forwarding. This section explores a number of 730 different techniques that may potentially be employed by multicast 731 routing protocols: 733 o "Simpleminded" Techniques 734 - Flooding 735 - Spanning Trees 737 o Source-Based Tree (SBT) Techniques 738 - Reverse Path Broadcasting (RPB) 739 - Truncated Reverse Path Broadcasting (TRPB) 740 - Reverse Path Multicasting (RPM) 742 o "Shared-Tree" Techniques 744 Later sections will describe how these algorithms are implemented in the 745 most prevalent multicast routing protocols in the Internet today (e.g., 746 Distance Vector Multicast Routing Protocol (DVMRP), Multicast extensions 747 to OSPF (MOSPF), Protocol-Independent Multicast (PIM), and Core-Based 748 Trees (CBT). 750 6.1 "Simpleminded" Techniques 752 Flooding and Spanning Trees are two algorithms that can be used to build 753 primitive multicast routing protocols. The techniques are primitive due 754 to the fact that they tend to waste bandwidth or require a large amount 755 of computational resources within the multicast routers involved. Also, 756 protocols built on these techniques may work for small networks with few 757 senders, groups, and routers, but do not scale well to larger numbers of 758 senders, groups, or routers. Also, the ability to handle arbitrary 759 topologies may not be present or may only be present in limited ways. 761 6.1.1 Flooding 763 The simplest technique for delivering multicast datagrams to all routers 764 in an internetwork is to implement a flooding algorithm. The flooding 765 procedure begins when a router receives a packet that is addressed to a 766 multicast group. The router employs a protocol mechanism to determine 767 whether or not it has seen this particular packet before. If it is the 768 first reception of the packet, the packet is forwarded on all interfaces 769 (except the one on which it arrived) guaranteeing that the multicast 770 packet reaches all routers in the internetwork. If the router has seen 771 the packet before, then the packet is discarded. 773 A flooding algorithm is very simple to implement since a router does not 774 have to maintain a routing table and only needs to keep track of the 775 most recently seen packets. However, flooding does not scale for 776 Internet-wide applications since it generates a large number of 777 duplicate packets and uses all available paths across the internetwork 778 instead of just a limited number. Also, the flooding algorithm makes 779 inefficient use of router memory resources since each router is required 780 to maintain a distinct table entry for each recently seen packet. 782 6.1.2 Spanning Tree 784 A more effective solution than flooding would be to select a subset of 785 the internetwork topology which forms a spanning tree. The spanning 786 tree defines a structure in which only one active path connects any two 787 routers of the internetwork. Figure 6 shows an internetwork and a 788 spanning tree rooted at router RR. 790 Once the spanning tree has been built, a multicast router simply 791 forwards each multicast packet to all interfaces that are part of the 792 spanning tree except the one on which the packet originally arrived. 793 Forwarding along the branches of a spanning tree guarantees that the 794 multicast packet will not loop and that it will eventually reach all 795 routers in the internetwork. 797 A spanning tree solution is powerful and would be relatively easy to 798 implement since there is a great deal of experience with spanning tree 799 protocols in the Internet community. However, a spanning tree solution 800 can centralize traffic on a small number of links, and may not provide 801 the most efficient path between the source subnetwork and group members. 802 Also, it is computationally difficult to compute a spanning tree in 803 large, complex topologies. 805 6.2 Source-Based Tree Techniques 807 The following techniques all generate a source-based tree by various 808 means. The techniques differ in the efficiency of the tree building 809 process, and the bandwidth and router resources (i.e., state tables) 810 used to build a source-based tree. 812 6.2.1 Reverse Path Broadcasting (RPB) 814 A more efficient solution than building a single spanning tree for the 815 entire internetwork would be to build a spanning tree for each potential 816 source [subnetwork]. These spanning trees would result in source-based 817 delivery trees emanating from the subnetworks directly connected to the 818 ======================================================================== 820 A Sample Internetwork 822 #----------------# 823 / |\ / \ 824 | | \ / \ 825 | | \ / \ 826 | | \ / \ 827 | | \ / \ 828 | | #------# \ 829 | | / | \ \ 830 | | / | \ \ 831 | \ / | \-------# 832 | \ / | -----/| 833 | #-----------#----/ | 834 | /|\--- --/| \ | 835 | / | \ / \ \ | 836 | / \ /\ | \ / 837 | / \ / \ | \ / 838 #---------#-- \ | ----# 839 \ \ | / 840 \--- #-/ 842 A Spanning Tree for this Sample Internetwork 844 # # 845 \ / 846 \ / 847 \ / 848 \ / 849 \ / 850 #------RR 851 | \ 852 | \ 853 | \-------# 854 | 855 #-----------#---- 856 /| | \ 857 / | \ \ 858 / \ | \ 859 / \ | \ 860 # # | # 861 | 862 # 863 LEGEND 865 # Router 866 RR Root Router 868 Figure 6: Spanning Tree 869 ======================================================================== 870 source stations. Since there are many potential sources for a group, a 871 different delivery tree is constructed rooted at each active source. 873 6.2.1.1 Reverse Path Broadcasting: Operation 875 The fundamental algorithm to construct these source-based trees is 876 referred to as Reverse Path Broadcasting (RPB). The RPB algorithm is 877 actually quite simple. For each source, if a packet arrives on a link 878 that the local router believes to be on the shortest path back toward 879 the packet's source, then the router forwards the packet on all 880 interfaces except the incoming interface. If the packet does not 881 arrive on the interface that is on the shortest path back toward the 882 source, then the packet is discarded. The interface over which the 883 router expects to receive multicast packets from a particular source is 884 referred to as the "parent" link. The outbound links over which the 885 router forwards the multicast packet are called "child" links for this 886 source. 888 This basic algorithm can be enhanced to reduce unnecessary packet 889 duplication. If the local router making the forwarding decision can 890 determine whether a neighboring router on a child link is "downstream," 891 then the packet is multicast toward the neighbor. (A "downstream" 892 neighbor is a neighboring router which considers the local router to be 893 on the shortest path back toward a given source.) Otherwise, the packet 894 is not forwarded on the potential child link since the local router 895 knows that the neighboring router will just discard the packet (since it 896 will arrive on a non-parent link for the source, relative to that 897 downstream router). 899 ======================================================================== 901 Source 902 | ^ 903 | : shortest path back to the 904 | : source for THIS router 905 | : 906 "parent link" 907 _ 908 ______|!2|_____ 909 | | 910 --"child -|!1| |!3| - "child -- 911 link" | ROUTER | link" 912 |_______________| 914 Figure 7: Reverse Path Broadcasting - Forwarding Algorithm 915 ======================================================================== 916 The information to make this "downstream" decision is relatively easy to 917 derive from a link-state routing protocol since each router maintains a 918 topological database for the entire routing domain. If a distance- 919 vector routing protocol is employed, a neighbor can either advertise its 920 previous hop for the source as part of its routing update messages or 921 "poison reverse" the route toward a source if it is not on the 922 distribution tree for that source. Either of these techniques allows an 923 upstream router to determine if a downstream neighboring router is on an 924 active branch of the delivery tree for a certain source. 926 Please refer to Figure 8 for a discussion describing the basic operation 927 of the enhanced RPB algorithm. 929 ======================================================================== 931 Source Station------>O 932 A # 933 +|+ 934 + | + 935 + O + 936 + + 937 1 2 938 + + 939 + + 940 + + 941 B + + C 942 O-#- - - - -3- - - - -#-O 943 +|+ -|+ 944 + | + - | + 945 + O + - O + 946 + + - + 947 + + - + 948 4 5 6 7 949 + + - + 950 + + E - + 951 + + - + 952 D #- - - - -8- - - - -#- - - - -9- - - - -# F 953 | | | 954 O O O 956 LEGEND 958 O Leaf 959 + + Shortest-path 960 - - Branch 961 # Router 963 Figure 8: Reverse Path Broadcasting - Example 964 ======================================================================== 965 Note that the source station (S) is attached to a leaf subnetwork 966 directly connected to Router A. For this example, we will look at the 967 RPB algorithm from Router B's perspective. Router B receives the 968 multicast packet from Router A on link 1. Since Router B considers link 969 1 to be the parent link for the (source, group) pair, it forwards the 970 packet on link 4, link 5, and the local leaf subnetworks if they contain 971 group members. Router B does not forward the packet on link 3 because 972 it knows from routing protocol exchanges that Router C considers link 2 973 as its parent link for the source. Router B knows that if it were to 974 forward the packet on link 3, it would be discarded by Router C since 975 the packet would not be arriving on Router C's parent link for this 976 source. 978 6.2.1.2 RPB: Benefits and Limitations 980 The key benefit to reverse path broadcasting is that it is reasonably 981 efficient and easy to implement. It does not require that the router 982 know about the entire spanning tree, nor does it require a special 983 mechanism to stop the forwarding process (as flooding does). In 984 addition, it guarantees efficient delivery since multicast packets 985 always follow the "shortest" path from the source station to the 986 destination group. Finally, the packets are distributed over multiple 987 links, resulting in better network utilization since a different tree is 988 computed for each source. 990 One of the major limitations of the RPB algorithm is that it does not 991 take into account multicast group membership when building the delivery 992 tree for a source. As a result, datagrams may be unnecessarily 993 forwarded onto subnetworks that have no members in a destination group. 995 6.2.2 Truncated Reverse Path Broadcasting (TRPB) 997 Truncated Reverse Path Broadcasting (TRPB) was developed to overcome the 998 limitations of Reverse Path Broadcasting. With information provided by 999 IGMP, multicast routers determine the group memberships on each leaf 1000 subnetwork and avoid forwarding datagrams onto a leaf subnetwork if it 1001 does not contain at least one member of a given destination group. Thus, 1002 the delivery tree is "truncated" by the router if a leaf subnetwork has 1003 no group members. 1005 Figure 9 illustrates the operation of TRPB algorithm. In this example 1006 the router receives a multicast packet on its parent link for the 1007 Source. The router forwards the datagram on interface 1 since that 1008 interface has at least one member of G1. The router does not forward 1009 the datagram to interface 3 since this interface has no members in the 1010 destination group. The datagram is forwarded on interface 4 if and only 1011 if a downstream router considers this subnetwork to be part of its 1012 "parent link" for the Source. 1014 ====================================================================== 1016 Source 1017 | : 1018 : 1019 | : (Source, G1) 1020 v 1021 | 1022 "parent link" 1023 | 1024 "child link" ___ 1025 G1 _______|2|_____ 1026 \ | | 1027 G3\\ _____ ___ ROUTER ___ ______ / G2 1028 \| hub |--|1| |3|-----|switch|/ 1029 /|_____| ^-- ___ --^ |______|\ 1030 / ^ |______|4|_____| ^ \ 1031 G1 ^ .^--- ^ G3 1032 ^ .^ | ^ 1033 ^ .^ "child link" ^ 1034 Forward | Truncate 1036 Figure 9: Truncated Reverse Path Broadcasting - (TRPB) 1037 ====================================================================== 1039 TRPB removes some limitations of RPB but it solves only part of the 1040 problem. It eliminates unnecessary traffic on leaf subnetworks but it 1041 does not consider group memberships when building the branches of the 1042 delivery tree. 1044 6.2.3 Reverse Path Multicasting (RPM) 1046 Reverse Path Multicasting (RPM) is an enhancement to Reverse Path 1047 Broadcasting and Truncated Reverse Path Broadcasting. 1049 RPM creates a delivery tree that spans only: 1051 o Subnetworks with group members, and 1053 o Routers and subnetworks along the shortest 1054 path to subnetworks with group members. 1056 RPM allows the source-based "shortest-path" tree to be pruned so that 1057 datagrams are only forwarded along branches that lead to active members 1058 of the destination group. 1060 6.2.3.1 Operation 1062 When a multicast router receives a packet for a (source, group) pair, 1063 the first packet is forwarded following the TRPB algorithm across all 1064 routers in the internetwork. Routers on the edge of the network (which 1065 have only leaf subnetworks) are called leaf routers. The TRPB algorithm 1066 guarantees that each leaf router will receive at least the first 1067 multicast packet. If there is a group member on one of its leaf 1068 subnetworks, a leaf router forwards the packet based on this group 1069 membership information. 1071 ======================================================================== 1073 Source 1074 | : 1075 | : (Source, G) 1076 | v 1077 | 1078 | 1079 o-#-G 1080 |********** 1081 ^ | * 1082 , | * 1083 ^ | * o 1084 , | * / 1085 o-#-o #*********** 1086 ^ |\ ^ |\ * 1087 ^ | o ^ | G * 1088 , | , | * 1089 ^ | ^ | * 1090 , | , | * 1091 # # # 1092 /|\ /|\ /|\ 1093 o o o o o o G o G 1094 LEGEND 1096 # Router 1097 o Leaf without group member 1098 G Leaf with group member 1099 *** Active Branch 1100 --- Pruned Branch 1101 ,>, Prune Message (direction of flow --> 1103 Figure 10: Reverse Path Multicasting (RPM) 1104 ======================================================================== 1106 If none of the subnetworks connected to the leaf router contain group 1107 members, the leaf router may transmit a "prune" message on its parent 1108 link, informing the upstream router that it should not forward packets 1109 for this particular (source, group) pair on the child interface on which 1110 it received the prune message. Prune messages are sent just one hop 1111 back toward the source. 1113 An upstream router receiving a prune message is required to store the 1114 prune information in memory. If the upstream router has no recipients 1115 on local leaf subnetworks and has received prune messages from each 1116 downstream neighbor on each of the child interfaces for this (source, 1117 group) pair, then the upstream router does not need to receive 1118 additional packets for the (source, group) pair. This implies that the 1119 upstream router can also generate a prune message of its own, one hop 1120 further back toward the source. This cascade of prune messages results 1121 in an active multicast delivery tree, consisting exclusively of "live" 1122 branches (i.e., branches that lead to active receivers). 1124 Since both the group membership and internetwork topology can change 1125 dynamically, the pruned state of the multicast delivery tree must be 1126 refreshed periodically. At regular intervals, the prune information 1127 expires from the memory of all routers and the next packet for the 1128 (source, group) pair is forwarded toward all downstream routers. This 1129 allows "stale state" (prune state for groups that are no longer active) 1130 to be reclaimed by the multicast routers. 1132 6.2.3.2 Limitations 1134 Despite the improvements offered by the RPM algorithm, there are still 1135 several scaling issues that need to be addressed when attempting to 1136 develop an Internet-wide delivery service. The first limitation is that 1137 multicast packets must be periodically flooded across every router in 1138 the internetwork, onto every leaf subnetwork. This flooding is wasteful 1139 of bandwidth (until the updated prune state is constructed). 1141 This "flood and prune" paradigm is very powerful, but it wastes 1142 bandwidth and does not scale well, especially if there are receivers at 1143 the edge of the delivery tree which are connected via low-speed 1144 technologies (e.g., ISDN or modem). Also, note that every router 1145 participating in the RPM algorithm must either have a forwarding table 1146 entry for a (source, group) pair, or have prune state information for 1147 that (source, group) pair. 1149 It is clearly wasteful (especially as the number of active sources and 1150 groups increase) to place such a burden on routers that are not on every 1151 (or perhaps any) active delivery tree. Shared tree techniques are an 1152 attempt to address these scaling issues, which become quite acute when 1153 most groups' senders and receivers are sparsely distributed across the 1154 internetwork. 1156 6.3 Shared Tree Techniques 1158 The most recent additions to the set of multicast forwarding techniques 1159 are based on a shared delivery tree. Unlike shortest-path tree 1160 algorithms which build a source-based tree for each source, or each 1161 (source, group) pair, shared tree algorithms construct a single delivery 1162 tree that is shared by all members of a group. The shared tree approach 1163 is quite similar to the spanning tree algorithm except it allows the 1164 definition of a different shared tree for each group. Stations wishing 1165 to receive traffic for a multicast group must explicitly join the shared 1166 delivery tree. Multicast traffic for each group is sent and received 1167 over the same delivery tree, regardless of the source. 1169 6.3.1 Operation 1171 A shared tree may involve a single router, or set of routers, which 1172 comprise(s) the "core" of a multicast delivery tree. Figure 11 1173 illustrates how a single multicast delivery tree is shared by all 1174 sources and receivers for a multicast group. 1176 ======================================================================== 1178 Source Source Source 1179 | | | 1180 | | | 1181 v v v 1183 [#] * * * * * [#] * * * * * [#] 1184 * 1185 ^ * ^ 1186 | * | 1187 join | * | join 1188 | [#] | 1189 [x] [x] 1190 : : 1191 member member 1192 host host 1194 LEGEND 1196 [#] Shared Tree "Core" Routers 1197 * * Shared Tree Backbone 1198 [x] Member-hosts' directly-attached routers 1200 Figure 11: Shared Multicast Delivery Tree 1202 ======================================================================== 1204 Similar to other multicast forwarding algorithms, shared tree algorithms 1205 do not require that the source of a multicast packet be a member of a 1206 destination group in order to send to a group. 1208 6.3.2 Benefits 1210 In terms of scalability, shared tree techniques have several advantages 1211 over source-based trees. Shared tree algorithms make efficient use of 1212 router resources since they only require a router to maintain state 1213 information for each group, not for each source, or for each (source, 1214 group) pair. (Remember that source-based tree techniques required all 1215 routers in an internetwork to either a) be on the delivery tree for a 1216 given source or (source, group) pair, or b) to have prune state for 1217 that source or (source, group) pair: So the entire internetwork must 1218 participate in the source-based tree protocol.) This improves the 1219 scalability of applications with many active senders since the number of 1220 source stations is no longer a scaling issue. Also, shared tree 1221 algorithms conserve network bandwidth since they do not require that 1222 multicast packets be periodically flooded across all multicast routers 1223 in the internetwork onto every leaf subnetwork. This can offer 1224 significant bandwidth savings, especially across low-bandwidth WAN 1225 links, and when receivers sparsely populate the domain of operation. 1226 Finally, since receivers are required to explicitly join the shared 1227 delivery tree, data only ever flows over those links that lead to active 1228 receivers. 1230 6.3.3 Limitations 1232 Despite these benefits, there are still several limitations to protocols 1233 that are based on a shared tree algorithm. Shared trees may result in 1234 traffic concentration and bottlenecks near core routers since traffic 1235 from all sources traverses the same set of links as it approaches the 1236 core. In addition, a single shared delivery tree may create suboptimal 1237 routes (a shortest path between the source and the shared tree, a 1238 suboptimal path across the shared tree, a shortest path between the 1239 egress core router and the receiver's directly attached router) 1240 resulting in increased delay which may be a critical issue for some 1241 multimedia applications. (Simulations indicate that latency over a 1242 shared tree may be approximately 10% larger than source-based trees in 1243 many cases, but by the same token, this may be negligible for many 1244 applications.) Finally, expanding-ring searches are not supported 1245 inside shared-tree domains. 1247 7. "DENSE MODE" ROUTING PROTOCOLS 1249 Certain multicast routing protocols are designed to work well in 1250 environments that have plentiful bandwidth and where it is reasonable 1251 to assume that receivers are rather densely distributed. In such 1252 scenarios, it is very reasonable to use periodic flooding, or other 1253 bandwidth-intensive techniques that would not necessarily be very 1254 scalable over a wide-area network. In section 8, we will examine 1255 different protocols that are specifically geared toward efficient WAN 1256 operation, especially for groups that have widely dispersed (i.e., 1257 sparse) membership. 1259 These routing protocols include: 1261 o Distance Vector Multicast Routing Protocol (DVMRP), 1263 o Multicast Extensions to Open Shortest Path First (MOSPF), 1265 o Protocol Independent Multicast - Dense Mode (PIM-DM). 1267 These protocols' underlying designs assume that the amount of protocol 1268 overhead (in terms of the amount of state that must be maintained by 1269 each router, the number of router CPU cycles required, and the amount of 1270 bandwidth consumed by protocol operation) is appropriate since receivers 1271 densely populate the area of operation. 1273 7.1. Distance Vector Multicast Routing Protocol (DVMRP) 1275 The Distance Vector Multicast Routing Protocol (DVMRP) is a distance- 1276 vector routing protocol designed to support the forwarding of multicast 1277 datagrams through an internetwork. DVMRP constructs source-based 1278 multicast delivery trees using the Reverse Path Multicasting (RPM) 1279 algorithm. Originally, the entire MBone ran only DVMRP. Today, over 1280 half of the MBone routers still run some version of DVMRP. 1282 DVMRP was first defined in RFC-1075. The original specification was 1283 derived from the Routing Information Protocol (RIP) and employed the 1284 Truncated Reverse Path Broadcasting (TRPB) technique. The major 1285 difference between RIP and DVMRP is that RIP calculates the next-hop 1286 toward a destination, while DVMRP computes the previous-hop back toward 1287 a source. Since mrouted 3.0, DVMRP has employed the Reverse Path 1288 Multicasting (RPM) algorithm. Thus, the latest implementations of DVMRP 1289 are quite different from the original RFC specification in many regards. 1290 There is an active effort within the IETF Inter-Domain Multicast Routing 1291 (IDMR) working group to specify DVMRP version 3 in a standard form. 1293 The current DVMRP v3 Internet-Draft is: 1295 , or 1296 1298 7.1.1 Physical and Tunnel Interfaces 1300 The ports of a DVMRP router may be either a physical interface to a 1301 directly-attached subnetwork or a tunnel interface to another multicast- 1302 capable island. All interfaces are configured with a metric specifying 1303 cost for the given port, and a TTL threshold that limits the scope of a 1304 multicast transmission. In addition, each tunnel interface must be 1305 explicitly configured with two additional parameters: The IP address of 1306 the local router's tunnel interface and the IP address of the remote 1307 router's interface. 1309 ======================================================================== 1311 TTL Scope 1312 Threshold 1313 ________________________________________________________________________ 1314 0 Restricted to the same host 1315 1 Restricted to the same subnetwork 1316 15 Restricted to the same site 1317 63 Restricted to the same region 1318 127 Worldwide 1319 191 Worldwide; limited bandwidth 1320 255 Unrestricted in scope 1322 Table 1: TTL Scope Control Values 1323 ======================================================================== 1325 A multicast router will only forward a multicast datagram across an 1326 interface if the TTL field in the IP header is greater than the TTL 1327 threshold assigned to the interface. Table 1 lists the conventional 1328 TTL values that are used to restrict the scope of an IP multicast. For 1329 example, a multicast datagram with a TTL of less than 16 is restricted 1330 to the same site and should not be forwarded across an interface to 1331 other sites in the same region. 1333 TTL-based scoping is not always sufficient for all applications. 1334 Conflicts arise when trying to simultaneously enforce limits on 1335 topology, geography, and bandwidth. In particular, TTL-based scoping 1336 cannot handle overlapping regions, which is a necessary characteristic 1337 of administrative regions. In light of these issues, "administrative" 1338 scoping was created in 1994, to provide a way to do scoping based on 1339 multicast address. Certain addresses would be usable within a given 1340 administrative scope (e.g., a corporate internetwork) but would not be 1341 forwarded onto the global MBone. This allows for privacy, and address 1342 reuse within the class D address space. The range from 239.0.0.0 to 1343 239.255.255.255 has been reserved for administrative scoping. While 1344 administrative scoping has been in limited use since 1994 or so, it has 1345 yet to be widely deployed. The IETF MBoneD working group is working on 1346 the deployment of administrative scoping. For additional information, 1347 please see or its successor, 1348 entitled "Administratively Scoped IP Multicast." 1350 7.1.2 Basic Operation 1352 DVMRP implements the Reverse Path Multicasting (RPM) algorithm. 1353 According to RPM, the first datagram for any (source, group) pair is 1354 forwarded across the entire internetwork (providing the packet's TTL and 1355 router interface thresholds permit this). Upon receiving this traffic, 1356 leaf routers may transmit prune messages back toward the source if there 1357 are no group members on their directly-attached leaf subnetworks. The 1358 prune messages remove all branches that do not lead to group members 1359 from the tree, leaving a source-based shortest path tree. 1361 After a period of time, the prune state for each (source, group) pair 1362 expires to reclaim stale prune state (from groups that are no longer in 1363 use). If those groups are actually still in use, a subsequent datagram 1364 for the (source, group) pair will be flooded across all downstream 1365 routers. This flooding will result in a new set of prune messages, 1366 serving to regenerate the source-based shortest-path tree for this 1367 (source, group) pair. In current implementations of RPM (notably 1368 DVMRP), prune messages are not reliably transmitted, so the prune 1369 lifetime must be kept short to compensate for lost prune messages. 1371 DVMRP also implements a mechanism to quickly "graft" back a previously 1372 pruned branch of a group's delivery tree. If a router that had sent a 1373 prune message for a (source, group) pair discovers new group members on 1374 a leaf network, it sends a graft message to the previous-hop router for 1375 this source. When an upstream router receives a graft message, it 1376 cancels out the previously-received prune message. Graft messages 1377 cascade (reliably) hop-by-hop back toward the source until they reach 1378 the nearest "live" branch point on the delivery tree. In this way, 1379 previously-pruned branches are quickly restored to a given delivery 1380 tree. 1382 7.1.3 DVMRP Router Functions 1384 In Figure 13, Router C is downstream and may potentially receive 1385 datagrams from the source subnetwork from Router A or Router B. If 1386 Router A's metric to the source subnetwork is less than Router B's 1387 metric, then Router A is dominant over Router B for this source. 1389 This means that Router A will forward any traffic from the source 1390 subnetwork and Router B will discard traffic received from that source. 1391 However, if Router A's metric is equal to Router B's metric, then the 1392 router with the lower IP address on its downstream interface (child 1393 link) becomes the Dominant Router for this source. Note that on a 1394 subnetwork with multiple routers forwarding to groups with multiple 1395 sources, different routers may be dominant for each source. 1397 7.1.4 DVMRP Routing Table 1399 The DVMRP process periodically exchanges routing table updates with its 1400 DVMRP neighbors. These updates are logically independent of those 1401 generated by any unicast Interior Gateway Protocol. 1403 Since the DVMRP was developed to route multicast and not unicast 1404 traffic, a router will probably run multiple routing processes in 1405 practice: One to support the forwarding of unicast traffic and another 1406 to support the forwarding of multicast traffic. (This can be convenient: 1407 A router can be configured to only route multicast IP, with no unicast 1408 ======================================================================== 1410 To 1411 .-<-<-<-<-<-<-Source Subnetwork->->->->->->->->--. 1412 v v 1413 | | 1414 parent link parent link 1415 | | 1416 _____________ _____________ 1417 | Router A | | Router B | 1418 | | | | 1419 ------------- ------------- 1420 | | 1421 child link child link 1422 | | 1423 --------------------------------------------------------------------- 1424 | 1425 parent link 1426 | 1427 _____________ 1428 | Router C | 1429 | | 1430 ------------- 1431 | 1432 child link 1433 | 1435 Figure 12. DVMRP Dominant Router in a Redundant Topology 1436 ======================================================================== 1438 IP routing. This may be a useful capability in firewalled 1439 environments.) 1441 Again, consider Figure 12: There are two types of routers in this 1442 figure: dominant and subordinate; assume in this example that Router B 1443 is dominant, Router A is subordinate, and Router C is part of the 1444 downstream distribution tree. In general, which routers are dominant 1445 or subordinate may be different for each source! A subordinate router 1446 is one that is NOT on the shortest path tree back toward a source. The 1447 dominant router can tell this because the subordinate router will 1448 'poison-reverse' the route for this source in its routing updates which 1449 are sent on the common LAN (i.e., Router A sets the metric for this 1450 source to 'infinity'). The dominant router keeps track of subordinate 1451 routers on a per-source basis...it never needs or expects to receive a 1452 prune message from a subordinate router. Only routers that are truly on 1453 the downstream distribution tree will ever need to send prunes to the 1454 dominant router. If a dominant router on a LAN has received either a 1455 poison-reversed route for a source, or prunes for all groups emanating 1456 from that source subnetwork, then it may itself send a prune upstream 1457 toward the source (assuming also that IGMP has told it that there are no 1458 local receivers for any group from this source). 1460 A sample routing table for a DVMRP router is shown in Figure 13. Unlike 1462 ======================================================================== 1464 Source Subnet From Metric Status TTL 1465 Prefix Mask Gateway 1467 128.1.0.0 255.255.0.0 128.7.5.2 3 Up 200 1468 128.2.0.0 255.255.0.0 128.7.5.2 5 Up 150 1469 128.3.0.0 255.255.0.0 128.6.3.1 2 Up 150 1470 128.3.0.0 255.255.0.0 128.6.3.1 4 Up 200 1472 Figure 13: DVMRP Routing Table 1473 ======================================================================== 1475 the table that would be created by a unicast routing protocol such as 1476 the RIP, OSPF, or the BGP, the DVMRP routing table contains Source 1477 Prefixes and From-Gateways instead of Destination Prefixes and Next-Hop 1478 Gateways. 1480 The routing table represents the shortest path (source-based) spanning 1481 tree to every possible source prefix in the internetwork--the Reverse 1482 Path Broadcasting (RPB) tree. The DVMRP routing table does not 1483 represent group membership or received prune messages. 1485 The key elements in DVMRP routing table include the following items: 1487 Source Prefix A subnetwork which is a potential or actual 1488 source of multicast datagrams. 1490 Subnet Mask The subnet mask associated with the Source 1491 Prefix. Note that the DVMRP provides the subnet 1492 mask for each source subnetwork (in other words, 1493 the DVMRP is classless). 1495 From-Gateway The previous-hop router leading back toward a 1496 particular Source Prefix. 1498 TTL The time-to-live is used for table management 1499 and indicates the number of seconds before an 1500 entry is removed from the routing table. This 1501 TTL has nothing at all to do with the TTL used 1502 in TTL-based scoping. 1504 7.1.5 DVMRP Forwarding Table 1506 Since the DVMRP routing table is not aware of group membership, the 1507 DVMRP process builds a forwarding table based on a combination of the 1508 information contained in the multicast routing table, known groups, and 1509 received prune messages. The forwarding table represents the local 1510 router's understanding of the shortest path source-based delivery tree 1511 for each (source, group) pair--the Reverse Path Multicasting (RPM) tree. 1513 ======================================================================== 1515 Source Multicast TTL InIntf OutIntf(s) 1516 Prefix Group 1518 128.1.0.0 224.1.1.1 200 1 Pr 2p3p 1519 224.2.2.2 100 1 2p3 1520 224.3.3.3 250 1 2 1521 128.2.0.0 224.1.1.1 150 2 2p3 1523 Figure 14: DVMRP Forwarding Table 1524 ======================================================================== 1526 The forwarding table for a sample DVMRP router is shown in Figure 14. 1527 The elements in this display include the following items: 1529 Source Prefix The subnetwork sending multicast datagrams 1530 to the specified groups (one group per row). 1532 Multicast Group The Class D IP address to which multicast 1533 datagrams are addressed. Note that a given 1534 Source Prefix may contain sources for several 1535 Multicast Groups. 1537 InIntf The parent interface for the (source, group) 1538 pair. A 'Pr' in this column indicates that a 1539 prune message has been sent to the upstream 1540 router (the From-Gateway for this Source Prefix 1541 in the DVMRP routing table). 1543 OutIntf(s) The child interfaces over which multicast 1544 datagrams for this (source, group) pair are 1545 forwarded. A 'p' in this column indicates 1546 that the router has received a prune message(s) 1547 from a (all) downstream router(s) on this port. 1549 7.2. Multicast Extensions to OSPF (MOSPF) 1551 Version 2 of the Open Shortest Path First (OSPF) routing protocol is 1552 defined in RFC-1583. OSPF is an Interior Gateway Protocol (IGP) that 1553 distributes unicast topology information among routers belonging to a 1554 single OSPF "Autonomous System." OSPF is based on link-state algorithms 1555 which permit rapid route calculation with a minimum of routing protocol 1556 traffic. In addition to efficient route calculation, OSPF is an open 1557 standard that supports hierarchical routing, load balancing, and the 1558 import of external routing information. 1560 The Multicast Extensions to OSPF (MOSPF) are defined in RFC-1584. MOSPF 1561 routers maintain a current image of the network topology through the 1562 unicast OSPF link-state routing protocol. The multicast extensions to 1563 OSPF are built on top of OSPF Version 2 so that a multicast routing 1564 capability can be incrementally introduced into an OSPF Version 2 1565 routing domain. Routers running MOSPF will interoperate with non-MOSPF 1566 routers when forwarding unicast IP data traffic. MOSPF does not support 1567 tunnels. 1569 7.2.1 Intra-Area Routing with MOSPF 1571 Intra-Area Routing describes the basic routing algorithm employed by 1572 MOSPF. This elementary algorithm runs inside a single OSPF area and 1573 supports multicast forwarding when a source and all destination group 1574 members reside in the same OSPF area, or when the entire OSPF Autonomous 1575 System is a single area (and the source is inside that area...). The 1576 following discussion assumes that the reader is familiar with OSPF. 1578 7.2.1.1 Local Group Database 1580 Similar to all other multicast routing protocols, MOSPF routers use the 1581 Internet Group Management Protocol (IGMP) to monitor multicast group 1582 membership on directly-attached subnetworks. MOSPF routers maintain a 1583 "local group database" which lists directly-attached groups and 1584 determines the local router's responsibility for delivering multicast 1585 datagrams to these groups. 1587 On any given subnetwork, the transmission of IGMP Host Membership 1588 Queries is performed solely by the Designated Router (DR). However, 1589 the responsibility of listening to IGMP Host Membership Reports is 1590 performed by not only the Designated Router (DR) but also the Backup 1591 Designated Router (BDR). Therefore, in a mixed LAN containing both 1592 MOSPF and OSPF routers, an MOSPF router must be elected the DR for the 1593 subnetwork. This can be achieved by setting the OSPF RouterPriority to 1594 zero in each non-MOSPF router to prevent them from becoming the (B)DR. 1596 The DR is responsible for communicating group membership information to 1597 all other routers in the OSPF area by flooding Group-Membership LSAs. 1598 Similar to Router-LSAs and Network-LSAs, Group-Membership LSAs are only 1599 flooded within a single area. 1601 7.2.1.2 Datagram's Shortest Path Tree 1603 The datagram's shortest path tree describes the path taken by a 1604 multicast datagram as it travels through the area from the source 1605 subnetwork to each of the group members' subnetworks. The shortest 1606 path tree for each (source, group) pair is built "on demand" when a 1607 router receives the first multicast datagram for a particular (source, 1608 group) pair. 1610 When the initial datagram arrives, the source subnetwork is located in 1611 the MOSPF link state database. The MOSPF link state database is simply 1612 the standard OSPF link state database with the addition of Group- 1613 Membership LSAs. Based on the Router- and Network-LSAs in the OSPF 1614 link state database, a source-based shortest-path tree is constructed 1615 using Dijkstra's algorithm. After the tree is built, Group-Membership 1616 LSAs are used to prune the tree such that the only remaining branches 1617 lead to subnetworks containing members of this group. The output of 1618 these algorithms is a pruned source-based tree rooted at the datagram's 1619 source. 1621 ======================================================================== 1623 S 1624 | 1625 | 1626 A # 1627 / \ 1628 / \ 1629 1 2 1630 / \ 1631 B # # C 1632 / \ \ 1633 / \ \ 1634 3 4 5 1635 / \ \ 1636 D # # E # F 1637 / \ \ 1638 / \ \ 1639 6 7 8 1640 / \ \ 1641 G # # H # I 1643 LEGEND 1645 # Router 1647 Figure 15. Shortest Path Tree for a (S, G) pair 1648 ======================================================================== 1649 To forward multicast datagrams to downstream members of a group, each 1650 router must determine its position in the datagram's shortest path tree. 1651 Assume that Figure 15 illustrates the shortest path tree for a given 1652 (source, group) pair. Router E's upstream node is Router B and there 1653 are two downstream interfaces: one connecting to Subnetwork 6 and 1654 another connecting to Subnetwork 7. 1656 Note the following properties of the basic MOSPF routing algorithm: 1658 o For a given multicast datagram, all routers within an OSPF 1659 area calculate the same source-based shortest path delivery 1660 tree. Tie-breakers have been defined to guarantee that if 1661 several equal-cost paths exist, all routers agree on a single 1662 path through the area. Unlike unicast OSPF, MOSPF does not 1663 support the concept of equal-cost multipath routing. 1665 o Synchronized link state databases containing Group-Membership 1666 LSAs allow an MOSPF router to build a source-based shortest- 1667 path tree in memory, working forward from the source to the 1668 group member(s). Unlike the DVMRP, this means that the first 1669 datagram of a new transmission does not have to be flooded to 1670 all routers in an area. 1672 o The "on demand" construction of the source-based delivery tree 1673 has the benefit of spreading calculations over time, resulting 1674 in a lesser impact for participating routers. Of course, this 1675 may strain the CPU(s) in a router if many new (source, group) 1676 pairs appear at about the same time, or if there are a lot of 1677 events which force the MOSPF process to flush and rebuild its 1678 forwarding cache. In a stable topology with long-lived 1679 multicast sessions, these effects should be minimal. 1681 7.2.1.3 Forwarding Cache 1683 Each MOSPF router makes its forwarding decision based on the contents of 1684 its forwarding cache. Contrary to DVMRP, MOSPF forwarding is not RPF- 1685 based. The forwarding cache is built from the source-based shortest- 1686 path tree for each (source, group) pair, and the router's local group 1687 database. After the router discovers its position in the shortest path 1688 tree, a forwarding cache entry is created containing the (source, group) 1689 pair, its expected upstream interface, and the necessary downstream 1690 interface(s). The forwarding cache entry is now used to quickly 1691 forward all subsequent datagrams from this source to this group. If 1692 a new source begins sending to a new group, MOSPF must first calculate 1693 the distribution tree so that it may create a cache entry that can be 1694 used to forward the packet. 1696 Figure 16 displays the forwarding cache for an example MOSPF router. 1697 The elements in the display include the following items: 1699 Dest. Group A known destination group address to which 1700 datagrams are currently being forwarded, or to 1701 which traffic was sent "recently" (i.e., since 1702 the last topology or group membership or other 1703 event which (re-)initialized MOSPF's forwarding 1704 cache. 1706 Source The datagram's source host address. Each (Dest. 1707 Group, Source) pair uniquely identifies a 1708 separate forwarding cache entry. 1710 ======================================================================== 1712 Dest. Group Source Upstream Downstream TTL 1714 224.1.1.1 128.1.0.2 11 12 13 5 1715 224.1.1.1 128.4.1.2 11 12 13 2 1716 224.1.1.1 128.5.2.2 11 12 13 3 1717 224.2.2.2 128.2.0.3 12 11 7 1719 Figure 16: MOSPF Forwarding Cache 1720 ======================================================================== 1722 Upstream Datagrams matching this row's Dest. Group and 1723 Source must be received on this interface. 1725 Downstream If a datagram matching this row's Dest. Group 1726 and Source is received on the correct Upstream 1727 interface, then it is forwarded across the listed 1728 Downstream interfaces. 1730 TTL The minimum number of hops a datagram must cross 1731 to reach any of the Dest. Group's members. An 1732 MOSPF router may discard a datagram if it can see 1733 that the datagram has insufficient TTL to reach 1734 even the closest group member. 1736 The information in the forwarding cache is not aged or periodically 1737 refreshed: It is maintained as long as there are system resources 1738 available (e.g., memory) or until the next topology change. The 1739 contents of the forwarding cache will change when: 1741 o The topology of the OSPF internetwork changes, forcing all of 1742 the shortest path trees to be recalculated. (Once the cache 1743 has been flushed, entries are not rebuilt. If another packet 1744 for one of the previous (Dest. Group, Source) pairs is 1745 received, then a "new" cache entry for that pair will be 1746 created then.) 1748 o There is a change in the Group-Membership LSAs indicating that 1749 the distribution of individual group members has changed. 1751 7.2.2 Mixing MOSPF and OSPF Routers 1753 MOSPF routers can be combined with non-multicast OSPF routers. This 1754 permits the gradual deployment of MOSPF and allows experimentation with 1755 multicast routing on a limited scale. 1757 It is important to note that an MOSPF router is required to eliminate 1758 all non-multicast OSPF routers when it builds its source-based shortest- 1759 path delivery tree. (An MOSPF router can determine the multicast 1760 capability of any other router based on the setting of the multicast- 1761 capable bit (MC-bit) in the Options field of each router's link state 1762 advertisements.) The omission of non-multicast routers may create a 1763 number of potential problems when forwarding multicast traffic: 1765 o The Designated Router for a multi-access network must be an 1766 MOSPF router. If a non-multicast OSPF router is elected the 1767 DR, the subnetwork will not be selected to forward multicast 1768 datagrams since a non-multicast DR cannot generate Group- 1769 Membership LSAs for its subnetwork (because it is not running 1770 IGMP, so it won't hear IGMP Host Membership Reports). 1772 o Even though there may be unicast connectivity to a destination, 1773 there may not be multicast connectivity. For example, the only 1774 path between two points could require traversal of a non- 1775 multicast-capable OSPF router. 1777 o The forwarding of multicast and unicast datagrams between two 1778 points may follow different paths, making some routing problems 1779 a bit more challenging to solve. 1781 7.2.3 Inter-Area Routing with MOSPF 1783 Inter-area routing involves the case where a datagram's source and some 1784 of its destination group members reside in different OSPF areas. It 1785 should be noted that the forwarding of multicast datagrams continues to 1786 be determined by the contents of the forwarding cache which is still 1787 built from the local group database and the datagram source-based trees. 1788 The major differences are related to the way that group membership 1789 information is propagated and the way that the inter-area source-based 1790 tree is constructed. 1792 7.2.3.1 Inter-Area Multicast Forwarders 1794 In MOSPF, a subset of an area's Area Border Routers (ABRs) function as 1795 "inter-area multicast forwarders." An inter-area multicast forwarder is 1796 responsible for the forwarding of group membership information and 1797 multicast datagrams between areas. Configuration parameters determine 1798 whether or not a particular ABR also functions as an inter-area 1799 multicast forwarder. 1801 Inter-area multicast forwarders summarize their attached areas' group 1802 membership information to the backbone by originating new Group- 1803 Membership LSAs into the backbone area. Note that the summarization of 1804 group membership in MOSPF is asymmetric. This means that while group 1805 membership information from non-backbone areas is flooded into the 1806 backbone, but group membership from the backbone (or from any other 1807 non-backbone areas) is not flooded into any non-backbone area(s). 1809 To permit the forwarding of multicast traffic between areas, MOSPF 1810 introduces the concept of a "wild-card multicast receiver." A wild-card 1811 multicast receiver is a router that receives all multicast traffic 1812 generated in an area. In non-backbone areas, all inter-area multicast 1813 forwarders operate as wild-card multicast receivers. This guarantees 1814 that all multicast traffic originating in any non-backbone area is 1815 delivered to its inter-area multicast forwarder, and then if necessary 1816 into the backbone area. Since the backbone knows group membership for 1817 all areas, the datagram can be forwarded to the appropriate location(s) 1818 in the OSPF autonomous system, if only it is forwarded into the backbone 1819 by the source area's multicast ABR. 1821 7.2.3.2 Inter-Area Datagram's Shortest-Path Tree 1823 In the case of inter-area multicast routing, it is usually impossible to 1824 build a complete shortest-path delivery tree. Incomplete trees are a 1825 fact of life because each OSPF area's complete topological and group 1826 membership information is not distributed between OSPF areas. 1827 Topological estimates are made through the use of wild-card receivers 1828 and OSPF Summary-Links LSAs. 1830 If the source of a multicast datagram resides in the same area as the 1831 router performing the calculation, the pruning process must be careful 1832 to ensure that branches leading to other areas are not removed from the 1833 tree. Only those branches having no group members nor wild-card 1834 multicast receivers are pruned. Branches containing wild-card multicast 1835 receivers must be retained since the local routers do not know whether 1836 there are any group members residing in other areas. 1838 If the source of a multicast datagram resides in a different area than 1839 the router performing the calculation, the details describing the local 1840 topology surrounding the source station are not known. However, this 1841 information can be estimated using information provided by Summary-Links 1842 LSAs for the source subnetwork. In this case, the base of the tree 1843 begins with branches directly connecting the source subnetwork to each 1844 of the local area's inter-area multicast forwarders. Datagrams sourced 1845 from outside the local area will enter the area via one of its inter- 1846 area multicast forwarders, so they all must be part of the candidate 1847 distribution tree. 1849 Since each inter-area multicast forwarder is also an ABR, it must 1850 maintain a separate link state database for each attached area. Thus 1851 each inter-area multicast forwarder is required to calculate a separate 1852 forwarding tree for each of its attached areas. 1854 7.2.4 Inter-Autonomous System Multicasting with MOSPF 1856 Inter-Autonomous System multicasting involves the situation where a 1857 datagram's source or some of its destination group members are in 1858 different OSPF Autonomous Systems. In OSPF terminology, "inter-AS" 1859 communication also refers to connectivity between an OSPF domain and 1860 another routing domain which could be within the same Autonomous System 1861 from the perspective of an Exterior Gateway Protocol. 1863 To facilitate inter-AS multicast routing, selected Autonomous System 1864 Boundary Routers (ASBRs) are configured as "inter-AS multicast 1865 forwarders." MOSPF makes the assumption that each inter-AS multicast 1866 forwarder executes an inter-AS multicast routing protocol which forwards 1867 multicast datagrams in a reverse path forwarding (RPF) manner. Since 1868 the publication of the MOSPF RFC, a term has been defined for such a 1869 router: Multicast Border Router. See section 9 for an overview of the 1870 MBR concepts. Each inter-AS multicast forwarder is a wildcard multicast 1871 receiver in each of its attached areas. This guarantees that each 1872 inter-AS multicast forwarder remains on all pruned shortest-path trees 1873 and receives all multicast datagrams. 1875 The details of inter-AS forwarding are very similar to inter-area 1876 forwarding. On the "inside" of the OSPF domain, the multicast ASBR 1877 must conform to all the requirements of intra-area and inter-area 1878 forwarding. Within the OSPF domain, group members are reached by the 1879 usual forward path computations, and paths to external sources are 1880 approximated by a reverse-path source-based tree, with the multicast 1881 ASBR standing in for the actual source. When the source is within the 1882 OSPF AS, and there are external group members, it falls to the inter- 1883 AS multicast forwarders, in their role as wildcard receivers, to make 1884 sure that the data gets out of the OSPF domain and sent off in the 1885 correct direction. 1887 7.3 Protocol-Independent Multicast (PIM) 1889 The Protocol Independent Multicast (PIM) routing protocols have been 1890 developed by the Inter-Domain Multicast Routing (IDMR) working group of 1891 the IETF. The objective of the IDMR working group is to develop one--or 1892 possibly more than one--standards-track multicast routing protocol(s) 1893 that can provide scaleable multicast routing across the Internet. 1895 PIM is actually two protocols: PIM - Dense Mode (PIM-DM) and PIM - 1896 Sparse Mode (PIM-SM). In the remainder of this introduction, any 1897 references to "PIM" apply equally well to either of the two protocols... 1898 there is no intention to imply that there is only one PIM protocol. 1899 While PIM-DM and PIM-SM share part of their names, and they do have 1900 related control messages, they are actually two completely independent 1901 protocols. 1903 PIM receives its name because it is not dependent on the mechanisms 1904 provided by any particular unicast routing protocol. However, any 1905 implementation supporting PIM requires the presence of a unicast routing 1906 protocol to provide routing table information and to adapt to topology 1907 changes. 1909 PIM makes a clear distinction between a multicast routing protocol that 1910 is designed for dense environments and one that is designed for sparse 1911 environments. Dense-mode refers to a protocol that is designed to 1912 operate in an environment where group members are relatively densely 1913 packed and bandwidth is plentiful. Sparse-mode refers to a protocol 1914 that is optimized for environments where group members are distributed 1915 across many regions of the Internet and bandwidth is not necessarily 1916 widely available. It is important to note that sparse-mode does not 1917 imply that the group has a few members, just that they are widely 1918 dispersed across the Internet. 1920 The designers of PIM-SM argue that DVMRP and MOSPF were developed for 1921 environments where group members are densely distributed, and bandwidth 1922 is relatively plentiful. They emphasize that when group members and 1923 senders are sparsely distributed across a wide area, DVMRP and MOSPF 1924 do not provide the most efficient multicast delivery service. The 1925 DVMRP periodically sends multicast packets over many links that do not 1926 lead to group members, while MOSPF can send group membership 1927 information over links that do not lead to senders or receivers. 1929 7.3.1 PIM - Dense Mode (PIM-DM) 1931 While the PIM architecture was driven by the need to provide scaleable 1932 sparse-mode delivery trees, PIM also defines a new dense-mode protocol 1933 instead of relying on existing dense-mode protocols such as DVMRP and 1934 MOSPF. It is envisioned that PIM-DM would be deployed in resource rich 1935 environments, such as a campus LAN where group membership is relatively 1936 dense and bandwidth is likely to be readily available. PIM-DM's control 1937 messages are similar to PIM-SM's by design. 1939 [This space was intentionally left blank.] 1940 PIM - Dense Mode (PIM-DM) is similar to DVMRP in that it employs the 1941 Reverse Path Multicasting (RPM) algorithm. However, there are several 1942 important differences between PIM-DM and DVMRP: 1944 o To find routes back to sources, PIM-DM relies on the presence 1945 of an existing unicast routing table. PIM-DM is independent of 1946 the mechanisms of any specific unicast routing protocol. In 1947 contrast, DVMRP contains an integrated routing protocol that 1948 makes use of its own RIP-like exchanges to build its own unicast 1949 routing table (so a router may orient itself with respect to 1950 active source(s)). MOSPF augments the information in the OSPF 1951 link state database, thus MOSPF must run in conjunction with 1952 OSPF. 1954 o Unlike the DVMRP which calculates a set of child interfaces for 1955 each (source, group) pair, PIM-DM simply forwards multicast 1956 traffic on all downstream interfaces until explicit prune 1957 messages are received. PIM-DM is willing to accept packet 1958 duplication to eliminate routing protocol dependencies and 1959 to avoid the overhead inherent in determining the parent/child 1960 relationships. 1962 For those cases where group members suddenly appear on a pruned branch 1963 of the delivery tree, PIM-DM, like DVMRP, employs graft messages to 1964 re-attach the previously pruned branch to the delivery tree. 1966 8. "SPARSE MODE" ROUTING PROTOCOLS 1968 The most recent additions to the set of multicast routing protocols are 1969 called "sparse mode" protocols. They are designed from a different 1970 perspective than the "dense mode" protocols that we have already 1971 examined. Often, they are not data-driven, in the sense that forwarding 1972 state is set up in advance, and they trade off using bandwidth liberally 1973 (which is a valid thing to do in a campus LAN environment) for other 1974 techniques that are much more suited to scaling over large WANs, where 1975 bandwidth is scarce and expensive. 1977 These emerging routing protocols include: 1979 o Protocol Independent Multicast - Sparse Mode (PIM-SM), and 1981 o Core-Based Trees (CBT). 1983 While these routing protocols are designed to operate efficiently over a 1984 wide area network where bandwidth is scarce and group members may be 1985 quite sparsely distributed, this is not to imply that they are only 1986 suitable for small groups. Sparse doesn't mean small, rather it is 1987 meant to convey that the groups are widely dispersed, and thus it is 1988 wasteful to (for instance) flood their data periodically across the 1989 entire internetwork. 1991 8.1 Protocol-Independent Multicast - Sparse Mode (PIM-SM) 1993 As described previously, PIM also defines a "dense-mode" or source-based 1994 tree variant. Again, the two protocols are quite unique, and other than 1995 control messages, they have very little in common. Note that while PIM 1996 integrates control message processing and data packet forwarding among 1997 PIM-Sparse and -Dense Modes, PIM-SM and PIM-DM must run in separate 1998 regions. All groups in a region are either sparse-mode or dense-mode. 2000 PIM-Sparse Mode (PIM-SM) has been developed to provide a multicast 2001 routing protocol that provides efficient communication between members 2002 of sparsely distributed groups--the type of groups that are likely to 2003 be common in wide-area internetworks. PIM's designers observed that 2004 several hosts wishing to participate in a multicast conference do not 2005 justify flooding the entire internetwork periodically with the group's 2006 multicast traffic. 2008 Noting today's existing MBone scaling problems, and extrapolating to a 2009 future of ubiquitous multicast (overlaid with perhaps thousands of 2010 small, widely dispersed groups), it is not hard to imagine that existing 2011 multicast routing protocols will experience scaling problems. To 2012 eliminate these potential scaling issues, PIM-SM is designed to limit 2013 multicast traffic so that only those routers interested in receiving 2014 traffic for a particular group "see" it. 2016 PIM-SM differs from existing dense-mode protocols in two key ways: 2018 o Routers with adjacent or downstream members are required to 2019 explicitly join a sparse mode delivery tree by transmitting 2020 join messages. If a router does not join the pre-defined 2021 delivery tree, it will not receive multicast traffic addressed 2022 to the group. 2024 In contrast, dense-mode protocols assume downstream group 2025 membership and forward multicast traffic on downstream links 2026 until explicit prune messages are received. Thus, the default 2027 forwarding action of dense-mode routing protocols is to forward 2028 all traffic, while the default action of a sparse-mode protocol 2029 is to block traffic unless it has been explicitly requested. 2031 o PIM-SM evolved from the Core-Based Trees (CBT) approach in that 2032 it employs the concept of a "core" (or rendezvous point (RP) in 2033 PIM-SM terminology) where receivers "meet" sources. 2035 [This space was intentionally left blank.] 2036 ======================================================================== 2038 S1 S2 2039 ___|___ ___|___ 2040 | | 2041 | | 2042 # # 2043 \ / 2044 \ / 2045 \_____________RP______________/ 2046 ./|\. 2047 ________________// | \\_______________ 2048 / _______/ | \______ \ 2049 # # # # # 2050 ___|___ ___|___ ___|___ ___|___ ___|___ 2051 | | | | | | 2052 R R R R R R 2053 LEGEND 2055 # PIM Router 2056 R Multicast Receiver 2058 Figure 17: Rendezvous Point 2059 ======================================================================== 2061 When joining a group, each receiver uses IGMP to notify its directly- 2062 attached router, which in turn joins the multicast delivery tree by 2063 sending an explicit PIM-Join message hop-by-hop toward the group's 2064 RP. A source uses the RP to announce its presence, and act as a conduit 2065 to members that have joined the group. This model requires sparse-mode 2066 routers to maintain a bit of state (the RP-set for the sparse-mode 2067 region) prior to the arrival of data. In contrast, because dense-mode 2068 protocols are data-driven, they do not store any state for a group until 2069 the arrival of its first data packet. 2071 There is only one RP-set per sparse-mode domain, not per group. 2072 Moreover, the creator of a group is not involved in RP selection. Also, 2073 there is no such concept as a "primary" RP. Each group has precisely 2074 one RP at any given time. In the event of the failure of an RP, a new 2075 RP-set is distributed which does not include the failed RP. 2077 8.1.1 Directly Attached Host Joins a Group 2079 When there is more than one PIM router connected to a multi-access LAN, 2080 the router with the highest IP address is selected to function as the 2081 Designated Router (DR) for the LAN. The DR may or may not be 2082 responsible for the transmission of IGMP Host Membership Query messages, 2083 but does send Join/Prune messages toward the RP, and maintains the 2084 status of the active RP for local senders to multicast groups. 2086 When the DR receives an IGMP Report message for a new group, the DR 2087 determines if the group is RP-based or not by examining the group 2088 address. If the address indicates a SM group (by virtue of the group- 2089 specific state that even inactive groups have stored in all PIM 2090 routers), the DR performs a deterministic hash function over the 2091 sparse-mode region's RP-set to uniquely determine the RP for the 2092 group. 2094 ======================================================================== 2096 Source (S) 2097 _|____ 2098 | 2099 | 2100 # 2101 / \ 2102 / \ 2103 / \ 2104 # # 2105 / \ 2106 Designated / \ 2107 Host | Router / \ Rendezvous Point 2108 -----|- # - - - - - -#- - - - - - - -RP for group G 2109 (receiver) | ----Join--> ----Join--> 2110 | 2112 LEGEND 2114 # PIM Router RP Rendezvous Point 2116 Figure 18: Host Joins a Multicast Group 2117 ======================================================================== 2119 After performing the lookup, the DR creates a multicast forwarding entry 2120 for the (*, group) pair and transmits a unicast PIM-Join message toward 2121 the primary RP for this specific group. The (*, group) notation 2122 indicates an (any source, group) pair. The intermediate routers forward 2123 the unicast PIM-Join message, creating a forwarding entry for the 2124 (*, group) pair only if such a forwarding entry does not yet exist. 2125 Intermediate routers must create a forwarding entry so that they will be 2126 able to forward future traffic downstream toward the DR which originated 2127 the PIM-Join message. 2129 8.1.2 Directly Attached Source Sends to a Group 2131 When a source first transmits a multicast packet to a group, its DR 2132 forwards the datagram to the primary RP for subsequent distribution 2133 along the group's delivery tree. The DR encapsulates the initial 2134 multicast packets in a PIM-SM-Register packet and unicasts them toward 2135 the primary RP for the group. The PIM-SM-Register packet informs the 2136 RP of a new source which causes the active RP to transmit PIM-Join 2137 messages back toward the source's DR. The routers between the RP and 2138 the source's DR use the received PIM-Join messages (from the RP) to 2139 create forwarding state for the new (source, group) pair. Now all 2140 routers from the active RP for this sparse-mode group to the source's DR 2141 will be able to forward future unencapsulated multicast packets from 2142 this source subnetwork to the RP. Until the (source, group) state has 2143 been created in all the routers between the RP and source's DR, the DR 2144 must continue to send the source's multicast IP packets to the RP as 2145 unicast packets encapsulated within unicast PIM-Register packets. The 2146 DR may stop forwarding multicast packets encapsulated in this manner 2147 once it has received a PIM-Register-Stop message from the active RP for 2148 this group. The RP may send PIM-Register-Stop messages if there are no 2149 downstream receivers for a group, or if the RP has successfully joined 2150 the (source, group) tree (which originates at the source's DR). 2152 ======================================================================== 2154 Source (S) 2155 _|____ 2156 | 2157 | 2158 # v 2159 /.\ , 2160 / ^\ v 2161 / .\ , 2162 # ^# v 2163 / .\ , 2164 Designated / ^\ v 2165 Host | Router / .\ , | Host 2166 -----|-#- - - - - - -#- - - - - - - -RP- - - # - - -|----- 2167 (receiver) | <~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~> | (receiver) 2169 LEGEND 2171 # PIM Router 2172 RP Rendezvous Point 2173 > , > PIM-Register 2174 < . < PIM-Join 2175 ~ ~ ~ Resend to group members 2177 Figure 19: Source sends to a Multicast Group 2178 ======================================================================== 2179 8.1.3 Shared Tree (RP-Tree) or Shortest Path Tree (SPT)? 2181 The RP-tree provides connectivity for group members but does not 2182 optimize the delivery path through the internetwork. PIM-SM allows 2183 routers to either a) continue to receive multicast traffic over the 2184 shared RP-tree, or b) subsequently create a source-based shortest-path 2185 tree on behalf of their attached receiver(s). Besides reducing the 2186 delay between this router and the source (beneficial to its attached 2187 receivers), the shared tree also reduces traffic concentration effects 2188 on the RP-tree. 2190 A PIM-SM router with local receivers has the option of switching to the 2191 source's shortest-path tree (i.e., source-based tree) once it starts 2192 receiving data packets from the source. The change-over may be 2193 triggered if the data rate from the source exceeds a predefined 2194 threshold. The local receiver's last-hop router does this by sending a 2195 Join message toward the active source. After the source-based SPT is 2196 active, protocol mechanisms allow a Prune message for the same source 2197 to be transmitted to the active RP, thus removing this router from the 2198 shared RP-tree. Alternatively, the DR may be configured to continue 2199 using the shared RP-tree and never switch over to the source-based SPT, 2200 or a router could perhaps use a different administrative metric to 2201 decide if and when to switch to a source-based tree. 2203 ======================================================================== 2205 Source (S) 2206 _|____ 2207 | 2208 %| 2209 % # 2210 % / \* 2211 % / \* 2212 % / \* 2213 Designated % # #* 2214 Router % / \* 2215 % / \* 2216 Host | <-% % % % % % / \v 2217 -----|-#- - - - - - -#- - - - - - - -RP 2218 (receiver) | <* * * * * * * * * * * * * * * 2219 | 2220 LEGEND 2222 # PIM Router 2223 RP Rendezvous Point 2224 * * RP-Tree (Shared) 2225 % % Shortest-Path Tree (Source-based) 2227 Figure 20: Shared RP-Tree and Shortest Path Tree (SPT) 2228 ======================================================================== 2229 Besides a last-hop router being able to switch to a source-based tree, 2230 there is also the capability of the RP for a group to transition to a 2231 source's shortest-path tree. Similar controls (bandwidth threshhold, 2232 administrative weights, etc.) can be used at an RP to influence these 2233 decisions. 2235 8.2 Core Based Trees (CBT) 2237 Core Based Trees is another multicast architecture that is based on a 2238 shared delivery tree. It is specifically intended to address the 2239 important issue of scalability when supporting multicast applications 2240 across the public Internet. 2242 Similar to PIM-SM, CBT is protocol-independent. CBT employs the 2243 information contained in the unicast routing table to build its shared 2244 delivery tree. It does not care how the unicast routing table is 2245 derived, only that a unicast routing table is present. This feature 2246 allows CBT to be deployed without requiring the presence of any specific 2247 unicast routing protocol. 2249 Another similarity to PIM-SM is that CBT has adopted the core discovery 2250 mechanism ("bootstrap" ) defined in the PIM-SM specification. For 2251 inter-domain discovery, efforts are underway to standardize (or at least 2252 separately specify) a common RP/Core discovery mechanism. The intent is 2253 that any shared tree protocol could implement this common discovery 2254 mechanism using its own protocol message types. 2256 In a significant departure from PIM-SM, CBT has decided to maintain it's 2257 scaling characteristics by not offering the option of shifting from a 2258 Shared Tree (e.g., PIM-SM's RP-Tree) to a Shortest Path Tree (SPT) to 2259 optimize delay. The designers of CBT believe that this is a critical 2260 decision since when multicasting becomes widely deployed, the need for 2261 routers to maintain large amounts of state information will become the 2262 overpowering scaling factor. 2264 Finally, unlike PIM-SM's shared tree state, CBT state is bi-directional. 2265 Data may therefore flow in either direction along a branch. Thus, data 2266 from a source which is directly attached to an existing tree branch need 2267 not be encapsulated. 2269 8.2.1 Joining a Group's Shared Tree 2271 A host that wants to join a multicast group issues an IGMP host 2272 membership report. This message informs its local CBT-aware router(s) 2273 that it wishes to receive traffic addressed to the multicast group. 2274 Upon receipt of an IGMP host membership report for a new group, the 2275 local CBT router issues a JOIN_REQUEST hop-by-hop toward the group's 2276 core router. 2278 If the JOIN_REQUEST encounters a router that is already on the group's 2279 shared tree before it reaches the core router, then that router issues a 2280 JOIN_ACK hop-by-hop back toward the sending router. If the JOIN_REQUEST 2281 does not encounter an on-tree CBT router along its path towards the 2282 core, then the core router is responsible for responding with a 2283 JOIN_ACK. In either case, each intermediate router that forwards the 2284 JOIN_REQUEST towards the core is required to create a transient "join 2285 state." This transient "join state" includes the multicast group, and 2286 the JOIN_REQUEST's incoming and outgoing interfaces. This information 2287 allows an intermediate router to forward returning JOIN_ACKs along the 2288 exact reverse path to the CBT router which initiated the JOIN_REQUEST. 2290 As the JOIN_ACK travels towards the CBT router that issued the 2291 JOIN_REQUEST, each intermediate router creates new "active state" for 2292 this group. New branches are established by having the intermediate 2293 routers remember which interface is upstream, and which interface(s) 2294 is(are) downstream. Once a new branch is created, each child router 2295 monitors the status of its parent router with a keepalive mechanism, 2296 the CBT "Echo" protocol. A child router periodically unicasts a 2297 CBT_ECHO_REQUEST to its parent router, which is then required to respond 2298 with a unicast CBT_ECHO_REPLY message. 2300 ======================================================================== 2302 #- - - -#- - - - -# 2303 | \ 2304 | # 2305 | 2306 # - - - - # 2307 member | | 2308 host --| | 2309 | --Join--> --Join--> --Join--> | 2310 |- [DR] - - - [:] - - - -[:] - - - - [@] 2311 | <--ACK-- <--ACK-- <--ACK-- 2312 | 2314 LEGEND 2316 [DR] CBT Designated Router 2317 [:] CBT Router 2318 [@] Target Core Router 2319 # CBT Router that is already on the shared tree 2321 Figure 21: CBT Tree Joining Process 2322 ======================================================================== 2323 If, for any reason, the link between an on-tree router and its parent 2324 should fail, or if the parent router is otherwise unreachable, the 2325 on-tree router transmits a FLUSH_TREE message on its child interface(s) 2326 which initiates the tearing down of all downstream branches for the 2327 multicast group. Each downstream router is then responsible for 2328 re-attaching itself (provided it has a directly attached group member) 2329 to the group's shared delivery tree. 2331 The Designated Router (DR) is elected by CBT's "Hello" protocol and 2332 functions as THE single upstream router for all groups using that link. 2333 The DR is not necessarily the best next-hop router to every core for 2334 every multicast group. The implication is that it is possible for a 2335 JOIN_REQUEST to be redirected by the DR across a link to the best 2336 next-hop router providing access a given group's core. Note that data 2337 traffic is never duplicated across a link, only JOIN_REQUESTs, and the 2338 volume of this JOIN_REQUEST traffic should be negligible. 2340 8.2.2 Data Packet Forwarding 2342 When a JOIN_ACK is received by an intermediate router, it either adds 2343 the interface over which the JOIN_ACK was received to an existing 2344 forwarding cache entry, or creates a new entry if one does not already 2345 exist for the multicast group. When a CBT router receives a data packet 2346 addressed to the multicast group, it simply forwards the packet over all 2347 outgoing interfaces as specified by the forwarding cache entry for the 2348 group. 2350 8.2.3 Non-Member Sending 2352 Similar to other multicast routing protocols, CBT does not require that 2353 the source of a multicast packet be a member of the multicast group. 2354 However, for a multicast data packet to reach the active core for the 2355 group, at least one CBT-capable router must be present on the non-member 2356 source station's subnetwork. The local CBT-capable router employs 2357 IP-in-IP encapsulation and unicasts the data packet to the active core 2358 for delivery to the rest of the multicast group. 2360 8.2.4 CBT Multicast Interoperability 2362 Multicast interoperability is currently being defined. Work is underway 2363 in the IDMR working group to describe the attachment of stub-CBT and 2364 stub-PIM domains to a DVMRP backbone. Future work will focus on 2365 developing methods of connecting non-DVMRP transit domains to a DVMRP 2366 backbone. 2368 CBT interoperability will be achieved through the deployment of domain 2369 border routers (BRs) which enable the forwarding of multicast traffic 2370 between the CBT and DVMRP domains. The BR implements DVMRP and CBT on 2371 different interfaces and is responsible for forwarding data across the 2372 domain boundary. 2374 ======================================================================== 2376 /---------------\ /---------------\ 2377 | | | | 2378 | | | | 2379 | DVMRP |--[BR]--| CBT Domain | 2380 | Backbone | | | 2381 | | | | 2382 \---------------/ \---------------/ 2384 Figure 22: Domain Border Routers (BRs) 2385 ======================================================================== 2387 The BR is also responsible for exporting selected routes out of the CBT 2388 domain into the DVMRP domain. While the CBT stub domain never needs to 2389 import routes, the DVMRP backbone needs to import routes to any sources 2390 of traffic which are inside the CBT domain. The routes must be imported 2391 so that DVMRP can perform its RPF check. 2393 9. INTEROPERABILITY FRAMEWORK FOR MULTICAST BORDER ROUTERS 2395 In late 1996, the IETF IDMR working group began discussing a formal 2396 structure that would describe the way different multicast routing 2397 protocols should interact inside a multicast border router (MBR). The 2398 work can be found in the following internet draft: , or its successor. The draft covers explicit rules for 2400 the major multicast routing protocols that existed at the end of 1996: 2401 DVMRP, MOSPF, PIM-DM, PIM-SM, and CBT, but applies to any potential 2402 multicast routing protocol as well. 2404 The IDMR standards work will focus on this generic inter-protocol MBR 2405 scheme, rather than having to write 25 documents, 20 detailing how each 2406 of those 5 protocols must interwork with the 4 others, plus 5 detailing 2407 how two disjoint regions running the same protocol must interwork. 2409 9.1 Requirements for Multicast Border Routers 2411 In order to ensure reliable multicast delivery across a network with an 2412 arbitrary mixture of multicast routing protocols, some constraints are 2413 imposed to limit the scope of the problem space. 2415 Each multicast routing domain, or region, may be connected in a "tree 2416 of regions" topology. If more arbitrary inter-regional topologies are 2417 desired, a hierarchical multicast routing protocol (such as, H-DVMRP) 2418 must be employed, because it carries topology information about how the 2419 regions are interconnected. Until this information is available, we 2420 must consider the case of a tree of regions with one centrally-placed 2421 "backbone" region. Each pair of regions is interconnected one or more 2422 MBR(s). 2424 A MBR is responsible for injecting a default route into its "child 2425 regions," and also injecting subnetwork reachability information into 2426 its "parent region," optionally using aggregation techniques to reduce 2427 the volume of the information while preserving its meaning. MBRs which 2428 comply with have other characteristics and 2429 duties, including: 2431 o The MBR consists at least two active routing components, each 2432 an instance of some multicast routing protocol. No assumption is 2433 made about the type of routing protocol (e.g., broadcast-and-prune 2434 or explicit-join; distance-vector or link-state; etc.) any component 2435 runs, or the nature of a "component". Multiple components running 2436 the same protocol are allowed. 2438 o An MBR forwards packets between two or more independent regions, with 2439 one or more active interfaces per region, but only one component per 2440 region. 2442 o Each interface for which multicast is enabled is "owned" by exactly 2443 one of the components at a time. 2445 o All components share a common forwarding cache of (S,G) entries, 2446 which are created when data packets are received, and can be 2447 deleted at any time. The component owning an interface is the only 2448 component that may change forwarding cache entries pertaining to 2449 that interface. Each forwarding cache entry has a single incoming 2450 interface (iif) and a list of outgoing interfaces (oiflist). 2452 [This space was intentionally left blank.] 2453 10. REFERENCES 2455 10.1 Requests for Comments (RFCs) 2457 1075 "Distance Vector Multicast Routing Protocol," D. Waitzman, 2458 C. Partridge, and S. Deering, November 1988. 2460 1112 "Host Extensions for IP Multicasting," Steve Deering, 2461 August 1989. 2463 1583 "OSPF Version 2," John Moy, March 1994. 2465 1584 "Multicast Extensions to OSPF," John Moy, March 1994. 2467 1585 "MOSPF: Analysis and Experience," John Moy, March 1994. 2469 1700 "Assigned Numbers," J. Reynolds and J. Postel, October 2470 1994. (STD 2) 2472 1812 "Requirements for IP version 4 Routers," Fred Baker, 2473 Editor, June 1995 2475 2000 "Internet Official Protocol Standards," Jon Postel, 2476 Editor, February 1997. 2478 10.2 Internet-Drafts 2480 "Core Based Trees (CBT) Multicast: Architectural Overview," 2481 , A. J. Ballardie, March 1997. 2483 "Core Based Trees (CBT) Multicast: Protocol Specification," , A. J. Ballardie, March 1997. 2486 "Core Based Tree (CBT) Multicast Border Router Specification for 2487 Connecting a CBT Stub Region to a DVMRP Backbone," , A. J. Ballardie, March 1997. 2490 "Distance Vector Multicast Routing Protocol," , T. Pusateri, February 19, 1997. 2493 "Internet Group Management Protocol, Version 2," , William Fenner, January 22, 1997. 2496 "Internet Group Management Protocol, Version 3," , Brad Cain, Ajit Thyagarajan, and Steve Deering, 2498 Expired. 2500 "Protocol Independent Multicast-Dense Mode (PIM-DM): Protocol 2501 Specification," , D. Estrin, 2502 D. Farinacci, A. Helmy, V. Jacobson, and L. Wei, September 12, 1996. 2504 "Protocol Independent Multicast-Sparse Mode (PIM-SM): Motivation 2505 and Architecture," , S. Deering, 2506 D. Estrin, D. Farinacci, V. Jacobson, C. Liu, and L. Wei, 2507 November 19, 1996. 2509 "Protocol Independent Multicast-Sparse Mode (PIM-SM): Protocol 2510 Specification," , D. Estrin, 2511 D. Farinacci, A. Helmy, D. Thaler; S. Deering, M. Handley, 2512 V. Jacobson, C. Liu, P. Sharma, and L. Wei, October 9, 1996. 2514 (Note: Results of IESG review were announced on December 23, 1996: 2515 This internet-draft is to be published as an Experimental RFC.) 2517 "PIM Multicast Border Router (PMBR) specification for connecting 2518 PIM-SM domains to a DVMRP Backbone," , D. Estrin, A. Helmy, D. Thaler, Febraury 3, 1997. 2521 "Administratively Scoped IP Multicast," , D. Meyer, December 23, 1996. 2524 "Interoperability Rules for Multicast Routing Protocols," , D. Thaler, November 7, 1996. 2527 See the IDMR home pages for an archive of specifications: 2529 2530 2532 10.3 Textbooks 2534 Comer, Douglas E. Internetworking with TCP/IP Volume 1 Principles, 2535 Protocols, and Architecture Second Edition, Prentice Hall, Inc. 2536 Englewood Cliffs, New Jersey, 1991 2538 Huitema, Christian. Routing in the Internet, Prentice Hall, Inc. 2539 Englewood Cliffs, New Jersey, 1995 2541 Stevens, W. Richard. TCP/IP Illustrated: Volume 1 The Protocols, 2542 Addison Wesley Publishing Company, Reading MA, 1994 2544 Wright, Gary and W. Richard Stevens. TCP/IP Illustrated: Volume 2 2545 The Implementation, Addison Wesley Publishing Company, Reading MA, 2546 1995 2548 10.4 Other 2550 Deering, Steven E. "Multicast Routing in a Datagram 2551 Internetwork," Ph.D. Thesis, Stanford University, December 1991. 2553 Ballardie, Anthony J. "A New Approach to Multicast Communication 2554 in a Datagram Internetwork," Ph.D. Thesis, University of London, 2555 May 1995. 2557 "Hierarchical Distance Vector Multicast Routing for the MBone," 2558 Ajit Thyagarajan and Steve Deering, July 1995. 2560 11. SECURITY CONSIDERATIONS 2562 Security issues are not discussed in this memo. 2564 12. ACKNOWLEDGEMENTS 2566 This RFC would not have been possible without the encouragement of Mike 2567 O'Dell and the support of Joel Halpern and David Meyer. Also invaluable 2568 was the feedback and comments of the IETF MBoneD and IDMR working groups. 2569 Certain people spent considerable time commenting on and discussing this 2570 paper with the authors, and deserve to be mentioned by name: Tony 2571 Ballardie, Steve Casner, Jon Crowcroft, Steve Deering, Bill Fenner, Hugh 2572 Holbrook, Cyndi Jung, Shuching Shieh, Dave Thaler, and Nair Venugopal. 2573 Our apologies to anyone we unintentionally neglected to list here. 2575 13. AUTHORS' ADDRESSES 2577 Tom Maufer 2578 3Com Corporation 2579 5400 Bayfront Plaza 2580 P.O. Box 58145 2581 Santa Clara, CA 95052-8145 2583 Phone: +1 408 764-8814 2584 Email: 2586 Chuck Semeria 2587 3Com Corporation 2588 5400 Bayfront Plaza 2589 P.O. Box 58145 2590 Santa Clara, CA 95052-8145 2592 Phone: +1 408 764-7201 2593 Email: