idnits 2.17.1 draft-ietf-nimrod-multicast-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-23) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 296 instances of too long lines in the document, the longest one being 4 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 315 has weird spacing: '...e paths are s...' == Line 518 has weird spacing: '..., it is the r...' == Line 589 has weird spacing: '...en when it...' == Line 626 has weird spacing: '...xecuted by a ...' == Line 656 has weird spacing: '...tration for ...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (30 July 1995) is 10495 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'BFC93' -- No information found for draft-castineyra-routing-arch - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'CCS94' -- Possible downref: Non-RFC (?) normative reference: ref. 'DC90' -- Possible downref: Non-RFC (?) normative reference: ref. 'DEFJ93' -- Possible downref: Non-RFC (?) normative reference: ref. 'DM78' -- Possible downref: Non-RFC (?) normative reference: ref. 'Moy92' -- Possible downref: Non-RFC (?) normative reference: ref. 'Ste' -- No information found for draft-steenstrup-fun-perspective - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'Ste94' Summary: 9 errors (**), 0 flaws (~~), 6 warnings (==), 12 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Nimrod Working Group Ram Ramanathan 3 Internet Draft BBN Systems and Technologies 4 February 1995 Expires 30 July 1995 5 draft-ietf-nimrod-multicast-00.txt 7 Multicast Support for Nimrod : Requirements and Solution Approaches 9 Status of this Memo 11 This document is an Internet-Draft. Internet-Drafts are working documents 12 of the Internet Engineering Task Force (IETF), its areas, and its working 13 groups. Note that other groups may also distribute working documents as 14 Internet-Drafts. 16 Internet-Drafts are draft documents valid for a maximum of six months and 17 may be updated, replaced, or obsoleted by other documents at any time. It 18 is inappropriate to use Internet- Drafts as reference material or to cite 19 them other than as ``work in progress.'' 21 To learn the current status of any Internet-Draft, please check the 22 ``1id-abstracts.txt'' listing contained in the Internet- Drafts Shadow 23 Directories on ds.internic.net (US East Coast), nic.nordu.net (Europe), 24 ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific Rim). 26 Distribution of this draft is unlimited. Please send comments to 27 nimrod-wg@bbn.com. 29 Abstract 31 We discuss the issue of multicasting in Nimrod. We identify the 32 requirements that Nimrod has of any solution for multicast support. We 33 compare existing approaches for multicasting within an internetwork and 34 discuss their advantages and disadvantages. Finally, as an example, we 35 outline the mechanisms to support multicast in Nimrod using the scheme 36 currently being developed within the IETF - namely, the Protocol Indpendent 37 Multicast (PIM) protocol. 39 Contents 41 1 Introduction 1 43 2 Multicast vs Unicast 1 45 3 Goals and Requirements 3 47 4 Approaches 4 49 5 A Multicasting Scheme based on PIM 8 51 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 53 5.2 Joining and Leaving a Tree . . . . . . . . . . . . . . . . . . . . 10 55 5.2.1An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 57 5.3 Establishing a Shared Tree . . . . . . . . . . . . . . . . . . . . 13 59 5.4 Switching to a Source-Rooted Shortest Path Tree . . . . . . . . . . 15 61 5.5 Miscellaneous Issues . . . . . . . . . . . . . . . . . . . . . . . 16 63 6 Security Considerations 18 65 7 Summary 18 67 8 Acknowledgements 18 69 9 Author's Address 19 71 i 72 1 Introduction 74 The nature of emerging applications such as videoconferencing, remote 75 classroom, etc. makes the support for multicasting essential for any future 76 routing architecture. Multicasting is performed by using a multicast 77 delivery tree whose leaves are the multicast destinations. 79 Nimrod does not propose a solution for the multicasting problem. There are 80 two chief reasons for this. First, multicasting is a non-trivial problem 81 whose requirements are still not well understood. Second, a number of 82 groups (for instance the IDMR working group of the IETF) are studying the 83 problem by itself and it is not our intention to duplicate those efforts. 85 This attitude towards multicasting is consistent with Nimrod's general 86 philosophy of flexibility, adaptability and incremental change. 88 While a multicasting solution per se is not part of the ``core'' Nimrod 89 architecture, Nimrod does require that the solution have certain 90 characteristics. It is the purpose of this document to discuss some of 91 these requirements and evaluate approaches towards meeting them. 93 This document is organized as follows. In section 2 we discuss why 94 multicasting is treated a little differently than unicast despite the fact 95 that the former is essentially a generalization of the latter. Following 96 that, in section 4 we discuss current approaches toward multicasting . In 97 section 5, we give an example of how Nimrod multicasting may be done using 98 PIM [DEF+94a]. For readers who do not have the time to go through the 99 entire document, a summary is given at the end. 101 This document uses many terms and concepts from the Nimrod Architecture 102 document [CCS94] and some terms and concepts (in section 5) from the Nimrod 103 Functionality document [Ste94]. Much of the discussion assumes that you 104 have read at least the Nimrod Architecture document [CCS94]. 106 2 Multicast vs Unicast 108 We begin by looking at the similarities and differences between unicast 109 routing and multicast routing. Both unicast and multicast routing require 110 two phases - route generation and packet forwarding. In the case of unicast 111 routing, Nimrod specifies modes of packet forwarding; route generation 112 itself is not specified but left to the particular routing agent. For 113 multicasting, Nimrod leaves both route generation and packet forwarding 114 mechanisms unspecified. To explain why, we first point out three aspects 115 that make multicasting quite different from unicasting : 117 o Groups and group dynamism. In multicasting, the destinations are part 118 of a group, whose membership is dynamic. This brings up the following 120 1 121 issues : 123 -- An association between the multicast group and the EIDs and 124 locators of the members comprising that group. This is especially 125 relevant in the case of sender initiated multicasting and policy 126 support. 128 -- A mechanism to accommodate new group members in the delivery in 129 response to addition of members, and a mechanism to ``prune'' the 130 delivery in response to departures. 132 o State creation. Most solutions to multicasting can essentially be 133 viewed as creating state in routers for multicast packet forwarding. 134 Based on who creates the state, multicasting solutions differ. In 135 multicasting, we have several options for this - e.g., the sender, the 136 receivers or the intermediate routers. 138 o Route generation. Even more so than in unicast routing, one can choose 139 from a rich spectrum of heuristics with different tradeoffs between a 140 number of parameters (such as cost and delay, algorithmic time 141 complexity and optimality etc.). For instance, some heuristics produce 142 a low-cost tree with high end-to-end delay and some produce trees that 143 give the shortest path to each destination but with a higher cost. 144 Heuristics for multicasting are a significant research area today, and 145 we expect advances to result in sophisticated heuristics in the near 146 future. 148 Noting that there are various possible combinations of route generation, 149 group dynamism handling and state creation for a solution and that each 150 solution conceivably has applications for which it is the most suitable, we 151 do not specify one particular approach to multicasting in Nimrod. Every 152 implementation of Nimrod is free to use its own multicasting technique, as 153 long as it meets the goals and requirements of Nimrod. However, for 154 interoperability, it is necessary that certain things are agreed upon - for 155 instance, the structure of the forwarding information database that they 156 create (we discuss this in more detail in section 4). 158 Thus, we do not discuss the details of any multicast solution here, only its 159 requirements in the context of Nimrod. Specifically, we structure the 160 discussion in the remainder of this document on the following two themes : 162 o What are the goals that we want to meet in providing multicasting in 163 Nimrod, and what specific requirements do these goals imply for the 164 multicast solution? 166 o What are some of the approaches to multicasting being discussed 167 currently, and how relevant are each of these approaches to Nimrod? 169 2 170 3 Goals and Requirements 172 The chief goals of Nimrod multicasting and their implications on solution 173 requirements are as follows: 175 1. Scalability. Nimrod multicasting must scale in terms of the size of 176 the internetwork, the number of groups supported and the number of 177 members per group. It must also support group dynamism efficiently. 178 This has the following implications for the solution: 180 o Routers not on the direct path to the multicast destinations 181 should not be involved in state management. In a network with a 182 large number of routers, a solution that does involve such routers 183 is unlikely to scale (e.g., current implementation of DVMRP). 185 o It is likely that there will be a number of applications that have 186 a few members per group (e.g., medical imaging) and a number of 187 applications that have a large number of members per group (e.g., 188 news distribution). Nimrod multicasting should scale for both 189 these situations. If no single mechanism adequately scales for 190 both sparse and dense group memberships simultaneously, a 191 combination of mechanisms should be considered. 193 o In the face of group membership change, there must be a facility 194 for incremental addition or deletion of ``branches'' in the 195 multicast tree. Reconstructing the tree from scratch is not 196 likely to scale. 198 o It is likely that we will have some well-known groups (i.e., 199 groups which are more or less permanent in existence) and some 200 ephemeral groups. The dynamics of group membership are likely to 201 be different for each class of groups, and the solution should 202 take that into account as appropriate. 204 2. Policy support. This includes both quality of service (QOS) as well as 205 access restrictions, although currently, demand is probably higher for 206 QOS. In particular, every path from the source to each destination in 207 the multicast group should satisfy the requested quality of service and 208 conform to the access restrictions. The implications for the 209 multicasting solution are : 211 o It is likely that many multicasting applications will be cost 212 conscious in addition to having strict quality of service bounds 213 (such as delay and jitter). Balancing these will necessitate 214 dealing with some new parameters - e.g., the tree cost (sum of the 215 ``cost'' of each link), the tree delay (maximum, mean and variance 217 3 218 in end-to-end delay) etc. 220 o In order to support policy-based routing, we need to know where 221 the destinations are (so that we can decide what route we can take 222 to them). In such a case, a mechanism that provides an 223 association between a group id and a set of destination locators 224 is probably required. 226 o Some policy constraints are likely to be destination specific. 227 For instance, a domain might refuse transit service to traffic 228 going to certain destination domains. This presents certain 229 unique problems - in particular, for a single group, multiple 230 trees may need to be built, each tree ``servicing'' disjoint 231 partitions of the multicast destinations. 233 3. Resource sharing. Multicasting typically goes hand in hand with large 234 traffic volume or applications with a high demand for resources. 235 These, in turn, imply efficient resource management and sharing if 236 possible. Therefore, it is important that we place an emphasis on 237 interaction with resource reservation. For instance, Nimrod must be 238 able to provide information on which tree resources are shareable and 239 which are not so that resource reservation may use it while allocating 240 resources to flows. 242 4. Interoperability. There are two issues in this context. First, the 243 solution must be independent of mechanisms that provide the solution 244 with information it needs. For instance, many multicast solutions 245 (e.g., PIM) make use of information supplied by unicast routing 246 protocols. The multicast solution must not be dependent on which 247 unicast protocol is used. 249 Second, a multicast solution must interoperate with other multicast 250 solutions in the construction of a delivery tree. This implies some 251 kind of ``agreement'' at some ``level''. For instance, the agreement 252 could be that everybody use the same structure for storing forwarding 253 information in the routers. Since the delivery tree is defined by the 254 nature of forwarding information in the routers and not by the 255 particular mechanism used to create that information, multiple 256 implementations can coexist. 258 4 Approaches 260 The approaches to multicasting currently in operation and those being 261 considered by the IETF include the following : 263 1. Distance vector multicast routing protocol (DVMRP)[DC90]. This 264 approach is based upon distance-vector routing information distribution 266 4 267 and hop-by-hop forwarding. It uses Reverse Path Forwarding (RPF)[DM78] 268 - a distributed algorithm for constructing an internetwork broadcast 269 tree. DVMRP uses a modified RPF algorithm, essentially a truncated 270 broadcast tree, to build a reverse shortest path sender-based multicast 271 delivery tree. A reverse shortest path from s to d is a path that uses 272 the same intermediate nodes as those in the shortest path from d to 273 s(1). An implementation of RPF exists in the current Internet in what 274 is commonly referred to as the MBONE. An improvement to this is in the 275 process of being deployed. It incorporates ``prune'' messages to 276 truncate further the routers not on the path to the destinations and 277 ``graft'' messages to undo this truncation, if later necessary. 279 The main advantage of this scheme is that it is simple. The major 280 handicap is scalability. Two issues have been raised in this 281 context[BFC93]. First, if S is the number of active sources and G the 282 number of groups, then the state overhead is O(GS) and might be 283 unacceptable when resources are limited. Second, routers not on a 284 multicast tree are involved (in terms of sending/tracking prune and 285 graft messages) even though they might not be interested in the 286 particular source-group pair. The performance of this scheme is 287 expected to be relatively poor for large networks with sparsely 288 distributed group membership. Furthermore, no support for policies or 289 QOS is provided. 291 2. Core Based Trees (CBT)[BFC93]. This scheme uses a single tree shared 292 by all sources per group. This tree has a single router as the core 293 (with additional routers for robustness) from which branches emanate. 294 The chief distinguishing characteristic of CBT is that it is receiver 295 initiated, i.e., receivers wishing to join a multicast group find the 296 tree (or its core) and attach themselves to it, without any 297 participation from the sources. 299 The chief motivation behind this scheme is the reduction of the state 300 overhead, to O(G), in comparison to DVMRP and PIM(described below). 301 Also, only routers in the path between the core and the potential 302 members are involved in the process. Core-based tree formation and 303 packet flow are decoupled from underlying unicast routing. 305 The main disadvantage is that packets no longer traverse the shortest 306 path from the source to their destinations. The performance in general 307 depends on judicious placement of cores and coordination between them. 308 Traffic concentration on links incident to the core is another problem. 309 There is also a dependence on network entities (in other administrative 310 domains, for instance) for resource reservation and policy routing. 312 3. Protocol Independent Multicasting (PIM)[DEFJ93]. Yet another approach 313 based on the receiver initiated philosophy, this is designed to reap 314 ------------------------------ 315 1. If the paths are symmetric (i.e., cost the same) in either direction, 316 the reverse shortest path is same as the shortest path. 318 5 319 the advantages of DVMRP and CBT. Using a ``rendezvous point'', a 320 concept similar to the core discussed above, it allows for the 321 simultaneous existence of shared and source-specific multicast trees. 322 In the steady state, data can be delivered over the reverse shortest 323 path from the sender to the receiver (for better end-to-end delay) or 324 over the shared tree. 326 Using two modes of operation, sparse and dense, this provides improved 327 performance, both when the group membership in an internetwork is 328 sparse and when it is dense. It is however, a complex protocol. A 329 limitation of PIM is that the shortest paths are based on the reverse 330 metrics and therefore truly ``shortest'' only when the links are 331 symmetric. 333 4. Multicast Open Shortest Path First (MOSPF)[Moy92]. Unlike the 334 abovementioned approaches, this is based on link-state routing 335 information distribution. The packet forwarding mechanism is 336 hop-by-hop. Since every router has complete topology information, 337 every router computes the shortest path multicast tree from any source 338 to any group using Dijkstra's algorithm If the router doing the 339 computation falls within the tree computed, it can determine which 340 links it must forward copies onto. 342 MOSPF inherits advantages of OSPF and link-state distribution, namely 343 localized route computation (and easy verification of loop-freedom), 344 fast convergence to link-state changes etc. However, group membership 345 information is sent throughout the network, including links that are 346 not in the direct path to the multicast destinations. Thus, like 347 DVMRP, this is most suitable for small internetworks, that is, as an 348 intra-domain routing mechanism. 350 5. Inter-Domain Policy Routing (IDPR)[Ste]. This approach uses link-state 351 routing information distribution like MOSPF, but uses source-specified 352 packet forwarding. Using the link-state database, the source generates 353 a policy multicast route to the destinations. Using this, the IDPR 354 path-setup procedure sets up state in intermediate entities for packet 355 duplication and forwarding. The state contains information about the 356 next-hop entities for the multicast flow. When a data packet arrives, 357 it is forwarded to each next hop entity obtained from the state. 359 Among the advantages of this approach are its ability to support policy 360 based multicast routing with ease and independence (flexibility) in the 361 choice of multicasting algorithm used at the source. IDPR also allows 362 resource sharing over multiple multicast trees. The major disadvantage 363 is that it makes it relatively more difficult to handle group 364 membership changes (additions and deletions) since such changes must be 365 first communicated to the source of the tree which will then add 366 branches appropriately. 368 6 369 We now discuss the applicability of these approaches to Nimrod. Common to 370 all of the approaches described is the fact that we need to set up states in 371 the intermediate routers for multicast packet forwarding. The approaches 372 differ mainly on who initiates the state creation - the sender (e.g., IDPR, 373 PIM), the receiver (e.g., CBT, PIM) or the routers themselves create state 374 without intitiation by the sender or receivers (e.g., DVMRP, MOSPF). 376 Nimrod should be able to accommodate both sender initiated as well as 377 receiver initiated state creation for multicasting. In the remainder of 378 this section, we discuss the pros and cons of these approaches for Nimrod. 380 Nimrod uses link-state routing information distribution (topology maps) and 381 has four modes of packet forwarding - flow mode, Connectivity Specification 382 Chain (CSC) mode, BTES mode and datagram mode [CCS94]. 384 An approach similar to that used in IDPR is viable for multicasting using 385 the flow mode. The source can set up state in intermediate routers which 386 can then appropriately duplicate packets. For the CSC, BTES and datagram 387 modes, an approach similar to the one used in MOSPF is applicable. In these 388 situations, the advantages and disadvantages of these approaches in the 389 context of Nimrod is similar to the advantages and disadvantages of IDPR and 390 MOSPF respectively. 392 Sender based trees can be set up using an approach similar to IDPR and 393 generalizing it to an ``n'' level hierarchy. A significant advantage of 394 this approach is policy-based routing. The source knows about the policies 395 of nodes that care to advertise them and can choose a route the way it wants 396 (i.e., not depend upon other entities to choose the route, as in some 397 schemes mentioned above). Another advantage is that each source can use the 398 multicast route generation algorithm and packet forwarding scheme that best 399 suits it, instead of being forced to use whatever is implemented elsewhere 400 in the network. Further, this approach allows for incrementally deploying 401 new multicast tree generation algorithms as research in that area 402 progresses. 404 CBT-like methods may be used to set up receiver initiated trees. Nimrod 405 provides link-state maps for generating routes and a CBT-like method is 406 compatible with this. For instance, a receiver wishing to join a group may 407 generate a (policy) route to the core for that group using its link-state 408 map and attach itself to the tree. 410 A disadvantage of sender based methods in general seems to be the support of 411 group dynamism. Specifically, if there is a change in the membership of the 412 group, the particular database which contains the group-destination mapping 413 must be updated. In comparison, receiver oriented approaches seem to be 414 able to accommodate group dynamism more naturally. 416 Nimrod does not preclude the simultaneous existence of multiple approaches 417 to multicasting and the possibility of switching from one to the other 418 depending on the dynamics of group distributions. Interoperability is an 419 issue - that is, the question of whether or not different implementations of 421 7 422 Nimrod can participate in the same tree. However, as long as there is 423 agreement in the structure of the state created (i.e., the states can be 424 interpreted uniformly for packet forwarding), this should not be a problem. 425 For instance, a receiver wishing to join a sender created tree might set up 426 state on a path between itself and a router on the tree with the sender 427 itself being unaware of it. Packets entering the router would now be 428 additionally forwarded along this new ``branch'' to the new receiver. 430 In conclusion, the architecture of Nimrod can accommodate diverse approaches 431 to multicasting. Each approach has its disadvantages with respect to the 432 requirements mentioned in the previous section. The architecture does not 433 demand that one particular solution be used, and indeed, we expect that a 434 combination of approaches will be employed and engineered in a manner most 435 appropriate to the requirements of the particular application or subscriber. 437 5 A Multicasting Scheme based on PIM 439 The Inter-Domain Multicast Routing (IDMR) working group of the IETF has 440 drafted a specification for a new multicast scheme, namely, Protocol 441 Independent Multicasting (PIM) for use in the Internet [DEF+94a, DEF+94b]. 442 In this section, we decribe how the schemes mentioned therein may be 443 implemented using the facilities provided by Nimrod. 445 We note that the path setup facility provided in Nimrod makes it very 446 conducive to PIM-style multicasting; despite the length of the description 447 given here, we assure the reader that it is quite simple to implement PIM 448 style multicasting in Nimrod. 450 Before reading this section, we recommend that the reader acquire some 451 familiarity with PIM (see [DEF+94a, DEF+94b]). 453 5.1 Overview 455 The PIM architecture maintains the traditional IP multicast service model of 456 receiver-initiated membership and is independent of any specific unicast 457 routing protocol (hence the name). 459 A significant aspect of PIM is that it provides mechanisms for establishing 460 two kinds of trees - a shared tree, which is intended for low ``cost'' 461 multicasting and a source-based tree, intended for low delay multicasting. 463 A shared tree is rooted at a rendezvous point (RP), which is typically a 464 prespecified router for the multicast group in question. In order to 465 establish a shared tree, a designated router (DR) for a host wishing to join 466 a group G initiates a flow setup from the RP for G to the DR. A source S 467 wishing to send to a group G initiates a flow setup between S and the RP for 469 8 470 group G. At the conclusion of these flow setups, packets can be forwarded 471 from S to H through the RP. For details on the protocol used to implement 472 this flow setup please refer to [DEF+94b]. 474 After the shared tree has been setup, a recipient for group G has the option 475 of switching to a source-based shortest path tree. In such a tree, packets 476 are delivered from the source to each recipient along the shortest path. To 477 establish a source-based shortest path tree, the DR for H looks at the 478 source S of the packets it is receiving via the shared tree and establishes 479 a flow between S and the DR. The flow is established along the shortest path 480 from the DR to S(2). Subsequently, packets can be forwarded from S to H 481 using this shortest path and thereby bypassing the RP. For details on the 482 protocol used to implement source-based trees in PIM please refer 483 to [DEF+94b]. 485 When a host wishes to leave a multicast group, its designated router sends a 486 prune message towards the source (for source-based trees) or towards the RP 487 (for shared trees). For details on this and other features of PIM please 488 refer to [DEF+94b]. 490 In Nimrod, PIM is implemented as follows (we refer to PIM based multicast as 491 Nimpim). In order to join a shared tree, an endpoint (or an agent acting on 492 behalf of the endpoint) wishing to join a group G queries the association 493 database for the EID and locator of the RP for G (for well-known groups the 494 association may be configured). It is required that such an association be 495 maintained for every multicast group G. The endpoint gets a route for the RP 496 and initiates a multicast flow setup to the RP (a multicast flow setup is 497 similar to an unicast flow setup described in [CCS94] except for one feature 498 - when a multicast flow setup request reaches a node that already has that 499 flow present, the request is not forwarded further. The new flow gets 500 ``spliced'' in as a new branch of the existing multicast tree). Similarly, 501 the source establishes a flow to the RP. The RP creates state to associate 502 these two flows and now packets can be forwarded to the endpoints from the 503 source. Note that each flow setup may be ``hierarchical'' and involve many 504 subflows. All this, however, is transparent to Nimpim. For details on 505 management of hierarchical flows please refer to [CCS94]. 507 To create the source-based tree, the representative for a recipient node N 508 obtains the EID and locator of the source from the data packets and 509 initiates a multicast flow setup to the source. The route agent for the 510 node N uses its map in order to calculate the shortest path from the source 511 to N. The flow request is sent along the reverse of this path. We note that 512 the ``shortness'' of the path is constrained by the amount of routing 513 information available locally. However, since the map is available locally, 514 one can find the actual shortest path from the source to N and not use the 515 shortest path from N to S. Thus, with Nimrod one can actually surmount a 517 ------------------------------ 518 2. Thus, strictly speaking, it is the reverse shortest path that is being 519 used. 521 9 522 shortcoming of PIM with relative ease. 524 We now discuss some more details of Nimpim. We start with a description of 525 multicast flow setup. This is the ``basic'' functionality required to 526 implement multicasting. Having this ``building-block'' spelt out, we use 527 this to specify the establishment of the shared tree (in section 5.3) and 528 the establishment of a source-based tree (in section 5.4). 530 We only discuss sparse-mode multicasting, as described in [DEF+94a] here. 531 Further, to simplify the discussion, we assume a single Rendezvous Point per 532 group. Finally, we ``address'' all entities in terms of their EIDs alone 533 for reasons of conciseness - the locators could be used in conjuction to 534 reduce the overhead of database lookups. 536 5.2 Joining and Leaving a Tree 538 Nimpim uses two control packets in order to setup a flow - the Nimrod 539 Multicast Flow-Request packet (NMFReq) and the Nimrod Multicast Flow-Reply 540 packet (NMFRep). 542 The NMFReq packet is a control packet identified by a prespecified ``payload 543 type''. The protocol-specific part of this packet includes the following 544 fields (except for the Code field, these fields are present in the Unicast 545 Flow-Request packet too) : 547 1. S-EID : The EID of the initiator of the flow. 549 2. T-EID : The EID of the target of the flow. 551 3. Flow-id : A label denoting the flow. 553 4. Direction : The direction of the flow - whether from the initiator to 554 the target (FORW) or from the target to the initiator (REVERSE) or both 555 (BOTH). 557 5. Code : Denotes whether the packet is for joining a flow (NMFReq-Join) 558 for leaving a flow (NMFReq-Prune). 560 6. Source Route : A sequence of node locators through which the packet 561 must travel. 563 The processing of the NMFReq by a forwarding agent at node N is similar to 564 that of the unicast flow request (see [CCS94]), except for the fact that now 565 we provide the ability for the new flow to ``splice'' onto an existing 566 delivery tree or ``un-splice'' from an existing delivery tree. 567 Specifically, 569 10 570 begin 571 if the flow-id F in NMFReq-Join is in flow-list then 572 if T-EID in NMFReq-Join = target in flow state for F then 573 if Direction in NMFReq-Join is REVERSE or BOTH then 574 Add the node preceding N in source route to child list for F 575 else 576 discard packet 577 else 578 discard packet 579 else 580 begin 581 install state for F in N, i.e., 582 assign parent(F) = node succeeding N in source route 583 assign child(F) = node preceeding N in source route 584 assign target(F) = T-EID in NMFReq-Join 585 forward NMFReq-Join to parent(F) 586 end 587 end. 589 Figure 1: Algorithm executed by a forwarding agent for node N when when it 590 receives an NMFReq-Join. 592 o If the Code is NMFReq-Join then the algorithm executed by the 593 forwarding agent for node N is shown in Figure 1. 595 o If the Code is NMFReq-Prune then the algorithm is executed by the 596 forwarding agent at node N is shown in Figure 2. 598 The NMFRep packet is used to accept or reject an NMFReq-Join or 599 NMFReq-Prune. The packet format is the same as that for unicast flow 600 request. However, an NMFRep packet is generated now by the first node N 601 that grafts the new flow to the existing tree. This may be different from 602 the target of the NMFReq. 604 The NMFReq - NMFRep exchanges constitute a procedure for joining a multicast 605 delivery tree (when the Code is Join) and for leaving a multicast delivery 606 tree (when the Code is Prune). We term these procedures Tree-Join and 607 Tree-Leave respectively; we shall be using these procedures as 608 ``building-blocks'' in the construction of shared trees (section 5.3) and of 609 source-based trees (section 5.4). 611 11 612 begin 613 if the flow-id F in NMFReq-Prune is in flow-list 614 then begin 615 delete previous hop in source route from child list for F, if exists 616 if child list for F is empty 617 then begin 618 delete the flow-id and state associated with it 619 forward to next hop in source route 620 end 621 else discard packet 622 end 623 else forward to next hop in source-route 624 end. 626 Figure 2: Algorithm executed by a forwarding agent for node N when it 627 receives an NMFReq-Prune. 629 5.2.1 An Example 631 An example of how a tree is joined is given here with the help of Figure 3. 632 In the figure, bold lines indicate an existing tree. Representative R on 633 behalf of host H joins the tree by sending an NMFJoin-Req towards a target 634 T. When used in the shared tree mode, the target is the RP and when used in 635 the source tree mode, it is the source (root) of the multicast tree. 636 Suppose that a host H wants to join the multicast tree. The following steps 637 are executed : 639 Step 1 . A representative R of H queries the route agent for a route from 640 T to R. It obtains the route T - C- B - A - R. It builds a NMFJoin-Req 641 packet with source route as R, A, B, C, T and flow as F forwards it to 642 A. 644 Step 2 . A looks for flow F in its installed flow database and doesn't 645 find it. It installs state for F (makes R a child and B a parent in 646 the multicast tree) and sends the NMFJoin-Req packet to B. 648 Step 3 . B looks for flow F in its installed flow database and finds it. 649 It adds B to its child list and constructs an NMFJoin-Rep packet and 650 sends it to A. 652 Step 4 . A forwards the packet to R and the tree joining is complete. 653 Branch B-A-R is now added to the tree. 655 12 656 Figure 3: Illustration for the example describing joining an existing 657 multicast tree. 659 5.3 Establishing a Shared Tree 661 There are two parts to establishing a shared tree - the receiver-to-RP 662 communication wherein the receiver joins the delivery tree rooted at RP and 663 the sender-to-RP communication wherein the RP joins the delivery tree rooted 664 at the sender. 666 Receiver-RP Communications When an endpoint wishes to join a multicast group 667 G, the endpoint representative obtains the Rendezvous Point EID for G. We 668 assume that the association database contains such a mapping. For details 669 on how the association database query is implemented, please refer [CCS94]. 671 The representative also obtains the flow-id to be used for the flow. The 672 flow-id is constructed as the tuple (RP-EID, G) or an equivalent thereof. 673 Note that the flow-id must be unique to the particular multicast flow. This 674 is not the only method or perhaps even the best method for obtaining a flow 675 id. Alternate methods for obtaining the flow-id are discussed in 676 section 5.5. 678 The representative then initiates a Tree-Join procedure. 680 The NMFReq packet fields are as follows: 682 o S-EID : The EID of the endpoint wishing to join. 684 o T-EID : The RP EID (obtained from the Association Database). 686 o Flow-id : The flow-id for this group (obtained as mentioned above). 688 o Direction : REVERSE (from the RP to the receiving endpoint). 690 o Code : Join. 692 13 693 o Source Route : Reverse of the route obtained from the map agent for a 694 query ``from RP-EID to Receiver-EID''. 696 At the first node already containing this Flow-id or the RP, an NMFRep 697 packet is generated. The S-EID, T-EID, Direction and Flow-id fields are 698 copied from the NMFReq packet and the Code is set to Join-Accept or 699 Join-Refuse as the case may be. The source route is reversed from the 700 NMFReq packet. 702 Sender-RP Communications When an endpoint wishes to send to a multicast 703 group G, the endpoint representative obtains the Rendezvous Point EID for G. 704 We assume that the association database contains such a mapping. For 705 details on how the association database query is implemented, please 706 refer [CCS94]. 708 The representative also obtains the flow-id to be used for the flow. The 709 flow-id is constructed as the tuple (RP-EID, G) or an equivalent thereof. 710 Note that the flow-id must be unique to the particular multicast flow. This 711 is not the only method or perhaps even the best method for obtaining a flow 712 id. Alternate methods for obtaining the flow-id are discussed in 713 section 5.5. 715 The representative then sends a RP-Register Message to the RP. This register 716 message is equivalent to the PIM-Register described in [DEF+94b]. The 717 RP-Register message contains the group G and the flow-id (obtained as 718 discussed above) and the sender EID. 720 The RP then initiates a Tree-Join with the Sender EID as the target. The 721 NMFReq fields are as follows : 723 o S-EID : RP-EID. 725 o T-EID : Sender EID (copied from RP-Register Message). 727 o Flow-id : The flow-id field from RP-Register Message. 729 o Code : Join. 731 o Direction : REVERSE. 733 o Source Route : Reverse of the route obtained from map agent query 734 ``from Sender-EID to RP-EID''. 736 The NMFRep fields are obvious. 738 14 739 Shared Tree Data Forwarding Packets sent from the source for group G contain 740 the Flow-id used by the sender(s) and receiver(s) for setting up the 741 delivery tree. The packets from the sender are sent to the RP where they 742 are multicast, using the state created for the flow, into the delivery tree 743 rooted at the RP to all of the receivers that did a Tree-Join. 745 5.4 Switching to a Source-Rooted Shortest Path Tree 747 There are two parts involved in switching to a Source-Rooted Shortest Path 748 Tree - the receiver-source communications wherein the receiver joins a 749 multicast delivery tree rooted at the source and the receiver-RP 750 communications wherein the receiver leaves the shared tree. 752 Receiver-Source Communications An endpoint E that is receiving packets 753 through the shared tree from source S has the option of switching to a 754 delivery tree rooted at the source such that packets from S to E traverse 755 the shortest path (using whatever metric). 757 The endpoint representative of E obtains the flow-id to be used for the 758 flow. The flow-id is constructed equivalently to the tuple (Source-EID, G). 759 Note that the flow-id must be unique to the particular multicast flow. This 760 is not the only method or perhaps even the best method for obtaining a flow 761 id. Alternate methods for obtaining the flow-id are discussed in 762 section 5.5. 764 The representative for E initiates a Tree-Join toward S with NMFReq fields 765 as follows: 767 o S-EID : EID of the Endpoint E. 769 o T-EID : EID of the source. 771 o Flow-id : Flow id for the multicast (obtained as mentioned above). 773 o Code : Join. 775 o Direction : REVERSE. 777 o Source Route : To obtain the route, the route agent is queried for a 778 shortest path route (based on the chosen metric, typically, the delay) 779 from the source to the endpoint. We note that the quality of the route 780 is constrained by the amount of routing information available, directly 781 or indirectly, to the route agent. The Source Route is the reverse of 782 the route thus obtained. 784 A comment on the difference between the shortest-path trees obtained using 786 15 787 the RPF tree as in [DEF+94b, DC90] and the trees that are be obtained here. 788 When using the RPF scheme, the packets from the source S to the endpoint E 789 follow a path that is the shortest path from E to S. This is the desired 790 path if and only if the path is symmetric in either direction. However, in 791 the mechanism described for Nimrod here, the packets do follow the 792 ``actual'' shortest path from S to E whether or not the path is symmetric. 794 The NMFRep fields are obvious. 796 Receiver-RP Communications After the receiver has joined the source-rooted 797 tree, it can optionally disassociate itself from the shared tree. This is 798 done by initiating a Tree-Leave procedure. 800 The representative sends a NMFReq packet toward the RP with the fields as 801 follows. 803 o S-EID : The EID of the endpoint wishing to leave the shared tree. 805 o T-EID : The RP-EID. 807 o Flow-id : The flow-id it used to join the shared tree. 809 o Code : Prune. 811 o Direction : REVERSE. 813 o Source Route : Obtained as for the Tree-Join. 815 The prune packet is processed by the intermediate forwarding agents as 816 mentioned in section 5.2. When the receiver gets back the NMFRep packet, 817 the receiver has left the shared tree. 819 Source Tree Data Forwarding Packets from the source contain the flow-id that 820 was used to join the source tree for a given multicast group. Forwarding 821 agents simply use the state created by the Tree-Join procedure in order to 822 duplicate and forward packets toward the receivers. 824 5.5 Miscellaneous Issues 826 Obtaining the Flow-Id In the above scheme the flow-id for a particular 827 multicast group G was obtained by combining the RP-EID and the group set-id 828 (G-SID) (in case of shared tree) or by combining the Source-EID and the 829 G-SID (in case of source-based tree). A disadvantage of this approach is 830 that the bit-length of EID/SID is potentially high (more than 64 bits) and 831 thus the flow-id could be very long. While there do exist bit-based data 833 16 834 structures and search algorithms (such as Patricia Trees) that may be used 835 for an efficient implementation, it is worth considering some other methods 836 in lieu of using the EID/SID combination. We describe some methods below : 838 1. For shared trees, the flow-id for a particular group G may be stored 839 and updated in the association database. Since we have to use the 840 association database anyway to obtain the RP-EID, these does not cause 841 much additional burden. 843 However, this cannot be used efficiently for source-based trees because 844 we need a flow-id for each combination of Source and Group. 846 2. The flow-id for shared trees could be done as above. When the sender 847 does an RP-Register, it could send the RP the flow-id that it wishes to 848 be used by receivers when they switch to a source-based tree. This 849 could be included in the RP-Register message. The RP could then 850 multicast that flow-id to all receivers in a special packet. When the 851 receivers wish to switch, they use that flow-id. 853 This needs the definition of the ``special'' packet. 855 3. The flow-id is handed out only by the source (for source-based trees) 856 or the RP (for shared trees). The receivers use a ``dummy'' flow-id in 857 the NMFReq when doing a Tree-Join. The correct flow-id to be used is 858 returned in the NMFRep message generated by the forwarding agent where 859 the new branch meets the existing tree. Forwarding agents in the path 860 of the NMFRep packet update the state information by rewriting the 861 dummy flow-id by the correct flow-id contained in the NMFRep packet. 863 This requires the re-definition of the NMFRep packet. Note that now 864 there must be space for two flow-ids in the NMFRep packet - one for the 865 ``dummy'' flow-id and the other for the ``correct'' flow-id that must 866 replace the dummy flow-id. 868 We claim that each of the above schemes achieves synchronization in the 869 flow-id in various parts of the internetwork and that each flow-id is unique 870 to the multicast delivery tree. A formal proof of these claims is beyond 871 the scope of this document. 873 Dense Mode Multicast The PIM architecture [DEF+94a] includes a multicast 874 protocol when the group membership is densely distributed within the 875 internetwork. In this mode, no Rendezvous Points are used and a source 876 rooted tree is formed based on Reverse Path Forwarding in a manner similar 877 to that of DVMRP [DC90]. 879 We do not give details of how Nimrod can implement Dense Mode Multicast 880 here. 882 17 883 Multiple RPs Our discussion above has been based on the assumption that 884 there is one RP per group. PIM allows more than one RP per group. We do 885 not discuss multiple-RP PIM here. 887 6 Security Considerations 889 This document is not a protocol specification and therefore does not contain 890 a description of security mechanisms for Nimrod. 892 7 Summary 894 o Nimrod does not specify a particular multicast route generation 895 algorithm or state creation procedure. Nimrod can accommodate diverse 896 multicast techniques and leaves the choice of the technique to the 897 particular instantiation of Nimrod. 899 o A solution for multicasting within Nimrod should be capable of: 901 -- Scaling to large networks, large numbers of multicast groups and 902 large multicast groups. 904 -- Supporting policy, including quality of service and access 905 restrictions. 907 -- Resource sharing. 909 -- Interoperability with other solutions. 911 o Multicasting typically requires the setting up of state in intermediate 912 routers for packet forwarding. The state setup may be initiated by the 913 sender (e.g., IDPR), by the receiver (e.g., CBT), by both (e.g., PIM) 914 or by neither. The architecture of Nimrod provides sufficient 915 flexibility to accommodate any of these approaches. 917 o A receiver-initiated multicast protocol, PIM, is being designed by the 918 IDMR working group of the IETF. The facilities provided by Nimrod make 919 the use of PIM as a multicast protocol quite straightforward. 921 8 Acknowledgements 923 We thank Isidro Castineyra (BBN), Charles Lynn (BBN), Martha Steenstrup 924 (BBN) and other members of the Nimrod Working Group for their comments and 925 suggestions on this draft. 927 18 928 9 Author's Address 930 Ram Ramanathan 931 BBN Systems and Technologies 932 10 Moulton Street 933 Cambridge, MA 02138 934 Phone : (617) 873-2736 935 Email : ramanath@bbn.com 937 References 939 [BFC93] A. J. Ballardie, P. F. Francis, and J. Crowcroft. Core based 940 trees. In Proceedings of ACM SIGCOMM, 1993. 942 [CCS94] I. Castineyra, J. N. Chiappa, and M. Steenstrup. The nimrod 943 architecture. Working Draft, July 1994. 944 (draft-castineyra-routing-arch-00.ps (.txt)). 946 [DC90] S. Deering and D. Cheriton. Multicast routing in datagram 947 internetworks and extended lans. ACM Transactions on Computer 948 Systems, pages 85--111, May 1990. 950 [DEF+94a] S. Deering, D. Estrin, D. Farinacci, V. Jacobson, C. Liu, and 951 L. Wei. Protocol independent multicast (pim) : Motivation and 952 architecture. Working Draft, Mar 1994. 953 (draft-ietf-idmr-pim-arch-00.ps). 955 [DEF+94b] S. Deering, D. Estrin, D. Farinacci, V. Jacobson, C. Liu, and 956 L. Wei. Protocol independent multicast (pim) : Sparse mode 957 protocol specification. Working Draft, Mar 1994. 958 (draft-ietf-idmr-pim-sparse-spec-00.ps). 960 [DEFJ93] S. Deering, D. Estrin, D. Farinacci, and V. Jacobson. Igmp router 961 extensions for routing to sparse multicast groups. Working Draft, 962 July 1993. 964 [DM78] Y. K. Dalal and R. M. Metcalfe. Reverse path forwarding of 965 broadcast packets. Communications of the ACM, 21(12), pages 966 1040--1048, 1978. 968 [Moy92] J. Moy. Multicast extensions to ospf. Working Draft, Sep 1992. 970 [Ste] M. Steenstrup. Inter-domain policy routing protocol 971 specification: Version 2. Working Draft. 973 [Ste94] M. Steenstrup. A perspective on nimrod functionality. Working 974 Draft, July 1994. (draft-steenstrup-fun-perspective-00.ps 976 19 977 (.txt)). 979 20