idnits 2.17.1 draft-ietf-idmr-pim-sm-spec-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-18) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** Expected the document's filename to be given on the first page, but didn't find any ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 70 longer pages, the longest (page 69) being 109 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Abstract section. ** The document seems to lack a Security Considerations section. (A line matching the expected section header was found, but with an unexpected indentation: ' 3.10 Security' ) ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 1146 instances of weird spacing in the document. Is it really formatted ragged-right, rather than justified? ** There are 42 instances of too long lines in the document, the longest one being 84 characters in excess of 72. ** There are 11 instances of lines with control characters in the document. == There are 9 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 5 instances of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 17 has weird spacing: '... Drafts are ...' == Line 18 has weird spacing: '...cuments of t...' == Line 19 has weird spacing: '...ups may also ...' == Line 23 has weird spacing: '... Drafts may ...' == Line 24 has weird spacing: '...iate to use ...' == (1141 more instances...) == Couldn't figure out when the document was first submitted -- there may comments or warnings related to the use of a disclaimer for pre-RFC5378 work that could not be issued because of this. Please check the Legal Provisions document at https://trustee.ietf.org/license-info to determine if you need the pre-RFC5378 disclaimer. -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 48 looks like a reference -- Missing reference section? '2' on line 48 looks like a reference -- Missing reference section? '3' on line 64 looks like a reference -- Missing reference section? '4' on line 113 looks like a reference -- Missing reference section? '5' on line 113 looks like a reference -- Missing reference section? '6' on line 1913 looks like a reference -- Missing reference section? 'Hello-Period' on line 1866 looks like a reference -- Missing reference section? 'Hello-Holdtime' on line 1863 looks like a reference -- Missing reference section? 'Register-Suppression-Timeout' on line 1841 looks like a reference -- Missing reference section? 'Probe-Time' on line 1844 looks like a reference -- Missing reference section? 'Bootstrap-Timeout' on line 2703 looks like a reference -- Missing reference section? 'Data-Timeout' on line 1830 looks like a reference -- Missing reference section? 'Assert-Timeout' on line 1851 looks like a reference -- Missing reference section? 'Random-Delay-Join-Timeout' on line 1855 looks like a reference -- Missing reference section? 'Hello-Timer' on line 1726 looks like a reference -- Missing reference section? 'C-RP-Adv-Period' on line 1873 looks like a reference -- Missing reference section? 'RP-Holdtime' on line 1871 looks like a reference -- Missing reference section? 'Bootstrap-Timer' on line 1765 looks like a reference -- Missing reference section? 'Bootstrap-Period' on line 2660 looks like a reference -- Missing reference section? '7' on line 2522 looks like a reference -- Missing reference section? 'Aug 96' on line 2612 looks like a reference Summary: 16 errors (**), 0 flaws (~~), 11 warnings (==), 22 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Deborah Estrin (USC) 3 Internet Draft Dino Farinacci (CISCO) 4 Expire in six months Ahmed Helmy (USC) 5 David Thaler (UMICH) 6 Steven Deering (XEROX) 7 Mark Handley (UCL) 8 Van Jacobson (LBL) 9 Chinggung Liu (USC) 10 Puneet Sharma (USC) 11 Liming Wei (CISCO) * 12 Protocol Independent Multicast-Sparse Mode (PIM-SM): Protocol 13 Specification 15 Status of This Memo 17 This document is an Internet Draft. Internet Drafts are working 18 documents of the Internet Engineering Task Force (IETF), its Areas, 19 and its Working Groups. (Note that other groups may also distribute 20 working documents as Internet Drafts). 22 Internet Drafts are draft documents valid for a maximum of six 23 months. Internet Drafts may be updated, replaced, or obsoleted by 24 other documents at any time. It is not appropriate to use Internet 25 Drafts as reference material or to cite them other than as a 26 ``working'' draft'' or ``work in progress.'' 28 Please check the I-D abstract listing contained in each Internet 29 Draft directory to learn the current status of this or any other 30 Internet Draft. 32 [*] The author list has been reordered to reflect the involvement in 33 detailed editorial work on this specification document. 34 The first four authors are the primary editors and are listed 35 alphabetically. 36 The rest of the authors, also listed alphabetically, participated 37 in all aspects of the architectural and detailed design but 38 managed to get away without hacking the latex! 39 1 Introduction 41 This document describes a protocol for efficiently routing to 42 multicast groups that may span wide-area (and inter-domain) 43 internets. We refer to the approach as Protocol Independent 44 Multicast--Sparse Mode (PIM-SM) because it is not dependent on any 45 particular unicast routing protocol, and because it is designed to 46 support sparse groups as defined in [1][2]. This document describes 47 the protocol details. For the motivation behind the design and a 48 description of the architecture, see [1][2]. Section 2 summarizes 49 PIM-SM operation. It describes the protocol from a network 50 perspective, in particular, how the participating routers interact to 51 create and maintain the multicast distribution tree. Section 3 52 describes PIM-SM operations from the perspective of a single router 53 implementing the protocol; this section constitutes the main body of 54 the protocol specification. It is organized according to PIM-SM 55 message type; for each message type we describe its contents, its 56 generation, and its processing. 58 Sections 3.8 and 3.9 summarize the timers and flags referred to 59 throughout this document. Section 4 provides packet format details. 61 The most significant functional changes since the January '95 version 62 involve the Rendezvous Point-related mechanisms, several resulting 63 simplifications to the protocol, and removal of the PIM-DM protocol 64 details to a separate document [3] (for clarity). 66 2 PIM-SM Protocol Overview 68 In this section we provide an overview of the architectural 69 components of PIM-SM. 71 A router receives explicit Join/Prune messages from those neighboring 72 routers that have downstream group members. The router then forwards 73 data packets addressed to a multicast group, G, only onto those 74 interfaces on which explicit joins have been received. Note that all 75 routers mentioned in this document are assumed to be PIM-SM capable, 76 unless otherwise specified. 78 A Designated Router (DR) sends periodic Join/Prune messages toward a 79 group-specific Rendezvous Point (RP) for each group for which it has 80 active members. Each router along the path toward the RP builds a 81 wildcard (any-source) state for the group and sends Join/Prune 82 messages on toward the RP. We use the term route entry to refer to 83 the state maintained in a router to represent the distribution tree. 84 A route entry may include such fields as the source address, the 85 group address, the incoming interface from which packets are 86 accepted, the list of outgoing interfaces to which packets are sent, 87 timers, flag bits, etc. The wildcard route entry's incoming interface 88 points toward the RP; the outgoing interfaces point to the 89 neighboring downstream routers that have sent Join/Prune messages 90 toward the RP. This state creates a shared, RP-centered, distribution 91 tree that reaches all group members. When a data source first sends 92 to a group, its DR unicasts Register messages to the RP with the 93 source's data packets encapsulated within. If the data rate is high, 94 the RP can send source-specific Join/Prune messages back towards the 95 source and the source's data packets will follow the resulting 96 forwarding state and travel unencapsulated to the RP. Whether they 97 arrive encapsulated or natively, the RP forwards the source's 98 decapsulated data packets down the RP-centered distribution tree 99 toward group members. If the data rate warrants it, routers with 100 local receivers can join a source-specific, shortest path, 101 distribution tree, and prune this source's packets off of the shared 102 RP-centered tree. For low data rate sources, neither the RP, nor 103 last-hop routers need join a source-specific shortest path tree and 104 data packets can be delivered via the shared, RP-tree. 106 The following subsections describe SM operation in more detail, in 107 particular, the control messages, and the actions they trigger. 109 2.1 Local hosts joining a group 111 In order to join a multicast group, G, a host conveys its membership 112 information through the Internet Group Management Protocol (IGMP), 113 as specified in [4][5], (see figure 1). 114 From this point on we refer to such a host as 115 a receiver, R, (or member) of the group G. 117 Note that all figures used in this section are for illustration and 118 are not intended to be complete. For complete and detailed protocol 119 action see Section 3. 121 [Figures are present only in the postscript version] 122 Fig. 1 Example: how a receiver joins, and sets up shared tree 124 When a DR (e.g., router A in figure 1) gets a membership 125 indication from IGMP for a new group, G, the DR looks up the associated 126 RP. The DR creates a wildcard multicast route entry for the group, 127 referred to here as a (*,G) entry; if there is no more specific match 128 for a particular source, the packet will be forwarded according to 129 this entry. 131 The RP address is included in a special field in the route entry and 132 is included in periodic upstream Join/Prune messages. The outgoing 133 interface is set to that included in the IGMP membership 134 indication for the new member. 135 The incoming interface is set to 136 the interface used to send unicast packets to the RP. 138 When there are no longer directly connected members for the group, 139 IGMP notifies the DR. 140 If the DR has neither local members nor downstream 141 receivers, the (*,G) state is deleted. 143 2.2 Establishing the RP-rooted shared tree 145 Triggered by the (*,G) state, the DR creates a Join/Prune message 146 with the RP address in its join list and the the wildcard bit (WC- 147 bit) and RP-tree bit (RPT-bit) set to 1. The WC-bit indicates that 148 any source may match and be forwarded according to this entry if 149 there is no longer match; the RPT-bit indicates that this join is 150 being sent up the shared, RP-tree. The prune list is left empty. When 151 the RPT-bit is set to 1 it indicates that the join is associated with 152 the shared RP-tree and therefore the Join/Prune message is propagated 153 along the RP-tree. When the WC-bit is set to 1 it indicates that the 154 address is an RP and the downstream receivers expect to receive 155 packets from all sources via this (shared tree) path. The term RPT- 156 bit is used to refer to both the RPT-bit flags associated with route 157 entries, and the RPT-bit included in each encoded address in a 158 Join/Prune message. 160 Each upstream router creates or updates its multicast route entry for 161 (*,G) when it receives a Join/Prune with the RPT-bit and WC-bit set. 162 The interface on which the Join/Prune message arrived is added to the 163 list of outgoing interfaces (oifs) for (*,G). Based on this entry 164 each upstream router between the receiver and the RP sends a 165 Join/Prune message in which the join list includes the RP. The packet 166 payload contains Multicast-Address=G, Join=RP,WC-bit,RPT-bit, 167 Prune=NULL. 169 2.3 Hosts sending to a group 171 When a host starts sending multicast data packets to a group, 172 initially its DR must deliver each packet to the RP for distribution 173 down the RP-tree (see figure 2). The sender's DR initially 174 encapsulates each data packet in a Register message and unicasts it 175 to the RP for that group. The RP decapsulates each Register message 176 and forwards the enclosed data packet natively to downstream members 177 on the shared RP-tree. 179 [Figures are present only in the postscript version] 180 Fig. 2 Example: a host sending to a group 182 If the data rate of the source warrants the use of a source-specific 183 shortest path tree (SPT), the RP may construct a new multicast route 184 entry that is specific to the source, hereafter referred to as (S,G) 185 state, and send periodic Join/Prune messages toward the source. Note 186 that over time, the rules for when to switch can be modified without 187 global coordination. When and if the RP does switch to the SPT, the 188 routers between the source and the RP build and maintain (S,G) state 189 in response to these messages and send (S,G) messages upstream toward 190 the source. 192 The source's DR must stop encapsulating data packets in Registers 193 when (and so long as) it receives Register-Stop messages from the RP. 194 The RP triggers Register-Stop messages in response to Registers, if 195 the RP has no downstream receivers for the group (or for that 196 particular source), or if the RP has already joined the (S,G) tree 197 and is receiving the data packets natively. Each source's DR 198 maintains, per (S,G), a Register-Suppression-timer. The Register- 199 Suppression-timer is started by the Register-Stop message; upon 200 expiration, the source's DR resumes sending data packets to the RP, 201 encapsulated in Register messages. 203 2.4 Switching from shared tree (RP-tree) to shortest path tree (SP- 204 tree) 206 A router with directly-connected members first joins the shared RP- 207 tree. The router can switch to a source's shortest path tree (SP- 208 tree) after receiving packets from that source over the shared RP- 209 tree. The recommended policy is to initiate the switch to the SP-tree 210 after receiving a significant number of data packets during a 211 specified time interval from a particular source. To realize this 212 policy the router can monitor data packets from sources for which it 213 has no source-specific multicast route entry and initiate such an 214 entry when the data rate exceeds the configured threshold. As shown 215 in figure 3, router `A' initiates a (S,G) state. 217 [Figures are present only in the postscript version] 218 Fig. 3 Example: Switching from shared tree to shortest path tree 220 When a (S,G) entry is activated (and periodically so long as the 221 state exists), a Join/Prune message is sent upstream towards the 222 source, S, with S in the join list. The payload contains Multicast- 223 Address=G, Join=S, Prune=NULL. When the (S,G) entry is created, the 224 outgoing interface list is copied from (*,G), i.e., all local shared 225 tree branches are replicated in the new shortest path tree. In this 226 way when a data packet from S arrives and matches on this entry, all 227 receivers will continue to receive the source's packets along this 228 path. (In more complicated scenarios, other entries in the router 229 have to be considered, as described in Section 3). Note that (S,G) 230 state must be maintained in each last-hop router that is responsible 231 for initiating and maintaining an SP-tree. Even when (*,G) and (S,G) 232 overlap, both states are needed to trigger the source-specific 233 Join/Prune messages. (S,G) state is kept alive by data packets 234 arriving from that source. A timer, Entry-timer, is set for the (S,G) 235 entry and this timer is restarted whenever data packets for (S,G) are 236 forwarded out at least one oif, or Registers are sent. When the 237 Entry-timer expires, the state is deleted. The last-hop router is the 238 router that delivers the packets to their ultimate end-system 239 destination. This is the router that monitors if there is group 240 membership and joins or prunes the appropriate distribution trees in 241 response. In general the last-hop router is the Designated Router 242 (DR) for the LAN. However, under various conditions described later, 243 a parallel router connected to the same LAN may take over as the 244 last-hop router in place of the DR. 246 Only the RP and routers with local members can initiate switching to 247 the SP-tree; intermediate routers do not. Consequently, last-hop 248 routers create (S,G) state in response to data packets from the 249 source, S; whereas intermediate routers only create (S,G) state in 250 response to Join/Prune messages from downstream that have S in the 251 Join list. 253 The (S,G) entry is initialized with the SPT-bit cleared, indicating 254 that the shortest path tree branch from S has not yet been setup 255 completely, and the router can still accept packets from S that 256 arrive on the (*,G) entry's indicated incoming interface (iif). Each 257 PIM multicast entry has an associated incoming interface on which 258 packets are expected to arrive. 260 When a router with a (S,G) entry and a cleared SPT-bit starts to 261 receive packets from the new source S on the iif for the (S,G) entry, 262 and that iif differs from the (*,G) entry's iif, the router sets the 263 SPT-bit, and sends a Join/Prune message towards the RP, indicating 264 that the router no longer wants to receive packets from S via the 265 shared RP-tree. The Join/Prune message sent towards the RP includes S 266 in the prune list, with the RPT-bit set indicating that S's packets 267 must not be forwarded down this branch of the shared tree. If the 268 router receiving the Join/Prune message has (S,G) state (with or 269 without the route entry's RPT-bit flag set), it deletes the arriving 270 interface from the (S,G) oif list. If the router has only (*,G) 271 state, it creates an entry with the RPT-bit flag set to 1. For 272 brevity we refer to an (S,G) entry that has the RPT-bit flag set to 1 273 as an (S,G)RPT-bit entry. This notational distinction is useful to 274 point out the different actions taken for (S,G) entries depending on 275 the setting of the RPT-bit flag. Note that a router can have no more 276 than one active (S,G) entry for any particular S and G, at any 277 particular time; whether the RPT-bit flag is set or not. In other 278 words, a router never has both an (S,G) and an (S,G)RPT-bit entry for 279 the same S and G at the same time. The Join/Prune message payload 280 contains Multicast-Address=G, Join=NULL, Prune=S,RPT-bit. 282 A new receiver may join an existing RP-tree on which source-specific 283 prune state has been established (e.g., because downstream receivers 284 have switched to SP-trees). In this case the prune state must be 285 eradicated upstream of the new receiver to bring all sources' data 286 packets down to the new receiver. Therefore, when a (*,G) Join 287 arrives at a router that has any (Si,G)RPT-bit entries (i.e., entries 288 that cause the router to send source-specific prunes toward the RP), 289 these entries must be updated upstream of the router so as to bring 290 all sources' packets down to the new member. To accomplish this, each 291 router that receives a (*,G) Join/Prune message updates all existing 292 (S,G)RPT-bit entries. The router may also trigger a (*,G) Join/Prune 293 message upstream to cause the same updating of RPT-bit settings 294 upstream and pull down all active sources' packets. If the arriving 295 (*,G) join has some sources included in its prune list, then the 296 corresponding (S,G)RPT-bit entries are left unchanged (i.e., the 297 RPT-bit remains set and no oif is added). 299 2.5 Steady state maintenance of distribution tree (i.e., router state) 301 In the steady state each router sends periodic Join/Prune messages 302 for each active PIM route entry; the Join/Prune messages are sent to 303 the neighbor indicated in the corresponding entry. These messages are 304 sent periodically to capture state, topology, and membership changes. 305 A Join/Prune message is also sent on an event-triggered basis each 306 time a new route entry is established for some new source (note that 307 some damping function may be applied, e.g., a short delay to allow 308 for merging of new Join information). Join/Prune messages do not 309 elicit any form of explicit acknowledgment; routers recover from lost 310 packets using the periodic refresh mechanism. 312 2.6 Obtaining RP information 314 To obtain the RP information, all routers within a PIM domain collect 315 Bootstrap messages. Bootstrap messages are sent hop-by-hop within the 316 domain; the domain's bootstrap router (BSR) is responsible for 317 originating the Bootstrap messages. Bootstrap messages are used to 318 carry out a dynamic BSR election when needed and to distribute RP 319 information in steady state. 321 A domain in this context is a contiguous set of routers that all 322 implement PIM and are configured to operate within a common boundary 323 defined by PIM Multicast Border Routers (PMBRs). PMBRs connect each 324 PIM domain to the rest of the internet. 326 Routers use a set of available RPs (called the {RP-Set}) distributed 327 in Bootstrap messages to get the proper Group to RP mapping. The 328 following paragraphs summarize the mechanism; details of the 329 mechanism may be found in Sections 3.6 and Appendix 6.2. A (small) 330 set of routers, within a domain, are configured as candidate BSRs 331 and, through a simple election mechanism, a single BSR is selected 332 for that domain. A set of routers within a domain are also configured 333 as candidate RPs (C-RPs); typically these will be the same routers 334 that are configured as C-BSRs. Candidate RPs periodically unicast 335 Candidate-RP-Advertisement messages (C-RP-Advs) to the BSR of that 336 domain. C-RP-Advs include the address of the advertising C-RP, as 337 well as an optional group address and a mask length field, indicating 338 the group prefix(es) for which the candidacy is advertised. The BSR 339 then includes a set of these Candidate-RPs (the RP-Set), along with 340 the corresponding group prefixes, in Bootstrap messages it 341 periodically originates. Bootstrap messages are distributed hop-by- 342 hop throughout the domain. 344 Routers receive and store Bootstrap messages originated by the BSR. 345 When a DR gets a membership indication from IGMP for (or a data 346 packet from) a directly connected host, for a group for which it 347 has no entry, the DR uses a hash function to map the group address 348 to one of the C-RPs whose Group-prefix includes the group (see 349 Section 3.7). 350 The DR then sends a Join/Prune message towards (or unicasts Registers 351 to) that RP. 353 The Bootstrap message indicates liveness of the RPs included therein. 354 If an RP is included in the message, then it is tagged as `up' at the 355 routers; while RPs not included in the message are removed from the 356 list of RPs over which the hash algorithm acts. Each router continues 357 to use the contents of the most recently received Bootstrap message 358 until it receives a new Bootstrap message. 360 If a PIM domain partitions, each area separated from the old BSR will 361 elect its own BSR, which will distribute an RP-Set containing RPs 362 that are reachable within that partition. When the partition heals, 363 another election will occur automatically and only one of the BSRs 364 will continue to send out Bootstrap messages. As is expected at the 365 time of a partition or healing, some disruption in packet delivery 366 may occur. This time will be on the order of the region's round-trip 367 time and the bootstrap router timeout value. 369 2.7 Interoperation with dense mode protocols such as DVMRP 371 In order to interoperate with networks that run dense-mode, 372 {broadcast and prune}, protocols, such as DVMRP, all packets generated 373 within a PIM-SM region must be pulled out to that region's PIM 374 Multicast Border Routers (PMBRs) and injected (i.e., broadcast) into 375 the DVMRP network. A PMBR is a router that sits at the boundary of a 376 PIM-SM domain and interoperates with other types of multicast routers 377 such as those that run DVMRP. Generally a PMBR would speak both 378 protocols and implement interoperability functions not required by 379 regular PIM routers. To support interoperability, a special entry 380 type, referred to as (*,*,RP), must be supported by all PIM routers. 381 For this reason we include details about (*,*,RP) entry handling in 382 this general PIM specification. 384 A data packet will match on a (*,*,RP) entry if there is no more 385 specific entry (such as (S,G) or (*,G)) and the destination group 386 address in the packet maps to the RP listed in the (*,*,RP) entry. In 387 this sense, a (*,*,RP) entry represents an aggregation of all the 388 groups that hash to that RP. PMBRs initialize (*,*,RP) state for each 389 RP in the domain's RPset. The (*,*,RP) state causes the PMBRs to send 390 (*,*,RP) Join/Prune messages toward each of the active RPs in the 391 domain. As a result distribution trees are built that carry all data 392 packets originated within the PIM domain (and sent to the RPs) down 393 to the PMBRs. 395 PMBRs are also responsible for delivering externally-generated 396 packets to routers within the PIM domain. To do so, PMBRs initially 397 encapsulate externally-originated packets (i.e., received on DVMRP 398 interfaces) in Register messages and unicast them to the 399 corresponding RP within the PIM domain. The Register message has a 400 bit indicating that it was originated by a border router and the RP 401 caches the originating PMBR's address in the route entry so that 402 duplicate Registers from other PMBRs can be declined with a 403 Register-Stop message. 405 All PIM routers must be capable of supporting (*,*,RP) state and 406 interpreting associated Join/Prune messages. We describe the handling 407 of (*,*,RP) entries and messages throughout this document; however, 408 detailed PIM Multicast Border Router (PMBR) functions will be 409 specified in a separate interoperability document (see directory, 410 http://catarina.usc.edu/pim/interop/). 412 2.8 Multicast data packet processing 414 Data packets are processed in a manner similar to other multicast 415 schemes. A router first performs a longest match on the source and 416 group address in the data packet. A (S,G) entry is matched first if 417 one exists; a (*,G) entry is matched otherwise. If neither state 418 exists, then a (*,*,RP) entry match is attempted as follows: the 419 router hashes on G to identify the RP for group G, and looks for a 420 (*,*,RP) entry that has this RP address associated with it. If none 421 of the above exists, then the packet is dropped. If a state is 422 matched, the router compares the interface on which the packet 423 arrived to the incoming interface field in the matched route entry. 424 If the iif check fails the packet is dropped, otherwise the packet is 425 forwarded to all interfaces listed in the outgoing interface list. 427 Some special actions are needed to deliver packets continuously while 428 switching from the shared to shortest-path tree. In particular, when 429 a (S,G) entry is matched, incoming packets are forwarded as follows: 431 1 If the SPT-bit is set, then: 433 1 if the incoming interface is the same as a matching 434 (S,G) iif, the packet is forwarded to the oif-list of 435 (S,G). 437 2 if the incoming interface is different than a matching 438 (S,G) iif , the packet is discarded. 440 2 If the SPT-bit is cleared, then: 442 1 if the incoming interface is the same as a matching 443 (S,G) iif, the packet is forwarded to the oif-list of 444 (S,G). In addition, the SPT bit is set for that entry 445 if the incoming interface differs from the incoming 446 interface of the (*,G) or (*,*,RP) entry. 448 2 if the incoming interface is different than a matching 449 (S,G) iif, the incoming interface is tested against a 450 matching (*,G) or (*,*,RP) entry. If the iif is the 451 same as one of those, the packet is forwarded to the 452 oif-list of the matching entry. 454 3 Otherwise the iif does not match any entry for G and 455 the packet is discarded. 457 Data packets never trigger prunes. However, data packets may 458 trigger actions that in turn trigger prunes. For example, when 459 router B in figure 3 decides to switch to SP-tree at step 3, it 460 creates a (S,G) entry with SPT-bit set to 0. When data packets 461 from S arrive at interface 2 of B, B sets the SPT-bit to 1 462 since the iif for (*,G) is different than that for (S,G). This 463 triggers the sending of prunes towards the RP. 465 2.9 Operation over Multi-access Networks 467 This section describes a few additional protocol mechanisms 468 needed to operate PIM over multi-access networks: Designated 469 Router election, Assert messages to resolve parallel paths, and 470 the Join/Prune-Suppression-Timer to suppress redundant Joins on 471 multi-access networks. 473 * Designated router election 475 When there are multiple routers connected to a multi-access 476 network, one of them must be chosen to operate as the designated 477 router (DR) at any point in time. The DR is responsible for 478 sending triggered Join/Prune and Register messages toward the 479 RP. 481 A simple designated router (DR) election mechanism is used for 482 both SM and traditional IP multicast routing. Neighboring 483 routers send Hello messages to each other. The sender with the 484 largest IP address assumes the role of DR. Each router connected 485 to the multi-access LAN sends the Hellos periodically in order 486 to adapt to changes in router status. 488 * Parallel paths to a source or the RP--Assert 489 process 491 If a router receives a multicast datagram on a multi-access LAN 492 from a source whose corresponding (S,G) outgoing interface list 493 includes the interface to that LAN, the packet must be a 494 duplicate. In this case a single forwarder must be elected. 495 Using Assert messages addressed to `224.0.0.13' (ALL-PIM-ROUTERS 496 group) on the LAN, upstream routers can resolve which one will 497 act as the forwarder. Downstream routers listen to the Asserts 498 so they know which one was elected, and therefore where to send 499 subsequent Joins. Typically this is the same as the downstream 500 router's RPF (Reverse Path Forwarding) neighbor; but there are 501 circumstances where this might not be the case, e.g., when using 502 multiple unicast routing protocols on that LAN. The RPF neighbor 503 for a particular source (or RP) is the next-hop router to which 504 packets are forwarded en route to that source (or RP); and 505 therefore is considered a good path via which to accept packets 506 from that source. 508 The upstream router elected is the one that has the shortest 509 distance to the source. Therefore, when a packet is received on 510 an outgoing interface a router sends an Assert message on the 511 multi-access LAN indicating what metric it uses to reach the 512 source of the data packet. The router with the smallest 513 numerical metric (with ties broken by highest address) will 514 become the forwarder. All other upstream routers will delete the 515 interface from their outgoing interface list. The downstream 516 routers also do the comparison in case the forwarder is 517 different than the RPF neighbor. 519 Associated with the metric is a metric preference value. This is 520 provided to deal with the case where the upstream routers may 521 run different unicast routing protocols. The numerically smaller 522 metric preference is always preferred. The metric preference is 523 treated as the high-order part of an assert metric comparison. 524 Therefore, a metric value can be compared with another metric 525 value provided both metric preferences are the same. A metric 526 preference can be assigned per unicast routing protocol and 527 needs to be consistent for all routers on the multi-access 528 network. 530 Asserts are also needed for (*,G) entries since an RP-Tree and 531 an SP-Tree for the same group may both cross the same multi- 532 access network. When an assert is sent for a (*,G) entry, the 533 first bit in the metric preference (RPT-bit) is always set to 1 534 to indicate that this path corresponds to the RP tree, and that 535 the match must be done on (*,G) if it exists. Furthermore, the 536 RPT-bit is always cleared for metric preferences that refer to 537 SP-tree entries; this causes an SP-tree path to always look 538 better than an RP-tree path. When the SP-tree and RPtree cross 539 the same LAN, this mechanism eliminates the duplicates that 540 would otherwise be carried over the LAN. 542 In case the packet, or the Assert message, matches on oif for 543 (*,*,RP) entry, a (*,G) entry is created, and asserts take place 544 as if the matching state were (*,G). 546 The DR may lose the (*,G) Assert process to another router on 547 the LAN if there are multiple paths to the RP through the LAN. 548 From then on, the DR is no longer the last-hop router for local 549 receivers and removes the LAN from its (*,G) oif list. The 550 winning router becomes the last-hop router and is responsible 551 for sending (*,G) join messages to the RP. 553 * Join/Prune suppression 555 Join/Prune suppression may be used on multi-access LANs to 556 reduce duplicate control message overhead; it is not required 557 for correct performance of the protocol. If a Join/Prune message 558 arrives and matches on the incoming interface for an existing 559 (S,G), (*,G), or (*,*,RP) route entry, and the Holdtime included 560 in the Join/Prune message is greater than the recipient's own 561 [Join/Prune-Holdtime] (with ties resolved in favor of the higher 562 IP address), a timer (the Join/Prune-Suppression-timer) in the 563 recipient's route entry may be started to suppress further 564 Join/Prune messages. After this timer expires, the recipient 565 triggers a Join/Prune message, and resumes sending periodic 566 Join/Prunes, for this entry. The Join/Prune-Suppression-timer 567 should be restarted each time a Join/Prune message is received 568 with a higher Holdtime. 570 2.10 Unicast Routing Changes 572 When unicast routing changes, an RPF check is done on all active 573 (S,G), (*,G) and (*,*,RP) entries, and all affected expected 574 incoming interfaces are updated. In particular, if the new 575 incoming interface appears in the outgoing interface list, it is 576 deleted from the outgoing interface list. The previous incoming 577 interface may be added to the outgoing interface list by a 578 subsequent Join/Prune from downstream. Join/Prune messages 579 received on the current incoming interface are ignored. 580 Join/Prune messages received on new interfaces or existing 581 outgoing interfaces are not ignored. Other outgoing interfaces 582 are left as is until they are explicitly pruned by downstream 583 routers or are timed out due to lack of appropriate Join/Prune 584 messages. If the router has a (S,G) entry with the SPT-bit set, 585 and the updated iif(S,G) does not differ from iif(*,G) or 586 iif(*,*,RP), then the router resets the SPT-bit. 588 The router must send a Join/Prune message with S in the Join 589 list out any new incoming interfaces to inform upstream routers 590 that it expects multicast datagrams over the interface. It may 591 also send a Join/Prune message with S in the Prune list out the 592 old incoming interface, if the link is operational, to inform 593 upstream routers that this part of the distribution tree is 594 going away. 596 2.11 PIM-SM for Inter-Domain Multicast 598 Future documents will address the use of PIM-SM as a backbone 599 inter-domain multicast routing protocol. Design choices center 600 primarily around the distribution and usage of RP information 601 for wide area, inter-domain groups. 603 2.12 Security 605 All PIM control messages may use IPsec [6] to address security 606 concerns. Security mechanisms are likely to be enhanced in the 607 near future. 609 3 Detailed Protocol Description 611 This section describes the protocol operations from the 612 perspective of an individual router implementation. In 613 particular, for each message type we describe how it is 614 generated and processed. 616 3.1 Hello 618 Hello messages are sent so neighboring routers can discover each 619 other. 621 3.1.1 Sending Hellos 623 Hello messages are sent periodically between PIM neighbors, 624 every [Hello-Period] seconds. This informs routers what 625 interfaces have PIM neighbors. Hello messages are multicast 626 using address 224.0.0.13 (ALL-PIM-ROUTERS group). The packet 627 includes a Holdtime, set to [Hello-Holdtime], for neighbors to 628 keep the information valid. Hellos are sent on all types of 629 communication links. 631 3.1.2 Receiving Hellos 633 When a router receives a Hello message, it stores the IP address 634 for that neighbor, sets its Neighbor-timer for the Hello sender 635 to the Holdtime included in the Hello, and determines the 636 Designated Router (DR) for that interface. The highest IP 637 addressed system is elected DR. Each Hello received causes the 638 DR's address to be updated. 640 When a router that is the active DR receives a Hello from a new 641 neighbor (i.e., from an IP address that is not yet in the DRs 642 neighbor table), the DR unicasts its most recent RP-set 643 information to the new neighbor. 645 3.1.3 Timing out neighbor entries 647 A periodic process is run to time out PIM neighbors that have 648 not sent Hellos. If the DR has gone down, a new DR is chosen by 649 scanning all neighbors on the interface and selecting the new DR 650 to be the one with the highest IP address. If an interface has 651 gone down, the router may optionally time out all PIM neighbors 652 associated with the interface. 654 3.2 Join/Prune 656 Join/Prune messages are sent to join or prune a branch off of 657 the multicast distribution tree. A single message contains both 658 a join and prune list, either one of which may be null. Each 659 list contains a set of source addresses, indicating the source- 660 specific trees or shared tree that the router wants to join or 661 prune. 663 3.2.1 Sending Join/Prune Messages 665 Join/Prune messages are merged such that a message sent to a 666 particular upstream neighbor, N, includes all of the current 667 joined and pruned sources that are reached via N; according to 668 unicast routing Join/Prune messages are multicast to all routers 669 on multi-access networks with the target address set to the next 670 hop router towards S or RP. Join/Prune messages are sent every 671 [Join/Prune-Period] seconds. In the future we will introduce 672 mechanisms to rate-limit this control traffic on a hop by hop 673 basis, in order to avoid excessive overhead on small links. In 674 addition, certain events cause triggered Join/Prune messages to 675 be sent. 677 3.2.1.1 Periodic Join/Prune Messages 679 A router sends a periodic Join/Prune message to each distinct 680 RPF neighbor associated with each (S,G), (*,G) and (*,*,RP) 681 entry. Join/Prune messages are only sent if the RPF neighbor is 682 a PIM neighbor. A periodic Join/Prune message sent to a 683 particular RPF neighbor is constructed as follows: 685 1 Each router determines the RP for a (*,G) entry by using 686 the hash function described. The RP address (with RPT and 687 WC bits set) is included in the join list of a periodic 688 Join/Prune message under the following conditions: 690 1 The Join/Prune message is being sent to the RPF 691 neighbor toward the RP for an active (*,G) or (*,*,RP) 692 entry, and 694 2 The outgoing interface list in the (*,G) or (*,*,RP) 695 entry is non-NULL, or the router is the DR on the same 696 interface as the RPF neighbor. 698 2 A particular source address, S, is included in the join 699 list with the RPT and WC bits cleared under the following 700 conditions: 702 1 The Join/Prune message is being sent to the RPF 703 neighbor toward S, and 705 2 There exists an active (S,G) entry with the RPT-bit 706 flag cleared, and 708 3 The oif list in the (S,G) entry is not null. 710 3 A particular source address, S, is included in the prune 711 list with the RPT and WC bits cleared under the following 712 conditions: 714 1 The Join/Prune message is being sent to the RPF 715 neighbor toward S, and 717 2 There exists an active (S,G) entry with the RPT-bit 718 flag cleared, and 720 3 The oif list in the (S,G) entry is null. 722 4 A particular source address, S, is included in the prune 723 list with the RPT-bit set and the WC bit cleared under the 724 following conditions: 726 1 The Join/Prune message is being sent to the RPF 727 neighbor toward the RP and there exists a (S,G) entry 728 with the RPT-bit flag set and null oif list, or 730 2 The Join/Prune message is being sent to the RPF 731 neighbor toward the RP, there exists a (S,G) entry 732 with the RPT-bit flag cleared and SPT-bit set, and the 733 incoming interface toward S is different than the 734 incoming interface toward the RP, or 736 3 The Join/Prune message is being sent to the RPF 737 neighbor toward the RP, and there exists a (*,G) entry 738 and (S,G) entry for a directly connected source. 740 5 The RP address (with RPT and WC bits set) is included in 741 the prune list if: 743 1 The Join/Prune message is being sent to the RPF 744 neighbor toward the RP and there exists a (*,G) entry 745 with a null oif list (see Section 3.5.2). 747 3.2.1.2 Triggered Join/Prune Messages 749 In addition to periodic messages, the following events will 750 trigger Join/Prune messages if as a result, a) a new entry is 751 created, or b) the oif list changes from null to non-null or 752 non-null to null. The contents of triggered messages are the 753 same as the periodic, described above. 755 1 Receipt of an indication from IGMP that the state of 756 directly-connected- membership has changed (i.e., new 757 members have just joined `membership indication' or all 758 members have left), for a group G, may cause the last-hop 759 router to build or modify corresponding (*,G) state. When 760 IGMP indicates that there are no longer directly connected 761 members, the oif is removed from the oif list if the oif- 762 timer is not running. A Join/Prune message is triggered if 763 and only if a) a new entry is created, or b) the oif list 764 changes from null to non-null or non-null to null, as 765 follows : 767 1 If the receiving router does not have a route entry 768 for G the router creates a (*,G) entry, copies the oif 769 list from the corresponding (*,*,RP) entry (if it 770 exists), and includes the interface included in the 771 IGMP membership indication in the oif list; as always, 772 the router never includes the entry's iif in the oif 773 list. The router sends a Join/Prune message towards 774 the RP with the RP address and RPT-bit and WC-bits set 775 in the join list. Or, 777 2 If a (S,G)RPT-bit or (*,G) entry already exists, the 778 interface included in the IGMP membership indication is 779 added to the oif list (if it was not included 780 already). 782 2 Receipt of a Join/Prune message for (S,G), (*,G) or 783 (*,*,RP) will cause building or modifying corresponding 784 state, and subsequent triggering of upstream Join/Prune 785 messages, in the following cases: 787 1 When there is no current route entry, the RP address 788 included in the Join/Prune message is checked against 789 the local RP-Set information. If it matches, an entry 790 will be created and the new entry will in turn trigger 791 an upstream Join/Prune message. If the router has no 792 RP-Set information it may discard the message, or 793 optionally use the RP address included in the message. 795 2 When the outgoing interface list of an (S,G)RPT-bit 796 entry becomes null, the triggered Join/Prune message 797 will contain S in the prune list. 799 3 When there exists a (S,G)RPT-bit with null oif list, 800 and an (*,G) Join/Prune message is received, the 801 arriving interface is added to the oif list and a 802 (*,G) Join/Prune message is triggered upstream. 804 4 When there exists a (*,G) with null oif list, and a 805 (*,*,RP) Join/Prune message is received, the receiving 806 interface is added to the oif list and a (*,*,RP) 807 Join/Prune message is triggered upstream. 809 3 Receipt of a packet that matches on a (S,G) entry whose 810 SPT-bit is cleared triggers the following if the packet 811 arrived on the correct incoming interface and there is a 812 (*,G) or (*,*,RP) entry with a different incoming 813 interface: a) the router sets the SPT-bit on the (S,G) 814 entry, and b) the router sends a Join/Prune message towards 815 the RP with S and a set RPT-bit in the prune list. 817 4 When a Join/Prune message is received for a group G, the 818 prune list is checked. If the prune list contains a source 819 or RP for which the receiving router has a corresponding 820 active (S,G), (*,G) or (*,*,RP) entry, and whose iif is 821 that on which the Join/Prune was received, then a join for 822 (S,G), (*,G) or (*,*,RP) is triggered to override the 823 prune, respectively. (This is necessary in the case of 824 parallel downstream routers connected to a multi-access 825 network.) 827 5 When the RP fails, the RP will not be included in the 828 Bootstrap messages sent to all routers in that domain. This 829 triggers the DRs to send (*,G) Join/Prune messages towards 830 the new RP for the group, as determined by the RP-Set and 831 the hash function. As described earlier, PMBRs trigger 832 (*,*,RP) joins towards each RP in the RP-Set. 834 6 When an entry's Join/Prune-Suppression timer expires, a 835 Join/Prune message is triggered upstream corresponding to 836 that entry, even if the outgoing interface has not 837 transitioned between null and non-null states. 839 7 When the RPF neighbor changes (whether due to an Assert or 840 changes in unicast routing), the router sets a random delay 841 timer (the Random-Delay-Join-Timer) whose expiration 842 triggers sending of a Join/Prune message for the asserted 843 route entry to the Assert winner (if the Join/Prune 844 Suppression timer has expired.) 846 We do not trigger prunes onto interfaces based on data packets. 847 Data packets that arrive on the wrong incoming interface are 848 silently dropped. However, on point-to-point interfaces 849 triggered prunes may be sent as an optimization. 851 3.2.1.3 Fragmentation: It is possible that a Join/Prune message 852 constructed according to the preceding rules could exceed the 853 MTU of a network. In this case, the message can undergo semantic 854 fragmentation whereby information corresponding to different 855 groups can be sent in different messages. However, if a 856 Join/Prune message must be fragmented the complete prune list 857 corresponding to a group G must be included in the same 858 Join/Prune message as the associated RP-tree Join for G. If such 859 semantic fragmentation is not possible, IP fragmentation should 860 be used between the two neighboring hops. 862 3.2.2 Receiving Join/Prune Messages When a router receives a 863 Join/Prune message, it processes it as follows. 865 The receiver of the Join/Prune notes the interface on which the 866 PIM message arrived, call it I. The receiver then checks to see 867 if the Join/Prune message was addressed to the receiving router 868 itself (i.e., the router's address appears in the Unicast 869 Upstream Neighbor Router field of the Join/Prune message). (If 870 the router is connected to a multiaccess LAN, the message could 871 be intended for a different router.) If the Join/Prune is for 872 this router the following actions are taken. 874 For each group address G, in the Join/Prune message, the 875 associated join list is processed as follows. We refer to each 876 address in the join list as Sj; Sj refers to the RP if the RPT- 877 bit and WC-bit are both set. For each Sj in the join list of the 878 Join/Prune message: 880 1 If an address, Sj, in the join list of the Join/Prune 881 message has the RPT-bit and WC-bit set, then Sj is the RP 882 address used by the downstream router(s) and the following 883 actions are taken: 885 1 If Sj is not the same as the receiving router's RP 886 mapping for G, the receiving router may ignore the 887 Join/Prune message with respect to that group entry. 888 If the router does not have any RP-Set information, it 889 may use the address Sj included in the Join/Prune 890 message as the RP for the group. 892 2 If Sj is the same as the receiving router's RP mapping 893 for G, the receiving router adds I to the outgoing 894 interface list of the (*,G) route entry (if there is 895 no (*,G) entry, the router creates one first) and sets 896 the Oif-timer for that interface to the Holdtime 897 specified in the Join/Prune message. 898 In addition, the Oif-Deletion-Delay for that interface 899 is set to 1/3rd the Holdtime specified in the Join/Prune 900 message. 902 If a (*,*,RP) entry exists, for the RP associated with 903 G, then the oif list of the newly created (*,G) entry 904 is copied from that (*,*,RP) entry. 906 3 For each (Si,G) entry associated with group G, if Si 907 is not included in the prune list, and if I is not the 908 iif then interface I is added to the oif list and 909 the Oif-timer for that interface in each affected 910 entry is increased (never decreased) to the Holdtime 911 included in the Join/Prune message. 912 In addition, if the Oif-timer for that interface is 913 increased, the Oif-Deletion-Delay for that interface 914 is set to 1/3rd the Holdtime specified in the 915 Join/Prune message. 917 If the group address in the Join/Prune message is `*' 918 then every (*,G) and (S,G) entry, whose group address 919 hashes to the RP indicated in the (*,*,RP) Join/Prune 920 message, is updated accordingly. A `*' in the group 921 field of the Join/Prune is represented by a group 922 address 224.0.0.0 and a group mask length of 4, 923 indicating a (*,*,RP) Join. 925 4 If the (Si,G) entry has its RPT-bit flag set to 1, and 926 its oif list is the same as the (*,G) oif 927 list, then the (Si,G)RPT-bit entry is deleted, 929 5 The incoming interface is set to the interface used to 930 send unicast packets to the RP in the (*,G) route 931 entry, i.e., RPF interface toward the RP. 933 2 For each address, Sj, in the join list whose RPT-bit and 934 WC-bit are not set, and for which there is no existing 935 (Sj,G) route entry, the router initiates one. The router 936 creates a (S,G) entry and copies all outgoing interfaces 937 from the (S,G)RPT-bit entry, if it exists. If there is no 938 (S,G) entry, the oif list is copied from the (*,G) entry; 939 and if there is no (*,G) entry, the oif list is copied from 940 the (*,*,RP) entry, if it exists. In all cases, the iif of 941 the (S,G) entry is always excluded from the oif list. 943 1 The outgoing interface for (Sj,G) is set to I. The 944 incoming interface for (Sj,G) is set to the interface 945 used to send unicast packets to Sj (i.e., the RPF 946 neighbor). 948 2 If the interface used to reach Sj, is the same as I, 949 this represents an error (or a unicast routing change) 950 and the Join/Prune must not be processed. 952 3 For each address, Sj, in the join list of the Join/Prune 953 message, for which there is an existing (Sj,G) route entry, 954 1 If the RPT-bit is not set for Sj listed in the 955 Join/Prune message, but the RPT-bit flag is set on the 956 existing (Sj,G) entry, the router clears the RPT-bit 957 flag on the (Sj,G) entry, sets the incoming interface 958 to point towards Sj for that (Sj,G) entry, and sends a 959 Join/Prune message corresponding to that entry through 960 the new incoming interface; and 962 2 If I is not the same as the existing incoming 963 interface, the router adds I to the list of outgoing 964 interfaces. 966 3 The Oif-timer for I is increased (never decreased) 967 to the Holdtime included in the Join/Prune message. 968 In addition, if the Oif-timer for that interface 969 is increased, the Oif-Deletion-Delay for that interface 970 is set to 1/3rd the Holdtime specified in the 971 Join/Prune message. 973 4 The (Sj,G) entry's SPT bit is cleared until data comes 974 down the shortest path tree. 976 For each group address G, in the Join/Prune message, the 977 associated prune list is processed as follows. We refer to each 978 address in the prune list as Sp; Sp refers to the RP if the 979 RPT-bit and WC-bit are both set. For each Sp in the prune list 980 of the Join/Prune message: 982 1 For each address, Sp, in the prune list whose RPT-bit and 983 WC-bit are cleared: 985 1 If there is an existing (Sp,G) route entry, the router 986 lowers the Oif-timer for I to its Oif-Deletion-Delay, 987 allowing for other downstream routers on a multi- 988 access LAN to override the prune. However, on point- 989 to-point links, the oif-timer is expired immediately. 991 2 If the router has a current (*,G), or (*,*,RP), route 992 entry, and if the existing (Sp,G) entry has its RPT- 993 bit flag set to 1, then this (Sp,G)RPT-bit entry is 994 maintained (not deleted) even if its outgoing 995 interface list is null. 997 2 For each address, Sp, in the prune list whose RPT-bit is 998 set and whose WC-bit cleared: 1000 1 If there is an existing (Sp,G) route entry, the router 1001 lowers the entry's Oif-timer for I to its 1002 Oif-Deletion-Delay, 1003 allowing for other downstream routers on a multi- 1004 access LAN to override the prune. However, on point- 1005 to-point links, the oif-timer is expired immediately. 1007 2 If the router has a current (*,G), or (*,*,RP), route 1008 entry, and if the existing (Sp,G) entry has its RPT- 1009 bit flag set to 1, then this (Sp,G)RPT-bit entry is 1010 not deleted, and the Entry-timer is restarted, even if 1011 its outgoing interface list is null. 1013 3 If (*,G), or corresponding (*,*,RP), state exists, but 1014 there is no (Sp,G) entry, an (Sp,G)RPT-bit entry is 1015 created . The outgoing interface list is copied from 1016 the (*,G), or (*,*,RP), entry, with the interface, I, 1017 on which the prune was received, is deleted. Packets 1018 from the pruned source, Sp, match on this state and 1019 are not forwarded toward the pruned receivers. 1021 4 If there exists a (Sp,G) entry, with or without the 1022 RPT-bit set, the oif-timer for I is expired, and the 1023 Entry-timer is restarted. 1025 3 For each address, Sp, in the prune list whose RPT-bit and 1026 WC-bit are both set: 1028 1 If there is an existing (*,G) entry, with Sp as the RP 1029 for G, the router lowers the entry's Oif-timer for I 1030 to its Oif-Deletion-Delay, 1031 allowing for other downstream routers 1032 on a multi-access LAN to override the prune. However, 1033 on point-to-point links, the oif-timer is expired 1034 immediately. 1036 2 If the corresponding (*,*,RP) state exists, but there 1037 is no (*,G) entry, a (*,G) entry is created. The 1038 outgoing interface list is copied from (*,*,RP) entry, 1039 with the interface, I, on which the prune was 1040 received, deleted. 1042 For any new (S,G), (*,G) or (*,*,RP) entry created by an 1043 incoming Join/Prune message, the SPT-bit is cleared (and if 1044 a Join/Prune-Suppression timer is used, it is left off.) 1046 If the entry has a Join/Prune-Suppression timer associated with 1047 it, and if the received Join/Prune does not indicate the router 1048 as its target, then the receiving router examines the join and 1049 prune lists to see if any addresses in the list `completely- 1050 match' existing (S,G), (*,G), or (*,*,RP) state for which the 1051 receiving router currently schedules Join/Prune messages. An 1052 element on the join or prune list `completely-matches' a route 1053 entry only if both the IP addresses and RPT-bit flag are the 1054 same. If the incoming Join/Prune message completely matches an 1055 existing (S,G), (*,G), or (*,*,RP) entry and the Join/Prune 1056 arrived on the iif for that entry, then the router compares 1057 the Holdtime included in the Join/Prune message, to its own 1058 [Join/Prune-Holdtime]. If its own [Join/Prune-Holdtime] is 1059 lower, the Join/Prune-Suppression-timer is started at the 1060 [Join/Prune-Suppression-Timeout]. If the [Join/Prune-Holdtime] 1061 is equal, the tie is resolved in favor of the Join/Prune Message 1062 originator that has the higher IP address. When the Join/Prune 1063 timer expires, the router triggers a Join/Prune message for the 1064 corresponding entry(ies). 1066 3.3 Register and Register-Stop 1068 When a source first starts sending to a group its packets are 1069 encapsulated in Register messages and sent to the RP. If the 1070 data rate warrants source-specific paths, the RP sets up source 1071 specific state and starts sending (S,G) Join/Prune messages 1072 toward the source, with S in the join list. 1074 3.3.1 Sending Registers and Receiving Register-Stops 1076 Register messages are sent as follows: 1078 1 When a DR receives a packet from a directly connected 1079 source, S 1081 1 If there is no corresponding (S,G) entry, and the 1082 router has RP-Set information, the DR creates one with 1083 the Register-Suppression-timer turned off and the RP 1084 address set according to the hash function mapping for 1085 the corresponding group. The oif list is copied from 1086 existing (*,G) or (*,*,RP) entries, if they exist. The 1087 iif of the (S,G) entry is always excluded from the oif 1088 list. 1090 2 If there is a (S,G) entry in existence, the DR simply 1091 restarts the corresponding Entry-timer. 1093 When a PMBR (e.g., a router that connects the PIM-SM region 1094 to a dense mode region running DVMRP or PIM-DM) receives a 1095 packet from a source in the dense mode region, the router 1096 treats the packet as if it were from a directly connected 1097 source. A separate document will describe the details of 1098 interoperability. 1100 2 If the new or previously-existing (S,G) entry's Register- 1101 Suppression-timer is not running, the data packet is 1102 encapsulated in a Register message and unicast to the RP 1103 for that group. The data packet is also forwarded according 1104 to (S,G) state in the DR if the oif list is not null; since 1105 a receiver may join the SP-tree while the DR is still 1106 registering to the RP. 1108 3 If the (S,G) entry's Register-Suppression-timer is running, 1109 the data packet is not sent in a Register message, it is 1110 just forwarded according to the (S,G) oif list. 1112 When the DR receives a Register-Stop message, it restarts the 1113 Register-Suppression-timer in the corresponding (S,G) entry(ies) 1114 at [Register-Suppression-Timeout] seconds. If there is data to 1115 be registered, the DR may send a null Register (a Register 1116 message with a zero-length data portion in the inner IP packet) 1117 to the RP, [Probe-Time] seconds before the Register- 1118 Suppression-timer expires, to avoid sending occasional bursts of 1119 traffic to an RP unnecessarily. 1121 3.3.2 Receiving Register Messages and Sending Register-Stops 1123 When a router (i.e., the RP) receives a Register message, the 1124 router does the following: 1126 1 Decapsulates the data packet, and checks for a 1127 corresponding (S,G) entry. 1129 1 If a (S,G) entry with cleared (0) SPT bit exists, and 1130 the received Register does not have the Null- 1131 Register-Bit set to 1, the packet is forwarded; and 1132 the SPT bit is left cleared (0). If the SPT bit is 1, 1133 the packet is dropped, and Register-Stop messages are 1134 triggered. Register-Stops should be rate-limited (in 1135 an implementation-specific manner) so that no more 1136 than a few are sent per round trip time. This prevents 1137 a high datarate stream of packets from triggering a 1138 large number of Register-Stop messages between the 1139 time that the first packet is received and the time 1140 when the source receives the first Register-Stop. 1142 2 If there is no (S,G) entry, but there is a (*,G) 1143 entry, and the received Register does not have the 1144 Null-Register-Bit set to 1, the packet is forwarded 1145 according to the (*,G) entry. 1147 3 If there is a (*,*,RP) entry but no (*,G) entry, and 1148 the Register received does not have the Null- 1149 Register-Bit set to 1, a (*,G) or (S,G) entry is 1150 created and the oif list is copied from the (*,*,RP) 1151 entry to the new entry. The packet is forwarded 1152 according to the created entry. 1154 4 If there is no G or (*,*,RP) entry corresponding to G, 1155 the packet is dropped, and a Register-Stop is 1156 triggered. 1158 5 A ``Border bit'' bit is added to the Register message, 1159 to facilitate interoperability mechanisms. PMBRs set 1160 this bit when registering for external sources (see 1161 Section 2.7). If the ``Border bit'' is set in the 1162 Register, the RP does the following: 1164 1 If there is no matching (S,G) state, but there 1165 exists (*,G) or (*,*,RP) entry, the RP creates a 1166 (S,G) entry, with a `PMBR' field. This field 1167 holds the source of the Register (i.e. the outer 1168 IP address of the register packet). The RP 1169 triggers a (S,G) join towards the source of the 1170 data packet, and clears the SPT bit for the (S,G) 1171 entry. If the received Register is not a `null 1172 Register' the packet is forwarded according to 1173 the created state. Else, 1175 2 If the `PMBR' field for the corresponding (S,G) 1176 entry matches the source of the Register packet, 1177 and the received Register is not a `null 1178 Register', the decapsulated packet is forwarded 1179 to the oif list of that entry. Else, 1181 3 If the `PMBR' field for the corresponding (S,G) 1182 entry matches the source of the Register packet, 1183 the decapsulated packet is forwarded to the oif 1184 list of that entry, else 1186 4 The packet is dropped, and a Register-stop is 1187 triggered towards the source of the Register. 1189 The (S,G) Entry-timer is restarted by Registers arriving 1190 from that source to that group. 1192 2 If the matching (S,G) or (*,G) state contains a null oif 1193 list, the RP unicasts a Register-Stop message to the source 1194 of the Register message; in the latter case, the source- 1195 address field, within the Register-Stop message, is set to 1196 the wildcard value (all 0's). This message is not processed 1197 by intermediate routers, hence no (S,G) state is 1198 constructed between the RP and the source. 1200 3 If the Register message arrival rate warrants it and there 1201 is no existing (S,G) entry, the RP sets up a (S,G) route 1202 entry with the outgoing interface list, excluding iif(S,G), 1203 copied from the (*,G) outgoing interface list, its SPT-bit 1204 is initialized to 0. If a (*,G) entry does not exist, but 1205 there exists a (*,*,RP) entry with the RP corresponding to 1206 G , the oif list for (S,G) is copied -excluding the iif- 1207 from that (*,*,RP) entry. 1209 A timer (Entry-timer) is set for the (S,G) entry and this 1210 timer is restarted by receipt of data packets for (S,G). 1211 The (S,G) entry causes the RP to send a Join/Prune message 1212 for the indicated group towards the source of the register 1213 message. 1215 If the (S,G) oif list becomes null, Join/Prune messages 1216 will not be sent towards the source, S. 1218 3.4 Multicast Data Packet Forwarding 1220 Processing a multicast data packet involves the following steps: 1222 1 Lookup route state based on a longest match of the source 1223 address, and an exact match of the destination address in 1224 the data packet. If neither S, nor G, find a longest match 1225 entry, and the RP for the packet's destination group 1226 address has a corresponding (*,*,RP) entry, then the 1227 longest match does not require an exact match on the 1228 destination group address. In summary, the longest match is 1229 performed in the following order: (1) (S,G), (2) (*,G). If 1230 neither is matched, then a lookup is performed on (*,*,RP) 1231 entries. 1233 2 If the packet arrived on the interface found in the 1234 matching-entry's iif field, and the oif list is not 1235 null: 1237 1 Forward the packet to the oif list for that entry 1238 and restart the Entry-timer if the matching entry is 1239 (S,G). Optionally, the (S,G) Entry-timer may be 1240 restarted by periodic checking of the matching packet 1241 count. 1243 2 If the entry is a (S,G) entry with a cleared SPT-bit, 1244 and a (*,G) or associated (*,*,RP) also exists whose 1245 incoming interface is different than that for (S,G), 1246 set the SPT-bit for the (S,G) entry and trigger an 1247 (S,G) RPT-bit prune towards the RP. 1249 3 If the source of the packet is a directly-connected 1250 host and the router is the DR on the receiving 1251 interface, check the Register-Suppression-timer 1252 associated with the (S,G) entry. If it is not running, 1253 then the router encapsulates the data packet in a 1254 register message and sends it to the RP. 1256 This covers the common case of a packet arriving on the RPF 1257 interface to the source or RP and being forwarded to all 1258 joined branches. It also detects when packets arrive on the 1259 SP-tree, and triggers their pruning from the RP-tree. If it 1260 is the DR for the source, it sends data packets 1261 encapsulated in Registers to the RPs. 1263 3 If the packet matches to an entry but did not arrive on the 1264 interface found in the entry's iif field, check the 1265 SPT-bit of the entry. If the SPT-bit is set, drop the 1266 packet. If the SPT-bit is cleared, then lookup the (*,G), 1267 or (*,*,RP), entry for G. If the packet arrived on the 1268 iif found in (*,G), or the corresponding (*,*,RP), 1269 forward the packet to the oif list of the matching 1270 entry. This covers the case when a data packet matches on a 1271 (S,G) entry for which the SP-tree has not yet been 1272 completely established upstream. 1274 4 If the packet does not match any entry, but the source of 1275 the data packet is a local, directly-connected host, and 1276 the router is the DR on a multi-access LAN and has RP-Set 1277 information, the DR uses the hash function to determine the 1278 RP associated with the destination group, G. The DR creates 1279 a (S,G) entry, with the Register-Suppression-timer not 1280 running, encapsulates the data packet in a Register message 1281 and unicasts it to the RP. 1283 5 If the packet does not match to any entry, and it is not a 1284 local host or the router is not the DR, drop the packet. 1286 3.4.1 Data triggered switch to shortest path tree (SP-tree) 1288 Different criteria can be applied to trigger switching over from 1289 the RP-based shared tree to source-specific, shortest path 1290 trees. 1292 One proposed example is to do so based on data rate. For 1293 example, when a (*,G), or corresponding (*,*,RP), entry is 1294 created, a data rate counter may be initiated at the last-hop 1295 routers. The counter is incremented with every data packet 1296 received for directly connected members of an SM group, if the 1297 longest match is (*,G) or (*,*,RP). If and when the data rate 1298 for the group exceeds a certain configured threshold (t1), the 1299 router initiates `source-specific' data rate counters for the 1300 following data packets. Then, each counter for a source, is 1301 incremented when packets matching on (*,G), or (*,*,RP), are 1302 received from that source. If the data rate from the particular 1303 source exceeds a configured threshold (t2), a (S,G) entry is 1304 created and a Join/Prune message is sent towards the source. If 1305 the RPF interface for (S,G) is 1306 not the same as that for (*,G) -or (*,*,RP), then the SPT-bit 1307 is cleared in the (S,G) entry. 1309 Other configured rules may be enforced to cause or prevent 1310 establishment of (S,G) state. 1312 3.5 Assert 1314 Asserts are used to resolve which of the parallel routers 1315 connected to a multi-access LAN is responsible for forwarding 1316 packets onto the LAN. 1318 3.5.1 Sending Asserts 1320 The following Assert rules are provided when a multicast packet 1321 is received on an outgoing multi-access interface ``I'' of an 1322 existing (S,G) entry: 1324 1 Do unicast routing table lookup on source IP address from 1325 data packet, and send assert on interface ``I'' for source 1326 IP address in data packet; include metric preference of 1327 routing protocol and metric from routing table lookup. 1329 2 If route is not found, use metric preference of 0x7fffffff 1330 and metric 0xffffffff. 1332 When an assert is sent for a (*,G) entry, the first bit in the 1333 metric preference (the RPT-bit) is set to 1, indicating the data 1334 packet is routed down the RP-tree. 1336 Asserts should be rate-limited in an implementation-specific 1337 manner. 1339 3.5.2 Receiving Asserts 1341 When an Assert is received the router performs a longest match 1342 on the source and group address in the Assert message. The 1343 router checks the first bit of the metric preference (RPT-bit). 1345 1 If the RPT-bit is set, the router first does a match on 1346 (*,G), or (*,*,RP), entries; if no matching entry is found, 1347 it ignores the Assert. 1349 2 If the RPT-bit is not set in the Assert, the router first 1350 does a match on (S,G) entries; if no matching entry is 1351 found, the router matches (*,G) or (*,*,RP) entries. 1353 3.5.2.1 Receiving Asserts on an entry's outgoing interface 1355 If the interface that received the Assert message is in the 1356 oif list of the matched entry, then this Assert is processed 1357 by this router as follows: 1359 1 If the Assert's RPT-bit is set and the matching entry is 1360 (*,*,RP), the router creates a (*,G) entry. If the Assert's 1361 RPT-bit is cleared and the matching entry is (*,G), or 1362 (*,*,RP), the router creates a (S,G)RPT-bit entry. 1363 Otherwise, no new entry is created in response to the 1364 Assert. 1366 2 The router then compares the metric values received in the 1367 Assert with the metric values associated with the matched 1368 entry. The RPT-bit and metric preference (in that order) 1369 are treated as the high-order part of an Assert metric 1370 comparison. If the value in the Assert is less than the 1371 router's value (with ties broken by the IP address, where 1372 higher IP address wins), delete the interface from the 1373 entry. When the deletion occurs for a (*,G) or (*,*,RP) 1374 entry , the interface is also deleted from any associated 1375 (S,G)RPT-bit or (*,G) entries, respectively. The Entry- 1376 timer for the affected entries is restarted. 1378 3 If the router has won the election the router keeps the 1379 interface in its outgoing interface list. It acts as the 1380 forwarder for the LAN. 1382 The winning router sends an Assert message containing its own 1383 metric to that outgoing interface. This will cause other routers 1384 on the LAN to prune that interface from their route entries. The 1385 winning router sets the RPT-bit in the Assert message if a (*,G) 1386 or (S,G)RPT-bit entry was matched. 1388 3.5.2.2 Receiving Asserts on an entry's incoming interface 1390 If the Assert arrived on the incoming interface of an existing 1391 (S,G), (*,G), or (*,*,RP) entry, the Assert is processed as 1392 follows. If the Assert message does not match the entry, 1393 exactly, it is ignored; i.e, longest-match is not used in this 1394 case. If the Assert message does match exactly, then: 1396 1 Downstream routers will select the upstream router with the 1397 smallest metric preference and metric as their RPF 1398 neighbor. If two metrics are the same, the highest IP 1399 address is chosen to break the tie. This is important so 1400 that downstream routers send subsequent Joins/Prunes (in 1401 SM) to the correct neighbor. An Assert-timer is initiated 1402 when changing the RPF neighbor to the Assert winner. When 1403 the timer expires, the router resets its RPF neighbor 1404 according to its unicast routing tables to capture network 1405 dynamics and router failures. 1407 2 If the downstream routers have downstream members, and if 1408 the Assert caused the RPF neighbor to change, the 1409 downstream routers must trigger a Join/Prune message to 1410 inform the upstream router that packets are to be forwarded 1411 on the multi-access network. 1413 3.6 Candidate-RP-Advertisements and Bootstrap messages 1415 Candidate-RP-Advertisements (C-RP-Advs) are periodic PIM 1416 messages unicast to the BSR by those routers that are configured 1417 as Candidate-RPs (C-RPs). 1419 Bootstrap messages are periodic PIM messages originated by the 1420 Bootstrap router (BSR) within a domain, and forwarded hop-by-hop 1421 to distribute the current RP-set to all routers in that domain. 1423 The Bootstrap messages also support a simple mechanism by which 1424 the Candidate BSR (C-BSR) with the highest BSR-priority and IP 1425 address (referred to as the preferred BSR) is elected as the BSR 1426 for the domain. We recommend that each router configured as a 1427 C-RP also be configured as a C-BSR. Sections 3.6.2 and 3.6.3 1428 describe the combined function of Bootstrap messages as the 1429 vehicle for BSR election and RP-Set distribution. 1431 A Finite State Machine description of the BSR election and RP- 1432 Set distribution mechanisms is included in Appendix II. 1434 3.6.1 Sending Candidate-RP-Advertisements 1436 C-RPs periodically unicast C-RP-Advs to the BSR for that domain. 1437 The interval for sending these messages is subject to local 1438 configuration at the C-RP. 1440 Candidate-RP-Advertisements carry group address and group mask 1441 fields. This enables the advertising router to limit the 1442 advertisement to certain prefixes or scopes of groups. The 1443 advertising router may enforce this scope acceptance when 1444 receiving Registers or Join/Prune messages. C-RPs should send 1445 C-RP-Adv messages with the Authoritative bit cleared. 1447 3.6.2 Receiving C-RP-Advs and Originating Bootstrap 1449 Upon receiving a C-RP-Adv, a router does the following: 1451 1 If the router is not the elected BSR, it ignores the 1452 message, else 1454 2 The BSR adds the RP address to its local pool of candidate 1455 RPs, according to the associated group prefix(es) in the 1456 C-RP-Adv message. The Holdtime in the C-RP-Adv message is 1457 also stored with the corresponding RP, to be included later 1458 in the Bootstrap message. The BSR may apply a local 1459 policy to limit the number of Candidate RPs included 1460 in the Bootstrap message. 1461 The BSR may override the prefix indicated in a C-RP-Adv 1462 unless the Authoritative bit in the C-RP-Adv is set. 1464 The BSR keeps an RP-timer per RP in its local RP-set. The RP- 1465 timer is initialized to the Holdtime in the RP's C-RP-Adv. When 1466 the timer expires, the corresponding RP is removed from the RP- 1467 set. The RP-timer is restarted by the C-RP-Advs from the 1468 corresponding RP. 1470 The BSR also uses its Bootstrap-timer to periodically send 1471 Bootstrap messages. In particular, when the Bootstrap-timer 1472 expires, the BSR originates an Bootstrap message on each of its 1473 PIM interfaces. The message is sent with a TTL of 1 to the 1474 `ALL-PIM-ROUTERS' group. In steady state, the BSR originates 1475 Bootstrap messages periodically. At startup, the Bootstrap-timer 1476 is initialized to [Bootstrap-Timeout], causing the first 1477 Bootstrap message to be originated only when and if the timer 1478 expires. For timer details, see Section 3.6.3. A DR unicasts a 1479 Bootstrap message to each new PIM neighbor, i.e., after the DR 1480 receives the neighbor's Hello message (it does so even if the 1481 new neighbor becomes the DR). 1483 The Bootstrap message is subdivided into sets of {group- 1484 prefix,RP-Count,RP-addresses}. 1485 For each RP-address, the corresponding Holdtime is included in the 1486 ``RP-Holdtime'' field. The format of the Bootstrap 1487 message allows `semantic fragmentation', if the length of the 1488 original Bootstrap message exceeds the packet maximum boundaries 1489 (see Section 4). However, we recommend against configuring a 1490 large number of routers as C-RPs, to reduce the semantic 1491 fragmentation required. 1493 3.6.3 Receiving and Forwarding Bootstrap 1495 Each router keeps a Bootstrap-timer, initialized to [Bootstrap- 1496 Timeout] at startup. 1498 When a router receives Bootstrap message sent to `ALL-PIM- 1499 ROUTERS' group, it performs the following: 1501 1 If the message was not sent by the RPF neighbor towards the 1502 BSR address included, the message is dropped. Else 1504 2 If the included BSR is not preferred over, and not equal 1505 to, the currently active BSR: 1507 1 If the Bootstrap-timer has not yet expired, or if the 1508 receiving router is a C-BSR, then the Bootstrap 1509 message is dropped. Else 1511 2 If the Bootstrap-timer has expired and the receiving 1512 router is not a C-BSR, the receiving router stores the 1513 RP-Set and BSR address and priority found in the 1514 message, and restarts the timer by setting it to 1515 [Bootstrap-Timeout]. The Bootstrap message is then 1516 forwarded out all PIM interfaces, excluding the one 1517 over which the message arrived, to `ALL-PIM-ROUTERS' 1518 group, with a TTL of 1. 1520 3 If the Bootstrap message includes a BSR address that is 1521 preferred over, or equal to, the currently active BSR, the 1522 router restarts its Bootstrap-timer at [Bootstrap-Timeout] 1523 seconds. and stores the BSR address and RP-Set information. 1525 The Bootstrap message is then forwarded out all PIM 1526 interfaces, excluding the one over which the message 1527 arrived, to `ALL-PIM-ROUTERS' group, with a TTL of 1. 1529 4 If the receiving router has no current RP set information 1530 and the Bootstrap was unicast to it from a directly 1531 connected neighbor, the router stores the information as 1532 its new RP-set. This covers the startup condition when a 1533 newly booted router obtains the RP-Set and BSR address from 1534 its DR. 1536 When a router receives a new RP-Set, it checks if each of the 1537 RPs referred to by existing state (i.e., by (*,G), (*,*,RP), or 1538 (S,G)RPT-bit entries) is in the new RP-Set. If an RP is not in 1539 the new RP-set, that RP is considered unreachable and the hash 1540 algorithm (see below) is re-performed for each group with 1541 locally active state that previously hashed to that RP. This 1542 will cause those groups to be distributed among the remaining 1543 RPs. When the new RP-Set contains a new RP, the value of the new 1544 RP is calculated for each group covered by that C-RP's Group- 1545 prefix. Any group for which the new RP's value is greater than 1546 the previously active RP's value is switched over to the new RP. 1548 3.7 Hash Function 1550 The hash function is used by all routers within a domain, to map 1551 a group to one of the C-RPs from the RP-Set. For a particular 1552 group, G, the hash function uses only those C-RPs whose Group- 1553 prefix covers G. The algorithm takes as input the group address, 1554 and the addresses of the Candidate RPs, and gives as output one 1555 RP address to be used. 1557 The protocol requires that all routers hash to the same RP 1558 within a domain (except for transients). The following hash 1559 function must be used in each router: 1561 1 For each RP address C(i) in the RP-Set, whose Group-prefix 1562 covers G, compute a value: 1564 Value(G,M,C(i))= 1565 (1103515245 * ((1103515245 * (G&M)+12345) XOR C(i)) + 12345) mod 2^31 1567 where M is a hash-mask included in Bootstrap messages. 1568 This hash-mask allows a small number of consecutive groups 1569 (e.g., 4) to always hash to the same RP. For instance, 1570 hierarchically-encoded data can be sent on consecutive 1571 group addresses to get the same delay and fate-sharing 1572 characteristics. 1574 2 The candidate with the highest resulting value is then 1575 chosen as the RP for that group, and its identity and hash 1576 value are stored with the entry created. 1578 Ties between C-RPs having the same hash value, are broken 1579 in advantage of the highest address. 1581 The hash function algorithm is invoked by a DR, upon reception 1582 of a packet, or IGMP membership indication, for a group, for 1583 which the DR has no entry. It is invoked by any router that has 1584 (*,*,RP) state when a packet is received for which there is no 1585 corresponding (S,G) or (*,G) entry. Furthermore, the hash 1586 function is invoked by all routers upon receiving a (*,G) or 1587 (*,*,RP) Join/Prune message. 1589 3.8 Processing Timer Events 1591 In this subsection, we enumerate all timers that have been 1592 discussed or implied. Since some critical timer events are not 1593 associated with the receipt or sending of messages, they are not 1594 fully covered by earlier subsections. 1596 Timers are implemented in an implementation-specific manner. For 1597 example, a timer may count up or down, or may simply expire at a 1598 specific time. Setting a timer to a value T means that it will 1599 expire after T seconds. 1601 3.8.1 Timers related to tree maintenance 1603 Each (S,G), (*,G), and (*,*,RP) route entry has multiple timers 1604 associated with it: one for each interface in the outgoing 1605 interface list, one for the multicast routing entry itself, and 1606 one optional Join/Prune-Suppression-Timer. Each (S,G) and (*,G) 1607 entry also has an Assert-timer and a Random-Delay-Join-Timer for 1608 use with Asserts. In addition, DR's have a Register- 1609 Suppression-timer for each (S,G) entry and every router has a 1610 single Join/Prune-timer. (A router may optionally keep separate 1611 Join/Prune-timers for different interfaces or route entries if 1612 different Join/Prune periods are desired.) 1614 * [Join/Prune-Timer] This timer is used for periodically 1615 sending aggregate Join/Prune messages. To avoid 1616 synchronization among routers booting simultaneously, it is 1617 initially set to a random value between 1 and [Join/Prune- 1618 Period]. When it expires, the timer is immediately 1619 restarted to [Join/Prune-Period]. A Join/Prune message is 1620 then sent out each interface. This timer should not be 1621 restarted by other events. 1623 * [Join/Prune-Suppression-Timer (kept per route entry)] A 1624 route entry's (optional) Join/Prune-Suppression-Timer may 1625 be used to suppress duplicate joins from multiple 1626 downstream routers on the same LAN. When a Join message is 1627 received from a neighbor on the entry's incoming interface 1628 in which the included Holdtime is higher than the router's 1629 own [Join/Prune-Holdtime] (with ties broken by higher IP 1630 address), the timer is set to [Join/Prune-Suppression- 1631 Timeout], with some random jitter introduced to avoid 1632 synchronization of triggered Join/Prune messages on 1633 expiration. (The random timeout value must be < 1.5 * 1634 [Join/Prune-Period] to prevent losing data after 2 dropped 1635 Join/Prunes.) The timer is restarted every time a 1636 subsequent Join/Prune message (with higher Holdtime/IP 1637 address) for the entry is received on its incoming 1638 interface. While the timer is running, Join/Prune messages 1639 for the entry are not sent. This timer is idle (not 1640 running) for point-to-point links. 1642 * [Oif-Timer (kept per oif for each route entry)] A timer for 1643 each oif of a route entry is used to time out that oif. 1644 Because some of the outgoing interfaces in an (S,G) entry 1645 are copied from the (*,G) outgoing interface list, they may 1646 not have explicit (S,G) join messages from some of the 1647 downstream routers (i.e., where members are joining to the 1648 (*,G) tree only). Thus, when an Oif-timer is restarted in a 1649 (*,G) entry, the Oif-timer is restarted for that interface 1650 in each existing (S,G) entry whose oif list contains that 1651 interface. The same rule applies to (*,G) and (S,G) entries 1652 when restarting an Oif-timer on a (*,*,RP) entry. 1654 The following table shows its usage when first adding the 1655 oif to the entry's oiflist, when it should be restarted 1656 (unless it is already higher), and when it should be 1657 decreased (unless it is already lower). 1659 Set to | When | Applies to 1660 included Holdtime | adding oif off Join/Prune | (S,G) (*,G) (*,*,RP) 1662 Increased (only) to | When | Applies to 1663 included Holdtime | received Join/Prune | (S,G) (*,G) (*,*,RP) 1664 (*,*,RP) oif-timer value | (*,*,RP) oif-timer restarted | (S,G) (*,G) 1665 (*,G) oif-timer value | (*,G) oif-timer restarted | (S,G) 1667 Decreased (only) to | When | Applies to 1668 Oif-Deletion-Delay | prune received | (S,G) (*,G) 1669 When the timer expires, the oif is removed from the oiflist 1670 if there are no directly-connected members. When deleted, 1671 the oif is also removed in any associated (S,G) or (*,G) 1672 entries. 1674 * [Entry-Timer (kept per route entry)] A timer for each route 1675 entry is used to time out that entry. The following table 1676 summarizes its usage when first adding the oif to the 1677 entry's oiflist, and when it should be restarted (unless it 1678 is already higher). 1680 Set to | When | Applies to 1681 [Data- Timeout] | created off data packet | (S,G) 1682 included Holdtime | created off Join/Prune | (S,G) (*,G) (*,*,RP) 1684 Increased (only) to | When | Applies to 1685 [Data-Timeout] | receiving data packets | (S,G)no RPT-bit 1686 oif-timer value | any oif-timer restarted | (S,G)RPT-bit (*,G) (*,*,RP) 1687 [Assert-Timeout] | assert received | (S,G)RPT-bit (*,G) w/null oif 1689 When the timer expires, the route entry is deleted; if the 1690 entry is a (*,G) or (*,*,RP) entry, all associated 1691 (S,G)RPT-bit entries are also deleted. 1693 * [Register-Suppression-Timer (kept per (S,G) route entry)] 1694 An (S,G) route entry's Register-Suppression-Timer is used 1695 to suppress registers when the RP is receiving data packets 1696 natively. When a Register-Stop message for the entry is 1697 received from the RP, the timer is set to a random value in 1698 the range 0.5 * [Register-Suppression-Timeout] to 1.5 * 1699 [Register-Suppression-Timeout]. While the timer is running, 1700 Registers for that entry will be suppressed. If null 1701 registers are used, a null register is sent [Probe-Time] 1702 seconds before the timer expires. 1704 * [Assert-Timer (per (S,G) or (*,G) route entry)] The 1705 Assert-Timer for an (S,G) or (*,G) route entry is used for 1706 timing out Asserts received. When an Assert is received and 1707 the RPF neighbor is changed to the Assert winner, the 1708 Assert-Timer is set to [Assert-Timeout], and is restarted 1709 to this value every time a subsequent Assert for the entry 1710 is received on its incoming interface. When the timer 1711 expires, the router resets its RPF neighbor according to 1712 its unicast routing table. 1714 * [Random-Delay-Join-Timer (per (S,G) or (*,G) route entry)] 1715 The Random-Delay-Join-Timer for an (S,G) or (*,G) route 1716 entry is used to prevent synchronization among downstream 1717 routers on a LAN when their RPF neighbor changes. When the 1718 RPF neighbor changes, this timer is set to a random value 1719 between 0 and [Random-Delay-Join-Timeout] seconds. When the 1720 timer expires, a triggered Join/Prune message is sent for 1721 the entry unless its Join/Prune-Suppression-Timer is 1722 running. 1724 3.8.2 Timers relating to neighbor discovery 1726 * [Hello-Timer] This timer is used to periodically send Hello 1727 messages. To avoid synchronization among routers booting 1728 simultaneously, it is initially set to a random value 1729 between 1 and [Hello-Period]. When it expires, the timer is 1730 immediately restarted to [Hello-Period]. A Hello message is 1731 then sent out each interface. This timer should not be 1732 restarted by other events. 1734 * [Neighbor-Timer (kept per neighbor)] A Neighbor-Timer for 1735 each neighbor is used to time out the neighbor state. When 1736 a Hello message is received from a new neighbor, the timer 1737 is initially set to the Holdtime included in the Hello 1738 message (which is equal to the neighbor's value of [Hello- 1739 Holdtime]). Every time a subsequent Hello is received from 1740 that neighbor, the timer is restarted to the Holdtime in 1741 the Hello. When the timer expires, the neighbor state is 1742 removed. 1744 3.8.3 Timers relating to RP information 1746 * [C-RP-Adv-Timer (C-RP's only)] Routers configured as 1747 candidate RP's use this timer to periodically send C-RP-Adv 1748 messages. To avoid synchronization among routers booting 1749 simultaneously, the timer is initially set to a random 1750 value between 1 and [C-RP-Adv-Period]. When it expires, the 1751 timer is immediately restarted to [C-RP-Adv-Period]. A C- 1752 RP-Adv message is then sent to the elected BSR. This timer 1753 should not be restarted by other events. 1755 * [RP-Timer (BSR only, kept per RP in RP-Set)] The BSR uses a 1756 timer per RP in the RP-Set to monitor liveness. When a C-RP 1757 is added to the RP-Set, its timer is set to the Holdtime 1758 included in the C-RP-Adv message from that C-RP (which is 1759 equal to the C-RP's value of [RP-Holdtime]). Every time a 1760 subsequent C-RP-Adv is received from that RP, its timer is 1761 restarted to the Holdtime in the C-RP-Adv. When the timer 1762 expires, the RP is removed from the RP-Set included in 1763 Bootstrap messages. 1765 * [Bootstrap-Timer] This timer is used by the BSR to 1766 periodically originate Bootstrap messages, and by other 1767 routers to time out the BSR (see 1768 3.6.3). This timer is initially set to [Bootstrap- 1769 Timeout]. A C-BSR restarts this timer to [Bootstrap- 1770 Timeout] upon receiving a Bootstrap message from a 1771 preferred router, and originates an Bootstrap message and 1772 restarts the timer to [Bootstrap-Period] when it expires. 1773 Routers not configured as C-BSR's restart this timer to 1774 [Bootstrap-Timeout] upon receiving a Bootstrap message from 1775 the elected or a more preferred BSR, and ignore Bootstrap 1776 messages from non-preferred C-BSRs while it is running. 1778 3.8.4 Default timer values 1780 Most of the default timeout values for state information are 3.5 1781 times the refresh period. For example, Hellos refresh Neighbor 1782 state and the default Hello-timer period is 30 seconds, so a 1783 default Neighbor-timer duration of 105 seconds is included in 1784 the Holdtime field of the Hellos. In order to improve 1785 convergence, however, the default timeout value for information 1786 related to RP liveness and Bootstrap messages is 2.5 times the 1787 refresh period. 1789 In this version of the spec, we suggest particular numerical 1790 timer settings. A future version of the specification will 1791 specify a mechanism for timer values to be scaled based upon 1792 observed network parameters. 1794 * [Join/Prune-Period] This is the interval between 1795 sending Join/Prune messages. {Default: 60 seconds.} This 1796 value may be set to take into account such things as the 1797 configured bandwidth and expected average number of 1798 multicast route entries for the attached network or link 1799 (e.g., the period would be longer for lower-speed links, or 1800 for routers in the center of the network that expect to 1801 have a larger number of entries ). In addition, a router 1802 could modify this value (and corresponding Join/Prune- 1803 Holdtime value) if the number of route entries changes 1804 significantly (e.g., by an order of magnitude). For 1805 example, given a default minimum Join/Prune-Period value, 1806 if the number of route entries with a particular iif 1807 increases from N to N*100, the router could increase its 1808 Join/Prune-Period (and Join/Prune-Holdtime), for that 1809 interface, by a factor of 10; and if/when the number of 1810 entries decreases back to N, the Join/Prune-Period (and 1811 Join/Prune-Holdtime) could be decreased to its previous 1812 value. If the Join/Prune-Period is modified, these changes 1813 should be made relatively infrequently and the router 1814 should continue to refresh at its previous Join/Prune- 1815 Period for at least Join/Prune-Holdtime, in order to allow 1816 the upstream router to adapt. 1818 * [Join-Prune Holdtime] This is the Holdtime specified in 1819 Join/Prune messages, and is used to time out oifs. This 1820 should be set to 3.5 * [Join/Prune-Period]. {Default: 210 1821 seconds.} 1823 * [Join/Prune-Suppression-Timeout] This is the mean 1824 interval between receiving a Join/Prune with a higher 1825 Holdtime (with ties broken by higher IP addres) and 1826 allowing duplicate Join/Prunes to be sent again. This 1827 should be set to approximately 1.25 * [Join/Prune-Period]. 1828 {Default: 75 seconds. } 1830 * [Data-Timeout] This is the time after which (S,G) state 1831 for a silent source will be deleted. {Default: 210 1832 seconds.} 1834 * [Register-Suppression-Timeout] This is the mean 1835 interval between receiving a Register-Stop and allowing 1836 Registers to be sent again. A lower value means more 1837 frequent register bursts at RP, while a higher value means 1838 longer join latency for new receivers. {Default: 60 1839 seconds.} (Note that if null Registers are sent [Probe- 1840 Time] seconds before the timeout, register bursts are 1841 prevents, and [Register-Suppression-Timeout] may be lowered 1842 to decrease join latency.) 1844 * [Probe-Time] When null Registers are used, this is the 1845 time between sending a null Register and the Register- 1846 Suppression-Timer expiring unless it is restarted by 1847 receiving a Register-Stop. Thus, a null Register would be 1848 sent when the Register-Suppression-Timer reaches this 1849 value. {Default: 5 seconds.} 1851 * [Assert-Timeout] This is the interval between the last 1852 time an Assert is received, and the time at which the 1853 assert is timed out. {Default: 180 seconds.} 1855 * [Random-Delay-Join-Timeout] This is the maximum 1856 interval between the time when the RPF neighbor changes, 1857 and the time at which a triggered Join/Prune message is 1858 sent. {Default: 4.5 seconds.} 1860 * [Hello-Period] This is the interval between sending 1861 Hello messages. {Default: 30 seconds.} 1863 * [Hello-Holdtime] This is the Holdtime specified in 1864 Hello messages, after which neighbors will time out their 1865 neighbor entries for the router. This should be set to 3.5 1866 * [Hello-Period]. {Default: 105 seconds.} 1868 * [C-RP-Adv-Period] For C-RPs, this is the interval 1869 between sending C-RP-Adv messages. {Default: 60 seconds.} 1871 * [RP-Holdtime] For C-RPs, this is the Holdtime specified 1872 in C-RP-Adv messages, and is used by the BSR to time out 1873 RPs. This should be set to 2.5 * [C-RP-Adv-Period]. 1874 {Default: 150 seconds.} 1875 * [Bootstrap-Period] At the elected BSR, this is the 1876 interval between originating Bootstrap messages, and should 1877 be equal to 60 seconds. 1879 * [Bootstrap-Timeout] This is the time after which the 1880 elected BSR will be assumed unreachable when Bootstrap 1881 messages are not received from it. This should be set to 1882 2.5 * [Bootstrap-Period]. {Default: 150 seconds.} 1884 3.9 Summary of flags used 1886 Following is a summary of all the flags used in our scheme. 1888 Bit | Used in | Definition 1890 Authoritative | C-RP-Adv | Group-prefix information should not be over- 1891 ridden by BSR 1892 Border | Register | Register for external sources is coming from 1893 PIM multicast border router 1894 Null | Register | Register sent as Probe of RP, the encapsulated 1895 IP data packet should not be forwarded 1896 RPT | Route entry | Entry represents state on the RP-tree 1897 RPT | Join/Prune | Join is associated with the shared tree and 1898 therefore the Join/Prune message is propagated 1899 along the RP-tree (source encoded is an RP 1900 address) 1901 RPT | Assert | The data packet was routed down the shared 1902 tree; thus, the path indicated corresponds 1903 to the RP tree 1904 SPT | (S,G) entry | Packets have arrived on the iif towards S, and 1905 the iif is different from the (*,G) iif 1906 WC |Join | The receiver expects to receive packets from all sources via this (shared tree) path. Thus, the 1907 Join/Prune applies to a (*,G) entry 1908 WC | Route entry | Wildcard entry; if there is no more specific 1909 match for a particular source, packets will 1910 be forwarded according to this entry 1911 3.10 Security 1913 All PIM control messages may use IPSec [6] to address security 1914 concerns. 1916 4 Packet Formats 1918 This section describes the details of the packet formats for PIM 1919 control messages. 1921 All PIM control messages have protocol number 103. 1923 Basically, PIM messages are either unicast (e.g. Registers and 1924 Register-Stop), or multicast hop-by-hop to `ALL-PIM-ROUTERS' 1925 group `224.0.0.13' (e.g. Join/Prune, Asserts, etc.). 1927 0 1 2 3 1928 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1929 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1930 |PIM Ver| Type | Addr length | Checksum | 1931 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1933 PIM Ver 1934 PIM Version number is 2. 1936 Type Types for specific PIM messages. PIM Types are: 1938 0 = Hello 1939 1 = Register 1940 2 = Register-Stop 1941 3 = Join/Prune 1942 4 = Bootstrap 1943 5 = Assert 1944 6 = Graft (used in PIM-DM only) 1945 7 = Graft-Ack (used in PIM-DM only) 1946 8 = Candidate-RP-Advertisement 1948 Addr length 1949 Address length in bytes. Throughout this section this 1950 would indicate the number of bytes in the Address field of 1951 an address, including unicast and group addresses. 1953 Checksum 1954 The checksum is the 16-bit one's complement of the one's 1955 complement sum of the entire PIM message, (excluding the 1956 data portion in the Register message). For computing the 1957 checksum, the checksum field is zeroed. 1959 4.1 Encoded Source and Group Address formats 1961 1 Unicast address: Only the address is included. The length 1962 of the unicast address in bytes is specified in the `Addr 1963 length' field in the header. 1965 2 Encoded-Group-Address: Takes the following format: 1967 0 1 2 3 1968 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1969 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1970 | Reserved | Mask Len | Group multicast Address ... | 1971 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1972 | ...Group multicast Address ...| 1973 +-+-+-+-+-+-+-+-+-+-+~+~+~+~+~+~+ 1975 Reserved 1976 Transmitted as zero. Ignored upon receipt. 1978 Mask Len 1979 The Mask length is 8 bits. The value is the number of 1980 contiguous bits left justified used as a mask which 1981 describes the address. It is less than or equal to 1982 Addr length * 8. If the message is sent for a single 1983 group then the Mask length must equal Addr length * 8 1984 (i.e. 32 for IPv4 and 128 for IPv6). 1986 Group multicast Address 1987 contains the group address, and has number of bytes 1988 equal to that specified in the Addr length field. 1990 3 Encoded-Source-Address: Takes the following format: 1992 0 1 2 3 1993 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1994 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1995 | Rsrvd |S|W|R| Mask Len | Source Address ... | 1996 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1997 | ... Source Address | 1998 +-+-+-+-+-+-+-+-+-+-+-+-+-+~+~+-+ 2000 Reserved 2001 Transmitted as zero, ignored on receipt. 2003 S,W,R See Section 4.5 for details. 2005 Mask Length 2006 Mask length is 8 bits. The value is the number of 2007 contiguous bits left justified used as a mask which 2008 describes the address. The mask length must be less 2009 than or equal to Addr Length * 8. If the message is 2010 sent for a single source then the Mask length must 2011 equal Addr length * 8. In version 2 of PIM, it is 2012 strongly recommended that this field be set to 32 for 2013 IPv4. 2015 Source Address 2016 The address length is indicated from the Addr length 2017 field at the beginning of the header. For IPv4, the 2018 address length is 4 octets. 2020 4.2 Hello Message 2022 It is sent periodically by routers on all interfaces. 2024 0 1 2 3 2025 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2026 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2027 |PIM Ver| Type | Addr length | Checksum | 2028 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2029 | OptionType | OptionLength | 2030 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2031 | OptionValue | 2032 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+~+~+ 2033 | . | 2034 | . | 2035 | . | 2036 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2037 | OptionType | OptionLength | 2038 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2039 | OptionValue | 2040 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+~+~+ 2042 PIM Version, Type, Addr length, Checksum 2043 Described above. 2045 OptionType 2046 The type of the option given in the following OptionValue 2047 field. 2049 OptionLength 2050 The length of the OptionValue field in bytes. 2052 OptionValue 2053 A variable length field, carrying the value of the option. 2055 The Option fields may contain the following values: 2057 * OptionType = 1; OptionLength = 2; OptionValue = Holdtime; 2058 where Holdtime is the amount of time a receiver must keep 2059 the neighbor reachable, in seconds. If the Holdtime is set 2060 to `0xffff', the receiver of this message never times out 2061 the neighbor. This may be used with ISDN lines, to avoid 2062 keeping the link up with periodic Hello messages. 2064 Furthermore, if the Holdtime is set to `0', the information 2065 is timed out immediately. 2067 * OptionType 2 to 16: reserved 2069 * The rest of the OptionTypes are defined in another 2070 document. 2072 In general, options may be ignored; but a router must not ignore 2073 the 2074 4.3 Register Message 2076 A Register message is sent by the DR or a PMBR to the RP when a 2077 multicast packet needs to be transmitted on the RP-tree. Source 2078 IP address is set to the address of the DR, destination IP 2079 address is to the RP's address. 2081 0 1 2 3 2082 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2083 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2084 |PIM Ver| Type | Addr length | Checksum | 2085 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2086 |B|N| Reserved | 2087 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2088 | | 2089 ~ Multicast data packet ~ 2090 | | 2091 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2093 PIM Version, Type, Addr length, Checksum 2094 Described above. {Note that the checksum for Registers 2095 is done only on the PIM header, excluding the data packet 2096 portion.} 2098 B The Border bit. If the router is a DR for a source that it 2099 is directly connected to, it sets the B bit to 0. If the 2100 router is a PMBR for a source in a directly connected 2101 cloud, it sets the B bit to 1. 2103 N The Null-Register bit. Set to 1 by a DR that is probing 2104 the RP before expiring its local Register-Suppression 2105 timer. Set to 0 otherwise. 2107 Multicast data packet 2108 The original packet sent by the source. 2110 For (S,G) null Registers, the Multicast data packet portion 2111 contains only a dummy IP header with S as the source address, G 2112 as the destination address, and a data length of zero. 2114 4.4 Register-Stop Message 2116 A Register-Stop is unicast from the RP to the sender of the 2117 Register message. Source IP address is the address to which the 2118 register was addressed. Destination IP address is the source 2119 address of the register message. 2121 0 1 2 3 2122 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2123 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2124 |PIM Ver| Type | Addr length | Checksum | 2125 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2126 | Encoded-Group Address | 2127 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2128 | Unicast-Source Address | 2129 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2131 PIM Version, Type, Addr length, Checksum 2132 Described above. 2134 Encoded-Group Address 2135 Format described above. Note that for Register-Stops the 2136 Mask Len field contains Addr length * 8 (32 for IPv4), if 2137 the message is sent for a single group. 2139 Unicast-Source Address 2140 IP host address of source from multicast data packet in 2141 register. The length of this field in bytes is specified in 2142 the Addr length field. A special wild card value (0.0.0.0), 2143 can be used to indicate any source. 2145 4.5 Join/Prune Message 2147 A Join/Prune message is sent by routers towards upstream sources 2148 and RPs. Joins are sent to build shared trees (RP trees) or 2149 source trees (SPT). Prunes are sent to prune source trees when 2150 members leave groups as well as sources that do not use the 2151 shared tree. 2153 0 1 2 3 2154 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2155 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2156 |PIM Ver| Type | Addr length | Checksum | 2157 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2158 | Unicast-Upstream Neighbor Address | 2159 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2160 | Reserved | Num groups | Holdtime | 2161 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2162 | Encoded-Multicast Group Address-1 | 2163 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2164 | Number of Joined Sources | Number of Pruned Sources | 2165 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2166 | Encoded-Joined Source Address-1 | 2167 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2168 | . | 2169 | . | 2170 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2171 | Encoded-Joined Source Address-n | 2172 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2173 | Encoded-Pruned Source Address-1 | 2174 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2175 | . | 2176 | . | 2177 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2178 | Encoded-Pruned Source Address-n | 2179 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2180 | . | 2181 | . | 2182 | . | 2183 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2184 | Encoded-Multicast Group Address-n | 2185 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2186 | Number of Joined Sources | Number of Pruned Sources | 2187 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2188 | Encoded-Joined Source Address-1 | 2189 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2190 | . | 2191 | . | 2192 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2193 | Encoded-Joined Source Address-n | 2194 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2195 | Encoded-Pruned Source Address-1 | 2196 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2197 | . | 2198 | . | 2199 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2200 | Encoded-Pruned Source Address-n | 2201 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2203 PIM Version, Type, Addr length, Checksum 2204 Described above. 2206 Upstream Neighbor Address 2207 The IP address of the RPF or upstream neighbor. 2209 Reserved 2210 Transmitted as zero, ignored on receipt. 2212 Holdtime 2213 The amount of time a receiver must keep the Join/Prune 2214 state alive, in seconds. If the Holdtime is set to 2215 `0xffff', the receiver of this message never times out the 2216 oif. This may be used with ISDN lines, to avoid keeping the 2217 link up with periodical Join/Prune messages. Furthermore, 2218 if the Holdtime is set to `0', the information is timed out 2219 immediately. 2221 Number of Groups 2222 The number of multicast group sets contained in the 2223 message. 2225 Encoded-Multicast group address 2226 For format description see Section 2227 4.1. A wild card group in the (*,*,RP) join is represented 2228 by a 224.0.0.0 in the group address field and `4' in the 2229 mask length field. A (*,*,RP) join also has the WC-bit and 2230 the RPT-bit set. 2232 Number of Joined Sources 2233 Number of join source addresses listed for a given group. 2235 Join Source Address-1 .. n 2236 This list contains the sources that the sending router 2237 will forward multicast datagrams for if received on the 2238 interface this message is sent on. 2240 See format section 4.1. The fields explanation for the 2241 Encoded-Source-Address format follows: 2243 Reserved 2244 Described above. 2246 S The Sparse bit is a 1 bit value, set to 1 for PIM-SM. 2247 It is used for PIM v.1 compatibility. 2249 W The WC bit is a 1 bit value. If 1, the join or prune 2250 applies to the (*,G) or (*,*,RP) entry. If 0, the join 2251 or prune applies to the (S,G) entry where S is Source 2252 Address. Joins and prunes sent towards the RP must 2253 have this bit set. 2255 R The RPT-bit is a 1 bit value. If 1, the information 2256 about (S,G) is sent towards the RP. If 0, the 2257 information must be sent toward S, where S is the 2258 Source Address. 2260 Mask Length, Source Address 2261 Described above. 2263 Represented in the form of < WC-bit >< RPT-bit >< 2264 Mask length >< Source address>: 2266 A source address could be a host IP address : 2268 < 0 >< 0 >< 32 >< 192.1.1.17 > 2270 A source address could be the RP's IP address : 2272 < 1 >< 1 >< 32 >< 131.108.13.111 > 2274 A source address could be a subnet address to prune from 2275 the RP-tree : 2277 < 0 >< 1 >< 28 >< 192.1.1.16 > 2279 A source address could be a general aggregate : 2281 < 0 >< 0 >< 16 >< 192.1.0.0 > 2283 Number of Pruned Sources 2284 Number of prune source addresses listed for a group. 2286 Prune Source Address-1 .. n 2287 This list contains the sources that the sending router 2288 does not want to forward multicast datagrams for when 2289 received on the interface this message is sent on. If the 2290 Join/Prune message boundary exceeds the maximum packet 2291 size, then the join and prune lists for the same group must 2292 be included in the same packet. 2294 4.6 Bootstrap Message 2296 The Bootstrap messages are multicast to `ALL-PIM-ROUTERS' group, 2297 out all interfaces having PIM neighbors (excluding the one over 2298 which the message was received). Bootstrap messages are sent 2299 with TTL value of 1. Bootstrap messages originate at the BSR, 2300 and are forwarded by intermediate routers. 2302 Bootstrap message is divided up into `semantic fragments', if 2303 the original message exceeds the maximum packet size boundaries. 2305 The semantics of a single `fragment' is given below: 2307 0 1 2 3 2308 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2309 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2310 |PIM Ver| Type | Addr length | Checksum | 2311 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2312 | Fragment Tag | Hash Mask len | BSR-priority | 2313 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2314 | Unicast-BSR-Address | 2315 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2316 | Encoded-Group Address-1 | 2317 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2318 | RP-Count-1 | Frag RP-Cnt-1 | Reserved | 2319 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2320 | Unicast-RP-Address-1 | 2321 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2322 | RP1-Holdtime | Unicast- . . . | 2323 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2324 | . . . RP-Address-2 | RP2-Holdtime | 2325 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2326 | . | 2327 | . | 2328 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2329 | Unicast-RP-Address-m | 2330 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2331 | RPm-Holdtime | Encoded- . . . | 2332 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2333 | . . . Group Address-2 . . . | 2334 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2335 | . | 2336 | . | 2337 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2338 | Encoded-Group Address-n | 2339 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2340 | RP-Count-m | Frag RP-Cnt-m | Reserved | 2341 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2342 | Unicast-RP-Address-1 | 2343 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2344 | RP1-Holdtime | Unicast- . . . | 2345 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2346 | . . . RP-Address-2 | RP2-Holdtime | 2347 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2348 | . | 2349 | . | 2350 | . | 2351 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2352 | Unicast-RP-Address-m | 2353 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2354 | RPm-Holdtime | 2355 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2357 PIM Version, Type, Addr length, Checksum 2358 Described above. 2360 Fragment Tag 2361 A randomly generated number, acts to distinguish the 2362 fragments belonging to different Bootstrap messages; 2363 fragments belonging to same Bootstrap message carry the 2364 same `Fragment Tag'. 2366 Hash Mask len 2367 The length (in bits) of the mask to use in the hash 2368 function. For IPv4 we recommend a value of 30. For IPv6 we 2369 recommend a value of 126. 2371 BSR-priority 2372 Contains the BSR priority value of the included BSR. This 2373 field is considered as a high order byte when comparing BSR 2374 addresses. 2376 Unicast-BSR-Address 2377 The IP address of the bootstrap router for the domain. The 2378 length of this field in bytes is specified in Addr length. 2380 Encoded-Group Address-1..n 2381 The group prefix (address and mask) with which the 2382 Candidate RPs are associated. Format previously described. 2384 RP-Count-1..n 2385 The number of Candidate RP addresses included in the whole 2386 Bootstrap message for the corresponding group prefix. A 2387 router does not replace its old RP-Set for a given group 2388 prefix until/unless it receives `RP-Count' addresses for 2389 that prefix; the addresses could be carried over several 2390 fragments. If only part of the RP-Set for a given group 2391 prefix was received, the router discards it, without 2392 updating that specific group prefix's RP-Set. 2394 Frag RP-Cnt-1..m 2395 The number of Candidate RP addresses included in this 2396 fragment of the Bootstrap message, for the corresponding 2397 group prefix. The `Frag RP-Cnt' field facilitates parsing 2398 of the RP-Set for a given group prefix, when carried over 2399 more than one fragment. 2401 Unicast-RP-address-1..m 2402 The address of the Candidate RPs, for the corresponding 2403 group prefix. The length of this field in bytes is 2404 specified in Addr length. 2406 RP1..m-Holdtime 2407 The Holdtime for the corresponding RP. This field is copied 2408 from the `Holdtime' field of the associated RP stored at 2409 the BSR. 2411 4.7 Assert Message 2413 The Assert message is sent when a multicast data packet is 2414 received on an outgoing interface corresponding to the (S,G) or 2415 (*,G) associated with the source. 2417 0 1 2 3 2418 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2419 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2420 |PIM Ver| Type | Addr length | Checksum | 2421 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2422 | Encoded-Group Address | 2423 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2424 | Unicast-Source Address | 2425 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2426 |R| Metric Preference | 2427 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2428 | Metric | 2429 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2431 PIM Version, Type, Addr length, Checksum 2432 Described above. 2434 Encoded-Group Address 2435 The group address to which the data packet was addressed, 2436 and which triggered the Assert. Format previously 2437 described. 2439 Unicast-Source Address 2440 Source IP address from IP multicast datagram that 2441 triggered the Assert packet to be sent. The length of this 2442 field in bytes is specified in Addr length. 2444 R RPT-bit is a 1 bit value. If the IP multicast datagram 2445 that triggered the Assert packet is routed down the RP 2446 tree, then the RPT-bit is 1; if the IP multicast datagram 2447 is routed down the SPT, it is 0. 2449 Metric Preference 2450 Preference value assigned to the unicast routing protocol 2451 that provided the route to Host address. 2453 Metric The unicast routing table metric. The metric is in units 2454 applicable to the unicast routing protocol used. 2456 4.8 Graft Message 2458 Used in dense-mode. Refer to PIM dense mode specification. 2460 4.9 Graft-Ack Message 2462 Used in dense-mode. Refer to PIM dense mode specification. 2464 4.10 Candidate-RP-Advertisement 2466 Candidate-RP-Advertisements are periodically unicast from the 2467 C-RPs to the BSR. 2469 0 1 2 3 2470 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2471 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2472 |PIM Ver| Type | Addr length | Checksum | 2473 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2474 | Prefix-Cnt |A| Reserved | Holdtime | 2475 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2476 | Unicast-RP-Address | 2477 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2478 | Encoded-Group Address-1 | 2479 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2480 | . | 2481 | . | 2482 | . | 2483 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2484 | Encoded-Group Address-n | 2485 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2487 PIM Version, Type, Addr length, Checksum 2488 Described above. 2490 Prefix-Cnt 2491 The number of encoded group addresses included in the 2492 message; indicating the group prefixes for which the C-RP 2493 is advertising. A Prefix-Cnt of `0' implies a prefix of 2494 224.0.0.0 with mask length of 4; i.e. all multicast groups. 2495 If the C-RP is not configured with Group-prefix 2496 information, the C-RP puts a default value of `0' in this 2497 field. 2499 A The Authoritative bit. This bit indicates that the BSR 2500 should not override the group-prefix information indicated 2501 inthe C-RP Advertisement. In most cases C-RPs set this bit 2502 to 0. 2504 Holdtime 2505 The amount of time the advertisement is valid. This field 2506 allows advertisements to be aged out. 2508 Unicast-RP-Address 2509 The address of the interface to advertise as a Candidate 2510 RP. The length of this field in bytes is specified in Addr 2511 length. 2513 Encoded-Group Address-1..n 2514 The group prefixes for which the C-RP is advertising. 2515 Format previously described. 2517 5 Acknowledgments 2519 Tony Ballardie, Scott Brim, Jon Crowcroft, Bill Fenner, Paul 2520 Francis, Joel Halpern, Horst Hodel, Polly Huang, Stephen 2521 Ostrowski, Lixia Zhang and Girish Chandranmenon provided 2522 detailed comments on previous drafts. The authors of CBT [7] and 2523 membership of the IDMR WG provided many of the motivating ideas 2524 for this work and useful feedback on design details. 2526 This work was supported by the National Science Foundation, 2527 ARPA, cisco Systems and Sun Microsystems. 2529 6 Appendices 2530 6.1 Appendix I: Major Changes and Updates to the Spec 2532 This appendix populates the major changes in the specification 2533 document as compared to `draft-ietf-idmr-pim-spec-01.ps,txt'. 2535 * Major Changes 2537 List of changes since March '96 IETF: 2539 (*,*,RP) Joins state and data forwarding check; replaces (*,G- 2540 Prefix) Joins state for interoperability. (*,G) negative cache 2541 introduced for the (*,*,RP) state supporting mechanisms. 2543 Semantic fragmentation for the Bootstrap message. 2545 Refinement of Assert details. 2547 Addition and refinement of Join/Prune suppression and Register 2548 suppression (introduction of null Registers). 2550 Editorial changes and clarifications to the timers section. 2552 Addition of Appendix II (BSR Election and RP-Set Distribution), 2553 and Appendix III (Glossary of Terms). 2555 Addition of table of contents. 2557 List of changes incurred since version 1 of the spec.: 2559 Proposal and refinement of bootstrap router (BSR) election 2560 mechanisms 2562 Introduction of hash functions for Group to RP mapping 2564 New RP-liveness indication mechanisms based upon the the 2565 Bootstrap Router (BSR) and the Bootstrap messages. 2567 Removal of reachability messages, RP reports and multiple RPs 2568 per group. 2570 Estrin,Farinacci,Helmy,Thaler,Deering,Handley,Jacobson,Liu,Sharma,Wei [Page 68]^L 2571 * Packet Format Changes 2573 Packet Format incurred updates to accommodate different address 2574 lengths, and address aggregation. 2576 1 The `Addr length' field was added to the PIM fixed header 2577 to specify the address length in bytes of the underlying 2578 protocol, see section 4. 2580 2 The Encoded source and group address formats were 2581 introduced, with the use of a `Mask length' field to allow 2582 aggregation, section 4.1. 2584 3 Packet formats are no longer IGMP messages; rather PIM 2585 messages. 2587 PIM message types and formats were also modified: 2589 [Note: most changes were made to the May 95 version, unless 2590 otherwise specified]. 2592 1 Obsolete messages: 2594 Register-Ack [Feb. 96] 2596 Poll and Poll Response [Feb. 96] 2598 RP-Reachability [Feb. 96] 2600 RPlist-Mapping [Feb. 96] 2602 2 New messages: 2604 Candidate-RP-Advertisement [change made in October 95] 2605 RP-Set [Feb. 96] 2607 3 Modified messages: 2609 Join/Prune [Feb. 96] 2610 Register [Feb. 96] 2611 Register-Stop [Feb. 96] 2612 Hello (addition of OptionTypes) [Aug 96] 2614 4 Renamed messages: 2616 Query messages are renamed as Hello messages [Aug. 96] 2617 RP-Set messages are renamed as Bootstrap messages [Aug. 96] 2618 6.2 Appendix II: BSR Election and RP-Set Distribution 2620 For simplicity, the Bootstrap message is used in both the BSR 2621 election and the RP-Set distribution. 2623 The above two mechanisms; the BSR election, and the RP-Set 2624 distribution; are realized through the following state machine, 2625 illustrated in figure 4: 2627 [Figures are present only in the postscript version] 2628 Fig. 4 State Diagram for the BSR election and RP-Set 2629 distribution mechanisms 2631 The protocol transitions for a C-BSR are given in state diagram (a). 2632 For routers not configured as C-BSRs, the protocol transitions are 2633 given in state diagram (b). 2635 Each PIM router keeps a Bootstrap-timer, initialized to [Bootstrap- 2636 Timeout], in addition to a local BSR field `LclBSR' (initialized 2637 to a local address if C-BSR, or to 0 otherwise), and a local RP-Set 2638 `LclRP-Set' (initially empty). The two main stimuli to the state 2639 machine are the timer events, and receiving an Bootstrap message: 2641 * Initial States and Timer Events 2643 1 If the router is a C-BSR: 2645 1 The router operates initially in the `CandBSR' state, where 2646 it does not originate any Bootstrap messages. 2648 2 If the Bootstrap-timer expires, and the current state is 2649 `CandBSR', the router originates an Bootstrap message - 2650 carrying the local RP-Set, and its own BSR priority and 2651 address-, restarts the Bootstrap-timer at [Bootstrap- 2652 Period] seconds and transits into the `ElectedBSR' state. 2654 3 If the Bootstrap-timer expires, and the current state is 2655 `ElectedBSR', the router originates an Bootstrap message, 2656 and restarts the RP-Set timer at [Bootstrap-Period]. No 2657 state transition is incurred. 2659 This way, the elected BSR originates periodic Bootstrap 2660 messages every [Bootstrap-Period]. 2662 2 If a router is not a C-BSR: 2664 1 The router operates initially in the 'AxptAny' state. In 2665 such state, a router accepts the first Bootstrap message 2666 from the RPF neighbor toward the included BSR. The Reverse 2667 Path Forwarding (RPF) neighbor in this case is the next hop 2668 router en route to the included BSR. 2670 2 If the Bootstrap-timer expires, and the current state is 2671 `AxptPref', -where the router accepts only preferred. 2672 Bootstrap messages from the RPF neighbor toward the 2673 included BSR-, the router transits into the `AxptAny' 2674 state (preferred Bootstrap messages are those that carry 2675 BSR-priority and address higher than, or equal to, `LclBSR'). 2677 In this case, if an elected BSR becomes unreachable, the 2678 routers start accepting Bootstrap messages from another C- 2679 BSR after the Bootstrap-timer expires. All PIM routers 2680 within a domain converge on the preferred (with highest 2681 priority and address) reachable C-BSR. 2683 * Receiving Bootstrap Message 2685 To avoid loops, an RPF check is performed on the included BSR 2686 address. Upon receiving an Bootstrap message from the RPF neighbor 2687 toward the included BSR, the following actions are taken: 2689 1 If the router is not a C-BSR: 2691 1 If the current state is 'AxptAny', the router accepts the 2692 Bootstrap message, and transits into the 'AxptPref' state. 2694 2 If the current state is 'AxptPref', and the Bootstrap 2695 message is preferred, the message is accepted. No state 2696 transition is incurred. 2698 2 If the router is a C-BSR, and the Bootstrap message is 2699 preferred, the message is accepted. Further, if this happens 2700 when the current state is 2702 When an Bootstrap message is accepted, the router restarts the 2703 Bootstrap-timer at [Bootstrap-Timeout], stores the received BSR 2704 priority and address in `LclBSR', and the received RP-Set in 2705 `LclRP-Set', and forwards the Bootstrap message out all interfaces 2706 except the receiving interface. 2708 If an Bootstrap message is rejected, no state transitions are 2709 triggered. 2711 6.3 Appendix III: Glossary of Terms 2713 Following is an alphabetized list of terms and definitions used 2714 throughout this specification. 2716 * {Bootstrap router (BSR)}. A BSR is a dynamically elected router 2717 within a PIM domain. It is responsible for constructing the RP- 2718 Set and originating Bootstrap messages. 2720 * {Candidate-BSR (C-BSR)}. A C-BSR is a router configured to 2721 participate in the BSR election and act as BSRs if elected. 2723 * {Candidate RP (C-RP)}. A C-RP is a router configured to send 2724 periodic Candidate-RP-Advertisement messages to the BSR, and act 2725 as an RP when it receives Join/Prune or Register messages for 2726 the advertised group prefix. 2728 * {Designated Router (DR)}. The DR sets up multicast route 2729 entries and sends corresponding Join/Prune and Register messages 2730 on behalf of directly-connected receivers and sources, 2731 respectively. The DR may or may not be the same router as the 2732 IGMP Querier. The DR may or may not be the long-term, last-hop 2733 router for the group; a router on the LAN that has a lower 2734 metric route to the data source, or to the group's RP, may take 2735 over the role of sending Join/Prune messages. 2737 * {Incoming interface (iif)}. The iif of a multicast route entry 2738 indicates the interface from which multicast data packets are 2739 accepted for forwarding. The iif is initialized when the entry 2740 is created. 2742 * {Join list}. The Join list is one of two lists of addresses that 2743 is included in a Join/Prune message; each address refers to a 2744 source or RP. It indicates those sources or RPs to which 2745 downstream receiver(s) wish to join. 2747 * {Last-hop router}. The last-hop router is the last router to 2748 receive multicast data packets before they are delivered to 2749 directly-connected member hosts. In general the last-hop router 2750 is the DR for the LAN. However, under various conditions 2751 described in this document a parallel router connected to the 2752 same LAN may take over as the last-hop router in place of the 2753 DR. 2755 * {Outgoing interface (oif) list}. Each multicast route entry has 2756 an oif list containing the outgoing interfaces to which 2757 multicast packets should be forwarded. 2759 * {Prune List}. The Prune list is the second list of addresses that 2760 is included in a Join/Prune message. It indicates those sources 2761 or RPs from which downstream receiver(s) wish to prune. 2763 * {PIM Multicast Border Router (PMBR)}. A PMBR connects a PIM 2764 domain to other multicast routing domain(s). 2766 * {Rendezvous Point (RP)}. Each multicast group has a shared-tree 2767 via which receivers hear of new sources and new receivers hear 2768 of all sources. The RP is the root of this per-group shared 2769 tree, called the RP-Tree. 2771 * {RP-Set}. The RP-Set is a set of RP addresses constructed by 2772 the BSR based on Candidate-RP advertisements received. The RP- 2773 Set information is distributed to all PIM routers in the BSR's 2774 PIM domain. 2776 * {Reverse Path Forwarding (RPF)}. RPF is used to select the 2777 appropriate incoming interface for a multicast route entry . The 2778 RPF neighbor for an IP address X is the the next-hop router used 2779 to forward packets toward X. The RPF interface is the interface 2780 to that RPF neighbor. In the common case this is the next hop 2781 used by the unicast routing protocol for sending unicast packets 2782 toward X. For example, in cases where unicast and multicast 2783 routes are not congruent, it can be different. 2785 * {Route entry.} A multicast route entry is state maintained in a 2786 router along the distribution tree and is created, and updated 2787 based on incoming control messages. The route entry may be 2788 different from the forwarding entry; the latter is used to 2789 forward data packets in real time. Typically a forwarding entry 2790 is not created until data packets arrive, the forwarding entry's 2791 iif and oif list are copied from the route entry, and the 2792 forwarding entry may be flushed and recreated at will. 2794 * {Shortest path tree (SPT)}. The SPT is the multicast 2795 distribution tree created by the merger of all of the shortest 2796 paths that connect receivers to the source (as determined by 2797 unicast routing). 2799 * {Sparse Mode (SM)}. SM is one mode of operation of a multicast 2800 protocol. PIM SM uses explicit Join/Prune messages and 2801 Rendezvous points in place of Dense Mode PIM's and DVMRP's 2802 broadcast and prune mechanism. 2804 * {Wildcard (WC) multicast route entry}. Wildcard multicast route 2805 entries are those entries that may be used to forward packets 2806 for any source sending to the specified group. Wildcard bots in 2807 the join list of a Join/Prune message represent either a (*,G) 2808 or (*,*,RP) join; in the prune list they represent a (*,G) 2809 prune. 2811 * {(S,G) route entry}. (S,G) is a source-specific route entry. It 2812 may be created in response to data packets, Join/Prune messages, 2813 or Asserts. The (S,G) state in routers creates a source-rooted, 2814 shortest path (or reverse shortest path) distribution tree. 2815 (S,G)RPT bit entries are source-specific entries on the shared 2816 RP-Tree; these entries are used to prune particular sources off 2817 of the shared tree. 2819 * {(*,G) route entry}. Group members join the shared RP-Tree for 2820 a particular group. This tree is represented by (*,G) multicast 2821 route entries along the shortest path branches between the RP 2822 and the group members. 2824 * {(*,*,RP) route entry}. (*,*,RP) refers to any source and any 2825 multicast group that maps to the RP included in the entry. The 2826 routers along the shortest path branches between a domain's 2827 RP(s) and its PMBRs keep (*,*,RP) state and use it to determine 2828 how to deliver packets toward the PMBRs if data packets arrive 2829 for which there is not a longer match. The wildcard group in the 2830 (*,*,RP) route entry is represented by a group address of 2831 224.0.0.0 and a mask length of 4 bits. 2833 References 2835 1. S.Deering, D.Estrin, D.Farinacci, V.Jacobson, C.Liu, L.Wei, 2836 P.Sharma, and A.Helmy. Protocol independent multicast (pim) : 2837 Motivation and architecture. 2838 Internet Draft, May 1995. 2840 2. S.Deering, D.Estrin, D.Farinacci, V.Jacobson, C.Liu, and L.Wei. The 2841 pim architecture for wide-area multicast routing. 2842 ACM Transactions on Networks, April 1996. 2844 3. D.Estrin, D.Farinacci, V.Jacobson, C.Liu, L.Wei, P.Sharma, and 2845 A.Helmy. Protocol independent multicast-dense mode (pim-dm) : 2846 Protocol specification. Internet Draft, November 1995. 2848 4. S.Deering. Host extensions for ip multicasting, aug 1989. RFC1112. 2850 5. W.Fenner. Internet group management protocol, version 2. 2851 Internet Draft, May 1996. 2853 6. R.Atkinson. Security architecture for the internet protocol, August 2854 1995. RFC-1825. 2856 7. A.J. Ballardie, P.F. Francis, and J.Crowcroft. Core based trees. In 2857 Proceedings of the ACM SIGCOMM, San Francisco, 1993. 2859 Addresses of Authors: 2861 Deborah Estrin 2862 Computer Science Dept/ISI 2863 University of Southern Calif. 2864 Los Angeles, CA 90089 2865 estrin@usc.edu 2867 Dino Farinacci 2868 Cisco Systems Inc. 2869 170 West Tasman Drive, 2870 San Jose, CA 95134 2871 dino@cisco.com 2873 Ahmed Helmy 2874 Computer Science Dept. 2875 University of Southern Calif. 2876 Los Angeles, CA 90089 2877 ahelmy@catarina.usc.edu 2879 David Thaler 2880 EECS Department 2881 University of Michigan 2882 Ann Arbor, MI 48109 2883 thalerd@eecs.umich.edu 2885 Stephen Deering 2886 Xerox PARC 2887 3333 Coyote Hill Road 2888 Palo Alto, CA 94304 2889 deering@parc.xerox.com 2891 Mark Handley 2892 Department of Computer Science 2893 University College London 2894 Gower Street 2895 London, WC1E 6BT 2896 UK 2897 m.handley@cs.ucl.ac.uk 2899 Van Jacobson 2900 Lawrence Berkeley Laboratory 2901 1 Cyclotron Road 2902 Berkeley, CA 94720 2903 van@ee.lbl.gov 2905 Ching-gung Liu 2906 Computer Science Dept. 2907 University of Southern Calif. 2908 Los Angeles, CA 90089 2909 charley@catarina.usc.edu 2911 Puneet Sharma 2912 Computer Science Dept. 2913 University of Southern Calif. 2914 Los Angeles, CA 90089 2915 puneet@catarina.usc.edu 2917 Liming Wei 2918 Cisco Systems Inc. 2919 170 West Tasman Drive, 2920 San Jose, CA 95134 2921 lwei@cisco.com