idnits 2.17.1 draft-parekh-pim-rfc4601bis-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 2 instances of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 == There are 1 instance of lines with non-RFC3849-compliant IPv6 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 7, 2011) is 4676 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Draft Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: 'Actions A1' on line 3593 -- Looks like a reference, but probably isn't: 'Actions A6' on line 3237 -- Looks like a reference, but probably isn't: 'Actions A3' on line 3604 -- Looks like a reference, but probably isn't: 'Actions A2' on line 3604 -- Looks like a reference, but probably isn't: 'Actions A4' on line 3604 -- Looks like a reference, but probably isn't: 'Actions A5' on line 3630 ** Downref: Normative reference to an Proposed Standard RFC: RFC 3376 (ref. '2') ** Downref: Normative reference to an Proposed Standard RFC: RFC 2710 (ref. '4') ** Obsolete normative reference: RFC 2460 (ref. '5') (Obsoleted by RFC 8200) ** Downref: Normative reference to an Proposed Standard RFC: RFC 4607 (ref. '6') -- Possible downref: Non-RFC (?) normative reference: ref. '7' ** Downref: Normative reference to an Proposed Standard RFC: RFC 4301 (ref. '8') ** Obsolete normative reference: RFC 5226 (ref. '9') (Obsoleted by RFC 8126) -- Obsolete informational reference (is this intentional?): RFC 5996 (ref. '14') (Obsoleted by RFC 7296) Summary: 6 errors (**), 0 flaws (~~), 3 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. Fenner 3 Internet Draft AT&T Labs - Research 4 Intended Status: Draft Standard M. Handley 5 Expires: January 8, 2012 UCL 6 H. Holbrook 7 Arastra 8 I. Kouvelas 9 R. Parekh 10 Cisco Systems, Inc. 11 Z. Zhang 12 Juniper Networks 13 L. Zheng 14 Huawei Technologies 15 July 7, 2011 17 Protocol Independent Multicast - Sparse Mode (PIM-SM): 18 Protocol Specification (Revised) 20 draft-parekh-pim-rfc4601bis-01 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on November 25, 2011. 39 Abstract 41 This document specifies Protocol Independent Multicast - Sparse Mode 42 (PIM-SM). PIM-SM is a multicast routing protocol that can use the 43 underlying unicast routing information base or a separate multicast- 44 capable routing information base. It builds unidirectional shared 45 trees rooted at a Rendezvous Point (RP) per group, and optionally 46 creates shortest-path trees per source. 48 This document addresses errata filed against RFC 4601, and removes 49 the optional (*,*,RP) feature that lacks sufficient deployment 50 experience. 52 Copyright Notice 54 Copyright (c) 2011 IETF Trust and the persons identified as the 55 document authors. All rights reserved. 57 This document is subject to BCP 78 and the IETF Trust's Legal 58 Provisions Relating to IETF Documents 59 (http://trustee.ietf.org/license-info) in effect on the date of 60 publication of this document. Please review these documents 61 carefully, as they describe your rights and restrictions with respect 62 to this document. Code Components extracted from this document must 63 include Simplified BSD License text as described in Section 4.e of 64 the Trust Legal Provisions and are provided without warranty as 65 described in the Simplified BSD License. 67 Table of Contents 69 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 6 70 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 71 2.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 6 72 2.2. Pseudocode Notation . . . . . . . . . . . . . . . . . . . 7 73 3. PIM-SM Protocol Overview . . . . . . . . . . . . . . . . . . . 8 74 3.1. Phase One: RP Tree . . . . . . . . . . . . . . . . . . . . 8 75 3.2. Phase Two: Register-Stop . . . . . . . . . . . . . . . . . 9 76 3.3. Phase Three: Shortest-Path Tree . . . . . . . . . . . . . 10 77 3.4. Source-Specific Joins . . . . . . . . . . . . . . . . . . 11 78 3.5. Source-Specific Prunes . . . . . . . . . . . . . . . . . . 11 79 3.6. Multi-Access Transit LANs . . . . . . . . . . . . . . . . 12 80 3.7. RP Discovery . . . . . . . . . . . . . . . . . . . . . . . 12 81 4. Protocol Specification . . . . . . . . . . . . . . . . . . . . 13 82 4.1. PIM Protocol State . . . . . . . . . . . . . . . . . . . . 14 83 4.1.1. General Purpose State . . . . . . . . . . . . . . . . 15 84 4.1.2. (*,G) State . . . . . . . . . . . . . . . . . . . . . 16 85 4.1.3. (S,G) State . . . . . . . . . . . . . . . . . . . . . 17 86 4.1.4. (S,G,rpt) State . . . . . . . . . . . . . . . . . . . 20 87 4.1.5. State Summarization Macros . . . . . . . . . . . . . . 21 88 4.2. Data Packet Forwarding Rules . . . . . . . . . . . . . . . 25 89 4.2.1. Last-Hop Switchover to the SPT . . . . . . . . . . . . 27 90 4.2.2. Setting and Clearing the (S,G) SPTbit . . . . . . . . 28 91 4.3. Designated Routers (DR) and Hello Messages . . . . . . . . 29 92 4.3.1. Sending Hello Messages . . . . . . . . . . . . . . . . 29 93 4.3.2. DR Election . . . . . . . . . . . . . . . . . . . . . 31 94 4.3.3. Reducing Prune Propagation Delay on LANs . . . . . . . 33 95 4.3.4. Maintaining Secondary Address Lists . . . . . . . . . 36 96 4.4. PIM Register Messages . . . . . . . . . . . . . . . . . . 37 97 4.4.1. Sending Register Messages from the DR . . . . . . . . 37 98 4.4.2. Receiving Register Messages at the RP . . . . . . . . 41 99 4.5. PIM Join/Prune Messages . . . . . . . . . . . . . . . . . 43 100 4.5.1. Receiving (*,G) Join/Prune Messages . . . . . . . . . 43 101 4.5.2. Receiving (S,G) Join/Prune Messages . . . . . . . . . 48 102 4.5.3. Receiving (S,G,rpt) Join/Prune Messages . . . . . . . 51 103 4.5.4. Sending (*,G) Join/Prune Messages . . . . . . . . . . 56 104 4.5.5. Sending (S,G) Join/Prune Messages . . . . . . . . . . 61 105 4.5.6. (S,G,rpt) Periodic Messages . . . . . . . . . . . . . 66 106 4.5.7. State Machine for (S,G,rpt) Triggered Messages . . . . 67 107 4.6. PIM Assert Messages . . . . . . . . . . . . . . . . . . . 71 108 4.6.1. (S,G) Assert Message State Machine . . . . . . . . . . 72 109 4.6.2. (*,G) Assert Message State Machine . . . . . . . . . . 79 110 4.6.3. Assert Metrics . . . . . . . . . . . . . . . . . . . . 85 111 4.6.4. AssertCancel Messages . . . . . . . . . . . . . . . . 87 112 4.6.5. Assert State Macros . . . . . . . . . . . . . . . . . 87 113 4.7. PIM Bootstrap and RP Discovery . . . . . . . . . . . . . . 91 114 4.7.1. Group-to-RP Mapping . . . . . . . . . . . . . . . . . 92 115 4.7.2. Hash Function . . . . . . . . . . . . . . . . . . . . 93 116 4.8. Source-Specific Multicast . . . . . . . . . . . . . . . . 94 117 4.8.1. Protocol Modifications for SSM Destination Addresses . 94 118 4.8.2. PIM-SSM-Only Routers . . . . . . . . . . . . . . . . . 95 119 4.9. PIM Packet Formats . . . . . . . . . . . . . . . . . . . . 96 120 4.9.1. Encoded Source and Group Address Formats . . . . . . . 98 121 4.9.2. Hello Message Format . . . . . . . . . . . . . . . . .101 122 4.9.3. Register Message Format . . . . . . . . . . . . . . .105 123 4.9.4. Register-Stop Message Format . . . . . . . . . . . . .108 124 4.9.5. Join/Prune Message Format . . . . . . . . . . . . . .108 125 4.9.5.1. Group Set Source List Rules . . . . . . . . . . .111 126 4.9.5.2. Group Set Fragmentation . . . . . . . . . . . . .114 127 4.9.6. Assert Message Format . . . . . . . . . . . . . . . .115 128 4.10. PIM Timers . . . . . . . . . . . . . . . . . . . . . . .116 129 4.11. Timer Values . . . . . . . . . . . . . . . . . . . . . .118 130 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . .123 131 5.1. PIM Address Family . . . . . . . . . . . . . . . . . . . .123 132 5.2. PIM Hello Options . . . . . . . . . . . . . . . . . . . .124 133 6. Security Considerations . . . . . . . . . . . . . . . . . . .124 134 6.1. Attacks Based on Forged Messages . . . . . . . . . . . . .124 135 6.1.1. Forged Link-Local Messages . . . . . . . . . . . . . .124 136 6.1.2. Forged Unicast Messages . . . . . . . . . . . . . . .125 137 6.2. Non-Cryptographic Authentication Mechanisms . . . . . . .125 138 6.3. Authentication Using IPsec . . . . . . . . . . . . . . . .126 139 6.3.1. Protecting Link-Local Multicast Messages . . . . . . .126 140 6.3.2. Protecting Unicast Messages . . . . . . . . . . . . .127 141 6.3.2.1. Register Messages . . . . . . . . . . . . . . . .127 142 6.3.2.2. Register-Stop Messages . . . . . . . . . . . . . .127 143 6.4. Denial-of-Service Attacks . . . . . . . . . . . . . . . .128 144 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . .128 145 8. Normative References . . . . . . . . . . . . . . . . . . . . .129 146 9. Informative References . . . . . . . . . . . . . . . . . . . .129 147 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . .131 149 List of Figures 151 Figure 1. Per-(S,G) register state machine at a DR ................38 152 Figure 2. Downstream per-interface (*,G) state machine ............45 153 Figure 3. Downstream per-interface (S,G) state machine ............49 154 Figure 4. Downstream per-interface (S,G,rpt) state machine ........53 155 Figure 5. Upstream (*,G) state machine ............................58 156 Figure 6. Upstream (S,G) state machine ............................62 157 Figure 7. Upstream (S,G,rpt) state machine for triggered 158 messages ................................................67 159 Figure 8. Per-interface (S,G) Assert State machine ................72 160 Figure 9. Per-interface (*,G) Assert State machine ................80 162 1. Introduction 164 This document specifies a protocol for efficiently routing multicast 165 groups that may span wide-area (and inter-domain) internets. This 166 protocol is called Protocol Independent Multicast - Sparse Mode 167 (PIM-SM) because, although it may use the underlying unicast routing 168 to provide reverse-path information for multicast tree building, it 169 is not dependent on any particular unicast routing protocol. 171 PIM-SM version 2 was specified in RFC 4601 as a Proposed Standard. 172 This document is intended to address the reported errata and to 173 remove the optional (*,*,RP) feature that lacks sufficient deployment 174 experience, to advance PIM-SM to Draft Standard. 176 This document specifies the same protocol as RFC 4601 and 177 implementations per the specification in this document will be able 178 to interoperate successfully with implementations per RFC 4601. 180 2. Terminology 182 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 183 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 184 document are to be interpreted as described in RFC 2119 [1]. 186 2.1. Definitions 188 The following terms have special significance for PIM-SM: 190 Rendezvous Point (RP): 191 An RP is a router that has been configured to be used as the 192 root of the non-source-specific distribution tree for a 193 multicast group. Join messages from receivers for a group are 194 sent towards the RP, and data from senders is sent to the RP so 195 that receivers can discover who the senders are and start to 196 receive traffic destined for the group. 198 Designated Router (DR): 199 A shared-media LAN like Ethernet may have multiple PIM-SM 200 routers connected to it. A single one of these routers, the 201 DR, will act on behalf of directly connected hosts with respect 202 to the PIM-SM protocol. A single DR is elected per interface 203 (LAN or otherwise) using a simple election process. 205 MRIB Multicast Routing Information Base. This is the multicast 206 topology table, which is typically derived from the unicast 207 routing table, or routing protocols such as Multiprotocol BGP 208 (MBGP) that carry multicast-specific topology information. In 209 PIM-SM, the MRIB is used to decide where to send Join/Prune 210 messages. A secondary function of the MRIB is to provide 211 routing metrics for destination addresses; these metrics are 212 used when sending and processing Assert messages. 214 RPF Neighbor 215 RPF stands for "Reverse Path Forwarding". The RPF Neighbor of 216 a router with respect to an address is the neighbor that the 217 MRIB indicates should be used to forward packets to that 218 address. In the case of a PIM-SM multicast group, the RPF 219 neighbor is the router that a Join message for that group would 220 be directed to, in the absence of modifying Assert state. 222 TIB Tree Information Base. This is the collection of state at a 223 PIM router that has been created by receiving PIM Join/Prune 224 messages, PIM Assert messages, and Internet Group Management 225 Protocol (IGMP) or Multicast Listener Discovery (MLD) 226 information from local hosts. It essentially stores the state 227 of all multicast distribution trees at that router. 229 MFIB Multicast Forwarding Information Base. The TIB holds all the 230 state that is necessary to forward multicast packets at a 231 router. However, although this specification defines forwarding 232 in terms of the TIB, to actually forward packets using the TIB 233 is very inefficient. Instead, a real router implementation 234 will normally build an efficient MFIB from the TIB state to 235 perform forwarding. How this is done is implementation-specific 236 and is not discussed in this document. 238 Upstream 239 Towards the root of the tree. The root of the tree may be 240 either the source or the RP, depending on the context. 242 Downstream 243 Away from the root of the tree. 245 GenID Generation Identifier, used to detect reboots. 247 2.2. Pseudocode Notation 249 We use set notation in several places in this specification. 251 A (+) B is the union of two sets, A and B. 253 A (-) B is the elements of set A that are not in set B. 255 NULL is the empty set or list. 257 In addition, we use C-like syntax: 259 = denotes assignment of a variable. 261 == denotes a comparison for equality. 263 != denotes a comparison for inequality. 265 Braces { and } are used for grouping. 267 Unless otherwise noted, operations specified by statements having 268 multiple (+) and (-) operators should be evaluated from left to 269 right, i.e. A (+) B (-) C is the set resulting from union of sets A 270 and B minus elements in set C. 272 3. PIM-SM Protocol Overview 274 This section provides an overview of PIM-SM behavior. It is intended 275 as an introduction to how PIM-SM works, and it is NOT definitive. 276 For the definitive specification, see Section 4. 278 PIM relies on an underlying topology-gathering protocol to populate a 279 routing table with routes. This routing table is called the 280 Multicast Routing Information Base (MRIB). The routes in this table 281 may be taken directly from the unicast routing table, or they may be 282 different and provided by a separate routing protocol such as MBGP 283 [10]. Regardless of how it is created, the primary role of the MRIB 284 in the PIM protocol is to provide the next-hop router along a 285 multicast-capable path to each destination subnet. The MRIB is used 286 to determine the next-hop neighbor to which any PIM Join/Prune 287 message is sent. Data flows along the reverse path of the Join 288 messages. Thus, in contrast to the unicast RIB, which specifies the 289 next hop that a data packet would take to get to some subnet, the 290 MRIB gives reverse-path information and indicates the path that a 291 multicast data packet would take from its origin subnet to the router 292 that has the MRIB. 294 Like all multicast routing protocols that implement the service model 295 from RFC 1112 [3], PIM-SM must be able to route data packets from 296 sources to receivers without either the sources or receivers knowing 297 a priori of the existence of the others. This is essentially done in 298 three phases, although as senders and receivers may come and go at 299 any time, all three phases may occur simultaneously. 301 3.1. Phase One: RP Tree 303 In phase one, a multicast receiver expresses its interest in 304 receiving traffic destined for a multicast group. Typically, it does 305 this using IGMP [2] or MLD [4], but other mechanisms might also serve 306 this purpose. One of the receiver's local routers is elected as the 307 Designated Router (DR) for that subnet. On receiving the receiver's 308 expression of interest, the DR then sends a PIM Join message towards 309 the RP for that multicast group. This Join message is known as a 310 (*,G) Join because it joins group G for all sources to that group. 311 The (*,G) Join travels hop-by-hop towards the RP for the group, and 312 in each router it passes through, multicast tree state for group G is 313 instantiated. Eventually, the (*,G) Join either reaches the RP or 314 reaches a router that already has (*,G) Join state for that group. 315 When many receivers join the group, their Join messages converge on 316 the RP and form a distribution tree for group G that is rooted at the 317 RP. This is known as the RP Tree (RPT), and is also known as the 318 shared tree because it is shared by all sources sending to that 319 group. Join messages are resent periodically so long as the receiver 320 remains in the group. When all receivers on a leaf-network leave the 321 group, the DR will send a PIM (*,G) Prune message towards the RP for 322 that multicast group. However, if the Prune message is not sent for 323 any reason, the state will eventually time out. 325 A multicast data sender just starts sending data destined for a 326 multicast group. The sender's local router (DR) takes those data 327 packets, unicast-encapsulates them, and sends them directly to the 328 RP. The RP receives these encapsulated data packets, decapsulates 329 them, and forwards them onto the shared tree. The packets then 330 follow the (*,G) multicast tree state in the routers on the RP Tree, 331 being replicated wherever the RP Tree branches, and eventually 332 reaching all the receivers for that multicast group. The process of 333 encapsulating data packets to the RP is called registering, and the 334 encapsulation packets are known as PIM Register packets. 336 At the end of phase one, multicast traffic is flowing encapsulated to 337 the RP, and then natively over the RP tree to the multicast 338 receivers. 340 3.2. Phase Two: Register-Stop 342 Register-encapsulation of data packets is inefficient for two 343 reasons: 345 o Encapsulation and decapsulation may be relatively expensive 346 operations for a router to perform, depending on whether or not the 347 router has appropriate hardware for these tasks. 349 o Traveling all the way to the RP, and then back down the shared tree 350 may result in the packets traveling a relatively long distance to 351 reach receivers that are close to the sender. For some 352 applications, this increased latency or bandwidth consumption is 353 undesirable. 355 Although Register-encapsulation may continue indefinitely, for these 356 reasons, the RP will normally choose to switch to native forwarding. 357 To do this, when the RP receives a register-encapsulated data packet 358 from source S on group G, it will normally initiate an (S,G) source- 359 specific Join towards S. This Join message travels hop-by-hop 360 towards S, instantiating (S,G) multicast tree state in the routers 361 along the path. (S,G) multicast tree state is used only to forward 362 packets for group G if those packets come from source S. Eventually 363 the Join message reaches S's subnet or a router that already has 364 (S,G) multicast tree state, and then packets from S start to flow 365 following the (S,G) tree state towards the RP. These data packets 366 may also reach routers with (*,G) state along the path towards the 367 RP; if they do, they can shortcut onto the RP tree at this point. 369 While the RP is in the process of joining the source-specific tree 370 for S, the data packets will continue being encapsulated to the RP. 371 When packets from S also start to arrive natively at the RP, the RP 372 will be receiving two copies of each of these packets. At this 373 point, the RP starts to discard the encapsulated copy of these 374 packets, and it sends a Register-Stop message back to S's DR to 375 prevent the DR from unnecessarily encapsulating the packets. 377 At the end of phase 2, traffic will be flowing natively from S along 378 a source-specific tree to the RP, and from there along the shared 379 tree to the receivers. Where the two trees intersect, traffic may 380 transfer from the source-specific tree to the RP tree and thus avoid 381 taking a long detour via the RP. 383 Note that a sender may start sending before or after a receiver joins 384 the group, and thus phase two may happen before the shared tree to 385 the receiver is built. 387 3.3. Phase Three: Shortest-Path Tree 389 Although having the RP join back towards the source removes the 390 encapsulation overhead, it does not completely optimize the 391 forwarding paths. For many receivers, the route via the RP may 392 involve a significant detour when compared with the shortest path 393 from the source to the receiver. 395 To obtain lower latencies or more efficient bandwidth utilization, a 396 router on the receiver's LAN, typically the DR, may optionally 397 initiate a transfer from the shared tree to a source-specific 398 shortest-path tree (SPT). To do this, it issues an (S,G) Join 399 towards S. This instantiates state in the routers along the path to 400 S. Eventually, this join either reaches S's subnet or reaches a 401 router that already has (S,G) state. When this happens, data packets 402 from S start to flow following the (S,G) state until they reach the 403 receiver. 405 At this point, the receiver (or a router upstream of the receiver) 406 will be receiving two copies of the data: one from the SPT and one 407 from the RPT. When the first traffic starts to arrive from the SPT, 408 the DR or upstream router starts to drop the packets for G from S 409 that arrive via the RP tree. In addition, it sends an (S,G) Prune 410 message towards the RP. This is known as an (S,G,rpt) Prune. The 411 Prune message travels hop-by-hop, instantiating state along the path 412 towards the RP indicating that traffic from S for G should NOT be 413 forwarded in this direction. The prune is propagated until it reaches 414 the RP or a router that still needs the traffic from S for other 415 receivers. 417 By now, the receiver will be receiving traffic from S along the 418 shortest-path tree between the receiver and S. In addition, the RP 419 is receiving the traffic from S, but this traffic is no longer 420 reaching the receiver along the RP tree. As far as the receiver is 421 concerned, this is the final distribution tree. 423 3.4. Source-Specific Joins 425 IGMPv3 permits a receiver to join a group and specify that it only 426 wants to receive traffic for a group if that traffic comes from a 427 particular source. If a receiver does this, and no other receiver on 428 the LAN requires all the traffic for the group, then the DR may omit 429 performing a (*,G) join to set up the shared tree, and instead issue 430 a source-specific (S,G) join only. 432 The range of multicast addresses from 232.0.0.0 to 232.255.255.255 is 433 currently set aside for source-specific multicast in IPv4. For 434 groups in this range, receivers should only issue source-specific 435 IGMPv3 joins. If a PIM router receives a non-source-specific join for 436 a group in this range, it should ignore it, as described in Section 437 4.8. 439 3.5. Source-Specific Prunes 441 IGMPv3 also permits a receiver to join a group and to specify that it 442 only wants to receive traffic for a group if that traffic does not 443 come from a specific source or sources. In this case, the DR will 444 perform a (*,G) join as normal, but may combine this with an 445 (S,G,rpt) prune for each of the sources the receiver does not wish to 446 receive. 448 3.6. Multi-Access Transit LANs 450 The overview so far has concerned itself with point-to-point transit 451 links. However, using multi-access LANs such as Ethernet for transit 452 is not uncommon. This can cause complications for three reasons: 454 o Two or more routers on the LAN may issue (*,G) Joins to different 455 upstream routers on the LAN because they have inconsistent MRIB 456 entries regarding how to reach the RP. Both paths on the RP tree 457 will be set up, causing two copies of all the shared tree traffic 458 to appear on the LAN. 460 o Two or more routers on the LAN may issue (S,G) Joins to different 461 upstream routers on the LAN because they have inconsistent MRIB 462 entries regarding how to reach source S. Both paths on the source- 463 specific tree will be set up, causing two copies of all the traffic 464 from S to appear on the LAN. 466 o A router on the LAN may issue a (*,G) Join to one upstream router 467 on the LAN, and another router on the LAN may issue an (S,G) Join 468 to a different upstream router on the same LAN. Traffic from S may 469 reach the LAN over both the RPT and the SPT. If the receiver 470 behind the downstream (*,G) router doesn't issue an (S,G,rpt) 471 prune, then this condition would persist. 473 All of these problems are caused by there being more than one 474 upstream router with join state for the group or source-group pair. 475 PIM does not prevent such duplicate joins from occurring; instead, 476 when duplicate data packets appear on the LAN from different routers, 477 these routers notice this and then elect a single forwarder. This 478 election is performed using PIM Assert messages, which resolve the 479 problem in favor of the upstream router that has (S,G) state; or, if 480 neither or both router has (S,G) state, then the problem is resolved 481 in favor of the router with the best metric to the RP for RP trees, 482 or the best metric to the source for source-specific trees. 484 These Assert messages are also received by the downstream routers on 485 the LAN, and these cause subsequent Join messages to be sent to the 486 upstream router that won the Assert. 488 3.7. RP Discovery 490 PIM-SM routers need to know the address of the RP for each group for 491 which they have (*,G) state. This address is obtained automatically 492 (e.g., embedded-RP), through a bootstrap mechanism, or through static 493 configuration. 495 One dynamic way to do this is to use the Bootstrap Router (BSR) 496 mechanism [11]. One router in each PIM domain is elected the 497 Bootstrap Router through a simple election process. All the routers 498 in the domain that are configured to be candidates to be RPs 499 periodically unicast their candidacy to the BSR. From the 500 candidates, the BSR picks an RP-set, and periodically announces this 501 set in a Bootstrap message. Bootstrap messages are flooded hop-by-hop 502 throughout the domain until all routers in the domain know the RP- 503 Set. 505 To map a group to an RP, a router hashes the group address into the 506 RP-set using an order-preserving hash function (one that minimizes 507 changes if the RP-Set changes). The resulting RP is the one that it 508 uses as the RP for that group. 510 4. Protocol Specification 512 The specification of PIM-SM is broken into several parts: 514 o Section 4.1 details the protocol state stored. 516 o Section 4.2 specifies the data packet forwarding rules. 518 o Section 4.3 specifies Designated Router (DR) election and the rules 519 for sending and processing Hello messages. 521 o Section 4.4 specifies the PIM Register generation and processing 522 rules. 524 o Section 4.5 specifies the PIM Join/Prune generation and processing 525 rules. 527 o Section 4.6 specifies the PIM Assert generation and processing 528 rules. 530 o Section 4.7 specifies the RP discovery mechanisms. 532 o The subset of PIM required to support Source-Specific Multicast, 533 PIM-SSM, is described in Section 4.8. 535 o PIM packet formats are specified in Section 4.9. 537 o A summary of PIM-SM timers and their default values is given in 538 Section 4.10. 540 4.1. PIM Protocol State 542 This section specifies all the protocol state that a PIM 543 implementation should maintain in order to function correctly. We 544 term this state the Tree Information Base (TIB), as it holds the 545 state of all the multicast distribution trees at this router. In 546 this specification, we define PIM mechanisms in terms of the TIB. 547 However, only a very simple implementation would actually implement 548 packet forwarding operations in terms of this state. Most 549 implementations will use this state to build a multicast forwarding 550 table, which would then be updated when the relevant state in the TIB 551 changes. 553 Although we specify precisely the state to be kept, this does not 554 mean that an implementation of PIM-SM needs to hold the state in this 555 form. This is actually an abstract state definition, which is needed 556 in order to specify the router's behavior. A PIM-SM implementation 557 is free to hold whatever internal state it requires and will still be 558 conformant with this specification so long as it results in the same 559 externally visible protocol behavior as an abstract router that holds 560 the following state. 562 We divide TIB state into three sections: 564 (*,G) state 565 State that maintains the RP tree for G. 567 (S,G) state 568 State that maintains a source-specific tree for source S and 569 group G. 571 (S,G,rpt) state 572 State that maintains source-specific information about source S 573 on the RP tree for G. For example, if a source is being 574 received on the source-specific tree, it will normally have been 575 pruned off the RP tree. This prune state is (S,G,rpt) state. 577 The state that should be kept is described below. Of course, 578 implementations will only maintain state when it is relevant to 579 forwarding operations; for example, the "NoInfo" state might be 580 assumed from the lack of other state information rather than being 581 held explicitly. 583 4.1.1. General Purpose State 585 A router holds the following non-group-specific state: 587 For each interface: 589 o Effective Override Interval 591 o Effective Propagation Delay 593 o Suppression state: One of {"Enable", "Disable"} 595 Neighbor State: 597 For each neighbor: 599 o Information from neighbor's Hello 601 o Neighbor's GenID. 603 o Neighbor Liveness Timer (NLT) 605 Designated Router (DR) State: 607 o Designated Router's IP Address 609 o DR's DR Priority 611 The Effective Override Interval, the Effective Propagation Delay and 612 the Interface suppression state are described in Section 4.3.3. 613 Designated Router state is described in Section 4.3. 615 4.1.2. (*,G) State 617 For every group G, a router keeps the following state: 619 (*,G) state: 620 For each interface: 622 Local Membership: 623 State: One of {"NoInfo", "Include"} 625 PIM (*,G) Join/Prune State: 627 o State: One of {"NoInfo" (NI), "Join" (J), "Prune- 628 Pending" (PP)} 630 o Prune-Pending Timer (PPT) 632 o Join/Prune Expiry Timer (ET) 634 (*,G) Assert Winner State 636 o State: One of {"NoInfo" (NI), "I lost Assert" (L), 637 "I won Assert" (W)} 639 o Assert Timer (AT) 641 o Assert winner's IP Address (AssertWinner) 643 o Assert winner's Assert Metric (AssertWinnerMetric) 645 Not interface specific: 647 Upstream (*,G) Join/Prune State: 649 o State: One of {"NotJoined(*,G)", "Joined(*,G)"} 651 o Upstream Join/Prune Timer (JT) 653 o Last RP Used 655 o Last RPF Neighbor towards RP that was used 657 Local membership is the result of the local membership mechanism 658 (such as IGMP or MLD) running on that interface. It need not be kept 659 if this router is not the DR on that interface unless this router won 660 a (*,G) assert on this interface for this group, although 661 implementations may optionally keep this state in case they become 662 the DR or assert winner. It is RECOMMENDED to store this information 663 if possible, as it reduces latency converging to stable operating 664 conditions after a failure causing a change of DR. This information 665 is used by the pim_include(*,G) macro described in Section 4.1.6. 667 PIM (*,G) Join/Prune state is the result of receiving PIM (*,G) 668 Join/Prune messages on this interface and is specified in Section 669 4.5.2. The state is used by the macros that calculate the outgoing 670 interface list in Section 4.1.6, and in the JoinDesired(*,G) macro 671 (defined in Section 4.5.6) that is used in deciding whether a 672 Join(*,G) should be sent upstream. 674 (*,G) Assert Winner state is the result of sending or receiving (*,G) 675 Assert messages on this interface. It is specified in Section 4.6.2. 677 The upstream (*,G) Join/Prune State reflects the state of the 678 upstream (*,G) state machine described in Section 4.5.6. 680 The upstream (*,G) Join/Prune Timer is used to send out periodic 681 Join(*,G) messages, and to override Prune(*,G) messages from peers on 682 an upstream LAN interface. 684 The last RP used must be stored because if the RP-Set changes 685 (Section 4.7), then state must be torn down and rebuilt for groups 686 whose RP changes. 688 The last RPF neighbor towards the RP is stored because if the MRIB 689 changes, then the RPF neighbor towards the RP may change. If it does 690 so, then we need to trigger a new Join(*,G) to the new upstream 691 neighbor and a Prune(*,G) to the old upstream neighbor. Similarly, 692 if a router detects through a changed GenID in a Hello message that 693 the upstream neighbor towards the RP has rebooted, then it SHOULD re- 694 instantiate state by sending a Join(*,G). These mechanisms are 695 specified in Section 4.5.6. 697 4.1.3. (S,G) State 699 For every source/group pair (S,G), a router keeps the following 700 state: 702 (S,G) state: 704 For each interface: 706 Local Membership: 707 State: One of {"NoInfo", "Include"} 709 PIM (S,G) Join/Prune State: 711 o State: One of {"NoInfo" (NI), "Join" (J), "Prune- 712 Pending" (PP)} 714 o Prune-Pending Timer (PPT) 716 o Join/Prune Expiry Timer (ET) 718 (S,G) Assert Winner State 720 o State: One of {"NoInfo" (NI), "I lost Assert" (L), 721 "I won Assert" (W)} 723 o Assert Timer (AT) 725 o Assert winner's IP Address (AssertWinner) 727 o Assert winner's Assert Metric (AssertWinnerMetric) 729 Not interface specific: 731 Upstream (S,G) Join/Prune State: 733 o State: One of {"NotJoined(S,G)", "Joined(S,G)"} 735 o Upstream (S,G) Join/Prune Timer (JT) 737 o Last RPF Neighbor towards S that was used 739 o SPTbit (indicates (S,G) state is active) 741 o (S,G) Keepalive Timer (KAT) 743 Additional (S,G) state at the DR: 745 o Register state: One of {"Join" (J), "Prune" (P), 746 "Join-Pending" (JP), "NoInfo" (NI)} 748 o Register-Stop timer 750 Local membership is the result of the local source-specific 751 membership mechanism (such as IGMP version 3) running on that 752 interface and specifying that this particular source should be 753 included. As stored here, this state is the resulting state after 754 any IGMPv3 inconsistencies have been resolved. It need not be kept 755 if this router is not the DR on that interface unless this router won 756 an (S,G) assert on this interface for this group. However, it is 757 RECOMMENDED to store this information if possible, as it reduces 758 latency converging to stable operating conditions after a failure 759 causing a change of DR. This information is used by the 760 pim_include(S,G) macro described in Section 4.1.6. 762 PIM (S,G) Join/Prune state is the result of receiving PIM (S,G) 763 Join/Prune messages on this interface and is specified in Section 764 4.5.3. The state is used by the macros that calculate the outgoing 765 interface list in Section 4.1.6, and in the JoinDesired(S,G) macro 766 (defined in Section 4.5.7) that is used in deciding whether a 767 Join(S,G) should be sent upstream. 769 (S,G) Assert Winner state is the result of sending or receiving (S,G) 770 Assert messages on this interface. It is specified in Section 4.6.1. 772 The upstream (S,G) Join/Prune State reflects the state of the 773 upstream (S,G) state machine described in Section 4.5.7. 775 The upstream (S,G) Join/Prune Timer is used to send out periodic 776 Join(S,G) messages, and to override Prune(S,G) messages from peers on 777 an upstream LAN interface. 779 The last RPF neighbor towards S is stored because if the MRIB 780 changes, then the RPF neighbor towards S may change. If it does so, 781 then we need to trigger a new Join(S,G) to the new upstream neighbor 782 and a Prune(S,G) to the old upstream neighbor. Similarly, if the 783 router detects through a changed GenID in a Hello message that the 784 upstream neighbor towards S has rebooted, then it SHOULD re- 785 instantiate state by sending a Join(S,G). These mechanisms are 786 specified in Section 4.5.7. 788 The SPTbit is used to indicate whether forwarding is taking place on 789 the (S,G) Shortest Path Tree (SPT) or on the (*,G) tree. A router 790 can have (S,G) state and still be forwarding on (*,G) state during 791 the interval when the source-specific tree is being constructed. 792 When SPTbit is FALSE, only (*,G) forwarding state is used to forward 793 packets from S to G. When SPTbit is TRUE, both (*,G) and (S,G) 794 forwarding state are used. 796 The (S,G) Keepalive Timer is updated by data being forwarded using 797 this (S,G) forwarding state. It is used to keep (S,G) state alive in 798 the absence of explicit (S,G) Joins. Amongst other things, this is 799 necessary for the so-called "turnaround rules" -- when the RP uses 800 (S,G) joins to stop encapsulation, and then (S,G) prunes to prevent 801 traffic from unnecessarily reaching the RP. 803 On a DR, the (S,G) Register State is used to keep track of whether to 804 encapsulate data to the RP on the Register Tunnel; the (S,G) 805 Register-Stop timer tracks how long before encapsulation begins again 806 for a given (S,G). 808 4.1.4. (S,G,rpt) State 810 For every source/group pair (S,G) for which a router also has (*,G) 811 state, it also keeps the following state: 813 (S,G,rpt) state: 815 For each interface: 817 Local Membership: 818 State: One of {"NoInfo", "Exclude"} 820 PIM (S,G,rpt) Join/Prune State: 822 o State: One of {"NoInfo", "Pruned", "Prune- 823 Pending"} 825 o Prune-Pending Timer (PPT) 827 o Join/Prune Expiry Timer (ET) 829 Not interface specific: 831 Upstream (S,G,rpt) Join/Prune State: 833 o State: One of {"RPTNotJoined(G)", 834 "NotPruned(S,G,rpt)", "Pruned(S,G,rpt)"} 836 o Override Timer (OT) 838 Local membership is the result of the local source-specific 839 membership mechanism (such as IGMPv3) running on that interface and 840 specifying that although there is (*,G) Include state, this 841 particular source should be excluded. As stored here, this state is 842 the resulting state after any IGMPv3 inconsistencies between LAN 843 members have been resolved. It need not be kept if this router is 844 not the DR on that interface unless this router won a (*,G) assert on 845 this interface for this group. However, we recommend storing this 846 information if possible, as it reduces latency converging to stable 847 operating conditions after a failure causing a change of DR. This 848 information is used by the pim_exclude(S,G) macro described in 849 Section 4.1.6. 851 PIM (S,G,rpt) Join/Prune state is the result of receiving PIM 852 (S,G,rpt) Join/Prune messages on this interface and is specified in 853 Section 4.5.4. The state is used by the macros that calculate the 854 outgoing interface list in Section 4.1.6, and in the rules for adding 855 Prune(S,G,rpt) messages to Join(*,G) messages specified in Section 856 4.5.8. 858 The upstream (S,G,rpt) Join/Prune state is used along with the 859 Override Timer to send the correct override messages in response to 860 Join/Prune messages sent by upstream peers on a LAN. This state and 861 behavior are specified in Section 4.5.9. 863 4.1.5. State Summarization Macros 865 Using this state, we define the following "macro" definitions, which 866 we will use in the descriptions of the state machines and pseudocode 867 in the following sections. 869 The most important macros are those that define the outgoing 870 interface list (or "olist") for the relevant state. An olist can be 871 "immediate" if it is built directly from the state of the relevant 872 type. For example, the immediate_olist(S,G) is the olist that would 873 be built if the router only had (S,G) state and no (*,G) or (S,G,rpt) 874 state. In contrast, the "inherited" olist inherits state from other 875 types. For example, the inherited_olist(S,G) is the olist that is 876 relevant for forwarding a packet from S to G using both source- 877 specific and group-specific state. 879 There is no immediate_olist(S,G,rpt) as (S,G,rpt) state is negative 880 state; it removes interfaces in the (*,G) olist from the olist that 881 is actually used to forward traffic. The inherited_olist(S,G,rpt) is 882 therefore the olist that would be used for a packet from S to G 883 forwarding on the RP tree. It is a strict subset of 884 immediate_olist(*,G). 886 Generally speaking, the inherited olists are used for forwarding, and 887 the immediate_olists are used to make decisions about state 888 maintenance. 890 immediate_olist(*,G) = 891 joins(*,G) (+) pim_include(*,G) (-) lost_assert(*,G) 893 immediate_olist(S,G) = 894 joins(S,G) (+) pim_include(S,G) (-) lost_assert(S,G) 896 inherited_olist(S,G,rpt) = 897 ( joins(*,G) (-) prunes(S,G,rpt) ) 898 (+) ( pim_include(*,G) (-) pim_exclude(S,G)) 899 (-) ( lost_assert(*,G) (+) lost_assert(S,G,rpt) ) 901 inherited_olist(S,G) = 902 inherited_olist(S,G,rpt) (+) 903 joins(S,G) (+) pim_include(S,G) (-) lost_assert(S,G) 905 The macros pim_include(*,G) and pim_include(S,G) indicate the 906 interfaces to which traffic might be forwarded because of hosts that 907 are local members on that interface. Note that normally only the DR 908 cares about local membership, but when an assert happens, the assert 909 winner takes over responsibility for forwarding traffic to local 910 members that have requested traffic on a group or source/group pair. 912 pim_include(*,G) = 913 { all interfaces I such that: 914 ( ( I_am_DR( I ) AND lost_assert(*,G,I) == FALSE ) 915 OR AssertWinner(*,G,I) == me ) 916 AND local_receiver_include(*,G,I) } 918 pim_include(S,G) = 919 { all interfaces I such that: 920 ( (I_am_DR( I ) AND lost_assert(S,G,I) == FALSE ) 921 OR AssertWinner(S,G,I) == me ) 922 AND local_receiver_include(S,G,I) } 924 pim_exclude(S,G) = 925 { all interfaces I such that: 926 ( (I_am_DR( I ) AND lost_assert(*,G,I) == FALSE ) 927 OR AssertWinner(*,G,I) == me ) 928 AND local_receiver_exclude(S,G,I) } 930 The clause "local_receiver_include(S,G,I)" is true if the IGMP/MLD 931 module or other local membership mechanism has determined that local 932 members on interface I desire to receive traffic sent specifically by 933 S to G. "local_receiver_include(*,G,I)" is true if the IGMP/MLD 934 module or other local membership mechanism has determined that local 935 members on interface I desire to receive all traffic sent to G 936 (possibly excluding traffic from a specific set of sources). 937 "local_receiver_exclude(S,G,I) is true if 938 "local_receiver_include(*,G,I)" is true but none of the local members 939 desire to receive traffic from S. 941 The set "joins(*,G)" is the set of all interfaces on which the router 942 has received (*,G) Joins: 944 joins(*,G) = 945 { all interfaces I such that 946 DownstreamJPState(*,G,I) is either Join or Prune-Pending } 948 DownstreamJPState(*,G,I) is the state of the finite state machine in 949 Section 4.5.2. 951 The set "joins(S,G)" is the set of all interfaces on which the router 952 has received (S,G) Joins: 954 joins(S,G) = 955 { all interfaces I such that 956 DownstreamJPState(S,G,I) is either Join or Prune-Pending } 958 DownstreamJPState(S,G,I) is the state of the finite state machine in 959 Section 4.5.3. 961 The set "prunes(S,G,rpt)" is the set of all interfaces on which the 962 router has received (*,G) joins and (S,G,rpt) prunes. 964 prunes(S,G,rpt) = 965 { all interfaces I such that 966 DownstreamJPState(S,G,rpt,I) is Prune or PruneTmp } 968 DownstreamJPState(S,G,rpt,I) is the state of the finite state machine 969 in Section 4.5.4. 971 The set "lost_assert(*,G)" is the set of all interfaces on which the 972 router has received (*,G) joins but has lost a (*,G) assert. The 973 macro lost_assert(*,G,I) is defined in Section 4.6.5. 975 lost_assert(*,G) = 976 { all interfaces I such that 977 lost_assert(*,G,I) == TRUE } 979 The set "lost_assert(S,G,rpt)" is the set of all interfaces on which 980 the router has received (*,G) joins but has lost an (S,G) assert. 981 The macro lost_assert(S,G,rpt,I) is defined in Section 4.6.5. 983 lost_assert(S,G,rpt) = 984 { all interfaces I such that 985 lost_assert(S,G,rpt,I) == TRUE } 987 The set "lost_assert(S,G)" is the set of all interfaces on which the 988 router has received (S,G) joins but has lost an (S,G) assert. The 989 macro lost_assert(S,G,I) is defined in Section 4.6.5. 991 lost_assert(S,G) = 992 { all interfaces I such that 993 lost_assert(S,G,I) == TRUE } 995 The following pseudocode macro definitions are also used in many 996 places in the specification. Basically, RPF' is the RPF neighbor 997 towards an RP or source unless a PIM-Assert has overridden the normal 998 choice of neighbor. 1000 neighbor RPF'(*,G) { 1001 if ( I_Am_Assert_Loser(*, G, RPF_interface(RP(G))) ) { 1002 return AssertWinner(*, G, RPF_interface(RP(G)) ) 1003 } else { 1004 return NBR( RPF_interface(RP(G)), MRIB.next_hop( RP(G) ) ) 1005 } 1006 } 1008 neighbor RPF'(S,G,rpt) { 1009 if( I_Am_Assert_Loser(S, G, RPF_interface(RP(G)) ) ) { 1010 return AssertWinner(S, G, RPF_interface(RP(G)) ) 1011 } else { 1012 return RPF'(*,G) 1013 } 1014 } 1016 neighbor RPF'(S,G) { 1017 if ( I_Am_Assert_Loser(S, G, RPF_interface(S) )) { 1018 return AssertWinner(S, G, RPF_interface(S) ) 1019 } else { 1020 return NBR( RPF_interface(S), MRIB.next_hop( S ) ) 1021 } 1022 } 1024 RPF'(*,G) and RPF'(S,G) indicate the neighbor from which data packets 1025 should be coming and to which joins should be sent on the RP tree and 1026 SPT, respectively. 1028 RPF'(S,G,rpt) is basically RPF'(*,G) modified by the result of an 1029 Assert(S,G) on RPF_interface(RP(G)). In such a case, packets from S 1030 will be originating from a different router than RPF'(*,G). If we 1031 only have active (*,G) Join state, we need to accept packets from 1032 RPF'(S,G,rpt) and add a Prune(S,G,rpt) to the periodic Join(*,G) 1033 messages that we send to RPF'(*,G) (see Section 4.5.8). 1035 The function MRIB.next_hop( S ) returns an address of the next-hop 1036 PIM neighbor toward the host S, as indicated by the current MRIB. If 1037 S is directly adjacent, then MRIB.next_hop( S ) returns NULL. At the 1038 RP for G, MRIB.next_hop( RP(G)) returns NULL. 1040 The function NBR( I, A ) uses information gathered through PIM Hello 1041 messages to map the IP address A of a directly connected PIM neighbor 1042 router on interface I to the primary IP address of the same router 1043 (Section 4.3.4). The primary IP address of a neighbor is the address 1044 that it uses as the source of its PIM Hello messages. Note that a 1045 neighbor's IP address may be non-unique within the PIM neighbor 1046 database due to scope issues. The address must, however, be unique 1047 amongst the addresses of all the PIM neighbors on a specific 1048 interface. 1050 I_Am_Assert_Loser(S, G, I) is true if the Assert state machine (in 1051 Section 4.6.1) for (S,G) on Interface I is in "I am Assert Loser" 1052 state. 1054 I_Am_Assert_Loser(*, G, I) is true if the Assert state machine (in 1055 Section 4.6.2) for (*,G) on Interface I is in "I am Assert Loser" 1056 state. 1058 4.2. Data Packet Forwarding Rules 1060 The PIM-SM packet forwarding rules are defined below in pseudocode. 1062 iif is the incoming interface of the packet. 1063 S is the source address of the packet. 1064 G is the destination address of the packet (group address). 1065 RP is the address of the Rendezvous Point for this group. 1066 RPF_interface(S) is the interface the MRIB indicates would be used 1067 to route packets to S. 1068 RPF_interface(RP) is the interface the MRIB indicates would be 1069 used to route packets to the RP, except at the RP when it is the 1070 decapsulation interface (the "virtual" interface on which register 1071 packets are received). 1073 First, we restart (or start) the Keepalive Timer if the source is on 1074 a directly connected subnet. 1076 Second, we check to see if the SPTbit should be set because we've now 1077 switched from the RP tree to the SPT. 1079 Next, we check to see whether the packet should be accepted based on 1080 TIB state and the interface that the packet arrived on. 1082 If the packet should be forwarded using (S,G) state, we then build an 1083 outgoing interface list for the packet. If this list is not empty, 1084 then we restart the (S,G) state Keepalive Timer. 1086 If the packet should be forwarded using (*,G) state, then we just 1087 build an outgoing interface list for the packet. We also check if we 1088 should initiate a switch to start receiving this source on a shortest 1089 path tree. 1091 Finally we remove the incoming interface from the outgoing interface 1092 list we've created, and if the resulting outgoing interface list is 1093 not empty, we forward the packet out of those interfaces. 1095 On receipt of data from S to G on interface iif: 1096 if( DirectlyConnected(S) == TRUE AND iif == RPF_interface(S) ) { 1097 set KeepaliveTimer(S,G) to Keepalive_Period 1098 # Note: a register state transition or UpstreamJPState(S,G) 1099 # transition may happen as a result of restarting 1100 # KeepaliveTimer, and must be dealt with here. 1101 } 1103 if( iif == RPF_interface(S) AND UpstreamJPState(S,G) == Joined AND 1104 inherited_olist(S,G) != NULL ) { 1105 set KeepaliveTimer(S,G) to Keepalive_Period 1106 } 1108 Update_SPTbit(S,G,iif) 1109 oiflist = NULL 1111 if( iif == RPF_interface(S) AND SPTbit(S,G) == TRUE ) { 1112 oiflist = inherited_olist(S,G) 1113 } else if( iif == RPF_interface(RP(G)) AND SPTbit(S,G) == FALSE) { 1114 oiflist = inherited_olist(S,G,rpt) 1115 CheckSwitchToSpt(S,G) 1116 } else { 1117 # Note: RPF check failed 1118 # A transition in an Assert FSM may cause an Assert(S,G) 1119 # or Assert(*,G) message to be sent out interface iif. 1120 # See section 4.6 for details. 1121 if ( SPTbit(S,G) == TRUE AND iif is in inherited_olist(S,G) ) { 1122 send Assert(S,G) on iif 1123 } else if ( SPTbit(S,G) == FALSE AND 1124 iif is in inherited_olist(S,G,rpt) { 1125 send Assert(*,G) on iif 1126 } 1127 } 1129 oiflist = oiflist (-) iif 1130 forward packet on all interfaces in oiflist 1132 This pseudocode employs several "macro" definitions: 1134 DirectlyConnected(S) is TRUE if the source S is on any subnet that is 1135 directly connected to this router (or for packets originating on this 1136 router). 1138 inherited_olist(S,G) and inherited_olist(S,G,rpt) are defined in 1139 Section 4.1. 1141 Basically, inherited_olist(S,G) is the outgoing interface list for 1142 packets forwarded on (S,G) state, taking into account (*,G) state, 1143 asserts, etc. 1145 inherited_olist(S,G,rpt) is the outgoing interface list for packets 1146 forwarded on (*,G) state, taking into account (S,G,rpt) prune state, 1147 asserts, etc. 1149 Update_SPTbit(S,G,iif) is defined in Section 4.2.2. 1151 CheckSwitchToSpt(S,G) is defined in Section 4.2.1. 1153 UpstreamJPState(S,G) is the state of the finite state machine in 1154 Section 4.5.7. 1156 Keepalive_Period is defined in Section 4.10. 1158 Data-triggered PIM-Assert messages sent from the above forwarding 1159 code SHOULD be rate-limited in an implementation-dependent manner. 1161 4.2.1. Last-Hop Switchover to the SPT 1163 In Sparse-Mode PIM, last-hop routers join the shared tree towards the 1164 RP. Once traffic from sources to joined groups arrives at a last-hop 1165 router, it has the option of switching to receive the traffic on a 1166 shortest path tree (SPT). 1168 The decision for a router to switch to the SPT is controlled as 1169 follows: 1171 void 1172 CheckSwitchToSpt(S,G) { 1173 if ( ( pim_include(*,G) (-) pim_exclude(S,G) 1174 (+) pim_include(S,G) != NULL ) 1175 AND SwitchToSptDesired(S,G) ) { 1176 # Note: Restarting the KAT will result in the SPT switch 1177 set KeepaliveTimer(S,G) to Keepalive_Period 1178 } 1179 } 1181 SwitchToSptDesired(S,G) is a policy function that is implementation 1182 defined. An "infinite threshold" policy can be implemented by making 1183 SwitchToSptDesired(S,G) return false all the time. A "switch on 1184 first packet" policy can be implemented by making 1185 SwitchToSptDesired(S,G) return true once a single packet has been 1186 received for the source and group. 1188 4.2.2. Setting and Clearing the (S,G) SPTbit 1190 The (S,G) SPTbit is used to distinguish whether to forward on (*,G) 1191 or on (S,G) state. When switching from the RP tree to the source 1192 tree, there is a transition period when data is arriving due to 1193 upstream (*,G) state while upstream (S,G) state is being established, 1194 during which time a router should continue to forward only on (*,G) 1195 state. This prevents temporary black-holes that would be caused by 1196 sending a Prune(S,G,rpt) before the upstream (S,G) state has finished 1197 being established. 1199 Thus, when a packet arrives, the (S,G) SPTbit is updated as follows: 1201 void 1202 Update_SPTbit(S,G,iif) { 1203 if ( iif == RPF_interface(S) 1204 AND JoinDesired(S,G) == TRUE 1205 AND ( DirectlyConnected(S) == TRUE 1206 OR RPF_interface(S) != RPF_interface(RP(G)) 1207 OR inherited_olist(S,G,rpt) == NULL 1208 OR ( ( RPF'(S,G) == RPF'(*,G) ) AND 1209 ( RPF'(S,G) != NULL ) ) 1210 OR ( I_Am_Assert_Loser(S,G,iif) ) { 1211 Set SPTbit(S,G) to TRUE 1212 } 1213 } 1215 Additionally, a router can set SPTbit(S,G) to TRUE in other cases, 1216 such as when it receives an Assert(S,G) on RPF_interface(S) (see 1217 Section 4.6.1). 1219 JoinDesired(S,G) is defined in Section 4.5.7 and indicates whether we 1220 have the appropriate (S,G) Join state to wish to send a Join(S,G) 1221 upstream. 1223 Basically, Update_SPTbit(S,G,iif) will set the SPTbit if we have the 1224 appropriate (S,G) join state, and if the packet arrived on the 1225 correct upstream interface for S, and if one or more of the following 1226 conditions applies: 1228 1. The source is directly connected, in which case the switch to the 1229 SPT is a no-op. 1231 2. The RPF interface to S is different from the RPF interface to the 1232 RP. The packet arrived on RPF_interface(S), and so the SPT must 1233 have been completed. 1235 3. No One wants the packet on the RP tree. 1237 4. RPF'(S,G) == RPF'(*,G). In this case, the router will never be 1238 able to tell if the SPT has been completed, so it should just 1239 switch immediately. 1241 In the case where the RPF interface is the same for the RP and for S, 1242 but RPF'(S,G) and RPF'(*,G) differ, we wait for an Assert(S,G), which 1243 indicates that the upstream router with (S,G) state believes the SPT 1244 has been completed. However, item (3) above is needed because there 1245 may not be any (*,G) state to trigger an Assert(S,G) to happen. 1247 The SPTbit is cleared in the (S,G) upstream state machine (see 1248 Section 4.5.7) when JoinDesired(S,G) becomes FALSE. 1250 4.3. Designated Routers (DR) and Hello Messages 1252 A shared-media LAN like Ethernet may have multiple PIM-SM routers 1253 connected to it. A single one of these routers, the DR, will act on 1254 behalf of directly connected hosts with respect to the PIM-SM 1255 protocol. Because the distinction between LANs and point-to-point 1256 interfaces can sometimes be blurred, and because routers may also 1257 have multicast host functionality, the PIM-SM specification makes no 1258 distinction between the two. Thus, DR election will happen on all 1259 interfaces, LAN or otherwise. 1261 DR election is performed using Hello messages. Hello messages are 1262 also the way that option negotiation takes place in PIM, so that 1263 additional functionality can be enabled, or parameters tuned. 1265 4.3.1. Sending Hello Messages 1267 PIM Hello messages are sent periodically on each PIM-enabled 1268 interface. They allow a router to learn about the neighboring PIM 1269 routers on each interface. Hello messages are also the mechanism 1270 used to elect a Designated Router (DR), and to negotiate additional 1271 capabilities. A router must record the Hello information received 1272 from each PIM neighbor. 1274 Hello messages MUST be sent on all active interfaces, including 1275 physical point-to-point links, and are multicast to the 'ALL-PIM- 1276 ROUTERS' group address ('224.0.0.13' for IPv4 and 'ff02::d' for 1277 IPv6). 1279 We note that some implementations do not send Hello messages on 1280 point-to-point interfaces. This is non-compliant behavior. A 1281 compliant PIM router MUST send Hello messages, even on point-to-point 1282 interfaces. 1284 A per-interface Hello Timer (HT(I)) is used to trigger sending Hello 1285 messages on each active interface. When PIM is enabled on an 1286 interface or a router first starts, the Hello Timer of that interface 1287 is set to a random value between 0 and Triggered_Hello_Delay. This 1288 prevents synchronization of Hello messages if multiple routers are 1289 powered on simultaneously. After the initial randomized interval, 1290 Hello messages MUST be sent every Hello_Period seconds. The Hello 1291 Timer SHOULD NOT be reset except when it expires. 1293 Note that neighbors will not accept Join/Prune or Assert messages 1294 from a router unless they have first heard a Hello message from that 1295 router. Thus, if a router needs to send a Join/Prune or Assert 1296 message on an interface on which it has not yet sent a Hello message 1297 with the currently configured IP address, then it MUST immediately 1298 send the relevant Hello message without waiting for the Hello Timer 1299 to expire, followed by the Join/Prune or Assert message. 1301 The DR_Priority Option allows a network administrator to give 1302 preference to a particular router in the DR election process by 1303 giving it a numerically larger DR Priority. The DR_Priority Option 1304 SHOULD be included in every Hello message, even if no DR Priority is 1305 explicitly configured on that interface. This is necessary because 1306 priority-based DR election is only enabled when all neighbors on an 1307 interface advertise that they are capable of using the DR_Priority 1308 Option. The default priority is 1. 1310 The Generation_Identifier (GenID) Option SHOULD be included in all 1311 Hello messages. The GenID option contains a randomly generated 1312 32-bit value that is regenerated each time PIM forwarding is started 1313 or restarted on the interface, including when the router itself 1314 restarts. When a Hello message with a new GenID is received from a 1315 neighbor, any old Hello information about that neighbor SHOULD be 1316 discarded and superseded by the information from the new Hello 1317 message. This may cause a new DR to be chosen on that interface. 1319 The LAN Prune Delay Option SHOULD be included in all Hello messages 1320 sent on multi-access LANs. This option advertises a router's 1321 capability to use values other than the defaults for the 1322 Propagation_Delay and Override_Interval, which affect the setting of 1323 the Prune-Pending, Upstream Join, and Override Timers (defined in 1324 Section 4.10). 1326 The Address List Option advertises all the secondary addresses 1327 associated with the source interface of the router originating the 1328 message. The option MUST be included in all Hello messages if there 1329 are secondary addresses associated with the source interface and MAY 1330 be omitted if no secondary addresses exist. 1332 To allow new or rebooting routers to learn of PIM neighbors quickly, 1333 when a Hello message is received from a new neighbor, or a Hello 1334 message with a new GenID is received from an existing neighbor, a new 1335 Hello message SHOULD be sent on this interface after a randomized 1336 delay between 0 and Triggered_Hello_Delay. This triggered message 1337 need not change the timing of the scheduled periodic message. If a 1338 router needs to send a Join/Prune to the new neighbor or send an 1339 Assert message in response to an Assert message from the new neighbor 1340 before this randomized delay has expired, then it MUST immediately 1341 send the relevant Hello message without waiting for the Hello Timer 1342 to expire, followed by the Join/Prune or Assert message. If it does 1343 not do this, then the new neighbor will discard the Join/Prune or 1344 Assert message. 1346 Before an interface goes down or changes primary IP address, a Hello 1347 message with a zero HoldTime SHOULD be sent immediately (with the old 1348 IP address if the IP address changed). This will cause PIM neighbors 1349 to remove this neighbor (or its old IP address) immediately. After 1350 an interface has changed its IP address, it MUST send a Hello message 1351 with its new IP address. If an interface changes one of its 1352 secondary IP addresses, a Hello message with an updated Address_List 1353 option and a non-zero HoldTime SHOULD be sent immediately. This will 1354 cause PIM neighbors to update this neighbor's list of secondary 1355 addresses immediately. 1357 4.3.2. DR Election 1359 When a PIM Hello message is received on interface I, the following 1360 information about the sending neighbor is recorded: 1362 neighbor.interface 1363 The interface on which the Hello message arrived. 1365 neighbor.primary_ip_address 1366 The IP address that the PIM neighbor used as the source 1367 address of the Hello message. 1369 neighbor.genid 1370 The Generation ID of the PIM neighbor. 1372 neighbor.dr_priority 1373 The DR Priority field of the PIM neighbor, if it is present in 1374 the Hello message. 1376 neighbor.dr_priority_present 1377 A flag indicating if the DR Priority field was present in the 1378 Hello message. 1380 neighbor.timeout 1381 A timer value to time out the neighbor state when it becomes 1382 stale, also known as the Neighbor Liveness Timer. 1384 The Neighbor Liveness Timer (NLT(N,I)) is reset to 1385 Hello_Holdtime (from the Hello Holdtime option) whenever a 1386 Hello message is received containing a Holdtime option, or to 1387 Default_Hello_Holdtime if the Hello message does not contain 1388 the Holdtime option. 1390 Neighbor state is deleted when the neighbor timeout expires. 1392 The function for computing the DR on interface I is: 1394 host 1395 DR(I) { 1396 dr = me 1397 for each neighbor on interface I { 1398 if ( dr_is_better( neighbor, dr, I ) == TRUE ) { 1399 dr = neighbor 1400 } 1401 } 1402 return dr 1403 } 1405 The function used for comparing DR "metrics" on interface I is: 1407 bool 1408 dr_is_better(a,b,I) { 1409 if( there is a neighbor n on I for which n.dr_priority_present 1410 is false ) { 1411 return a.primary_ip_address > b.primary_ip_address 1412 } else { 1413 return ( a.dr_priority > b.dr_priority ) OR 1414 ( a.dr_priority == b.dr_priority AND 1415 a.primary_ip_address > b.primary_ip_address ) 1416 } 1417 } 1419 The trivial function I_am_DR(I) is defined to aid readability: 1421 bool 1422 I_am_DR(I) { 1423 return DR(I) == me 1424 } 1426 The DR Priority is a 32-bit unsigned number, and the numerically 1427 larger priority is always preferred. A router's idea of the current 1428 DR on an interface can change when a PIM Hello message is received, 1429 when a neighbor times out, or when a router's own DR Priority 1430 changes. If the router becomes the DR or ceases to be the DR, this 1431 will normally cause the DR Register state machine to change state. 1432 Subsequent actions are determined by that state machine. 1434 We note that some PIM implementations do not send Hello messages on 1435 point-to-point interfaces and thus cannot perform DR election on 1436 such interfaces. This is non-compliant behavior. DR election MUST 1437 be performed on ALL active PIM-SM interfaces. 1439 4.3.3. Reducing Prune Propagation Delay on LANs 1441 In addition to the information recorded for the DR Election, the 1442 following per neighbor information is obtained from the LAN Prune 1443 Delay Hello option: 1445 neighbor.lan_prune_delay_present 1446 A flag indicating if the LAN Prune Delay option was present in 1447 the Hello message. 1449 neighbor.tracking_support 1450 A flag storing the value of the T bit in the LAN Prune Delay 1451 option if it is present in the Hello message. This indicates 1452 the neighbor's capability to disable Join message suppression. 1454 neighbor.propagation_delay 1455 The Propagation Delay field of the LAN Prune Delay option (if 1456 present) in the Hello message. 1458 neighbor.override_interval 1459 The Override_Interval field of the LAN Prune Delay option (if 1460 present) in the Hello message. 1462 The additional state described above is deleted along with the DR 1463 neighbor state when the neighbor timeout expires. 1465 Just like the DR_Priority option, the information provided in the LAN 1466 Prune Delay option is not used unless all neighbors on a link 1467 advertise the option. The function below computes this state: 1469 bool 1470 lan_delay_enabled(I) { 1471 for each neighbor on interface I { 1472 if ( neighbor.lan_prune_delay_present == false ) { 1473 return false 1474 } 1475 } 1476 return true 1477 } 1479 The Propagation Delay inserted by a router in the LAN Prune Delay 1480 option expresses the expected message propagation delay on the link 1481 and SHOULD be configurable by the system administrator. It is used 1482 by upstream routers to figure out how long they should wait for a 1483 Join override message before pruning an interface. 1485 PIM implementers SHOULD enforce a lower bound on the permitted values 1486 for this delay to allow for scheduling and processing delays within 1487 their router. Such delays may cause received messages to be 1488 processed later as well as triggered messages to be sent later than 1489 intended. Setting this Propagation Delay to too low a value may 1490 result in temporary forwarding outages because a downstream router 1491 will not be able to override a neighbor's Prune message before the 1492 upstream neighbor stops forwarding. 1494 When all routers on a link are in a position to negotiate a 1495 Propagation Delay different from the default, the largest value from 1496 those advertised by each neighbor is chosen. The function for 1497 computing the Effective Propagation Delay of interface I is: 1499 time_interval 1500 Effective_Propagation_Delay(I) { 1501 if ( lan_delay_enabled(I) == false ) { 1502 return Propagation_delay_default 1503 } 1504 delay = Propagation_Delay(I) 1505 for each neighbor on interface I { 1506 if ( neighbor.propagation_delay > delay ) { 1507 delay = neighbor.propagation_delay 1508 } 1509 } 1510 return delay 1511 } 1513 To avoid synchronization of override messages when multiple 1514 downstream routers share a multi-access link, sending of such 1515 messages is delayed by a small random amount of time. The period of 1516 randomization should represent the size of the PIM router population 1517 on the link. Each router expresses its view of the amount of 1518 randomization necessary in the Override Interval field of the LAN 1519 Prune Delay option. 1521 When all routers on a link are in a position to negotiate an Override 1522 Interval different from the default, the largest value from those 1523 advertised by each neighbor is chosen. The function for computing 1524 the Effective Override Interval of interface I is: 1526 time_interval 1527 Effective_Override_Interval(I) { 1528 if ( lan_delay_enabled(I) == false ) { 1529 return t_override_default 1530 } 1531 delay = Override_Interval(I) 1532 for each neighbor on interface I { 1533 if ( neighbor.override_interval > delay ) { 1534 delay = neighbor.override_interval 1535 } 1536 } 1537 return delay 1538 } 1540 Although the mechanisms are not specified in this document, it is 1541 possible for upstream routers to explicitly track the join membership 1542 of individual downstream routers if Join suppression is disabled. A 1543 router can advertise its willingness to disable Join suppression by 1544 using the T bit in the LAN Prune Delay Hello option. Unless all PIM 1545 routers on a link negotiate this capability, explicit tracking and 1546 the disabling of the Join suppression mechanism are not possible. 1547 The function for computing the state of Suppression on interface I 1548 is: 1550 bool 1551 Suppression_Enabled(I) { 1552 if ( lan_delay_enabled(I) == false ) { 1553 return true 1554 } 1555 for each neighbor on interface I { 1556 if ( neighbor.tracking_support == false ) { 1557 return true 1558 } 1559 } 1560 return false 1561 } 1563 Note that the setting of Suppression_Enabled(I) affects the value of 1564 t_suppressed (see Section 4.10). 1566 4.3.4. Maintaining Secondary Address Lists 1568 Communication of a router's interface secondary addresses to its PIM 1569 neighbors is necessary to provide the neighbors with a mechanism for 1570 mapping next_hop information obtained through their MRIB to a primary 1571 address that can be used as a destination for Join/Prune messages. 1572 The mapping is performed through the NBR macro. The primary address 1573 of a PIM neighbor is obtained from the source IP address used in its 1574 PIM Hello messages. Secondary addresses are carried within the Hello 1575 message in an Address List Hello option. The primary address of the 1576 source interface of the router MUST NOT be listed within the Address 1577 List Hello option. 1579 In addition to the information recorded for the DR Election, the 1580 following per neighbor information is obtained from the Address List 1581 Hello option: 1583 neighbor.secondary_address_list 1584 The list of secondary addresses used by the PIM neighbor on 1585 the interface through which the Hello message was transmitted. 1587 When processing a received PIM Hello message containing an Address 1588 List Hello option, the list of secondary addresses in the message 1589 completely replaces any previously associated secondary addresses for 1590 that neighbor. If a received PIM Hello message does not contain an 1591 Address List Hello option, then all secondary addresses associated 1592 with the neighbor MUST be deleted. If a received PIM Hello message 1593 contains an Address List Hello option that includes the primary 1594 address of the sending router in the list of secondary addresses 1595 (although this is not expected), then the addresses listed in the 1596 message, excluding the primary address, are used to update the 1597 associated secondary addresses for that neighbor. 1599 All the advertised secondary addresses in received Hello messages 1600 must be checked against those previously advertised by all other PIM 1601 neighbors on that interface. If there is a conflict and the same 1602 secondary address was previously advertised by another neighbor, then 1603 only the most recently received mapping MUST be maintained, and an 1604 error message SHOULD be logged to the administrator in a rate-limited 1605 manner. 1607 Within one Address List Hello option, all the addresses MUST be of 1608 the same address family. It is not permitted to mix IPv4 and IPv6 1609 addresses within the same message. In addition, the address family 1610 of the fields in the message SHOULD be the same as the IP source and 1611 destination addresses of the packet header. 1613 4.4. PIM Register Messages 1615 The Designated Router (DR) on a LAN or point-to-point link 1616 encapsulates multicast packets from local sources to the RP for the 1617 relevant group unless it recently received a Register-Stop message 1618 for that (S,G) or (*,G) from the RP. When the DR receives a 1619 Register-Stop message from the RP, it starts a Register-Stop Timer to 1620 maintain this state. Just before the Register-Stop Timer expires, 1621 the DR sends a Null-Register Message to the RP to allow the RP to 1622 refresh the Register-Stop information at the DR. If the Register- 1623 Stop Timer actually expires, the DR will resume encapsulating packets 1624 from the source to the RP. 1626 4.4.1. Sending Register Messages from the DR 1628 Every PIM-SM router has the capability to be a DR. The state machine 1629 below is used to implement Register functionality. For the purposes 1630 of specification, we represent the mechanism to encapsulate packets 1631 to the RP as a Register-Tunnel interface, which is added to or 1632 removed from the (S,G) olist. The tunnel interface then takes part 1633 in the normal packet forwarding rules as specified in Section 4.2. 1635 If register state is maintained, it is maintained only for directly 1636 connected sources and is per-(S,G). There are four states in the 1637 DR's per-(S,G) Register state machine: 1639 Join (J) 1640 The register tunnel is "joined" (the join is actually implicit, 1641 but the DR acts as if the RP has joined the DR on the tunnel 1642 interface). 1644 Prune (P) 1645 The register tunnel is "pruned" (this occurs when a Register- 1646 Stop is received). 1648 Join-Pending (JP) 1649 The register tunnel is pruned but the DR is contemplating adding 1650 it back. 1652 NoInfo (NI) 1653 No information. This is the initial state, and the state when 1654 the router is not the DR. 1656 In addition, a Register-Stop Timer (RST) is kept if the state machine 1657 is not in the NoInfo state. 1659 Figure 1: Per-(S,G) register state machine at a DR in tabular form 1661 +----------++----------------------------------------------------------+ 1662 | || Event | 1663 | ++----------+-----------+-----------+-----------+-----------+ 1664 |Prev State||Register- | Could | Could | Register- | RP changed| 1665 | ||Stop Timer| Register | Register | Stop | | 1666 | ||expires | ->True | ->False | received | | 1667 +----------++----------+-----------+-----------+-----------+-----------+ 1668 |NoInfo ||- | -> J state| - | - | - | 1669 |(NI) || | add reg | | | | 1670 | || | tunnel | | | | 1671 +----------++----------+-----------+-----------+-----------+-----------+ 1672 | ||- | - | -> NI | -> P state| -> J state| 1673 | || | | state | | | 1674 | || | | remove reg| remove reg| update reg| 1675 |Join (J) || | | tunnel | tunnel; | tunnel | 1676 | || | | | set | | 1677 | || | | | Register- | | 1678 | || | | | Stop | | 1679 | || | | | Timer(*) | | 1680 +----------++----------+-----------+-----------+-----------+-----------+ 1681 | ||-> J state| - | -> NI | -> P state| -> J state| 1682 | || | | state | | | 1683 |Join- ||add reg | | | set | add reg | 1684 |Pending ||tunnel | | | Register- | tunnel; | 1685 |(JP) || | | | Stop | cancel | 1686 | || | | | Timer(*) | Register- | 1687 | || | | | | Stop Timer| 1688 +----------++----------+-----------+-----------+-----------+-----------+ 1689 | ||-> JP | - | -> NI | - | -> J state| 1690 | ||state | | state | | | 1691 | ||set | | | | add reg | 1692 |Prune (P) ||Register- | | | | tunnel; | 1693 | ||Stop | | | | cancel | 1694 | ||Timer(**);| | | | Register- | 1695 | ||send Null-| | | | Stop Timer| 1696 | ||Register | | | | | 1697 +----------++----------+-----------+-----------+-----------+-----------+ 1698 Notes: 1700 (*) The Register-Stop Timer is set to a random value chosen 1701 uniformly from the interval ( 0.5 * Register_Suppression_Time, 1702 1.5 * Register_Suppression_Time) minus Register_Probe_Time. 1704 Subtracting off Register_Probe_Time is a bit unnecessary because 1705 it is really small compared to Register_Suppression_Time, but 1706 this was in the old spec and is kept for compatibility. 1708 (**) The Register-Stop Timer is set to Register_Probe_Time. 1710 The following three actions are defined: 1712 Add Register Tunnel 1714 A Register-Tunnel virtual interface, VI, is created (if it doesn't 1715 already exist) with its encapsulation target being RP(G). 1716 DownstreamJPState(S,G,VI) is set to Join state, causing the tunnel 1717 interface to be added to immediate_olist(S,G) and 1718 inherited_olist(S,G). 1720 Remove Register Tunnel 1722 VI is the Register-Tunnel virtual interface with encapsulation 1723 target of RP(G). DownstreamJPState(S,G,VI) is set to NoInfo 1724 state, causing the tunnel interface to be removed from 1725 immediate_olist(S,G) and inherited_olist(S,G). If 1726 DownstreamJPState(S,G,VI) is NoInfo for all (S,G), then VI can be 1727 deleted. 1729 Update Register Tunnel 1731 This action occurs when RP(G) changes. 1733 VI_old is the Register-Tunnel virtual interface with encapsulation 1734 target old_RP(G). A Register-Tunnel virtual interface, VI_new, is 1735 created (if it doesn't already exist) with its encapsulation 1736 target being new_RP(G). DownstreamJPState(S,G,VI_old) is set to 1737 NoInfo state and DownstreamJPState(S,G,VI_new) is set to Join 1738 state. If DownstreamJPState(S,G,VI_old) is NoInfo for all (S,G), 1739 then VI_old can be deleted. 1741 Note that we cannot simply change the encapsulation target of 1742 VI_old because not all groups using that encapsulation tunnel will 1743 have moved to the same new RP. 1745 CouldRegister(S,G) 1747 The macro "CouldRegister" in the state machine is defined as: 1749 bool CouldRegister(S,G) { 1750 return ( I_am_DR( RPF_interface(S) ) AND 1751 KeepaliveTimer(S,G) is running AND 1752 DirectlyConnected(S) == TRUE ) 1753 } 1755 Note that on reception of a packet at the DR from a directly 1756 connected source, KeepaliveTimer(S,G) needs to be set by the 1757 packet forwarding rules before computing CouldRegister(S,G) in the 1758 register state machine, or the first packet from a source won't be 1759 registered. 1761 Encapsulating Data Packets in the Register Tunnel 1763 Conceptually, the Register Tunnel is an interface with a smaller 1764 MTU than the underlying IP interface towards the RP. IP 1765 fragmentation on packets forwarded on the Register Tunnel is 1766 performed based upon this smaller MTU. The encapsulating DR may 1767 perform Path MTU Discovery to the RP to determine the effective 1768 MTU of the tunnel. Fragmentation for the smaller MTU should take 1769 both the outer IP header and the PIM register header overhead into 1770 account. If a multicast packet is fragmented on the way into the 1771 Register Tunnel, each fragment is encapsulated individually so it 1772 contains IP, PIM, and inner IP headers. 1774 In IPv6, the DR MUST perform Path MTU discovery, and an ICMP 1775 Packet Too Big message MUST be sent by the encapsulating DR if it 1776 receives a packet that will not fit in the effective MTU of the 1777 tunnel. If the MTU between the DR and the RP results in the 1778 effective tunnel MTU being smaller than 1280 (the IPv6 minimum 1779 MTU), the DR MUST send Fragmentation Required messages with an MTU 1780 value of 1280 and MUST fragment its PIM register messages as 1781 required, using an IPv6 fragmentation header between the outer 1782 IPv6 header and the PIM Register header. 1784 The TTL of a forwarded data packet is decremented before it is 1785 encapsulated in the Register Tunnel. The encapsulating packet 1786 uses the normal TTL that the router would use for any locally- 1787 generated IP packet. 1789 The IP ECN bits should be copied from the original packet to the 1790 IP header of the encapsulating packet. They SHOULD NOT be set 1791 independently by the encapsulating router. 1793 The Diffserv Code Point (DSCP) should be copied from the original 1794 packet to the IP header of the encapsulating packet. It MAY be 1795 set independently by the encapsulating router, based upon static 1796 configuration or traffic classification. See [12] for more 1797 discussion on setting the DSCP on tunnels. 1799 Handling Register-Stop(*,G) Messages at the DR 1801 An old RP might send a Register-Stop message with the source 1802 address set to all zeros. This was the normal course of action in 1803 RFC 2362 when the Register message matched against (*,G) state at 1804 the RP, and it was defined as meaning "stop encapsulating all 1805 sources for this group". However, the behavior of such a 1806 Register-Stop(*,G) is ambiguous or incorrect in some 1807 circumstances. 1809 We specify that an RP should not send Register-Stop(*,G) messages, 1810 but for compatibility, a DR should be able to accept one if it is 1811 received. 1813 A Register-Stop(*,G) should be treated as a Register-Stop(S,G) for 1814 all (S,G) Register state machines that are not in the NoInfo 1815 state. A router should not apply a Register-Stop(*,G) to sources 1816 that become active after the Register-Stop(*,G) was received. 1818 4.4.2. Receiving Register Messages at the RP 1820 When an RP receives a Register message, the course of action is 1821 decided according to the following pseudocode: 1823 packet_arrives_on_rp_tunnel( pkt ) { 1824 if( outer.dst is not one of my addresses ) { 1825 drop the packet silently. 1826 # Note: this may be a spoofing attempt 1827 } 1828 if( I_am_RP(G) AND outer.dst == RP(G) ) { 1829 sentRegisterStop = FALSE; 1830 if ( SPTbit(S,G) OR 1831 ( SwitchToSptDesired(S,G) AND 1832 ( inherited_olist(S,G) == NULL ))) { 1833 send Register-Stop(S,G) to outer.src 1834 sentRegisterStop = TRUE; 1835 } 1836 if ( SPTbit(S,G) OR SwitchToSptDesired(S,G) ) { 1837 if ( sentRegisterStop == TRUE ) { 1838 set KeepaliveTimer(S,G) to RP_Keepalive_Period; 1839 } else { 1840 set KeepaliveTimer(S,G) to Keepalive_Period; 1841 } 1842 } 1843 if( !SPTbit(S,G) AND ! pkt.NullRegisterBit ) { 1844 decapsulate and forward the inner packet to 1845 inherited_olist(S,G,rpt) # Note (+) 1846 } 1847 } else { 1848 send Register-Stop(S,G) to outer.src 1849 # Note (*) 1850 } 1851 } 1852 outer.dst is the IP destination address of the encapsulating header. 1854 outer.src is the IP source address of the encapsulating header, i.e., 1855 the DR's address. 1857 I_am_RP(G) is true if the group-to-RP mapping indicates that this 1858 router is the RP for the group. 1860 Note (*): This may block traffic from S for Register_Suppression_Time 1861 if the DR learned about a new group-to-RP mapping before the RP 1862 did. However, this doesn't matter unless we figure out some way 1863 for the RP also to accept (*,G) joins when it doesn't yet realize 1864 that it is about to become the RP for G. This will all get sorted 1865 out once the RP learns the new group-to-rp mapping. We decided to 1866 do nothing about this and just accept the fact that PIM may suffer 1867 interrupted (*,G) connectivity following an RP change. 1869 Note (+): Implementations SHOULD NOT make this a special case, but to 1870 arrange that this path rejoin the normal packet forwarding path. 1871 All of the appropriate actions from the "On receipt of data from S 1872 to G on interface iif" pseudocode in Section 4.2 should be 1873 performed. 1875 KeepaliveTimer(S,G) is restarted at the RP when packets arrive on the 1876 proper tunnel interface and the RP desires to switch to the SPT or 1877 the SPTbit is already set. This may cause the upstream (S,G) state 1878 machine to trigger a join if the inherited_olist(S,G) is not NULL. 1880 An RP should preserve (S,G) state that was created in response to a 1881 Register message for at least ( 3 * Register_Suppression_Time ); 1882 otherwise, the RP may stop joining (S,G) before the DR for S has 1883 restarted sending registers. Traffic would then be interrupted until 1884 the Register-Stop Timer expires at the DR. 1886 Thus, at the RP, KeepaliveTimer(S,G) should be restarted to ( 3 * 1887 Register_Suppression_Time + Register_Probe_Time ). 1889 When forwarding a packet from the Register Tunnel, the TTL of the 1890 original data packet is decremented after it is decapsulated. 1892 The IP ECN bits should be copied from the IP header of the Register 1893 packet to the decapsulated packet. 1895 The Diffserv Code Point (DSCP) should be copied from the IP header of 1896 the Register packet to the decapsulated packet. The RP MAY retain 1897 the DSCP of the inner packet or re-classify the packet and apply a 1898 different DSCP. Scenarios where each of these might be useful are 1899 discussed in [12]. 1901 4.5. PIM Join/Prune Messages 1903 A PIM Join/Prune message consists of a list of groups and a list of 1904 Joined and Pruned sources for each group. When processing a received 1905 Join/Prune message, each Joined or Pruned source for a Group is 1906 effectively considered individually, and applies to one or more of 1907 the following state machines. When considering a Join/Prune message 1908 whose Upstream Neighbor Address field addresses this router, (*,G) 1909 Joins and Prunes can affect both the (*,G) and (S,G,rpt) downstream 1910 state machines, while (S,G), and (S,G,rpt) Joins and Prunes can only 1911 affect their respective downstream state machines. When considering 1912 a Join/Prune message whose Upstream Neighbor Address field addresses 1913 another router, most Join or Prune messages could affect each 1914 upstream state machine. 1916 In general, a PIM Join/Prune message should only be accepted for 1917 processing if it comes from a known PIM neighbor. A PIM router hears 1918 about PIM neighbors through PIM Hello messages. If a router receives 1919 a Join/Prune message from a particular IP source address and it has 1920 not seen a PIM Hello message from that source address, then the 1921 Join/Prune message SHOULD be discarded without further processing. 1922 In addition, if the Hello message from a neighbor was authenticated 1923 using IPsec AH (see Section 6.3), then all Join/Prune messages from 1924 that neighbor MUST also be authenticated using IPsec AH. 1926 We note that some older PIM implementations incorrectly fail to send 1927 Hello messages on point-to-point interfaces, so we also RECOMMEND 1928 that a configuration option be provided to allow interoperation with 1929 such older routers, but that this configuration option SHOULD NOT be 1930 enabled by default. 1932 4.5.1. Receiving (*,G) Join/Prune Messages 1934 When a router receives a Join(*,G), it must first check to see 1935 whether the RP in the message matches RP(G) (the router's idea of who 1936 the RP is). If the RP in the message does not match RP(G), the 1937 Join(*,G) should be silently dropped. (Note that other source list 1938 entries, such as (S,G,rpt) or (S,G), in the same Group-Specific Set 1939 should still be processed.) If a router has no RP information (e.g., 1940 has not recently received a BSR message), then it may choose to 1941 accept Join(*,G) and treat 1942 the RP in the message as RP(G). Received Prune(*,G) messages are 1943 processed even if the RP in the message does not match RP(G). 1945 The per-interface state machine for receiving (*,G) Join/Prune 1946 Messages is given below. There are three states: 1948 NoInfo (NI) 1949 The interface has no (*,G) Join state and no timers running. 1951 Join (J) 1952 The interface has (*,G) Join state, which will cause the 1953 router to forward packets destined for G from this interface 1954 except if there is also (S,G,rpt) prune information (see 1955 Section 4.5.4) or the router lost an assert on this interface. 1957 Prune-Pending (PP) 1958 The router has received a Prune(*,G) on this interface from a 1959 downstream neighbor and is waiting to see whether the prune 1960 will be overridden by another downstream router. For 1961 forwarding purposes, the Prune-Pending state functions exactly 1962 like the Join state. 1964 In addition, the state machine uses two timers: 1966 Expiry Timer (ET) 1967 This timer is restarted when a valid Join(*,G) is received. 1968 Expiry of the Expiry Timer causes the interface state to 1969 revert to NoInfo for this group. 1971 Prune-Pending Timer (PPT) 1972 This timer is set when a valid Prune(*,G) is received. Expiry 1973 of the Prune-Pending Timer causes the interface state to 1974 revert to NoInfo for this group. 1976 Figure 2: Downstream per-interface (*,G) state machine in tabular form 1978 +------------++--------------------------------------------------------+ 1979 | || Event | 1980 | ++-------------+--------------+-------------+-------------+ 1981 |Prev State ||Receive | Receive | Prune- | Expiry Timer| 1982 | ||Join(*,G) | Prune(*,G) | Pending | Expires | 1983 | || | | Timer | | 1984 | || | | Expires | | 1985 +------------++-------------+--------------+-------------+-------------+ 1986 | ||-> J state | -> NI state | - | - | 1987 |NoInfo (NI) ||start Expiry | | | | 1988 | ||Timer | | | | 1989 +------------++-------------+--------------+-------------+-------------+ 1990 | ||-> J state | -> PP state | - | -> NI state | 1991 |Join (J) ||restart | start Prune- | | | 1992 | ||Expiry Timer | Pending | | | 1993 | || | Timer | | | 1994 +------------++-------------+--------------+-------------+-------------+ 1995 |Prune- ||-> J state | -> PP state | -> NI state | -> NI state | 1996 |Pending (PP)||restart | | Send Prune- | | 1997 | ||Expiry Timer | | Echo(*,G) | | 1998 +------------++-------------+--------------+-------------+-------------+ 2000 The transition events "Receive Join(*,G)" and "Receive Prune(*,G)" 2001 imply receiving a Join or Prune targeted to this router's primary IP 2002 address on the received interface. If the upstream neighbor address 2003 field is not correct, these state transitions in this state machine 2004 MUST NOT occur, although seeing such a packet may cause state 2005 transitions in other state machines. 2007 On unnumbered interfaces on point-to-point links, the router's 2008 address should be the same as the source address it chose for the 2009 Hello message it sent over that interface. However, on point-to- 2010 point links it is RECOMMENDED that for backwards compatibility PIM 2011 Join/Prune messages with an upstream neighbor address field of all 2012 zeros also be accepted. 2014 Transitions from NoInfo State 2016 When in NoInfo state, the following event may trigger a transition: 2018 Receive Join(*,G) 2019 A Join(*,G) is received on interface I with its Upstream 2020 Neighbor Address set to the router's primary IP address on I. 2022 The (*,G) downstream state machine on interface I transitions 2023 to the Join state. The Expiry Timer (ET) is started and set 2024 to the HoldTime from the triggering Join/Prune message. 2026 Transitions from Join State 2028 When in Join state, the following events may trigger a transition: 2030 Receive Join(*,G) 2031 A Join(*,G) is received on interface I with its Upstream 2032 Neighbor Address set to the router's primary IP address on I. 2034 The (*,G) downstream state machine on interface I remains in 2035 Join state, and the Expiry Timer (ET) is restarted, set to 2036 maximum of its current value and the HoldTime from the 2037 triggering Join/Prune message. 2039 Receive Prune(*,G) 2040 A Prune(*,G) is received on interface I with its Upstream 2041 Neighbor Address set to the router's primary IP address on I. 2043 The (*,G) downstream state machine on interface I transitions 2044 to the Prune-Pending state. The Prune-Pending Timer is 2045 started. It is set to the J/P_Override_Interval(I) if the 2046 router has more than one neighbor on that interface; 2047 otherwise, it is set to zero, causing it to expire 2048 immediately. 2050 Expiry Timer Expires 2051 The Expiry Timer for the (*,G) downstream state machine on 2052 interface I expires. 2054 The (*,G) downstream state machine on interface I transitions 2055 to the NoInfo state. 2057 Transitions from Prune-Pending State 2059 When in Prune-Pending state, the following events may trigger a 2060 transition: 2062 Receive Join(*,G) 2063 A Join(*,G) is received on interface I with its Upstream 2064 Neighbor Address set to the router's primary IP address on I. 2066 The (*,G) downstream state machine on interface I transitions 2067 to the Join state. The Prune-Pending Timer is canceled 2068 (without triggering an expiry event). The Expiry Timer is 2069 restarted, set to maximum of its current value and the 2070 HoldTime from the triggering Join/Prune message. 2072 Expiry Timer Expires 2073 The Expiry Timer for the (*,G) downstream state machine on 2074 interface I expires. 2076 The (*,G) downstream state machine on interface I transitions 2077 to the NoInfo state. 2079 Prune-Pending Timer Expires 2080 The Prune-Pending Timer for the (*,G) downstream state machine 2081 on interface I expires. 2083 The (*,G) downstream state machine on interface I transitions 2084 to the NoInfo state. A PruneEcho(*,G) is sent onto the subnet 2085 connected to interface I. 2087 The action "Send PruneEcho(*,G)" is triggered when the router 2088 stops forwarding on an interface as a result of a prune. A 2089 PruneEcho(*,G) is simply a Prune(*,G) message sent by the 2090 upstream router on a LAN with its own address in the Upstream 2091 Neighbor Address field. Its purpose is to add additional 2092 reliability so that if a Prune that should have been 2093 overridden by another router is lost locally on the LAN, then 2094 the PruneEcho may be received and cause the override to 2095 happen. A PruneEcho(*,G) need not be sent on an interface 2096 that contains only a single PIM neighbor during the time this 2097 state machine was in Prune-Pending state. 2099 4.5.2. Receiving (S,G) Join/Prune Messages 2101 The per-interface state machine for receiving (S,G) Join/Prune 2102 messages is given below and is almost identical to that for (*,G) 2103 messages. There are three states: 2105 NoInfo (NI) 2106 The interface has no (S,G) Join state and no (S,G) timers 2107 running. 2109 Join (J) 2110 The interface has (S,G) Join state, which will cause the 2111 router to forward packets from S destined for G from this 2112 interface if the (S,G) state is active (the SPTbit is set) 2113 except if the router lost an assert on this interface. 2115 Prune-Pending (PP) 2116 The router has received a Prune(S,G) on this interface from a 2117 downstream neighbor and is waiting to see whether the prune 2118 will be overridden by another downstream router. For 2119 forwarding purposes, the Prune-Pending state functions exactly 2120 like the Join state. 2122 In addition, there are two timers: 2124 Expiry Timer (ET) 2125 This timer is set when a valid Join(S,G) is received. Expiry 2126 of the Expiry Timer causes this state machine to revert to 2127 NoInfo state. 2129 Prune-Pending Timer (PPT) 2130 This timer is set when a valid Prune(S,G) is received. Expiry 2131 of the Prune-Pending Timer causes this state machine to revert 2132 to NoInfo state. 2134 Figure 3: Downstream per-interface (S,G) state machine in tabular form 2136 +------------++--------------------------------------------------------+ 2137 | || Event | 2138 | ++-------------+--------------+-------------+-------------+ 2139 |Prev State ||Receive | Receive | Prune- | Expiry Timer| 2140 | ||Join(S,G) | Prune(S,G) | Pending | Expires | 2141 | || | | Timer | | 2142 | || | | Expires | | 2143 +------------++-------------+--------------+-------------+-------------+ 2144 | ||-> J state | -> NI state | - | - | 2145 |NoInfo (NI) ||start Expiry | | | | 2146 | ||Timer | | | | 2147 +------------++-------------+--------------+-------------+-------------+ 2148 | ||-> J state | -> PP state | - | -> NI state | 2149 |Join (J) ||restart | start Prune- | | | 2150 | ||Expiry Timer | Pending | | | 2151 | || | Timer | | | 2152 +------------++-------------+--------------+-------------+-------------+ 2153 |Prune- ||-> J state | -> PP state | -> NI state | -> NI state | 2154 |Pending (PP)||restart | | Send Prune- | | 2155 | ||Expiry Timer | | Echo(S,G) | | 2156 +------------++-------------+--------------+-------------+-------------+ 2158 The transition events "Receive Join(S,G)" and "Receive Prune(S,G)" 2159 imply receiving a Join or Prune targeted to this router's primary IP 2160 address on the received interface. If the upstream neighbor address 2161 field is not correct, these state transitions in this state machine 2162 MUST NOT occur, although seeing such a packet may cause state 2163 transitions in other state machines. 2165 On unnumbered interfaces on point-to-point links, the router's 2166 address SHOULD be the same as the source address it chose for the 2167 Hello message it sent over that interface. However, on point-to- 2168 point links it is RECOMMENDED that for backwards compatibility PIM 2169 Join/Prune messages with an upstream neighbor address field of all 2170 zeros also be accepted. 2172 Transitions from NoInfo State 2174 When in NoInfo state, the following event may trigger a transition: 2176 Receive Join(S,G) 2177 A Join(S,G) is received on interface I with its Upstream 2178 Neighbor Address set to the router's primary IP address on I. 2180 The (S,G) downstream state machine on interface I transitions 2181 to the Join state. The Expiry Timer (ET) is started and set 2182 to the HoldTime from the triggering Join/Prune message. 2184 Transitions from Join State 2186 When in Join state, the following events may trigger a transition: 2188 Receive Join(S,G) 2189 A Join(S,G) is received on interface I with its Upstream 2190 Neighbor Address set to the router's primary IP address on I. 2192 The (S,G) downstream state machine on interface I remains in 2193 Join state, and the Expiry Timer (ET) is restarted, set to 2194 maximum of its current value and the HoldTime from the 2195 triggering Join/Prune message. 2197 Receive Prune(S,G) 2198 A Prune(S,G) is received on interface I with its Upstream 2199 Neighbor Address set to the router's primary IP address on I. 2201 The (S,G) downstream state machine on interface I transitions 2202 to the Prune-Pending state. The Prune-Pending Timer is 2203 started. It is set to the J/P_Override_Interval(I) if the 2204 router has more than one neighbor on that interface; 2205 otherwise, it is set to zero, causing it to expire 2206 immediately. 2208 Expiry Timer Expires 2209 The Expiry Timer for the (S,G) downstream state machine on 2210 interface I expires. 2212 The (S,G) downstream state machine on interface I transitions 2213 to the NoInfo state. 2215 Transitions from Prune-Pending State 2217 When in Prune-Pending state, the following events may trigger a 2218 transition: 2220 Receive Join(S,G) 2221 A Join(S,G) is received on interface I with its Upstream 2222 Neighbor Address set to the router's primary IP address on I. 2224 The (S,G) downstream state machine on interface I transitions 2225 to the Join state. The Prune-Pending Timer is canceled 2226 (without triggering an expiry event). The Expiry Timer is 2227 restarted, set to maximum of its current value and the 2228 HoldTime from the triggering Join/Prune message. 2230 Expiry Timer Expires 2231 The Expiry Timer for the (S,G) downstream state machine on 2232 interface I expires. 2234 The (S,G) downstream state machine on interface I transitions 2235 to the NoInfo state. 2237 Prune-Pending Timer Expires 2238 The Prune-Pending Timer for the (S,G) downstream state machine 2239 on interface I expires. 2241 The (S,G) downstream state machine on interface I transitions 2242 to the NoInfo state. A PruneEcho(S,G) is sent onto the subnet 2243 connected to interface I. 2245 The action "Send PruneEcho(S,G)" is triggered when the router 2246 stops forwarding on an interface as a result of a prune. A 2247 PruneEcho(S,G) is simply a Prune(S,G) message sent by the 2248 upstream router on a LAN with its own address in the Upstream 2249 Neighbor Address field. Its purpose is to add additional 2250 reliability so that if a Prune that should have been 2251 overridden by another router is lost locally on the LAN, then 2252 the PruneEcho may be received and cause the override to 2253 happen. A PruneEcho(S,G) need not be sent on an interface 2254 that contains only a single PIM neighbor during the time this 2255 state machine was in Prune-Pending state. 2257 4.5.3. Receiving (S,G,rpt) Join/Prune Messages 2259 The per-interface state machine for receiving (S,G,rpt) Join/Prune 2260 messages is given below. There are five states: 2262 NoInfo (NI) 2263 The interface has no (S,G,rpt) Prune state and no (S,G,rpt) 2264 timers running. 2266 Prune (P) 2267 The interface has (S,G,rpt) Prune state, which will cause the 2268 router not to forward packets from S destined for G from this 2269 interface even though the interface has active (*,G) Join 2270 state. 2272 Prune-Pending (PP) 2273 The router has received a Prune(S,G,rpt) on this interface 2274 from a downstream neighbor and is waiting to see whether the 2275 prune will be overridden by another downstream router. For 2276 forwarding purposes, the Prune-Pending state functions exactly 2277 like the NoInfo state. 2279 PruneTmp (P') 2280 This state is a transient state that for forwarding purposes 2281 behaves exactly like the Prune state. A (*,G) Join has been 2282 received (which may cancel the (S,G,rpt) Prune). As we parse 2283 the Join/Prune message from top to bottom, we first enter this 2284 state if the message contains a (*,G) Join. Later in the 2285 message, we will normally encounter an (S,G,rpt) prune to 2286 reinstate the Prune state. However, if we reach the end of 2287 the message without encountering such an (S,G,rpt) prune, then 2288 we will revert to NoInfo state in this state machine. 2290 As no time is spent in this state, no timers can expire. 2292 Prune-Pending-Tmp (PP') 2293 This state is a transient state that is identical to P' except 2294 that it is associated with the PP state rather than the P 2295 state. For forwarding purposes, PP' behaves exactly like PP 2296 state. 2298 In addition, there are two timers: 2300 Expiry Timer (ET) 2301 This timer is set when a valid Prune(S,G,rpt) is received. 2302 Expiry of the Expiry Timer causes this state machine to revert 2303 to NoInfo state. 2305 Prune-Pending Timer (PPT) 2306 This timer is set when a valid Prune(S,G,rpt) is received. 2307 Expiry of the Prune-Pending Timer causes this state machine to 2308 move on to Prune state. 2310 Figure 4: Downstream per-interface (S,G,rpt) state machine 2311 in tabular form 2313 +----------++----------------------------------------------------------+ 2314 | || Event | 2315 | ++---------+----------+----------+--------+--------+--------+ 2316 |Prev ||Receive | Receive | Receive | End of | Prune- | Expiry | 2317 |State ||Join(*,G)| Join | Prune | Message| Pending| Timer | 2318 | || | (S,G,rpt)| (S,G,rpt)| | Timer | Expires| 2319 | || | | | | Expires| | 2320 +----------++---------+----------+----------+--------+--------+--------+ 2321 | ||- | - | -> PP | - | - | - | 2322 | || | | state | | | | 2323 | || | | start | | | | 2324 |NoInfo || | | Prune- | | | | 2325 |(NI) || | | Pending | | | | 2326 | || | | Timer; | | | | 2327 | || | | start | | | | 2328 | || | | Expiry | | | | 2329 | || | | Timer | | | | 2330 +----------++---------+----------+----------+--------+--------+--------+ 2331 | ||-> P' | -> NI | -> P | - | - | -> NI | 2332 | ||state | state | state | | | state | 2333 |Prune (P) || | | restart | | | | 2334 | || | | Expiry | | | | 2335 | || | | Timer | | | | 2336 +----------++---------+----------+----------+--------+--------+--------+ 2337 |Prune- ||-> PP' | -> NI | - | - | -> P | - | 2338 |Pending ||state | state | | | state | | 2339 |(PP) || | | | | | | 2340 +----------++---------+----------+----------+--------+--------+--------+ 2341 | ||- | - | -> P | -> NI | - | - | 2342 |PruneTmp || | | state | state | | | 2343 |(P') || | | restart | | | | 2344 | || | | Expiry | | | | 2345 | || | | Timer | | | | 2346 +----------++---------+----------+----------+--------+--------+--------+ 2347 | ||- | - | -> PP | -> NI | - | - | 2348 |Prune- || | | state | state | | | 2349 |Pending- || | | restart | | | | 2350 |Tmp (PP') || | | Expiry | | | | 2351 | || | | Timer | | | | 2352 +----------++---------+----------+----------+--------+--------+--------+ 2354 The transition events "Receive Join(S,G,rpt)", "Receive 2355 Prune(S,G,rpt)", and "Receive Join(*,G)" imply receiving a Join or 2356 Prune targeted to this router's primary IP address on the received 2357 interface. If the upstream neighbor address field is not correct, 2358 these state transitions in this state machine MUST NOT occur, 2359 although seeing such a packet may cause state transitions in other 2360 state machines. 2362 On unnumbered interfaces on point-to-point links, the router's 2363 address should be the same as the source address it chose for the 2364 Hello message it sent over that interface. However, on point-to- 2365 point links it is RECOMMENDED that PIM Join/Prune messages with an 2366 upstream neighbor address field of all zeros also be accepted. 2368 Transitions from NoInfo State 2370 When in NoInfo (NI) state, the following event may trigger a 2371 transition: 2373 Receive Prune(S,G,rpt) 2374 A Prune(S,G,rpt) is received on interface I with its Upstream 2375 Neighbor Address set to the router's primary IP address on I. 2377 The (S,G,rpt) downstream state machine on interface I 2378 transitions to the Prune-Pending state. The Expiry Timer (ET) 2379 is started and set to the HoldTime from the triggering 2380 Join/Prune message. The Prune-Pending Timer is started. It 2381 is set to the J/P_Override_Interval(I) if the router has more 2382 than one neighbor on that interface; otherwise, it is set to 2383 zero, causing it to expire immediately. 2385 Transitions from Prune-Pending State 2387 When in Prune-Pending (PP) state, the following events may trigger a 2388 transition: 2390 Receive Join(*,G) 2391 A Join(*,G) is received on interface I with its Upstream 2392 Neighbor Address set to the router's primary IP address on I. 2394 The (S,G,rpt) downstream state machine on interface I 2395 transitions to Prune-Pending-Tmp state whilst the remainder of 2396 the compound Join/Prune message containing the Join(*,G) is 2397 processed. 2399 Receive Join(S,G,rpt) 2400 A Join(S,G,rpt) is received on interface I with its Upstream 2401 Neighbor Address set to the router's primary IP address on I. 2403 The (S,G,rpt) downstream state machine on interface I 2404 transitions to NoInfo state. ET and PPT are canceled. 2406 Prune-Pending Timer Expires 2407 The Prune-Pending Timer for the (S,G,rpt) downstream state 2408 machine on interface I expires. 2410 The (S,G,rpt) downstream state machine on interface I 2411 transitions to the Prune state. 2413 Transitions from Prune State 2415 When in Prune (P) state, the following events may trigger a 2416 transition: 2418 Receive Join(*,G) 2419 A Join(*,G) is received on interface I with its Upstream 2420 Neighbor Address set to the router's primary IP address on I. 2422 The (S,G,rpt) downstream state machine on interface I 2423 transitions to PruneTmp state whilst the remainder of the 2424 compound Join/Prune message containing the Join(*,G) is 2425 processed. 2427 Receive Join(S,G,rpt) 2428 A Join(S,G,rpt) is received on interface I with its Upstream 2429 Neighbor Address set to the router's primary IP address on I. 2431 The (S,G,rpt) downstream state machine on interface I 2432 transitions to NoInfo state. ET and PPT are canceled. 2434 Receive Prune(S,G,rpt) 2435 A Prune(S,G,rpt) is received on interface I with its Upstream 2436 Neighbor Address set to the router's primary IP address on I. 2438 The (S,G,rpt) downstream state machine on interface I remains 2439 in Prune state. The Expiry Timer (ET) is restarted, set to 2440 maximum of its current value and the HoldTime from the 2441 triggering Join/Prune message. 2443 Expiry Timer Expires 2444 The Expiry Timer for the (S,G,rpt) downstream state machine on 2445 interface I expires. 2447 The (S,G,rpt) downstream state machine on interface I 2448 transitions to the NoInfo state. 2450 Transitions from Prune-Pending-Tmp State 2452 When in Prune-Pending-Tmp (PP') state and processing a compound 2453 Join/Prune message, the following events may trigger a transition: 2455 Receive Prune(S,G,rpt) 2456 The compound Join/Prune message contains a Prune(S,G,rpt) that 2457 is received on interface I with its Upstream Neighbor Address 2458 set to the router's primary IP address on I. 2460 The (S,G,rpt) downstream state machine on interface I 2461 transitions back to the Prune-Pending state. The Expiry Timer 2462 (ET) is restarted, set to maximum of its current value and the 2463 HoldTime from the triggering Join/Prune message. 2465 End of Message 2466 The end of the compound Join/Prune message is reached. 2468 The (S,G,rpt) downstream state machine on interface I 2469 transitions to the NoInfo state. ET and PPT are canceled. 2471 Transitions from PruneTmp State 2473 When in PruneTmp (P') state and processing a compound Join/Prune 2474 message, the following events may trigger a transition: 2476 Receive Prune(S,G,rpt) 2477 The compound Join/Prune message contains a Prune(S,G,rpt). 2479 The (S,G,rpt) downstream state machine on interface I 2480 transitions back to the Prune state. The Expiry Timer (ET) is 2481 restarted, set to maximum of its current value and the 2482 HoldTime from the triggering Join/Prune message. 2484 End of Message 2485 The end of the compound Join/Prune message is reached. 2487 The (S,G,rpt) downstream state machine on interface I 2488 transitions to the NoInfo state. ET is canceled. 2490 Notes: 2492 Receiving a Prune(*,G) does not affect the (S,G,rpt) downstream state 2493 machine. 2495 4.5.4. Sending (*,G) Join/Prune Messages 2497 The per-interface state machines for (*,G) hold join state from 2498 downstream PIM routers. This state then determines whether a router 2499 needs to propagate a Join(*,G) upstream towards the RP. 2501 If a router wishes to propagate a Join(*,G) upstream, it must also 2502 watch for messages on its upstream interface from other routers on 2503 that subnet, and these may modify its behavior. If it sees a 2504 Join(*,G) to the correct upstream neighbor, it should suppress its 2505 own Join(*,G). If it sees a Prune(*,G) to the correct upstream 2506 neighbor, it should be prepared to override that prune by sending a 2507 Join(*,G) almost immediately. Finally, if it sees the Generation ID 2508 (see Section 4.3) of the correct upstream neighbor change, it knows 2509 that the upstream neighbor has lost state, and it should be prepared 2510 to refresh the state by sending a Join(*,G) almost immediately. 2512 If a (*,G) Assert occurs on the upstream interface, and this changes 2513 this router's idea of the upstream neighbor, it should be prepared to 2514 ensure that the Assert winner is aware of downstream routers by 2515 sending a Join(*,G) almost immediately. 2517 In addition, if the MRIB changes to indicate that the next hop 2518 towards the RP has changed, and either the upstream interface changes 2519 or there is no Assert winner on the upstream interface, the router 2520 should prune off from the old next hop and join towards the new next 2521 hop. 2523 The upstream (*,G) state machine only contains two states: 2525 Not Joined 2526 The downstream state machines indicate that the router does not 2527 need to join the RP tree for this group. 2529 Joined 2530 The downstream state machines indicate that the router should join 2531 the RP tree for this group. 2533 In addition, one timer JT(*,G) is kept that is used to trigger the 2534 sending of a Join(*,G) to the upstream next hop towards the RP, 2535 RPF'(*,G). 2537 Figure 5: Upstream (*,G) state machine in tabular form 2539 +-------------------++-------------------------------------------------+ 2540 | || Event | 2541 | Prev State ++------------------------+------------------------+ 2542 | || JoinDesired(*,G) | JoinDesired(*,G) | 2543 | || ->True | ->False | 2544 +-------------------++------------------------+------------------------+ 2545 | || -> J state | - | 2546 | NotJoined (NJ) || Send Join(*,G); | | 2547 | || Set Join Timer to | | 2548 | || t_periodic | | 2549 +-------------------++------------------------+------------------------+ 2550 | Joined (J) || - | -> NJ state | 2551 | || | Send Prune(*,G); | 2552 | || | Cancel Join Timer | 2553 +-------------------++------------------------+------------------------+ 2555 In addition, we have the following transitions, which occur within 2556 the Joined state: 2558 +----------------------------------------------------------------------+ 2559 | In Joined (J) State | 2560 +----------------+-----------------+-----------------+-----------------+ 2561 |Timer Expires | See Join(*,G) | See Prune(*,G) | RPF'(*,G) | 2562 | | to RPF'(*,G) | to RPF'(*,G) | changes due to | 2563 | | | | an Assert | 2564 +----------------+-----------------+-----------------+-----------------+ 2565 |Send | Increase Join | Decrease Join | Decrease Join | 2566 |Join(*,G); Set | Timer to | Timer to | Timer to | 2567 |Join Timer to | t_joinsuppress | t_override | t_override | 2568 |t_periodic | | | | 2569 +----------------+-----------------+-----------------+-----------------+ 2571 +----------------------------------------------------------------------+ 2572 | In Joined (J) State | 2573 +----------------------------------+-----------------------------------+ 2574 | RPF'(*,G) changes not | RPF'(*,G) GenID changes | 2575 | due to an Assert | | 2576 +----------------------------------+-----------------------------------+ 2577 | Send Join(*,G) to new | Decrease Join Timer to | 2578 | next hop; Send | t_override | 2579 | Prune(*,G) to old next | | 2580 | hop; Set Join Timer to | | 2581 | t_periodic | | 2582 +----------------------------------+-----------------------------------+ 2583 This state machine uses the following macro: 2585 bool JoinDesired(*,G) { 2586 if (immediate_olist(*,G) != NULL) 2587 return TRUE 2588 else 2589 return FALSE 2590 } 2592 JoinDesired(*,G) is true when the router has forwarding state that 2593 would cause it to forward traffic for G using shared tree state. 2594 Note that although JoinDesired is true, the router's sending of a 2595 Join(*,G) message may be suppressed by another router sending a 2596 Join(*,G) onto the upstream interface. 2598 Transitions from NotJoined State 2600 When the upstream (*,G) state machine is in NotJoined state, the 2601 following event may trigger a state transition: 2603 JoinDesired(*,G) becomes True 2604 The macro JoinDesired(*,G) becomes True, e.g., because the 2605 downstream state for (*,G) has changed so that at least one 2606 interface is in immediate_olist(*,G). 2608 The upstream (*,G) state machine transitions to Joined state. 2609 Send Join(*,G) to the appropriate upstream neighbor, which is 2610 RPF'(*,G). Set the Join Timer (JT) to expire after t_periodic 2611 seconds. 2613 Transitions from Joined State 2615 When the upstream (*,G) state machine is in Joined state, the 2616 following events may trigger state transitions: 2618 JoinDesired(*,G) becomes False 2619 The macro JoinDesired(*,G) becomes False, e.g., because the 2620 downstream state for (*,G) has changed so no interface is in 2621 immediate_olist(*,G). 2623 The upstream (*,G) state machine transitions to NotJoined 2624 state. Send Prune(*,G) to the appropriate upstream neighbor, 2625 which is RPF'(*,G). Cancel the Join Timer (JT). 2627 Join Timer Expires 2628 The Join Timer (JT) expires, indicating time to send a 2629 Join(*,G) 2630 Send Join(*,G) to the appropriate upstream neighbor, which is 2631 RPF'(*,G). Restart the Join Timer (JT) to expire after 2632 t_periodic seconds. 2634 See Join(*,G) to RPF'(*,G) 2635 This event is only relevant if RPF_interface(RP(G)) is a 2636 shared medium. This router sees another router on 2637 RPF_interface(RP(G)) send a Join(*,G) to RPF'(*,G). This 2638 causes this router to suppress its own Join. 2640 The upstream (*,G) state machine remains in Joined state. 2642 Let t_joinsuppress be the minimum of t_suppressed and the 2643 HoldTime from the Join/Prune message triggering this event. If 2644 the Join Timer is set to expire in less than t_joinsuppress 2645 seconds, reset it so that it expires after t_joinsuppress 2646 seconds. If the Join Timer is set to expire in more than 2647 t_joinsuppress seconds, leave it unchanged. 2649 See Prune(*,G) to RPF'(*,G) 2650 This event is only relevant if RPF_interface(RP(G)) is a 2651 shared medium. This router sees another router on 2652 RPF_interface(RP(G)) send a Prune(*,G) to RPF'(*,G). As this 2653 router is in Joined state, it must override the Prune after a 2654 short random interval. 2656 The upstream (*,G) state machine remains in Joined state. If 2657 the Join Timer is set to expire in more than t_override 2658 seconds, reset it so that it expires after t_override seconds. 2659 If the Join Timer is set to expire in less than t_override 2660 seconds, leave it unchanged. 2662 RPF'(*,G) changes due to an Assert 2663 The current next hop towards the RP changes due to an 2664 Assert(*,G) on the RPF_interface(RP(G)). 2666 The upstream (*,G) state machine remains in Joined state. If 2667 the Join Timer is set to expire in more than t_override 2668 seconds, reset it so that it expires after t_override seconds. 2669 If the Join Timer is set to expire in less than t_override 2670 seconds, leave it unchanged. 2672 RPF'(*,G) changes not due to an Assert 2673 An event occurred that caused the next hop towards the RP for 2674 G to change. This may be caused by a change in the MRIB 2675 routing database or the group-to-RP mapping. Note that this 2676 transition does not occur if an Assert is active and the 2677 upstream interface does not change. 2679 The upstream (*,G) state machine remains in Joined state. Send 2680 Join(*,G) to the new upstream neighbor, which is the new value 2681 of RPF'(*,G). Send Prune(*,G) to the old upstream neighbor, 2682 which is the old value of RPF'(*,G). Use the new value of 2683 RP(G) in the Prune(*,G) message or all zeros if RP(G) becomes 2684 unknown (old value of RP(G) may be used instead to improve 2685 behavior in routers implementing older versions of this spec). 2686 Set the Join Timer (JT) to expire after t_periodic seconds. 2688 RPF'(*,G) GenID changes 2689 The Generation ID of the router that is RPF'(*,G) changes. 2690 This normally means that this neighbor has lost state, and so 2691 the state must be refreshed. 2693 The upstream (*,G) state machine remains in Joined state. If 2694 the Join Timer is set to expire in more than t_override 2695 seconds, reset it so that it expires after t_override seconds. 2697 4.5.5. Sending (S,G) Join/Prune Messages 2699 The per-interface state machines for (S,G) hold join state from 2700 downstream PIM routers. This state then determines whether a router 2701 needs to propagate a Join(S,G) upstream towards the source. 2703 If a router wishes to propagate a Join(S,G) upstream, it must also 2704 watch for messages on its upstream interface from other routers on 2705 that subnet, and these may modify its behavior. If it sees a 2706 Join(S,G) to the correct upstream neighbor, it should suppress its 2707 own Join(S,G). If it sees a Prune(S,G), Prune(S,G,rpt), or 2708 Prune(*,G) to the correct upstream neighbor towards S, it should be 2709 prepared to override that prune by scheduling a Join(S,G) to be sent 2710 almost immediately. Finally, if it sees the Generation ID of its 2711 upstream neighbor change, it knows that the upstream neighbor has 2712 lost state, and it should refresh the state by scheduling a Join(S,G) 2713 to be sent almost immediately. 2715 If an (S,G) Assert occurs on the upstream interface, and this changes 2716 this router's idea of the upstream neighbor, it should be prepared to 2717 ensure that the Assert winner is aware of downstream routers by 2718 scheduling a Join(S,G) to be sent almost immediately. 2720 In addition, if MRIB changes cause the next hop towards the source to 2721 change, and either the upstream interface changes or there is no 2722 Assert winner on the upstream interface, the router should send a 2723 prune to the old next hop and a join to the new next hop. 2725 The upstream (S,G) state machine only contains two states: 2727 Not Joined 2728 The downstream state machines and local membership information do 2729 not indicate that the router needs to join the shortest-path tree 2730 for this (S,G). 2732 Joined 2733 The downstream state machines and local membership information 2734 indicate that the router should join the shortest-path tree for 2735 this (S,G). 2737 In addition, one timer JT(S,G) is kept that is used to trigger the 2738 sending of a Join(S,G) to the upstream next hop towards S, RPF'(S,G). 2740 Figure 6: Upstream (S,G) state machine in tabular form 2742 +-------------------+--------------------------------------------------+ 2743 | | Event | 2744 | Prev State +-------------------------+------------------------+ 2745 | | JoinDesired(S,G) | JoinDesired(S,G) | 2746 | | ->True | ->False | 2747 +-------------------+-------------------------+------------------------+ 2748 | NotJoined (NJ) | -> J state | - | 2749 | | Send Join(S,G); | | 2750 | | Set Join Timer to | | 2751 | | t_periodic | | 2752 +-------------------+-------------------------+------------------------+ 2753 | Joined (J) | - | -> NJ state | 2754 | | | Send Prune(S,G); | 2755 | | | Set SPTbit(S,G) to | 2756 | | | FALSE; Cancel Join | 2757 | | | Timer | 2758 +-------------------+-------------------------+------------------------+ 2760 In addition, we have the following transitions, which occur within 2761 the Joined state: 2763 +----------------------------------------------------------------------+ 2764 | In Joined (J) State | 2765 +-----------------+-----------------+-----------------+----------------+ 2766 | Timer Expires | See Join(S,G) | See Prune(S,G) | See Prune | 2767 | | to RPF'(S,G) | to RPF'(S,G) | (S,G,rpt) to | 2768 | | | | RPF'(S,G) | 2769 +-----------------+-----------------+-----------------+----------------+ 2770 | Send | Increase Join | Decrease Join | Decrease Join | 2771 | Join(S,G); Set | Timer to | Timer to | Timer to | 2772 | Join Timer to | t_joinsuppress | t_override | t_override | 2773 | t_periodic | | | | 2774 +-----------------+-----------------+-----------------+----------------+ 2775 +----------------------------------------------------------------------+ 2776 | In Joined (J) State | 2777 +-----------------+-----------------+----------------+-----------------+ 2778 | See Prune(*,G) | RPF'(S,G) | RPF'(S,G) | RPF'(S,G) | 2779 | to RPF'(S,G) | changes not | GenID changes | changes due to | 2780 | | due to an | | an Assert | 2781 | | Assert | | | 2782 +-----------------+-----------------+----------------+-----------------+ 2783 | Decrease Join | Send Join(S,G) | Decrease Join | Decrease Join | 2784 | Timer to | to new next | Timer to | Timer to | 2785 | t_override | hop; Send | t_override | t_override | 2786 | | Prune(S,G) to | | | 2787 | | old next hop; | | | 2788 | | Set Join Timer | | | 2789 | | to t_periodic | | | 2790 +-----------------+-----------------+----------------+-----------------+ 2792 This state machine uses the following macro: 2794 bool JoinDesired(S,G) { 2795 return( immediate_olist(S,G) != NULL 2796 OR ( KeepaliveTimer(S,G) is running 2797 AND inherited_olist(S,G) != NULL ) ) 2798 } 2800 JoinDesired(S,G) is true when the router has forwarding state that 2801 would cause it to forward traffic for G using source tree state. The 2802 source tree state can be as a result of either active source-specific 2803 join state, or the (S,G) Keepalive Timer and active non-source- 2804 specific state. Note that although JoinDesired is true, the router's 2805 sending of a Join(S,G) message may be suppressed by another router 2806 sending a Join(S,G) onto the upstream interface. 2808 Transitions from NotJoined State 2810 When the upstream (S,G) state machine is in NotJoined state, the 2811 following event may trigger a state transition: 2813 JoinDesired(S,G) becomes True 2814 The macro JoinDesired(S,G) becomes True, e.g., because the 2815 downstream state for (S,G) has changed so that at least one 2816 interface is in inherited_olist(S,G). 2818 The upstream (S,G) state machine transitions to Joined state. 2819 Send Join(S,G) to the appropriate upstream neighbor, which is 2820 RPF'(S,G). Set the Join Timer (JT) to expire after t_periodic 2821 seconds. 2823 Transitions from Joined State 2825 When the upstream (S,G) state machine is in Joined state, the 2826 following events may trigger state transitions: 2828 JoinDesired(S,G) becomes False 2829 The macro JoinDesired(S,G) becomes False, e.g., because the 2830 downstream state for (S,G) has changed so no interface is in 2831 inherited_olist(S,G). 2833 The upstream (S,G) state machine transitions to NotJoined 2834 state. Send Prune(S,G) to the appropriate upstream neighbor, 2835 which is RPF'(S,G). Cancel the Join Timer (JT), and set 2836 SPTbit(S,G) to FALSE. 2838 Join Timer Expires 2839 The Join Timer (JT) expires, indicating time to send a 2840 Join(S,G) 2842 Send Join(S,G) to the appropriate upstream neighbor, which is 2843 RPF'(S,G). Restart the Join Timer (JT) to expire after 2844 t_periodic seconds. 2846 See Join(S,G) to RPF'(S,G) 2847 This event is only relevant if RPF_interface(S) is a shared 2848 medium. This router sees another router on RPF_interface(S) 2849 send a Join(S,G) to RPF'(S,G). This causes this router to 2850 suppress its own Join. 2852 The upstream (S,G) state machine remains in Joined state. 2854 Let t_joinsuppress be the minimum of t_suppressed and the 2855 HoldTime from the Join/Prune message triggering this event. 2857 If the Join Timer is set to expire in less than t_joinsuppress 2858 seconds, reset it so that it expires after t_joinsuppress 2859 seconds. If the Join Timer is set to expire in more than 2860 t_joinsuppress seconds, leave it unchanged. 2862 See Prune(S,G) to RPF'(S,G) 2863 This event is only relevant if RPF_interface(S) is a shared 2864 medium. This router sees another router on RPF_interface(S) 2865 send a Prune(S,G) to RPF'(S,G). As this router is in Joined 2866 state, it must override the Prune after a short random 2867 interval. 2869 The upstream (S,G) state machine remains in Joined state. If 2870 the Join Timer is set to expire in more than t_override 2871 seconds, reset it so that it expires after t_override seconds. 2873 See Prune(S,G,rpt) to RPF'(S,G) 2874 This event is only relevant if RPF_interface(S) is a shared 2875 medium. This router sees another router on RPF_interface(S) 2876 send a Prune(S,G,rpt) to RPF'(S,G). If the upstream router is 2877 an RFC-2362-compliant PIM router, then the Prune(S,G,rpt) will 2878 cause it to stop forwarding. For backwards compatibility, 2879 this router should override the prune so that forwarding 2880 continues. 2882 The upstream (S,G) state machine remains in Joined state. If 2883 the Join Timer is set to expire in more than t_override 2884 seconds, reset it so that it expires after t_override seconds. 2886 See Prune(*,G) to RPF'(S,G) 2887 This event is only relevant if RPF_interface(S) is a shared 2888 medium. This router sees another router on RPF_interface(S) 2889 send a Prune(*,G) to RPF'(S,G). If the upstream router is an 2890 RFC-2362-compliant PIM router, then the Prune(*,G) will cause 2891 it to stop forwarding. For backwards compatibility, this 2892 router should override the prune so that forwarding continues. 2894 The upstream (S,G) state machine remains in Joined state. If 2895 the Join Timer is set to expire in more than t_override 2896 seconds, reset it so that it expires after t_override seconds. 2898 RPF'(S,G) changes due to an Assert 2899 The current next hop towards S changes due to an Assert(S,G) 2900 on the RPF_interface(S). 2902 The upstream (S,G) state machine remains in Joined state. If 2903 the Join Timer is set to expire in more than t_override 2904 seconds, reset it so that it expires after t_override seconds. 2905 If the Join Timer is set to expire in less than t_override 2906 seconds, leave it unchanged. 2908 RPF'(S,G) changes not due to an Assert 2909 An event occurred that caused the next hop towards S to 2910 change. Note that this transition does not occur if an Assert 2911 is active and the upstream interface does not change. 2913 The upstream (S,G) state machine remains in Joined state. Send 2914 Join(S,G) to the new upstream neighbor, which is the new value 2915 of RPF'(S,G). Send Prune(S,G) to the old upstream neighbor, 2916 which is the old value of RPF'(S,G). Set the Join Timer (JT) 2917 to expire after t_periodic seconds. 2919 RPF'(S,G) GenID changes 2920 The Generation ID of the router that is RPF'(S,G) changes. 2921 This normally means that this neighbor has lost state, and so 2922 the state must be refreshed. 2924 The upstream (S,G) state machine remains in Joined state. If 2925 the Join Timer is set to expire in more than t_override 2926 seconds, reset it so that it expires after t_override seconds. 2928 4.5.6. (S,G,rpt) Periodic Messages 2930 (S,G,rpt) Joins and Prunes are (S,G) Joins or Prunes sent on the RP 2931 tree with the RPT bit set, either to modify the results of (*,G) 2932 Joins, or to override the behavior of other upstream LAN peers. The 2933 next section describes the rules for sending triggered messages. 2934 This section describes the rules for including a Prune(S,G,rpt) 2935 message with a Join(*,G). 2937 When a router is going to send a Join(*,G), it should use the 2938 following pseudocode, for each (S,G) for which it has state, to 2939 decide whether to include a Prune(S,G,rpt) in the compound Join/Prune 2940 message: 2942 if( SPTbit(S,G) == TRUE ) { 2943 # Note: If receiving (S,G) on the SPT, we only prune off the 2944 # shared tree if the RPF neighbors differ. 2945 if( RPF'(*,G) != RPF'(S,G) ) { 2946 add Prune(S,G,rpt) to compound message 2947 } 2948 } else if ( inherited_olist(S,G,rpt) == NULL ) { 2949 # Note: all (*,G) olist interfaces received RPT prunes for (S,G). 2950 add Prune(S,G,rpt) to compound message 2951 } else if ( RPF'(*,G) != RPF'(S,G,rpt) { 2952 # Note: we joined the shared tree, but there was an (S,G) assert 2953 # and the source tree RPF neighbor is different. 2954 add Prune(S,G,rpt) to compound message 2955 } 2957 Note that Join(S,G,rpt) is normally sent not as a periodic message, 2958 but only as a triggered message. 2960 4.5.7. State Machine for (S,G,rpt) Triggered Messages 2962 The state machine for (S,G,rpt) triggered messages is required per- 2963 (S,G) when there is (*,G) join state at a router, and the router or 2964 any of its upstream LAN peers wishes to prune S off the RP tree. 2966 There are three states in the state machine. One of the states is 2967 when there is no (*,G) join state at this router. If there is (*,G) 2968 join state at the router, then the state machine must be at one of 2969 the other two states. The three states are: 2971 Pruned(S,G,rpt) 2972 (*,G) Joined, but (S,G,rpt) pruned 2974 NotPruned(S,G,rpt) 2975 (*,G) Joined, and (S,G,rpt) not pruned 2977 RPTNotJoined(G) 2978 (*,G) has not been joined. 2980 In addition, there is an (S,G,rpt) Override Timer, OT(S,G,rpt), which 2981 is used to delay triggered Join(S,G,rpt) messages to prevent 2982 implosions of triggered messages. 2984 Figure 7: Upstream (S,G,rpt) state machine for triggered messages 2985 in tabular form 2987 +------------++--------------------------------------------------------+ 2988 | || Event | 2989 | ++--------------+--------------+-------------+------------+ 2990 |Prev State || PruneDesired | PruneDesired | RPTJoin | inherited_ | 2991 | || (S,G,rpt) | (S,G,rpt) | Desired(G) | olist | 2992 | || ->True | ->False | ->False | (S,G,rpt) | 2993 | || | | | ->non-NULL | 2994 +------------++--------------+--------------+-------------+------------+ 2995 |RPTNotJoined|| -> P state | - | - | -> NP state| 2996 |(G) (NJ) || | | | | 2997 +------------++--------------+--------------+-------------+------------+ 2998 |Pruned || - | -> NP state | -> NJ state | - | 2999 |(S,G,rpt) || | Send Join | | | 3000 |(P) || | (S,G,rpt) | | | 3001 +------------++--------------+--------------+-------------+------------+ 3002 |NotPruned || -> P state | - | -> NJ state | - | 3003 |(S,G,rpt) || Send Prune | | Cancel OT | | 3004 |(NP) || (S,G,rpt); | | | | 3005 | || Cancel OT | | | | 3006 +------------++--------------+--------------+-------------+------------+ 3007 Additionally, we have the following transitions within the 3008 NotPruned(S,G,rpt) state, which are all used for prune override 3009 behavior. 3011 +----------------------------------------------------------------------+ 3012 | In NotPruned(S,G,rpt) State | 3013 +----------+--------------+--------------+--------------+--------------+ 3014 |Override | See Prune | See Join | See Prune | RPF' | 3015 |Timer | (S,G,rpt) to | (S,G,rpt) to | (S,G) to | (S,G,rpt) -> | 3016 |expires | RPF' | RPF' | RPF' | RPF' (*,G) | 3017 | | (S,G,rpt) | (S,G,rpt) | (S,G,rpt) | | 3018 +----------+--------------+--------------+--------------+--------------+ 3019 |Send Join | OT = min(OT, | Cancel OT | OT = min(OT, | OT = min(OT, | 3020 |(S,G,rpt);| t_override) | | t_override) | t_override) | 3021 |Leave OT | | | | | 3022 |unset | | | | | 3023 +----------+--------------+--------------+--------------+--------------+ 3025 Note that the min function in the above state machine considers a 3026 non-running timer to have an infinite value (e.g., min(not-running, 3027 t_override) = t_override). 3029 This state machine uses the following macros: 3031 bool RPTJoinDesired(G) { 3032 return (JoinDesired(*,G)) 3033 } 3035 RPTJoinDesired(G) is true when the router has forwarding state that 3036 would cause it to forward traffic for G using (*,G) shared tree 3037 state. 3039 bool PruneDesired(S,G,rpt) { 3040 return ( RPTJoinDesired(G) AND 3041 ( inherited_olist(S,G,rpt) == NULL 3042 OR (SPTbit(S,G)==TRUE 3043 AND (RPF'(*,G) != RPF'(S,G)) ))) 3044 } 3046 PruneDesired(S,G,rpt) can only be true if RPTJoinDesired(G) is true. 3047 If RPTJoinDesired(G) is true, then PruneDesired(S,G,rpt) is true 3048 either if there are no outgoing interfaces that S would be forwarded 3049 on, or if the router has active (S,G) forwarding state but RPF'(*,G) 3050 != RPF'(S,G). 3052 The state machine contains the following transition events: 3054 See Join(S,G,rpt) to RPF'(S,G,rpt) 3055 This event is only relevant in the "Not Pruned" state. 3057 The router sees a Join(S,G,rpt) from someone else to 3058 RPF'(S,G,rpt), which is the correct upstream neighbor. If we're 3059 in "NotPruned" state and the (S,G,rpt) Override Timer is running, 3060 then this is because we have been triggered to send our own 3061 Join(S,G,rpt) to RPF'(S,G,rpt). Someone else beat us to it, so 3062 there's no need to send our own Join. 3064 The action is to cancel the Override Timer. 3066 See Prune(S,G,rpt) to RPF'(S,G,rpt) 3067 This event is only relevant in the "NotPruned" state. 3069 The router sees a Prune(S,G,rpt) from someone else to 3070 RPF'(S,G,rpt), which is the correct upstream neighbor. If we're 3071 in the "NotPruned" state, then we want to continue to receive 3072 traffic from S destined for G, and that traffic is being supplied 3073 by RPF'(S,G,rpt). Thus, we need to override the Prune. 3075 The action is to set the (S,G,rpt) Override Timer to the 3076 randomized prune-override interval, t_override. However, if the 3077 Override Timer is already running, we only set the timer if doing 3078 so would set it to a lower value. At the end of this interval, if 3079 no one else has sent a Join, then we will do so. 3081 See Prune(S,G) to RPF'(S,G,rpt) 3082 This event is only relevant in the "NotPruned" state. 3084 This transition and action are the same as the above transition 3085 and action, except that the Prune does not have the RPT bit set. 3086 This transition is necessary to be compatible with routers 3087 implemented from RFC2362 that don't maintain separate (S,G) and 3088 (S,G,rpt) state. 3090 The (S,G,rpt) prune Override Timer expires 3091 This event is only relevant in the "NotPruned" state. 3093 When the Override Timer expires, we must send a Join(S,G,rpt) to 3094 RPF'(S,G,rpt) to override the Prune message that caused the timer 3095 to be running. We only send this if RPF'(S,G,rpt) equals 3096 RPF'(*,G); if this were not the case, then the Join might be sent 3097 to a router that does not have (*,G) Join state, and so the 3098 behavior would not be well defined. If RPF'(S,G,rpt) is not the 3099 same as RPF'(*,G), then it may stop forwarding S. However, if 3100 this happens, then the router will send an AssertCancel(S,G), 3101 which would then cause RPF'(S,G,rpt) to become equal to RPF'(*,G) 3102 (see below). 3104 RPF'(S,G,rpt) changes to become equal to RPF'(*,G) 3105 This event is only relevant in the "NotPruned" state. 3107 RPF'(S,G,rpt) can only be different from RPF'(*,G) if an (S,G) 3108 Assert has happened, which means that traffic from S is arriving 3109 on the SPT, and so Prune(S,G,rpt) will have been sent to 3110 RPF'(*,G). When RPF'(S,G,rpt) changes to become equal to 3111 RPF'(*,G), we need to trigger a Join(S,G,rpt) to RPF'(*,G) to 3112 cause that router to start forwarding S again. 3114 The action is to set the (S,G,rpt) Override Timer to the 3115 randomized prune-override interval t_override. However, if the 3116 timer is already running, we only set the timer if doing so would 3117 set it to a lower value. At the end of this interval, if no one 3118 else has sent a Join, then we will do so. 3120 PruneDesired(S,G,rpt)->TRUE 3121 See macro above. This event is relevant in the "NotPruned" and 3122 "RPTNotJoined(G)" states. 3124 The router wishes to receive traffic for G, but does not wish to 3125 receive traffic from S destined for G. This causes the router to 3126 transition into the Pruned state. 3128 If the router was previously in NotPruned state, then the action 3129 is to send a Prune(S,G,rpt) to RPF'(S,G,rpt), and to cancel the 3130 Override Timer. If the router was previously in RPTNotJoined(G) 3131 state, then there is no need to trigger an action in this state 3132 machine because sending a Prune(S,G,rpt) is handled by the rules 3133 for sending the Join(*,G). 3135 PruneDesired(S,G,rpt)->FALSE 3136 See macro above. This transition is only relevant in the "Pruned" 3137 state. 3139 If the router is in the Pruned(S,G,rpt) state, and 3140 PruneDesired(S,G,rpt) changes to FALSE, this could be because the 3141 router no longer has RPTJoinDesired(G) true, or it now wishes to 3142 receive traffic from S again. If it is the former, then this 3143 transition should not happen, but instead the 3144 "RPTJoinDesired(G)->FALSE" transition should happen. Thus, this 3145 transition should be interpreted as "PruneDesired(S,G,rpt)->FALSE 3146 AND RPTJoinDesired(G)==TRUE". 3148 The action is to send a Join(S,G,rpt) to RPF'(S,G,rpt). 3150 RPTJoinDesired(G)->FALSE 3151 This event is relevant in the "Pruned" and "NotPruned" states. 3153 The router no longer wishes to receive any traffic destined for G 3154 on the RP Tree. This causes a transition to the RPTNotJoined(G) 3155 state, and the Override Timer is canceled if it was running. Any 3156 further actions are handled by the appropriate upstream state 3157 machine for (*,G). 3159 inherited_olist(S,G,rpt) becomes non-NULL 3160 This transition is only relevant in the RPTNotJoined(G) state. 3162 The router has joined the RP tree (handled by the (*,G) upstream 3163 state machine as appropriate) and wants to receive traffic from S. 3164 This does not trigger any events in this state machine, but 3165 causes a transition to the NotPruned(S,G,rpt) state. 3167 4.6. PIM Assert Messages 3169 Where multiple PIM routers peer over a shared LAN, it is possible for 3170 more than one upstream router to have valid forwarding state for a 3171 packet, which can lead to packet duplication (see Section 3.6). PIM 3172 does not attempt to prevent this from occurring. Instead, it detects 3173 when this has happened and elects a single forwarder amongst the 3174 upstream routers to prevent further duplication. This election is 3175 performed using PIM Assert messages. Assert messages are also 3176 received by downstream routers on the LAN, and these cause subsequent 3177 Join/Prune messages to be sent to the upstream router that won the 3178 Assert. 3180 In general, a PIM Assert message should only be accepted for 3181 processing if it comes from a known PIM neighbor. A PIM router hears 3182 about PIM neighbors through PIM Hello messages. If a router receives 3183 an Assert message from a particular IP source address and it has not 3184 seen a PIM Hello message from that source address, then the Assert 3185 message SHOULD be discarded without further processing. In addition, 3186 if the Hello message from a neighbor was authenticated using the 3187 IPsec Authentication Header (AH) (see Section 6.3), then all Assert 3188 messages from that neighbor MUST also be authenticated using IPsec 3189 AH. 3191 We note that some older PIM implementations incorrectly fail to send 3192 Hello messages on point-to-point interfaces, so we also RECOMMEND 3193 that a configuration option be provided to allow interoperation with 3194 such older routers, but that this configuration option SHOULD NOT be 3195 enabled by default. 3197 4.6.1. (S,G) Assert Message State Machine 3199 The (S,G) Assert state machine for interface I is shown in Figure 8. 3200 There are three states: 3202 NoInfo (NI) 3203 This router has no (S,G) assert state on interface I. 3205 I am Assert Winner (W) 3206 This router has won an (S,G) assert on interface I. It is now 3207 responsible for forwarding traffic from S destined for G out of 3208 interface I. Irrespective of whether it is the DR for I, while a 3209 router is the assert winner, it is also responsible for forwarding 3210 traffic onto I on behalf of local hosts on I that have made 3211 membership requests that specifically refer to S (and G). 3213 I am Assert Loser (L) 3214 This router has lost an (S,G) assert on interface I. It must not 3215 forward packets from S destined for G onto interface I. If it is 3216 the DR on I, it is no longer responsible for forwarding traffic 3217 onto I to satisfy local hosts with membership requests that 3218 specifically refer to S and G. 3220 In addition, there is also an Assert Timer (AT) that is used to time 3221 out asserts on the assert losers and to resend asserts on the assert 3222 winner. 3224 Figure 8: Per-interface (S,G) Assert State machine in tabular form 3226 +----------------------------------------------------------------------+ 3227 | In NoInfo (NI) State | 3228 +---------------+-------------------+------------------+---------------+ 3229 | Receive | Receive Assert | Data arrives | Receive | 3230 | Inferior | with RPTbit | from S to G on | Acceptable | 3231 | Assert with | set and | I and | Assert with | 3232 | RPTbit clear | CouldAssert | CouldAssert | RPTbit clear | 3233 | | (S,G,I) | (S,G,I) | and AssTrDes | 3234 | | | | (S,G,I) | 3235 +---------------+-------------------+------------------+---------------+ 3236 | -> W state | -> W state | -> W state | -> L state | 3237 | [Actions A1] | [Actions A1] | [Actions A1] | [Actions A6] | 3238 +---------------+-------------------+------------------+---------------+ 3239 +----------------------------------------------------------------------+ 3240 | In I Am Assert Winner (W) State | 3241 +----------------+------------------+-----------------+----------------+ 3242 | Assert Timer | Receive | Receive | CouldAssert | 3243 | Expires | Inferior | Preferred | (S,G,I) -> | 3244 | | Assert | Assert | FALSE | 3245 +----------------+------------------+-----------------+----------------+ 3246 | -> W state | -> W state | -> L state | -> NI state | 3247 | [Actions A3] | [Actions A3] | [Actions A2] | [Actions A4] | 3248 +----------------+------------------+-----------------+----------------+ 3250 +---------------------------------------------------------------------+ 3251 | In I Am Assert Loser (L) State | 3252 +-------------+-------------+-------------+-------------+-------------+ 3253 |Receive |Receive |Receive |Assert Timer |Current | 3254 |Preferred |Acceptable |Inferior |Expires |Winner's | 3255 |Assert |Assert with |Assert or | |GenID | 3256 | |RPTbit clear |Assert | |Changes or | 3257 | |from Current |Cancel from | |NLT Expires | 3258 | |Winner |Current | | | 3259 | | |Winner | | | 3260 +-------------+-------------+-------------+-------------+-------------+ 3261 |-> L state |-> L state |-> NI state |-> NI state |-> NI state | 3262 |[Actions A2] |[Actions A2] |[Actions A5] |[Actions A5] |[Actions A5] | 3263 +-------------+-------------+-------------+-------------+-------------+ 3265 +----------------------------------------------------------------------+ 3266 | In I Am Assert Loser (L) State | 3267 +----------------+-----------------+------------------+----------------+ 3268 | AssTrDes | my_metric -> | RPF_interface | Receive | 3269 | (S,G,I) -> | better than | (S) stops | Join(S,G) on | 3270 | FALSE | winner's | being I | interface I | 3271 | | metric | | | 3272 +----------------+-----------------+------------------+----------------+ 3273 | -> NI state | -> NI state | -> NI state | -> NI State | 3274 | [Actions A5] | [Actions A5] | [Actions A5] | [Actions A5] | 3275 +----------------+-----------------+------------------+----------------+ 3277 Note that for reasons of compactness, "AssTrDes(S,G,I)" is used in 3278 the state machine table to refer to AssertTrackingDesired(S,G,I). 3280 Terminology: 3282 A "preferred assert" is one with a better metric than the current 3283 winner. 3285 An "acceptable assert" is one that has a better metric than 3286 my_assert_metric(S,G,I). An assert is never considered acceptable 3287 if its metric is infinite. 3289 An "inferior assert" is one with a worse metric than 3290 my_assert_metric(S,G,I). An assert is never considered inferior 3291 if my_assert_metric(S,G,I) is infinite. 3293 The state machine uses the following macros: 3295 CouldAssert(S,G,I) = 3296 SPTbit(S,G)==TRUE 3297 AND (RPF_interface(S) != I) 3298 AND (I in ( ( joins(*,G) (-) prunes(S,G,rpt) ) 3299 (+) ( pim_include(*,G) (-) pim_exclude(S,G) ) 3300 (-) lost_assert(*,G) 3301 (+) joins(S,G) (+) pim_include(S,G) ) ) 3303 CouldAssert(S,G,I) is true for downstream interfaces that would be in 3304 the inherited_olist(S,G) if (S,G) assert information was not taken 3305 into account. 3307 AssertTrackingDesired(S,G,I) = 3308 (I in ( joins(*,G) (-) prunes(S,G,rpt) 3309 (+) ( pim_include(*,G) (-) pim_exclude(S,G) ) 3310 (-) lost_assert(*,G) 3311 (+) joins(S,G) ) ) 3312 OR (local_receiver_include(S,G,I) == TRUE 3313 AND (I_am_DR(I) OR (AssertWinner(S,G,I) == me))) 3314 OR ((RPF_interface(S) == I) AND (JoinDesired(S,G) == TRUE)) 3315 OR ((RPF_interface(RP(G)) == I) AND (JoinDesired(*,G) == TRUE) 3316 AND (SPTbit(S,G) == FALSE)) 3318 AssertTrackingDesired(S,G,I) is true on any interface in which an 3319 (S,G) assert might affect our behavior. 3321 The first three lines of AssertTrackingDesired account for (*,G) join 3322 and local membership information received on I that might cause the 3323 router to be interested in asserts on I. 3325 The 4th line accounts for (S,G) join information received on I that 3326 might cause the router to be interested in asserts on I. 3328 The 5th and 6th lines account for (S,G) local membership information 3329 on I. Note that we can't use the pim_include(S,G) macro since it 3330 uses lost_assert(S,G,I) and would result in the router forgetting 3331 that it lost an assert if the only reason it was interested was local 3332 membership. The AssertWinner(S,G,I) check forces an assert winner to 3333 keep on being responsible for forwarding as long as local receivers 3334 are present. Removing this check would make the assert winner give 3335 up forwarding as soon as the information that originally caused it to 3336 forward went away, and the task of forwarding for local receivers 3337 would revert back to the DR. 3339 The last three lines account for the fact that a router must keep 3340 track of assert information on upstream interfaces in order to send 3341 joins and prunes to the proper neighbor. 3343 Transitions from NoInfo State 3345 When in NoInfo state, the following events may trigger transitions: 3347 Receive Inferior Assert with RPTbit cleared 3348 An assert is received for (S,G) with the RPT bit cleared that 3349 is inferior to our own assert metric. The RPT bit cleared 3350 indicates that the sender of the assert had (S,G) forwarding 3351 state on this interface. If the assert is inferior to our 3352 metric, then we must also have (S,G) forwarding state (i.e., 3353 CouldAssert(S,G,I)==TRUE) as (S,G) asserts beat (*,G) asserts, 3354 and so we should be the assert winner. We transition to the 3355 "I am Assert Winner" state and perform Actions A1 (below). 3357 Receive Assert with RPTbit set AND CouldAssert(S,G,I)==TRUE 3358 An assert is received for (S,G) on I with the RPT bit set 3359 (it's a (*,G) assert). CouldAssert(S,G,I) is TRUE only if we 3360 have (S,G) forwarding state on this interface, so we should be 3361 the assert winner. We transition to the "I am Assert Winner" 3362 state and perform Actions A1 (below). 3364 An (S,G) data packet arrives on interface I, AND 3365 CouldAssert(S,G,I)==TRUE 3366 An (S,G) data packet arrived on a downstream interface that is 3367 in our (S,G) outgoing interface list. We optimistically 3368 assume that we will be the assert winner for this (S,G), and 3369 so we transition to the "I am Assert Winner" state and perform 3370 Actions A1 (below), which will initiate the assert negotiation 3371 for (S,G). 3373 Receive Acceptable Assert with RPT bit clear AND 3374 AssertTrackingDesired(S,G,I)==TRUE 3375 We're interested in (S,G) Asserts, either because I is a 3376 downstream interface for which we have (S,G) or (*,G) 3377 forwarding state, or because I is the upstream interface for S 3378 and we have (S,G) forwarding state. The received assert has a 3379 better metric than our own, so we do not win the Assert. We 3380 transition to "I am Assert Loser" and perform Actions A6 3381 (below). 3383 Transitions from "I am Assert Winner" State 3385 When in "I am Assert Winner" state, the following events trigger 3386 transitions: 3388 Assert Timer Expires 3389 The (S,G) Assert Timer expires. As we're in the Winner state, 3390 we must still have (S,G) forwarding state that is actively 3391 being kept alive. We resend the (S,G) Assert and restart the 3392 Assert Timer (Actions A3 below). Note that the assert 3393 winner's Assert Timer is engineered to expire shortly before 3394 timers on assert losers; this prevents unnecessary thrashing 3395 of the forwarder and periodic flooding of duplicate packets. 3397 Receive Inferior Assert 3398 We receive an (S,G) assert or (*,G) assert mentioning S that 3399 has a worse metric than our own. Whoever sent the assert is 3400 in error, and so we resend an (S,G) Assert and restart the 3401 Assert Timer (Actions A3 below). 3403 Receive Preferred Assert 3404 We receive an (S,G) assert that has a better metric than our 3405 own. We transition to "I am Assert Loser" state and perform 3406 Actions A2 (below). Note that this may affect the value of 3407 JoinDesired(S,G) and PruneDesired(S,G,rpt), which could cause 3408 transitions in the upstream (S,G) or (S,G,rpt) state machines. 3410 CouldAssert(S,G,I) -> FALSE 3411 Our (S,G) forwarding state or RPF interface changed so as to 3412 make CouldAssert(S,G,I) become false. We can no longer 3413 perform the actions of the assert winner, and so we transition 3414 to NoInfo state and perform Actions A4 (below). This includes 3415 sending a "canceling assert" with an infinite metric. 3417 Transitions from "I am Assert Loser" State 3419 When in "I am Assert Loser" state, the following transitions can 3420 occur: 3422 Receive Preferred Assert 3423 We receive an assert that is better than that of the current 3424 assert winner. We stay in Loser state and perform Actions A2 3425 below. 3427 Receive Acceptable Assert with RPTbit clear from Current Winner 3428 We receive an assert from the current assert winner that is 3429 better than our own metric for this (S,G) (although the metric 3430 may be worse than the winner's previous metric). We stay in 3431 Loser state and perform Actions A2 below. 3433 Receive Inferior Assert or Assert Cancel from Current Winner 3434 We receive an assert from the current assert winner that is 3435 worse than our own metric for this group (typically, because 3436 the winner's metric became worse or because it is an assert 3437 cancel). We transition to NoInfo state, deleting the (S,G) 3438 assert information and allowing the normal PIM Join/Prune 3439 mechanisms to operate. Usually, we will eventually re-assert 3440 and win when data packets from S have started flowing again. 3442 Assert Timer Expires 3443 The (S,G) Assert Timer expires. We transition to NoInfo 3444 state, deleting the (S,G) assert information (Actions A5 3445 below). 3447 Current Winner's GenID Changes or NLT Expires 3448 The Neighbor Liveness Timer associated with the current winner 3449 expires or we receive a Hello message from the current winner 3450 reporting a different GenID from the one it previously 3451 reported. This indicates that the current winner's interface 3452 or router has gone down (and may have come back up), and so we 3453 must assume it no longer knows it was the winner. We 3454 transition to the NoInfo state, deleting this (S,G) assert 3455 information (Actions A5 below). 3457 AssertTrackingDesired(S,G,I)->FALSE 3458 AssertTrackingDesired(S,G,I) becomes FALSE. Our forwarding 3459 state has changed so that (S,G) Asserts on interface I are no 3460 longer of interest to us. We transition to the NoInfo state, 3461 deleting the (S,G) assert information. 3463 My metric becomes better than the assert winner's metric 3464 my_assert_metric(S,G,I) has changed so that now my assert 3465 metric for (S,G) is better than the metric we have stored for 3466 current assert winner. This might happen when the underlying 3467 routing metric changes, or when CouldAssert(S,G,I) becomes 3468 true; for example, when SPTbit(S,G) becomes true. We 3469 transition to NoInfo state, delete this (S,G) assert state 3470 (Actions A5 below), and allow the normal PIM Join/Prune 3471 mechanisms to operate. Usually, we will eventually re-assert 3472 and win when data packets from S have started flowing again. 3474 RPF_interface(S) stops being interface I 3475 Interface I used to be the RPF interface for S, and now it is 3476 not. We transition to NoInfo state, deleting this (S,G) 3477 assert state (Actions A5 below). 3479 Receive Join(S,G) on Interface I 3480 We receive a Join(S,G) that has the Upstream Neighbor Address 3481 field set to my primary IP address on interface I. The action 3482 is to transition to NoInfo state, delete this (S,G) assert 3483 state (Actions A5 below), and allow the normal PIM Join/Prune 3484 mechanisms to operate. If whoever sent the Join was in error, 3485 then the normal assert mechanism will eventually re-apply, and 3486 we will lose the assert again. However, whoever sent the 3487 assert may know that the previous assert winner has died, and 3488 so we may end up being the new forwarder. 3490 (S,G) Assert State machine Actions 3492 A1: Send Assert(S,G). 3493 Set Assert Timer to (Assert_Time - Assert_Override_Interval). 3494 Store self as AssertWinner(S,G,I). 3495 Store spt_assert_metric(S,I) as AssertWinnerMetric(S,G,I). 3497 A2: Store new assert winner as AssertWinner(S,G,I) and assert 3498 winner metric as AssertWinnerMetric(S,G,I). 3499 Set Assert Timer to Assert_Time. 3501 A3: Send Assert(S,G). 3502 Set Assert Timer to (Assert_Time - Assert_Override_Interval). 3504 A4: Send AssertCancel(S,G). 3505 Delete assert info (AssertWinner(S,G,I) and 3506 AssertWinnerMetric(S,G,I) will then return to their default 3507 values). 3509 A5: Delete assert info (AssertWinner(S,G,I) and 3510 AssertWinnerMetric(S,G,I) will then return to their default 3511 values). 3513 A6: Store new assert winner as AssertWinner(S,G,I) and assert 3514 winner metric as AssertWinnerMetric(S,G,I). 3515 Set Assert Timer to Assert_Time. 3516 If (I is RPF_interface(S)) AND (UpstreamJPState(S,G) == 3517 Joined) set SPTbit(S,G) to TRUE. 3519 Note that some of these actions may cause the value of 3520 JoinDesired(S,G), PruneDesired(S,G,rpt), or RPF'(S,G) to change, 3521 which could cause further transitions in other state machines. 3523 4.6.2. (*,G) Assert Message State Machine 3525 The (*,G) Assert state machine for interface I is shown in Figure 9. 3526 There are three states: 3528 NoInfo (NI) 3529 This router has no (*,G) assert state on interface I. 3531 I am Assert Winner (W) 3532 This router has won an (*,G) assert on interface I. It is now 3533 responsible for forwarding traffic destined for G onto interface I 3534 with the exception of traffic for which it has (S,G) "I am Assert 3535 Loser" state. Irrespective of whether it is the DR for I, it is 3536 also responsible for handling the membership requests for G from 3537 local hosts on I. 3539 I am Assert Loser (L) 3540 This router has lost an (*,G) assert on interface I. It must not 3541 forward packets for G onto interface I with the exception of 3542 traffic from sources for which it has (S,G) "I am Assert Winner" 3543 state. If it is the DR, it is no longer responsible for handling 3544 the membership requests for group G from local hosts on I. 3546 In addition, there is also an Assert Timer (AT) that is used to time 3547 out asserts on the assert losers and to resend asserts on the assert 3548 winner. 3550 When an Assert message is received with a source address other than 3551 zero, a PIM implementation must first match it against the possible 3552 events in the (S,G) assert state machine and process any transitions 3553 and actions, before considering whether the Assert message matches 3554 against the (*,G) assert state machine. 3556 It is important to note that NO TRANSITION CAN OCCUR in the (*,G) 3557 state machine as a result of receiving an Assert message unless the 3558 (S,G) assert state machine for the relevant S and G is in the 3559 "NoInfo" state after the (S,G) state machine has processed the 3560 message. Also, NO TRANSITION CAN OCCUR in the (*,G) state machine as 3561 a result of receiving an assert message if that message triggers any 3562 change of state in the (S,G) state machine. Obviously, when the 3563 source address in the received message is set to zero, an (S,G) state 3564 machine for the S and G does not exist and can be assumed to be in 3565 the "NoInfo" state. 3567 For example, if both the (S,G) and (*,G) assert state machines are in 3568 the NoInfo state when an Assert message arrives, and the message 3569 causes the (S,G) state machine to transition to either "W" or "L" 3570 state, then the assert will not be processed by the (*,G) assert 3571 state machine. 3573 Another example: if the (S,G) assert state machine is in "L" state 3574 when an assert message is received, and the assert metric in the 3575 message is worse than my_assert_metric(S,G,I), then the (S,G) assert 3576 state machine will transition to NoInfo state. In such a case, if 3577 the (*,G) assert state machine were in NoInfo state, it might appear 3578 that it would transition to "W" state, but this is not the case 3579 because this message already triggered a transition in the (S,G) 3580 assert state machine. 3582 Figure 9: Per-interface (*,G) Assert State machine in tabular form 3584 +----------------------------------------------------------------------+ 3585 | In NoInfo (NI) State | 3586 +-----------------------+-----------------------+----------------------+ 3587 | Receive Inferior | Data arrives for G | Receive Acceptable | 3588 | Assert with RPTbit | on I and | Assert with RPTbit | 3589 | set and | CouldAssert | set and AssTrDes | 3590 | CouldAssert(*,G,I) | (*,G,I) | (*,G,I) | 3591 +-----------------------+-----------------------+----------------------+ 3592 | -> W state | -> W state | -> L state | 3593 | [Actions A1] | [Actions A1] | [Actions A2] | 3594 +-----------------------+-----------------------+----------------------+ 3596 +---------------------------------------------------------------------+ 3597 | In I Am Assert Winner (W) State | 3598 +----------------+-----------------+-----------------+----------------+ 3599 | Assert Timer | Receive | Receive | CouldAssert | 3600 | Expires | Inferior | Preferred | (*,G,I) -> | 3601 | | Assert | Assert | FALSE | 3602 +----------------+-----------------+-----------------+----------------+ 3603 | -> W state | -> W state | -> L state | -> NI state | 3604 | [Actions A3] | [Actions A3] | [Actions A2] | [Actions A4] | 3605 +----------------+-----------------+-----------------+----------------+ 3606 +---------------------------------------------------------------------+ 3607 | In I Am Assert Loser (L) State | 3608 +-------------+-------------+-------------+-------------+-------------+ 3609 |Receive |Receive |Receive |Assert Timer |Current | 3610 |Preferred |Acceptable |Inferior |Expires |Winner's | 3611 |Assert with |Assert from |Assert or | |GenID | 3612 |RPTbit set |Current |Assert | |Changes or | 3613 | |Winner with |Cancel from | |NLT Expires | 3614 | |RPTbit set |Current | | | 3615 | | |Winner | | | 3616 +-------------+-------------+-------------+-------------+-------------+ 3617 |-> L state |-> L state |-> NI state |-> NI state |-> NI state | 3618 |[Actions A2] |[Actions A2] |[Actions A5] |[Actions A5] |[Actions A5] | 3619 +-------------+-------------+-------------+-------------+-------------+ 3621 +----------------------------------------------------------------------+ 3622 | In I Am Assert Loser (L) State | 3623 +----------------+----------------+-----------------+------------------+ 3624 | AssTrDes | my_metric -> | RPF_interface | Receive | 3625 | (*,G,I) -> | better than | (RP(G)) stops | Join(*,G) on | 3626 | FALSE | Winner's | being I | Interface I | 3627 | | metric | | | 3628 +----------------+----------------+-----------------+------------------+ 3629 | -> NI state | -> NI state | -> NI state | -> NI State | 3630 | [Actions A5] | [Actions A5] | [Actions A5] | [Actions A5] | 3631 +----------------+----------------+-----------------+------------------+ 3633 The state machine uses the following macros: 3635 CouldAssert(*,G,I) = 3636 ( I in ( joins(*,G) (+) pim_include(*,G)) ) 3637 AND (RPF_interface(RP(G)) != I) 3639 CouldAssert(*,G,I) is true on downstream interfaces for which we have 3640 (*,G) join state, or local members that requested any traffic 3641 destined for G. 3643 AssertTrackingDesired(*,G,I) = 3644 CouldAssert(*,G,I) 3645 OR (local_receiver_include(*,G,I)==TRUE 3646 AND (I_am_DR(I) OR AssertWinner(*,G,I) == me)) 3647 OR (RPF_interface(RP(G)) == I AND RPTJoinDesired(G)) 3649 AssertTrackingDesired(*,G,I) is true on any interface on which an 3650 (*,G) assert might affect our behavior. 3652 Note that for reasons of compactness, "AssTrDes(*,G,I)" is used in 3653 the state machine table to refer to AssertTrackingDesired(*,G,I). 3655 Terminology: 3657 A "preferred assert" is one with a better metric than the current 3658 winner. 3660 An "acceptable assert" is one that has a better metric than 3661 my_assert_metric(*,G,I). An assert is never considered acceptable 3662 if its metric is infinite. 3664 An "inferior assert" is one with a worse metric than 3665 my_assert_metric(*,G,I). An assert is never considered inferior 3666 if my_assert_metric(*,G,I) is infinite. 3668 Transitions from NoInfo State 3670 When in NoInfo state, the following events trigger transitions, but 3671 only if the (S,G) assert state machine is in NoInfo state before and 3672 after consideration of the received message: 3674 Receive Inferior Assert with RPTbit set AND 3675 CouldAssert(*,G,I)==TRUE 3676 An Inferior (*,G) assert is received for G on Interface I. If 3677 CouldAssert(*,G,I) is TRUE, then I is our downstream 3678 interface, and we have (*,G) forwarding state on this 3679 interface, so we should be the assert winner. We transition 3680 to the "I am Assert Winner" state and perform Actions A1 3681 (below). 3683 A data packet destined for G arrives on interface I, AND 3684 CouldAssert(*,G,I)==TRUE 3685 A data packet destined for G arrived on a downstream interface 3686 that is in our (*,G) outgoing interface list. We therefore 3687 believe we should be the forwarder for this (*,G), and so we 3688 transition to the "I am Assert Winner" state and perform 3689 Actions A1 (below). 3691 Receive Acceptable Assert with RPT bit set AND 3692 AssertTrackingDesired(*,G,I)==TRUE 3693 We're interested in (*,G) Asserts, either because I is a 3694 downstream interface for which we have (*,G) forwarding state, 3695 or because I is the upstream interface for RP(G) and we have 3696 (*,G) forwarding state. We get a (*,G) Assert that has a 3697 better metric than our own, so we do not win the Assert. We 3698 transition to "I am Assert Loser" and perform Actions A2 3699 (below). 3701 Transitions from "I am Assert Winner" State 3702 When in "I am Assert Winner" state, the following events trigger 3703 transitions, but only if the (S,G) assert state machine is in NoInfo 3704 state before and after consideration of the received message: 3706 Receive Inferior Assert 3707 We receive a (*,G) assert that has a worse metric than our 3708 own. Whoever sent the assert has lost, and so we resend a 3709 (*,G) Assert and restart the Assert Timer (Actions A3 below). 3711 Receive Preferred Assert 3712 We receive a (*,G) assert that has a better metric than our 3713 own. We transition to "I am Assert Loser" state and perform 3714 Actions A2 (below). 3716 When in "I am Assert Winner" state, the following events trigger 3717 transitions: 3719 Assert Timer Expires 3720 The (*,G) Assert Timer expires. As we're in the Winner state, 3721 then we must still have (*,G) forwarding state that is 3722 actively being kept alive. To prevent unnecessary thrashing 3723 of the forwarder and periodic flooding of duplicate packets, 3724 we resend the (*,G) Assert and restart the Assert Timer 3725 (Actions A3 below). 3727 CouldAssert(*,G,I) -> FALSE 3728 Our (*,G) forwarding state or RPF interface changed so as to 3729 make CouldAssert(*,G,I) become false. We can no longer 3730 perform the actions of the assert winner, and so we transition 3731 to NoInfo state and perform Actions A4 (below). 3733 Transitions from "I am Assert Loser" State 3735 When in "I am Assert Loser" state, the following events trigger 3736 transitions, but only if the (S,G) assert state machine is in NoInfo 3737 state before and after consideration of the received message: 3739 Receive Preferred Assert with RPTbit set 3740 We receive a (*,G) assert that is better than that of the 3741 current assert winner. We stay in Loser state and perform 3742 Actions A2 below. 3744 Receive Acceptable Assert from Current Winner with RPTbit set 3745 We receive a (*,G) assert from the current assert winner that 3746 is better than our own metric for this group (although the 3747 metric may be worse than the winner's previous metric). We 3748 stay in Loser state and perform Actions A2 below. 3750 Receive Inferior Assert or Assert Cancel from Current Winner 3751 We receive an assert from the current assert winner that is 3752 worse than our own metric for this group (typically because 3753 the winner's metric became worse or is now an assert cancel). 3754 We transition to NoInfo state, delete this (*,G) assert state 3755 (Actions A5), and allow the normal PIM Join/Prune mechanisms 3756 to operate. Usually, we will eventually re-assert and win 3757 when data packets for G have started flowing again. 3759 When in "I am Assert Loser" state, the following events trigger 3760 transitions: 3762 Assert Timer Expires 3763 The (*,G) Assert Timer expires. We transition to NoInfo state 3764 and delete this (*,G) assert info (Actions A5). 3766 Current Winner's GenID Changes or NLT Expires 3767 The Neighbor Liveness Timer associated with the current winner 3768 expires or we receive a Hello message from the current winner 3769 reporting a different GenID from the one it previously 3770 reported. This indicates that the current winner's interface 3771 or router has gone down (and may have come back up), and so we 3772 must assume it no longer knows it was the winner. We 3773 transition to the NoInfo state, deleting the (*,G) assert 3774 information (Actions A5). 3776 AssertTrackingDesired(*,G,I)->FALSE 3777 AssertTrackingDesired(*,G,I) becomes FALSE. Our forwarding 3778 state has changed so that (*,G) Asserts on interface I are no 3779 longer of interest to us. We transition to NoInfo state and 3780 delete this (*,G) assert info (Actions A5). 3782 My metric becomes better than the assert winner's metric 3783 My routing metric, rpt_assert_metric(G,I), has changed so that 3784 now my assert metric for (*,G) is better than the metric we 3785 have stored for current assert winner. We transition to 3786 NoInfo state, delete this (*,G) assert state (Actions A5), and 3787 allow the normal PIM Join/Prune mechanisms to operate. 3788 Usually, we will eventually re-assert and win when data 3789 packets for G have started flowing again. 3791 RPF_interface(RP(G)) stops being interface I 3792 Interface I used to be the RPF interface for RP(G), and now it 3793 is not. We transition to NoInfo state and delete this (*,G) 3794 assert state (Actions A5). 3796 Receive Join(*,G) on interface I 3797 We receive a Join(*,G) that has the Upstream Neighbor Address 3798 field set to my primary IP address on interface I. The action 3799 is to transition to NoInfo state, delete this (*,G) assert 3800 state (Actions A5), and allow the normal PIM Join/Prune 3801 mechanisms to operate. If whoever sent the Join was in error, 3802 then the normal assert mechanism will eventually re-apply, and 3803 we will lose the assert again. However, whoever sent the 3804 assert may know that the previous assert winner has died, so 3805 we may end up being the new forwarder. 3807 (*,G) Assert State machine Actions 3809 A1: Send Assert(*,G). 3810 Set Assert Timer to (Assert_Time - Assert_Override_Interval). 3811 Store self as AssertWinner(*,G,I). 3812 Store rpt_assert_metric(G,I) as AssertWinnerMetric(*,G,I). 3814 A2: Store new assert winner as AssertWinner(*,G,I) and assert 3815 winner metric as AssertWinnerMetric(*,G,I). 3816 Set Assert Timer to Assert_Time. 3818 A3: Send Assert(*,G) 3819 Set Assert Timer to (Assert_Time - Assert_Override_Interval). 3821 A4: Send AssertCancel(*,G). 3822 Delete assert info (AssertWinner(*,G,I) and 3823 AssertWinnerMetric(*,G,I) will then return to their default 3824 values). 3826 A5: Delete assert info (AssertWinner(*,G,I) and 3827 AssertWinnerMetric(*,G,I) will then return to their default 3828 values). 3830 Note that some of these actions may cause the value of 3831 JoinDesired(*,G) or RPF'(*,G)) to change, which could cause further 3832 transitions in other state machines. 3834 4.6.3. Assert Metrics 3836 Assert metrics are defined as: 3838 struct assert_metric { 3839 rpt_bit_flag; 3840 metric_preference; 3841 route_metric; 3842 ip_address; 3843 }; 3845 When comparing assert_metrics, the rpt_bit_flag, metric_preference, 3846 and route_metric field are compared in order, where the first lower 3847 value wins. If all fields are equal, the primary IP address of the 3848 router that sourced the Assert message is used as a tie-breaker, with 3849 the highest IP address winning. 3851 An assert metric for (S,G) to include in (or compare against) an 3852 Assert message sent on interface I should be computed using the 3853 following pseudocode: 3855 assert_metric 3856 my_assert_metric(S,G,I) { 3857 if( CouldAssert(S,G,I) == TRUE ) { 3858 return spt_assert_metric(S,I) 3859 } else if( CouldAssert(*,G,I) == TRUE ) { 3860 return rpt_assert_metric(G,I) 3861 } else { 3862 return infinite_assert_metric() 3863 } 3864 } 3866 spt_assert_metric(S,I) gives the assert metric we use if we're 3867 sending an assert based on active (S,G) forwarding state: 3869 assert_metric 3870 spt_assert_metric(S,I) { 3871 return {0,MRIB.pref(S),MRIB.metric(S),my_ip_address(I)} 3872 } 3874 rpt_assert_metric(G,I) gives the assert metric we use if we're 3875 sending an assert based only on (*,G) forwarding state: 3877 assert_metric 3878 rpt_assert_metric(G,I) { 3879 return {1,MRIB.pref(RP(G)),MRIB.metric(RP(G)),my_ip_address(I)} 3880 } 3882 MRIB.pref(X) and MRIB.metric(X) are the routing preference and 3883 routing metrics associated with the route to a particular (unicast) 3884 destination X, as determined by the MRIB. my_ip_address(I) is simply 3885 the router's primary IP address that is associated with the local 3886 interface I. 3888 infinite_assert_metric() gives the assert metric we need to send an 3889 assert but don't match either (S,G) or (*,G) forwarding state: 3891 assert_metric 3892 infinite_assert_metric() { 3893 return {1,infinity,infinity,0} 3894 } 3896 4.6.4. AssertCancel Messages 3898 An AssertCancel message is simply an RPT Assert message but with 3899 infinite metric. It is sent by the assert winner when it deletes the 3900 forwarding state that had caused the assert to occur. Other routers 3901 will see this metric, and it will cause any other router that has 3902 forwarding state to send its own assert, and to take over forwarding. 3904 An AssertCancel(S,G) is an infinite metric assert with the RPT bit 3905 set that names S as the source. 3907 An AssertCancel(*,G) is an infinite metric assert with the RPT bit 3908 set and the source set to zero. 3910 AssertCancel messages are simply an optimization. The original 3911 Assert timeout mechanism will allow a subnet to eventually become 3912 consistent; the AssertCancel mechanism simply causes faster 3913 convergence. No special processing is required for an AssertCancel 3914 message, since it is simply an Assert message from the current 3915 winner. 3917 4.6.5. Assert State Macros 3919 The macros lost_assert(S,G,rpt,I), lost_assert(S,G,I), and 3920 lost_assert(*,G,I) are used in the olist computations of Section 4.1, 3921 and are defined as: 3923 bool lost_assert(S,G,rpt,I) { 3924 if ( RPF_interface(RP(G)) == I OR 3925 ( RPF_interface(S) == I AND SPTbit(S,G) == TRUE ) ) { 3926 return FALSE 3927 } else { 3928 return ( AssertWinner(S,G,I) != NULL AND 3929 AssertWinner(S,G,I) != me ) 3930 } 3931 } 3932 bool lost_assert(S,G,I) { 3933 if ( RPF_interface(S) == I ) { 3934 return FALSE 3935 } else { 3936 return ( AssertWinner(S,G,I) != NULL AND 3937 AssertWinner(S,G,I) != me AND 3938 (AssertWinnerMetric(S,G,I) is better 3939 than spt_assert_metric(S,I) ) 3940 } 3941 } 3943 Note: the term "AssertWinnerMetric(S,G,I) is better than 3944 spt_assert_metric(S,I)" is required to correctly handle the 3945 transition phase when a router has (S,G) join state, but has not yet 3946 set the SPTbit. In this case, it needs to ignore the assert state if 3947 it will win the assert once the SPTbit is set. 3949 bool lost_assert(*,G,I) { 3950 if ( RPF_interface(RP(G)) == I ) { 3951 return FALSE 3952 } else { 3953 return ( AssertWinner(*,G,I) != NULL AND 3954 AssertWinner(*,G,I) != me ) 3955 } 3956 } 3958 AssertWinner(S,G,I) is the IP source address of the Assert(S,G) 3959 packet that won an Assert. 3961 AssertWinner(*,G,I) is the IP source address of the Assert(*,G) 3962 packet that won an Assert. 3964 AssertWinnerMetric(S,G,I) is the Assert metric of the Assert(S,G) 3965 packet that won an Assert. 3967 AssertWinnerMetric(*,G,I) is the Assert metric of the Assert(*,G) 3968 packet that won an Assert. 3970 AssertWinner(S,G,I) defaults to NULL and AssertWinnerMetric(S,G,I) 3971 defaults to Infinity when in the NoInfo state. 3973 Summary of Assert Rules and Rationale 3975 This section summarizes the key rules for sending and reacting to 3976 asserts and the rationale for these rules. This section is not 3977 intended to be and should not be treated as a definitive 3978 specification of protocol behavior. The state machines and 3979 pseudocode should be consulted for that purpose. Rather, this 3980 section is intended to document important aspects of the Assert 3981 protocol behavior and to provide information that may prove helpful 3982 to the reader in understanding and implementing this part of the 3983 protocol. 3985 1. Behavior: Downstream neighbors send Join(*,G) and Join(S,G) 3986 periodic messages to the appropriate RPF' neighbor, i.e., the RPF 3987 neighbor as modified by the assert process. They are not always 3988 sent to the RPF neighbor as indicated by the MRIB. Normal 3989 suppression and override rules apply. 3991 Rationale: By sending the periodic and triggered Join messages to 3992 the RPF' neighbor instead of to the RPF neighbor, the downstream 3993 router avoids re-triggering the Assert process with every Join. 3994 A side effect of sending Joins to the Assert winner is that 3995 traffic will not switch back to the "normal" RPF neighbor until 3996 the Assert times out. This will not happen until data stops 3997 flowing, if item 8, below, is implemented. 3999 2. Behavior: The assert winner for (*,G) acts as the local DR for 4000 (*,G) on behalf of IGMP/MLD members. 4002 Rationale: This is required to allow a single router to merge PIM 4003 and IGMP/MLD joins and leaves. Without this, overrides don't 4004 work. 4006 3. Behavior: The assert winner for (S,G) acts as the local DR for 4007 (S,G) on behalf of IGMPv3 members. 4009 Rationale: Same rationale as for item 2. 4011 4. Behavior: (S,G) and (*,G) prune overrides are sent to the RPF' 4012 neighbor and not to the regular RPF neighbor. 4014 Rationale: Same rationale as for item 1. 4016 5. Behavior: An (S,G,rpt) prune override is not sent (at all) if 4017 RPF'(S,G,rpt) != RPF'(*,G). 4019 Rationale: This avoids keeping state alive on the (S,G) tree when 4020 only (*,G) downstream members are left. Also, it avoids sending 4021 (S,G,rpt) joins to a router that is not on the (*,G) tree. This 4022 behavior might be confusing although this specification does 4023 indicate that such a join SHOULD be dropped. 4025 6. Behavior: An assert loser that receives a Join(S,G) with an 4026 Upstream Neighbor Address that is its primary IP address on that 4027 interface expires the (S,G) Assert Timer. 4029 Rationale: This is necessary in order to have rapid convergence 4030 in the event that the downstream router that initially sent a 4031 join to the prior Assert winner has undergone a topology change. 4033 7. Behavior: An assert loser that receives a Join(*,G) with an 4034 Upstream Neighbor Address that is its primary IP address on that 4035 interface cancels the (*,G) Assert Timer and all (S,G) assert 4036 timers that do not have corresponding Prune(S,G,rpt) messages in 4037 the compound Join/Prune message. 4039 Rationale: Same rationale as for item 6. 4041 8. Behavior: An assert winner for (*,G) or (S,G) sends a canceling 4042 assert when it is about to stop forwarding on a (*,G) or an (S,G) 4043 entry. This behavior does not apply to (S,G,rpt). 4045 Rationale: This allows switching back to the shared tree after 4046 the last SPT router on the LAN leaves. Doing this prevents 4047 downstream routers on the shared tree from keeping SPT state 4048 alive. 4050 9. Behavior: Resend the assert messages before timing out an assert. 4051 (This behavior is optional.) 4053 Rationale: This prevents the periodic duplicates that would 4054 otherwise occur each time that an assert times out and is then 4055 re-established. 4057 10. Behavior: When RPF'(S,G,rpt) changes to be the same as RPF'(*,G) 4058 we need to trigger a Join(S,G,rpt) to RPF'(*,G). 4060 Rationale: This allows switching back to the RPT after the last 4061 SPT member leaves. 4063 4.7. PIM Bootstrap and RP Discovery 4065 For correct operation, every PIM router within a PIM domain must be 4066 able to map a particular multicast group address to the same RP. If 4067 this is not the case, then black holes may appear, where some 4068 receivers in the domain cannot receive some groups. A domain in this 4069 context is a contiguous set of routers that all implement PIM and are 4070 configured to operate within a common boundary. 4072 A notable exception to this is where a PIM domain is broken up into 4073 multiple administrative scope regions; these are regions where a 4074 border has been configured so that a range of multicast groups will 4075 not be forwarded across that border. For more information on 4076 Administratively Scoped IP Multicast, see RFC 2365. The modified 4077 criteria for admin-scoped regions are that the region is convex with 4078 respect to forwarding based on the MRIB, and that all PIM routers 4079 within the scope region map scoped groups to the same RP within that 4080 region. 4082 This specification does not mandate the use of a single mechanism to 4083 provide routers with the information to perform the group-to-RP 4084 mapping. Currently four mechanisms are possible, and all four have 4085 associated problems: 4087 Static Configuration 4088 A PIM router MUST support the static configuration of group-to- 4089 RP mappings. Such a mechanism is not robust to failures, but 4090 does at least provide a basic interoperability mechanism. 4092 Embedded-RP 4093 Embedded-RP defines an address allocation policy in which the 4094 address of the Rendezvous Point (RP) is encoded in an IPv6 4095 multicast group address [17]. 4097 Cisco's Auto-RP 4098 Auto-RP uses a PIM Dense-Mode multicast group to announce group- 4099 to-RP mappings from a central location. This mechanism is not 4100 useful if PIM Dense-Mode is not being run in parallel with PIM 4101 Sparse-Mode, and was only intended for use with PIM Sparse-Mode 4102 Version 1. No standard specification currently exists. 4104 BootStrap Router (BSR) 4105 RFC 2362 specifies a bootstrap mechanism based on the automatic 4106 election of a bootstrap router (BSR). Any router in the domain 4107 that is configured to be a possible RP reports its candidacy to 4108 the BSR, and then a domain-wide flooding mechanism distributes 4109 the BSR's chosen set of RPs throughout the domain. As specified 4110 in RFC 2362, BSR is flawed in its handling of admin-scoped 4111 regions that are smaller than a PIM domain, but the mechanism 4112 does work for global-scoped groups. 4114 As far as PIM-SM is concerned, the only important requirement is that 4115 all routers in the domain (or admin scope zone for scoped regions) 4116 receive the same set of group-range-to-RP mappings. This may be 4117 achieved through the use of any of these mechanisms, or through 4118 alternative mechanisms not currently specified. 4120 It must be operationally ensured that any RP address configured, 4121 learned, or advertised is reachable from all routers in the PIM 4122 domain. 4124 4.7.1. Group-to-RP Mapping 4126 Using one of the mechanisms described above, a PIM router receives 4127 one or more possible group-range-to-RP mappings. Each mapping 4128 specifies a range of multicast groups (expressed as a group and mask) 4129 and the RP to which such groups should be mapped. Each mapping may 4130 also have an associated priority. It is possible to receive multiple 4131 mappings, all of which might match the same multicast group; this is 4132 the common case with BSR. The algorithm for performing the group-to- 4133 RP mapping is as follows: 4135 1. Perform longest match on group-range to obtain a list of RPs. 4137 2. From this list of matching RPs, find the ones with highest 4138 priority. 4140 Eliminate any RPs from the list that have lower priorities. 4142 3. If only one RP remains in the list, use that RP. 4144 4. If multiple RPs are in the list, use the PIM hash function to 4145 choose one. 4147 Thus, if two or more group-range-to-RP mappings cover a particular 4148 group, the one with the longest mask is the mapping to use. If the 4149 mappings have the same mask length, then the one with the highest 4150 priority is chosen. If there is more than one matching entry with 4151 the same longest mask and the priorities are identical, then a hash 4152 function (see Section 4.7.2) is applied to choose the RP. 4154 This algorithm is invoked by a DR when it needs to determine an RP 4155 for a given group, e.g., upon reception of a packet or IGMP/MLD 4156 membership indication for a group for which the DR does not know the 4157 RP. 4159 Furthermore, the mapping function is invoked by all routers upon 4160 receiving a (*,G) Join/Prune message. 4162 Note that if the set of possible group-range-to-RP mappings changes, 4163 each router will need to check whether any existing groups are 4164 affected. This may, for example, cause a DR or acting DR to re-join a 4165 group, or cause it to restart register encapsulation to the new RP. 4167 Implementation note: the bootstrap mechanism described in RFC 2362 4168 omitted step 1 above. However, of the implementations we are aware 4169 of, approximately half performed step 1 anyway. Note that 4170 implementations of BSR that omit step 1 will not correctly 4171 interoperate with implementations of this specification when used 4172 with the BSR mechanism described in [11]. 4174 4.7.2. Hash Function 4176 The hash function is used by all routers within a domain, to map a 4177 group to one of the RPs from the matching set of group-range-to-RP 4178 mappings (this set all have the same longest mask length and same 4179 highest priority). The algorithm takes as input the group address, 4180 and the addresses of the candidate RPs from the mappings, and gives 4181 as output one RP address to be used. 4183 The protocol requires that all routers hash to the same RP within a 4184 domain (except for transients). The following hash function must be 4185 used in each router: 4187 1. For RP addresses in the matching group-range-to-RP mappings, 4188 compute a value: 4190 Value(G,M,C(i))= 4191 (1103515245 * ((1103515245 * (G&M)+12345) XOR C(i)) + 12345) mod 2^31 4193 where C(i) is the RP address and M is a hash-mask. If BSR is 4194 being used, the hash-mask is given in the Bootstrap messages. If 4195 BSR is not being used, the alternative mechanism that supplies 4196 the group-range-to-RP mappings may supply the value, or else it 4197 defaults to a mask with the most significant 30 bits being one 4198 for IPv4 and the most significant 126 bits being one for IPv6. 4199 The hash-mask allows a small number of consecutive groups (e.g., 4200 4) to always hash to the same RP. For instance, hierarchically- 4201 encoded data can be sent on consecutive group addresses to get 4202 the same delay and fate-sharing characteristics. 4204 For address families other than IPv4, a 32-bit digest to be used 4205 as C(i) and G must first be derived from the actual RP or group 4206 address. Such a digest method must be used consistently 4207 throughout the PIM domain. For IPv6 addresses, it is RECOMMENDED 4208 to use the equivalent IPv4 address for an IPv4-compatible 4209 address, and the exclusive-or of each 32-bit segment of the 4210 address for all other IPv6 addresses. For example, the digest of 4211 the IPv6 address 3ffe:b00:c18:1::10 would be computed as 4212 0x3ffe0b00 ^ 0x0c180001 ^ 0x00000000 ^ 0x00000010, where ^ 4213 represents the exclusive-or operation. 4215 2. The candidate RP with the highest resulting hash value is then 4216 the RP chosen by this Hash Function. If more than one RP has the 4217 same highest hash value, the RP with the highest IP address is 4218 chosen. 4220 4.8. Source-Specific Multicast 4222 The Source-Specific Multicast (SSM) service model [6] can be 4223 implemented with a strict subset of the PIM-SM protocol mechanisms. 4224 Both regular IP Multicast and SSM semantics can coexist on a single 4225 router, and both can be implemented using the PIM-SM protocol. A 4226 range of multicast addresses, currently 232.0.0.0/8 in IPv4 and 4227 FF3x::/32 for IPv6, is reserved for SSM, and the choice of semantics 4228 is determined by the multicast group address in both data packets and 4229 PIM messages. 4231 4.8.1. Protocol Modifications for SSM Destination Addresses 4233 The following rules override the normal PIM-SM behavior for a 4234 multicast address G in the SSM range: 4236 o A router MUST NOT send a (*,G) Join/Prune message for any reason. 4238 o A router MUST NOT send an (S,G,rpt) Join/Prune message for any 4239 reason. 4241 o A router MUST NOT send a Register message for any packet that is 4242 destined to an SSM address. 4244 o A router MUST NOT forward packets based on (*,G) or (S,G,rpt) 4245 state. The (*,G)- and (S,G,rpt)-related state summarization macros 4246 are NULL for any SSM address, for the purposes of packet 4247 forwarding. 4249 o A router acting as an RP MUST NOT forward any Register-encapsulated 4250 packet that has an SSM destination address, and SHOULD respond with 4251 a Register-Stop message to such a Register message. 4253 o A router MAY optimize out the creation and maintenance of (S,G,rpt) 4254 and (*,G) state for SSM destination addresses -- this state is not 4255 needed for SSM packets. 4257 The last three rules are present to deal with SSM-unaware "legacy" 4258 routers that may be sending (*,G) and (S,G,rpt) Join/Prunes, or 4259 Register messages for SSM destination addresses. Note that this 4260 specification does not attempt to aid an SSM-unaware "legacy" router 4261 with SSM operations. 4263 4.8.2. PIM-SSM-Only Routers 4265 An implementer may choose to implement only the subset of PIM Sparse- 4266 Mode that provides SSM forwarding semantics. 4268 A PIM-SSM-only router MUST implement the following portions of this 4269 specification: 4271 o Upstream (S,G) state machine (Section 4.5.7) 4273 o Downstream (S,G) state machine (Section 4.5.3) 4275 o (S,G) Assert state machine (Section 4.6.1) 4277 o Hello messages, neighbor discovery, and DR election (Section 4.3) 4279 o Packet forwarding rules (Section 4.2) 4281 A PIM-SSM-only router does not need to implement the following 4282 protocol elements: 4284 o Register state machine (Section 4.4) 4286 o (*,G) and (S,G,rpt) Downstream state machines (Sections 4.5.2, 4287 4.5.4, and 4.5.1) 4289 o (*,G) and (S,G,rpt) Upstream state machines (Sections 4.5.6, 4.5.8, 4290 and 4.5.5) 4292 o (*,G) Assert state machine (Section 4.6.2) 4294 o Bootstrap RP Election (Section 4.7) 4296 o Keepalive Timer 4298 o SPTbit (Section 4.2.2) 4300 The Keepalive Timer should be treated as always running, and SPTbit 4301 should be treated as always being set for an SSM address. 4302 Additionally, the Packet forwarding rules of Section 4.2 can be 4303 simplified in a PIM-SSM-only router: 4305 if( iif == RPF_interface(S) AND UpstreamJPState(S,G) == Joined ) { 4306 oiflist = inherited_olist(S,G) 4307 } else if( iif is in inherited_olist(S,G) ) { 4308 send Assert(S,G) on iif 4309 } 4311 oiflist = oiflist (-) iif 4312 forward packet on all interfaces in oiflist 4314 This is nothing more than the reduction of the normal PIM-SM 4315 forwarding rule, with all (S,G,rpt) and (*,G) clauses replaced with 4316 NULL. 4318 4.9. PIM Packet Formats 4320 This section describes the details of the packet formats for PIM 4321 control messages. 4323 All PIM control messages have IP protocol number 103. 4325 PIM messages are either unicast (e.g., Registers and Register-Stop) 4326 or multicast with TTL 1 to the 'ALL-PIM-ROUTERS' group (e.g., 4327 Join/Prune, Asserts, etc.). The source address used for unicast 4328 messages is a domain-wide reachable address; the source address used 4329 for multicast messages is the link-local address of the interface on 4330 which the message is being sent. 4332 The IPv4 'ALL-PIM-ROUTERS' group is '224.0.0.13'. The IPv6 'ALL-PIM- 4333 ROUTERS' group is 'ff02::d'. 4335 The PIM header common to all PIM messages is: 4337 0 1 2 3 4338 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4339 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4340 |PIM Ver| Type | Reserved | Checksum | 4341 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4343 PIM Ver 4344 PIM Version number is 2. 4346 Type 4347 Types for specific PIM messages. PIM Types are: 4349 Message Type Destination 4350 --------------------------------------------------------------------- 4351 0 = Hello Multicast to ALL-PIM-ROUTERS 4352 1 = Register Unicast to RP 4353 2 = Register-Stop Unicast to source of Register 4354 packet 4355 3 = Join/Prune Multicast to ALL-PIM-ROUTERS 4356 4 = Bootstrap Multicast to ALL-PIM-ROUTERS 4357 5 = Assert Multicast to ALL-PIM-ROUTERS 4358 6 = Graft (used in PIM-DM only) Unicast to RPF'(S) 4359 7 = Graft-Ack (used in PIM-DM only) Unicast to source of Graft 4360 packet 4361 8 = Candidate-RP-Advertisement Unicast to Domain's BSR 4363 Reserved 4364 Set to zero on transmission. Ignored upon receipt. 4366 Checksum 4367 The checksum is a standard IP checksum, i.e., the 16-bit one's 4368 complement of the one's complement sum of the entire PIM 4369 message, excluding the "Multicast data packet" section of the 4370 Register message. For computing the checksum, the checksum 4371 field is zeroed. If the packet's length is not an integral 4372 number of 16-bit words, the packet is padded with a trailing 4373 byte of zero before performing the checksum. 4375 For IPv6, the checksum also includes the IPv6 "pseudo-header", 4376 as specified in RFC 2460, Section 8.1 [5]. This "pseudo-header" 4377 is prepended to the PIM header for the purposes of calculating 4378 the checksum. The "Upper-Layer Packet Length" in the pseudo- 4379 header is set to the length of the PIM message, except in 4380 Register messages where it is set to the length of the PIM 4381 register header (8). The Next Header value used in the pseudo- 4382 header is 103. 4384 If a message is received with an unrecognized PIM Ver or Type field, 4385 or if a message's destination does not correspond to the table above, 4386 the message MUST be discarded, and an error message SHOULD be logged 4387 to the administrator in a rate-limited manner. 4389 4.9.1. Encoded Source and Group Address Formats 4391 Encoded-Unicast Address 4393 An Encoded-Unicast address takes the following format: 4395 0 1 2 3 4396 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4397 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4398 | Addr Family | Encoding Type | Unicast Address 4399 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+... 4401 Addr Family 4402 The PIM address family of the 'Unicast Address' field of this 4403 address. 4405 Values 0-127 are as assigned by the IANA for Internet Address 4406 Families in [7]. Values 128-250 are reserved to be assigned by 4407 the IANA for PIM-specific Address Families. Values 251 though 4408 255 are designated for private use. As there is no assignment 4409 authority for this space, collisions should be expected. 4411 Encoding Type 4412 The type of encoding used within a specific Address Family. The 4413 value '0' is reserved for this field and represents the native 4414 encoding of the Address Family. 4416 Unicast Address 4417 The unicast address as represented by the given Address Family 4418 and Encoding Type. 4420 Encoded-Group Address 4422 Encoded-Group addresses take the following format: 4424 0 1 2 3 4425 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4426 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4427 | Addr Family | Encoding Type |B| Reserved |Z| Mask Len | 4428 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4429 | Group multicast Address 4430 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+... 4432 Addr Family 4433 Described above. 4435 Encoding Type 4436 Described above. 4438 [B]idirectional PIM 4439 Indicates the group range should use Bidirectional PIM [13]. 4440 For PIM-SM defined in this specification, this bit MUST be zero. 4442 Reserved 4443 Transmitted as zero. Ignored upon receipt. 4445 Admin Scope [Z]one 4446 indicates the group range is an admin scope zone. This is used 4447 in the Bootstrap Router Mechanism [11] only. For all other 4448 purposes, this bit is set to zero and ignored on receipt. 4450 Mask Len 4451 The Mask length field is 8 bits. The value is the number of 4452 contiguous one bits that are left justified and used as a mask; 4453 when combined with the group address, it describes a range of 4454 groups. It is less than or equal to the address length in bits 4455 for the given Address Family and Encoding Type. If the message 4456 is sent for a single group, then the Mask length must equal the 4457 address length in bits for the given Address Family and Encoding 4458 Type (e.g., 32 for IPv4 native encoding, 128 for IPv6 native 4459 encoding). 4461 Group multicast Address 4462 Contains the group address. 4464 Encoded-Source Address 4466 Encoded-Source address takes the following format: 4468 0 1 2 3 4469 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4470 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4471 | Addr Family | Encoding Type | Rsrvd |S|W|R| Mask Len | 4472 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4473 | Source Address 4474 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... 4476 Addr Family 4477 Described above. 4479 Encoding Type 4480 Described above. 4482 Reserved 4483 Transmitted as zero, ignored on receipt. 4485 S The Sparse bit is a 1-bit value, set to 1 for PIM-SM. It is 4486 used for PIM version 1 compatibility. 4488 W The WC (or WildCard) bit is a 1-bit value for use with PIM 4489 Join/Prune messages (see Section 4.9.5.1). 4491 R The RPT (or Rendezvous Point Tree) bit is a 1-bit value for use 4492 with PIM Join/Prune messages (see Section 4.9.5.1). If the WC 4493 bit is 1, the RPT bit MUST be 1. 4495 Mask Len 4496 The mask length field is 8 bits. The value is the number of 4497 contiguous one bits left justified used as a mask which, 4498 combined with the Source Address, describes a source subnet. 4499 The mask length MUST be equal to the mask length in bits for the 4500 given Address Family and Encoding Type (32 for IPv4 native and 4501 128 for IPv6 native). A router SHOULD ignore any messages 4502 received with any other mask length. 4504 Source Address 4505 The source address. 4507 4.9.2. Hello Message Format 4509 It is sent periodically by routers on all interfaces. 4511 0 1 2 3 4512 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4513 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4514 |PIM Ver| Type | Reserved | Checksum | 4515 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4516 | OptionType | OptionLength | 4517 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4518 | OptionValue | 4519 | ... | 4520 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4521 | . | 4522 | . | 4523 | . | 4524 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4525 | OptionType | OptionLength | 4526 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4527 | OptionValue | 4528 | ... | 4529 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4531 PIM Version, Type, Reserved, Checksum 4532 Described in Section 4.9. 4534 OptionType 4535 The type of the option given in the following OptionValue field. 4537 OptionLength 4538 The length of the OptionValue field in bytes. 4540 OptionValue 4541 A variable length field, carrying the value of the option. 4543 The Option fields may contain the following values: 4545 o OptionType 1: Holdtime 4547 0 1 2 3 4548 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4549 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4550 | Type = 1 | Length = 2 | 4551 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4552 | Holdtime | 4553 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4555 Holdtime is the amount of time a receiver must keep the neighbor 4556 reachable, in seconds. If the Holdtime is set to '0xffff', the 4557 receiver of this message never times out the neighbor. This may be 4558 used with dial-on-demand links, to avoid keeping the link up with 4559 periodic Hello messages. 4561 An implementation MAY provide a configuration mechanism to reject a 4562 Hello message with holdtime 0xffff, and/or provide a mechanism to 4563 remove a neighbor. 4565 Hello messages with a Holdtime value set to '0' are also sent by a 4566 router on an interface about to go down or changing IP address (see 4567 Section 4.3.1). These are effectively goodbye messages, and the 4568 receiving routers SHOULD immediately time out the neighbor 4569 information for the sender. 4571 o OptionType 2: LAN Prune Delay 4573 0 1 2 3 4574 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4575 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4576 | Type = 2 | Length = 4 | 4577 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4578 |T| Propagation_Delay | Override_Interval | 4579 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4581 The LAN Prune Delay option is used to tune the prune propagation 4582 delay on multi-access LANs. The T bit specifies the ability of the 4583 sending router to disable join suppression. Propagation_Delay and 4584 Override_Interval are time intervals in units of milliseconds. A 4585 router originating a LAN Prune Delay option on interface I sets the 4586 Propagation_Delay field to the configured value of 4587 Propagation_Delay(I) and the value of the Override_Interval field 4588 to the value of Override_Interval(I). On a receiving router, the 4589 values of the fields are used to tune the value of the 4590 Effective_Override_Interval(I) and its derived timer values. 4592 Section 4.3.3 describes how these values affect the behavior of a 4593 router. 4595 o OptionType 3 to 16: reserved to be defined in future versions of 4596 this document. 4598 o OptionType 18: deprecated and should not be used. 4600 o OptionType 19: DR Priority 4602 0 1 2 3 4603 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4604 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4605 | Type = 19 | Length = 4 | 4606 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4607 | DR Priority | 4608 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4610 DR Priority is a 32-bit unsigned number and should be considered in 4611 the DR election as described in Section 4.3.2. 4613 o OptionType 20: Generation ID 4615 0 1 2 3 4616 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4617 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4618 | Type = 20 | Length = 4 | 4619 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4620 | Generation ID | 4621 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4623 Generation ID is a random 32-bit value for the interface on which 4624 the Hello message is sent. The Generation ID is regenerated 4625 whenever PIM forwarding is started or restarted on the interface. 4627 o OptionType 24: Address List 4629 0 1 2 3 4630 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4631 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4632 | Type = 24 | Length = | 4633 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4634 | Secondary Address 1 (Encoded-Unicast format) | 4635 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4636 ... 4637 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4638 | Secondary Address N (Encoded-Unicast format) | 4639 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4641 The contents of the Address List Hello option are described in 4642 Section 4.3.4. All addresses within a single Address List must 4643 belong to the same address family. 4645 OptionTypes 17 through 65000 are assigned by the IANA. OptionTypes 4646 65001 through 65535 are reserved for Private Use, as defined in [9]. 4648 Unknown options MUST be ignored and MUST NOT prevent a neighbor 4649 relationship from being formed. The "Holdtime" option MUST be 4650 implemented; the "DR Priority" and "Generation ID" options SHOULD be 4651 implemented. The "Address List" option MUST be implemented for IPv6. 4653 4.9.3. Register Message Format 4655 A Register message is sent by the DR to the RP when a multicast 4656 packet needs to be transmitted on the RP-tree. The IP source address 4657 is set to the address of the DR, the destination address to the RP's 4658 address. The IP TTL of the PIM packet is the system's normal unicast 4659 TTL. 4661 0 1 2 3 4662 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4663 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4664 |PIM Ver| Type | Reserved | Checksum | 4665 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4666 |B|N| Reserved2 | 4667 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4668 | | 4669 . Multicast data packet . 4670 | | 4671 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4672 PIM Version, Type, Reserved, Checksum 4673 Described in Section 4.9. Note that in order to reduce 4674 encapsulation overhead, the checksum for Registers is done only 4675 on the first 8 bytes of the packet, including the PIM header and 4676 the next 4 bytes, excluding the data packet portion. For 4677 interoperability reasons, a message carrying a checksum 4678 calculated over the entire PIM Register message should also be 4679 accepted. When calculating the checksum, the IPv6 pseudoheader 4680 "Upper-Layer Packet Length" is set to 8. 4682 B The Border bit. This specification deprecates the Border bit. A 4683 router MUST set B bit to 0 on tranmission and MUST ignore this 4684 bit on reception. 4686 N The Null-Register bit. Set to 1 by a DR that is probing the RP 4687 before expiring its local Register-Suppression Timer. Set to 0 4688 otherwise. 4690 Reserved2 4691 Transmitted as zero, ignored on receipt. 4693 Multicast data packet 4694 The original packet sent by the source. This packet must be of 4695 the same address family as the encapsulating PIM packet, e.g., 4696 an IPv6 data packet must be encapsulated in an IPv6 PIM packet. 4697 Note that the TTL of the original packet is decremented before 4698 encapsulation, just like any other packet that is forwarded. In 4699 addition, the RP decrements the TTL after decapsulating, before 4700 forwarding the packet down the shared tree. 4702 For (S,G) Null-Registers, the Multicast data packet portion 4703 contains a dummy IP header with S as the source address, G as 4704 the destination address. When generating an IPv4 Null-Register 4705 message, the fields in the dummy IPv4 header SHOULD be filled in 4706 according to the following table. Other IPv4 header fields may 4707 contain any value that is valid for that field. 4709 Field Value 4710 --------------------------------------- 4711 IP Version 4 4712 Header Length 5 4713 Checksum Header checksum 4714 Fragmentation offset 0 4715 More Fragments 0 4716 Total Length 20 4717 IP Protocol 103 (PIM) 4719 On receipt of an (S,G) Null-Register, if the Header Checksum 4720 field is non-zero, the recipient SHOULD check the checksum and 4721 discard null registers that have a bad checksum. The recipient 4722 SHOULD NOT check the value of any individual fields; a correct 4723 IP header checksum is sufficient. If the Header Checksum field 4724 is zero, the recipient MUST NOT check the checksum. 4726 With IPv6, an implementation generates a dummy IP header 4727 followed by a dummy PIM header with values according to the 4728 following table in addition to the source and group. Other IPv6 4729 header fields may contain any value that is valid for that 4730 field. 4732 Header Field Value 4733 -------------------------------------- 4734 IP Version 6 4735 Next Header 103 (PIM) 4736 Length 4 4737 PIM Version 0 4738 PIM Type 0 4739 PIM Reserved 0 4740 PIM Checksum PIM checksum including 4741 IPv6 "pseudo-header"; 4742 see Section 4.9 4744 On receipt of an IPv6 (S,G) Null-Register, if the dummy PIM 4745 header is present, the recipient SHOULD check the checksum and 4746 discard Null-Registers that have a bad checksum. 4748 4.9.4. Register-Stop Message Format 4750 A Register-Stop is unicast from the RP to the sender of the Register 4751 message. The IP source address is the address to which the register 4752 was addressed. The IP destination address is the source address of 4753 the register message. 4755 0 1 2 3 4756 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4757 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4758 |PIM Ver| Type | Reserved | Checksum | 4759 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4760 | Group Address (Encoded-Group format) | 4761 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4762 | Source Address (Encoded-Unicast format) | 4763 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4765 PIM Version, Type, Reserved, Checksum 4766 Described in Section 4.9. 4768 Group Address 4769 The group address from the multicast data packet in the 4770 Register. Format described in Section 4.9.1. Note that for 4771 Register-Stops the Mask Len field contains the full address 4772 length * 8 (e.g., 32 for IPv4 native encoding), if the message 4773 is sent for a single group. 4775 Source Address 4776 The host address of the source from the multicast data packet in 4777 the register. The format for this address is given in the 4778 Encoded-Unicast address in Section 4.9.1. A special wild card 4779 value consisting of an address field of all zeros can be used to 4780 indicate any source. 4782 4.9.5. Join/Prune Message Format 4784 A Join/Prune message is sent by routers towards upstream sources and 4785 RPs. Joins are sent to build shared trees (RP trees) or source trees 4786 (SPT). Prunes are sent to prune source trees when members leave 4787 groups as well as sources that do not use the shared tree. 4789 0 1 2 3 4790 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4791 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4792 |PIM Ver| Type | Reserved | Checksum | 4793 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4794 | Upstream Neighbor Address (Encoded-Unicast format) | 4795 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4796 | Reserved | Num groups | Holdtime | 4797 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4798 | Multicast Group Address 1 (Encoded-Group format) | 4799 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4800 | Number of Joined Sources | Number of Pruned Sources | 4801 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4802 | Joined Source Address 1 (Encoded-Source format) | 4803 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4804 | . | 4805 | . | 4806 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4807 | Joined Source Address n (Encoded-Source format) | 4808 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4809 | Pruned Source Address 1 (Encoded-Source format) | 4810 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4811 | . | 4812 | . | 4813 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4814 | Pruned Source Address n (Encoded-Source format) | 4815 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4816 | . | 4817 | . | 4818 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4819 | Multicast Group Address m (Encoded-Group format) | 4820 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4821 | Number of Joined Sources | Number of Pruned Sources | 4822 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4823 | Joined Source Address 1 (Encoded-Source format) | 4824 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4825 | . | 4826 | . | 4827 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4828 | Joined Source Address n (Encoded-Source format) | 4829 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4830 | Pruned Source Address 1 (Encoded-Source format) | 4831 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4832 | . | 4833 | . | 4834 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4835 | Pruned Source Address n (Encoded-Source format) | 4836 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4837 PIM Version, Type, Reserved, Checksum 4838 Described in Section 4.9. 4840 Unicast Upstream Neighbor Address 4841 The address of the upstream neighbor that is the target of the 4842 message. The format for this address is given in the Encoded- 4843 Unicast address in Section 4.9.1. For IPv6 the source address 4844 used for multicast messages is the link-local address of the 4845 interface on which the message is being sent. For IPv4, the 4846 source address is the primary address associated with that 4847 interface. 4849 Reserved 4850 Transmitted as zero, ignored on receipt. 4852 Holdtime 4853 The amount of time a receiver MUST keep the Join/Prune state 4854 alive, in seconds. If the Holdtime is set to '0xffff', the 4855 receiver of this message SHOULD hold the state until canceled by 4856 the appropriate canceling Join/Prune message, or timed out 4857 according to local policy. This may be used with dial-on-demand 4858 links, to avoid keeping the link up with periodic Join/Prune 4859 messages. 4861 Note that the HoldTime MUST be larger than the 4862 J/P_Override_Interval(I). 4864 Number of Groups 4865 The number of multicast group sets contained in the message. 4867 Multicast group address 4868 For format description, see Section 4.9.1. 4870 Number of Joined Sources 4871 Number of joined source addresses listed for a given group. 4873 Joined Source Address 1 .. n 4874 This list contains the sources for a given group that the 4875 sending router will forward multicast datagrams from if received 4876 on the interface on which the Join/Prune message is sent. 4878 See Encoded-Source-Address format in Section 4.9.1. 4880 Number of Pruned Sources 4881 Number of pruned source addresses listed for a group. 4883 Pruned Source Address 1 .. n 4884 This list contains the sources for a given group that the 4885 sending router does not want to forward multicast datagrams from 4886 when received on the interface on which the Join/Prune message 4887 is sent. 4889 Within one PIM Join/Prune message, all the Multicast Group Addresses, 4890 Joined Source addresses, and Pruned Source addresses MUST be of the 4891 same address family. It is NOT PERMITTED to mix IPv4 and IPv6 4892 addresses within the same message. In addition, the address family 4893 of the fields in the message SHOULD be the same as the IP source and 4894 destination addresses of the packet. This permits maximum 4895 implementation flexibility for dual-stack IPv4/IPv6 routers. If a 4896 router receives a message with mixed family addresses, it SHOULD only 4897 process the addresses that are of the same family as the unicast 4898 upstream neighbor address. 4900 4.9.5.1. Group Set Source List Rules 4902 As described above, Join/Prune messages are composed of one or more 4903 group sets. Each set contains two source lists, the Joined Sources 4904 and the Pruned Sources. This section describes the different types 4905 of group sets and source list entries that can exist in a Join/Prune 4906 message. 4908 There is one valid group set type: 4910 Group-Specific Set 4911 A Group-Specific Set is represented by a valid IP multicast 4912 address in the group address field and the full length of the IP 4913 address in the mask length field of the Multicast Group Address. 4914 Each Join/Prune message SHOULD NOT contain more than one group- 4915 specific set for the same IP multicast address. Each group- 4916 specific set may contain (*,G), (S,G,rpt), and (S,G) source list 4917 entries in the Joined or Pruned lists. 4919 (*,G) 4920 The (*,G) source list entry is used in Join/Prune messages 4921 sent towards the RP for the specified group. It expresses 4922 interest (or lack thereof) in receiving traffic sent to the 4923 group through the Rendezvous-Point shared tree. There MUST 4924 only be one such entry in both the Joined and Pruned lists of 4925 a group-specific set. 4927 (*,G) source list entries have the Source-Address set to the 4928 address of the RP for group G, the Source-Address Mask-Len set 4929 to the full length of the IP address, and both the WC and RPT 4930 bits of the Encoded-Source-Address set. 4932 (S,G,rpt) 4933 The (S,G,rpt) source list entry is used in Join/Prune messages 4934 sent towards the RP for the specified group. It expresses 4935 interest (or lack thereof) in receiving traffic through the 4936 shared tree sent by the specified source to this group. For 4937 each source address, the entry MUST exist in only one of the 4938 Joined and Pruned source lists of a group-specific set, but 4939 not both. 4941 (S,G,rpt) source list entries have the Source-Address set to 4942 the address of the source S, the Source-Address Mask-Len set 4943 to the full length of the IP address, and the WC bit cleared 4944 and the RPT bit set in the Encoded-Source-Address. 4946 (S,G) 4947 The (S,G) source list entry is used in Join/Prune messages 4948 sent towards the specified source. It expresses interest (or 4949 lack thereof) in receiving traffic through the shortest path 4950 tree sent by the source to the specified group. For each 4951 source address, the entry MUST exist in only one of the Joined 4952 and Pruned source lists of a group-specific set, but not both. 4954 (S,G) source list entries have the Source-Address set to the 4955 address of the source S, the Source-Address Mask-Len set to 4956 the full length of the IP address, and both the WC and RPT 4957 bits of the Encoded-Source-Address cleared. 4959 The rules described above are sufficient to prevent invalid 4960 combinations of source list entries in group-specific sets. There 4961 are, however, a number of combinations that have a valid 4962 interpretation but that are not generated by the protocol as 4963 described in this specification: 4965 o Combining a (*,G) Join and an (S,G,rpt) Join entry in the same 4966 message is redundant as the (*,G) entry covers the information 4967 provided by the (S,G,rpt) entry. 4969 o The same applies for a (*,G) Prune and an (S,G,rpt) Prune. 4971 o The combination of a (*,G) Prune and an (S,G,rpt) Join is also not 4972 generated. (S,G,rpt) Joins are only sent when the router is 4973 receiving all traffic for a group on the shared tree and it wishes 4974 to indicate a change for the particular source. As a (*,G) prune 4975 indicates that the router no longer wishes to receive shared tree 4976 traffic, the (S,G,rpt) Join would be meaningless. 4978 o As Join/Prune messages are targeted to a single PIM neighbor, 4979 including both an (S,G) Join and an (S,G,rpt) Prune in the same 4980 message is usually redundant. The (S,G) Join informs the neighbor 4981 that the sender wishes to receive the particular source on the 4982 shortest path tree. It is therefore unnecessary for the router to 4983 say that it no longer wishes to receive it on the shared tree. 4984 However, there is a valid interpretation for this combination of 4985 entries. A downstream router may have to instruct its upstream 4986 only to start forwarding a specific source once it has started 4987 receiving the source on the shortest-path tree. 4989 o The combination of an (S,G) Prune and an (S,G,rpt) Join could 4990 possibly be used by a router to switch from receiving a particular 4991 source on the shortest-path tree back to receiving it on the shared 4992 tree (provided that the RPF neighbor for the shortest-path and 4993 shared trees is common). However, Sparse-Mode PIM does not provide 4994 a mechanism for explicitly switching back to the shared tree. 4996 The rules are summarized in the tables below. 4998 +----------++------+-------+-----------+-----------+-------+-------+ 4999 | ||Join | Prune | Join | Prune | Join | Prune | 5000 | ||(*,G) | (*,G) | (S,G,rpt) | (S,G,rpt) | (S,G) | (S,G) | 5001 +----------++------+-------+-----------+-----------+-------+-------+ 5002 |Join ||- | no | ? | yes | yes | yes | 5003 |(*,G) || | | | | | | 5004 +----------++------+-------+-----------+-----------+-------+-------+ 5005 |Prune ||no | - | ? | ? | yes | yes | 5006 |(*,G) || | | | | | | 5007 +----------++------+-------+-----------+-----------+-------+-------+ 5008 |Join ||? | ? | - | no | yes | ? | 5009 |(S,G,rpt) || | | | | | | 5010 +----------++------+-------+-----------+-----------+-------+-------+ 5011 |Prune ||yes | ? | no | - | yes | ? | 5012 |(S,G,rpt) || | | | | | | 5013 +----------++------+-------+-----------+-----------+-------+-------+ 5014 |Join ||yes | yes | yes | yes | - | no | 5015 |(S,G) || | | | | | | 5016 +----------++------+-------+-----------+-----------+-------+-------+ 5017 |Prune ||yes | yes | ? | ? | no | - | 5018 |(S,G) || | | | | | | 5019 +----------++------+-------+-----------+-----------+-------+-------+ 5021 yes Allowed and expected. 5023 no Combination is not allowed by the protocol and MUST NOT be 5024 generated by a router. A router MAY accept these messages, but 5025 the result is undefined. An error message MAY be logged to the 5026 administrator in a rate-limited manner. 5028 ? Combination not expected by the protocol, but well-defined. A 5029 router MAY accept it but SHOULD NOT generate it. 5031 The order of source list entries in a group set source list is not 5032 important, except where limited by the packet format itself. 5034 4.9.5.2. Group Set Fragmentation 5036 When building a Join/Prune for a particular neighbor, a router should 5037 try to include in the message as much of the information it needs to 5038 convey to the neighbor as possible. This implies adding one group 5039 set for each multicast group that has information pending 5040 transmission and within each set including all relevant source list 5041 entries. 5043 On a router with a large amount of multicast state, the number of 5044 entries that must be included may result in packets that are larger 5045 than the maximum IP packet size. In most such cases, the information 5046 may be split into multiple messages. 5048 There is an exception with group sets that contain a (*,G) Joined 5049 source list entry. The group set expresses the router's interest in 5050 receiving all traffic for the specified group on the shared tree, and 5051 it MUST include an (S,G,rpt) Pruned source list entry for every 5052 source that the router does not wish to receive. This list of 5053 (S,G,rpt) Pruned source-list entries MUST NOT be split in multiple 5054 messages. 5056 If only N (S,G,rpt) Prune entries fit into a maximum-sized Join/Prune 5057 message, but the router has more than N (S,G,rpt) Prunes to add, then 5058 the router MUST choose to include the first N (numerically smallest 5059 in network byte order) IP addresses, and the rest are ignored (not 5060 included). 5062 4.9.6. Assert Message Format 5064 The Assert message is used to resolve forwarder conflicts between 5065 routers on a link. It is sent when a router receives a multicast 5066 data packet on an interface on which the router would normally have 5067 forwarded that packet. Assert messages may also be sent in response 5068 to an Assert message from another router. 5070 0 1 2 3 5071 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 5072 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 5073 |PIM Ver| Type | Reserved | Checksum | 5074 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 5075 | Group Address (Encoded-Group format) | 5076 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 5077 | Source Address (Encoded-Unicast format) | 5078 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 5079 |R| Metric Preference | 5080 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 5081 | Metric | 5082 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 5084 PIM Version, Type, Reserved, Checksum 5085 Described in Section 4.9. 5087 Group Address 5088 The group address for which the router wishes to resolve the 5089 forwarding conflict. This is an Encoded-Group address, as 5090 specified in Section 4.9.1. 5092 Source Address 5093 Source address for which the router wishes to resolve the 5094 forwarding conflict. The source address MAY be set to zero for 5095 (*,G) asserts (see below). The format for this address is given 5096 in Encoded-Unicast-Address in Section 4.9.1. 5098 R RPT-bit is a 1-bit value. The RPT-bit is set to 1 for 5099 Assert(*,G) messages and 0 for Assert(S,G) messages. 5101 Metric Preference 5102 Preference value assigned to the unicast routing protocol that 5103 provided the route to the multicast source or Rendezvous-Point. 5105 Metric 5106 The unicast routing table metric associated with the route used 5107 to reach the multicast source or Rendezvous-Point. The metric 5108 is in units applicable to the unicast routing protocol used. 5110 Assert messages can be sent to resolve a forwarding conflict for all 5111 traffic to a given group or for a specific source and group. 5113 Assert(S,G) 5114 Source-specific asserts are sent by routers forwarding a 5115 specific source on the shortest-path tree (SPTbit is TRUE). 5116 (S,G) Asserts have the Group-Address field set to the group G 5117 and the Source-Address field set to the source S. The RPT-bit 5118 is set to 0, the Metric-Preference is set to MRIB.pref(S) and 5119 the Metric is set to MRIB.metric(S). 5121 Assert(*,G) 5122 Group-specific asserts are sent by routers forwarding data for 5123 the group and source(s) under contention on the shared tree. 5124 (*,G) asserts have the Group-Address field set to the group G. 5125 For data-triggered Asserts, the Source-Address field MAY be set 5126 to the IP source address of the data packet that triggered the 5127 Assert and is set to zero otherwise. The RPT-bit is set to 1, 5128 the Metric-Preference is set to MRIB.pref(RP(G)), and the Metric 5129 is set to MRIB.metric(RP(G)). 5131 4.10. PIM Timers 5133 PIM-SM maintains the following timers, as discussed in Section 4.1. 5134 All timers are countdown timers; they are set to a value and count 5135 down to zero, at which point they typically trigger an action. Of 5136 course they can just as easily be implemented as count-up timers, 5137 where the absolute expiry time is stored and compared against a real- 5138 time clock, but the language in this specification assumes that they 5139 count downwards to zero. 5141 Global Timers 5143 Per interface (I): 5145 Hello Timer: HT(I) 5147 Per neighbor (N): 5149 Neighbor Liveness Timer: NLT(N,I) 5151 Per Group (G): 5153 (*,G) Join Expiry Timer: ET(*,G,I) 5155 (*,G) Prune-Pending Timer: PPT(*,G,I) 5157 (*,G) Assert Timer: AT(*,G,I) 5159 Per Source (S): 5161 (S,G) Join Expiry Timer: ET(S,G,I) 5163 (S,G) Prune-Pending Timer: PPT(S,G,I) 5165 (S,G) Assert Timer: AT(S,G,I) 5167 (S,G,rpt) Prune Expiry Timer: ET(S,G,rpt,I) 5169 (S,G,rpt) Prune-Pending Timer: PPT(S,G,rpt,I) 5171 Per Group (G): 5173 (*,G) Upstream Join Timer: JT(*,G) 5175 Per Source (S): 5177 (S,G) Upstream Join Timer: JT(S,G) 5179 (S,G) Keepalive Timer: KAT(S,G) 5181 (S,G,rpt) Upstream Override Timer: OT(S,G,rpt) 5183 At the DRs or relevant Assert Winners only: 5185 Per Source,Group pair (S,G): 5187 Register-Stop Timer: RST(S,G) 5189 4.11. Timer Values 5191 When timers are started or restarted, they are set to default values. 5192 This section summarizes those default values. 5194 Note that protocol events or configuration may change the default 5195 value of a timer on a specific interface. When timers are 5196 initialized in this document, the value specific to the interface in 5197 context must be used. 5199 Some of the timers listed below (Prune-Pending, Upstream Join, 5200 Upstream Override) can be set to values that depend on the settings 5201 of the Propagation_Delay and Override_Interval of the corresponding 5202 interface. The default values for these are given below. 5204 Variable Name: Propagation_Delay(I) 5206 +-------------------------------+--------------+----------------------+ 5207 | Value Name | Value | Explanation | 5208 +-------------------------------+--------------+----------------------+ 5209 | Propagation_delay_default | 0.5 secs | Expected | 5210 | | | propagation delay | 5211 | | | over the local | 5212 | | | link. | 5213 +-------------------------------+--------------+----------------------+ 5215 The default value of the Propagation_delay_default is chosen to be 5216 relatively large to provide compatibility with older PIM 5217 implementations. 5219 Variable Name: Override_Interval(I) 5221 +--------------------------+-----------------+-------------------------+ 5222 | Value Name | Value | Explanation | 5223 +--------------------------+-----------------+-------------------------+ 5224 | t_override_default | 2.5 secs | Default delay | 5225 | | | interval over | 5226 | | | which to randomize | 5227 | | | when scheduling a | 5228 | | | delayed Join | 5229 | | | message. | 5230 +--------------------------+-----------------+-------------------------+ 5232 Timer Name: Hello Timer (HT(I)) 5234 +---------------------+--------+---------------------------------------+ 5235 |Value Name | Value | Explanation | 5236 +---------------------+--------+---------------------------------------+ 5237 |Hello_Period | 30 secs| Periodic interval for Hello messages. | 5238 +---------------------+--------+---------------------------------------+ 5239 |Triggered_Hello_Delay| 5 secs | Randomized interval for initial Hello | 5240 | | | message on bootup or triggered Hello | 5241 | | | message to a rebooting neighbor. | 5242 +---------------------+--------+---------------------------------------+ 5244 At system power-up, the timer is initialized to rand(0, 5245 Triggered_Hello_Delay) to prevent synchronization. When a new or 5246 rebooting neighbor is detected, a responding Hello is sent within 5247 rand(0, Triggered_Hello_Delay). 5249 Timer Name: Neighbor Liveness Timer (NLT(N,I)) 5251 +--------------------------+----------------------+--------------------+ 5252 | Value Name | Value | Explanation | 5253 +--------------------------+----------------------+--------------------+ 5254 | Default_Hello_Holdtime | 3.5 * Hello_Period | Default holdtime | 5255 | | | to keep neighbor | 5256 | | | state alive | 5257 +--------------------------+----------------------+--------------------+ 5258 | Hello_Holdtime | from message | Holdtime from | 5259 | | | Hello Message | 5260 | | | Holdtime option. | 5261 +--------------------------+----------------------+--------------------+ 5263 The Holdtime in a Hello Message should be set to (3.5 * 5264 Hello_Period), giving a default value of 105 seconds. 5266 Timer Names: Expiry Timer (ET(*,G,I), ET(S,G,I), ET(S,G,rpt,I)) 5268 +----------------+----------------+------------------------------------+ 5269 | Value Name | Value | Explanation | 5270 +----------------+----------------+------------------------------------+ 5271 | J/P_HoldTime | from message | Holdtime from Join/Prune Message | 5272 +----------------+----------------+------------------------------------+ 5274 See details of JT(*,G) for the Holdtime that is included in 5275 Join/Prune Messages. 5277 Timer Names: Prune-Pending Timer (PPT(*,G,I), PPT(S,G,I), 5278 PPT(S,G,rpt,I)) 5280 +--------------------------+---------------------+---------------------+ 5281 |Value Name | Value | Explanation | 5282 +--------------------------+---------------------+---------------------+ 5283 |J/P_Override_Interval(I) | Default: | Short period after | 5284 | | Effective_ | a join or prune to | 5285 | | Propagation_ | allow other | 5286 | | Delay(I) + | routers on the LAN | 5287 | | EffectiveOverride_ | to override the | 5288 | | Interval(I) | join or prune | 5289 +--------------------------+---------------------+---------------------+ 5291 Note that both the Effective_Propagation_Delay(I) and the 5292 Effective_Override_Interval(I) are interface-specific values that may 5293 change when Hello messages are received (see Section 4.3.3). 5295 Timer Names: Assert Timer (AT(*,G,I), AT(S,G,I)) 5297 +---------------------------+---------------------+--------------------+ 5298 | Value Name | Value | Explanation | 5299 +---------------------------+---------------------+--------------------+ 5300 | Assert_Override_Interval | Default: 3 secs | Short interval | 5301 | | | before an assert | 5302 | | | times out where | 5303 | | | the assert winner | 5304 | | | resends an Assert | 5305 | | | message | 5306 +---------------------------+---------------------+--------------------+ 5307 | Assert_Time | Default: 180 secs | Period after last | 5308 | | | assert before | 5309 | | | assert state is | 5310 | | | timed out | 5311 +---------------------------+---------------------+--------------------+ 5313 Note that for historical reasons, the Assert message lacks a Holdtime 5314 field. Thus, changing the Assert Time from the default value is not 5315 recommended. 5317 Timer Names: Upstream Join Timer (JT(*,G), JT(S,G)) 5319 +-------------+--------------------+-----------------------------------+ 5320 |Value Name | Value | Explanation | 5321 +-------------+--------------------+-----------------------------------+ 5322 |t_periodic | Default: 60 secs | Period between Join/Prune Messages| 5323 +-------------+--------------------+-----------------------------------+ 5324 |t_suppressed | rand(1.1 * | Suppression period when someone | 5325 | | t_periodic, 1.4 * | else sends a J/P message so we | 5326 | | t_periodic) when | don't need to do so. | 5327 | | Suppression_ | | 5328 | | Enabled(I) is | | 5329 | | true, 0 otherwise | | 5330 +-------------+--------------------+-----------------------------------+ 5331 |t_override | rand(0, Effective_ | Randomized delay to prevent | 5332 | | Override_ | response implosion when sending a | 5333 | | Interval(I)) | join message to override someone | 5334 | | | else's Prune message. | 5335 +-------------+--------------------+-----------------------------------+ 5337 t_periodic may be set to take into account such things as the 5338 configured bandwidth and expected average number of multicast route 5339 entries for the attached network or link (e.g., the period would be 5340 longer for lower-speed links, or for routers in the center of the 5341 network that expect to have a larger number of entries). If the 5342 Join/Prune-Period is modified during operation, these changes should 5343 be made relatively infrequently, and the router should continue to 5344 refresh at its previous Join/Prune-Period for at least Join/Prune- 5345 Holdtime, in order to allow the upstream router to adapt. 5347 The holdtime specified in a Join/Prune message should be set to (3.5 5348 * t_periodic). 5350 t_override depends on the Effective Override Interval of the upstream 5351 interface, which may change when Hello messages are received. 5353 t_suppressed depends on the Suppression State of the upstream 5354 interface (Section 4.3.3) and becomes zero when suppression is 5355 disabled. 5357 Timer Name: Upstream Override Timer (OT(S,G,rpt)) 5359 +---------------+--------------------------+---------------------------+ 5360 | Value Name | Value | Explanation | 5361 +---------------+--------------------------+---------------------------+ 5362 | t_override | see Upstream Join Timer | see Upstream Join Timer | 5363 +---------------+--------------------------+---------------------------+ 5365 The upstream Override Timer is only ever set to t_override; this 5366 value is defined in the section on Upstream Join Timers. 5368 Timer Name: Keepalive Timer (KAT(S,G)) 5370 +-----------------------+-----------------------+----------------------+ 5371 | Value Name | Value | Explanation | 5372 +-----------------------+-----------------------+----------------------+ 5373 | Keepalive_Period | Default: 210 secs | Period after last | 5374 | | | (S,G) data packet | 5375 | | | during which (S,G) | 5376 | | | Join state will be | 5377 | | | maintained even in | 5378 | | | the absence of | 5379 | | | (S,G) Join | 5380 | | | messages. | 5381 +-----------------------+-----------------------+----------------------+ 5382 | RP_Keepalive_Period | ( 3 * Register_ | As | 5383 | | Suppression_Time ) | Keepalive_Period, | 5384 | | + Register_ | but at the RP when | 5385 | | Probe_Time | a Register-Stop is | 5386 | | | sent. | 5387 +-----------------------+-----------------------+----------------------+ 5389 The normal keepalive period for the KAT(S,G) defaults to 210 seconds. 5390 However, at the RP, the keepalive period must be at least the 5391 Register_Suppression_Time, or the RP may time out the (S,G) state 5392 before the next Null-Register arrives. Thus, the KAT(S,G) is set to 5393 max(Keepalive_Period, RP_Keepalive_Period) when a Register-Stop is 5394 sent. 5396 Timer Name: Register-Stop Timer (RST(S,G)) 5398 +---------------------------+--------------------+---------------------+ 5399 |Value Name | Value | Explanation | 5400 +---------------------------+--------------------+---------------------+ 5401 |Register_Suppression_Time | Default: 60 secs | Period during | 5402 | | | which a DR stops | 5403 | | | sending Register- | 5404 | | | encapsulated data | 5405 | | | to the RP after | 5406 | | | receiving a | 5407 | | | Register-Stop | 5408 | | | message. | 5409 +---------------------------+--------------------+---------------------+ 5410 |Register_Probe_Time | Default: 5 secs | Time before RST | 5411 | | | expires when a DR | 5412 | | | may send a Null- | 5413 | | | Register to the RP | 5414 | | | to cause it to | 5415 | | | resend a Register- | 5416 | | | Stop message. | 5417 +---------------------------+--------------------+---------------------+ 5419 If the Register_Suppression_Time or the Register_Probe_Time are 5420 configured to values other than the defaults, it MUST be ensured that 5421 the value of the Register_Probe_Time is less than half the value of 5422 the Register_Suppression_Time to prevent a possible negative value in 5423 the setting of the Register-Stop Timer. 5425 5. IANA Considerations 5427 5.1. PIM Address Family 5429 The PIM Address Family field was chosen to be 8 bits as a tradeoff 5430 between packet format and use of the IANA assigned numbers. Because 5431 when the PIM packet format was designed only 15 values were assigned 5432 for Address Families, and large numbers of new Address Family values 5433 were not envisioned, 8 bits seemed large enough. However, the IANA 5434 assigns Address Families in a 16-bit field. Therefore, the PIM 5435 Address Family is allocated as follows: 5437 Values 0 through 127 are designated to have the same meaning as 5438 IANA-assigned Address Family Numbers [7]. 5440 Values 128 through 250 are designated to be assigned for PIM by the 5441 IANA based upon IESG Approval, as defined in [9]. 5443 Values 251 through 255 are designated for Private Use, as defined 5444 in [9]. 5446 5.2. PIM Hello Options 5448 Values 17 through 65000 are to be assigned by the IANA. Since the 5449 space is large, they may be assigned as First Come First Served as 5450 defined in [9]. Such assignments are valid for one year and may be 5451 renewed. Permanent assignments require a specification (see 5452 "Specification Required" in [9].) 5454 6. Security Considerations 5456 This section describes various possible security concerns related to 5457 the PIM-SM protocol, including a description of how to use IPsec to 5458 secure the protocol. The reader is referred to [15] and [16] for 5459 further discussion of PIM-SM and multicast security. The IPsec 5460 authentication header [8] MAY be used to provide data integrity 5461 protection and groupwise data origin authentication of PIM protocol 5462 messages. Authentication of PIM messages can protect against unwanted 5463 behaviors caused by unauthorized or altered PIM messages. 5465 Note that PIM relies upon an MRIB populated outside of PIM so 5466 securing the sources of change to the MRIB is RECOMMENDED. 5468 6.1. Attacks Based on Forged Messages 5470 The extent of possible damage depends on the type of counterfeit 5471 messages accepted. We next consider the impact of possible 5472 forgeries, including forged link-local (Join/Prune, Hello, and 5473 Assert) and forged unicast (Register and Register-Stop) messages. 5475 6.1.1. Forged Link-Local Messages 5477 Join/Prune, Hello, and Assert messages are all sent to the link-local 5478 ALL_PIM_ROUTERS multicast addresses and thus are not forwarded by a 5479 compliant router. A forged message of this type can only reach a LAN 5480 if it was sent by a local host or if it was allowed onto the LAN by a 5481 compromised or non-compliant router. 5483 1. A forged Join/Prune message can cause multicast traffic to be 5484 delivered to links where there are no legitimate requesters, 5485 potentially wasting bandwidth on that link. A forged leave 5486 message on a multi-access LAN is generally not a significant 5487 attack in PIM, because any legitimately joined router on the LAN 5488 would override the leave with a join before the upstream router 5489 stops forwarding data to the LAN. 5491 2. By forging a Hello message, an unauthorized router can cause 5492 itself to be elected as the designated router on a LAN. The 5493 designated router on a LAN is (in the absence of asserts) 5494 responsible for forwarding traffic to that LAN on behalf of any 5495 local members. The designated router is also responsible for 5496 register-encapsulating to the RP any packets that are originated 5497 by hosts on the LAN. Thus, the ability of local hosts to send 5498 and receive multicast traffic may be compromised by a forged 5499 Hello message. 5501 3. By forging an Assert message on a multi-access LAN, an attacker 5502 could cause the legitimate designated forwarder to stop 5503 forwarding traffic to the LAN. Such a forgery would prevent any 5504 hosts downstream of that LAN from receiving traffic. 5506 6.1.2. Forged Unicast Messages 5508 Register messages and Register-Stop messages are forwarded by 5509 intermediate routers to their destination using normal IP forwarding. 5510 Without data origin authentication, an attacker who is located 5511 anywhere in the network may be able to forge a Register or Register- 5512 Stop message. We consider the effect of a forgery of each of these 5513 messages next. 5515 1. By forging a Register message, an attacker can cause the RP to 5516 inject forged traffic onto the shared multicast tree. 5518 2. By forging a Register-stop message, an attacker can prevent a 5519 legitimate DR from Registering packets to the RP. This can 5520 prevent local hosts on that LAN from sending multicast packets. 5522 The above two PIM messages are not changed by intermediate routers 5523 and need only be examined by the intended receiver. Thus, these 5524 messages can be authenticated end-to-end, using AH. Attacks on 5525 Register and Register-Stop messages do not apply to a PIM-SSM-only 5526 implementation, as these messages are not required for PIM-SSM. 5528 6.2. Non-Cryptographic Authentication Mechanisms 5530 A PIM router SHOULD provide an option to limit the set of neighbors 5531 from which it will accept Join/Prune, Assert, and Hello messages. 5532 Either static configuration of IP addresses or an IPsec security 5533 association MAY be used. Furthermore, a PIM router SHOULD NOT accept 5534 protocol messages from a router from which it has not yet received a 5535 valid Hello message. 5537 A Designated Router MUST NOT register-encapsulate a packet and send 5538 it to the RP unless the source address of the packet is a legal 5539 address for the subnet on which the packet was received. Similarly, 5540 a Designated Router SHOULD NOT accept a Register-Stop packet whose IP 5541 source address is not a valid RP address for the local domain. 5543 An implementation SHOULD provide a mechanism to allow an RP to 5544 restrict the range of source addresses from which it accepts 5545 Register-encapsulated packets. 5547 All options that restrict the range of addresses from which packets 5548 are accepted MUST default to allowing all packets. 5550 6.3. Authentication Using IPsec 5552 The IPsec [8] transport mode using the Authentication Header (AH) is 5553 the recommended method to prevent the above attacks against PIM. The 5554 specific AH authentication algorithm and parameters, including the 5555 choice of authentication algorithm and the choice of key, are 5556 configured by the network administrator. When IPsec authentication 5557 is used, a PIM router should reject (drop without processing) any 5558 unauthorized PIM protocol messages. 5560 To use IPsec, the administrator of a PIM network configures each PIM 5561 router with one or more security associations (SAs) and associated 5562 Security Parameter Indexes (SPIs) that are used by senders to 5563 authenticate PIM protocol messages and are used by receivers to 5564 authenticate received PIM protocol messages. This document does not 5565 describe protocols for establishing SAs. It assumes that manual 5566 configuration of SAs is performed, but it does not preclude the use 5567 of a negotiation protocol such as the Internet Key Exchange [14] to 5568 establish SAs. 5570 IPsec [8] provides protection against replayed unicast and multicast 5571 messages. The anti-replay option for IPsec SHOULD be enabled on all 5572 SAs. 5574 The following sections describe the SAs required to protect PIM 5575 protocol messages. 5577 6.3.1. Protecting Link-Local Multicast Messages 5579 The network administrator defines an SA and SPI that are to be used 5580 to authenticate all link-local PIM protocol messages (Hello, 5581 Join/Prune, and Assert) on each link in a PIM domain. 5583 IPsec [8] allows (but does not require) different Security Policy 5584 Databases (SPD) for each router interface. If available, it may be 5585 desirable to configure the Security Policy Database at a PIM router 5586 such that all incoming and outgoing Join/Prune, Assert, and Hello 5587 packets use a different SA for each incoming or outgoing interface. 5589 6.3.2. Protecting Unicast Messages 5591 IPsec can also be used to provide data origin authentication and data 5592 integrity protection for the Register and Register-Stop unicast 5593 messages. 5595 6.3.2.1. Register Messages 5597 The Security Policy Database at every PIM router is configured to 5598 select an SA to use when sending PIM Register packets to each 5599 rendezvous point. 5601 In the most general mode of operation, the Security Policy Database 5602 at each DR is configured to select a unique SA and SPI for traffic 5603 sent to each RP. This allows each DR to have a different 5604 authentication algorithm and key to talk to the RP. However, this 5605 creates a daunting key management and distribution problem for the 5606 network administrator. Therefore, it may be preferable in PIM domains 5607 where all Designated Routers are under a single administrative 5608 control that the same authentication algorithm parameters (including 5609 the key) be used for all Registered packets in a domain, regardless 5610 of who are the RP and the DR. 5612 In this "single shared key" mode of operation, the network 5613 administrator must choose an SPI for each DR that will be used to 5614 send the PIM protocol packets. The Security Policy Database at every 5615 DR is configured to select an SA (including the authentication 5616 algorithm, authentication parameters, and this SPI) when sending 5617 Register messages to the RP. 5619 By using a single authentication algorithm and associated parameters, 5620 the key distribution problem is simplified. Note, however, that this 5621 method has the property that, in order to change the authentication 5622 method or authentication key used, all routers in the domain must be 5623 updated. 5625 6.3.2.2. Register-Stop Messages 5627 Similarly, the Security Policy Database at each Rendezvous Point 5628 should be configured to choose an SA to use when sending Register- 5629 Stop messages. Because Register-Stop messages are unicast to the 5630 destination DR, a different SA and a potentially unique SPI are 5631 required for each DR. 5633 In order to simplify the management problem, it may be acceptable use 5634 the same authentication algorithm and authentication parameters, 5635 regardless of the sending RP and regardless of the destination DR. 5636 Although a unique SA is needed for each DR, the same authentication 5637 algorithm and authentication algorithm parameters (secret key) can be 5638 shared by all DRs and by all RPs. 5640 6.4. Denial-of-Service Attacks 5642 There are a number of possible denial-of-service attacks against PIM 5643 that can be caused by generating false PIM protocol messages or even 5644 by generating false traffic. Authenticating PIM protocol traffic 5645 prevents some, but not all, of these attacks. Three of the possible 5646 attacks include: 5648 - Sending packets to many different group addresses quickly can be a 5649 denial-of-service attack in and of itself. This will cause many 5650 register-encapsulated packets, loading the DR, the RP, and the 5651 routers between the DR and the RP. 5653 - Forging Join messages can cause a multicast tree to get set up. A 5654 large number of forged joins can consume router resources and 5655 result in denial of service. 5657 7. Acknowledgements 5659 PIM-SM was designed over many years by a large group of people, 5660 including ideas, comments, and corrections from Deborah Estrin, Dino 5661 Farinacci, Ahmed Helmy, David Thaler, Steve Deering, Van Jacobson, C. 5662 Liu, Puneet Sharma, Liming Wei, Tom Pusateri, Tony Ballardie, Scott 5663 Brim, Jon Crowcroft, Paul Francis, Joel Halpern, Horst Hodel, Polly 5664 Huang, Stephen Ostrowski, Lixia Zhang, Girish Chandranmenon, Brian 5665 Haberman, Hal Sandick, Mike Mroz, Garry Kump, Pavlin Radoslavov, Mike 5666 Davison, James Huang, Christopher Thomas Brown, and James Lingard. 5668 Thanks are due to the American Licorice Company, for its obscure but 5669 possibly essential role in the creation of this document. 5671 8. Normative References 5673 [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement 5674 Levels", BCP 14, RFC 2119, March 1997. 5676 [2] Cain, B., Deering, S., Kouvelas, I., Fenner, B., and A. 5677 Thyagarajan, "Internet Group Management Protocol, Version 3", 5678 RFC 3376, October 2002. 5680 [3] Deering, S., "Host extensions for IP multicasting", STD 5, RFC 5681 1112, August 1989. 5683 [4] Deering, S., Fenner, W., and B. Haberman, "Multicast Listener 5684 Discovery (MLD) for IPv6", RFC 2710, October 1999. 5686 [5] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) 5687 Specification", RFC 2460, December 1998. 5689 [6] Holbrook, H. and B. Cain, "Source-Specific Multicast for IP", 5690 RFC 4607, August 2006. 5692 [7] IANA, "Address Family Numbers", 5693 . 5695 [8] Kent, S. and K. Seo, "Security Architecture for the Internet 5696 Protocol", RFC 4301, December 2005. 5698 [9] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA 5699 Considerations Section in RFCs", BCP 26, RFC 5226, May 2008. 5701 9. Informative References 5703 [10] Bates, T., Rekhter, Y., Chandra, R., and D. Katz, "Multiprotocol 5704 Extensions for BGP-4", RFC 4760, January 2007. 5706 [11] Bhaskar, N., Gall, A., Lingard, J., and S. Venaas, "Bootstrap 5707 Router (BSR) Mechanism for Protocol Independent Multicast 5708 (PIM)", RFC 5059, January 2008. 5710 [12] Black, D., "Differentiated Services and Tunnels", RFC 2983, 5711 October 2000. 5713 [13] Handley, M., Kouvelas, I., Speakman, T., and L. Vicisano, 5714 "Bidirectional Protocol Independent Multicast" (BIDIR-PIM), RFC 5715 5015, October 2007. 5717 [14] Kaufman, C., Hoffman, P., Nir, Y., and P. Eronen, "Internet Key 5718 Exchange (IKEv2) Protocol", RFC 5996, September 2010. 5720 [15] Savola, P., Lehtonen, R., and D. Meyer, "Protocol Independent 5721 Multicast - Sparse Mode (PIM-SM) Multicast Routing Security 5722 Issues and Enhancements", RFC 4609, August 2006. 5724 [16] Savola, P. and J. Lingard, "Host Threats to Protocol Independent 5725 Multicast (PIM)", RFC 5294, August 2008. 5727 [17] Savola, P. and B. Haberman, "Embedding the Rendezvous Point (RP) 5728 Address in an IPv6 Multicast Address", RFC 3956, November 2004. 5730 Authors' Addresses 5732 Bill Fenner 5733 AT&T Labs - Research 5734 1 River Oaks Place 5735 San Jose, CA 95134 5737 EMail: fenner@research.att.com 5739 Mark Handley 5740 Department of Computer Science 5741 University College London 5742 Gower Street 5743 London WC1E 6BT 5744 United Kingdom 5746 EMail: M.Handley@cs.ucl.ac.uk 5748 Hugh Holbrook 5749 Arastra, Inc. 5750 P.O. Box 10905 5751 Palo Alto, CA 94303 5753 EMail: holbrook@arastra.com 5755 Isidor Kouvelas 5756 Cisco Systems, Inc. 5757 170 W. Tasman Drive 5758 San Jose, CA 95134 5760 EMail: kouvelas@cisco.com 5762 Rishabh Parekh 5763 Cisco Systems, Inc. 5764 170 W. Tasman Drive 5765 San Jose, CA 95134 5767 EMail: riparekh@cisco.com 5768 Zhaohui (Jeffrey) Zhang 5769 Juniper Networks 5770 10 Technology Park Drive 5771 Westford, MA 01886 5773 Email: zzhang@juniper.net 5775 Lianshu Zheng 5776 Huawei Technologies Co., Ltd 5777 No. 3 Xinxi Road, Shang-di, Hai-dian District 5778 Beijing 100085 5779 China 5781 Email: verozheng@huawei.com