idnits 2.17.1 draft-ietf-pim-port-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1122 has weird spacing: '...Capable thi...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 24, 2011) is 4561 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 4601 (Obsoleted by RFC 7761) ** Obsolete normative reference: RFC 4960 (Obsoleted by RFC 9260) -- Duplicate reference: RFC4601, mentioned in 'HELLO-OPT', was also mentioned in 'RFC4601'. -- Obsolete informational reference (is this intentional?): RFC 4601 (ref. 'HELLO-OPT') (Obsoleted by RFC 7761) Summary: 3 errors (**), 0 flaws (~~), 2 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group D. Farinacci 3 Internet-Draft IJ. Wijnands 4 Intended status: Experimental S. Venaas 5 Expires: April 26, 2012 cisco Systems 6 M. Napierala 7 AT&T Labs 8 October 24, 2011 10 A Reliable Transport Mechanism for PIM 11 draft-ietf-pim-port-09.txt 13 Abstract 15 This document defines a reliable transport mechanism for the PIM 16 protocol for transmission of Join/Prune messages. This eliminates 17 the need for periodic Join/Prune message transmission and processing. 18 The reliable transport mechanism can use either TCP or SCTP as the 19 transport protocol. 21 Status of this Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at http://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on April 26, 2012. 38 Copyright Notice 40 Copyright (c) 2011 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. Code Components extracted from this document must 49 include Simplified BSD License text as described in Section 4.e of 50 the Trust Legal Provisions and are provided without warranty as 51 described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 56 1.1. Requirements Notation . . . . . . . . . . . . . . . . . . 5 57 1.2. Definitions . . . . . . . . . . . . . . . . . . . . . . . 5 58 2. Protocol Overview . . . . . . . . . . . . . . . . . . . . . . 6 59 3. PIM Hello Options . . . . . . . . . . . . . . . . . . . . . . 8 60 3.1. PIM over the TCP Transport Protocol . . . . . . . . . . . 8 61 3.2. PIM over the SCTP Transport Protocol . . . . . . . . . . . 9 62 3.3. Interface ID . . . . . . . . . . . . . . . . . . . . . . . 10 63 4. Establishing Transport Connections . . . . . . . . . . . . . . 11 64 4.1. Connection Security . . . . . . . . . . . . . . . . . . . 13 65 4.2. Connection Maintenance . . . . . . . . . . . . . . . . . . 13 66 4.3. Actions When a Connection Goes Down . . . . . . . . . . . 15 67 4.4. Moving from PORT to Datagram Mode . . . . . . . . . . . . 15 68 4.5. On-demand versus Pre-configured Connections . . . . . . . 16 69 4.6. Possible Hello Suppression Considerations . . . . . . . . 16 70 4.7. Avoiding a Pair of TCP Connections between Neighbors . . . 17 71 5. PORT Message Definitions . . . . . . . . . . . . . . . . . . . 18 72 5.1. PORT Join/Prune Message . . . . . . . . . . . . . . . . . 19 73 5.2. PORT Keep-alive Message . . . . . . . . . . . . . . . . . 21 74 5.3. PORT Options . . . . . . . . . . . . . . . . . . . . . . . 22 75 5.3.1. PIM IPv4 Join/Prune Option . . . . . . . . . . . . . . 22 76 5.3.2. PIM IPv6 Join/Prune Option . . . . . . . . . . . . . . 23 77 6. Explicit Tracking . . . . . . . . . . . . . . . . . . . . . . 24 78 7. Multiple Address-Family Support . . . . . . . . . . . . . . . 25 79 8. Miscellany . . . . . . . . . . . . . . . . . . . . . . . . . . 26 80 9. Transport Considerations . . . . . . . . . . . . . . . . . . . 27 81 10. Manageability Considerations . . . . . . . . . . . . . . . . . 28 82 11. Security Considerations . . . . . . . . . . . . . . . . . . . 29 83 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 84 12.1. PORT Port Number . . . . . . . . . . . . . . . . . . . . . 30 85 12.2. PORT Hello Options . . . . . . . . . . . . . . . . . . . . 30 86 12.3. PORT Message Type Registry . . . . . . . . . . . . . . . . 30 87 12.4. PORT Option Type Registry . . . . . . . . . . . . . . . . 30 88 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 32 89 14. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 33 90 15. References . . . . . . . . . . . . . . . . . . . . . . . . . . 34 91 15.1. Normative References . . . . . . . . . . . . . . . . . . . 34 92 15.2. Informative References . . . . . . . . . . . . . . . . . . 34 93 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 36 95 1. Introduction 97 The goals of this specification are: 99 o To create a simple incremental mechanism to provide reliable PIM 100 Join/Prune message delivery in PIM version 2 for use with PIM 101 Sparse-Mode [RFC4601] (including Source-Specific Multicast) and 102 Bidirectional PIM [RFC5015]. 104 o When a router supports this specification, it need not use the 105 reliable transport mechanism with every neighbor. It can be 106 negotiated on a per neighbor basis. 108 The explicit non-goals of this specification are: 110 o Changes to the PIM message formats as defined in [RFC4601]. 112 o Provide support for automatic switching between the reliable 113 transport mechanism and the regular PIM mechanism defined in 114 [RFC4601]. Two routers that are PIM neighbors on a link will 115 always use the reliable transport mechanism if and only if both 116 have enabled support for the reliable transport mechanism. 118 This document will specify how periodic Join/Prune message 119 transmission can be eliminated by using TCP [RFC0793] or SCTP 120 [RFC4960] as the reliable transport mechanism for Join/Prune 121 messages. The destination port number is 8471 for both TCP and SCTP. 123 This specification enables greater scalability in terms of control 124 traffic overhead. However, for routers connected to multi-access 125 links that comes at the price of increased PIM state and the overhead 126 required to maintain this state. 128 In many existing and emerging networks, particularly wireless and 129 mobile satellite systems, link degradation due to weather, 130 interference, and other impairments can result in temporary spikes in 131 the packet loss. In these environments, periodic PIM joining can 132 cause join latency when messages are lost causing a retransmission 133 only 60 seconds later. By applying a reliable transport, a lost join 134 is retransmitted rapidly. Furthermore, when the last user leaves a 135 multicast group, any lost prune is similarly repaired and the 136 multicast stream is quickly removed from the wireless/satellite link. 137 Without a reliable transport, the multicast transmission could 138 otherwise continue until it timed out, roughly 3 minutes later. As 139 network resources are at a premium in many of these environments, 140 rapid termination of the multicast stream is critical for maintaining 141 efficient use of bandwidth. 143 This is an experimental extension to PIM. It makes some fundamental 144 changes to how PIM works in that Join/Prune state does not require 145 periodic updates, and partly turns PIM into a hard-state protocol. 146 Also, using reliable delivery for PIM messages is a new concept, and 147 it is likely that experiences from early implementations and 148 deployments will lead to at least minor changes in the protocol. It 149 should be considered making this a standards track protocol once 150 there is some deployment experience. Experiments using this protocol 151 only require support by pairs of PIM neighbors, and need not be 152 constrained to isolated networks. 154 1.1. Requirements Notation 156 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 157 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 158 document are to be interpreted as described in [RFC2119]. 160 1.2. Definitions 162 PORT: Stands for PIM Over Reliable Transport. Which is the short 163 form for describing the mechanism in this specification where PIM 164 can use the TCP or SCTP transport protocol. 166 Periodic Join/Prune message: A Join/Prune message sent periodically 167 to refresh state. 169 Incremental Join/Prune message: A Join/Prune message sent as a 170 result of state creation or deletion events. Also known as a 171 triggered message. 173 Native Join/Prune message: A Join/Prune message that is carried 174 with an IP protocol type of PIM. 176 PORT Join/Prune message: A Join/Prune message using TCP or SCTP for 177 transport. 179 Datagram Mode: The procedures whereby PIM uses by encapsulates 180 Join/Prune messages in IP packets sent either triggered or 181 periodically. 183 PORT Mode: Procedures used by PIM defined in this specification for 184 sending Join/Prune messages over the TCP or SCTP transport layer. 186 2. Protocol Overview 188 PIM Over Reliable Transport (PORT) is a simple extension to PIMv2 for 189 refresh reduction of PIM Join/Prune messages. It involves sending 190 incremental rather than periodic Join/Prune messages over a TCP/SCTP 191 connection between PIM neighbors. 193 PORT only applies to PIM Sparse-Mode [RFC4601] and Bidirectional PIM 194 [RFC5015] Join/Prune messages. 196 This document does not restrict PORT to any specific link types. 197 However, the use of PORT on e.g. multi-access LANs with many PIM 198 neighbors should be carefully evaluated. This due to the fact that 199 there may be a full mesh of PORT connections, and that explicit 200 tracking of all PIM PORT routers is required. 202 PORT can be incrementally used on a link between PORT-capable 203 neighbors. Routers that are not PORT-capable can continue to use PIM 204 in Datagram Mode. PORT capability is detected using new PORT-Capable 205 PIM Hello Options. 207 Once PORT is enabled on an interface and a PIM neighbor also 208 announces that it is PORT enabled, only PORT Join/Prune messages will 209 be used. That is, only PORT Join/Prune messages are accepted from, 210 and sent to, that particular neighbor. Native Join/Prune messages 211 are still used for PIM neighbors that are not PORT enabled. 213 PORT Join/Prune messages are sent using a TCP/SCTP connection. When 214 two PIM neighbors are PORT enabled, both for TCP or both for SCTP, 215 they will immediately, or on-demand, establish a connection. If the 216 connection goes down, they will again immediately, or on-demand, try 217 to reestablish the connection. No Join/Prune messages (neither 218 Native nor PORT) are sent while there is no connection. Also, any 219 received native Join/Prune messages from that neighbor are discarded, 220 even when the connection is down. 222 When PORT is used, only incremental Join/Prune messages are sent from 223 downstream routers to upstream routers. As such, downstream routers 224 do not generate periodic Join/Prune messages for state for which the 225 RPF neighbor is PORT-capable. 227 For Joins and Prunes, which are received over a TCP/SCTP connection, 228 the upstream router does not start or maintain timers on the outgoing 229 interface entry. Instead, it keeps track of which downstream routers 230 have expressed interest. An interface is deleted from the outgoing 231 interface list only when all downstream routers on the interface, no 232 longer wish to receive traffic. If there also are native joins/ 233 prunes from non-PORT neighbor, then one can maintain timers on the 234 outgoing interface entry as usual, while at the same time keep track 235 of each of the downstream PORT joins/prunes. 237 This document does not update the PIM Join/Prune packet format. In 238 the procedures described in this document, each PIM Join/Prune 239 message is included in the payload of a PORT message carried over 240 TCP/SCTP. See section Section 5 for details on the PORT message. 242 3. PIM Hello Options 244 3.1. PIM over the TCP Transport Protocol 246 Option Type: PIM-over-TCP-Capable 248 0 1 2 3 249 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 250 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 251 | Type = 27 | Length = 4 + X | 252 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 253 | TCP Connection ID AFI | Reserved | Exp | 254 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 255 | TCP Connection ID | 256 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 258 Assigned Hello Type values can be found in [HELLO-OPT]. 260 When a router is configured to use PIM over TCP on a given interface, 261 it MUST include the PIM-over-TCP-Capable hello option in its Hello 262 messages for that interface. If a router is explicitly disabled from 263 using PIM over TCP, it MUST NOT include the PIM-over-TCP-Capable 264 hello option in its Hello messages. 266 All Hello messages containing the PIM-over-TCP-Capable hello option, 267 MUST also contain the Interface ID hello option, see section 268 Section 3.3. 270 Implementations MAY provide a configuration option to enable or 271 disable PORT functionality. It is RECOMMENDED that this capability 272 be disabled by default. 274 Length: Length in bytes for the value part of the Type/Length/Value 275 encoding; where X is the number of bytes that make up the 276 Connection ID field. X is 4 when AFI of value 1 (IPv4) [AFI] is 277 used, 16 when AFI of value 2 (IPv6) [AFI] is used, and 0 if AFI of 278 value 0 is used. 280 TCP Connection ID AFI: The AFI value to describe the address-family 281 of the address of the TCP Connection ID field. When this field is 282 0, a mechanism outside the scope of this document is used to 283 obtain the addresses used to establish the TCP connection. 285 Reserved: Set to zero on transmission and ignored on receipt. 287 Exp: For experimental use [RFC3692]. One expected use of these 288 bits would be to signal experimental capabilities. E.g. if a 289 router supports an experimental feature, it may set a bit to 290 indicate this. The default behavior, unless a router supports a 291 particular experiment, is to ignore the bits on receipt. 293 TCP Connection ID: An IPv4 or IPv6 address used to establish the 294 TCP connection. This field is omitted (length 0) for the 295 Connection ID AFI 0. 297 3.2. PIM over the SCTP Transport Protocol 299 Option Type: PIM-over-SCTP-Capable 301 0 1 2 3 302 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 303 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 304 | Type = 28 | Length = 4 + X | 305 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 306 | SCTP Connection ID AFI | Reserved | Exp | 307 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 308 | SCTP Connection ID | 309 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 311 Assigned Hello Type values can be found in [HELLO-OPT]. 313 When a router is configured to use PIM over SCTP on a given 314 interface, it MUST include the PIM-over-SCTP-Capable hello option in 315 its Hello messages for that interface. If a router is explicitly 316 disabled from using PIM over SCTP, it MUST NOT include the PIM-over- 317 SCTP-Capable hello option in its Hello messages. 319 All Hello messages containing the PIM-over-SCTP-Capable hello option, 320 MUST also contain the Interface ID hello option, see section 321 Section 3.3. 323 Implementations MAY provide a configuration option to enable or 324 disable PORT functionality. It is RECOMMENDED that this capability 325 be disabled by default. 327 Length: Length in bytes for the value part of the Type/Length/Value 328 encoding; where X is the number of bytes that make up the 329 Connection ID field. X is 4 when AFI of value 1 (IPv4) [AFI] is 330 used, 16 when AFI of value 2 (IPv6) [AFI] is used, and 0 if AFI of 331 value 0 is used. 333 SCTP Connection ID AFI: The AFI value to describe the address- 334 family of the address of the SCTP Connection ID field. When this 335 field is 0, a mechanism outside the scope of this document is used 336 to obtain the addresses used to establish the SCTP connection. 338 Reserved: Set to zero on transmission and ignored on receipt. 340 Exp: For experimental use [RFC3692]. One expected use of these 341 bits would be to signal experimental capabilities. E.g. if a 342 router supports an experimental feature, it may set a bit to 343 indicate this. The default behavior, unless a router supports a 344 particular experiment, is to ignore the bits on receipt. 346 SCTP Connection ID: An IPv4 or IPv6 address used to establish the 347 SCTP connection. This field is omitted (length 0) for the 348 Connection ID AFI 0. 350 3.3. Interface ID 352 All Hello messages containing PIM-over-TCP-Capable or PIM-over-SCTP- 353 Capable hello options, MUST also contain the Interface ID hello 354 option [RFC6395]. 356 The Interface ID is used to associate a PORT Join/Prune message with 357 the PIM neighbor that it is coming from. When unnumbered interfaces 358 are used or when a single Transport connection is used for sending 359 and receiving Join/Prune messages over multiple interfaces, the 360 Interface ID is used to convey the interface from Join/Prune message 361 sender to Join/Prune message receiver. The value of the Interface ID 362 hello option in Hellos sent on an interface, MUST be the same as the 363 Interface ID value in all PORT Join/Prune messages sent to a PIM 364 neighbor on that interface. 366 The Interface ID need only uniquely identify an interface of a 367 router, it does not need to identify which router the interface 368 belongs to. This means that the Router ID part of the Interface ID 369 MAY be 0. For details on the Router ID and the value 0, see 370 [RFC6395]. 372 4. Establishing Transport Connections 374 While a router interface is PORT enabled, a PIM-over-TCP or a PIM- 375 over-SCTP option MUST be included in the PIM Hello messages sent on 376 that interface. When a router on a PORT-enabled interface receives a 377 Hello message containing a PIM-over-TCP/PIM-over-SCTP Option from a 378 new neighbor, or an existing neighbor that did not previously include 379 the option, it switches to PORT mode for that particular neighbor. 381 When a router switches to PORT mode for a neighbor, it stops sending 382 and accepting Native Join/Prune messages for that neighbor. Any 383 state from previous Native Join/Prune messages is left to expire as 384 normal. It will also attempt to establish a Transport connection 385 (TCP or SCTP) with the neighbor. If both the router and its neighbor 386 have announced both PIM-over-TCP and PIM-over-SCTP options, SCTP MUST 387 be used. This resolves the issue where two transports are both 388 offered. The method prefers SCTP over TCP, because SCTP has benefits 389 such as call collision handling and support for multiple streams, as 390 discussed later in this document. 392 When the router is using TCP, it will compare the TCP Connection ID 393 it announced in the PIM-over-TCP-Capable Option with the TCP 394 Connection ID in the Hello received from the neighbor. Unless 395 connections are opened on-demand (see below), the router with the 396 lower Connection ID MUST do an active Transport open to the neighbor 397 Connection ID. The router with the higher Connection ID MUST do a 398 passive Transport open. An implementation MAY open connections only 399 on-demand, in that case it may be that the neighbor with the higher 400 Connection ID does the active open, see Section 4.5. If the router 401 with the lower Connection ID chooses to only do an active open on- 402 demand, it MUST do a passive open, allowing for the neighbor to 403 initiate the connection. Note that the source address of the active 404 open MUST be the announced Connection ID. 406 When the router is using SCTP, the IP address comparison need not be 407 done since the SCTP protocol can handle call collision. 409 The decisions whether to use PORT, which transport, and which 410 Connection IDs to use are performed independently for IPv4 and IPv6. 411 Thus, if PORT is used both for IPv4 and IPv6, both IPv4 and IPv6 PIM 412 Hello messages MUST be sent, both containing PORT Hello options. If 413 two neighbors announce the same transport (TCP or SCTP) and the same 414 Connection IDs in the IPv4 and IPv6 Hello messages, then only one 415 connection is established and is shared. Otherwise, two connections 416 are established and are used separately. 418 The PIM router that performs the active open initiates the connection 419 with a locally generated source transport port number and a well- 420 known destination transport port number. The PIM router that 421 performs the passive open listens on the well-known local transport 422 port number and does not qualify the remote transport port number. 423 See Section 5 for well-known port number assignment for PORT. 425 When a Transport connection is established (or reestablished), the 426 two routers MUST both send a full set of Join/Prune messages for 427 state for which the other router is the upstream neighbor. This is 428 needed to ensure that the upstream neighbor has the correct state. 429 When moving from Datagram mode, or when the connection has gone down, 430 the router cannot be sure that all the previous Join/Prune state was 431 received by the neighbor. Any state created before the connection 432 was established (or reestablished) that is not refreshed, MUST be 433 left to expire and be deleted. When the non-refreshed state has 434 expired and been deleted, the two neighbors will be in sync. 436 When not running PORT, a full update is only needed when a router 437 restarts, with PORT it must be done every time a connection is 438 established. This can be costly, although it is expected that it is 439 a rare event for a PORT connection to go up and down. There may be a 440 need for extensions to better handle this. 442 It is possible that a router starts sending Hello messages with a new 443 Connection ID, e.g. due to configuration changes. A router MUST 444 always use the last announced and last seen Connection IDs. A 445 connection is identified by the local Connection ID (the one we are 446 announcing on a particular interface), and the remote Connection ID 447 (the one we are receiving from a neighbor on the same interface). 448 When either the local or remote ID changes, the Connection ID pair we 449 need a connection for changes. There may be an existing connection 450 with the same pair, in which case the router will share that 451 connection. Or a new connection may need to be established. Note 452 that for link-local addresses, the interface should be regarded as 453 part of the ID, so that connection sharing is not attempted when the 454 same link-local addresses are seen on different interfaces. 456 When a Connection ID changes, if the previously used connection is 457 not needed (there are no other PIM neighborships using the same 458 Connection ID pair), both peers MUST attempt to reset the transport 459 connection. Next (even if the old connection is still needed), they 460 MUST, unless a connection already exists with the new Connection ID 461 pair, immediately or on-demand attempt to establish a new connection 462 with the new Connection ID pair. 464 Normally the Interface ID would not change while a connection is up. 465 However, if it does, it does not affect the connection. It just 466 means that when subsequent PORT join/prune messages are received, 467 they should be matched against the last seen Interface ID. 469 Note that, a Join sent over a Transport connection will only be seen 470 by the upstream router, and thus will not cause routers on the link 471 that do not use PIM PORT with the upstream router to possibly delay 472 the refresh of Join state for the same state. Similarly, a Prune 473 sent over a Transport connection will only be seen by the upstream 474 router, and will thus never cause routers on the link that do not use 475 PIM PORT with the upstream router, to send a Join to override this 476 Prune. 478 Note also, that a datagram PIM Join/Prune message for a said (S,G) or 479 (*,G) sent by some router on a link will not cause routers on the 480 same link that use a Transport connection with the upstream router 481 for that state, to suppress the refresh of that state to the upstream 482 router (because they don't need to periodically refresh this state) 483 or to send a Join to override a Prune (as the upstream router will 484 only stop forwarding the traffic when all joined routers that use a 485 Transport connection have explicitly sent a Prune for this state, as 486 explained in Section 6). 488 4.1. Connection Security 490 TCP/SCTP packets used for PORT MUST be sent with a TTL/Hop Limit of 491 255 to facilitate enabling of the Generalized TTL Security Mechanism 492 (GTSM) [RFC5082]. Implementations SHOULD provide a configuration 493 option to enable the GTSM check at the receiver. This means checking 494 that inbound packets from directly connected neighbors have a TTL/Hop 495 Limit of 255, but MAY also allow for a different TTL/Hop Limit 496 threshold to check that the sender is within a certain number of 497 router hops. The GTSM check SHOULD be disabled by default. 499 Implementations SHOULD support the TCP Authentication Option (TCP-AO) 500 [RFC5925] and SCTP Authenticated Chunks [RFC4895]. 502 4.2. Connection Maintenance 504 TCP is designed to keep connections up indefinitely during a period 505 of network disconnection. If a PIM-over-TCP router fails, the TCP 506 connection may stay up until the neighbor actually reboots, and even 507 then it may continue to stay up until PORT tries to send the neighbor 508 some information. This is particularly relevant to PIM, since the 509 flow of Join/Prune messages might be in only one direction, and the 510 downstream neighbor might never get any indication via TCP that the 511 other end of the connection is not really there. 513 SCTP has a heart beat mechanism that can be used to detect that a 514 connection is not working, even when no data is sent. Many TCP 515 implementations also support sending keep-alives for this purpose. 516 Implementations MAY make use of TCP keep-alives, but it the PORT 517 keep-alive mechanism defined below allows for more control and 518 flexibility. 520 One can detect that a PORT connection is not working by regularly 521 sending PORT messages. This applies to both TCP and SCTP. E.g., for 522 TCP the connection will be reset if no TCP ACKs are received after 523 several retries. PORT in itself does not require any periodic 524 signaling. PORT Join/Prune messages are only sent when there is a 525 state change. If the state changes are not frequent enough, a PORT 526 Keep-Alive message (defined in Section 5.2) can be sent instead. 527 E.g., if an implementation wants to send a PORT message, to check 528 that the connection is working, at least every 60 seconds, then 529 whenever there is 60 seconds since the previous message, a Keep-Alive 530 message could be sent. If there were less than 60 seconds between 531 each Join/Prune, no Keep-Alive messages would be needed. 532 Implementations SHOULD support the use of PORT Keep-Alive messages. 533 It is RECOMMENDED that a configuration option is available to network 534 administrators to enable it when needed. Note that Keep-Alives can 535 be used by a peer, independently of whether the other peer supports 536 it. 538 An implementation that supports Keep-Alive messages acts as follows 539 when processing a received PORT message. When processing a Keep- 540 Alive message with a non-zero Holdtime value, it MUST set a timer to 541 the value. We call this timer Connection Expiry Timer (CET). If the 542 CET is already running, it MUST be reset to the new value. When 543 processing a Keep-Alive message with a zero Holdtime value, the CET 544 MUST be stopped if running. When processing a PORT message other 545 than Keep-Alive, the CET MUST be reset to the last received Holdtime 546 value if running. If the CET is not running, no action is taken. If 547 the CET expires, the connection SHOULD be shut down. This 548 specification does not mandate a specific default Holdtime value. 549 However, the dynamic congestion and flow control in TCP and SCTP can 550 result in variable transit delay between the endpoints when capacity 551 varies, there may be loss in the network or variable link 552 performance. Consistent behaviour therefore requires a sufficiently 553 large Holdtime value. E.g., 60 seconds to prevent premature 554 termination. 556 It is possible that a router receives Join/Prune messages for an 557 interface/link that is down. As long as the neighbor has not 558 expired, it is RECOMMENDED processing those messages as usual. If 559 they are ignored, then the router SHOULD ensure it gets a full update 560 for that interface when it comes back up. This can be done by 561 changing the GenID (Generation Identifier, see [RFC4601]), or by 562 terminating and reestablishing the connection. 564 If a PORT neighbor changes its GenID and a connection is established 565 or attempting to be established, the local side should generally tear 566 down the connection and do as described in Section 4.3. However, if 567 the connection is shared by multiple interfaces and the GenID changes 568 only for one of them, the local side SHOULD simply send a full 569 update, similar to other cases when a GenID changes for an upstream 570 neighbor. 572 4.3. Actions When a Connection Goes Down 574 A connection may go down for a variety of reasons. It may be due to 575 an error condition, or a configuration change. A connection SHOULD 576 be shut down as soon as there are no more PIM neighbors using it. 577 That is, for the connection we have associated local and remote 578 Connection IDs. When there is no PIM neighbor with that particular 579 remote connection ID on any interface where we announce the local 580 connection ID, the connection SHOULD be shut down. This may happen 581 when a new connection ID is configured, PORT is disabled, or a PIM 582 neighbor expires. 584 If a PIM neighbor expires, one should free connection state and 585 downstream oif-list state for the neighbor. A downstream router, 586 when an upstream neighboring router has expired, will simply update 587 the RPF neighbor for the corresponding state to a new neighbor where 588 it would trigger Join/Prune messages. This behavior is according to 589 [RFC4601] where also the term RPF neighbor is defined. It is 590 required of a PIM router to clear its neighbor table for a neighbor 591 who has timed out due to neighbor holdtime expiration. 593 When a connection is no longer available between two PORT enabled PIM 594 neighbors, they MUST immediately, or on-demand, try to reestablish 595 the connection following the normal rules for connection 596 establishment. The neighbors MUST also start expiry timers so that 597 all oif-list state for the neighbor using the connection, gets 598 expired after J/P_Holdtime, unless it later gets refreshed by 599 receiving new Join/Prunes. 601 The value of J/P_Holdtime is 215 seconds. This value is based on 602 section 4.11 of [RFC4601] which says that J/P_HoldTime should be 3.5 603 * t_periodic where the default for t_periodic is 60 seconds. 605 4.4. Moving from PORT to Datagram Mode 607 There may be situations where an administrator decides to stop using 608 PORT. If PORT is disabled on a router interface, or a previously 609 PORT enabled neighbor no longer announces any of the PORT Hello 610 options, the router follows the rules in Section 4.3 for taking down 611 connections and starting timers. Next, the router SHOULD trigger a 612 full state update similar to what would be done if the GenID changed 613 in Datagram Mode. The router SHOULD send Join/Prune messages for any 614 state where the router switched from PORT to Datagram Mode for the 615 upstream neighbor. 617 4.5. On-demand versus Pre-configured Connections 619 Transport connections could be established when they are needed or 620 when a router interface to other PIM neighbors has come up. The 621 advantage of on-demand Transport connection establishment is the 622 reduction of router resources. Especially in the case where there is 623 no need for a full mesh of connections on a network interface. The 624 disadvantage is additional delay and queueing when a Join/Prune 625 message needs to be sent and a Transport connection is not 626 established yet. 628 If a router interface has become operational and PIM neighbors are 629 learned from Hello messages, at that time, Transport connections may 630 be established. The advantage is that a connection is ready to 631 transport data by the time a Join/Prune message needs to be sent. 632 The disadvantage is there can be more connections established than 633 needed. This can occur when there is a small set of RPF neighbors 634 for the active distribution trees compared to the total number of 635 neighbors. Even when Transport connections are pre-established 636 before they are needed, a connection can go down and an 637 implementation will have to deal with an on-demand situation. 639 Note that for TCP, it is the router with the lower Connection ID that 640 decides whether to open a connection immediately, or on-demand. The 641 router with the higher Connection ID SHOULD only initiate a 642 connection on-demand. That is, if it needs to send a Join/Prune 643 message and there is no currently established connection. 645 Therefore, this specification RECOMMENDS but does not mandate the use 646 of on-demand Transport connection establishment. 648 4.6. Possible Hello Suppression Considerations 650 Based on this specification, a Transport connection cannot be 651 established until a Hello message is received. One reason for this 652 is to determine if the PIM neighbor supports this specification and 653 the other is to determine the remote address to use to establish the 654 Transport connection. 656 There are cases where it is desirable to suppress entirely the 657 transmission of Hello messages. In this case, it is outside the 658 scope of this document on how to determine if the PIM neighbor 659 supports this specification as well as an out-of-band (outside of the 660 PIM protocol) method to determine the remote address to establish the 661 Transport connection. 663 4.7. Avoiding a Pair of TCP Connections between Neighbors 665 To ensure that there is only one TCP connection between a pair of PIM 666 neighbors, the following set of rules MUST be followed. Note that 667 this section applies only to TCP, for SCTP this is not an issue. Let 668 A and B be two PIM neighbors where A's Connection ID is numerically 669 smaller than B's Connection ID, and each is known to the other as 670 having a potential PIM adjacency relationship. 672 At node A: 674 o If there is already an established TCP connection to B, on the 675 PIM-over-TCP port, then A MUST NOT attempt to establish a new 676 connection to B. Rather it uses the established connection to send 677 Join/Prune messages to B. (This is independent of which node 678 initiated the connection.) 680 o If A has initiated a connection to B, but the connection is still 681 in the process of being established, then A MUST refuse any 682 connection on the PIM-over-TCP port from B. 684 o At any time when A does not have a connection to B which is either 685 established or in the process of being established, A MUST accept 686 connections from B. 688 At node B: 690 o If there is already an established TCP connection to A, on the 691 PIM-over-TCP port, then B MUST NOT attempt to establish a new 692 connection to A. Rather it uses the established connection to send 693 Join/Prune messages to A. (This is independent of which node 694 initiated the connection.) 696 o If B has initiated a connection to A, but the connection is still 697 in the process of being established, then if A initiates a 698 connection too, B MUST accept the connection initiated by A and 699 must release the connection which it (B) initiated. 701 5. PORT Message Definitions 703 It may be desirable for scaling purposes to allow Join/Prune messages 704 from different PIM protocol families to be sent over the same 705 Transport connection. Also, it may be desirable to have a set of 706 Join/Prune messages for one address-family sent over a Transport 707 connection that is established over a different address-family 708 network layer. 710 To be able to do this we need a common PORT message format. This 711 will provide both record boundary and demux points when sending over 712 a stream protocol like TCP/SCTP. 714 A PORT message may contain PORT options, see Section 5.3. We will 715 define two PORT options for carrying PIM Join/Prune messages. One 716 for IPv4 and one for IPv6. For each PIM Join/Prune message to be 717 sent over the Transport connection, we send a PORT Join/Prune message 718 containing exactly one such option. 720 Each PORT message will have the Type/Length/Value format. Multiple 721 different TLV types can be sent over the same Transport connection. 723 To make sure PIM Join/Prune messages are delivered as soon as the TCP 724 transport layer receives the Join/Prune buffer, the TCP Push flag 725 will be set in all outgoing Join/Prune messages sent over a TCP 726 transport connection. 728 PORT messages will be sent using destination TCP port number 8471. 729 When using SCTP as the reliable transport, destination port number 730 8471 will be used. See Section 12 for IANA considerations. 732 PORT messages are error checked. This includes unknown/illegal type 733 fields, or a truncated message. If the PORT message contains a PIM 734 Join/Prune Message, then that is subject to the normal PIM error 735 checks, including checksum verification. If any parsing errors occur 736 in a PORT message, it is skipped, and we proceed to any following 737 PORT messages. 739 When an unknown type field is encountered, that message MUST be 740 ignored. As specified above, one then proceeds as usual processing 741 further PORT messages. This is important in order to allow new 742 message types to be specified in the future, without breaking 743 existing implementations. However, if only unknown or invalid 744 messages are received for a longer period of time, an implementation 745 MAY alert the operator. E.g., if a message is sent with a wrong 746 length, the receiver is likely to see only unknown/invalid messages 747 thereafter. 749 The checksum of the PIM Join/Prune message MUST be calculated exactly 750 as specified in section 4.9 of [RFC4601]). For IPv6, [RFC4601] 751 specifies the use of a pseudo-header. For PORT, the exact same 752 pseudo-header MUST be used, but its source and destination address 753 fields MUST be set to 0 when calculating the checksum. 755 The TLV type field is 16 bits. The range 65532 - 65535 is for 756 experimental use [RFC3692]. 758 This document defines two message types. 760 5.1. PORT Join/Prune Message 762 PORT Join/Prune Message 764 0 1 2 3 765 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 766 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 767 | Type = 1 | Message Length | 768 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 769 | Reserved | 770 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 771 | Interface | 772 | ID | 773 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 774 | PORT Option Type | Option Value Length | 775 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 776 | Value | 777 | . | 778 | . | 779 | . | 780 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 781 \ . \ 782 / . / 783 \ . \ 784 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 785 | PORT Option Type | Option Value Length | 786 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 787 | Value | 788 | . | 789 | . | 790 | . | 791 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 793 The PORT Join/Prune Message is used for sending a PIM Join/Prune. 795 Message Length: Length in bytes for the value part of the Type/ 796 Length/Value encoding. If no PORT Options were included, the 797 length would be 12. If n PORT Options with Option Value lengths 798 L1, L2, ..., Ln are included, the message length will be 12 + 4*n 799 + L1 + L2 + ... + Ln. 801 Reserved: Set to zero on transmission and ignored on receipt. 803 Interface ID: This MUST be the Interface ID of the Interface ID 804 Hello option contained in the PIM Hello messages the PIM router is 805 sending to the PIM neighbor. It indicates to the PIM neighbor 806 what interface to associate the Join/Prune with. The Interface ID 807 allows us to do connection sharing. 809 PORT Options: The message MUST contain exactly one PIM Join/Prune 810 Port Option, either one PIM IPv4 Join/Prune or one PIM IPv6 Join/ 811 Prune. It MUST NOT contain both. It MAY contain additional 812 options not defined in this document. The behavior when receiving 813 a message containing unknown options depends on the option type. 814 See Section 5.3 for option definitions. 816 5.2. PORT Keep-alive Message 818 PORT Keep-alive Message 820 0 1 2 3 821 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 822 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 823 | Type = 2 | Message Length | 824 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 825 | Reserved | 826 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 827 | Holdtime | PORT Option Type | 828 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 829 | Option Value Length | Value | 830 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . + 831 | . | 832 | . | 833 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 834 \ . \ 835 / . / 836 \ . \ 837 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 838 | PORT Option Type | Option Value Length | 839 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 840 | Value | 841 | . | 842 | . | 843 | . | 844 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 846 The PORT Keep-alive Message is used to regularly send PORT messages 847 to verify that a connection is alive. They are used when other PORT 848 messages are not sent at the desired frequency. 850 Message Length: Length in bytes for the value part of the Type/ 851 Length/Value encoding. If no PORT Options were included, the 852 length would be 6. If n PORT Options with Option Value lengths 853 L1, L2, ..., Ln are included, the message length will be 6 + 4*n + 854 L1 + L2 + ... + Ln. 856 Reserved: Set to zero on transmission and ignored on receipt. 858 Holdtime: This specifies a Holdtime in seconds for the connection. 859 A non-zero value means that the connection SHOULD be gracefully 860 shut down if no further PORT messages are received within the 861 specified time. This is measured on the receiving side by 862 measuring the time from one PORT message has been processed until 863 the next has been processed. Note that this MUST be done for any 864 PORT message, not just keep-alive messages. A hold time of 0 865 disables the keep-alive mechanism. 867 PORT Options: A keep-alive message MUST NOT contain any of the 868 options defined in this document. It MAY contain other options 869 not defined in this document. The behavior when receiving a 870 message containing unknown options depends on the option type. 871 See Section 5.3 for option definitions. 873 5.3. PORT Options 875 Each PORT Option is a TLV. The type is 16 bits. The PORT Option 876 type space is split in two ranges. The types in the range 0 - 32767 877 (the most significant bit is not set) are for Critical Options. The 878 types in the range 32768 - 65535 (the most significant bit is set) 879 are for Non-Critical Options. 881 The behavior of a router receiving a message with an unknown PORT 882 Option, is determined by whether the option is a critical option. If 883 the message contains an unknown critical option, the entire message 884 must be ignored. If the option is non-critical, only that particular 885 option is ignored, and the message is processed as if the option was 886 not present. 888 PORT Option types are assigned by IANA, except the ranges 32764 - 889 32767 and 65532 - 65535 that are for experimental use [RFC3692]. The 890 length specifies the length of the value in bytes. Below are the two 891 options defined in this document. 893 5.3.1. PIM IPv4 Join/Prune Option 895 PIM IPv4 Join/Prune Option Format 897 0 1 2 3 898 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 899 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 900 | PORT Option Type = 1 | Option Value Length | 901 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 902 | PIMv2 Join/Prune Message | 903 | . | 904 | . | 905 | . | 906 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 908 The IPv4 Join/Prune Option is used to carry a PIMv2 Join/Prune 909 message that has all IPv4 encoded addresses in the PIM payload. 911 Option Value Length: The number of bytes that make up the PIMv2 912 Join/Prune message. 914 PIMv2 Join/Prune Message: PIMv2 Join/Prune message and payload with 915 no IP header in front of it. 917 5.3.2. PIM IPv6 Join/Prune Option 919 PIM IPv6 Join/Prune Option Format 921 0 1 2 3 922 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 923 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 924 | PORT Option Type = 2 | Option Value Length | 925 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 926 | PIMv2 Join/Prune Message | 927 | . | 928 | . | 929 | . | 930 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 932 The IPv6 Join/Prune Option is used to carry a PIMv2 Join/Prune 933 message that has all IPv6 encoded addresses in the PIM payload. 935 Option Value Length: The number of bytes that make up the PIMv2 936 Join/Prune message. 938 PIMv2 Join/Prune Message: PIMv2 Join/Prune message and payload with 939 no IP header in front of it. 941 6. Explicit Tracking 943 When explicit tracking is used, a router keeps track of join state 944 for individual downstream neighbors on a given interface. This MUST 945 be done for all PORT joins and prunes. Note that it may also be done 946 for native join/prune messages, if all neighbors on the LAN have set 947 the T bit of the LAN Prune Delay option (see definition in section 948 4.9.2 of [RFC4601]). In the discussion below we will talk about ET 949 (explicit tracking) neighbors, and non-ET neighbors. The set of ET 950 neighbors MUST include the PORT neighbors. The set of non-ET 951 neighbors consists of all the non-PORT neighbors unless all neighbors 952 have set the LAN Prune Delay T bit. Then the ET neighbors set 953 contains all neighbors. 955 For some link-types, e.g. point-to-point, tracking neighbors is no 956 different than tracking interfaces. It may also be possible for an 957 implementation to treat different downstream neighbors as being on 958 different logical interfaces, even if they are on the same physical 959 link. Exactly how this is implemented and for which link types, is 960 left to the implementer. 962 For (*,G) and (S,G) state, the router starts forwarding traffic on an 963 interface when a Join is received from a neighbor on such an 964 interface. When a non-ET neighbor sends a Prune, as specified 965 [RFC4601], if no Join is sent to override this Prune before the 966 expiration of the Override Timer, the upstream router concludes that 967 no non-ET neighbor is interested. If no ET neighbors are interested, 968 the interface can be removed from the oif-list. When an ET neighbor 969 sends a Prune, one removes the join state for that neighbor. If no 970 other ET or non-ET neighbors are interested, the interface can be 971 removed from the oif-list. When a PORT neighbor sends a prune, there 972 can be no Prune Override, since the Prune is not visible to other 973 neighbors. 975 For (S,G,rpt) state, the router needs to track Prune state on the 976 shared tree. It needs to know which ET neighbors have sent prunes, 977 and whether any non-ET neighbors have sent prunes. Normally one 978 would forward a packet from a source S to a group G out on an 979 interface if a (*,G)-join is received, but no (S,G,rpt)-prune. With 980 ET one needs to do this check per ET neighbor. That is, the packet 981 should be forwarded unless all ET neighbors that have sent 982 (*,G)-joins have also sent (S,G,rpt)-prunes, and if a non-ET neighbor 983 has sent a (*,G)-join, whether there also is non-ET (S,G,rpt)-prune 984 state. 986 7. Multiple Address-Family Support 988 To allow for efficient use of router resources, one can mux Join/ 989 Prune messages of different address families on the same Transport 990 connection. There are two ways this can be accomplished, one using a 991 common message format over a TCP connection and the other using 992 multiple streams over a single SCTP connection. 994 Using the common message format described previously in this 995 specification, using different PORT options, both IPv4 and IPv6 based 996 Join/Prune messages can be encoded within the same Transport 997 connection. 999 When using SCTP multi-streaming, the common message format is still 1000 used to convey address family information but an SCTP association is 1001 used, on a per-family basis, to send data concurrently for multiple 1002 families. When data is sent concurrently, head of line blocking, 1003 which can occur when using TCP, is avoided. 1005 8. Miscellany 1007 There are no changes to processing of other PIM messages like PIM 1008 Asserts, Grafts, Graft-Acks, Registers, and Register-Stops. This 1009 goes for BSR and Auto-RP type messages as well. 1011 This extension is applicable only to PIM-SM, PIM-SSM and Bidir-PIM. 1012 It does not take requirements for PIM-DM into consideration. 1014 9. Transport Considerations 1016 As noted in the introduction, this is an experimental extension to 1017 PIM, and using reliable delivery for PIM messages is a new concept. 1018 There are several potential transport related concerns. Hopefully 1019 experiences from early implementations and deployments will reveal 1020 what concerns are relevant and how to resolve them. 1022 One consideration is keep-alive mechanisms. We have defined an 1023 optional Keep-alive mechanism for PORT, see Section 4.2. Also SCTP 1024 and many TCP implementations provide keep-alive mechanisms that could 1025 be used. When to use keep-alive messages and which mechanism is 1026 unclear, although we believe the PORT keep-alive allows for better 1027 application control. It is unclear what holdtimes are preferred for 1028 the PORT Keep-alives. For now it is RECOMMENDED that administrators 1029 can configure whether to use keep-alives, what holdtimes etc. 1031 In a stable state it is expected that only occasional small messages 1032 are sent over a PORT connection. This depends on how often PIM Join/ 1033 Prune state changes. Thus, over a long period of time, there may be 1034 only small messages that never use the entire TCP congestion window, 1035 and the window may become very large. This would then be an issue if 1036 there is a state change making PORT send a very large message. It 1037 may be good if the TCP stack provides some rate-limiting or burst- 1038 limiting. The congestion control mechanism defined in [RFC3465] may 1039 be of help. 1041 With PORT, it is possible as discussed in the previous paragraph that 1042 only occasional small messages are sent. This may cause problems for 1043 the TCP retransmit mechanism. In particular, the TCP Fast Retransmit 1044 algorithm may never get triggered. For further discussion of this 1045 and a possible solution, see [RFC3042]. 1047 While the above two paragraphs only discuss TCP issues, there may 1048 also be similar issues regarding SCTP. 1050 10. Manageability Considerations 1052 This document defines using TCP or SCTP transports between pairs of 1053 PIM neighbors. It is recommended that this mechanism is disabled by 1054 default. An administrator can then enable PORT TCP and/or SCTP on 1055 PIM enabled interfaces. If two neighbors both have PORT SCTP (and if 1056 not, if both PORT TCP) they will only use SCTP (alternatively TCP) 1057 for PIM Join/Prune messages. This is the case even when the 1058 connection is down (there is no fallback to native Join/Prune 1059 messages). 1061 When PORT support is enabled, a router sends PIM Hello messages 1062 announcing support for TCP and/or SCTP and also Connection IDs. It 1063 should be possible to configure a local Connection ID, and also to 1064 see what PORT capabilities and Connection IDs PIM neighbors are 1065 announcing. Based on these advertisements, pairs of PIM neighbors 1066 will decide whether to try to establish a PORT connection. There 1067 should be a way for an operator to check the current connection 1068 state. Statistics on the number of PORT messages sent and received 1069 (including number of invalid messages) may also be helpful 1071 For connection security (see Section 4.1), it should be possible to 1072 enable a GTSM check to only accept connections (TCP/SCTP packets) 1073 when the sender is within a certain number of router hops. Also one 1074 should be able to configure the use of TCP-AO. 1076 For connection maintenance (see Section 4.2), it is recommended to 1077 support Keep-Alive messages. It should be configurable whether to 1078 send Keep-Alives. In that case, also wheter to use a Holdtime, and 1079 what Holdtime to use. 1081 There should be some way to alert an operator when PORT connections 1082 are going down, or when there is a failure in establishing a PORT 1083 connection. Also information like the number of connection failures, 1084 and how long the connection has been up or down, is useful. 1086 11. Security Considerations 1088 There are several security issues related to the use of TCP or SCTP 1089 transports. One can do off-path attacks sending packets with a 1090 spoofed source address. Either establishing a connection, or 1091 injecting packets into an existing connnection. This might allow 1092 someone to send spoofed join/prune messages, and may also allow 1093 someone to reset the connection. Mechanisms that help protect 1094 against this are discussed in Section 4.1). 1096 For authentication one may for TCP use TCP-AO [RFC5925], and for SCTP 1097 use Authenticated Chunks [RFC4895]. Also GTSM [RFC5082] can be used 1098 to help prevent spoofing. 1100 12. IANA Considerations 1102 This specification makes use of a TCP port number and an SCTP port 1103 number for the use of the pim-port service that has been assigned by 1104 IANA. It also makes use of IANA PIM Hello Options assignments that 1105 should be made permanent. 1107 12.1. PORT Port Number 1109 IANA has already assigned a port number that is used as a destination 1110 port for pim-port TCP and SCTP transports. The assigned number is 1111 8471. References to this document should be added to the Service 1112 Name and Transport Protocol Port Number Registry for pim-port. 1114 12.2. PORT Hello Options 1116 In the Protocol Independent Multicast (PIM) Hello Options registry, 1117 the following options are needed for PORT. 1119 Value Length Name Reference 1120 ------- ---------- ----------------------- --------------- 1121 27 Variable PIM-over-TCP-Capable this document 1122 28 Variable PIM-over-SCTP-Capable this document 1124 12.3. PORT Message Type Registry 1126 A registry for PORT message types is requested. The message type is 1127 a 16-bit integer, with values from 0 to 65535. An RFC is required 1128 for assignments in the range 0 - 65531. This document defines two 1129 PORT message types. Type 1, Join/Prune; and Type 2, Keep-alive. The 1130 type range 65532 - 65535 is for experimental use [RFC3692]. 1132 The initial content of the registry should be as follows: 1134 Type Name Reference 1135 ------------- ------------------------------- --------------- 1136 0 Reserved this document 1137 1 Join/Prune this document 1138 2 Keep-alive this document 1139 3-65531 Unassigned 1140 65532-65535 Experimental this document 1142 12.4. PORT Option Type Registry 1144 A registry for PORT option types is requested. The option type is a 1145 16-bit integer, with values from 0 to 65535. The type space is split 1146 in two ranges, 0 - 32767 for Critical Options and 32768 - 65535 for 1147 Non-Critical Options. Option types are assigned by IANA, except the 1148 ranges 32764 - 32767 and 65532 - 65535 that are for experimental use 1149 [RFC3692]. An RFC is required for the IANA assignments. An RFC 1150 defining a new option type must specify whether the option is 1151 Critical or Non-Critical in order for IANA to assign a type. This 1152 document defines two Critical PORT Option types. Type 1, PIM IPv4 1153 Join/Prune Message; and Type 2, PIM IPv6 Join/Prune Message. 1155 The initial content of the registry should be as follows: 1157 Type Name Reference 1158 ------------- ---------------------------------- --------------- 1159 0 Reserved this document 1160 1 PIM IPv4 Join/Prune this document 1161 2 PIM IPv6 Join/Prune this document 1162 3-32763 Unassigned Critical Options 1163 32764-32767 Experimental this document 1164 32768-65531 Unassigned Non-Critical Options 1165 65532-65535 Experimental this document 1167 13. Contributors 1169 In addition to the persons listed as authors, significant 1170 contributions were provided by Apoorva Karan and Arjen Boers. 1172 14. Acknowledgments 1174 The authors would like to give a special thank you and appreciation 1175 to Nidhi Bhaskar for her initial design and early prototype of this 1176 idea. 1178 Appreciation goes to Randall Stewart for his authoritative review and 1179 recommendation for using SCTP. 1181 Thanks also goes to the following for their ideas and commentary 1182 review of this specification, Mike McBride, Toerless Eckert, Yiqun 1183 Cai, Albert Tian, Suresh Boddapati, Nataraj Batchu, Daniel Voce, John 1184 Zwiebel, Yakov Rekhter, Lenny Giuliano, Gorry Fairhurst, Sameer 1185 Gulrajani, Thomas Morin, Dimitri Papadimitriou, Bharat Joshi, Rishabh 1186 Parekh, Manav Bhatia, Pekka Savola, Tom Petch and Joe Touch. 1188 A special thank you goes to Eric Rosen for his very detailed review 1189 and commentary. Many of his comments are reflected as text in this 1190 specification. 1192 15. References 1194 15.1. Normative References 1196 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 1197 RFC 793, September 1981. 1199 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1200 Requirement Levels", BCP 14, RFC 2119, March 1997. 1202 [RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas, 1203 "Protocol Independent Multicast - Sparse Mode (PIM-SM): 1204 Protocol Specification (Revised)", RFC 4601, August 2006. 1206 [RFC4895] Tuexen, M., Stewart, R., Lei, P., and E. Rescorla, 1207 "Authenticated Chunks for the Stream Control Transmission 1208 Protocol (SCTP)", RFC 4895, August 2007. 1210 [RFC4960] Stewart, R., "Stream Control Transmission Protocol", 1211 RFC 4960, September 2007. 1213 [RFC5015] Handley, M., Kouvelas, I., Speakman, T., and L. Vicisano, 1214 "Bidirectional Protocol Independent Multicast (BIDIR- 1215 PIM)", RFC 5015, October 2007. 1217 [RFC5082] Gill, V., Heasley, J., Meyer, D., Savola, P., and C. 1218 Pignataro, "The Generalized TTL Security Mechanism 1219 (GTSM)", RFC 5082, October 2007. 1221 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 1222 Authentication Option", RFC 5925, June 2010. 1224 [RFC6395] Gulrajani, S. and S. Venaas, "An Interface Identifier (ID) 1225 Hello Option for PIM", RFC 6395, October 2011. 1227 15.2. Informative References 1229 [AFI] IANA, "Address Family Numbers", ADDRESS FAMILY NUMBERS htt 1230 p://www.iana.org/assignments/address-family-numbers, 1231 February 2007. 1233 [HELLO-OPT] 1234 IANA, "PIM Hello Options", PIM-HELLO-OPTIONS per 1235 RFC4601 http://www.iana.org/assignments/pim-hello-options, 1236 March 2007. 1238 [RFC3042] Allman, M., Balakrishnan, H., and S. Floyd, "Enhancing 1239 TCP's Loss Recovery Using Limited Transmit", RFC 3042, 1240 January 2001. 1242 [RFC3465] Allman, M., "TCP Congestion Control with Appropriate Byte 1243 Counting (ABC)", RFC 3465, February 2003. 1245 [RFC3692] Narten, T., "Assigning Experimental and Testing Numbers 1246 Considered Useful", BCP 82, RFC 3692, January 2004. 1248 Authors' Addresses 1250 Dino Farinacci 1251 cisco Systems 1252 Tasman Drive 1253 San Jose, CA 95134 1254 USA 1256 Email: dino@cisco.com 1258 IJsbrand Wijnands 1259 cisco Systems 1260 Tasman Drive 1261 San Jose, CA 95134 1262 USA 1264 Email: ice@cisco.com 1266 Stig Venaas 1267 cisco Systems 1268 Tasman Drive 1269 San Jose, CA 95134 1270 USA 1272 Email: stig@cisco.com 1274 Maria Napierala 1275 AT&T Labs 1276 200 Laurel Drive 1277 Middletown, New Jersey 07748> 1278 USA 1280 Email: mnapierala@att.com