idnits 2.17.1 draft-ietf-idmr-gum-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([MASC]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'BR32' is mentioned on line 222, but not defined == Missing Reference: 'BR41' is mentioned on line 222, but not defined == Missing Reference: 'BR31' is mentioned on line 224, but not defined == Missing Reference: 'BR42' is mentioned on line 224, but not defined == Missing Reference: 'BR43' is mentioned on line 224, but not defined == Missing Reference: 'BR22' is mentioned on line 226, but not defined == Missing Reference: 'BR52' is mentioned on line 226, but not defined == Missing Reference: 'BR53' is mentioned on line 226, but not defined == Missing Reference: 'BR21' is mentioned on line 228, but not defined == Missing Reference: 'BR51' is mentioned on line 228, but not defined == Missing Reference: 'BR12' is mentioned on line 230, but not defined == Missing Reference: 'BR61' is mentioned on line 230, but not defined == Missing Reference: 'BR13' is mentioned on line 232, but not defined == Missing Reference: 'BR71' is mentioned on line 236, but not defined == Missing Reference: 'BR81' is mentioned on line 236, but not defined == Missing Reference: 'BRXY' is mentioned on line 240, but not defined == Missing Reference: 'HPIM' is mentioned on line 259, but not defined == Missing Reference: 'PIM-SM' is mentioned on line 259, but not defined == Unused Reference: 'DVMRP' is defined on line 1758, but no explicit reference was found in the text == Unused Reference: 'MOSPF' is defined on line 1784, but no explicit reference was found in the text == Unused Reference: 'PIMDM' is defined on line 1787, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2283 (ref. 'MBGP') (Obsoleted by RFC 2858) ** Downref: Normative reference to an Historic RFC: RFC 2189 (ref. 'CBT') -- Possible downref: Normative reference to a draft: ref. 'CBTDM' == Outdated reference: A later version (-11) exists of draft-ietf-idmr-dvmrp-v3-06 ** Downref: Normative reference to an Informational draft: draft-ietf-idmr-dvmrp-v3 (ref. 'DVMRP') -- Possible downref: Non-RFC (?) normative reference: ref. 'DWR' ** Downref: Normative reference to an Informational draft: draft-thaler-multicast-interop (ref. 'INTEROP') -- Possible downref: Non-RFC (?) normative reference: ref. 'IPv6MAA' == Outdated reference: A later version (-03) exists of draft-ietf-mboned-imrp-some-issues-02 -- Possible downref: Normative reference to a draft: ref. 'ISSUES' == Outdated reference: A later version (-06) exists of draft-ietf-malloc-masc-01 ** Downref: Normative reference to an Historic draft: draft-ietf-malloc-masc (ref. 'MASC') ** Downref: Normative reference to an Historic RFC: RFC 1584 (ref. 'MOSPF') == Outdated reference: A later version (-03) exists of draft-ietf-pim-v2-dm-00 -- Possible downref: Normative reference to a draft: ref. 'PIMDM' ** Obsolete normative reference: RFC 2362 (ref. 'PIMSM') (Obsoleted by RFC 4601, RFC 5059) ** Obsolete normative reference: RFC 1966 (ref. 'REFLECT') (Obsoleted by RFC 4456) ** Obsolete normative reference: RFC 1700 (Obsoleted by RFC 3232) ** Obsolete normative reference: RFC 1771 (Obsoleted by RFC 4271) Summary: 20 errors (**), 0 flaws (~~), 27 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IDMR Working Group D. Thaler 3 Internet Engineering Task Force U. Michigan 4 INTERNET-DRAFT D. Estrin 5 August 5, 1998 USC/ISI 6 Expires February 1999 D. Meyer 7 U. Oregon 8 Editors 10 Border Gateway Multicast Protocol (BGMP): 11 Protocol Specification 12 14 Status of this Memo 16 This document is an Internet Draft. Internet Drafts are working 17 documents of the Internet Engineering Task Force (IETF), its Areas, and 18 its Working Groups. Note that other groups may also distribute working 19 documents as Internet Drafts. 21 Internet Drafts are valid for a maximum of six months and may be 22 updated, replaced, or obsoleted by other documents at any time. It is 23 inappropriate to use Internet Drafts as reference material or to cite 24 them other than as a "work in progress". 26 Copyright Notice 28 Copyright (C) The Internet Society (1998). All Rights Reserved. 30 Abstract 32 This document describes BGMP, a protocol for inter-domain multicast 33 routing. BGMP builds shared trees for active multicast groups, and 34 allows receiver domains to build source-specific, inter-domain, 35 distribution branches where needed. Building upon concepts from CBT and 36 PIM-SM, BGMP requires that each multicast group be associated with a 37 single root (in BGMP it is referred to as the root domain). BGMP 38 assumes that at any point in time, different ranges of the class D space 39 Draft BGMP August 1998 41 are associated (e.g., with MASC [MASC]) with various domains. Each of 42 these domains then becomes the root of the shared domain-trees for all 43 groups in its range. Multicast participants will generally receive 44 better multicast service if the session initiator's address allocator 45 selects addresses from its own domain's part of the space, thereby 46 causing the root domain to be local to at least one of the session 47 participants. 49 1. Acknowledgements 51 In addition to the editors, the following individuals have contributed 52 to the design of BGMP: Cengiz Alaettinoglu, Tony Ballardie, Steve 53 Casner, Steve Deering, Dino Farinacci, Bill Fenner, Mark Handley, Ahmed 54 Helmy, Van Jacobson, and Satish Kumar. 56 This document is the product of the IETF IDMR Working Group with Dave 57 Thaler, Deborah Estrin, and David Meyer as editors. 59 2. Purpose 61 It has been suggested that inter-domain multicast is better supported 62 with a rendezvous mechanism whereby members receive sources' data 63 packets without any sort of global broadcast (e.g., DVMRP and PIM-DM 64 broadcast initial data packets and MOSPF broadcasts membership 65 information). CBT [CBT] and PIM-SM [PIMSM] use a shared group-tree, to 66 which all members join and thereby hear from all sources (and to which 67 non-members do not join and thereby hear from no sources). 69 This document describes BGMP, a protocol for inter-domain multicast 70 routing. BGMP builds shared trees for active multicast groups, and 71 allows domains to build source-specific, inter-domain, distribution 72 branches where needed. Building upon concepts from CBT and PIM-SM, BGMP 73 requires that each global multicast group be associated with a single 74 root. However, in BGMP, the root is an entire exchange or domain, 75 rather than a single router. 77 BGMP assumes that ranges of the class D space have been associated 78 (e.g., with MASC [MASC]) with selected domains. Each such domain then 79 becomes the root of the shared domain-trees for all groups in its range. 80 An address allocator will generally achieve better distribution trees if 81 it takes its multicast addresses from its own domain's part of the 82 space, thereby causing the root domain to be local. 84 Draft BGMP August 1998 86 BGMP uses TCP as its transport protocol. This eliminates the need to 87 implement message fragmentation, retransmission, acknowledgement, and 88 sequencing. BGMP uses TCP port 264 for establishing its connections. 89 This port is distinct from BGP's port to provide protocol independence, 90 and to facilitate distinguishing between protocol packets (e.g., by 91 packet classifiers, diagnostic utilities, etc.) 93 Two BGMP peers form a TCP connection between one another, and exchange 94 messages to open and confirm the connection parameters. They then send 95 incremental Join/Prune Updates as group memberships change. BGMP does 96 not require periodic refresh of individual entries. KeepAlive messages 97 are sent periodically to ensure the liveness of the connection. 98 Notification messages are sent in response to errors or special 99 conditions. If a connection encounters an error condition, a 100 notification message is sent and the connection is closed. 102 3. Terminology 104 This document uses the following technical terms: 106 Domain: 107 A set of one or more contiguous links and zero or more routers 108 surrounded by one or more multicast border routers. Note that this 109 loose definition of domain also applies to an exchange. 111 Root Domain: 112 When constructing a shared tree of domains for some group, one 113 domain will be the "root" of the tree. The root domain receives 114 data from each sender to the group, and functions as a rendezvous 115 domain toward which member domains can send inter-domain joins, and 116 toward which sender domains can send data. 118 Multicast RIB: 119 The Routing Information Base, or routing table, used to calculate 120 the "next-hop" towards a particular address for multicast traffic. 122 Multicast IGP (M-IGP): 123 A generic term for any multicast routing protocol used for tree 124 construction within a domain. Typical examples of M-IGPs are: 125 DVMRP, PIM-DM, PIM-SM, CBT, and MOSPF. 127 EGP: A generic term for the interdomain unicast routing protocol in use. 128 Typically, this will be some version of BGP, such as BGP4+ [MBGP], 130 Draft BGMP August 1998 132 which can support a Multicast RIB containing both unicast and 133 multicast address prefixes. 135 Component: 136 The portion of a border router associated with (and logically 137 inside) a particular domain that runs the multicast IGP (M-IGP) for 138 that domain, if any. Each border router thus has zero or more 139 components inside routing domains. In addition, each border router 140 with external links that do not fall inside any routing domain will 141 have an inter-domain component that runs BGMP. 143 External peer: 144 A border router in another multicast AS (autonomous system, as used 145 in BGP), to which a BGMP TCP-connection is open. Assuming BGP4+ is 146 being used, a separate "eBGP" TCP-connection will also be open to 147 the same peer. 149 Internal peer: 150 Another border router of the same multicast AS. A border router 151 either speaks iBGP ("internal" BGP) directly to internal peers in a 152 full mesh, or indirectly through a route reflector [REFLECT]. A 153 border router is only required to establish a BGMP TCP-connection 154 to an internal peer when one border router acts as as a data 155 injector for another. 157 Next-hop peer: 158 The next-hop peer towards a given IP address is the next EGP router 159 on the path to the given address, according to multicast RIB routes 160 in the EGP's routing table (e.g., in BGP4+, routes whose Subsequent 161 Address Family Identifier field indicates that the route is valid 162 for multicast traffic). 164 target: 165 Either an EGP peer, or an M-IGP component. 167 Tree State Table: 168 This is a table of (S-prefix,G-prefix) entries (including (*,G- 169 prefix) entries) that have been explicitly joined by a set of 170 targets. Each entry has, in addition to the source and group 171 addresses and masks, a "parent" target (towards the root), and a 172 list of "child" targets that have explicitly requested data (on 173 behalf of directly connected hosts or downstream routers). The 174 generic term "target list" refers to the combination of the parent 175 target plus the child target list. (S,G) entries also have an 176 "SPT" bit. 178 Draft BGMP August 1998 180 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 181 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 182 document are to be interpreted as described in RFC 2119 [RFC2119]. 184 4. Protocol Overview 186 BGMP maintains group-prefix state in response to messages from BGMP 187 peers and notifications from M-IGP components. Group-shared trees are 188 rooted at the domain advertising the group prefix covering those groups. 189 When a receiver joins a specific group address, the border router 190 towards the root domain generates a group-specific Join message, which 191 is then forwarded Border-Router-by-Border-Router towards the root 192 domain. BGMP Join and Prune messages are sent over TCP connections 193 between BGMP peers, and BGMP protocol state is refreshed by KEEPALIVE 194 messages periodically sent over TCP. 196 BGMP routers build group-specific bidirectional forwarding state as they 197 process the BGMP Join messages. Bidirectional forwarding state means 198 that packets received from any target are forwarded to all other targets 199 in the target list without any RPF checks. No group-specific state or 200 traffic exists in parts of the network where there are no members of 201 that group. 203 BGMP routers build source-specific unidirectional forwarding state, only 204 where needed, to be compatible with source-specific trees (SPTs) used by 205 some M-IGPs (e.g., DVMRP and PIM-DM). A domain that uses an SPT-based 206 M-IGP may need to inject multicast packets from external sources via 207 other border routers (to be compatible with the M-IGP Reverse Path 208 Forwarding checks) which thus act as "surrogates". For example, in 209 figure 1, data from Src_A arrives at BR12 but must be injected into the 210 Transit_1 domain (which runs say DVMRP) by BR11 or it will be dropped by 211 routers inside the domain. A surrogate router may create a source- 212 specific BGMP branch if no shared tree state exists. Note: stub domains 213 with a single border router, such as Rcvr_Stub_7 in Figure 1, receive 214 all multicast data packets through that router, to which all RPF checks 215 point. Therefore, stub domains never build source-specific state. 217 Draft BGMP August 1998 219 Root_Domain 220 [BR91]--------------------------\ 221 | | 222 [BR32] [BR41] 223 Transit_3 Transit_4 224 [BR31] [BR42] [BR43] 225 | | | 226 [BR22] [BR52] [BR53] 227 Transit_2 Transit_5 228 [BR21] [BR51] 229 | | 230 [BR12] [BR61] 231 Transit_1[BR11]----------[BR62]Stub_6 232 [BR13] (Src_A) 233 | (Rcvr_D) 234 ------------------- 235 | | 236 [BR71] [BR81] 237 Rcvr_Stub_7 Src_only_Stub_8 238 (Rcvr_C) (Src_B) 240 Figure 1: Example inter-domain topology. [BRXY] represents a BGMP border 241 router. Transit_X is a transit domain network. *_Stub_X is a stub 242 domain network. 244 Data packets are forwarded based on a combination of BGMP and M-IGP 245 rules. The router forwards to a set of targets according to a matching 246 (S,G) BGMP tree state entry if it exists. If not found, the router 247 checks for a matching (*,G) BGMP tree state entry. If neither is found, 248 then the packet is sent natively to the next-hop EGP peer for G, 249 according to the Multicast RIB (for example, in the case of a non-member 250 sender such as Src_B in Figure 1). If a matching entry was found, the 251 packet is forwarded to all other targets in the target list. In this way 252 BGMP trees forward data in a bidirectional manner. If a target is an 253 M-IGP component then forwarding is subject to the rules of that M-IGP 254 protocol. 256 4.1. Design Rationale 258 Several other protocols, or protocol proposals, build shared trees 259 within domains [CBT, HPIM, PIM-SM]. The design choices made for BGMP 260 result from our focus on Inter-Domain multicast in particular. The 261 design choices made by CBT and PIM-SM are better suited to the wide-area 262 Draft BGMP August 1998 264 intra-domain case. There are three major differences between BGMP and 265 other shared-tree protocols: 267 (1) Unidirectional vs. Bidirectional trees 269 Bidirectional trees (using bidirectional forwarding state as described 270 above) minimize third party dependence which is essential in the inter- 271 domain context. For example, in Figure 1, stub domains 7 and 8 would 272 like to exchange multicast packets without being dependent on the 273 quality of connectivity of the root domain. However, unidirectional 274 shared trees (i.e., those using RPF checks) have more aggressive loop 275 prevention and share the same processing rules as source-specific 276 entries which are inherently unidirectional. 278 The lack of third party dependence concerns in the INTRA domain case 279 reduces the incentive to employ bidirectional trees. BGMP supports 280 bidirectional trees because it has to, and because it can without 281 excessive cost. 283 (2) Source-specific distribution trees/branches 285 In a departure from other shared tree protocols, source-specific BGMP 286 state is built ONLY where (a) it is needed to pull the multicast traffic 287 down to a BGMP router that has source-specific (S,G) state, and (b) that 288 router is NOT already on the shared tree (i.e., has no (*,G) state), and 289 (c) that router does not want to receive packets via encapsulation from 290 a router which is on the shared tree. BGMP provides source-specific 291 branches because most M-IGP protocols in use today build source-specific 292 trees. BGMP's source-specific branches eliminate the unnecessary 293 overhead of encapsulations for high data rate sources from the shared 294 tree's ingress router to the surrogate injector (e.g. from BR12 to BR11 295 in Figure 1). Moreover, cases in which SPT paths are significantly 296 shorter than shared paths will also benefit. 298 However, we do not build source-specific inter-domain trees in general 299 because (a) inter-domain connectivity is generally less rich than 300 intra-domain connectivity, so shared distribution trees should have more 301 acceptable path length and traffic concentration properties in the 302 inter-domain context than in the intra-domain case, and (b) by having 303 the shared tree state always take precedence over source-specific tree 304 state, we avoid loops that would otherwise arise. 306 In summary, BGMP trees are, in a sense, a hybrid between CBT and PIM-SM 307 trees. 309 Draft BGMP August 1998 311 (3) Method of choosing root of group shared tree 313 The choice of a group's shared-tree-root has implications for 314 performance and policy. In the intra-domain case it can be assumed that 315 all potential shared-tree roots (RPs/Cores) within the domain are 316 equally suited to be the root for a group that is initiated within that 317 domain. In the INTER-domain case, there is far more opportunity for 318 unacceptably poor locality, and for administrative control of a group's 319 shared-tree root. Therefore in the intra-domain case, other protocols 320 treat all candidate roots (RPs or Cores) as equivalent and emphasize 321 load sharing and stability to maximize performance. In the Inter-Domain 322 case, all roots are not equivalent, and we adopt an approach whereby a 323 group's root domain is not random but is subject to administrative and 324 performance input. 326 5. Protocol Details 328 In this section, we describe the detailed protocol that border routers 329 perform. We assume that each border router conforms to the component- 330 based model described in [INTEROP]. 332 5.1. Interaction with the EGP 334 A fundamental requirement imposed by BGMP on the design of an EGP is 335 that it be able to carry multicast prefixes. For example, a multi- 336 protocol BGP (MBGP) must be able to carry a multicast prefix in the 337 Network Layer Reachability Information (NLRI) field of the UPDATE 338 message (i.e., either an IPv4 class D prefix or an IPv6 prefix with 339 high-order octet equal to FF [IPv6MAA]). This capability is required by 340 BGMP in the implementation of bi-directional trees; BGMP must be able to 341 forward data and control packets to the next hop towards either a 342 unicast source S or a multicast group G (see section 5.2). It is also 343 required that the path attributes defined in [RFC1771] have the same 344 semantics whether they are accompany unicast or multicast NLRI. 346 BGP4+ [MBGP] satisfies the requirement described above. [MBGP] defines 347 the optional transitive attributes Multiprotocol Reachable NLRI 348 (MP_REACH_NLRI) and Multiprotocol Unreachable (MP_UNREACH_NRLI) to carry 349 sets of reachable or unreachable destinations, and the appropriate next 350 hop in the case of MP_REACH_NLRI. These attributes contain an Address 351 Family Information field [RFC1700] which indicates the type of NLRI 352 carried in the attribute. In addition, the attribute carries another 353 field, the Subsequent Address Family Identifier, or SAFI, which can be 354 Draft BGMP August 1998 356 used to provide additional information about the type of NLRI. For 357 example, SAFI value two indicates that the NLRI is valid for multicast 358 forwarding. BGMP's requirement can be satisfied by allowing the NLRI 359 field of the MP_REACH_NLRI (or MP_UNREACH_NLRI) to carry a multicast 360 prefix in the Prefix field of the NLRI encoding. 362 Finally, while not required for correct BGMP operation, the design of an 363 EGP should also provide a mechanism that allows discrimination between 364 NLRI that is to be used for unicast forwarding and NLRI to be used for 365 multicast forwarding. This property is required to support multicast- 366 specific policy. As mentioned above, BGP4+ specified in [MBGP] has this 367 capability. 369 5.2. Multicast Data Packet Processing 371 For BGMP rules to be applied, an incoming packet must first be 372 "accepted": 374 o If the packet was received from an external peer, the packet is 375 accepted. 377 o If the packet arrived on an interface owned by an M-IGP, the M-IGP 378 component determines whether the packet should be accepted or dropped 379 according to its rules. If the packet is accepted, the packet is 380 forwarded (or not forwarded) out any other interfaces owned by the 381 same component, as specified by the M-IGP. 383 If the packet is accepted, then the router checks the tree state table 384 for a matching (S,G) entry. If one is found, but the packet was not 385 received from the next hop target towards S (if the entry's SPT bit is 386 True), or was not received from the next hop target towards G (if the 387 entry's SPT bit is False) then the packet is dropped and no further 388 actions are taken. If no (S,G) entry was found, the router then checks 389 for a matching (*,G) entry. 391 If neither is found, then the packet is forwarded to the parent target 392 for G. If a matching entry was found, the packet is forwarded to all 393 other targets (parent and child) in the listed in the entry. 395 Forwarding to a target which is an M-IGP component means that the packet 396 is forwarded out any interfaces owned by that component according to 397 that component's multicast forwarding rules. 399 Draft BGMP August 1998 401 5.3. BGMP processing of Join and Prune messages and notifications 403 5.3.1. Receiving (*,G) Joins 405 When the BGMP component receives a (*,G) Join alert from another 406 component, or a BGMP (*,G) Join message from an external peer, it 407 searches the tree state table for a matching entry. If an entry is 408 found, and that peer is already listed in the child target list, then no 409 further actions are taken. 411 Otherwise, if no (*,G) entry was found, one is created. The parent 412 target is set to the next-hop peer towards G, if it is an external peer. 413 If the peer is internal, the parent target is set to the M-IGP component 414 owning the next-hop interface. If there is no next-hop peer (because G 415 is inside the domain), then the parent target is set to the next-hop 416 component. If an (S,G) entry exists for the same G for which the (*,G) 417 Join is being processed, and the next-hop peers toward S and G are 418 different, the BGMP router must first send an (S,G) Prune to the (S,G) 419 parent target and clear the SPT bit on the (S,G) entry, before 420 activating the (*,G) entry. 422 The target from which the Join was received is then added to the (*,G) 423 child target list. If the child target list becomes non-null as a 424 result, the parent target must be notified as follows: 426 a) If the parent target is an external peer, a BGMP (*,G) Join message 427 is unicast to the external peer. 429 b) If the parent target is an M-IGP component, a (*,G) Join alert is 430 sent to the M-IGP component. 432 5.3.2. Receiving (S,G) Joins 434 When the BGMP component receives an (S,G) Join alert from another 435 component, or a BGMP (S,G) Join message from an external peer, it 436 searches the tree state table for a matching entry. If an entry is 437 found, and that peer is already listed in the child target list, then no 438 further actions are taken. 440 Otherwise, if no (S,G) entry was found, one is created. The router then 441 looks up S in the Multicast RIB to find the next-hop EGP peer and sets 442 the entry's parent target to be the peer (if external) or the 443 appropriate M-IGP component. The target from which the Join was 444 received is then added to the child target list. If the child target 445 list becomes non-null as a result, the parent target must be notified as 446 Draft BGMP August 1998 448 follows: 450 a) If the parent target is an external peer, and the router has NO (*,G) 451 state for the group G, a BGMP (S,G) Join message is unicast to the 452 external peer. A BGMP (S,G) Join message is never sent to an external 453 peer by a router that also contains (*,G) state for the same group. 454 If the parent target is an external peer and the router DOES have 455 active (*,G) state for that group G, the SPT bit is always set to 456 False. 458 b) If the parent target is an M-IGP component, an (S,G) Join alert is 459 sent to the M-IGP component. 461 5.3.3. Receiving (*,G) Prunes 463 When the BGMP component receives a (*,G) Prune alert from another 464 component, or a BGMP (*,G) Prune message from an external peer, it 465 searches the tree state table for a matching entry. If no matching 466 entry exists, or if the component or peer is not listed in the child 467 target list, no further actions are taken. 469 Otherwise, the component or peer is removed from the child target list. 470 If the child target list becomes null as a result, the parent target 471 must be notified as follows. 473 a) If the parent target is an external peer, a BGMP (*,G) Prune message 474 is unicast to it. 476 b) If the parent target is an M-IGP component, a (*,G) Prune alert is 477 sent to the M-IGP component. 479 5.3.4. Receiving (S,G) Prunes 481 When the BGMP component receives an (S,G) Prune alert from another 482 component, or a BGMP (S,G) Prune message from an external peer, it 483 searches the tree state table for a matching entry. If no (S,G) entry 484 was found, but (*,G) state exists, an (S,G) entry is created, with the 485 child target list copied from the (*,G) entry, and the (*,G) parent 486 target added. If no matching entry exists, or if the component or peer 487 is not listed in the child target list, no further actions are taken. 489 Otherwise, the component or peer is removed from the child target list. 490 If the child target list becomes null as a result, and the BGMP router 491 Draft BGMP August 1998 493 has no corresponding (*,G) entry, then the parent target must be 494 notified as follows. 496 a) If the parent target is an external peer, a BGMP (S,G) Prune message 497 is unicast to it. 499 b) If the parent target is an M-IGP component, an (S,G) Prune alert is 500 sent to the M-IGP component. 502 5.3.5. Receiving Route Change Notifications 504 When a border router receives a route for a new prefix in the multicast 505 RIB, or a existing route for a prefix is withdrawn, a route change 506 notification for that prefix must be sent to the BGMP component. In 507 addition, when the next hop peer (according to the multicast RIB) 508 changes, a route change notification for that prefix must be sent to the 509 BGMP component. 511 In addition, an internal route for each class-D prefix associated with 512 the domain (if any) MUST be injected into the multicast RIB in the EGP 513 by the domain's border routers. 515 When a route for a new group prefix is learned, or an existing route for 516 a group prefix is withdrawn, or the next-hop peer for a group prefix 517 changes, a BGMP router updates all affected (*,G) parent targets. The 518 router sends a (*,G) Join to the new parent target, and a (*,G) Prune to 519 the old parent target, as appropriate. 521 When an existing route for a source prefix is withdrawn, or the next-hop 522 peer for a source prefix changes, a BGMP router updates all affected 523 (S,G) parent targets. The router sends an (S,G) Join to the new parent 524 target, and an (S,G) Prune to the old parent target, as appropriate. 526 5.4. Interaction with M-IGP components 528 When an M-IGP component on a border router first learns that there are 529 internally-reached members for a group G (whose scope is larger than 530 that domain), a (*,G) Join alert is sent to the BGMP component. 531 Similarly, when an M-IGP component on a border router learns that there 532 are no longer internally-reached members for a group G (whose scope is 533 larger than a single domain), a (*,G) Prune alert is sent to the BGMP 534 component. 536 Draft BGMP August 1998 538 At any time, any M-IGP domain MAY decide to join a source-specific 539 branch for some external source S and group G. When the M-IGP component 540 in the border router that is the next-hop router for a particular source 541 S learns that a receiver wishes to receive data from S on a source- 542 specific path, an (S,G) Join alert is sent to the BGMP component. When 543 it is learned that such receivers no longer exist, an (S,G) Prune alert 544 is sent to the BGMP component. Recall that the BGMP component will 545 generate external source-specific Joins only where the source-specific 546 branch does not coincide with the shared tree distribution tree for that 547 group. 549 Finally, we will require that the border router that is the next-hop 550 internal peer for a particular address S or G be able to forward data 551 for a matching tree state table entry to all members within the domain. 552 This requirement has implications on specific M-IGPs as follows. 554 5.4.1. Interaction with DVMRP and PIM-DM 556 DVMRP and PIM-DM are both "broadcast and prune" protocols in which every 557 data packet must pass an RPF check against the packet's source address, 558 or be dropped. If the border router receiving packets from an external 559 source is the only BR to inject the route for the source into the 560 domain, then there are no problems. For example, this will always be 561 true for stub domains with a single border router (see Figure 1). 562 Otherwise, the border router receiving packets externally is responsible 563 for encapsulating the data to any other border routers that must inject 564 the data into the domain for RPF checks to succeed. Although peering 565 sessions to internal peers are normally not required, in this situation, 566 BGMP TCP-connections must exist between such internal peers, and the 567 "virtual" interfaces used for encapsulation are owned by BGMP. 569 When an intended border router injector for a source receives 570 encapsulated packets from another border router in its domain, it should 571 create source-specific (S,G) BGMP state. Note that the border router 572 may be configured to do this on a data-rate triggered basis so that the 573 state is not created for very low data-rate/intermittent sources. If 574 source-specific state is created, then its incoming interface points to 575 the virtual encapsulation interface from the border router that 576 forwarded the packet, and it has an SPT flag that is initialized to be 577 False. 579 When the (S,G) BGMP state is created, the BGMP component will in turn 580 send a BGMP (S,G) Join message to the next-hop external peer towards S 581 if there is no (*,G) state for that same group, G. The (S,G) BGMP state 582 Draft BGMP August 1998 584 will have the SPT bit set to False if (*,G) BGMP state is present. 586 When the first data packet from S arrives from the external peer and 587 matches the BGMP (S,G) state, and IF there is no (*,G) state, the router 588 sets the SPT flag to True, resets the incoming interface to point to the 589 external peer, and sends a BGMP (S,G) Prune message to the border router 590 that was encapsulating the packets (e.g., in Figure 1, BR11 sends the 591 (Src_A,G) Prune to BR12). When the border router with (*,G) state 592 receives the prune for (S,G), it then deletes that border router from 593 its child target list for (S,G). 595 PIM-DM and DVMRP present an additional problem, i.e., no protocol 596 mechanism exists for joining and pruning entire groups; only joins and 597 prunes for individual sources are available. We therefore require that 598 some form of Domain-Wide Reports (DWRs) [DWR] are available within such 599 domains. Such messages provide the ability to join and prune an entire 600 group across the domain. One simple heuristic to approximate DWRs is to 601 assume that if there are any internally-reached members, then at least 602 one of them is a sender. With this heuristic, the presense of any M-IGP 603 (S,G) state for internally-reached sources can be used instead. Sending 604 a data packet to a group is then equivalent to sending a DWR for the 605 group. 607 5.4.2. Interaction with PIM-SM 609 Protocols such as PIM-SM build unidirectional shared and source-specific 610 trees. As with DVMRP and PIM-DM, every data packet must pass an RPF 611 check against some group-specific or source-specific address. 613 The fewest encapsulations/decapsulations will be done when the intra- 614 domain tree is rooted at the ingress/egress router for G (which becomes 615 the RP), since in general that router will receive the most packets from 616 external sources. To achieve this, each BGMP border router to a PIM-SM 617 domain should send Candidate-RP-Advertisements within the domain for 618 those groups for which it is the shared-domain tree ingress router. When 619 the border router that is the RP for a group G receives an external data 620 packet, it forwards the packet according to the M-IGP (i.e., PIM-SM) 621 shared-tree outgoing interface list. 623 Other border routers will receive data packets from external sources 624 that are farther down the bidirectional tree of domains. When a border 625 router that is not the RP receives an external packet for which it does 626 not have a source-specific entry, the border router treats it like a 627 Draft BGMP August 1998 629 local source by creating (S,G) state with a Register flag set, based on 630 normal PIM-SM rules; the Border router then encapsulates the data 631 packets in PIM-SM Registers and unicasts them to the RP for the group. 632 As explained above, the RP for the inter-domain group will be one of the 633 other border routers of the domain. 635 If a source's data rate is high enough, DRs within the PIM-SM domain may 636 switch to the shortest path tree. If the shortest path to an external 637 source is via the group's ingress router for the shared tree, the new 638 (S,G) state in the BGMP border router will not cause BGMP (S,G) Joins 639 because that border router will already have (*,G) state. If however, 640 the shortest path to an external source is via some other border router, 641 that border router will create (S,G) BGMP state in response to the M-IGP 642 (S,G) Join alert. In this case, because there is no local (*,G) state to 643 supress it, the border router will send a BGMP (S,G) Join to the next- 644 hop external peer towards S, in order to pull the data down directly. 645 (See BR11 in Figure 1.) As in normal PIM-SM operation, those PIM-SM 646 routers that have (*,G) and (S,G) state pointing to different incoming 647 interfaces will prune that source off the shared tree. Therefore, all 648 internal interfaces may be eventually pruned off the internal shared 649 tree. 651 5.4.3. Interaction with CBTv2 653 CBT builds bidirectional shared trees but must address two points of 654 compatibility with BGMP. First, CBT can not accommodate more than one 655 border router injecting a packet. Therefore, if a CBT domain does have 656 multiple external connections, the M-IGP components of the border 657 routers are responsible for insuring that only one of them will inject 658 data from any given source. This mechanism is provided in [CBTDM]. 660 Second, CBTv2 cannot process source-specific Joins or Prunes. Two 661 options thus exist for each CBTv2 domain: 663 Option A: 664 The CBT component interprets an (S,G) Join alert as if it were an 665 (*,G) Join alert, as described in [INTEROP]. That is, if it is not 666 already on the core-tree for G, then it sends a CBT (*,G) JOIN- 667 REQUEST message towards the core for G. Similarly, when the CBT 668 component receives an (S,G) Prune alert, and the child interface list 669 for a group is NULL, then it sends a (*,G) QUIT_NOTIFICATION towards 670 the core for G. This option has the disadvantage of pulling all data 671 for the group G down to the CBT domain when no members exist. 673 Draft BGMP August 1998 675 Option B: 676 The CBT domain does not propagate any source routes (i.e., non-class 677 D routes) to their external peers for the Multicast RIB unless it is 678 known that no other path exists to that prefix (e.g., routes for 679 prefixes internal to the domain or in a singly-homed customer's 680 domain may be propagated). This insures that source-specific joins 681 are never received unless the source's data already passes through 682 the domain on the shared tree, in which case the (S,G) Join need not 683 be propagated anyway. BGMP border routers will only send source- 684 specific Joins or Prunes to an external peer if that external peer 685 advertises source-prefixes in the EGP. If a BGMP-CBT border router 686 does receive an (S,G) Join or Prune, that border router should ignore 687 the message. 689 To minimize en/de-capsulations, CBTv2 BR's may follow the same scheme as 690 described under PIM-SM above, in which Candidate-Core advertisements are 691 sent for those groups for which it is the shared-tree ingress router. 693 5.4.4. Interaction with MOSPF 695 As with CBTv2, MOSPF cannot process source-specific Joins or Prunes, and 696 the same two options are available. Therefore, an MOSPF domain may 697 either: 699 Option A: 700 send a Group-Membership-LSA for all of G in response to an (S,G) Join 701 alert, and "prematurely age" it out (when no other downstream members 702 exist) in response to an (S,G) Prune alert, OR 704 Option B: 705 not propagate any source routes (i.e., non-class D routes) to their 706 external peers for the Multicast RIB unless it is known that no other 707 path exists to that prefix (e.g., routes for prefixes internal to the 708 domain or in a singly-homed customer's domain may be propagated) 710 6. Interaction with address allocation 712 6.1. Requirements for BGMP components 714 Each border router must be able to determine (e.g., from MASC [MASC]) 715 which class-D prefixes (if any) belong to each domain in which a 716 component resides. 718 Draft BGMP August 1998 720 7. Transition Strategy 722 There have been significant barriers to multicast deployment in Internet 723 backbones. While many of the problems with the current DVMRP backbone 724 (MBONE) have been documented in [ISSUES], most of these problems require 725 longer term engineering solutions. However, there is much that can be 726 done with existing technologies to enable deployment and put in place an 727 architecture that will enable a smooth transition to the next generation 728 of inter-domain multicast routing protocols (i.e., BGMP). This section 729 proposes a near-term transition strategy and architecture that is 730 designed to be simple, risk-neutral, and provide a smooth, incremental 731 transition path to BGMP. In addition, the transition architecture 732 provides for improved convergence properties, some initial policy 733 control, and the opportunity for providers to run either native or 734 tunneled multicast backbones and exchanges. 736 The transition strategy proposed here is to initially use BGP4+ [MBGP] 737 to provide the desired convergence and policy control properties, and 738 PIM-DM for multicast data forwarding. Once this architecture is in 739 place, backbones and exchanges can incrementally transition to BGMP and 740 domains running other M-IGPs may be incorporated more fully. 742 Since the current MBone uses a broadcast-and-prune backbone running 743 DVMRP, BGMP may view the entire MBone as a single multi-homed stub 744 domain (with a new AS number). The members-are-senders heuristic can 745 then be used initially to provide membership notifications within this 746 stub domain. 748 A BGMP backbone can then be formed by designating one or more neutral 749 PIM-DM domains (say, exchanges) as initial BGMP backbones. Each 750 exchange is then associated with a group prefix which is injected into 751 the Multicast RIB by all BGP4+/BGMP border routers on that exchange. 753 Any domain which meets the following constraints may then transition 754 from a normal MBone-connected domain to one running BGMP: 756 (1) Must peer with another BGMP domain and participate in M-BGP to 757 propagate routes in the Multicast RIB. 759 (2) Must establish an internal (to the MBone AS) EGP (e.g., iBGP) peer 760 relationship with other border routers of the MBone "stub" domain, 761 as is done with unicast routing. We expect this to eventually 762 involve the use of one or more route reflectors [REFLECT] inside 763 the MBone domain. 765 Draft BGMP August 1998 767 (3) If the transition will partition the MBone "stub" domain, then it 768 must be insured that the MBone domain will be administratively 769 split into multiple domains, each with a different multicast AS 770 number. 772 Draft BGMP August 1998 774 7.1. Preventing transit through the MBone stub 776 We desire that two AS's which are mutually reachable through BGMP use 777 paths which do not pass through the MBone stub domain. This is 778 illustrated in Figure 2, where the MBone stub is AS 5, which is multi- 779 homed to both AS 3 and AS 4. Paths between sources and destinations 780 which have already transitioned to BGP4+/BGMP should not use AS 5 as 781 transit unless no other path exists. 783 ----------------------\ /---------------------------- 784 | | 785 DVMRP /----\ | | /----\ IGP/iBGP 786 ..............| BR |+++++++++| BR |----------- 787 \----/ | E | \----/ 788 + | B | + AS 3 789 MBone + | G | + 790 + | P \-----+---------------------- 791 AS 5 iBGP + | + eBGP 792 + | /-----+---------------------- 793 + | | + 794 + | | + 795 DVMRP /----\ | | /----\ IGP/iBGP 796 ..............| BR |+++++++++| BR |----------- 797 \----/ | | \----/ 798 | | AS 4 799 | | 800 ----------------------/ \---------------------------- 802 Figure 2: Preventing Transit through MBone Stub 804 This requirement is easily solved using standard BGP policy mechanisms. 805 The MBone border routers should prefer EGP routes to DVMRP routes, since 806 DVMRP cannot tag routes as being external. Thus, external routes may 807 appear in the DVMRP routing table, but will not be imported into the EGP 808 since they will be overridden by iBGP routes. 810 Other EGP routers should prefer routes whose ASpath does not contain the 811 well-known MBone AS number. This will insure that the route through the 812 MBone stub is not used unless no other path exists. For safety, routes 813 whose ASpath begins with the MBone AS should receive the worst 814 preference. 816 Draft BGMP August 1998 818 8. Message Formats 820 This section describes message formats used by BGMP. 822 Messages are sent over a reliable transport protocol connection (BGMP 823 uses TCP port 264 to listen for incoming connections). A message is 824 processed only after it is entirely received. The maximum message size 825 is 4096 octets. All implementations are required to support this 826 maximum message size. 828 All fields labelled "Reserved" below must be transmitted as 0, and 829 ignored upon receipt. 831 8.1. Message Header Format 833 Each message has a fixed-size (4-byte) header. There may or may not be 834 a data portion following the header, depending on the message type. The 835 layout of these fields is shown below: 837 0 1 2 3 838 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 839 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 840 | Length | Type | Reserved | 841 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 843 Length: 844 This 2-octet unsigned integer indicates the total length of the 845 message, including the header, in octets. Thus, e.g., it allows 846 one to locate in the transport-level stream the start of the next 847 message. The value of the Length field must always be at least 4 848 and no greater than 4096, and may be further constrained, depending 849 on the message type. No "padding" of extra data after the message 850 is allowed, so the Length field must have the smallest value 851 required given the rest of the message. 853 Type: 854 This 1-octet unsigned integer indicates the type code of the 855 message. The following type codes are defined: 857 1 - OPEN 858 2 - UPDATE 859 3 - NOTIFICATION 861 Draft BGMP August 1998 863 4 - KEEPALIVE 865 8.2. OPEN Message Format 867 After a transport protocol connection is established, the first message 868 sent by each side is an OPEN message. If the OPEN message is 869 acceptable, a KEEPALIVE message confirming the OPEN is sent back. Once 870 the OPEN is confirmed, UPDATE, KEEPALIVE, and NOTIFICATION messages may 871 be exchanged. 873 In addition to the fixed-size BGMP header, the OPEN message contains the 874 following fields: 876 0 1 2 3 877 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 878 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 879 | Version | Reserved | Hold Time | 880 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 881 | BGMP Identifier | 882 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 883 | | 884 + (Optional Parameters) | 885 | | 886 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 888 Version: 889 This 1-octet unsigned integer indicates the protocol version number 890 of the message. The current BGMP version number is 1. 892 Hold Time: 893 This 2-octet unsigned integer indicates the number of seconds that 894 the sender proposes for the value of the Hold Timer. Upon receipt 895 of an OPEN message, a BGMP speaker MUST calculate the value of the 896 Hold Timer by using the smaller of its configured Hold Time and the 897 Hold Time received in the OPEN message. The Hold Time MUST be 898 either zero or at least three seconds. An implementation may 899 reject connections on the basis of the Hold Time. The calculated 900 value indicates the maximum number of seconds that may elapse 901 between the receipt of successive KEEPALIVE, and/or UPDATE messages 902 by the sender. 904 Draft BGMP August 1998 906 BGMP Identifier: 907 This 4-octet unsigned integer indicates the BGMP Identifier of the 908 sender. A given BGMP speaker sets the value of its BGMP Identifier 909 to a globally-unique value assigned to that BGMP speaker (e.g., an 910 IPv4 address). The value of the BGMP Identifier is determined on 911 startup and is the same for every BGMP session opened. 913 Optional Parameters: 914 This field may contain a list of optional parameters, where each 915 parameter is encoded as a triplet. The combined length of all optional 917 parameters can be derived from the Length field in the message 918 header. 920 0 1 921 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 922 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... 923 | Parm. Type | Parm. Length | Parameter Value (variable) 924 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... 926 Parameter Type is a one octet field that unambiguously identifies 927 individual parameters. Parameter Length is a one octet field that 928 contains the length of the Parameter Value field in octets. 929 Parameter Value is a variable length field that is interpreted 930 according to the value of the Parameter Type field. 932 This document defines the following Optional Parameters: 934 a) Authentication Information (Parameter Type 1): 935 This optional parameter may be used to authenticate a BGMP peer. 936 The Parameter Value field contains a 1-octet Authentication Code 937 followed by a variable length Authentication Data. 939 0 1 2 3 4 5 6 7 8 940 +-+-+-+-+-+-+-+-+ 941 | Auth. Code | 942 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 943 | | 944 | Authentication Data | 945 | | 946 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 948 Authentication Code: 950 Draft BGMP August 1998 952 This 1-octet unsigned integer indicates the authentication 953 mechanism being used. Whenever an authentication mechanism is 954 specified for use within BGMP, three things must be included in 955 the specification: 957 - the value of the Authentication Code which indicates use of the 958 mechanism, 960 - the form and meaning of the Authentication Data, and 962 - the algorithm for computing values of Marker fields. 964 Note that a separate authentication mechanism may be used in 965 establishing the transport level connection. 967 Authentication Data: 969 The form and meaning of this field depend on the Authentication 970 Code. 972 The minimum length of the OPEN message is 12 octets (including 973 message header). 975 b) Capability Information (Parameter Type 2): 976 This is an Optional Parameter that is used by a BGMP-speaker to 977 convey to its peer a list of capabilities supported by the speaker. 978 The parameter contains one or more triples , where each triple is encoded 980 as shown below: 981 +------------------------------+ 982 | Capability Code (1 octet) | 983 +------------------------------+ 984 | Capability Length (1 octet) | 985 +------------------------------+ 986 | Capability Value (variable) | 987 +------------------------------+ 988 Capability Code: 990 Capability Code is a one octet field that unambiguously identifies 991 individual capabilities. 993 Capability Length: 995 Capability Length is a one octet field that contains the length of 997 Draft BGMP August 1998 999 the Capability Value field in octets. 1001 Capability Value: 1003 Capability Value is a variable length field that is interpreted 1004 according to the value of the Capability Code field. 1006 A particular capability, as identified by its Capability Code, may 1007 occur more than once within the Optional Parameter. 1009 This document reserves Capability Codes 128-255 for vendor-specific 1010 applications. 1012 This document reserves value 0. 1014 Capability Codes (other than those reserved for vendor specific use) 1015 are assigned only by the IETF consensus process and IESG approval. 1017 8.3. UPDATE Message Format 1019 UPDATE messages are used to transfer Join/Prune information between BGMP 1020 peers. The UPDATE message always includes the fixed-size BGMP header, 1021 and one or more attributes as described below. 1023 The message format below allows compact encoding of (*,G) Joins and 1024 Prunes, while allowing the flexibility needed to do other updates such 1025 as (S,G) Joins and Prunes towards sources as well as on the shared tree. 1026 In the discussion below, an Encoded-Address-Prefix is of the form: 1027 0 1 2 3 1028 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1029 +-+-+-+-+-+-+-+-+ 1030 |EnTyp| AddrFam | 1031 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1032 | Address (variable length) | 1033 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1034 | Mask (variable length) | 1035 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1037 EnTyp: 1038 0 - All 1's Mask. The Mask field is 0 bytes long. 1039 1 - Mask length included. The Mask field is 4 bytes long, and 1040 contains the mask length, in bits. 1041 2 - Full Mask included. The Mask field is the same length 1043 Draft BGMP August 1998 1045 as the Address field, and contains the full bitmask. 1047 AddrFam: 1048 The IANA-assigned address family number of the encoded prefix. 1049 These include (among others): 1051 Number Description 1052 ------ ----------- 1053 1 IP (IP version 4) 1054 2 IPv6 (IP version 6) 1056 Address: 1057 The address associated with the given prefix to be encoded. The 1058 length is determined based on the Address Family. 1060 Mask: 1061 The mask associated with the given prefix. The format (or absence) 1062 of this field is determined by the EnTyp field. 1064 Each attribute is of the form: 1066 0 1 2 3 1067 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1068 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1069 | Length | Type | Data ... 1070 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1071 All attributes are 4-byte aligned. 1073 Length: 1074 The Length is the length of the entire attribute, including the 1075 length, type, and data fields. If other attributes are nested 1076 within the data field, the length includes the size of all such 1077 nested attributes. 1079 Type: 1081 Types 128-255 are reserved for "optional" attributes. If a 1082 required attribute is unrecognized, a NOTIFICATION will be sent and 1083 the connection will be closed. Unrecognized optional attributes 1084 are simply ignored. 1086 Draft BGMP August 1998 1088 0 - JOIN 1089 1 - PRUNE 1090 2 - GROUP 1091 3 - SOURCE 1093 a) JOIN (Type Code 0) 1095 The JOIN attribute indicates that all GROUP or SOURCE options 1096 nested immediately within the JOIN option should be joined. 1098 0 1 2 3 1099 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1100 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1101 | Length | Type=0 | Reserved | 1102 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1103 | Nested Attributes ... 1104 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1105 No JOIN or PRUNE attributes may be immediately nested within a JOIN 1106 attribute. 1108 b) PRUNE (Type Code 1) 1110 The PRUNE attribute indicates that all GROUP or SOURCE attributes 1111 nested immediately within the PRUNE attribute should be pruned. 1113 0 1 2 3 1114 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1115 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1116 | Length | Type=1 | Reserved | 1117 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1118 | Nested Attributes ... 1119 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1120 No JOIN or PRUNE attributes may be immediately nested within a JOIN 1121 attribute. 1123 c) GROUP (Type Code 2) 1125 The GROUP attribute identifies a given group-prefix. In addition, 1126 any attributes nested immediately within the GROUP attribute also 1127 apply to the given group-prefix. 1129 Draft BGMP August 1998 1131 0 1 2 3 1132 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1133 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1134 | Length | Type=2 | | 1135 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 1136 | | 1137 | Encoded-Address-Prefix | 1138 | | 1139 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1140 | Nested Attributes (optional) ... 1141 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1142 No GROUP or SOURCE attributes may be immediately nested within a 1143 GROUP attribute. 1145 Encoded-Address-Prefix 1146 The multicast group prefix to be joined to pruned, in the format 1147 described above. 1149 d) SOURCE (Type Code 3): 1151 The SOURCE attribute identifies a given source-prefix. In addition, any 1152 attributes nested immediately within the SOURCE attribute also apply to 1153 the given source-prefix. 1155 The SOURCE attribute has the following format: 1157 0 1 2 3 1158 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1159 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1160 | Length | Type=2 | | 1161 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 1162 | | 1163 | Encoded-Address-Prefix | 1164 | | 1165 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1166 | Nested Attributes (optional) ... 1167 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1169 Encoded-Address-Prefix 1170 The Source-prefix in the format described above. 1172 Nested Attributes 1173 No GROUP or SOURCE attributes may be immediately nested within a 1174 SOURCE attribute. 1176 Draft BGMP August 1998 1178 8.4. Encoding examples 1180 Below are enumerated examples of how various updates are built using 1181 nested attributes, where A ( B ) denotes that attribute B is nested 1182 within attribute A. 1183 (*,G-prefix) Join: JOIN ( GROUP ) 1184 (*,G-prefix) Prune: PRUNE ( GROUP ) 1185 (S,G) Join towards S : GROUP ( JOIN ( SOURCE ) ) 1186 (S,G) Join cancelling prune towards G: GROUP ( JOIN ( SOURCE ) ) 1187 (S,G) Prune towards S: GROUP ( PRUNE ( SOURCE ) ) 1188 (S,G) Prune towards G: GROUP ( PRUNE ( SOURCE ) ) 1189 Switch from (*,G) to (S,G): PRUNE ( GROUP ( JOIN ( SOURCE ) ) ) 1190 Switch from (S,G) to (*,G): JOIN ( GROUP ) 1191 Initial (*,G) Join with S pruned: JOIN ( GROUP ( PRUNE ( SOURCE ) ) ) 1193 8.5. KEEPALIVE Message Format 1195 BGMP does not use any transport protocol-based keep-alive mechanism to 1196 determine if peers are reachable. Instead, KEEPALIVE messages are 1197 exchanged between peers often enough as not to cause the Hold Timer to 1198 expire. A reasonable maximum time between the last KEEPALIVE or UPDATE 1199 message sent, and the time at which a KEEPALIVE message is sent, would 1200 be one third of the Hold Time interval. KEEPALIVE messages MUST NOT be 1201 sent more frequently than one per second. An implementation MAY adjust 1202 the rate at which it sends KEEPALIVE messages as a function of the Hold 1203 Time interval. 1205 If the negotiated Hold Time interval is zero, then periodic KEEPALIVE 1206 messages MUST NOT be sent. 1208 A KEEPALIVE message consists of only a message header, and has a length 1209 of 4 octets. 1211 8.6. NOTIFICATION Message Format 1213 A NOTIFICATION message is sent when an error condition is detected. The 1214 BGMP connection is closed immediately after sending it. 1216 In addition to the fixed-size BGMP header, the NOTIFICATION message 1217 contains the following fields: 1219 Draft BGMP August 1998 1221 0 1 2 3 1222 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1223 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1224 | Error code | Error subcode | Data | 1225 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 1226 | | 1227 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1229 Error Code: 1231 This 1-octet unsigned integer indicates the type of 1232 NOTIFICATION. The following Error Codes have been defined: 1234 Error Code Symbolic Name Reference 1236 1 Message Header Error Section 9.1 1238 2 OPEN Message Error Section 9.2 1240 3 UPDATE Message Error Section 9.3 1242 4 Hold Timer Expired Section 9.5 1244 5 Finite State Machine Error Section 9.6 1246 6 Cease Section 9.7 1248 Error subcode: 1250 This 1-octet unsigned integer provides more specific 1251 information about the nature of the reported error. Each Error 1252 Code may have one or more Error Subcodes associated with it. 1253 If no appropriate Error Subcode is defined, then a zero 1254 (Unspecific) value is used for the Error Subcode field. 1256 Message Header Error subcodes: 1258 2 - Bad Message Length. 1259 3 - Bad Message Type. 1261 OPEN Message Error subcodes: 1263 1 - Unsupported Version Number 1264 4 - Unsupported Optional Parameter 1265 5 - Authentication Failure 1267 Draft BGMP August 1998 1269 6 - Unacceptable Hold Time 1270 7 - Unsupported Capability 1272 UPDATE Message Error subcodes: 1274 1 - Malformed Attribute List 1275 2 - Unrecognized Well-known Attribute 1276 5 - Attribute Length Error 1277 10 - Invalid Prefix Field 1278 Data: 1279 This variable-length field is used to diagnose the reason for the 1280 NOTIFICATION. The contents of the Data field depend upon the 1281 Error Code and Error Subcode. See Section 9 below for more 1282 details. 1284 Note that the length of the Data field can be determined from the 1285 message Length field by the formula: 1287 Message Length = 6 + Data Length 1289 The minimum length of the NOTIFICATION message is 6 octets 1290 (including message header). 1292 9. BGMP Error Handling 1294 This section describes actions to be taken when errors are detected 1295 while processing BGMP messages. BGMP Error Handling is similar to that 1296 of BGP [RFC1771]. 1298 When any of the conditions described here are detected, a NOTIFICATION 1299 message with the indicated Error Code, Error Subcode, and Data fields is 1300 sent, and the BGMP connection is closed. If no Error Subcode is 1301 specified, then a zero must be used. 1303 The phrase "the BGMP connection is closed" means that the transport 1304 protocol connection has been closed and that all resources for that BGMP 1305 connection have been deallocated. The remote peer is removed from the 1306 target list of all tree state entries. 1308 Unless specified explicitly, the Data field of the NOTIFICATION message 1309 that is sent to indicate an error is empty. 1311 Draft BGMP August 1998 1313 9.1. Message Header error handling 1315 All errors detected while processing the Message Header are indicated by 1316 sending the NOTIFICATION message with Error Code Message Header Error. 1317 The Error Subcode elaborates on the specific nature of the error. 1319 If the Length field of the message header is less than 4 or greater than 1320 4096, or if the Length field of an OPEN message is less than the minimum 1321 length of the OPEN message, or if the Length field of an UPDATE message 1322 is less than the minimum length of the UPDATE message, or if the Length 1323 field of a KEEPALIVE message is not equal to 4, then the Error Subcode 1324 is set to Bad Message Length. The Data field contains the erroneous 1325 Length field. 1327 If the Type field of the message header is not recognized, then the 1328 Error Subcode is set to Bad Message Type. The Data field contains the 1329 erroneous Type field. 1331 9.2. OPEN message error handling 1333 All errors detected while processing the OPEN message are indicated by 1334 sending the NOTIFICATION message with Error Code OPEN Message Error. 1335 The Error Subcode elaborates on the specific nature of the error. 1337 If the version number contained in the Version field of the received 1338 OPEN message is not supported, then the Error Subcode is set to 1339 Unsupported Version Number. The Data field is a 2-octet unsigned 1340 integer, which indicates the largest locally supported version number 1341 less than the version the remote BGMP peer bid (as indicated in the 1342 received OPEN message). 1344 If the Hold Time field of the OPEN message is unacceptable, then the 1345 Error Subcode MUST be set to Unacceptable Hold Time. An implementation 1346 MUST reject Hold Time values of one or two seconds. An implementation 1347 MAY reject any proposed Hold Time. An implementation which accepts a 1348 Hold Time MUST use the negotiated value for the Hold Time. 1350 If one of the Optional Parameters in the OPEN message is not recognized, 1351 then the Error Subcode is set to Unsupported Optional Parameter. 1353 If the OPEN message carries Authentication Information (as an Optional 1354 Parameter), then the corresponding authentication procedure is invoked. 1355 If the authentication procedure (based on Authentication Code and 1356 Authentication Data) fails, then the Error Subcode is set to 1357 Draft BGMP August 1998 1359 Authentication Failure. 1361 If the OPEN message indicates that the peer does not support a 1362 capability which the receiver requires, the receiver may send a 1363 NOTIFICATION message to the peer, and terminate peering. The Error 1364 Subcode in the message is set to Unsupported Capability. The Data field 1365 in the NOTIFICATION message lists the set of capabilities that cause the 1366 speaker to send the message. Each such capability is encoded the same 1367 way as in OPEN messages. 1369 9.3. UPDATE message error handling 1371 All errors detected while processing the UPDATE message are indicated by 1372 sending the NOTIFICATION message with Error Code UPDATE Message Error. 1373 The error subcode elaborates on the specific nature of the error. 1375 If any recognized attribute has Attribute Length that conflicts with the 1376 expected length (based on the attribute type code), then the Error 1377 Subcode is set to Attribute Length Error. The Data field contains the 1378 erroneous attribute (type, length and value). 1380 If the Encoded-Address-Prefix field in some attribute is syntactically 1381 incorrect, then the Error Subcode is set to Invalid Prefix Field. 1383 If any other is encountered when processing attributes (such as invalid 1384 nestings), then the Error Subcode is set to Malformed Attribute List, 1385 and the problematic attribute is included in the data field. 1387 9.4. NOTIFICATION message error handling 1389 If a peer sends a NOTIFICATION message, and there is an error in that 1390 message, there is unfortunately no means of reporting this error via a 1391 subsequent NOTIFICATION message. Any such error, such as an 1392 unrecognized Error Code or Error Subcode, should be noticed, logged 1393 locally, and brought to the attention of the administration of the peer. 1394 The means to do this, however, lies outside the scope of this document. 1396 9.5. Hold Timer Expired error handling 1398 If a system does not receive successive KEEPALIVE and/or UPDATE and/or 1399 NOTIFICATION messages within the period specified in the Hold Time field 1400 of the OPEN message, then the NOTIFICATION message with Hold Timer 1401 Draft BGMP August 1998 1403 Expired Error Code must be sent and the BGMP connection closed. 1405 9.6. Finite State Machine error handling 1407 Any error detected by the BGMP Finite State Machine (e.g., receipt of an 1408 unexpected event) is indicated by sending the NOTIFICATION message with 1409 Error Code Finite State Machine Error. 1411 9.7. Cease 1413 In absence of any fatal errors (that are indicated in this section), a 1414 BGMP peer may choose at any given time to close its BGMP connection by 1415 sending the NOTIFICATION message with Error Code Cease. However, the 1416 Cease NOTIFICATION message must not be used when a fatal error indicated 1417 by this section does exist. 1419 9.8. Connection collision detection 1421 If a pair of BGMP speakers try simultaneously to establish a TCP 1422 connection to each other, then two parallel connections between this 1423 pair of speakers might well be formed. We refer to this situation as 1424 connection collision. Clearly, one of these connections must be closed. 1426 Based on the value of the BGMP Identifier a convention is established 1427 for detecting which BGMP connection is to be preserved when a collision 1428 does occur. The convention is to compare the BGMP Identifiers of the 1429 peers involved in the collision and to retain only the connection 1430 initiated by the BGMP speaker with the higher-valued BGMP Identifier. 1432 Upon receipt of an OPEN message, the local system must examine all of 1433 its connections that are in the OpenConfirm state. A BGMP speaker may 1434 also examine connections in an OpenSent state if it knows the BGMP 1435 Identifier of the peer by means outside of the protocol. If among these 1436 connections there is a connection to a remote BGMP speaker whose BGMP 1437 Identifier equals the one in the OPEN message, then the local system 1438 performs the following collision resolution procedure: 1440 1. The BGMP Identifier of the local system is compared to the BGMP 1441 Identifier of the remote system (as specified in the OPEN message). 1443 2. If the value of the local BGMP Identifier is less than the remote 1444 one, the local system closes BGMP connection that already exists (the 1445 Draft BGMP August 1998 1447 one that is already in the OpenConfirm state), and accepts BGMP 1448 connection initiated by the remote system. 1450 3. Otherwise, the local system closes the newly created BGMP connection 1451 (the one associated with the newly received OPEN message), and continues 1452 to use the existing one (the one that is already in the OpenConfirm 1453 state). 1455 Comparing BGMP Identifiers is done by treating them as (4-octet long) 1456 unsigned integers. 1458 A connection collision with an existing BGMP connection that is in 1459 Established states causes unconditional closing of the newly created 1460 connection. Note that a connection collision cannot be detected with 1461 connections that are in Idle, or Connect, or Active states. 1463 Closing the BGMP connection (that results from the collision resolution 1464 procedure) is accomplished by sending the NOTIFICATION message with the 1465 Error Code Cease. 1467 10. BGMP Version Negotiation 1469 BGMP speakers may negotiate the version of the protocol by making 1470 multiple attempts to open a BGMP connection, starting with the highest 1471 version number each supports. If an open attempt fails with an Error 1472 Code OPEN Message Error, and an Error Subcode Unsupported Version 1473 Number, then the BGMP speaker has available the version number it tried, 1474 the version number its peer tried, the version number passed by its peer 1475 in the NOTIFICATION message, and the version numbers that it supports. 1476 If the two peers do support one or more common versions, then this will 1477 allow them to rapidly determine the highest common version. In order to 1478 support BGMP version negotiation, future versions of BGMP must retain 1479 the format of the OPEN and NOTIFICATION messages. 1481 10.1. BGMP Capability Negotiation 1483 When a BGMP speaker sends an OPEN message to its BGMP peer, the message 1484 may include an Optional Parameter, called Capabilities. The parameter 1485 lists the capabilities supported by the speaker. 1487 A BGMP speaker may use a particular capability when peering with another 1488 speaker only if both speakers support that capability. A BGMP speaker 1489 determines the capabilities supported by its peer by examining the list 1490 Draft BGMP August 1998 1492 of capabilities present in the Capabilities Optional Parameter carried 1493 by the OPEN message that the speaker receives from the peer. 1495 11. BGMP Finite State machine 1497 This section specifies BGMP operation in terms of a Finite State Machine 1498 (FSM). Following is a brief summary and overview of BGMP operations by 1499 state as determined by this FSM. 1501 Initially BGMP is in the Idle state. 1503 Idle state: 1505 In this state BGMP refuses all incoming BGMP connections. No 1506 resources are allocated to the peer. In response to the Start 1507 event (initiated by either system or operator) the local system 1508 initializes all BGMP resources, starts the ConnectRetry timer, 1509 initiates a transport connection to the other BGMP peer, while 1510 listening for a connection that may be initiated by the remote 1511 BGMP peer, and changes its state to Connect. The exact value of 1512 the ConnectRetry timer is a local matter, but should be 1513 sufficiently large to allow TCP initialization. 1515 If a BGMP speaker detects an error, it shuts down the connection 1516 and changes its state to Idle. Getting out of the Idle state 1517 requires generation of the Start event. If such an event is 1518 generated automatically, then persistent BGMP errors may result in 1519 persistent flapping of the speaker. To avoid such a condition it 1520 is recommended that Start events should not be generated 1521 immediately for a peer that was previously transitioned to Idle 1522 due to an error. For a peer that was previously transitioned to 1523 Idle due to an error, the time between consecutive generation of 1524 Start events, if such events are generated automatically, shall 1525 exponentially increase. The value of the initial timer shall be 60 1526 seconds. The time shall be doubled for each consecutive retry. 1528 Any other event received in the Idle state is ignored. 1530 Connect state: 1532 In this state BGMP is waiting for the transport protocol 1533 connection to be completed. 1535 If the transport protocol connection succeeds, the local system 1537 Draft BGMP August 1998 1539 clears the ConnectRetry timer, completes initialization, sends an 1540 OPEN message to its peer, and changes its state to OpenSent. If 1541 the transport protocol connect fails (e.g., retransmission 1542 timeout), the local system restarts the ConnectRetry timer, 1543 continues to listen for a connection that may be initiated by the 1544 remote BGMP peer, and changes its state to the Active state. 1546 In response to the ConnectRetry timer expired event, the local 1547 system restarts the ConnectRetry timer, initiates a transport 1548 connection to the other BGMP peer, continues to listen for a 1549 connection that may be initiated by the remote BGMP peer, and 1550 stays in the Connect state. 1552 The Start event is ignored in the Connect state. 1554 In response to any other event (initiated by either system or 1555 operator), the local system releases all BGMP resources associated 1556 with this connection and changes its state to Idle. 1558 Active state: 1560 In this state BGMP is trying to acquire a peer by initiating a 1561 transport protocol connection. 1563 If the transport protocol connection succeeds, the local system 1564 clears the ConnectRetry timer, completes initialization, sends an 1565 OPEN message to its peer, sets its Hold Timer to a large value, 1566 and changes its state to OpenSent. A Hold Timer value of 4 1567 minutes is suggested. 1569 In response to the ConnectRetry timer expired event, the local 1570 system restarts the ConnectRetry timer, initiates a transport 1571 connection to the other BGMP peer, continues to listen for a 1572 connection that may be initiated by the remote BGMP peer, and 1573 changes its state to Connect. 1575 If the local system detects that a remote peer is trying to 1576 establish BGMP connection to it, and the IP address of the remote 1577 peer is not an expected one, the local system restarts the 1578 ConnectRetry timer, rejects the attempted connection, continues to 1579 listen for a connection that may be initiated by the remote BGMP 1580 peer, and stays in the Active state. 1582 The Start event is ignored in the Active state. 1584 Draft BGMP August 1998 1586 In response to any other event (initiated by either system or 1587 operator), the local system releases all BGMP resources associated 1588 with this connection and changes its state to Idle. 1590 OpenSent state: 1592 In this state BGMP waits for an OPEN message from its peer. When 1593 an OPEN message is received, all fields are checked for 1594 correctness. If the BGMP message header checking or OPEN message 1595 checking detects an error (see Section 9.2), or a connection 1596 collision (see Section 9.8) the local system sends a NOTIFICATION 1597 message and changes its state to Idle. 1599 If there are no errors in the OPEN message, BGMP sends a KEEPALIVE 1600 message and sets a KeepAlive timer. The Hold Timer, which was 1601 originally set to a large value (see above), is replaced with the 1602 negotiated Hold Time value. If the negotiated Hold Time value is 1603 zero, then the Hold Time timer and KeepAlive timers are not 1604 started. Finally, the state is changed to OpenConfirm. 1606 If a disconnect notification is received from the underlying 1607 transport protocol, the local system closes the BGMP connection, 1608 restarts the ConnectRetry timer, continues to listen for a 1609 connection that may be initiated by the remote BGMP peer, and goes 1610 into the Active state. 1612 If the Hold Timer expires, the local system sends a NOTIFICATION 1613 message with error code Hold Timer Expired and changes its state 1614 to Idle. 1616 In response to the Stop event (initiated by either system or 1617 operator) the local system sends a NOTIFICATION message with Error 1618 Code Cease and changes its state to Idle. 1620 The Start event is ignored in the OpenSent state. 1622 In response to any other event, the local system sends a 1623 NOTIFICATION message with Error Code Finite State Machine Error 1624 and changes its state to Idle. 1626 Whenever BGMP changes its state from OpenSent to Idle, it closes 1627 the BGMP (and transport-level) connection and releases all 1628 resources associated with that connection. 1630 OpenConfirm state: 1632 Draft BGMP August 1998 1634 In this state BGMP waits for a KEEPALIVE or NOTIFICATION message. 1636 If the local system receives a KEEPALIVE message, it changes its 1637 state to Established. 1639 If the Hold Timer expires before a KEEPALIVE message is received, 1640 the local system sends a NOTIFICATION message with error code Hold 1641 Timer Expired and changes its state to Idle. 1643 If the local system receives a NOTIFICATION message, it changes 1644 its state to Idle. 1646 If the KeepAlive timer expires, the local system sends a KEEPALIVE 1647 message and restarts its KeepAlive timer. 1649 If a disconnect notification is received from the underlying 1650 transport protocol, the local system changes its state to Idle. 1652 In response to the Stop event (initiated by either system or 1653 operator) the local system sends a NOTIFICATION message with Error 1654 Code Cease and changes its state to Idle. 1656 The Start event is ignored in the OpenConfirm state. 1658 In response to any other event the local system sends a 1659 NOTIFICATION message with Error Code Finite State Machine Error 1660 and changes its state to Idle. 1662 Whenever BGMP changes its state from OpenConfirm to Idle, it 1663 closes the BGMP (and transport-level) connection and releases all 1664 resources associated with that connection. 1666 Established state: 1668 In the Established state BGMP can exchange UPDATE, NOTIFICATION, 1669 and KEEPALIVE messages with its peer. 1671 If the local system receives an UPDATE or KEEPALIVE message, and 1672 the negotiated Hold Time value is non-zero, then it restarts its 1673 Hold Timer. 1675 If the local system receives a NOTIFICATION message, it changes 1676 its state to Idle. 1678 If the local system receives an UPDATE message and the UPDATE 1680 Draft BGMP August 1998 1682 message error handling procedure (see Section 9.3) detects an 1683 error, the local system sends a NOTIFICATION message and changes 1684 its state to Idle. 1686 If a disconnect notification is received from the underlying 1687 transport protocol, the local system changes its state to Idle. 1689 If the Hold Timer expires, the local system sends a NOTIFICATION 1690 message with Error Code Hold Timer Expired and changes its state 1691 to Idle. 1693 If the KeepAlive timer expires, the local system sends a KEEPALIVE 1694 message and restarts its KeepAlive timer. 1696 Each time the local system sends a KEEPALIVE or UPDATE message, it 1697 restarts its KeepAlive timer, unless the negotiated Hold Time 1698 value is zero. 1700 In response to the Stop event (initiated by either system or 1701 operator), the local system sends a NOTIFICATION message with 1702 Error Code Cease and changes its state to Idle. 1704 The Start event is ignored in the Established state. 1706 In response to any other event, the local system sends a 1707 NOTIFICATION message with Error Code Finite State Machine Error 1708 and changes its state to Idle. 1710 Whenever BGMP changes its state from Established to Idle, it 1711 closes the BGMP (and transport-level) connection, releases all 1712 resources associated with that connection, and deletes all routes 1713 derived from that connection. 1715 12. Security Considerations 1717 Security issues are not discussed in this memo. 1719 13. Authors' Addresses 1721 Dave Thaler 1722 Microsoft 1723 One Microsoft Way 1725 Draft BGMP August 1998 1727 Redmond, WA 98052 1728 Phone: +1 425 703 8835 1729 EMail: thalerd@eecs.umich.edu 1731 Deborah Estrin 1732 Computer Science Dept./ISI 1733 University of Southern California 1734 Los Angeles, CA 90089 1735 Email: estrin@usc.edu 1737 David Meyer 1738 University of Oregon 1739 1225 Kincaid St. 1740 Eugene, OR 97403 1741 Phone: (541) 346-1747 1742 EMail: meyer@antc.uoregon.edu 1744 14. References 1746 [MBGP] 1747 Bates, T., Chandra, R., Katz, D., and Y. Rekhter, "Multiprotocol 1748 Extensions for BGP-4", RFC 2283, February 1998. 1750 [CBT] 1751 Ballardie, A., "Core Based Trees (CBT) Multicast Routing", RFC 1752 2189, September 1997. 1754 [CBTDM] 1755 Ballardie, A., "Core Based Tree (CBT) Multicast Border Router 1756 Specification" draft-ietf-idmr-cbt-br-spec-02.txt, April 1998. 1758 [DVMRP] 1759 Pusateri, T., "Distance Vector Multicast Routing Protocol", draft- 1760 ietf-idmr-dvmrp-v3-06.txt, April 1998. 1762 [DWR] 1763 Fenner, W., "Domain-Wide Reports", Work in progress. 1765 [INTEROP] 1766 Thaler, D., "Interoperability Rules for Multicast Routing 1767 Protocols", draft-thaler-multicast-interop-03.txt, July 1998. 1769 [IPv6MAA] 1770 Hinden, R. and S. Deering, "IPv6 Multicast Address Assignments", 1772 Draft BGMP August 1998 1774 RFC 2375, July 1998. 1776 [ISSUES] 1777 Meyer, D., "Some Issues for an Inter-domain Multicast Routing 1778 Protocol", draft-ietf-mboned-imrp-some-issues-02.txt, June 1997. 1780 [MASC] 1781 Estrin, D., Handley, M, and D. Thaler, "The Multicast-Address-Set 1782 Claim (MASC) Protocol", draft-ietf-malloc-masc-01.txt, August 1998. 1784 [MOSPF] 1785 Moy, J., "Multicast Extensions to OSPF", RFC 1584, March 1994. 1787 [PIMDM] 1788 Deering, et al., "Protocol Independent Multicast Version 2 Dense 1789 Mode Specification" draft-ietf-pim-v2-dm-00.txt, August 1998. 1791 [PIMSM] 1792 Estrin, et al., "Protocol Independent Multicast-Sparse Mode (PIM- 1793 SM): Protocol Specification", RFC 2362, June 1998. 1795 [REFLECT] 1796 Bates, T., and R. Chandra, "BGP Route Reflection: An alternative to 1797 full mesh IBGP", RFC 1966, June 1996. 1799 [RFC1700] 1800 Reynolds, S. J., and J. Postel, "ASSIGNED NUMBERS", STD 1, RFC 1801 1700, October 1994. 1803 [RFC1771] 1804 Rekhter, Y., and T. Li, "A Border Gateway Protocol 4 (BGP-4)", RFC 1805 1771, March 1995. 1807 [RFC2119] 1808 S. Bradner, "Key words for use in RFCs to Indicate Requirement 1809 Levels", BCP 14, RFC 2119, March 1997. 1811 15. Full Copyright Statement 1813 Copyright (C) The Internet Society (1998). All Rights Reserved. 1815 This document and translations of it may be copied and furnished to 1816 others, and derivative works that comment on or otherwise explain it or 1817 assist in its implementation may be prepared, copied, published and 1818 Draft BGMP August 1998 1820 distributed, in whole or in part, without restriction of any kind, 1821 provided that the above copyright notice and this paragraph are included 1822 on all such copies and derivative works. However, this document itself 1823 may not be modified in any way, such as by removing the copyright notice 1824 or references to the Internet Society or other Internet organizations, 1825 except as needed for the purpose of developing Internet standards in 1826 which case the procedures for copyrights defined in the Internet 1827 Standards process must be followed, or as required to translate it into 1828 languages other than English. 1830 The limited permissions granted above are perpetual and will not be 1831 revoked by the Internet Society or its successors or assigns. 1833 This document and the information contained herein is provided on an "AS 1834 IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK 1835 FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT 1836 LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT 1837 INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR 1838 FITNESS FOR A PARTICULAR PURPOSE." 1840 Table of Contents 1842 1 Acknowledgements ................................................ 2 1843 2 Purpose ......................................................... 2 1844 3 Terminology ..................................................... 3 1845 4 Protocol Overview ............................................... 5 1846 4.1 Design Rationale .............................................. 6 1847 5 Protocol Details ................................................ 8 1848 5.1 Interaction with the EGP ...................................... 8 1849 5.2 Multicast Data Packet Processing .............................. 9 1850 5.3 BGMP processing of Join and Prune messages and notifications 1851 .............................................................. 10 1852 5.3.1 Receiving (*,G) Joins ....................................... 10 1853 5.3.2 Receiving (S,G) Joins ....................................... 10 1854 5.3.3 Receiving (*,G) Prunes ...................................... 11 1855 5.3.4 Receiving (S,G) Prunes ...................................... 11 1856 5.3.5 Receiving Route Change Notifications ........................ 12 1857 5.4 Interaction with M-IGP components ............................. 12 1858 5.4.1 Interaction with DVMRP and PIM-DM ........................... 13 1859 5.4.2 Interaction with PIM-SM ..................................... 14 1860 5.4.3 Interaction with CBTv2 ...................................... 15 1861 5.4.4 Interaction with MOSPF ...................................... 16 1862 6 Interaction with address allocation ............................. 16 1863 Draft BGMP August 1998 1865 6.1 Requirements for BGMP components .............................. 16 1866 7 Transition Strategy ............................................. 17 1867 7.1 Preventing transit through the MBone stub ..................... 19 1868 8 Message Formats ................................................. 20 1869 8.1 Message Header Format ......................................... 20 1870 8.2 OPEN Message Format ........................................... 21 1871 8.3 UPDATE Message Format ......................................... 24 1872 8.4 Encoding examples ............................................. 28 1873 8.5 KEEPALIVE Message Format ...................................... 28 1874 8.6 NOTIFICATION Message Format ................................... 28 1875 9 BGMP Error Handling ............................................. 30 1876 9.1 Message Header error handling ................................. 31 1877 9.2 OPEN message error handling ................................... 31 1878 9.3 UPDATE message error handling ................................. 32 1879 9.4 NOTIFICATION message error handling ........................... 32 1880 9.5 Hold Timer Expired error handling ............................. 32 1881 9.6 Finite State Machine error handling ........................... 33 1882 9.7 Cease ......................................................... 33 1883 9.8 Connection collision detection ................................ 33 1884 10 BGMP Version Negotiation ....................................... 34 1885 10.1 BGMP Capability Negotiation .................................. 34 1886 11 BGMP Finite State machine ...................................... 35 1887 12 Security Considerations ........................................ 39 1888 13 Authors' Addresses ............................................. 39 1889 14 References ..................................................... 40 1890 15 Full Copyright Statement ....................................... 41