idnits 2.17.1 draft-ietf-idmr-cbt-spec-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-20) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 30 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 3 instances of too long lines in the document, the longest one being 8 characters in excess of 72. ** There are 732 instances of lines with control characters in the document. ** The abstract seems to contain references (3], [4,5], [1,2,), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 160 has weird spacing: '...e where some...' == Line 162 has weird spacing: '...CBT. In such ...' == Line 575 has weird spacing: '... tunnel cbt ...' == Line 577 has weird spacing: '... tunnel cbt ...' == Line 578 has weird spacing: '... tunnel cbt ...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 1175 looks like a reference -- Missing reference section? '2' on line 1180 looks like a reference -- Missing reference section? '3' on line 1183 looks like a reference -- Missing reference section? '4' on line 1187 looks like a reference -- Missing reference section? '5' on line 1190 looks like a reference -- Missing reference section? '6' on line 1194 looks like a reference -- Missing reference section? '11' on line 1211 looks like a reference -- Missing reference section? '7' on line 1197 looks like a reference -- Missing reference section? '8' on line 1200 looks like a reference -- Missing reference section? '9' on line 1205 looks like a reference -- Missing reference section? '10' on line 1208 looks like a reference Summary: 13 errors (**), 0 flaws (~~), 7 warnings (==), 14 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Inter-Domain Multicast Routing (IDMR) A. J. Ballardie 3 INTERNET-DRAFT University College London 4 S. Reeve 5 Bay Networks, Inc. 6 N. Jain 7 Bay Networks, Inc. 9 April, 1996 11 Core Based Trees (CBT) Multicast 13 -- Protocol Specification -- 15 17 Status of this Memo 19 This document is an Internet Draft. Internet Drafts are working do- 20 cuments of the Internet Engineering Task Force (IETF), its Areas, and 21 its Working Groups. Note that other groups may also distribute work- 22 ing documents as Internet Drafts). 24 Internet Drafts are draft documents valid for a maximum of six 25 months. Internet Drafts may be updated, replaced, or obsoleted by 26 other documents at any time. It is not appropriate to use Internet 27 Drafts as reference material or to cite them other than as a "working 28 draft" or "work in progress." 30 Please check the I-D abstract listing contained in each Internet 31 Draft directory to learn the current status of this or any other 32 Internet Draft. 34 Abstract 36 This document describes the Core Based Tree (CBT) network layer mul- 37 ticast protocol. CBT is a next-generation multicast protocol that 38 makes use of a shared delivery tree rather than separate per-sender 39 trees utilized by most other multicast schemes [1, 2, 3]. 41 This specification includes an optimization whereby unencapsulated 42 (native) IP-style multicasts are forwarded by CBT, resulting in very 43 good forwarding performance. This mode of operation is called CBT 44 "native mode". Native mode can only be used in CBT-only domains or 45 "clouds". 47 This document is progressing through the IDMR working group of the 48 IETF. The CBT architecture is described in an accompanying document: 49 ftp://cs.ucl.ac.uk/darpa/IDMR/draft-ietf-idmr-arch-03.txt. Other 50 related documents include [4, 5]. For all IDMR-related documents, see 51 http://www.cs.ucl.ac.uk/ietf/idmr. 53 1. Changes since Previous Revision (04) 55 This note summarizes the changes to this document since the previous 56 revision (revision 04). 58 + inclusion of a "group mask" field for aggregated joins/join-acks 59 (sections 10.2, 8.1, and Appendix A). 61 + removal of the term "Group DR (G-DR)", which was only a "token" 62 identity. 64 + more complete explanation of the use of CBT's IP protocol and 65 UDP port numbers (section 11). 67 + more complete explanation of non-member sender case (section 6). 69 + the term FIB (forwarding information base) has been replaced 70 throughout with the term "forwarding database (db)". 72 + editorial changes throughout for extra clarity. 74 Finally, in keeping with CBT's tradition of simplicity, this revision 75 is 1 page less than the previous revision :-) . 77 2. Some Terminology 79 In CBT, the core routers for a particular group are categorised into 80 PRIMARY CORE, and NON-PRIMARY (secondary) CORES. 82 The "core tree" is the part of a tree linking all core routers of a 83 particular group together. 85 3. Protocol Specification 87 3.1. Tree Joining Process -- Overview 89 A CBT router is notified of a local host's desire to join a group via 90 IGMP [6]. We refer to a CBT router with directly attached hosts as a 91 "leaf CBT router", or just "leaf" router. 93 The following CBT control messages come into play subequent to a 94 subnet's CBT leaf router receiving an IGMP membership report (also 95 termed "IGMP join"): 97 + JOIN_REQUEST 99 + JOIN_ACK 101 If the CBT leaf router is the subnet's default designated router (see 102 next section), it generates a CBT join-request in response to receiv- 103 ing an IGMP group membership report from a directly connected host. 104 The CBT join is sent to the next-hop on the unicast path to a target 105 core, specified in the join packet; a router elects a "target core" 106 based on a static configuration. If, on receipt of an IGMP-join, the 107 locally-elected DR has already joined the corresponding tree, then it 108 need do nothing more with respect to joining. 110 The join is processed by each such hop on the path to the core, until 111 either the join reaches the target core itself, or hits a router that 112 is already part of the corresponding distribution tree (as identified 113 by the group address). In both cases, the router concerned terminates 114 the join, and responds with a join-ack, which traverses the reverse- 115 path of the corresponding join. This is possible due to the transient 116 path state created by a join traversing a CBT router. The ack fixes 117 that state. 119 3.2. DR Election 121 Multiple CBT routers may be connected to a multi-access subnetwork. 122 In such cases it is necessary to elect a subnetwork designated router 123 (D-DR) that is responsible for generating and sending CBT joins 124 upstream, on behalf of the subnetwork. 126 CBT DR election happens "on the back" of IGMP [6]; on a subnet with 127 multiple multicast routers, an IGMP "querier" is elected as part of 128 IGMP; at start-up, a multicast router assumes no other multicast 129 routers are present on its subnetwork, and so begins by believing it 130 is the subnet's IGMP querier. It sends a small number IGMP-HOST- 131 MEMBERSHIP-QUERYs in short succession in order to quickly learn about 132 any group memberships on the subnet. If other multicast routers are 133 present on the same subnet, they will receive these IGMP queries; a 134 multicast router yields querier duty as soon as it hears an IGMP 135 query from a lower-addressed router on the same subnetwork. 137 The CBT default DR (D-DR) is always (footnote 1) the subnet's IGMP- 138 querier. As a result, there is no protocol overhead whatsoever asso- 139 ciated with electing a CBT D-DR. 141 3.3. Tree Joining Process -- Details 143 The receipt of an IGMP group membership report by a CBT D-DR for a 144 CBT group not previously heard from triggers the tree joining pro- 145 cess; the D-DR unicasts a JOIN-REQUEST to the first hop on the (uni- 146 cast) path to the target core specified in the CBT join packet. 148 Each CBT-capable router traversed on the path between the sending DR 149 and the core processes the join. However, if a join hits a CBT router 150 that is already on-tree (footnote 2), the join is not propogated 151 further, but ACK'd downstream from that point. 153 JOIN-REQUESTs carry the identity of all the cores associated with the 154 group. Assuming there are no on-tree routers in between, once the 155 join (subcode ACTIVE_JOIN) reaches the target core, if the target 156 core is not the primary core (as indicated in a separate field of the 157 join packet) it first acknowledges the received join by means of a 158 _________________________ 160 1 This document does not address the case where some 161 routers on a multi-access subnet may be running multi- 162 cast routing protocols other than CBT. In such cases, 163 IGMP querier may be a non-CBT router, in which case the 164 CBT DR election breaks. This will be discussed in a CBT 165 interoperability document, to appear shortly. 166 2 "on-tree" refers to whether a router has a forward- 167 ing db entry for the corresponding group. 169 JOIN-ACK, then sends a JOIN-REQUEST, subcode REJOIN-ACTIVE, to the 170 primary core router. 172 If the rejoin-active reaches the primary core, it responds by sending 173 a JOIN-ACK, subcode PRIMARY-REJOIN-ACK, which traverses the reverse- 174 path of the join. The primary-rejoin-ack serves to confirm no loop is 175 present without requiring explicit loop detection. 177 If some other on-tree router is encountered before the rejoin-active 178 reaches the primary, that router responds with a JOIN-ACK, subcode 179 NORMAL. On receipt of the ack, subcode normal, the router sends a 180 join, subcode REJOIN-NACTIVE, which acts as a loop detection packet 181 (see section 8.3). Note that loop detection is not necessary subse- 182 quent to receiving a join-ack with subcode PRIMARY-REJOIN-ACK. 184 To facilitate detailed protocol description, we use a sample topol- 185 ogy, illustrated in Figure 1 (shown over). Member hosts are shown as 186 individual capital letters, routers are prefixed with R, and subnets 187 are prefixed with S. 189 A B 190 | S1 S4 | 191 ------------------- ----------------------------------------------- 192 | | | | 193 ------ ------ ------ ------ 194 | R1 | | R2 | | R5 | | R6 | 195 ------ ------ ------ ------ 196 C | | | | | 197 | | | | S2 | S8 | 198 ---------- ------------------------------------------ ------------- 199 S3 | 200 ------ 201 | R3 | 202 | ------ D 203 | S9 | | S5 | 204 | | --------------------------------------------- 205 | |----| | | 206 ---| R7 |-----| ------ 207 | |----| |------------------| R4 | 208 | S7 | ------ F 209 | | | S6 | 210 |-E | --------------------------------- 211 | | 212 | ------ 213 |---| |---------------------| R8 | 214 |R12 ----| ------ G 215 |---| | | | S10 216 | S14 ---------------------------- 217 | | 218 I --| ------ 219 | | R9 | 220 ------ 221 | S12 222 | ---------------------------- 223 S15 | | 224 | ------ 225 |----------------------|R10 | 226 J ---| ------ H 227 | | | 228 | ---------------------------- 229 | S13 231 Figure 1. Example Network Topology 232 Taking the example topology in figure 1, host A is the group initia- 233 tor, and has configured core routers R4 (primary core) and R9 (secon- 234 dary core). 236 Router R1 receives an IGMP host membership report, and proceeds to 237 unicast a JOIN-REQUEST, subcode ACTIVE-JOIN to the next-hop on the 238 path to R4 (R3), the target core. R3 receives the join, caches the 239 necessary group information, and forwards it to R4 -- the target of 240 the join. 242 R4, being the target of the join, sends a JOIN_ACK (subcode NORMAL) 243 back out of the receiving interface to the previous-hop sender of the 244 join, R3. A JOIN-ACK, like a JOIN-REQUEST, is processed hop-by-hop by 245 each router on the reverse-path of the corresponding join. The 246 receipt of a join-ack establishes the receiving router on the 247 corresponding CBT tree, i.e. the router becomes part of a branch on 248 the delivery tree. Finally, R3 sends a join-ack to R1. A new CBT 249 branch has been created, attaching subnet S1 to the CBT delivery tree 250 for the corresponding group. 252 For the period between any CBT-capable router forwarding (or ori- 253 ginating) a JOIN_REQUEST and receiving a JOIN_ACK the corresponding 254 router is not permitted to acknowledge any subsequent joins received 255 for the same group; rather, the router caches such joins till such 256 time as it has itself received a JOIN_ACK for the original join. Only 257 then can it acknowledge any cached joins. A router is said to be in a 258 "pending-join" state if it is awaiting a JOIN_ACK itself. 260 Note that the presence of asymmetric routes in the underlying unicast 261 routing, does not affect the tree-building process; CBT tree branches 262 are symmetric by the nature in which they are built. Joins set up 263 transient state (incoming and outgoing interface state) in all 264 routers along a path to a particular core. The corresponding join-ack 265 traverses the reverse-path of the join as dictated by the transient 266 state, and not the path that underlying routing would dictate. Whilst 267 permanent asymmetric routes could pose a problem for CBT, transient 268 asymmetricity is detected by the CBT protocol. 270 3.4. Forwarding Joins on Multi-Access Subnets 272 The DR election mechanism does not guarantee that the DR will be the 273 router that actually forwards a join off a multi-access network; the 274 first hop on the path to a particular core might be via another 275 router on the same subnetwork, which actually forwards off-subnet. 277 Although very much the same, let's see another example using our 278 example topology of figure 1 of a host joining a CBT tree for the 279 case where more than one CBT router exists on the host subnetwork. 281 B's subnet, S4, has 3 CBT routers attached. Assume also that R6 has 282 been elected IGMP-querier and CBT D-DR. 284 R6 (S4's D-DR) receives an IGMP group membership report. R6's config- 285 ured information suggests R4 as the target core for this group. R6 286 thus generates a join-request for target core R4, subcode 287 ACTIVE_JOIN. R6's routing table says the next-hop on the path to R4 288 is R2, which is on the same subnet as R6. This is irrelevant to R6, 289 which unicasts it to R2. R2 unicasts it to R3, which happens to be 290 already on-tree for the specified group (from R1's join). R3 there- 291 fore can acknowledge the arrived join and unicast the ack back to R2. 292 R2 forwards it to R6, the origin of the join-request. 294 If an IGMP membership report is received by a D-DR with a join for 295 the same group already pending, or if the D-DR is already on-tree for 296 the group, it takes no action. 298 3.5. On-Demand "Core Tree" Building 300 The "core tree", the part of a CBT tree linking all of its cores 301 together, is built on-demand. That is, the core tree is only built 302 subsequent to a non-primary (secondary) core receiving a join- 303 request. This triggers the secondary core to join the primary core; 304 the primary need never join anything. 306 Join-requests carry an ordered list of core routers (and the identity 307 of the primary core in its own separate field), making it possible 308 for the secondary cores to know where to join when they themselves 309 receive a join. Hence, the primary core must be uniquely identified 310 as such across a whole group. A secondary joins the primary subse- 311 quent to sending an ack for the join just received. 313 3.6. Tree Teardown 315 There are two scenarios whereby a tree branch may be torn down: 317 + During a re-configuration. If a router's best next-hop to the 318 specified core is one of its existing children, then before 319 sending the join it must tear down that particular downstream 320 branch. It does so by sending a FLUSH_TREE message which is pro- 321 cessed hop-by-hop down the branch. All routers receiving this 322 message must process it and forward it to all their children. 323 Routers that have received a flush message will re-establish 324 themselves on the delivery tree if they have directly connected 325 subnets with group presence. 327 + If a CBT router has no children it periodically checks all its 328 directly connected subnets for group member presence. If no 329 member presence is ascertained on any of its subnets it sends a 330 QUIT_REQUEST upstream to remove itself from the tree. 332 The receipt of a quit-request triggers the receiving parent 333 router to immediately query its forwarding database, and estab- 334 lish whether there remains any directly connected group member- 335 ship, or any children, for the said group. If not, the router 336 itself sends a quit-request upstream. 338 The following example, using the example topology of figure 1, shows 339 how a tree branch is gracefully torn down using a QUIT_REQUEST. 341 Assume group member B leaves group G on subnet S4. B issues an IGMP 342 HOST-MEMBERSHIP-LEAVE (relevant only to IGMPv2 and later versions) 343 message which is multicast to the "all-routers" group (224.0.0.2). 344 R6, the subnet's D-DR and IGMP-querier, responds with a group- 345 specific-QUERY. No hosts respond within the required response inter- 346 val, so D-DR assumes group G traffic is no longer wanted on subnet 347 S4. 349 Since R6 has no CBT children, and no other directly attached subnets 350 with group G presence, it immediately follows on by sending a 351 QUIT_REQUEST to R2, its parent on the tree for group G. R2 responds 352 with a QUIT-ACK, unicast to R6; R2 removes the corresponding child 353 information. R2 in turn sends a QUIT upstream to R3 (since it has no 354 other children or subnet(s) with group presence). 356 NOTE: immediately subsequent to sending a QUIT-REQUEST, the sender 357 removes the corresponding parent information, i.e. it does not 358 wait for the receipt of a QUIT-ACK. 360 R3 responds to the QUIT by unicasting a QUIT-ACK to R2. R3 subse- 361 quently checks whether it in turn can send a quit by checking group G 362 presence on its directly attached subnets, and any group G children. 363 It has the latter (R1 is its child on the group G tree), and so R3 364 cannot itself send a quit. However, the branch R3-R2-R6 has been 365 removed from the tree. 367 4. Data Packet Forwarding Rules 369 4.1. Native Mode 371 In native mode, when a router receives a data packet, the packet's 372 TTL is decremented, and, provided the packet's TTL remains greater 373 than/equal to 1, forwards the data packet over all outgoing inter- 374 faces that are part of the corresponding CBT tree. 376 4.2. CBT Mode 378 In CBT mode, routers ignore all non-locally originated native mode 379 multicast data packets. Locally-originated multicast data is only 380 processed by a subnet's D-DR; in this case, the D-DR forwards the 381 native multicast data packet, TTL 1, over any outgoing member subnets 382 for which that router is D-DR. Additionally, the D-DR encapsulates 383 the locally-originated multicast and forwards it, CBT mode, over all 384 tree interfaces, as dictated by the CBT forwarding database. 386 When a router, operating in CBT mode, receives an encapsulated multi- 387 cast data packet, it decapsulates one copy to send, native mode and 388 TTL 1, over any directly attached member subnets for which it is D- 389 DR. Additionally, an encapsulated copy is forwarded over all outgoing 390 tree interfaces, as dictated by the CBT forwarding database. 392 Like the outer encapsulating IP header, the TTL value of the encapsu- 393 lating CBT header is decremented each time it is processed by a CBT 394 router. 396 An example of CBT mode forwarding is provided towards the end of the 397 next section. 399 5. CBT Mode -- Encapsulation Details 401 In a multi-protocol environment, whose infrastructure may include 402 non-multicast-capable routers, it is necessary to tunnel data packets 403 between CBT-capable routers. This is called "CBT mode". Data packets 404 are de-capsulated by CBT routers (such that they become native mode 405 data packets) before being forwarded over subnets with member hosts. 406 When multicasting (native mode) to member hosts, the TTL value of the 407 original IP header is set to one. CBT mode encapsulation is as fol- 408 lows: 410 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 411 | encaps IP hdr | CBT hdr | original IP hdr | data ....| 412 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 414 Figure 2. Encapsulation for CBT mode 416 The TTL value of the CBT header is set by the encapsulating CBT 417 router directly attached to the origin of a data packet. This value 418 is decremented each time it is processed by a CBT router. An encap- 419 sulated data packet is discarded when the CBT header TTL value 420 reaches zero. 422 The purpose of the (outer) encapsulating IP header is to "tunnel" 423 data packets between CBT-capable routers (or "islands"). The outer IP 424 header's TTL value is set to the "length" of the corresponding tun- 425 nel, or MAX_TTL (255)if this is not known, or subject to change. 427 It is worth pointing out here the distinction between subnetworks and 428 tree branches (especially apparent in CBT mode), although they can be 429 one and the same. For example, a multi-access subnetwork containing 430 routers and end-systems could potentially be both a CBT tree branch 431 and a subnetwork with group member presence. A tree branch which is 432 not simultaneously a subnetwork is either a "tunnel" or a point-to- 433 point link. 435 In CBT mode there are three forwarding methods used by CBT routers: 437 + IP multicasting. This method sends an unaltered (unencapsulated) 438 data packet across a directly-connected subnetwork with group 439 member presence. Any host originating multicast data, does so 440 in this form. 442 + CBT unicasting. This method is used for sending data packets 443 encapsulated (as illustrated above) across a tunnel or point- 444 to-point link. En/de-capsulation takes place in CBT routers. 446 + CBT multicasting. Routers on multi-access links use this method 447 to send data packets encapsulated (as illustrated above) but the 448 outer encapsulating IP header contains a multicast address. This 449 method is used when a parent or multiple children are reachable 450 over a single physical interface, as could be the case on a 451 multi-access Ethernet. The IP module of end-systems subscribed 452 to the same group will discard these multicasts since the CBT 453 payload type (protocol id) of the outer IP header is not recog- 454 nizable by hosts. 456 CBT routers create forwarding database (db) entries whenever they 457 send or receive a JOIN_ACK. The forwarding database describes the 458 parent-child relationships on a per-group basis. A forwarding data- 459 base entry dictates over which tree interfaces, and how (unicast or 460 multicast) a data packet is to be sent. A forwarding db entry is 461 shown below: 463 Note that a CBT forwarding db is required for both CBT-mode and 464 native-mode multicasting. 466 The field lengths shown above assume a maximum of 16 directly con- 467 nected neighbouring routers. 469 Using our example topology in figure 1, let's assume the CBT routers 470 are operating in CBT mode. 472 Member G originates an IP multicast (native mode) packet. R8 is the 473 DR for subnet S10. R8 therefore sends a (native mode) copy over any 474 member subnets for which it is DR - S14 and S10 (the copy over S10 is 475 not sent, since the packet was originally received from S10). The 476 multicast packet is CBT mode encapsulated by R8, and unicast to each 477 of its children, R9 and R12; these children are not reachable over 478 the same interface, otherwise R8 could have sent a CBT mode multi- 479 cast. R9, the DR for S12, need not IP multicast (native mode) onto 481 32-bits 4 4 4 8 482 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 483 | group-id | parent addr | parent vif | No. of | | 484 | | index | index |children | children | 485 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+--+-+-+-+-+-++-+-+-+-+-+-+-+-+-+ 486 |chld addr |chld vif | 487 | index | index | 488 |+-+-+-+-+-+-+-+-+-+-+ 489 |chld addr |chld vif | 490 | index | index | 491 |+-+-+-+-+-+-+-+-+-+-+ 492 |chld addr |chld vif | 493 | index | index | 494 |+-+-+-+-+-+-+-+-+-+-+ 495 | | 496 | etc. | 497 |+-+-+-+-+-+-+-+-+-+-| 499 Figure 3. CBT forwarding database entry 501 S12 since there are no members present there. R9, in CBT mode, uni- 502 casts the packet to R10, which is the DR for S13 and S15. R10 decap- 503 sulates the CBT mode packet and IP multicasts (native mode) to each 504 of S13 and S15. 506 Going upstream from R8, R8 CBT mode unicasts to R4. It is DR for all 507 directly connected subnets and therefore IP multicasts (native mode) 508 the data packet onto S5, S6 and S7, all of which have member pres- 509 ence. R4 unicasts, CBT mode, the packet to all outgoing children, R3 510 and R7 (NOTE: R4 does not have a parent since it is the primary core 511 router for the group). R7 IP multicasts (native mode) onto S9. R3 CBT 512 mode unicasts to R1 and R2, its children. Finally, R1 IP multicasts 513 (native mode) onto S1 and S3, and R2 IP multicasts (native mode) onto 514 S4. 516 6. Non-Member Sending 518 For a multicast data packet to span beyond the scope of the originat- 519 ing subnetwork at least one CBT-capable router must be present on 520 that subnetwork. The default DR (D-DR) for the group on the 521 subnetwork must encapsulate the (native) IP-style packet and unicast 522 it to a core for the group. The encapsulation required is shown in 523 figure 2; CBT mode encapsulation is necessary so the receiving CBT 524 router can demultiplex the packet accordingly. 526 If the encapsulated packet hits the tree at a non-core router, the 527 packet is forwarded according to the forwarding rules of section 4.2. 529 If the first on-tree router encountered is the target core, various 530 scenarios define what happens next: 532 + if the target core is not the primary, and the target core has 533 not yet joined the tree (because it has not yet itself received 534 any join-requests), the target core simply forwards the encapsu- 535 lated packet to the primary core. 537 if the target core is not the primary, but has children, the 538 target core forwards the data according to the rules of section 539 4.2. 541 + if the target core is the primary, the primary forwards the data 542 according to the rules of section 4.2. 544 7. Eliminating the Topology-Discovery Protocol in the Presence of Tun- 545 nels 547 Traditionally, multicast protocols operating within a virtual topol- 548 ogy, i.e. an overlay of the physical topology, have required the 549 assistance of a multicast topology discovery protocol, such as that 550 present in DVMRP [1]. However, it is possible to have a multicast 551 protocol operate within a virtual topology without the need for a 552 multicast topology discovery protocol. One way to achieve this is by 553 having a router configure all its tunnels to its virtual neighbours 554 in advance. A tunnel is identified by a local interface address and a 555 remote interface address. Routing is replaced by "ranking" each such 556 tunnel interface associated with a particular core address; if the 557 highest-ranked route is unavailable (tunnel end-points are required 558 to run an Hello-like protocol between themselves) then the next- 559 highest ranked available route is selected, and so on. The exact 560 specification of the Hello protocol is outside the scope of this 561 document. 563 CBT trees are built using the same join/join-ack mechanisms as 564 before, only now some branches of a delivery tree run in native mode, 565 whilst others (tunnels) run in CBT mode. Underlying unicast routing 566 dictates which interface a packet should be forwarded over. Each 567 interface is configured as either native mode or CBT mode, so a 568 packet can be encapsulated (decapsulated) accordingly. 570 As an example, router R's configuration would be as follows: 572 intf type mode remote addr 573 ----------------------------------- 574 #1 phys native - 575 #2 tunnel cbt 128.16.8.117 576 #3 phys native - 577 #4 tunnel cbt 128.16.6.8 578 #5 tunnel cbt 128.96.41.1 580 core backup-intfs 581 -------------------- 582 A #5, #2 583 B #3, #5 584 C #2, #4 586 The CBT forwarding database needs to be slightly modified to accommo- 587 date an extra field, "backup-intfs" (backup interfaces). The entry in 588 this field specifies a backup interface whenever a tunnel interface 589 specified in the forwarding db is down. Additional backups (should 590 the first-listed backup be down) are specified for each core in the 591 core backup table. For example, if interface (tunnel) #2 were down, 592 and the target core of a CBT control packet were core A, the core 593 backup table suggests using interface #5 as a replacement. If inter- 594 face #5 happened to be down also, then the same table recommends 595 interface #2 as a backup for core A. 597 8. Tree Maintenance 599 Once a tree branch has been created, i.e. a CBT router has received a 600 JOIN_ACK for a JOIN_REQUEST previously sent (or forwarded), a child 601 router is required to monitor the status of its parent/parent link at 602 fixed intervals by means of a "keepalive" mechanism operating between 603 them. The "keepalive" mechanism is implemented by means of two CBT 604 control messages: CBT_ECHO_REQUEST and CBT_ECHO_REPLY. Adjacent CBT 605 routers only need to send one keepalive per link, regardless of how 606 many groups are present on that link. This aggregation strategy is 607 expected to conserve considerable bandwidth on "busy" links, such as 608 transit network, or backbone network, links. 610 The keepalive protocol is simple, as follows: a child unicasts a 611 CBT-ECHO-REQUEST to its parent, which unicasts a CBT-ECHO-REPLY in 612 response. 614 For any CBT router, if its parent router, or path to the parent, 615 fails, the child is initially responsible for re-attaching itself, 616 and therefore all routers subordinate to it on the same branch, to 617 the tree. 619 CBT echo requests and replies can be aggregated and sent on a per 620 link basis, rather than individually for each group; the CBT control 621 packet header (section 10.2) accommodates such aggregation. 623 8.1. Router Failure 625 An on-tree router can detect a failure from the following two cases: 627 + if the child responsible for sending keepalives across a partic- 628 ular link stops receiving CBT_ECHO_REPLY messages. In this case 629 the child realises that its parent has become unreachable and 630 must therefore try and re-connect to the tree for all groups 631 represented on the parent/child link. For all groups sharing a 632 common core set (corelist), provided those groups can be speci- 633 fied as a CIDR-like aggregate, an aggregated join can be sent 634 representing a range of groups. Aggregated joins are made pos- 635 sible by the presence of a "group mask" field in the CBT control 636 packet header. Aggregated joins are also discussed in Appendix 637 A. 639 If a range of groups cannot be represented by a mask, then each 640 group must be re-joined individually. 642 CBT's re-join strategy is as follows: the rejoining router which 643 is immediately subordinate to the failure sends a JOIN_REQUEST 644 (subcode ACTIVE_JOIN if it has no children attached, and subcode 645 ACTIVE_REJOIN if at least one child is attached) to the best 646 next-hop router on the path to the elected core. If no JOIN-ACK 647 is received after three retransmissions, each transmission being 648 at PEND-JOIN-INTERVAL (10 secs), the next-highest priority core 649 is elected from the core list, and the process repeated. If all 650 cores have been tried unsuccessfully, the D-DR has no option but 651 to give up. 653 + if a parent stops receiving CBT_ECHO_REQUESTs from a child. In 654 this case, if the parent has not received an expected keepalive 655 after CHILD_ASSERT_EXPIRE_TIME, all children reachable across 656 that link are removed from the parent's forwarding database. 658 8.2. Router Re-Starts 660 There are two cases to consider here: 662 + Core re-start. All JOIN-REQUESTs (all types) carry the identi- 663 ties (i.e. IP addresses) of each of the cores for a group. If a 664 router is a core for a group, but has only recently re-started, 665 it will not be aware that it is a core for any group(s). In such 666 circumstances, a core only becomes aware that it is such by 667 receiving a JOIN-REQUEST. Subsequent to a core learning its 668 status in this way, if it is not the primary core it ack- 669 nowledges the received join, then sends a JOIN_REQUEST (subcode 670 ACTIVE_REJOIN) to the primary core. If the re-started router is 671 the primary core, it need take no action, i.e. in all cir- 672 cumstances, the primary core simply waits to be joined by other 673 routers. 675 + Non-core re-start. In this case, the router can only join the 676 tree again if a downstream router sends a JOIN_REQUEST through 677 it, or it is elected DR for one of its directly attached sub- 678 nets, and subsequently receives an IGMP membership report. 680 8.3. Route Loops 682 Routing loops are only a concern when a router with at least one 683 child is attempting to re-join a CBT tree. In this case the re- 684 joining router sends a JOIN_REQUEST (subcode ACTIVE REJOIN) to the 685 best next-hop on the path to an elected core. This join is forwarded 686 as normal until it reaches either the specified core, another core, 687 or a non-core router that is already part of the tree. If the rejoin 688 reaches the primary core, loop detection is not necessary because the 689 primary never has a parent. The primary core acks an active-rejoin by 690 means of a JOIN-ACK, subcode PRIMARY-REJOIN-ACK. This ack must be 691 processed by each router on the reverse-path of the active-rejoin; 692 this ack creates tree state, just like a normal join-ack. 694 If an active-rejoin is terminated by any router on the tree other 695 than the primary core, loop detection must take place, as we now 696 describe. 698 If, in response to an active-rejoin, a JOIN-ACK is returned, subcode 699 NORMAL (as opposed to an ack with subcode PRIMARY-REJOIN-ACK), the 700 router receiving the ack subsequently generates a JOIN-REQUEST, sub- 701 code NACTIVE-REJOIN (non-active rejoin). This packet serves only to 702 detect loops; it does not create any transient state in the routers 703 it traverses, other than the originating router. Any on-tree router 704 receiving a non-active rejoin is required to forward it over its 705 parent interface for the specified group. In this way, it will either 706 reach the primary core, which returns, directly to the sender, a join 707 ack with subcode PRIMARY-NACTIVE-ACK (so the sender knows no loop is 708 present), or the sender receives the non-active rejoin it sent, via 709 one of its child interfaces, in which case the rejoin obviously 710 formed a loop. 712 If a loop is present, the non-active join originator immediately 713 sends a QUIT_REQUEST to its newly-established parent and the loop is 714 broken. 716 Using figure 4 (over) to demonstrate this, if R3 is attempting to 717 re-join the tree (R1 is the core in figure 4) and R3 believes its 718 best next-hop to R1 is R6, and R6 believes R5 is its best next-hop to 719 R1, which sees R4 as its best next-hop to R1 -- a loop is formed. R3 720 begins by sending a JOIN_REQUEST (subcode ACTIVE_REJOIN, since R4 is 721 its child) to R6. R6 forwards the join to R5. R5 is on-tree for the 722 group, so responds to the active-rejoin with a JOIN-ACK, subcode NOR- 723 MAL (the ack traverses R6 on its way to R3). 725 R3 now generates a JOIN-REQUEST, subcode NACTIVE-REJOIN, and forwards 726 this to its parent, R6. R6 forwards the non-active rejoin to R5, its 727 parent. R5 does similarly, as does R4. Now, the non-active rejoin has 728 reached R3, which originated it, so R3 concludes a loop is present on 729 the parent interface for the specified group. It immediately sends a 730 QUIT_REQUEST to R6, which in turn sends a quit if it has not received 731 an ACK from R5 already AND has itself a child or subnets with member 732 presence. If so it does not send a quit -- the loop has been broken 733 by R3 sending the first quit. 735 QUIT_REQUESTs are typically acknowledged by means of a QUIT_ACK. A 736 child removes its parent information immediately subsequent to send- 737 ing its first QUIT-REQUEST. The ack here serves to notify the (old) 738 child that it (the parent) has in fact removed its child information. 739 However, there might be cases where, due to failure, the parent can- 740 not respond. The child sends a QUIT-REQUEST a maximum of three 741 times, at PEND-QUIT-INTERVAL (10 sec) intervals. 743 ------ 744 | R1 | 745 ------ 746 | 747 --------------------------- 748 | 749 ------ 750 | R2 | 751 ------ 752 | 753 --------------------------- 754 | | 755 ------ | 756 | R3 |--------------------------| 757 ------ | 758 | | 759 --------------------------- | 760 | | ------ 761 ------ | | | 762 | R4 | |-------| R6 | 763 ------ | |----| 764 | | 765 --------------------------- | 766 | | 767 ------ | 768 | R5 |--------------------------| 769 ------ | 770 | 772 Figure 4: Example Loop Topology 774 In another scenario the rejoin travels over a loop-free path, and the 775 first on-tree router encountered is the primary core, R1. In figure 776 4, R3 sends a join, subcode REJOIN_ACTIVE to R2, the next-hop on the 777 path to core R1. R2 forwards the re-join to R1, the primary core, 778 which returns a JOIN-ACK, subcode PRIMARY-REJOIN-ACK, over the 779 reverse-path of the rejoin-active. Whenever a router receives a 780 PRIMARY-REJOIN-ACK no loop detection is necessary. 782 If we assume R2 is on tree for the corresponding group, R3 sends a 783 join, subcode REJOIN_ACTIVE to R2, which replies with a join ack, 784 subcode NORMAL. R3 must then generate a loop detection packet (join 785 request, subcode REJOIN-NACTIVE) which is forwarded to its parent, 786 R2, which does similarly. On receipt of the rejoin-Nactive, the pri- 787 mary core unicasts a join ack back directly to R3, with subcode 788 PRIMARY-NACTIVE-ACK. This confirms to R3 that its rejoin does not 789 form a loop. 791 9. Data Packet Loops 793 The CBT protocol builds a loop-free distribution tree. If all routers 794 that comprise a particular tree function correctly, data packets 795 should never traverse a tree branch more than once. 797 CBT mode data packets from a non-member sender must arrive on a tree 798 via an "off-tree" interface. The CBT mode data packet's header 799 includes an "on-tree" field, which contains the value 0x00 until the 800 data packet reaches an on-tree router. The first on-tree router must 801 convert this value to 0xff. This value remains unchanged, and from 802 here on the packet should traverse only on-tree interfaces. If an 803 encapsulated packet happens to "wander" off-tree and back on again, 804 an on-tree router will receive the CBT encapsulated packet via an 805 off-tree interface. However, this router will recognise that the 806 "on-tree" field of the encapsulating CBT header is set to 0xff, and 807 so immediately discards the packet. 809 10. CBT Packet Formats and Message Types 811 We distinguish between two types of CBT packet: CBT mode data pack- 812 ets, and CBT control packets. CBT control packets carry a CBT control 813 packet header. 815 For "conventional router" implementations, it is recommended CBT con- 816 trol packets be encapsulated in IP, as illustrated below: 818 +++++++++++++++++++++++++++++++ 819 | IP header | CBT control pkt | 820 +++++++++++++++++++++++++++++++ 822 In CBT mode, the original data packet is encapsulated in a CBT header 823 and an IP header, as illustrated below: 825 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 826 | IP header | CBT header | original IP hdr | data .... | 827 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 829 The IP protocol field of the IP header is used to demultiplex a 830 packet correctly; CBT has been assigned IP protocol number 7. The 831 CBT module then demultiplexes based on the encapsulating CBT header's 832 "type" field, thereby distinguishing between CBT control packets and 833 CBT mode data packets (the first 16 bits of both the CBT control and 834 CBT data packet headers are identical). 836 Some implementations of CBT encapsulate CBT control packets in UDP 837 (like the workstation router version). In these implementations, the 838 encapsulation of CBT contol packets is as follows: 840 ++++++++++++++++++++++++++++++++++++++++++++ 841 | IP header | UDP header | CBT control pkt | 842 ++++++++++++++++++++++++++++++++++++++++++++ 844 CBT has been assigned UDP port number 7777 for this purpose. 846 It is recommended for performance reasons that conventional router 847 implementations implement the IP encapsulation for control packets, 848 not the UDP encapsulation. 850 The CBT data packet header is illustrated below: 852 10.1. CBT Header Format (for CBT Mode data) 854 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 855 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 856 | vers |unused | type | hdr length | on-tree|unused| 857 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 858 | checksum | IP TTL | unused | 859 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 860 | group identifier | 861 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 862 | reserved | reserved | Type | Length | 863 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 864 | .....Flow-id value..... | 865 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 866 | unused | unused | Type | Length | 867 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 868 | .....Security Information..... | 869 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 871 Figure 5. CBT Header 873 Each of the fields is described below: 875 + Vers: Version number -- this release specifies version 1. 877 + type: indicates CBT payload; values are defined for control 878 (0x00), and data (0xff). For the value 0x00 (control), a CBT 879 control header is assumed present rather than a CBT header. 881 + hdr length: length of the header, for purpose of checksum 882 calculation. 884 + on-tree: indicates whether the packet is on-tree (0xff) or 885 off-tree (0x00). 887 + checksum: the 16-bit one's complement of the one's complement 888 of the CBT header, calculated across all fields. 890 + IP TTL: TTL value gleaned from the IP header where the packet 891 originated. 893 + group identifier: multicast group address. 895 + The TLV fields at the end of the header are for a flow- 896 identifier, and/or security options, if and when implemented. 897 A "type" value of zero implies a "length" of zero, implying 898 there is no "value" field. 900 10.2. Control Packet Header Format 902 The individual fields are described below. 904 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 905 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 906 | vers |unused | type | code | # cores | 907 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 908 | hdr length | checksum | 909 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 910 | group identifier | 911 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 912 | group mask | 913 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 914 | packet origin | 915 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 916 | primary core address | 917 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 918 | target core address (core #1) | 919 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 920 | Core #2 | 921 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 922 | Core #3 | 923 | .... | 924 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 925 | reserved | reserved | Type | Length | 926 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 927 | .....Flow-id value..... | 928 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 929 | unused | unused | Type | Length | 930 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 931 | .....Security data..... | 932 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 934 Figure 6. CBT Control Packet Header 935 + Vers: Version number -- this release specifies version 1. 937 + type: indicates control message type (see sections 10.3). 939 + code: indicates subcode of control message type. 941 + # cores: number of core addresses carried by this control 942 packet. 944 + header length: length of the header, for purpose of checksum 945 calculation. 947 + checksum: the 16-bit one's complement of the one's complement 948 of the CBT control header, calculated across all fields. 950 + group identifier: multicast group address. 952 + group mask: mask value for aggregated CBT joins/join-acks. 953 Zero for non-aggregated joins/join-acks. 955 + packet origin: address of the CBT router that originated the 956 control packet. 958 + primary core address: the address of the primary core for the 959 group. 961 + target core address: desired core affiliation of control mes- 962 sage. 964 + Core #1, #2, #3 etc.: IP address for each of a group's cores. 966 + The TLV fields at the end of the header are for a flow- 967 identifier, and/or security options, if implemented. A "type" 968 value of zero implies a "length" of zero, implying there is 969 no "value" field. 971 10.3. CBT Control Message Types 973 There are ten types of CBT message. All are encoded in the CBT con- 974 trol header, shown in figure 6. 976 + JOIN-REQUEST (type 1): generated by a router and unicast to 977 the specified core address. It is processed hop-by-hop on its 978 way to the specified core. Its purpose is to establish the 979 originating CBT router, and all intermediate CBT routers, as 980 part of the corresponding delivery tree. Note that all cores 981 are carried in join-requests. 983 + JOIN-ACK (type 2): an acknowledgement to the above. The full 984 list of core addresses is carried in a JOIN-ACK, together 985 with the actual core affiliation (the join may have been ter- 986 minated by an on-tree router on its journey to the specified 987 core, and the terminating router may or may not be affiliated 988 to the core specified in the original join). A JOIN-ACK 989 traverses the reverse path as the corresponding JOIN-REQUEST, 990 with each CBT router on the path processing the ack. It is 991 the receipt of a JOIN-ACK that actually "fixes" tree state. 993 + JOIN-NACK (type 3): a negative acknowledgement, indicating 994 that the tree join process has not been successful. 996 + QUIT-REQUEST (type 4): a request, sent from a child to a 997 parent, to be removed as a child to that parent. 999 + QUIT-ACK (type 5): acknowledgement to the above. If the 1000 parent, or the path to it is down, no acknowledgement will be 1001 received within the timeout period. This results in the 1002 child nevertheless removing its parent information. 1004 + FLUSH-TREE (type 6): a message sent from parent to all chil- 1005 dren, which traverses a complete branch. This message results 1006 in all tree interface information being removed from each 1007 router on the branch, possibly because of a re-configuration 1008 scenario. 1010 + CBT-ECHO-REQUEST (type 7): once a tree branch is established, 1011 this messsage acts as a "keepalive", and is unicast from 1012 child to parent (can be aggregated from one per group to one 1013 per link). 1015 + CBT-ECHO-REPLY (type 8): positive reply to the above. 1017 + CBT-BR-KEEPALIVE (type 9): applicable to border routers only, 1018 when attaching a CBT domain to some other domain. See [11] 1019 for more information. 1021 + CBT-BR-KEEPALIVE-ACK (type 10): acknowledgement to the above. 1023 10.3.1. CBT Control Message Subcodes 1025 The JOIN-REQUEST has three valid subcodes: 1027 + ACTIVE-JOIN (code 0) - sent from a CBT router that has no 1028 children for the specified group. 1030 + REJOIN-ACTIVE (code 1) - sent from a CBT router that has at 1031 least one child for the specified group. 1033 + REJOIN-NACTIVE (code 2) - generated by a router subsequent to 1034 receiving a join ack, subcode NORMAL, in response to a 1035 active-rejoin. 1037 A JOIN-ACK has three valid subcodes: 1039 + NORMAL (code 0) - sent by a core router, or on-tree non-core 1040 router acknowledging joins with subcodes ACTIVE-JOIN and 1041 REJOIN-ACTIVE. 1043 + PRIMARY-REJOIN-ACK (code 1) - sent by a primary core to ack- 1044 nowledge the receipt of a join-request received with subcode 1045 REJOIN-ACTIVE. This message traverses the reverse-path of the 1046 corresponding re-join, and is processed by each router on 1047 that path. 1049 + PRIMARY-NACTIVE-ACK (code 2) - sent by a primary core to ack- 1050 nowledge the receipt of a join-request received with subcode 1051 REJOIN-NACTIVE. This ack is unicast directly to the router 1052 that generated the rejoin-Nactive, i.e. the ack it is not 1053 processed hop-by-hop. 1055 11. CBT Protocol and Port Numbers 1057 CBT has been assigned IP protocol number 7, and UDP port number 7777. 1058 The UDP port number is only required for certain CBT implementations, 1059 as described at the beginning of section 10. 1061 12. Default Timer Values 1063 There are several CBT control messages which are transmitted at fixed 1064 intervals. These values, retransmission times, and timeout values, 1065 are given below. Note these are recommended default values only, and 1066 are configurable with each implementation (all times are in seconds): 1068 + CBT-ECHO-INTERVAL 30 (time between sending successive CBT-ECHO- 1069 REQUESTs to parent). 1071 + PEND-JOIN-INTERVAL 10 (retransmission time for join-request if 1072 no ack rec'd) 1074 + PEND-JOIN-TIMEOUT 30 (time to try joining a different core, or 1075 give up) 1077 + EXPIRE-PENDING-JOIN 90 (remove transient state for join that has 1078 not been ack'd) 1080 + PEND_QUIT_INTERVAL 10 (retransmission time for quit-request if 1081 no ack rec'd) 1083 + CBT-ECHO-TIMEOUT 90 (time to consider parent unreachable) 1085 + CHILD-ASSERT-INTERVAL 90 (increment child timeout if no ECHO 1086 rec'd from a child) 1088 + CHILD-ASSERT-EXPIRE-TIME 180 (time to consider child gone) 1090 + IFF-SCAN-INTERVAL 300 (scan all interfaces for group presence. 1091 If none, send QUIT) 1093 + BR-KEEPALIVE-INTERVAL 200 (backup designated BR to designated BR 1094 keepalive interval) 1096 + BR-KEEPALIVE-RETRY-INTERVAL 30 (keepalive interval if BR fails 1097 to respond) 1098 13. Interoperability Issues 1100 Interoperability between CBT and DVMRP has recently been defined in 1101 ftp://cs.ucl.ac.uk/darpa/IDMR/draft-ietf-idmr-cbt-dvmrp-00.txt. 1103 Interoperability with other multicast protocols will be fully speci- 1104 fied shortly. 1106 14. CBT Security Architecture 1108 see [4]. 1110 Acknowledgements 1112 Special thanks goes to Paul Francis, NTT Japan, for the original 1113 brainstorming sessions that brought about this work. 1115 Thanks too to Sue Thompson (Bellcore). Her detailed reviews led to 1116 the identification of some subtle protocol flaws, and she suggested 1117 several simplifications. 1119 Thanks also to the networking team at Bay Networks for their comments 1120 and suggestions, in particular Steve Ostrowski for his suggestion of 1121 using "native mode" as a router optimization, and Eric Crawley. 1123 Thanks also to Ken Carlberg (SAIC) for reviewing the text, and gen- 1124 erally providing constructive comments throughout. 1126 I would also like to thank the participants of the IETF IDMR working 1127 group meetings for their general constructive comments and sugges- 1128 tions since the inception of CBT. 1130 APPENDIX A 1132 There are situations where it is advantageous to send a single join- 1133 request that represents potentially many groups. One such example is 1134 provided in [11], whereby a designated border router is required to 1135 join all groups inside a CBT domain. 1137 Such aggregated joining is only possible if each of the groups the 1138 join represents shares a common corelist. Furthermore, aggregation is 1139 only efficient over contiguous ranges of group addresses; the "group 1140 mask" field in the CBT control packet header is used to specify a 1141 CIDR-like group address mask. 1143 Authors' Addresses: 1145 Tony Ballardie, 1146 Department of Computer Science, 1147 University College London, 1148 Gower Street, 1149 London, WC1E 6BT, 1150 ENGLAND, U.K. 1152 Tel: ++44 (0)71 419 3462 1153 e-mail: A.Ballardie@cs.ucl.ac.uk 1155 Scott Reeve, 1156 Bay Networks, Inc. 1157 3, Federal Street, 1158 Billerica, MA 01821, 1159 USA. 1161 Tel: ++1 508 670 8888 1162 e-mail: sreeve@BayNetworks.com 1164 Nitin Jain, 1165 Bay Networks, Inc. 1166 3, Federal Street, 1167 Billerica, MA 01821, 1168 USA. 1170 Tel: ++1 508 670 8888 1171 e-mail: njain@BayNetworks.com 1173 References 1175 [1] DVMRP. Described in "Multicast Routing in a Datagram Internet- 1176 work", S. Deering, PhD Thesis, 1990. Available via anonymous ftp from: 1177 gregorio.stanford.edu:vmtp/sd-thesis.ps. NOTE: DVMRP version 3 is 1178 specified as a working draft. 1180 [2] J. Moy. Multicast Routing Extensions to OSPF. Communications of 1181 the ACM, 37(8): 61-66, August 1994. 1183 [3] D. Farinacci, S. Deering, D. Estrin, and V. Jacobson. Protocol 1184 Independent Multicast (PIM) Dense-Mode Specification (draft-ietf- 1185 idmr-pim-spec-01.ps). Working draft, 1994. 1187 [4] A. J. Ballardie. Scalable Multicast Key Distribution; RFC XXXX, 1188 SRI Network Information Center, 1996. 1190 [5] A. J. Ballardie. "A New Approach to Multicast Communication in a 1191 Datagram Internetwork", PhD Thesis, 1995. Available via anonymous ftp 1192 from: cs.ucl.ac.uk:darpa/IDMR/ballardie-thesis.ps.Z. 1194 [6] W. Fenner. Internet Group Management Protocol, version 2 (IGMPv2), 1195 (draft-idmr-igmp-v2-01.txt). 1197 [7] B. Cain, S. Deering, A. Thyagarajan. Internet Group Management 1198 Protocol Version 3 (IGMPv3) (draft-cain-igmp-00.txt). 1200 [8] M. Handley, J. Crowcroft, I. Wakeman. Hierarchical Rendezvous 1201 Point proposal, work in progress. 1202 (http://www.cs.ucl.ac.uk/staff/M.Handley/hpim.ps) and 1203 (ftp://cs.ucl.ac.uk/darpa/IDMR/IETF-DEC95/hpim-slides.ps). 1205 [9] D. Estrin et al. USC/ISI, Work in progress. 1206 (http://netweb.usc.edu/pim/). 1208 [10] D. Estrin et al. PIM Sparse Mode Specification. (draft-ietf- 1209 idmr-pim-sparse-spec-00.txt). 1211 [11] A. Ballardie. CBT Multicast Interoperability - Stage 1; Working 1212 draft, April 1996. Also available from: 1213 ftp://cs.ucl.ac.uk/darpa/IDMR/draft-ietf-idmr-cbt-dvmrp-00.txt