idnits 2.17.1 draft-ietf-malloc-masc-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. == There are 4 instances of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 983 has weird spacing: '...is less than ...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'HOLDTIME' is mentioned on line 1330, but not defined == Unused Reference: 'API' is defined on line 2340, but no explicit reference was found in the text -- Possible downref: Normative reference to a draft: ref. 'AAP' ** Downref: Normative reference to an Informational RFC: RFC 2771 (ref. 'API') == Outdated reference: A later version (-06) exists of draft-ietf-bgmp-spec-01 ** Downref: Normative reference to an Historic draft: draft-ietf-bgmp-spec (ref. 'BGMP') ** Obsolete normative reference: RFC 1771 (ref. 'BGP') (Obsoleted by RFC 4271) ** Downref: Normative reference to an Historic RFC: RFC 1520 (ref. 'CIDR') ** Obsolete normative reference: RFC 1700 (ref. 'IANA') (Obsoleted by RFC 3232) ** Obsolete normative reference: RFC 2434 (ref. 'IANA-CONSIDERATIONS') (Obsoleted by RFC 5226) ** Obsolete normative reference: RFC 2401 (ref. 'IPSEC') (Obsoleted by RFC 4301) -- Possible downref: Non-RFC (?) normative reference: ref. 'KAMPAI' ** Downref: Normative reference to an Historic draft: draft-ietf-malloc-arch (ref. 'MALLOC') ** Obsolete normative reference: RFC 2283 (ref. 'MBGP') (Obsoleted by RFC 2858) == Outdated reference: A later version (-07) exists of draft-ietf-idmr-traceroute-ipm-06 -- Possible downref: Normative reference to a draft: ref. 'MTRACE' ** Downref: Normative reference to an Historic RFC: RFC 2776 (ref. 'MZAP') ** Obsolete normative reference: RFC 2373 (Obsoleted by RFC 3513) ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) Summary: 16 errors (**), 0 flaws (~~), 8 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MALLOC Working Group Deborah Estrin (USC/ISI) 3 Internet Engineering Task Force Ramesh Govindan (ISI) 4 INTERNET-DRAFT Mark Handley (ACIRI) 5 July, 2000 Satish Kumar (USC/ISI) 6 Expires January 2001 Pavlin Radoslavov (USC/ISI) 7 Dave Thaler (Microsoft) 9 The Multicast Address-Set Claim (MASC) Protocol 10 12 Status of this Memo 14 This document is an Internet-Draft and is in full conformance with 15 all provisions of Section 10 of RFC2026. 17 Internet Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its Areas, and its Working Groups. Note that 19 other groups may also distribute working documents as Internet 20 Drafts. 22 Internet Drafts are valid for a maximum of six months and may be 23 updated, replaced, or obsoleted by other documents at any time. It 24 is inappropriate to use Internet Drafts as reference material or to 25 cite them other than as a "work in progress". 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 Copyright Notice 35 Copyright (C) The Internet Society (2000). All Rights Reserved. 37 Abstract 39 This document describes the Multicast Address-Set Claim (MASC) 40 protocol which can be used for inter-domain multicast address set 41 allocation. MASC is used by a node (typically a router) to claim and 42 allocate one or more address prefixes to that node's domain. While a 43 domain does not necessarily need to allocate an address set for hosts 44 in that domain to be able to allocate group addresses, allocating an 45 address set to the domain does ensure that inter-domain group- 46 specific distribution trees will be locally-rooted, and that traffic 47 will be sent outside the domain only when and where external 48 receivers exist. 50 1. Introduction 52 This document describes MASC, a protocol for inter-domain multicast 53 address set allocation. The MASC protocol (a Layer-3 protocol in the 54 multicast address allocation architecture [MALLOC]) is used by a node 55 (typically a router) to claim and allocate one or more address 56 prefixes to that node's domain. Each prefix has an associated 57 lifetime, and is chosen out of a larger prefix with a lifetime at 58 least as long, in a manner such that prefixes are aggregatable. At 59 any time, each MASC node (a Prefix Coordinator in [MALLOC]) will 60 typically advertise several prefixes with different lifetimes and 61 scopes, allowing Multicast Address Allocation Servers (MAAS's) in 62 that domain or child MASC domains to choose appropriate addresses for 63 their clients. 65 The set of prefixes ("address set") associated with a domain is 66 injected into an inter-domain routing protocol (e.g., BGP4+ [MBGP]), 67 where it can be used by an inter-domain multicast tree construction 68 protocol (e.g., BGMP [BGMP]) to construct inter-domain group-shared 69 trees. 71 Note that a domain does not need to allocate an address set for the 72 hosts in that domain to be able to allocate group addresses, nor does 73 allocating necessarily guarantee that hosts in other domains will not 74 use an address in the set (since, for example, hosts are not forced 75 to contact a MAAS before using a group address). Allocating an 76 address set to a domain does, however, ensure that inter-domain 77 group-specific multicast distribution trees for any group in the 78 address set will be locally-rooted, and that traffic will be sent 79 outside the given domain only when and where external receivers 80 exist. 82 1.1. Terminology 84 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 85 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 86 document are to be interpreted as described in RFC 2119 [RFC2119]. 88 Constants used by this protocol are shown as [NAME_OF_CONSTANT], and 89 summarized in Section 6. 91 1.2. Definitions 93 This specification uses a number of terms that may not be familiar to 94 the reader. This section defines some of these and refers to other 95 documents for definitions of others. 97 MAAS (Multicast Address Allocation Server) 98 A host providing multicast address allocation services to end users 99 (e.g. via MADCAP [MADCAP]). 101 MASC server 102 A node running MASC. 104 Peer 105 Other MASC speakers a node directly communicates with. 107 Multicast 108 IP Multicast, as defined for IPv4 in [RFC1112] and for IPv6 in 109 [RFC2460]. 111 Multicast Address 112 An IP multicast address or group address, as defined in [RFC1112] 113 and [RFC2373]. An identifier for a group of nodes. 115 2. Requirements for Inter-Domain Address Allocation 117 The key design requirements for the inter-domain address allocation 118 mechanism are: 120 o Efficient address space utilization when space is scare, which 121 naturally implies that address allocations be based on the actual 122 address usage patterns, and therefore that it be dynamic. 124 o Address aggregation, that implies that the address allocation 125 mechanism be hierarchical. 127 o Minimize flux in the allocated address sets (e.g. the address sets 128 should be reused when possible). 130 o Robustness, by using decentralized mechanisms. 132 The timeliness in obtaining an address set is not a major design 133 constraint as this is taken care of at a lower level [MALLOC]. 135 3. Overall Architecture 137 The Multicast Address Set Claim (MASC) protocol is used by MASC 138 domains to claim and allocate address sets for use by Multicast 139 Address Allocation Servers (MAASs) within each domain. Typically one 140 or more border routers of each domain that requires multicast address 141 space of its own would run MASC. Throughout this document, the term 142 "MASC domain" refers to a domain that has at least one node running 143 MASC; typically these domains will be Autonomous Systems (AS's). A 144 MASC node (on behalf of its domain) chooses an address set to claim, 145 sends a claim to other MASC domains in the network, and waits while 146 listening for any colliding claims. If there is a collision, the 147 losing claimer gives up the colliding claim and claims a different 148 address set. 150 After a sufficiently long collision-free waiting period, the address 151 set chosen by a MASC node is considered allocated to that node's 152 domain. Three things may then happen: 154 a) The allocated prefix can then be injected as a "multicast route" 155 into the inter-domain routing protocol (e.g., BGP4+ [MBGP]) as 156 "G-RIB" Network Layer Reachability Information (NLRI), where it 157 may be used by an inter-domain multicast routing protocol (e.g., 158 BGMP [BGMP]) to construct group-shared trees. To reduce the size 159 and slow the growth of the G-RIB, MASC nodes may perform CIDR-like 160 aggregation [CIDR] of the multicast NLRI information. This 161 motivates the need for an algorithm to select prefixes for domains 162 in such a way as to ensure good aggregation in addition to 163 achieving good address space utilization. 165 b) The node's domain may assign to itself a sub-prefix which can be 166 used by MAASs within the domain. 168 c) Sub-prefixes may be allocated to child domains, if any. 170 3.1. Claim-Collide vs. Query-Response Rationale 172 We choose a claim-collide mechanism instead of a query-response 173 mechanism for the following reasons. In a query-response mechanism, 174 replicas of the MASC node would be needed in parent MASC domains in 175 order to make their responses be robust to failures. This brings 176 about the associated problem of synchronization of the replicas and 177 possibly additional fragmentation of the address space. In addition, 178 even in this mechanism, address collisions would still need to be 179 handled. We believe the proposed claim-collide mechanism is simpler 180 and more robust than a query-response mechanism. 182 4. MASC Topology 184 The domain hierarchy used by MASC is congruent to the somewhat 185 hierarchical structure of the inter-domain topology, e.g., backbones 186 connected to regionals, regionals connected to metropolitan 187 providers, etc. As in BGP, MASC connections are locally configured. 188 A MASC domain that is a customer of other MASC domains will have one 189 or more of those provider domains as its parent. For example, a MASC 190 domain that is a regional provider will choose one (or more) of its 191 backbone provider domains as its parent(s). Children are configured 192 with their parent MASC domain, and parents are configured with their 193 children domains. At the top, a number of Top-Level Domains are 194 connected in a (sparse) mesh and share the global multicast address 195 space. To improve the robustness, a pair of children of the same 196 parent domain MAY be configured as siblings with regard to that 197 parent. 199 Figure 1 illustrates a sample topology. Double-line links denote 200 intra-domain TCP peering sessions, and single-line links denote 201 inter-domain TCP connections. T1 and T2 are Top-Level Domains (e.g., 202 backbone providers), containing MASC speakers T1a and T2a, 203 respectively. P3 and P4 are regional domains, containing (P3a, P3b), 204 and (P4a, P4b) respectively. P3 has a single customer (or "child"), 205 C5, containing (C5a, C5b, C5c). P4 has three children, C5, C6, C7, 206 containing (C5a, C5b, C5c), (C6a, C6b), and (C7a) respectively. 208 T1a-----------T2a 209 | | 210 | | 211 | | 212 P3a====P3b P4a====P4b 213 | | / | / | \ 214 | | _______/ | / | \ 215 | | / | / | \______ 216 | | / | / | \ 217 C5a====C5b C6a====C6b----------C7a 218 \\ // 219 \\// 220 C5c 222 Figure 1: Example MASC Topology 224 All MASC communications use TCP. Each MASC node is connected to and 225 communicates directly with other MASC nodes. The local node acts in 226 exactly one of the following four roles with respect to each remote 227 note: 229 INTERNAL_PEER 230 The local and remote nodes are both in the same MASC domain. For 231 example, P4b is an INTERNAL_PEER of P4a. 233 CHILD 234 A customer relationship exists whereby the local node may obtain 235 address space from the remote node. For example, C6a is a CHILD in 236 its session with P4a. 238 PARENT 239 A provider relationship exists whereby the remote node may obtain 240 address space from the local node. For example, T2a is a PARENT in 241 its session with P4a. Whether space is actually requested is up to 242 the implementation and local policy configuration. 244 SIBLING 245 No customer-provider relationship exists. For example, T2a is a 246 SIBLING in its session with T1a (Top-Level Domain SIBLING peering). 247 Also, C6b is a SIBLING in its session with C7a with regard to their 248 common parent P4. 250 A node's message will be propagated to its parent, all siblings with 251 the same parent, and its children. Since a domain need not have a 252 direct peering session with every sibling, a MASC domain must 253 propagate messages from a child domain to other children, can 254 propagate messages from a parent domain to other siblings, and, if a 255 Top-Level Domain, it must propagate messages from a sibling to other 256 siblings, otherwise may propagate messages from a sibling domain to 257 its parent and other siblings. 259 4.1. Managed vs Locally-Allocated Space 261 Each domain has a "Managed" Address Set, and a "Locally-Allocated" 262 Address Set. The "managed" space includes all address space which a 263 domain has successfully claimed via MASC. The "locally-allocated" 264 space, on the other hand, includes all address space which MAASs 265 inside the domain may use. Thus, the locally-allocated space is a 266 subset of the managed space, and refers to the portion which a domain 267 allocates for its own use. 269 For leaf domains (ones with no children), these two sets are 270 identical, since all claimed space is allocated for local use. A 271 parent domain, on the other hand, "manages" all address space which 272 it has claimed via MASC, while sub-prefixes can be allocated to 273 itself and to its children. 275 4.2. Prefix Lifetime 277 Each prefix has an associated lifetime. If a domain wants to use a 278 prefix longer than its lifetime, that domain must "renew" the prefix 279 BEFORE its lifetime expires (see Section 5.2). If the lifetime 280 cannot be extended, then the domain should either retry later to 281 extend, or should choose and claim another prefix. 283 After a prefix's lifetime expires, MASC nodes in the domain that own 284 that prefix must stop using that prefix. The corresponding entry 285 from the G-RIB database must be removed, and all information 286 associated with the expired prefix may be deleted from the MASC 287 node's local memory. 289 4.3. Active vs. Deprecated Prefixes 291 Each prefix advertised by a parent to its children can be either 292 "active" or "deprecated". A "deprecated" prefix is a prefix that the 293 parent wishes to discontinue to use after its lifetime expires. The 294 "active" prefixes only are candidates for size expansion or lifetime 295 extension. Usually, this information will be used by a child as a 296 hint to know which of the parent's prefixes might have their lifetime 297 extended. 299 4.4. Multi-Parent Sibling-to-Sibling and Internal Peering 301 Two sibling nodes that have more than one common parent will create 302 and use between them a number of transport-level connections, one per 303 each common parent. The information associated with a parent will be 304 sent over the connection that corresponds to the same parent. 305 Internal peers do not need to open multiple connections between them; 306 a single connection is used for all information. 308 4.5. Administratively-Scoped Address Allocation 310 MASC can also be used for sub-allocating prefixes of addresses within 311 an administrative scope zone [SCOPE], but only if the scope is 312 "divisible" (as described in [MALLOC] and [MZAP]). A MASC node can 313 learn what scopes it resides within by listening to MZAP [MZAP] 314 messages. 316 A "Zone TLD" is a domain which has no parent domain within the scope 317 zone. Zone TLDs act as TLDs for the prefix associated with the 318 scope. Figure 2 gives an example, where a scope boundary around 319 domains P3 and C5 has been added to Figure 1. Domain P3 is a Zone 320 TLD, since its only parent (T1) is outside the boundary. Hence, P3 321 can claim space directly out of the prefix associated with the scope 322 itself. Domain C5, on the other hand, has a parent within the scope 323 (namely, P3), and hence is not a Zone TLD. 325 T1a-----------T2a 326 | | 327 ............|....... | 328 . | . | 329 . P3a====P3b . P4a 330 . | | . / 331 . | | _______/ 332 . | | / . 333 . | | / . 334 . C5a====C5b . 335 . \\ // . 336 . \\// . 337 . C5c . 338 . . 339 . Admin Scope Zone . 340 .................... 342 Figure 2: Scope Zone Example 344 It is assumed that the role of a node (as discussed in Section 4) 345 with respect to a given peering session is the same for every scope 346 in which both ends are contained. A peering session that crosses a 347 scope boundary (such as the session between C5b and P4a in Figure 2) 348 is ignored when propagating messages that pertain to the given scope. 349 That is, such messages are not sent across such sessions. 351 5. Protocol Details 353 5.1. Claiming Space 355 When a MASC node, on behalf of a MASC domain, needs more address 356 space, it decides locally the size and the value of the address 357 prefix(es) it will claim from one of its parents. For example, the 358 decision might be based on the knowledge this node has about its 359 parent's address set, its siblings' claims and allocations, its own 360 address set, the claim messages from its siblings, and/or the demand 361 pattern of its children and the local domain. A sample algorithm is 362 given in Appendix A. 364 A MASC node which is not in a top-level domain can initiate a claim 365 toward a parent MASC domain if and only if it currently has an 366 established connection with at least one node in that parent domain. 368 After the prefix address and size are decided, the claim proceeds as 369 follows: 371 a) The claim is scheduled to be sent after a random delay in the 372 interval (0, [INITIATE_CLAIM_DELAY]). If a claim originated by a 373 node from the same MASC domain is received, and that claim 374 eliminates the need for the local claim, the local claim is 375 canceled and no further action is taken. 377 b) The claim is sent to one of the parents (if the domain is not a 378 top-level domain), all known siblings with the same parent, and 379 all internal peers. A Claim-Timer is then started at 380 [WAITING_PERIOD], and the MASC node starts listening for colliding 381 claims. 383 c) If a colliding claim is received while the Claim-Timer is running, 384 that claim is compared with the locally initiated claim using the 385 function described in Section 5.1.1. If the local claim is the 386 loser, a new prefix must be chosen to claim, and the loser claim's 387 Claim-Timer must be canceled. The loser claim can be either 388 explicitly withdrawn, or can be left to expire without taking 389 further actions. If the winning claim was originated by a node 390 from the same MASC domain, no new claim will be initiated. If the 391 local claim is the winner, no actions need to be taken. 393 d) If the Claim-Timer expires, the claimed prefix becomes associated 394 with the claimer's domain, i.e. it is considered allocated to that 395 domain and the following actions can be performed: 397 o Advertise the prefix to its parent, and to all siblings with 398 the same parent, by sending a PREFIX_IN_USE claim to them. 400 o Inject the prefix into the G-RIB of the inter-domain routing 401 protocol. 403 o Send a PREFIX_MANAGED message to all children and internal 404 peers, informing them that they may issue claims within the 405 managed space. A sub-prefix may then be claimed for local 406 usage (see Section 12.2). 408 Each MASC node receives all claims from its siblings and children. A 409 received claim must be evaluated against all claims saved in the 410 local cache using the function described in Section 5.1.1. The 411 output of the function will define the further processing of that 412 claim (see Section 11). 414 5.1.1. Claim Comparison Function 416 Each claim message includes: 418 o a "type", being one of: PREFIX_IN_USE, CLAIM_DENIED, 419 CLAIM_TO_EXPAND, or NEW_CLAIM (PREFIX_MANAGED and WITHDRAW are 420 not considered as claims that have to be compared) 422 o timestamp when the claim was initiated 424 o the claimed prefix and lifetime 426 o MASC Identifier of the node that originated the claim 428 When two claims are compared, first the type is compared based on the 429 following precedence: 431 PREFIX_IN_USE > CLAIM_DENIED > CLAIM_TO_EXPAND > NEW_CLAIM 433 If the type is the same, then the timestamps are used to compare the 434 claims. In practice, two claims will have the same type if the type 435 is either NEW_CLAIM (ordinary collision) or PREFIX_IN_USE (signal for 436 a clash). When the timestamps are compared, the claim with the 437 smallest, i.e. earliest timestamp wins. If the timestamps are the 438 same, then the claim with the smallest Origin Node Identifier wins. 440 5.2. Renewing an Existing Claim 442 The procedure for extending the lifetime of prefixes already in use 443 is the same as claiming new space (see Section 5.1), except that the 444 claim type must be CLAIM_TO_EXPAND, while the Address and the Mask of 445 the claim (see Section 7.3) must be the same as the already allocated 446 prefix. If the Claim-Timer expires and there is no collision, the 447 desired lifetime is assumed. 449 5.3. Expanding an Existing Prefix 451 The procedure for extending the lifetime of prefixes already in use 452 is the same as claiming new space (see Section 5.1), except that the 453 claim type must be CLAIM_TO_EXPAND, while the Address and the Mask of 454 the claim (see Section 7.3) must be set to the desired values. If 455 the Claim-Timer expires and there is no collision, the desired larger 456 prefix is associated with the local domain. 458 5.4. Releasing Allocated Space 460 If the lifetime of a prefix allocated to the local domain expires and 461 the domain does not need to reuse it, all resources associated with 462 this prefix are deleted and no further actions are taken. If the 463 lifetime of the prefix has not expired, and if no subranges of that 464 prefix have being allocated for local usage or by some of the 465 children domains, the space may be released by sending a withdraw 466 message to the parent domain, all known siblings with the same 467 parent, and all internal peers. 469 6. Constants 471 MASC uses the following constants: 473 [PORT_NUMBER] 474 2587. The TCP port number used to listen for incoming MASC 475 connections, as assigned by IANA. 477 [WAITING_PERIOD] 478 The amount of time (in seconds) that must pass between a NEW_CLAIM 479 (or CLAIM_TO_EXPAND), and a PREFIX_IN_USE for the same prefix. 480 This must be long enough to reasonably span any single inter-domain 481 network partition. Default: 172800 seconds (i.e. 48 hours). 483 [INITIATE_CLAIM_DELAY] 484 The amount of time (in seconds) a MASC node must wait before 485 initiating a new claim or a claim for space expansion. This must be 486 a random value in the interval (0, [INITIATE_CLAIM_DELAY]). 487 Default value for [INITIATE_CLAIM_DELAY]: 600 seconds (i.e. 10 488 minutes). 490 [TLD_ID] 491 The Parent Domain Identifier used by a Top-Level Domain (which has 492 no parent). Must be 0. 494 [HOLDTIME] 495 The amount of time (in seconds) that must pass without any messages 496 received from a remote node before considering the connection is 497 down. Default: 240 seconds (i.e. 4 minutes). 499 7. Message Formats 501 This section describes message formats used by MASC. 503 Messages are sent over a reliable transport protocol connection. A 504 message is processed only after it is entirely received. The maximum 505 message size is 4096 octets. All implementations are required to 506 support this maximum message size. 508 7.1. Message Header Format 510 Each message has a fixed-size (4-octets) header. There may or may 511 not be a data portion following the header, depending on the message 512 type. The layout of these fields is shown below: 514 0 1 2 3 515 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 516 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 517 | Length | Type | Reserved | 518 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 520 Length: 521 This 2-octet unsigned integer indicates the total length of the 522 message, including the header, in octets. Thus, e.g., it allows 523 one to locate in the transport-level stream the start of the next 524 message. The value of the Length field must always be at least 4 525 and no greater than 4096, and may be further constrained, depending 526 on the message type. No "padding" of extra data after the message 527 is allowed, so the Length field must have the smallest value 528 required given the rest of the message. 530 Type: 531 This 1-octet unsigned integer indicates the type code of the 532 message. The following type codes are defined: 534 1 - OPEN 535 2 - UPDATE 536 3 - NOTIFICATION 537 4 - KEEPALIVE 539 Reserved: 540 This 1-octet field is reserved. MUST be set to zero by the sender, 541 and MUST be ignored by the receiver. 543 7.2. OPEN Message Format 545 After a transport protocol connection is established, the first 546 message sent by each side is an OPEN message. If the OPEN message is 547 acceptable, a KEEPALIVE message confirming the OPEN is sent back. 548 Once the OPEN is confirmed, UPDATE, KEEPALIVE, and NOTIFICATION 549 messages may be exchanged. 551 The minimum length of the OPEN message is 20 octets (including message 552 header). In addition to the fixed-size MASC header, the OPEN message 553 contains the following fields: 555 0 1 2 3 556 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 557 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 558 | Version |R| AddrFam |Rol| Hold Time | 559 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 560 | Sender Domain Identifier (variable length) | 561 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 562 | Sender MASC Node Identifier (variable length) | 563 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 564 | Parent's Domain Identifier (variable length) | 565 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 566 | | 567 + (Optional Parameters) | 568 | | 569 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 571 Version: 572 This 1-octet unsigned integer indicates the protocol version number 573 of the message. The current MASC version number is 1. 575 R bit: 576 This 1-bit field is reserved. MUST be set to zero by the sender, 577 and MUST be ignored by the receiver. 579 AddrFam: 580 This 5-bit field is the IANA-assigned address family number of the 581 encoded prefix [IANA]. These include (among others): 583 Number Description 584 ------ ----------- 585 1 IP (IP version 4) 586 2 IPv6 (IP version 6) 588 My Role (Rol): 589 This 2-bit field indicates the proposed relationship of the sending 590 system to the receiving system: 591 00 = INTERNAL_PEER (sent from one internal peer to another) 592 01 = CHILD (sent from a child to its parent) 593 10 = SIBLING (sent from one sibling to another) 594 11 = PARENT (sent from a parent to its child) 596 Hold Time: 597 This 2-octet unsigned integer indicates the number of seconds that 598 the sender proposes for the value of the Hold Timer. Upon receipt 599 of an OPEN message, a MASC speaker MUST calculate the value of the 600 Hold Timer by using the smaller of its configured Hold Time for 601 that peer and the Hold Time received in the OPEN message. The Hold 602 Time MUST be either zero or at least three seconds. An 603 implementation may reject connections on the basis of the Hold 604 Time. The calculated value indicates the maximum number of seconds 605 that may elapse between the receipt of successive KEEPALIVE and/or 606 UPDATE messages by the sender. RECOMMENDED value is [HOLDTIME] 607 seconds. 609 Sender Domain Identifier: 610 A globally unique identifier. Its length is determined based on 611 the Address Family, and should be treated as an unsigned integer 612 (e.g. a 4-octet integer for IPv4, or a 16-octet integer for IPv6), 613 but must be at least 4 octets long. It should be set to the 614 Autonomous System number of the sender, but the network unicast 615 prefix address is also acceptable. 617 Sender MASC Node Identifier: 618 This field's length and format are same as the Sender Domain 619 Identifier field, and indicates the MASC Node Identifier of the 620 sender. A given MASC speaker sets the value of its MASC Node 621 Identifier to a globally-unique value assigned to that MASC speaker 622 (e.g., an IPv4 or IPv6 address). The value of the MASC Node 623 Identifier is determined on startup and is the same for every MASC 624 session opened. 626 Parent's Domain Identifier: 627 This field's length and format are same as the Sender Domain 628 Identifier field, and is set to the Domain Identifier of the 629 sender's parent (e.g. the parent's Autonomous System number, or 630 network prefix address), or is set to [TLD_ID] if the sender is a 631 TLD. Used only when Rol is INTERNAL_PEER or SIBLING, otherwise is 632 ignored. This field is used to determine the common parents 633 between siblings, to associate each sibling-to-sibling connection 634 with a particular parent, and to discover TLD-related configuration 635 problems among internal peers. If a non-TLD node does not know yet 636 the Domain ID of any of its parents, it can use its own Domain ID 637 in the OPEN messages to its internal peers. 639 Optional Parameters: 640 This field may contain a list of optional parameters, where each 641 parameter is encoded as a triplet. The combined length of all optional 643 parameters can be derived from the Length field in the message 644 header. 646 0 1 647 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 648 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... 649 | Parm. Length | Parm. Type | Parameter Value (variable) 650 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... 652 Parameter Length is a one octet field that contains the length of 653 the Parameter Value field in octets. Parameter Type is a one octet 654 field that unambiguously identifies individual parameters. 655 Parameter Value is a variable length field that is interpreted 656 according to the value of the Parameter Type field. Unrecognized 657 optional parameters MUST be silently ignored. 659 This document does not define any optional parameters. 661 7.3. UPDATE Message Format 663 UPDATE messages are used to transfer Claim/Collision/PrefixManaged 664 information between MASC speakers. The UPDATE message always 665 includes the fixed-size MASC header, and one or more attributes as 666 described below. The minimum length of the UPDATE message is 40 667 octets (including the message header). 669 Each attribute is of the form: 671 0 1 2 3 672 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 673 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 674 | Length | Type | Reserved | 675 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 676 | Data ... | 677 . . 678 . . 679 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 681 All attributes are 4-octets aligned. 683 Length: 684 The Length is the length of the entire attribute, including the 685 length, type, and data fields. If other attributes are nested 686 within the data field, the length includes the size of all such 687 nested attributes. 689 Type: 690 This 1-octet unsigned integer indicates the type code of the 691 attribute. The following type codes are defined: 693 0 = PREFIX_IN_USE (prefix is being used by the origin) 694 1 = CLAIM_DENIED (the claim is refused (probably by the 695 origin's parent domain)) 696 2 = CLAIM_TO_EXPAND (origin is trying to expand the size of 697 an existing prefix) 698 3 = NEW_CLAIM (origin is trying to claim a new prefix) 699 4 = PREFIX_MANAGED (parent is informing child of space 700 available) 701 5 = WITHDRAW (origin is withdrawing a previous claim) 703 Types 128-255 are reserved for "optional" attributes. If a 704 required attribute is unrecognized, a NOTIFICATION with UPDATE 705 Error Code and Unrecognized Required Attribute subcode will be 706 sent. Unrecognized optional attributes are simply ignored. 708 Reserved: 709 This 1-octet field is reserved. MUST be set to zero by the sender, 710 and MUST be ignored by the receiver. 712 Types 0-3 are collectively called "CLAIMs". The message format below 713 describes the encoding of a CLAIM, PREFIX_MANAGED and WITHDRAW. 715 0 1 2 3 716 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 717 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 718 | Reserved1 |D| AddrFam |Rol| Reserved2 | 719 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 720 | Claim Timestamp | 721 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 722 | Claim Lifetime | 723 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 724 | Claim Holdtime | 725 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 726 | Origin Domain Identifier (variable length) | 727 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 728 | Origin Node Identifier (variable length) | 729 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 730 | Address (variable length) | 731 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 732 | Mask (variable length) | 733 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 734 | | 735 + (Optional Parameters) | 736 | | 737 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 739 Reserved1: 740 This 1-octet field is reserved. MUST be set to zero by the 741 sender, and MUST be ignored by the receiver. 743 D-bit: 744 DEPRECATED_PREFIX bit. If set, indicates that the advertised 745 address prefix is Deprecated, otherwise the prefix is Active (see 746 Section 4.3). 748 AddrFam: 749 This 5-bit field is the IANA-assigned address family number of the 750 encoded prefix [IANA]. 752 Rol: 753 This 2-bit field indicates the relationship/role of the Origin of 754 the message to the node sending that message: 755 00 = INTERNAL (originated by the sender's domain) 756 01 = CHILD (originated by a child of the sender's domain) 757 10 = SIBLING (originated by a sibling of the sender's domain) 758 11 = PARENT (originated by a parent of the sender's domain) 760 Reserved2: 761 This 2-octet field is reserved. MUST be set to zero by the 762 sender, and MUST be ignored by the receiver. 764 Claim Timestamp: 765 The timestamp of the claim when it was originated. The timestamp 766 is expressed in number of seconds since midnight (0 hour), January 767 1, 1970, Greenwich. 769 Claim Lifetime: 770 The time in seconds between the Claim Timestamp, and the time at 771 which the prefix will become free. 773 Claim Holdtime: 774 The time in seconds between the Claim Timestamp, and the time at 775 which the claim should be deleted from the local cache. For 776 PREFIX_IN_USE and PREFIX_MANAGED claims it should be equal to 777 Claim Lifetime; for CLAIM_TO_EXPAND, NEW_CLAIM, and CLAIM_DENIED 778 it should be equal to [WAITING_PERIOD]. 780 Origin Domain Identifier: 781 The domain identifier of the claim originator. Its length and 782 format definition are same as the Sender Domain Identifier (see 783 Section 7.2). 785 Origin Node Identifier: 786 The MASC Node ID of the claim originator. Its length and format 787 definition are same as the Sender MASC Node Identifier (see 788 Section 7.2). 790 Address: 791 The address associated with the given prefix to be encoded. The 792 length is determined based on the Address Family (e.g. 4 octets 793 for IPv4, 16 for IPv6) 795 Mask: 796 The mask associated with the given prefix. The length is the same 797 as the Address field and is determined based on the Address 798 Family. The field contains the full bitmask. 800 Optional Parameters: 801 This field may contain a list of optional parameters, where each 802 parameter is encoded using same format as the optional parameters 803 of an OPEN message (see Section 7.2). Unrecognized optional 804 parameters MUST be silently ignored. This document does not define 805 any optional parameters. 807 7.4. KEEPALIVE Message Format 809 MASC does not use any transport protocol-based keep-alive mechanism 810 to determine if peers are reachable. Instead, KEEPALIVE messages are 811 exchanged between peers often enough as not to cause the Hold Timer 812 to expire. A reasonable maximum time between the last KEEPALIVE or 813 UPDATE message sent, and the time at which a KEEPALIVE message is 814 sent, would be one third of the Hold Time interval. KEEPALIVE 815 messages MUST NOT be sent more frequently than one per second. An 816 implementation MAY adjust the rate at which it sends KEEPALIVE 817 messages as a function of the Hold Time interval. 819 If the negotiated Hold Time interval is zero, then periodic KEEPALIVE 820 messages MUST NOT be sent. 822 A KEEPALIVE message consists of only a message header, and has a 823 length of 4 octets. 825 7.5. NOTIFICATION Message Format 827 A NOTIFICATION message is sent when an error condition is detected. 828 Depending on the error condition, the MASC connection might or must 829 be closed immediately after sending the message. If the sender of 830 the NOTIFICATION decides that the connection is to be closed, it will 831 indicate this by zeroing the O-bit in the NOTIFICATION message (see 832 below). 834 In addition to the fixed-size MASC header, the NOTIFICATION message 835 contains the following fields: 837 0 1 2 3 838 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 839 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 840 |O| Error code | Error subcode | Data | 841 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 842 | | 843 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 845 O-bit: 846 Open-bit. If zero, it indicates that the sender will close the 847 connection. If '1', it indicates that the sender has chosen to 848 keep the connection open. 850 Error Code: 851 This 7-bit unsigned integer indicates the type of NOTIFICATION. 852 The following Error Codes have been defined: 854 Error Code Symbolic Name Reference 855 1 Message Header Error Section 8.1 857 2 OPEN Message Error Section 8.2 859 3 UPDATE Message Error Section 8.3 861 4 Hold Timer Expired Section 8.4 863 5 Finite State Machine Error Section 8.5 865 6 NOTIFICATION Message Error Section 8.6 867 7 Cease Section 8.7 869 Error subcode: 870 This 1-octet unsigned integer provides more specific information 871 about the nature of the reported error. Each Error Code may have 872 one or more Error Subcodes associated with it. If no appropriate 873 Error Subcode is defined, then a zero (Unspecific) value is used 874 for the Error Subcode field, and the O-bit must be zero (i.e. the 875 connection will be closed). The notation used in the error 876 description below is: MC = Must Close connection = O-bit is zero; 877 CC = Can Close connection = O-bit might be zero. 879 Message Header Error subcodes: 880 0 - Unspecific (MC) 881 1 - Bad Message Length (MC) 882 2 - Bad Message Type (CC) 884 OPEN Message Error subcodes: 886 0 - Unspecific (MC) 887 1 - Unsupported Version Number (MC) 888 2 - Bad Peer Domain ID (MC) 889 3 - Bad Peer MASC Node ID (MC) 890 6 - Unacceptable Hold Time (MC) 891 7 - Invalid Parent Configuration (MC) 892 8 - Inconsistent Role (MC) 893 9 - Bad Parent Domain ID (MC) 894 10 - No Common Parent (MC) 895 13 - Unrecognized Address Family (MC) 897 UPDATE Message Error subcodes: 899 0 - Unspecific (MC) 900 1 - Malformed Attribute List (MC) 901 2 - Unrecognized Required Attribute (CC) 902 5 - Attribute Length Error (MC) 903 10 - Invalid Address field (CC) 904 11 - Invalid Mask field (CC) 905 12 - Non-Contiguous Mask (CC) 906 13 - Unrecognized Address Family (MC) 907 14 - Claim Type Error (CC) 908 15 - Origin Domain ID Error (CC) 909 16 - Origin Node ID Error (CC) 910 17 - Claim Lifetime Too Short (CC) 911 18 - Claim Lifetime Too Long (CC) 912 19 - Claim Timestamp Too Old (CC) 913 20 - Claim Timestamp Too New (CC) 914 21 - Claim Prefix Size Too Small (CC) 915 22 - Claim Prefix Size Too Large (CC) 916 23 - Illegal Origin Role Error (CC) 917 24 - No Appropriate Parent Prefix (CC) 918 25 - No Appropriate Child Prefix (CC) 919 26 - No Appropriate Internal Prefix (CC) 920 27 - No Appropriate Sibling Prefix (CC) 921 28 - Claim Holdtime Too Short (CC) 922 29 - Claim Holdtime Too Long (CC) 924 Hold Timer Expired subcodes (the O-bit is always zero): 926 0 - Unspecific (MC) 928 Finite State Machine Error subcodes: 930 0 - Unspecific (MC) 931 1 - Open/Close MASC Connection FSM Error (MC) 932 2 - Unexpected Message Type FSM Error (MC) 934 Cease subcodes (the O-bit is always zero): 936 0 - Unspecific (MC) 938 NOTIFICATION subcodes (the O-bit is always zero): 940 0 - Unspecific (MC) 942 Data: 943 This variable-length field is used to diagnose the reason for the 944 NOTIFICATION. The contents of the Data field depend upon the 945 Error Code and Error Subcode. See Section 8 for more details. 947 Note that the length of the Data field can be determined from the 948 message Length field by the formula: 950 Message Length = 6 + Data Length 952 The minimum length of the NOTIFICATION message is 6 octets 953 (including message header). 955 8. MASC Error Handling 957 This section describes actions to be taken when errors are detected 958 while processing MASC messages. MASC Error Handling is similar to 959 that of BGP [BGP]. 961 When any of the conditions described here are detected, a 962 NOTIFICATION message with the indicated Error Code, Error Subcode, 963 and Data fields is sent. In addition, the MASC connection might be 964 closed. If no Error Subcode is specified, then a zero (Unspecific) 965 must be used. 967 The phrase "the MASC connection is closed" means that the transport 968 protocol connection has been closed and that all resources for that 969 MASC connection have been deallocated. 971 Unless specified explicitly, the Data field of the NOTIFICATION 972 message is empty. 974 8.1. Message Header Error Handling 976 All errors detected while processing the Message Header are indicated 977 by sending the NOTIFICATION message with Error Code Message Header 978 Error. The Error Subcode elaborates on the specific nature of the 979 error. The Data field contains the erroneous Message (including the 980 message header). 982 If the Length field of the message header is less than 4 or greater 983 than 4096, or if the length of an OPEN message is less than the 984 minimum length of the OPEN message, or if the length of an UPDATE 985 message is less than the minimum length of the UPDATE message, or if 986 the length of a KEEPALIVE message is not equal to 4, then the Error 987 Subcode is set to Bad Message Length. 989 If the Type field of the message header is not recognized, then the 990 Error Subcode is set to Bad Message Type. 992 8.2. OPEN Message Error Handling 994 All errors detected while processing the OPEN message are indicated 995 by sending the NOTIFICATION message with Error Code OPEN Message 996 Error. The Error Subcode elaborates on the specific nature of the 997 error. The Data field contains the erroneous OPEN Message (excluding 998 the Message Header), unless stated otherwise. 1000 If the version number contained in the Version field of the received 1001 OPEN message is not supported, then the Error Subcode is set to 1002 Unsupported Version Number. The Data field is a 1-octet unsigned 1003 integer, which indicates the largest locally supported version number 1004 less than the version the remote MASC node bid (as indicated in the 1005 received OPEN message). 1007 If the Sender Domain Identifier field of the OPEN message is 1008 unacceptable, then the Error Subcode is set to Bad Peer Domain ID. 1009 The determination of acceptable Domain IDs is outside the scope of 1010 this protocol. 1012 If the Sender MASC Node Identifier field of the OPEN message is 1013 unacceptable, then the Error Subcode is set to Bad Peer MASC Node ID. 1014 The determination of acceptable Node IDs is outside the scope of this 1015 protocol. 1017 If the Hold Time field of the OPEN message is unacceptable, then the 1018 Error Subcode MUST be set to Unacceptable Hold Time. An 1019 implementation MUST reject Hold Time values of one or two seconds. 1020 An implementation MAY reject any proposed Hold Time. An 1021 implementation which accepts a Hold Time MUST use the negotiated 1022 value for the Hold Time. 1024 If the remote system's proposed Role is INTERNAL_PEER, and either 1025 (but not both) the local system or the remote system's Parent Domain 1026 ID is [TLD_ID], then the Error Subcode is set to Invalid Parent 1027 Configuration. The Data field must be filled with all the local 1028 system's Parent Domain IDs. 1030 If the remote system's proposed Role conflicts with its expected role 1031 (based on the local system's configured Role), then the Error Subcode 1032 is set to Inconsistent Role. The Data field is 1-octet long, and 1033 contains the local system's configured Role. 1035 If the remote system's Parent Domain ID is unacceptable, then the 1036 Error Subcode is set to Bad Parent Domain ID, and the Data field is 1037 filled with the erroneous Parent Domain ID. The determination of 1038 acceptable Parent Domain ID is outside the scope of this protocol. 1040 If the remote system is supposed to be a sibling, but it does not 1041 have a common parent with the local system (based on the Parent 1042 Domain ID information in the OPEN message), the Error Subcode is set 1043 to No Common Parent, and the Data field is filled with all Parent 1044 Domain IDs of the local MASC domain. 1046 If the Address Family is unrecognized, then the Error Subcode is set 1047 to Unrecognized Address Family. 1049 8.3. UPDATE Message Error Handling 1051 All errors detected while processing the UPDATE message are indicated 1052 by sending the NOTIFICATION message with Error Code UPDATE Message 1053 Error. The error subcode elaborates on the specific nature of the 1054 error. The Data field contains the erroneous UPDATE Message 1055 (including the attribute header, but excluding the Message Header), 1056 unless stated otherwise. 1058 If any recognized attribute has an Attribute Length that conflicts 1059 with the expected length (based on the attribute type code), then the 1060 Error Subcode is set to Attribute Length Error. 1062 If any of the mandatory well-known attributes are not recognized, 1063 then the Error Subcode is set to Unrecognized Required Attribute. 1065 If the Address field includes an invalid address (except 0), then the 1066 Error Subcode is set to Invalid Address. 1068 If the Mask field includes an invalid mask (for example, starting 1069 with 0), then the Error Subcode is set to Invalid Mask. 1071 If the Mask field includes a non-contiguous bitmask, and that MASC 1072 server does not support, or is not configured to use non-contiguous 1073 masks, then the Error Subcode is set to Non-Contiguous Mask. 1075 If the Address Family is unrecognized, then the Error Subcode is set 1076 to Unrecognized Address Family. 1078 If the Origin Role/Claim Type combination is not one of the 1079 following, then the Error Subcode is set to Claim Type Error. 1081 Origin Claim 1082 Role Type 1084 ICS PREFIX_IN_USE (0) 1085 I P CLAIM_DENIED (1) 1086 ICS CLAIM_TO_EXPAND (2) 1087 ICS NEW_CLAIM (3) 1088 I P PREFIX_MANAGED (4) 1089 ICSP WITHDRAW (5) 1091 If there is a reason to believe that the Origin Domain ID is invalid, 1092 then the Error Subcode is set to Origin Domain ID Error. The same 1093 applies for Origin Node ID (the corresponding error is Origin Node ID 1094 Error). 1096 If a node (usually a parent receiving a claim from a child) decides 1097 that the Claim Lifetime is too short (for example, less than 172800, 1098 i.e. 48 hours), it MAY send an UPDATE Message Error with subcode 1099 Claim Lifetime Too Short. 1101 If a node (usually a parent receiving a claim from a child) decides 1102 that the Claim Lifetime is too long (for example, more than 1103 15,768,000, i.e. half year), then it MAY send an UPDATE Message Error 1104 with subcode Claim Lifetime Too Long. Note that usually a parent 1105 MASC node should send first CLAIM_DENIED collision messages with 1106 Claim Lifetime field filled with the longest acceptable lifetime. If 1107 the child refuses to claim with shorter lifetime, then Claim Lifetime 1108 Too Long should be sent. 1110 If a node (usually a parent receiving a claim from a child) decides 1111 that the Claim Timestamp is too small, i.e. too old (for example, if 1112 a node is self-confident that its clock is quite accurate), then it 1113 MUST send an UPDATE Message Error with subcode Claim Timestamp Too 1114 Old. Claim Timestamp Too New is defined similarly. 1116 If a node (usually a parent receiving a claim from a child) decides 1117 that the prefix size implied by the Mask field is too small (for 1118 example, smaller than 16 addresses), then it MAY send an UPDATE 1119 Message Error with subcode Claim Prefix Size Too Small. 1121 If a node (usually a parent receiving a claim from a child) decides 1122 that the prefix size implied by the Mask field is too large, then it 1123 MAY send an UPDATE Message Error with subcode Claim Prefix Size Too 1124 Large. Note that usually a parent MASC node should send first 1125 CLAIM_DENIED collision messages for some subrange of the child's 1126 large claimed address range. If the child refuses to shrink the 1127 claim size, then Claim Prefix Size Too Large should be sent. 1129 If the received UPDATE message's computed Updated Origin Role is 1130 illegal (see Table 1 in Section 11.1), then the Error Subcode is set 1131 to Illegal Origin Role Error. 1133 If the received UPDATE message needs to be associated with a parent's 1134 prefix, but the association is not successful, then the Error Subcode 1135 is set to No Appropriate Parent Prefix. The No Appropriate Child 1136 Prefix, No Appropriate Internal Prefix, and No Appropriate Sibling 1137 Prefix Error Subcodes are defined similarly. 1139 If a node decides that the Claim Holdtime is too short (for example, 1140 just few seconds), it MAY send an UPDATE Message Error with subcode 1141 Claim Holdtime Too Short. 1143 If a node decides that the Claim Holdtime is too long (for example, 1144 more than 15,768,000, i.e. half year), then it SHOULD send an UPDATE 1145 Message Error with subcode Claim Holdtime Too Long. 1147 If any other error is encountered when processing attributes, then 1148 the Error Subcode is set to Malformed Attribute List, and the erratic 1149 attribute is included in the data field. 1151 8.4. Hold Timer Expired Error Handling 1153 If a system does not receive successive KEEPALIVE and/or UPDATE 1154 and/or NOTIFICATION messages within the period specified in the Hold 1155 Time field of the OPEN message, then the NOTIFICATION message with 1156 Hold Timer Expired Error Code must be sent and the MASC connection 1157 closed. 1159 8.5. Finite State Machine Error Handling 1161 Any error detected by the MASC Finite State Machine (e.g., receipt of 1162 an unexpected event) is indicated by sending the NOTIFICATION message 1163 with Error Code Finite State Machine Error. The Error Subcode 1164 elaborates on the specific nature of the error. 1166 8.6. NOTIFICATION Message Error Handling 1168 If a node sends a NOTIFICATION message, and there is an error in that 1169 message, and the O-bit of that message is not zero, a NOTIFICATION 1170 with O-bit zeroed, Error Code of NOTIFICATION Error, and subcode 1171 Unspecific must be sent. In addition, the Data field must include 1172 the erratic NOTIFICATION message. However, if the erratic 1173 NOTIFICATION message had the O-bit zeroed, then any error, such as an 1174 unrecognized Error Code or Error Subcode, should be noticed, logged 1175 locally, and brought to the attention of the administrator of the 1176 remote node. The means to do this, however, lies outside the scope 1177 of this document. 1179 8.7. Cease 1181 In absence of any fatal errors (that are indicated in this section), 1182 a MASC node may choose at any given time to close its MASC connection 1183 by sending the NOTIFICATION message with Error Code Cease. However, 1184 the Cease NOTIFICATION message must not be used when a fatal error 1185 indicated by this section does exist. 1187 8.8. Connection Collision Detection 1189 If a pair of MASC speakers try simultaneously to establish a TCP 1190 connection to each other, then two parallel connections between this 1191 pair of speakers might well be formed. We refer to this situation as 1192 connection collision. Clearly, one of these connections must be 1193 closed. Note that if the nodes were siblings, and each of those 1194 connections was associated with a different parent, then we do not 1195 consider this situation as collision (see Section 4.4). 1197 Based on the value of the MASC Node Identifier a convention is 1198 established for detecting which MASC connection is to be preserved 1199 when a connection collision does occur. The convention is to compare 1200 the MASC Node Identifiers of the remote nodes involved in the 1201 collision and to retain only the connection initiated by the MASC 1202 speaker with the higher-valued MASC Node Identifier. 1204 Upon receipt of an OPEN message, the local system must examine all of 1205 its connections that are in the OpenConfirm state. A MASC speaker 1206 may also examine connections in an OpenSent state if it knows the 1207 MASC Node Identifier of the remote node by means outside of the 1208 protocol. If among these connections there is a connection to a 1209 remote MASC speaker whose MASC Node Identifier equals the one in the 1210 OPEN message, and, in case of a sibling-to-sibling connection, the 1211 Parent Domain ID of that connection equals the one in the OPEN 1212 message, then the local system performs the following connection 1213 collision resolution procedure: 1215 1. The MASC Node Identifier of the local system is compared to the 1216 MASC Node Identifier of the remote system (as specified in the 1217 OPEN message). Comparing MASC Node Identifiers is done by 1218 treating them as unsigned integers (e.g. 4-octets long for IPv4 1219 and 16-octets long for IPv6). 1221 2. If the value of the local MASC Node Identifier is less than the 1222 remote one, the local system closes MASC connection that already 1223 exists (the one that is already in the OpenConfirm state), and 1224 accepts the MASC connection initiated by the remote system. 1226 3. Otherwise, the local system closes the newly created MASC 1227 connection (the one associated with the newly received OPEN 1228 message), and continues to use the existing one (the one that is 1229 already in the OpenConfirm state). 1231 A connection collision with an existing MASC connection that is in 1232 the Established state causes unconditional closing of the newly 1233 created connection. Note that a connection collision cannot be 1234 detected with connections that are in Idle, or Connect, or Active 1235 states (see Section 10). 1237 Closing the MASC connection (that results from the collision 1238 resolution procedure) is accomplished by sending the NOTIFICATION 1239 message with the Error Code Cease. 1241 9. MASC Version Negotiation 1243 MASC speakers may negotiate the version of the protocol by making 1244 multiple attempts to open a MASC connection, starting with the 1245 highest version number each supports. If an open attempt fails with 1246 an Error Code OPEN Message Error, and an Error Subcode Unsupported 1247 Version Number, then the MASC speaker has available the version 1248 number it tried, the version number the remote node tried, the 1249 version number passed by the remote node in the NOTIFICATION message, 1250 and the version numbers that it supports. If the two MASC speakers 1251 do support one or more common versions, then this will allow them to 1252 rapidly determine the highest common version. In order to support 1253 MASC version negotiation, future versions of MASC must retain the 1254 format of the OPEN and NOTIFICATION messages. 1256 10. MASC Finite State Machine 1258 This section specifies MASC operation in terms of a Finite State 1259 Machine (FSM). The FSM and the operations are peer peering session. 1260 Following is a brief summary and overview of MASC operations by state 1261 as determined by this FSM. 1263 Initially the peering session is in the Idle state. 1265 10.1. Open/Close MASC Connection FSM 1267 Idle state: 1269 In this state MASC refuses all incoming MASC connections from the 1270 peer. No resources are allocated to the remote node. In response 1271 to the Start event (initiated by either system or operator) the 1272 local system initializes all MASC resources, starts the 1273 ConnectRetry timer, initiates a transport connection to the remote 1274 node, while listening for a connection that may be initiated by 1275 the remote MASC node, and changes its state to Connect. The exact 1276 value of the ConnectRetry timer is a local matter, but should be 1277 sufficiently large to allow TCP initialization. 1279 If a MASC speaker detects an error, it shuts down the connection 1280 and changes its state to Idle. Getting out of the Idle state 1281 requires generation of the Start event. If such an event is 1282 generated automatically, then persistent MASC errors may result in 1283 persistent flapping of the speaker. To avoid such a condition it 1284 is recommended that Start events should not be generated 1285 immediately for a node that was previously transitioned to Idle 1286 due to an error. For a node that was previously transitioned to 1287 Idle due to an error, the time between consecutive generation of 1288 Start events, if such events are generated automatically, shall 1289 exponentially increase. The value of the initial timer shall be 60 1290 seconds. The time shall be doubled for each consecutive retry, but 1291 shall not be longer than 24 hours. 1293 Any other event received in the Idle state is ignored. 1295 Connect state: 1297 In this state MASC is waiting for the transport protocol 1298 connection to be completed. 1300 If the transport protocol connection succeeds, the local system 1301 clears the ConnectRetry timer, completes initialization, sends an 1302 OPEN message to the remote node, and changes its state to 1303 OpenSent. If the transport protocol connect fails (e.g., 1304 retransmission timeout), the local system restarts the 1305 ConnectRetry timer, continues to listen for a connection that may 1306 be initiated by the remote MASC node, and changes its state to 1307 Active state. 1309 In response to the ConnectRetry timer expired event, the local 1310 system restarts the ConnectRetry timer, initiates a transport 1311 connection to the other MASC node, continues to listen for a 1312 connection that may be initiated by the remote MASC node, and 1313 stays in the Connect state. 1315 The Start event is ignored in the Connect state. 1317 In response to any other event (initiated by either system or 1318 operator), the local system releases all MASC resources associated 1319 with this connection and changes its state to Idle. 1321 Active state: 1323 In this state MASC is trying to acquire a remote node by listening 1324 for a transport protocol connection initiated by the remote node. 1326 If the transport protocol connection succeeds, the local system 1327 clears the ConnectRetry timer, completes initialization, sends an 1328 OPEN message to the remote node, sets its Hold Timer to a large 1329 value, and changes its state to OpenSent. A Hold Timer value of 1330 [HOLDTIME] seconds is suggested. 1332 In response to the ConnectRetry timer expired event, the local 1333 system restarts the ConnectRetry timer, initiates a transport 1334 connection to other MASC node, continues to listen for a 1335 connection that may be initiated by the remote MASC node, and 1336 changes its state to Connect. 1338 If the local system detects that a remote node is trying to 1339 establish a MASC connection to it, and the IP address of the 1340 remote node is not an expected one, the local system restarts the 1341 ConnectRetry timer, rejects the attempted connection, continues to 1342 listen for a connection that may be initiated by the remote MASC 1343 node, and stays in the Active state. 1345 The Start event is ignored in the Active state. 1347 In response to any other event (initiated by either system or 1348 operator), the local system releases all MASC resources associated 1349 with this connection and changes its state to Idle. 1351 OpenSent state: 1353 In this state MASC waits for an OPEN message from the remote node. 1354 When an OPEN message is received, all fields are checked for 1355 correctness. If the MASC message header checking or OPEN message 1356 checking detects an error (see Section 8.2), or a connection 1357 collision (see Section 8.8) the local system sends a NOTIFICATION 1358 message and, if the connection is to be closed, it changes its 1359 state to Idle. 1361 If the locally configured role is SIBLING and there is no parent 1362 domain with Domain ID equal to the Parent Domain ID in the OPEN 1363 message, the local system sends a NOTIFICATION Open Message Error 1364 with Error Subcode set to No Common Parent, the connection must be 1365 closed, and the state of the local system must be changed to Idle. 1367 If there are no errors in the OPEN message, MASC sends a KEEPALIVE 1368 message and sets a KeepAlive timer. The Hold Timer, which was 1369 originally set to a large value (see above), is replaced with the 1370 negotiated Hold Time value (see Section 7.2). If the negotiated 1371 Hold Time value is zero, then the Hold Time timer and KeepAlive 1372 timers are not started. If the value of the MASC Domain ID field 1373 is the same as the local MASC Domain ID, and if the Role field of 1374 the OPEN message is set to INTERNAL_PEER, then the connection is 1375 an "internal" connection; otherwise, it is "external". Finally, 1376 the state is changed to OpenConfirm. 1378 If a disconnect notification is received from the underlying 1379 transport protocol, the local system closes the MASC connection, 1380 restarts the ConnectRetry timer, while continue listening for 1381 connection that may be initiated by the remote MASC node, and goes 1382 into the Active state. 1384 If the Hold Timer expires, the local system sends a NOTIFICATION 1385 message with error code Hold Timer Expired and changes its state 1386 to Idle. 1388 In response to the Stop event (initiated by either system or 1389 operator) the local system sends a NOTIFICATION message with Error 1390 Code Cease and changes its state to Idle. 1392 The Start event is ignored in the OpenSent state. 1394 In response to any other event the local system sends a 1395 NOTIFICATION message with Error Code Finite State Machine Error 1396 and Error Subcode Open/Close MASC Connection FSM Error, and 1397 changes its state to Idle. 1399 Whenever MASC changes its state from OpenSent to Idle, it closes 1400 the MASC (and transport-level) connection and releases all 1401 resources associated with that connection. 1403 OpenConfirm state: 1405 In this state MASC waits for a KEEPALIVE or NOTIFICATION message. 1407 If the local system receives a KEEPALIVE message, it changes its 1408 state to Established. 1410 If the Hold Timer expires before a KEEPALIVE message is received, 1411 the local system sends a NOTIFICATION message with error code Hold 1412 Timer Expired and changes its state to Idle. 1414 If the local system receives a NOTIFICATION message with the O-bit 1415 zeroed, it changes its state to Idle. 1417 If the KeepAlive timer expires, the local system sends a KEEPALIVE 1418 message and restarts its KeepAlive timer. 1420 If a disconnect notification is received from the underlying 1421 transport protocol, the local system changes its state to Idle. 1423 In response to the Stop event (initiated by either system or 1424 operator) the local system sends a NOTIFICATION message with Error 1425 Code Cease and changes its state to Idle. 1427 The Start event is ignored in the OpenConfirm state. 1429 In response to any other event the local system sends a 1430 NOTIFICATION message with Error Code Finite State Machine Error 1431 and Error Subcode Unspecific, and changes its state to Idle. 1433 Whenever MASC changes its state from OpenConfirm to Idle, it 1434 closes the MASC (and transport-level) connection and releases all 1435 resources associated with that connection. 1437 Established state: 1439 In the Established state MASC can exchange UPDATE, NOTIFICATION, 1440 and KEEPALIVE messages with the remote node. 1442 If the local system receives an UPDATE, or KEEPALIVE message, or 1443 NOTIFICATION message with O-bit set, it restarts its Hold Timer, 1444 if the negotiated Hold Time value is non-zero. 1446 If the local system receives a NOTIFICATION message, with the O- 1447 bit zeroed, it changes its state to Idle. 1449 If the local system receives an UPDATE message and the UPDATE 1450 message error handling procedure (see Section 8.3) detects an 1451 error, the local system sends a NOTIFICATION message and, if the 1452 O-bit was zeroed, changes its state to Idle. 1454 If a disconnect notification is received from the underlying 1455 transport protocol, the local system changes its state to Idle. 1457 If the Hold Timer expires, the local system sends a NOTIFICATION 1458 message with Error Code Hold Timer Expired and changes its state 1459 to Idle. 1461 If the KeepAlive timer expires, the local system sends a KEEPALIVE 1462 message and restarts its KeepAlive timer. 1464 Each time the local system sends a KEEPALIVE or UPDATE message, it 1465 restarts its KeepAlive timer, unless the negotiated Hold Time 1466 value is zero. 1468 In response to the Stop event (initiated by either system or 1469 operator), the local system sends a NOTIFICATION message with 1470 Error Code Cease and changes its state to Idle. 1472 The Start event is ignored in the Established state. 1474 After entering the Established state, if the local system has 1475 UPDATE messages that are to be sent to the remote node, they must 1476 be sent immediately (see Section 11.8). 1478 In response to any other event, the local system sends a 1479 NOTIFICATION message with Error Code Finite State Machine Error 1480 with the O-bit zeroed and Error Subcode Unspecific, and changes 1481 its state to Idle. 1483 Whenever MASC changes its state from Established to Idle, it 1484 closes the MASC (and transport-level) connection, releases all 1485 resources associated with that connection, and deletes all state 1486 derived from that connection. 1488 11. UPDATE Message Processing 1490 The UPDATE message are accepted only when the system is in the 1491 Established state. 1493 In the text below, a MASC domain is considered a child of itself with 1494 regard to the claims that are related to the address space with local 1495 usage purpose (i.e. to be used by the MAASs within that domain). For 1496 example, a NEW_CLAIM initiated by a MASC node to obtain more space 1497 for local usage from a prefix managed by that domain will have field 1498 Role = CHILD. 1500 If an UPDATE is to be propagated further, it should not be sent back 1501 to the node that UPDATE was received from, unless there is an 1502 indication that the connection to that node was down and then 1503 restored. 1505 If the local system receives an UPDATE message, and there is no 1506 indication for error, it checks whether to accept or reject the 1507 message, and if it is not rejected, the UPDATE is processed based on 1508 its type. 1510 If an UPDATE message must be associated with a parent domain, then 1511 there must be a PREFIX_MANAGED by some parent domain for a prefix 1512 that covers the prefix of the particular UPDATE. 1514 11.1. Accept/Reject an UPDATE 1516 The Origin Role field is first compared against the local system's 1517 configured Role, according to Table 1, to determine the relationship 1518 of the origin to the local system, where Locally-Configured Role is 1519 the local configuration with regard to the peer-forwarder of the 1520 message. A result of "---" means that receiving such an UPDATE is 1521 illegal and should generate a NOTIFICATION. Any other result is the 1522 value to use as the "Updated" Origin Role when propagating the UPDATE 1523 to others. This is analogous to updating a metric upon receiving a 1524 route, based on the metric of the link. 1526 Locally-Configured Role 1527 Origin 1528 Role || INTERNAL_PEER | CHILD | SIBLING | PARENT 1529 =========++===============+=========+=========+========= 1530 INTERNAL || INTERNAL_PEER | PARENT | SIBLING | CHILD 1531 CHILD || CHILD | SIBLING | --- | --- 1532 SIBLING || SIBLING | --- | SIBLING | CHILD 1533 PARENT || PARENT | --- | PARENT | --- 1535 Table 1: Updated Origin Role Computation 1537 After the Origin Role is updated, the following additional processing 1538 needs to be applied: 1540 o If the output from the Updated Origin Role Computation is SIBLING, 1541 but the Origin Domain ID is the same as the local MASC domain, the 1542 Updated Origin Role is changed to INTERNAL. This is necessary in 1543 case a MASC node receives from a parent or sibling its own UPDATEs 1544 after reboot, or if because of internal partitioning, the 1545 INTERNAL_PEERs are exchanging UPDATEs via other MASC domains 1546 (either parent or sibling(s)). 1548 o If both Locally-Configured Role, and Origin Role are equal to 1549 PARENT, and the Origin Domain ID is the same as the local MASC 1550 domain, the Updated Origin Role is changed to INTERNAL. This is 1551 necessary to allow a parent to receive its own UPDATEs through its 1552 own children, although the parent might drop those UPDATEs if it 1553 has a reason not to believe its children. 1555 o If both Locally-Configured Role, and Origin Role are equal to 1556 PARENT, and the Origin Domain ID is the same as the remote MASC 1557 domain, and the UPDATE type is CLAIM_DENIED, the Updated Origin 1558 Role is changed to INTERNAL. This is necessary to allow a parent 1559 to receive the CLAIM_DENIED it has originated through the child 1560 whose claim was denied. If the Origin Domain ID is not same as 1561 the remote MASC domain, but is same as some of the other MASC 1562 children domains, the Updated Origin Role still should be changed 1563 to INTERNAL, although the parent might drop this UPDATE if it has 1564 a reason not to believe a third party child. 1566 If the Updated Origin Role is INTERNAL, but the Origin Domain ID 1567 differs from the local Domain ID, a NOTIFICATION of must be sent back, and the claim is 1569 rejected. 1571 If Claim Timestamp and Claim Holdtime indicate that the claim has 1572 expired (e.g. Timestamp + Claim Holdtime <= CurrentTime), the UPDATE 1573 is silently dropped and no further actions are taken. 1575 Each new arrival UPDATE is compared with all claims in the local 1576 cache. The following fields are compared, and if all of them are the 1577 same, the message is silently rejected and no further actions are 1578 taken: 1580 o Role, D-bit, Type 1582 o AddrFam 1584 o Claim Timestamp 1586 o Claim Lifetime 1588 o Claim Holdtime 1590 o Origin Domain Identifier 1592 o Origin Node Identifier 1594 o Address 1596 o Mask 1598 Further processing of an UPDATE is based on its type and the Updated 1599 Origin Role. 1601 11.2. PREFIX_IN_USE Message Processing 1603 11.2.1. PREFIX_IN_USE by PARENT 1605 The claim is rejected, and a NOTIFICATION of should be sent back. 1608 11.2.2. PREFIX_IN_USE by SIBLING 1610 If the claim cannot be associated with any parent's PREFIX_MANAGED, 1611 the claim is dropped, a NOTIFICATION of must be sent back and no further actions 1613 should be taken. 1615 If the claim collides with some of the local domain's pending claims, 1616 the local claims must not be considered further, and the Claim-Timer 1617 of each of them must be canceled. If the received PREFIX_IN_USE claim 1618 clashes with and wins over some of the local domain's allocated 1619 prefixes, resolve the clash according to Section 12.4. Finally, the 1620 claim must be propagated further to all INTERNAL_PEERs, all MASC 1621 nodes from the corresponding parent MASC domain and all known 1622 siblings with the same parent domain. 1624 11.2.3. PREFIX_IN_USE by CHILD 1626 If the claim's prefix is not a subrange of any of the local domain's 1627 PREFIX_MANAGED, the claim is dropped, a NOTIFICATION of must be sent back and no 1629 further actions should be taken. Otherwise, the claim must be 1630 propagated further to all INTERNAL_PEERs and all MASC children 1631 domains. 1633 11.2.4. PREFIX_IN_USE by INTERNAL_PEER 1635 If the MASC node decides that the local domain does not need that 1636 prefix any more, it may be withdrawn, otherwise, the claim is 1637 processed as PREFIX_MANAGED. 1639 11.3. CLAIM_DENIED Message Processing 1641 11.3.1. CLAIM_DENIED by CHILD or SIBLING 1643 The message is rejected, and a NOTIFICATION of should be sent back. 1646 11.3.2. CLAIM_DENIED by INTERNAL_PEER 1648 Propagate to all INTERNAL_PEERs and all MASC children nodes. 1650 11.3.3. CLAIM_DENIED by PARENT 1652 If the Origin Domain ID is not same as the local domain ID, and the 1653 UPDATE cannot be associated with any parent domain, the message is 1654 dropped, a NOTIFICATION of must be sent back and no further actions should be 1656 taken. 1658 If the Origin Domain ID is not same as the local domain ID, and the 1659 UPDATE can be associated with a parent domain, the message is 1660 propagated to all nodes from that parent domain, all INTERNAL_PEERs, 1661 and all known SIBLINGs with regard to that parent. 1663 If the Origin Domain ID is same as the local domain ID, and there is 1664 no corresponding pending claim originated by the local MASC domain 1665 (i.e. a NEW_CLAIM or CLAIM_TO_EXPAND with same AddrFam, Origin Domain 1666 ID, Claim Timestamp, Address and Mask), a NOTIFICATION of must be sent back and 1668 no further actions should be taken. Otherwise, the matching NEW_CLAIM 1669 or CLAIM_TO_EXPAND's Claim-Timer must be canceled and the claim must 1670 not be considered further. Finally, the received CLAIM_DENIED must be 1671 propagated to all INTERNAL_PEERs, all MASC nodes from the 1672 corresponding parent MASC domain, and all known SIBLINGs with regard 1673 to that parent. 1675 11.4. CLAIM_TO_EXPAND Message Processing 1677 11.4.1. CLAIM_TO_EXPAND by PARENT 1679 The claim is rejected, and a NOTIFICATION of should be sent back. 1682 11.4.2. CLAIM_TO_EXPAND by SIBLING 1684 If the claim cannot be associated with any parent's PREFIX_MANAGED, 1685 the claim is dropped, a NOTIFICATION of must be sent back and no further actions 1687 should be taken. 1689 If there is no overlapping PREFIX_IN_USE by the same MASC domain, the 1690 claim is dropped, a NOTIFICATION of must be sent back and no further actions 1692 should be taken. 1694 If the claim collides with and wins over some of the local domain's 1695 pending claims, the loser claims must not be considered further, and 1696 the Claim-Timer of the each of them must be canceled. Also, the 1697 received claim must be propagated further to all INTERNAL_PEERs, all 1698 MASC nodes from the corresponding parent MASC domain and all known 1699 siblings with the same parent domain. 1701 11.4.3. CLAIM_TO_EXPAND by CHILD 1703 If the claim cannot be associated with any of the local domain's 1704 PREFIX_MANAGED, the claim is dropped, a NOTIFICATION of must be sent back and no 1706 further actions should be taken. 1708 If there is no overlapping PREFIX_IN_USE by the same MASC domain, the 1709 claim is dropped, a NOTIFICATION of must be sent back and no further actions 1711 should be taken. 1713 Otherwise, the claim has to be propagated to all INTERNAL_PEERs. If 1714 the lifetime of the claim is longer than the lifetime of the 1715 corresponding prefix managed by the local domain, or if there is an 1716 administratively configured reason to prevent the child from 1717 succeeding allocating the claimed prefix, a CLAIM_DENIED must be sent 1718 to all MASC children nodes that have same Domain ID as Origin Domain 1719 ID in the received message. The CLAIM_DENIED must be the same as the 1720 received claim, except Rol=INTERNAL, and Claim Lifetime should be set 1721 to the maximum allowed lifetime. Otherwise, propagate the claim to 1722 all children as well. 1724 11.4.4. CLAIM_TO_EXPAND by INTERNAL_PEER 1726 If the claim cannot be associated with any parent's PREFIX_MANAGED, 1727 the claim is dropped, a NOTIFICATION of must be sent back and no further action 1729 should be taken. 1731 If there is no overlapping PREFIX_IN_USE by the local MASC domain, 1732 the claim is dropped, a NOTIFICATION of must be sent back and no further actions 1734 should be taken. 1736 If the MASC node decides that the local domain does not need that 1737 pending claim any more, it MAY be withdrawn. Otherwise, the claim 1738 must be propagated to all INTERNAL_PEERs and all MASC nodes from the 1739 corresponding parent MASC domain. 1741 11.5. NEW_CLAIM Message Processing 1743 If the claim's Address field is 0 (i.e. a hint by a child to a parent 1744 to obtain more space), the claim should be propagated only among the 1745 nodes that belong to the child Origin Domain and the parent domain. 1747 Otherwise, process like CLAIM_TO_EXPAND, except that no check for 1748 overlapping PREFIX_IN_USE needs to be performed. 1750 11.6. PREFIX_MANAGED Message Processing. 1752 11.6.1. PREFIX_MANAGED by PARENT 1754 If the Origin Domain ID matches one of the parents' domain ID's, the 1755 prefix is recorded, and can be used by the address allocation 1756 algorithm for allocating subranges. Also, the message is propagated 1757 to all MASC nodes of the corresponding parent domain, all 1758 INTERNAL_PEERs, and SIBLINGs with same parent. 1760 11.6.2. PREFIX_MANAGED by CHILD or SIBLING 1762 The message is rejected, and a NOTIFICATION of should be sent back. 1765 11.6.3. PREFIX_MANAGED by INTERNAL_PEER 1767 The prefix is recorded as allocated to the local domain, propagated 1768 to all INTERNAL_PEERs, and can be used for (all items apply): 1770 a) address ranges/prefixes advertisements to all MASC children and 1771 local domain's MAASs; 1773 b) injection into G-RIB; 1775 c) further expansion by the address allocation algorithm (see 1776 Appendix A); 1778 11.7. WITHDRAW Message Processing 1780 11.7.1. WITHDRAW by CHILD 1782 If the WITHDRAW cannot be associated with any of the child domain's 1783 PREFIX_IN_USE (i.e. no child's PREFIX_IN_USE covers WITHDRAW's 1784 range), or if the WITHDRAW does not match any of the child domain's 1785 NEW_CLAIM or CLAIM_TO_EXPAND (i.e. there is no child's claim with 1786 same Address, Mask and Timestamp), the message is dropped, a 1787 NOTIFICATION of 1788 must be sent back and no further actions should be taken. Otherwise, 1789 propagate to all INTERNAL_PEERs and children. 1791 11.7.2. WITHDRAW by SIBLING 1793 If the WITHDRAW cannot be associated with any of the siblings' 1794 PREFIX_IN_USE (i.e. no sibling's PREFIX_IN_USE covers WITHDRAW's 1795 range), or if the WITHDRAW does not match any of the sibling domain's 1796 NEW_CLAIM or CLAIM_TO_EXPAND (i.e. there is no sibling's claim with 1797 same Address, Mask and Timestamp), the message is dropped, a 1798 NOTIFICATION of 1799 must be sent back and no further actions should be taken. Otherwise, 1800 propagate to all INTERNAL_PEERs, all MASC nodes from the same parent 1801 MASC domain and all known siblings with the same parent domain. 1803 11.7.3. WITHDRAW by INTERNAL 1805 If the WITHDRAW cannot be associated with any of the local domain's 1806 PREFIX_IN_USE or PREFIX_MANAGED (i.e. no local domain's prefix covers 1807 WITHDRAW's range), or if the WITHDRAW does not match any of the local 1808 domain's NEW_CLAIM or CLAIM_TO_EXPAND (i.e. there is no local 1809 domain's claim with same Address, Mask and Timestamp) the message is 1810 dropped, a NOTIFICATION of must be sent back and no further actions should be 1812 taken. 1814 Otherwise, propagate to all INTERNAL_PEERs, all MASC nodes of the 1815 corresponding parent domain of that prefix, all known siblings with 1816 that parent domain, and all children. If the WITHDRAW can be 1817 associated with some of local domain's PREFIX_IN_USE or 1818 PREFIX_MANAGED, stop advertising the WITHDRAW range to the MAASs and 1819 withdraw that range from the G-RIB database. In the special case 1820 when there is an indication that the WITHDRAW has been originated by 1821 the local domain because of a clash, and the range specified in 1822 WITHDRAW is a subrange of the local PREFIX_MANAGED, and the Claim 1823 Holdtime of WITHDRAW is shorter than the Claim Holdtime of 1824 PREFIX_MANAGED, the WITHDRAW's range should not be withdrawn from the 1825 G-RIB. If the WITHDRAW matches a local domain's NEW_CLAIM or 1826 CLAIM_TO_EXPAND, cancel the matching claim's Claim-Timer. 1828 11.7.4. WITHDRAW by PARENT 1830 If the WITHDRAW cannot be associated with any parent domain, a 1831 NOTIFICATION of 1832 must be sent back and no further actions should be taken. 1834 Otherwise, propagate to all INTERNAL_PEERs and all known siblings 1835 with the same parent domain. Also, originate a WITHDRAW message for 1836 each intersection of a locally owned PREFIX_MANAGED/PREFIX_IN_USE and 1837 the received WITHDRAW. The locally originated WITHDRAW message's 1838 Claim Holdtime should be at least equal to the Claim Holdtime in the 1839 WITHDRAW message received from the parent; the Origin Node ID should 1840 be the same as the particular PREFIX_MANAGED/PREFIX_IN_USE. 1842 11.8. UPDATE Message Ordering 1844 To simplify consistency and sanity check implementations, if there is 1845 more than one UPDATE message that needs to be send to a peer (for 1846 example, after a connection (re)establishment), some of the UPDATEs 1847 must be sent before others. 1849 The rules that always apply are: 1851 o PREFIX_IN_USE must always be sent BEFORE CLAIM_TO_EXPAND, 1852 NEW_CLAIM, and WITHDRAW by the same MASC domain 1854 o WITHDRAW must always be sent AFTER PREFIX_IN_USE, CLAIM_TO_EXPAND, 1855 NEW_CLAIM, and PREFIX_MANAGED by the same MASC domain 1857 Any further ordering is defined below by the roles of the sender and 1858 the receiver. 1860 11.8.1. Parent to Child 1862 Messages are sent in the following order: 1864 1) Parent's PREFIX_MANAGED and WITHDRAWs. 1866 2) All children's PREFIX_IN_USE, CLAIM_TO_EXPAND, and NEW_CLAIMs. 1867 CLAIMs from third party children that are hints for more space 1868 (i.e. address = 0) should not be propagated; if propagated, the 1869 child should drop them. 1871 3) Parent initiated CLAIM_DENIED and children initiated WITHDRAWs. 1872 CLAIM_DENIED regarding third party children's claims/hints with 1873 address = 0 should not be propagated; if propagated, the child 1874 should drop them. 1876 11.8.2. Child to Parent 1878 Messages are sent in the following order: 1880 1) Parent's PREFIX_MANAGED and WITHDRAWs. 1882 2) All PREFIX_IN_USE, CLAIM_TO_EXPAND, and NEW_CLAIMSs from that 1883 parent's space, initiated by that child and all its siblings. 1885 3) Parent's initiated CLAIM_DENIED, and all WITHDRAWSs that can be 1886 associated with that parent's space and are initiated by the local 1887 domain or all known siblings with that parent. 1889 11.8.3. Sibling to Sibling 1891 Messages are sent in the following order: 1893 1) All common parent's PREFIX_MANAGED and WITHDRAWs. 1895 2) PREFIX_IN_USE, CLAIM_TO_EXPAND, and NEW_CLAIMs, initiated by 1896 siblings. 1898 3) CLAIM_DENIEDs initiated by common parent, and WITHDRAWs initiated 1899 by local domain and all known siblings with that parent. 1901 11.8.4. Internal to Internal 1903 Messages are sent in the following order: 1905 1) All parents' PREFIX_MANAGED and WITHDRAWs. 1907 2) Local domain's and all siblings' PREFIX_IN_USE, CLAIM_TO_EXPAND, 1908 and NEW_CLAIMs. CLAIMs from siblings that are hints for more 1909 space (i.e. address = 0) should not be propagated; if propagated, 1910 the recipient should drop them. 1912 3) CLAIM_DENIEDs initiated by all parents, and WITHDRAWs initiated by 1913 local domain and all known siblings. 1915 4) All children's PREFIX_IN_USE, CLAIM_TO_EXPAND, and NEW_CLAIMs. 1917 5) All local domain initiated CLAIM_DENIED regarding children claims 1918 and all children initiated WITHDRAWs. 1920 12. Operational Considerations 1922 12.1. Bootup Operations 1924 To learn about its parent domains' IDs and prefixes, a MASC node 1925 SHOULD try to establish connections to its PARENT nodes before 1926 initiating a connection to a SIBLING node. To avoid learning about 1927 its own PREFIX_MANAGED from its children or siblings, a MASC node 1928 SHOULD try to establish connections to its PARENT nodes and 1929 INTERNAL_PEER nodes before initiating a connection to a CHILD or 1930 SIBLING node. 1932 12.2. Leaf and Non-leaf MASC Domain Operation 1934 A non-leaf MASC domain (i.e. a domain that has children domains) 1935 should advertise its PREFIX_MANAGED addresses to its children, and 1936 should claim from that space the sub-ranges that would be advertised 1937 to the internal MAASs (the claim wait time SHOULD be equal to 1938 [WAITING_PERIOD]). A MASC node that belongs to a non-leaf MASC 1939 domain should perform dual functions by being a child of itself with 1940 regard to the claiming and management of the sub-ranges for local 1941 usage. A leaf MASC domain should advertise all PREFIX_MANAGED 1942 addresses to its MAASs without explicitly claiming them for internal 1943 usage. A MASC node can assume that it belongs to a leaf domain if it 1944 simply does not have any UPDATEs by children domains. If an UPDATE 1945 by a child is received, the domain MUST switch from "leaf" to "non- 1946 leaf" mode, and if it needs more addresses for internal usage, it 1947 MUST claim them from that domain's PREFIX_MANAGED. After the last 1948 UPDATE originated by a child expires, the domain can switch back to 1949 "leaf" mode. 1951 12.3. Clock Skew Workaround 1953 Each UPDATE has "Claim Timestamp" field that is set to the absolute 1954 time of the MASC node that originated that UPDATE. The timestamp is 1955 used for two purposes: to resolve collisions, and to define how long 1956 an UPDATE should be kept in the local cache of other MASC nodes. A 1957 skew in the clock could result in unfair collision decision such that 1958 the claims originated by nodes that have their clock behind the real 1959 time will always win; however, because collisions are presumably 1960 rare, this will not be an issue. Skew in the clock however might 1961 result in expiring an UPDATE earlier than it really should be 1962 expired, and a node might assume too early that the expired 1963 UPDATE/prefix is free for allocation. To compensate for the clock 1964 skew, an UPDATE message should be kept longer than the amount of time 1965 specified in the Claim Holdtime. For example, keeping UPDATEs for an 1966 additional 24 hours will compensate for clock skew for up to 24 1967 hours. 1969 12.4. Clash Resolving Mechanism 1971 If a MASC node receives a PREFIX_IN_USE claim originated by a sibling 1972 and the claim overlaps with some of the local prefixes, the clash 1973 must be resolved. Two MASC domains should not manage overlapping 1974 address ranges, unless the domains have an ancestor-descendant (e.g. 1976 parent-child) relationship in the MASC hierarchy. Also, two MASC 1977 domains should not have locally-allocated overlapping address ranges. 1978 The clashed address ranges should not be advertised to the MAASs and 1979 allocated to multicast applications/sessions. If a clashed address 1980 has being allocated to an application, the application should be 1981 informed to stop using that address and switch to a new one. 1983 The G-RIB database must be consistent, such that it does not have 1984 ambiguous entries. "Ambiguous G-RIB entries" are those entries that 1985 might cause the multicast routing protocol to loop or lose 1986 connectivity. In MASC the WITHDRAW message is used to solve this 1987 problem. When a clashing PREFIX_IN_USE is received, it is compared 1988 (using the function describe in Section 5.1.1) against all prefixes 1989 allocated to the local domain. If the local PREFIX_IN_USE is the 1990 winner, no further actions are taken. If the local PREFIX_IN_USE is 1991 the loser, the clashing address range must be withdrawn by initiating 1992 a WITHDRAW message. The message must have Role = INTERNAL, Origin 1993 Node ID and Origin Domain ID must be the same as the corresponding 1994 local PREFIX_IN_USE message, while Claim Timestamp, Claim Lifetime, 1995 Claim Holdtime, Address and Mask must be the same as the received 1996 winning PREFIX_IN_USE. The initiated WITHDRAW message must be 1997 processed as described in Section 11.7. 1999 If a cached WITHDRAW times out and the local MASC domain owns an 2000 overlapping PREFIX_MANAGED or PREFIX_IN_USE, the overlapping prefix 2001 ranges can be injected back into the G-RIB database. Similarly, the 2002 address ranges that were not advertised to the local domain's MAASs 2003 due to the WITHDRAW, can now be advertised again. 2005 In addition to the automatic resolving of clashes, a MASC 2006 implementation should support manual resolving of clashes. For 2007 example, after a clash is detected, the network administrator should 2008 be informed that a clash has occurred. The specific manual 2009 mechanisms are outside the scope of this protocol. 2011 A MASC node must be configured to operate using either manual or 2012 automatic clash resolution mechanisms. 2014 12.5. Changing Network Providers 2016 If a MASC domain changes a network provider, such that the old 2017 provider cannot be used to provide connectivity, any traffic for 2018 sessions that are in progress and use that MASC domain as the root of 2019 multicast distribution trees will not be able to reach that domain. 2021 If the new network provider is willing to carry the traffic for the 2022 old sessions rooted at the customer domain, then it must propagate 2023 the customer's old prefixes through the G-RIB. However, at least one 2024 MASC node in the customer domain must maintain a TCP connection to 2025 one of the old network provider's MASC nodes. Thus, it can continue 2026 to "defend" the customer's prefixes, and should continue until the 2027 old prefixes' lifetimes expire. 2029 If the new network provider is not willing to propagate the old 2030 prefixes, then the customer should remove its prefixes from the G- 2031 RIB. If BGMP is in use, the old network provider's domain will 2032 automatically become the Root Domain for the customer's old groups 2033 due to the lack of a more specific group route. MASC nodes in the 2034 customer domain MAY still connect with the old provider's MASC nodes 2035 to defend their allocation. 2037 12.6. Debugging 2039 12.6.1. Prefix-to-Domain Lookup 2041 Use mtrace [MTRACE] to find the BGMP/MASC root domain for a group 2042 address chosen from that prefix. 2044 12.6.2. Domain-to-Prefix Lookup 2046 We can find the address space allocated to a particular MASC domain 2047 by directly querying one of the MASC servers within that domain, by 2048 observing the state in parents, siblings, or children MASC domains, 2049 or by observing the G-RIB information originated by that domain. 2050 From those three methods, the first method can provide the most 2051 detailed information. Finding the address of one of the MASC nodes 2052 within a particular domain is outside the scope of MASC. 2054 13. MASC Storage 2056 In general, MASC will be run by a border routers, which, in general 2057 do not have stable storage. In this case, MASC must use the Layer 2 2058 protocol/mechanism (e.g., ([AAP]) as described in [MALLOC] to store 2059 the important information (the prefixes allocated by the local 2060 domain) in the domain's MAASs who should have stable storage. If the 2061 MASC speaker has local storage, it should use it instead of the Layer 2062 2 protocol/mechanism. Claims that are in progress do not have to be 2063 saved by using the Layer 2 protocol/mechanism. 2065 14. Security Considerations 2067 IPsec [IPSEC] can be used to address security concerns between two 2068 MASC peering nodes. However, because of the store-and-forward nature 2069 of the UPDATE messages, it is possible that if a non-trustworthy MASC 2070 node can connect to some point of the MASC topology, then this node 2071 can undetectably inject malicious UPDATEs that may disturb the normal 2072 operation of other MASC nodes. To address this problem, each MASC 2073 node should allow peering only with trustworthy nodes. 2075 After a reboot, a MASC node/domain can restore its state from its 2076 neighbors (internal peers, parents, siblings, children). Typically, 2077 the state received from a parent or internal peer will be 2078 trustworthy, but a node may choose to drop its own UPDATEs that were 2079 received through a sibling or a child. 2081 A misbehaving node may attempt a Denial of Service attack by sending 2082 a large number of colliding messages that would prevent any of its 2083 siblings from allocating more addresses. A single mis-behaving node 2084 can easily be identified by all of its siblings, and all of its 2085 UPDATEs can be ignored. A Denial of Service attack that uses 2086 multiple origin addresses can be prevented if a third-party UPDATE 2087 (e.g. by a non-directly connected sibling) is accepted only if it is 2088 sent via the common parent domain, and the MASC nodes in the parent 2089 domain accept children UPDATEs only if they come via an internal 2090 peer, or come directly from a child node that is same as the Origin 2091 Node ID. 2093 15. IANA Considerations 2095 This document defines several number spaces (MASC message types, MASC 2096 OPEN message optional parameters types, MASC UPDATE message attribute 2097 types, MASC UPDATE message optional parameters types, and MASC 2098 NOTIFICATION message error codes and subcodes). For all of these 2099 number spaces, certain values are defined in this specification. New 2100 values may only be defined by IETF Consensus, as described in [IANA- 2101 CONSIDERATIONS]. Basically, this means that they are defined by RFCs 2102 approved by the IESG. 2104 16. Acknowledgments 2106 The authors would like to thank the participants of the IETF for 2107 their assistance with this protocol. 2109 17. APPENDIX A: Sample Algorithms 2111 DISCLAIMER: This section describes some preliminary suggestions by 2112 various people for algorithms which could be used with MASC. 2114 17.1. Claim Size and Prefix Selection Algorithm 2116 This section covers the algorithms used by a MASC node (on behalf of 2117 a MASC domain) to satisfy the demand for multicast addresses. The 2118 allocated addresses should be aggregatable, the address utilization 2119 should be reasonably high, and the allocation latency to the MAASs 2120 should be shorter than [WAITING_PERIOD] whenever possible. 2122 17.1.1. Prefix Expansion 2124 For ease of implementation and troubleshooting, MASC should use 2125 contiguous masks to specify the address ranges, i.e. prefixes. 2126 (Research indicates that sufficiently good results can be achieved 2127 using contiguous masks only.) The chosen prefixes should be as 2128 expandable as possible. The method used to choose the children sub- 2129 prefixes from the parent's prefix is the so called Reverse Bit 2130 Ordering (idea by Dave Thaler; inspired by Kampai [KAMPAI]). For 2131 example, if the parent's prefix width is four bits, the addresses of 2132 the sub-prefixes are chosen in the following order: 2134 Parent: xxxx 2136 Child A: 0000 2137 Child B: 1000 2138 Child C: 0100 2139 Child D: 1100 2141 If some of the children need to expand their sub-prefix, they try to 2142 double the corresponding sub-prefix starting from the right: 2144 Child A: 000x 2145 Child A: 00xx 2146 Child D: 110x 2147 Child D: 11xx 2149 and so on. 2151 However, because the address ordering is very strict, to reduce the 2152 probability for collision, when a new sub-prefix has to be chosen, 2153 the choice should be random among all candidates with the same 2154 potential for expandability. For example, if the free sub-prefixes 2155 are 01xx, 10xx, 110x, then the new prefix to claim should be chosen 2156 with probability of 50% for 01xx and 50% for 10xx for example. 2158 17.1.2. Reducing Allocation Latency 2160 To reduce the allocation latency, a MASC node uses pre-allocation. 2161 It constantly monitors the demand for addresses from its children (or 2162 MAASs), and predicts what would be the address usage after 2163 [WAITING_PERIOD]. Only if the available addresses will be used up 2164 within [WAITING_PERIOD], a MASC node claims more addresses in 2165 advance. 2167 17.1.3. Address Space Utilization 2169 Because every prefix size is a power of two, if a node tries to 2170 allocate just a single prefix, the utilization at that node (i.e. at 2171 that node's domain) can be as low as 50%. To improve the 2172 utilization, a MASC node can have more than one prefix allocated at a 2173 time (typically, each of them with different size). By using a pre- 2174 allocation and allocating several prefixes of different size (see 2175 below), a MASC node should try to keep its address utilization in the 2176 range 70-90%. 2178 17.1.4. Prefix Selection After Increase of Demand 2180 To additionally reduce the allocation latency by reducing the 2181 probability for collision, and to improve the aggregability of the 2182 allocated addresses, a MASC node carefully chooses the prefixes to 2183 claim. The first prefix is chosen at random among all reasonably 2184 expandable candidates. If a node chooses to allocate another, 2185 smaller prefix, then, instead of doubling the size of the first one 2186 which might reduce significantly the address utilization, a second 2187 ``neighbor'' prefix is chosen. For example, if prefix 224.0/16 was 2188 already allocated, and the MASC domain needs 256 more addresses, the 2189 second prefix to claim will be 224.1.0/24. If the domain needs more 2190 addresses, the second prefix will eventually grow to 224.1/16, and 2191 then both prefixes can be automatically aggregated into 224.0/15. 2192 Only if 224.0.1/24 could not be allocated, a MASC node will choose 2193 another prefix (eventually random among the unused prefixes). 2195 If the number of allocated prefixes increases above some threshold, 2196 and none of them can be extended when more addresses are needed, 2197 then, to reduce the amount of state, a MASC node should claim a new 2198 larger prefix and should stop re-claiming the older non-expandable 2199 prefixes. Research results show that up to three prefixes per MASC 2200 domain is a reasonable threshold, such that the address utilization 2201 can be in the range 70-90%, and at the same time the prefix flux will 2202 be reasonably low. 2204 17.1.5. Prefix Selection After Decrease of Demand 2206 If the demand for addresses decreases, such that its address space is 2207 under-utilized, a MASC node implicitly returns the unused prefixes 2208 after their lifetimes expire, or re-claims some smaller sub-prefixes. 2209 For example, if prefix 224.0/15 is 50% used by the MAASs and/or 2210 children MASC domains, and the overall utilization is such that 2211 approximately 2^16 (64K) addresses should be returned, a MASC node 2212 should stop reclaiming 224.0/15 and should start reclaiming either 2213 224.0/16 or 224.1/16 (whichever sub-prefix utilization is higher). 2215 17.1.6. Lifetime Extension Algorithm 2217 If the demand for addresses did not decrease, then a MASC node re- 2218 claims the prefixes it has allocated before their lifetime expires. 2219 Each prefix (or sub-prefix if the demand has decreased) should be 2220 re-claimed every 48 hours. 2222 18. APPENDIX B: Strawman Deployment 2224 At the moment of writing, 225.0.0.0-225.255.255.255 is temporarily 2225 allocated to MALLOC. Presumably this block of addresses will be used 2226 for experimental deployment and testing. 2228 If MASC were widely deployed on the Internet, we might expect numbers 2229 similar to the following: 2231 o Initially will have approximately 128 Top-Level Domains 2233 o Assume initially approximately 8192 level-2 MASC domains; on 2234 average, a TLD will have approximately 64 children domains. 2236 o MASC managed global addresses: 2238 The following (large) ranges are not allocated yet (2^N represents 2239 the size of the contiguous mask prefixes): 2240 225.0.0.0 - 231.255.255.255 = 2^26 + 2^25 + 2^24 2241 234.0.0.0 - 238.255.255.255 = 2^25 + 2^25 + 2^24 2242 --------------------------- 2243 Total: 12*2^24 addresses 2245 Initially, the range 228.0.0.0 - 231.255.255.255 (4*2^24 = 2^26 = 2246 64M) could be used by MASC as the global addresses pool. The rest 2247 (8*2^24) should be reserved. Part of it could be added later to 2248 MASC, or can be used to enlarge the pool of administratively 2249 scoped addresses (currently 239.X.X.X), or the pool for static 2250 allocation (233.X.X.X). 2252 o If the multicast addresses are evenly distributed, each TLD would 2253 have a maximum of 2^19 (512K) addresses, while each level-2 MASC 2254 domain would have 8192 addresses. 2256 o Initial claim size: 256 addresses/MASC domain 2258 o Could use soft and hard thresholds to specify the maximum amount 2259 of claimed+allocated addresses per domain. For example, trigger a 2260 warning message if claimed+allocated addresses by a domain is >= 2261 1.0*average_assumed_per_domain (a strawman default soft 2262 threshold): 2264 * if a TLD claim+allocation >= 512K 2265 * if a second level MASC domain claim+allocation >= 8K 2267 The hard threshold (for example, 2.0*average_assumed_per_domain) 2268 can be enforced by sending an explicit DENIED message. 2270 The TLDs thresholds (with regard to the claims by the second level 2271 MASC domains) is a private matter and is a part of the particular 2272 TLD policy: the thresholds could be per customer, and the warnings 2273 to the administrators could be a signal that it is time to change 2274 the policy. 2276 o Initial claim lifetime is of the order of 30 days. Prefix 2277 lifetime is periodically (every 48 hours) reclaimed/extended, 2278 unless the prefix is under-utilized (see APPENDIX A). Because the 2279 allocation is demand-driven, the allocated prefix lifetime will be 2280 automatically extended if the MAASs need longer prefix lifetime 2281 (e.g. 3-6 months). 2283 o A level-2 MASC domain could have children (i.e. level-3) MASC 2284 domains. 2286 o If a level-2 or level-3 MASC domain uses less than 128 addresses, 2287 a Layer 2 protocol/mechanism (e.g. AAP) should be run among that 2288 domain and its parent MASC domain. 2290 19. Authors' Addresses 2292 Deborah Estrin 2293 Computer Science Department 2294 University of Southern California/ISI 2295 Los Angeles, CA 90089 2296 USA 2297 Email: estrin@isi.edu 2299 Ramesh Govindan 2300 University of Southern California/ISI 2301 4676 Admiralty Way 2302 Marina Del Rey, CA 90292 2303 USA 2304 Email: govindan@isi.edu 2306 Mark Handley 2307 AT&T Center for Internet Research at ISCI (ACIRI) 2308 1947 Center St., Suite 600 2309 Berkeley, CA 94704-119 2310 USA 2311 Email: mjh@aciri.org 2313 Satish Kumar 2314 Computer Science Department 2315 University of Southern California/ISI 2316 Los Angeles, CA 90089 2317 USA 2318 Email: kkumar@usc.edu 2320 Pavlin Radoslavov 2321 Computer Science Department 2322 University of Southern California/ISI 2323 Los Angeles, CA 90089 2324 USA 2325 Email: pavlin@catarina.usc.edu 2327 David Thaler 2328 Microsoft 2329 One Microsoft Way 2330 Redmond, WA 98052 2331 USA 2332 Email: dthaler@microsoft.com 2334 20. References 2336 [AAP] 2337 Handley, M. and S. Hanna, "Multicast Address Allocation Protocol 2338 (AAP)", draft-ietf-malloc-aap-04.txt, June 2000. Work in progress. 2340 [API] 2341 Finlayson, R., "An Abstract API for Multicast Address Allocation", 2342 RFC 2771, February 2000. 2344 [BGMP] 2345 Thaler, D., Estrin, D. and D. Meyer, "Border Gateway Multicast 2346 Protocol (BGMP): Protocol Specification", draft-ietf-bgmp-spec- 2347 01.txt, March 2000. Work in progress. 2349 [BGP] 2350 Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 (BGP-4)", RFC 2351 1771, March 1995. 2353 [CIDR] 2354 Rekhter, Y. and C. Topolcic, "Exchanging Routing Information Across 2355 Provider Boundaries in the CIDR Environment", RFC 1520, September 2356 1993. 2358 [IANA] 2359 Reynolds, J. and J. Postel, "Assigned Numbers", RFC 1700, October 2360 1994. 2362 [IANA-CONSIDERATIONS] 2363 Alvestrand, H. and T. Narten, "Guidelines for Writing an IANA 2364 Considerations Section in RFCs", BCP 26, RFC 2434, October 1998. 2366 [IPSEC] 2367 Kent, S. and R. Atkinson, "Security Architecture for the Internet 2368 Protocol", RFC 2401, November 1998. 2370 [KAMPAI] 2371 Tsuchiya, P., "Efficient and Flexible Hierarchical Address 2372 Assignment", INET92, June 1992, pp. 441--450. 2374 [MADCAP] 2375 Hanna, S., Patel, B. and M. Shah, "Multicast Address Dynamic Client 2376 Allocation Protocol (MADCAP)", RFC 2730, December 1999 2378 [MALLOC] 2379 Thaler, D., Handley, M. and D. Estrin, "The Internet Multicast 2380 Address Allocation Architecture", draft-ietf-malloc-arch-05.txt, 2381 July 2000. Work in progress. 2383 [MBGP] 2384 Bates, T., Chandra, R., Katz, D. and Y. Rekhter, "Multiprotocol 2385 Extensions for BGP-4", RFC 2283, September 1997. 2387 [MTRACE] 2388 Fenner, W., and S. Casner, "A ''traceroute'' facility for IP 2389 Multicast", draft-ietf-idmr-traceroute-ipm-06.txt, March 2000. Work 2390 in progress. 2392 [MZAP] 2393 Handley, M, Thaler, D. and R. Kermode "Multicast-Scope Zone 2394 Announcement Protocol (MZAP)", RFC 2776, February 2000. 2396 [RFC1112] 2397 Deering, S., "Host Extensions for IP Multicasting", RFC 1112, 2398 August 1989. 2400 [RFC2119] 2401 Bradner, S., "Key words for use in RFCs to Indicate Requirement 2402 Levels", RFC 2119, March 1997. 2404 [RFC2373] 2405 Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", 2406 RFC 2373, July 1998. 2408 [RFC2460] 2409 Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) 2410 Specification", RFC 2460, December 1998. 2412 [SCOPE] 2413 Meyer, D., "Administratively Scoped IP Multicast", RFC 2365, July 2414 1998. 2416 21. Full Copyright Statement 2418 Copyright (C) The Internet Society (2000). All Rights Reserved. 2420 This document and translations of it may be copied and furnished to 2421 others, and derivative works that comment on or otherwise explain it or 2422 assist in its implementation may be prepared, copied, published and 2423 distributed, in whole or in part, without restriction of any kind, 2424 provided that the above copyright notice and this paragraph are included 2425 on all such copies and derivative works. However, this document itself 2426 may not be modified in any way, such as by removing the copyright notice 2427 or references to the Internet Society or other Internet organizations, 2428 except as needed for the purpose of developing Internet standards in 2429 which case the procedures for copyrights defined in the Internet 2430 Standards process must be followed, or as required to translate it into 2431 languages other than English. 2433 The limited permissions granted above are perpetual and will not be 2434 revoked by the Internet Society or its successors or assigns. 2436 This document and the information contained herein is provided on an "AS 2437 IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK 2438 FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT 2439 LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT 2440 INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR 2441 FITNESS FOR A PARTICULAR PURPOSE." 2443 Table of Contents 2445 1 Introduction .................................................... 2 2446 1.1 Terminology ................................................... 2 2447 1.2 Definitions ................................................... 3 2448 2 Requirements for Inter-Domain Address Allocation ................ 3 2449 3 Overall Architecture ............................................ 4 2450 3.1 Claim-Collide vs. Query-Response Rationale .................... 5 2451 4 MASC Topology ................................................... 5 2452 4.1 Managed vs Locally-Allocated Space ............................ 7 2453 4.2 Prefix Lifetime ............................................... 7 2454 4.3 Active vs. Deprecated Prefixes ................................ 8 2455 4.4 Multi-Parent Sibling-to-Sibling and Internal Peering .......... 8 2456 4.5 Administratively-Scoped Address Allocation .................... 8 2457 5 Protocol Details ................................................ 9 2458 5.1 Claiming Space ................................................ 9 2459 5.1.1 Claim Comparison Function ................................... 11 2460 5.2 Renewing an Existing Claim .................................... 11 2461 5.3 Expanding an Existing Prefix .................................. 11 2462 5.4 Releasing Allocated Space ..................................... 12 2463 6 Constants ....................................................... 12 2464 7 Message Formats ................................................. 13 2465 7.1 Message Header Format ......................................... 13 2466 7.2 OPEN Message Format ........................................... 14 2467 7.3 UPDATE Message Format ......................................... 16 2468 7.4 KEEPALIVE Message Format ...................................... 19 2469 7.5 NOTIFICATION Message Format ................................... 20 2470 8 MASC Error Handling ............................................. 23 2471 8.1 Message Header Error Handling ................................. 23 2472 8.2 OPEN Message Error Handling ................................... 24 2473 8.3 UPDATE Message Error Handling ................................. 25 2474 8.4 Hold Timer Expired Error Handling ............................. 27 2475 8.5 Finite State Machine Error Handling ........................... 28 2476 8.6 NOTIFICATION Message Error Handling ........................... 28 2477 8.7 Cease ......................................................... 28 2478 8.8 Connection Collision Detection ................................ 28 2479 9 MASC Version Negotiation ........................................ 29 2480 10 MASC Finite State Machine ...................................... 30 2481 10.1 Open/Close MASC Connection FSM ............................... 30 2482 11 UPDATE Message Processing ...................................... 35 2483 11.1 Accept/Reject an UPDATE ...................................... 36 2484 11.2 PREFIX_IN_USE Message Processing ............................. 38 2485 11.2.1 PREFIX_IN_USE by PARENT .................................... 38 2486 11.2.2 PREFIX_IN_USE by SIBLING ................................... 38 2487 11.2.3 PREFIX_IN_USE by CHILD ..................................... 38 2488 11.2.4 PREFIX_IN_USE by INTERNAL_PEER ............................. 38 2489 11.3 CLAIM_DENIED Message Processing .............................. 39 2490 11.3.1 CLAIM_DENIED by CHILD or SIBLING ........................... 39 2491 11.3.2 CLAIM_DENIED by INTERNAL_PEER .............................. 39 2492 11.3.3 CLAIM_DENIED by PARENT ..................................... 39 2493 11.4 CLAIM_TO_EXPAND Message Processing ........................... 39 2494 11.4.1 CLAIM_TO_EXPAND by PARENT .................................. 39 2495 11.4.2 CLAIM_TO_EXPAND by SIBLING ................................. 40 2496 11.4.3 CLAIM_TO_EXPAND by CHILD ................................... 40 2497 11.4.4 CLAIM_TO_EXPAND by INTERNAL_PEER ........................... 41 2498 11.5 NEW_CLAIM Message Processing ................................. 41 2499 11.6 PREFIX_MANAGED Message Processing. .......................... 41 2500 11.6.1 PREFIX_MANAGED by PARENT ................................... 41 2501 11.6.2 PREFIX_MANAGED by CHILD or SIBLING ......................... 41 2502 11.6.3 PREFIX_MANAGED by INTERNAL_PEER ............................ 42 2503 11.7 WITHDRAW Message Processing .................................. 42 2504 11.7.1 WITHDRAW by CHILD .......................................... 42 2505 11.7.2 WITHDRAW by SIBLING ........................................ 42 2506 11.7.3 WITHDRAW by INTERNAL ....................................... 42 2507 11.7.4 WITHDRAW by PARENT ......................................... 43 2508 11.8 UPDATE Message Ordering ...................................... 43 2509 11.8.1 Parent to Child ............................................ 44 2510 11.8.2 Child to Parent ............................................ 44 2511 11.8.3 Sibling to Sibling ......................................... 44 2512 11.8.4 Internal to Internal ....................................... 45 2513 12 Operational Considerations ..................................... 45 2514 12.1 Bootup Operations ............................................ 45 2515 12.2 Leaf and Non-leaf MASC Domain Operation ...................... 46 2516 12.3 Clock Skew Workaround ........................................ 46 2517 12.4 Clash Resolving Mechanism .................................... 46 2518 12.5 Changing Network Providers ................................... 47 2519 12.6 Debugging .................................................... 48 2520 12.6.1 Prefix-to-Domain Lookup .................................... 48 2521 12.6.2 Domain-to-Prefix Lookup .................................... 48 2522 13 MASC Storage ................................................... 48 2523 14 Security Considerations ........................................ 49 2524 15 IANA Considerations ............................................ 49 2525 16 Acknowledgments ................................................ 50 2526 17 APPENDIX A: Sample Algorithms .................................. 51 2527 17.1 Claim Size and Prefix Selection Algorithm .................... 51 2528 17.1.1 Prefix Expansion ........................................... 51 2529 17.1.2 Reducing Allocation Latency ................................ 52 2530 17.1.3 Address Space Utilization .................................. 52 2531 17.1.4 Prefix Selection After Increase of Demand .................. 52 2532 17.1.5 Prefix Selection After Decrease of Demand .................. 53 2533 17.1.6 Lifetime Extension Algorithm ............................... 53 2534 18 APPENDIX B: Strawman Deployment ................................ 53 2535 19 Authors' Addresses ............................................. 56 2536 20 References ..................................................... 57 2537 21 Full Copyright Statement ....................................... 59