idnits 2.17.1 draft-ietf-dhc-interserver-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-25) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 89 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 3 instances of too long lines in the document, the longest one being 3 characters in excess of 72. ** There are 814 instances of lines with control characters in the document. == There are 9 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 362: '...y failed by the external agent MUST be...' RFC 2119 keyword, line 407: '...cation but not yet allocated SHOULD be...' RFC 2119 keyword, line 619: '... software MUST verify that the group...' RFC 2119 keyword, line 2016: '...t, the joining server SHOULD start the...' RFC 2119 keyword, line 2023: '... ing server MUST continue to try ...' (4 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 372 has weird spacing: '...ined as trans...' == Line 2339 has weird spacing: '...l. The group...' == Line 4126 has weird spacing: '... caches throu...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 1998) is 9597 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '6' is defined on line 3906, but no explicit reference was found in the text == Outdated reference: A later version (-04) exists of draft-ietf-ion-scsp-01 ** Obsolete normative reference: RFC 1247 (ref. '3') (Obsoleted by RFC 1583) == Outdated reference: A later version (-04) exists of draft-ietf-ion-scsp-nhrp-00 == Outdated reference: A later version (-01) exists of draft-ietf-ion-scsp-atmarp-00 -- Possible downref: Normative reference to a draft: ref. '5' ** Obsolete normative reference: RFC 1340 (ref. '6') (Obsoleted by RFC 1700) == Outdated reference: A later version (-01) exists of draft-ietf-dhc-security-arch-00 -- Possible downref: Normative reference to a draft: ref. '8' Summary: 15 errors (**), 0 flaws (~~), 11 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group K. Kinnear 2 INTERNET DRAFT American Internet Corporation 3 R. Cole 4 AT&T MNS 5 R. Droms 6 Bucknell University 7 July 1997 8 Expires January 1998 10 An Inter-server Protocol for DHCP 11 13 Status of this Memo 15 This document is an Internet-Draft. Internet-Drafts are working docu- 16 ments of the Internet Engineering Task Force (IETF), its areas, and 17 its working groups. Note that other groups may also distribute work- 18 ing documents as Internet-Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference mate- 23 rial or to cite them other than as ``work in progress.'' 25 To learn the current status of any Internet-Draft, please check the 26 ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow 27 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 28 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 29 ftp.isi.edu (US West Coast). 31 Abstract 33 The DHCP protocol is designed to allow for multiple DHCP servers, so 34 that reliability of DHCP service can be improved through the use of 35 redundant servers. To provide redundant service, all of the DHCP 36 servers must be configured with the same information about assigned 37 IP addresses and parameters; i.e., all of the servers must be config- 38 ured with the same bindings. Because DHCP servers may dynamically 39 assign new addresses or configuration parameters, or extend the lease 40 on an existing address assignment, the bindings on some servers may 41 become out of date. The DHCP inter-server protocol provides an auto- 42 matic mechanism for synchronization of the bindings stored on a set 43 of cooperating DHCP servers. 45 This draft is a direct extension of draft-ietf-dhc- 46 interserver-00.txt, and represents the merging of ideas from both 48 DRAFT July 1997 50 draft-ietf-dhc-interserver-alt-00.txt and draft-ietf-dhc- 51 interserver-01.txt. The basic protocol semantics from draft-ietf- 52 dhc-interserver-alt-00.txt were used with the underlying message map- 53 ping to SCSP from draft-ietf-dhc-interserver-01.txt. Considerable 54 additional work has been included in this current draft in the area 55 of protocol correctness, detailed work on mapping the protocol to 56 SCSP, and organization of the draft itself. 58 1. Introduction 60 DHCP servers manage the assignment of IP address and configuration 61 parameters to IP hosts. The DHCP protocol specification [1] refers 62 to the collection of configuration information assigned to a client 63 as a "binding". The DHCP protocol is designed to allow for multiple 64 DHCP servers, so that reliability of DHCP service can be improved 65 through the use of redundant servers. To provide redundant service, 66 all of the DHCP servers must be configured with the same information 67 about assigned IP addresses and parameters; i.e., all of the servers 68 must be configured with the same bindings. Because DHCP servers may 69 dynamically assign new addresses or configuration parameters, or 70 extend the lease on an existing address assignment, the bindings on 71 some servers may become out of date. 73 The DHCP inter-server protocol provides an automatic mechanism for 74 synchronization of the bindings stored on a set of cooperating DHCP 75 servers. 77 The remainder of this document is organized in the following sec- 78 tions: 80 2. Goals and Requirements 82 Defines the requirements and goals for the protocol. Discusses 83 limitations of the protocol. Also contains a definition of 84 several classes of failures as well as a list of specific fail- 85 ures (which provide a useful common ground for discussion). 87 3. Overview 89 Discusses in a general way the content of the information com- 90 municated between servers implementing this protocol as well as 91 the way that information is communicated. 93 Introduces the three aspects of the protocol: client binding 94 management, address management, and group management. 96 DRAFT July 1997 98 Defines some key concepts surrounding the allowable "states" of 99 an IP address, including extensions critical to the operation 100 of this protocol. 102 Gives a brief sketch of the actions required by this protocol 103 for each DHCP client request received by the server. 105 4. Client Binding Management 107 Discusses the fundamental messages used by this portion of the 108 protocol, and the ways in which these messages are combined to 109 form higher level operations. Required responses to incoming 110 client binding management requests are explained in this sec- 111 tion. The required responses to incoming DHCP client requests 112 are explained in Section 6 below. 114 5. Address Management 116 The fundamental messages used by the address management portion 117 of the protocol are explained, as well as how they are combined 118 into higher level operations. The required responses to incom- 119 ing address management requests are explained in this section, 120 while the required responses to incoming DHCP client requests 121 are explained in Section 6 below. 123 6. Actions in Response to DHCP Client Messages and Events 125 The required responses to incoming DHCP client messages and 126 events are discussed in this section. 128 7. Group Management 130 The fundamental messages and their combination into higher 131 level operations for the group management portion of the proto- 132 col are explained. The actions to take when receiving any of 133 these messages as well as how to utilize them to join or leave 134 a server group are explained. 136 8. SCSP Message Mapping 138 The messages described in sections 4, 5, and 7 are mapped into 139 underlying SCSP messages in this section. This includes 140 detailed information on the format of each SCSP message. 142 9. IP Address State Transition 144 This protocol expands the possible states for an IP address. 145 The new states are described in Section 3.3. This section 147 DRAFT July 1997 149 describes all of the transitions between states in detail. 151 10. Security 153 The security implications of this draft are discussed in this 154 section. 156 11. Open Questions 158 Poses open questions about the protocol. Some questions from 159 draft-ietf-dhc-interserver-00.txt are included verbatim with 160 answers and questions (and some answers) new to this draft are 161 included as well. 163 12. Acknowledgments 165 13. References 167 14. Author's Information 169 A. Appendix A: An Overview of SCSP 171 1.1. The Language of Requirements 173 Throughout this document, the words that are used to define the sig- 174 nificance of particular requirements are capitalized. These words 175 are: 177 o "MUST" 179 This word or the adjective "REQUIRED" means that the item is an 180 absolute requirement of this specification. 182 o "MUST NOT" 184 This phrase means that the item is an absolute prohibition of 185 this specification. 187 o "SHOULD" 189 This word or the adjective "RECOMMENDED" means that there may 190 exist valid reasons in particular circumstances to ignore this 191 item, but the full implications should be understood and the case 192 carefully weighed before choosing a different course. 194 o "SHOULD NOT" 196 DRAFT July 1997 198 This phrase means that there may exist valid reasons in particu- 199 lar circumstances when the listed behavior is acceptable or even 200 useful, but the full implications should be understood and the 201 case carefully weighed before implementing any behavior described 202 with this label. 204 o "MAY" 206 This word or the adjective "OPTIONAL" means that this item is 207 truly optional. One vendor may choose to include the item 208 because a particular marketplace requires it or because it 209 enhances the product, for example; another vendor may omit the 210 same item. 212 1.2. Terminology 214 This document uses the following terms: 216 o "DHCP client" 218 A DHCP client is an Internet host using DHCP to obtain configura- 219 tion parameters such as a network address. 221 o "client" 223 Whenever the term client is used in this draft, it refers to a 224 DHCP client (and not a server communicating with another server 225 using this protocol). 227 o "DHCP server" 229 A DHCP server is an Internet host that returns configuration 230 parameters to DHCP clients. 232 o "binding" 234 A binding is a collection of configuration parameters, including 235 at least an IP address, associated with or "bound to" a DHCP 236 client. Bindings are managed by DHCP servers. 238 o "active server" 240 An active server is one which is capable of offering IP addresses 241 to clients. 243 o "stable storage" 245 DRAFT July 1997 247 Every DHCP server is assumed to have some form of what is called 248 "stable storage". Stable storage is used to hold information 249 concerning IP address bindings (among other things) so that this 250 information is not lost in the event of a server failure which 251 requires restart of the server. 253 2. Goals and Requirements 255 There are several levels of goals for this protocol. There are a set 256 of requirements with which it must comply, and then there are a set 257 of goals for the protocol and the way that it is used that are listed 258 in priority order. 260 2.1. Requirements on this Protocol 262 The following list of requirements must be (and are) achieved by this 263 protocol. 265 1. Implementations of this protocol work with existing DHCP client 266 implementations based on the DHCP protocol [1]. It must work 267 with today's clients! 269 2. Implementation works with existing BOOTP relay implementations. 271 3. Can be specified with sufficient clarity that unique implementa- 272 tions will work well together the first time (e.g. DHCP today 273 largely meets this requirement). 275 4. Work well with minimum of two and a maximum of 16 servers. 277 2.2. Goals of this Protocol 279 The following are the goals of this protocol. These goals are listed 280 in priority order. The protocol meets all of these goals. 282 1. Avoid binding an IP address to a client while that binding is 283 currently valid for another client. In other words, don't allo- 284 cate the same IP address to two clients. 286 2. Ensure that an existing client can keep its existing IP address 287 binding if it can communicate with any DHCP server using this 288 protocol -- not just the server that originally offered it the 289 binding. 291 DISCUSSION: 293 DRAFT July 1997 295 There is a subtle but very important point here. For exam- 296 ple, assume that there are five servers using this protocol. 297 Everything is running fine, and then the network becomes par- 298 titioned, and three servers can communicate among themselves, 299 and the other two can communicate among themselves -- but the 300 set of three cannot communicate with the set of two. Each 301 set, however, can communicate with some clients. 303 In this situation, every client that can communicate with a 304 DHCP server in either set should be able to continue to use 305 its existing binding, even if the server that originally cre- 306 ated the binding is not included in the set of servers with 307 which it can communicate. 309 3. Do not add any requirement for communication with another server 310 to the processing between a DHCPDISCOVER and a DHCPOFFER or 311 between a DHCPREQUEST and a DHCPACK. 313 DISCUSSION: 315 This is another subtle point. The implications of this goal 316 are that "lazy" update of IP address binding information is 317 required. In other words, because of this goal, the protocol 318 cannot require one server to update another server with 319 information concerning a new IP address binding prior to 320 sending the DHCPACK to the DHCP client. 322 As a result of this goal, a server may fail immediately after 323 sending the DHCPACK to the client but prior to successfully 324 sending a record of that information to any other server. 325 Should this happen, the DHCP client is the only operational 326 machine with a record of this binding -- and the protocol must 327 be (and has been) designed to properly deal with this situation. 329 3. Ensure that a new client can get an IP address from some server. 331 4. If a server goes down, and an external agent determines that it 332 is actually down as opposed to running but simply unable to com- 333 municate with other servers, then the addresses that it cur- 334 rently owns but are not yet bound may be recovered for use by 335 other servers. 337 5. Ensure that in the face of partition, where servers continue to 338 run but cannot communicate with each other, the above goals and 339 requirements are met. In addition, when the partition condition 340 is removed, allow graceful automatic re-integration without 341 requiring human intervention. 343 DRAFT July 1997 345 2.3. Limitations of this Protocol 347 The following are explicit limitations of this protocol. This is not 348 to say that they are not useful capabilities to have (that's why they 349 are explicitly listed, so that it will be clear that this protocol 350 does not supply them). 352 1. Determination of permanent server failure. 354 The protocol provides a way to propagate information about the 355 permanent failure of a server, but no way to detect a permanent 356 failure. Transient failures are detected, but there is no mech- 357 anism in this protocol to determine when a transient failure is 358 really a permanent failure. Some external agent must make this 359 determination -- and must ensure that the server declared perma- 360 nently failed is not simply partitioned from the other servers 361 and unable to communicate with them. The server which has been 362 declared permanently failed by the external agent MUST be 363 informed of that declaration prior to restart. 365 DISCUSSION: 367 The existing configuration messages allow one server to 368 declare another server as permanently failed and remove it 369 from the group. That is not the issue. What makes fully 370 automatic determination of permanent server failure impracti- 371 cal is distinguishing between permanent server failure (which 372 is easily defined as transient server failure that has gone 373 on too long) and partition of the group of servers. 375 Once communication fails with a server, the other servers 376 cannot know if it is still operating or not, and removing an 377 operating server from the group is an activity fraught with 378 peril. 380 This protocol is designed so that a server which is parti- 381 tioned from the group will re-integrate cleanly when it can 382 communicate again with the rest of the group. 384 Group membership protocols typically handle a partition situ- 385 ation (when they bother to handle it at all) by having the 386 partitioned server determine that it has been partitioned and 387 shut itself down. It detects a partition condition in one of 388 two ways: either it can't communicate with the "master", or 389 it can't communicate with the "majority" of the group. In 390 either case, it shuts down. 392 We believe that this is not an appropriate response for a 394 DRAFT July 1997 396 DHCP server. If my DHCP client can talk to a DHCP server, I 397 want my client to continue to operate -- I'm not interested 398 in having the only DHCP server to which I can talk shut 399 itself down! 401 2. Some addresses are temporarily unavailable during transient 402 server failure. 404 The full range of existing IP addresses that are potentially 405 available for allocation is reduced during the period of a tran- 406 sient server failure. The size of the pool of addresses that 407 are available for allocation but not yet allocated SHOULD be 408 configurable for each server. If the server is subsequently 409 declared to have undergone a permanent failure, these addresses 410 will be made available again. 412 Note that it is only the addresses not yet allocated but avail- 413 able for allocation that are unusable during the period of a 414 transient server failure. IP addresses that have been allocated 415 to clients may continue to be used by those clients even during 416 server failure. Indeed -- to allow existing clients to be able 417 to renew their existing IP addresses even if the server who 418 granted them the lease has failed is a primary reason why this 419 protocol exists. 421 2.4. Failures 423 This section makes explicit both classes of failures as well as a 424 list of specific failure scenarios in order to facilitate discussion 425 of the capabilities of this protocol. 427 o "transient server failure" 429 A transient server failure is one where a server is unable to 430 respond to requests, but later becomes operational and able to 431 respond to requests. Its local stable storage (i.e., whatever 432 mechanism it uses to preserve its binding information) is accu- 433 rate as of the time that transient server failure began. 435 o "permanent server failure" 437 A permanent server failure is one where a server is unable to 438 respond to requests -- probably for an extended period. While the 439 protocol defined in this document supports declaration of a per- 440 manent server failure, the decision that a transient server fail- 441 ure is in reality a permanent server failure is beyond the scope 442 of this protocol. 444 DRAFT July 1997 446 This determination will be likely be performed by some adminis- 447 trative entity, although in the future a group membership proto- 448 col could be integrated with the protocol defined in this docu- 449 ment to make such determinations automatically. 451 o "partition" 453 A network partition is caused by a failure of the underlying com- 454 munications substrate, such that two systems that could previ- 455 ously communicate cannot now do so. This may mimic transient 456 server failure, but is not the same because in this case the 457 server that appears to have failed may still be operational and 458 interacting with clients. 460 There is a form of partition known as "partial partition", where 461 the transitivity of communication usually expected is not 462 achieved. Imagine a set of servers organized (for the purposes 463 of exposition only) as a ring where each server can communicate 464 with its neighbors, but nobody else -- and when the number of 465 servers is greater than three, a partial partition situation 466 exists. 468 This term may also be used as a noun, as in "each partition may 469 communicate with ...", and in this case it refers to the group of 470 servers which can communicate normally (as distinguished from 471 those with which that group cannot communicate). 473 o "communication failure" 475 Communications failure describes the condition where the communi- 476 cation channel between two servers becomes impossible. "Partial 477 communication failure" describes the case where the normally 478 bidirectional communications channel becomes unidirectional, 479 where one server can send to but not receive from another server. 481 Some examples of the above failures are given below: 483 1. A single server crashes and reboots. [transient failure] 485 2. A single server crashes and stays down for a period of hours and 486 then reboots (either automatically or through some external 487 agent). [transient failure] 489 3. A single server fails and never returns. No permanent failure 490 is declared for this server. [transient failure] 492 4. A single server fails. A permanent failure is declared for this 493 server. [permanent failure] 495 DRAFT July 1997 497 5. A group of two servers are partitioned so that they cannot com- 498 municate, but each can communicate to some clients. [partition] 500 6. A group of five servers are partitioned so that three can commu- 501 nicate together and the remaining two can also communicate, but 502 the two partitions cannot communicate. Each partition can com- 503 municate with a subset of the clients, and these subsets are 504 disjoint. [partition] 506 7. A group of five servers are partitioned so that three can commu- 507 nicate together and the remaining two can also communicate, but 508 the two partitions cannot communicate. Each server continues to 509 be able to communicate with all of the clients. [partition] 511 DISCUSSION: 513 This situation is unlikely to occur, but the protocol should 514 be able to handle it. 516 8. Server A can send packets to server B, but cannot receive pack- 517 ets from server B. [partial communications failure] 519 9. There are four servers, A, B, C, and D. A cannot communicate 520 with C, B cannot communicate with D. [partial partition] 522 DISCUSSION: 524 This section on failures may well not belong in the final docu- 525 ment. For the purposes of review of the rest of the protocol, 526 however, defining a common language to describe failures and giv- 527 ing specific examples of failures as an aid to discussion seemed 528 useful. 530 3. Overview 532 At the most basic level, the DHCP protocol specifies the behavior of 533 DHCP servers which communicate with DHCP clients in order to allocate 534 IP address to the clients as well as provide a variety of configura- 535 tion parameters information to them. It is the allocation of IP 536 addresses to clients by the server that creates a requirement to 537 update what is known as "stable storage" -- typically held on disk. 538 This information is used to "remember" the IP address bindings that 539 have been made by the DHCP server in order to avoid allocating the 540 same IP address to two clients. 542 The key motivation for an inter-server protocol is the desire to 543 allow a client to continue to use its IP address (i.e., be able to 545 DRAFT July 1997 547 renew its lease on an IP address) even if the server who initially 548 offered it the lease on its IP address is unavailable for some rea- 549 son. In addition, no IP address should ever be bound to two clients 550 simultaneously. 552 Providing multiple DHCP servers to which each client can communicate 553 is the first step in creating this reliable DHCP capability. 555 In addition, these DHCP servers must communicate among themselves in 556 order to provide this reliable DHCP capability. 558 3.1. Information Communicated by the Protocol 560 There are three types of information which must be communicated 561 between servers implementing the server server protocol. 563 o Client Binding Information 565 This entire interserver protocol exists in order to allow servers 566 to share information about client bindings of IP addresses. 567 Servers must be able to update other servers about client bind- 568 ings that they have created, and must be able to receive similar 569 updates from other servers about client bindings that the other 570 servers have made or changed. 572 o Address Management Information 574 In order to implement an effective strategy for client binding 575 information updates, this protocol defines some additional states 576 for an IP address beyond those defined or implied by RFC 2131 [1] 577 that are not directly connected with client binding information. 578 The servers need to communicate among themselves concerning these 579 states, and this communication is enabled by the address manage- 580 ment information portion of the protocol. 582 o Group Management Information 584 While it is possible to conceive of a group of servers statically 585 configured to be part of a server group, the operational charac- 586 teristics of such an approach are far from pleasant. The group 587 management portion of this protocol allows a server to determine 588 the groups to which another server belongs; determine for each 589 group the current membership in the group; determine for each 590 group the subnets and IP addresses managed by that group; and 591 join or leave a server group. 593 DRAFT July 1997 595 3.2. Server Groups 597 Fundamental to this protocol is the "group" of servers which are com- 598 municating and with which the clients can communicate in order to 599 provide a reliable DHCP service. 601 Each server group (SG) to which a server belongs is associated with a 602 particular set of address pools. These address pools are those which 603 exist on a single network segment (sometimes called a single "wire"). 605 An active server can be (and typically would be) a member of several 606 groups simultaneously. This protocol allows a server to join an 607 existing SG. Which SGs a server would join is a configuration issue 608 for a particular server, and outside of the realm of this protocol -- 609 although considerable support is provided in order to make this a 610 solvable problem. 612 The membership of a particular SG will change over time, and in order 613 to ensure that each server is made aware of any changes in group mem- 614 bership in a timely way, every protocol message which is sent in the 615 inter-server protocol includes a group generation number (with a few 616 exceptions). 618 Whenever a message is received, the group management layer of the 619 software MUST verify that the group generation number matches the 620 current group generation number for that SG stored in the server. If 621 there is a mismatch, the group management layer will discard the mes- 622 sage. It will then attempt to update its knowledge of the current 623 group (and incidentally bring its generation number up to date in the 624 process). 626 In this way, any changes in group membership become spread throughout 627 the group as fast as possible -- and no messages that are out of syn- 628 chronization with the latest concept of group membership can be 629 received. 631 A server attempts to become a member of a particular group by using 632 the configuration messages described in Section 7 below. In addi- 633 tion, a server can remove another server from the group using these 634 messages -- but in this case an external agent must ensure that the 635 server being removed is truly inactive and not just partitioned. 637 3.3. Messages and Operations Defined by the Protocol 639 The protocol requires that servers who implement it can communicate, 640 each with the other, in a point-to-point manner (when all are operat- 641 ing correctly). It allows for the possibility that they can fail 643 DRAFT July 1997 645 entirely (i.e., crash) or be unable to communicate with each other 646 for a variety of reasons. 648 Each server will periodically need to communicate with other servers 649 in the group. There are several recurring styles of communication 650 that, if defined, will assist in explaining the major concepts of 651 this protocol. These major styles of group communication are as fol- 652 lows: 654 There are "messages", which for the purpose of this specification 655 consist of a communication between two servers. Messages are gath- 656 ered into higher level generic "operations", which describe the form 657 of the operation, and are made up of messages communicated between 658 more than one server. These generic operations are then instantiated 659 into specific operations as part of the various portions of the pro- 660 tocol. 662 3.3.1. Generic Protocol Messages 664 Messages are used to communicate between a pair of servers. 666 o QUERY 668 A QUERY operation is performed when one server wishes to obtain 669 knowledge about the server cache of another server. 671 o UPDATE 673 An UPDATE operation is performed when one server wishes to update 674 the information in the cache of another server. 676 3.3.2. Generic Protocol Operations 678 These generic protocol operations are used when a server must commu- 679 nicate with more than one other server. 681 o POLL 683 A POLL operation is used when one server must contact every other 684 server in the group using a QUERY message in order to request 685 that they respond with some information (typically concerning an 686 IP address). Usually, if the server executing the POLL cannot 687 contact all of the other servers using the QUERY message, it will 688 use whatever information it could glean from those it could con- 689 tact. 691 o COMPLETE POLL 693 DRAFT July 1997 695 A COMPLETE POLL is like a POLL in that one server attempts to 696 contact every other server using a QUERY message -- but in a COM- 697 PLETE POLL it must successfully complete a QUERY with each of 698 them or the operation itself fails to complete. 700 o PUSH 702 A PUSH operation is used when one server wants to update all of 703 the other servers using an UPDATE message. In a way similar to 704 the POLL operation, a PUSH operation will succeed if the server 705 employing it has managed to contact at least one other server in 706 the group with a successful UPDATE. 708 o COMPLETE PUSH 710 A COMPLETE PUSH is analogous to a COMPLETE POLL -- the COMPLETE 711 PUSH operation requires the server to attempt to UPDATE every 712 other server in the group. If every server responds successfully 713 to the UPDATE, the COMPLETE PUSH succeeds, otherwise the COMPLETE 714 PUSH fails. 716 Note that both PUSH and POLL involve operations to all of the servers 717 in the group. 719 3.3.3. Specific Protocol Operations 721 These above generic forms of inter-server communication are utilized 722 in the following ways in the Client Binding and Address Management. 724 Client Binding Management: 726 o CLIENT BINDING POLL (operation) 728 This operation involves one server asking every other server 729 using a QUERY for client binding information concerning a partic- 730 ular IP address. If all of the other servers are not opera- 731 tional, the requesting server will use any information it 732 receives. 734 o CLIENT BINDING COMPLETE PUSH (operation) 736 This operation involves one server informing all of the other 737 servers using an UPDATE about updated client binding information. 738 While there is utility in reaching even one other server (in some 739 cases) the operation is not deemed to have succeeded unless all 740 of the other servers were successfully updated with the new 741 information. 743 DRAFT July 1997 745 Address Management: 747 o UNBINDABLE COMPLETE POLL (operation) 749 In this operation, all of the other servers are contacted using a 750 QUERY concerning one (or more) IP addresses, and they all report 751 on whether that IP address(es) is UNBINDABLE or not. This opera- 752 tion fails if any server fails to respond to the QUERY or if any 753 server responds to the QUERY with a negative answer (i.e., the IP 754 address is not currently UNBINDABLE). It succeeds only when all 755 of the servers in the server group answer that the address is 756 UNBINDABLE. 758 o TRANSFER (message) 760 This message is used to transfer BINDABLE IP addresses from one 761 server to another (used when the SG is partitioned and the normal 762 UNBINDABLE COMPLETE POLL cannot be used to make an IP address 763 BINDABLE, but also when all of the UNBINDABLE IP addresses have 764 already been made BINDABLE by some server). 766 The information is sent from the initiating to the responding 767 server as a QUERY and includes the subnet specification and the 768 number of BINDABLE IP addresses the initiating server has avail- 769 able for that address pool, and the number of BINDABLE IP 770 addresses it is requesting. 772 The responding server is free to give the initiating server all, 773 some, or none of the number of IP addresses the initiating server 774 has requested. 776 3.4. IP Address State 778 The concept of the state of an IP address is largely implicit in the 779 DHCP RFC [1]. However, in order to manage pools of IP addresses with 780 multiple servers, the states and transitions between them must be 781 made quite explicit. 783 3.4.1. IP Address State: Basic DHCP Protocol 785 When an IP address is always controlled by a single DHCP server 786 (implicit in the definition of DHCP in the current DHCP draft [1]) 787 the IP address is either in the BINDABLE state or the BOUND state. 788 The following state diagram represents the states that an IP address 789 may occupy based on the current DHCP draft. (Note that these terms 790 do not appear in [1], but are terms that describe concepts that are 792 DRAFT July 1997 794 implicit in the RFC.) 796 +-----------------+ 797 | | 798 | BINDABLE |<--+ 799 | | | 800 +-----------------+ | 801 | | 802 V | 803 +-----------------+ | 804 | | | 805 | BOUND |---+ 806 | | 807 +-----------------+ 809 Figure 3.4.1-1: Basic DHCP IP address state transition diagram 811 When an IP address transitions from BINDABLE to BOUND, that transi- 812 tion must be recorded in the server's stable storage prior to the 813 transition being "published" to any observer outside of the server. 815 3.4.2. IP Address State: Extensions to Support the Interserver Protocol 817 The situation is more complex when multiple servers are managing the 818 same set of IP addresses as required by this protocol. Three new 819 states are defined for an IP address: UNBINDABLE, POLLING, PUSHED and 820 EXPIRED. 822 This is the state diagram for IP address state required by this pro- 823 tocol: 825 DRAFT July 1997 827 +-----------------+ 828 | | 829 | UNBINDABLE |<--------+ 830 | | | 831 +-----------------+ | 832 | | 833 V | 834 +-----------------+ | 835 | | | 836 | POLLING |-------->| 837 | | | 838 +-----------------+ | 839 | | 840 V | 841 +-----------------+ | 842 | | | 843 | BINDABLE |-------->| 844 | | | 845 +-----------------+ | 846 | | 847 ----------------------------- | 848 V | 849 +-----------------+ | 850 | | | 851 +-->| BOUND |-------->| 852 | | | | 853 | +-----------------+ | 854 | | | | 855 | | V | 856 | | +-----------------+ | 857 | | | | | 858 | | | PUSHED |-->| 859 | | | | | 860 | | +-----------------+ | 861 | | | | 862 | V V | 863 | +-----------------+ | 864 | | | | 865 +<--| EXPIRED |-------->+ 866 | | 867 +-----------------+ 869 Figure 3.4.2-1: Extended DHCP IP address state transition diagram 870 required for the Inter-server protocol. 872 DRAFT July 1997 874 For every server which cooperates using this protocol, an IP address 875 is in one of the following six states: 877 o UNBINDABLE 879 This state represents the default state for every IP address. 880 Explicit action must be taken to move an IP address from this 881 state into the BINDABLE state. An UNBINDABLE COMPLETE POLL must 882 be performed and must complete successfully. 884 Any IP address that has previously been BOUND must retain infor- 885 mation concerning the server that PUSHED the binding information, 886 the client to which it was bound, and the lease time for the 887 binding. This information is used when a server is removed from 888 the server group. 890 o POLLING 892 While an UNBINDABLE COMPLETE POLL operation is being performed, 893 an IP address is in the POLLING state. This ensures that if two 894 servers are simultaneously performing an UNBINDABLE COMPLETE POLL 895 operation that involves the same address that neither of them 896 will succeed in making that address BINDABLE. 898 o BINDABLE 900 In this state, the IP address is available to be offered to a 901 DHCP client, and if the client accepts the offer, it may be bound 902 to that client. 904 An IP address is only BINDABLE by a single server at a time. A 905 server must know for precisely which IP addresses it has on its 906 list of BINDABLE addresses. A server does not know about any 907 other server's list of BINDABLE addresses. (Although performance 908 optimizations are possible where a server may develop hints about 909 this information, they are not required). 911 An IP address can move from the BINDABLE state into the BOUND 912 state through the normal activity of the DHCP protocol where a 913 server interacts with a client. When this happens, the Client 914 Binding Management portion of the protocol is used to inform 915 other servers of the change. 917 A server can also transfer ownership of a BINDABLE IP address to 918 another server upon request from that other server (and without 919 any interaction beyond that with the other server). 921 DRAFT July 1997 923 o BOUND 925 An address that is BOUND is associated with a particular DHCP 926 client, and usually is in use by that client (although it may 927 have abandoned the lease on that IP address). It may be termed 928 BOUND to that client. In the BOUND state the information about 929 the client binding has not been propagated to all of the other 930 servers in the server group. 932 o PUSHED 934 An address that is PUSHED is associated with a client in the same 935 was as a BOUND address. However, an address in the PUSHED state 936 indicates that all of the other servers in the server group have 937 been informed of the existence of the binding to this client. 939 When a DHCP client releases a lease on an IP address it moves 940 from either the BOUND or PUSHED state into the UNBINDABLE state, 941 but no explicit PUSH operation is required. 943 When the lease time and any grace period implemented by a server 944 both expire, then an IP address moves into the EXPIRED state. 946 Note that only a server that actually completes a CLIENT BINDING 947 COMPLETE PUSH will place its IP address into the PUSHED state. 948 The servers who receive the CLIENT BINDING COMPLETE PUSH will 949 place their IP addresses into the BOUND state. 951 DISCUSSION: 953 Many DHCP servers implement something called a "grace 954 period", which is a period after the the lease on a binding 955 expires that an IP address will not be offered to another 956 DHCP client. A lease which is in this "grace period" is 957 still BOUND or PUSHED as far as the inter-server protocol is 958 concerned. 960 o EXPIRED 962 An IP address is EXPIRED when it was BOUND and the term of the 963 lease (and any implemented grace period) has run out. It may be 964 termed EXPIRED to that client. 966 An EXPIRED IP address will transition to the UNBINDABLE state 967 when the server who shows it as EXPIRED receives an UNBINDABLE 968 COMPLETE POLL. It will respond to the UNBINDABLE COMPLETE POLL 969 after making the IP address UNBINDABLE. 971 DRAFT July 1997 973 It may be moved back into the BOUND state by an REQUEST/INIT- 974 REBOOT request from the previously bound client. 976 Note that an IP address can never go from BOUND to one client to 977 BOUND to another client without first passing through the UNBINDABLE 978 state. The line across the middle of the state transition diagram 979 helps to illustrate this. 981 Further, note that the transition from POLLING to BINDABLE requires 982 the successful completion of an UNBINDABLE COMPLETE POLL. 984 3.5. Overview of Server Operation 986 This section will give a brief sketch of the of the core elements of 987 the Client Binding Management and Address Management parts of the 988 protocol (from the perspective of an already configured group of 989 servers). Many of the possible cases are not described here, and 990 this section is not to be considered definitive. The definitive 991 description of this information is contained in Section 6 and in the 992 case of conflicts with information found there, the information in 993 Section 6 will govern. 995 3.5.1. DISCOVER 997 Prior to the receipt of a DISCOVER message, each server should have 998 built up a list of BINDABLE IP addresses -- for two reasons. First, 999 because an UNBINDABLE COMPLETE POLL is required to move an IP address 1000 into the BINDABLE state, and an UNBINDABLE COMPLETE POLL may not be 1001 possible due to server failure at any given instant. Second, because 1002 even if an UNBINDABLE COMPLETE POLL was possible it would generally 1003 take too long to do between a DISCOVER and an OFFER message. 1005 A server should offer a BINDABLE address to a client upon receipt of 1006 a DISCOVER message. 1008 There are no inter-server protocol activities required when a DIS- 1009 COVER is processed and an OFFER is returned to the client (assuming 1010 of course that a BINDABLE address was available to be offered). 1012 3.5.2. REQUEST/SELECTING 1014 When a client accepts an offer by sending a SELECTING message, then 1015 the server updates its stable storage with the binding information 1016 and ACKs the client. It must then perform a CLIENT BINDING COMPLETE 1017 PUSH operation to push the binding information to all of the other 1019 DRAFT July 1997 1021 servers (to which it can communicate at that time). There are some 1022 limitations on the lease time that can be offered to the client until 1023 at least one successful CLIENT BINDING COMPLETE PUSH has succeeded 1024 for the offering server. See Section 4.4.1 for additional details. 1026 3.5.3. REQUEST/INIT-REBOOT 1028 In the usual case where the server who created the binding for the 1029 requesting client managed to PUSH that information to the other 1030 servers using a CLIENT BINDING COMPLETE PUSH, the receiving server 1031 will have the binding information for this client. If this informa- 1032 tion can be verified, then ACK the client -- else NAK it. 1034 If the IP address was in the EXPIRED state, then move the IP address 1035 to the PUSHED state. 1037 3.5.4. REQUEST/RENEWING 1039 Upon receipt of a RENEWAL message (which is unicast from the client 1040 to the server), it is expected that the server will have accurate 1041 information concerning the binding of the client. If it does not, 1042 process the message like a REBINDING, below. Given that the server 1043 has information sufficient to extend the lease, it should update its 1044 stable storage with the lease extension, and then ACK the client with 1045 the extended time. Then it must perform a CLIENT BINDING COMPLETE 1046 PUSH operation to the other servers with the updated binding informa- 1047 tion. 1049 3.5.5. REQUEST/REBINDING 1051 Upon receipt of a REBINDING message (which is broadcast from the 1052 client), the server will check to see if it has any information about 1053 the binding for this client. There are several possible cases: 1055 1. Current information shows that this client owns the IP address. 1057 Extend the lease, update stable storage, ACK the client, and 1058 perform a CLIENT BINDING COMPLETE PUSH with the information to 1059 the other servers. 1061 2. Current information shows that some other client is BOUND to 1062 this IP address. 1064 This is a problem. Make the IP address UNAVAILABLE (see Section 1065 12 for details). 1067 DRAFT July 1997 1069 3. Current information says this IP address is UNBINDABLE. 1071 In this case, a server has probably created a binding and then 1072 failed to propagate the information to this server. Perform a 1073 POLL operation to see if any communicating server has any better 1074 information. 1076 If information is returned, then move to the appropriate case in 1077 this list. 1079 If no information is returned, then extend the lease on the IP 1080 address, update stable storage, ACK the client, and PUSH the 1081 information to the other servers. 1083 3.5.6. RELEASE 1085 When a release is received, if the client matches the binding infor- 1086 mation in the server, then update stable storage with the release, 1087 set the IP address UNBINDABLE, and perform a CLIENT BINDING COMPLETE 1088 PUSH to inform other servers. 1090 If the CLIENT BINDING COMPLETE PUSH operation fails due to inability 1091 of an UPDATE message to succeed to another server, do nothing. 1093 3.5.7. Expiration 1095 When a lease on an IP address expires, move the lease to the EXPIRED 1096 state and update stable storage with this information. From now on, 1097 if some server performs an UNBINDABLE COMPLETE POLL operation to 1098 gather information about this IP address, make the IP address UNBIND- 1099 ABLE, update stable storage, and respond with the state of the IP 1100 address as UNBINDABLE. 1102 3.6. When a server is down or partitioned and can't be contacted 1104 When a server is down or partitioned (i.e., can't be reached), then 1105 some aspects of the normal DHCP client processing are different. 1106 This section summarizes those differences: 1108 o Client lease times for new clients will never be greater than 1109 MAXIMUM_UNPUSHED_LEASE_TIME, since a CLIENT BINDING COMPLETE PUSH 1110 cannot succeed. 1112 o No UNBINDABLE COMPLETE PUSH will succeed, and thus no server will 1113 be able to transition an address from the UNBINDABLE state into 1115 DRAFT July 1997 1117 the BINDABLE state. If a server runs low on addresses, it will 1118 have to use TRANSFER messages to acquire new addresses from other 1119 servers. 1121 4. Client Binding Management 1123 Client binding management is the aspect of the protocol which is con- 1124 cerned with communicating information about client bindings from one 1125 server to another. It is the core of the inter-server protocol. 1127 The following messages and operations are used explicitly by a server 1128 participating in the interserver protocol when DHCP client requests 1129 and events require it, and are used implicitly by the SCSP cache 1130 alignment procedure whenever a server (re)establishes communication 1131 with another server. 1133 4.1. Client Binding Messages 1135 o CLIENT BINDING UPDATE 1137 Update a single server with client binding information. This 1138 operation will not complete successfully unless and until that 1139 server is updated with the information being sent. 1141 o CLIENT BINDING QUERY 1143 Query a single server for its client binding information. 1145 4.2. Client Binding Operations 1147 The operations defined in for client binding management are: 1149 o CLIENT BINDING COMPLETE PUSH 1151 This operation involves one server using the UPDATE message to 1152 inform all of the other servers about updated client binding 1153 information. While there is utility in reaching even one other 1154 server (in some cases) the operation is not deemed to have suc- 1155 ceeded unless all of the other servers were successfully updated 1156 with the new information. 1158 o CLIENT BINDING POLL 1160 This operation involves one server using the QUERY message to 1161 inquire of every other server about client binding information 1162 concerning a particular IP address. If all of the other servers 1164 DRAFT July 1997 1166 are not operational, the requesting server will use any informa- 1167 tion it receives. 1169 4.3. Client Binding Information 1171 When binding data is sent as part of message concerned with client 1172 binding management it contains the following information: 1174 o IP Address 1176 o Expiration [expressed as a delta seconds from the current time] 1178 o Client ID 1180 o MAC Address [including the hardware type] 1182 o Last Transaction [selected from the list below] 1184 o Last Transaction Time [expressed as a delta seconds from the cur- 1185 rent time] 1187 o Last Transaction Server [an IP address] 1189 Each server must maintain as part of the binding information the 1190 "last transaction time", the "last transaction", and the "last trans- 1191 action server" associated with that binding. 1193 The last transaction time is the time at which the binding changed in 1194 response to a request (the last transaction) from the client. The 1195 last transaction time is returned in an address information message 1196 as a number of seconds from "now". 1198 The possible last transactions are listed below. This list is 1199 ordered by the precedence of the transactions and is used to help 1200 determine if a response to an address information message contains 1201 more recent information than that currently held by a server. 1203 The last transaction is one of the following: 1205 o DHCPREQUEST/SELECTING 1207 o DHCPREQUEST/REBINDING 1209 o DHCPREQUEST/INIT-REBOOT 1211 o DHCPREQUEST/RENEWING 1213 DRAFT July 1997 1215 o DHCPRELEASE 1217 o EXPIRATION 1219 The IP address state information is transmitted as well, and it con- 1220 sists of one of the following states: 1222 o UNBINDABLE 1224 o POLLING 1226 o BINDABLE 1228 o BOUND 1230 o PUSHED 1232 o EXPIRED 1234 4.4. Initiating Client Binding Operations and Messages 1236 4.4.1. CLIENT BINDING COMPLETE PUSH 1238 The CLIENT BINDING COMPLETE PUSH operation is initiated whenever the 1239 state of a server's client binding cache is changed, typically by the 1240 receipt of a DHCP client request or expiration of a lease. 1242 The lease time that is offered to a DHCP client must not be greater 1243 than the MAXIMUM-UNPUSHED-LEASE-TIME for that SG until at least one 1244 CLIENT BINDING COMPLETE PUSH has succeeded for that client binding. 1245 Thus, as long as the state of the IP address is BOUND, then the 1246 client should be offered the MAXIMUM-UNPUSHED-LEASE-TIME. 1248 The lease time that is sent to the other servers in the CLIENT BIND- 1249 ING COMPLETE PUSH is the lease time that the server would like to 1250 give to the DHCP client, and once a CLIENT BINDING COMPLETE PUSH has 1251 succeeded with that lease time in it (and the IP address state is set 1252 to PUSHED), then the server is free to actually extend the client's 1253 lease on the IP address with that lease time. 1255 The servers which receive the CLIENT BINDING COMPLETE PUSH will place 1256 their IP addresses into the BOUND state, not the PUSHED state. 1258 DRAFT July 1997 1260 4.4.2. CLIENT BINDING POLL 1262 The CLIENT BINDING POLL is used when the server has received a DHCP 1263 client request but believes that it has insufficient or out-of-date 1264 information concerning this client's binding. Thus, the CLIENT BIND- 1265 ING POLL is an attempt to gather more recent and up-to-date informa- 1266 tion from the other servers in the SG. 1268 DISCUSSION: 1270 Is this really necessary? Given that SCSP will "align" the 1271 caches of the servers at every reconnect, then what is the 1272 value of asking "again"? 1274 4.4.3. CLIENT BINDING UPDATE 1276 The CLIENT BINDING UPDATE is initiated in three ways. 1278 It is initiated at the client binding management level as the under- 1279 lying operation in a CLIENT BINDING COMPLETE PUSH. It is initiated 1280 at the client binding management level when a server realizes that 1281 the server who returned information as a result of a CLIENT BINDING 1282 QUERY returned information which was less up-to-date than that avail- 1283 able to the current server. It is initiated at the SCSP level as 1284 part of the cache state alignment process. 1286 4.5. Responding to Client Binding Messages 1288 When a server receives the following client binding messages, it 1289 should respond as detailed below. Note that operations consist of 1290 multiple messages at the initiator, but that when processing incoming 1291 requests, only individual messages are evident. 1293 4.5.1. CLIENT BINDING QUERY 1295 The proper response to a CLIENT BINDING QUERY is to respond with the 1296 current information in the client binding cache. 1298 4.5.2. CLIENT BINDING UPDATE 1300 The proper response to a CLIENT BINDING UPDATE is to determine if the 1301 information received is more current than that available in the 1302 server's cache. If it is not, then respond negatively to this 1303 request. If it is, then update the client binding cache, ensure that 1305 DRAFT July 1997 1307 the changes have been written to stable storage, and respond success- 1308 fully. Note that no CLIENT BINDING UPDATE should generate additional 1309 client binding message activity (i.e., the CLIENT BINDING UPDATE 1310 should not generate a CLIENT BINDING COMPLETE PUSH). 1312 When a CLIENT BINDING UPDATE is received, the IP address should be 1313 placed into the BOUND state, not the PUSHED state. Only the actual 1314 server performing the CLIENT BINDING COMPLETE PUSH will place its IP 1315 address into the PUSHED state. 1317 5. Address Managment 1319 Address management is the aspect of the protocol concerned with man- 1320 aging the state of IP addresses that are not currently bound to any 1321 client. It is a necessary part of the protocol in order to support 1322 certain goals in the client binding management part of the protocol, 1323 principally that of allowing a server to continue to operate even 1324 though it was partitioned from other servers in the server group. 1326 5.1. Address Management Operations 1328 o UNBINDABLE COMPLETE POLL 1330 In this operation, all of the other servers are contacted using a 1331 QUERY operation concerning one (or more) IP addresses, and they 1332 all report on whether that IP address(es) is UNBINDABLE or not. 1333 If they are UNBINDABLE, then the current information on that IP 1334 address is also reported (as in a CLIENT BINDING POLL). In con- 1335 trast to a CLIENT BINDING POLL, this operation fails if any 1336 server cannot be contacted or if any server answers the QUERY 1337 with a negative answer (i.e., the IP address is not currently 1338 UNBINDABLE). It succeeds when all of the servers answer that the 1339 address is UNBINDABLE. 1341 There is a subtle interaction required with the group management 1342 layer of the protocol. A successful UNBINDABLE COMPLETE POLL 1343 must be inhibited in certain cases where a server has been 1344 removed from a server group. 1346 The case is question is that where a server is removed from a 1347 server group by a different server. Immediately after this hap- 1348 pens, all UNBINDABLE COMPLETE POLLS must fail for a period equal 1349 to the MAXIMUM-UNPUSHED-LEASE-TIME. After that time passes, then 1350 UNBINDABLE COMPLETE POLLS may operate as they normally do. 1352 DRAFT July 1997 1354 DISCUSSION: 1356 This covers the situation where a server gives a lease to a 1357 while both the client and server are partitioned. Then, the 1358 server goes away completely. The client stays up, but remains 1359 partitioned. Then, the dead server is removed by another 1360 server from the server group. At this point, UNBINDABLE COM- 1361 PLETE POLL operations could (except for the above restriction) 1362 begin to complete successfully. However, the client that was 1363 given a lease while partitioned along with the server that 1364 died certainly has an address, and when the partition is 1365 removed (just after the UNBINDABLE COMPLETE POLL operation 1366 which declared its IP address now BINDABLE for some server), 1367 there would be a very dangerous situation developing. 1369 The solution is to only offer leases to clients of the MAXIMUM- 1370 UNPUSHED-LEASE-TIME until the information concerning their client 1371 binding reaches all of the other servers in the group. Once that 1372 happens, then they can be offered the normal lease time. 1374 Thus, whenever any server is removed from the group (where it 1375 doesn't remove itself), then there is a possibility that it may 1376 have offered leases to clients about which no other server would 1377 have any record. In this case, the remaining servers must wait 1378 the MAXIMUM-UNPUSHED-LEASE-TIME before being able to complete an 1379 UNBINDABLE COMPLETE POLL and reuse the BINDABLE addresses that 1380 the removed server was using. 1382 5.2. Address Management Messages 1384 The following messages are part of the address management portion of 1385 the protocol. 1387 o TRANSFER 1389 This message is used to transfer BINDABLE IP addresses from one 1390 server to another (especially when the SG is partitioned and the 1391 normal UNBINDABLE COMPLETE POLL cannot be used to make an IP 1392 address BINDABLE, but also when all of the UNBINDABLE IP 1393 addresses have already been made BINDABLE by some server). 1395 The information sent from the initiating to the responding server 1396 includes the subnet specification and the number of BINDABLE IP 1397 addresses the initiating server has available for that address 1398 pool, and the number of BINDABLE IP addresses it is requesting. 1400 DRAFT July 1997 1402 The responding server is free to give the initiating server all, 1403 some, or none of the number of IP addresses the initiating server 1404 has requested. 1406 o UNBINDABLE QUERY 1408 The UNBINDABLE QUERY operation is the primitive query from which 1409 the UNBINDABLE COMPLETE POLL is constructed. It is identical to 1410 the CLIENT BINDING QUERY defined above in terms of the data 1411 returned, although the actions taken when it is received are 1412 slightly different. 1414 5.3. Initiating Address Management Operations and Messages 1416 o UNBINDABLE COMPLETE POLL (operation) 1418 This operation is initiated when the server detects that it needs 1419 to generate more BINDABLE IP addresses. It will initiate this 1420 operation whenever the number of BINDABLE IP addresses drops 1421 below a configurable threshold. 1423 Prior to initiating this operation, the server must change the 1424 state for each IP address that will be part of the UNBINDABLE 1425 COMPLETE POLL from UNBINDABLE to POLLING, and commit this state 1426 change to stable storage. 1428 DISCUSSION: 1430 Is the commit to stable storage really necessary? Given that 1431 we will abandon the POLL if we reboot (presumably), what is 1432 the value of remembering that we were doing it? 1434 For every IP address for which the UNBINDABLE COMPLETE POLL oper- 1435 ation fails (i.e., some server responds in such a way that indi- 1436 cates that the IP address is not UNBINDABLE, or some server fails 1437 to respond at all), the IP address' state should be reset to 1438 UNBINDABLE. 1440 o TRANSFER (message) 1442 The TRANSFER message, which attempts to transfer some IP 1443 addresses from some other server to the initiating server, is 1444 initiated whenever the number of BINDABLE IP addresses in an 1445 address pool falls below a configurable threshold. 1447 DRAFT July 1997 1449 5.4. Responding to Address Management Messages 1451 o TRANSFER 1453 When receiving a TRANSFER message, the responding server inspects 1454 its list of BINDABLE addresses for the address pool to which the 1455 TRANSFER operation refers. It will attempt to offer the initiat- 1456 ing server as many addresses as it requested, with the limitation 1457 that it will never give away more than half of its pool of BIND- 1458 ABLE addresses in any one request. 1460 o UNBINDABLE QUERY 1462 The responding server will respond to this query just like it 1463 responds to a CLIENT BINDING QUERY as far as the information com- 1464 municated to the initiating server is concerned. 1466 In addition, if the IP address mentioned in this query was in the 1467 EXPIRED state, prior to responding to this message, the respond- 1468 ing server will move that IP address to the UNBINDABLE state, 1469 commit this change to stable storage, and then respond with 1470 information that indicates the IP address in question was UNBIND- 1471 ABLE. 1473 Note that an UNBINDABLE QUERY will not be generated to any server 1474 if at least one server in the SG is currently not able to be con- 1475 tacted, as known by the SCSP "Hello" subprotocol. This will pre- 1476 vent unnecessary transitions from the EXPIRED to the UNBINDABLE 1477 state when an UNBINDABLE COMPLETE POLL would not be able to com- 1478 plete in any case. 1480 6. Actions in Response to DHCP Client Messages and Events 1482 This section defines the actions that should be taken in the client 1483 binding and address management portions of the protocol when incoming 1484 DHCP requests (messages) are received. 1486 DISCUSSION: 1488 There is considerable commonality in the sections that describe 1489 the various DHCP client messages below. Once the details have 1490 stabilized, it should be possible to compress the explanations. 1492 DRAFT July 1997 1494 6.1. DISCOVER 1496 Prior to the receipt of a DISCOVER message, each server should have 1497 built of a list of BINDABLE IP addresses -- for two reasons. First, 1498 because a CLIENT BINDING COMPLETE POLL is required to get a BINDABLE 1499 IP address, and a CLIENT BINDING COMPLETE POLL may not be possible 1500 due to server failure at any given instant. Second, because even if 1501 a CLIENT BINDING COMPLETE POLL were possible, it would be unwise to 1502 require such an operation between a receipt of a DISCOVER message and 1503 the response of an OFFER to a client. 1505 There are several cases involved in processing a DISCOVER request, 1506 depending on the state of the requested IP address in the DISCOVER 1507 request: 1509 o No specific IP address requested. 1511 Offer a BINDABLE address to the client. Record that this address 1512 was offered in the cache memory of the server, but there is no 1513 need to update the stable storage of the server with any informa- 1514 tion. The IP address continues to be BINDABLE as far as the 1515 inter-server protocol is concerned. 1517 o Requested IP address is UNBINDABLE. 1519 If the IP address is UNBINDABLE, then perform a UNBINDABLE COM- 1520 PLETE POLL operation in an attempt to make the IP address BIND- 1521 ABLE. If the operation is successful, then respond as though the 1522 IP address were BINDABLE, below. If the results of the attempt 1523 to make the IP address BINDABLE resulted in a discovery that the 1524 IP address is now BOUND or PUSHED, then respond as for BOUND our 1525 PUSHED, below. Otherwise (i.e., the IP address is BINDABLE for 1526 some other server, or no an UNBINDABLE COMPLETE POLL was not pos- 1527 sible) then respond as above for "No specific IP address 1528 requested". 1530 o Requested IP address is BINDABLE. 1532 Offer the IP address to the client. IP address remains BINDABLE. 1534 o Requested IP address is BOUND or EXPIRED. 1536 If the IP address is BOUND or EXPIRED to the requesting client, 1537 then set it to BOUND and offer it to the client -- with a lease 1538 time of MAXIMUM-UNPUSHED-LEASE-TIME. Otherwise (i.e., the IP 1539 address is BOUND or EXPIRED to some other client), respond as in 1540 "No specific IP address requested", above. 1542 DRAFT July 1997 1544 o Requested IP address is PUSHED. 1546 If the IP address is PUSHED to the requesting client, then offer 1547 it to the client -- with a normal lease time. Otherwise (i.e., 1548 the IP address is PUSHED to some other client), respond as in "No 1549 specific IP address requested", above. 1551 6.2. REQUEST/SELECTING 1553 The client uses a REQUEST/SELECTING to accept the offer of a lease 1554 made by a server. When a server receives such a message, and where 1555 the server-id option reflects the IP address of that server, then if 1556 the IP address is in the following states the server should respond 1557 in the following way: 1559 o UNBINDABLE 1561 If the IP address is UNBINDABLE, then perform a UNBINDABLE COM- 1562 PLETE POLL operation in an attempt to make the IP address BIND- 1563 ABLE. If that operation is successful, then respond as though 1564 the IP address were BINDABLE, below. If the results of the 1565 attempt to make the IP address BINDABLE resulted in a discovery 1566 that the IP address is now BOUND, then respond as for BOUND, 1567 below. Otherwise (i.e., the IP address is BINDABLE for some 1568 other server, or no a complete POLL was not possible) NAK the 1569 REQUEST. 1571 o BINDABLE 1573 If the IP address is BINDABLE and has been offered to the 1574 requester, then bind the IP address to the client, set the IP 1575 address BOUND, and update stable storage. Then, ACK the client, 1576 and finally perform a PUSH operation of the binding information 1577 to the other servers. 1579 o BOUND or EXPIRED 1581 If the IP address is BOUND or EXPIRED to the requesting client, 1582 then set the state to BOUND, update the expiration time using the 1583 normal lease time, update stable storage, ACK the client with the 1584 MAXIMUM-UNPUSHED-LEASE-TIME, and perform a CLIENT BINDING COM- 1585 PLETE PUSH with the normal lease time. 1587 If the IP address is BOUND or EXPIRED to a different client, then 1588 NAK this REQUEST. 1590 DRAFT July 1997 1592 o PUSHED 1594 If the IP address is PUSHED to the requesting client, set the IP 1595 address to be PUSHED, update the expiration time, update stable 1596 storage, and ACK the client. Finally, perform a CLIENT BINDING 1597 COMPLETE PUSH operation of the updated binding information to the 1598 other servers. 1600 Use the normal lease time in all of the above operations. 1602 If the IP address is PUSHED to some other client, then NAK the 1603 request. 1605 6.3. REQUEST/INIT-REBOOT 1607 The client uses a REQUEST/INIT-REBOOT to query the server (as part of 1608 the client boot process) to determine if a "remembered" binding is 1609 still valid. If the requested IP address will be in one of the fol- 1610 lowing states: 1612 o UNBINDABLE 1614 If the IP address is UNBINDABLE, then perform a UNBINDABLE COM- 1615 PLETE POLL operation in an attempt to make the IP address BIND- 1616 ABLE. If the operation is successful, then respond as though the 1617 IP address were BINDABLE, below. If the results of the attempt 1618 to make the IP address BINDABLE resulted in a discovery that the 1619 IP address is now BOUND, then respond as for BOUND, below. Oth- 1620 erwise (i.e., the IP address is BINDABLE for some other server, 1621 or a complete POLL was not possible) NAK the REQUEST. 1623 DISCUSSION: 1625 This means that if a server creates a binding for a client and 1626 fails to PUSH the information to any other server prior to 1627 undergoing a server failure, and if the client is powered off 1628 prior to the time when it will issue a REBINDING message, it 1629 will not get back the same lease when it is powered back on. 1630 The reasoning for this (and the difference from the REBINDING 1631 case below) is that in this case the server has no way to 1632 determine if the requested address in the INIT-REBOOT request 1633 is current or perhaps very old indeed. In the REBINDING case 1634 the client is currently using the address, so the client at 1635 least believes that it is current and not in use by some other 1636 client. In this case, however, no such assumption is possi- 1637 ble. 1639 DRAFT July 1997 1641 In the case where a server which creates a binding fails prior to 1642 PUSHing the information about a lease to some other server, and 1643 the client which receives that binding makes a REBINDING request 1644 prior to either failing or being shutdown, it will get back the 1645 existing binding upon restart and INIT-REBOOT -- since the 1646 REBINDING will have caused a recovery of the binding information 1647 and that will have been distributed through a CLIENT BINDING COM- 1648 PLETE PUSH. 1650 o BINDABLE 1652 If the IP address is BINDABLE, then bind the IP address to the 1653 client, set the IP address BOUND, and update stable storage. 1654 Then, ACK the client, and finally perform a PUSH operation of the 1655 binding information to the other servers. 1657 o BOUND or EXPIRED 1659 If the IP address is BOUND or EXPIRED to the requesting client, 1660 then set the state to BOUND, update the expiration time using the 1661 normal lease time, update stable storage, ACK the client with the 1662 MAXIMUM-UNPUSHED-LEASE-TIME, and perform a CLIENT BINDING COM- 1663 PLETE PUSH with the normal lease time. 1665 If the IP address is BOUND or EXPIRED to a different client, then 1666 NAK this REQUEST. 1668 o PUSHED 1670 If the IP address is PUSHED to the requesting client then set the 1671 IP address PUSHED, update the expiration time, update stable 1672 storage, and ACK the client. Finally, perform a CLIENT BINDING 1673 COMPLETE PUSH operation of the updated binding information to the 1674 other servers. Use the normal lease time for all of the above 1675 operations. 1677 If the IP address is PUSHED to some other client, then NAK the 1678 request. 1680 6.4. REQUEST/RENEWING 1682 Upon receipt of a RENEWAL message (which is unicast from the client 1683 to the server), it is expected that the server will have accurate 1684 information concerning the binding of the client since this is the 1685 server that the client believes most recently sent an ACK to the 1686 client concerning this IP address binding. 1688 DRAFT July 1997 1690 Perform the following actions if the IP address being renewed (i.e., 1691 the IP address in ciaddr) is in one of these states: 1693 o UNBINDABLE 1695 If the IP address is UNBINDABLE, then perform an UNBINDABLE COM- 1696 PLETE POLL operation in an attempt to make the IP address BIND- 1697 ABLE. If the operation is successful, then respond as though the 1698 IP address were BINDABLE, below. If the results of the attempt 1699 to make the IP address BINDABLE resulted in a discovery that the 1700 IP address is now BOUND, then respond as for BOUND, below. 1702 If the IP address is determined to be BINDABLE for some other 1703 server, then NAK the request, and set the IP address to be 1704 UNAVAILABLE since this likely represents a duplicate allocation 1705 of an IP address (see Section 11, Open Questions, for details). 1707 Otherwise NAK the request. 1709 o BINDABLE 1711 If the IP address is BINDABLE, then bind the IP address to the 1712 client, set the IP address BOUND, and update stable storage. 1713 Then, ACK the client, and finally perform a PUSH operation of the 1714 binding information to the other servers. 1716 o BOUND or EXPIRED 1718 If the IP address is BOUND or EXPIRED to the requesting client, 1719 then set the state to BOUND, update the expiration time using the 1720 normal lease time, update stable storage, ACK the client with the 1721 MAXIMUM-UNPUSHED-LEASE-TIME, and perform a CLIENT BINDING COM- 1722 PLETE PUSH with the normal lease time. 1724 If the IP address is BOUND or EXPIRED to a different client, then 1725 NAK this REQUEST. 1727 o PUSHED 1729 If the IP address is PUSHED to the requesting client then set the 1730 IP address PUSHED, update the expiration time, update stable 1731 storage, and ACK the client. Finally, perform a CLIENT BINDING 1732 COMPLETE PUSH operation of the updated binding information to the 1733 other servers. Use the normal lease time for all of the above 1734 operations. 1736 If the IP address is PUSHED to some other client, then NAK the 1737 request and set the IP address to UNAVAILABLE. (see Section 11, 1739 DRAFT July 1997 1741 Open Questions, for details). 1743 6.5. REQUEST/REBINDING 1745 Upon receipt of a REBINDING message (which is broadcast from the 1746 client), the server will check to the state of the address requested 1747 for rebinding (i.e., the ciaddr). There are several cases possible: 1749 o UNBINDABLE 1751 If the IP address is UNBINDABLE, then perform an UNBINDABLE COM- 1752 PLETE POLL operation in an attempt to make the IP address BIND- 1753 ABLE. If the operation is successful, then respond as though the 1754 IP address were BINDABLE, below. If the results of the attempt 1755 to make the IP address BINDABLE resulted in a discovery that the 1756 IP address is now BOUND, then respond as for BOUND, below. 1758 If the IP address is determined to be BINDABLE for some other 1759 server, then NAK the request. Set the IP address to be UNAVAIL- 1760 ABLE since this likely represents a duplicate allocation of an IP 1761 address (see Section 11, Open Questions, for details). 1763 If no information is returned from any server that this IP 1764 address is anything but UNBINDABLE, then consider the address 1765 BOUND to this client, and proceed as in BOUND below. 1767 DISCUSSION: 1769 This is one of the key points of the inter-server protocol. 1770 In this case, a server has created a binding and then failed 1771 prior to telling any other server about that binding. Eventu- 1772 ally, the client to whom that binding was made will attempt a 1773 REQUEST/REBINDING and contact a different server. That dif- 1774 ferent server will be able to determine nothing about that IP 1775 address. As far as can be determined, it is not BOUND to any 1776 client, and it is not BINDABLE for any other server. In this 1777 restricted case, the server will renew the lease for the 1778 client and move the IP address into the BOUND state -- and 1779 PUSH this information to the rest of the servers. 1781 How can this be safe? Well, remember that the client is 1782 presently using the IP address to make this request. In this 1783 limited case where a server crashes before PUSHing information 1784 about a BOUND IP address to any other server, the client to 1785 whom the IP address is BOUND is the only running machine with 1786 any record of that binding. In this case, the DHCP servers 1787 will accept that client's information about the binding as 1789 DRAFT July 1997 1791 correct. 1793 o BINDABLE 1795 If the IP address is BINDABLE, then bind the IP address to the 1796 client, set the IP address BOUND, and update stable storage. 1797 Then, ACK the client, and finally perform a PUSH operation of the 1798 binding information to the other servers. 1800 o BOUND or EXPIRED 1802 If the IP address is BOUND or EXPIRED to the requesting client, 1803 then set the state to BOUND, update the expiration time using the 1804 normal lease time, update stable storage, ACK the client with the 1805 MAXIMUM-UNPUSHED-LEASE-TIME, and perform a CLIENT BINDING COM- 1806 PLETE PUSH with the normal lease time. 1808 If the IP address is BOUND or EXPIRED to a different client, then 1809 NAK this REQUEST. 1811 o PUSHED 1813 If the IP address is PUSHED to the requesting client then set the 1814 IP address PUSHED, update the expiration time, update stable 1815 storage, and ACK the client. Finally, perform a CLIENT BINDING 1816 COMPLETE PUSH operation of the updated binding information to the 1817 other servers. Use the normal lease time for all of the above 1818 operations. 1820 If the IP address is PUSHED to some other client, then NAK the 1821 request and set the IP address to UNAVAILABLE. (see Section 11, 1822 Open Questions, for details). 1824 6.6. RELEASE 1826 When a RELEASE is received, an IP address will be in one of the fol- 1827 lowing states: 1829 o UNBINDABLE 1831 If the IP address is UNBINDABLE, then perform a CLIENT BINDING 1832 POLL operation in an attempt to determine if this IP address is 1833 BOUND to any client. 1835 If the results of the POLL operation indicate that the IP address 1836 is now BOUND, then respond as for BOUND, below. 1838 DRAFT July 1997 1840 If the IP address is determined to be BINDABLE for some other 1841 server, then NAK the request. Set the IP address to be UNAVAIL- 1842 ABLE since this likely represents a duplicate allocation of an IP 1843 address (see Section 11, Open Questions, for details). 1845 Otherwise, ignore the RELEASE. 1847 o BINDABLE 1849 If the IP address is BINDABLE, ignore the RELEASE. 1851 o BOUND, PUSHED, or EXPIRED 1853 If the IP address is BOUND, PUSHED, or EXPIRED to the requesting 1854 client set the IP address to be UNBINDABLE, update stable stor- 1855 age, and perform a CLIENT BINDING COMPLETE PUSH to update the 1856 other servers with this information. 1858 6.7. Lease Period Expiration 1860 When the lease period on a BOUND or PUSHED IP address expires, set 1861 the IP address to be EXPIRED and update stable storage. 1863 7. Group Management 1865 The group management part of the protocol is concerned with configur- 1866 ing a server into or out of a server group (SG). It allows discovery 1867 of information concerning the configuration of an existing server 1868 group as well as the address pools that are managed by a server 1869 group. While it is possible to conceive of a statically defined 1870 server group, the operational characteristics (both for group startup 1871 as well as removal of a server from a group) are quite painful. 1873 Group management messages are used add a server to a group as well as 1874 to remove a server from a group. A server must add itself to a group 1875 -- it cannot be added by another server. A server may be removed by 1876 any server in the group, including itself. 1878 In addition to changing the group membership, group management mes- 1879 sages are used to keep the various servers up to date with respect to 1880 the current membership of the group. 1882 Once a server successfully become part of a group using the group 1883 management messages, it the goes into the SCSP protocol. This proto- 1884 col determines which servers in the SG are currently in communication 1885 with this server, and starts an automatch "cache alignment" process 1887 DRAFT July 1997 1889 with each connected server. 1891 7.1. Group Management Operations 1893 o SG CHANGE 1895 The SG CHANGE operation is a two-stage operation made up of a 1896 propose and then a commit phase. It uses the SG PROPOSE CHANGE 1897 and SG COMMIT CHANGE messages as part of this operation. It is 1898 used to change the membership of the group, either to add a 1899 server or to remove a server. 1901 7.2. Group Management Messages 1903 o SG DISCOVERY QUERY 1905 The first stage of becoming a server participating in the inter- 1906 server protocol is to determine the existing SG ID for each SG 1907 for which participation in the inter-server protocol is desired. 1909 Assuming that a server has been provided or can discover the IP 1910 address of a server maybe in a group to which it wants to join, a 1911 server who wants to become a member of a group will send a SG 1912 DISCOVERY QUERY message to that server. 1914 The reply to the SG DISCOVERY QUERY message is a message which 1915 contains the list of SG identifiers for all of the groups to 1916 which the replying server belongs. These SG ids can then be used 1917 in SG CONFIGURATION messages to determine more information about 1918 each SG. 1920 This operation is performed only upon one server at a time, since 1921 at this point there is no notion of a "current" server group. 1923 o SG CONFIGURATION QUERY 1925 The SG CONFIGURATION QUERY operation has several suboperations, 1926 corresponding to the following types of configuration informa- 1927 tion: subnets, IP addresses, client configuration information, 1928 and vendor specific information. 1930 Each SG CONFIGURATION QUERY operation is read-only to the receiv- 1931 ing server. The particular SG CONFIGURATION QUERY suboperations 1932 are: 1934 DRAFT July 1997 1936 o Subnets 1938 The specific subnets managed by this SG are returned in this as 1939 part of this operation. 1941 o IP Addresses 1943 The IP addresses which are managed by this SG within this sub- 1944 net are return as the result of this operation. 1946 o Client Configuration Information 1948 The client configuration information associated with this sub- 1949 net is returned as the result of this operation. 1951 o Vendor Specific Information 1953 Provision is made for vendor specific configuration information 1954 to be returned in the SG CONFIGURATION message. Its format is 1955 TBD, but should be regular even though vendor specific. 1957 o SG PROPOSE CHANGE UPDATE 1959 The SG PROPOSE CHANGE UPDATE message is sent to all of the 1960 servers in a SG to propose a new membership in the server group. 1961 The information sent with this message is an updated list of the 1962 servers in the group. The servers to add to the group and 1963 servers to remove from the group are both listed in the same mes- 1964 sage. 1966 o SG COMMIT CHANGE UPDATE 1968 The SG COMMIT CHANGE UPDATE message is sent to all of the servers 1969 in the SG to commit a change the was proposed in a SG PROPOSE 1970 CHANGE operation. 1972 7.3. Initiating Group Management Operations and Messages 1974 7.3.1. SG CHANGE (operation) 1976 The SG CHANGE operation consists of the the following steps: 1978 o Determine the group membership using an SG CONFIGURATION message. 1980 Find out to whom to send all of the SG CHANGE messages. 1982 DRAFT July 1997 1984 o Send a SG PROPOSE CHANGE message to every member of the SG. 1986 This message has the current group specifier in the message, 1987 along with the new group membership. As the joining server 1988 cycles through the existing members of the group, it will be 1989 rationalizing the group specifiers among the group and the entire 1990 group's picture of the membership of the group. If it encounters 1991 a server whose view of the group membership lags behind that of 1992 the server from which the joining server received its idea of 1993 group membership, then it will bring that server up to date. 1995 If, on the other hand, it encounters a server that has a more up 1996 to date version of the group membership than the one from which 1997 it is operating, it will have to update its idea of the group 1998 membership and then start the proposal sequence over. All of the 1999 servers with which it has created proposals will be forced to 2000 update their view of group membership as part of this process. 2002 At the end of this process of proposal generation, all of the 2003 servers in the group share a common picture of both the group 2004 membership as well as the current proposal. 2006 o Reverify the group membership from at lease one server using an 2007 SG CONFIGURATION message. 2009 This is to ensure that all of the members of the group have actu- 2010 ally been sent a SG PROPOSE CHANGE message. 2012 o Check the proposal timer. 2014 The initiating server must have started a timer when it sent out 2015 the first SG PROPOSE CHANGE message, and if that timer has less 2016 than time/2 time left on it, the joining server SHOULD start the 2017 process over. 2019 o Send a SG COMMIT CHANGE message to every member of the SG. 2021 As soon as this completes successfully with one server, the 2022 server has changed the membership of the group, but the initiat- 2023 ing server MUST continue to try to update the other servers as 2024 long as they remain in the server group. 2026 7.3.2. SG DISCOVERY QUERY (message) 2028 This is sent when a server wishes to know the groups to which another 2029 server is a member. It is used primarily when starting up a server 2030 in the initial discovery of the server group configuration. 2032 DRAFT July 1997 2034 7.3.3. SG CONFIGURATION QUERY (message) 2036 This message is sent to determine the details of the configuration of 2037 the server group. A server would typically initiate these messages 2038 as part of the process of confirming that it wished to be part of a 2039 particular server group. 2041 The SG CONFIGURATION QUERY operation has several suboperations, cor- 2042 responding to the following types of configuration information: 2044 o Subnets 2046 The specific subnets managed by this SG are returned in this as 2047 part of this operation. 2049 o IP Addresses 2051 The IP addresses which are managed by this SG within this subnet 2052 are return as the result of this operation. 2054 o Client Configuration Information 2056 The client configuration information associated with this subnet 2057 is returned as the result of this operation. 2059 o Vendor Specific Information 2061 Provision is made for vendor specific configuration information 2062 to be returned in the SG CONFIGURATION QUERY message. Its format 2063 is TBD, but should be regular even though vendor specific. 2065 7.4. Responding to Group Management Messages 2067 7.4.1. SG PROPOSE CHANGE UPDATE 2069 Upon receipt of a SG PROPOSE CHANGE UPDATE message, if no existing 2070 proposal exists that has not timed out, a server will create a single 2071 "proposed" group specifier from the current group specifier by incre- 2072 menting the group sequence number by 1. The creation of this pro- 2073 posed group specifier will inhibit the creation of another proposed 2074 group specifier for a 30 seconds. 2076 If an existing proposal exists that has not timed out, the responding 2077 will respond negatively to the SG PROPOSE CHANGE UPDATE message. 2079 DRAFT July 1997 2081 DISCUSSION: 2083 Clearly a deadlock situation can occur where two servers are try- 2084 ing to join a group at the same time, and each is working from 2085 "opposite ends" of the group. In this case, where the joining 2086 server gets a failure from a SG PROPOSE CHANGE UPDATE message due 2087 to the existence of a valid proposal that has not timed out, then 2088 the joining server should backoff an amount of time that is based 2089 in part on its IP address before trying again. The exact algo- 2090 rithm is TBD. 2092 This proposed group specifier will not be used in any messages until 2093 it moves to the accepted stage and become the current group specifier 2094 (see below for how it does that). 2096 If a second SG PROPOSE CHANGE UPDATE request is received from a 2097 server, that message will supersede the existing proposal and the 2098 timer will be reset. 2100 DISCUSSION 2102 Is there some possible attack here? Should we limit one servers 2103 proposals from tying up the "proposal" for more than 3 minutes at 2104 a time, for instance? 2106 7.4.2. SG COMMIT CHANGE UPDATE 2108 Upon receipt of a SG COMMIT CHANGE UPDATE message, the current pro- 2109 posal is compared with the data in the SG COMMIT CHANGE UPDATE mes- 2110 sage, and if it compares successfully, the proposed new group becomes 2111 the current group and the group specifier is changed. 2113 Once a SG COMMIT CHANGE UPDATE message is received, the receiving 2114 server MUST examine all of its IP addresses. For every IP address 2115 for which the "last transaction server" is a server which was previ- 2116 ously in the group and is now not in the group, the following action 2117 should be taken: 2119 If the IP address is shown as ever having been BOUND to a client, and 2120 if that client does not now have a different IP address, then the IP 2121 address should be set to BOUND to that client, the lease time should 2122 be restarted for the previously recorded lease time. 2124 DISCUSSION: 2126 This is a key aspect of the protocol in terms of safely removing 2127 possibly partitioned servers from the group. The specific case 2129 DRAFT July 1997 2131 that this protects against is as follows. 2133 If a connected server creates a client binding, and successfully 2134 performs a CLIENT BINDING COMPLETE PUSH operation, and then renews 2135 its client's lease for the full lease time -- and then becomes 2136 partitioned, there can be problems if that server is ultimately 2137 removed from the group much later. If the server is partitioned 2138 for longer than the client's lease time, and if all of the other 2139 servers move this IP address to EXPIRED, and if then some server 2140 tries (unsuccessfully) to perform an UNBINDABLE COMPLETE POLL -- 2141 which will move the EXPIRED addresses to UNBINDABLE. Now, the 2142 partitioned server has updated the client several times, and the 2143 other servers by this time all believe that the IP address is 2144 UNBINDABLE. If the partitioned server then fails and is removed 2145 from the SG -- the other servers could (in the absence of the 2146 above algorithm) believe that they only need wait the MAXIMUM- 2147 UNPUSHED-LEASE-TIME before then can make those UNBINDABLE 2148 addresses BINDABLE. But in this case that would cause a failure. 2149 Thus, when a server is removed from a SG, each remaining server 2150 must look around for any IP addresses that it previously PUSHED, 2151 and set them up with their previous maximum lease time in order to 2152 catch this case. 2154 7.4.3. SG DISCOVERY QUERY 2156 The server groups to which the current server belongs are returned as 2157 the response to an SG DISCOVERY QUERY message. 2159 7.4.4. SG CONFIGURATION QUERY 2161 The SG CONFIGURATION QUERY operation has several suboperations, cor- 2162 responding to the following types of configuration information: 2164 o Subnets 2166 The specific subnets managed by this SG are returned in this as 2167 part of this operation. 2169 o IP Addresses 2171 The IP addresses which are managed by this SG within this subnet 2172 are return as the result of this operation. 2174 o Client Configuration Information 2176 The client configuration information associated with this subnet 2177 is returned as the result of this operation. 2179 DRAFT July 1997 2181 o Vendor Specific Information 2183 Provision is made for vendor specific configuration information 2184 to be returned in the SG CONFIGURATION QUERY message. Its format 2185 is TBD, but should be regular even though vendor specific. 2187 8. SCSP Message Mapping 2189 This section develops the SCSP capabilities supporting the DHCP 2190 interserver protocol. The Server Cache Synchronization Protocol 2191 (SCSP) is found in [1]. The organization of this section is 1) we 2192 present a brief overview of SCSP (and refer to appendices for a more 2193 detailed discussion), 2) we discuss the mapping of the DHCP inter- 2194 server protocol onto SCSP and how the various DCHP interserver mes- 2195 sages are mapped into SCSP messages, 3) we identify the modifications 2196 to the SCSP protocol as identified in [1] necessary for the mapping 2197 of the DHCP interserver protocol onto SCSP, 4) we present the spe- 2198 cific formats of the DHCP protocol specific SCSP records and 5) we 2199 present a list of the open issues with respect to the mapping onto 2200 SCSP. 2202 8.1. SCSP Overview 2204 The Server Cache Synchronization Protocol (SCSP) is a protocol which 2205 provides the generic functions necessary to provide loose synchro- 2206 nization between a set of distributed databases. The protocol, which 2207 is presented in [2], was developed to specifically address to issues 2208 associated with synchronizing the caches of redundant servers which 2209 provide the server functionality of a specific client-server proto- 2210 col. SCSP was built based upon the extensive experience in develop- 2211 ing and running link state routing protocols such as OSPF [3]. 2212 Client server protocols for which a redundant server capability is 2213 being developed using SCSP are NHRP [4] and ATM ARP [5]. Here we 2214 present the use of SCSP to synchronize servers supporting the DHCPv4 2215 client-server protocol. 2217 The SCSP protocol consist of three separate sub-protocols, i.e., 2219 o The "Hello" protocol: this protocol defines and maintains the 2220 status of the inter-server connection, 2222 o The "Cache Alignment" protocol: this protocol defines the cache 2223 synchronization capability for new servers and servers that, for 2224 whatever reason, have lost synchronization, and 2226 o The "Client State Update" protocol: this protocol provides the 2227 ongoing server cache synchronization through asynchronous client 2229 DRAFT July 1997 2231 state updates. 2233 These sub-protocols define the semantics and high-level syntax of 2234 generic message sets and their exchanges in support of the capabili- 2235 ties provided. The SCSP associates replica databases into Server 2236 Groups (SG). The SCSP supports both point-to-point and point-to- 2237 multipoint connections between the local servers (LS) and the 2238 directly connected servers DCS(es). We discuss each of these sub- 2239 protocols in more detail in the appendices below. 2241 SCSP defines five message types in the operation of the above subpro- 2242 tocols: 2244 o Hello 2246 o Cache Alignment (CA) 2248 o Cache State Update (CSU) Solicit (CSU_Sol) 2250 o CSU Request (CSU_Req) 2252 o CSU Reply (CSU_Rep). 2254 The Hello and the CA messages are used within the Hello and the Cache 2255 Alignment subprotocol respectively. The CSU_Sol, CSU_Req and CSU_Rep 2256 messages are used to distribute cache records between the distributed 2257 servers of a server group. Full records are called Client State 2258 Advertisement (CSA) records. Summary records, which are essentially 2259 pointers to the full records, are called Client State Advertisement 2260 Summary (CSAS) records. 2262 For a server to request a particular record, it can send a CSU_Sol 2263 message containing the CSAS to indicate the full record of interest. 2264 A server which receives a CSU_Sol is required to respond with a 2265 CSU_Req message containing the full CSA record associated with the 2266 CSAS of the CSU_Sol. The soliciting server follows the receipt of 2267 the CSU_Req with a CSU_Rep to acknowledge receipt. A server which 2268 wishes to communicate a full record to the rest of the SG would 2269 transmit a CSU_Req message containing the full CSA record. This is 2270 acknowledged with a CSU_Rep message. 2272 DISCUSSION 2274 In some cases the CSU_Sol, CSU_Req, CSU_Rep sequence is overkill 2275 when one wants to perform a simple query operation. See the dis- 2276 cussion at the end of Section 8.3 for more details. 2278 For now we accept that these capabilities are generically provided 2280 DRAFT July 1997 2282 discuss the DHCPv4 interserver protocol specific overlay on SCSP. 2284 8.2. Mapping DHCP interserver onto SCSP 2286 This section presents the relationship of SCSP to the DHCP inter- 2287 server protocol, the assumptions made in developing this relationship 2288 and the specific mappings of DHCP interserver messages into SCSP. 2290 The assumptions made in defining the DHCP client/server protocol map- 2291 ping onto SCSP are the following: 2293 o On the Issue of Protocol Encapsulation: 2295 The assumption is that the SCSP messages, and in fact all inter- 2296 server messages, are to be defined over UDP. Currently the SCSP 2297 messages within [2] are LLC/SNAP encapsulated. 2299 o On the Interserver over SCSP Layering Model: 2301 The interserver group management protocol will initialize a 2302 server into the group upon initial join, re-booting or re- 2303 connecting. Once this is complete the interserver group manage- 2304 ment protocol will initialize the SCSP protocol to handle the 2305 ongoing operation of the interserver cache alignment and address 2306 management functions. 2308 o On the DHCP Interserver Sub-Protocols: 2310 The current thinking goes as follows. The draft specification 2311 defines three DHCP interserver sub-protocols, i.e., the 'Client 2312 Binding Management' protocol (see Section 4), the 'Address Man- 2313 agement' protocol (see Section 5), and the 'Group Management' 2314 protocol (see Section 7). The 'Client Binding Management' sub- 2315 protocol addresses the core of the interserver protocol in that 2316 it distributes and maintains the client binding records over the 2317 distributed SG. This sub-protocol is to be mapped onto SCSP and 2318 is assigned a unique SCSP 'Protocol ID' value, e.g., the SCSP 2319 ProtID = 4 assigned to DCHP. For this draft we assume that the 2320 Group Management sub-protocol is run on a separate UDP port from 2321 the SCSP UDP port. The Group Mgmt sub-protocols will be assigned 2322 a unique UDP port number = tbd. We had no compelling reason to 2323 carry the Address Management subprotocol on SCSP as for the 2324 Client Binding protocol, however for this draft we mantain both 2325 these sub-protocols within SCSP. If at a later date it is deemed 2326 useful to separate these two protocol 1) we can define separate 2327 SCSP protocol types for the Cache Management and the Address Man- 2328 agement protocols, yet support them with a common Hello protocol 2329 link via the Hello protocol Family type field or 2)we can move 2331 DRAFT July 1997 2333 the address management sub-protocol out from SCSP as in the case 2334 of the Group management sub-protocol. 2336 The mappings between the interserver messages and the SCSP mes- 2337 sages will cover the interserver messages handling client binding 2338 and address management, but not the group management protocol 2339 functions of the interserver protocol. The group management 2340 messages are to be defined outside of SCSP, however these mes- 2341 sages will follow the syntax of the SCSP message sets to simplify 2342 the parsing of the total message sets required within the DHCP 2343 interserver protocol. 2345 The client binding management operations are CLIENT BINDING COM- 2346 PLETE PUSH and CLIENT BINDING POLL. CLIENT BINDING COMPLETE PUSH 2347 is required to distribute binding information and to increase the 2348 initial lease period to the desirable lease period. The CLIENT 2349 BINDING POLL is required to solicit information on client bind- 2350 ings in the event that the specific server has no record of the 2351 client requested binding. The Interserver messages supporting 2352 these operations are the CLIENT BINDING UPDATE and the CLIENT 2353 BINDING QUERY messages, respectively. The SCSP records for these 2354 operations are 'Binding' records for the update and query mes- 2355 sages. 2357 The Address Management operations are UNBINDABLE COMPLETE POLL 2358 and TRANSFER. The UNBINDABLE COMPLETE POLL initializes an 2359 address as bindable by the LS. The TRANSFER allows for the 2360 transfer of a block of bindable addresses between servers. The 2361 Interserver messages supporting these operations are the UNBIND- 2362 ABLE QUERY and the TRANSFER messages. The SCSP records for these 2363 operations are 'Address' records for the UNBINDABLE QUERY and 2364 'Bindable Block Address' records for the TRANSFER messages. 2366 The Group Management messages are SG DISCOVERY Query, SG CONFIGU- 2367 RATION QUERY, SG PROPOSE CHANGE UPDATE and SG COMMIT CHANGE 2368 UPDATE. The SCSP records associated with these operations are 2369 'SG Specifier' records for the SG DISCOVERY QUERY, 'SG Subnets' 2370 records for the SG CONFIGURATION QUERY, 'SG Members' records for 2371 the SG DISCOVERY Query, and 'SG Proposed Members' records for the 2372 SG PROPOSE CHANGE UPDATE and SG COMMIT CHANGE UPDATE messages. 2374 o On DHCP Interserver Authentication: 2376 The interserver protocol will rely on the authentication exten- 2377 sions within SCSP for the SCSP message authentication between 2378 servers within a server group. The authentication of the inter- 2379 server group management protocol messages are tbd. 2381 DRAFT July 1997 2383 o On the Notion of Server Ownership of Binding Records: 2385 It will be assumed that once the initial client binding record is 2386 generated by a particular server, that record will indicate that 2387 server as the originating server in the SCSP 'Originating Server 2388 ID' field. Any further changes to that binding, whether by the 2389 originating server or by another server, e.g., the originating 2390 server is down and the client is Rebinding and getting a lease 2391 extension from another server, that server does change the Origi- 2392 nating Server ID in the SCSP record field to indicate itself as 2393 the last transaction server. 2395 o On a More Efficient Cache Alignment Process: 2397 The cache alignment process can be made more efficient if the 2398 servers time stamp their cache records. In the event that the 2399 connections between servers fails, the servers determine and 2400 record the failure time. Upon reconnecting and cache alignment, 2401 the SCSP CRL list can be limited to those records that are more 2402 'recent' than the failure and therefore greatly reduce the time 2403 and the bandwidth required. The details are presented below. 2405 Also, it is not necessary to perform a cache alignment of the 2406 address records for the proper operation of the Interserver pro- 2407 tocol. Therefore, we assume that the SCSP cache alignment pro- 2408 cess will not include these address records when building the 2409 SCSP CRL. 2411 o On the More Recent Record Determination: 2413 SCSP relies on the ability of identifying the more recent-ness of 2414 records when aligning and updating the cache based upon the CSA 2415 Sequence Number. For binding records this implies that in situa- 2416 tions where it is clear that a single server is updating the 2417 binding, e.g., extending the lease, then it should increment the 2418 CSA Sequence number by one. However there are situations in DHCP 2419 where multiple servers can simultaneously update the client bind- 2420 ing and it is not clear which of these updated bindings is 2421 accepted by the client, e.g., the client is in the rebinding 2422 state and the originating server is down and the other servers 2423 received the client broadcast request and the client gets multi- 2424 ple DHCPACKs extending the lease. In these situations the 2425 servers are required to increment the CSA sequence numbers by one 2426 and indicate that they are the last transaction server. Then, 2427 when a server caches the record, if it already has a cache record 2428 for that binding (as indicated by the Cache Key) it should 2429 replace the existing record only if the new record indicates a 2430 lease period which is greater than the existing record. 2432 DRAFT July 1997 2434 o On Maximally Defined Binding Records (or the B.Hibbs' Question): 2436 B.Hibbs' posed the question regarding the nature of the configu- 2437 ration synchronization of the servers within the same SG; Does 2438 the DHCP Interserver protocol require synchronization of all con- 2439 figuration parameters or a subset? We are assuming that there is 2440 a minimal set of configuration and client binding information to 2441 be synchronized across the members of the SG to ensure the cor- 2442 rect operation of the DHCP Client/Server protocol. This informa- 2443 tion must be carried in the interserver messages to synchronize 2444 the members in the SG with respect to this information. Further, 2445 there may be other client binding information that the members 2446 want to communicate; we currently have this information encoded 2447 as optional in this draft. 2449 The parameters encoded into the 'Client Binding' records are 2450 those which are minimally required for the correct operation of 2451 the DHCP Client/Server protocol. The interserver protocol should 2452 allow for situations where the configuration of the servers of 2453 the same server group are not strictly aligned; their configura- 2454 tions are only required to be aligned in the specification of the 2455 subnets and masks that are covered with a SG and the list of 2456 assignable addresses within each of the subnets. However, 2457 because clients DHCPDISCOVER messages can contain client specific 2458 requests for parameters, it may be desirable to embed a fuller 2459 set of parameters (committed to the client in the DHCPOFFER mes- 2460 sage) within the CSA record. This fuller set of parameters may 2461 be included in the initial CLIENT BINDING COMPLETE PUSH (encoded 2462 in the optional fields location in the record). The server in 2463 receipt of a CLIENT BINDING COMPLETE PUSH may chose not to cache 2464 or forward these optional parameters. 2466 o On Knowledge Obtained Through the SCSP Hello protocol: 2468 The SCSP Hello protocol maintains current status of the inter- 2469 server connectivity through a polling mechanism. This status 2470 information can be used to influence the actions of the LS, e.g., 2471 in the event that the LS has lost connectivity from a DCS, then 2472 it should not perform a COMPLETE POLL operation. 2474 o On the SG Connectivity: 2476 It is likely that the servers of the SG are required to be fully 2477 interconnected, i.e., a LS is a DCS to all other servers of the 2478 SG. It was first thought that this would aid in determining the 2479 status of the SG, i.e., whether the SG was 'up' (fully function- 2480 ing) or 'down' (not fully functioning). However on further 2481 inspection this is not true, i.e., the loss of connectivity 2483 DRAFT July 1997 2485 between a pair of servers in a fully connected SG does not imply 2486 that the other servers are not still connected to the other 2487 servers. Full mesh connectivity may still be required for the 2488 correct operation of the Address Management protocol. This is 2489 currently under study. 2491 When a new server wishes to join a server group, it must initialize 2492 itself to the other members of the server group through the above 2493 defined interserver Group Management Protocol. Once this has 2494 occurred, the local server must initiate SCSP which then will align 2495 its client binding cache to that of the server group. It should then 2496 acquire Bindable addresses and fully participate in the on-going 2497 client binding update functions of the server group. 2499 This process is outlined in the below state diagram for the DHCP 2500 interserver protocol. The Group Management protocol handles the new 2501 server joining the group. Once this has occurred, the new server and 2502 all the other servers of the server group initiate the SCSP Hello 2503 Protocol on a pairwise basis. Per the discussion in the SCSP speci- 2504 fication, once bi-directional connectivity is re-verified and now 2505 monitored within the SCSP Hello protocol, the servers enter into the 2506 cache alignment and then the ongoing cache and address management 2507 functions. In the event that the servers transition to the 'DOWN' 2508 state, polling will continue until connectivity is re-established. 2510 The Group Management Protocol does not allow additions to the member- 2511 ship in the event that the SG is down. However it does allow for the 2512 removal of a server from the SG while another server is re-booting or 2513 disconnected. Therefore a re-booting or re-connecting server cannot 2514 be assured that the SG generation has remained constant during the 2515 'DOWN' period. Therefore, in the event that the generation number of 2516 the SG has changed as indicated through the generation number con- 2517 tained within the interserver messages, the server needs to update 2518 its notion of the server group through the procedures identified in 2519 the group management protocol prior to aligning its cache. 2521 DRAFT July 1997 2523 +------------+ 2524 | Group | 2525 | Management | 2526 | Protocol | 2527 +------------+ 2528 | 2529 | 2530 V 2531 +------------+ 2532 | SCSP | 2533 | Hello | 2534 +------------+ 2535 / ^ \ 2536 / | \ 2537 V | V 2538 +--------------+ | +---------------+ 2539 |'Binding Mgmt'| | |Null'Addr Mgmt'| 2540 | Cache |---+----| Cache | 2541 | Alignment | | | Alignment | 2542 +--------------+ | +---------------+ 2543 | | | 2544 | | | 2545 V | V 2546 +--------------+ | +------------+ 2547 |'Binding Mgmt'| | | 'Addr Mgmt'| 2548 | Cache Update |---+----|Cache Update| 2549 +--------------+ +------------+ 2551 Figure 8.2-1 Interserver State Flow Diagram 2553 For operational efficiency, the servers should implement a scheme to 2554 limit the number of cache records to exchange during the cache align- 2555 ment process. For example, a SG could easily be managing 10,000 2556 client records and the bandwidth requirements to pass even the sum- 2557 mary records required to build the CRL table can be quite large. 2558 Therefore, for the 'Cache Management' sub-protocol, the servers 2559 should record the times at which the cache entries were received or 2560 created or modified. When the CAFSM transitions for a particular DCS 2561 to the down state, t(down) should be recorded. Then when the CAFSM 2562 enters the cache alignment state, the CRL list is to be built up 2563 based upon only those records with time stamps more recent then 2564 t(down) - F, where F is a factor to be set to a multiple of the Hel- 2565 loInterval x DeadFactor. We recommend that the multiple be 10. In 2566 the event that the LS crashed (causing the transition to the down 2567 state), then t(down) should be set to the last record time stamp when 2568 the LS reboots. In the event that the server has just joined the SG, 2569 the CRL should be built up from all of the current cache records. 2571 DRAFT July 1997 2573 The interserver messages associated with the Client Binding Manage- 2574 ment are: CLIENT BINDING QUERY for the CLIENT BINDING POLL opera- 2575 tion, and CLIENT BINDING UPDATE for the CLIENT BINDING COMPLETE PUSH 2576 operation. These are discussed in detail in the following list 2577 items: 2579 o The CLIENT BINDING QUERY message queries another server regarding 2580 the status of a particular binding. Within the SCSP protocol, 2581 this exchange is accomplished by the LS sending a Client State 2582 Update_Solicit (CSUS) message with the Client State Advertisement 2583 Summary (CSAS) 'Address record' of the IP address in question. 2584 The DCS responds with the CSU_Request message with the Client 2585 State Update (CSU) record associated with the CSAS. The LS then 2586 replies with a CSU_Reply with the 'A-bit' set. 2588 o The CLIENT BINDING UPDATE message updates another server with a 2589 new, or changed, client binding. Within the SCSP protocol, this 2590 exchange is accomplished with the CSU_Request message carrying 2591 the specific CSA 'Binding record' of the client binding in ques- 2592 tion. The DCS responds with the CSA-Reply with the 'A-bit' set. 2594 The interserver messages associated with the Address Management are: 2595 UNBINDABLE QUERY for the UNBINDABLE COMPLETE POLL operation, and 2596 TRANSFER messages for the TRANSFER operation. These are discussed in 2597 detail in the following list items: 2599 o The UNBINDABLE QUERY message queries another server of the SG 2600 regarding the status of a particular address with the intent of 2601 making that address bindable to the LS. Within the SCSP proto- 2602 col, this exchange is accomplished by the LS sending a 2603 CSU_Solicit with the CSAS 'Address' record of the IP address in 2604 question to all other servers of the SG. The DCSes respond with 2605 the CSU_Request message with the CSA 'Address' record indicating 2606 the status of the address within the DCS. The LS then replies 2607 with the CSU_Reply message to the DCS with the 'A-bit' set. 2609 o The 'TRANSFER' operation is initiated by the LS to request a 2610 transfer of bindable addresses from the DCS to the LS. Within 2611 the SCSP protocol, this exchange is accomplished by a two step 2612 process. First, the LS sends a CSU_Request message with the CSA 2613 'Subnet Bindable Addresses' record to the DCS, which then 2614 responds with a CSU_Reply. The CSA 'Subnet Bindable Addresses' 2615 record indicates the subnet in question, the number of BINDABLE 2616 addresses owned by the LS and the number of additional BINDABLE 2617 addresses the LS is requesting. Second, this is immediately fol- 2618 lowed by the DCS sending a CSU_Request message with a CSA 'Subnet 2619 Bindable Address' record for the given subnet in question. The 2620 DCS' CSA 'Subnet Bindable Addresses' record indicates the subnet 2622 DRAFT July 1997 2624 in question and the number and address of the IP addresses that 2625 the DCS is transferring to the LS based upon it's previous 2626 request. This is based upon the DCS' current understanding of 2627 the supply of bindable addresses within the LS and its local 2628 knowledge of its own set of bindable addresses for this subnet. 2629 This CSU_Request will generate a CSU_Reply from the originating 2630 LS. When sending the CSU_Request message, the DCS sets the 2631 addresses it is transferring to the LS as UNBINDABLE. The LS 2632 then moves these addresses to its list of BINDABLE addresses and 2633 sends a CSU_Reply to the DCS with the 'A-bit' set. 2635 The interserver messages associated with the Group Management opera- 2636 tions are: SG DISCOVERY QUERY, SG CONFIGURATION QUERY, SG PROPOSE 2637 CHANGE UPDATE, and SG COMMIT CHANGE UPDATE messages. These are dis- 2638 cussed in detail in the following list items: 2640 o The SG DISCOVERY QUERY message queries the DCS for its list of 2641 current SG in which it is participating. Within the SCSP proto- 2642 col, this exchange is accomplished by the LS sending a 2643 CSU_Solicit with the CSAS 'Server Groups' record and the DCS 2644 replys with the CSU_Request message containing the CSA 'Server 2645 Groups' record. This record contains the list SG specifiers, 2646 i.e., SG ID and SG Generation Number (GN) pairs. The LS replies 2647 with a CSU_Reply. 2649 o The SG CONFIGURATION QUERY message queries the DCS for its con- 2650 figuration information. This information is passed within the 2651 'SG Subnets Configuration' record. The LS initiates this query 2652 by sending a CSU_Solicit containing the CSAS 'SG Subnets Configu- 2653 ration' summary record. The responds with a CSU_Request contain- 2654 ing the CSA 'SG Subnets Configuration' record. The LS replies 2655 with the CSU_Reply message. 2657 o The SG PROPOSE CHANGE UPDATE message proposes the new member to 2658 the rest of the SG. This is accomplished with a SCSP CSU_Req 2659 message carrying the 'SG Proposed Members' record. The SG COMMIT 2660 CHANGE UPDATE message consummates the new server joining the SG. 2661 Once the joining member has received positive CSU_Reply from all 2662 of the current members of the SG as part of the proposal phase, 2663 it then moves to the join commit phase. The new server now 2664 issues an SCSP CSU_Req message with the 'SG Members' record car- 2665 rying the newly joined member to the list of servers of the SG. 2667 o The SG PROPOSE CHANGE UPDATE message may also be used to propose 2668 the removal of an existing server from the membership of the SG. 2669 This is accomplished with a SCSP CSU_Req message carrying the 'SG 2670 Proposed Members' record containing all of the existing members 2671 of the SG minus the server ID to be removed. The SG COMMIT 2673 DRAFT July 1997 2675 CHANGE UPDATE message consummates the existing server leaving the 2676 SG. Once the removing member, i.e., the member who is actively 2677 removing the existing member from the group, has received posi- 2678 tive CSU_Reply from all of the current members of the SG (except 2679 for the member being removed) as part of the proposal phase, it 2680 then moves to the remove commit phase. The removing server now 2681 issues an SCSP CSU_Req message with the 'SG Members' record car- 2682 rying the new membership minus the removed server. 2684 8.3. Necessary Modifications to SCSP 2686 The SCSP modifications required to support the DHCP interserver pro- 2687 tocol are as follows: 2689 o The operation of the SCSP protocol in this application is initi- 2690 ated upon the successful completion of the interserver 'Group 2691 Management Protocol'. 2693 o The SCSP messages, and in fact all of the DHCP interserver mes- 2694 sages are carried in UDP packets. Therefore a UDP port number 2695 needs to be defined for SCSP. 2697 DISCUSSION: 2699 Currently SCSP is defined only for NMBA networks. This mani- 2700 fests itself in two ways; a) the operation of the SCSP proto- 2701 col is initiated upon the establishment of NBMA connectivity, 2702 i.e., a virtual circuit being established, and b) the SCSP 2703 messages are encapsulated into link level frames using the 2704 LLC/SNAP encapsulation method. 2706 Instead of relying upon the establishment of a virtual circuit 2707 connection, the interserver protocol will initiate the SCSP 2708 protocol based upon the results of the 'Group Management Pro- 2709 tocol'. This divorces the operation of the interserver proto- 2710 col from the specifics of the link layer. Also, by carrying 2711 the messages within UDP, the protocol achieves independence in 2712 the deployment and proximity of the servers which are members 2713 of the same server group, i.e., servers are not required to 2714 have an interface on a common subnet. 2716 Because SCSP provides a generic capability to synchronize 2717 caches in distributed servers, it is best to define a separate 2718 UDP port number for the 'generic' SCSP protocol and a separate 2719 UDP port for the DHCP interserver Group Management protocol. 2720 These UPD port numbers are tbd. 2722 DRAFT July 1997 2724 o A SG Generation Number SCSP extension field needs to be defined. 2726 DISCUSSION: 2728 We have defined the notion of a Server Group Generation Number 2729 to distinguish between the various instantiations of a partic- 2730 ular SG. The membership of a particular SG will change over 2731 time. Because it is necessary for the correct operation of 2732 the DHCP interserver protocol for each server to know the cur- 2733 rent membership, it was deemed necessary to define a Genera- 2734 tion Number which is incremented each time a new server joins 2735 the SG or an existing server is removed from the SG. This GN 2736 is to be carried in every interserver message. No obvious 2737 place existed with the SCSP message formats to carry such 2738 information. Therefore, we have chosen to define a new SCSP 2739 extension type and will carry the GN in this method. 2741 o Some modification to the Authentication extension in the SCSP 2742 protocol may be required. 2744 DISCUSSION: 2746 Currently SCSP states that the authentication extension covers 2747 the SCSP message other than the extensions. However we have 2748 chosen to carry a new extension within the SCSP messages; the 2749 Generation Number. Ideally we would prefer that this exten- 2750 sion be protected by the authentication extension. Because it 2751 is not, we will also include the Generation Number in the SG 2752 Specifier record. Through this record a server may reverify 2753 the current Generation Number through a protected channel. 2755 o The three step Solicit_Request_Reply seems excessive when one 2756 server wishes to simply query another server. Perhaps this could 2757 be simplified (when desirable) by adding a bit to the CSU_Solicit 2758 message indicating whether the soliciting server wishes the DCS 2759 to expect or not to expect a CSU-Rep from the soliciting server. 2761 DISCUSSION: 2763 Currently SCSP states that the three step process of CSU_Sol 2764 followed by a CSU_Req which is then followed by a CSU_Rep. In 2765 certain situations this may be a desirable sequence. However, 2766 in other situations it may not be necessary. When the CSU_Sol 2767 is sent a CSUSReXmtInterval timer is set which tracks the sta- 2768 tus of the receipt of the requested CSU_Req records. For sim- 2769 ply queries, this re-transmit timer may be sufficient. There- 2770 fore, it seems reasonable that DCS should expect a CSU_Rep 2771 from the LS which sent the CSU_Sol message. 2773 DRAFT July 1997 2775 8.4. DHCP Specific CSA and CSAS Records 2777 This section presents the CSA and the CSAS records specific to the 2778 DHCP inter-server protocol. The mappings of the interserver protocol 2779 onto SCSP messages discussed in the previous section relys upon the 2780 definition of a number of record types. These record types will be 2781 distinguished within the CSAS defined 'Cache Key', which for the pur- 2782 pose of running the DHCP interserver protocol will consist of a 2783 TYPE/Key pair. The following CSAS and CSA record types are required 2784 to run the interserver protocol: 2786 For Client Binding Management: 2788 o Binding Record - contains the complete client binding informa- 2789 tion. 2791 For Address Management: 2793 o Address Record - contains the status of a specific IP address, 2794 e.g., unbindable, bindable, bound, expired, etc. 2796 o Subnet Bindable Record - contains information regarding the sub- 2797 net addresses, e.g., number of bindable addresses. 2799 For Group Management: 2801 o SG Specifier Record - contains the current Server Group speci- 2802 fiers, i.e., the SG ID (which is fixed for the duration of the 2803 life of the SG) and the SG Generation Number which is incremented 2804 for each new server add or old server delete. 2806 o SG Members Record - contains the current list of member servers 2807 of the SG. 2809 o SG Subnets Configuration Record - contains a list of all subnets, 2810 i.e., subnet address and mask, for all of the subnets served by 2811 the SG as well as the assignable addresses per subnet, and poten- 2812 tially other configuration parameters necessary for the proper 2813 operation of the DHCP interserver protocol. 2815 o SG Proposed Members Record - contains a list of the proposed mem- 2816 ber servers of the SG used in the group join proposal process. 2817 This record has a finite duration associated with it and times 2818 out if the proposed join fails. 2820 DRAFT July 1997 2822 8.4.1. The SCSP CSAS Records for the Interserver Protocol 2824 The CSAS record is completely specified in [2]. The format of the 2825 CSAS record is: 2827 0 1 2 3 2828 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2829 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2830 | Hop Count | Record Length | 2831 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2832 | Cache Key Len | Orig ID Len |N| unused | 2833 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2834 | CSA Sequence Number | 2835 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2836 | Cache Key (variable) | 2837 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2838 | Originator ID (variable) | 2839 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2841 Figure 8.4.1-1 SCSP CSAS Record Format 2842 where: 2844 o Hop Count - this represents the number of hops that the record 2845 may take before being dropped. 2847 o Record Length - this is the length in bytes of the CSAS record if 2848 stand-alone, otherwise it is the length in bytes of the CSAS 2849 record and the protocol specific part of the cache entry com- 2850 bined, i.e., the length of the CSA record. 2852 o Cache Key Length - this is the length of the Cache Key field in 2853 bytes. 2855 o Originator ID Length - this is the length of the Originator ID 2856 field in bytes. 2858 o N bit - this bit, when set, signifies a Null record. This may 2859 be the case when the LS receives a solicitation for a record that 2860 has been released by the DHCP client. 2862 o CSA Sequence Number - this field contains the sequence number 2863 that identifies the 'newness' of a CSA record instance being sum- 2864 marized. This number is assigned by the originator of the CSA 2865 record, i.e., the last transaction server. 2867 DRAFT July 1997 2869 o Cache Key - is an opaque string used by the receiving server to 2870 identify the cache entry referred to by the record. For the pur- 2871 poses of running the DHCP interserver protocol, the Cache Key 2872 will be encoded as a Type/Key pair, where the type is an 8 bit 2873 field and the length of the Key is derived from the Cache Key 2874 Length field in the header. The Type indicates the type of 2875 record and equivalently the Interserver message type, e.g., 2876 Unbindable Address Query, SG Configuration Query, etc. The 8 bit 2877 type encodings are defined in the table below. 2879 o Originator ID - this field contains an ID which is administra- 2880 tively assigned to the server which is the originator of the CSA 2881 record. For the DHCP interserver mapping, the the Originating 2882 Server ID is chosen to be the IP address of the server. In the 2883 event that the server has multiple IP addresses assigned to it, 2884 then the Originating Server ID is set to the IP address with the 2885 highest value. 2887 The CSAS record is specified by SCSP except for the specifics of the 2888 Cache Key and the Originator ID. 2890 For the purpose of the DHCP interserver specification, the Originat- 2891 ing Server ID is chosen to be the IP address of the server. In the 2892 event that the server has multiple IP addresses assigned to it, then 2893 the Originating Server ID is set to the IP address with the highest 2894 value. 2896 The Cache Key used is dependent upon the specific CSAS record in 2897 question. The table below identifies the specific Cache Keys for the 2898 various CSAS records within the DHCP interserver protocol. These are 2899 composed of a type and key field, both of which are identified in the 2900 table. 2902 DRAFT July 1997 2904 Table 8.4.1-1 Cache Keys for the various CSAS and CSA records 2906 Record Type | Encoding | Key 2907 -------------------------------------------------- 2908 | | 2909 Client Binding | 0x00 | Client ID 2910 | | or hwaddr 2911 Address | 0x10 | IP addr 2912 | | 2913 Subnet Bindable Addrs | 0x11 | Subnet/Mask * 2914 | | 2915 SG Specifiers | 0x20 | IP addr 2916 | | 2917 SG Subnet Configs | 0x21 | SG ID 2918 | | 2919 SG Members | 0x22 | SG ID/SG GN ** 2920 | | 2921 SG Proposed Members | 0x23 | SG ID/SG GN ** 2923 * The subnet address and the subnet mask will be encoded as 32 bit 2924 strings with the subnet address followed by the subnet mask. 2926 ** The SG ID and SG GN are encoded as 16 bit strings with the SG 2927 ID first, immediately followed by the SG GN. 2929 8.4.2. The SCSP CSA Records for the Interserver Protocol 2931 There are several types of DHCP specific CSA records defined corre- 2932 sponding to each of the CSAS record types discussed above and found 2933 in Table 8.4.1-1. 2935 For many of these records, DHCP options appear in the records in the 2936 same format as specified in [7]. 2938 The records are: 2940 o The Client Binding record carries the complete client binding 2941 information. The Key for this record is the chaddr or the 2942 'client ID' from the optional DHCP extension. This is utilized 2943 in the Cache Mgmt sub-protocol in handling the COMPLETE PUSH, 2944 POLL and SCSP cache alignment operations. 2946 o The Address record carries the information required to achieve 2947 the desired response from the CSU_Solicit message. The Key is 2948 the IP address. This is utilized in the Address Mgmt sub- 2949 protocol in handling the UNBINDABLE COMPLETE POLL operation. 2951 DRAFT July 1997 2953 o The Subnet Bindable Address record carries the information 2954 required to determine the status of the available IP addresses 2955 which are bindable to the DCS and which it is will to transfer to 2956 the LS. The Key for this record is the subnet address and mask 2957 of the subnet in question. This is utilized in the Address Mgmt 2958 sub-protocol by the TRANSFER operation. 2960 o The SG Specifier record contains the total list of SG specifiers, 2961 i.e., SG ID and SG GN pairs, of which the server in question is 2962 currently a member. This is utilized in the Group Mgmt sub- 2963 protocol by the DISCOVERY operation. The Key for this record is 2964 the Server ID, i.e., the IP address of the server. 2966 o The SG Members record contains a list of the Server IDs which 2967 comprise the SG in question. This is utilized in the Group Mgmt 2968 sub-protocol by the DISCOVER MEMBERS operation. The Key for this 2969 record is the SG Specifier, i.e., the SGID and SG GN pair. 2971 o The SG Proposed Members record contains a list of the SG members, 2972 including the newly proposed member, of the server group. This 2973 is utilized in the Group Mgmt sub-protocol by the PROPOSE JOIN 2974 operation. The Key for this record is the SG Specifier, i.e., 2975 the SGID and SG GN pair where the SG GN is one greater than the 2976 current GN of the SG. 2978 8.4.2.1. Binding Records 2980 The approach taken in defining the Client Binding record is as fol- 2981 lows. It is possible, while still maintaining the correct operation 2982 of the DHCP client/server protocol, to have the different server con- 2983 figurations within the same server group with respect to certain 2984 parameters. For these parameters we do not require synchronization 2985 of the server configurations and we make the passing of these parame- 2986 ters as optional. However there are some configuration parameters 2987 and binding information which is critical to the correct operation of 2988 the protocol. For these client parameters we require that they be 2989 included in the Client Binding records. The minimal, required set of 2990 parameters to be sent in the Client Binding are the IP address 2991 (ciaddr), the lease period, the last transaction type, the client 2992 hardware address, the Client-Identifier and the Renewel (T1) and 2993 Rebinding (T2) Time values (if present in the DHCP options extensions 2994 of the DHCPACK). 2996 The format of the CSA Binding record for the DCHP inter-server proto- 2997 col is: 2999 DRAFT July 1997 3001 0 1 2 3 3002 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 3003 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3004 | CSAS Record (variable) | 3005 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3006 | LTT |resrv'd| HTYPE | HLEN | resrv'd | 3007 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3008 | CHADDR (HLEN in octets) | 3009 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3010 | CIADDR (4 octet) | 3011 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3012 | Last Transaction Time (4 octet) | 3013 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3014 | IP Address Lease Time (encoded as tag=51) (6 octet) | 3015 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3016 | Optional ClientID (encoded as tag=61) (variable) | 3017 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3018 | Optional Renewal Time (encoded as tag=58) (6 octet) | 3019 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3020 | Optional Rebinding Time (encoded as tag=59) (6 octet) | 3021 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3022 | Other desirable DCHP extensions (variable) | 3023 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3024 | End Option (encoded as in BOOTP options, tag=255) (1 octet) | 3025 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3027 Figure 8.4.2.1-1 DHCP inter-server CSA Binding record format 3029 where: 3031 o CSAS Record - represents the full CSAS record as identified in 3032 Section 8.4.1. 3034 o LLT - indicates the Last Transaction Type. The allowed LTTs are: 3035 DHCPREQUEST/SELECTING (0x0), DHCPREQUEST/REBINDING (0x3), DHCPRE- 3036 QUEST/RENEWING(0x2), DHCPREQUEST/INIT-REBOOT (0x1), DHCPRELEASE 3037 (0x4), and EXPIRATION (0x5). 3039 o HTYPE - hardware address type (defined in [1]) 3041 o HLEN - hardware address length 3043 o CHADDR - client hardware address 3045 o CIADDR - client IP address (if assigned). If not assigned, this 3046 field is all 0s. 3048 DRAFT July 1997 3050 o Last Transaction Time - the time from now in seconds of the last 3051 transaction time associated with the LTT as indicated in the mes- 3052 sage. 3054 o IP Address Lease Time - the IP Address Lease Time encoded as in 3055 the DHCP options and BOOTP vendor extensions defined in [7]. 3056 This represents the time from now that the client lease is to 3057 expire. 3059 o (Optional) Client ID - this field is the optional Client ID 3060 encoded as in the DHCP options and BOOTP vendor extensions 3061 defined in RFC 2132 [7]. If present, the Client ID is the 3062 'search string'. 3064 o (Optional) Renewal Time - this field is the optional Client 3065 Renewal Time (T1) as encoded in the DHCP options and BOOTP vendor 3066 extensions defined in RFC 2132 [7]. 3068 o (Optional) Rebinding Time - this field is the optional Client 3069 Rebinding Time (T2) as encoded in the DHCP options and BOOTP ven- 3070 dor extensions defined in RFC 2132 [7]. 3072 o Remaining Options - any remaining options carried in the original 3073 DHCPOFFER message to the client encoded as in the DHCP options 3074 and BOOTP vendor extensions defined in [7] 3076 o End option - determines the end of the CSAS record 3078 DISCUSSION: 3080 As discussed in the previous section on the CSAS record for- 3081 mat, the format shown above is intended to be the Binding type 3082 CSA record. The binding record is used in the PUSH and COM- 3083 PLETE PUSH operations to transfer to the DCSes the newly cre- 3084 ated or changed binding and in the cache alignment procedures. 3085 The structure of the Client Binding is defined, for the pro- 3086 pose of the DHCP interserver protocol into a mandatory part 3087 and an optional part. The mandatory part is everything upto 3088 and including the (Optional) Rebinding Time. The optional 3089 part is everything following the (Optional) Rebinding Time. 3090 The PUSHing server may include any additional parameters which 3091 were part of the DHCPACK message to the client within the 3092 Client Binding Record and encode this as defined in the the 3093 DHCP options and BOOTP vendor extensions defined in RFC 2132 3094 [7]. The server which is the recipient of the PUSH may chose 3095 to save and forward these optional parameters in the record or 3096 may chose not to save and forward these optional parameters. 3098 DRAFT July 1997 3100 8.4.2.2. Address Records 3102 The format of the CSA Address record for the DCHP inter-server proto- 3103 col is: 3105 0 1 2 3 3106 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 3107 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3108 | CSAS Record (variable) | 3109 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3110 | ST | reserved | 3111 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3113 Figure 8.4.2.2-1 DHCP inter-server CSA Address record format 3115 where: 3117 o CSAS Record - represents the full CSAS record as identified in 3118 Section 8.4.1. 3120 o ST - represents the state of the (client) record, e.g., unbind- 3121 able, bindable, bound, expired, polling, static 3123 DISCUSSION: 3125 The Address record is used within the UNBINDABLE COMPLETE POLL 3126 operation to move an unbindable address to a bindable address. 3127 The POLLed server returns the Address record indicating the 3128 current status of the address within the server. If all of 3129 the servers indicate that the address is unbindable, then and 3130 only then will the LS move the address to its Bindable pool. 3132 The ST field indicates the servers view of the state of the 3133 address. The states (defined in Section 3.4.2) are: UNBIND- 3134 ABLE, POLLING, BINDABLE, BOUND, PUSHED, and EXPIRED. 3136 The IP address states are encoded in the following manner: 3138 DRAFT July 1997 3140 Table 8.4.2.2-1 IP Address State Encodings 3142 IP Address State | Encoding 3143 -------------------------------------------------- 3144 | 3145 UNBINDABLE | 0x01 3146 POLLING | 0x02 3147 BINDABLE | 0x03 3148 BOUND | 0x04 3149 PUSHED | 0x05 3150 EXPIRED | 0x06 3152 8.4.2.3. Subnet Bindable Addresses Record 3154 The CSA Subnet Bindable Addresses record indicates the set of 3155 addresses that a server is willing to TRANSFER to a requesting 3156 server. This record is used in the TRANSFER operation. 3158 The format of the CSA Subnet Bindable Addresses record for the DCHP 3159 inter-server protocol is: 3161 0 1 2 3 3162 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 3163 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3164 | CSAS Record (variable) | 3165 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3166 | No. Addresses |No. Addr.Ranges|R| reserved |No.Ownd|No.Reqd| 3167 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3168 | List of IP Addresses | 3169 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3171 Figure 8.4.2.3-1 DHCP inter-server CSA Subnet Bindable Addresses record 3172 format 3174 where: 3176 o CSAS Record - represents the full CSAS record as identified in 3177 Section 8.4.1. 3179 o No. Address - indicates the number of IP addresses contained 3180 within the subnet record. These are the addresses that the DCS 3181 is transferring to the LS as part of the TRANSFER operation. 3182 This is set to 0 when the R-bit is set to 1 (see R-bit below). 3184 DRAFT July 1997 3186 o No. Addr. Ranges - indicates the number of IP address ranges of 3187 the form 135.16.114.5 to 135.16.114.235. These will immediately 3188 follow the listing of the individual addresses. This is set to 0 3189 when the R-bit is set to 1 (see R-bit below). 3191 o R - represents the request bit. When this bit is set to 1, it 3192 indicates that the LS is requesting BINDABLE addresses from the 3193 DCS as part of the TRANSFER operation. When it is set to 0, it 3194 indicates that the DCS is transferring these addresses to the LS. 3196 o No. Ownd - indicates the current number of BINDABLE addresses 3197 owned by the LS when the R-bit is set to 1. 3199 o No.Reqd - indicates the number of additional BINDABLE addresses 3200 requested by the LS when the R-bit is set to 1. 3202 o List of IP Addresses - this is a consecutive list of IP address 3203 and address ranges. 3205 DISCUSSION: 3207 The Subnet record is used in the TRANSFER operation to indi- 3208 cate 1) the list of bindable IP addresses that the DCS is 3209 willing to transfer to the LS when the R bit is 0, and 2) the 3210 IP addresses that the LS is requesting when the R bit is 1. 3212 Further, it may be useful to develop similar records for Sub- 3213 net UNBINDABLE, BOUND, PUSHED, and EXPIRED address. They can 3214 have an identical record format and be distinguished through 3215 the 8 bit type field encoded into the SCSP Cache Key. The 3216 utility of these record types is TBD. 3218 8.4.2.4. SG Specifier Record 3220 The CSA SG Specifier Record indicates the total list of DHCP Inter- 3221 server protocol Server Groups that the DCS is currently a member. 3222 This is used in the Group Management subprotocol during the initial 3223 contact of a prospective new member to the Server Group. 3225 The format of the CSA SG Specifier Record for the DCHP inter-server 3226 protocol is: 3228 DRAFT July 1997 3230 0 1 2 3 3231 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 3232 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3233 | CSAS Record (variable) | 3234 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3235 |No. Specifiers | reserved | 3236 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3237 | List of Specifier Pairs | 3238 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3240 Figure 8.4.2.4-1 DHCP inter-server CSA SG Specifiers record format 3242 where: 3244 o CSAS Record - represents the full CSAS record as identified in 3245 Section 8.4.1. 3247 o No. Specifiers - is a count of the number of specifier pairs con- 3248 tained within this CSA record. 3250 o List of Specifier Pairs - represents a consecutive listing of the 3251 specifier pairs of which the DCS is current a mamber. The encod- 3252 ing of the specifier pairs is SG ID first, which is a 16 bit 3253 string, followed by the SG Generation Number, which is also a 3254 16-bit string. 3256 DISCUSSION: 3258 This record is initially requested by a server which is inter- 3259 ested in joining a DHCP Interserver Server Group and has been 3260 configured with the IP address of a server to first contact. 3261 The first contacted server then replies with the SG Specifier 3262 record. This record can also be solicited when a server, 3263 which an existing member of a group becomes uncertain regard- 3264 ing the current Generation Number of the group. 3266 The SG Generation Number, obtained from this record, is car- 3267 ried in every DHCP Interserver protocol message, encoded as an 3268 extension to the SCSP message extension fields. The extension 3269 encoding is TBD. 3271 8.4.2.5. SG Subnets Configuration Record 3273 The CSA SG Subnet Configuration Record carries SG configuration 3274 information necessary to ensure the correct protocol operation of the 3275 group. The encoding of this record is essentially the subnet address 3276 and mask followed by the pool of addresses which are dynamically 3278 DRAFT July 1997 3280 managed by the Server Group for this subnet. The encoding of the 3281 address pool with be consistent with the address pool encoding of the 3282 Subnet Bindable Addresses Record discussed in Section 8.4.2.3 above. 3283 Other configuration parameters may be including if deemed important 3284 to the correct operation of the DHCP interserver protocol. 3286 Section 7.2 specifies that additional information (specifically 3287 client configuration information and vendor specific configuration 3288 information) will be also be available. The precise details of how 3289 this information is encoded is TBD. 3291 The format of the CSA SG Subnets Configuration Record for the DCHP 3292 inter-server protocol is: 3294 0 1 2 3 3295 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 3296 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3297 | CSAS Record (variable) | 3298 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3299 | No. Subnets | reserved | 3300 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3301 | Subnet Address | 3302 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3303 | Subnet Mask | 3304 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3305 | Address Pool of first subnet (variable) | 3306 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3307 | Subnet Address | 3308 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3309 ... 3310 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3311 | Address Pool of last subnet (variable) | 3312 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3314 Figure 8.4.2.5-1 DHCP inter-server CSA SG Subnets Configuration record 3315 format 3317 where: 3319 o CSAS Record - represents the full CSAS record as identified in 3320 Section 8.4.1. 3322 o No. Subnets - indicates the number of subnet configurations con- 3323 tained in this record. 3325 DRAFT July 1997 3327 o Subnet Address - this is the subnet address of the subnet for 3328 which the following address pool is related. 3330 o Subnet Mask - this is the mask of the subnet in question. 3332 o Address pool of subnet - this is a listing of the address pool 3333 for which this SG can allocate from for this particular subnet. 3334 The encoding will follow the address pool encoding for the Subnet 3335 Bindable Addresses record. Therefore, the address pool should 3336 contain two count fields, the first indicating the number of 3337 individually listed addresses, followed by another field indicat- 3338 ing the number of address ranges. These are then followed by the 3339 list of individual IP addresses and then the list of address 3340 ranges. 3342 DISCUSSION: 3344 The total list of configuration items to be incorporated into 3345 this record needs to be further fleshed out. Currently this 3346 record is planned to contain a list of the subnets and the 3347 address pools associated with each from which this SG can 3348 allocate. If other configuration parameters are deemed neces- 3349 sary for the proper operation of the DHCP Interserver proto- 3350 col, then these need to be incorporated into this record. 3352 8.4.2.6. SG Members Record 3354 The CSA SG Members Record indicates the list of the current SG mem- 3355 bers, in the opinion of the sending server, including itself. 3357 The format of the CSA SG Members Record for the DCHP inter-server 3358 protocol is: 3360 0 1 2 3 3361 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 3362 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3363 | CSAS Record (variable) | 3364 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3365 | No. Server IDs|P| reserved | 3366 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3367 | List of Server IDs | 3368 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3370 Figure 8.4.2.6-1 DHCP inter-server CSA SG Members record format 3372 where: 3374 DRAFT July 1997 3376 o CSAS Record - represents the full CSAS record as identified in 3377 Section 8.4.1. 3379 o No. Server IDs - this is the number of Server IDs contained 3380 within this record. 3382 o P bit - the Proposal bit is used to indicate that this record is 3383 a current group members record (here set to 0) or a proposed 3384 group members record (discussed in the next section). 3386 o List of the Server IDs - this is a consecutive list of Server IDs 3387 which comprise this server's view of the current SG membership. 3388 The Server IDs are IP addresses associated with one of the 3389 server's interfaces. 3391 8.4.2.7. SG Proposed Members Record 3393 The CSA SG Proposed Members Record indicates the list of the current 3394 SG members, in the opinion of the sending server, and adding itself. 3395 This is a temporary record (with a lifetime associated with the 3396 period during which a Group Management SG CHANGE operation has to 3397 complete). Once the SG COMMIT CHANGE UPDATE is received, this record 3398 replaces the old SG Members record as the new member record contain- 3399 ing the newly joined server. 3401 The format of the CSA SG Proposed Members Record for the DCHP inter- 3402 server protocol is: 3404 0 1 2 3 3405 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 3406 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3407 | CSAS Record (variable) | 3408 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3409 | No. Server IDs|P| reserved | 3410 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3411 | List of Server IDs | 3412 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3414 Figure 8.4.2.7-1 DHCP inter-server CSA SG Proposed Members format 3416 where: 3418 o CSAS Record - represents the full CSAS record as identified in 3419 Section 8.4.1. 3421 DRAFT July 1997 3423 o No. Server IDs - this is the number of Server IDs contained 3424 within this record. 3426 o P bit - the Proposal bit is used to indicate that this record is 3427 a proposed group members record (here set to 1) or a current 3428 group members record (discussed in the previous section). 3430 o List of the Server IDs - this is a consecutive list of Server IDs 3431 which comprise the sending server's view of the proposed SG mem- 3432 bership. The Server IDs are IP addresses associated with one of 3433 the server's interfaces. 3435 DISCUSSION: 3437 This record contains the proposed group membership from the 3438 view of the proposing server. This record conceptually has a 3439 temporary lifetime associated with the period for which a 3440 group join proposal can live. If a server receives a SG COM- 3441 MIT CHANGE UPDATE message, then this record becomes the new SG 3442 Members record. If a SG COMMIT CHANGE UPDATE message is not 3443 received within the appropriate period, then this record 3444 expires. If the server receives a second SG PROPOSE CHANGE 3445 UPDATE message while another Proposed Members record is 3446 active, it should NAK this second Proposed Members record. 3447 Only one group join can be in process at any given time. 3449 8.5. Open Questions with the Mapping onto SCSP 3451 The following questions are identified as outstanding issues to be 3452 resolved for the CSAS and CSA record definitions to be considered 3453 complete: 3455 o SCSP is currently LLC/SNAP encapsulated. We are proposing that a 3456 UDP port be defined to carry SCSP messages for DHCP. In fact we 3457 are proposing that the entire DHCP interserver protocol be run 3458 over UDP. 3460 o SCSP has currently reserved its Protocol ID = 4 for DHCP. This 3461 draft discusses DHCPv4 Interserver protocol and therefore the 3462 SCSP Protocol ID reservation should reflect that fact. If a 3463 DHCPv6 extension to this draft were developed it would require a 3464 separate SCSP Protocol ID. 3466 o SCSP dropped support for message fragmentation. We need to look 3467 into the size required for the various records defined in this 3468 draft and, if necessary, consider how to handle records larger 3469 than can fit into a single UDP packet. 3471 DRAFT July 1997 3473 o Need to give further thought to the partitioning of the DHCP 3474 interserver protocol into three separate but related subproto- 3475 cols; the Group Management, the Binding Management and the 3476 Address Management subprotocols. Currently this draft has these 3477 as separate subprotocols, with the Group Management subprotocol 3478 run separate from the SCSP protocol and in fact on a different 3479 UDP port as the SCSP protocol. The Group Management does however 3480 share common message semantics and syntax with the SCSP messages 3481 in order to simplify parsing the various messages associated with 3482 the DHCP interserver protocol. The Binding Management and the 3483 Address Management subprotocols are run on top of SCSP with a 3484 single Protocol ID. 3486 o We need to explicitly discuss the method used to authenticate the 3487 DHCP Interserver protocol messages. Current thinking is to use 3488 the SCSP authentication extensions. This should be investigated 3489 and should be consistent with the 'Security Architecture for 3490 DHCP' draft [8]. 3492 9. IP Address State Transitions 3494 The possible states of an IP address were defined in Section 3.2.2, 3495 and the state transition diagram appears there. The state transi- 3496 tions though which an IP address can move were discussed implicitly 3497 in Section 6 in the context of the receipt of DHCP messages from DHCP 3498 clients. However, an explicit examination of the processing required 3499 of a server by this protocol on each of the state transitions will 3500 serve to highlight some important aspects of this protocol. 3502 The IP address state transitions are handled in the following way: 3504 o UNBINDABLE -> POLLING 3506 When a server attempts to make a particular IP address BINDABLE, 3507 it first moves that IP address into the POLLING state. Once in 3508 this state, if queried about whether that IP address is UNBIND- 3509 ABLE, the server will reply negatively. 3511 o UNBINDABLE -> BOUND 3513 When a server is removed from a server group, all of the IP 3514 addresses must be scanned to see if any of them show that server 3515 as the server who performed the last transaction (as set by that 3516 server successfully completing a CLIENT BINDING COMPLETE PUSH). 3517 For all of those IP addresses, if there is a client recorded in 3518 the IP address, and if that client does not have a currently dif- 3519 ferent binding, then that IP address must be set to BOUND and the 3520 lease time must be reset to the value sent in the latest CLIENT 3522 DRAFT July 1997 3524 BINDING COMPLETE PUSH. 3526 The only states from which this transition will be made are 3527 UNBINDABLE and EXPIRED. 3529 o POLLING -> BINDABLE 3531 A fundamental point and guarantee of this state transition dia- 3532 gram is that for an IP address to move from the UNBINDABLE state 3533 (where it is not owned by any server) through the POLLING state 3534 and on to the BINDABLE state (where it is owned by a single 3535 server) requires the server seeking to own the IP address to con- 3536 tact all of the other servers in the group. It requires an 3537 UNBINDABLE COMPLETE POLL to complete successfully. 3539 The server attempting to move an IP address from the UNBINDABLE 3540 through the POLLING and on to the BINDABLE state must ask every 3541 other server in the group if it believes that the IP address is 3542 currently UNBINDABLE using an UNBINDABLE COMPLETE POLL. If any 3543 server says that the IP address is either BINDABLE (i.e., it cur- 3544 rently owns the IP address) or BOUND (i.e., a client currently 3545 owns the IP address), then the server attempting to move the IP 3546 address from the UNBINDABLE to BINDABLE state MUST abandon the 3547 attempt. If any server fails to respond at all, the server MUST 3548 abandon the attempt as well. 3550 DISCUSSION: 3552 In addition (and this is important!) if the server attempting 3553 to move the IP address from the UNBINDABLE state through the 3554 POLLING state and on to the BINDABLE state fails to hear from 3555 some other server, then the attempt cannot complete. This 3556 means that if a server cannot communicate with every other 3557 server (due to communications failure, transient server fail- 3558 ure, or network partition) then this state transition cannot 3559 be made. 3561 Thus, all addresses in the UNBINDABLE state will stay in that 3562 state while any server in the group is out of communication with 3563 the group for any reason at all. 3565 Of course, the detailed description of the protocol suggests that 3566 a server build up a supply of BINDABLE IP addresses so that in 3567 the event of server failure it has BINDABLE addresses that are 3568 available to offer to new DHCP clients. 3570 o BINDABLE -> BOUND 3572 DRAFT July 1997 3574 Once an IP address is BINDABLE it may be BOUND to a client 3575 through the normal actions of the DHCP protocol. Once a server 3576 has received a DHCPREQUEST/SELECTING message from a client it can 3577 move the IP address into the BOUND state, update its stable stor- 3578 age, and reply with a DHCPACK message to the client. 3580 After the DHCPACK has been sent, the DHCP server MUST also 3581 attempt to update all servers in the group with information indi- 3582 cating that the IP address is now BOUND to a particular client. 3583 It must perform a CLIENT BINDING COMPLETE PUSH operation with 3584 this information. 3586 An IP address that is BOUND will always result in a lease time 3587 that is no greater than the MAXIMUM-UNPUSHED-LEASE-TIME when 3588 given to a client, although the normal lease time is used in all 3589 interactions with other servers. 3591 DISCUSSION: 3593 In an ideal world, the server who created the binding would 3594 always succeed in updating all other servers in the group with 3595 the binding information. Then, in the event that the binding 3596 server failed at some later time, another server to whom the 3597 client could broadcast would receive a DHCPREQUEST/REBINDING 3598 request and could reply with updated binding information. 3600 However, there is obviously a window where a server can crash 3601 after sending a DHCPACK and prior to updating even one addi- 3602 tional server. This protocol has been designed so that not 3603 only is the process of updating all of the servers in the 3604 group with information concerning a new binding "lazy" (i.e., 3605 performed after the actual binding is made), but also unneces- 3606 sary for correct operation. The protocol only requires that a 3607 server try to update the other servers -- not that it succeed 3608 at updating even one server. 3610 The protocol accomplishes this by allowing a server to respond 3611 to a DHCPREQUEST/REBINDING message from a client without any 3612 information having been propagated from the server who created 3613 the binding. Thus, a server who receives a rebinding request 3614 for an IP address about which it has no information must check 3615 with all available servers in the group, but in the absence of 3616 information to the contrary arriving within a relatively short 3617 timeout period, the server should respond to the rebinding 3618 request with an extension of the existing lease on the IP 3619 address. 3621 DRAFT July 1997 3623 o BINDABLE -> UNBINDABLE 3625 A server can relinquish an IP address in the BINDABLE state that 3626 it owns simply by responding to requests for information about 3627 the IP address as if it were UNBINDABLE. No explicit action need 3628 be taken other than to respond correctly to POLL operations from 3629 other servers. 3631 o BOUND -> PUSHED 3633 Once an IP address that is BOUND to a client has a CLIENT BINDING 3634 COMPLETE PUSH succeed (and that means succeed to all of the 3635 servers), then it moves from the BOUND to the PUSHED state. At 3636 this point, the normal lease time may be returned to the client 3637 on the next renewal or discover or rebinding. 3639 Note that only the server which executes the CLIENT BINDING COM- 3640 PLETE PUSH will set its IP address into the PUSHED state. The 3641 state that it PUSHes to the other servers is BOUND. 3643 o BOUND -> UNBINDABLE 3645 In order for an IP address to move from the BOUND to the UNBIND- 3646 ABLE state, the client that owns the IP address (i.e., to which 3647 it is BOUND) must send a DHCPRELEASE message. In this case, the 3648 receiving server (which may or may not be the server who created 3649 original binding) will update its stable storage with information 3650 that the IP address is not currently BOUND by any client. It 3651 should then transmit this information to all other servers to 3652 which it can communicate at that time by performing a CLIENT 3653 BINDING COMPLETE PUSH operation. 3655 In the event that the server fails to update any other server 3656 with the new information about the IP address prior to undergoing 3657 some failure, then the worst that will happen is that the other 3658 servers will believe that an IP address is in the BOUND state 3659 when it need not be. Ultimately the lease on the IP address will 3660 expire. 3662 o BOUND -> EXPIRED 3664 Any server which has information concerning a BOUND IP address 3665 may determine that the lease on the IP address has expired, and 3666 after an appropriate grace period has elapsed, that the IP 3667 address should be moved to the EXPIRED state. A record of the 3668 client to which the IP address was BOUND must be kept. 3670 DRAFT July 1997 3672 o PUSHED -> UNBINDABLE 3674 In order for an IP address to move from the PUSHED to the UNBIND- 3675 ABLE state, the client that owns the IP address (i.e., to which 3676 it is BOUND) must send a DHCPRELEASE message. In this case, the 3677 receiving server (which may or may not be the server who created 3678 original binding) will update its stable storage with information 3679 that the IP address is not currently BOUND by any client. It 3680 should then transmit this information to all other servers to 3681 which it can communicate at that time by performing a CLIENT 3682 BINDING COMPLETE PUSH operation. 3684 In the event that the server fails to update any other server 3685 with the new information about the IP address prior to undergoing 3686 some failure, then the worst that will happen is that the other 3687 servers will believe that an IP address is in the PUSHED state 3688 when it need not be. Ultimately the lease on the IP address will 3689 expire. 3691 o PUSHED -> EXPIRED 3693 Any server which has information concerning a PUSHED IP address 3694 may determine that the lease on the IP address has expired, and 3695 after an appropriate grace period has elapsed, that the IP 3696 address should be moved to the EXPIRED state. A record of the 3697 client to which the IP address was PUSHED must be kept. 3699 o EXPIRED -> UNBINDABLE 3701 If any server asks for information concerning this IP address, 3702 then the receiving server should set the IP address to be UNBIND- 3703 ABLE, update its stable storage, and respond to the requesting 3704 server. 3706 o EXPIRED -> BOUND 3708 If a server receives a message from a client and the IP address 3709 is EXPIRED, but was last BOUND or PUSHED to that client, then the 3710 IP address can be moved back into the BOUND state. This is pos- 3711 sible because no other server can have attempted to make this IP 3712 address BINDABLE. If it had, the IP address would not be in the 3713 EXPIRED state anymore, but in the UNBINDABLE state (see the 3714 EXPIRED -> UNBINDABLE transition above). 3716 Another reason this transition can occur is as follows. When a 3717 server is removed from a server group, all of the IP addresses 3718 must be scanned to see if any of them show that server as the 3719 server who performed the last transaction (as set by that server 3721 DRAFT July 1997 3723 successfully completing a CLIENT BINDING COMPLETE PUSH). For all 3724 of those IP addresses, if there is a client recorded in the IP 3725 address, and if that client does not have a currently different 3726 binding, then that IP address must be set to BOUND and the lease 3727 time must be reset to the value sent in the latest CLIENT BINDING 3728 COMPLETE PUSH. 3730 The only states from which this transition will be made are 3731 UNBINDABLE and EXPIRED. 3733 10. Security Considerations 3735 Minimal security would be provided by configuring every server in a 3736 group with the IP addresses of the allowable servers that could ever 3737 join that group. 3739 Some additional security is created by using the SCSP security mecha- 3740 nism, although there are limitations to that for other than the 3741 client binding management part of the protocol. 3743 Other, more powerful security approaches are and must be addressed 3744 prior to further progress on this protocol. 3746 11. Open Questions 3748 The following open questions set off by the "*" character remain from 3749 Ralph Droms' original draft: draft-ietf-dhc-interserver-00.txt. 3750 Comments have been added in square brackets []. Additional open 3751 questions new to this draft are listed with the "o" character. 3753 * Each server must know all other servers. 3755 Requiring each server to know about every other server imposes 3756 additional administrative overhead in the configuration of DHCP 3757 servers. However, this configuration overhead is probably mini- 3758 mal relative to any other configuration required for DHCP 3759 servers. 3761 [The group management messages in Section 7 provide a step 3762 towards an answer here. A server needs to know only one other 3763 server.] 3765 * Each server must contact all other servers before reassigning an 3766 address. 3768 DRAFT July 1997 3770 [This is fundamental if we wish to use the "lazy synchronization" 3771 mode -- you can't get one without the other.] 3773 There is a potential issue here in which no new DHCP clients can 3774 be configured if any of the DHCP servers cannot be contacted. 3775 Servers can mitigate this problem by maintaining a list of pre- 3776 checked addresses that can be allocated without contacting all 3777 other servers at the time of address allocation. 3779 The protocol may need additional definition of specific actions 3780 on the part of DHCP servers in response to situations in which a 3781 server cannot contact all other servers. [Added a lot of these 3782 in this draft.] 3784 * Servers cooperating to achieve "fair" distribution of available 3785 addresses. 3787 The protocol may need additional mechanisms or definition of 3788 default behavior through which servers cooperate among themselves 3789 to ensure that each has a sufficient pool of prechecked-addresses 3790 on each network. 3792 [Not yet addressed, and needs work. Initial thinking is that all 3793 addresses should be allocated to some server, so that if the 3794 event of a SG where one member can't be contacted, the maximum 3795 addresses are available for TRANSFER operations as necessary.] 3797 * User intervention in case of database incoherency. 3799 Fixing the collective database on the DHCP servers in case of a 3800 problem could be a *real* nightmare. 3802 * Potential deadlock in checking address - suppose two servers 3803 check the same address for reassignment simultaneously? 3805 [Solved with the introduction of the POLLING state.] 3807 * Potential configuration for new server? 3809 One ancillary use of the inter-server protocol might be in con- 3810 figuring new DHCP servers. Suppose the inter-server protocol 3811 were extended to allow download of a server's configuration file 3812 and to allow addition of a new server to the list of DHCP 3813 servers. A new server might be configured by simply giving it 3814 the address of an existing server. The new server could then 3815 download a list of all other known servers, the pool of candidate 3816 addresses, any special configuration information (e.g., vendor 3817 class information) and the existing bindings. The new server 3819 DRAFT July 1997 3821 could also announce itself to all of the other existing servers. 3823 [Much of this is in the current draft, principally in the group 3824 management configuration messages. At this stage, a server can 3825 figure out which groups correspond with which subnets, which 3826 addresses that group manages on that subnet, and some additional 3827 configuration information. This is considerable distance towards 3828 both ensuring that all servers in the SG have compatible configu- 3829 rations, as well as towards one server downloading configuration 3830 data from another server. 3832 Downloading configuration files would not be a great idea for 3833 servers which don't use configuration files.] 3835 * DHCP server maintenance 3837 There is likely an opportunity for the development of a server 3838 management tool that would download the database information from 3839 all servers and check for conflicts/inconsistencies such as 3840 assignment of an IP address to multiple clients, bindings that 3841 are not replicated across all servers, bindings that have incon- 3842 sistent lease expiration times, etc. 3844 o Group-id selection. 3846 The group-id's for various groups need to be sufficiently unique 3847 that no server will ever be a member of two groups with the same 3848 group-id. No mechanism is provided yet in this protocol to gen- 3849 erate group-id's which conform to this requirement. 3851 Possibly a group-id can be synthesized in some manner to ensure 3852 that they conform to this requirement. 3854 o The original draft discussed the requirement for each server to 3855 have a synchronized clock using available time synchronization 3856 protocols. That requirement has been removed in this draft, and 3857 in its place all times are sent in "seconds from now" as a signed 3858 32 bit number. There is clearly a bit of additional complexity 3859 required to do this, but we have been so impressed at how well 3860 DHCP works with "relative" instead of "absolute" time that we 3861 felt the complexity of using relative time worth it (since using 3862 synchronized time is not without its own complexities). 3864 o UNAVAILABLE IP addresses 3866 There are several cases where a server can determine that some 3867 sort of serious error has occurred, and apparently an IP address 3868 is in an inconsistent state. In these cases, the server should 3870 DRAFT July 1997 3872 make the IP address UNAVAILABLE -- i.e., no other server should 3873 be able to operate on it. Just what is necessary to make this 3874 happen? Could it be a passive response to address information 3875 messages, or must it involve a complete push to all of the other 3876 servers, and a new IP address state? 3878 12. Acknowledgments 3880 Many of the ideas in this proposal are due to Jeff Mogul, Greg Min- 3881 shall, Rob Stevens, Walt Wimer, Ted Lemon and the DHC working group. 3882 Thanks to all who have contributed their ideas and participated in 3883 the discussion of the inter-server protocol. 3885 At American Internet, Brad Parker and Mark Stapp have been key con- 3886 tributors to the design discussions that have resulted in our contri- 3887 butions to the this draft. They have each invested many hours of 3888 work in this protocol. 3890 13. References 3892 [1] Droms, R., "Dynamic Host Configuration Protocol", RFC 2131, 3893 March 1997. 3895 [2] Luciani, J., Armitage, G., Halpern, J., "Server Cache Synchro- 3896 nization Protocol (SCSP)", draft-ietf-ion-scsp-01.txt. 3898 [3] Moy, J. "OSPF Version 2", IETF RFC1247, July 1991. 3900 [4] Luciani, J., "A Distributed NHRP Service Using SCSP", draft- 3901 ietf-ion-scsp-nhrp-00.txt. 3903 [5] Luciani, J., Fox, B., "A Distributed ATMARP Service Using 3904 SCSP", draft-ietf-ion-scsp-atmarp-00.txt. 3906 [6] Reynolds, J., Postel, J., "Assigned Numbers", Internet STD 2, 3907 Internet RFC 1340, USC/Information Sciences Institute, July 3908 1992. 3910 [7] Alexander, S., Droms, R., "DHCP Options and BOOTP Vendor 3911 Extensions", Internet RFC 2132, March 1997. 3913 [8] Gudmundsson, Olafur, "Security Architecture for DHCP", draft- 3914 ietf-dhc-security-arch-00.txt. 3916 DRAFT July 1997 3918 14. Author's information 3920 Kim Kinnear 3921 American Internet Corporation 3922 4 Preston Ct. 3923 Bedford, MA 01730-2334 3925 Phone: (617) 276-4587 3926 EMail: kinnear@american.com 3928 Robert G. Cole 3929 AT&T Laboratories 3930 Managed Network Solutions Division 3931 Rm. 3L-533 3932 101 Crawfords Corner Road 3933 Holmdel, NJ 07733 3935 Phone: (908) 949-1950 3936 EMail: rgc@qsun.att.com 3938 Ralph Droms 3939 Computer Science Department 3940 323 Dana Engineering 3941 Bucknell University 3942 Lewisburg, PA 17837 3944 Phone: (717) 524-1145 3945 EMail: droms@bucknell.edu 3947 DRAFT July 1997 3949 Appendix A: An Overview of SCSP 3951 This appendix presents an overview of the SCSP protocol and supple- 3952 ments Section 8.2 in the main text of this specification. For a com- 3953 plete discussion of the SCSP protocol see [2]. 3955 This appendix is divided into three following sections on the SCSP 3956 Hello, Cache Alignment and Cache Update subprotocols respectively. 3957 The last section of this appendix presents a summary of the SCSP mes- 3958 sage sets. 3960 A.1 The SCSP "Hello" Sub-protocol Overview 3962 The function of the SCSP "Hello" protocol is to monitor the status of 3963 the LS to DCS connection. The LS must be configured with the 3964 addresses of its DCSs. The protocol contains a 'Family ID' which 3965 allows for the multiplexing of multiple protocol specific SCSP imple- 3966 mentations to rely on a single Hello mechanism between each server 3967 pair. For each DCS (whether the low level connection is point-to- 3968 point or point-to-multipoint), the LS maintains an Hello Finite State 3969 Machine (HFSM). The HFSM is shown in the figure below. 3971 +---------------+ 3972 | | 3973 +------->| DOWN |<-------+ 3974 | | | | 3975 | +---------------+ | 3976 | | ^ | 3977 | | | | 3978 | | | | 3979 | | | | 3980 | V | | 3981 | +---------------+ | 3982 | | | | 3983 | | WAITING | | 3984 | +--| |--+ | 3985 | | +---------------+ | | 3986 | | ^ ^ | | 3987 | | | | | | 3988 | V | | V | 3989 +---------------+ +---------------+ 3990 | BIDIRECTION |---->| UNIDIRECTION | 3991 | | | | 3992 | CONNECTION |<----| CONNECTION | 3993 +---------------+ +---------------+ 3995 Figure A.1-1 The Hello Finite State Machine 3997 DRAFT July 1997 3999 Key: 4001 1: Link layer connection is established 4003 2: Transition based upon the receipt of a Hello message (and 4004 whether the LS ID is found in the Rec ID portion of the message 4006 3: Hello Interval * Dead Factor exceeded 4008 4: Loss of link layer connectivity 4010 The LS to DCS connections are initialized into the down state. The 4011 numbers in the figure refer to the actions discussed in the Key that 4012 cause a transition in the HFSM (Note: These numbers didn't appear in 4013 the original figure in [2], and are TBD). The Hello protocol employs 4014 poll messages to monitor the status of the LS to DCS connections. 4016 The Hello messages contain the ID s of the DCS s that the LS has 4017 received a Hello message from. The LS' HFSM uses these ID s to 4018 determine the status of the HFSM for each of the DCS s. Multiple DCS 4019 ID s are present in order to support point-to-multipoint connections. 4020 The messages also contain two fields; the Polling Interval and the 4021 Dead Factor. The product of the Polling Interval and the Dead Factor 4022 determines the length of time that the HFSM will hold open a connec- 4023 tion without receiving a Hello from a peer DCS and transitioning the 4024 HFSM for that DCS to the Wait state. 4026 A.2 The SCSP "Cache Alignment" Sub-protocol 4028 The Cache Alignment protocol supports the initial server cache syn- 4029 chronization process of an LS with its DCSs. This process may occur 4030 at initial boot time of the server, at reconnect time of the server 4031 to the network, or other possible initialization or failure recovery 4032 scenarios. Like the Hello protocol, the Cache Alignment (CA) proto- 4033 col maintains a Cache Alignment Finite State Machine (CAFSM) for each 4034 of its DCSs to monitor the status of its cache alignment. The figure 4035 below shows the CAFSM and indicates some of the triggers that would 4036 cause the state transitions to occur. 4038 DRAFT July 1997 4040 +------------+ 4041 | | 4042 +--->| DOWN | 4043 | | | 4044 | +------------+ 4045 | | 4046 | | 4047 | V 4048 | +------------+ 4049 | |Master/Slave| 4050 |----| |<---+ 4051 | |Negotiation | | 4052 | +------------+ | 4053 | | | 4054 | | | 4055 | V | 4056 | +------------+ | 4057 | | Cache | | 4058 |----| |----| 4059 | | Summarize | | 4060 | +------------+ | 4061 | | | 4062 | | | 4063 | V | 4064 | +------------+ | 4065 | | Update | | 4066 |----| |----| 4067 | | Cache | | 4068 | +------------+ | 4069 | | | 4070 | | | 4071 | V | 4072 | +------------+ | 4073 | | | | 4074 +----| Aligned |----+ 4075 | | 4076 +------------+ 4078 Figure A.2-1 Cache Alignment Finite State Machine 4080 Key: 4082 1: When HFSM reaches Bi-directional state 4084 DRAFT July 1997 4086 2: HFSM transitions out of Bi-directional state 4088 3: Master/Slave relationship is established 4090 4: Once both LS and DCS exchange CA messages, both with O-bit set 4091 to 0, then CRL is complete 4093 5: E.g., Errored sequence number 4095 6: Full cache update achieved 4097 (Note: The key numbers don't appear in the figure in [2],a and are 4098 TBD.) 4100 Each of the CAFSMs is coupled with the respective HFSMs in the LS. 4101 The CAFSM is initialized in the Down state. It transitions to the 4102 Master/Slave Negotiation state when the corresponding HFSM transi- 4103 tions to the Bi-Directional state. The CAFSM transitions back to 4104 the Down state in the event that the corresponding HFSM transitions 4105 out of the Bi-Directional state. 4107 In the Master/Slave state the LS-DCS pair negotiate who is to be the 4108 master of the connection during the cache alignment process. In the 4109 Cache Summary state the LS/DCS pair exchange Client State Advertise- 4110 ment Summary (CSAS) records within the CA messages. The servers use 4111 these message exchanges to build a Client State Advertisement Request 4112 List (CRL). The CRL indicates the portions of the respective server 4113 caches that are out of alignment. The cache mis-alignment (as indi- 4114 cated in the local CRL) is resolved in the Update Cache state where 4115 the servers exchange full client state information in CSA records 4116 within the CSU messages, only where mis-alignment occurs. Once the 4117 CRL is resolved, the LS/DCS caches are aligned and the CAFSM transi- 4118 tions to the Aligned state. 4120 The protocol further defines the high-level syntax of a generic CA 4121 message as discussed in a later section of this appendix. 4123 A.3 The SCSP "Client State Update" Sub-protocol Overview 4125 The purpose of the Client State Update (CSU) protocol is to provide a 4126 capability to constantly update the server caches through asyn- 4127 chronous CSU message exchanges. These updates are necessary because 4128 the status of the clients are in constant flux. Unlike the other two 4129 sub-protocols, the Client State Update protocol does not maintain a 4130 separate finite state machine. Instead, the activity of this proto- 4131 col is tied to the CAFSM. 4133 Each CSU can contain zero or more Client State Advertisement records. 4135 DRAFT July 1997 4137 The LS may send and receive CSUs when the corresponding CAFSM is in 4138 either the Aligned or the Cache Update states. The CSU protocol 4139 defines both CSU requests and reply messages. As consistent through- 4140 out the definition of the SCSP, the CSU protocol supports both point- 4141 to-point and point-to-multipoint connections. 4143 A.4 The SCSP Message Set Overview 4145 The structure of the SCSP messages is a)a fixed length, generic 4146 header, b) a SCSP message specific part header of variable length, c) 4147 an fixed length, message field and d) zero, one or more SCSP message 4148 specific records. This is shown in the following figure. 4150 0 1 2 3 4151 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4152 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4153 | Version | type | Packet Size | 4154 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4155 | IP Checksum | Start of Extensions | 4156 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4157 | SCSP Message Specific part (variable) | 4158 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4159 | Protocol ID | SG ID | 4160 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4161 | unused | Flags | 4162 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4163 | Sender ID Len | Recvr ID Len | No. of Records | 4164 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4165 | Sender ID (variable) | 4166 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4167 | Receiver ID (variable) | 4168 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4169 | SCSP Message Specific Records (variable) | 4170 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4172 Figure A.4-1 SCSP Message Format 4174 where 4176 o Version - is the version of the SCSP protocol defined in [2] 4178 o type - represents the SCSP message type, i.e., CA, Hello, 4179 CSU_Req, CSU_Reply, and CSU_Solicit 4181 o Packet Size - 4183 DRAFT July 1997 4185 The SCSP messages have identical syntax except for the 1) the SCSP 4186 message specific part header and 2) the SCSP message specific part 4187 record. The following table summarizes the content of these specific 4188 parts: 4190 Table A.4-1 SCSP Message Specific Parts 4192 | Hello | CA | CSUS | CSU_Req | CSU_Reply 4193 ------------------------------------------------------------------------ 4194 | | | | | 4195 SCSP mesg | hello int,|CSA Seq.No.| null | null | null 4196 spec header | dead fac.,| | | | 4197 | Family ID | | | | 4198 ------------------------------------------------------------------------ 4199 | | | | | 4200 SCSP mesg |Additional |CSAS Rec. | CSAS Rec. | CSA Rec. | CSAS Rec. 4201 spec record | Recvr ID | | | | 4202 | records | | | | 4204 The detailed formats of the various SCSP messages are given in [2]. 4205 However, two SCSP message specific records are of particular interest 4206 to the development of the DHCP interserver specification. These are: 4207 1) the CSAS record and 2) the CSA record. The CSAS record is defined 4208 within the SCSP specification as: 4210 0 1 2 3 4211 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4212 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4213 | Hop Count | Record Length | 4214 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4215 | Cache Key Len | Orig ID Len |N| unused | 4216 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4217 | CSA Sequence Number | 4218 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4219 | Cache Key (variable) | 4220 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4221 | Originator ID (variable) | 4222 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4224 Figure A.4-2 SCSP CSAS Record Format 4226 See Section 8.4.1 for details. 4228 DRAFT July 1997 4230 The CSA record is defined within the SCSP specification as: 4232 0 1 2 3 4233 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4234 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4235 | CSAS Record | 4236 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4237 | Client/Server Protocol Specific Part Cache Entry | 4238 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4240 Figure A.4-3 SCSP CSA Record Format 4242 The CSA records for the DHCP interserver mapping to SCSP are defined 4243 in Section 8.4.2. 4245 [end of document ]