idnits 2.17.1 draft-ietf-dhc-interserver-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 582: '...EASE, the client MUST go to INIT-state...' Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 1319 has weird spacing: '...nically incre...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 1997) is 9751 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '5' is defined on line 1362, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. '1' -- No information found for draft-ieft-ion-scsp - is the name correct? -- Possible downref: Normative reference to a draft: ref. '2' ** Obsolete normative reference: RFC 1533 (ref. '3') (Obsoleted by RFC 2132) ** Obsolete normative reference: RFC 1340 (ref. '4') (Obsoleted by RFC 1700) -- Possible downref: Non-RFC (?) normative reference: ref. '5' Summary: 12 errors (**), 0 flaws (~~), 3 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group R. Droms 2 INTERNET DRAFT Bucknell University 4 R. Cole 5 AT&T MNS 7 March 1997 8 Expires August 1997 10 An Inter-server Protocol for DHCP 11 13 Status of this Memo 15 This document is an Internet-Draft. Internet-Drafts are working 16 documents of the Internet Engineering Task Force (IETF), its areas, 17 and its working groups. Note that other groups may also distribute 18 working documents as Internet-Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as ``work in progress.'' 25 To learn the current status of any Internet-Draft, please check the 26 ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow 27 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 28 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 29 ftp.isi.edu (US West Coast). 31 Abstract 33 The DHCP protocol is designed to allow for multiple DHCP servers, so 34 that reliability of DHCP service can be improved through the use of 35 redundant servers. To provide redundant service, multiple DHCP 36 servers must carry the same information about assigned IP addresses 37 and parameters; i.e., the servers must be configured with the same 38 bindings. Because DHCP servers may dynamically assign new addresses 39 or configuration parameters, or extend the lease on an existing 40 address assignment, the bindings on some servers may become out of 41 date. The DHCP inter-server protocol provides an automatic mechanism 42 for synchronization of the bindings stored on a set of cooperating 43 DHCP servers. The underlying capabilities of the DHCP inter-server 44 protocol required for multiple server cache replications are based 45 upon the Server Cache Synchronization Protocol (SCSP). 47 1. Introduction 49 DHCP servers manage the assignment of IP address and configuration 50 parameters to IP hosts. The DHCP protocol specification [1] refers 51 to the collection of configuration information assigned to a client 52 as a "binding". The DHCP protocol is designed to allow for multiple 53 DHCP servers, so that reliability of DHCP service can be improved 54 through the use of redundant servers. To provide redundant service, 55 the distributed DHCP servers' databases must be configured with the 56 same information about assigned IP addresses and parameters; i.e., 57 client bindings must be replicated in multiple server databases. 58 Because DHCP servers may dynamically assign new addresses or 59 configuration parameters, or extend the lease on an existing address 60 assignment, the bindings on some servers may become out of date. The 61 DHCP inter-server protocol provides an automatic mechanism for 62 synchronization of the bindings stored on a set of cooperating DHCP 63 servers. 65 Much of the underlying capabilities provided by the DHCP inter-server 66 protocol will rely on the capabilities provided by another protocol, 67 the Server Cache Synchronization Protocol (SCSP) [2]. The SCSP 68 protocol defines a generic capability for the replication of 69 multiple, dispersed, replica server databases. The SCSP places no 70 topological requirements on the interconnection of the replica 71 databases other than the requirement that the resultant graph spans 72 the total set of servers. The SCSP protocol itself borrows heavily 73 from the work of link state protocol database replication. 75 The DHCP inter-server protocol uses TCP between pairs of servers. 76 Each server is configured with a list of all other servers. The 77 servers are also all configured with a pool of candidate IP addresses 78 that may be assigned dynamically to DHCP clients. Periodically or on 79 demand, a server may contact one, some or all other DHCP servers to 80 perform DHCP inter-server protocol functions. All DHCP servers have 81 synchronized clocks (e.g., using NTP). Through these protocol 82 sessions between pairs of servers, a server can inform other servers 83 about new bindings or about lease extensions on existing bindings and 84 can inform other servers about bindings that have been released. 86 The collection of bindings managed by the DHCP servers is essentially 87 a distributed database. The servers can use the inter-server 88 protocol to synchronize changes to the database and ensure coherency 89 among the individual servers. However, latency in the 90 synchronization process means that the bindings on some servers may 91 be stale. Potentially, clients could receive invalid configuration 92 information based on these stale bindings. The inter-server protocol 93 is designed to ensure that clients always receive valid configuration 94 information. 96 1.1 Terminology 98 This document uses the following terms: 100 + "DHCP client" 102 A DHCP client is an Internet host using DHCP to obtain 103 configuration parameters such as a network address. 105 + "DHCP server" 107 A DHCP server is an Internet host that returns configuration 108 parameters to DHCP clients. 110 + "binding" 112 A binding is a collection of configuration parameters, 113 including at least an IP address, associated with or "bound to" 114 a DHCP client. Bindings are managed by DHCP servers. 116 + "Local Server" 118 A Local Server (LS) references the particular server in 119 question. 121 + "Directly Connected Server" 123 A Directly Connected Server (DCS) references servers which are 124 directly connected to (or one hop removed from) the LS. 126 + "Remote Server" 128 A Remote Server (RS) references servers two or more hops 129 removed from the LS. 131 + "Server Group" 133 A Server Group (SG) is the set of associated servers providing 134 the redundant database for the common set of PCs, workstations, 135 etc. 137 1.2 Protocol Goals 138 The DHCP inter-server protocol is developed with the following 139 objectives: 141 + Develop a highly available DHCP server architecture. 143 + Maintain the client behavior in the current non-redundant DHCP 144 protocol [1]. 146 + Maintain the design goals of the DHCP Client/Server protocol as 147 identified in [1]. 149 + Maintain uniqueness of the assigned IP addresses. 151 + Minimize changes to the behavior of the BOOTP Relay Agents. 153 + Ease redundant server administration. Administration should be 154 primarily isolated to a single server of the replica server group. 155 Failure recovery should be automatic. 157 The DHCP inter-server protocol provides the following functions: 159 + Distribution of address assignment information, 161 + Distribution of lease release (as a result of DHCPRELEASE) 162 information, 164 + Reallocation of available addresses and 166 + Query about whether a specific address is "in use". 168 1.3 Approach Philosophy 170 The remainder of the this document discusses the SCSP as applied the 171 problem of developing the DHCP inter-server protocol. Two redundant 172 server behavior models are developed; the Peer Redundant Server Model 173 (PRSM) where all servers are roughly equivalent in their actions and 174 the Primary/Secondary Redundant Server Model (PSRSM) where the 175 primary server handles all interaction with the DHCP clients. Over 176 time, one of the behavior models will be chosen and fully developed 177 as the DHCP inter-server protocol. 179 Section 2 of this document presents an overview of the SCSP protocol 180 and a discussion of the issues to resolve in building the DHCP 181 inter-server protocol on the SCSP capabilities. The issues to be 182 resolved include the decision on the choice of the redundant server 183 behavior model for the DHCP inter-server protocol. Section 3 184 presents the Peer Redundant Server Model (PRSM) where all servers are 185 roughly equivalent in their actions. Section 4 presents the 186 Primary/Secondary Redundant Server Model (PSRSM). Here a primary 187 server handles all the interaction with the DHCP clients, where 188 changes to the client's binding are required. Included in the 189 discussion of the PRSM and the PSRSM is a description of the ways in 190 which DHCP servers will use the protocol to coordinate assignment, 191 release and expiration of bindings to guarantee consistent 192 interactions between DHCP servers and clients. These sections also 193 contain a list of the open questions to resolve for the full 194 development of the respective models. We anticipate that this list 195 of open questions will be resolved in following drafts. Section 5 196 presents the DHCP specific Client State Advertisement and Client 197 State Advertisement Summary records. These are required to map the 198 DHCP inter-server protocol onto the SCSP capabilities. Section 6 199 contains conclusions. 201 2. Analysis of SCSP for DHCP Inter-server Protocol 203 This section presents a brief overview of the SCSP protocol. Further 204 details are found in the appendices and in Reference [2]. An analysis 205 of the issues to resolve to build the DHCP inter-server protocol on 206 top of the SCSP capabilities is presented following the SCSP 207 Overview. 209 2.1 SCSP Overview 211 The SCSP protocol consist of three separate sub-protocols, i.e., 213 the status of the inter-server connection, 215 + the "Cache Alignment" protocol: this protocol defines the cache 216 synchronization capability for new servers and servers that, for 217 whatever reason, have lost synchronization, and 219 + the "Client State Update" protocol: this protocol provides the 220 ongoing server cache synchronization through asynchronous client 221 state updates. 223 These sub-protocols define the semantics and high-level syntax of 224 generic message sets and their exchanges in support of the 225 capabilities provided. The SCSP associates replica databases into 226 Server Groups. The SCSP supports both point-to-point and point-to- 227 multipoint connections between the LS and the DCS(es). We discuss 228 each of these sub-protocols in more detail in the appendices below. 230 For now we accept that these capabilities are generically provided 231 and analyze possible redundant DHCP server overlays on top of the 232 SCSP. Within DHCP, the notion of SCSP Server Groups (SG) is defined 233 by those servers supporting a common set of client PCs, workstations, 234 etc. Then, in general we have multiple redundant servers supporting 235 distinct sets of client PCs which may be remote from their supporting 236 servers. Logically, the remote PCs are connected to their 237 geographically dispersed servers via DHCP relay agents and IP 238 transport. The relay agents may have multiple interfaces to the 239 network. 241 For discussion purposes we say that SG A supports the client base A, 242 SG B supports the client base B and so on. Relay agents A1, A2, 243 servers A1, A2, ... 245 2.2 Issues to Resolve for DHCP Inter-server Protocol Development 247 The SCSP does not fully define the redundant DHCP inter-server 248 protocol. It does provide an underlying capability. Several issues 249 must by addressed in order to fully define the DHCP inter-server 250 protocol. These include: 252 + What behavior model will the redundant servers within a SG 253 employ? 255 + Can the DHCP inter-server protocol be developed without 256 modifying the behavior of the relay agents and the clients? 258 + How do servers in the SG identify a "failed" server? 260 + What are the DHCP protocol specific client state records defined 261 in SCSP? 263 + How does SCSP support the synchronization of pre-configured (or 264 provisioned) database information? 266 + What is the nature of the server-to-server connection? 268 + What topologies will be supported? 270 We discuss each of these separately below. Within each of the 271 appendices, which present short overviews of the SCSP sub-protocols, 272 further elaboration on some of the above issues is provided. 274 2.2.1 What behavior model will the redundant servers within a SG employ? 275 Two distinct models are Peer Behavior and Primary/Secondary Behavior. 276 These two models are more fully developed in Sections 3 and 4 277 respectively. 279 2.2.2 How do servers in the SG identify a "failed" server? 281 the LS know that the DCS is disconnected from the client pool 282 associated with their SG? Does the fact that the LS is disconnected 283 from the DCS yet connected to the client pool indicate that the DCS 284 is necessarily disconnected from the client pool? I.e., does routing 285 transitivity hold? 287 2.2.3 Can the DHCP inter-server protocol be developed without modifying 288 the behavior of the relay agents and the clients? 290 In particular, when a server fails and another server picks up its 291 bindings, how does the client lease extensions, lease releases,etc. 292 get to the new server? Does the relay agent replicate the messages 293 to all servers in a Server Group? How do the servers within a single 294 Server Group respond to client requests, discovery, extension, 295 release? 297 In [3] there is a discussion of a Relay Agent caching an association 298 between a client and a server for the duration of the lease to help 299 provide some load sharing capabilities. If this is in fact 300 implemented, then the Relay Agent would have to move this to the 301 backup server in the event the client server failed. 303 2.2.4 What are the DHCP protocol specific client state records defined 304 in SCSP? 306 The SCSP defines a generic message set and semantics and associated 307 client state records. The specifics of the DHCP bindings must be 308 mapped into this message set and client records. Specifically, it is 309 required to define the DHCP protocol specific CSAS and CSA records 310 which are part of the CA and CSU messages, respectively. Loosely, 311 the CSA record within a DHCP implementation is the client binding and 312 the CSAS is a summary message and pointer to the CSA on the 313 originating server. 315 2.2.5 How does SCSP support the synchronization of pre-configured (or 316 provisioned) database information? 318 The Client State Advertisement (and Summary) records are explicitly 319 defined to support client requested bindings (or summaries). But 320 there is information provisioned into DHCP servers which must be 321 distributed to a new replica server. How this information is 322 replicated needs definition within the DHCP inter-server protocol 323 through the exchange of SCSP messages. 325 2.2.6 What is the nature of the server-to-server connection? 327 SCSP was developed within the ION working group and relies on an 328 underlying layer two connection existing. What is the nature of the 329 corresponding connection for the DHCP server-to-server case? Is it 330 none, i.e., simple UDP/IP connectivity? (Are the acknowledgment and 331 timeout procedures within SCSP robust enough to run over UDP?) Or is 332 it a TCP connection? (Need to define a TCP port number or dynamic 333 assignment of the port for this protocol to run over.) 335 2.2.7 What topologies will be supported? 337 The SCSP supports both point-to-multipoint and point-to-point 338 connections between the LS and the DCS. It also supports full mesh 339 and a partial mesh interconnection of servers within an SG. What 340 impact on the system performance will these different topologies 341 have? 343 Each of the above issues must be addressed for the DHCP inter-server 344 protocol independent of use the generic capabilities offered by SCSP. 345 The value of the SCSP is that it provides the lower level connection 346 maintenance, database synchronization and asynchronous database 347 update capabilities that are required of any redundant server 348 architecture. By relying on SCSP as the lower level synchronization 349 capabilities, the work of defining the DHCP inter-server protocol is 350 greatly simplified. This simplification would allow the working 351 group to focus on resolving the DHCP inter-server protocol specific 352 issues identified above, having the effect of accelerating the 353 progress of this protocol development. 355 3. Peer Redundant Server Models 357 In the Peer Redundant Server Model (PRSM) all servers of the SG 358 behave roughly identically. Each can respond to the initial 359 DHCPREQUESTs of the clients, each is the owner of their particular 360 bindings, etc. They all are capable of randomly servicing clients 361 from a pool and all are responsible to propagate the binding 362 information within the SG. This model has the advantages that it 363 provides load balancing and a graceful fault recovery (once defined). 364 It has the disadvantages that it is harder to ensure non-duplicate 365 address assignments and the client bindings are distributed 366 potentially making fault isolation more difficult. 368 3.1 PRSM Description 369 The PRSM supports multiple servers within a single SG. Within the SG 370 the actions and behavior of all servers are roughly equivalent to one 371 another. Any of the servers can handle the DHCP client server 372 interactions. The servers within the SG maintain sufficient TCP 373 connectivity that the resultant graph spans the set of servers in the 374 SG. All DHCP servers within the SG have synchronized clocks, e.g., 375 using NTP. The Relay Agents forward messages to all servers in the 376 SG. 378 The approach proposed for the PRSM, which we believe is conceptually 379 the easiest to develop, is that 1) unallocated addresses belong to 380 different servers (however, they can be redistributed through the 381 Address Redistribution Procedure), and 2) once a binding is made, and 382 for the duration of that binding, it 'belongs' to that server (unless 383 the server dies or becomes disconnected for its set of clients). 384 States in which the bound client unicasts back to that server are 385 handled sufficiently well with this approach. (Note: There are 386 probably failure scenarios where the client unicasts back, e.g., 387 sends a DHCPDECLINE from the REQUESTING-state or a DHCPRELEASE from 388 the BOUND-state, to a server which has recently died that need to be 389 thought through in some detail.) Client states where the bound 390 client broadcast back to the SG are handed somewhat differently. In 391 this case, only the owner of the binding should respond if a change 392 to the binding is requested, e.g., a lease extension. If a change to 393 the binding is not required, e.g., the client is in the INIT-REBOOT- 394 state and is only verifying an existing binding, then any of the 395 servers may respond. 397 When a server dies (or becomes disconnected), the bindings (and 398 unallocated address) belonging to it are passed to another server of 399 the SG according to some rule. The rule could be a simple list 400 administered into the definition of the SG which defines which server 401 is to pick up the bindings belonging to the dead (or disconnected) 402 server. (We suspect that this new server should change the and 403 should propagate these new CSA records to the other servers in the 404 SG.) Therefore, this model relies on the notion of server 405 'ownership' of the client binding. The ownership is communicated 406 through the 408 Prior to committing any change to a client binding, e.g., sending a 409 DHCPACK, the LS must communicate this information with at least one 410 DCS in the SG. This may cause excessive delay in servicing DHCP 411 client requests. However, this is necessary to guarantee that no 412 duplicate address assignments occur. The advantage of requiring 413 forwarding to only one backup server is that this scales well as the 414 number of servers in a SG grows; you do not have to forward to all 415 servers in a SG. There are performance improvements possible in an 416 implementation, e.g., your could forward to two, but wait for the 417 acknowledgment from only one. Therefore, if you are running this 418 protocol over noisy facilities, this would improve the probability of 419 getting the forwarding out to at least one other server the first 420 time. 422 When a server boots and establishes connectivity to the other servers 423 in the SG or re-establishes connectivity to other servers in the SG, 424 it synchronizes its cache according to the cache alignment protocol 425 as describe in [2]. 427 When a server looses connectivity to another server, it should check 428 to see if it is picking up the ownership of the dead server. If so, 429 it should appropriately modify the CSA records associated with the 430 dead server. It should then force the SCSP cache alignment process 431 with each of its remaining DCS prior to servicing any further client 432 messages. (Note: we're assuming there a mechanism to force the cache 433 alignment process?) 435 The available address pool is distributed over the peer servers in 436 the server group. Each unallocated address 'belongs' to a specific 437 server. The Address Redistribution Procedure distributes unallocated 438 addresses to the peer servers. If a server runs low of unallocated 439 addresses it can request additional unallocated addresses through the 440 Address Redistribution Procedure. If it is out of unallocated 441 addresses, it must obtain more before it can make DHCPOFFERS. This 442 effectively decouples the servicing of clients from the request for 443 unallocated addresses and should provide better performance and 444 scaling. 446 In the event of a server failure, the unallocated addresses 447 associated with the failed server must be available to another server 448 or servers in the SG. These addresses are passed to another server 449 in the server group along with the bindings which belonged to the 450 failed server according to a rule as discussed above. Unallocated 451 addresses are redistributed by the Address Redistribution Procedure 452 on a need be basis. The Address Redistribution Procedure is TBD. 454 3.2 Protocol actions 456 There are several DHCP protocol interactions that can change the 457 address assignment information managed by DHCP servers: 459 + New address assignment 461 + Lease extension 463 + Lease expiration 464 + RELEASE 466 In the remainder of this section, each case is discussed along with 467 PRSM actions to avoid passing invalid configuration information to 468 clients. Server actions which do not change the nature of a binding, 469 e.g., binding verification requests from a client in the INIT- 470 REBOOT-state, can be serviced by any of the servers in the SG. 472 3.2.1 New Address Assignment 474 When a DHCP server assigns a new IP address to a DHCP client (as part 475 of an INIT-state transaction), the server adds that assignment to its 476 local database of bindings. The server must use an IP address that 477 is available for assignment from its local address pool and must 478 inform at least one of the other DHCP servers about the newly created 479 binding by completing the transmission of a CSU message containing 480 the CSA record to the other server or servers. These actions must be 481 completed just prior to sending the DHCPACK. The SCSP protocol 482 requires the DCS(s) to forward this CSU throughout the remainder of 483 the SG. (Note: Specify the options/type/priority fields in the CSA 484 message.) 486 To identify an IP address that may be assigned to the new client, the 487 server picks an address from its local pool of assignable addresses 488 (as described in the Address Redistribution Procedure) that is not 489 currently in the server's list of bindings. If the server is 'low' 490 on available address for assignment, it should initiate the Address 491 Reassignment Procedure (soon after servicing the immediate client 492 request) in order to obtain additional address. If no addresses are 493 available for local assignment, no DHCPOFFER can be sent to the 494 client. 496 3.2.2 Lease Renewal 498 A DHCP server may choose to extend the lease of a DHCP client in 499 response to a DHCPREQUEST message from a client in INIT-REBOOT-state. 500 This server must be the 'owner' of the client binding. This lease 501 extension is propagated by the extending server to at least one other 502 server by successfully transmitting a CSU message containing the CSA 503 record with the lease extension. This must happen prior to the 504 server transmitting the DHCPACK to the client. The SCSP protocol 505 ensures the propagation of this information to all servers in the SG. 507 DISCUSSION: 509 The details of this propagation require a little care in their 510 design. The delay between lease extension and distribution to 511 other servers leaves a window in which some servers may have 512 different lease expiration times for a particular binding. During 513 that window, a client may reboot and get an old lease expiration 514 date or a server may determine that a lease has expired (based on 515 an old lease expiration date) after it has been extended on 516 another server. 518 If a client receives an old expiration date (that has not been 519 extended), the client will reset its expiration date to that old 520 value. If the lease is sufficiently close to expiring, the client 521 will use DHCP to extend the lease. Even if this extension takes 522 place on a different server, the servers will eventually converge 523 to agree on the expiration time last issued to the client. 525 A server may determine that a lease has expired prior to 526 notification of the extension of that lease. If the server takes 527 no explicit action other than to delete the expired binding from 528 its database, the extended lease will propagate to the server from 529 the extending server. The following section describes lease 530 expiration in more detail. 532 It is hoped that this issue can be resolved by employing the 533 notion of binding ownership, e.g., lease extensions should not 534 happen without explicit communication with the server currently 535 owning the CSA record. The details need to be worked out and 536 changes to this section made. 538 3.2.3 Lease Expiration 540 When a DHCP server determines that the lease on a binding has 541 expired, the server simply drops that binding from its database and 542 takes no other explicit action. The address in that binding is 543 available to be allocated to another client at this time by the 544 server owning that unallocated address. 546 DISCUSSION: 548 If a server takes no other specific action than to delete the 549 binding from its database, premature expiration (expiration based 550 on a stale expiration date) will have no effect. The extending 551 server will distribute the information about the lease extension 552 to the other servers, synchronizing all of the other servers to 553 the new expiration date. 555 The only potential problem arising from premature expiration is 556 reassignment of an address that is still in use. The notion that 557 a server owns the client binding and the associated address should 558 eliminate the possibility of this situations from occurring. 560 3.2.4 Lease RELEASE 562 When a DHCP server receives a DHCPRELEASE from a client and the 563 server is the owner of that client binding, the server should expire 564 that binding and transmit a CSU message containing the CSA record of 565 the release notification to at least one of the other servers in the 566 SG. The other servers discard the binding record from their 567 databases upon receipt of the CSA record containing the DHCPRELEASE 568 notification. 570 If the RELEASEing server discovers any other server that has 571 responded to a DHCPREQUEST message from the DHCP client for the 572 RELEASEd address after the RELEASE message was received, the client 573 is still using the address and the lease is still valid. In this 574 case, the server that has responded to the DHCPREQUEST message 575 retains the ownership of the binding and distributes that binding to 576 at least one of the other servers. 578 DISCUSSION: 580 The case discussed in the second paragraph is actually a DHCP 581 protocol error on the part of the client; after issuing a 582 DHCPRELEASE, the client MUST go to INIT-state and request a new 583 address. However, as there is no mechanism in DHCP through which 584 the server can inform the client of such an error, the servers 585 must accommodate the error and maintain the consistency of the 586 binding database. 588 In the event that the original server has died prior to receiving 589 a RELEASE message from the client, the RELEASE message will not be 590 propagated to the remaining servers. This is due to the fact that 591 the RELEASEing client unicasts the message to the dead server. 592 The implications of this need to be fully determined. Currently, 593 no actions are defined to try to 'capture' the client RELEASE by 594 another server in the SG. 596 3.3 Address Redistribution Procedure 598 This procedure is TBD. 600 Several requirements imposed on this procedure are identified in the 601 above PRSM. These include: 603 + The redistribution procedure must be capable of distributing the 604 unallocated addresses at SG initialization or when initializing a 605 new server of the SG. 607 + The redistribution procedure should fairly distribute 608 unallocated addresses. 610 DISCUSSION: 612 The Address Redistribution Procedure has not been fully thought 613 out. However, the procedure may be as simple as the following 614 algorithm. A server which realizes that it is low on unallocated 615 addresses (associated with a given subnet), may initiate a request 616 to DCS(s) for more unallocated addresses. A server may find 617 itself in this situation either at initialization time, reboot, or 618 by allocating most of its owned addresses. The server then goes 619 down its list of DCS(s). For each DCS, the LS sends a request for 620 additional addresses. Contained in this request is the number of 621 unallocated addresses it currently owns, say n. The receiving DCS 622 compares this to its number of unallocated addresses, say m. If m 623 > n, then the DCS must respond to the LS with (m - n)/2 addresses. 624 If m < n, then the DCS may request the LS to provide it with (n - 625 m)/2 addresses. The LS continues this procedure until it has 626 corresponded with each of its DCS(s). 628 To avoid situations/conditions where addresses are sparse and 629 potential battles for addresses would occur, there probably needs 630 to be some sort of throttling mechanism to slow down the requests. 632 3.4 Open Questions for the PRSMs 634 + Are these the only cases in which binding information may become 635 out of date? 637 + Are these solutions correct? 639 + Need to fully develop procedures for DHCPDECLINE, DHCPRELEASE 640 and all 'lost' packet and failure scenarios. 642 + Servers cooperating to achieve "fair" distribution of available 643 addresses through the Address Redistribution Procedure. 645 + Can a cache alignment process be 'simultaneously' imposed on all 646 servers in the SG? 647 The philosophical approach taken in defining the actions of the 648 assigning server is to force it to inject the information into 649 at least one other server in the SG just prior to committing a 650 change in a client state, e.g., an IP address assignment, a 651 lease extension, etc. Then, force all servers to go into a 652 'simultaneous' cache alignment process in the event of a server 653 failure in the group to ensure that the most recent CSA records 654 are fully propagated prior to further assignments or extensions 655 being made by the group. This is to ensure non-duplicate 656 address assignments. But the specifics of how to force a 657 'simultaneous' cache alignment is to be determined. 659 + User intervention in case of database incoherency 660 Fixing the collective database on the DHCP servers in case of a 661 problem could be a *real* nightmare. 663 + DHCP server maintenance 664 There is likely an opportunity for the development of a server 665 management tool that would download the database information 666 from all servers and check for conflicts/inconsistencies such 667 as assignment of an IP address to multiple clients, bindings 668 that are not replicated across all servers, bindings that have 669 inconsistent lease expiration times, etc. 671 4. Primary/Secondary Redundant Server Models 673 In the Primary/Secondary Behavior model, a single server in the SG is 674 primary and is responsible for servicing all client PCs and to 675 distribute this information to the other servers. All other servers 676 are secondary. Secondary servers may participate in client/server 677 interactions when no modification to an existing binding is required, 678 e.g., a client verification request. When the primary server fails, 679 one of the secondary servers becomes the new prime. One mechanism to 680 elect the primary server within an SG is described in Appendix C of 681 [2]. Another mechanism is to simply define through an administrative 682 rule the order of ascension. Currently, the Primary Election Process 683 for the PSRSM is to be determined. 685 This model has the advantage of being conceptually simple to discuss, 686 minimizes issues associated with duplicate address assignments and 687 isolates the ownership of the bindings to a single server at any 688 point in time. It has the disadvantages of not fully supporting load 689 balancing. 691 4.1 PSRSM Description 693 The PSRSM supports multiple servers within a single SG. Within the 694 SG a single server acts as the "Primary" server; all other servers 695 act as "Secondary" servers. The Primary server is responsible for 696 handling all DHCP client server interactions which require a change 697 to a client binding. The role of the secondary servers is to maintain 698 a redundant server cache in the event that the primary server fails. 700 However, if a change to the binding is not required, e.g., the client 701 is in the INIT-REBOOT-state and is only verifying an existing 702 binding, then any of the secondary servers may respond as well. The 703 servers within the SG maintain sufficient TCP connectivity that the 704 resultant graph spans the set of servers in the SG. All DHCP servers 705 within the SG have synchronized clocks, e.g., using NTP. The Relay 706 Agents forward messages to all servers in the SG. 708 Prior to committing to any change in a client binding, e.g., sending 709 a DHCPACK, the Primary server must communicate this change to at 710 least one secondary DCS. This may cause excessive delay in servicing 711 DHCP client requests. However, this is necessary to guarantee that 712 no duplicate address assignments occur. The advantage of requiring 713 forwarding to only one backup server is that this scales well as the 714 number of servers in a SG grows; you do not have to forward to all 715 servers in a SG. There are performance improvements possible in an 716 implementation, e.g., your could forward to two, but wait for the 717 acknowledgment from only one. Therefore, if you are running this 718 protocol over noisy facilities, this would improve the probability of 719 getting the forwarding out to at least one other server the first 720 time. 722 Within this model, ownership of a client binding always resides with 723 the Primary server. Because the Primary server is solely responsible 724 for the servicing of all client requests which require changes to be 725 made to the client binding, it can potentially represent a 726 performance bottleneck. A possible solution to this problem is to 727 limit the number of subnets (and hosts) supported by a SG in the 728 PSRSM. However, in situations where the majority of the 729 client/server interactions are related to verification of existing 730 bindings, load balancing can occur because the secondary servers may 731 respond to these client requests as well as the primary server. 733 When a server boots and establishes connectivity to the other servers 734 in the SG or re-establishes connectivity to other servers in the SG, 735 it synchronizes its cache as describe in [2]. A newly established 736 (or reconnected) server within the SG can initiate the Primary Server 737 Election Process. The Primary Server Election Process is TBD (one 738 such election process is discussed in the Appendix C of [2].) 740 When a secondary server or group of secondary servers become 741 disconnected from the primary server (for whatever reason), they 742 initiate the Primary Server Election Process. The servers can be 743 disconnected for many reasons, e.g., a failure of the primary server 744 process or a network failure causing the connection to be dropped. 745 When a secondary server becomes disconnected from other secondary 746 servers this is not cause to initiate the Primary Server Election 747 Process. Once the primary server is newly elected, it should go 748 through the SCSP cache alignment protocol with each of the remaining 749 secondary servers prior to servicing client messages. (Note: we're 750 assuming there a mechanism to force the cache alignment process?) 751 (Note: There are probably failure scenarios where the client unicasts 752 back, e.g., sends a DHCPDECLINE from the REQUESTING-state or a 753 DHCPRELEASE from the BOUND-state, to a server which has recently died 754 that need to be thought through in some detail.) 756 4.2 Protocol Actions 758 There are several DHCP protocol interactions that can change the 759 address assignment information managed by DHCP servers: 761 + New address assignment 763 + Lease extension 765 + Lease expiration 767 + RELEASE 769 In the remainder of this section, each case is discussed along with 770 PSRSM inter-server protocol actions to avoid passing invalid 771 configuration information to clients. Server actions which do not 772 change the nature of a binding, e.g., binding verification requests 773 from a client in the INIT-REBOOT-state, can be serviced by any of the 774 servers in the SG. 776 4.2.1 New Address Assignment 778 Just prior to sending the DHCPACK, the primary server completes the 779 transmission of a CSU message containing the CSA record for the 780 client binding to at least one of the secondary DCSs. The SCSP 781 protocol requires the DCS(s) to forward this CSU throughout the 782 remainder of the SG. (Note: Specify the options/type/priority 783 fields in the CSA message.) 785 If a newly elected Primary server receives a DHCPREQUEST with a 786 'server identifier' other than its own, it should respond to this 787 DHCPREQUEST. (How would this currently happen?) 789 4.2.2 Lease Renewal 791 Just prior to sending the DHCPACK, the primary server completes the 792 transmission of a CSU message containing the CSAS record for the 793 renewed client binding to at least one of the secondary DCSs. The 794 SCSP protocol requires the DCS(s) to forward this CSU throughout the 795 remainder of the SG. (Note: Specify the options/type/priority 796 fields in the CSA message.) 798 4.2.3 Lease Expiration 800 When the primary server determines that the lease on a binding has 801 expired, the server simply drops that binding from its database and 802 takes no other explicit action. The address in that binding may be 803 assigned to a new client at this time. When a secondary server 804 determines that the lease on a binding has expired, the server simply 805 drops that binding from its database and takes no other explicit 806 action. 808 4.2.4 Lease RELEASE 810 When a primary server receives a DHCPRELEASE from a client, the 811 primary server completes the transmission of a CSU message containing 812 the CSAS record for the released client binding to at least one of 813 the secondary DCSs. The servers discard the lease from their 814 databases. 816 DISCUSSION: 818 There are probably failure scenarios where the client unicasts 819 back, e.g., sends a DHCPDECLINE from the REQUESTING-state or a 820 DHCPRELEASE from the BOUND-state, to a server which has recently 821 died that need to be thought through in some detail. In this 822 case, there is no mechanism currently defined for the newly 823 elected primary server to receive the client's RELEASE message. 825 4.3 Primary Server Election Process 827 The Primary Server Election Process is to be determined. 829 DISCUSSION: 831 However, this may be as simple as defining an 'administrative 832 rule' to determine the order of succession (as discussed above in 833 the case of passing binding ownership in the PRSM above). Or this 834 may be more automatic through the definition of an election 835 process, such as that identified in the appendix of [2]. 837 4.4 Open Questions for the PSRSM 839 + Can a cache alignment process be 'simultaneously' imposed on all 840 servers in the SG? 841 The philosophical approach taken in defining the actions of the 842 assigning primary server is to force it to inject the 843 information into at least one other server in the SG just prior 844 to committing a change in a client state, e.g., an IP address 845 assignment, a lease extension, etc. Then, force all servers to 846 go into a 'simultaneous' cache alignment process in the event 847 of a primary server failure in the group to ensure that the 848 most recent CSA records are fully propagated prior to further 849 assignments or extensions being made by the group. This is to 850 ensure non-duplicate address assignments. But the specifics of 851 how to force a 'simultaneous' cache alignment is to be 852 determined. 854 + Need to define the new primary server election process. 856 + Need to fully develop procedures for DHCPDECLINE and all 'lost' 857 packet scenarios and failure scenarios. 859 5. DHCP Specific CSA and CSAS Records 861 This section presents the CSA and the CSAS records specific to the 862 DHCP inter-server protocol. These records apply to both the PRSM and 863 the PSRSM and so are presented separately in this section. 865 The assumptions made in defining the DHCP client/server protocol 866 specific records are the following: 868 + Must provide the capability for the auto-configuration of a new 869 server. One ancillary use of the inter-server protocol is in 870 configuring new DHCP servers. The DHCP inter-server protocol 871 should allow the download of a server's configuration file and to 872 allow addition of a new server to the list of DHCP servers. A new 873 server might be configured by simply giving it the address of an 874 existing server. The new server could then download a list of all 875 other known servers, the pool of candidate addresses, any special 876 configuration information (e.g., vendor class information) and the 877 existing bindings. The new server could also announce itself to 878 all of the other existing servers. 880 + A 'boot record' is required which carries the provisioned 881 portion of the DCHP server cache. This is the information which 882 contains the administrative information defining the address 883 range, 'scopes', registered clients', etc. It is assumed that 884 this record is vendor specific (because of the different 885 implementations of the server configuration files) and will be 886 defined as such. This boot record will satisfy the capabilities 887 discussed in the previous bullet item. (Note: this requires a lot 888 more thought.) 890 + The CSAS and the CSA records are maximally defined at this 891 point. Because clients DHCPDISCOVERY messages can contain client 892 specific requests for parameters, it is necessary to embed the 893 full set of options (committed to the client in the DHCPOFFER 894 message) within the CSA record. If it is determined at a later 895 date, that there is information in the CSA records which are 896 locally derivable, then this information will be removed from the 897 definition of the CSA records. 899 5.1 CSAS Records 901 According to the semantics of the CSAS record defined in [2], the 902 CSAS record should maximally contain the 'CSA Sequence Number', the 903 'Search String' and the server 'Originator ID'. Further, the 904 sequence number is defined in the generic portion of the CSAS record; 905 only the search string and the originator ID are DHCP protocol 906 specific. 908 The format of the CSAS record for the DCHP inter-server protocol is: 910 0 1 2 3 911 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 912 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 913 | CSA Sequence Number | 914 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 915 |type | state | htype | hlen | reserved | 916 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 917 | chaddr (16 octets) | 918 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 919 | ciaddr | 920 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 921 | Server ID (encoded as in BOOTP options, tag=54) (6 octets) | 922 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 923 | Optional ClientID (encoded as tag=61) (variable) | 924 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 925 | End Option (encoded as in BOOTP options, tag=255) (1 octet) | 926 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 928 Figure 5.1-1 DHCP inter-server CSAS record format 929 where 931 CSA Seq.No - is part of the generic SCSP CSAS record format 932 defined in [2] 934 type - represents the type of the CSAS record, e.g. client, boot 935 state - represent the state of the (client) record, e.g., 936 reserved, unbound, bound, extended 938 htype - hardware address type (defined in [4]) 940 hlen - hardware address length 942 chaddr - client hardware address 944 ciaddr - client IP address (if assigned). If not assigned, this 945 field is all 0s. 947 Server ID - the Server ID encoded as in the DHCP options and BOOTP 948 vendor extensions defined in [3]. 950 (Optional) Client ID - this field is the optional Client ID 951 encoded as in the DHCP options and BOOTP vendor extensions defined 952 in [3]. If present, the Client ID is the 'search string'. 954 End option - determines the end of the CSAS record 956 The CSA sequence number is part of the generic CSAS record defined in 957 [2]. The remainder of the CSAS record is the client/server protocol 958 specific portion of the record. The portion beginning with the 959 Server ID is encoded as defined in the DHCP Options and BOOTP Vendor 960 Extensions in [3] using a 'tag, length, variable' encoding scheme. 962 DISCUSSION: 964 The inclusion of the 'type' and 'state' fields needs more thought. 965 There is a desire to provide the capability to dynamically 966 propagate boot files between servers. There are probably other 967 ways to indicate the fact that the CSAS records points to a 'boot 968 file' versus a 'client record', but it is felt that this is the 969 most straight forward. 971 The record identified above is really meant to represent the 972 format for a 'client record', not the 'boot file' record. However, 973 the format of the 'boot file' record is to be determined. The 974 SCSP CSA record supports fragmentation (with a fragmentation 975 sequence number field of 15 bits). Therefore, a CSA record could 976 accommodate a large boot file transfer. 978 The 'state' filed was included currently as a place holder. There 979 may be a need to be able to explicitly identify the state of a 980 client record. This field is placed here in anticipation of this 981 requirement. 983 The SCSP requires only the 'search string', the sequence number 984 and the Originator ID (here the Server ID). The Client ID option 985 was included because it is allowed in the DHCP protocol and is 986 used as the 'search string' if it is included. The default 987 'search string' is the chaddr plus ciaddr combination. In the 988 event that the ciaddr is not assigned to the client, this field is 989 all 0s. 991 5.2 CSA Records 993 The format of the CSA record for the DCHP inter-server protocol is: 995 0 1 2 3 996 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 997 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 998 |F| Fragment Number | TTL | 999 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1000 | CSA Sequence Number | 1001 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1002 | Server Group ID | 1003 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1004 |type | state | htype | hlen | reserved | 1005 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1006 | chaddr (16 octets) | 1007 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1008 | ciaddr | 1009 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1010 | Lease Time Stamp | 1011 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1012 | Server ID (encoded as in BOOTP options, tag=54) (6 octets) | 1013 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1014 | IP Address Lease Time (encoded as tag=51) (6 octet) | 1015 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1016 | Optional ClientID (encoded as tag=61) (variable) | 1017 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1018 | End Option (encoded as in BOOTP options, tag=255) (1 octet) | 1019 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1021 Figure 5.2-1 DHCP inter-server CSA record format 1022 where 1024 F - final bit, used to indicate the last fragment of a record 1026 Fragment Number - sequence number of the various fragments of a 1027 fragmented CSA record 1029 TTL - time to leave for a packet. This represents the number of 1030 hops that a CSA takes before it is dropped. At each server that 1031 the CSA record traverses, the TTL is decremented by one. 1033 CSA Seq.No - is part of the generic SCSP CSAS record format 1034 defined in [2] 1036 Server Group ID - a 32-bit identification field that uniquely 1037 identifies both the client/server protocol for which the servers 1038 of the SG are being synchronized, e.g., DHCP, as well as the 1039 instance of that protocol. This implies that multiple instances 1040 of that same protocol may be in operation at the same time and 1041 have their servers synchronized independently of each other. 1043 type - represents the type of the CSAS record, e.g. client, boot 1045 state - represent the state of the (client) record, e.g., 1046 reserved, unbound, bound, extended 1048 htype - hardware address type (defined in [4]) 1050 hlen - hardware address length 1052 chaddr - client hardware address 1054 ciaddr - client IP address (if assigned). If not assigned, this 1055 field is all 0s. 1057 Lease Time Stamp - a time stamp indicating when the lease was made 1058 to the client. The specifics of this field are to be determined. 1059 The intent of this field is to allow another server (e.g., a newly 1060 booting server) to be able to determine the time this client's 1061 leave should expire (given as the sum of the Lease Time Stamp and 1062 the IP Address Lease Time below). 1064 Server ID - the Server ID encoded as in the DHCP options and BOOTP 1065 vendor extensions defined in [3] 1067 IP Address Lease Time - the IP Address Lease Time encoded as in 1068 the DHCP options and BOOTP vendor extensions defined in [3] 1070 (Optional) Client ID - this filed is the optional Client ID 1071 encoded as in the DHCP options and BOOTP vendor extensions defined 1072 in RFC 1533. If present, the Client ID is the 'search string'. 1074 Remaining Options - any remaining options carried in the original 1075 DHCPOFFER message to the client encoded as in the DHCP options and 1076 BOOTP vendor extensions defined in [3] 1077 End option - determines the end of the CSAS record 1079 The F-bit, Fragmentation Number, TTL, CSA sequence number and Server 1080 Group ID are part of the generic CSA record defined in [2]. The 1081 remainder of the CSA record is the client/server protocol specific 1082 portion of the record. The portion beginning with the Server ID is 1083 encoded as defined in the DHCP Options and BOOTP Vendor Extensions in 1084 [3] using a 'tag, length, variable' encoding scheme. 1086 DISCUSSION: 1088 As discussed in the previous section on the CSAS record format, 1089 the format shown above is intended to be the client-type CSA 1090 record. Given a desire to support automatic booting of new servers 1091 and that the intent here is to support this boot file exchange 1092 through the CSA record, the definition of the bootfile-type CSA 1093 record needs to be defined. This will probably be vendor specific 1094 and will probably rely on the fragmentation capability of the CSA 1095 record provided for in the SCSP [2]. 1097 5.3 Open Questions with the CSAS and CSA Records 1099 The following questions are identified as outstanding issues to be 1100 resolved for the CSAS and CSA record definitions to be considered 1101 complete: 1103 + Is the right approach for new server boot file transfers to rely 1104 on the CSA records defined within the SCSP? 1106 + Is it necessary to communicate the 'state' field information in 1107 the CSAS and CSA records? 1109 + How should the Lease Time Stamp be encoded? 1111 6. Conclusion 1113 To be determined. 1115 Appendix A: The SCSP "Hello" Sub-protocol Overview 1117 The function of the SCSP "Hello" protocol is to monitor the status of 1118 the LS to DCS connection. The LS must be configured with the 1119 addresses of its DCSs. For each DCS (whether the low level 1120 connection is point-to-point or point-to-multipoint), the LS 1121 maintains an Hello Finite State Machine (HFSM). The HFSM is shown 1122 in the figure below. 1124 +---------------+ 1125 | | 1126 +-------@| DOWN |@-------+ 1127 | | | | 1128 | +---------------+ | 1129 | | @ | 1130 | | | | 1131 | | | | 1132 | | | | 1133 | @ | | 1134 | +---------------+ | 1135 | | | | 1136 | | WAITING | | 1137 | +--| |--+ | 1138 | | +---------------+ | | 1139 | | @ @ | | 1140 | | | | | | 1141 | @ | | @ | 1142 +---------------+ +---------------+ 1143 | BIDIRECTION |----@| UNIDIRECTION | 1144 | | | | 1145 | CONNECTION |@----| CONNECTION | 1146 +---------------+ +---------------+ 1148 Figure A-1 The Hello Finite State Machine 1150 Key: 1152 1: Link layer connection is established 1154 2: Transition based upon the receipt of a Hello message (and 1155 whether the LS ID is found in the Rec ID portion of the message 1157 3: Hello Interval * Dead Factor exceeded 1159 4: Loss of link layer connectivity 1161 The LS to DCS connections are initialized into the down state. The 1162 numbers in the figure refer to the actions discussed in the Key that 1163 cause a transition in the HFSM. The Hello protocol employs poll 1164 messages to monitor the status of the LS to DCS connections. The 1165 format of the Hello message is shown below. 1167 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1168 | LS ID | RecID1, .....RecIDn | Hello Int | Dead Factor | 1169 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1171 Figure A-2 Hello message format 1173 The first field contains the LS ID. The following fields contain the 1174 ID s of the DCS s that the LS has received a Hello message from. The 1175 LS' HFSM uses these ID s to determine the status of the HFSM for each 1176 of the DCS s. Multiple DCS ID s are present in order to support 1177 point-to-multipoint connections. The following field is the Polling 1178 Interval and the last field is a Dead Factor. The product of the 1179 Polling Interval and the Dead Factor determines the length of time 1180 that the HFSM will hold open a connection without receiving a Hello 1181 from a peer DCS and transitioning the HFSM for that DCS to the Wait 1182 state. 1184 Issues to resolve for DHCP Server-to-Server Implementation: 1186 + The transition from the Down to the Wait state is made when the 1187 link level connection between the servers is made. The DHCP 1188 inter-server protocol needs to generalize this trigger because the 1189 path between redundant DHCP servers may not be a link level 1190 virtual circuit. Possible triggers include a) the establishment 1191 of a TCP session between the servers or b) the return of a ping 1192 off the distant server. 1194 Appendix B: The SCSP "Cache Alignment" Sub-protocol Overview 1196 The Cache Alignment protocol supports the initial server cache 1197 synchronization process of an LS with its DCSs. This process may 1198 occur at initial boot time of the server, at reconnect time of the 1199 server to the network, or other possible initialization or failure 1200 recovery scenarios. Like the Hello protocol, the Cache Alignment 1201 (CA) protocol maintains a Cache Alignment Finite State Machine 1202 (CAFSM) for each of its DCSs to monitor the status of its cache 1203 alignment. The figure below shows the CAFSM and indicates some of 1204 the triggers that would cause the state transitions to occur. 1206 +------------+ 1207 | | 1208 +---@| DOWN | 1209 | | | 1210 | +------------+ 1211 | | 1212 | | 1213 | @ 1214 | +------------+ 1215 | |Master/Slave| 1216 |----| |@---+ 1217 | |Negotiation | | 1218 | +------------+ | 1219 | | | 1220 | | | 1221 | @ | 1222 | +------------+ | 1223 | | Cache | | 1224 |----| |----| 1225 | | Summarize | | 1226 | +------------+ | 1227 | | | 1228 | | | 1229 | @ | 1230 | +------------+ | 1231 | | Update | | 1232 |----| |----| 1233 | | Cache | | 1234 | +------------+ | 1235 | | | 1236 | | | 1237 | @ | 1238 | +------------+ | 1239 | | | | 1240 +----| Aligned |----+ 1241 | | 1242 +------------+ 1244 Figure B-1 Cache Alignment Finite State Machine 1246 Key: 1248 1: When HFSM reaches Bi-directional state 1250 2: HFSM transitions out of Bi-directional state 1251 3: Master/Slave relationship is established 1253 4: Once both LS and DCS exchange CA messages, both with O-bit set 1254 to 0, then CRL is complete 1256 5: E.g., Errored sequence number 1258 6: Full cache update achieved 1260 Each of the CAFSMs is coupled with the respective HFSMs in the LS. 1261 The CAFSM is initialized in the Down state. It transitions to the 1262 Master/Slave Negotiation state when the corresponding HFSM 1263 transitions to the Bi-Directional state. The CAFSM transitions back 1264 to the Down state in the event that the corresponding HFSM 1265 transitions out of the Bi-Directional state. 1267 In the Master/Slave state the LS-DCS pair negotiate who is to be the 1268 master of the connection during the cache alignment process. In the 1269 Cache Summary state the LS/DCS pair exchange Client State 1270 Advertisement Summary (CSAS) records within the CA messages. The 1271 servers use these message exchanges to build a Client State 1272 Advertisement Request List (CRL). The CRL indicates the portions of 1273 the respective server caches that are out of alignment. The cache 1274 mis-alignment (as indicated in the local CRL) is resolved in the 1275 Update Cache state where the servers exchange full client state 1276 information in CSA records within the CSU messages, only where mis- 1277 alignment occurs. Once the CRL is resolved, the LS/DCS caches are 1278 aligned and the CAFSM transitions to the Aligned state. 1280 The protocol further defines the high-level syntax of a generic CA 1281 message. This format is shown in the figure below. 1283 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1284 | LS ID | DCS ID | CA Seq.No. | M-bit | I-bit | O-bit | 1285 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1287 Figure B-2 Cache Alignment (CA) message format 1289 The message format consists of a CA Header followed by zero or more 1290 Client State Advertisement Summary (CSAS) records. The CA header 1291 consist of LS and DCS ID s , a Sequence Number, and an M, I, and O 1292 bits. The M bit indicates the Master/Slave relationship, the I bit 1293 indicates that the Master/Slave relationship is being negotiated and 1294 the O bit indicates more messages are to be exchanged. 1296 Issues to resolve for DHCP Server-to-Server Implementation: 1298 The SCSP generic message syntax and semantics are defined, but not 1299 the detailed mappings required for the DHCP Server-to-Server 1300 implementation. Messages to be defined include: 1302 - Client State Advertisement Summary (CSAS) records within the 1303 Cache Alignment messages 1305 - Client State Advertisement (CSA) records within the Client 1306 State Update (CSU ) messages 1308 + Need to define the set of triggers which initiated the Client 1309 Alignment Protocol? Clearly on server boot initialization. But 1310 how does a well behaving server determine that due to network 1311 topology changes that it needs to trigger the Client Alignment 1312 Protocol ? 1314 + When building the CRL, the LS has to be able to determine, based 1315 upon the CSAS messages, that a particular client record is "out- 1316 of-date"? The SCSP defines the term "search string" which is the 1317 key word used to access the server cache, e.g., the client HW 1318 address for the DHCP implementation. The CA header also contains 1319 a sequence number which is monotonically increasing and is 1320 assigned by the originating LS (e.g., a primary DHCP server in the 1321 primary/secondary behavior model discussed above). The 1322 determination of the client state record's quality has to be 1323 specified. 1325 Appendix C: The SCSP "Client State Update" Sub-protocol Overview 1327 The purpose of the Client State Update (CSU) protocol is to provide a 1328 capability to constantly update the server caches through 1329 asynchronous CSU message exchanges. These updates are necessary 1330 because the status of the clients are in constant flux. Unlike the 1331 other two sub-protocols, the Client State Update protocol does not 1332 maintain a separate finite state machine. Instead, the activity of 1333 this protocol is tied to the CAFSM. 1335 Each CSU can contain zero or more Client State Advertisement records. 1336 The LS may send and receive CSUs when the corresponding CAFSM is in 1337 either the Aligned or the Cache Update states. The CSU protocol 1338 defines both CSU requests and reply messages. As consistent 1339 throughout the definition of the SCSP, the CSU protocol supports both 1340 point-to-point and point-to-multipoint connections. 1342 Issues to resolve for DHCP Server-to-Server Implementation: 1344 The specific format of the Client State Advertisement (CSA) 1345 records within the CSU messages need to be defined for the DHCP 1346 implementation. 1348 References 1350 [1] Droms, R., "