Network Working Group Ralph Droms INTERNET DRAFT Bucknell University Kim Kinnear Mark Stapp Cisco Systems Bernie Volz Steve Gonczi Process Software Greg Rabil Mike Dooley Arun Kapur Quadritek Systems June 1999 Expires December 1999 DHCP Failover Protocol Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (C) The Internet Society (1999). All Rights Reserved. Droms, et. al. Expires December 1999 [Page 1] Internet Draft DHCP Failover Protocol June 1999 Abstract DHCP [RFC 2131] allows for multiple servers to be operating on a single network. Some sites are interested in running multiple servers in such a way so as to provide redundancy in case of server failure. In order for this to work reliably, the cooperating primary and secondary servers must maintain a consistent database of the lease information. This implies that servers will need to coordinate any and all lease activity so that this information is synchronized in case of failover. This document defines a protocol to provide this synchronization between two servers. One server is designated the "primary" server, the other is the "secondary" server. Additionally, this document describes a protocol which allows each server to determine to which DHCP clients it should provide service when both servers are operating in order to support load balancing as well as when on one server has failed in order to support increased DHCP service availability. This document is a complete rewrite of draft-ietf-dhc-failover- 03.txt. That earlier draft described a UDP based failover protocol, and this draft describes a closely related protocol which uses TCP as a transport and includes new load-balancing and security capabilities. Table of Contents 1. Introduction................................................. 4 2. Terminology.................................................. 5 2.1. Requirements terminology................................... 5 2.2. DHCP and failover terminology.............................. 5 3. Background and External Requirements......................... 7 3.1. Key aspects of the DHCP protocol........................... 7 3.2. BOOTP relay agent implementation........................... 9 3.3. What does it mean if a server can't communicate with its partner? 10 3.4. Challenging scenarios for a Failover protocol............. 10 3.5. Using TCP to detect partner server failure................ 11 4. Design Goals................................................ 13 4.1. Design requirements for this protocol..................... 13 4.2. Goals for this protocol................................... 13 4.3. Limitations of this Protocol.............................. 14 Droms, et. al. Expires December 1999 [Page 2] Internet Draft DHCP Failover Protocol June 1999 5. Protocol Overview........................................... 15 5.1. Messages and States....................................... 15 5.2. Fundamental restrictions.................................. 18 5.3. Load balancing............................................ 24 5.4. Operating in NORMAL state................................. 25 5.5. Operating in COMMUNICATIONS-INTERRUPTED state............. 25 5.6. Operating in PARTNER-DOWN state........................... 25 5.7. Operating in RECOVER state................................ 26 6. Packet Formats.............................................. 26 6.1. Common message format..................................... 26 6.2. Common option format...................................... 28 6.3. BNDUPD message format..................................... 40 6.4. BNDACK message format..................................... 42 6.5. Bulking for BNDUPD and BNDACK messages.................... 44 6.6. UPDREQ message format..................................... 44 6.7. UPDREQALL message format.................................. 44 6.8. UPDDONE message format.................................... 44 6.9. POOLREQ message format.................................... 45 6.10. POOLRESP message format.................................. 45 6.11. CONNECT message format................................... 46 6.12. CONNECTACK message format................................ 46 6.13. STATE message format..................................... 47 6.14. CONTACT message format................................... 48 7. Protocol Messages........................................... 48 7.1. BNDUPD message............................................ 48 7.2. BNDACK message............................................ 57 7.3. UPDREQ message............................................ 58 7.4. UPDREQALL message......................................... 59 7.5. UPDDONE message........................................... 60 7.6. POOLREQ message........................................... 60 7.7. POOLRESP message.......................................... 61 7.8. CONNECT message........................................... 62 7.9. CONNECTACK message........................................ 65 7.10. STATE message............................................ 68 7.11. CONTACT message.......................................... 69 8. Connection Management....................................... 70 8.1. Connection granularity.................................... 70 8.2. Creating the TCP connection............................... 70 8.3. Using the TCP connection for determining communications status. 71 8.4. Using the TCP connection for binding data................. 73 8.5. Using the TCP connection for control messages............. 73 8.6. Losing the TCP connection................................. 73 9. Protocol States............................................. 73 9.1. Server Initialization..................................... 74 9.2. Server State Transitions.................................. 74 9.3. STARTUP state............................................. 77 9.4. PARTNER-DOWN state........................................ 79 Droms, et. al. Expires December 1999 [Page 3] Internet Draft DHCP Failover Protocol June 1999 9.5. RECOVER state............................................. 81 9.6. NORMAL state.............................................. 83 9.7. COMMUNICATIONS-INTERRUPTED State.......................... 86 9.8. POTENTIAL-CONFLICT state.................................. 89 9.9. RECOVER-DONE state........................................ 90 9.10. PAUSED state............................................. 91 9.11. SHUTDOWN state........................................... 91 10. Safe Period................................................ 92 11. Security................................................... 94 11.1. Simple shared secret..................................... 94 11.2. TLS...................................................... 94 12. Hash algorithm for load balancing.......................... 95 13. Acknowledgments............................................ 96 14. References................................................. 97 15. Author's information....................................... 98 16. Full Copyright Statement................................... 99 1. Introduction DHCP [RFC 2131] allows for multiple servers to be operating on a sin- gle network. Some sites are interested in running multiple servers in such a way so as to provide redundancy in case of server failure since the DHCP subsystem is in many cases a critical part of the net- work infrastructure. This document defines a protocol to provide synchronization between two servers in order that each can take over for the other should either one fail or become unreachable. One server is designated the "primary" server, the other is the "secondary" server, and all DHCP client requests are sent to each server. In order to provide a high availability DHCP service, these cooperating primary and secondary servers must maintain a consistent database of lease information. This implies that servers will need to coordinate any and all lease activity so that this information is synchronized in case failover is required. The protocol messages and processing techniques required to maintain a consistent database are specified in the protocol described here. The failover protocol also contains an algorithm which allows each server to determine to which DHCP clients it should provide service when both servers are operating normally, and this capability can be used to support load balancing. Droms, et. al. Expires December 1999 [Page 4] Internet Draft DHCP Failover Protocol June 1999 2. Terminology This section discusses both the generic requirements terminology com- mon to many IETF protocol specifications as well as specialized DHCP and failover protocol specific terminology. 2.1. Requirements terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC 2119]. 2.2. DHCP and failover terminology This document uses the following terms: o "DHCP client" or "client" A DHCP client is an Internet host using DHCP to obtain confi- guration parameters such as a network address. o "DHCP server" or "server" A DHCP server is an Internet host that returns configuration parameters to DHCP clients. o "binding" A binding is a collection of configuration parameters, including at least an IP address, associated with or "bound to" a DHCP client. Bindings are managed by DHCP servers. o "binding database" The collection of bindings managed by a primary and secondary. o "failover endpoint" The failover protocol allows for there to be a unique failover endpoint per partner per role (where role is primary or secon- dary). This failover endpoint can take actions and hold unique states. There are thus a maximum of two failover endpoints per server per partner (one for each partner as a primary and one for that same partner as a secondary.) o "lazy update" Droms, et. al. Expires December 1999 [Page 5] Internet Draft DHCP Failover Protocol June 1999 Lazy update refers to the requirement placed on a server imple- menting a failover protocol to update its failover partner when- ever the binding database changes. A failover protocol which didn't support lazy update would require the failover partner update to be complete before a DHCP server could respond to a DHCP client request with a DHCPACK. A failover protocol which does support lazy update places no such restriction on the update of the failover partner server, and so a server can allo- cate an IP address or extend a lease on an IP address and then update its failover partner as time permits. A failover proto- col which supports lazy update not only removes the requirement to update the failover partner prior to responding to a DHCP client with a DHCPACK, but also allows gathering up batches of updates from one failover server to its partner. o "subnet address pool" A subnet address pool is the set of IP address which is associ- ated with a particular network number and subnet mask. In the simple case, there is a single network number and subnet mask and a set of IP addresses. In the more complex case (sometimes called "secondary subnets", sometimes "superscopes"), several (apparently unrelated) network number and subnet mask combina- tions with their associated IP addresses may all be configured together into one subnet address pool. o "Primary server" or "Primary" A DHCP server configured to provide primary service to a set of DHCP clients for a particular set of subnet address pools. o "Secondary server" or "Secondary" A DHCP server configured to act as backup to a primary server for a particular set of subnet address pools. o "stable storage" Every DHCP server is assumed to have some form of what is called "stable storage". Stable storage is used to hold information concerning IP address bindings (among other things) so that this information is not lost in the event of a server failure which requires restart of the server. o "MCLT" The MCLT refers to maximum client lead time. This time is con- figured on the primary server and transmitted from the primary Droms, et. al. Expires December 1999 [Page 6] Internet Draft DHCP Failover Protocol June 1999 to the secondary server in the CONNECT message. It is the max- imum amount of time that one server can give to a client for a binding beyond that known and ACKed by the partner server. See section 5.2.1 for details. 3. Background and External Requirements This section highlights key aspects of the DHCP protocol on which the failover protocol depends. It also discusses the requirements that the failover protocol places on other aspects of the network infras- tructure, and some general issues surrounding server failure detec- tion. Some failure scenarios that provide particular challenges to a failover protocol are discussed. Finally, the challenges inherent in using a TCP connection as a means to detect failure of a partner server are elaborated. 3.1. Key aspects of the DHCP protocol The failover protocol is designed to augment the DHCP protocol as described in RFC 2131 [RFC 2131]. There are several key aspects of the DHCP protocol which are required by the failover protocol in order to successfully meet its design goals. 3.1.1. Broadcast behavior There are two aspects of the broadcast behavior of the DHCP protocol which are key to making the failover protocol operate successfully. The first is simply that the DHCP protocol requires a DHCP client to broadcast all DHCPDISCOVER and DHCPREQUEST/INIT-REBOOT messages. Because of this requirement, a DHCP client who was communicating with one server will automatically be able to communicate with another server if one is available. The second aspect of broadcast behavior is similar to the first, but involves the distinction between a DHCPREQUEST/RENEW and DHCPREQUEST/REBINDING. A DHCPREQUEST/RENEW is the message that a DHCP client uses to extend its lease. It is unicast to the DHCP server from which it acquired the lease. However, the DHCP protocol (in a farsighted move), was explicitly designed so that in the event that a DHCP client cannot contact the server from which it received a lease on an IP address using a DHCPREQUEST/RENEW, the client is required to broadcast its renewal using a DHCPREQUEST/REBINDING to any available DHCP server. Since all DHCP clients were required to implement this algorithm, the failover protocol can have a different server from the one that initially granted a lease be the server to renew a lease. Thus, one server can take over for another with no interruption in the service as experience by the DHCP client or its Droms, et. al. Expires December 1999 [Page 7] Internet Draft DHCP Failover Protocol June 1999 associated applications software. 3.1.2. Client responsibility In the DHCP protocol the DHCP clients are entrusted with a consider- able responsibility. In particular, after they are granted a lease on an IP address, they are enjoined to only use that IP address while their lease is valid. Every DHCP client is expected to stop using an IP address if the expiration time on the lease has passed and if it cannot get an extension on the lease for that IP address from some DHCP server. Thus, the correct behavior of every DHCP client in this regard is required to ensure the integrity of the DHCP service. On the other hand, incorrect behavior by a client in this area will tend to adversely affect at most one other DHCP client. Furthermore, any DHCP client which sends in a DHCPREQUEST/RENEW or DHCPREQUEST/REBINDING to a DHCP server (either unicast for a RENEW or broadcast for a REBINDING) MUST still have time to run on the lease for that IP address. The DHCP server sends the DHCPACK back unicast to the IP address from which the RENEW or REBINDING originated. Given the existing responsibility placed on the client to only use an IP address when the lease is valid, and to only send in a RENEW or REBINDING if the lease is valid, the failover protocol relies on DHCP clients to perform responsibly and will, in the absence of conflict- ing information, believe a DHCP client that is attempting to RENEW or REBIND a lease on an IP address is the legitimate owner of that IP address. One troublesome issue is that of the DHCP client responsibility when sending in DHCPREQUEST/INIT-REBOOT requests. While the original DHCP RFC was written to require a DHCP client to have time left to run on the lease for an IP address if the client is sending an INIT-REBOOT request, it was sufficiently unclear that some client vendors didn't realize this until recently. Since the INIT-REBOOT request was sent with the IP address in the dhcp-requested-address option and not in the ciaddr (for perfectly good reasons), the similarity to the RENEW and REBINDING case was lost on many people. At present, the failover protocol does not assume that a client send- ing in an INIT-REBOOT request necessarily has a valid lease on the IP address appearing in the dhcp-requested-address option in the INIT- REBOOT request. The implications of this are as follows: Assume that there is a DHCP client that gets a lease from one server while that server is unable to communicate with its failover partner. Then, assume that after that client reboots it is able only to communicate with the other Droms, et. al. Expires December 1999 [Page 8] Internet Draft DHCP Failover Protocol June 1999 failover server. If the failover servers have not been able to com- municate with each other during this process, then the DHCP client will get a new IP address instead of being able to continue to use its existing IP address. This will affect no applications on the DHCP client, since it is rebooting. However, it will use up an additional IP address in this marginal case. 3.1.3. Stable storage update before DHCPACK The DHCP protocol allocates resources, and in order to operate correctly it requires that a DHCP server update some form of stable storage prior to sending a DHCPACK to a DHCP client in order to grant that client a lease on an IP address. One of the goals of the failover protocol is that it not add signifi- cant additional time to this already time consuming requirement to update stable storage prior to a DHCPACK. In particular, adding a requirement to communicate with another server prior to sending a DHCPACK would simplify the failover protocol, but it would limit the potential scalability of any DHCP server which employed the failover protocol in an unacceptable manner. 3.2. BOOTP relay agent implementation Many DHCP clients are not resident on the same network segment as a DHCP server. In order to support this form of network architecture, most contemporary routers implement something known as a BOOTP Relay Agent. This capability inside of a router listens for all broadcasts at the DHCP port, port 67, and will relay any broadcasts that it receives on to a DHCP server. The IP address of the DHCP server must have been previously configured into the router. As part of the relay process, the relay agent will place the address of the inter- face on which it received the broadcast into the giaddr field of the DHCP packet. Since the failover protocol requires two DHCP servers to receive any broadcast DHCP messages, in order to work with DHCP clients which are not local to the DHCP server, the BOOTP relay agent on the router closest to the DHCP client must be configured to point at more than one DHCP server. Most BOOTP relay agent implementations allow this duplication of packets. If this is not possible, an administrator might be able to configure the relay agent with a subnet broadcast address, but in this case the primary and secondary DHCP servers in a failover pair must both reside on the same subnet. While this is a realistic configuration, Droms, et. al. Expires December 1999 [Page 9] Internet Draft DHCP Failover Protocol June 1999 it is not the one that most people will use. 3.3. What does it mean if a server can't communicate with its partner? In any protocol designed to allow one server to take over some responsibilities from a partner server in the event of "failure" of that partner server, there is an inherent difficulty in determining when that partner server has failed. In fact, it is fundamentally impossible for one server to distinguish a network communications failure from the outright failure of the server to which it is trying to communicate. In the case where each server is handing out resources (in this case IP addresses) to a client community, mistaking an inability to communicate with a partner server for failure of that partner server could easily cause both servers to be handing out the same IP addresses to different clients. One way that this is sometimes handled is for there to be more than two servers. In the case of an odd number of servers, the servers that can still communicate with a majority of other servers will con- sider themselves operational, and any server which can't communicate to a majority of other servers must immediately cease operations. While this technique works in some domains, having the only server to which a DHCP client can communicate voluntarily shut itself down seems like something worth avoiding. The failover protocol will operate correctly while both servers are unable to communicate, whether they are both running or not. At some point there may be resource contention, and if one of the servers is actually down, then the operator can inform the other server and the operational server will be able to use all of the downed server's resources. The protocol also allows detection of an orderly shutdown of a parti- cipating server. 3.4. Challenging scenarios for a Failover protocol There exist two failure scenarios which provide particular challenges the correctness guarantees of a failover protocol. 3.4.1. Primary Server crash before "lazy" update: In the case where the primary server sends a DHCPACK to a client for a newly allocated IP address and then crashes prior to sending the corresponding update to the secondary server, the secondary server Droms, et. al. Expires December 1999 [Page 10] Internet Draft DHCP Failover Protocol June 1999 will have no record of the IP address allocation. When the secondary server takes over, it may well try to allocate that IP address to a different client. In the case where the first client to receive the IP address is not on the net at the time (yet while there was still time to run on its lease), an ICMP echo (i.e., ping) will not prevent the secondary server from allocating that IP address to a different client. The failover protocol deals with this situation by having the primary and secondary servers allocate addresses for new clients from dis- joint address pools. See section 5.4 for details. A more likely (in that DHCPRENEWs are presumably more common than DHCPDISCOVERs) and more subtle version of this problem is where the primary server crashes after extending a client's lease time, and before updating the secondary with a new time using a lazy update. After the secondary takes over, if the client is not connected to the network the secondary will believe the client's lease has expired when, in fact, it has not. In this case as well, the IP address might be reallocated to a different client while the first client is still using it. This scenario is handled by the failover protocol through control of the lease time and the use of the maximum client lead time (MCLT). See section 5.2.1 for details. 3.4.2. Network partition where DHCP servers can't communicate but each can talk to clients: Several conditions are required for this situation to occur. First, due to a network failure, the primary and secondary servers cannot communicate. As well, some of the DHCP clients must be able to com- municate with the primary server, and some of the clients must now only be able to communicate with the secondary server. When this condition occurs, both primary and secondary servers could attempt to allocate IP addresses for new clients from the same pool of available addresses. At some point, then, two clients will end up being allo- cated the same IP address. This will cause problems when the network failure that created this situation is corrected. The failover protocol deals with this situation by having the primary and secondary servers allocate addresses for new clients from dis- joint address pools. See section 5.4 for details. 3.5. Using TCP to detect partner server failure There are several characteristics of TCP that are important to the functioning of the failover protocol, which uses one TCP connection Droms, et. al. Expires December 1999 [Page 11] Internet Draft DHCP Failover Protocol June 1999 for both bulk data transfer as well as to assess communications integrity with the other server. Reliable and ordered message delivery are chief among these important characteristics. It would be nice to use the capabilities built in to TCP to allow it to determine if communications integrity exists to the failover partner but this strategy contains some problems which require analysis. There exist three fundamental cases for an open TCP con- nection that must be examined. 1. When no data is being sent then no messages are traveling across the TCP connection. 2. When data is queued to be sent, and the receiver has not blocked the sending of additional data, then messages are flowing across the TCP connection containing the applications data. 3. When data is queued to be sent, and the receiver has blocked the transmission of additional data, then persist messages are flowing from the receiver to the sender to ensure that the sender doesn't miss the receiver opening the window for further transmissions. The first case can be turned into the second case by sending application-level keep-alive messages periodically when there is no other data queued to be sent. Note TCP keep-alive messages might be used as well, but they present additional problems. Thus, we can ensure that the TCP connection has messages flowing periodically across the connection fairly easily. The question remains as to what TCP will do if the other end of the connection fails to respond (either because of network partition or because the receiving server crashes). TCP will attempt to retransmit a message with an exponential backoff, and will eventually timeout that retransmission. However, the length of that timeout cannot, in gen- eral, be set on a per-connection basis, and is frequently as long as nine minutes, though in some cases it may be as short as two minutes. One some systems it can be set system-wide, while on some systems it cannot be changed at all. A value for this timeout that would be appropriate for the failover protocol, say less than 1 minute, could have unpleasant side-effects on other applications running on the same server, assuming that it could be changed at all on the host operating system. Nine minutes is a long time for the DHCP service to be unavailable to any new clients that were being served by the server which has Droms, et. al. Expires December 1999 [Page 12] Internet Draft DHCP Failover Protocol June 1999 crashed, when there is another server running that could respond to them immediately as soon as it determines that its partner is not operational. The conclusion drawn from this analysis is that TCP provides very useful support for the failover protocol in the areas of reliable and ordered message delivery, but cannot by itself be relied upon to detect partner server failure in a fashion acceptable to the needs of the failover protocol. Additional failover protocol capabilities will need to be created to support timely detection of partner server failure. See section 8.3 for details on this mechanism. 4. Design Goals This section lists the design requirements, the design goals, and the limitations of the failover protocol. 4.1. Design requirements for this protocol The following list of requirements must be (and are) met by this pro- tocol. They are listed in priority order. 1. Implementations of this protocol must work with existing DHCP client implementations based on the DHCP protocol [1]. 2. Implementations of the protocol must work with existing BOOTP relay agent implementations. 3. The protocol must provide failover redundancy between servers that are not located on the same subnet. 4.2. Goals for this protocol The following goals are met by this protocol as well, though they are less important than the requirements listed above. These goals are listed in priority order. 1. Provide for continued service to DHCP clients through an automated mechanism in the event of failure of the primary server. 2. Avoid binding an IP address to a client while that binding is currently valid for another client. In other words, do not allocate the same IP address to two clients. 3. Minimize any need for manual administrative intervention. Droms, et. al. Expires December 1999 [Page 13] Internet Draft DHCP Failover Protocol June 1999 4. Introduce no additional delays in server response time as a result of the network communications required to implement the failover protocol, i.e., don't require communications with the partner between the receipt of a DHCPREQUEST and the corresponding DHCPACK. 5. Share IP address ranges between primary and secondary servers; i.e., impose no requirement that the pool of available addresses be divided between servers. 6. Continue to meet the goals and objectives of this protocol in the event of server failure or network partition. 7. Provide graceful reintegration of full protocol service after server failure or network partition. 8. Allow for one computer to act as a secondary server for multi- ple primary servers. Other topologies (e.g.: mesh) are also possible. primary and secondary servers SHOULD be viewed as "logical" servers and not necessarily physical computers. 9. Ensure that an existing client can keep its existing IP address binding if it can communicate with either the primary or secondary DHCP server implementing this protocol - not just whichever server that originally offered it the binding. 10. Ensure that a new client can get an IP address from some server. Ensure that in the face of partition, where servers continue to run but cannot communicate with each other, the above goals and requirements may be met. In addition, when the partition condition is removed, allow graceful automatic re- integration without requiring human intervention. 11. If either primary or secondary server loses all of the infor- mation that is has stored in stable storage, it should be able to refresh its stable storage from the other server. 12. Support load balancing between the primary and secondary servers, and allow configuration of the percentage of the client population served by each with a moderately fine granu- larity. 4.3. Limitations of this Protocol The following are explicit limitations of this protocol. 1. This protocol provides only one level of redundancy through a Droms, et. al. Expires December 1999 [Page 14] Internet Draft DHCP Failover Protocol June 1999 single secondary server for each primary server. 2. A subset of the address pool is reserved for secondary server use. In order to handle the failure case where both servers are able to communicate with DHCP clients, but unable to com- municate with each other, a subset of the IP address pool must be set aside as a private address pool for the secondary server. The secondary can use these to service newly arrived DHCP clients during such a period. The size of this private pool SHOULD be based only on the arrival rate of new DHCP clients and the length of expected downtime, and is not influ- enced in any way by the total number of DHCP clients supported by the server pair. 3. The primary and secondary servers do not respond to client requests at all while recovering from a failure that could have resulted in duplicate IP assignments. (When synchroniz- ing in POTENTIAL-CONFLICT state). 5. Protocol Overview This section will discuss the failover protocol at a relatively high level level of detail. In the event that a description in this sec- tion conflicts (or appears to conflict due to the overview nature of this section) with information in later sections of this draft, the information in the later sections should be considered authoritative. 5.1. Messages and States This protocol is centered around the message exchange used by one server to update the other server of binding database changes result- ing from DHCP client activity: o Communication of binding database changes The binding update (BNDUPD) message is used to send the binding database changes to the partner server, and the partner server responds with a binding acknowledgement (BNDACK) message when it has successfully committed those changes to its own stable storage. All of the other messages are involve ancillary issues: o Management of available IP addresses The pool request (POOLREQ) is used by the secondary server to request an allocation of IP addresses from the primary server. Droms, et. al. Expires December 1999 [Page 15] Internet Draft DHCP Failover Protocol June 1999 The pool response (POOLRESP) is used by the primary server to inform the secondary server how many IP addresses it was allo- cated as the result of a pool request. o Synchronization of the binding databases between the servers after they've been out of communications The update request (UPDREQ) message is used by one server to request that its partner send it all binding database informa- tion that it has not already seen. The update request all (UPDREQALL) message is used by one server to request that all binding database information be sent in order to recover from a total loss of its lease state database by the requesting server. The update done (UPDDONE) message is used by the responding server to indicate that all requested updates have been sent the responding server and acked by the requesting server. o Connection establishment The connect (CONNECT) message is used by either server to estab- lish a high level connection with the other server, and to transmit several important configuration data items between the servers. The connect acknowledgement message (CONNECTACK) is used to respond to a CONNECT message from another server. o Server synchronization The state change (STATE) message is used by either server to inform the other server of a change of failover state. o Connection integrity management The contact (CONTACT) message is used by either server to ensure that the other server continues to see the connection as opera- tional. It MUST be transmitted periodically over every esta- blished connection if other message traffic is not flowing, and it MAY be sent at any time. 5.1.1. Failover endpoints The proper operation of the failover protocol requires more than the transmission of messages between one server and the other. Each end- point might seem to be a single DHCP server, but in fact there are many situations where additional flexibility in configuration is use- ful. For instance, there might be several servers which are each primary for a distinct set of address pools, and one server which is Droms, et. al. Expires December 1999 [Page 16] Internet Draft DHCP Failover Protocol June 1999 secondary for all of those address pools. The situation with the primaries is straightforward, but the secondary will need to maintain a separate failover state, partner state, and communications up/down status for each of the separate primary servers for which it is act- ing as a secondary. The failover protocol calls for there to be a unique failover end- point per partner per role (where role is primary or secondary). This failover endpoint can take actions and hold unique states. There are thus a maximum of two failover endpoints per partner (one for the partner as a primary and one for that same partner as a secondary.) Thus, in the case where there are two primary servers A and B each backed up by a single common secondary server C, there is one fail- over endpoint on each of A and B, and two different failover end- points on C. The two different failover endpoints on C each have unique states and independent TCP connections. This document describes the behavior of the protocol in terms of pri- mary and secondary servers, not primary and secondary failover end- points. However, it is important to remember that every 'server' described in this document is in reality a failover endpoint that resides in a particular process, and that many failover endpoints may reside in the same process. It is not the case that there is a unique failover endpoint for each subnet that participates in a failover relationship. On one server, there is one failover endpoint per partner per role, regardless of how many subnets or address pools are managed by that combination of partner and role. Conversely, any given subnet or pool will be asso- ciated with exactly one failover endpoint on a single server. When a connection is received from the partner, the unique failover endpoint to which the message is directed is determined solely by the IP address of the partner and the setting of the SECONDARY bit in the 'flags' field of the contact message. Throughout this document, the states and actions taken by "servers" are described. The terms "server", "primary server", and "secondary server" are commonly used to described the failover endpoint taking these states and performing these actions. This description is wholly accurate only for the simplest of cases, where all of the address pools on one server are backed up by all of the address pools on another server. In this case, there is single failover endpoint in each server. In all other cases, the term "server" is used to describe one of the two possible failover endpoints per partner. Droms, et. al. Expires December 1999 [Page 17] Internet Draft DHCP Failover Protocol June 1999 5.2. Fundamental restrictions There a several fundamental restrictions this protocol places on what one server an do in the absence of knowledge of the other server, and these restrictions are key to the correct operation of the protocol. 5.2.1. Control of lease time The key problem with lazy update is that when the a server fails after updating a client with a particular lease time and before updating its partner, the partner will believe that a lease has expired even though the client still retains a valid lease on that IP address. In order to handle this problem, a period of time known as the "Max- imum Client Lead Time" (MCLT) is defined and must be known to both the primary and secondary servers. Proper use of this time interval places an upper bound on the difference allowed between the lease time provided to a DHCP client by a server and the lease time known by that server's partner. However, the MCLT is typically much less than the lease time that a server has been configured to offer a client, and so some strategy must exist to allow a server to offer the configured lease time to a client. During a lazy update the updating server typically updates its partner with a potential expiration time which is longer than the lease time previously given to the client and which is longer than the lease time that the server has been configured to give a client. This allows that server to give a longer lease time to the client the next time the client renews its lease, since the time that it will give to the client will not exceed the MCLT beyond the potential expiration time acknowledged by the partner. When moving to the PARTNER-DOWN state (where a server is allowed to reallocate the partner's IP addresses), a server will wait the Max- imum Client Lead Time before allocating any IP addresses from its partner's pool to any new DHCP clients. Thus, any clients which have a lease on an IP address with a lease time greater than that known by the server moving into PARTNER-DOWN state will either have contacted that server during the MCLT period or their leases will have expired. When a server has transitioned to PARTNER-DOWN state, it MUST NOT reallocate an IP address from one client to another client until an additional maximum client lead time interval after the lease by the original client expires. (Actually, until the maximum client lead time after what it believes to be the lease expiration time of the first client.) Some optimizations exist for this restriction, in that it only Droms, et. al. Expires December 1999 [Page 18] Internet Draft DHCP Failover Protocol June 1999 applies to leases that were issued BEFORE entering PARTNER-DOWN. Once a server has entered PARTNER-DOWN and it leases out an address, it need not wait this time as long as it has never communicated with the partner since the lease was given out. The fundamental relationship on which much of the correctness of this protocol depends is that the lease expiration time known to a DHCP client MUST NOT be more than the maximum client lead time greater than the potential expiration time known to a server's partner. The remainder of this section makes the above fundamental relation- ship more explicit. This protocol requires a DHCP server to deal with several different lease intervals and places specific restrictions on their relation- ships. The purpose of these restrictions is to allow the other server in the pair to be able to make certain assumptions in the absence of an ability to communicate between servers. The different lease times are: o desired lease interval The desired lease interval is the lease interval that a DHCP server would like to give to a DHCP client in the absence of any restrictions imposed by the Failover protocol. Its determina- tion is outside of the scope of this protocol. Typically this is the result of external configuration of a DHCP server. o actual lease interval The actual lease internal is the lease interval that a DHCP server gives out to a DHCP client in the dhcp-lease-time option of a DHCPACK packet. It may be shorter than the desired client lease interval (as explained below). o potential lease interval The potential lease interval is the lease expiration interval the local server tells to its partner in the potential- expiration-time option of a BNDUPD message. o acknowledged potential lease interval The acknowledged potential lease interval is the potential least interval the partner server has most recently acknowledged in the potential-expiration-time option of a BNDACK message. Droms, et. al. Expires December 1999 [Page 19] Internet Draft DHCP Failover Protocol June 1999 The key restriction (and guarantee) that any server makes with respect to lease intervals is that the actual client lease interval never exceeds the acknowledged potential lease interval (if any) by more than a fixed amount. This fixed amount is called the "Maximum Client Lead Time" (MCLT). The MCLT MAY be configurable on the primary server, but for correct server operation it MUST be the same and known to both the primary and secondary servers. The secondary server determines the MCLT from the MCLT option sent from the primary server to the secondary server in the CONNECT or CONNECTACK message. A server MUST record in its stable storage both the actual lease interval and the most recently acknowledged potential lease interval for each IP address binding. It is assumed that the desired client lease interval can be determined through techniques outside of the scope of this protocol. Again, the fundamental relationship among these times which MUST be maintained is: actual lease interval < ( acknowledged potential lease interval + MCLT ) Figure 5.1-1 illustrates a initial lease to a client using the rules discussed in the example which follows it. Droms, et. al. Expires December 1999 [Page 20] Internet Draft DHCP Failover Protocol June 1999 DHCP Primary Secondary time Client Server Server | (time in intervals) | (absolute time) | | | | | >-DHCPDISCOVER-> | | | <---DHCPOFFER-< | | | | | | >-DHCPREQUEST-> | | | (selecting) | | | | | t | <--------DHCPACK-< | | | lease-time=MCLT | | | | >-BNDUPD--> | | | lease-expiration=t+MCLT | | potential-expiration=t+(MCLT/2)+X | | | | | <-BNDACK-< | | | potential-expiration=t+(MCLT/2)+X ... ... ... | | | t+MCLT/2 | >-DHCPREQUEST-> | | | (renew) | | | | | t1 | <--------DHCPACK-< | | | lease-time=X | | | | >-BNDUPD--> | | | lease-expiration=t1+X | | potential-expiration=t1+(X/2)+X | | | | | <-BNDACK-< | | | potential-expiration=t1+(X/2)+X ... ... ... Figure 5.1-1: Lazy Update Message Traffic X = Desired Lease Interval DISCUSSION: This protocol mandates no algorithm concerning these lease inter- vals, as long as above fundamental relationship is preserved. In the interests of clarity, however, let's examine a specific example. The MCLT in this case is 1 hour. The desired lease interval is 3 days, and its renewal time is half the lease inter- val. Droms, et. al. Expires December 1999 [Page 21] Internet Draft DHCP Failover Protocol June 1999 The rules for this example are: o What to tell the client: Take the remainder of the acknowledged potential lease interval. If this is a new lease, then this value will be zero. If this remainder plus the MCLT is greater than the desired lease inter- val, give the client the desired lease interval else give the client the remainder plus the MCLT. o What to tell the failover partner server: Take the renewal interval (typically half of the actual client lease interval), add to it the desired lease interval, and add it to the current time to yield the value that goes into the potential-expiration-time option. Also tell the failover partner the actual lease interval by adding it to the current time to yield the value that goes into the lease-expiration option. In operation this might work as follows: When a server makes an offer for a new lease on an IP address to a DHCP client, it determines the desired lease interval (in this case, 3 days). It then examines the acknowledged potential lease interval (which in this case is zero) and determines the remainder of the time left to run, which is also zero. To this it adds the MCLT. Since the actual lease interval cannot be allowed to exceed the remainder of the current acknowledged potential lease interval plus the MCLT, the offer made to the client is for the remainder of the current acknowledged potential lease interval (i.e., zero) plus the MCLT. Thus, the actual lease interval is 1 hour. Once the server has performed the ACK to the DHCP client, it will update the secondary server with the lease information. However, the desired potential lease interval will be composed of the one half of the current actual lease interval added to the desired lease interval. Thus, the secondary server is updated with a BNDUPD with a lease interval of 3 days + 1/2 hour specified in the IP Address Lease Time Option (Option 51). When the primary server receives an ACK to its update of the secondary server's (partner's) potential lease interval, it records that as the acknowledged potential lease interval. A server MUST NOT send a BNDACK in response to a BNDUPD message until it is sure that the information in the BNDUPD message resides in its stable storage. Thus, the primary server in this Droms, et. al. Expires December 1999 [Page 22] Internet Draft DHCP Failover Protocol June 1999 case can be sure that the secondary server has recorded the poten- tial lease interval in its stable storage when the primary server receives a BNDACK message from the secondary server. When the DHCP client attempts to renew at T1 (approximately one half an hour from the start of the lease), the primary server again determines the desired lease interval, which is still 3 days. It then compares this with the remaining acknowledged potential lease interval (3 days + 1/2 hour) and adjusts for the time passed since the secondary was last updated (1/2 hour). Thus the time remaining of the acknowledged potential lease interval is 3 days. Adding the MCLT to this yields 3 days plus 1 hour, which is more than the desired lease interval of 3 days. So the client is renewed for the desired lease interval -- 3 days. When the primary DHCP server updates the secondary DHCP server after the DHCP client's renewal ACK is complete, it will calculate the desired potential lease interval as the T1 fraction of the actual client lease interval (1/2 of 3 days this time = 1.5 days). To this it will add the desired client lease interval of 3 days, yielding a total desired partner server lease interval of 4.5 days. In this way, the primary attempts to have the secondary always "lead" the client in its understanding of the client's lease interval so as to be able to always offer the client the desired client lease interval. Once the initial actual client lease interval of the MCLT is past, the protocol operates effectively like the DHCP protocol does today in its behavior concerning lease intervals. However, the guarantee that the actual client lease interval will never exceed the remaining acknowledged partner server lease interval by more than the MCLT allows full recovery from a variety of failures. 5.2.2. Controlled re-allocation of IP addresses When in PARTNER-DOWN state there is a waiting period after which an IP address can be re-allocated to another client. For leases which are available when the server enters PARTNER-DOWN state, the period is the MCLT from entry into PARTNER-DOWN state. For IP addresses which are not available when the server enters PARTNER-DOWN state, the period is the MCLT after the lease becomes available. See sec- tion 9.4.2 for more details. In any other state, a server cannot reallocate an address from one client to another without first notifying its partner (through a BNDUPD message) and receiving acknowledgement (through a BNDACK mes- sage) that its partner is aware that that first client is not using the address. Droms, et. al. Expires December 1999 [Page 23] Internet Draft DHCP Failover Protocol June 1999 This could be modeled in the following way. Though this specific implementation is in no way required, it may serve to better illus- trate the concept. An "available" IP address on a server may be allocated to any client. An IP address which was leased to a client and which expired or was released by that client would take on a new state, EXPIRED or RELEASED respectively. The partner server would then be notified that this IP address was EXPIRED or RELEASED through a BNDUPD. When the sending server received the BNDACK for that IP address showing it was FREE, it would move the IP address from EXPIRED or RELEASED to FREE, and it would be available for allocation by the primary server to any clients. A server MAY reallocate an IP address in the EXPIRED or RELEASED state to the same client with no restrictions. 5.3. Load balancing In order to implement load balancing between a primary and secondary server pair, each server must respond to DHCPDISCOVER requests from some clients and not from other clients. In order to do this suc- cessfully, each server must be able to determine immediately upon receipt of a DHCP client request whether it is to service this request or to ignore it in order to allow the other server to service the request. In addition, it should be possible to configure the percentage of clients which will be serviced by either the primary or secondary server. This configuration should be more or less continuous, from all serviced by the primary through an even split with half serviced by each, to all serviced by the secondary. The technique chosen to support these goals is to define a hash func- tion which must be applied to the client-identifier or to the htype concatenated with the chaddr if no client-identifier is specified. The results of this hash function yields a number between 0 and 255 which maps into one of 256 "hash-buckets". Each hash bucket is assigned to one server or the other by the primary server whenever a connection is established, through use of the hash-bucket-assignment option. The hash-bucket-assignment option uses a 32 octet value field (con- taining 256 bits), with one bit associated with each possible hash bucket. If the bit corresponding to a hash bucket is a 1 in the hash-bucket-assignment option, then the secondary server is required to service all DHCP client requests that map into that hash bucket Droms, et. al. Expires December 1999 [Page 24] Internet Draft DHCP Failover Protocol June 1999 when in NORMAL state. For example, if the primary server sends a hash-bucket-assignment option to the secondary with the following 32 octets: buckets FF FF FF FF FF FF FF FF ( 0 - 63 ) FF FF FF FF FF FF FF FF ( 64 - 127 ) 00 00 00 00 00 00 00 00 ( 128 - 191 ) 00 00 00 00 00 00 00 00 ( 192 - 255 ) then the secondary MUST service any DHCP client requests where the client-identifier or htype concatenated with the chaddr hashs into the bucket values of 0 through 127. See section 12 for the code to implement the hash bucket algorithm. Each server MUST implement this same algorithm in order for all clients to get service. 5.4. Operating in NORMAL state When in NORMAL state, each server services DHCPDISCOVER's and all other DHCP requests other than DHCPREQUEST/RENEWAL or DHCPREQUEST/REBINDING from the client set defined by the load balanc- ing algorithm. Each server services DHCPREQUEST/RENEWAL or DHCPDISCOVER/REBINDING requests from any client. In general, whenever the binding database is changed in stable storage, then a BNDUPD message is sent with the contents of that change to the partner server. The partner server then writes the information about that binding in its bindings database in stable storage and replies with a BNDACK message. 5.5. Operating in COMMUNICATIONS-INTERRUPTED state When operating in COMMUNICATIONS-INTERRUPTED state, each server is operating independently, but does not assume that its partner is not operating. The partner server might be operating and simply unable to communicate with this server, or might not be operating. Each server responds to the full range of DHCP client messages that it receives, but in such a way that graceful reintegration is alway possible when its partner comes back into contact with it. 5.6. Operating in PARTNER-DOWN state When operating in PARTNER-DOWN state, a server assumes that its Droms, et. al. Expires December 1999 [Page 25] Internet Draft DHCP Failover Protocol June 1999 partner is not currently operating, but does make allowances for the possibility that that server was operating in the past. It responds to all DHCP client requests in PARTNER-DOWN state. Any transactions that the partner server may have had with DHCP clients but been unable to communicate to this server are allowed for in the algorithms that are used to gradually take over full control of all of the addresses configured into the server. 5.7. Operating in RECOVER state A server operating in RECOVER state assumes that it is reintegrating with a server that has been operating in PARTNER-DOWN state, and that it needs to update its bindings database before it services DHCP client requests. A server may also operate in RECOVER state in order to fully recover its bindings database from its partner server. 6. Packet Formats This section discusses the common message format that all failover messages have in common, and then defines option used in the failover protocol. 6.1. Common message format All failover protocol messages are sent over the TCP connection between failover endpoints and encoded using a packet format specific to the failover protocol. There exists a common message format for all failover messages, which utilizes the options in a way similar to the DHCP protocol. For each message type, some options are required and some are optional. In addition, when a message is received any options that are not under- stood by the receiving server MUST be ignored. All of the fields in the fixed portion of the packet MUST be filled with correct data in every message sent. Droms, et. al. Expires December 1999 [Page 26] Internet Draft DHCP Failover Protocol June 1999 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | packet length (2) | msg type (1) |payload off (1)| +---------------+---------------+---------------+---------------+ | xid (4) | +---------------------------------------------------------------+ | 0 or more additional header bytes (variable) | +---------------------------------------------------------------+ | payload data (variable) | | | | formatted as DHCP-style options | | using a unique option number space in the ?R6? | | format defined by [NAMESPACE] | +---------------------------------------------------------------+ packet length - 2 bytes, network byte order This is the length of the packet. It includes the two byte packet length itself. msg type - 1 byte The message type field is used to distinguish between messages. The following message types are defined: Value Message Type ----- ------------ 0 reserved not used 1 POOLREQ request allocation of addresses 2 POOLRESP respond with allocation count 3 BNDUPD update partner with binding info 4 BNDACK acknowledge receipt of binding update 5 CONNECT establish connection with partner 6 CONNECTACK respond to attempt to establish contact with partner 7 UPDREQALL request full transfer of binding info 8 UPDDONE ack send and ack of req'd binding info 9 UPDREQ req transfer of un-acked binding info 10 STATE inform partner of current state or state change 11 CONTACT probe communications integrity with partner New message types should be defined in one of two ranges, 0-127 or 129-255. The range of 0-127 is used for messages that MUST be Droms, et. al. Expires December 1999 [Page 27] Internet Draft DHCP Failover Protocol June 1999 supported by every server, and if a server receives a message in the range of 0-127 that it doesn't understand, it MUST drop the TCP con- nection. The range of 128-255 is used for messages which MAY be sup- ported but are not required, and if a server receives a message in this range that it does not understand it SHOULD ignore the message. payload offset - 1 byte The byte offset of the Payload Data, from the beginning of the failover packet header. The value for the current protocol version is 8. xid - 4 bytes, network byte order This is the transaction id of the failover packet. The sender of a failover protocol packet is responsible for setting this number, and the receiver of the packet copies the number over into any response packet, treating it as opaque data. The sender SHOULD ensure that every packet sent from a particular failover endpoint over the associated TCP connection has a unique transaction id unless that packet is a re-transmission. payload data - variable length The options are placed after the header, after skipping payload offset bytes from beginning of the packet. The payload data options are not preceded by a "cookie" value. The payload data is formatted as DHCP style options using the two byte option number and two byte option length format as specified in the recommendations of the DHCP panel in [NAMESPACE]. The maximum length of the payload data in octets is 2048 less the size of the header, i.e., the maximum packet length is 2048 octets. 6.2. Common option format The options contained in the payload data section of the failover packet all use the two byte option number and two byte length format as specified by the recommendations of the DHCP panel in [NAMESPACE]. The option numbers are drawn from an option number space unique to the failover protocol. All of the message types share a common option number space and common options definitions, though not all options are required or meaningful for every message. Droms, et. al. Expires December 1999 [Page 28] Internet Draft DHCP Failover Protocol June 1999 In contrast to the options which appear in DHCP client and server packets, the options in failover message are ordered. That is, for some messages the order in which the options appear in the payload data area is significant. The messages for which this is the case spell it out in detail. For all options which refer to time, they all use an absolute time in GMT. Time synchronization has already been achieved between the source and the target server using the CONNECT message. All time fields in the options defined below use a time represented as seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t time value representa- tion). Note that this is (at present) a signed field. Additional options can be defined for intervendor or vendor specific use with limited difficulty due to the large number of option numbers available. 6.2.1. binding-status This option is used to convey the current state of a binding. Code Len Type +-----+-----+------+-----+-----+ | 0 | 1 | 0 | 1 | 1-7 | +-----+-----+------+-----+-----+ Legal values for this option are: Value Binding Status ----- ------------------------------------------------ 1 FREE Lease has never been used 2 ACTIVE Lease is assigned to a client 3 EXPIRED Lease has expired 4 RELEASED Lease has been released by client 5 ABANDONED A server, or client flagged address as unusable 6 RESET Lease was freed by some external agent 7 BACKUP Lease belongs to secondary's private address pool 8 EXPIRED-GRACE Lease will become available after this period 9 RELEASED-GRACE Lease will become available after this period Droms, et. al. Expires December 1999 [Page 29] Internet Draft DHCP Failover Protocol June 1999 6.2.2. assigned-IP-address The IP address to which this message refers. Code Len Address +-----+-----+------+-----+----+-----+-----+-----+ | 0 | 2 | 0 | 4 | a1 | a2 | a3 | a4 | +-----+-----+------+-----+----+-----+-----+-----+ 6.2.3. sending-server-IP-address The IP address of the server sending this message. Code Len Address +-----+-----+------+-----+----+-----+-----+-----+ | 0 | 3 | 0 | 4 | a1 | a2 | a3 | a4 | +-----+-----+------+-----+----+-----+-----+-----+ 6.2.4. addresses-transferred A 32 bit unsigned long in network byte order. Reports the number of addresses transferred by the primary to the secondary server (addresses to be used for the secondary server's private address pool) Code Len Number of Addresses +-----+-----+------+-----+----+-----+-----+-----+ | 0 | 4 | 0 | 4 | n1 | n2 | n3 | n4 | +-----+-----+------+-----+----+-----+-----+-----+ 6.2.5. client-identifier The format, code and conventions used are identical to DHCP option 61. Code Len Client Identifier +-----+-----+------+-----+----+-----+--- | 0 | 5 | 0 | n | i1 | i2 | ... +-----+-----+------+-----+----+-----+-- Droms, et. al. Expires December 1999 [Page 30] Internet Draft DHCP Failover Protocol June 1999 6.2.6. client-hardware-address The format is similar to DHCP option 61. Byte t1 (type) MUST be set to the proper ARP hardware address code, as defined in the ARP section of RFC 1700 (it MUST NOT be zero!) Code Len MAC address +-----+-----+------+-----+----+-----+-----+--- | 0 | 6 | 0 | n | t1 | m1 | m2 | ... +-----+-----+------+-----+----+-----+-----+--- Either Client Id, Client Hardware Address or BOTH MAY be present in binding update transactions. At least one of them MUST be present. If both are present, the Client Id MUST be used to uniquely identify the owner of the binding (exactly as in RFC 2131). 6.2.7. client-FQDN If an implementation supports Dynamic DNS updates, this option can be used to communicate the DNS name that was set. Uses the format of the Client FQDN option (81) as described in [DDNS] and extended to fit in the two byte code and length approach of the DHCP panel. Code Len Flags Rcode1 Rcode2 Domain Name +-----+-----+------+-----+-----+------+------+-----+------ | 0 | 7 | 0 | n | f | r1 | r2 | d1 | d2... +-----+-----+------+-----+-----+------+------+-----+------ Droms, et. al. Expires December 1999 [Page 31] Internet Draft DHCP Failover Protocol June 1999 6.2.8. reject-reason This option is used to selectively reject binding updates. It MAY be used in BNDACK message, always associated with an assigned-IP-address option, which contains the IP address of the update being rejected. Code Len Reason Code +-----+-----+------+-----+----------+ | 0 | 8 | 0 | 1 | R1 | +-----+-----+------+-----+----------+ Reason codes : 0 Reserved 1 Illegal IP address (not part of any address pool) 2 Fatal conflict exists: address in use by other client. 3 Missing binding information. 4 Connection rejected, time mismatch too great. 5 Connection rejected, invalid MCLT. 6 Connection rejected, unknown reason. 7 Connection rejected, duplicate connection. 8 Connection rejected, invalid failover partner. 9 TLS not supported 10 TLS supported but not configured 11 TLS required but not supported by partner 12 Message digest not supported 13 Message digest not configured 14 Protocol version mismatch 15 Missing binding information 16 Outdata binding information 17 Less critical binding information 18-253, reserved. 254 Unknown: Error occurred but does not match any reason code 255 Reserved for code expansion Droms, et. al. Expires December 1999 [Page 32] Internet Draft DHCP Failover Protocol June 1999 6.2.9. message This option is used to supply a human readable message. It may be used in association with the Reject Reason Code to provide a human readable error message for the reject. Code Len Text +-----+-----+------+-----+------+-----+-- | 0 | 9 | 0 | n | c1 | c2 | ... +-----+-----+------+-----+------+-----+-- 6.2.10. MCLT Maximum Client Lead Time, in seconds. A 32 bit integer value, in network byte order. T Code Len Time +-----+-----+------+-----+----+-----+-----+-----+ | 0 | 10 | 0 | 4 | t1 | t2 | t3 | t4 | +-----+-----+------+-----+----+-----+-----+-----+ 6.2.11. vendor-class-identifier A string which identifies the vendor of the failover protocol implementation. The code for this option is 60, and its minimum length is 1. Code Len vendor class string +-----+-----+------+-----+----+-----+--- | 0 | 11 | 0 | n | c1 | c2 | ... +-----+-----+------+-----+----+-----+--- Droms, et. al. Expires December 1999 [Page 33] Internet Draft DHCP Failover Protocol June 1999 6.2.12. current-time The current time expressed as an absolute time in GMT represented as seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t time value representation). Code Len Current Time +-----+-----+------+-----+----+-----+-----+-----+ | 0 | 12 | 0 | 4 | t1 | t2 | t3 | t4 | +-----+-----+------+-----+----+-----+-----+-----+ 6.2.13. lease-expiration-time The lease expiration time expressed as an absolute time in GMT represented as seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t time value representation). The lease expiration time is the time that a server has ACKed to a DHCP client. Code Len Time +-----+-----+------+-----+----+-----+-----+-----+ | 0 | 13 | 0 | 4 | t1 | t2 | t3 | t4 | +-----+-----+------+-----+----+-----+-----+-----+ 6.2.14. potential-expiration-time The potential expiration time expressed as an absolute time in GMT represented as seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t time value representation). The potential expiration time is the time that one server tells another server that it may ACK to a client. Code Len Time +-----+-----+------+-----+----+-----+-----+-----+ | 0 | 14 | 0 | 4 | t1 | t2 | t3 | t4 | +-----+-----+------+-----+----+-----+-----+-----+ Droms, et. al. Expires December 1999 [Page 34] Internet Draft DHCP Failover Protocol June 1999 6.2.15. grace-expiration-time The grace expiration time expressed as an absolute time in GMT represented as seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t time value representation). The grace expiration time is the time that a grace period will expire. Code Len Time +-----+-----+------+-----+----+-----+-----+-----+ | 0 | 15 | 0 | 4 | t1 | t2 | t3 | t4 | +-----+-----+------+-----+----+-----+-----+-----+ 6.2.16. client-last-transaction-time The time at which this server last received a DHCP request from a particular client expressed as an absolute time in GMT represented as seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t time value representation). Code Len Partner Down Time +-----+-----+------+-----+----+-----+-----+-----+ | 0 | 16 | 0 | 4 | t1 | t2 | t3 | t4 | +-----+-----+------+-----+----+-----+-----+-----+ 6.2.17. start-time-of-state The time at which the state contained in this message began, expressed as an absolute time in GMT represented as seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t time value representation). This option is used for different states in different messages. In a BNDUPD message it represents the start time of the state of the lease in the BNDUPD message. In a STATE message, it represents the start time of the partner server's failover state. Code Len Start Time of State +-----+-----+------+-----+----+-----+-----+-----+ | 0 | 17 | 0 | 4 | t1 | t2 | t3 | t4 | +-----+-----+------+-----+----+-----+-----+-----+ Droms, et. al. Expires December 1999 [Page 35] Internet Draft DHCP Failover Protocol June 1999 6.2.18. server-state This option is used to convey the current state of the failover endpoint in the sending server. Code Len Server State +-----+-----+------+-----+-----+ | 0 | 18 | 0 | 1 | 1-9 | +-----+-----+------+-----+-----+ Legal values for this option are: Value Server State ----- ------------------------------------------------------------- 0 reserved 1 STARTUP Startup state (1) 2 NORMAL Normal state 3 COMMUNICATIONS-INTERRUPTED Communication interrupted (safe) 4 PARTNER-DOWN Partner down (unsafe mode) 5 POTENTIAL-CONFLICT Synchronizing 6 RECOVER Recovering bindings from partner 7 PAUSED Shutting down for a short period. 8 SHUTDOWN Shutting down for an extended period. 9 RECOVER-DONE Interlock state prior to NORMAL 6.2.19. server-flags This option is used to convey the current flags of the failover endpoint in the sending server. Code Len Server Flags +-----+-----+------+-----+-------+ | 0 | 19 | 0 | 1 | flags | +-----+-----+------+-----+-------+ Legal values for this option are: Currently, bit 5 is defined. All other bits are reserved, and must be set to 0. o STARTUP Bit 5 is the STARTUP flag. Bit 5 MUST be set to 1 whenever the server is in STARTUP state, and set to 0 otherwise. (Note that Droms, et. al. Expires December 1999 [Page 36] Internet Draft DHCP Failover Protocol June 1999 when in STARTUP state, the state transmitted in the server-state option is usually the last recorded state from stable storage, but see section 9.3 for details.) 6.2.20. vendor-specific-options This option is used to convey options specific to a particular vendor's implementation. The vendor class identifier is used to specify which option space the embedded options are drawn from. It functions similarly to the vendor class identifier and vendor specific options in the DHCP protocol. This option contains other options in the same two byte code, two byte length format. If this option appears in a message without a corresponding vendor class identifier, it MUST be ignored. Code Len Embedded options +-----+-----+------+-----+----+-----+--- | 0 | 20 | 0 | n | c1 | c2 | ... +-----+-----+------+-----+----+-----+--- 6.2.21. max-unacked-bndupd The maximum number of BNDUPD message that this server is prepared to accept over the TCP connection without causing the TCP connection to block. Code Len Maximum Unacked BNDUPD +-----+-----+------+-----+----+-----+-----+-----+ | 0 | 21 | 0 | 4 | n1 | n2 | n3 | n4 | +-----+-----+------+-----+----+-----+-----+-----+ Droms, et. al. Expires December 1999 [Page 37] Internet Draft DHCP Failover Protocol June 1999 6.2.22. server-role This option is used to convey the role of the failover endpoint in the sending server. Code Len Role +-----+-----+------+-----+-------+ | 0 | 22 | 0 | 1 | r1 | +-----+-----+------+-----+-------+ A value of 0 indicates that the failover endpoint is a primary server and a value of 1 indicates that it is a secondary server. 6.2.23. receive-timer The number of seconds within which the server must receive a packet from its partner, or it will assume that the partner is down or the communication path to the partner has failed. Code Len Receive Timer +-----+-----+------+-----+----+-----+-----+-----+ | 0 | 23 | 0 | 4 | s1 | s2 | s3 | s4 | +-----+-----+------+-----+----+-----+-----+-----+ 6.2.24. hash-bucket-assignment The set of hash values to which the receiving server MUST respond. See section 5.3 for more information on how this option is used. This option consists of a set of 32 bytes, in network byte order, where each bit corresponds to one of 256 possible hash bucket values. If a bit is set to 1, the recipient is required to service the requests whose client-identifier or htype concatenated with the chaddr (if no client-identifier exists) map into the corresponding hash bucket. Code Len Hash Buckets +-----+-----+------+-----+----+-----+-----+-----+ | 0 | 24 | 0 | 32 | b1 | b2 | ... | b32 | +-----+-----+------+-----+----+-----+-----+-----+ Droms, et. al. Expires December 1999 [Page 38] Internet Draft DHCP Failover Protocol June 1999 6.2.25. message-digest The message digest for this message. This option consists of a variable number of bytes which contain the message digest of the message prior to the inclusion of this option. When this option appears in a message, it MUST appear as the last option in the message. Code Len Message Digest +-----+-----+------+-----+----+-----+----- | 0 | 25 | 0 | n | d1 | d2 | ... +-----+-----+------+-----+----+-----+----- 6.2.26. protocol-version The protocol version being used by the server. It is only sent in the CONNECT and CONNECTACK messages. Code Len Version +-----+-----+------+-----+----+ | 0 | 26 | 0 | 1 | v1 | +-----+-----+------+-----+----+ Droms, et. al. Expires December 1999 [Page 39] Internet Draft DHCP Failover Protocol June 1999 6.2.27. TLS-request This option contains information relating to TLS security negotiation. It is sent in a CONNECT message The first byte, req, is the TLS request from this server. A value of 0 indicates no TLS operation, a value of 1 indicates that TLS operation is desired, and a value of 2 indicates that TLS operation is required to establish communications with this server. The second byte, acc, is what this server will accept for TLS operation. A value of 0 means that this server will not accept TLS connections. A value of 1 means that this server will accept TLS connections. If req is not zero, then acc MUST be 1. This allows a server which is not configured for TLS support to inform its partner that it will accept a TLS connection although it does not desire one, for instance. Code Len request acccept +-----+-----+------+-----+----+----+ | 0 | 27 | 0 | 2 | req| acc| +-----+-----+------+-----+----+----+ 6.2.28. TLS-reply This option contains information relating to TLS security negotiation. It is sent in a CONNECTACK message The value of 0 indicates no TLS operation, a value of 1 indicates that TLS operation is required. Code Len TLS +-----+-----+------+-----+----+ | 0 | 28 | 0 | 1 | t1 | +-----+-----+------+-----+----+ 6.3. BNDUPD message format The binding update (BNDUPD) message is used to send the binding data- base changes to the partner server. The message type for the BNDUPD message is 3. Droms, et. al. Expires December 1999 [Page 40] Internet Draft DHCP Failover Protocol June 1999 The xid of the BNDUPD MUST be unique with respect to other failover messages transmitted from this failover endpoint. The following table summarizes the various options for the BNDUPD message. binding-status Option ACTIVE EXPIRED RELEASED FREE ------ ------ ------- -------- ---- assigned-IP-address MUST MUST MUST MUST binding-status MUST MUST MUST MUST client-identifier MAY MAY MAY MAY client-hardware-address MUST MUST MUST MAY lease-expiration-time MUST MUST NOT MUST NOT MUST NOT potential-expiration-time MUST MUST NOT MUST NOT MUST NOT grace-expiration-time MUST NOT MUST NOT MUST NOT MUST NOT start-time-of-state SHOULD SHOULD SHOULD SHOULD client-last-trans.-time SHOULD SHOULD SHOULD MAY client-FQDN(1) SHOULD SHOULD SHOULD SHOULD all others MAY MAY MAY MAY binding-status BACKUP EXPIRED- RELEASED- RESET Option GRACE GRACE ABANDONED ------ ------ ----- --------- assigned-IP-address MUST MUST MUST binding-status MUST MUST MUST client-identifier MAY MAY MAY(2) client-hardware-address MAY MAY MAY(2) lease-expiration-time MUST NOT MUST NOT MUST NOT potential-expiration-time MUST NOT MUST NOT MUST NOT grace-expiration-time MUST MUST MUST NOT start-time-of-state SHOULD SHOULD SHOULD client-last-trans.-time SHOULD SHOULD MAY client-FQDN(1) SHOULD SHOULD SHOULD all others MAY MAY MAY (1) Only SHOULD appear if client supplies a host name and dynamic DNS is used. (2) MUST NOT if binding-status is ABANDONED. Table 6.3-1: Options used in a BNDACK message Droms, et. al. Expires December 1999 [Page 41] Internet Draft DHCP Failover Protocol June 1999 6.4. BNDACK message format A server sends a binding acknowledgement (BNDACK) message when it has successfully committed binding database changes received from a fail- over partner in a BNDUPD message to its own stable storage. The message type for the BNDACK message is 4. The xid in a BNDACK MUST be the same as the xid of the corresponding BNDUPD. The following table summarizes the options for the BNDACK message. Droms, et. al. Expires December 1999 [Page 42] Internet Draft DHCP Failover Protocol June 1999 binding-status Option ACTIVE EXPIRED RELEASED FREE ------ ------ ------- -------- ---- assigned-IP-address MUST MUST MUST MUST binding-status MUST MUST MUST MUST client-identifier MAY MAY MAY MAY client-hardware-address MUST MUST MUST MAY reject-reason MAY MAY MAY MAY message MAY MAY MAY MAY lease-expiration-time MUST MUST NOT MUST NOT MUST NOT potential-expiration-time MUST MUST NOT MUST NOT MUST NOT grace-expiration-time MUST NOT MUST NOT MUST NOT MUST NOT start-time-of-state SHOULD SHOULD SHOULD SHOULD client-last-trans.-time SHOULD SHOULD SHOULD MAY client-FQDN(1) SHOULD SHOULD SHOULD SHOULD all others MAY MAY MAY MAY binding-status BACKUP EXPIRED- RELEASED- RESET Option GRACE GRACE ABANDONED ------ ------ ----- --------- assigned-IP-address MUST MUST MUST binding-status MUST MUST MUST client-identifier MAY MAY MAY client-hardware-address MAY MAY MAY(2) reject-reason MAY MAY MAY message MAY MAY MAY lease-expiration-time MUST NOT MUST NOT MUST NOT potential-expiration-time MUST NOT MUST NOT MUST NOT grace-expiration-time MUST MUST MUST NOT start-time-of-state SHOULD SHOULD SHOULD client-last-trans.-time SHOULD SHOULD MAY client-FQDN(1) SHOULD SHOULD SHOULD all others MAY MAY MAY (1) Only SHOULD appear if client supplies a host name and dynamic DNS is used. (2) MUST NOT if binding-status is ABANDONED. Table 6.4-1: Options used in a BNDACK message Droms, et. al. Expires December 1999 [Page 43] Internet Draft DHCP Failover Protocol June 1999 6.5. Bulking for BNDUPD and BNDACK messages DISCUSSION: Bulking is planned for this protocol, but it hasn't been specified in this revision of the draft. Once the draft settles down, we will specify the bulking approach in detail. 6.6. UPDREQ message format The update request (UPDREQ) message is used by one server to request that its partner send it all binding database information that it has not already seen. The message type for the UPDREQ message is 9. The xid in a UPDREQ message MUST be unique among messages transmitted from this failover endpoint during the life of this connection. There are no options that MUST appear in an UPDREQALL message. Any option MAY appear. 6.7. UPDREQALL message format The update request all (UPDREQALL) message is used by one server to request that all binding database information be sent in order to recover from a total loss of its lease state database by the request- ing server. The message type for the UPDREQALL message is 7. The xid in a UPDREQALL message MUST be unique among messages transmitted from this failover endpoint during the life of this con- nection. There are no options that MUST appear in an UPDREQALL message. Any option MAY appear. 6.8. UPDDONE message format The update done (UPDDONE) message is used by the responding server to indicate that all requested updates have been sent by the responding server as BNDUPD messages and acked by the requesting server using BNDACK messages. While a BNDACK message MUST have been received for each IP address that was sent in a BNDUPD message, the BNDACK message could have contained a reject-reason in order to NAK that specific Droms, et. al. Expires December 1999 [Page 44] Internet Draft DHCP Failover Protocol June 1999 update. Thus, this message confirms that the requesting server has received and responded to a BNDUPD message for all of the requested updates, but it does require the requesting server to accept all of the offered updates. The message type for the UPDDONE message is 7. The xid in an UPDDONE message MUST be identical to the xid in the UPDREQ or UPDREQALL message that initiated the update process. There are no options that MUST appear in an UPDDONE message. Any option MAY appear. 6.9. POOLREQ message format The pool request (POOLREQ) is used by the secondary server to request an allocation of IP addresses from the primary server. The message type for the POOLREQ message is 1. The xid in a POOLREQ message MUST be unique among messages transmit- ted from this failover endpoint during the life of this connection. There are no options that MUST appear in a POOLREQ message. Any option MAY appear. 6.10. POOLRESP message format The pool response (POOLRESP) is used by the primary server to inform the secondary server how many IP addresses it was allocated as the result of a pool request. The message type for the POOLRESP message is 2. The xid in the POOLRESP message MUST be identical to the xid in the POOLREQ message for which this POOLRESP is a response. The following table shows the options that MUST appear in a POOLRESP message: Droms, et. al. Expires December 1999 [Page 45] Internet Draft DHCP Failover Protocol June 1999 Option ------ addresses-transferred MUST Table 6.10-1: Options used in a STATE message 6.11. CONNECT message format The connect (CONNECT) message is used by either server to establish a high level connection with the other server, and to transmit several important configuration data items between the servers. The message type for the CONNECT message is 5. The xid in a CONNECT message MUST be unique among messages transmit- ted from this failover endpoint during the life of this connection. The CONNECT message MUST be the first message sent down a newly esta- blished connection. The following table summarizes the options that are associated with the CONNECT message: role Option primary secondary ------ ------ --------- sending-server-IP-address MUST MUST server-role MUST MUST max-unacked-bndupd MUST MUST receive-timer MUST MUST current-time MUST MUST vendor-class-identifier MUST MUST protocol-version MUST MUST TLS-request MUST(1) MUST(1) MCLT MUST MUST NOT hash-bucket-assignment MUST MUST NOT all others MAY MAY (1) If the CONNECT message is being sent on a TLS secured connection, then there MUST NOT be a TLS-request option. Table 6.11-1: Options used in a CONNECT message Droms, et. al. Expires December 1999 [Page 46] Internet Draft DHCP Failover Protocol June 1999 6.12. CONNECTACK message format The connect response (CONNECTACK) message is used by a server to respond to the receipt of a CONNECT message. The message type for the CONNECTACK message is 6. The xid in the CONNECTACK message MUST be identical to the xid in the CONNECT message for which this CONNECTACK is a response. The following table summarizes the options associated with the CON- NECTACK message: Option ------ sending-server-IP-address MUST server-role MUST max-unacked-bndupd MUST receive-timer MUST current-time MUST vendor-class-identifier MUST protocol-version MUST TLS-reply MUST(1) reject-reason MAY(2) message MAY (1) If the CONNECTACK is being sent over an already TLS secured connection, then the TLS-reply option MUST NOT appear. (2) Indicates a rejection of the CONNECT message. Table 6.12-1: Options used in a CONNECTACK message 6.13. STATE message format The state (STATE) message is used by either server to communicate the current state of the failover endpoint with the other server. It MUST be sent immediately after a connection is established with another server, and it MUST be sent whenever the server's state changes. The message type for the STATE message is 10. The xid in a STATE message MUST be unique among messages transmitted from this failover endpoint during the life of this connection. Droms, et. al. Expires December 1999 [Page 47] Internet Draft DHCP Failover Protocol June 1999 The following table shows the options that MUST appear in a STATE message: Option ------ sending-state MUST server-flags MUST start-time-of-state MUST Table 6.13-1: Options used in a STATE message 6.14. CONTACT message format The contact (CONTACT) message is used by either server to verify that the connection is operational to the other server. The message type for the CONTACT message is 11. The xid in a CONTACT message MUST be unique among messages transmit- ted from this failover endpoint during the life of this connection. The following table shows the options that MUST appear in a CONTACT message: Option ------ current-time MUST Table 6.14-1: Options used in a CONTACT message 7. Protocol Messages This section contains the detailed definition of the protocol mes- sages, including the information to include when sending the message, as well as the actions to take upon receiving the message. 7.1. BNDUPD message The binding update (BNDUPD) message is used to send the binding data- base changes to the partner server, and the partner server responds with a binding acknowledgement (BNDACK) message when it has success- fully commited those changes to its own stable storage. Droms, et. al. Expires December 1999 [Page 48] Internet Draft DHCP Failover Protocol June 1999 The rest of the failover protocol exists to determine whether the partner server is able to communicate or not, and to enable the partners to exchange BNDUPD/BNDACK messages in order to keep their binding databases in stable storage synchronized. 7.1.1. Sending the BNDUPD message A BNDUPD message SHOULD be generated whenever any binding changes. A change might be in the binding-status, the lease-expiration-time, or even just the last-transaction-time. In general, any time a DHCP client sends in a packet that results in a DHCP server writing to its stable storage, a BNDUPD message SHOULD be generated. The BNDUPD (and BNDACK) messages refer to the binding-status of the IP address, and this protocol defines a series of binding-statuses, discussed in more detail below. Some servers may not support all of these binding-statuses, and so in those cases they will not be sent, and upon receipt a reasonable interpretation should be made. All BNDUPD messages MUST contain the IP address in the assigned-IP- address option, and it contains the IP address about which the BNDUPD message is being sent. All BNDUPD messages MUST contain the binding-status option, and it will have one of the values in the following list. This list discusses the meanings of the various binding-statuses and the infor- mation that should go into the BNDUPD message because of them. o ACTIVE Indicates that the IP address is currently leased to a DHCP client. client-hardware-address The client-hardware-address option MUST appear, and be set from the MAC address of the DHCP client to which this IP address is leased. client-identifier If the DHCP client to which this IP address is leased used a client-identifier option to identify itself, then the client- identifier MUST appear in the BNDUPD message, else it MUST NOT appear. lease-expiration-time Droms, et. al. Expires December 1999 [Page 49] Internet Draft DHCP Failover Protocol June 1999 The lease-expiration-time option MUST appear, and be set to the expiration time most recently ACKed to the DHCP client. Note that the time ACKed to a DHCP client is a lease duration in seconds, while the lease-expiration-time option in a BNDUPD mes- sage is an absolute time value. potential-expiration-time The potential-expiration-time option MUST appear, and be set to a value beyond that of the lease-expiration time. This is the value that is ACKed by the BNDACK message. A server sending a BNDUPD message MUST be able to recover the potential- expiration-time sent in every BNDUPD, not just those that receive a corresponding BNDACK, in order to be able to protect against possible duplicate allocation of IP addresses after transitioning to PARTNER-DOWN state. See section 5.2.1 for details as to why the potential-expiration-time exists and guidelines for how to decide the value. o EXPIRED A binding-status of EXPIRED is used when a client's binding on an IP address has expired and the server does not wish to imple- ment an expired-grace period. When the partner server ACK's the BNDUPD of an EXPIRED IP address, the server sets its internal state to FREE. It is then available to allocation to any client of the primary server. client-hardware-address There SHOULD be a DHCP client associated with the IP address whose binding has expired. If there is, then the client- hardware-address option MUST appear, and be set from the MAC address of the DHCP client to which this IP address was leased. client-identifier There SHOULD be a DHCP client associated with the IP address whose binding has expired. If there is, then if the DHCP client to which this IP address was leased used a client-identifier option to identify itself, then the client-identifier MUST appear in the BNDUPD message, else it MUST NOT appear. o RELEASED A binding-status of RELEASED is used when a DHCP client sends in a DHCPRELEASE message and the server does not wish to implement a released-grace period. When the partner server ACK's the Droms, et. al. Expires December 1999 [Page 50] Internet Draft DHCP Failover Protocol June 1999 BNDUPD of an RELEASED IP address, the server sets its internal state to FREE, and it is available for allocation by the primary server to any DHCP client. client-hardware-address There SHOULD be a DHCP client associated with the IP address whose binding has been released. If there is, then the client- hardware-address option MUST appear, and be set from the MAC address of the DHCP client which released this IP address. client-identifier There SHOULD be a DHCP client associated with the IP address whose binding has been released. If there is, then if the DHCP client which released this IP address used a client-identifier option to identify itself, then the client-identifier MUST appear in the BNDUPD message, else it MUST NOT appear. o FREE A binding-status of FREE is used when a DHCP server needs to communicate that an IP address is available for allocation to another server, but it was not just released, expired, or reset by a network administrator. When the partner server ACK's the BNDUPD of an FREE IP address, the server sets its internal state such that it is available for allocation by any DHCP client. client-hardware-address There MAY be a DHCP client associated with the IP address whose binding is now desired to be FREE. If there is, then the client-hardware-address option MUST appear, and be set from the MAC address of the DHCP client which released this IP address. client-identifier There MAY be a DHCP client associated with the IP address whose binding is now desired to be FREE. If there is, then if the DHCP client which released this IP address used a client- identifier option to identify itself, then the client-identifier MUST appear in the BNDUPD message, else it MUST NOT appear. o EXPIRED-GRACE Some servers support a grace period after lease expiration, to handle clock speed differences between clients and servers as well as to limit the number of times names are removed and Droms, et. al. Expires December 1999 [Page 51] Internet Draft DHCP Failover Protocol June 1999 subsequently added to dynamic DNS. client-hardware-address There MAY be a DHCP client associated with the IP address whose binding has now expired. If there is, then the client- hardware-address option MUST appear, and be set from the MAC address of the DHCP client which released this IP address. client-identifier There MAY be a DHCP client associated with the IP address whose binding hs now expired. If there is, then if the DHCP client which most recently leased this IP address used a client- identifier option to identify itself, then the client-identifier MUST appear in the BNDUPD message, else it MUST NOT appear. grace-expiration-time The grace-expiration-time option MUST appear, and is the length of time that this server will wait before trying to make the IP address available after the lease has expired for this IP address. o RELEASED-GRACE Some servers support a grace period after lease release by a DHCP client, to handle clock speed differences between clients and servers as well as to limit the number of times names are removed and subsequently added to dynamic DNS. client-hardware-address There MAY be a DHCP client associated with the IP address whose binding has now been released by sending a DHCPRELEASE. If there is, then the client-hardware-address option MUST appear, and be set from the MAC address of the DHCP client which released this IP address. client-identifier There MAY be a DHCP client associated with the IP address whose binding has been released. If there is, then if the DHCP client which most recently leased this IP address used a client- identifier option to identify itself, then the client-identifier MUST appear in the BNDUPD message, else it MUST NOT appear. client-hardware-address Droms, et. al. Expires December 1999 [Page 52] Internet Draft DHCP Failover Protocol June 1999 There MAY be a DHCP client associated with the IP address whose binding is now desired to be FREE. If there is, then the client-hardware-address option MUST appear, and be set from the MAC address of the DHCP client which released this IP address. client-identifier There MAY be a DHCP client associated with the IP address whose binding is now desired to be FREE. If there is, then if the DHCP client which released this IP address used a client- identifier option to identify itself, then the client-identifier MUST appear in the BNDUPD message, else it MUST NOT appear. grace-expiration-time The grace-expiration-time MUST appear, and is the length of time that this server will wait before trying to make the IP address available after the lease was released for this IP address o ABANDONED An ABANDONED IP address is one that has been considered unusable by the DHCP subsystem. An IP address for which a valid PING response was received SHOULD be set to ABANDONED. client-hardware-address There SHOULD NOT be a DHCP client associated with an ABANDONDED IP address. The client-hardware-address option MUST NOT appear in the BNDUPD message. client-identifier There SHOULD NOT be a DHCP client associated with the IP address whose binding has now been ABANDONED. The client-identifier option MUST-NOT appear in the BNDUPD message. o RESET The RESET value of the binding-status is used to indicate that this IP address was made available by operator command. o BACKUP The BACKUP value of binding-status indicates that this IP address belongs to the secondary server, and can be allocated by that server to a DHCP client at any time. Droms, et. al. Expires December 1999 [Page 53] Internet Draft DHCP Failover Protocol June 1999 client-hardware-address There MAY be a DHCP client associated with an BACKUP IP address. If there is, the client-hardware-address option MUST appear, and be set from the MAC address of the DHCP client to which this IP address was most recently associated. client-identifier There MAY be a DHCP client associated with this IP address. If the DHCP client to which this IP address is leased used a client-identifier option to identify itself, then the client- identifier MUST appear in the BNDUPD message, else it MUST NOT appear. The following option information is generic to all BNDUPD messages, regardless of the value of the binding-status. o start-time-of-state The start-time-of-state SHOULD appear. It is set to the time at which this IP address first took on the state that corresponds to the current value of binding-status. o last-transaction-time The last-transaction-time value SHOULD appear. This is the time at which this DHCP server last received a packet from the DHCP client referenced by the client-identifier or client-hardware-address that was associated with the IP address referenced by the assigned-IP- address. o client-FQDN If the DHCP server is performing dynamic DNS operations on behalf of the DHCP client represented by the client-identifier or client- hardware-address, then it should include a client-FQDN option con- taining the host name, domain name, and status of any dynamic DNS operations enabled. The BNDUPD message SHOULD be sent as soon as possible from the time that the DHCP client received a response and the lease bindings data- base is written on stable storage. 7.1.2. Receiving the BNDUPD message When a server receives a BNDUPD message, it needs to decide how to processes the message and whether the message represents a conflict Droms, et. al. Expires December 1999 [Page 54] Internet Draft DHCP Failover Protocol June 1999 of any sort. The conflict resolution process is used on the receipt of every BNDUPD message, not just those that are received while in POTENTIAL-CONFLICT state, in order to increase the robustness of the protocol. There are two sorts of conflict. The first, more major conflict, is when a server receives a BNDUPD message from its partner for an ACTIVE IP address and finds that the client specified in the BNDUPD message is different from the client associated with this ACTIVE IP address in this server's bindings database. The second sort of conflict is where the receiving server has in its bindings database the client specified in the BNDUPD message associ- ated with a different IP address. These two conflict cases can both occur together with the same BNDUPD message. When receiving a BNDUPD message, the server first determines the IP address from the assigned-IP-address option, and then determines if there was any client associated with this IP address by looking for the client-identifier option. If there is no client-identifier option, then the server looks for a client-hardware-address option, and ultimately determines the client's identity specified in the BNDUPD. The client specified in the BNDUPD message is compared to the client currently associated with the IP address in this server's bindings database. If they are the same, continue. If there is no client in this server's binding database, continue. If there is a client in this server's bindings database, and it is different from that speci- fied in the BNDUPD message, a 'client conflict' exists. See the sec- tion below on conflict resolution. If the client specified in the BNDUPD message is associated with a different IP address in this server's bindings database in the same subnet, then an 'IP address conflict' exists. This does not refer to the case where a single client has addresses in multiple different subnets or administrative domains, but rather the case where in the same subnet the client has as lease on one IP address in one server and on a different IP address on the other server. See the section below on conflict reso- lution. If none of the conflicts mentioned above exist, then develop a time for both the BNDUPD message and the server's information. The time for both the BNDUPD and the server's information are developed independently in the following way: If there is a client- last-transaction time, use that. If there isn't, but there is a Droms, et. al. Expires December 1999 [Page 55] Internet Draft DHCP Failover Protocol June 1999 start-time-of-state, use that. If there isn't, but there is a client-expiration-time, use that. If there isn't, then use the time the BNDUPD message was received for a BNDUPD message, and the current time for the server's information. Then the server determines the binding-status in the BNDUPD, and takes the following actions based on binding-status: (In the following list, to "accept" a BNDUPD means to update the server's bindings database with the information contained in the BNDUPD and once that update is complete, send a BNDACK message corresponding to the BNDUPD message). o ACTIVE in BNDUPD If the BNDUPD is LATER than the server's information, accept it, else reject it. o EXPIRED or EXPIRED-GRACE in BNDUPD If the binding-status in the receiving server's bindings data- base is ACTIVE, then reject the BNDUPD. Otherwise, accept the BNDUPD. If the binding-status in the BNDUPD is EXPIRED-GRACE and the server receiving the BNDUPD does not implement a grace period for expired leases, then the server MUST set its lease expira- tion to value held in the grace-expiration in the BNDUPD. o RELEASED or RELEASED-GRACE in BNDUPD If the BNDUPD is LATER than the server's information, accept it, else reject it. If the binding-status in the BNDUPD is RELEASED-GRACE and the server receiving the BNDUPD does not implement a grace period for released leases, then the server MUST set its lease expira- tion to value held in the grace-expiration in the BNDUPD. o FREE or BACKUP in BNDUPD If the binding-status in the receiving server's database is ACTIVE and the lease-expiration-time has not yet been reached, reject it, else accept it. o RESET or ABANDONDED in BNDUPD Accept it under all circumstances. Droms, et. al. Expires December 1999 [Page 56] Internet Draft DHCP Failover Protocol June 1999 7.1.3. Conflict resolution when receiving the BNDUPD message When a either of the following conflicts exists between the informa- tion in a BNDUPD message and the information held in the receiving server's bindings database, it should be resolved in the following manner: o client conflict This is the duplicate IP address allocation conflict. There are two different clients each allocated the same address. If times for both exist, use the LATER update, else use the information from the primary server. o IP address conflict An IP address conflict exists when a client on one server is associated with a one IP address, and on the other server with a different IP address in the same or a related subnet. If one binding-status is ACTIVE and the other is anything but ACTIVE, then the information in the ACTIVE binding SHOULD be used. Oth- erwise, if times exist, then the LATER SHOULD be used. Other- wise, if times do not exist, then the information from the pri- mary server should be used. 7.2. BNDACK message Every BNDUPD message that is received by a server MUST be responded to with a corresponding BNDUPD message. The receiving server SHOULD respond quickly to every BNDUPD message but it MAY choose to respond preferentially to DHCP client requests instead of BNDUPD messages, since there is no absolute time period within which a BNDACK must be sent in response to a BNDUPD message, and DHCP clients frequently do have time constraints that must be met. 7.2.1. Sending the BNDACK message The BNDACK message MUST contain the same xid as the corresponding BNDUPD message. All of the options which appear in the BNDUPD message MUST be included in the BNDACK message. The values in the options MAY be updated to reflect current information on the server sending the BNDACK. Note that update of this information may be used for infor- mational purposes, but MUST NOT be assumed to necessarily be recorded in the stable storage of the server who sent the BNDUPD message because there is not corresponding ACK of the BNDACK message. Any Droms, et. al. Expires December 1999 [Page 57] Internet Draft DHCP Failover Protocol June 1999 information that SHOULD be recorded in the partner server's stable storage MUST be transmitted in a subsequent BNDUPD. If the server is accepting the BNDUPD, the BNDACK message includes only those options that appears in the BNDUPD message. If the server is rejecting the BNDUPD, the additional option reject-reason MUST appear in the BNDACK message, and the message option SHOULD appear in this case containing a human-readable error message describing in some detail the reason for the rejection of the BNDUPD message. 7.2.2. Receiving the BNDACK message When a server receives a BNDACK message, if it doesn't contain a reject-reason option that means that the BNDUPD message was accepted, and the server which sent the BNDUPD MUST update its stable storage with the potential-expiration-time value sent in the BNDUPD message and returned in the BNDACK message. Other values sent in the BNDUPD message MAY be used as desired. 7.3. UPDREQ message The update request (UPDREQ) message is used by one server to request that its partner send it all of the binding database information that it has not already seen. Since each server is required to keep track at all times of the binding information the other server has received and ACKed, one server can request transmission of all un- ACKed binding database information held by the other server by using the UPDREQ message. The UPDREQ message is used whenever the sending server cannot proceed before it has processed all previously un-ACKed binding update infor- mation, since the UPDREQ message should yield a corresponding UPDDONE message. The UPDDONE message is not sent until the server that sent the UPDREQ message has responded to all of the BNDUPD messages gen- erated by the UPDREQ message with BNDACK messages. Thus, the sender of the UPDREQ message can be sure upon receipt of an UPDDONE message that it has received and commited to stable storage all outstanding binding database updates. See section 9, Protcol state transitions, for the details of when the UPDREQ message is sent. 7.3.1. Sending the UPDREQ message There are no options for the UPDREQ message. The UPDREQ message is sent with a unique xid. Droms, et. al. Expires December 1999 [Page 58] Internet Draft DHCP Failover Protocol June 1999 7.3.2. Receiving the UPDREQ message A server receiving an UPDREQ message MUST send all binding database changes that have not yet been ACKed by the sending server. These changes are sent as undistinguished BNDUPD messages. However, the server which received and is processing the UPDREQ mes- sage MUST track the BNDACK messages that correspond to the BNDUPD messages triggered by the UPDREQ message and, when they are all received, the server MUST send an UPDDONE message. When queuing up the BNDUPD messages for transmission to the sender of the UPDREQ message, the receiving server MUST honor the value returned in the max-unacked-bndupd option in the CONNECT or CONNEC- TACK message that set up the connection with the sending server. It MUST NOT send more BNDUPD messages without receiving corresponding BNDACKs than the value returned in max-unacked-bndupd. 7.4. UPDREQALL message The update request all (UPDREQALL) message is used by one server to request that its partner send it all of the binding database informa- tion. This message is used to allow one server to recover from a failure of stable storage and to restore its binding database in its entirety from the other server. A server which sends an UPDREQALL message cannot proceed until all of its binding update information is restored, and it knows that all of that information is restored when an UPDDONE message is received. See section 9, Protcol state transitions, for the details of when the UPDREQALL message is sent. 7.4.1. Sending the UPDREQALL message There are no options for the UPDREQALL message. The UPDREQALL message is sent with a unique xid. 7.4.2. Receiving the UPDREQALL message A server receiving an UPDREQALL message MUST send all binding data- base information to the sending server. These changes are sent as undistinguished BNDUPD messages. However, the server receiving the UPDREQALL message MUST track the BNDACK messages that correspond to the BNDUPD messages triggered by Droms, et. al. Expires December 1999 [Page 59] Internet Draft DHCP Failover Protocol June 1999 the UPDREQ message and, when they are all received, the server MUST send an UPDDONE message. When queuing up the BNDUPD messages for transmission to the sender of the UPDREQALL message, the receiving server MUST honor the value returned in the max-unacked-bndupd option in the CONNECT or CONNEC- TACK message that set up the connection with the sending server. It MUST NOT send more BNDUPD messages without receiving corresponding BNDACKs than the value returned in max-unacked-bndupd. 7.5. UPDDONE message The update done (UPDDONE) message is used by a server receiving an UPDREQ or UPDREQALL message to signify that it has sent all of the BNDUPD messages requested by the UPDREQ or UPDREQALL request and that it has received a BNDACK for each of those messages. 7.5.1. Sending the UPDDONE message The UPDDONE message SHOULD be sent as soon as the last BNDACK message corresponding to a BNDUPD message requested by the UPDREQ or UPDREQALL is received from the server which sent the UPDREQ or UPDREQALL. 7.5.2. Receiving the UPDDONE message A server receiving the UPDDONE message knows that all of the informa- tion that it requested by sending an UPDREQ or UPDREQALL message has now been sent and that it has recorded this information in its stable storage. It typically uses that the receipt of an UPDDONE message to move to a different failover state. See sections 9.5.2 and 9.8.3 for details. 7.6. POOLREQ message The pool request (POOLREQ) message is used by the secondary server to request an allocation of IP addresses from the primary server. It MUST be sent by a secondary server to a primary server to request IP address allocation by the primary. The IP addresses allocated are transmitted using normal BNDUPD messages from the primary to the secondary. The POOLREQ message SHOULD be sent from the secondary to the primary whenever the secondary transitions into NORMAL state. It SHOULD periodically be resent in order that any change in the number of available IP addresses on the primary be reflected in the pool on the secondary. Droms, et. al. Expires December 1999 [Page 60] Internet Draft DHCP Failover Protocol June 1999 7.6.1. Sending the POOLREQ message The POOLREQ message has no options. It must be sent with a unique xid. 7.6.2. Receiving the POOLREQ message When a primary server receives a POOLREQ message it SHOULD examine the binding database and determine how many IP addresses the secon- dary server should have, and set these IP addresses to BACKUP state. It SHOULD then send BNDUPD messages concerning all of these IP addresses to the secondary server. Servers frequently have several kinds of IP addresses available on a particular network segment. The failover protocol assumes that both primary and secondary servers are configured in such a way that each knows the type and number of IP addresses on every network segment participating in the failover protocol. The primary server is responsible for allocating the secondary server the correct propor- tion of available IP addresses of each kind, and the secondary server is responsible for being configured in such a way that it can tell the kind of every IP address based solely on the IP address itself. A primary server MUST keep track of how many IP addresses were allo- cated as a result of processing the POOLREQ message, and send that number in the POOLRESP message. A primary server MAY choose to defer processing a POOLREQ message until a more convenient time to process it, but it should not depend on the secondary server to retransmit the POOLREQ message in that case. If a secondary server receives a POOLREQ message it SHOULD report an error. 7.7. POOLRESP message A primary server sends a POOLRESP message to a secondary server after the allocation process for available addresses to the secondary server is complete. Typically this message will precede some of the BNDUPD messages that the primary uses to send the actual allocated IP addresses to the secondary. 7.7.1. Sending the POOLRESP message The POOLRESP message MUST contain the same xid as the corresponding POOLREQ message. Droms, et. al. Expires December 1999 [Page 61] Internet Draft DHCP Failover Protocol June 1999 The only option which MUST appear in a POOLREQ message is: o addressed-transferred The number of addresses allocated to the secondary server by the primary server as a result of a POOLREQ is contained in the addresses-transferred option in a POOLRESP message. Note this is the number of addresses that are transferred to the secondary in the primary's binding database as a result of the correspond- ing POOLREQ message, and that it may be some time before they can all be transmitted to the secondary server through the use of BNDUPD messages. 7.7.2. Receiving the POOLRESP message When a secondary server receives a POOLRESP message, it SHOULD send another POOLRESP message if the value of the addresses-transferred option is non-zero. Typically, no other action is taken on the reception of a POOLRESP message. 7.8. CONNECT message The connect message is used to establish an applications level con- nection over a newly created TCP connection. It gives the source information for the connection, and some important configuration information. It may be sent by either primary or secondary server. It is sent by the initiator of a TCP connection. 7.8.1. Sending the CONNECT message The CONNECT message MUST be the first message sent by the initiator of a TCP connection after the establishment of a new TCP connection with another server participating in the failover protocol. The xid of the CONNECT message must be unique. The IP address of the sending server MUST be placed in the sending- server-IP-address option. This information is placed in an option inside of the packet in order to allow the identity of the sender to be covered by a shared secret. The role of the sending failover endpoint (i.e., either primary or secondary) MUST be placed in the server-role option. The current time MUST be placed in the current-time option. Droms, et. al. Expires December 1999 [Page 62] Internet Draft DHCP Failover Protocol June 1999 The number of BNDUPD messages the server can accept without blocking the TCP connection MUST be placed in the max-unacked-bndupd option. This MUST be a number equal to or greater than 1, SHOULD be a number greater than 10, and SHOULD be a number less than 100. The length of the receive timer (tReceive, see section 8.3) MUST be placed in the receive-timer option. If the sending server is a primary server, then the MCLT MUST be placed in the MCLT option. If the sending server is a primary server, then the hash-bucket- assignment option MUST be included in the CONNECT message. The value of the hash-bucket-assignment option is determined from the specific buckets that the primary server has determined that the secondary server MUST service as part of the load-balancing algorithm. The way in which the primary server determines this information is outside the scope of this protocol definition. The primary server is SHOULD be able to be configured with a percentage of clients that the secon- dary server will be instructed to service, and the primary server SHOULD convert that percentage value into a corresponding set of bits in the hash-bucket-assignment option that are set to a 1, indicating that the secondary server MUST service clients which map to those hash buckets. The vendor class identifier MUST be placed in the vendor-class- identifier option. The protocol-version option MUST be included in every CONNECT mes- sage. The current value of the protocol version is 1. The TLS-request option MUST be sent and contains the desired TLS con- nection request as well as information concerning whether TLS is sup- ported. If this CONNECT message is being sent over a already created TLS connection, the TLS-request MUST NOT appear. 7.8.2. Receiving the CONNECT message When a server receives a TCP connection on the failover port, it should wait for a CONNECT message. When a server receives a CONNECT message it should: 1. Record the time at which the message was received. 2. Examine the protocol-version option, and decide if this server is capable of interoperating with another server running that protocol version. If not, then send the CONNECTACK message Droms, et. al. Expires December 1999 [Page 63] Internet Draft DHCP Failover Protocol June 1999 with the appropriate reject-reason. The server MUST include its protocol-version in the CONNECTACK message. 3. Examine the TLS-request option. Figure out the TLS-reply value based on the capabilities and configuration of this server, and save it for the CONNECTACK message. If the results of the TLS negotiation result in a connection rejec- tion, then go immediately to send the CONNECTACK message. The possibilities are: CONNECT CONNECTACK TLS-request TLS-reply Reject req acc t1 Reason Comments --- --- -- ------ -------- 0 0 0 0 0 1 11 receiver requires TLS 0 1 0 0 1 1 1 0 - request doesn't make sense 1 1 0 1 1 1 2 0 - request doesn't make sense 2 1 0 9 or 10 receiver won't do TLS 2 1 1 4. Check to see if there is a message-digest option in the CON- NECT message. If there was, and the server does not support message-digests, then reject the connection with the appropri- ate reject-reason in the CONNECTACK. 5. Determine if the sender (from the sending-server-IP-address option) and the role of the sender (from the server-role) option represents a server with which the receiver was config- ured to engage in failover activity. If not, then the receiving server should reject the CONNECT request by sending a CONNECTACK message with a reject-reason value of: 8, invalid failover partner. If it is, then the receiving failover endpoint should be determined. 6. Decide if the time delta between the sending of the packet, in the current-time option, and the receipt of the packet, Droms, et. al. Expires December 1999 [Page 64] Internet Draft DHCP Failover Protocol June 1999 recorded in step 1 above, is acceptable. A server MAY require an arbitrarily small delta in time values in order to set up a failover connection with another server. If the delta between the time values is too great, the server should reject the CONNECT request by sending a CONNECTACK mes- sage with a reject-reason of 4, time mismatch too great. If the time mismatch is not considered too great then the receiving server MUST record the delta between the servers. The receiving server MUST use this delta to correct all of the absolute times received from the other server in all time- valued options. Note that server's can participate in fail- over with arbitrarily great time mismatches, as long as it is more or less constant. 7. If the receiving server is a secondary server, it MUST examine the MCLT option in the CONNECT request and use the value of the MCLT as the MCLT for this failover endpoint. A receiving secondary server SHOULD be able to operate with any MCLT sent by the primary, but if it cannot, then it should send a CONNECTACK with a reject-reason of 5, MCLT mismatch. 8. The receiving server MAY use the vendor-class-identifier to do vendor specific processing. 7.9. CONNECTACK message The CONNECTACK message is sent to accept or reject a CONNECT message. It is sent by the server which accepted the TCP connection and received a CONNECT message. 7.9.1. Sending the CONNECTACK message The xid of the CONNECTACK message must be that of the corresponding CONNECT message. The IP address of the sending server MUST be placed in the sending- server-IP-address option. This information is placed in an option inside of the packet in order to allow the identity of the sender to be covered by a shared secret. The role of the sending failover endpoint (i.e., either primary or secondary) MUST be placed in the server-role option. The current time MUST be placed in the current-time option. Droms, et. al. Expires December 1999 [Page 65] Internet Draft DHCP Failover Protocol June 1999 The protocol-version option MUST be included in every CONNECTACK mes- sage. The current value of the protocol version is 1. If the connection has been rejected, the reject-reason option MUST be placed in the CONNECTACK message with an appropriate reason, and a message option SHOULD be included with a human-readable error message describing the reason for the rejection in some detail. If the reject-reason option appears, then the remaining options listed below do not appear. The results of the TLS negotiation MUST be placed in the TLS-reply option. If this CONNECTACK message is being sent over an already TLS secured connection, then there MUST NOT be a TLS-reply option. If there was a message-digest option in the CONNECT message, then there MUST be a message-digest in the CONNECTACK message if it does not contain a reject-reason. The number of BNDUPD messages the server can accept without blocking the TCP connection MUST be placed in the max-unacked-bndupd option. This SHOULD be a number greater than 10, and SHOULD be a number less than 100. The length of the receive timer (tReceive, see section 8.3) MUST be placed in the receive-timer option. If the sending server is a primary server, then the MCLT MUST be placed in the MCLT option. The vendor class identifier MUST be placed in the vendor-class- identifier option. If the server is rejecting the CONNECT message, then the reject- reason option MUST appear. A message option MAY appear to give a human readable version of the rejection reason. After sending a CONNECTACK message, the server MUST send a STATE mes- sage. After sending a CONNECTACK message, the server MUST start two timers for the connection: tSend and tReceive. The tSend timer SHOULD be approximately 20 percent of the time in the receiver-timer option in the corresponding CONNECT message. The tReceive timer SHOULD be the time sent in the receiver-timer option in the CONNECTACK message. The tReceive timer is reset whenever a message is received from this TCP connection. If it ever expires, the TCP connection is dropped and communications with this partner is considered not ok. Droms, et. al. Expires December 1999 [Page 66] Internet Draft DHCP Failover Protocol June 1999 The tSend timer is reset whenever a packet is sent over this connec- tion. When it expires, a CONTACT message MUST be sent. 7.9.2. Receiving the CONNECTACK message When a CONNECTACK message is received, the following actions should be taken: 1. Record the time the packet was received. 2. Check to see if there is a reject-reason option in the CONNEC- TACK message. If not, continue with step 3. If there is a reject-reason option, the server SHOULD report the error code. If a message option appears a server SHOULD display the string from the message option in a user visible way. The server MUST close the connection if a reject-reason option appears. 3. Check to see if the xid on the CONNECTACK matches an outstand- ing CONNECT message on this TCP connection. 4. Check the value of the TLS-reply option, and if it was 1, then skip processing of the rest of the CONNECTACK message, and immediately enter into TLS connection setup. If it does not, a server SHOULD report an error. 5. Examine the value of the protocol-version option. If this server is able to establish connections with another server running this protocol version, then continue, else close the connection. 6. Check to see if the sending-server-IP-address and server-role in the CONNECTACK message correspond to the failover endpoint for which this TCP connection was created. If it was not, the server MUST drop the TCP connection and SHOULD report an error. 7. Decide if the time delta between the sending of the packet, in the current-time option, and the receipt of the packet, recorded in step 1 above, is acceptable. A server MAY require an arbitrarily small delta in time values in order to set up a failover connection with another server. If the delta between the time values is too great, the server should drop the TCP connection. If the time mismatch is not considered too great then the Droms, et. al. Expires December 1999 [Page 67] Internet Draft DHCP Failover Protocol June 1999 receiving server MUST record the delta between the servers. The receiving server MUST use this delta to correct all of the absolute times received from the other server in all time- valued options. Note that the failover protocol is con- structed so that two servers can be failover partners with arbitrarily great time mismatches. 8. If the receiving server is a secondary server, it MUST examine the MCLT option in the CONNECT request and use the value of the MCLT as the MCLT for this failover endpoint. A receiving secondary server SHOULD be able to operate with any MCLT sent by the primary, but if it cannot, then it MUST drop the TCP connection. 9. The receiving server MAY use the vendor-class-identifier to do vendor specific processing. 10. After accepting a CONNECTACK message, the server MUST send a STATE message. After receiving a CONNECTACK message, the server MUST start two timers for the connection: tSend and tReceive. The tSend timer SHOULD be approximately 20 percent of the time in the receiver-timer option in the corresponding CONNECTACK message. The tReceive timer SHOULD be set to the time sent in the receiver-timer option in the CONNECT message. The tReceive timer is reset whenever a message is received from this TCP connection. If it ever expires, the TCP connec- tion is dropped and communications with this partner is con- sidered not ok. The tSend timer is reset whenever a packet is sent over this connection. When it expires, a CONTACT message MUST be sent. 7.10. STATE message The state (STATE) message is used to communicate the current failover state to the partner server. The STATE message MUST be sent after sending a CONNECTACK message that didn't contain a reject-reason option, and MUST be sent after receiving a CONNECTACK message without a reject-reason option. A STATE message MUST be sent whenever the failover endpoint changes its failover state and a connection exists to the partner. Droms, et. al. Expires December 1999 [Page 68] Internet Draft DHCP Failover Protocol June 1999 The STATE message requires no response from the failover partner. 7.10.1. Sending the STATE message The current failover state is placed in the server-state option and the current state of the STARTUP flag is placed in the server-flags option. The message is sent with a unique xid. A server SHOULD only send the STATE message either when the connec- tion is created (i.e, after sending or receiving a CONNECTACK message with no reject-reason option), or when there is a change from the values sent in a previous STATE message. 7.10.2. Receiving the STATE message Every STATE message SHOULD indicate a change in state or a change in the flags. When a STATE message is received, any state transitions specified in section 9 are taken. No response to a STATE message is required. 7.11. CONTACT message The contact (CONTACT) message is sent to verify communications integrity with a failover partner. The CONTACT message is sent when no messages have been sent to the failover partner for a specified period of time. This is determined by the tSend timer expiring (see section 8.3). 7.11.1. Sending the CONTACT message The current time is placed in the current-time option, and the CON- TACT message is sent. 7.11.2. Receiving the CONTACT message When a CONTACT message is received, the tReceive timer is reset (as it is with any message that is received). A server MAY use the time in the current-time option and the time recorded above to refine the delta time calculations between the servers. Droms, et. al. Expires December 1999 [Page 69] Internet Draft DHCP Failover Protocol June 1999 8. Connection Management Servers participating in the failover protocol communicate over TCP connections. These TCP connections are used both to transmit bind- ing information from one server to another as well as to allow each server to determine whether communications is possible with the other server. Central to the operation of the failover protocol is a notion of "communications okay" or "communications failed". Failover state transitions are taken in many cases when the status of communications with the partner changes, and the existence or non-existence of a TCP connections between failover endpoints is used to determine if com- munications is "okay" or "failed". A single TCP connection exists which connects two failover endpoints. 8.1. Connection granularity There exists one TCP connection between each set of failover end- points. See section 5.1.1 for an explanation of failover endpoint. There are a maximum of two TCP connections between any two servers implementing the failover protocol, one for each of the possible failover endpoints between these two servers. There is a minimum of one TCP connection between one server and every other failover server with which it implements the failover protocol. 8.2. Creating the TCP connection Every server implementing the failover protocol MUST listen on port 647 for incoming failover TCP connections. The source port of the TCP connection is unimportant. Every server implementing the failover protocol SHOULD attempt to connect to all of its partners periodically, where the period is implementation dependent and SHOULD be configurable. In the event that a connection has been rejected by a CONNECTACK message with a reject-reason option contained in it, a server SHOULD reduce the fre- quency with which it attempts to connect to that server but it SHOULD continue to attempt to connect periodically. Once a connection is established, the first message sent across the connection MUST be a CONNECT message. This message establishes the identity of the failover endpoint making the connection. Every CONNECT message includes a TLS-request option, and if the CON- NECTACK message does not reject the CONNECT message and the TLS-reply Droms, et. al. Expires December 1999 [Page 70] Internet Draft DHCP Failover Protocol June 1999 option says TLS MUST be used, then the servers will enter into TLS negotiation. Once that negotiation is complete, then the server MUST resend the CONNECT message on the newly secured TLS connection and then wait for the CONNECTACK message in response. The TLS-request and TLS-reply options MUST have the same values in this second CONNECT and CONNEC- TACK message has they had in the first messages. The second message sent over a new connection is a STATE message. Upon the receipt of this message, the receiver can consider communi- cations up. It is entirely possible that two servers will attempt to make connec- tions to each other essentially simultaneously, and then each will send a CONNECT message down the new connection. In this case each server will receive a CONNECT message on one connection having already sent a CONNECT message on the other connection. In the event that the primary server receives a CONNECT message from the secondary server either while waiting for a CONNECTACK message from a secondary server or when it has a valid connection open to a secondary server, it will close the connection on which the CONNECT message was received. 8.3. Using the TCP connection for determining communications status The TCP connection is used to determine the communications status of the other server, i.e., communications-ok, or communications- interrupted. Three things must happen for a server to consider that communications are ok with respect to another server: 1. A TCP connection must be established to the other server. 2. A CONNECT message must be received and a CONNECTACK message sent in response. The CONNECT message is used to determine the identify of the failover endpoint of the other end of the TCP connection -- without it, the failover endpoint cannot be uniquely determined. Without knowledge of the failover end- point, then the entity with which communications is ok is undetermined. 3. A STATE message must be received from the other server over the connection. This STATE message initializes important information necessary to the operation of the state machine the governs the behavior of this failover endpoint. Droms, et. al. Expires December 1999 [Page 71] Internet Draft DHCP Failover Protocol June 1999 There are two ways that a server can determine that communications has failed: 1. The TCP connection can go down, yielding an error when attempting to send a message. This will happen at least as often as the period of the tSend timer. 2. The tReceive timer can expire. In either of these cases, communications is considered interrupted. Several difficulties arise when trying to use one TCP connection for both bulk data transfer as well as to sense the communications status of the other server. One aspect of the problem stems from the dif- ferent requirements of both uses. The bulk data transfer is of course critically important to the protocol, but the speed with which it is processed is not terribly significant. It might well be minutes before a BNDUPD message is processed, and while not optimal, such an occasional delay doesn't compromise the correctness of the protocol. However, the speed with which one server detects the other server is up (or, more importantly, down) is more highly constrained. Generally one server should be able to detect that the other server is not communicating within a minute or less. These differing time constraints makes it difficult to use the same TCP connection for data transfer as well as to sense communications integrity. See section 3.5 for additional details on TCP. The solution to this problem is to require a that some message be received by each end of the connection within a limited time or that the connection will be considered down. If no messages have been sent recently, then a CONTACT message is sent. In the case where there is no data queued to be sent, this is not a problem, but in the case where there is data queued to be sent to the partner, then the CONTACT message will not actually be transmitted until the queued data is sent. Section 3.5 explains why waiting for TCP to determine that the connection is down is not acceptable, and leads a requirement that the receiving server never block the sending server from sending CONTACT packets. In order to meet this requirement, each server tells the other server the number of outstanding BNDUPD messages that it will accept. The receiving server is required to always be able to accept that many BNDUPD messages off of the connection's input queue even if it cannot process them immediately, and to accept all other messages immedi- ately. Droms, et. al. Expires December 1999 [Page 72] Internet Draft DHCP Failover Protocol June 1999 Thus, the sending server's TCP is never blocked from sending a mes- sage except for very short periods, less than a few seconds unless the network connection itself has problems. In this case, if the CONTACT messages don't make it to the partner then the partner will close the connection. 8.4. Using the TCP connection for binding data Binding data, in the form of BNDUPD messages and BNDACK messages to respond to them, are sent across the TCP connection. In order to support timely detection of any failure in the partner server, the TCP connection MUST NOT block for more than a very short time, on the order of a few seconds. Therefore, a server that is sending BNDUPD messages MUST send only a restricted number before receiving BNDACK messages about previous messages sent. The number of outstanding BNDUPD messages that each server will accept without causing TCP to block transmission of additional data (i.e, CONTACT messages) is sent by each server in the CONNECT and CONNECTACK messages in the max-unacked-bndupd option. 8.5. Using the TCP connection for control messages The TCP connection is used for control messages: POOLREQ, UPDREQ, STATE, UPDREQALL and the corresponding reply messages: POOLRESP, UPDDONE. A server MUST immediately accept all of these messages from the TCP connection. A server MUST immediately accept any BNDACK which is received as well. 8.6. Losing the TCP connection When the TCP connection is lost, then communications is not ok with the other server. A server which has lost communications SHOULD immediately attempt to reconnect to the other server, and should retry these connection attempts periodically. Any BNDUPD or other messages that have been received but not yet pro- cessed from the partner SHOULD be processed as soon as possible. 9. Protocol States This section discusses the various states that a failover endpoint may take, and the server actions required when entering the state, operating in the state, and leaving the state, as well as the events that cause transitions out of the state into another state. The state transition diagram in Figure 9.2-1 is relevant for this Droms, et. al. Expires December 1999 [Page 73] Internet Draft DHCP Failover Protocol June 1999 section. In the event that the textual description of a state differs from the state transition diagram, the textual description is to be con- sidered authoritative. This is the common state transition diagram for both servers in a failover pair. 9.1. Server Initialization When a server starts it starts out in STARTUP state. See section 9.4 below for details. 9.2. Server State Transitions Whenever a server transitions into a new state, it MUST record the state and the time at which it entered that state in stable storage. If communications is "ok", it MUST also send a STATE message to its failover partner. Figure 9.2-1 is the diagram of the server state transitions. The remainder of this section contains information important to the understanding of that diagram. The server stays in the current state until all of the actions speci- fied on the state transition are complete. If communications fails during one of the actions, the server simply stays in the current state and attempts a transition whenever the conditions for a transi- tion are later fulfilled. In the state transition diagram below, the "+" or "-" in the upper right corner of each state is a notation about whether communication is ongoing with the other server. The legend "responsive", "balanced", or "unresponsive" in each state indicates whether the server is responsive to all DHCP client requests, running in load balanced mode, or totally unresponsive in the respective state. The terms "responsive" and "unresponsive" have the obvious meanings, while "balanced" means that a DHCP server may respond to all DHCPREQUEST messages that are RENEWAL or REBINDING, and to all other messages from clients for which the load balancing algorithm indicates that it MUST respond to. See sections 5.3 and 9.6.2 for details on load balancing. In the state transition diagram below, when communication is reesta- blished between the two servers, each must record the state of the partner when communication was restored. State transitions on one server in some cases imply state transitions on the partner server, so a record of the current state of the partner server must be kept by each server. Droms, et. al. Expires December 1999 [Page 74] Internet Draft DHCP Failover Protocol June 1999 If the state of the partner changes while communicating a server moves through the communications-failed transition and into whatever state results. It then immediately moves through whatever state transition is appropriate given the current state of the partner server. A server performing this operation SHOULD NOT drop the TCP connection to its partner. DISCUSSION: The point of this technique is simplicity, both in explanation of the protocol and in its implementation. The alternative to this technique of memory of partner state and automatic state transi- tion on change of partner state is to have every state in the fol- lowing diagram have a state transition for every possible state of the partner. With the approach adopted, only the states in which communications are reestablished require a state transition for each possible partner state. The current state of a server MUST be recorded in stable storage and thus be available to the server after a server restart. Droms, et. al. Expires December 1999 [Page 75] Internet Draft DHCP Failover Protocol June 1999 +---------------+ V +--------------+ | RECOVER - | | | STARTUP - | |(unresponsive) | +->|(unresponsive)| +---------------+ +--------------+ Comm. OK +-----------------+ Other State:-RECOVER | PARTNER DOWN - |<-----+ | | | (responsive) | | All POTENTIAL- +-----------------+ | Others CONFLICT------------ | --------+ ^(see | | Comm. OK | | 9.8.3)| UPDREQ(ALL) Other State: | +-----+ | Wait UPDDONE | | | Comm. | | Wait MCLT from fail RECOVER All Others| Failed | | +--------------+ | V V | | | |RECOVER-DONE +| +--+ +--------------+ | | |(unresponsive)| | | POTENTIAL + |<--+ | +--------------+ Wait for +>| CONFLICT | | Comm. OK Other | |(unresponsive)|<--- | --+ +--Other State:-+ State: | +--------------+ | | | | | RECOVER | | | | | All POTENT. DONE | Resolve Conflict | | | Others: CONFLICT-- | ----+ (see 9.8) | | | Wait for V V | | | Other State: NORMAL +-----------------+ | | | V | NORMAL + | External | | | +--+----------+-->| (balanced) |-Command-->+ | | ^ ^ +-----------------+ | | | | | | | | | Wait for Comm. OK Comm. External | | Other Other Failed Command | | State: State: | or | | |RECOVER-DONE NORMAL Start Safe Safe | | | | COMM. INT. Period Timer Period | | | Comm. OK. | V expiration | | Other State: | +------------------+ | | | RECOVER +--| COMMUNICATIONS - |-----------+ | V +-------------| INTERRUPTED | Comm. OK | RECOVER | (responsive) |--Other State:-+ RECOVER-DONE--------->+------------------+ All Others Figure 9.2-1: Server state diagram. Droms, et. al. Expires December 1999 [Page 76] Internet Draft DHCP Failover Protocol June 1999 9.3. STARTUP state The STARTUP state affords an opportunity for a server to probe its partner server, before starting to service DHCP clients. DISCUSSION: Without the STARTUP state, a server would likely start in a state derived from its previously stored state (held in stable storage), if any. However, this may be inconsistent with the current state of the partner. The STARTUP state affords the opportunity for a server to potentially learn the partner's state and determine if that state is consistent with its derived starting state or whether some significant state change has occurred at the partner that forces the server to start in another state. This is especially critical if significant time has elapsed while the server was down. 9.3.1. Operation while in STARTUP state Whenever a server is in STARTUP state, it MUST be unresponsive to DHCP client requests, and so the time spent in the STARTUP state is necessarily short, typically on the order of a few seconds to a few tens of seconds. The exact time spent in the STARTUP state is imple- mentation dependent, and the primary and secondary server are not required to spend the same amount of time in the STARTUP state. Whenever a STATE message is sent to the partner while in STARTUP state the STARTUP bit MUST be set in the server-flags option and the previously recorded failover state MUST be placed in the server-state option. 9.3.2. Transition out of STARTUP state Each server starts out in startup state every time it initializes itself, and performs the following algorithm as part of its initiali- zation: 1. Do not send any messages until step 5. 2. Is there any record in stable storage of a previous failover state? If yes, set previous-state to the last recorded state in stable storage, and continue with step 3. Is there any configuration information that indicates that Droms, et. al. Expires December 1999 [Page 77] Internet Draft DHCP Failover Protocol June 1999 this server was previously running but lost its stable storage? Such information must typically come from some administrative intervention, since it is difficult for a server to distinguish first startup from a startup after it has lost its stable storage. If yes, then set the previous- state to RECOVER, and set the time-of-failure to whatever time was configured, and go on to step 3. This time-of-failure will be used in the transition out of the RECOVER state into the RECOVER-DONE state, below. If there is no record of any previous failover state in stable storage nor of any previous operational activity for this server, then set the previous-state to PARTNER-DOWN if this server is a primary and RECOVER if this server is a secondary, and set the time-of-failure to a time before the maximum- client-lead-time before now. If using standard Posix times, 0 would typically do quite well. 3. Is the previous-state NORMAL? If yes, set the previous-state to COMMUNICATIONS-INTERRUPTED. 4. Start the STARTUP state timer. The time that a server remains in the STARTUP state (absent any communications with its partner) is implementation dependent and SHOULD be configur- able. It SHOULD be long enough to for a TCP connection to be created to a heavily loaded partner across a slow network. 5. Attempt to create a TCP connection to the failover partner. See section 8.2. 6. Wait for "communications okay", i.e., the process discussed in section 8.2 "Creating the TCP Connection", to complete, including the receipt of a STATE message from the partner. When and if communications become "okay", clear the STARTUP flag, and set the current state to the previous-state. If the partner is in PARTNER-DOWN state, and if the time at which it entered PARTNER-DOWN state (as receive in the start- time-of-state option in the STATE message) is later than the last recorded time of operation of this server, then set the current state to RECOVER. Then, transition to the current state and take the "communica- tions okay" state transition based on the current state of this server and the partner. 7. If the startup time expires, take an implementation dependent Droms, et. al. Expires December 1999 [Page 78] Internet Draft DHCP Failover Protocol June 1999 action: The server MAY go to the previous-state, or the server MAY wait. Reasons to go to previous-state and begin processing: If the current server is the only operational server, then if it waits, there will be no operational DHCP servers. This situation could occur very easily where one server fails and then the other crashes and reboots. If the rebooting server doesn't start processing DHCP client requests without first being in communication with the other server, then the level of DHCP redundancy is not particularly high. This is an appropriate approach if the possibility of partition is low, or if the safe period expiration time is well beyond the time at which an operator would notice and react to a partition situation. It is also quite appropriate if the safe period will never expire. Reasons to wait: If the current server has been down for longer than the maximum-client-lead-time, and it is partitioned from the other server, then when it returns it will attempt to use its own available addresses to allocate to new DHCP clients, and the other server may well be in PARTNER-DOWN state and may have already allocated some of those available addresses to DHCP clients. In cases where the possibility of partition is high, and the safe period expiration time is less than the likely operator reaction time, this is a good approach to use. 9.4. PARTNER-DOWN state PARTNER-DOWN state is a state either server can enter. When in this state, the server does not assume that the other server could still be operating and servicing a different set of clients, but instead assumes that it is the only server operating. For this reason, only one server should be operating in this state at a time. 9.4.1. Upon entry to PARTNER-DOWN state No special actions are required when entering PARTNER-DOWN state. The server should continue to attempt to connect to the partner periodically. Droms, et. al. Expires December 1999 [Page 79] Internet Draft DHCP Failover Protocol June 1999 9.4.2. Operation while in PARTNER-DOWN state A server in PARTNER-DOWN state MUST respond to DHCP client requests. It will allow renewal of all outstanding leases on IP addresses, and will allocate IP addresses from its own pool, and after a fixed period of time (the MCLT interval) has elapsed from entry into PARTNER-DOWN state, it will allocate IP addresses from the set of all available IP addresses. Once a server has entered NORMAL state, the PARTNER-DOWN state is entered only on command of an external agency (typically an adminis- trator of some sort) or after the expiration of an externally config- ured minimum safe-time after the beginning of COMMUNICATIONS- INTERRUPTED state. Any available IP address tagged as belonging to the other server (at entry to PARTNER-DOWN state) MUST NOT be used until the maximum- client-lead-time beyond the entry into PARTNER-DOWN state has elapsed. A server in PARTNER-DOWN state MUST NOT allocate an IP address to a DHCP client different from that to which it was allocated at the entrance to PARTNER-DOWN state until the maximum-client-lead-time beyond the its expiration time has elapsed. If this time would be earlier than the current time plus the maximum-client-lead-time, then the current time plus the maximum-client-lead-time is used. Two options exist for lease times given out while in PARTNER-DOWN state, with different ramifications flowing from each. If the server wishes the Failover protocol to protect it from loss of stable storage in PARTNER-DOWN state, then it should ensure that the MCLT based lease time restrictions in Section 5.1 are maintained, even in PARTNER-DOWN state. If the server wishes to forego the protection of the Failover proto- col in the event of loss of stable storage, then it need recognize no restrictions on actual client lease times while in PARTNER-DOWN state. A server in PARTNER-DOWN state attempt to establish communications and synchronization with its partner. 9.4.3. Transitions out of PARTNER-DOWN state When a server in PARTNER-DOWN state succeeds in establishing a con- nection to its partner, its actions are conditional on the state and flags received in the STATE message from the other server as part of Droms, et. al. Expires December 1999 [Page 80] Internet Draft DHCP Failover Protocol June 1999 the process of establishing the connection. If the STARTUP bit is set in the server-flags option of a received STATE message, a server in PARTNER-DOWN state MUST NOT take any state transitions based on reestablishing communications. Essentially, if a server is in PARTNER-DOWN state, it ignores all STATE messages from its partner that have the STARTUP bit set in the server-flags option of the STATE message. If the STARTUP bit is not set in the server-flags option of a STATE message received from its partner, then a server in PARTNER-DOWN state take the following actions based on the value of the server- state option in the received STATE message: o partner in NORMAL, COMMUNICATIONS-INTERRUPTED, PARTNER-DOWN or POTENTIAL-CONFLICT state transition to POTENTIAL-CONFLICT state o partner in RECOVER state stay in PARTNER-DOWN state o partner in RECOVER-DONE state transition into NORMAL state 9.5. RECOVER state This state indicates that the server has no information in its stable storage or that it is re-integrating with a server in PARTNER-DOWN state after it has been down. A server in this state will attempt to refresh its stable storage from the other server. 9.5.1. Operation in RECOVER state A server in RECOVER MUST NOT respond to DHCP client requests. A server in RECOVER state will attempt to reestablish communications with the other server. 9.5.2. Transitions out of RECOVER state If the other server is in POTENTIAL-CONFLICT state when communica- tions are reestablished, then the server in RECOVER state will move to POTENTIAL-CONFLICT state itself. If the other server is in RECOVER state, then this server SHOULD Droms, et. al. Expires December 1999 [Page 81] Internet Draft DHCP Failover Protocol June 1999 signal an error and halt processing. If the other server is in any other state, then the server in RECOVER state will request an update of missing binding information by send- ing an UPDREQ message. If the server has been instructed (through configuration or other external agency) that it has lost its stable storage, it MUST send an UPDREQALL message, otherwise it MUST send an UPDREQ message. It will wait for an UPDDONE message, and upon receipt of that message it will start a timer whose expiration is set to a time equal to the time the server went down (if known) or the current time (if the down-time is unknown) plus the maximum-client-lead-time. When this timer goes off, the server will transition into RECOVER-DONE state. This is to allow any IP addresses that were allocated by this server prior to loss of its client binding information in stable storage to contact the other server or to time out. See Figure 9.5.2-1. DISCUSSION: The actual requirement on this wait period in RECOVER is that it start when the recovering server went down, not necessarily when it came back up. If the time when the recovering server failed is known, then it could be communicated to the recovering server, and the wait period could be reduced to the maximum-client-lead-time less the difference between the current time and the time the server failed. In this way, the waiting period could be minimized. If an UPDDONE message isn't received within an implementation depen- dent amount of time, and no BNDUPD message are being received, then the UPDREQ(ALL) message will be re-transmitted. Droms, et. al. Expires December 1999 [Page 82] Internet Draft DHCP Failover Protocol June 1999 A B Server Server | | RECOVER PARTNER-DOWN | | | >--UPDREQ--------------------> | | | | <---------------------BNDUPD--< | | >--BNDACK--------------------> | ... ... | | | <---------------------BNDUPD--< | | >--BNDACK--------------------> | | | | <--------------------UPDDONE--< | | | Wait MCLT from last known | time of operation | | | RECOVER-DONE | | | | >--STATE-(RECOVER-DONE)------> | | NORMAL | <-------------(NORMAL)-STATE--< | NORMAL | | | | | Figure 9.5.2-1: Transition out of RECOVER state 9.6. NORMAL state NORMAL state is the state used by a server when it can communicate with the other server. 9.6.1. Upon Entry to NORMAL state When entering NORMAL state, a server will send to the other server all currently unacknowledged binding updates as BNDUPD messages. When the above process is complete, if the server entering NORMAL state is a secondary server, then it will request IP addresses for Droms, et. al. Expires December 1999 [Page 83] Internet Draft DHCP Failover Protocol June 1999 allocation using the POOLREQ message. 9.6.2. Processing DHCP client requests and load balancing When in NORMAL state, each server MUST process all requests from some DHCP clients, and MUST NOT process any request other than a DHCPREQUEST/RENEWAL or a DHCPREQUEST/REBINDING request from some other DHCP clients. The load balancing algorithm determines into which set a particular DHCP client falls. As discussed in section 5.3, each server will take the client- identifier from each DHCP client request (or the htype concatenated to the front of the chaddr if no client-identifier is present in the request), and hash it with the algorithm given in section 12. The results of this hash algorithm yields a number between 0 and 255. This number is used to index into the bit array received by a server in the hash-bucket-assignment option (if the server is a secondary), or into the inverse of the bit array sent to the secondary in the hash-bucket-assignment option if the server is a primary. If the bit found from this indexing process is a 1 bit, then the server MUST process this DHCP request. In NORMAL state, a server MUST processes every DHCPREQUEST/RENEWAL or DHCPREQUEST/REBINDING request it receives. 9.6.3. Operation in NORMAL state When in NORMAL state, for every DHCP client request that it processes, as determined by the algorithm described in section 9.6.2, above, a server will operate in the following manner: o Lease time calculations As discussed in section 5.2.1, "Control of lease time", the lease interval given to a DHCP client can never be more than the MCLT greater than the most recently received potential- expiration-time from the failover partner or the current time, whichever is later. As long as a server adheres to this constraint, the specifics of the lease interval that it gives to a DHCP client or the value of the potential-expiration-time sent to its failover partner are implementation dependent. One possible approach is dis- cussed in section 5.2.1, but that particular approach is in no way required by this protocol. Droms, et. al. Expires December 1999 [Page 84] Internet Draft DHCP Failover Protocol June 1999 o Lazy update of partner server After an ACK of a IP address binding, the server servicing a DHCP client request attempts to update its partner with the new binding information. The lease time used in the update of the secondary MUST be at that given to the DHCP client in the DHCPACK, and the potential-expiration-time MUST be at least the lease time, and SHOULD be longer. o Reallocation of IP addresses between clients Whenever a client binding is released or expires, a BNDUPD mes- sage must be sent to partner, setting the binding state to RELEASED or EXPIRED. However, until a BNDACK is received for this message, the IP address cannot be allocated to another client. It can be allocated to the same client again. In normal state, the each server receives binding updates from its partner server in BNDUPD messages. It records these in its client binding database in stable storage and then sends a corresponding BNDACK message to the primary server. It MUST ensure that the infor- mation is recorded in stable storage prior to sending the BNDACK mes- sage back to the primary server. 9.6.4. Transitions out of NORMAL state If an external command is received by a server in NORMAL state informing it that its partner is down, then transition into PARTNER- DOWN state. If a server in NORMAL state fails to receive acks to messages sent to its partner for an implementation dependent period of time, it MAY move into COMMUNICATIONS-INTERRUPTED state. This situation might occur if the partner server was capable of maintaining the TCP con- nection between the server and also capable of sending a CONTACT mes- sage every tSend seconds, but was (for some reason) incapable of pro- cessing BNDUPD messages. If the communications is determined to not be "ok" (as defined in section 8), then transition into COMMUNICATIONS-INTERRUPTED state. If a server in NORMAL state receives any messages from its partner where the partner has changed state from that expected by the server in NORMAL state, then the server should transition into COMMUNICATIONS-INTERRUPTED state and take the appropriate state tran- sition from there. For example, it would be expected for the partner to transition from POTENTIAL-CONFLICT into NORMAL state, but not for Droms, et. al. Expires December 1999 [Page 85] Internet Draft DHCP Failover Protocol June 1999 the partner to transition from NORMAL into POTENTIAL-CONFLICT state. 9.7. COMMUNICATIONS-INTERRUPTED State A server goes into COMMUNICATIONS-INTERRUPTED state whenever it is unable to communicate with the other server. Primary and secondary servers cycle automatically (without administrative intervention) between NORMAL and COMMUNICATIONS-INTERRUPTED state as the network connection between them fails and recovers, or as the partner server cycles between operational and non-operational. No duplicate IP address allocation can occur while the servers cycle between these states. 9.7.1. Upon Entry to COMMUNICATIONS-INTERRUPTED state When a server enters COMMUNICATIONS-INTERRUPTED state, if it has been configured to support an automatic transition out of COMMUNICATIONS- INTERRUPTED state and into PARTNER-DOWN state (i.e., a "safe period" has been configured, see section 10), then a timer MUST be started for a the length of the configured safe period. A server transitioning into the COMMUNICATIONS-INTERRUPTED state from the NORMAL state SHOULD raise some alarm condition to alert adminis- trative staff to a potential problem in the DHCP subsystem. 9.7.2. Operation in COMMUNICATIONS-INTERRUPTED State In this state a server MUST respond to all DHCP client requests, and the algorithm for load balancing described in section 5.3 MUST NOT be used. When allocating new IP addresses, each server allocates from its own IP address pool, where the primary MUST allocate only FREE IP addresses, and the secondary MUST allocate only BACKUP IP addresses. When responding to renewal requests, each server will allow continued renewal of a DHCP client's current lease on an IP address irrespec- tive of whether that lease was given out by the receiving server or not, although the renewal period MUST not exceed the maximum client lead time (MCLT) beyond the potential-expiration-time already ack- nowledged by the other server or the lease-expiration-time or potential-expiration-time received from the partner server. However, since the server cannot communicate with its partner in this state, the acknowledged-potential-expiration time will not be updated in any new bindings. This is likely to eventually cause the actual- client-lease-times to be the current-time plus the maximum-client- lead-time (unless this is greater than the desired-client-lease- time). Droms, et. al. Expires December 1999 [Page 86] Internet Draft DHCP Failover Protocol June 1999 9.7.3. Transition out of COMMUNICATIONS-INTERRUPTED State If the safe period timer expires while a server is in the COMMUNICATIONS-INTERRUPTED state, it will transition immediately into PARTNER-DOWN state. If an external command is received by a server in COMMUNICATIONS- INTERRUPTED state informing it that its partner is down, it will transition immediately into PARTNER-DOWN state. If communications is restored with the other server, then the server in COMMUNICATIONS-INTERRUPTED state will transition into another state based on the state of the partner: o partner in NORMAL or COMMUNICATIONS-INTERRUPTED Transition into the NORMAL state. o partner in RECOVER Stay in COMMUNICATIONS-INTERRUPTED state. o partner in RECOVER-DONE Transition into NORMAL state. o partner in PARTNER-DOWN or POTENTIAL-CONFLICT Transition into POTENTIAL-CONFLICT state. o partner in PAUSED Stay in COMMUNICATIONS-INTERRUPTED state. o partner in SHUTDOWN Transition into PARTNER-DOWN state. The following figure illustrates the transition from NORMAL to COMMUNICATIONS-INTERRUPTED state and then back to NORMAL state again. Droms, et. al. Expires December 1999 [Page 87] Internet Draft DHCP Failover Protocol June 1999 Primary Secondary Server Server NORMAL NORMAL | >--CONTACT-------------------> | | <--------------------CONTACT--< | | [TCP connection broken] | COMMUNICATIONS : COMMUNICATIONS INTERRUPTED : INTERRUPTED | [attempt new TCP connection] | | [connection succeeds] | | | | >--CONNECT-------------------> | | <-----------------CONNECTACK--< | | <-------------------STATE-----< | | NORMAL | >--STATE---------------------> | NORMAL | | >--BNDUPD--------------------> | | <---------------------BNDACK--< | | | | <---------------------BNDUPD--< | | >------BNDACK----------------> | ... ... | | | <--------------------POOLREQ--< | | >--POOLRESP-(2)--------------> | | | | >--BNDUPD-(#1)---------------> | | <---------------------BNDACK--< | | | | <--------------------POOLREQ--< | | >--POOLRESP-(0)--------------> | | | | >--BNDUPD-(#2)---------------> | | <---------------------BNDACK--< | | | Figure 9.7.3-1: Transition from NORMAL to COMMUNICATIONS- INTERRUPTED and back (example with 2 addresses allocated to secondary) Droms, et. al. Expires December 1999 [Page 88] Internet Draft DHCP Failover Protocol June 1999 9.8. POTENTIAL-CONFLICT state This state indicates that the two servers are attempting to re- integrate with each other, but at least one of them was running in a state that did not guarantee automatic reintegration would be possible. In POTENTIAL-CONFLICT state the servers may determine that the same IP address has been offered and accepted by two different DHCP clients. It is a goal of this protocol to minimize the possibility that POTENTIAL-CONFLICT state is ever entered. 9.8.1. Upon Entry to POTENTIAL-CONFLICT When a primary server enters POTENTIAL-CONFLICT state it should request that the secondary send it all updates of which it is currently unaware by sending an UPDREQ message to the secondary server. A secondary server entering POTENTIAL-CONFLICT state will wait for the primary to send it an UPDREQ message. 9.8.2. Operation in POTENTIAL-CONFLICT state Any server in POTENTIAL-CONFLICT state MUST NOT process any incoming DHCP requests. 9.8.3. Transitions out of POTENTIAL-CONFLICT state If communications fails with the partner while in POTENTIAL-CONFLICT state, then a primary server will transition to PARTNER-DOWN state and a secondary server will stay in POTENTIAL-CONFLICT state. Whenever either server receives an UPDDONE message from its partner while in POTENTIAL-CONFLICT state, it MUST transition to NORMAL state. This will cause the primary server to leave POTENTIAL- CONFLICT state prior to the secondary, since the primary sends an UPDREQ message and receives an UPDDONE before the secondary sends an UPDREQ message and receives its UPDDONE message. When a secondary server receives an indication that the primary server has transitioned from POTENTIAL-CONFLICT to NORMAL state, it SHOULD send an UPDREQ message to the primary server. Droms, et. al. Expires December 1999 [Page 89] Internet Draft DHCP Failover Protocol June 1999 Primary Secondary Server Server | | POTENTIAL-CONFLICT POTENTIAL-CONFLICT | | | >--UPDREQ--------------------> | | | | <---------------------BNDUPD--< | | >--BNDACK--------------------> | ... ... | | | <---------------------BNDUPD--< | | >--BNDACK--------------------> | | | | <--------------------UPDDONE--< | NORMAL | | >--STATE--(NORMAL)-----------> | | <---------------------UPDREQ--< | | | | >--BNDUPD--------------------> | | <---------------------BNDACK--< | ... ... | >--BNDUPD--------------------> | | <---------------------BNDACK--< | | | | >--UPDDONE-------------------> | | NORMAL | | | <--------------------POOLREQ--< | | >------POOLRESP-(?)----------> | | | Figure 9.8.3-1: Transition out of POTENTIAL-CONFLICT 9.9. RECOVER-DONE state This state exists to allow an interlocked transition for one server from RECOVER state and another server from PARTNER-DOWN or COMMUNICATIONS-INTERRUPTED state into NORMAL state. 9.9.1. Operation in RECOVER-DOWN state A server in RECOVER-DONE state MUST respond only to DHCPREQUEST/RENEWAL and DHCPREQUEST/REBINDING DHCP messages. Droms, et. al. Expires December 1999 [Page 90] Internet Draft DHCP Failover Protocol June 1999 9.9.2. Transitions out of RECOVER-DONE state When a server in RECOVER-DONE state determines that its partner server has entered NORMAL state, then it will transition into NORMAL state as well. 9.10. PAUSED state This state exists to allow one server to inform another that it will be out of service for what is predicted to be a relatively short time, and to allow the other server to transition to COMMUNICATIONS- INTERRUPTED state immediately and to begin servicing all DHCP clients with no interruption in service to new DHCP clients. A server which is aware that it is shutting down temporarily SHOULD send a STATE message with the server-state option containing PAUSED state. While a server may or may not transition internally into PAUSED state, the 'previous' state determined when it is restarted MUST be the state the server was in prior to receiving the command to shut- down and restart and which precedes its entry into the PAUSED state. See section 9.3.2 concerning the use of the previous state upon server restart. 9.10.1. Upon entry to PAUSED state When entering PAUSED state, the server MUST store the previous state in stable storage, and use that state as the previous state when it is restarted. 9.10.2. Transitions out of PAUSED state A server transitions out of PAUSED state by being restarted. At that time, the previous state MUST be the state the server was in prior to entering the PAUSED state. 9.11. SHUTDOWN state This state exists to allow one server to inform another that it will be out of service for what is predicted to be a relatively long time, and to allow the other server to transition immediately to PARTNER- DOWN state, and take over completely for the server going down. A server which is aware that it is shutting down SHOULD send a STATE message with the server-state field containing SHUTDOWN. Droms, et. al. Expires December 1999 [Page 91] Internet Draft DHCP Failover Protocol June 1999 While a server may or may not transition internally into SHUTDOWN state, the 'previous' state determined when it is restarted MUST be the state active prior to the command to shutdown. See section 9.3.2 concerning the use of the previous state upon server restart. 9.11.1. Upon entry to SHUTDOWN state When entering SHUTDOWN state, the server MUST record the previous state in stable storage for use when the server is restarted. It also MUST record the current time as the last time operational. A server which is aware that it is shutting down SHOULD send a STATE message with the server-state field containing SHUTDOWN. 9.11.2. Operation in SHUTDOWN state A server in SHUTDOWN state MUST NOT respond to any DHCP client input. If a server receives any message indicating that the partner has moved to PARTNER-DOWN state while it is in SHUTDOWN state then it MUST record RECOVER state as the previous state to be used when it is restarted. A server SHOULD wait for a few seconds after informing the partner of entry into SHUTDOWN state (if communications are okay) to determine if it will enter PARTNER-DOWN state. 9.11.3. Transitions out of SHUTDOWN state A server transitions out of SHUTDOWN state by being restarted. 10. Safe Period Due to the restrictions imposed on each server while in COMMUNICATIONS-INTERRUPTED state, long-term operation in this state is not feasible for either server. One reason that these states exist at all, is to allow the servers to easily survive transient network communications failures of a few minutes to a few days (although the actual time periods will depend a great deal on the DHCP activity of the network in terms of arrival and departure of DHCP clients on the network). Eventually, when the servers are unable to communicate, they will have to move into a state where they no longer can re-integrate without the some possibility of a duplicate IP address allocation. There are two ways that they can move into this state (known as PARTNER-DOWN). Droms, et. al. Expires December 1999 [Page 92] Internet Draft DHCP Failover Protocol June 1999 They can either be informed by external command that, indeed, the partner server is down. In this case, there is no difficulty in mov- ing into the PARTNER-DOWN state since it is an accurate reflection of reality and the protocol has been designed to operate correctly (even during reintegration) if, when in PARTNER-DOWN state the partner is, indeed, down. The more difficult scenario is when the servers are running unat- tended for extended periods, and in this case an option is provided to configure something called a "safe-period" into each server. This OPTIONAL safe-period is the period after which either the primary or secondary server will automatically transition to PARTNER-DOWN from COMMUNICATIONS-INTERRUPTED state. If this transition is completed and the partner is not down, then the possibility of duplicate IP address allocations will exist. The goal of the "safe-period" is to allow network operations staff some time to react to a server moving into COMMUNICATIONS-INTERRUPTED state. During the safe-period the only requirement is that the net- work operations staff determine if both servers are still running -- and if they are, to either fix the network communications failure between them, or to take one of the servers down before the expira- tion of the safe-period. The length of the safe-period is installation dependent, and depends in large part on the number of unallocated IP addresses within the subnet address pool and the expected frequency of arrival of previ- ously unknown DHCP clients requiring IP addresses. Many environments should be able to support safe-periods of several days. During this safe period, either server will allow renewals from any existing client. The only limitation concerns the need for IP addresses for the DHCP server to hand out to new DHCP clients and the need to re-allocate IP addresses to different DHCP clients. The number of "extra" IP addresses required is equal to the expected total number of new DHCP clients encountered during the safe period. This is dependent only on the arrival rate of new DHCP clients, not the total number of outstanding leases on IP addresses. In the unlikely event that a relatively short safe period of an hour is all that can be used (given a dearth of IP addresses or a very high arrival rate of new DHCP clients), even that can provide sub- stantial benefits in allowing the DHCP subsystem to ride through minor problems that could occur and be fixed within that hour. In these cases, no possibility of duplicate IP address allocation exists, and re-integration after the failure is solved will be automatic and require no operator intervention. Droms, et. al. Expires December 1999 [Page 93] Internet Draft DHCP Failover Protocol June 1999 11. Security It is very desirable to assure the integrity of failover partners and to thus ensure proper operation of the servers. For example, denial of service attacks are possible by the communication of invalid state information to both servers. The Failover protocol MAY be secured either by using a simple shared secret message digest which covers each message or by using TLS [TLS] (Transport Layer Security). 11.1. Simple shared secret A simple shared secret message digest MAY be used to cover each mes- sage. Since there are a number of configuration parameters that must already be the same on each server in a pair, it is not unreasonable to require a shared secret to be configured as well. Only information within the packet and covered by the message digest is used for operation of the protocol. It is for this reason that the IP address of the sending server is sent in the sending-server-IP- address option of the CONNECT and CONNECTACK messages. This message digest is placed in the message-digest option. The dig- est covers the message prior to the inclusion of the message-digest option. 11.2. TLS TLS, Transport Layer Security, as specified in [TLS] MAY be used. The use of TLS would be similar to the way it is used with SMTP [SMTPTLS] and IMAP/POP3/ACAP [IPAMTLS]. To request the use TLS, the server that successfully opened a connec- tion to its peer MUST send the TLS option as part of the CONNECT mes- sage. The server receiving the TLS option MUST respond with a TLS- reply option indicating its acceptace or rejection of the TLS-request in the CONNECT message. If the CONNECTACK message contained a TLS-reply of 1 , then both servers begin TLS negotiation. Upon completion of this negotiation, the server which originally sent the CONNECT message MUST resent its CONNECT message without any TLS- request, and must wait for a corresponding CONNECTACK. Implementation of the TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA [TLS] cipher suite is REQUIRED in Failover servers supporting TLS. This is Droms, et. al. Expires December 1999 [Page 94] Internet Draft DHCP Failover Protocol June 1999 important as it assures that any two compliant implementations can be configured to interoperate. 12. Hash algorithm for load balancing The following hash function is an implementation of the algorithm known as "Pearson's hash". The Pearson's hash algorithm was originally pub- lished in the Communications of the ACM Vol.33, No. 6 (June 1990), pp. 677-680. The author, Peter K. Pearson, has kindly granted his permis- sion to use this algorithm, free of any encumbrances. To make Primary-backup load balancing possible , both servers MUST use the same hash function. /* A "mixing table" of 256 distinct values, in pseudo-random order. */ unsigned char failover_hash_mx_tbl[256] = { 251, 175, 119, 215, 81, 14, 79, 191, 103, 49, 181, 143, 186, 157, 0, 232, 31, 32, 55, 60, 152, 58, 17, 237, 174, 70, 160, 144, 220, 90, 57, 223, 59, 3, 18, 140, 111, 166, 203, 196, 134, 243, 124, 95, 222, 179, 197, 65, 180, 48, 36, 15, 107, 46, 233, 130, 165, 30, 123, 161, 209, 23, 97, 16, 40, 91, 219, 61, 100, 10, 210, 109, 250, 127, 22, 138, 29, 108, 244, 67, 207, 9, 178, 204, 74, 98, 126, 249, 167, 116, 34, 77, 193, 200, 121, 5, 20, 113, 71, 35, 128, 13, 182, 94, 25, 226, 227, 199, 75, 27, 41, 245, 230, 224, 43, 225, 177, 26, 155, 150, 212, 142, 218, 115, 241, 73, 88, 105, 39, 114, 62, 255, 192, 201, 145, 214, 168, 158, 221, 148, 154, 122, 12, 84, 82, 163, 44, 139, 228, 236, 205, 242, 217, 11, 187, 146, 159, 64, 86, 239, 195, 42, 106, 198, 118, 112, 184, 172, 87, 2, 173, 117, 176, 229, 247, 253, 137, 185, 99, 164, 102, 147, 45, 66, 231, 52, 141, 211, 194, 206, 246, 238, 56, 110, 78, 248, 63, 240, 189, 93, 92, 51, 53, 183, 19, 171, 72, 50, 33, 104, 101, 69, 8, 252, 83, 120, 76, 135, 85, 54, 202, 125, 188, 213, 96, 235, 136, 208, 162, 129, 190, 132, 156, 38, 47, 1, 7, 254, 24, 4, 216, 131, 89, 21, 28, 133, 37, 153, 149, 80, 170, 68, 6, 169, 234, 151 }; Droms, et. al. Expires December 1999 [Page 95] Internet Draft DHCP Failover Protocol June 1999 unsigned char failover_p_hash( unsigned char *key, /* The key to be hashed (e.g., MAC address) */ int len /* Length of key in bytes */ ) { unsigned char hash = len; int i; for( i=len ; i > 0 ; ) { hash = failover_p_mx_tbl [ hash ^ key[ --i ] ]; } return( hash ); } 13. Acknowledgments Ralph Droms started it all, by sketching out an initial interserver draft that embodied ideas from several past IETF meetings. In that draft, he acknowledged contributions by Jeff Mogul, Greg Minshall, Rob Stevens, Walt Wimer, Ted Lemon, and the DHC working group. Kim Kinnear and Bob Cole each extended that draft, separately and then together, until they created an interserver draft that supported any number of servers. The complexity of that approach was just too great, and that draft wasn't greeted with enthusiasm by many, includ- ing its authors. It did however lead to a much simpler approach embodied in the first Failover draft by Greg Rabil, Mike Dooley, Arun Kapur and Ralph Droms. This draft posited only two servers -- a primary and a secon- dary. Kim Kinnear then wrote the Safe Failover draft to layer on top of the Failover Draft and increase its robustness in the face of certain rare network failures. At the spring 1998 IETF meeting in LA, the DHC working group said that they wanted a merged Failover and Safe Failover draft. Steve Gonczi and Bernie Volz stepped up and produced the raw material for such a merged draft, along with a new message format designed around DHCP options and other extensions and clarifications. Kim Kinnear edited their work into draft format and made other changes in time for the Summer Chicago IETF meeting. Droms, et. al. Expires December 1999 [Page 96] Internet Draft DHCP Failover Protocol June 1999 During the summer and fall of 1998, two groups worked on separate implementations of the UDP failover draft. Bernie Volz and Steve Gonczi constituted one group, and Kim Kinnear, Mark Stapp and Paul Fox made up the other. These two groups worked together to produce considerable changes and simplifications of the protocol during that period, and Steve Gonczi and Kim Kinnear edited those changes into -03 draft in time for submission to the December 1998 Orlando IETF meeting. In February of 1999 Kim Kinnear and Mark Stapp hosted a meeting on people interested in the failover draft. During that meeting a gen- eral agreement was reached to recast the failover protocol to use TCP instead of UDP. In addition, the group together brainstormed a work- able load-balancing technique. Kim Kinnear volunteered to rewrite the entire draft to include the changes made at that meeting as well as to restructure the draft along guidelines suggested by Thomas Nar- ten. The current draft represents the results of that effort. The initial idea for a hash-based load balancing approach was offered by Ted Lemon, and the determination of an algorithm and its integra- tion into the draft was done by Steve Gonczi. The security section was spearheaded by Bernie Volz. Both contributed considerably to the ideas and text in the rest of the draft with several reviews. These most recent changes have been widely circulated among the other authors, but that does not preclude any of them from expressing disagreement with what is contained in this draft at any future time. Many people have reviewed the various earlier drafts that went into this result. At American Internet, ideas were contributed by Brad Parker. At Cisco Systems, Paul Fox, and Ellen Garvey have contri- buted greatly to the form of the protocol. Glenn Waters of Bay Networks contributed ideas and enthusiasm to make a Failover protocol that was both "safe" and "lazy". Many thanks to Peter K. Pearson, the author of Pearson's hash who has kindly granted his permission to use this algorithm, for DHCP load balancing, free of any encumbrances. 14. References [RFC 2131] Droms, R., "Dynamic Host Configuration Protocol", RFC 2131, March 1997. [RFC 2119] Bradner, S. "Key words for use in RFCs to Indicate Droms, et. al. Expires December 1999 [Page 97] Internet Draft DHCP Failover Protocol June 1999 Requirement Levels", RFC 2119. [RFC 2132] Alexander, S., Droms, R., "DHCP Options and BOOTP Vendor Extensions", Internet RFC 2132, March 1997. [TLS] Dierks, T., "The TLS Protocol, Version 1.0", RFC 2246, January 1999. [SMTPTLS] Hoffman, P., "SMTP Service Extension for Secure SMTP over TLS", RFC 2487, January 1999. [IMAPTLS] Newman, C., "Using TLS with IMAP, POP3, and ACAP", RFC 2595, June 1999. [NAMESPACE] Carney, M., "draft-ietf-dhc-option_review_and_namespace- 00.txt", June 1999. [DDNS] Rekhter, Y., Stapp, M., "draft-ietf-dhc-dhcp-dns-10.txt", June, 1999. 15. Author's information Ralph Droms 323 Dana Engineering Bucknell University Lewisburg, PA 17837 Phone: (717) 524-1145 EMail: droms@bucknell.edu Greg Rabil, Mike Dooley, Arun Kapur Lucent Technologies (Quadritek) 10 Valley Stream Parkway, Suite 240 Malvern, PA 19355 Phone: (800) 208-2747 EMail: grabil@lucent.com mdooley@lucent.com akapur@lucent.com Kim Kinnear Mark Stapp Cisco Systems 250 Apollo Drive Chelmsford, MA 01824 Droms, et. al. Expires December 1999 [Page 98] Internet Draft DHCP Failover Protocol June 1999 Phone: (978) 244-8000 EMail: kkinnear@cisco.com mjs@cisco.com Bernie Volz Steve Gonczi Process Software Corporation 959 Concord St. Framingham, MA 01701 Phone: (508) 879-6994 EMail: volz@process.com gonczi@process.com 16. Full Copyright Statement Copyright (C) The Internet Society (1999). All Rights Reserved. This document and translations of it may be copied and furnished to oth- ers, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and dis- tributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Stan- dards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FIT- NESS FOR A PARTICULAR PURPOSE. Droms, et. al. Expires December 1999 [Page 99] Internet Draft DHCP Failover Protocol June 1999 Open Issues These issues need to be resolved: 1. We need to deal with the option space, and the procedures for managing it. Probably IANA. 2. Figure out a better way to identify vendors. How about an SNMP Enterprise MIB value? 3. Need more clarity in the conflict resolution section, probably backed up by real implementation experience. Learned a lot from the UDP implementation and experience with it in the real world, and need equivalent learning from a TCP implementation with no messages out of order or lost. Droms, et. al. Expires December 1999 [Page 100]