idnits 2.17.1 

draft-ietf-dhc-failover-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-25) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  ** The document is more than 15 pages and seems to lack a Table of Contents.

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack an Authors' Addresses Section.

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There are 1184 instances of lines with control characters in the
     document.

  ** The abstract seems to contain references ([RFC2131]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords -- however, there's a paragraph with a matching beginning.
     Boilerplate error?

     RFC 2119 keyword, line 212: '... Secondary Servers SHOULD be viewed as...'
     RFC 2119 keyword, line 257: '...his	private	pool SHOULD be based only ...'
     RFC 2119 keyword, line 262: '...econdary Servers SHOULD pause normal D...'
     RFC 2119 keyword, line 363: '...   SHOULD ensure that every packet sen...'
     RFC 2119 keyword, line 390: '...   message MUST	have the same transact...'
     (47 more instances...)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 237 has weird spacing: '...through  redun...'

  == Line 305 has weird spacing: '...ow easy  recog...'

  == Line 422 has weird spacing: '...pproach  as	in...'

  == Line 423 has weird spacing: '... one of  these...'

  == Line 513 has weird spacing: '...  could  just ...'

  == (1 more instance...)

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (March 1999) is 9173 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Looks like a reference, but probably isn't: 'RFC 2131' on line 42

  == Unused Reference: '2' is defined on line 1986, but no explicit reference
     was found in the text

  == Unused Reference: '3' is defined on line 1991, but no explicit reference
     was found in the text

  == Unused Reference: '4' is defined on line 1994, but no explicit reference
     was found in the text

  == Outdated reference: A later version (-12) exists of
     draft-ietf-dhc-failover-00

  -- Possible downref: Normative reference to a draft: ref. '3' 

  == Outdated reference: A later version (-01) exists of
     draft-ietf-dhc-security-arch-00

  -- Possible downref: Normative reference to a draft: ref. '4' 


     Summary: 14 errors (**), 0 flaws (~~), 12 warnings (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Network	Working	Group					     Ralph Droms
2	INTERNET DRAFT					     Bucknell University

4								      Greg Rabil
5								     Mike Dooley
6								      Arun Kapur
7							       Quadritek Systems

9								     Kim Kinnear
10							       American	Internet

12								    Steve Gonczi
13								     Bernie Volz
14								Process	Software

16								     August 1998
17							      Expires March 1999

19				 DHCP Failover Protocol
20			    <draft-ietf-dhc-failover-02.txt>

22	Status of this Memo

24	   This	document is an Internet-Draft. Internet-Drafts are working
25	   documents of	the Internet Engineering Task Force (IETF), its	areas,
26	   and its working groups. Note	that other groups may also distribute
27	   working documents as	Internet-Drafts.

29	   Internet-Drafts are draft documents valid for a maximum of six months
30	   and may be updated, replaced, or obsoleted by other documents at any
31	   time. It is inappropriate to	use Internet-Drafts as reference
32	   material or to cite them other than as ``work in progress.''

34	   To learn the	current	status of any Internet-Draft, please check the
35	   ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
36	   Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
37	   munnari.oz.au (Pacific Rim),	ftp.ietf.org (US East Coast), or
38	   ftp.isi.edu (US West	Coast).

40	Abstract

42	   DHCP	[RFC 2131] allows for multiple servers to be operating on a
43	   single network. Some	sites are interested in	running	multiple servers
44	   in such a way so as to provide redundancy in	case of	server failure.
45	   In order for	this to	work reliably, the cooperating Primary and
46	   Secondary servers must maintain a consistent	database of the	lease

48	DRAFT							    January 1998

50	   information.	 This implies that servers will	need to	coordinate any
51	   and all lease activity so that this information is synchronized in
52	   case	of failover.

54	   This	document defines a protocol to provide this synchronization
55	   between two servers.	One server is designated the "Primary" server,
56	   the other is	the "Secondary"	server.	Additionally, this document
57	   describes a protocol	for the	automatic transfer of control from the
58	   Primary to the Secondary in the case	of failure (failover), as well
59	   as a	network	partition.

61	   This	document is a merge of draft-ietf-dhc-failover-01.txt and
62	   draft-ietf-dhc-safe-failover-proto-00.txt, along with substantial
63	   changes to each.  Unfortunately, this merge was not completed with
64	   sufficient time to allow review by any of the authors of draft-ietf-
65	   dhc-failover-01.txt,	and so it may well not reflect their views even
66	   though their	names appear as	authors.  See Section 11, issue	#1 and
67	   Section 12 for more details.

69	1.  Introduction

71	   As the use of DHCP servers in networked environments	grows, the
72	   dependency of those networks	on the DHCP server increases.  This is
73	   particularly	true of	the hosts that receive their configuration
74	   information from the	DHCP server.  Therefore, it is very important to
75	   be able to provide reliable,	continuous availability	of DHCP	ser-
76	   vices.

78	   This	specification describes	a protocol to support automatic	failover
79	   from	a primary to its secondary server.  The	failover mechanism
80	   allows the secondary	server to perform DHCP actions while the primary
81	   is down, or when a network failure prevents the primary and secondary
82	   from	communicating.	The protocol also specifies how	reintegration is
83	   achieved when the primary again becomes operational or when the pri-
84	   mary	and secondary can again	communicate.

86	   In providing	the specification for the failover, the	protocol speci-
87	   fies	how to guarantee reliable delivery of changes to the secondary.
88	   This	is required to synchronize the secondary's lease data with that
89	   of the primary.  The	protocol further specifies a mechanism to allow
90	   the secondary to determine if it can	communicate with the primary
91	   server.  The	secondary will automatically begin to service DHCP
92	   requests whenever it	cannot communicate with	the primary.  When the
93	   primary server becomes available again, the secondary will convey any
94	   changes that	occurred since the time	of failover back to the	primary.

96	   Through careful control of the difference between the lease times

98	DRAFT							    January 1998

100	   offered to DHCP clients and the lease time known by the secondary
101	   server, the protocol	allows the primary to communicate with the
102	   secondary after the primary has completed communication with	the DHCP
103	   client (a technique known as	"lazy" update) and still guarantee that
104	   duplicate IP	address	allocations do not occur.  Thus, the protocol
105	   does	not directly impact the	ability	of a DHCP server to respond to
106	   DHCP	client requests.

108	1.1.  Requirements Terminology

110	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
111	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"	in this
112	   document are	to be interpreted as described in RFC 2119 [RFC	2119].

114	1.2.  DHCP Terminology

116	   This	document uses the following terms:

118		o "DHCP	client"	or "client"

120		  A DHCP client	is an Internet host using DHCP to obtain confi-
121		  guration parameters such as a	network	address.

123		o "DHCP	server"	or "server"

125		  A DHCP server	is an Internet host that returns configuration
126		  parameters to	DHCP clients.

128		o "binding"

130		  A binding is a collection of configuration parameters, includ-
131		  ing at least an IP address, associated with or "bound	to" a
132		  DHCP client.	Bindings are managed by	DHCP servers.

134		o "binding database"

136		  The collection of bindings managed by	a primary and secondary.

138		o "subnet address pool"

140		  A subnet address pool	is the set of IP address which is asso-
141		  ciated with a	particular network number and subnet mask.  In
142		  the simple case, there is a single network number and	subnet
143		  mask and a set of IP addresses.  In the more complex case
144		  (sometimes called "secondary subnets", sometimes "super-
145		  scopes"), several (apparently	unrelated) network number and
146		  subnet mask combinations with	their associated IP addresses
147	DRAFT							    January 1998

149		  may all be configured	together into one subnet address pool.

151		o "primary server" or "primary"

153		  A DHCP server	configured to provide primary service to a set
154		  of DHCP clients for a	particular set of subnet address pools.

156		o "secondary server" or	"secondary"

158		  A DHCP server	configured to act as backup to a primary server
159		  for a	particular set of subnet address pools.

161		o "stable storage"

163		  Every	DHCP server is assumed to have some form of what is
164		  called "stable storage".  Stable storage is used to hold
165		  information concerning IP address bindings (among other
166		  things) so that this information is not lost in the event of a
167		  server failure which requires	restart	of the server.

169	1.3.  Requirements for this protocol

171	   The following list of goals must be (and are) achieved by this proto-
172	   col.

174		1. Implementations of this protocol must work with existing DHCP
175		   client implementations based	on the DHCP protocol [1].

177		2. Implementations of the protocol must	work with existing BOOTP
178		   relay implementations.

180		3. The protocol	must provide failover redundancy between servers
181		   that	are not	located	on the same subnet.

183	1.4.  Goals for	this protocol

185		1. Provide for continued service to DHCP clients through an
186		   automated mechanism in the event of failure of the Primary
187		   Server.

189		2. Avoid binding an IP address to a client while that binding is
190		   currently valid for another client.	In other words,	don't
191		   allocate the	same IP	address	to two clients.

193		3. Minimize any	need for manual	administrative intervention.

195	DRAFT							    January 1998

197		4. Introduce no	additional delays in server response time as a
198		   result of inter-server communication.

200		5. Share IP address ranges between primary and secondary
201		   servers; i.e., impose no requirement	that the pool of avail-
202		   able	addresses be divided between servers.

204		6. Continue to meet the	goals and objectives of	this protocol in
205		   the event of	server failure or network partition.

207		7. Provide graceful reintegration of full protocol service after
208		   server failure or network partition.

210		8. Allow for one computer to act as a Secondary	Server for mul-
211		   tiple Primary Servers. Other	topologies (e.g.: mesh)	are also
212		   possible.  Primary and Secondary Servers SHOULD be viewed as
213		   "logical" servers and not necessarily physical computers.

215		9. Ensure that an existing client can keep its existing	IP
216		   address binding if it can communicate with either the Primary
217		   or Secondary	DHCP server implementing this protocol - not
218		   just	whichever server that originally offered it the	binding.

220		10.Ensure that a new client can	get an IP address from some
221		   server. Ensure that in the face of partition, where servers
222		   continue to run but cannot communicate with each other, the
223		   above goals and requirements	may be met. In addition, when
224		   the partition condition is removed, allow graceful automatic
225		   re-integration without requiring human intervention.

227		11.If either Primary or	Secondary Server loses all of the infor-
228		   mation that is has stored in	stable storage,	it should be
229		   able	to refresh its stable storage from the other server.

231	1.5.  Limitations of this Protocol

233	   The following are explicit limitations of this protocol.

235		1. Under normal	operation, only	one server at a	time will ser-
236		   vice	DHCP client requests; this protocol provides reliability
237		   through  redundancy but not load balancing.

239		2. This	protocol provides only one level of redundancy through a
240		   single Secondary Server for each Primary Server.

242		3. The protocol	provides a way to detect when the primary and
243		   secondary server cannot communicate,	but once this condition
244	DRAFT							    January 1998

246		   has been detected, does not (indeed,	cannot)	provide	any way
247		   to further distinguish between network failure and failure of
248		   one of the servers.

250		4. A small number of IP	addresses are reserved for Secondary
251		   Server use.	In order to handle the failure case where both
252		   servers are able to communicate with	DHCP clients, but unable
253		   to communicate with each other, a small number of IP
254		   addresses must be set aside as a private address pool for the
255		   Secondary Server. The Secondary can use these to service
256		   newly arrived DHCP clients during such a period.  The size of
257		   this	private	pool SHOULD be based only on the arrival rate of
258		   new DHCP clients and	the length of expected downtime, and is
259		   not influenced in any way by	the total number of DHCP clients
260		   supported by	the server pair.

262		5. The Primary and Secondary Servers SHOULD pause normal DHCP
263		   transaction processing while	resynchronizing, after a system
264		   failure.

266	2.  Protocol Operations

268	   The protocol	necessary in providing redundant/failover servers can be
269	   grouped in three areas:

271		o Messages to keep the Secondary Server's lease	data synchron-
272		  ized with that of the	Primary	so that	when failover occurs,
273		  there	is no degradation of service.

275		o Messages that	allow the Secondary to determine the operational
276		  state	of the Primary,	so as to know when to start servicing
277		  DHCP traffic.

279		o Messages that	are used to coordinate the Primary regaining
280		  control when it has become available again.

282	2.1.  Time synchronization between communicating servers

284	   Each	Binding	update message carries a "sent time stamp" (the	time
285	   when	the message was	sent in	GMT). This provides a simple mechanism
286	   to determine	any "time drift" between communicating servers.

288	   DISCUSSION:

290	      If an UDP	packet is successfully transmitted (i.e.: it does not
291	      get lost), the packet travel time	is negligible in the framework

293	DRAFT							    January 1998

295	      of  DHCP leases.	By providing a GMT "sent time" stamp, the reci-
296	      pient can	compare	this with its notion of	the current GMT	time at
297	      the time it receives the packet.	The difference (plus the packet
298	      travel time, which we ignore) is the time	drift.	The recipient
299	      can use this time	drift value to bias all	"absolute time"	values
300	      it receives from the sender.

302	2.2.  Failover Protocol	Messages

304	   The Failover	Protocol messages are encoded using a packet format
305	   specific to the Failover Protocol. To allow easy  recognition of
306	   Failover Protocol messages, BOOTP packet "op" field values  3..14 are
307	   proposed to mark various Failover Protocol messages.	A Failover Pro-
308	   tocol message is always unicast from	the source to the destination.
309	   The sender, and never the recipient is responsible for reliable re-
310	   transmission.

312	2.3.  Failover Protocol	packet header format

314	   0		       1		   2		       3
315	   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
316	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
317	   |	 op (1)	   |	 rev (1)   |	    payload offset (2)	   |
318	   +---------------+---------------+---------------+---------------+
319	   |				xid (4)				   |
320	   +---------------------------------------------------------------+
321	   |	     0 or more additional header bytes	(variable)	   |
322	   +---------------------------------------------------------------+
323	   |	       Payload data, formatted as DHCP-style options	   |
324	   |	       (although using a unique	option number space)	   |
325	   |			       (variable)			   |
326	   +---------------------------------------------------------------+

328	   op -	1 byte

330	   These values	extend the number space	of the existing	BOOTP message
331	   type	"Op" field.  The following types are defined:

333	DRAFT							    January 1998

335	   3		   DHCPPOOLREQ
336	   4		   DHCPPOOLRESP
337	   5		   DHCPBNDUPD
338	   6		   DHCPBNDACK
339	   7		   DHCPPOLL
340	   8		   DHCPPRPL
341	   9		   DHCPCTLREQ
342	   10		   DHCPCTLRET
343	   11		   DHCPCTLACK
344	   12		   DHCPCTLACKACK
345	   13		   DHCPREQUEREQ
346	   14		   DHCPREQUERESP

348	   rev - 1 byte

350	   Failover protocol version supported.	Set to 1 for the Failover Proto-
351	   col described in this draft.

353	   payload offset - 2 bytes, network byte order

355	   The byte offset of the Payload area,	from the beginning of the Fail-
356	   over	packet header. The value for the current protocol version is 8.

358	   xid - 4 bytes, network byte order

360	   The sender of a failover protocol packet is responsible for setting
361	   this	number,	and the	receiver of the	packet copies the number over
362	   into	any response packet.  To the receiver it is opaque.  The sender
363	   SHOULD ensure that every packet sent	to a particular	IP address and
364	   port	combination has	a unique transaction id	unless that packet is a
365	   re-transmission.

367	2.4.  DHCPPOOLREQ and DHCPPOOLRESP:

369	   Whenever the	Secondary server transitions into NORMAL mode, it first
370	   sends a DHCPPOOLREQ message	to initiate a transfer of a small range
371	   of IP addresses that	will serve as its private address pool.

373	   This	is necessary, because initially	the Secondary server has no such
374	   address pool, and its pool gets depleted when it hands out addresses
375	   in COMMUNICATION-INTERRUPTED	mode. This is why the request is sent
376	   every time the Secondary server transitions into NORMAL mode.  The
377	   DHCPPOOLREQ message does not	carry any payload data.	When the Primary
378	   Server gets a DHCPPOOLREQ message, it computes which	addresses should
379	   be transferred to the Secondary, and	queues up  DHCPBNDUPD transac-
380	   tions, setting the Status of	these bindings to "BACKUP".  Having done
381	   this, it sends a  DHCPPOOLRESP message. The DHCPPOOLRESP message

383	DRAFT							    January 1998

385	   carries the "Number of addresses transferred" as its	payload.

387	   The Secondary server	keeps sending DHCPPOOLREQ messages until it
388	   receives a  DHCPPOOLRESP with "Number of addresses transferred" = 0,
389	   or it decides that the partner is not responding.  Each one of these
390	   message MUST	have the same transaction ID.  If a new	transaction ID
391	   is used in one of these messages, the receiving server will begin the
392	   transmission	of the DHCPBNDUPD messages all over again.  To be clear,
393	   if the Secondary Server receives a  DHCPPOOLRESP message with "Number
394	   of addresses	transferred" > 0, it MUST send another DHCPPOOLREQ mes-
395	   sage. This mechanism	makes it possible for the Primary Server to pace
396	   the transfer	(e.g., it could	generate all addresses all at once, or
397	   one-by-one).

399	   The Primary Server must respond to each DHCPPOOLREQ message it
400	   receives. If	it has already generated all private addresses,	or it
401	   has no available addresses, it MUST send  DHCPPOOLRESP with "Number
402	   of addresses	transferred" = 0.

404	2.5.  DHCPREQUEREQ and DHCPREQUERESP:

406	   Whenever either server wishes to be updated with the	information that
407	   the other server knows and has not yet transmitted to it, will send a
408	   DHCPREQUEREQ.

410	   The DHCPREQUEREQ message does not carry any payload data. When the
411	   either server gets a	DHCPREQUEREQ message, it computes which	updates
412	   should be transferred to the	Secondary, and queues up DHCPBNDUPD
413	   transactions	   as appropriate.  Having done	this, it sends a DHCPRE-
414	   QUERESP message. The	DHCPREQUESP message carries the	"Number	of
415	   addresses queued up"	as its payload.	The set	of binding updates
416	   queued up will depend on the	requesting server's state. (The	state
417	   has already been communicated via prior DHCPPOLL/DHCPPRPL messages)

419	   The Secondary server	keeps sending DHCPPREQUEREQ messages until it
420	   receives a  DHCPREQUERESP with "Number of addresses queued up" = 0,
421	   or it decides that the partner is not responding.  This is the same
422	   approach  as	in the DHCPPOOLREQ/DHCPPOOLRESP	messages is used.  Each
423	   one of  these DHCPREQUEREQ message MUST have	the same transaction ID.
424	   Use of a new	transaction ID will cause re-building of the outgoing
425	   binding update queue.

427	   The Primary Server must respond to each DHCPREQUEREQ	message	it
428	   receives. If	it has already queued up all of	the previously unsent
429	   bindings update, then it MUST send  DHCPREQUERESP with "Number of
430	   addresses queued up"	= 0.

432	DRAFT							    January 1998

434	2.6.  DHCPBNDUPD

436	   The Primary notifies	Secondary (or the other	way around) of a binding
437	   state and data change.

439	   In response to a binding update, the	recipient server MUST respond
440	   with	a  DHCPBNDACK message.	Multiple binding updates can be	batched
441	   up, and sent	in one Failover	Protocol message.

443	2.7.  DHCPBNDACK

445	   This	message	implements a positive, or negative acknowledgement of
446	   one or more binding updates.

448	   A binding update, (or a batch of binding updates sent as one	message)
449	   are matched up with their associated	acknowledgment by having the
450	   same	Xid field value	in the message header.

452	   The server sending a	DHCPBNDACK message MAY include any of the
453	   options that	are acceptable in a DHCPBNDUPD message when the
454	   DHCPBNDACK message returned to the sender.  If any of this informa-
455	   tion	differs	from the information in	the DHCPBNDUPD message,	the
456	   receiver SHOULD update its bindings database	with that information
457	   upon	receipt	of the DHCPBNDACK message.

459	   The DHCPBNDACK MAY selectively reject one or	more updates by	includ-
460	   ing one or more IP address -	Reject Reason option pairs in the mes-
461	   sage	body.

463	   The DHCPBNDACK implicitly acknowledges any binding updates it replies
464	   to, except those it enumerates using	Reject Reason Codes.

466	2.8.  DHCPPOLL

468	   In order to determine the state of a	given server, or to communicate
469	   a critical change in	its own	status,	a participant can use the above
470	   message.

472	   This	message	inquires about the current state of the	recipient, and
473	   tells the recipient what state the sender is.

475	   In response to the DHCPPOLL message,	the participant	will listen for
476	   a DHCPPRPL message.

478	DRAFT							    January 1998

480	2.9.  DHCPPRPL

482	   This	message	replies	to the DHCPPOLL	message	(PRPL=Poll reply). The
483	   DHCPPRPL also carries server	status information (see	message	payload
484	   details below).

486	   After a failover, when the Primary Server is	restarted, the following
487	   messages are	used to	coordinate the Primary taking control back from
488	   the Secondary:

490	   DHCPCTLREQ	  - Request for	control
491	   DHCPCTLRET	  - Return of control initiated
492	   DHCPCTLACK	  - Return of control completed
493	   DHCPCTLACKACK  - Return of control completed	message	acknowledged.

495	   The Primary Server sends a DHCPCTLREQ message, indicating that it
496	   would like to take control of the bindings database.	 The Secondary
497	   Server replies with a DHCPCTLRET message, which serves as a signal to
498	   the Primary "Stand by to receive binding updates".  This message then
499	   is followed by a set	of binding updates from	the secondary to the
500	   primary.  When all updates have been	transmitted (and acknowledged)
501	   from	Secondary to Primary,  a DHCPCTLACK message is sent from the
502	   Secondary to	the Primary, to	signal that "all updates from the Secon-
503	   dary	are now	completed".

505	   DISCUSSION:

507	      Note, that the DHCPCTLACK	message	type must be transmitted reli-
508	      ably, as the Primary Server will not start servicing clients,
509	      until it has received the	DHCPCTLACK message.  To	provide	this
510	      reliability, the DCHPCTLACKACK message is	provided. This provides
511	      an acknowledgment	of the DHCPCTLACK message, and the DHCPCTLACK
512	      message will be periodically re-sent until it is acknowledged.  We
513	      could  just periodically re- send	the DHCPCTLACK message until we
514	      start receiving binding updates from the Primary,	but the	Primary
515	      may not have any updates to send at all, hence the need for an
516	      explicit DCHPCTLACKACK   message.

518	   The Primary Server transitions into NORMAL state upon receiving a
519	   DHCPCTLACK from the secondary, when the secondary has completed send-
520	   ing all of its updates during synchronization. The  DHCPCTLACKACK
521	   message is needed to	prevent	the primary from waiting and not servic-
522	   ing clients if the DHCPCTLACK message got lost.  The	Secondary server
523	   will	keep re-sending	the DHCPCTLACK message,	until:

525		1. It Decides that the primary is not responding, so the Secon-
526		   dary	server goes into COMMUNICATION-	INTERRUPTED mode.

528	DRAFT							    January 1998

530		2. It receives a DHCPCTLACKACK or a DHCPBNDUPD message from the
531		   primary.  The Primary's DHCPBNDUPD messages would start
532		   arriving at the Secondary server, if	the Primary did	get the
533		   DHCPCTLACK, but the DHCPCTLACKACK message got lost.

535	3.  Protocol Payload Data Format

537	   Payload data	is encoded as a	set of flexible	DHCP/BOOTP style
538	   options. (The usual 1 byte option code, 1 byte length, and "length"
539	   bytes of data).  The	options	are placed after the header, after skip-
540	   ping	PayloadOffset bytes. The payload data options are not preceded
541	   "cookie" value.

543	   Since the packet is NOT a DHCP/BOOTP	protocol packet,  the options
544	   used	here do	not conflict with any existing "proper"	DHCP/BOOTP
545	   options.  In	fact, these options are	allocated in relationship to the
546	   DHCP	option space in	the following way.  In cases where the syntax
547	   and semantics of a Failover Payload Option is identical to that of a
548	   DHCP/BOOTP option, the same number option number is used.  For
549	   options unique to the Failover protocol, options numbers starting at
550	   230 are used.

552	   Thus, all new Failover Protocol option numbers are assigned from a
553	   continuous range beginning with 230.	 This number is	shown as an X in
554	   the tables below.

556	   The protocol	is permissive in allowing various other	DHCP options in
557	   binding updates.  As	long as	the sender wishes to use an option, it
558	   MAY include it. On the other	hand, the recipient MUST ignore	any
559	   option it is	not expecting.

561	   Multiple DHCPBNDUPD transactions can	be batched together in one UDP
562	   packet. Option sets	for individual transaction MUST	always begin
563	   with	the IP address (Option	50) . This is the only restriction on
564	   payload item	ordering. In any other case, payload data items	can be
565	   included in any desired order.

567	   In case an implementation chooses to	use the	DHCPBNDNAK mechanism,
568	   the DHCPBNDNAK message SHOULD contain one or	more Option 50s	from the
569	   NAK-ed message, to indicate which specific update items are being
570	   NAK-ed.

572	   While the synchronization is	in progress, the secondary MUST	NOT
573	   accept client requests, and the primary MUST	NOT send any updates to
574	   the secondary. This is necessary to allow the Primary to be the sole
575	   arbitrator of any conflicting updates.

577	DRAFT							    January 1998

579	3.1.  DHCP Server Status

581	   This	option is used to convey the current state of a	server.

583	    Code  Len  Type
584	   +--+---+------+
585	   | X|	1 | 1-15 |
586	   +--+---+------+

588	   Allowed values for this option:

590	   Value Message Type
591	   ----- ------------
592	   1	 UNKNOWN-STATE
593	   2	 PRIMARY-NORMAL		   Normal state
594	   3	 BACKUP-NORMAL
595	   4	 PRIMARY-COMINT		   Communication interrupted (safe)
596	   5	 BACKUP-COMINT
597	   6	 PRIMARY-PARTNERDOWN	   Partner down	(unsafe
598					   mode)
599	   7	 BACKUP-PARTNERDOWN
600	   8	 PRIMARY-CONFLICT	   Synchronizing, after	a
601					   "Partner-Down"
602					   divergence
603	   9	 PRIMARY-SYNC		   Synchronizing, after	a
604					   "communications-
605					   interrupted"
606					   divergence.
607	   10	 BACKUP-SYNC
608	   11	 PRIMARY-RECOVER	   Recovering ALL
609					   bindings from partner
610	   12	 BACKUP-RECOVER
611	   13	 FAILOVER-DISABLED	   The server is running
612					   with	the failover
613					   protocol disabled.
614					   (standalone)

616	   14	 SERVER-PAUSED		    The	server is inactive,
617					   shutting down for a sort period.
618	   15	 SERVER-SHUTDOWN	    The	server is inactive,
619					   shutting down for an	extended period.

621	   When	a server is being re-started, it should	send a DHCPPOLL	message
622	   to its partner, reporting its status	(SERVER-PAUSED).  In response,
623	   the recipient SHOULD	go into	COMMUNICATION-INTERRUPTED mode.

625	DRAFT							    January 1998

627	   When	a server is being shut down,  it should	send a DHCPPOLL	message
628	   to its partner, reporting its status	(SERVER-SHUTDOWN).

630	   In response,	the recipient SHOULD go	into PARTNER-DOWN mode.

632	3.2.  DHCP Binding Status

634	   This	option is used to convey the current state of a	binding. This
635	   option is mandatory for DHCPBNDUPD messages.

637	   Code	  Len  Type
638	   +-----+-----+-----+
639	   | X+1 |  1  | 1-7 |
640	   +-----+-----+-----+

642	   Legal values	for this option	are:

644	   Value   Message Type
645	   -----   ------------
646	   1	   FREE		  The lease has	never been used
647	   2	   ACTIVE	  assigned to a	client *
648	   3	   EXPIRED
649	   4	   RELEASED	  A client released the	lease
650	   5	   ABANDONED	  A server or client flagged address
651				  as not usable.
652	   6	   RESET	  Lease	was freed by some
653				  external agent.
654	   7	   BACKUP	  Lease	is set aside for Secondary
655				  server's private address pool.

657	3.3.  Assigned IP address

659	   Uses	identical code and format to DHCP Option 50 (requested IP
660	   address).

662	   Code	  Len	       Address
663	   +-----+-----+-----+-----+-----+-----+
664	   |  50 |  4  |  a1 |	a2 |  a3 |  a4 |
665	   +-----+-----+-----+-----+-----+-----+

667	DRAFT							    January 1998

669	3.4.  Lease grant time

671	   An absolute,	GMT time value for this	option,	as time	synchronization
672	   has already been achieved between the source	and the	target server
673	   using the Sent Time Stamp option.  Represented as seconds since Jan
674	   1, 1970  (i.e. ANSI C time_t	time value representation).

676	   Code	  Len		Time
677	   +------+-----+-----+-----+-----+-----+
678	   | X+2  |  4	|  t1 |	 t2 |  t3 |  t4	|
679	   +------+-----+-----+-----+-----+-----+

681	3.5.  Sent Time	Stamp

683	   A time stamp	using GMT, when	the packet was sent. It	is used	to
684	   determine the time drift between the	sender and the recipient. The
685	   time	drift is defined as the	difference between "Arrive Time	(GMT)"
686	   and (Send Time (GMT)" .  The	actual packet travel time is assumed to
687	   be negligible in this context. All Date-Time	values contained  in
688	   Failover messages will be corrected by the time drift before	being
689	   stored by the recipient.

691	   Code	  Len		Time
692	   +-----+-----+-----+-----+-----+-----+
693	   | X+3 |  4  |  t1 |	t2 |  t3 |  t4 |
694	   +-----+-----+-----+-----+-----+-----+

696	   The time is a 32 bit	unsigned long in network byte order, in	units of
697	   seconds (GMT	since EPOCH).

699	3.6.  Number of	addresses transferred to Secondary Server

701	   A 32	bit unsigned long in network byte order. Reports the number of
702	   addresses transferred by the	Primary	to the Secondary Server
703	   (addresses to be used for the Secondary Server's private address
704	   pool)

706	DRAFT							    January 1998

708	   Code	  Len		Time
709	   +-----+-----+-----+-----+-----+-----+
710	   | X+4 |  4  |  t1 |	t2 |  t3 |  t4 |
711	   +-----+-----+-----+-----+-----+-----+

713	3.7.  Lease Duration

715	   Uses	the format and code of the standard DHCP IP Address Lease Time
716	   option. It is used by the DHCP protocol in the exact	same way by the
717	   DHCPOFFER message. The time is in units of seconds, and is specified
718	   as a	32-bit	unsigned integer. A Lease Duration of 0xFFFFFFFF indi-
719	   cates an infinite lease.

721	   Code	  Len	      Lease Time
722	   +-----+-----+-----+-----+-----+-----+
723	   |  51 |  4  |  t1 |	t2 |  t3 |  t4 |
724	   +-----+-----+-----+-----+-----+-----+

726	3.8.  Client Identifier

728	   The format, code and	conventions used are identical to DHCP option
729	   61.

731	   Code	  Len	Type  Client-Identifier
732	   +-----+-----+-----+-----+-----+---
733	   |  61 |  n  |  t1 |	i1 |  i2 | ...
734	   +-----+-----+-----+-----+-----+---

736	3.9.  Client Hardware Address

738	   The format is similar to DHCP option	61. T1 (type) MUST be set to the
739	   proper ARP hardware address code ( it MUST NOT be zero!)  TBD: Refer-
740	   ence	the ARP	document here.

742	DRAFT							    January 1998

744	   Code	  Len	Type  Client-Identifier
745	   +-----+-----+-----+-----+-----+---
746	   | X+5 |  n  |  t1 |	i1 |  i2 | ...
747	   +-----+-----+-----+-----+-----+---

749	   Either Client Id, Client Hardware Address or	BOTH MAY be present in
750	   binding update transactions.	At least one of	them MUST be present.
751	   If both are present,	the Client Id MUST be used to uniquely identify
752	   the owner of	the binding (exactly as	in RFC 2131).

754	3.10.  Host Name

756	   Uses	the format and code of DHCP option 12.

758	   Code	  Len		      Host Name
759	   +-----+-----+-----+-----+-----+-----+-----+-----+--
760	   |  12 |  n  |  h1 |	h2 |  h3 |  h4 |  h5 |	h6 |  ...
761	   +-----+-----+-----+-----+-----+-----+-----+-----+--

763	3.11.  Domain Name

765	   Uses	the format and code of DHCP option 15.

767	   Code	  Len	Domain Name
768	   +-----+-----+-----+-----+-----+-----+--
769	   |  15 |  n  |  d1 |	d2 |  d3 |  d4 |  ...
770	   +-----+-----+-----+-----+-----+-----+--

772	3.12.  Reject Reason Code

774	   This	option is used to selectively reject binding updates. It MAY be
775	   used	in DHCPBNDACK message, always following	an option 50.(The option
776	   50 contains the IP address of the specific update being rejected).

778	DRAFT							    January 1998

780	   Code	  Len	Reason code
781	   +-----+-----+-----+
782	   | X+6 |  1  |  R1 |
783	   +-----+-----+-----+-

785	   Reason codes	:

787	   1 Illegal IP	address	(not part of any address pool)
788	   2 Fatal conflict exists: address in use by other client.

790	3.13.  MDLI

792	   Maximum Delta Lease Interval, in seconds.  A	32  bit	integer	value,
793	   in netwotk byte order.

795	   Code	  Len		Time
796	   +------+-----+-----+-----+-----+-----+
797	   | X+7  |  4	|  t1 |	 t2 |  t3 |  t4	|
798	   +------+-----+-----+-----+-----+-----+

800	4.  Exchange of	control	between	Primary	and Secondary

802	   The Primary and Secondary Servers coordinate	the exchange control
803	   over	the bindings database through the use of DHCPPOLL and DHCPCTLREQ
804	   messages.  In normal	operation:

806	   The Primary sends notification of each change to its	bindings data-
807	   base	to the Secondary, and the Secondary keeps its bindings database
808	   synchronized	with the Primary's database.

810	   The Secondary periodically sends DHCPPOLL messages to the Primary,
811	   and the Primary responds to each DHCPPOLL message with a DHCPPRPL
812	   message. If the Secondary does not receive a	DHCPPRPL response mes-
813	   sage, the Secondary takes control of	the bindings database and begins
814	   answering requests from DHCP	clients.  Note that the	Secondary should
815	   be able to be configured to not perform the automatic switch-over.

817	   The conditions under	which a	Secondary takes	control	of the bindings
818	   database, e.g., the number of consecutive missing acknowledgments,
819	   should be configurable in the Secondary by the DHCP administrator.

821	DRAFT							    January 1998

823	   The Secondary records any changes it	makes to the bindings database
824	   while it has	control. The Secondary continues to send DHCPPOLL mes-
825	   sages to the	Primary.  The DHCPPOLL messages	also carry information
826	   on the state	of the Secondary Server.

828	   To regain control of	the bindings database, e.g., after the Primary
829	   Server has recovered	from a failure,	or a partitioned network condi-
830	   tion, the Primary sends a DHCPCTLREQ	message	to the Secondary.  The
831	   Secondary stops answering DHCP client requests, and responds	to its
832	   Primary with	a DHCPCTLRET message.  After sending the DHCPCTLRET mes-
833	   sage, the Secondary sends DHCPBNDUPD	messages for each of the changes
834	   it has made to the bindings database.

836	   The Primary sends a DHCPBNDACK for each DHCPBNDUPD message it
837	   receives.  The Secondary completes the transfer of control by sending
838	   a DHCPCTLACK	message	to the Primary as soon as all of its updates
839	   were	acknowledged.

841	   Note, that the Primary SHOULD NOT send any DHCPBNDUPD messages while
842	   synchronization is in progress with the Secondary.

844	   Once	the synchronization is completed, and the Primary transitions
845	   into	NORMAL state, and starts sending DHCPBNDUPD transactions on any
846	   accumulated binding changes it may have.

848	5.  Duplicate address assignment scenarios

850	   In the following two	scenarios, the protocol	could end up allocating
851	   duplicate IP	addresses, unless the measures recommended in Section 6.
852	   are taken:

854	   Primary Server crash	before "lazy" update: In the case where	the Pri-
855	   mary	Server sends an	ACK to a client	for a newly allocated IP address
856	   and then crashes prior to sending the corresponding update to the
857	   Secondary Server, the Secondary Server will have no record of the IP
858	   address allocation.	When the Secondary Server takes	over, it may
859	   well	try to allocate	that IP	address	to a different client.	In the
860	   case	where the first	client to receive the IP address is not	on the
861	   net at the time (yet	while there was	still time to run on its lease),
862	   an ICMP echo	(i.e., ping) will not prevent the Secondary Server from
863	   allocating that IP address to different client.

865	   A more likely and subtle version of this problem is where the Primary
866	   Server crashes after	extending a client's lease time, and before
867	   updating the	Secondary with a new time using	a lazy update. After the
868	   Secondary takes over, if the	client is not connected	to the network
869	   the Secondary will believe the client's lease has expired when, in
870	   fact, it has	not.  In this case as well, the	IP address might be

872	DRAFT							    January 1998

874	   reallocated to a different client while the first client is still
875	   using it.

877	   Network partition where servers can't communicate but each can talk
878	   to clients: Several conditions are required for this	situation to
879	   occur. First, due to	a network failure, the Primary and Secondary
880	   Servers cannot communicate.	As well, some of the DHCP clients must
881	   be able to communicate with the Primary Server, and some of the
882	   clients must	now only be able to communicate	with the Secondary
883	   Server.  When this condition	occurs,	both Primary and Secondary
884	   Servers could attempt to allocate IP	addresses for new clients from
885	   the same pool of available addresses. At some point,	then, two
886	   clients will	end up being allocated the same	IP address. This will
887	   cause potentially serious problems when the network failure that
888	   created this	situation is corrected.

890	   The next section details how	the Failover Protocol prevents either of
891	   the above scenarios (and other related scenarios) from causing dupli-
892	   cate	IP address allocation.

894	6.  Duplicate Address Assignment Control

896	   There are several ways that the Failover protocol avoids the	possi-
897	   bility of duplicate address assignment.

899	6.1.  Control of lease time

901	   The key problem with	lazy update is that when the primary server
902	   fails after updating	a client with a	particular lease time and before
903	   updating the	secondary server, the secondary	server will believe that
904	   a lease has expired even though the client still retains a valid
905	   lease on that IP address.

907	   In order to handle this problem, a period of	time known as the "max-
908	   imum	delta lease interval" (MDLI) is	defined	and must be known to
909	   both	the primary and	secondary servers.  Proper use of this time
910	   interval places an upper bound on the difference allowed between the
911	   lease time provided to a DHCP client	and the	lease time known by the
912	   secondary server.  In order that this is not	the maximum lease time
913	   that	the primary can	ever provide to	a client, during a lazy	update
914	   the primary typically updates the secondary with lease time informa-
915	   tion	which is longer	than the lease time previously given to	the
916	   client.

918	   In the case where the secondary needs to take over from the primary,
919	   the secondary will not reallocate any IP addresses from one client to
920	   a different clients.	 When transitioning to the PARTNER-DOWN	state
921	   (where the secondary	is allowed to reallocate IP addresses),	the

923	DRAFT							    January 1998

925	   secondary will wait the maximum-delta-lease-interval	before complet-
926	   ing the state transition.  Thus, any	clients	which have a lease on an
927	   IP address with a lease time	greater	that than known	by the secondary
928	   will	either have contacted the secondary during that	time or	the
929	   their lease will have expired.

931	   This	protocol requires a DHCP server	to deal	with several different
932	   lease intervals and places specific restrictions on their relation-
933	   ships. The purpose of these restrictions is to allow	the other server
934	   in the pair to be able to make certain assumptions in the absence of
935	   an ability to communicate between servers.

937	   The different lease times are:

939		o desired client lease interval

941		  The desired client lease interval is the lease interval that
942		  the DHCP server would	like to	give to	the DHCP client	in the
943		  absence of any restrictions imposed by the Failover Protocol.
944		  Its determination is outside of the scope of this protocol.
945		  Typically this is the	result of external configuration of a
946		  DHCP server.

948		o actual client	lease interval

950		  The actual client lease internal is the lease	interval that
951		  that DHCP server gives out to	the DHCP client.  It may be
952		  shorter than the desired client lease	interval (as explained
953		  below).

955		o Primary Server lease interval

957		  The Primary Server lease interval is the interval after which
958		  the Primary Server believes that DHCP	client's lease will
959		  expire.

961		o desired Secondary Server lease interval

963		  The desired Secondary	Server lease interval is the interval
964		  the Primary Server tells to the Secondary Server after which
965		  the lease will expire.

967		o acknowledged Secondary Server	lease interval

969		  The acknowledged Secondary Server lease interval is the inter-
970		  val the Secondary Server has most recently acknowledged. The
971		  key restriction (and guarantee) that the Primary Server makes
972		  with respect to lease	intervals is that the actual client
973	DRAFT							    January 1998

975		  lease	interval never exceeds the acknowledged	Secondary Server
976		  lease	interval (if any) by more than a fixed amount.	This
977		  fixed	amount is called the "maximum delta lease interval"
978		  (MDLI).

980	   The MDLI MAY	be configurable, but for correct server	operation it
981	   MUST	be known to both the Primary and Secondary Servers.

983	   The Primary Server MUST record in its state both the	Primary	Server
984	   lease interval and the most recently	acknowledged Secondary Server
985	   lease interval. It is assumed that the desired client lease interval
986	   can be determined through techniques	outside	of the scope of	this
987	   protocol.

989	   The above lease time	descriptions are written for the case where the
990	   where the Primary server is operating and in	communication with the
991	   Secondary server.  In the case where	the Secondary server is	operat-
992	   ing out of communications with the Primary server, then the relation-
993	   ships must hold in the other	direction.

995	   The fundamental relationship	among these times which	MUST be	main-
996	   tained is:

998	       actual client lease interval <
999	       ( acknowledged other server lease interval + MDLI )

1001	   The "acknowledged other server lease	interval" is the acknowledged
1002	   secondary server lease interval for the Primary server, and it would
1003	   be the acknowledged primary server lease interval for the Secondary
1004	   server when it is operating out of contact with the Primary server.

1006	   DISCUSSION:

1008	      This protocol mandates no	particular detailed algorithms concern-
1009	      ing these	lease intervals, as long as above fundamental relation-
1010	      ship is preserved.

1012	      In the interests of clarity, however, let's examine a specific
1013	      example. The MDLI	in this	case is	1 hour.	 The desired client
1014	      lease interval is	3 days.	 In operation this might work as fol-
1015	      lows:

1017	      When a Primary Server makes an offer for a new lease on an IP
1018	      address to a DHCP	client,	it determines the desired client lease
1019	      interval (in this	case, 3	days).	It then	examines the ack-
1020	      nowledged	Secondary lease	interval (which	in this	case is	 zero).

1022	DRAFT							    January 1998

1024	      Since the	actual client lease interval can not be	allowed	to
1025	      exceed the current Secondary lease interval by more than the MDLI,
1026	      the offer	made to	the DHCP client	(the actual client lease inter-
1027	      val) is for (essentially)	the MDLI, 1 hour.

1029	      Once the Primary Server has performed the	ACK to the DHCP	client,
1030	      it will update the Secondary Server with the lease information.
1031	      However, the Secondary Server lease interval will	be composed of
1032	      the current actual client	lease interval + ( 1.5 * desired client
1033	      lease interval). Thus, the Secondary Server is updated with a
1034	      lease interval of	4.5 days + 1 hour.

1036	      When the Primary Server receives an ACK to its update of the
1037	      Secondary	Server's lease interval, it records that as the	ack-
1038	      nowledged	Secondary Server lease interval.  The Primary Server
1039	      MUST ensure that the Secondary Server has	received and recorded in
1040	      its stable storage the Secondary Server lease interval.

1042	      When the DHCP client attempts to renew at	T2 (approximately one
1043	      half an hour from	the start of the lease), the Primary Server
1044	      again determines the desired client lease	time, which is still 3
1045	      days.  It	then compares this with	the remaining acknowledged
1046	      Secondary	Server lease interval (adjusting for the time passed
1047	      since the	Secondary Server was last updated), which is 4.5 days +
1048	      to the desired client lease interval as it is less than the ack-
1049	      nowledged	Secondary lease	interval.

1051	      When the Primary DHCP server updates the Secondary DHCP server
1052	      after the	DHCP client's renewal ACK is complete, it will calculate
1053	      the Secondary Server lease interval as the actual	client lease
1054	      interval (3 days this time) + .5 the desired client lease	interval
1055	      (1.5 days).  In this way,	the Primary attempts to	have the Secon-
1056	      dary always "lead" the client in its understanding of the	client's
1057	      lease interval.

1059	      Once the initial actual client lease interval of the MDLI	is past,
1060	      the protocol operates effectively	like the DHCP protocol does
1061	      today in its behavior concerning lease intervals.	However, the
1062	      guarantee	that the actual	client lease interval will never exceed
1063	      the acknowledged Secondary Server	lease interval by more than the
1064	      MDLI allows full recovery	from failures in lazy update.

1066	6.2.  Controlled re-allocation of IP addresses

1068	   When	the servers cannot communicate neither server will allow an IP
1069	   address previously used by one client to be offered to a different
1070	   client.  As a corollary, during normal operations the primary server

1072	DRAFT							    January 1998

1074	   must	update the secondary server whenever a lease expires or	an IP
1075	   address is released,	and must receive acknowledgement of that update
1076	   before offering the IP address of the expired or released IP	address
1077	   to a	different client.

1079	7.  Server States

1081	   The following server	states are defined:

1083	   NORMAL State:

1085	   NORMAL state	is the state used by a server when it can communicate
1086	   with	the other server in the	Primary-Secondary Server pair. When in
1087	   this	state, the Primary responds to DHCP clients requests, while the
1088	   Secondary does not.

1090	   COMMUNICATION-INTERRUPTED state:

1092	   A server goes into this state whenever it is	unable to communicate
1093	   with	the other server. Both the Primary and Secondary Servers can go
1094	   into	this state, although the behavior changes that result are dif-
1095	   ferent. Primary and Secondary Servers cycle automatically (without
1096	   administrative intervention)	between	NORMAL and COMMUNICATION-
1097	   INTERRUPTED state as	the network connection between them fails and
1098	   recovers, or	as the partner server cycles between operational and
1099	   non-operational. No duplicate IP address allocation can occur while
1100	   the servers cycle between these states.  In this state both servers
1101	   may respond to DHCP client requests.	 When allocating new IP
1102	   addresses, each server allocates from a different pool. When	respond-
1103	   ing to renewal requests, each server	will allow continued renewal of
1104	   a DHCP client's current lease on an IP address.

1106	   PARTNER-DOWN	state:

1108	   PARTNER-DOWN	state is a state either	server can enter. Once a server
1109	   has entered NORMAL state, the PARTNER-DOWN state is entered only on
1110	   command of an external agency (typically an administrator of	some
1111	   sort) or after the expiration of an externally configured minimum
1112	   safe-time after the beginning of COMMUNICATION-INTERRUPTED state.
1113	   When	in this	state, the server no longer assumes that the other
1114	   server could	still be operational and servicing a a different set of
1115	   clients, but	instead	assumes	that it	is the only server operating.
1116	   Only	one server should be operating in this state at	a time.	The
1117	   server in this state	will respond to	DHCP client requests. It will
1118	   allow renewal of all	outstanding leases on IP addresses, and	will
1119	   allocate IP addresses from its own pool, and	after a	fixed period of
1120	   time, it will allocate IP addresses from the	set of all available IP

1122	DRAFT							    January 1998

1124	   addresses. The server will transition out of	PARTNER-DOWN state after
1125	   automatic re-integration the	companion server is complete.  This
1126	   automatic re- integration will typically be initiated by the	restart
1127	   of the server which was down.

1129	   POTENTIAL-CONFLICT state:

1131	   This	state indicates	that the two servers are attempting to rein-
1132	   tegrate with	each other, but	at least one of	them was running in a
1133	   state that did not guarantee	automatic reintegration	would be possi-
1134	   ble.	 In POTENTIAL-CONFLICT state the servers may determine that the
1135	   same	IP address has been offered and	accepted by two	different DHCP
1136	   clients.

1138	   RECOVER state:

1140	   This	state indicates	that the server	has no information in its stable
1141	   storage. A server in	this state will	attempt	to refresh its stable
1142	   storage from	the other server.

1144	   SYNC	state:

1146	   In this state, the Secondary	Server attempts	to synchronize its
1147	   stable storage with the Primary Server.  Both the Primary and Secon-
1148	   dary	may have information that the other lacks.

1150	8.  Primary Server Operation

1152	   This	section	discusses the operation	of the primary server using the
1153	   state transition diagram in Figure 8.2-1.

1155	8.1.  Primary Server Initialization

1157	   When	the Primary Server starts, there are three possibilities:  it
1158	   has never started before and	therefore has no record	of any previous
1159	   state nor of	any client binding information;	it has started before
1160	   and has a record of a previous state	and possibly of	some client
1161	   binding information;	it has started before, but failed catastrophi-
1162	   cally, and now has no record	of any previous	state (nor of any client
1163	   binding information).

1165	   When	the Primary Server starts, if it has any record	of a previous
1166	   state, then if that state was NORMAL	or COMMUNICATION-INTERRUPTED it
1167	   moves to COMMUNICATION- INTERRUPTED state.  If that state was
1168	   PARTNER-DOWN	or POTENTIAL-CONFLICT, then it moves to	PARTNER-DOWN
1169	   state.  If that state was RECOVER, then the Primary Server moves into
1170	   the RECOVER state.

1172	DRAFT							    January 1998

1174	   If it has no	record of any previous state, then either this is an
1175	   initial startup, or a recovery from a catastrophic failure where
1176	   stable storage and all client binding information was lost. These are
1177	   distinguished by recovery from a catastrophic failure being indicated
1178	   by some external configuration indication to	the Primary Server.

1180	8.2.  Primary Server State Transitions

1182	   Figure 8.2-1	is the diagram of the Primary Server's state transi-
1183	   tions. The remainder	of this	section	contains information important
1184	   to the understanding	of that	diagram.

1186	   The server stays in the current state until all of the actions speci-
1187	   fied	on the state transition	are complete.  If communications fails
1188	   during one of the actions, the server simply	stays in the current
1189	   state and attempts a	transition whenever the	conditions for a transi-
1190	   tion	are later fulfilled.

1192	   In the state	transition diagram below, the "+" or "-" in the	upper
1193	   right corner	of each	state is a notation about whether communication
1194	   is ongoing with the Secondary Server.  The legend "responsive" and
1195	   "unresponsive" in each state	indicates whether the Primary Server is
1196	   responsive to DHCP client requests in the respective	state.

1198	   In the diagram state	transition diagram below, when communication is
1199	   reestablished between the Primary and Secondary Server, the Primary
1200	   server must record the state	of the Secondary Server	when the commun-
1201	   ication was reestablished.

1203	   If the state	of the Secondary Server	changes	 while communicating,
1204	   then	the Primary Server moves through the communications-failed tran-
1205	   sition, and into whatever state results.  It	then immediately moves
1206	   through whatever state transition is	appropriate given the current
1207	   state of the	Secondary Server.

1209	   DISCUSSION:

1211	      The point	of this	technique is simplicity, both in explanation of
1212	      the protocol and in its implementation.  The alternative to this
1213	      technique	of memory of partner state and automatic state transi-
1214	      tion on change of	partner	state is to have every state in	the fol-
1215	      lowing diagram have a state transition for every possible	state of
1216	      the partner.  With the approach adopted, only the	states in which
1217	      communications are reestablished require a state transition for
1218	      each possible partner state.

1220	   All state transitions of the	Primary	Server must be recorded	in its
1221	   stable storage, and thus be available to the	server after a server

1223	DRAFT							    January 1998

1225	   restart.

1227		       Previous	Primary	State:

1229		 NORMAL	or     RECOVER	       PARTNER DOWN
1230	       COMMUNICATION  <ext. cmd>    POTENTIAL CONFLICT
1231		INTERRUPTED	  |		<none>
1232	       +---+		  V		   |
1233	       |     +----------------+	+-----------------+
1234	       |     |		    - |	|		- |
1235	       |     |	  RECOVER     |	|  PARTNER DOWN	  |<-----+
1236	       |     | (unresponsive) |	|  (responsive)	  |	 |
1237	       |     +----------------+	+-----------------+	 |
1238	       |       |		 |	 |	 ^	 |
1239	       |   Comm. OK		 |    Comm. OK	 |	 |
1240	       |   Sec.	State:		 |  Sec. State:	Comm.	 |
1241	       |    |	   |		 V  All	Others	Failed	 |
1242	       |    |	RECOVER	    +<---+	 V	 |	 |
1243	       |   All	   |	    |	    +-------------+	 |
1244	       |  Others   |	 Comm. OK   |  POTENTIAL +|	 |
1245	       |    |	  Note	Sec. State: |  CONFLICT	  |	 |
1246	       |    |	  Poss.	 RECOVER    |(responsive) |<---- | --+
1247	       |    V	  Error	  NORMAL    +-------------+	 |   |
1248	       | Sec->Pri   |	 Pri->Sec	    |		 |   |
1249	       |   Sync	    |	  Sync.	      Resolve Conflict	 |   |
1250	       |    |	    |	    V		    V		 |   |
1251	       | Wait MDLI  |	   +-----------------+		 |   |
1252	       | from Fail. |	   |		   + | External	 |   |
1253	       |    V	    V	   |	 NORMAL	     |-Command-->+   |
1254	       |    +-----++------>|  (responsive)   |		 |   |
1255	       |	  ^	   +-----------------+		 |   |
1256	       |	  |		    |			 |   |
1257	       |      Pri<->Sec		  Comm.		    External |
1258	       |	Sync		 Failed		     Command |
1259	       |	  |		    |			or   |
1260	       |      Comm. OK		    |	       "Safe Period" |
1261	       |     Sec. State:	    V		 expiration  |
1262	       |       NORMAL	   +-----------------+		 |   |
1263	       |     COMM. INT.	   |		   - |---------->+   |
1264	       |      RECOVER------| COMMUNICATIONS  |		     |
1265	       |		   |   INTERRUPTED   |	 Comm. OK    |
1266	       +------------------>|  (responsive)   |--Sec. State:--+
1267				   +-----------------+	All Others

1269		   Figure 8.2-1:  Primary Server state diagram.

1271	DRAFT							    January 1998

1273	8.3.  Primary Server in	PARTNER-DOWN state

1275	   When	it is in PARTNER-DOWN state, the Primary Server	operates largely
1276	   as does a normal DHCP server, with none of the special algorithms
1277	   described below.  In	PARTNER-DOWN state the Primary Server MUST
1278	   respond to DHCP client requests.

1280	   Any available IP address tagged as belonging	to the Secondary Server
1281	   (at entry to	PARTNER-DOWN state) MUST NOT be	used until the MDLI
1282	   beyond the entry into PARTNER-DOWN state has	elapsed.

1284	   The Primary Server MUST NOT allocate	an IP address to a DHCP	client
1285	   different from that to which	it was allocated at the	entrance to
1286	   PARTNER-DOWN	state until the	MDLI beyond the	its expiration time has
1287	   elapsed.  If	this time would	be earlier than	the current time plus
1288	   the MDLI, then the current time plus	the MDLI is used.

1290	   Two options exist for lease times, with different ramifications flow-
1291	   ing from each.

1293	   If the Primary Server wishes	the Failover Protocol to protect it from
1294	   loss	of stable storage in any state,	then it	should ensure that the
1295	   MDLI	based lease time restrictions in Section 6.1 are maintained,
1296	   even	in PARTNER-DOWN	state.

1298	   If the Primary Server wishes	to forego the protection of the	Failover
1299	   Protocol in the event of loss of stable storage, then it need recog-
1300	   nize	no restrictions	on actual client lease times while in PARTNER-
1301	   DOWN	state.

1303	   The Primary Server MUST poll	the Secondary Server and attempt to
1304	   establish communications and	synchronization	with it.

1306	   Once	the Primary succeeds in	contacting the Secondary Server, the
1307	   Primary examines the	state of the Secondary Server. If the state of
1308	   the Secondary Server	is RECOVER or NORMAL, then both	servers	have
1309	   been	running	in such	a way that duplicate IP	address	allocations were
1310	   inhibited.  In this case, the Primary Server	updates	the Secondary
1311	   Server with its client binding information, and moves into the NORMAL
1312	   state.

1314	   Once	contact	has been established, if the state of the Secondary
1315	   Server is anything other than RECOVER or NORMAL then	the Primary
1316	   Server moves	into the POTENTIAL-CONFLICT state.

1318	8.4.  Primary Server in	RECOVER	state

1320	   When	Primary	Server is initialized in the RECOVER state it expects to

1322	DRAFT							    January 1998

1324	   refresh its stable storage from an existing Secondary Server.  In
1325	   this	state the Primary Server MUST NOT respond to DHCP client
1326	   requests.

1328	   When	the Primary Server succeeds in contacting the Secondary	Server,
1329	   if it determines that the Secondary Server is itself	in the RECOVER
1330	   state (which	indicates that the Secondary Server has	no existing
1331	   client binding information),	the Primary Server will	move directly
1332	   into	NORMAL state after signaling some kind of an error (since some
1333	   person had to explicitly start the Primary Server in	RECOVER	state to
1334	   refresh its lost client binding information from the	Secondary, and
1335	   the Secondary had no	state).

1337	   If the Primary Server determines that the Secondary Server is in any
1338	   state other than RECOVER, then the Secondary	Server has some	client
1339	   binding information that the	Primary	Server needs before it moves
1340	   into	the NORMAL state.  The Primary Server will attempt to refresh
1341	   its state from the Secondary	Server,	and it will remain in the
1342	   RECOVER state until it is successful	in doing so.

1344	   The Primary Server MUST remain in RECOVER state until a period of at
1345	   least the MDLI has passed since the Primary Server was known	to have
1346	   failed.  This is to allow any IP addresses that were	allocated by the
1347	   Primary Server prior	to loss	of Primary Server client binding infor-
1348	   mation in stable storage to contact the Secondary Server or to time
1349	   out.

1351	   DISCUSSION:

1353	      The actual requirement on	this wait period in RECOVER is that it
1354	      start when the Primary Server went down, not necessarily when it
1355	      came back	up.  If	the time when the Primary Server failed	is
1356	      known, then it could be communicated to the recovering server, and
1357	      the wait period could be reduced to the MDLI less	the difference
1358	      between the current time and the time the	server failed. In this
1359	      way, the waiting period could be minimized.

1361	8.5.  Primary Server in	NORMAL state

1363	   When	in NORMAL state, the Primary Server takes the following	actions
1364	   to implement	the Safe Failover Protocol:

1366		o Lease	Time Calculations

1368		  As discussed in Section 6.1, "Control	of lease time",	the
1369		  lease	interval given to a DHCP client	can never be more than
1370		  the maximum delta lease interval greater than	the acknowledged
1371	DRAFT							    January 1998

1373		  Secondary Server lease interval.

1375		  As long as the Primary Server	adheres	to this	constraint, the
1376		  specifics of the lease intervals that	it gives to either the
1377		  DHCP client or the Secondary DHCP server are implementation
1378		  dependent. One possible approach is shown in Section 6.1, but
1379		  that particular approach is in no way	required by this proto-
1380		  col.

1382		o Lazy Update of Secondary Server

1384		  After	an ACK of a IP address binding,	the Primary Server
1385		  attempts to update the Secondary with	the binding information.
1386		  The lease time used in the update of the Secondary MUST be at
1387		  least	that given to the DHCP client in the DHCPACK.  It MAY,
1388		  however, be longer.

1390		o Reallocation of IP Addresses Between Clients

1392		  Whenever a client binding is released, a DHCPBNDUPD message
1393		  must be sent to the Secondary	Server,	setting	the binding
1394		  state	to RELEASED. However, until a DHCPBNDACK is received for
1395		  this message,	the IP address cannot be allocated to another
1396		  client.

1398	8.6.  Primary Server in	COMMUNICATION-INTERRUPTED Mode

1400	   When	in COMMUNICATION-INTERRUPTED state the Primary Server operates
1401	   in such a way that correct operation	is ensured even	if the Secondary
1402	   Server is still up and operational, but unable to communicate to the
1403	   Secondary Server. When communications are reestablished between the
1404	   Primary and Secondary Servers, if both are still in COMMUNICATION-
1405	   INTERRUPTED state, then the re-integration of their operation will
1406	   proceed automatically and without human intervention.  The protocol
1407	   is designed to ensure that reintegration will proceed in an error
1408	   free	manner and that	no actions taken by either server while	in
1409	   COMMUNICATION-INTERRUPTED state will	cause problems during reintegra-
1410	   tion.

1412	   The Primary Server operates in COMMUNICATION-INTERRUPTED state as it
1413	   does	in NORMAL state.

1415	   However, since it cannot communicate	with the Secondary in this
1416	   state, the acknowledged-Secondary-lease-time	will not be updated in
1417	   any new bindings. This is likely to eventually cause	the actual-
1418	   client-lease-times to be the	current-time plus the MDLI (unless this
1419	   is greater than the desired-client-lease-time).

1421	DRAFT							    January 1998

1423	   The Primary Server can simply queue updates to the Secondary	on com-
1424	   munication interruption and stay in the NORMAL state. If, at	the time
1425	   communication with the Secondary is reestablished, the Secondary
1426	   remains in the NORMAL state as well,	then the queued	updates	for the
1427	   Secondary will simply be processed.

1429	   COMMUNICATION-INTERRUPTED state for the Primary Server is a signal
1430	   that	it has stopped queuing updates to the Secondary, and is	able to
1431	   respond to a	variety	of possible Secondary states.

1433	   It is anticipated that some alarm condition would be	raised upon the
1434	   transition from NORMAL state	to COMMUNICATION-INTERRUPTED state. Once
1435	   the Primary Server has been in COMMUNICATION-INTERRUPTED state for a
1436	   period equal	to the safe-period, then it can	(if configured to do so)
1437	   transition into the PARTNER-DOWN state.  An external	command	may also
1438	   force a transition to PARTNER-DOWN state.

1440	9.  Secondary Server Operation

1442	   The Secondary Server	responds to DHCP client	requests only in the
1443	   PARTNER-DOWN	and COMMUNICATION-INTERRUPTED states.

1445	9.1.  Secondary	Server Initialization

1447	   When	the Secondary Server starts, there are three possibilities: it
1448	   has never started before and	therefore has no record	of any previous
1449	   state nor of	any client binding information;	it has started before
1450	   and has a record of a previous state	and possibly of	some client
1451	   binding information;	it has started before, but failed catastrophi-
1452	   cally, and now has no record	of any previous	state (nor of any client
1453	   binding information).

1455	   When	the Secondary Server starts, if	it has any record of a previous
1456	   state, then if that state was NORMAL, COMMUNICATION-INTERRUPTED, or
1457	   SYNC, it moves to COMMUNICATION-INTERRUPTED state. If that state was
1458	   PARTNER-DOWN	or POTENTIAL-CONFLICT, then it moves to	PARTNER-DOWN
1459	   state. In all other cases (both other previous states and the cases
1460	   where there is no record of a previous state), the Secondary	Server
1461	   moves into the RECOVER state.

1463	9.2.  Secondary	Server State Transitions

1465	   The server stays in the current state until all of the actions speci-
1466	   fied	on the state transition	are complete.  If communications fails
1467	   during one of the actions, the server simply	stays in the current
1468	   state and attempts a	transition whenever the	conditions for a

1470	DRAFT							    January 1998

1472	   transition are later	fulfilled.

1474	   In the state	transition diagram below, the "+" or "-" in the	upper
1475	   right corner	of each	state is a notation about whether communication
1476	   is ongoing with the Primary Server. The legend responsive" and
1477	   "unresponsive" in each state	indicates whether the Secondary	Server
1478	   is responsive to DHCP client	requests in the	respective state.

1480	   In the state	transition diagram below, when communication is	reesta-
1481	   blished between the Secondary and Primary Server, the Secondary
1482	   Server must record the state	of the Primary Server when the communi-
1483	   cations was reestablished. If the state of the Primary Server changes
1484	   while communicating,	then the Secondary Server moves	through	the
1485	   communications-interrupted transition, and into whatever state
1486	   results.  At	that time, it then immediately moves through whatever
1487	   state transition is appropriate for the current state of the	Primary
1488	   Server.

1490	   All state transitions of the	Secondary Server must be recorded in its
1491	   stable storage, and thus be available to the	server after a server
1492	   restart.

1494	DRAFT							    January 1998

1496		       Previous	Secondary State:

1498		 NORMAL	   RECOVER	  PARTNER DOWN
1499	       COMM. INT.   <none>	POTENTIAL CONFLICT
1500		  SYNC	      |		       |
1501	       +---+	      V		       V
1502	       |     +----------------+	+-----------------+
1503	       |     |	  RECOVER   - |	|  PARTNER DOWN	- |<-----+
1504	       |     | (unresponsive) |	|  (responsive)	  |	 |
1505	       |     +----------------+	+-----------------+	 |
1506	       |       |		 |	|	 ^	 |
1507	       |   Comm. OK		 |   Comm. OK	 |	 |
1508	       |   Pri.	State:		 |  Pri. State:	Comm.	 |
1509	       |    |	   |		 V  All	Others	Failed	 |
1510	       |    |	RECOVER	    +<---+	V	 |	 |
1511	       |    |	   |	    |	    +--------------+	 |
1512	       |    |	   |	 Comm. OK   |  POTENTIAL + |	 |
1513	       |   All	   |	Pri. State: |  CONFLICT	   |	 |
1514	       |  Others   |	 RECOVER    |(unresponsive)|<--- | --+
1515	       |    |	  Note	    |	    +--------------+	 |   |
1516	       |    |	  Poss.	 Sec->Pri	    |		 |   |
1517	       |    V	  Error	  Sync.	      Resolve Conflict	 |   |
1518	       | Pri->Sec  |	    V		    V		 |   |
1519	       |   Sync	   |	   +-----------------+		 |   |
1520	       |    V	   V	   |	 NORMAL	   + |-External->+   |
1521	       |    +-----++------>| (unresponsive)  | Command	 |   |
1522	       |	  ^	   +-----------------+		 |   |
1523	       |      Pri<->Sec	      |	       ^		 |   |
1524	       |	Sync	      |	 Start Alloc Timer	 |   |
1525	       |	  |	      |	    Sec->Pri		 |   |
1526	       |  +--------------+    |	      Sync		 |   |
1527	       |  |	       + |--->+	       |	    External |
1528	       |  |	SYNC	 |  Comm.   Comm. OK	     Command |
1529	       |  | unresponsive | Failed  Pri.	State:		or   |
1530	       |  +--------------+    |	     RECOVER   "Safe Period" |
1531	       |	  ^	      V	       |	 expiration  |
1532	       |	  |	  +------------------+		 |   |
1533	       |      Comm. OK	  | COMMUNICATIONS - |---------->+   |
1534	       |     Pri. State:  |    INTERRUPTED   |	 Comm. OK    |
1535	       |       NORMAL-----|   (responsive)   |--Pri. State:--+
1536	       |     COMM. INT.	  +------------------+	All Others
1537	       |		     ^
1538	       +---------------------+

1540		  Figure 9.2-1:	 Secondary Server State	Diagram.

1542	DRAFT							    January 1998

1544	9.3.  Secondary	Server in RECOVER state

1546	   The Secondary DHCP server comes up in the RECOVER state when	it has
1547	   no record of	any previous state (or that previous state was RECOVER).

1549	   It stays in this state until	it establishes communication with the
1550	   Primary Server, and is unresponsive to DHCP client requests in this
1551	   state. Essentially it is idle until it can contact the Primary
1552	   Server.

1554	   When	it establishes communication with the Primary Server, it
1555	   attempts to load its	client binding database	from that of the Primary
1556	   Server using	the techniques specified in section 6.

1558	   Once	the Secondary Server's client binding database is refreshed from
1559	   that	of the Primary,	the Secondary Server moves into	NORMAL state.

1561	9.4.  Secondary	Server in NORMAL state

1563	   In normal state, the	Secondary Server receives state	updates	from the
1564	   Primary Server in DHCPBNDUPD	messages.  It records these in its
1565	   client binding database in stable storage and then sends the
1566	   corresponding DHCPBNDACK message to the Primary Server.

1568	   While in NORMAL state, the Secondary	Server MUST also acquire a
1569	   series of IP	addresses from the Primary Server to be	used to	satisfy
1570	   DHCPDISCOVER	requests from DHCP clients when	in COMMUNICATION- INTER-
1571	   RUPTED state. See Section 2.2.2 for details of this acquisition pro-
1572	   cess.

1574	   The Secondary Server	periodically polls the Primary Server with the
1575	   DHCPPOLL message. If	it fails to receive a DHCPPRPL message in reply
1576	   after a configured number of	retries	or some	administratively deter-
1577	   mined time, the Secondary Server transitions	into COMMUNICATION-
1578	   INTERRUPTED state. Both the DHCPPOLL	and DHCPPRPL messages carry the
1579	   current status of the sender.

1581	   If an external command is received by the Secondary Server, it can
1582	   move	from NORMAL to PARTNER-	DOWN state directly.  Such a command
1583	   might be sent when the Primary Server was removed from server, and an
1584	   operator wanted the Secondary Server	to take	over immediately and
1585	   completely from the Primary Server.(Note that the Secondary Server
1586	   takes over from the Primary Server when in COMMUNICATION- INTERRUPTED
1587	   state, but less completely than in PARTNER-DOWN state).

1589	DRAFT							    January 1998

1591	9.5.  Secondary	Server in COMMUNICATION-INTERRUPTED state

1593	   When	in COMMUNICATION-INTERRUPTED state the Secondary Server	operates
1594	   in such a way that correct operation	is ensured even	if the Primary
1595	   Server is still up and operational, but unable to communicate to the
1596	   Secondary Server. When communications are reestablished between the
1597	   Primary and Secondary Servers, if both are still in COMMUNICATION-
1598	   INTERRUPTED state, then the re-integration of their operation will
1599	   proceed automatically and without human intervention.  The protocol
1600	   is designed to ensure that reintegration will proceed in an error
1601	   free	manner and that	no actions taken by either server while	in
1602	   COMMUNICATION-INTERRUPTED state will	cause any conflicts to occur
1603	   during re-integration.

1605	   In COMMUNICATION-INTERRUPTED	state, the Secondary Server responds to
1606	   DHCP	client requests.

1608	   When	processing a DHCPREQUEST from a	DHCP client, the Secondary
1609	   Server MUST ensure that the client- lease-time is never more	than the
1610	   maximum-delta-lease-	interval from the current-time,	independent of
1611	   the desired-	client-lease-time.

1613	   When	processing a DHCPRELEASE request from a	DHCP client or the
1614	   expiration of a lease, the Secondary	Server must not	reallocate the
1615	   IP address to a different client.  If the same client subsequently
1616	   performs a DHCPDISCOVER request, the	Secondary Server SHOULD	offer it
1617	   the previously used IP address.

1619	   When	processing a DHCPDISCOVER request from a DHCP client, the secon-
1620	   dary	MUST allocate IP addresses from	the list of IP addresses that it
1621	   acquired from the Primary Server in RECOVER state.  When it exhausts
1622	   this	list, it MUST stop responding to DHCPDISCOVER requests (except
1623	   those it can	satisfy	by offering expired or released	IP addresses to
1624	   their previously bound clients).

1626	   The Secondary Server	MUST continue to send DHCPPOLL messages	to the
1627	   Primary Server when in COMMUNICATION-INTERRUPTED state.  If it
1628	   receives a DHCPPRPL message in reply, the Secondary Server determines
1629	   the state of	the Primary Server.  If	the Primary Server is in NORMAL
1630	   or COMMUNICATION-INTERRUPTED	state, then the	Secondary Server moves
1631	   into	the SYNC state.

1633	   If, however,	the Primary Server is in RECOVER state,	then the Secon-
1634	   dary	Server updates the Primary Server with its known client	binding
1635	   information,	and moves into NORMAL state upon completion of that
1636	   update.

1638	   If instructed to by an outside agency (e.g.,	an administrator), the

1640	DRAFT							    January 1998

1642	   Secondary Server SHOULD move	into PARTNER-DOWN state.  Once the
1643	   Secondary Server has	been in	COMMUNICATION-INTERRUPTED state	for a
1644	   period equal	to the safe-period, then it may	(if configured to do so)
1645	   transition into the PARTNER-DOWN state in the absence of an external
1646	   command.

1648	9.6.  Secondary	Server in SYNCH	state

1650	   The Secondary Server	does not respond to DHCP client	requests when in
1651	   SYNCH state.

1653	   DISCUSSION:

1655	      This is the entire reason	for this states	existence, otherwise the
1656	      activities specified for this state could	happen as part of a
1657	      state transition from the	COMMUNICATION-INTERRUPTED state	to the
1658	      NORMAL state. However, in	the COMMUNICATION-INTERRUPTED state the
1659	      Secondary	Server responds	to DHCP	client requests. Having	the
1660	      Secondary	Server respond to DHCP client requests during the syn-
1661	      chronization process (and	thus taking actions requiring further
1662	      synchronization) seemed like a bad idea.

1664	   The Secondary Server	synchronizes its information with the Primary
1665	   Server while	in SYNCH state.	 Both Primary and Secondary Servers may
1666	   have	information the	other lacks because of operations performed
1667	   while communications	were interrupted.

1669	   During the synchronization process, the Secondary Server continues to
1670	   poll	the Primary Server with	DHCPPOLL messages.  If it fails	to
1671	   receive a reply, it moves back into COMMUNICATION-INTERRUPTED state.

1673	   When	synchronization	is complete, the Secondary Server moves	into
1674	   NORMAL state.

1676	9.7.  Secondary	Server in PARTNER-DOWN state

1678	   The Secondary Server	responds to DHCP client	requests when in
1679	   PARTNER-DOWN	state.

1681	   Any available IP address which does not belong to the private pool
1682	   established by the Secondary	Server (at entry to PARTNER-DOWN state)
1683	   MUST	NOT be used until the MDLI beyond the entry into PARTNER-DOWN
1684	   state has elapsed.

1686	   The Secondary Server	MUST NOT allocate an IP	address	to a DHCP client
1687	   different from that to which	it was allocated at the	entrance to

1689	DRAFT							    January 1998

1691	   PARTNER-DOWN	state until the	MDLI beyond the	its expiration time has
1692	   elapsed. If this time would be earlier than the current time	plus the
1693	   MDLI, then the current time plus the	MDLI is	used.

1695	   Two options exist for lease times, with different ramifications flow-
1696	   ing from each.

1698	   If the Secondary Server wishes the Failover Protocol	to protect it
1699	   from	loss of	stable storage in any state, then it should ensure that
1700	   the MDLI based lease	time restrictions in Section 6.1 are maintained,
1701	   even	in PARTNER-DOWN	state.

1703	   If the Secondary Server wishes to forego the	protection of the safe
1704	   Failover Protocol in	the event of loss of stable storage, then it MAY
1705	   recognize no	restrictions on	actual client lease times while	in
1706	   PARTNER-DOWN	state.

1708	   The Secondary Server	continues to poll the Primary Server with
1709	   DHCPPOLL messages.  If the Secondary	Server receives	a reply, and the
1710	   Primary Server is in	the RECOVER state, the Secondary Server	updates
1711	   the Primary Server with all of the Secondary's client binding infor-
1712	   mation, and then moves into the NORMAL state.

1714	   If communications with the Primary Server are reestablished,	and the
1715	   Primary Server is in	any other state	but RECOVER, the Secondary
1716	   Server moves	into the POTENTIAL-CONFLICT state (as does the Primary
1717	   Server).

1719	9.8.  Secondary	Server in POTENTIAL-CONFLICT state

1721	   The secondary server	enters POTENTIAL-CONFLICT state	when the combi-
1722	   nation of its state and that	of the primary indicate	that a potential
1723	   conflict of IP address allocation has occurred.  There is no	guaran-
1724	   tee that such a conflict has	occurred -- just the possibility.  In
1725	   this	state each server compares its client binding information with
1726	   that	of the other server and	any conflicts are resolved in an imple-
1727	   mentation dependent manner.

1729	   When	(and if) the resolution	process	completes, each	server moves
1730	   into	the NORMAL state.

1732	10.  Safe Period

1734	   Due to the restrictions imposed on each server while	in
1735	   COMMUNICATION-INTERRUPTED state, long-term operation	in this	state is
1736	   not feasible	for either server. One reason that these states	exist at
1737	   all,	is to allow the	servers	to easily survive transient network

1739	DRAFT							    January 1998

1741	   communications failures of a	few minutes to a few days (although the
1742	   actual time periods will depend a great deal	on the DHCP activity of
1743	   the network in terms	of arrival and departure of DHCP clients on the
1744	   network).

1746	   Eventually, when the	servers	are unable to communicate, they	will
1747	   have	to move	into a state where they	no longer can re-integrate
1748	   without the some possibility	of a duplicate IP address allocation.
1749	   There are two ways that they	can move into this state (known	as
1750	   PARTNER-DOWN).

1752	   They	can either be informed by external command that, indeed, the
1753	   partner server is down. In this case, there is no difficulty	in mov-
1754	   ing into the	PARTNER-DOWN state since it is an accurate reflection of
1755	   reality and the protocol has	been designed to operate correctly (even
1756	   during reintegration) if, when in PARTNER-DOWN state	the partner is,
1757	   indeed, down.

1759	   The other difficulty	is when	the servers are	running	unattended for
1760	   extended periods, and in this case the option is provided to	config-
1761	   ure something called	a "safe- period" into each server. This	OPTIONAL
1762	   safe-period is the period after which either	the Primary or Secondary
1763	   Server will automatically transition	to PARTNER-DOWN	from
1764	   COMMUNICATION-INTERRUPTED state.  If	this transition	is completed and
1765	   the partner is not down, then the possibility of duplicate IP address
1766	   allocations will exist.

1768	   The goal of the "safe-period" is to allow network operations	staff
1769	   some	time to	react to a server moving into COMMUNICATION-INTERRUPTED
1770	   state.  During the safe-period the only requirement is that the net-
1771	   work	operations staff determine if both servers are still running --
1772	   and if they are, to either fix the network communications failure
1773	   between them, or to take one	of the servers down before the	expira-
1774	   tion	of the safe-period.

1776	   The length of the safe-period is installation dependent, and	depends
1777	   in large part on the	number of unallocated IP addresses within the
1778	   subnet address pool and the expected	frequency of arrival of	previ-
1779	   ously unknown DHCP clients requiring	IP addresses.  Many environments
1780	   should be able to support safe-periods of several days.

1782	   During this safe period, either server will allow renewals from any
1783	   existing client.  The only limitation concerns the need for IP
1784	   addresses for the DHCP server to hand out to	new DHCP clients and the
1785	   need	to re-allocate IP addresses to different DHCP clients.

1787	   The number of "extra" IP addresses required is equal	to the expected
1788	   total number	of new DHCP clients encountered	during the safe	period.

1790	DRAFT							    January 1998

1792	   This	is dependent only on the arrival rate of new DHCP clients, not
1793	   the total number of outstanding leases on IP	addresses.

1795	   In the unlikely event that a	relatively short safe period of	an hour
1796	   is all that can be used (given a dearth of IP addresses or a	very
1797	   high	arrival	rate of	new DHCP clients), even	that can provide sub-
1798	   stantial benefits in	allowing the DHCP subsystem to ride through a
1799	   minor problems that could occur and be fixed	within that hour.  In
1800	   these cases,	no possibility of duplicate IP address allocation
1801	   exists, and re-integration after the	failure	is solved will be
1802	   automatic and require no operator intervention.

1804	11.  Open Issues

1806	A number of details remain to be worked	out.  They are as follows:

1808	     1.	Level of Agreement and Completion

1810		This draft is incomplete in two	senses.	 First,	none of	the
1811		authors	agree with everything written, and quite a number of
1812		issues remain to be worked out among the various authors (to say
1813		nothing	about the rest of the community).  Second, this	draft is
1814		not yet	complete enough	to support creation of inter-operable
1815		implementations.

1817		However, we believe that even though this draft	is very	much a
1818		work in	progress, there	is value with sharing it with the rest
1819		of the DHCP community in its current form.

1821	     2.	Failover Port

1823		We need	to resolve whether the Failover	protocol runs with the
1824		same or	a different port as the	DHCP protocol.	In the interests
1825		of allowing implementation of the Failover protocol by a dif-
1826		ferent process or sub-process, having it use a different port
1827		seems reasonable.

1829	     3.	High Level Operations

1831		While the detailed operations are beginning to come together,
1832		the higher level operations (like reintegration) are, as yet,
1833		incompletely specifcied.  This will be rectified in a later
1834		revision.

1836	     4.	Option Spaces

1838		The draft currently reflects some rather fuzzy goals of	using
1839		DHCP options where they	apply but also defining	new options.  It
1840	DRAFT							    January 1998

1842		uses the "user defined option space" for this, which is	probably
1843		not a good idea.  Perhaps the DHCP Panel will produce a	larger
1844		option space in	which all of these options can be defined, or
1845		perhaps	(as it written in the draft) this protocol will	just
1846		have to	define entirely	unique options.

1848	     5.	Subnet Level Granularity

1850		This protocol talks about a server being in one	state or
1851		another, however the desire is for this	protocol to operate
1852		independently in each address pool for which a primary and
1853		secondary server is defined.  In this way, the "server"	state
1854		really refers to the "subnet" state.  Once the protocol	is vali-
1855		dated, the editing work	to make	it operate at subnet granularity
1856		will be	performed.

1858	     6.	Secondary Server Communications	with DHCP Clients

1860		There are two situations where we may want to allow the	secon-
1861		dary server to communicate with	DHCP clients even though the
1862		secondary can communicate with the primary and would normally be
1863		unresponsive to	DHCP client requests.

1865		The first situation which deserves consideration is where the
1866		secondary has given a DHCP client a lease on an	IP address when
1867		it was not able	to communicate with the	primary, and then subse-
1868		quently	the secondary becomes able to communicate with the pri-
1869		mary.  When the	client unicasts	its DHCPREQUEST	to the secondary
1870		to renew its lease, the	secondary will not be able to communi-
1871		cate with the client (as this protocol is defined).  Should we
1872		allow the Secondary to extend the lease	for the	DHCP client and
1873		then inform the	primary	of that	extension using	the DHCPBNDUPD
1874		message	in the same was	as the Primary uses that message?

1876		The second situation arises where a client can only communicate
1877		with the secondary due to some network failure,	but the	primary
1878		and secondary server can communicate.  As written, the protocol
1879		will not allow the secondary to	offer a	lease to the DHCP
1880		client,	but it would be	straightforward	to modify the protocol
1881		to allow the secondary to do so.  The only difficult part of
1882		this change to the protocol would be to	suggest	how the	secon-
1883		dary would know	that the DHCP client could talk	only to	the
1884		secondary.  But, given that if the DHCP	primary	could talk to
1885		the DHCP client, the secondary would expect to hear about it in
1886		DHCPBNDUPD messages at some point, the absence of such messages
1887		could be used as a signal to communicate to the	DHCP client in
1888		question.

1890	DRAFT							    January 1998

1892	     7.	UDP or TCP

1894		There has been much debate about the utility of	using UDP for
1895		the failover protocol, since it	doesn't	supply guaranteed
1896		delivery.  Certainly rebuilding	TCP out	of UDP would be	a mis-
1897		take.  Some factors to consider	in this	debate are as follows:

1899		First, it is important to recognize that mere receipt of a
1900		packet by the other server in the pair (e.g., receipt of a
1901		DHCPBNDUPD packet by the secondary server) is not sufficient for
1902		the primary to update its own bindings database	with new infor-
1903		mation about what the secondary	knows.	In all cases of
1904		transfers of bindings information, the server of a DHCPBNDUPD
1905		message	MUST update its	own stable storage prior to replying
1906		with a DHCPBNDACK message (except in the marginal case where all
1907		of the updates are rejected).  An action is required by	the
1908		receiving server and an	explicit ACK is	needed by the sending
1909		server to ensure the integrity of the protocol.	 So, just know-
1910		ing that the other server has received a Failover protocol
1911		packet is not intrinsically interesting.

1913		Second,	the DHCP protocol, both	the client and server side, is
1914		being implemented in progressively smaller and smaller machines.
1915		While this progression is most evident in DHCP clients,	there
1916		exist implementations today of DHCP servers embedded in	devices
1917		that are by no stretch of the imagination traditional "servers"
1918		running	mainstream operating systems.  In many ways, the Fail-
1919		over protocol is very well suited to such devices.  Adding addi-
1920		tional protocol	infrastructure requirements to implement the
1921		Failover protocol could	easily prevent its implementation in
1922		devices	that in	some ways need it most.

1924		Third, there are only a	few cases where	the Failover protocol
1925		requires guaranteed delivery of	packets.  In particular, the
1926		normal Primary to Secondary DHCPBNDUPD message to not have to be
1927		delivered reliably.  The consequences of lost DHCPBNDUPD mes-
1928		sages are handled by the use of	the MDLI, for the simple reason
1929		that since these messages are "lazy", they may not get delivered
1930		because	of a server failover prior to their transmission.  Given
1931		that the protocol is robust in the face	of loss	of either a
1932		DHCPBNDUPD message or a	DHCPBNDACK message, a technique	known as
1933		"fire and forget" may be used with this	protocol and two
1934		cooperating implementations.  If the DHCPBNDACK	message	contains
1935		all of the information originally in the DHCPBNDUPD message,
1936		then the DHCPBNDUPD message may	be transmitted and forgotten by
1937		the sending server (typically the primary).  When and if the
1938		secondary receives the DHCPBNDUPD and replies with a DHCPBNDACK
1939		message	and the	primary	receives it, the primary will update its
1940	DRAFT							    January 1998

1942		stable storage with a new picture of what the secondary	knows
1943		about the lease	time.  If either of these messages is lost, the
1944		only downside is that the DHCP client associated with the bind-
1945		ing in question	may receive a shorter lease for	one lease period
1946		than it	would otherwise.   This	"fire and forget" technique
1947		could substantially ease both the complexity of	implementation
1948		and memory requirements	of an implementation of	the Failover
1949		protocol, especially where two servers were communicating over a
1950		very slow link.

1952	12.  Acknowledgments

1954	   Ralph Droms started it all, by sketching out	an initial interserver
1955	   draft that embodied ideas from several past IETF meetings.  In that
1956	   draft, he acknowledged contributions	by Jeff	Mogul, Greg Minshall,
1957	   Rob Stevens,	Walt Wimer, Ted	Lemon, and the DHC working group.

1959	   Kim Kinnear and Bob Cole each extended that draft, separately and
1960	   then	together, until	they created an	interserver draft that supported
1961	   any number of servers.  The complexity of that approach was just too
1962	   great, and led to a much simpler approach embodied in the first Fail-
1963	   over	draft by Greg Rabil, Mike Dooley, and Arun Kapur and Ralph
1964	   Droms.  This	draft posited only two servers -- a primary and	a secon-
1965	   dary.  Kim Kinnear then wrote the Safe Failover draft to layer on top
1966	   of the Failover Draft and increase its the robustness in the	face of
1967	   certain rare	network	failures. At the spring	1998 IETF meeting in LA,
1968	   the DHC working group said that they	wanted a merged	Failover and
1969	   Safe	Failover draft.	 Steve Gonczi and Bernie Volz stepped up and
1970	   produced the	raw material for such a	merged draft, along with a new
1971	   message format designed around DHCP options and other extensions and
1972	   clarifications.  Kim	Kinnear	edited their work into draft format and
1973	   made	other changes, and that	is what	you have in your hands.

1975	   Many	people have reviewed the various drafts	that went into this
1976	   result.  At American	Internet, ideas	have been contributed by Mark
1977	   Stapp, Brad Parker, and Ellen Garvey.  Glenn	Waters of Bay Networks
1978	   contributed ideas and enthusiasm to make a Failover protocol	that was
1979	   both	"safe" and "lazy".

1981	13.  References

1983		[1] Droms, R., "Dynamic	Host Configuration Protocol", RFC 2131,
1984		    March 1997.

1986		[2] Alexander, S.,  Droms, R., "DHCP Options and BOOTP Vendor
1987		    Extensions", Internet RFC 2132, March 1997.

1989	DRAFT							    January 1998

1991		[3] Rabil, G., Dooley, M., Kapur, A., Droms, R., "DHCP Failover
1992		    Protocol", draft-ietf-dhc-failover-00.txt.

1994		[4] Gudmundsson, Olafur, "Security Architecture	for DHCP",
1995		    draft-ietf-dhc-security-arch-00.txt.

1997	14.  Author's information

1999	      Ralph Droms
2000	      323 Dana Engineering
2001	      Bucknell University
2002	      Lewisburg, PA  17837

2004	      Phone: (717) 524-1145
2005	      EMail: droms@bucknell.edu

2007	      Greg Rabil, Mike Dooley, Arun Kapur
2008	      Quadritek	Systems, Inc.
2009	      10 Valley	Stream Parkway,	Suite 240
2010	      Malvern, PA 19355

2012	      Phone: (800) 208-2747

2014	      EMail: grabil@quadritek.com
2015		     mdooley@quadritek.com
2016		     akapur@quadritek.com

2018	      Kim Kinnear
2019	      American Internet	Corporation
2020	      4	Preston	Ct.
2021	      Bedford, MA  01730-2334

2023	      Phone: (781) 276-4587
2024	      EMail: kinnear@american.com

2026	      Steve Gonczi, Bernie Volz
2027	      Process Software Corporation
2028	      959 Concord St.
2029	      Framingham, MA  01701

2031	      Phone: (508) 879-6994

2033	      EMail: gonczi@process.com
2034		     volz@process.com